Dear all,

OK, I did some small test using a modified example 4 from the PP examples.
Essentially, instead of using the band path given there I used (random):

K_POINTS crystal_b
8
0 0 0 10
1 0 0 10
1 1 0 10
1 1 1 10
0 0 0 10
0 1 0 10
0 1 1 10
0 0 0 1

And I changed the pseudo between

Pt.pz-n-rrkjus_psl.0.1.UPF
Pt.pz-n-kjpaw_psl.0.1.UPF

Pt.rel-pz-n-rrkjus_psl.0.1.UPF
Pt.rel-pz-n-kjpaw_psl.0.1.UPF

Finally, I calculated the projected wave functions by using projwfc.x with:

&PROJWFC
    prefix='Pt',
    outdir='$TMP_DIR/'
    ngauss = 0,
    degauss = 0.01,
    Emin = 8,
    Emax = 40,
    DeltaE = 0.01,
    lsym = .false.,
    kresolveddos = .true.,
    filproj = 'pt.band.dat.proj',
/

Or without the kresolveddos flag set to true, i.e., deleting the last 2 lines above.

In the case of kresolveddos = .true. I always observe that the memory used by one process (4 in total) increases to nearly twice the value of the others. For example:

for paw (logged with top)
28710 tbrumme   20   0  877776  42480  22016 R 242.0  0.3   0:21.21 projwfc.x 28710 tbrumme   20   0  909748  74836  22592 R  98.0  0.5   0:25.11 projwfc.x

for rel-paw
28921 tbrumme   20   0  888844  52388  21380 R 227.5  0.3   0:35.94 projwfc.x 28921 tbrumme   20   0  920608  86028  22476 R 100.0  0.5   0:40.07 projwfc.x

for us
29285 tbrumme   20   0  870516  34372  21304 R 219.6  0.2   0:23.30 projwfc.x 29285 tbrumme   20   0  906888  71848  22272 R  98.0  0.4   0:25.95 projwfc.x

for rel-us
29102 tbrumme   20   0  878620  43500  21980 R 223.5  0.3   0:34.56 projwfc.x 29102 tbrumme   20   0  914472  79604  22324 R 102.0  0.5   0:39.10 projwfc.x

This also happens in a serial calculation, but does not happen when calculating with kresolveddos=.false. For the bands calculation I can see a maximum memory usage like (for rel paw):
28850 tbrumme   20   0  892820  51972  21256 R 178.4  0.3   1:55.15 pw.x

which is comparable to the memory usage before the sudden increase. The output of the estimated memory usage in the bands run tells me that I will need a maximum of 7.72 MB per process and 30.89 MB total for us potentials. The 34 MB given above (before the increase) is already more than the estimate - but OK, I know that it's just an estimate and the estimation of the usage was improved in a recent commit. Yet, at the end the one task uses even twice this estimate. Judging from the PID I think it is the master process (ionode ?!).

In my large calculation of MoS2 on MoS2 projwfc.x does not even reach the point of writing DOS per atom, i.e., *.pdos_atm#* and thus the crash must be before. So, one way of reducing the memory usage would obviously be to reduce the number of k points and apparently also reducing the number of energy points does help. And it turns out that this DeltaE crucially affects the used memory by one process... So while writing this email I found a solution - more or less.

To cut a long story short:

If someone experience the same problem, i.e., memory problems for projwfc.x, try reducing the deltae

Cheerio

Thomas Brumme

On 08/14/18 12:03, Thomas Brumme wrote:
Dear all,

I'm struggling to project the wave functions on atoms in the k-resolved case. The job always crashes because of the memory limit. The system itself is quite large - 2 layers of MoS2 but rotated, total of 138 atoms. The band structure
calculation for 151 k points finished without problems using 1.72 GB RAM
maximum per core (100 cores in total). Starting the projwfc.x run with the
same settings (100 cores, 2 GB RAM per core) the job is killed because it
exceeds the memory. Increasing to 8 GB per core does not solve the problem.

What are the exact memory requirements for projwfc.x for the k-resolved
case? I read in the forums that it shouldn't be more than the corresponding scf or bands run, should it? Then why does those runs finish and the projwfc.x not? I'm using version 6.2.1 compiled with the old xml format (as I started the calculation when the new XML was not there yet and had to stop in between) Furthermore, the normal (scf and bands) run are parallelized via the standard R & G space devision on 100 cores. Um, and I'm using the relativistic PBE paw
pseudos of the pslibrary, 55 Ry and 440 Ry cutoffs.

Is the code reading in the wave functions of all k points at once, i.e., would
it help to reduce the number of k points?

Regards

Thomas


--
Dr. rer. nat. Thomas Brumme
Wilhelm-Ostwald-Institute for Physical and Theoretical Chemistry
Leipzig University
Phillipp-Rosenthal-Strasse 31
04103 Leipzig

Tel:  +49 (0)341 97 36456

email: [email protected]

_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to