Re: [QE-users] Memory requirements of projwfc.x in k-resolved case

Thomas Brumme Tue, 14 Aug 2018 07:49:27 -0700

Dear all,

OK, I did some small test using a modified example 4 from the PP examples.
Essentially, instead of using the band path given there I used (random):


K_POINTS crystal_b
8
0 0 0 10
1 0 0 10
1 1 0 10
1 1 1 10
0 0 0 10
0 1 0 10
0 1 1 10
0 0 0 1

And I changed the pseudo between

Pt.pz-n-rrkjus_psl.0.1.UPF
Pt.pz-n-kjpaw_psl.0.1.UPF

Pt.rel-pz-n-rrkjus_psl.0.1.UPF
Pt.rel-pz-n-kjpaw_psl.0.1.UPF

Finally, I calculated the projected wave functions by using projwfc.x with:

&PROJWFC
    prefix='Pt',
    outdir='$TMP_DIR/'
    ngauss = 0,
    degauss = 0.01,
    Emin = 8,
    Emax = 40,
    DeltaE = 0.01,
    lsym = .false.,
    kresolveddos = .true.,
    filproj = 'pt.band.dat.proj',
/

Or without the kresolveddos flag set to true, i.e., deleting the last 2lines above.

In the case of kresolveddos = .true. I always observe that the memoryused by one process (4 in total) increases to nearly twice the value ofthe others. For example:


for paw (logged with top)

28710 tbrumme 20 0 877776 42480 22016 R 242.0 0.3 0:21.21projwfc.x28710 tbrumme 20 0 909748 74836 22592 R 98.0 0.5 0:25.11projwfc.x


for rel-paw

28921 tbrumme 20 0 888844 52388 21380 R 227.5 0.3 0:35.94projwfc.x28921 tbrumme 20 0 920608 86028 22476 R 100.0 0.5 0:40.07projwfc.x


for us

29285 tbrumme 20 0 870516 34372 21304 R 219.6 0.2 0:23.30projwfc.x29285 tbrumme 20 0 906888 71848 22272 R 98.0 0.4 0:25.95projwfc.x


for rel-us

29102 tbrumme 20 0 878620 43500 21980 R 223.5 0.3 0:34.56projwfc.x29102 tbrumme 20 0 914472 79604 22324 R 102.0 0.5 0:39.10projwfc.x

This also happens in a serial calculation, but does not happen whencalculating with kresolveddos=.false. For the bands calculation I cansee a maximum memory usage like (for rel paw):

28850 tbrumme   20   0  892820  51972  21256 R 178.4  0.3   1:55.15 pw.x

which is comparable to the memory usage before the sudden increase. Theoutput of the estimated memory usage in the bands run tells me that Iwill need a maximum of 7.72 MB per process and 30.89 MB total for uspotentials. The 34 MB given above (before the increase) is already morethan the estimate - but OK, I know that it's just an estimate and theestimation of the usage was improved in a recent commit. Yet, at the endthe one task uses even twice this estimate. Judging from the PID I thinkit is the master process (ionode ?!).

In my large calculation of MoS2 on MoS2 projwfc.x does not even reachthe point of writing DOS per atom, i.e., *.pdos_atm#* and thus the crashmust be before. So, one way of reducing the memory usage would obviouslybe to reduce the number of k points and apparently also reducing thenumber of energy points does help. And it turns out that this DeltaEcrucially affects the used memory by one process... So while writingthis email I found a solution - more or less.


To cut a long story short:

If someone experience the same problem, i.e., memory problems forprojwfc.x, try reducing the deltae


Cheerio

Thomas Brumme

On 08/14/18 12:03, Thomas Brumme wrote:

Dear all,
I'm struggling to project the wave functions on atoms in thek-resolved case.The job always crashes because of the memory limit. The system itselfis quitelarge - 2 layers of MoS2 but rotated, total of 138 atoms. The bandstructure
calculation for 151 k points finished without problems using 1.72 GB RAM
maximum per core (100 cores in total). Starting the projwfc.x run withthe
same settings (100 cores, 2 GB RAM per core) the job is killed because it
exceeds the memory. Increasing to 8 GB per core does not solve theproblem.
What are the exact memory requirements for projwfc.x for the k-resolved
case? I read in the forums that it shouldn't be more than thecorrespondingscf or bands run, should it? Then why does those runs finish and theprojwfc.xnot? I'm using version 6.2.1 compiled with the old xml format (as Istarted thecalculation when the new XML was not there yet and had to stop inbetween)Furthermore, the normal (scf and bands) run are parallelized via thestandardR & G space devision on 100 cores. Um, and I'm using the relativisticPBE paw
pseudos of the pslibrary, 55 Ry and 440 Ry cutoffs.
Is the code reading in the wave functions of all k points at once,i.e., would
it help to reduce the number of k points?

Regards

Thomas


--
Dr. rer. nat. Thomas Brumme
Wilhelm-Ostwald-Institute for Physical and Theoretical Chemistry
Leipzig University
Phillipp-Rosenthal-Strasse 31
04103 Leipzig

Tel:  +49 (0)341 97 36456

email: [email protected]

_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] Memory requirements of projwfc.x in k-resolved case

Reply via email to