Re: [Pw_forum] GIPAW acceleration

Yasser Fowad AlWahedi Sun, 16 Jul 2017 03:07:56 -0700

Thanks Davide,

I am running using the smearing option since the system is metallic.


I also noticed an interesting relation. GIPAW runs succeed if  number of cores 
(np) <= number of k points/npool. I checked this in the 38 atom case which kept 
failing whenever I chose a number of processors higher than the number of 
kpoints per pool. Although the SCF runs was finishing successfully all the 
time. This was also observed in other cases. Is this a general rule?

I will send you the files privately. 

Yasser

-----Original Message-----
From: Davide Ceresoli [mailto:[email protected]] 
Sent: Sunday, July 16, 2017 1:49 PM
To: Yasser Fowad AlWahedi <[email protected]>; PWSCF Forum 
<[email protected]>
Subject: Re: [Pw_forum] GIPAW acceleration

Dear Yasser,
     no problem! First of all, it seems to me that I/O is not a problem.
In fact cputime ~= walltime and davcio routines consume only 1.88 s.

I compared calculations of similar size and I've got:
     wollastonite: 30 atoms, 36 k-points: 10h40m
     coesite:      48 atoms, 32 k-points: 19h20m
on a rather old (2008) Xeon E5520 2.27 GHz, 8 cores.

My timings are more favorable than your C1 results. However, if your system is 
a slab, the empty space carries a non-neglibigle extra cost.
You can try to minimize it as much as possible. NMR interactions are 
short-ranged, contrary to electrostatic interactions

Is your system metallic? even if it has a small band gap, I suggest using 
occupations='smearing'. This will speed up the linear-response in GIPAW, and 
convergence wrt k-points.

Finally, the clock difference between the i7 (3.5 GHz) and the Xeon (2.2 GHz) 
can explain the difference in timing. The clock ratio is ~1.6, similar to the 
walltime ratio.

In any case, if you send me privately input and output files, I can look them 
in detail.

Best wishes,
     Davide






On 07/16/2017 10:26 AM, Yasser Fowad AlWahedi wrote:
> Dear Davide,
>
> Thanks for your support and my apologies for the late reply.  PW and GIPAW 
> are compiled using GNU compilers and the intel MKL libs.
>
> I am running DFT of Ni2P clusters of various surfaces over two computational 
> rigs:
>
> 1) The university cluster: Each node consist of dual 8 cores/8 threads 
> CPUs Xeons clocked at 2.2 GHz with 64 GB ram. I only use one node per 
> simulation. For storage it uses a mechanical hard drive . (Later 
> called C1)
>
> 2) My home pc: which is equipped with i7 5930K processor 6 cores 12 threads 
> clocked at 3.9 GHz with 128 GB ram (Later called C2). For storage I use a 
> Samsung 850 EVO SSD.
>
> Below table summarize the cases performed/running and the time of finish or 
> expected time of finishing assuming linear extrapolation.
>
>
> # of atoms    npool   Cores   # kpoints per pool      Computer        Time 
> (hrs)
> 30            2       16      17                      C1              28.9
> 38            1       16      25                      C1              31.3
> 49            1       16      34                      C1              124.9*
> 50            2       16      17                      C1              474.6*
> 52            1       10      34                      C2              295.2*
>
> * estimated time of finish
>
> I understand that the cases are different and as such they will require more 
> or less time to finish.
>
> But I noticed that the 50 and 52 cases which are quite similar (same k points 
> and similar number of atoms) but done over two different systems attain 
> substantially different time of finish. My guess it is probably due to the 
> SSD being used to write off the data.  Considering that C2 uses less 
> computational threads and more atoms but is expected to finish faster.
>
> I also noticed an interesting relation. GIPAW runs succeed if  number of 
> cores (np) <= number of k points/npool. I checked this in the 38 atom case 
> which kept failing whenever I chose a number of processors higher than the 
> number of kpoints per pool. Although the SCF runs was finishing successfully 
> all the time. This was also observed in other cases. Is this a general rule?
>
> Below is the timing output of the 38 atoms case:
>
> gipaw_setup  :      0.46s CPU      0.50s WALL (       1 calls)
>
>      Linear response
>      greenf       :  20177.91s CPU  20207.68s WALL (     600 calls)
>      cgsolve      :  20057.24s CPU  20086.82s WALL (     600 calls)
>      ch_psi       :  19536.93s CPU  19563.75s WALL (   44231 calls)
>      h_psiq       :  13685.97s CPU  13707.40s WALL (   44231 calls)
>
>      Apply operators
>      h_psi        :  44527.30s CPU  46802.35s WALL ( 5434310 calls)
>      apply_vel    :    262.98s CPU    263.30s WALL (     525 calls)
>
>      Induced current
>      j_para       :    559.19s CPU    560.39s WALL (     675 calls)
>      biot_savart  :      0.05s CPU      0.06s WALL (       1 calls)
>
>      Other routines
>
>      General routines
>      calbec       :  39849.22s CPU  37474.79s WALL (10917262 calls)
>      fft          :      0.12s CPU      0.15s WALL (      42 calls)
>      ffts         :      0.01s CPU      0.01s WALL (      10 calls)
>      fftw         :   8220.39s CPU   9116.72s WALL (27084278 calls)
>      davcio       :      0.02s CPU      1.88s WALL (     400 calls)
>
>      Parallel routines
>      fft_scatter  :   3533.10s CPU   3242.29s WALL (27084330 calls)
>
>      Plugins
>
>      GIPAW        : 112557.79s CPU 112726.12s WALL (       1 calls)
>
> Yasser
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] 
> On Behalf Of Davide Ceresoli
> Sent: Thursday, July 13, 2017 8:30 PM
> To: PWSCF Forum <[email protected]>
> Subject: Re: [Pw_forum] GIPAW acceleration
>
> Dear Yasser,
>      how many atoms? how many k-points? I/O can always be the reason, but in 
> my experience if the system is very large, time is dominated by computation, 
> not I/O.
> You should get some speedup if diagonalization='cg' in GIPAW.
>
> Anyway, if I have time, I will introduce a "disk_io" variable in the input 
> file, to try to keep more data in memory instead that on disk.
>
> Best regards,
>      Davide
>
>
> On 07/13/2017 10:02 AM, Yasser Fowad AlWahedi wrote:
>> Dear GIPAW users,
>>
>>
>>
>> For nmr shifts calculations, I am suffering from the extreme slowness 
>> of GIPAW nmr shifts calculations.  I have noticed that GIPAW write 
>> off the results frequently for restart purposes. In our clusters we 
>> have mechanical hard drives which stores the off data for. Could that be a 
>> reason for its slowness?
>>
>>
>>
>> Yasser Al Wahedi
>>
>> Assistant Professor
>>
>> Khalifa University of Science and Technology
>>
>

-- 
+--------------------------------------------------------------+
   Davide Ceresoli
   CNR Institute of Molecular Science and Technology (CNR-ISTM)
   c/o University of Milan, via Golgi 19, 20133 Milan, Italy
   Email: [email protected]
   Phone: +39-02-50314276, +39-347-1001570 (mobile)
   Skype: dceresoli
+--------------------------------------------------------------+

_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] GIPAW acceleration

Reply via email to