Dear Lorenzo and Michel,

 thank you for your reply. Yes, unfortunately I am using a small cluster
and,  for memory issues, I cannot use the SSD-based local scratch. Thus I
am forced to use a
 non-parallel NFS. From your answers, I understand that, given
the situation, the huge difference between the CPU and WALL time that I am
observing is reasonable.
Since I am facing a hardware limitation I guess I cannot do much to improve
the performances. I will try to find some solutions with the person in
charge of the cluster.

Best Regards,
Raffaello Bianco
UPV/EHU - CFM

On Wed, Sep 30, 2020 at 4:13 PM Michal Krompiec <[email protected]>
wrote:

> Dear Rafaello,
> Are you using a local (preferably SSD-based) scratch drive, or a very fast
> parallel file system?
> Best wishes,
> Michal Krompiec
> Merck KGaA
>
> On Wed, 30 Sep 2020 at 15:05, Raffaello Bianco <
> [email protected]> wrote:
>
>> Dear QE users and developers,
>>
>>  I am doing an electron-phonon coupling calculation in this way (I am
>> using QE v 6.6).
>> First, I have done an scf calculation. Then, I have done a phonon
>> calculation where I have printed the dvscf files, with
>>
>>    fildvscf         = 'dvscf'
>>
>> Subsequently, I have done the electron-phonon calculation changing the
>> k-mesh grid, with
>>
>>     trans            = .false.
>>     electron_phonon  = 'simple'
>>
>> The calculation ends correctly, but for some q points I have noticed a
>> huge difference between
>> CPU and Wall time, like
>>
>> PHONON       :     15h55m CPU   3d18h56m WALL
>>
>> From the report at the end of the output, the I/O davcio routine seems to
>> be the
>> "guilty":
>>
>>
>>     General routines
>>      davcio       :    107.89s CPU 263856.08s WALL (  520331 calls)
>>
>>      Parallel routines
>>
>>       Electron-phonon coupling
>>      elphon       :  41730.55s CPU 309708.87s WALL (       1 calls)
>>      elphel       :  41671.20s CPU 309625.04s WALL (      60 calls)
>>
>>       General routines
>>      davcio       :    107.89s CPU 263856.08s WALL (  520331 calls)
>>
>>  This calculation was done with 10 processors and npool = 10, if I use 40
>> processors and npool = 10 it is worse (as can probably be expected due to
>> the higher number of I / O operations). I have looked at the documentation
>> but I am not very familiar with these things, thus I still have several
>> doubts. Any suggestions on tests to do or how to improve performance, or at
>> least comments to clarify the problem, would be greatly appreciated.
>>
>> Thank you for your time,
>>
>> Best,
>> Raffaello Bianco
>>
>>
>> _______________________________________________
>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
>> users mailing list [email protected]
>> https://lists.quantum-espresso.org/mailman/listinfo/users
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list [email protected]
> https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to