ehm ...
I should have read the guide before answering, sorry Guido,
I surely would have been more helpfull
following the indications that the cineca module qe/6.3_knl prints
out when it is loaded and using the bind-cpu option of srun
srun --bind-cpu=cores pw.x < pwin > pwout
the 6.3 works smoothly also for cpu-per-task=2 and tasks-per-node=68
Pietro
On 31/01/19 14:41, Guido Fratesi wrote:
Dear Pietro, Paolo and Davide,
thank you for your hints. Indeed by changing the number of CPUs the
calculation *may* converge also with QE6.3. For example:
2pools-x-34cpus-x-2omp (ie #MPIxOpenMP cores = #cpus)
2pools-x-8cpus-x-2omp
6pools-x-8cpus-x-2omp
are OK, but
2pools-x-68cpus-x-1omp (ie #MPIxOpenMP cores = #cpus)
does not converge again, although I'm not asking for more tasks than
cpus (see Pietro's comment). Also, KNL nodes in A2 should support
hyperthreading (4x)
https://wiki.u-gov.it/confluence/display/SCAIUS/UG3.1%3A+MARCONI+UserGuide#UG3.1:MARCONIUserGuide-SystemArchitecture
so I would not expect that asking for a number of threads that is
twice the number of allocated cpu's would be a problem - nor it is for
QE6.0 and for the inputs with the molecule/surface.
I though this could be related to the size of the system since I had
no problems with the heavier molecule/surface case; however, the
problem is also present for larger, clean-Au(111), unit cells.
I can now circumvent the issue, thank you. I'd also be curious to know
what is the reason...
Guido
_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users