Dear all,
(I'm sorry, I forgot to attach file which including error message and job
script files)
I constantly got following error messages when the parallel job was submitted.
I attach it.
Also the generated .machines file is attached, please check whether it is
properly generated or not. I
The dayfile indicates that you are doing a non-mpi, but k-point parallel
calculation using
8 k-parallel lapw1 jobs per node. (only lapw0 runs mpi-parallel)
However, the timing is strange:
tachyon1218(1) 527.132u 2.121s 25:49.23 34.1%
indicating that a job which should run 530 seconds (9 minutes)
It is exactly what it says. You are trying to run more tasks on a single
cpu than you have memory for. The idea of mpi is to share cpu and memory.
If you have a cpu with 24 cores (unlikely) you might run (for instance) 3
tasks each using 8 cores, e.g. with three lines of node:8.
You probably only
3 matches
Mail list logo