[Wien] forrtl: severe (41): insufficient virtual memory (file attached!!)

2012-04-15 Thread hyunjung kim
Dear all,

(I'm sorry, I forgot to attach file which including error message and job 
script files)

I constantly got following error messages when the parallel job was submitted.

I attach it.
Also the generated .machines file is attached, please check whether it is 
properly generated or not. I intended to do 24 k-point parallelized job.

The compiler version is 
fortran : ifort, 12.0 (2011.3.174), mpif90 [ I got same error message within 
ifort 11.1 version, so I guess that fortran version is not the origin of this 
problem..]
openmpi : 1.4.5
FFTW2   : 2.1.5
CC  : icc, 12.0 (2011.3.174)
compiler option
 O   Compiler options:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
-mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
 L   Linker Flags:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) 
-pthread
 P   Preprocessor flags   '-DParallel'
 R   R_LIB (LAPACK+BLAS): -lmkl_lapack95_lp64 -lmkl_intel_lp64 
-lmkl_intel_thread -lmkl_core -openmp -lpthread

 RP  RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64 
-lmkl_blacs_lp64 -L$(FFTWPATH)/lib -lfftw_mpi -lfftw $(R_LIBS)
 FP  FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
-mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
 MP  MPIRUN commando: mpirun -mca btl self,openib -mca 
plm_rsh_num_concurrent 400 -mca oob_tcp_listen_mode listen_thread -mca 
plm_rsh_tree_spawn 1 -np _NP_ -machinefile _HOSTS_ _EXEC_


The error messages is:
~~ abbreviation ~
 LAPW0 END 
 LAPW0 END 
 LAPW0 END 
 LAPW0 END 
 LAPW0 END
 LAPW0 END 
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW1 END 
 LAPW1 END 
 LAPW1 END 
 LAPW1 END 
 LAPW1 END
 LAPW1 END 
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END 
 LAPW1 END 
 LAPW1 END 
 LAPW1 END 
 LAPW1 END
 LAPW1 END 
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
forrtl: severe (41): insufficient virtual memory
Image  PCRoutineLineSource
libintlc.so.5  2B0540E88F7A  Unknown   Unknown  Unknown
libintlc.so.5  2B0540E87AF5  Unknown   Unknown  Unknown
libifcoremt.so.5   2B0540058CF2  Unknown   Unknown  Unknown
libifcoremt.so.5   2B053FFCAAAB  Unknown   Unknown  Unknown
libifcoremt.so.5   2B054001AFBA  Unknown   Unknown  Unknown
libifcoremt.so.5   2B054001AE11  Unknown   Unknown  Unknown
lapwso 004281C0  MAIN__131  lapwso.f
lapwso 00402A9C  Unknown   Unknown  Unknown
libc.so.6  003CFA61D974  Unknown   Unknown  Unknown
lapwso 004029A9  Unknown   Unknown  Unknown
forrtl: severe (41): insufficient virtual memory
Image  PCRoutineLineSource
libintlc.so.5  2B5D32256F7A  Unknown   Unknown  Unknown
libintlc.so.5  2B5D32255AF5  Unknown   Unknown  Unknown
libifcoremt.so.5   2B5D31426CF2  Unknown   Unknown  Unknown
libifcoremt.so.5   2B5D31398AAB  Unknown   Unknown  Unknown
libifcoremt.so.5   2B5D313E8FBA  Unknown   Unknown  Unknown
libifcoremt.so.5   2B5D313E8E11  Unknown   Unknown  Unknown
lapwso 00409A6A  hmsout_mp_init_hm  78  modules.f
lapwso 004280E2  MAIN__130  lapwso.f
lapwso 00402A9C  Unknown   Unknown  Unknown
libc.so.6  003CFA61D974  Unknown   Unknown  Unknown
~~ abbreviation ~~

I note that the compilation was done without any error messages. 

Any advice will be greatly appreciated!


Hyun-Jung Kim (Ph.D student)| phone : ++82 10 7335 7889
Department of Physics   | 
Hanyang University  | e-mail: angpangmokjang at hanmail.net 
17 Haengdang-Dong   | 
133-791 Seongdong-Ku,Seoul/Korea|

www: http://physics.hanyang.ac.kr/~sst/











-- next part --
An HTML attachment was scrubbed...
URL: 
http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120415/b93c185d/attachment.htm
-- next part --
A non-text attachment was scrubbed...
Name: error.zip
Type: application/zip
Size: 8025 bytes
Desc: not available
URL: 
http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120415/b93c185d/attachment.zip
-- next part --
An HTML attachment was scrubbed...
URL: 

[Wien] forrtl: severe (41): insufficient virtual memory (file attached!!)

2012-04-15 Thread Peter Blaha
The dayfile indicates that you are doing a non-mpi, but k-point parallel 
calculation using
8 k-parallel lapw1 jobs per node. (only lapw0 runs mpi-parallel)

However, the timing is strange:
tachyon1218(1) 527.132u 2.121s 25:49.23 34.1%
indicating that a job which should run 530 seconds (9 minutes) needs actually 3 
times as long.
This usually means that i) your memory is insufficient, or ii) somebody else is 
using the same node too
or iii) it is not a real 8-core but eg. only a 4 core node.

In any case, the error is in lapwso (which is never mpi-parallel), and it seems 
rather clear, that
you do not have enough memory to run 8 parallel lapwso jobs on one node.

Modify your script such that you are using only 4 parallel jobs per node. That 
should be much
faster and the memory should probably be sufficient.


Am 15.04.2012 02:49, schrieb hyunjung kim:
 Dear all,


 (I'm sorry, I forgot to attach file which including error message and job 
 script files)

 I constantly got following error messages when the parallel job was submitted.

 I attach it.
 Also the generated .machines file is attached, please check whether it is 
 properly generated or not. I intended to do 24 k-point parallelized job.

 The compiler version is
 fortran : ifort, 12.0 (2011.3.174), mpif90 [ I got same error message within 
 ifort 11.1 version, so I guess that fortran version is not the origin of this 
 problem..]
 openmpi : 1.4.5
 FFTW2 : 2.1.5
 CC : icc, 12.0 (2011.3.174)
 compiler option
 O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
 -mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
 L Linker Flags: $(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -pthread
 P Preprocessor flags '-DParallel'
 R R_LIB (LAPACK+BLAS): -lmkl_lapack95_lp64 -lmkl_intel_lp64 
 -lmkl_intel_thread -lmkl_core -openmp -lpthread

 RP RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64 
 -lmkl_blacs_lp64 -L$(FFTWPATH)/lib -lfftw_mpi -lfftw $(R_LIBS)
 FP FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
 -mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
 MP MPIRUN commando : mpirun -mca btl self,openib -mca plm_rsh_num_concurrent 
 400 -mca oob_tcp_listen_mode listen_thread -mca plm_rsh_tree_spawn 1 -np _NP_ 
 -machinefile
 _HOSTS_ _EXEC_


 The error messages is:
 ~~ abbreviation ~
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW0 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 LAPW1 END
 forrtl: severe (41): insufficient virtual memory
 Image PC Routine Line Source
 libintlc.so.5 2B0540E88F7A Unknown Unknown Unknown
 libintlc.so.5 2B0540E87AF5 Unknown Unknown Unknown
 libifcoremt.so.5 2B0540058CF2 Unknown Unknown Unknown
 libifcoremt.so.5 2B053FFCAAAB Unknown Unknown Unknown
 libifcoremt.so.5 2B054001AFBA Unknown Unknown Unknown
 libifcoremt.so.5 2B054001AE11 Unknown Unknown Unknown
 lapwso 004281C0 MAIN__ 131 lapwso.f
 lapwso 00402A9C Unknown Unknown Unknown
 libc.so.6 003CFA61D974 Unknown Unknown Unknown
 lapwso 004029A9 Unknown Unknown Unknown
 forrtl: severe (41): insufficient virtual memory
 Image PC Routine Line Source
 libintlc.so.5 2B5D32256F7A Unknown Unknown Unknown
 libintlc.so.5 2B5D32255AF5 Unknown Unknown Unknown
 libifcoremt.so.5 2B5D31426CF2 Unknown Unknown Unknown
 libifcoremt.so.5 2B5D31398AAB Unknown Unknown Unknown
 libifcoremt.so.5 2B5D313E8FBA Unknown Unknown Unknown
 libifcoremt.so.5 2B5D313E8E11 Unknown Unknown Unknown
 lapwso 00409A6A hmsout_mp_init_hm 78 modules.f
 lapwso 004280E2 MAIN__ 130 lapwso.f
 lapwso 00402A9C Unknown Unknown Unknown
 libc.so.6 003CFA61D974 Unknown Unknown Unknown
 ~~ abbreviation ~~

 I note that the compilation was done without any error messages.

 Any advice will be greatly appreciated!

 
 Hyun-Jung Kim (Ph.D student)| phone : ++82 10 7335 7889
 Department of Physics|
 Hanyang University| e-mail: angpangmokjang at h mailto:hyunjung at 
 fhi-berlin.mpg.deanmail.net http://anmail.net
 17 Haengdang-Dong|
 133-791 Seongdong-Ku,Seoul/Korea|
 
 www: http://physics.hanyang.ac.kr/~sst/
 










 =


 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 
-
Peter Blaha
Inst. Materials Chemistry, TU Vienna
Getreidemarkt 9, A-1060 Vienna, Austria
Tel: +43-1-5880115671
Fax: 

[Wien] forrtl: severe (41): insufficient virtual memory (file attached!!)

2012-04-14 Thread Laurence Marks
It is exactly what it says. You are trying to run more tasks on a single
cpu than you have memory for. The idea of mpi is to share cpu and memory.
If you have a cpu with 24 cores (unlikely) you might run (for instance) 3
tasks each using 8 cores, e.g. with three lines of node:8.

You probably only have 8 cores, so for a large job you might use node:8

Please do a little google searching on the principles of mpi, much better
than any email response.

---
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
Research is to see what everybody else has seen, and to think what nobody
else has thought
Albert Szent-Gyorgi
 On Apr 14, 2012 7:49 PM, hyunjung kim angpangmokjang at hanmail.net wrote:

 Dear all,

 (I'm sorry, I forgot to attach file which including error message and job
 script files)

 I constantly got following error messages when the parallel job was
 submitted.

 I attach it.
 Also the generated .machines file is attached, please check whether it is
 properly generated or not. I intended to do 24 k-point parallelized job.

 The compiler version is
 fortran : ifort, 12.0 (2011.3.174), mpif90 [ I got same error message
 within ifort 11.1 version, so I guess that fortran version is not the
 origin of this problem..]
 openmpi : 1.4.5
 FFTW2   : 2.1.5
 CC  : icc, 12.0 (2011.3.174)
 compiler option
  O   Compiler options:-FR -mp1 -w -prec_div -pc80 -pad -ip
 -DINTEL_VML -mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
  L   Linker Flags:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH)
 -pthread
  P   Preprocessor flags   '-DParallel'
  R   R_LIB (LAPACK+BLAS): -lmkl_lapack95_lp64 -lmkl_intel_lp64
 -lmkl_intel_thread -lmkl_core -openmp -lpthread

  RP  RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64
 -lmkl_blacs_lp64 -L$(FFTWPATH)/lib -lfftw_mpi -lfftw $(R_LIBS)
  FP  FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip
 -DINTEL_VML -mcmodel=medium -i-dynamic -traceback -I$(MKLROOT)/include
  MP  MPIRUN commando: mpirun -mca btl self,openib -mca
 plm_rsh_num_concurrent 400 -mca oob_tcp_listen_mode listen_thread -mca
 plm_rsh_tree_spawn 1 -np _NP_ -machinefile _HOSTS_ _EXEC_


 The error messages is:
 ~~ abbreviation ~
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW0 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
  LAPW1 END
 forrtl: severe (41): insufficient virtual memory
 Image  PCRoutineLineSource
 libintlc.so.5  2B0540E88F7A  Unknown   Unknown  Unknown
 libintlc.so.5  2B0540E87AF5  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B0540058CF2  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B053FFCAAAB  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B054001AFBA  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B054001AE11  Unknown   Unknown  Unknown
 lapwso 004281C0  MAIN__131
  lapwso.f
 lapwso 00402A9C  Unknown   Unknown  Unknown
 libc.so.6  003CFA61D974  Unknown   Unknown  Unknown
 lapwso 004029A9  Unknown   Unknown  Unknown
 forrtl: severe (41): insufficient virtual memory
 Image  PCRoutineLineSource
 libintlc.so.5  2B5D32256F7A  Unknown   Unknown  Unknown
 libintlc.so.5  2B5D32255AF5  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B5D31426CF2  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B5D31398AAB  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B5D313E8FBA  Unknown   Unknown  Unknown
 libifcoremt.so.5   2B5D313E8E11  Unknown   Unknown  Unknown
 lapwso 00409A6A  hmsout_mp_init_hm  78
  modules.f
 lapwso 004280E2  MAIN__130
  lapwso.f
 lapwso 00402A9C  Unknown   Unknown  Unknown
 libc.so.6  003CFA61D974  Unknown   Unknown  Unknown
 ~~ abbreviation ~~

 I note that the compilation was done without any error messages.

 Any advice will be greatly appreciated!

 
 Hyun-Jung Kim (Ph.D student) | phone : ++82 10 7335 7889
 Department of Physics |
 Hanyang University | e-mail: angpangmokjang at h hyunjung at 
 fhi-berlin.mpg.de
 anmail.net
 17 Haengdang-Dong |