[Wien] Which mpi?
Dear all: Below my comment is based not only on Wien2k but on other ab-initio/first-principles codes in which relatively larger memory is consumed and the number of iteration is fewer, than, say, molecular dynamics codes. With the introduction of an mpi benchmark (thanks Peter), I would like to start a thread on this which would help me and perhaps others. Some questions: 1) Has anyone tested Intel's mpi to see how much better (if at all) it is? In our case, three points needs to be evaluated for this question: (1) stability, (2) commanding procedure, and (3) speed. As far as I tried Intel's for other pseudo-potential code, I saw no remarkable difference in all three points above, and I've never heard opinions that are in great favor for Intel's, compared with MPICH1, MPICH2, LAM, or OpenMPI. One of advantages of Intel's over other free ones is that if you buy it you receive support. 2) Is there much difference between mpich-1 and mpich-2? In my experiences, statistical error is larger than the difference between mpich-1 and mpich-2 if you do not encounter bugs specific to a specific version, except for (2) commanding procedure: MPICH-1 or OpenMPI doesn't require any daemon to be executed in prior to an actual parallel run while MPICH-2 or LAM requires a daemon needs to be booted before a parallel run is executed. And, if you use ones that require the daemon, you can kill all the parallel run threads safely. I'm (still) using mpich-1 (ver. 1.2.6) and mpich-2 (ver. 1.0.5p4), depending on what?, weather of the day. 3) Is there much effect for 1) and 2) with ethernet versus myrinet or infiniband? Sorry, I have no idea. 4) Should one use rsh/ssh or something different for multiple CPU's on one computer? If you execute a parallel run using mpiXX, you needs to use either rsh or ssh, even if you are using other core/CPU in a computer. But configuring routing table not to use NIC but to use loopback to reach the machine's own other CPU greatly reduces communication speed loss. Hope this helps. Looking forward to hearing others' opinions on Wien2k since I have just a little experience on parallel Wien2k. Masato
[Wien] strange time using -it switch
I think I am using $SCRATCH. In my .cshrc file, I have the line, setenv SCRATCH ./ This machine is a shared memory machine, 4 CPUs in one node. Since the communication between nodes is slow, I only use one node in k-point parallel style NOT MPI parallel. The same thing happens on our IBM Linux cluster, 2 CPUs in one node. -it switch only works with the line without $para. It is not a shared memory machine, and I use ssh for parallelization. Moreover, on this machine, the -it switch meets another problem: The first full diagonalization iteration is fine, and memory is enough for the calculation, but when it switches to -it in the second iteration, and copy case.vector into case.vector_old correctly, it says insufficiently virtual memory. LAPW0 END LAPW1 END LAPW1 END LAPW1 END LAPW1 END LAPW2 - FERMI; weighs written LAPW2 END LAPW2 END LAPW2 END LAPW2 END SUMPARA END SUMPARA END CORE END MIXER END LAPW0 END forrtl: severe (41): insufficient virtual memory Image PCRoutineLineSource lapw1 08548873 Unknown Unknown Unknown lapw1 08547E93 Unknown Unknown Unknown lapw1 0850C80E Unknown Unknown Unknown lapw1 084DBFB8 Unknown Unknown Unknown lapw1 084F8832 Unknown Unknown Unknown lapw1 08098779 Unknown Unknown Unknown lapw1 08091A14 Unknown Unknown Unknown lapw1 08055F8C Unknown Unknown Unknown lapw1 0807832E Unknown Unknown Unknown lapw1 0804EA59 Unknown Unknown Unknown libc.so.6 400BE210 Unknown Unknown Unknown lapw1 0804E981 Unknown Unknown Unknown forrtl: severe (41): insufficient virtual memory Image PCRoutineLineSource lapw1 08548873 Unknown Unknown Unknown lapw1 08547E93 Unknown Unknown Unknown lapw1 0850C80E Unknown Unknown Unknown lapw1 084DBFB8 Unknown Unknown Unknown lapw1 084F8832 Unknown Unknown Unknown lapw1 08098779 Unknown Unknown Unknown lapw1 08091A14 Unknown Unknown Unknown lapw1 08055F8C Unknown Unknown Unknown lapw1 0807832E Unknown Unknown Unknown lapw1 0804EA59 Unknown Unknown Unknown libc.so.6 400BE210 Unknown Unknown Unknown lapw1 0804E981 Unknown Unknown Unknown forrtl: severe (41): insufficient virtual memory . Then I do a test, turning off the -it switch, and the job just run smoothly. Thank you very much Zhang Peter Blaha wrote: Are you using $SCRATCH ? Is this a shared memory machine, do you use ssh or rsh for parallelization ? execute vec2old_lapw $para on the commandline, eventually add the -x switch in the first line of the script. -- - Address: Fritz-Haber-Institut, Abt. Theorie Faradayweg 4-6 D-14195 Berlin (Germany) Phone:+49 30 8413 4818 Fax: +49 30 8413 4701 Email:zhang at fhi-berlin.mpg.de - 1-0.0735-11600-23.05
[Wien] strange time using -it switch
I can hardly help without more info. Anyway, without a local SCRATCh dir even without $para it should be ok. (execute vec2old_lapw -p on the commandline in this subdir. What do you get ? Eventually change the first line of the script to -fx.) Yes, of course the iterative diagonalization needs some extra memory (basically two times the vector files + some auxilliary arrays. So when full diag. just fits into the memory it is possible that -it will crash. For such large cases I'd use the mpi-parallel version anyway! The same thing happens on our IBM Linux cluster, 2 CPUs in one node. -it switch only works with the line without $para. It is not a shared memory machine, and I use ssh for parallelization. Moreover, on this machine, the -it switch meets another problem: The first full diagonalization iteration is fine, and memory is enough for the calculation, but when it switches to -it in the second iteration, and copy case.vector into case.vector_old correctly, it says insufficiently virtual memory. LAPW0 END LAPW1 END LAPW1 END LAPW1 END LAPW1 END LAPW2 - FERMI; weighs written LAPW2 END LAPW2 END LAPW2 END LAPW2 END SUMPARA END SUMPARA END CORE END MIXER END LAPW0 END forrtl: severe (41): insufficient virtual memory Image PCRoutineLineSource lapw1 08548873 Unknown Unknown Unknown lapw1 08547E93 Unknown Unknown Unknown lapw1 0850C80E Unknown Unknown Unknown lapw1 084DBFB8 Unknown Unknown Unknown lapw1 084F8832 Unknown Unknown Unknown lapw1 08098779 Unknown Unknown Unknown lapw1 08091A14 Unknown Unknown Unknown lapw1 08055F8C Unknown Unknown Unknown lapw1 0807832E Unknown Unknown Unknown lapw1 0804EA59 Unknown Unknown Unknown libc.so.6 400BE210 Unknown Unknown Unknown lapw1 0804E981 Unknown Unknown Unknown forrtl: severe (41): insufficient virtual memory Image PCRoutineLineSource lapw1 08548873 Unknown Unknown Unknown lapw1 08547E93 Unknown Unknown Unknown lapw1 0850C80E Unknown Unknown Unknown lapw1 084DBFB8 Unknown Unknown Unknown lapw1 084F8832 Unknown Unknown Unknown lapw1 08098779 Unknown Unknown Unknown lapw1 08091A14 Unknown Unknown Unknown lapw1 08055F8C Unknown Unknown Unknown lapw1 0807832E Unknown Unknown Unknown lapw1 0804EA59 Unknown Unknown Unknown libc.so.6 400BE210 Unknown Unknown Unknown lapw1 0804E981 Unknown Unknown Unknown forrtl: severe (41): insufficient virtual memory . Then I do a test, turning off the -it switch, and the job just run smoothly. Thank you very much Zhang Peter Blaha wrote: Are you using $SCRATCH ? Is this a shared memory machine, do you use ssh or rsh for parallelization ? execute vec2old_lapw $para on the commandline, eventually add the -x switch in the first line of the script. -- P.Blaha -- Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna Phone: +43-1-58801-15671 FAX: +43-1-58801-15698 Email: blaha at theochem.tuwien.ac.atWWW: http://info.tuwien.ac.at/theochem/ --