> but this does NOT work: > OMP_NUM_THREAD=1 > ./myprogram > > mysteries of bash shell... > I think this is due to the fact that bash temporarely set the variables for the single command...
Anyway, I did few tests, as suggested by Axel, on 32 water molecule (example 21). I started to play with the version 4.1 and forgot that it is under testing. The following results refers to the version 4.1CVS ... sorry... I did two exacutables, v0 is without -ipo, v1 is compiled with -ipo I do have to try to set the sequential, but in my mkl lib it does not exists any lib-sequential... therefore I'm going to try to use some compiler switches Here are the results (read OMP as OMP_NUM_THREADS and MPI as number of cpu used with mpiexec): The test using OMP=8 and no MPI: v0: CP : 4m 1.98s CPU time, 3m 0.83s wall time v1: CP : 3m49.07s CPU time, 2m57.33s wall time Using OMP=1 and MPI=8 v0: CP : 0m36.43s CPU time, 1m31.59s wall time v1: CP : 0m35.39s CPU time, 1m30.67s wall time There is a clear advantage in using MPI, even though each %CPU (with MPI) is only between 40-50%. In fact the CPU time is almost 1/3 of the wall time. I suspect that Axel is right pointing to the disk I/O. I'm afraid that to increase the performance I should set the RAID 10... OMP=2, MPI=8 v0: CP : 0m46.98s CPU time, 1m36.08s wall time little worse than OMP=1 and MPI=8 OMP=4. MPI=4 v0: CP : 1m25.82s CPU time, 2m 4.96s wall time OMP=8, MPI=8 v0: CP : 1m17.36s CPU time, 2m11.83s wall time Here there is probably some interferences between parallel approaches. I did also a couple of run on 64 water molecules: OMP=1, MPI=8 v0: CP : 3m41.49s CPU time, 7m14.67s wall time OMP=8, no MPI v0: CP : 15m27.18s CPU time, 12m56.67s wall time In this case the ratio CPU time/wall time is slightly better, and in fact the %CPU usage were between 50-60% Furthermore, I did also the tests/check-pw.x.j on the version v1 only: ./check-pw.x.j Checking atom-lsda...passed Checking atom-pbe...discrepancy in number of scf iterations detected Reference: 7, You got: 5 discrepancy in pressure detected Reference: -14.44, You got: -14.48 Checking atom-sigmapbe...discrepancy in pressure detected Reference: -15.02, You got: -15.01 Checking atom...passed Checking berry...passed Checking berry, step 2 ...passed Checking electric0...discrepancy in number of scf iterations detected Reference: 8, You got: 9 Checking electric1...passed Checking electric2...passed Checking eval_infix...passed Checking eval_infix, step 2 ...discrepancy in HOMO detected Reference: -8.4542, You got: -8.4554 discrepancy in LUMO detected Reference: -0.4297, You got: -0.4300 Checking lattice-ibrav0-abc...passed Checking lattice-ibrav0-cell_parameters+a...passed Checking lattice-ibrav0-cell_parameters+celldm...passed Checking lattice-ibrav0-cell_parameters...passed Checking lattice-ibrav1-kauto...passed Checking lattice-ibrav1...passed Checking lattice-ibrav10-kauto...passed Checking lattice-ibrav10...passed Checking lattice-ibrav11-kauto...passed Checking lattice-ibrav11...passed Checking lattice-ibrav12-kauto...passed Checking lattice-ibrav12...passed Checking lattice-ibrav13-kauto...passed Checking lattice-ibrav13...passed Checking lattice-ibrav14-kauto...passed Checking lattice-ibrav14...passed Checking lattice-ibrav2-kauto...passed Checking lattice-ibrav2...passed Checking lattice-ibrav3-kauto...passed Checking lattice-ibrav3...passed Checking lattice-ibrav4-kauto...passed Checking lattice-ibrav4...passed Checking lattice-ibrav5-kauto...passed Checking lattice-ibrav5...passed Checking lattice-ibrav6-kauto...passed Checking lattice-ibrav6...passed Checking lattice-ibrav7-kauto...passed Checking lattice-ibrav7...passed Checking lattice-ibrav8-kauto...passed Checking lattice-ibrav8...passed Checking lattice-ibrav9-kauto...passed Checking lattice-ibrav9...passed Checking lda+U-noU...passed Checking lda+U-user_ns...passed Checking lda+U...passed Checking lsda-cg...passed Checking lsda-mixing_TF...passed Checking lsda-mixing_localTF...passed Checking lsda-mixing_ndim...passed Checking lsda-nelup+neldw...passed Checking lsda-tot_magnetization...passed Checking lsda...passed Checking lsda, step 2 ...passed Checking md-pot_extrap1...passed Checking md-pot_extrap2...passed Checking md-wfc_extrap1...passed Checking md-wfc_extrap2...passed Checking md...passed Checking metaGGA...passed Checking metadyn...passed Checking metal-fermi_dirac...passed Checking metal-gaussian...passed Checking metal-tetrahedra...passed Checking metal-tetrahedra, step 2 ...passed Checking metal...passed Checking metal, step 2 ...passed Checking neb1-H2+H...passed Checking neb2-H2+H-symm...passed Checking neb3-H2+H-asym...passed Checking noncolin-cg...passed Checking noncolin-constrain_angle...passed Checking noncolin-constrain_atomic...discrepancy in total energy detected Reference: -55.690284, You got: -55.690283 discrepancy in number of scf iterations detected Reference: 12, You got: 14 Checking noncolin-constrain_total...discrepancy in total energy detected Reference: -55.544783, You got: -55.544784 discrepancy in number of scf iterations detected Reference: 32, You got: 30 Checking noncolin...discrepancy in pressure detected Reference: 193.22, You got: 193.53 Checking noncolin, step 2 ...passed Checking paw-atom...passed Checking paw-atom_l=2...passed Checking paw-atom_lda...passed Checking paw-atom_spin...discrepancy in total energy detected Reference: -41.264991, You got: -41.265001 discrepancy in number of scf iterations detected Reference: 6, You got: 5 Checking paw-atom_spin_lda...discrepancy in total energy detected Reference: -40.244090, You got: -40.244091 Checking paw-bfgs...discrepancy in number of scf iterations detected Reference: 7, You got: 8 Checking paw-vcbfgs...passed Checking relax-damped...passed Checking relax-el...passed Checking relax...passed Checking relax2-bfgs_ndim3...passed Checking relax2...passed Checking scf-cg...passed Checking scf-disk_io...passed Checking scf-gamma...passed Checking scf-k0...passed Checking scf-kauto...passed Checking scf-mixing_TF...passed Checking scf-mixing_beta...passed Checking scf-mixing_localTF...passed Checking scf-mixing_ndim...passed Checking scf-ncpp...discrepancy in total energy detected Reference: -15.839765, You got: -15.839767 Checking scf-wf_collect...passed Checking scf...passed Checking scf, step 2 ...passed Checking spinorbit...passed Checking spinorbit, step 2 ...passed Checking uspp-cg...passed Checking uspp-mixing_TF...passed Checking uspp-mixing_localTF...passed Checking uspp-mixing_ndim...passed Checking uspp-singlegrid...passed Checking uspp...passed Checking uspp, step 2 ...passed Checking uspp1-coulomb...passed Checking uspp1...passed Checking uspp2...discrepancy in pressure detected Reference: -30.68, You got: -30.69 Checking vc-relax1...passed Checking vc-relax2...passed Total wall time (s) spent in this run: 5274.73 Reference : 720.04 There are few discrepancies, but they seems reasonables.. or not? It's only very long... Carlo P.S.: I hope I did not post a message too much long... ------------------------------------------------------ Carlo Nervi carlo.nervi at unito.it Tel:+39 011 6707507/8 Fax: +39 011 6707855 - Dipartimento di Chimica IFM via P. Giuria 7, 10125 Torino, Italy http://lem.ch.unito.it/
