> > how are the timings if you don't use all 8 cores? > does the jobs get faster again? > N_core tot cputime for one iter ------- ----------------------- 1 398.89 sec 2 200.14 3 134.61 4 101.46 5 85.24 6 71.60 7 63.31 8 57.16
Is it surprising? > if you are still running the same "benchmark" that > you were running before. your comparisons are most > likely severely flawed. No, axel! It was a two-atom Au cluster with "relax" calculations ON pentium 4, 3.2GHz, dual core. Now it is SLAB. > you never could prove to me, > that you are running a correctly compiled executable > and mpi installation. so you may be comparing apples > and oranges. you openmpi timings are highly suspicious. > I do not want to prove anything. I just announce my experience. Everybody interested can verify by him/herself. > i was showing you, that openmpi _does_ behave properly > on an example that does specifically test MPI performance > and not depend on anything else (like NFS i/o). I agree with you. In that case you were comparing apple with orange! > > MP> > if you see these kinds of differences, then there is something > MP> > else causing problems. > MP> > > MP> > are you using processor and memory affinity with openmpi? > MP> > MP> I have no idea on these concepts. I just use (practice as a good(?) > MP> student) what you taught me during the hpc08. > > processor affinity is tying a process to a specific CP. > in multi-processor/multi-core environments, this has > severe performance implications, as it improves cpu > cache utilization. just stick those keyword into google > and you'll see. > > MP> > > MP> > what kind of processor is this exactly? > MP> It is 5420. > > ok. so that is intel quad-core. i have a bunch of 5430s > available to me. please redo those tests with the 32-water > cp.x input from example21 of the Q-E distribution. and > then we can start dicussing seriously. for as long as > nobody can reproduce your benchmarks, they are useless. I do not have any experience with CPMD. > > also you still have a huge difference between wall > clock and cpu clock. in short, you are trying to solve > the least important problem first. > > i'd kindly ask to not to make claims about mpi implementations > being "better" unless you can prove that the difference in > timings are really due to the mpi implementation and not due > to improper use of the machine or inadequate hardware. I just expressed my findings, and tried to share it. regards,mahmoud > > cheers, > axel. > > > > MP> > MP> regards,mahmoud > MP> > MP> > MP> > axel. > MP> > > MP> > MP> > MP> > MP> Cheers, > MP> > MP> mahmoud > MP> > MP> > MP> > MP> > MP> > > MP> > -- > MP> > > ======================================================================= > MP> > Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu > http://www.cmm.upenn.edu > MP> > Center for Molecular Modeling -- University of > Pennsylvania > MP> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA > 19104-6323 > MP> > tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: > 1-215-898-5425 > MP> > > ======================================================================= > MP> > If you make something idiot-proof, the universe creates a better > idiot. > MP> > > -- > ======================================================================= > Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu > Center for Molecular Modeling -- University of Pennsylvania > Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323 > tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425 > ======================================================================= > If you make something idiot-proof, the universe creates a better idiot.
