Hello

I'm doing some testing with IMB and dicovered a strange thing:

Since I have a system with new AMD opteron 6276 processors I'm using
1.5.5rc3 since it supports binding to cores.

But when I run the barrier test form intel mpi benchmarks, the best I get
is:
#repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
          598     15159.56     15211.05     15184.70
 (/opt/openmpi-1.5.5rc3/intel12/bin/mpirun -x OMP_NUM_THREADS=1  -hostfile
hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca
coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 1 -np 256
openmpi-1.5.5rc3/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256
barrier)

And with openmpi 1.5.4 the result is much better:
#repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
         1000       113.23       113.33       113.28

(/opt/openmpi-1.5.4/intel12/bin/mpirun -x OMP_NUM_THREADS=1  -hostfile
hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca
coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 3 -np 256
openmpi-1.5.4/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256
barrier)

and still I couldn't come close to the result I got with mvapich:
#repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
         1000        17.51        17.53        17.53

(/opt/mvapich2-1.8/intel12/bin/mpiexec.hydra -env OMP_NUM_THREADS 1
-hostfile hosts_all2all_2 -np 256 mvapich2-1.8/intel12/IMB-MPI1 -mem 2
-off_cache 16,64 -msglog 1:16 -npmin 256 barrier)

I dunno if this is a bug or me doing something not the way I should. So is
there a way to improve my results?

Best regards,
Pavel Mezentsev

Reply via email to