Hi George, It seems like some data corruption in Reduce_scatter function
I discovered it when added -DCHECK to IMB benchmark, and it seemed to be there for ages. it runs with voltaire MPI, but failes with OMPI. you will get a seqv with IMB3.1 and error with IMB3.0 host#VER=TRUNK ; /home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpirun -np 2 -H witch8 /home/BENCHMARKS/PALLAS/IMB_3.0v/src/IMB-MPI1_${VER} Reduce_scatter #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.0v modified by Voltaire, MPI-1 part #--------------------------------------------------- # Date : Tue Sep 23 18:05:35 2008 # Machine : x86_64 # System : Linux # Release : 2.6.16.46-0.12-smp # Version : #1 SMP Thu May 17 14:00:09 UTC 2007 # MPI Version : 2.0 # MPI Thread Environment: MPI_THREAD_SINGLE # # Minimum message length in bytes: 0 # Maximum message length in bytes: 67108864 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Reduce_scatter #----------------------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 2 #----------------------------------------------------------------------------- #Benchmarking #procs #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] defects Reduce_scatter 2 0 1000 0.05 0.05 0.05 0.00 0: Error Reduce_scatter, size = 4, sample #0 Process 0: Got invalid buffer: Buffer entry: 817291591680.000000 pos: 0 Process 0: Expected buffer: Buffer entry: 0.000000 Reduce_scatter 2 4 1000 0.98 1.06 1.02 1.00 Application error code 1 occurred [witch8:10190] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 17 -------------------------------------------------------------------------- mpirun has exited due to process rank 0 with PID 10190 on node witch8 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). --------------------------------------------------------------------------