Jeff, On Monday 06 July 2009 11:05:16 am Jeff Squyres wrote: > I notice that in the new HLRS mpi_test_suite, I'm getting oodles of Well, the test suite is not really new (it was started some time around 2003) Regular ompi testing is new ;-)) Thanks for that, Jeff!
> errors with the MPI_TYPE_MIX and MPI_SHORT_INT datatypes (Linux/ > x86_64). I have to run with: > > mpirun mpi_test_suite -d All,\!MPI_TYPE_MIX,\!MPI_SHORT_INT > > (which excludes these two types) > > I can't quite follow the test suite code, but MPI_TYPE_MIX is some > kind of derived MPI datatype. Yes. Basically MPI_TYPE_MIX (and MPI_TYPE_MIX_LB_UB) is a struct of 11 basic types: MPI_Datatype mix_type[11] = {MPI_CHAR, MPI_SHORT, MPI_INT, MPI_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_FLOAT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, MPI_SHORT_INT, MPI_2INT}; Now, as it contains MPI_SHORT_INT (which contains a hole), the problem's cause may be similar! This has to be investigated. > Is something wrong with our datatype engine? Or are these tests > faulty? First of all, the MPI standard requires the types such as MPI_FLOAT_INT or MPI_SHORT_INT to be usable in reduction operations. Nevertheless they should be fine here. Now, MPIch2-1.1 works fine with all the datatypes (including MPI_TYPE_MIX) ------------------------------ mpirun -np 2 ./mpi_test_suite -t 'P2P,Collective' -r FULL -x strict P2P tests Ring (3/44), comm MPI_COMM_WORLD (1/13), type MPI_CHAR (1/29) ... Collective tests Alltoall (47/44), comm Intracomm merged of the Halved Intercomm (13/13), type MPI_TYPE_MIX_LB_UB (29/29) Number of failed tests:0 ------------------------------ > I don't know if anyone has run this test suite with any regularity before, > so I don't know which it is... Tests with these datatypes have been run on IBM's MPI, NEC's MPI (derived from MPIch) and Intel MPI (well, also MPIch based) although these were tested some time ago. Tests against MPIch-1 and now MPIch2 have been done very often and bugs have been tracked down, so I believe the core of the test suite itself is fine! [I am not talking about correctness of individual tests themselves, e.g. -t one-sided will definitely show bugs in the test-suite]... With best regards, Rainer PS: The test suite fills the send buffers with known values according to the datatype being passed to the test and afterwards checks against expected values. The send and recv buffers are preset with a definable pattern (0xa5) to check for overwritten data in holes (see type MPI_TYPE_MIX_LB_UB). The buffer starts with the MIN, then the MAX value of the given datatype, followed by (2+rank_of_comm_partner), (3+rank...) etc. One may check the hex-values of the ALL communicated buffers using a higher report level (-r FULL), however, one may want to reduce the number of elements send using -n, e.g. -n 10. Higher values (default is -n 1000) however have shown problems (that have hinted to bugs) when switching from eager protocol... These have been fixed in ompi. -- ------------------------------------------------------------------------ Rainer Keller, PhD Tel: +1 (865) 241-6293 Oak Ridge National Lab Fax: +1 (865) 241-4811 PO Box 2008 MS 6164 Email: kel...@ornl.gov Oak Ridge, TN 37831-2008 AIM/Skype: rusraink