Thomas, Thanks for the detailed bug report and the test case. I successfully identified the culprit, and the issue is now fixed (commit r28319).
Regards, George. PS: During the debugging process I sketched the datatype representation to help myself understand the issue. I attached the figure here for the delight of whoever might be interested. It contains the 4 datatypes created in main, and the two datatypes created on the second invocation of the do_test function. On Apr 8, 2013, at 16:08 , Thomas Jahns <ja...@dkrz.de> wrote: > Hello, > > a colleague of mine has investigated a difficult problem we traced to OpenMPI > giving incorrectly delivered data on some struct datatypes which use specific > offsets (on the stack in our case but the problem can be reproduced when using > specifically chosen slices of an array). Our library is used to aggregate > several MPI communications in a generic and transparent manner and therefore > we > need to be able to handle any combination of properly aligned offsets and > component types. > > The attached example program contains the necessary steps to reproduce the > problem: > > 1. create the struct types in question > 2. send/recv the data > 3. compare to reference (said comparison works on several MPICH2 versions) > > The code prints than any array indices/values not matching the reference. > > Our platform is linux x86_64 with Debian squeeze, the tested versions of > OpenMPI > are the 1.4.2 version supplied with squeeze and 1.6.4 compiled ourselves. For > 1.4.2 I also did a quick test in a i386 chroot and the code fails there too. > gcc > 4.6.1 was used for the x86_64 cases and gcc 4.3.5 for the i386 chroot. > > Sorry if the test is not of minimal size, but we were happy once he got this > down from several 10000 lines Fortran+C and even that took more than a day > once > we understood the problem was unrelated to the Fortran program it originally > occurred in. > > When running the program with OpenMPI: > > $ mpicc -std=gnu99 ./mpi_test.c && ./a.out > first tests: > second tests: > results_2[6] = 8 > ref_results_2[6] = 12 > results_2[7] = 9 > ref_results_2[7] = 13 > > MPICH gives the expected result: > $ /sw/squeeze-x64/mpi/mpich2-1.4.1p1-gccsys/bin/mpicc -std=gnu99 ./mpi_test.c > && > ./a.out > first tests: > second tests: > > Regards, Thomas > -- > Thomas Jahns > DKRZ GmbH, Department: Application software > > Deutsches Klimarechenzentrum > Bundesstraße 45a > D-20146 Hamburg > > Phone: +49-40-460094-151 > Fax: +49-40-460094-270 > Email: Thomas Jahns <ja...@dkrz.de> > <mpi_test.c>_______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel