Thomas,

Thanks for the detailed bug report and the test case. I successfully identified 
the culprit, and the issue is now fixed (commit r28319).

  Regards,
    George.

PS: During the debugging process I sketched the datatype representation to help 
myself understand the issue. I attached the figure here for the delight of 
whoever might be interested. It contains the 4 datatypes created in main, and 
the two datatypes created on the second invocation of the do_test function.




On Apr 8, 2013, at 16:08 , Thomas Jahns <ja...@dkrz.de> wrote:

> Hello,
> 
> a colleague of mine has investigated a difficult problem we traced to OpenMPI
> giving incorrectly delivered data on some struct datatypes which use specific
> offsets (on the stack in our case but the problem can be reproduced when using
> specifically chosen slices of an array). Our library is used to aggregate
> several MPI communications in a generic and transparent manner and therefore 
> we
> need to be able to handle any combination of properly aligned offsets and
> component types.
> 
> The attached example program contains the necessary steps to reproduce the 
> problem:
> 
> 1. create the struct types in question
> 2. send/recv the data
> 3. compare to reference (said comparison works on several MPICH2 versions)
> 
> The code prints than any array indices/values not matching the reference.
> 
> Our platform is linux x86_64 with Debian squeeze, the tested versions of 
> OpenMPI
> are the 1.4.2 version supplied with squeeze and 1.6.4 compiled ourselves. For
> 1.4.2 I also did a quick test in a i386 chroot and the code fails there too. 
> gcc
> 4.6.1 was used for the x86_64 cases and gcc 4.3.5 for the i386 chroot.
> 
> Sorry if the test is not of minimal size, but we were happy once he got this
> down from several 10000 lines Fortran+C and even that took more than a day 
> once
> we understood the problem was unrelated to the Fortran program it originally
> occurred in.
> 
> When running the program with OpenMPI:
> 
> $ mpicc -std=gnu99 ./mpi_test.c && ./a.out
> first tests:
> second tests:
> results_2[6]     = 8
> ref_results_2[6] = 12
> results_2[7]     = 9
> ref_results_2[7] = 13
> 
> MPICH gives the expected result:
> $ /sw/squeeze-x64/mpi/mpich2-1.4.1p1-gccsys/bin/mpicc -std=gnu99 ./mpi_test.c 
> &&
> ./a.out
> first tests:
> second tests:
> 
> Regards, Thomas
> -- 
> Thomas Jahns
> DKRZ GmbH, Department: Application software
> 
> Deutsches Klimarechenzentrum
> Bundesstraße 45a
> D-20146 Hamburg
> 
> Phone: +49-40-460094-151
> Fax: +49-40-460094-270
> Email: Thomas Jahns <ja...@dkrz.de>
> <mpi_test.c>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to