On 05/05/2014 09:20 PM, Richard Shaw wrote:
Hello, I think I've come across a bug when using ROMIO to read in a 2D distributed array. I've attached a test case to this email.
Richard: may I add this test case to ROMIO's test suite? I'm always on the hunt for small self-contained tests.
I found the problem in MPICH, but i don't know how it relates to OpenMPI -- the darray bug is one I introduced on tuesday, so OpenMPI's ROMIO code should not have a problem with this darray type.
==rob
In the testcase I first initialise an array of 25 doubles (which will be a 5x5 grid), then I create a datatype representing a 5x5 matrix distributed in 3x3 blocks over a 2x2 process grid. As a reference I use MPI_Pack to pull out the block cyclic array elements local to each process (which I think is correct). Then I write the original array of 25 doubles to disk, and use MPI-IO to read it back in (performing the Open, Set_view, and Real_all), and compare to the reference. Running this with OMPI, the two match on all ranks. > mpirun -mca io ompio -np 4 ./darr_read.x === Rank 0 === (9 elements) Packed: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 Read: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 === Rank 1 === (6 elements) Packed: 15.0 16.0 17.0 20.0 21.0 22.0 Read: 15.0 16.0 17.0 20.0 21.0 22.0 === Rank 2 === (6 elements) Packed: 3.0 4.0 8.0 9.0 13.0 14.0 Read: 3.0 4.0 8.0 9.0 13.0 14.0 === Rank 3 === (4 elements) Packed: 18.0 19.0 23.0 24.0 Read: 18.0 19.0 23.0 24.0 However, using ROMIO the two differ on two of the ranks: > mpirun -mca io romio -np 4 ./darr_read.x === Rank 0 === (9 elements) Packed: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 Read: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 === Rank 1 === (6 elements) Packed: 15.0 16.0 17.0 20.0 21.0 22.0 Read: 0.0 1.0 2.0 0.0 1.0 2.0 === Rank 2 === (6 elements) Packed: 3.0 4.0 8.0 9.0 13.0 14.0 Read: 3.0 4.0 8.0 9.0 13.0 14.0 === Rank 3 === (4 elements) Packed: 18.0 19.0 23.0 24.0 Read: 0.0 1.0 0.0 1.0 My interpretation is that the behaviour with OMPIO is correct. Interestingly everything matches up using both ROMIO and OMPIO if I set the block shape to 2x2. This was run on OS X using 1.8.2a1r31632. I have also run this on Linux with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO I just get segfaults. Thanks, Richard _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
-- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA