Rolf, I didn’t see these on my check run. Can you run the MPI_Isend_ator test with mpi_ddt_pack_debug and mpi_ddt_unpack_debug set to 1. I would be interested in the output you get on your machine.
George. On Apr 16, 2014, at 14:34 , Rolf vandeVaart <rvandeva...@nvidia.com> wrote: > I have seen errors when running the intel test suite using the openib BTL > when transferring derived datatypes. I do not see the error with sm or tcp > BTLs. The errors begin after this checkin. > > https://svn.open-mpi.org/trac/ompi/changeset/31370 > Timestamp: 04/11/14 16:06:56 (5 days ago) > Author: bosilca > Message: Reshape all the packing/unpacking functions to use the same > skeleton. Rewrite the > generic_unpacking to take advantage of the same capabilitites. > > Does anyone else see errors? Here is an example running with r31370: > > [rvandevaart@drossetti-ivy1 src]$ mpirun --mca btl self,openib -np 2 -host > drossetti-ivy0,drossetti-ivy1 --mca btl_openib_warn_default_gid_prefix 0 > MPI_Isend_ator_c > MPITEST error (1): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (1): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (1): 2 errors in buffer (17,0,12) len 273 commsize 2 commtype > -10 data_type 13 root 1 > MPITEST error (1): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (1): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (1): 2 errors in buffer (17,2,12) len 273 commsize 2 commtype > -16 data_type 13 root 1 > MPITEST info (0): Starting MPI_Isend_ator: All Isend TO Root test > MPITEST info (0): Node spec MPITEST_comm_sizes[6]=2 too large, using 1 > MPITEST info (0): Node spec MPITEST_comm_sizes[22]=2 too large, using 1 > MPITEST info (0): Node spec MPITEST_comm_sizes[32]=2 too large, using 1 > MPITEST error (0): libmpitest.c:1608 i=117, int32_t value=-1, expected 118 > MPITEST error (0): libmpitest.c:1578 i=195, char value=-1, expected -60 > MPITEST error (0): 2 errors in buffer (17,0,12) len 273 commsize 2 commtype > -10 data_type 13 root 0 > MPITEST error (0): libmpitest.c:1608 i=117, int32_t value=-1, expected 118 > MPITEST error (0): libmpitest.c:1578 i=195, char value=-1, expected -60 > MPITEST error (0): 2 errors in buffer (17,2,12) len 273 commsize 2 commtype > -16 data_type 13 root 0 > MPITEST error (1): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (1): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (1): 2 errors in buffer (17,4,12) len 273 commsize 2 commtype > -13 data_type 13 root 1 > MPITEST error (0): libmpitest.c:1608 i=117, int32_t value=-1, expected 118 > MPITEST error (0): libmpitest.c:1578 i=195, char value=-1, expected -60 > MPITEST error (0): 2 errors in buffer (17,4,12) len 273 commsize 2 commtype > -13 data_type 13 root 0 > MPITEST error (1): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (1): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (1): 2 errors in buffer (17,6,12) len 273 commsize 2 commtype > -15 data_type 13 root 0 > MPITEST error (0): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (0): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (0): 2 errors in buffer (17,6,12) len 273 commsize 2 commtype > -15 data_type 13 root 0 > MPITEST_results: MPI_Isend_ator: All Isend TO Root 8 tests FAILED (of 3744) > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[12363,1],0] > Exit code: 4 > -------------------------------------------------------------------------- > [rvandevaart@drossetti-ivy1 src]$ > > > Here is an error with the trunk which is slightly different. > [rvandevaart@drossetti-ivy1 src]$ mpirun --mca btl self,openib -np 2 -host > drossetti-ivy0,drossetti-ivy1 --mca btl_openib_warn_default_gid_prefix 0 > MPI_Isend_ator_c > [drossetti-ivy1.nvidia.com:22875] > ../../../opal/datatype/opal_datatype_position.c:72 > Pointer 0x1ad414c size 4 is outside [0x1ac1d20,0x1ad1d08] for > base ptr 0x1ac1d20 count 273 and data > [drossetti-ivy1.nvidia.com:22875] Datatype 0x1ac0220[] size 104 align 16 id 0 > length 22 used 21 > true_lb 0 true_ub 232 (true_extent 232) lb 0 ub 240 (extent 240) > nbElems 21 loops 0 flags 1C4 (commited )-c--lu-GD--[---][---] > contain lb ub OPAL_LB OPAL_UB OPAL_INT1 OPAL_INT2 OPAL_INT4 OPAL_INT8 > OPAL_UINT1 OPAL_UINT2 OPAL_UINT4 OPAL_UINT8 OPAL_FLOAT4 OPAL_FLOAT8 > OPAL_FLOAT16 > --C---P-D--[---][---] OPAL_INT4 count 1 disp 0x0 (0) extent 4 (size 4) > --C---P-D--[---][---] OPAL_INT2 count 1 disp 0x8 (8) extent 2 (size 2) > --C---P-D--[---][---] OPAL_INT8 count 1 disp 0x10 (16) extent 8 (size 8) > --C---P-D--[---][---] OPAL_UINT2 count 1 disp 0x20 (32) extent 2 (size 2) > --C---P-D--[---][---] OPAL_UINT4 count 1 disp 0x24 (36) extent 4 (size 4) > --C---P-D--[---][---] OPAL_UINT8 count 1 disp 0x30 (48) extent 8 (size 8) > --C---P-D--[---][---] OPAL_FLOAT4 count 1 disp 0x40 (64) extent 4 (size 4) > --C---P-D--[---][---] OPAL_INT1 count 1 disp 0x48 (72) extent 1 (size 1) > --C---P-D--[---][---] OPAL_FLOAT8 count 1 disp 0x50 (80) extent 8 (size 8) > --C---P-D--[---][---] OPAL_UINT1 count 1 disp 0x60 (96) extent 1 (size 1) > --C---P-D--[---][---] OPAL_FLOAT16 count 1 disp 0x70 (112) extent 16 (size > 16) > --C---P-D--[---][---] OPAL_INT1 count 1 disp 0x90 (144) extent 1 (size 1) > --C---P-D--[---][---] OPAL_UINT1 count 1 disp 0x92 (146) extent 1 (size 1) > --C---P-D--[---][---] OPAL_INT2 count 1 disp 0x94 (148) extent 2 (size 2) > --C---P-D--[---][---] OPAL_UINT2 count 1 disp 0x98 (152) extent 2 (size 2) > --C---P-D--[---][---] OPAL_INT4 count 1 disp 0x9c (156) extent 4 (size 4) > --C---P-D--[---][---] OPAL_UINT4 count 1 disp 0xa4 (164) extent 4 (size 4) > --C---P-D--[---][---] OPAL_INT8 count 1 disp 0xb0 (176) extent 8 (size 8) > --C---P-D--[---][---] OPAL_UINT8 count 1 disp 0xc0 (192) extent 8 (size 8) > --C---P-D--[---][---] OPAL_INT8 count 1 disp 0xd0 (208) extent 8 (size 8) > --C---P-D--[---][---] OPAL_UINT8 count 1 disp 0xe0 (224) extent 8 (size 8) > -------G---[---][---] OPAL_END_LOOP prev 21 elements first elem displacement > 0 size of data 104 > Optimized description > -cC---P-DB-[---][---] OPAL_INT4 count 1 disp 0x0 (0) extent 4 (size 4) > -cC---P-DB-[---][---] OPAL_INT2 count 1 disp 0x8 (8) extent 2 (size 2) > -cC---P-DB-[---][---] OPAL_INT8 count 1 disp 0x10 (16) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_UINT2 count 1 disp 0x20 (32) extent 2 (size 2) > -cC---P-DB-[---][---] OPAL_UINT4 count 1 disp 0x24 (36) extent 4 (size 4) > -cC---P-DB-[---][---] OPAL_UINT8 count 1 disp 0x30 (48) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_FLOAT4 count 1 disp 0x40 (64) extent 4 (size 4) > -cC---P-DB-[---][---] OPAL_INT1 count 1 disp 0x48 (72) extent 1 (size 1) > -cC---P-DB-[---][---] OPAL_FLOAT8 count 1 disp 0x50 (80) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_UINT1 count 1 disp 0x60 (96) extent 1 (size 1) > -cC---P-DB-[---][---] OPAL_FLOAT16 count 1 disp 0x70 (112) extent 16 (size > 16) > -cC---P-DB-[---][---] OPAL_INT1 count 1 disp 0x90 (144) extent 1 (size 1) > -cC---P-DB-[---][---] OPAL_UINT1 count 1 disp 0x92 (146) extent 1 (size 1) > -cC---P-DB-[---][---] OPAL_INT2 count 1 disp 0x94 (148) extent 2 (size 2) > -cC---P-DB-[---][---] OPAL_UINT2 count 1 disp 0x98 (152) extent 2 (size 2) > -cC---P-DB-[---][---] OPAL_INT4 count 1 disp 0x9c (156) extent 4 (size 4) > -cC---P-DB-[---][---] OPAL_UINT4 count 1 disp 0xa4 (164) extent 4 (size 4) > -cC---P-DB-[---][---] OPAL_INT8 count 1 disp 0xb0 (176) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_UINT8 count 1 disp 0xc0 (192) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_INT8 count 1 disp 0xd0 (208) extent 8 (size 8) > -cC---P-DB-[---][---] OPAL_UINT8 count 1 disp 0xe0 (224) extent 8 (size 8) > -------G---[---][---] OPAL_END_LOOP prev 21 elements first elem displacement > 0 size of data 104 > > MPITEST error (1): libmpitest.c:1578 i=0, char value=-61, expected 0 > MPITEST error (1): libmpitest.c:1608 i=0, int32_t value=117, expected 0 > MPITEST error (1): libmpitest.c:1608 i=117, int32_t value=-1, expected 117 > MPITEST error (1): libmpitest.c:1578 i=195, char value=-1, expected -61 > MPITEST error (1): 4 errors in buffer (17,0,12) len 273 commsize 2 commtype > -10 data_type 13 root 1 > MPITEST info (0): Starting MPI_Isend_ator: All Isend TO Root test > MPITEST info (0): Node spec MPITEST_comm_sizes[6]=2 too large, using 1 > MPITEST info (0): Node spec MPITEST_comm_sizes[22]=2 too large, using 1 > MPITEST info (0): Node spec MPITEST_comm_sizes[32]=2 too large, using 1 > MPITEST_results: MPI_Isend_ator: All Isend TO Root 1 tests FAILED (of 3744) > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[12296,1],1] > Exit code: 1 > -------------------------------------------------------------------------- > [rvandevaart@drossetti-ivy1 src]$ > > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may > contain > confidential information. Any unauthorized review, use, disclosure or > distribution > is prohibited. If you are not the intended recipient, please contact the > sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14553.php