Well ok. Report to the OpenMP developers and have --download-openmp use your patched version.
Thanks, Barry > On Apr 29, 2015, at 7:06 PM, Satish Balay <[email protected]> wrote: > > The following change to openmpi appears to get rid of this error [and > valgrind messages] > > I haven't checked closely to see what the actual bug is though.. > > Satish > > ------- > > $ diff -Nru ompi/datatype/ompi_datatype_args.c~ > ompi/datatype/ompi_datatype_args.c > --- ompi/datatype/ompi_datatype_args.c~ 2014-10-03 15:32:23.000000000 > -0500 > +++ ompi/datatype/ompi_datatype_args.c 2015-04-29 19:00:03.425618061 > -0500 > @@ -84,7 +84,7 @@ > do { \ > int length = sizeof(ompi_datatype_args_t) + (IC) * sizeof(int) + \ > (AC) * sizeof(OPAL_PTRDIFF_TYPE) + (DC) * sizeof(MPI_Datatype); \ > - char* buf = (char*)malloc( length ); \ > + char* buf = (char*)malloc( length+8 ); \ > ompi_datatype_args_t* pArgs = (ompi_datatype_args_t*)buf; \ > pArgs->ci = (IC); \ > pArgs->ca = (AC); \ > $ > > > On Fri, 10 Apr 2015, Barry Smith wrote: > >> >> Not testing something that doesn't work in order to not have an error in >> the tests doesn't seem right to me. Shouldn't the window stuff be fixed or >> removed rather than leaving buggy code? >> >> barry >> >>> On Apr 10, 2015, at 2:45 AM, Lawrence Mitchell >>> <[email protected]> wrote: >>> >>> (cc'ing petsc-dev as well) >>> >>>> On 10 Apr 2015, at 00:51, Satish Balay <[email protected]> wrote: >>>> >>>> Its likely a codebug some where. MPICH build also gives a valgrind trace. >>> >>> It's probable that the OMPI implementation is buggy enough to completely >>> not work. I'm a little confused by the MPICH issue. I don't understand >>> enough about the datatype implementation in the window SF type to know if >>> this is a PETSc issue, or an MPICH one. I note in passing that all the ex1 >>> tests exhibit a similar valgrind trace. >>> >>> For ex2 at least, maybe the simplest option is to turn off the window test >>> entirely. Like this: >>> >>> diff --git a/src/vec/is/sf/examples/tutorials/makefile >>> b/src/vec/is/sf/examples/tutorials >>> index aeaf1e4..e7774c5 100644 >>> --- a/src/vec/is/sf/examples/tutorials/makefile >>> +++ b/src/vec/is/sf/examples/tutorials/makefile >>> @@ -86,7 +86,7 @@ runex2_window: >>> ${RM} -f ex2.tmp >>> >>> TESTEXAMPLES_C = ex1.PETSc runex1_basic runex1_2_basic >>> runex1_3_basic runex1 >>> - ex2.PETSc runex2_basic runex2_window ex2.rm >>> + ex2.PETSc runex2_basic ex2.rm >>> TESTEXAMPLES_C_X = >>> TESTEXAMPLES_FORTRAN = >>> TESTEXAMPLES_FORTRAN_MPIUNI = >>> >>> Lawrence >>> >>>> Satish >>>> >>>> ---------- >>>> >>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials (master=) >>>> $ mpiexec -n 2 valgrind --tool=memcheck -q --dsymutil=yes --num-callers=40 >>>> --track-origins=yes ./ex2 -sf_type window >>>> PetscSF Object: 2 MPI processes >>>> type: window >>>> synchronization=FENCE sort=rank-order >>>> [0] Number of roots=1, leaves=2, remote ranks=2 >>>> [0] 0 <- (0,0) >>>> [0] 1 <- (1,0) >>>> [1] Number of roots=1, leaves=2, remote ranks=2 >>>> [1] 0 <- (1,0) >>>> [1] 1 <- (0,0) >>>> ==29265== Syscall param writev(vector[...]) points to uninitialised byte(s) >>>> ==29265== at 0x8F474E7: writev (in /usr/lib64/libc-2.20.so) >>>> ==29265== by 0x894AD87: MPL_large_writev (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x8941A48: MPIDU_Sock_writev (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x892AC7D: MPIDI_CH3_iStartMsgv (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x8911D08: recv_rma_msg (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x8913D46: MPIDI_Win_fence (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x88C91EB: PMPI_Win_fence (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x50FB025: PetscSFRestoreWindow (sfwindow.c:348) >>>> ==29265== by 0x50FD4BF: PetscSFBcastEnd_Window (sfwindow.c:510) >>>> ==29265== by 0x5123CD9: PetscSFBcastEnd (sf.c:957) >>>> ==29265== by 0x401CAF: main (ex2.c:81) >>>> ==29265== Address 0x99436ec is 108 bytes inside a block of size 208 >>>> alloc'd >>>> ==29265== at 0x4C29BCF: malloc (in >>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==29265== by 0x890DA35: MPIDI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x88C406A: PMPI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x50FD0DA: PetscSFBcastBegin_Window (sfwindow.c:495) >>>> ==29265== by 0x51235B5: PetscSFBcastBegin (sf.c:924) >>>> ==29265== by 0x401BD3: main (ex2.c:79) >>>> ==29265== Uninitialised value was created by a heap allocation >>>> ==29265== at 0x4C29BCF: malloc (in >>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==29265== by 0x890DA35: MPIDI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x88C406A: PMPI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29265== by 0x50FD0DA: PetscSFBcastBegin_Window (sfwindow.c:495) >>>> ==29265== by 0x51235B5: PetscSFBcastBegin (sf.c:924) >>>> ==29265== by 0x401BD3: main (ex2.c:79) >>>> ==29265== >>>> ==29266== Syscall param writev(vector[...]) points to uninitialised byte(s) >>>> ==29266== at 0x8F474E7: writev (in /usr/lib64/libc-2.20.so) >>>> ==29266== by 0x894AD87: MPL_large_writev (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x8941A48: MPIDU_Sock_writev (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x892AC7D: MPIDI_CH3_iStartMsgv (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x8911D08: recv_rma_msg (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x8913D46: MPIDI_Win_fence (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x88C91EB: PMPI_Win_fence (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x50FB025: PetscSFRestoreWindow (sfwindow.c:348) >>>> ==29266== by 0x50FD4BF: PetscSFBcastEnd_Window (sfwindow.c:510) >>>> ==29266== by 0x5123CD9: PetscSFBcastEnd (sf.c:957) >>>> ==29266== by 0x401CAF: main (ex2.c:81) >>>> ==29266== Address 0x98d16dc is 108 bytes inside a block of size 208 >>>> alloc'd >>>> ==29266== at 0x4C29BCF: malloc (in >>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==29266== by 0x890DA35: MPIDI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x88C406A: PMPI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x50FD0DA: PetscSFBcastBegin_Window (sfwindow.c:495) >>>> ==29266== by 0x51235B5: PetscSFBcastBegin (sf.c:924) >>>> ==29266== by 0x401BD3: main (ex2.c:79) >>>> ==29266== Uninitialised value was created by a heap allocation >>>> ==29266== at 0x4C29BCF: malloc (in >>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==29266== by 0x890DA35: MPIDI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x88C406A: PMPI_Get (in >>>> /home/balay/soft/mpich-3.1.3/lib/libmpi.so.12.0.4) >>>> ==29266== by 0x50FD0DA: PetscSFBcastBegin_Window (sfwindow.c:495) >>>> ==29266== by 0x51235B5: PetscSFBcastBegin (sf.c:924) >>>> ==29266== by 0x401BD3: main (ex2.c:79) >>>> ==29266== >>>> Vec Object: 2 MPI processes >>>> type: mpi >>>> Process [0] >>>> 0 >>>> 1 >>>> Process [1] >>>> 1 >>>> 0 >>>> Vec Object: 2 MPI processes >>>> type: mpi >>>> Process [0] >>>> 10 >>>> 11 >>>> Process [1] >>>> 11 >>>> 10 >>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials (master=) >>>> $ >>>> >>>> On Thu, 9 Apr 2015, Satish Balay wrote: >>>> >>>>> here is a better valgrind trace.. >>>>> >>>>> satish >>>>> >>>>> -------- >>>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials (master=) >>>>> $ /home/balay/petsc/arch-ompi/bin/mpiexec -n 2 valgrind --tool=memcheck >>>>> -q --dsymutil=yes --num-callers=40 --track-origins=yes ./ex2 -sf_type >>>>> window >>>>> PetscSF Object: 2 MPI processes >>>>> type: window >>>>> synchronization=FENCE sort=rank-order >>>>> [0] Number of roots=1, leaves=2, remote ranks=2 >>>>> [0] 0 <- (0,0) >>>>> [0] 1 <- (1,0) >>>>> [1] Number of roots=1, leaves=2, remote ranks=2 >>>>> [1] 0 <- (1,0) >>>>> [1] 1 <- (0,0) >>>>> ==14815== Invalid write of size 2 >>>>> ==14815== at 0x4C2E36B: memcpy@@GLIBC_2.14 (in >>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> ==14815== by 0x8AFDABD: ompi_datatype_set_args >>>>> (ompi_datatype_args.c:167) >>>>> ==14815== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>> (ompi_datatype_args.c:718) >>>>> ==14815== by 0x8AFEC0E: __ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:649) >>>>> ==14815== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:788) >>>>> ==14815== by 0xF727F0E: ompi_osc_base_datatype_create >>>>> (osc_base_obj_convert.h:52) >>>>> ==14815== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>> ==14815== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>> ==14815== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>> ==14815== by 0xF72AA35: ompi_osc_rdma_callback >>>>> (osc_rdma_data_move.c:1656) >>>>> ==14815== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>> ==14815== by 0xECCF4EA: recv_request_pml_complete >>>>> (pml_ob1_recvreq.h:181) >>>>> ==14815== by 0xECCFF87: mca_pml_ob1_recv_frag_callback_match >>>>> (pml_ob1_recvfrag.c:243) >>>>> ==14815== by 0xE68F875: mca_btl_vader_check_fboxes >>>>> (btl_vader_fbox.h:220) >>>>> ==14815== by 0xE690D82: mca_btl_vader_component_progress >>>>> (btl_vader_component.c:695) >>>>> ==14815== by 0x9A9E9F2: opal_progress (opal_progress.c:187) >>>>> ==14815== by 0xECCA70A: opal_condition_wait (condition.h:78) >>>>> ==14815== by 0xECCA7F4: ompi_request_wait_completion (request.h:381) >>>>> ==14815== by 0xECCAF69: mca_pml_ob1_recv (pml_ob1_irecv.c:109) >>>>> ==14815== by 0xFD8938D: ompi_coll_tuned_reduce_intra_basic_linear >>>>> (coll_tuned_reduce.c:677) >>>>> ==14815== by 0xFD79C26: ompi_coll_tuned_reduce_intra_dec_fixed >>>>> (coll_tuned_decision_fixed.c:386) >>>>> ==14815== by 0xF0F3B91: mca_coll_basic_reduce_scatter_block_intra >>>>> (coll_basic_reduce_scatter_block.c:96) >>>>> ==14815== by 0xF72BC58: ompi_osc_rdma_fence >>>>> (osc_rdma_active_target.c:140) >>>>> ==14815== by 0x8B47078: PMPI_Win_fence (pwin_fence.c:59) >>>>> ==14815== by 0x5106D8F: PetscSFRestoreWindow (sfwindow.c:348) >>>>> ==14815== by 0x51092DA: PetscSFBcastEnd_Window (sfwindow.c:510) >>>>> ==14815== by 0x51303D6: PetscSFBcastEnd (sf.c:957) >>>>> ==14815== by 0x401DD3: main (ex2.c:81) >>>>> ==14815== Address 0x101c3b98 is 0 bytes after a block of size 72 alloc'd >>>>> ==14815== at 0x4C29BCF: malloc (in >>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> ==14815== by 0x8AFD755: ompi_datatype_set_args >>>>> (ompi_datatype_args.c:123) >>>>> ==14815== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>> (ompi_datatype_args.c:718) >>>>> ==14815== by 0x8AFEC0E: __ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:649) >>>>> ==14815== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:788) >>>>> ==14815== by 0xF727F0E: ompi_osc_base_datatype_create >>>>> (osc_base_obj_convert.h:52) >>>>> ==14815== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>> ==14815== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>> ==14815== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>> ==14815== by 0xF72AA35: ompi_osc_rdma_callback >>>>> (osc_rdma_data_move.c:1656) >>>>> ==14815== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>> ==14815== by 0xECCF4EA: recv_request_pml_complete >>>>> (pml_ob1_recvreq.h:181) >>>>> ==14815== by 0xECCFF87: mca_pml_ob1_recv_frag_callback_match >>>>> (pml_ob1_recvfrag.c:243) >>>>> ==14815== by 0xE68F875: mca_btl_vader_check_fboxes >>>>> (btl_vader_fbox.h:220) >>>>> ==14815== by 0xE690D82: mca_btl_vader_component_progress >>>>> (btl_vader_component.c:695) >>>>> ==14815== by 0x9A9E9F2: opal_progress (opal_progress.c:187) >>>>> ==14815== by 0xECCA70A: opal_condition_wait (condition.h:78) >>>>> ==14815== by 0xECCA7F4: ompi_request_wait_completion (request.h:381) >>>>> ==14815== by 0xECCAF69: mca_pml_ob1_recv (pml_ob1_irecv.c:109) >>>>> ==14815== by 0xFD8938D: ompi_coll_tuned_reduce_intra_basic_linear >>>>> (coll_tuned_reduce.c:677) >>>>> ==14815== by 0xFD79C26: ompi_coll_tuned_reduce_intra_dec_fixed >>>>> (coll_tuned_decision_fixed.c:386) >>>>> ==14815== by 0xF0F3B91: mca_coll_basic_reduce_scatter_block_intra >>>>> (coll_basic_reduce_scatter_block.c:96) >>>>> ==14815== by 0xF72BC58: ompi_osc_rdma_fence >>>>> (osc_rdma_active_target.c:140) >>>>> ==14815== by 0x8B47078: PMPI_Win_fence (pwin_fence.c:59) >>>>> ==14815== by 0x5106D8F: PetscSFRestoreWindow (sfwindow.c:348) >>>>> ==14815== by 0x51092DA: PetscSFBcastEnd_Window (sfwindow.c:510) >>>>> ==14815== by 0x51303D6: PetscSFBcastEnd (sf.c:957) >>>>> ==14815== by 0x401DD3: main (ex2.c:81) >>>>> ==14815== >>>>> ==14816== Invalid write of size 2 >>>>> ==14816== at 0x4C2E36B: memcpy@@GLIBC_2.14 (in >>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> ==14816== by 0x8AFDABD: ompi_datatype_set_args >>>>> (ompi_datatype_args.c:167) >>>>> ==14816== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>> (ompi_datatype_args.c:718) >>>>> ==14816== by 0x8AFEC0E: __ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:649) >>>>> ==14816== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:788) >>>>> ==14816== by 0xF727F0E: ompi_osc_base_datatype_create >>>>> (osc_base_obj_convert.h:52) >>>>> ==14816== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>> ==14816== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>> ==14816== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>> ==14816== by 0xF72AA35: ompi_osc_rdma_callback >>>>> (osc_rdma_data_move.c:1656) >>>>> ==14816== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>> ==14816== by 0xECCF4EA: recv_request_pml_complete >>>>> (pml_ob1_recvreq.h:181) >>>>> ==14816== by 0xECCFF87: mca_pml_ob1_recv_frag_callback_match >>>>> (pml_ob1_recvfrag.c:243) >>>>> ==14816== by 0xE68F875: mca_btl_vader_check_fboxes >>>>> (btl_vader_fbox.h:220) >>>>> ==14816== by 0xE690D82: mca_btl_vader_component_progress >>>>> (btl_vader_component.c:695) >>>>> ==14816== by 0x9A9E9F2: opal_progress (opal_progress.c:187) >>>>> ==14816== by 0xECCA70A: opal_condition_wait (condition.h:78) >>>>> ==14816== by 0xECCA7F4: ompi_request_wait_completion (request.h:381) >>>>> ==14816== by 0xECCAF69: mca_pml_ob1_recv (pml_ob1_irecv.c:109) >>>>> ==14816== by 0xFD8D951: ompi_coll_tuned_scatter_intra_basic_linear >>>>> (coll_tuned_scatter.c:231) >>>>> ==14816== by 0xFD7A66D: ompi_coll_tuned_scatter_intra_dec_fixed >>>>> (coll_tuned_decision_fixed.c:769) >>>>> ==14816== by 0xF0F3BDB: mca_coll_basic_reduce_scatter_block_intra >>>>> (coll_basic_reduce_scatter_block.c:102) >>>>> ==14816== by 0xF72BC58: ompi_osc_rdma_fence >>>>> (osc_rdma_active_target.c:140) >>>>> ==14816== by 0x8B47078: PMPI_Win_fence (pwin_fence.c:59) >>>>> ==14816== by 0x5106D8F: PetscSFRestoreWindow (sfwindow.c:348) >>>>> ==14816== by 0x51092DA: PetscSFBcastEnd_Window (sfwindow.c:510) >>>>> ==14816== by 0x51303D6: PetscSFBcastEnd (sf.c:957) >>>>> ==14816== by 0x401DD3: main (ex2.c:81) >>>>> ==14816== Address 0x101bb398 is 0 bytes after a block of size 72 alloc'd >>>>> ==14816== at 0x4C29BCF: malloc (in >>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> ==14816== by 0x8AFD755: ompi_datatype_set_args >>>>> (ompi_datatype_args.c:123) >>>>> ==14816== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>> (ompi_datatype_args.c:718) >>>>> ==14816== by 0x8AFEC0E: __ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:649) >>>>> ==14816== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>> (ompi_datatype_args.c:788) >>>>> ==14816== by 0xF727F0E: ompi_osc_base_datatype_create >>>>> (osc_base_obj_convert.h:52) >>>>> ==14816== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>> ==14816== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>> ==14816== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>> ==14816== by 0xF72AA35: ompi_osc_rdma_callback >>>>> (osc_rdma_data_move.c:1656) >>>>> ==14816== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>> ==14816== by 0xECCF4EA: recv_request_pml_complete >>>>> (pml_ob1_recvreq.h:181) >>>>> ==14816== by 0xECCFF87: mca_pml_ob1_recv_frag_callback_match >>>>> (pml_ob1_recvfrag.c:243) >>>>> ==14816== by 0xE68F875: mca_btl_vader_check_fboxes >>>>> (btl_vader_fbox.h:220) >>>>> ==14816== by 0xE690D82: mca_btl_vader_component_progress >>>>> (btl_vader_component.c:695) >>>>> ==14816== by 0x9A9E9F2: opal_progress (opal_progress.c:187) >>>>> ==14816== by 0xECCA70A: opal_condition_wait (condition.h:78) >>>>> ==14816== by 0xECCA7F4: ompi_request_wait_completion (request.h:381) >>>>> ==14816== by 0xECCAF69: mca_pml_ob1_recv (pml_ob1_irecv.c:109) >>>>> ==14816== by 0xFD8D951: ompi_coll_tuned_scatter_intra_basic_linear >>>>> (coll_tuned_scatter.c:231) >>>>> ==14816== by 0xFD7A66D: ompi_coll_tuned_scatter_intra_dec_fixed >>>>> (coll_tuned_decision_fixed.c:769) >>>>> ==14816== by 0xF0F3BDB: mca_coll_basic_reduce_scatter_block_intra >>>>> (coll_basic_reduce_scatter_block.c:102) >>>>> ==14816== by 0xF72BC58: ompi_osc_rdma_fence >>>>> (osc_rdma_active_target.c:140) >>>>> ==14816== by 0x8B47078: PMPI_Win_fence (pwin_fence.c:59) >>>>> ==14816== by 0x5106D8F: PetscSFRestoreWindow (sfwindow.c:348) >>>>> ==14816== by 0x51092DA: PetscSFBcastEnd_Window (sfwindow.c:510) >>>>> ==14816== by 0x51303D6: PetscSFBcastEnd (sf.c:957) >>>>> ==14816== by 0x401DD3: main (ex2.c:81) >>>>> ==14816== >>>>> Vec Object: 2 MPI processes >>>>> type: mpi >>>>> Process [0] >>>>> 0 >>>>> 1 >>>>> Process [1] >>>>> 1 >>>>> 0 >>>>> Vec Object: 2 MPI processes >>>>> type: mpi >>>>> Process [0] >>>>> 10 >>>>> 11 >>>>> Process [1] >>>>> 11 >>>>> 10 >>>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials (master=) >>>>> $ >>>>> >>>>> >>>>> On Thu, 9 Apr 2015, Barry Smith wrote: >>>>> >>>>>> >>>>>> Satish, >>>>>> >>>>>> Why are you telling me :-). Tell the person who's been pushing this >>>>>> stuff into PETSc and he can debug it. >>>>>> >>>>>> Barry >>>>>> >>>>>> This is why "my part" of PETSc only uses MPI 1.1 :-) >>>>>> >>>>>> >>>>>> >>>>>>> On Apr 9, 2015, at 5:48 PM, Satish Balay <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, 9 Apr 2015, Barry Smith wrote: >>>>>>> >>>>>>>> >>>>>>>> http://ftp.mcs.anl.gov/pub/petsc/nightlylogs/archive/2015/04/08/examples_master_arch-linux-pkgs-opt_crank.log >>>>>>> >>>>>>> >>>>>>> The following test is hanging - perhaps --download-openmpi is the >>>>>>> trigger. >>>>>>> >>>>>>> >>>>>>> petsc 14547 0.0 0.0 12312 1220 ? S 13:56 0:00 >>>>>>> /bin/sh -c /sandbox/petsc/petsc.clone/arch-linux-pkgs-opt/bin/mpiexec >>>>>>> -n 2 ./ex2 -sf_type window > ex2.tmp 2>&1; \? /usr/bin/diff -w >>>>>>> output/ex2_window.out ex2.tmp || printf >>>>>>> "/sandbox/petsc/petsc.clone/src/vec/is/sf/examples/tutorials\nPossible >>>>>>> problem with ex2_window, diffs >>>>>>> above\n=========================================\n"; \? >>>>>>> /bin/rm -f -f ex2.tmp >>>>>>> >>>>>>> >>>>>>> >>>>>>> I can reproduce on my laptop [with the following trace]. >>>>>>> >>>>>>> Satish >>>>>>> >>>>>>> --------- >>>>>>> >>>>>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials >>>>>>> (master=) >>>>>>> $ /home/balay/petsc/arch-ompi/bin/mpiexec -n 2 ./ex2 -sf_type window >>>>>>> PetscSF Object: 2 MPI processes >>>>>>> type: window >>>>>>> synchronization=FENCE sort=rank-order >>>>>>> [0] Number of roots=1, leaves=2, remote ranks=2 >>>>>>> [0] 0 <- (0,0) >>>>>>> [0] 1 <- (1,0) >>>>>>> [1] Number of roots=1, leaves=2, remote ranks=2 >>>>>>> [1] 0 <- (1,0) >>>>>>> [1] 1 <- (0,0) >>>>>>> *** Error in `./ex2': free(): invalid next size (fast): >>>>>>> 0x0000000002395ed0 *** >>>>>>> [asterix:14290] *** Process received signal *** >>>>>>> [asterix:14290] Signal: Aborted (6) >>>>>>> [asterix:14290] Signal code: (-6) >>>>>>> ======= Backtrace: ========= >>>>>>> /lib64/libc.so.6(+0x77d9e)[0x[asterix:14290] [ 0] >>>>>>> /lib64/libpthread.so.0(+0x100d0)[0x7f331fac10d0] >>>>>>> [asterix:14290] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x7f331f7288d7] >>>>>>> [asterix:14290] [ 2] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(ompi_datatype_release_args/lib64/libc.so.6(abort+0x16a)[0x7f331f72a53a] >>>>>>> [asterix:14290] [ 3] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(/lib64/libc.so.6(+0x77da3)[0x7f331f76bda3] >>>>>>> [asterix:14290] [ 4] +0x508e3)[0x7f9018c898e3] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x11773/lib64/libc.so.6(cfree+0x5b5)[0x7f331f7779f5] >>>>>>> [asterix:14290] [ 5] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x12ece)[0x7f900f73aece] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x862a)[0x7f900ecdb62a] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x8a15/home/balay/petsc/arch-ompi/lib/libmpi.so.1(ompi_datatype_release_args+0x12b)[0x7f331ff33627] >>>>>>> [asterix:14290] [ 6] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0xbac7)[0x7f900ecdeac7] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc0de)[0x7f900f7340de] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc4eb)[0x7f900f7344eb] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0x2ed)[0x7f900f734f88] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x3876)[0x7f9014009876] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x4d83)[0x7f901400ad83] >>>>>>> /home/balay/petsc/arch-ompi/lib/libopen-pal.so.6(opal_progress+0xa2)[0x7f9017cca9f3] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x770b)[0x7f900f72f70b] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x77f5)[0x7f900f72f7f5] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x1c6)[0x7f900f72ff6a] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_scatter_intra_basic_linear+0x76)[0x7f900e689952] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_scatter_intra_dec_fixed+0x112)[0x7f900e67666e] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_basic.so(mca_coll_basic_reduce_scatter_block_intra+0x188)[0x7f900f319bdc] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(ompi_osc_rdma_fence+0x125)[0x7f900ecdfc59] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(MPI_Win_fence+0x116)[0x7f9018cd1079] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(+0x508e3)[0x7f331ff348e3] >>>>>>> [asterix:14290] [ 7] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x11773)[0x7f3316910773] >>>>>>> [asterix:14290] [ 8] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x12ece)[0x7f3316911ece] >>>>>>> [asterix:14290] [ 9] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x862a)[0x7f3315eb262a] >>>>>>> [asterix:14290] [10] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x8a15)[0x7f3315eb2a15] >>>>>>> [asterix:14290] [11] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0xbac7)[0x7f3315eb5ac7] >>>>>>> [asterix:14290] [12] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc0de)[0x7f331690b0de] >>>>>>> [asterix:14290] [13] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc4eb)[0x7f331690b4eb] >>>>>>> [asterix:14290] [14] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0x2ed)[0x7f331690bf88] >>>>>>> [asterix:14290] [15] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x3876)[0x7f3316f4a876] >>>>>>> [asterix:14290] [16] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x4d83)[0x7f3316f4bd83] >>>>>>> [asterix:14290] [17] >>>>>>> /home/balay/petsc/arch-ompi/lib/libopen-pal.so.6(opal_progress+0xa2)[0x7f331ef759f3] >>>>>>> [asterix:14290] [18] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x770b)[0x7f331690670b] >>>>>>> [asterix:14290] [19] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x77f5)[0x7f33169067f5] >>>>>>> [asterix:14290] [20] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x1c6)[0x7f3316906f6a] >>>>>>> [asterix:14290] [21] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_reduce_intra_basic_linear+0x1cb)[0x7f331585c38e] >>>>>>> [asterix:14290] [22] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_reduce_intra_dec_fixed+0x1a6)[0x7f331584cc27] >>>>>>> [asterix:14290] [23] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_basic.so(mca_coll_basic_reduce_scatter_block_intra+0x13e)[0x7f33164f0b92] >>>>>>> [asterix:14290] [24] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(ompi_osc_rdma_fence+0x125)[0x7f3315eb6c59] >>>>>>> [asterix:14290] [25] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(MPI_Win_fence+0x116)[0x7f331ff7c079] >>>>>>> [asterix:14290] [26] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(+0x2d1d90)[0x7f3322855d90] >>>>>>> [asterix:14290] [27] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(PetscSFBcastEnd_Window+0x218)[0x7f33228582db] >>>>>>> [asterix:14290] [28] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(PetscSFBcastEnd+0x4eb)[0x7f332287f3d7] >>>>>>> [asterix:14290] [29] ./ex2[0x401dd4] >>>>>>> [asterix:14290] *** End of error message *** >>>>>>> [asterix:14291] *** Process received signal *** >>>>>>> [asterix:14291] Signal: Aborted (6) >>>>>>> [asterix:14291] Signal code: (-6) >>>>>>> [asterix:14291] [ 0] /lib64/libpthread.so.0(+0x100d0)[0x7f90188160d0] >>>>>>> [asterix:14291] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x7f901847d8d7] >>>>>>> [asterix:14291] [ 2] /lib64/libc.so.6(abort+0x16a)[0x7f901847f53a] >>>>>>> [asterix:14291] [ 3] /lib64/libc.so.6(+0x77da3)[0x7f90184c0da3] >>>>>>> [asterix:14291] [ 4] /lib64/libc.so.6(cfree+0x5b5)[0x7f90184cc9f5] >>>>>>> [asterix:14291] [ 5] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(ompi_datatype_release_args+0x12b)[0x7f9018c88627] >>>>>>> [asterix:14291] [ 6] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(+0x508e3)[0x7f9018c898e3] >>>>>>> [asterix:14291] [ 7] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x11773)[0x7f900f739773] >>>>>>> [asterix:14291] [ 8] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x12ece)[0x7f900f73aece] >>>>>>> [asterix:14291] [ 9] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x862a)[0x7f900ecdb62a] >>>>>>> [asterix:14291] [10] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0x8a15)[0x7f900ecdba15] >>>>>>> [asterix:14291] [11] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(+0xbac7)[0x7f900ecdeac7] >>>>>>> [asterix:14291] [12] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc0de)[0x7f900f7340de] >>>>>>> [asterix:14291] [13] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0xc4eb)[0x7f900f7344eb] >>>>>>> [asterix:14291] [14] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0x2ed)[0x7f900f734f88] >>>>>>> [asterix:14291] [15] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x3876)[0x7f9014009876] >>>>>>> [asterix:14291] [16] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_btl_vader.so(+0x4d83)[0x7f901400ad83] >>>>>>> [asterix:14291] [17] >>>>>>> /home/balay/petsc/arch-ompi/lib/libopen-pal.so.6(opal_progress+0xa2)[0x7f9017cca9f3] >>>>>>> [asterix:14291] [18] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x770b)[0x7f900f72f70b] >>>>>>> [asterix:14291] [19] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(+0x77f5)[0x7f900f72f7f5] >>>>>>> [asterix:14291] [20] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x1c6)[0x7f900f72ff6a] >>>>>>> [asterix:14291] [21] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_scatter_intra_basic_linear+0x76)[0x7f900e689952] >>>>>>> [asterix:14291] [22] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_scatter_intra_dec_fixed+0x112)[0x7f900e67666e] >>>>>>> [asterix:14291] [23] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_coll_basic.so(mca_coll_basic_reduce_scatter_block_intra+0x188)[0x7f900f319bdc] >>>>>>> [asterix:14291] [24] >>>>>>> /home/balay/petsc/arch-ompi/lib/openmpi/mca_osc_rdma.so(ompi_osc_rdma_fence+0x125)[0x7f900ecdfc59] >>>>>>> [asterix:14291] [25] >>>>>>> /home/balay/petsc/arch-ompi/lib/libmpi.so.1(MPI_Win_fence+0x116)[0x7f9018cd1079] >>>>>>> [asterix:14291] [26] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(+0x2d1d90)[0x7f901b5aad90] >>>>>>> [asterix:14291] [27] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(PetscSFBcastEnd_Window+0x218)[0x7f901b5ad2db] >>>>>>> [asterix:14291] [28] >>>>>>> /home/balay/petsc/arch-ompi/lib/libpetsc.so.3.05(PetscSFBcastEnd+0x4eb)[0x7f901b5d43d7] >>>>>>> [asterix:14291] [29] ./ex2[0x401dd4] >>>>>>> [asterix:14291] *** End of error message *** >>>>>>> -------------------------------------------------------------------------- >>>>>>> mpiexec noticed that process rank 0 with PID 14290 on node asterix >>>>>>> exited on signal 6 (Aborted). >>>>>>> -------------------------------------------------------------------------- >>>>>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials >>>>>>> (master=) >>>>>>> $ /home/balay/petsc/arch-ompi/bin/mpiexec -n 2 valgrind --tool=memcheck >>>>>>> -q ./ex2 -sf_type window >>>>>>> PetscSF Object: 2 MPI processes >>>>>>> type: window >>>>>>> synchronization=FENCE sort=rank-order >>>>>>> [0] Number of roots=1, leaves=2, remote ranks=2 >>>>>>> [0] 0 <- (0,0) >>>>>>> [0] 1 <- (1,0) >>>>>>> [1] Number of roots=1, leaves=2, remote ranks=2 >>>>>>> [1] 0 <- (1,0) >>>>>>> [1] 1 <- (0,0) >>>>>>> ==14349== Invalid write of size 2 >>>>>>> ==14349== at 0x4C2E36B: memcpy@@GLIBC_2.14 (in >>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>>> ==14349== by 0x8AFDABD: ompi_datatype_set_args >>>>>>> (ompi_datatype_args.c:167) >>>>>>> ==14349== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>>>> (ompi_datatype_args.c:718) >>>>>>> ==14349== by 0x8AFEC0E: >>>>>>> __ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:649) >>>>>>> ==14349== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:788) >>>>>>> ==14349== by 0xF727F0E: ompi_osc_base_datatype_create >>>>>>> (osc_base_obj_convert.h:52) >>>>>>> ==14349== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>>>> ==14349== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>>>> ==14349== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>>>> ==14349== by 0xF72AA35: ompi_osc_rdma_callback >>>>>>> (osc_rdma_data_move.c:1656) >>>>>>> ==14349== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>>>> ==14349== by 0xECCF4EA: recv_request_pml_complete >>>>>>> (pml_ob1_recvreq.h:181) >>>>>>> ==14349== Address 0x101bf188 is 0 bytes after a block of size 72 >>>>>>> alloc'd >>>>>>> ==14349== at 0x4C29BCF: malloc (in >>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>>> ==14349== by 0x8AFD755: ompi_datatype_set_args >>>>>>> (ompi_datatype_args.c:123) >>>>>>> ==14349== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>>>> (ompi_datatype_args.c:718) >>>>>>> ==14349== by 0x8AFEC0E: >>>>>>> __ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:649) >>>>>>> ==14349== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:788) >>>>>>> ==14349== by 0xF727F0E: ompi_osc_base_datatype_create >>>>>>> (osc_base_obj_convert.h:52) >>>>>>> ==14349== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>>>> ==14349== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>>>> ==14349== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>>>> ==14349== by 0xF72AA35: ompi_osc_rdma_callback >>>>>>> (osc_rdma_data_move.c:1656) >>>>>>> ==14349== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>>>> ==14349== by 0xECCF4EA: recv_request_pml_complete >>>>>>> (pml_ob1_recvreq.h:181) >>>>>>> ==14349== >>>>>>> ==14348== Invalid write of size 2 >>>>>>> ==14348== at 0x4C2E36B: memcpy@@GLIBC_2.14 (in >>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>>> ==14348== by 0x8AFDABD: ompi_datatype_set_args >>>>>>> (ompi_datatype_args.c:167) >>>>>>> ==14348== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>>>> (ompi_datatype_args.c:718) >>>>>>> ==14348== by 0x8AFEC0E: >>>>>>> __ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:649) >>>>>>> ==14348== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:788) >>>>>>> ==14348== by 0xF727F0E: ompi_osc_base_datatype_create >>>>>>> (osc_base_obj_convert.h:52) >>>>>>> ==14348== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>>>> ==14348== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>>>> ==14348== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>>>> ==14348== by 0xF72AA35: ompi_osc_rdma_callback >>>>>>> (osc_rdma_data_move.c:1656) >>>>>>> ==14348== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>>>> ==14348== by 0xECCF4EA: recv_request_pml_complete >>>>>>> (pml_ob1_recvreq.h:181) >>>>>>> ==14348== Address 0x101c71b8 is 0 bytes after a block of size 72 >>>>>>> alloc'd >>>>>>> ==14348== at 0x4C29BCF: malloc (in >>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>>>> ==14348== by 0x8AFD755: ompi_datatype_set_args >>>>>>> (ompi_datatype_args.c:123) >>>>>>> ==14348== by 0x8AFF0F3: __ompi_datatype_create_from_args >>>>>>> (ompi_datatype_args.c:718) >>>>>>> ==14348== by 0x8AFEC0E: >>>>>>> __ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:649) >>>>>>> ==14348== by 0x8AFF5D6: ompi_datatype_create_from_packed_description >>>>>>> (ompi_datatype_args.c:788) >>>>>>> ==14348== by 0xF727F0E: ompi_osc_base_datatype_create >>>>>>> (osc_base_obj_convert.h:52) >>>>>>> ==14348== by 0xF728424: datatype_create (osc_rdma_data_move.c:333) >>>>>>> ==14348== by 0xF72887D: process_get (osc_rdma_data_move.c:536) >>>>>>> ==14348== by 0xF72A856: process_frag (osc_rdma_data_move.c:1593) >>>>>>> ==14348== by 0xF72AA35: ompi_osc_rdma_callback >>>>>>> (osc_rdma_data_move.c:1656) >>>>>>> ==14348== by 0xECCF0DD: ompi_request_complete (request.h:402) >>>>>>> ==14348== by 0xECCF4EA: recv_request_pml_complete >>>>>>> (pml_ob1_recvreq.h:181) >>>>>>> ==14348== >>>>>>> Vec Object: 2 MPI processes >>>>>>> type: mpi >>>>>>> Process [0] >>>>>>> 0 >>>>>>> 1 >>>>>>> Process [1] >>>>>>> 1 >>>>>>> 0 >>>>>>> Vec Object: 2 MPI processes >>>>>>> type: mpi >>>>>>> Process [0] >>>>>>> 10 >>>>>>> 11 >>>>>>> Process [1] >>>>>>> 11 >>>>>>> 10 >>>>>>> balay@asterix /home/balay/petsc/src/vec/is/sf/examples/tutorials >>>>>>> (master=) >>>>>>> $ >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> >
