The invalid writes in uGNI are nothing. I suggest adding any GNI_ call to a suppression file. The RB tree invalid write looks like a bug. I will take a look and see what might be causing it.
BTW, you can add --with-valgrind(=DIR) to configure. This will suppress some uninitialized value errors with btl/vader and other components. It won’t help with btl/ugni right now though. -Nathan > On May 17, 2018, at 3:50 AM, Joseph Schuchart <schuch...@hlrs.de> wrote: > > Nathan, > > I am trying to track down some memory corruption that leads to crashes in my > application running on the Cray system using Open MPI (git-6093f2d). Valgrind > reports quite some invalid reads and writes inside Open MPI when running the > benchmark that I sent you earlier. > > There are plenty of invalid reads in MPI_Init and MPI_Win_allocate. Valgrind > also reports some invalid writes during communication: > > ``` > ==42751== Invalid write of size 8 > ==42751== at 0x94C647D: GNII_POST_FMA_GET (in > /opt/cray/ugni/6.0.14-6.0.5.0_16.9__g19583bb.ari/lib64/libugni.so.0.6.0) > ==42751== by 0x94C8D74: GNI_PostFma (in > /opt/cray/ugni/6.0.14-6.0.5.0_16.9__g19583bb.ari/lib64/libugni.so.0.6.0) > ==42751== by 0x10FA21D0: mca_btl_ugni_get (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_btl_ugni.so) > ==42751== by 0x134AF6C5: ompi_osc_get_data_blocking (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_osc_rdma.so) > ==42751== by 0x134D0CC4: ompi_osc_rdma_peer_lookup (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_osc_rdma.so) > ==42751== by 0x134B4A1F: ompi_osc_rdma_rget (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_osc_rdma.so) > ==42751== by 0x46C1D52: PMPI_Rget (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libmpi.so.0.0.0) > ==42751== by 0x20001EA9: main (in > /zhome/academic/HLRS/hlrs/hpcjschu/src/test/mpi_test_loop) > ==42751== Address 0x2aaaaabc0000 is not stack'd, malloc'd or (recently) > free'd > > ==42751== Invalid write of size 8 > ==42751== at 0x94D76BC: GNII_SmsgSend (in > /opt/cray/ugni/6.0.14-6.0.5.0_16.9__g19583bb.ari/lib64/libugni.so.0.6.0) > ==42751== by 0x94D9D5C: GNI_SmsgSendWTag (in > /opt/cray/ugni/6.0.14-6.0.5.0_16.9__g19583bb.ari/lib64/libugni.so.0.6.0) > ==42751== by 0x10F9D9E6: mca_btl_ugni_sendi (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_btl_ugni.so) > ==42751== by 0x11BE5DDF: mca_pml_ob1_isend (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_pml_ob1.so) > ==42751== by 0x1201DC40: NBC_Progress (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_coll_libnbc.so) > ==42751== by 0x1201DC91: NBC_Progress (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_coll_libnbc.so) > ==42751== by 0x1201C692: ompi_coll_libnbc_progress (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_coll_libnbc.so) > ==42751== by 0x631A503: opal_progress (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x632111C: ompi_sync_wait_mt (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x4669A4C: ompi_comm_nextcid (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libmpi.so.0.0.0) > ==42751== by 0x4667ECC: ompi_comm_dup_with_info (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libmpi.so.0.0.0) > ==42751== by 0x134C15AE: ompi_osc_rdma_component_select (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_osc_rdma.so) > ==42751== Address 0x2aaaaabaf000 is not stack'd, malloc'd or (recently) > free'd > > And some write-after-free during MPI_Finalize: > ==42751== Invalid write of size 8 > ==42751== at 0x6316E64: opal_rb_tree_delete (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x1076BA03: mca_mpool_hugepage_seg_free (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_mpool_hugepage.so) > ==42751== by 0x1015EB33: mca_allocator_bucket_cleanup (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_allocator_bucket.so) > ==42751== by 0x1015DF5C: mca_allocator_bucket_finalize (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_allocator_bucket.so) > ==42751== by 0x1076BAE6: mca_mpool_hugepage_finalize (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_mpool_hugepage.so) > ==42751== by 0x1076C202: mca_mpool_hugepage_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_mpool_hugepage.so) > ==42751== by 0x633CED9: mca_base_component_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x633CE01: mca_base_components_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x63C6F31: mca_mpool_base_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x634AEF7: mca_base_framework_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x4687B6A: ompi_mpi_finalize (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libmpi.so.0.0.0) > ==42751== by 0x20001F35: main (in > /zhome/academic/HLRS/hlrs/hpcjschu/src/test/mpi_test_loop) > ==42751== Address 0xa3aa348 is 16,440 bytes inside a block of size 16,568 > free'd > ==42751== at 0x4428CDA: free (vg_replace_malloc.c:530) > ==42751== by 0x630FED2: opal_free_list_destruct (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x63160C1: opal_rb_tree_destruct (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x1076BACE: mca_mpool_hugepage_finalize (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_mpool_hugepage.so) > ==42751== by 0x1076C202: mca_mpool_hugepage_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/openmpi/mca_mpool_hugepage.so) > ==42751== by 0x633CED9: mca_base_component_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x633CE01: mca_base_components_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x63C6F31: mca_mpool_base_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x634AEF7: mca_base_framework_close (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libopen-pal.so.0.0.0) > ==42751== by 0x4687B6A: ompi_mpi_finalize (in > /zhome/academic/HLRS/hlrs/hpcjschu/opt-cray/openmpi-6093f2d-intel/lib/libmpi.so.0.0.0) > ==42751== by 0x20001F35: main (in > /zhome/academic/HLRS/hlrs/hpcjschu/src/test/mpi_test_loop) > ``` > > I'm not sure whether the invalid writes (and reads) during initialization and > communication are caused by Open MPI or uGNI itself and whether they are > critical (the addresses seem to be "special"). The write-after-free in > MPI_Finalize seems suspicious though. I cannot say whether that causes the > memory corruption I am seeing but I thought I report it. I will dig further > into this to try to figure out what causes the crashes (they are not > deterministically reproducible, unfortunately). > > Cheers, > Joseph > > On 05/10/2018 03:24 AM, Nathan Hjelm wrote: >> Thanks for confirming that it works for you as well. I have a PR open on >> v3.1.x that brings osc/rdma up to date with master. I will also be bringing >> some code that greatly improves the multi-threaded RMA performance on Aries >> systems (at least with benchmarks— github.com/hpc/rma-mt). That will not >> make it into v3.1.x but will be in v4.0.0. >> -Nathan >>> On May 9, 2018, at 1:26 AM, Joseph Schuchart <schuch...@hlrs.de> wrote: >>> >>> Nathan, >>> >>> Thank you, I can confirm that it works as expected with master on our >>> system. I will stick to this version then until 3.1.1 is out. >>> >>> Joseph >>> >>> On 05/08/2018 05:34 PM, Nathan Hjelm wrote: >>>> Looks like it doesn't fail with master so at some point I fixed this bug. >>>> The current plan is to bring all the master changes into v3.1.1. This >>>> includes a number of bug fixes. >>>> -Nathan >>>> On May 08, 2018, at 08:25 AM, Joseph Schuchart <schuch...@hlrs.de> wrote: >>>>> Nathan, >>>>> >>>>> Thanks for looking into that. My test program is attached. >>>>> >>>>> Best >>>>> Joseph >>>>> >>>>> On 05/08/2018 02:56 PM, Nathan Hjelm wrote: >>>>>> I will take a look today. Can you send me your test program? >>>>>> >>>>>> -Nathan >>>>>> >>>>>>> On May 8, 2018, at 2:49 AM, Joseph Schuchart <schuch...@hlrs.de> wrote: >>>>>>> >>>>>>> All, >>>>>>> >>>>>>> I have been experimenting with using Open MPI 3.1.0 on our Cray XC40 >>>>>>> (Haswell-based nodes, Aries interconnect) for multi-threaded MPI RMA. >>>>>>> Unfortunately, a simple (single-threaded) test case consisting of two >>>>>>> processes performing an MPI_Rget+MPI_Wait hangs when running on two >>>>>>> nodes. It succeeds if both processes run on a single node. >>>>>>> >>>>>>> For completeness, I am attaching the config.log. The build environment >>>>>>> was set up to build Open MPI for the login nodes (I wasn't sure how to >>>>>>> properly cross-compile the libraries): >>>>>>> >>>>>>> ``` >>>>>>> # this seems necessary to avoid a linker error during build >>>>>>> export CRAYPE_LINK_TYPE=dynamic >>>>>>> module swap PrgEnv-cray PrgEnv-intel >>>>>>> module sw craype-haswell craype-sandybridge >>>>>>> module unload craype-hugepages16M >>>>>>> module unload cray-mpich >>>>>>> ``` >>>>>>> >>>>>>> I am using mpirun to launch the test code. Below is the BTL debug log >>>>>>> (with tcp disabled for clarity, turning it on makes no difference): >>>>>>> >>>>>>> ``` >>>>>>> mpirun --mca btl_base_verbose 100 --mca btl ^tcp -n 2 -N 1 >>>>>>> ./mpi_test_loop >>>>>>> [nid03060:36184] mca: base: components_register: registering framework >>>>>>> btl components >>>>>>> [nid03060:36184] mca: base: components_register: found loaded component >>>>>>> self >>>>>>> [nid03060:36184] mca: base: components_register: component self >>>>>>> register function successful >>>>>>> [nid03060:36184] mca: base: components_register: found loaded component >>>>>>> sm >>>>>>> [nid03061:36208] mca: base: components_register: registering framework >>>>>>> btl components >>>>>>> [nid03061:36208] mca: base: components_register: found loaded component >>>>>>> self >>>>>>> [nid03060:36184] mca: base: components_register: found loaded component >>>>>>> ugni >>>>>>> [nid03061:36208] mca: base: components_register: component self >>>>>>> register function successful >>>>>>> [nid03061:36208] mca: base: components_register: found loaded component >>>>>>> sm >>>>>>> [nid03061:36208] mca: base: components_register: found loaded component >>>>>>> ugni >>>>>>> [nid03060:36184] mca: base: components_register: component ugni >>>>>>> register function successful >>>>>>> [nid03060:36184] mca: base: components_register: found loaded component >>>>>>> vader >>>>>>> [nid03061:36208] mca: base: components_register: component ugni >>>>>>> register function successful >>>>>>> [nid03061:36208] mca: base: components_register: found loaded component >>>>>>> vader >>>>>>> [nid03060:36184] mca: base: components_register: component vader >>>>>>> register function successful >>>>>>> [nid03060:36184] mca: base: components_open: opening btl components >>>>>>> [nid03060:36184] mca: base: components_open: found loaded component self >>>>>>> [nid03060:36184] mca: base: components_open: component self open >>>>>>> function successful >>>>>>> [nid03060:36184] mca: base: components_open: found loaded component ugni >>>>>>> [nid03060:36184] mca: base: components_open: component ugni open >>>>>>> function successful >>>>>>> [nid03060:36184] mca: base: components_open: found loaded component >>>>>>> vader >>>>>>> [nid03060:36184] mca: base: components_open: component vader open >>>>>>> function successful >>>>>>> [nid03060:36184] select: initializing btl component self >>>>>>> [nid03060:36184] select: init of component self returned success >>>>>>> [nid03060:36184] select: initializing btl component ugni >>>>>>> [nid03061:36208] mca: base: components_register: component vader >>>>>>> register function successful >>>>>>> [nid03061:36208] mca: base: components_open: opening btl components >>>>>>> [nid03061:36208] mca: base: components_open: found loaded component self >>>>>>> [nid03061:36208] mca: base: components_open: component self open >>>>>>> function successful >>>>>>> [nid03061:36208] mca: base: components_open: found loaded component ugni >>>>>>> [nid03061:36208] mca: base: components_open: component ugni open >>>>>>> function successful >>>>>>> [nid03061:36208] mca: base: components_open: found loaded component >>>>>>> vader >>>>>>> [nid03061:36208] mca: base: components_open: component vader open >>>>>>> function successful >>>>>>> [nid03061:36208] select: initializing btl component self >>>>>>> [nid03061:36208] select: init of component self returned success >>>>>>> [nid03061:36208] select: initializing btl component ugni >>>>>>> [nid03061:36208] select: init of component ugni returned success >>>>>>> [nid03061:36208] select: initializing btl component vader >>>>>>> [nid03061:36208] select: init of component vader returned failure >>>>>>> [nid03061:36208] mca: base: close: component vader closed >>>>>>> [nid03061:36208] mca: base: close: unloading component vader >>>>>>> [nid03060:36184] select: init of component ugni returned success >>>>>>> [nid03060:36184] select: initializing btl component vader >>>>>>> [nid03060:36184] select: init of component vader returned failure >>>>>>> [nid03060:36184] mca: base: close: component vader closed >>>>>>> [nid03060:36184] mca: base: close: unloading component vader >>>>>>> [nid03061:36208] mca: bml: Using self btl for send to [[54630,1],1] on >>>>>>> node nid03061 >>>>>>> [nid03060:36184] mca: bml: Using self btl for send to [[54630,1],0] on >>>>>>> node nid03060 >>>>>>> [nid03061:36208] mca: bml: Using ugni btl for send to [[54630,1],0] on >>>>>>> node (null) >>>>>>> [nid03060:36184] mca: bml: Using ugni btl for send to [[54630,1],1] on >>>>>>> node (null) >>>>>>> ``` >>>>>>> >>>>>>> It looks like the UGNI btl is being initialized correctly but then >>>>>>> fails to find the node to communicate with? Is there a way to get more >>>>>>> information? There doesn't seem to be an MCA parameter to increase >>>>>>> verbosity specifically of the UGNI btl. >>>>>>> >>>>>>> Any help would be appreciated! >>>>>>> >>>>>>> Cheers >>>>>>> Joseph >>>>>>> <config.log.tgz> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>>>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.open-mpi.org >>>> https://lists.open-mpi.org/mailman/listinfo/users >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users