Alltoallv has both a large count and large displacement problem in the API. You can work around the latter by using neighborhood alltoallv using a duplicate of your original communicator that’s neighborhood compatible. Neighborhood collectives use MPI_Aint displacements instead of int.
If you need tests, https://github.com/jeffhammond/BigMPI test suite is nothing but large count MPI calls using derived data types. Jeff On Thu 2. Jun 2022 at 22.28 Eric Chamberland via users < users@lists.open-mpi.org> wrote: > Hi Josh, > > ok, thanks for the suggestion. We are in process to test with IntelMPI > right now. I hope to do it with a newer version of OpenMPI too. > > Do you suggest a minimum version for UCX lib? > > Thanks, > > Eric > On 2022-06-02 04:05, Josh Hursey via users wrote: > > I would suggest trying OMPI v4.1.4 (or the v5 snapshot) > * https://www.open-mpi.org/software/ompi/v4.1/ > * https://www.mail-archive.com/announce@lists.open-mpi.org//msg00152.html > > We fixed some large payload collective issues in that release which might > be what you are seeing here with MPI_Alltoallv with the tuned collective > component. > > > > On Thu, Jun 2, 2022 at 1:54 AM Mikhail Brinskii via users < > users@lists.open-mpi.org> wrote: > >> Hi Eric, >> >> >> >> Yes, UCX is supposed to be stable for large sized problems. >> >> Did you see the same crash with both OMPI-4.0.3 + UCX 1.8.0 and >> OMPI-4.1.2 + UCX1.11.2? >> >> Have you also tried to run large sized problems test with OMPI-5.0.x? >> >> Regarding the application, at some point it invokes MPI_Alltoallv sending >> more than 2GB to some of the ranks (using derived dt), right? >> >> >> >> //WBR, Mikhail >> >> >> >> *From:* users <users-boun...@lists.open-mpi.org> *On Behalf Of *Eric >> Chamberland via users >> *Sent:* Thursday, June 2, 2022 5:31 AM >> *To:* Open MPI Users <users@lists.open-mpi.org> >> *Cc:* Eric Chamberland <eric.chamberl...@giref.ulaval.ca>; Thomas >> Briffard <thomas.briff...@michelin.com>; Vivien Clauzon < >> vivien.clau...@michelin.com>; dave.mar...@giref.ulaval.ca; Ramses van >> Zon <r...@scinet.utoronto.ca>; charles.coulomb...@ulaval.ca >> *Subject:* [OMPI users] Segfault in ucp_dt_pack function from UCX >> library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI >> 4.0.3 and 4.1.2 >> >> >> >> Hi, >> >> In the past, we have successfully launched large sized (finite elements) >> computations using PARMetis as mesh partitioner. >> >> It was first in 2012 with OpenMPI (v2.?) and secondly in March 2019 with >> OpenMPI 3.1.2 that we succeeded. >> >> Today, we have a bunch of nightly (small) tests running nicely and >> testing all of OpenMPI (4.0.x, 4.1.x and 5.0x), MPICH-3.3.2 and IntelMPI >> 2021.6. >> >> Preparing for launching the same computation we did in 2012, and even >> larger ones, we compiled with bot OpenMPI 4.0.3+ucx-1.8.0 and OpenMPI >> 4.1.2+ucx-1.11.2 and launched computation from small to large problems >> (meshes). >> >> For small meshes, it goes fine. >> >> But when we reach near 2^31 faces into the 3D mesh we are using and call >> ParMETIS_V3_PartMeshKway, we always get a segfault with the same backtrace >> pointing into ucx library: >> >> Wed Jun 1 23:04:54 >> 2022<stdout>:chrono::InterfaceParMetis::ParMETIS_V3_PartMeshKway::debut >> VmSize: 1202304 VmRSS: 349456 VmPeak: 1211736 VmData: 500764 VmHWM: 359012 >> <etiq_18> >> Wed Jun 1 23:07:07 2022<stdout>:Erreur : MEF++ Signal recu : 11 : >> segmentation violation >> Wed Jun 1 23:07:07 2022<stdout>:Erreur : >> Wed Jun 1 23:07:07 2022<stdout>:------------------------------ (Début >> des informations destinées aux développeurs C++) >> ------------------------------ >> Wed Jun 1 23:07:07 2022<stdout>:La pile d'appels contient 27 symboles. >> Wed Jun 1 23:07:07 2022<stdout>:# 000: >> reqBacktrace(std::__cxx11::basic_string<char, std::char_traits<char>, >> std::allocator<char> >&) >>> probGD.opt >> (probGD.opt(_Z12reqBacktraceRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) >> [0x4119f1]) >> Wed Jun 1 23:07:07 2022<stdout>:# 001: attacheDebugger() >>> >> probGD.opt (probGD.opt(_Z15attacheDebuggerv+0x29a) [0x41386a]) >> Wed Jun 1 23:07:07 2022<stdout>:# 002: >> /gpfs/fs0/project/d/deteix/ericc/GIREF/lib/libgiref_opt_Util.so(traitementSignal+0x1f9f) >> [0x2ab3aef0e5cf] >> Wed Jun 1 23:07:07 2022<stdout>:# 003: /lib64/libc.so.6(+0x36400) >> [0x2ab3bd59a400] >> Wed Jun 1 23:07:07 2022<stdout>:# 004: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(ucp_dt_pack+0x123) >> [0x2ab3c966e353] >> Wed Jun 1 23:07:07 2022<stdout>:# 005: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(+0x536b7) >> [0x2ab3c968d6b7] >> Wed Jun 1 23:07:07 2022<stdout>:# 006: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/ucx/libuct_ib.so.0(uct_dc_mlx5_ep_am_bcopy+0xd7) >> [0x2ab3ca712137] >> Wed Jun 1 23:07:07 2022<stdout>:# 007: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(+0x52d3c) >> [0x2ab3c968cd3c] >> Wed Jun 1 23:07:07 2022<stdout>:# 008: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(ucp_tag_send_nbx+0x5ad) >> [0x2ab3c9696dcd] >> Wed Jun 1 23:07:07 2022<stdout>:# 009: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/openmpi/mca_pml_ucx.so(mca_pml_ucx_send+0xf2) >> [0x2ab3c922e0b2] >> Wed Jun 1 23:07:07 2022<stdout>:# 010: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x92) >> [0x2ab3bbca5a32] >> Wed Jun 1 23:07:07 2022<stdout>:# 011: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/libmpi.so.40(ompi_coll_base_alltoallv_intra_pairwise+0x141) >> [0x2ab3bbcad941] >> Wed Jun 1 23:07:07 2022<stdout>:# 012: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_alltoallv_intra_dec_fixed+0x42) >> [0x2ab3d4836da2] >> Wed Jun 1 23:07:07 2022<stdout>:# 013: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/libmpi.so.40(PMPI_Alltoallv+0x29) >> [0x2ab3bbc7bdf9] >> Wed Jun 1 23:07:07 2022<stdout>:# 014: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(libparmetis__gkMPI_Alltoallv+0x106) >> [0x2ab3bb0e1c06] >> Wed Jun 1 23:07:07 2022<stdout>:# 015: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(ParMETIS_V3_Mesh2Dual+0xdd6) >> [0x2ab3bb0f10b6] >> Wed Jun 1 23:07:07 2022<stdout>:# 016: >> /scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(ParMETIS_V3_PartMeshKway+0x100) >> [0x2ab3bb0f1ac0] >> >> PARMetis is compiled as part of PETSc-3.17.1 with 64bit indices. Here >> are PETSc configure options: >> >> >> --prefix=/scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1 >> COPTFLAGS=\"-O2 -march=native\" >> CXXOPTFLAGS=\"-O2 -march=native\" >> FOPTFLAGS=\"-O2 -march=native\" >> --download-fftw=1 >> --download-hdf5=1 >> --download-hypre=1 >> --download-metis=1 >> --download-mumps=1 >> --download-parmetis=1 >> --download-plapack=1 >> --download-prometheus=1 >> --download-ptscotch=1 >> --download-scotch=1 >> --download-sprng=1 >> --download-superlu_dist=1 >> --download-triangle=1 >> --with-avx512-kernels=1 >> --with-blaslapack-dir=/scinet/intel/oneapi/2021u4/mkl/2021.4.0 >> --with-cc=mpicc >> --with-cxx=mpicxx >> --with-cxx-dialect=C++11 >> --with-debugging=0 >> --with-fc=mpifort >> --with-mkl_pardiso-dir=/scinet/intel/oneapi/2021u4/mkl/2021.4.0 >> --with-scalapack=1 >> >> --with-scalapack-lib=\"[/scinet/intel/oneapi/2021u4/mkl/2021.4.0/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/oneapi/2021u4/mkl/2021.4.0/lib/intel64/libmkl_blacs_openmpi_lp64.so]\" >> --with-x=0 >> --with-64-bit-indices=1 >> --with-memalign=64 >> >> and OpenMPI configure options: >> >> >> '--prefix=/scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2' >> '--enable-mpi-cxx' >> '--enable-mpi1-compatibility' >> '--with-hwloc=internal' >> '--with-knem=/opt/knem-1.1.3.90mlnx1' >> '--with-libevent=internal' >> '--with-platform=contrib/platform/mellanox/optimized' >> '--with-pmix=internal' >> '--with-slurm=/opt/slurm' >> '--with-ucx=/scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2' >> >> I am then wondering: >> >> 1) Is UCX library considered "stable" for production use with very large >> sized problems ? >> >> 2) Is there a way to "bypass" UCX at runtime? >> >> 3) Any idea for debugging this? >> >> Of course, I do not yet have a "minimum reproducer" that bugs, since it >> happens only on "large" problems, but I think I could export the data for a >> 512 processes reproducer with PARMetis call only... >> >> Thanks for helping, >> >> Eric >> >> -- >> >> Eric Chamberland, ing., M. Ing >> >> Professionnel de recherche >> >> GIREF/Université Laval >> >> (418) 656-2131 poste 41 22 42 >> >> > > -- > Josh Hursey > IBM Spectrum MPI Developer > > -- > Eric Chamberland, ing., M. Ing > Professionnel de recherche > GIREF/Université Laval > (418) 656-2131 poste 41 22 42 > > -- Jeff Hammond jeff.scie...@gmail.com http://jeffhammond.github.io/