Just trying to understand - why are you saying this is a pmix problem? 
Obviously, something to do with mpirun is failing, but I don't see any 
indication here that it has to do with pmix.

Can you add --enable-debug to your configure line and inspect the core file 
from the dump?

> On Jan 31, 2021, at 8:18 PM, Andrej Prsa via devel <devel@lists.open-mpi.org> 
> wrote:
> 
> Hello list,
> 
> I just upgraded openmpi from 4.0.3 to 4.1.0 to see if it would solve a weird 
> openpmix problem we've been having; I configured it using:
> 
> ./configure --prefix=/usr/local --with-pmix=internal --with-slurm 
> --without-tm --without-moab --without-singularity --without-fca 
> --without-hcoll --without-ime --without-lustre --without-psm --without-psm2 
> --without-mxm --with-gnu-ld
> 
> (I also have an external pmix version installed and tried using that instead 
> of internal, but it doesn't change anything). Here's the output of configure:
> 
> Open MPI configuration:
> -----------------------
> Version: 4.1.0
> Build MPI C bindings: yes
> Build MPI C++ bindings (deprecated): no
> Build MPI Fortran bindings: mpif.h, use mpi, use mpi_f08
> MPI Build Java bindings (experimental): no
> Build Open SHMEM support: false (no spml)
> Debug build: no
> Platform file: (none)
> 
> Miscellaneous
> -----------------------
> CUDA support: no
> HWLOC support: external
> Libevent support: external
> PMIx support: Internal
> 
> Transports
> -----------------------
> Cisco usNIC: no
> Cray uGNI (Gemini/Aries): no
> Intel Omnipath (PSM2): no
> Intel TrueScale (PSM): no
> Mellanox MXM: no
> Open UCX: no
> OpenFabrics OFI Libfabric: no
> OpenFabrics Verbs: no
> Portals4: no
> Shared memory/copy in+copy out: yes
> Shared memory/Linux CMA: yes
> Shared memory/Linux KNEM: no
> Shared memory/XPMEM: no
> TCP: yes
> 
> Resource Managers
> -----------------------
> Cray Alps: no
> Grid Engine: no
> LSF: no
> Moab: no
> Slurm: yes
> ssh/rsh: yes
> Torque: no
> 
> OMPIO File Systems
> -----------------------
> DDN Infinite Memory Engine: no
> Generic Unix FS: yes
> IBM Spectrum Scale/GPFS: no
> Lustre: no
> PVFS2/OrangeFS: no
> 
> Once configured, make and sudo make install worked without a glitch; but when 
> I run mpirun, I get this:
> 
> andrej@terra:~/system/openmpi-4.1.0$ mpirun --version
> mpirun (Open MPI) 4.1.0
> 
> Report bugs to http://www.open-mpi.org/community/help/
> andrej@terra:~/system/openmpi-4.1.0$ mpirun
> malloc(): corrupted top size
> Aborted (core dumped)
> 
> No matter what I try to run, it always segfaults. Any suggestions on what I 
> can try to resolve this?
> 
> Oh, I should also mention that I tried to remove the global libevent; openmpi 
> configured its internal copy but then failed to build.
> 
> Thanks,
> Andrej
> 


Reply via email to