Prentice,

ibverbs might be used by UCX (either pml/ucx or btl/uct),
so to be 100% sure, you should

mpirun --mca pml ob1 --mca btl ^openib,uct ...

in order to force btl/tcp, you need to ensure pml/ob1 is used, 
and then you always need the btl/self component

mpirun --mca pml ob1 --mca btl tcp,self ...

Cheers,

Gilles

----- Original Message -----
> Can anyone explain why my job still calls libibverbs when I run it 
with 
> '-mca btl ^openib'?
> 
> If I instead use '-mca btl tcp', my jobs don't segfault. I would assum 
> 'mca btl ^openib' and '-mca btl tcp' to essentially be equivalent, but 
> there's obviously a difference in the two.
> 
> Prentice
> 
> On 7/23/20 3:34 PM, Prentice Bisbal wrote:
> > I manage a cluster that is very heterogeneous. Some nodes have 
> > InfiniBand, while others have 10 Gb/s Ethernet. We recently upgraded 
> > to CentOS 7, and built a new software stack for CentOS 7. We are 
using 
> > OpenMPI 4.0.3, and we are using Slurm 19.05.5 as our job scheduler.
> >
> > We just noticed that when jobs are sent to the nodes with IB, the 
> > segfault immediately, with the segfault appearing to come from 
> > libibverbs.so. This is what I see in the stderr output for one of 
> > these failed jobs:
> >
> > srun: error: greene021: tasks 0-3: Segmentation fault
> >
> > And here is what I see in the log messages of the compute node where 
> > that segfault happened:
> >
> > Jul 23 15:19:41 greene021 kernel: mpihello[7911]: segfault at 
> > 7f0635f38910 ip 00007f0635f49405 sp 00007ffe354485a0 error 4
> > Jul 23 15:19:41 greene021 kernel: mpihello[7912]: segfault at 
> > 7f23d51ea910 ip 00007f23d51fb405 sp 00007ffef250a9a0 error 4
> > Jul 23 15:19:41 greene021 kernel: in 
> > libibverbs.so.1.5.22.4[7f23d51ec000+18000]
> > Jul 23 15:19:41 greene021 kernel:
> > Jul 23 15:19:41 greene021 kernel: mpihello[7909]: segfault at 
> > 7ff504ba5910 ip 00007ff504bb6405 sp 00007ffff917ccb0 error 4
> > Jul 23 15:19:41 greene021 kernel: in 
> > libibverbs.so.1.5.22.4[7ff504ba7000+18000]
> > Jul 23 15:19:41 greene021 kernel:
> > Jul 23 15:19:41 greene021 kernel: mpihello[7910]: segfault at 
> > 7fa58abc5910 ip 00007fa58abd6405 sp 00007ffdde50c0d0 error 4
> > Jul 23 15:19:41 greene021 kernel: in 
> > libibverbs.so.1.5.22.4[7fa58abc7000+18000]
> > Jul 23 15:19:41 greene021 kernel:
> > Jul 23 15:19:41 greene021 kernel: in 
> > libibverbs.so.1.5.22.4[7f0635f3a000+18000]
> > Jul 23 15:19:41 greene021 kernel
> >
> > Any idea what is going on here, or how to debug further? I've been 
> > using OpenMPI for years, and it usually just works.
> >
> > I normally start my job with srun like this:
> >
> > srun ./mpihello
> >
> > But even if I try to take IB out of the equation by starting the job 
> > like this:
> >
> > mpirun -mca btl ^openib ./mpihello
> >
> > I still get a segfault issue, although the message to stderr is now 
a 
> > little different:
> >
> > --------------------------------------------------------------------
------ 
> >
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code. Per user-direction, the job has been aborted.
> > --------------------------------------------------------------------
------ 
> >
> > --------------------------------------------------------------------
------ 
> >
> > mpirun noticed that process rank 1 with PID 8502 on node greene021 
> > exited on signal 11 (Segmentation fault).
> > --------------------------------------------------------------------
------ 
> >
> >
> > The segfaults happens immediately. It seems to happen as soon as 
> > MPI_Init() is called. The program I'm running is very simple MPI 
> > "Hello world!" program.
> >
> > The output of  ompi_info is below my signature, in case that helps.
> >
> > Prentice
> >
> > $ ompi_info
> >                  Package: Open MPI u...@host.example.com 
Distribution
> >                 Open MPI: 4.0.3
> >   Open MPI repo revision: v4.0.3
> >    Open MPI release date: Mar 03, 2020
> >                 Open RTE: 4.0.3
> >   Open RTE repo revision: v4.0.3
> >    Open RTE release date: Mar 03, 2020
> >                     OPAL: 4.0.3
> >       OPAL repo revision: v4.0.3
> >        OPAL release date: Mar 03, 2020
> >                  MPI API: 3.1.0
> >             Ident string: 4.0.3
> >                   Prefix: /usr/pppl/gcc/9.3-pkgs/openmpi-4.0.3
> >  Configured architecture: x86_64-unknown-linux-gnu
> >           Configure host: dawson027.pppl.gov
> >            Configured by: lglant
> >            Configured on: Mon Jun  1 12:37:07 EDT 2020
> >           Configure host: dawson027.pppl.gov
> >   Configure command line: '--prefix=/usr/pppl/gcc/9.3-pkgs/openmpi-4.
0.3'
> >                           '--with-ucx' '--with-verbs' '--with-
libfabric'
> >                           '--with-libevent=/usr'
> > '--with-libevent-libdir=/usr/lib64'
> > '--with-pmix=/usr/pppl/pmix/3.1.5' '--with-pmi'
> >                 Built by: lglant
> >                 Built on: Mon Jun  1 13:05:40 EDT 2020
> >               Built host: dawson027.pppl.gov
> >               C bindings: yes
> >             C++ bindings: no
> >              Fort mpif.h: yes (all)
> >             Fort use mpi: yes (full: ignore TKR)
> >        Fort use mpi size: deprecated-ompi-info-value
> >         Fort use mpi_f08: yes
> >  Fort mpi_f08 compliance: The mpi_f08 module is available, but due 
to
> >                           limitations in the gfortran compiler and/
or 
> > Open
> >                           MPI, does not support the following: array
> >                           subsections, direct passthru (where 
> > possible) to
> >                           underlying Open MPI's C functionality
> >   Fort mpi_f08 subarrays: no
> >            Java bindings: no
> >   Wrapper compiler rpath: runpath
> >               C compiler: gcc
> >      C compiler absolute: /usr/pppl/gcc/9.3.0/bin/gcc
> >   C compiler family name: GNU
> >       C compiler version: 9.3.0
> >             C++ compiler: g++
> >    C++ compiler absolute: /usr/pppl/gcc/9.3.0/bin/g++
> >            Fort compiler: gfortran
> >        Fort compiler abs: /usr/pppl/gcc/9.3.0/bin/gfortran
> >          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
> >    Fort 08 assumed shape: yes
> >       Fort optional args: yes
> >           Fort INTERFACE: yes
> >     Fort ISO_FORTRAN_ENV: yes
> >        Fort STORAGE_SIZE: yes
> >       Fort BIND(C) (all): yes
> >       Fort ISO_C_BINDING: yes
> >  Fort SUBROUTINE BIND(C): yes
> >        Fort TYPE,BIND(C): yes
> >  Fort T,BIND(C,name="a"): yes
> >             Fort PRIVATE: yes
> >           Fort PROTECTED: yes
> >            Fort ABSTRACT: yes
> >        Fort ASYNCHRONOUS: yes
> >           Fort PROCEDURE: yes
> >          Fort USE...ONLY: yes
> >            Fort C_FUNLOC: yes
> >  Fort f08 using wrappers: yes
> >          Fort MPI_SIZEOF: yes
> >              C profiling: yes
> >            C++ profiling: no
> >    Fort mpif.h profiling: yes
> >   Fort use mpi profiling: yes
> >    Fort use mpi_f08 prof: yes
> >           C++ exceptions: no
> >           Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL 
> > support: yes,
> >                           OMPI progress: no, ORTE progress: yes, 
Event 
> > lib:
> >                           yes)
> >            Sparse Groups: no
> >   Internal debug support: no
> >   MPI interface warnings: yes
> >      MPI parameter check: runtime
> > Memory profiling support: no
> > Memory debugging support: no
> >               dl support: yes
> >    Heterogeneous support: no
> >  mpirun default --prefix: no
> >        MPI_WTIME support: native
> >      Symbol vis. support: yes
> >    Host topology support: yes
> >             IPv6 support: no
> >       MPI1 compatibility: no
> >           MPI extensions: affinity, cuda, pcollreq
> >    FT Checkpoint support: no (checkpoint thread: no)
> >    C/R Enabled Debugging: no
> >   MPI_MAX_PROCESSOR_NAME: 256
> >     MPI_MAX_ERROR_STRING: 256
> >      MPI_MAX_OBJECT_NAME: 64
> >         MPI_MAX_INFO_KEY: 36
> >         MPI_MAX_INFO_VAL: 256
> >        MPI_MAX_PORT_NAME: 1024
> >   MPI_MAX_DATAREP_STRING: 128
> >            MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >            MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                  MCA btl: self (MCA v2.1.0, API v3.1.0, Component v4.
0.3)
> >                  MCA btl: uct (MCA v2.1.0, API v3.1.0, Component v4.
0.3)
> >                  MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.
0.3)
> >                  MCA btl: usnic (MCA v2.1.0, API v3.1.0, Component 
> > v4.0.3)
> >                  MCA btl: vader (MCA v2.1.0, API v3.1.0, Component 
> > v4.0.3)
> >                  MCA btl: openib (MCA v2.1.0, API v3.1.0, Component 
> > v4.0.3)
> >             MCA compress: gzip (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >             MCA compress: bzip (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA crs: none (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                   MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >                MCA event: external (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                MCA hwloc: hwloc201 (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                   MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                   MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >          MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >          MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >               MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, 
Component 
> > v4.0.3)
> >              MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, 
Component
> >                           v4.0.3)
> >                 MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                 MCA pmix: isolated (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                 MCA pmix: s2 (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                 MCA pmix: ext3x (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA pmix: s1 (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                 MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >               MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component 
> > v4.0.3)
> >            MCA reachable: weighted (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                MCA timer: linux (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >               MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, 
Component
> >                           v4.0.3)
> >               MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, 
> > Component
> >                           v4.0.3)
> >               MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, 
Component
> >                           v4.0.3)
> >               MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, 
Component
> >                           v4.0.3)
> >                  MCA ess: env (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                  MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                  MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                  MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component 
> > v4.0.3)
> >                  MCA ess: singleton (MCA v2.1.0, API v3.0.0, 
Component
> >                           v4.0.3)
> >                  MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >              MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component 
> > v4.0.3)
> >                  MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA iof: orted (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA odls: pspawn (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA odls: default (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                  MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                  MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA plm: isolated (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                  MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                  MCA ras: simulator (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                 MCA regx: fwd (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                 MCA regx: naive (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >                 MCA regx: reverse (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >                MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA rml: oob (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >               MCA routed: direct (MCA v2.1.0, API v3.0.0, Component 
> > v4.0.3)
> >               MCA routed: binomial (MCA v2.1.0, API v3.0.0, 
Component 
> > v4.0.3)
> >               MCA routed: radix (MCA v2.1.0, API v3.0.0, Component 
> > v4.0.3)
> >                  MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >               MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >               MCA schizo: orte (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >               MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >               MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                MCA state: orted (MCA v2.1.0, API v1.0.0, Component 
> > v4.0.3)
> >                MCA state: novm (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                MCA state: app (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                MCA state: tool (MCA v2.1.0, API v1.0.0, Component v4.
0.3)
> >                  MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                 MCA coll: self (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                 MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                 MCA coll: inter (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                 MCA coll: monitoring (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                 MCA coll: basic (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                 MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                MCA fcoll: individual (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                   MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                   MCA io: romio321 (MCA v2.1.0, API v2.0.0, 
Component 
> > v4.0.3)
> >                   MCA io: ompio (MCA v2.1.0, API v2.0.0, Component 
> > v4.0.3)
> >                  MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA mtl: psm (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                  MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component 
> > v4.0.3)
> >                  MCA osc: monitoring (MCA v2.1.0, API v3.0.0, 
Component
> >                           v4.0.3)
> >                  MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v4.0.
3)
> >                  MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v4.
0.3)
> >                  MCA pml: v (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                  MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >                  MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >                  MCA pml: monitoring (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                  MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v4.
0.3)
> >             MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >             MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v4.0.
3)
> >             MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >                 MCA topo: treematch (MCA v2.1.0, API v2.2.0, 
Component
> >                           v4.0.3)
> >                 MCA topo: basic (MCA v2.1.0, API v2.2.0, Component 
> > v4.0.3)
> >            MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, 
Component
> >                           v4.0.3)
> >
> >
> -- 
> Prentice Bisbal
> Lead Software Engineer
> Research Computing
> Princeton Plasma Physics Laboratory
> http://www.pppl.gov
> 
> 

Reply via email to