In 1.10.x is possible for the BTLs to be in use by ether ob1 or an
oshmem component. In 2.x one-sided components can also use BTLs. The MTL
interface doesn't not provide support for accessing hardware atomics and
RDMA. As for UD it stands for Unconnected Datagram. Its usage gets
better messaage rates for small messages but really hurts bandwidth. Our
applications are bandwidth bound and not message rate bound so we should
be using XRC not UD.

-Nathan

On Thu, Apr 21, 2016 at 09:33:06AM -0600, David Shrader wrote:
>    Hey Nathan,
> 
>    I thought only 1 pml could be loaded at a time, and the only pml that
>    could use btl's was ob1. If that is the case, how can the openib btl run
>    at the same time as cm and yalla?
> 
>    Also, what is UD?
> 
>    Thanks,
>    David
> 
>    On 04/21/2016 09:25 AM, Nathan Hjelm wrote:
> 
>  The openib btl should be able to run alongside cm/mxm or yalla. If I
>  have time this weekend I will get on the mustang and see what the
>  problem is. The best answer is to change the openmpi-mca-params.conf in
>  the install to have pml = ob1. I have seen little to no benefit with
>  using MXM on mustang. In fact, the default configuration (which uses UD)
>  gets terrible bandwidth.
> 
>  -Nathan
> 
>  On Thu, Apr 21, 2016 at 01:48:46PM +0300, Alina Sklarevich wrote:
> 
>     David, thanks for the info you provided.
>     I will try to dig in further to see what might be causing this issue.
>     In the meantime, maybe Nathan can please comment about the openib btl
>     behavior here?
>     Thanks,
>     Alina.
>     On Wed, Apr 20, 2016 at 8:01 PM, David Shrader <dshra...@lanl.gov> wrote:
> 
>       Hello Alina,
> 
>       Thank you for the information about how the pml components work. I knew
>       that the other components were being opened and ultimately closed in
>       favor of yalla, but I didn't realize that initial open would cause a
>       persistent change in the ompi runtime.
> 
>       Here's the information you requested about the ib network:
> 
>       - MOFED version:
>       We are using the Open Fabrics Software as bundled by RedHat, and my ib
>       network folks say we're running something close to v1.5.4
>       - ibv_devinfo:
>       [dshrader@mu0001 examples]$ ibv_devinfo
>       hca_id: mlx4_0
>               transport:                      InfiniBand (0)
>               fw_ver:                         2.9.1000
>               node_guid:                      0025:90ff:ff16:78d8
>               sys_image_guid:                 0025:90ff:ff16:78db
>               vendor_id:                      0x02c9
>               vendor_part_id:                 26428
>               hw_ver:                         0xB0
>               board_id:                       SM_2121000001000
>               phys_port_cnt:                  1
>                       port:   1
>                               state:                  PORT_ACTIVE (4)
>                               max_mtu:                4096 (5)
>                               active_mtu:             4096 (5)
>                               sm_lid:                 250
>                               port_lid:               366
>                               port_lmc:               0x00
>                               link_layer:             InfiniBand
> 
>       I still get the seg fault when specifying the hca:
> 
>       $> mpirun -n 1 -mca btl_openib_receive_queues
>       X,4096,1024:X,12288,512:X,65536,512 -mca btl_openib_if_include mlx4_0
>       ./hello_c.x
>       Hello, world, I am 0 of 1, (Open MPI v1.10.2, package: Open MPI
>       dshra...@mu-fey.lanl.gov Distribution, ident: 1.10.2, repo rev:
>       v1.10.1-145-g799148f, Jan 21, 2016, 135)
>       
> --------------------------------------------------------------------------
>       mpirun noticed that process rank 0 with PID 10045 on node mu0001 exited
>       on signal 11 (Segmentation fault).
>       
> --------------------------------------------------------------------------
> 
>       I don't know if this helps, but the first time I tried the command I
>       mistyped the hca name. This got me a warning, but no seg fault:
> 
>       $> mpirun -n 1 -mca btl_openib_receive_queues
>       X,4096,1024:X,12288,512:X,65536,512 -mca btl_openib_if_include ml4_0
>       ./hello_c.x
>       
> --------------------------------------------------------------------------
>       WARNING: One or more nonexistent OpenFabrics devices/ports were
>       specified:
> 
>         Host:                 mu0001
>         MCA parameter:        mca_btl_if_include
>         Nonexistent entities: ml4_0
> 
>       These entities will be ignored.  You can disable this warning by
>       setting the btl_openib_warn_nonexistent_if MCA parameter to 0.
>       
> --------------------------------------------------------------------------
>       Hello, world, I am 0 of 1, (Open MPI v1.10.2, package: Open MPI
>       dshra...@mu-fey.lanl.gov Distribution, ident: 1.10.2, repo rev:
>       v1.10.1-145-g799148f, Jan 21, 2016, 135)
> 
>       So, telling the openib btl to use the actual hca didn't get the seg
>       fault to go away, but giving it a dummy value did.
> 
>       Thanks,
>       David
> 
>       On 04/20/2016 08:13 AM, Alina Sklarevich wrote:
> 
>         Hi David,
>         I was able to reproduce the issue you reported.
>         When the command line doesn't specify the components to use, ompi will
>         try to load/open all the ones available (and close them in the end)
>         and then choose the components according to their priority and whether
>         or not they were opened successfully.
>         This means that even if pml yalla was the one running, other
>         components were opened and closed as well.
>         The parameter you are using - btl_openib_receive_queues, doesn't have
>         an effect on pml yalla. It only affects the openib btl which is used
>         by pml ob1.
>         Using the verbosity of btl_base_verbose I see that when the
>         segmentation fault happens, the code doesn't reach the phase of
>         unloading the openib btl so perhaps the problem originates there
>         (since pml yalla was already unloaded).
>         Can you please try adding this mca parameter to your command line to
>         specify the HCA you are using?
>         -mca btl_openib_if_include <hca>
>         It made the segv go away for me.
>         Can you please attach the output of ibv_devinfo and write the MOFED
>         version you are using?
>         Thank you,
>         Alina.
>         On Wed, Apr 20, 2016 at 2:53 PM, Joshua Ladd <jladd.m...@gmail.com>
>         wrote:
> 
>           Hi, David
> 
>           We are looking into your report.
> 
>           Best,
> 
>           Josh
>           On Tue, Apr 19, 2016 at 4:41 PM, David Shrader <dshra...@lanl.gov>
>           wrote:
> 
>             Hello,
> 
>             I have been investigating using XRC on a cluster with a mellanox
>             interconnect. I have found that in a certain situation I get a seg
>             fault. I am using 1.10.2 compiled with gcc 5.3.0, and the simplest
>             configure line that I have found that still results in the seg
>             fault is as follows:
> 
>             $> ./configure --with-hcoll --with-mxm --prefix=...
> 
>             I do have mxm 3.4.3065 and hcoll 3.3.768 installed in to system
>             space (/usr/lib64). If I use '--without-hcoll --without-mxm,' the
>             seg fault does not happen.
> 
>             The seg fault happens even when using examples/hello_c.c, so here
>             is an example of the seg fault using it:
> 
>             $> mpicc hello_c.c -o hello_c.x
>             $> mpirun -n 1 ./hello_c.x
>             Hello, world, I am 0 of 1, (Open MPI v1.10.2, package: Open MPI
>             dshra...@mu-fey.lanl.gov Distribution, ident: 1.10.2, repo rev:
>             v1.10.1-145-g799148f, Jan 21, 2016, 135)
>             $> mpirun -n 1 -mca btl_openib_receive_queues
>             X,4096,1024:X,12288,512:X,65536,512
>             Hello, world, I am 0 of 1, (Open MPI v1.10.2, package: Open MPI
>             dshra...@mu-fey.lanl.gov Distribution, ident: 1.10.2, repo rev:
>             v1.10.1-145-g799148f, Jan 21, 2016, 135)
>             
> --------------------------------------------------------------------------
>             mpirun noticed that process rank 0 with PID 22819 on node mu0001
>             exited on signal 11 (Segmentation fault).
>             
> --------------------------------------------------------------------------
> 
>             The seg fault happens no matter the number of ranks. I have tried
>             the above command with '-mca pml_base_verbose,' and it shows that
>             I am using the yalla pml:
> 
>             $> mpirun -n 1 -mca btl_openib_receive_queues
>             X,4096,1024:X,12288,512:X,65536,512 -mca pml_base_verbose 100
>             ./hello_c.x
>             ...output snipped...
>             [mu0001.localdomain:22825] select: component cm not selected /
>             finalized
>             [mu0001.localdomain:22825] select: component ob1 not selected /
>             finalized
>             [mu0001.localdomain:22825] select: component yalla selected
>             ...output snipped...
>             
> --------------------------------------------------------------------------
>             mpirun noticed that process rank 0 with PID 22825 on node mu0001
>             exited on signal 11 (Segmentation fault).
>             
> --------------------------------------------------------------------------
> 
>             Interestingly enough, if I tell mpirun what pml to use, the seg
>             fault goes away. The following command does not get the seg fault:
> 
>             $> mpirun -n 1 -mca btl_openib_receive_queues
>             X,4096,1024:X,12288,512:X,65536,512 -mca pml yalla ./hello_c.x
> 
>             Passing either ob1 or cm to '-mca pml' also works. So it seems
>             that the seg fault comes about when the yalla pml is chosen by
>             default, when mxm/hcoll is involved, and using XRC. I'm not sure
>             if mxm is to blame, however, as using '-mca pml cm -mca mtl mxm'
>             with the XRC parameters doesn't throw the seg fault.
> 
>             Other information...
>             OS: RHEL 6.7-based (TOSS)
>             OpenFabrics: RedHat provided
>             Kernel: 2.6.32-573.8.1.2chaos.ch5.4.x86_64
>             Config.log and 'ompi_info --all' are in the tarball ompi.tar.bz2
>             which is attached.
> 
>             Is there something else I should be doing with the yalla pml when
>             using XRC? Regardless, I hope reporting the seg fault is useful.
> 
>             Thanks,
>             David
> 
>             --
>             David Shrader
>             HPC-ENV High Performance Computer Systems
>             Los Alamos National Lab
>             Email: dshrader <at> lanl.gov
> 
>             _______________________________________________
>             devel mailing list
>             de...@open-mpi.org
>             Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>             Link to this post:
>             http://www.open-mpi.org/community/lists/devel/2016/04/18786.php
> 
>           _______________________________________________
>           devel mailing list
>           de...@open-mpi.org
>           Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>           Link to this post:
>           http://www.open-mpi.org/community/lists/devel/2016/04/18788.php
> 
>   _______________________________________________
>   devel mailing list
>   de...@open-mpi.org
>   Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>   Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18789.php
> 
>   --
>   David Shrader
>   HPC-ENV High Performance Computer Systems
>   Los Alamos National Lab
>   Email: dshrader <at> lanl.gov
> 
>       _______________________________________________
>       devel mailing list
>       de...@open-mpi.org
>       Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>       Link to this post:
>       http://www.open-mpi.org/community/lists/devel/2016/04/18793.php
> 
>  _______________________________________________
>  devel mailing list
>  de...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18801.php
> 
>  _______________________________________________
>  devel mailing list
>  de...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18804.php
> 
>  --
>  David Shrader
>  HPC-ENV High Performance Computer Systems
>  Los Alamos National Lab
>  Email: dshrader <at> lanl.gov

> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18805.php

Attachment: pgpWcVst0D2ca.pgp
Description: PGP signature

Reply via email to