Thanks Gus. I'll try and post results.
I am newbie in this and appreciate any advice very much.

Cheers
--Boris


On Mon, Jul 17, 2017 at 10:09 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> On 07/17/2017 01:06 PM, Gus Correa wrote:
>
>> Hi Boris
>>
>> The nodes may have standard Gigabit Ethernet interfaces,
>> besides the Infiniband (RoCE).
>> You may want to direct OpenMPI to use the Infiniband interfaces,
>> not Gigabit Ethernet,
>> by adding something like this to "--mca btl self,vader,self":
>>
> Oops! Typo:
> "--mca btl self,vader,tcp"
>
>
>> "--mca btl_tcp_if_include ib0,ib1"
>>
>> (Where the interface names ib0,ib1 are just my guess for
>> what your nodes may have. Check with your "root" system administrator!)
>>
>> That syntax may also use IP address, or a subnet mask,
>> whichever it is simpler for you.
>> It is better explained in this FAQ:
>>
>> https://www.open-mpi.org/faq/?category=all#tcp-selection
>>
>> BTW, some of your questions (and others that you may hit later)
>> are covered in the OpenMPI FAQ:
>>
>> https://www.open-mpi.org/faq/?category=all
>>
>> I hope this helps,
>> Gus Correa
>>
>>
>> On 07/17/2017 12:43 PM, Boris M. Vulovic wrote:
>>
>>> Gus, Gilles, Russell, John:
>>>
>>> Thanks very much for the replies and the help.
>>> I got confirmation from the "root" that it is indeed RoCE with 100G.
>>>
>>> I'll go over the info in the link Russell provided, but have a quick
>>> question: if I run the "*mpiexec*" with "*-mca btl tcp,self*" do I get the
>>> benefit of *RoCE *(the fastest speed)?
>>>
>>> I'll go over the details of all reply and post useful feedback.
>>>
>>> Thanks very much all!
>>>
>>> Best,
>>>
>>> --Boris
>>>
>>>
>>>
>>>
>>> On Mon, Jul 17, 2017 at 6:31 AM, Russell Dekema <deke...@umich.edu
>>> <mailto:deke...@umich.edu>> wrote:
>>>
>>>     It looks like you have two dual-port Mellanox VPI cards in this
>>>     machine. These cards can be set to run InfiniBand or Ethernet on a
>>>     port-by-port basis, and all four of your ports are set to Ethernet
>>>     mode. Two of your ports have active 100 gigabit Ethernet links, and
>>>     the other two have no link up at all.
>>>
>>>     With no InfiniBand links on the machine, you will, of course, not be
>>>     able to run your OpenMPI job over InfiniBand.
>>>
>>>     If your machines and network are set up for it, you might be able to
>>>     run your job over RoCE (RDMA Over Converged Ethernet) using one or
>>>     both of those 100 GbE links. I have never used RoCE myself, but one
>>>     starting point for gathering more information on it might be the
>>>     following section of the OpenMPI FAQ:
>>>
>>>     https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce
>>>     <https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce>
>>>
>>>     Sincerely,
>>>     Rusty Dekema
>>>     University of Michigan
>>>     Advanced Research Computing - Technology Services
>>>
>>>
>>>     On Fri, Jul 14, 2017 at 12:34 PM, Boris M. Vulovic
>>>     <boris.m.vulo...@gmail.com <mailto:boris.m.vulo...@gmail.com>>
>>> wrote:
>>>      > Gus, Gilles and John,
>>>      >
>>>      > Thanks for the help. Let me first post (below) the output from
>>>     checkouts of
>>>      > the IB network:
>>>      > ibdiagnet
>>>      > ibhosts
>>>      > ibstat  (for login node, for now)
>>>      >
>>>      > What do you think?
>>>      > Thanks
>>>      > --Boris
>>>      >
>>>      >
>>>      >
>>>     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>>      >
>>>      > -bash-4.1$ ibdiagnet
>>>      > ----------
>>>      > Load Plugins from:
>>>      > /usr/share/ibdiagnet2.1.1/plugins/
>>>      > (You can specify more paths to be looked in with
>>>     "IBDIAGNET_PLUGINS_PATH"
>>>      > env variable)
>>>      >
>>>      > Plugin Name                                   Result     Comment
>>>      > libibdiagnet_cable_diag_plugin-2.1.1          Succeeded  Plugin
>>>     loaded
>>>      > libibdiagnet_phy_diag_plugin-2.1.1            Succeeded  Plugin
>>>     loaded
>>>      >
>>>      > ---------------------------------------------
>>>      > Discovery
>>>      > -E- Failed to initialize
>>>      >
>>>      > -E- Fabric Discover failed, err=IBDiag initialize wasn't done
>>>      > -E- Fabric Discover failed, MAD err=Failed to register SMI class
>>>      >
>>>      > ---------------------------------------------
>>>      > Summary
>>>      > -I- Stage                     Warnings   Errors     Comment
>>>      > -I- Discovery                                       NA
>>>      > -I- Lids Check                                      NA
>>>      > -I- Links Check                                     NA
>>>      > -I- Subnet Manager                                  NA
>>>      > -I- Port Counters                                   NA
>>>      > -I- Nodes Information                               NA
>>>      > -I- Speed / Width checks                            NA
>>>      > -I- Partition Keys                                  NA
>>>      > -I- Alias GUIDs                                     NA
>>>      > -I- Temperature Sensing                             NA
>>>      >
>>>      > -I- You can find detailed errors/warnings in:
>>>      > /var/tmp/ibdiagnet2/ibdiagnet2.log
>>>      >
>>>      > -E- A fatal error occurred, exiting...
>>>      > -bash-4.1$
>>>      >
>>>     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>>      >
>>>      > -bash-4.1$ ibhosts
>>>      > ibwarn: [168221] mad_rpc_open_port: client_register for mgmt 1
>>> failed
>>>      > src/ibnetdisc.c:766; can't open MAD port ((null):0)
>>>      > /usr/sbin/ibnetdiscover: iberror: failed: discover failed
>>>      > -bash-4.1$
>>>      >
>>>      >
>>>     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>>      > -bash-4.1$ ibstat
>>>      > CA 'mlx5_0'
>>>      >         CA type: MT4115
>>>      >         Number of ports: 1
>>>      >         Firmware version: 12.17.2020
>>>      >         Hardware version: 0
>>>      >         Node GUID: 0x248a0703005abb1c
>>>      >         System image GUID: 0x248a0703005abb1c
>>>      >         Port 1:
>>>      >                 State: Active
>>>      >                 Physical state: LinkUp
>>>      >                 Rate: 100
>>>      >                 Base lid: 0
>>>      >                 LMC: 0
>>>      >                 SM lid: 0
>>>      >                 Capability mask: 0x3c010000
>>>      >                 Port GUID: 0x268a07fffe5abb1c
>>>      >                 Link layer: Ethernet
>>>      > CA 'mlx5_1'
>>>      >         CA type: MT4115
>>>      >         Number of ports: 1
>>>      >         Firmware version: 12.17.2020
>>>      >         Hardware version: 0
>>>      >         Node GUID: 0x248a0703005abb1d
>>>      >         System image GUID: 0x248a0703005abb1c
>>>      >         Port 1:
>>>      >                 State: Active
>>>      >                 Physical state: LinkUp
>>>      >                 Rate: 100
>>>      >                 Base lid: 0
>>>      >                 LMC: 0
>>>      >                 SM lid: 0
>>>      >                 Capability mask: 0x3c010000
>>>      >                 Port GUID: 0x0000000000000000
>>>      >                 Link layer: Ethernet
>>>      > CA 'mlx5_2'
>>>      >         CA type: MT4115
>>>      >         Number of ports: 1
>>>      >         Firmware version: 12.17.2020
>>>      >         Hardware version: 0
>>>      >         Node GUID: 0x248a0703005abb30
>>>      >         System image GUID: 0x248a0703005abb30
>>>      >         Port 1:
>>>      >                 State: Down
>>>      >                 Physical state: Disabled
>>>      >                 Rate: 100
>>>      >                 Base lid: 0
>>>      >                 LMC: 0
>>>      >                 SM lid: 0
>>>      >                 Capability mask: 0x3c010000
>>>      >                 Port GUID: 0x268a07fffe5abb30
>>>      >                 Link layer: Ethernet
>>>      > CA 'mlx5_3'
>>>      >         CA type: MT4115
>>>      >         Number of ports: 1
>>>      >         Firmware version: 12.17.2020
>>>      >         Hardware version: 0
>>>      >         Node GUID: 0x248a0703005abb31
>>>      >         System image GUID: 0x248a0703005abb30
>>>      >         Port 1:
>>>      >                 State: Down
>>>      >                 Physical state: Disabled
>>>      >                 Rate: 100
>>>      >                 Base lid: 0
>>>      >                 LMC: 0
>>>      >                 SM lid: 0
>>>      >                 Capability mask: 0x3c010000
>>>      >                 Port GUID: 0x268a07fffe5abb31
>>>      >                 Link layer: Ethernet
>>>      > -bash-4.1$
>>>      >
>>>     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>>      >
>>>      > On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users
>>>      > <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>
>>> wrote:
>>>      >>
>>>      >> ABoris, as Gilles says - first do som elower level checkouts of
>>> your
>>>      >> Infiniband network.
>>>      >> I suggest running:
>>>      >> ibdiagnet
>>>      >> ibhosts
>>>      >> and then as Gilles says 'ibstat' on each node
>>>      >>
>>>      >>
>>>      >>
>>>      >> On 14 July 2017 at 03:58, Gilles Gouaillardet <gil...@rist.or.jp
>>>     <mailto:gil...@rist.or.jp>> wrote:
>>>      >>>
>>>      >>> Boris,
>>>      >>>
>>>      >>>
>>>      >>> Open MPI should automatically detect the infiniband hardware,
>>>     and use
>>>      >>> openib (and *not* tcp) for inter node communications
>>>      >>>
>>>      >>> and a shared memory optimized btl (e.g. sm or vader) for intra
>>> node
>>>      >>> communications.
>>>      >>>
>>>      >>>
>>>      >>> note if you "-mca btl openib,self", you tell Open MPI to use
>>>     the openib
>>>      >>> btl between any tasks,
>>>      >>>
>>>      >>> including tasks running on the same node (which is less
>>>     efficient than
>>>      >>> using sm or vader)
>>>      >>>
>>>      >>>
>>>      >>> at first, i suggest you make sure infiniband is up and running
>>>     on all
>>>      >>> your nodes.
>>>      >>>
>>>      >>> (just run ibstat, at least one port should be listed, state
>>>     should be
>>>      >>> Active, and all nodes should have the same SM lid)
>>>      >>>
>>>      >>>
>>>      >>> then try to run two tasks on two nodes.
>>>      >>>
>>>      >>>
>>>      >>> if this does not work, you can
>>>      >>>
>>>      >>> mpirun --mca btl_base_verbose 100 ...
>>>      >>>
>>>      >>> and post the logs so we can investigate from there.
>>>      >>>
>>>      >>>
>>>      >>> Cheers,
>>>      >>>
>>>      >>>
>>>      >>> Gilles
>>>      >>>
>>>      >>>
>>>      >>>
>>>      >>> On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:
>>>      >>>>
>>>      >>>>
>>>      >>>> I would like to know how to invoke InfiniBand hardware on
>>>     CentOS 6x
>>>      >>>> cluster with OpenMPI (static libs.) for running my C++ code.
>>>     This is how I
>>>      >>>> compile and run:
>>>      >>>>
>>>      >>>> /usr/local/open-mpi/1.10.7/bin/mpic++
>>>     -L/usr/local/open-mpi/1.10.7/lib
>>>      >>>> -Bstatic main.cpp -o DoWork
>>>      >>>>
>>>      >>>> usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self
>>> --hostfile
>>>      >>>> hostfile5 -host node01,node02,node03,node04,node05 -n 200
>>> DoWork
>>>      >>>>
>>>      >>>> Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and
>>>     the cluster
>>>      >>>> has InfiniBand.
>>>      >>>>
>>>      >>>> What should be changed in compiling and running commands for
>>>     InfiniBand
>>>      >>>> to be invoked? If I just replace "*-mca btl tcp,self*" with
>>>     "*-mca btl
>>>      >>>> openib,self*" then I get plenty of errors with relevant one
>>>     saying:
>>>      >>>>
>>>      >>>> /At least one pair of MPI processes are unable to reach each
>>>     other for
>>>      >>>> MPI communications. This means that no Open MPI device has
>>>     indicated that it
>>>      >>>> can be used to communicate between these processes. This is an
>>>     error; Open
>>>      >>>> MPI requires that all MPI processes be able to reach each
>>>     other. This error
>>>      >>>> can sometimes be the result of forgetting to specify the
>>>     "self" BTL./
>>>      >>>>
>>>      >>>> Thanks very much!!!
>>>      >>>>
>>>      >>>>
>>>      >>>> *Boris *
>>>      >>>>
>>>      >>>>
>>>      >>>>
>>>      >>>>
>>>      >>>> _______________________________________________
>>>      >>>> users mailing list
>>>      >>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>      >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>      >>>
>>>      >>>
>>>      >>> _______________________________________________
>>>      >>> users mailing list
>>>      >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>      >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>      >>
>>>      >>
>>>      >>
>>>      >> _______________________________________________
>>>      >> users mailing list
>>>      >> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>      >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>      >
>>>      >
>>>      >
>>>      >
>>>      > --
>>>      >
>>>      > Boris M. Vulovic
>>>      >
>>>      >
>>>      >
>>>      > _______________________________________________
>>>      > users mailing list
>>>      > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>      > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>     _______________________________________________
>>>     users mailing list
>>>     users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>     https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> *Boris M. Vulovic*
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>
>>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>



-- 

*Boris M. Vulovic*
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to