Thanks Gus. I'll try and post results. I am newbie in this and appreciate any advice very much.
Cheers --Boris On Mon, Jul 17, 2017 at 10:09 AM, Gus Correa <g...@ldeo.columbia.edu> wrote: > On 07/17/2017 01:06 PM, Gus Correa wrote: > >> Hi Boris >> >> The nodes may have standard Gigabit Ethernet interfaces, >> besides the Infiniband (RoCE). >> You may want to direct OpenMPI to use the Infiniband interfaces, >> not Gigabit Ethernet, >> by adding something like this to "--mca btl self,vader,self": >> > Oops! Typo: > "--mca btl self,vader,tcp" > > >> "--mca btl_tcp_if_include ib0,ib1" >> >> (Where the interface names ib0,ib1 are just my guess for >> what your nodes may have. Check with your "root" system administrator!) >> >> That syntax may also use IP address, or a subnet mask, >> whichever it is simpler for you. >> It is better explained in this FAQ: >> >> https://www.open-mpi.org/faq/?category=all#tcp-selection >> >> BTW, some of your questions (and others that you may hit later) >> are covered in the OpenMPI FAQ: >> >> https://www.open-mpi.org/faq/?category=all >> >> I hope this helps, >> Gus Correa >> >> >> On 07/17/2017 12:43 PM, Boris M. Vulovic wrote: >> >>> Gus, Gilles, Russell, John: >>> >>> Thanks very much for the replies and the help. >>> I got confirmation from the "root" that it is indeed RoCE with 100G. >>> >>> I'll go over the info in the link Russell provided, but have a quick >>> question: if I run the "*mpiexec*" with "*-mca btl tcp,self*" do I get the >>> benefit of *RoCE *(the fastest speed)? >>> >>> I'll go over the details of all reply and post useful feedback. >>> >>> Thanks very much all! >>> >>> Best, >>> >>> --Boris >>> >>> >>> >>> >>> On Mon, Jul 17, 2017 at 6:31 AM, Russell Dekema <deke...@umich.edu >>> <mailto:deke...@umich.edu>> wrote: >>> >>> It looks like you have two dual-port Mellanox VPI cards in this >>> machine. These cards can be set to run InfiniBand or Ethernet on a >>> port-by-port basis, and all four of your ports are set to Ethernet >>> mode. Two of your ports have active 100 gigabit Ethernet links, and >>> the other two have no link up at all. >>> >>> With no InfiniBand links on the machine, you will, of course, not be >>> able to run your OpenMPI job over InfiniBand. >>> >>> If your machines and network are set up for it, you might be able to >>> run your job over RoCE (RDMA Over Converged Ethernet) using one or >>> both of those 100 GbE links. I have never used RoCE myself, but one >>> starting point for gathering more information on it might be the >>> following section of the OpenMPI FAQ: >>> >>> https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce >>> <https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce> >>> >>> Sincerely, >>> Rusty Dekema >>> University of Michigan >>> Advanced Research Computing - Technology Services >>> >>> >>> On Fri, Jul 14, 2017 at 12:34 PM, Boris M. Vulovic >>> <boris.m.vulo...@gmail.com <mailto:boris.m.vulo...@gmail.com>> >>> wrote: >>> > Gus, Gilles and John, >>> > >>> > Thanks for the help. Let me first post (below) the output from >>> checkouts of >>> > the IB network: >>> > ibdiagnet >>> > ibhosts >>> > ibstat (for login node, for now) >>> > >>> > What do you think? >>> > Thanks >>> > --Boris >>> > >>> > >>> > >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> > >>> > -bash-4.1$ ibdiagnet >>> > ---------- >>> > Load Plugins from: >>> > /usr/share/ibdiagnet2.1.1/plugins/ >>> > (You can specify more paths to be looked in with >>> "IBDIAGNET_PLUGINS_PATH" >>> > env variable) >>> > >>> > Plugin Name Result Comment >>> > libibdiagnet_cable_diag_plugin-2.1.1 Succeeded Plugin >>> loaded >>> > libibdiagnet_phy_diag_plugin-2.1.1 Succeeded Plugin >>> loaded >>> > >>> > --------------------------------------------- >>> > Discovery >>> > -E- Failed to initialize >>> > >>> > -E- Fabric Discover failed, err=IBDiag initialize wasn't done >>> > -E- Fabric Discover failed, MAD err=Failed to register SMI class >>> > >>> > --------------------------------------------- >>> > Summary >>> > -I- Stage Warnings Errors Comment >>> > -I- Discovery NA >>> > -I- Lids Check NA >>> > -I- Links Check NA >>> > -I- Subnet Manager NA >>> > -I- Port Counters NA >>> > -I- Nodes Information NA >>> > -I- Speed / Width checks NA >>> > -I- Partition Keys NA >>> > -I- Alias GUIDs NA >>> > -I- Temperature Sensing NA >>> > >>> > -I- You can find detailed errors/warnings in: >>> > /var/tmp/ibdiagnet2/ibdiagnet2.log >>> > >>> > -E- A fatal error occurred, exiting... >>> > -bash-4.1$ >>> > >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> > >>> > -bash-4.1$ ibhosts >>> > ibwarn: [168221] mad_rpc_open_port: client_register for mgmt 1 >>> failed >>> > src/ibnetdisc.c:766; can't open MAD port ((null):0) >>> > /usr/sbin/ibnetdiscover: iberror: failed: discover failed >>> > -bash-4.1$ >>> > >>> > >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> > -bash-4.1$ ibstat >>> > CA 'mlx5_0' >>> > CA type: MT4115 >>> > Number of ports: 1 >>> > Firmware version: 12.17.2020 >>> > Hardware version: 0 >>> > Node GUID: 0x248a0703005abb1c >>> > System image GUID: 0x248a0703005abb1c >>> > Port 1: >>> > State: Active >>> > Physical state: LinkUp >>> > Rate: 100 >>> > Base lid: 0 >>> > LMC: 0 >>> > SM lid: 0 >>> > Capability mask: 0x3c010000 >>> > Port GUID: 0x268a07fffe5abb1c >>> > Link layer: Ethernet >>> > CA 'mlx5_1' >>> > CA type: MT4115 >>> > Number of ports: 1 >>> > Firmware version: 12.17.2020 >>> > Hardware version: 0 >>> > Node GUID: 0x248a0703005abb1d >>> > System image GUID: 0x248a0703005abb1c >>> > Port 1: >>> > State: Active >>> > Physical state: LinkUp >>> > Rate: 100 >>> > Base lid: 0 >>> > LMC: 0 >>> > SM lid: 0 >>> > Capability mask: 0x3c010000 >>> > Port GUID: 0x0000000000000000 >>> > Link layer: Ethernet >>> > CA 'mlx5_2' >>> > CA type: MT4115 >>> > Number of ports: 1 >>> > Firmware version: 12.17.2020 >>> > Hardware version: 0 >>> > Node GUID: 0x248a0703005abb30 >>> > System image GUID: 0x248a0703005abb30 >>> > Port 1: >>> > State: Down >>> > Physical state: Disabled >>> > Rate: 100 >>> > Base lid: 0 >>> > LMC: 0 >>> > SM lid: 0 >>> > Capability mask: 0x3c010000 >>> > Port GUID: 0x268a07fffe5abb30 >>> > Link layer: Ethernet >>> > CA 'mlx5_3' >>> > CA type: MT4115 >>> > Number of ports: 1 >>> > Firmware version: 12.17.2020 >>> > Hardware version: 0 >>> > Node GUID: 0x248a0703005abb31 >>> > System image GUID: 0x248a0703005abb30 >>> > Port 1: >>> > State: Down >>> > Physical state: Disabled >>> > Rate: 100 >>> > Base lid: 0 >>> > LMC: 0 >>> > SM lid: 0 >>> > Capability mask: 0x3c010000 >>> > Port GUID: 0x268a07fffe5abb31 >>> > Link layer: Ethernet >>> > -bash-4.1$ >>> > >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >>> > >>> > On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users >>> > <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> >>> wrote: >>> >> >>> >> ABoris, as Gilles says - first do som elower level checkouts of >>> your >>> >> Infiniband network. >>> >> I suggest running: >>> >> ibdiagnet >>> >> ibhosts >>> >> and then as Gilles says 'ibstat' on each node >>> >> >>> >> >>> >> >>> >> On 14 July 2017 at 03:58, Gilles Gouaillardet <gil...@rist.or.jp >>> <mailto:gil...@rist.or.jp>> wrote: >>> >>> >>> >>> Boris, >>> >>> >>> >>> >>> >>> Open MPI should automatically detect the infiniband hardware, >>> and use >>> >>> openib (and *not* tcp) for inter node communications >>> >>> >>> >>> and a shared memory optimized btl (e.g. sm or vader) for intra >>> node >>> >>> communications. >>> >>> >>> >>> >>> >>> note if you "-mca btl openib,self", you tell Open MPI to use >>> the openib >>> >>> btl between any tasks, >>> >>> >>> >>> including tasks running on the same node (which is less >>> efficient than >>> >>> using sm or vader) >>> >>> >>> >>> >>> >>> at first, i suggest you make sure infiniband is up and running >>> on all >>> >>> your nodes. >>> >>> >>> >>> (just run ibstat, at least one port should be listed, state >>> should be >>> >>> Active, and all nodes should have the same SM lid) >>> >>> >>> >>> >>> >>> then try to run two tasks on two nodes. >>> >>> >>> >>> >>> >>> if this does not work, you can >>> >>> >>> >>> mpirun --mca btl_base_verbose 100 ... >>> >>> >>> >>> and post the logs so we can investigate from there. >>> >>> >>> >>> >>> >>> Cheers, >>> >>> >>> >>> >>> >>> Gilles >>> >>> >>> >>> >>> >>> >>> >>> On 7/14/2017 6:43 AM, Boris M. Vulovic wrote: >>> >>>> >>> >>>> >>> >>>> I would like to know how to invoke InfiniBand hardware on >>> CentOS 6x >>> >>>> cluster with OpenMPI (static libs.) for running my C++ code. >>> This is how I >>> >>>> compile and run: >>> >>>> >>> >>>> /usr/local/open-mpi/1.10.7/bin/mpic++ >>> -L/usr/local/open-mpi/1.10.7/lib >>> >>>> -Bstatic main.cpp -o DoWork >>> >>>> >>> >>>> usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self >>> --hostfile >>> >>>> hostfile5 -host node01,node02,node03,node04,node05 -n 200 >>> DoWork >>> >>>> >>> >>>> Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and >>> the cluster >>> >>>> has InfiniBand. >>> >>>> >>> >>>> What should be changed in compiling and running commands for >>> InfiniBand >>> >>>> to be invoked? If I just replace "*-mca btl tcp,self*" with >>> "*-mca btl >>> >>>> openib,self*" then I get plenty of errors with relevant one >>> saying: >>> >>>> >>> >>>> /At least one pair of MPI processes are unable to reach each >>> other for >>> >>>> MPI communications. This means that no Open MPI device has >>> indicated that it >>> >>>> can be used to communicate between these processes. This is an >>> error; Open >>> >>>> MPI requires that all MPI processes be able to reach each >>> other. This error >>> >>>> can sometimes be the result of forgetting to specify the >>> "self" BTL./ >>> >>>> >>> >>>> Thanks very much!!! >>> >>>> >>> >>>> >>> >>>> *Boris * >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> _______________________________________________ >>> >>>> users mailing list >>> >>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> users mailing list >>> >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> >>> >> >>> >> >>> >> >>> >> _______________________________________________ >>> >> users mailing list >>> >> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> >>> > >>> > >>> > >>> > >>> > -- >>> > >>> > Boris M. Vulovic >>> > >>> > >>> > >>> > _______________________________________________ >>> > users mailing list >>> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> >>> >>> >>> >>> >>> -- >>> >>> *Boris M. Vulovic* >>> >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> >>> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > -- *Boris M. Vulovic*
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users