Hello Brendan,

This helps some, but looks like we need more debug output.

Could you build a debug version of Open MPI by adding --enable-debug
to the config options and rerun the test with the breakout cable setup
and keeping the --mca btl_base_verbose 100 command line option?

Thanks

Howard


2017-01-23 8:23 GMT-07:00 Brendan Myers <brendan.my...@soft-forge.com>:

> Hello Howard,
>
> Thank you for looking into this. Attached is the output you requested.
> Also, I am using Open MPI 2.0.1.
>
>
>
> Thank you,
>
> Brendan
>
>
>
> *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *Howard
> Pritchard
> *Sent:* Friday, January 20, 2017 6:35 PM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Subject:* Re: [OMPI users] Open MPI over RoCE using breakout cable and
> switch
>
>
>
> Hi Brendan
>
>
>
> I doubt this kind of config has gotten any testing with OMPI.  Could you
> rerun with
>
>
>
> --mca btl_base_verbose 100
>
>
>
> added to the command line and post the output to the list?
>
>
>
> Howard
>
>
>
>
>
> Brendan Myers <brendan.my...@soft-forge.com> schrieb am Fr. 20. Jan. 2017
> um 15:04:
>
> Hello,
>
> I am attempting to get Open MPI to run over 2 nodes using a switch and a
> single breakout cable with this design:
>
> (100GbE)QSFP ßà 2x (50GbE)QSFP
>
>
>
> Hardware Layout:
>
> Breakout cable module A connects to switch (100GbE)
>
> Breakout cable module B1 connects to node 1 RoCE NIC (50GbE)
>
> Breakout cable module B2 connects to node 2 RoCE NIC (50GbE)
>
> Switch is Mellanox SN 2700 100GbE RoCE switch
>
>
>
> ·         I  am able to pass RDMA traffic between the nodes with perftest
> (ib_write_bw) when using the breakout cable as the IC from both nodes to
> the switch.
>
> ·         When attempting to run a job using the breakout cable as the IC
> Open MPI aborts with failure to initialize open fabrics device errors.
>
> ·         If I replace the breakout cable with 2 standard QSFP cables the
> Open MPI job will complete correctly.
>
>
>
>
>
> This is the command I use, it works unless I attempt a run with the
> breakout cable used as IC:
>
> *mpirun --mca btl openib,self,sm --mca btl_openib_receive_queues
> P,65536,120,64,32 --mca btl_openib_cpc_include rdmacm  -hostfile
> mpi-hosts-ce /usr/local/bin/IMB-MPI1*
>
>
>
> If anyone has any idea as to why using a breakout cable is causing my jobs
> to fail please let me know.
>
>
>
> Thank you,
>
>
>
> Brendan T. W. Myers
>
> brendan.my...@soft-forge.com
>
> Software Forge Inc
>
>
>
> _______________________________________________
>
> users mailing list
>
> users@lists.open-mpi.org
>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to