On Wed, Sep 21, 2016 at 9:36 AM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote:
> > if i want to exclude ib0, i might want to > mpirun --mca btl_tcp_if_exclude ib0 ... > > to me, this is an honest mistake, but with your proposal, i would be > screwed when > running on more than one node because i should have > mpirun --mca btl_tcp_if_exclude ib0,lo ... My view on this particular honest mistake is that it feels a lot like failing to include the "self" btl list. To the best of my knowledge the is no "safety net" for that user mistake. Instead, there is documentation in README: - If specified, the "btl" parameter must include the "self" component, or Open MPI will not be able to deliver messages to the same rank as the sender. For example: "mpirun --mca btl tcp,self ..." So, one could/should do he same for btl_tcp_if_exclude. BUT IT IS ALREADY IN THE README TODAY! Immediately following the warning above regarding "self" is the following text: - If specified, the "btl_tcp_if_exclude" paramater must include the loopback device ("lo" on many Linux platforms), or Open MPI will not be able to route MPI messages using the TCP BTL. For example: "mpirun --mca btl_tcp_if_exclude lo,eth1 ..." So, in short, there is *already* documentation that tells the user *not* to do what Gilles is worried about. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel