Keep in mind that -- in general -- Open MPI has two different kinds of traffic:

1. "Out of band" (OOB) traffic, used for launching/monitoring/killing the job
2. MPI traffic

The OOB traffic generally uses TCP, and can be over whatever network you want.

The MPI traffic you generally want to use your high performance network, but 
you can tell Open MPI to use whichever network / interface(s) you want it to 
use.

Check out this FAQ item: 
https://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems





> On May 11, 2016, at 12:33 PM, Llolsten Kaonga <l...@soft-forge.com> wrote:
> 
> Hello Gilles,
> 
> Sorry my last message was a bit garbled. I edited the original and hit the 
> send button while distracted.
> 
> We actually use one internal port and an external port for the RoCe traffic.
> 
> I thank you.
> --
> Llolsten
> 
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles 
> Gouaillardet
> Sent: Wednesday, May 11, 2016 11:03 AM
> To: Open MPI Users <us...@open-mpi.org>
> Subject: Re: [OMPI users] mpirun command won't run unless the firewalld 
> daemon is disabled
> 
> I am not sure I understand your last message.
> 
> if MPI only need the internal port, and there is no firewall protecting this 
> port, then simply tell ompi to use it and only it
> mpirun --mca oob_tco_if_include ethxx --mca btl_tcp_if_include ethxx ...
> 
> otherwise, it should work, but only after some internal timeout expire 
> (because of dropped packets by the firewall on the external port) and that 
> can take a while
> 
> Cheers,
> 
> Gilles
> 
> On Wednesday, May 11, 2016, Llolsten Kaonga <l...@soft-forge.com> wrote:
>> Hello Gilles/Jeff,
>> 
>> Thank you for clarifying this.
>> 
>> We have three ports but the RoCE traffic is supposed to use one of the 
>> internal ports. However, we do allow use of one of the external ports which 
>> we assign a static address.
>> 
>> I thank you.
>> --
>> Llolsten
>> 
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles 
>> Gouaillardet
>> Sent: Tuesday, May 10, 2016 5:06 PM
>> To: Open MPI Users <us...@open-mpi.org>
>> Subject: Re: [OMPI users] mpirun command won't run unless the firewalld 
>> daemon is disabled
>> 
>> I was basically suggesting you open a few ports to anyone (e.g. any IP 
>> address), and Jeff suggests you open all ports to a few trusted IP addresses.
>> 
>> btw, how many network ports do you have ?
>> if you have two ports (e.g. eth0 for external access and eth1 for private 
>> network) and MPI should only use the internal network, then you can allow 
>> all traffic on the internal port, and
>> mpirun --mca oob_tcp_if_include eth1 --mca btl_tcp_if_include eth1 ...
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Wednesday, May 11, 2016, Llolsten Kaonga <l...@soft-forge.com> wrote:
>>> Hello Jeff,
>>> 
>>> I think what you suggest is likely exactly what we want to see happen. We
>>> run the interop tests with at least two servers, sometimes more. We also
>>> have other devices (InfiniBand or RoCE switches) between the servers.
>>> 
>>> I will have to ask a stupid question here but when you suggest that we open
>>> the firewall to trust random TCP connections, how is that different from
>>> disabling it? Is there some configuration besides the suggestion by Gilles
>>> to specify ports or a range of ports?
>>> 
>>> I thank you.
>>> --
>>> Llolsten
>>> 
>>> -----Original Message-----
>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres
>>> (jsquyres)
>>> Sent: Tuesday, May 10, 2016 3:47 PM
>>> To: Open MPI User's List <us...@open-mpi.org>
>>> Subject: Re: [OMPI users] mpirun command won't run unless the firewalld
>>> daemon is disabled
>>> 
>>> Open MPI generally needs to be able to communicate on random TCP ports
>>> between machines in the MPI job (and the machine where mpirun is invoked, if
>>> that is a different machine).
>>> 
>>> You could also open your firewall to trust random TCP connections just
>>> between the servers in your cluster.
>>> 
>>> 
>>> 
>>>> On May 10, 2016, at 3:44 PM, Llolsten Kaonga <l...@soft-forge.com> wrote:
>>>> 
>>>> Hello Orion,
>>>> 
>>>> I actually rather like the new CentOS 7.2 system better and would like
>>>> to not remove firewalld. We will try Gilles' suggestion and see what
>>> happens.
>>>> 
>>>> I thank you.
>>>> --
>>>> Llolsten
>>>> 
>>>> -----Original Message-----
>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Orion
>>>> Poplawski
>>>> Sent: Tuesday, May 10, 2016 3:31 PM
>>>> To: Open MPI Users <us...@open-mpi.org>
>>>> Subject: Re: [OMPI users] mpirun command won't run unless the
>>>> firewalld daemon is disabled
>>>> 
>>>> On 05/10/2016 09:24 AM, Llolsten Kaonga wrote:
>>>>> Hello Durga,
>>>>> 
>>>>> As I mentioned earlier, up to version 1.8.2, we would just disable
>>>>> SELinux and the IPv4 firewall and things run smoothly. It was only
>>>>> when we installed version 1.10.2 (CentOS 7.2) that we run into these
>>>>> troubles. CentOS 7.2 no longer seems to bother with the IPv4
>>>>> firewall, so
>>>> you can't do:
>>>>> 
>>>>> 
>>>>> 
>>>>> # service iptables save
>>>>> 
>>>>> # service iptables stop
>>>>> 
>>>>> # chkconfig iptables off
>>>> 
>>>> I'll just note that you can either embrace the new firewalld config
>>>> (and use firewall-cmd to open your needed ports) or you can remove
>>>> firewalld and install iptables-services and go back to the old
>>>> iptables method of configuring the firewall.  If you don't want a
>>>> firewall at all, just remove firewalld.
>>>> 
>>>> --
>>>> Orion Poplawski
>>>> Technical Manager                     303-415-9701 x222
>>>> NWRA, Boulder/CoRA Office             FAX: 303-415-9702
>>>> 3380 Mitchell Lane                       or...@nwra.com
>>>> Boulder, CO 80301                   http://www.nwra.com
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/05/29160.php
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/05/29161.php
>>> 
>>> 
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/05/29162.php
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2016/05/29163.php
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/05/29174.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to