On Jun 19, 2007, at 2:24 PM, George Bosilca wrote:

While limiting the ports used by Open MPI might be a good idea, I'm skeptical about it. For at least 2 reasons:

1. I don't believe the OS to release the binding when we close the socket. As an example on Linux the kernel sockets are release at a later moment. That means the socket might be still in use for the next run.

So...? If you define a large enough range, it's not a big enough deal -- if you use port N for one run, if you start another run right after the first one finishes, you'll use port N+1.

2. Multiple processes on the same node will try to bind the ports in same order. Is this really safe ?

Sure. Say that 2 processes try to bind to the same port simultaneously. If they can both succeed binding to the same port, I'd say that the kernel is pretty broken!

That being said, I am equally dubious about restricting to specific port ranges, but for different reasons:

1. If you're trying to go through firewalls, this isn't enough. You'll also need "external" IP addresses for each internal IP address. This alone is such a hassle that it really makes the concept not worth it (and no competent network/firewall admin would agree to do it ;-) ). Instead, you'd want a *single* punch-through in the firewall to communicate between processes in front of and behind the firewall, and then have some MPI-level routing to multiplex all relevant MPI communication through that single pinhole.

2. If your range is small enough and you execute lots and lots of short jobs on the same nodes, you could run out of available ports in the range while the kernel is shutting down the sockets from the previous runs.

That being said, I *can* see at least one argument for wanting restricted TCP port ranges: using switches for traffic shaping or other kinds of QoS/filtering based on port range. For example, if you have a TCP-based HPC cluster, you might want to give priority to your MPI traffic. If the MPI traffic is guaranteed to be in a specific port range, you can do that.

This is why I asked about the network topology in my previous mail.

--
Jeff Squyres
Cisco Systems

Reply via email to