Michael wrote:
The primary difference seems to be that you have all communication going over a single interface.

Yes. It's clearly stated in the OpenMPI FAQ that such configuration is not supported:

These rules do /not/ cover the following cases:

   * Running an MPI job that spans a bunch of private networks with
     narrowly-scoped netmasks, such as nodes that have IP addresses
     192.168.1.10 and 192.168.2.10 with netmasks of 255.255.255.0
     (i.e., the network fabric makes these two nodes be routable to
     each other, even though the netmask implies that they are on
     different subnets).


This is exactly our case. Anyway, after a discussion with our administrators, we decided to use a walkaround, I run my program only on worker nodes from 192.168.12.0 network, and I got a direct route to these machines from my computer, outside the cluster's private network. So in this configuration, one of worker nodes became a head, and cluster's head is not being used at all.
That solved problem.

Thank you for your support!

regards, Marcin Skoczylas

Reply via email to