Michael wrote:
The primary difference seems to be that you have all communication
going over a single interface.
Yes. It's clearly stated in the OpenMPI FAQ that such configuration is
not supported:
These rules do /not/ cover the following cases:
* Running an MPI job that spans a bunch of private networks with
narrowly-scoped netmasks, such as nodes that have IP addresses
192.168.1.10 and 192.168.2.10 with netmasks of 255.255.255.0
(i.e., the network fabric makes these two nodes be routable to
each other, even though the netmask implies that they are on
different subnets).
This is exactly our case. Anyway, after a discussion with our
administrators, we decided to use a walkaround, I run my program only on
worker nodes from 192.168.12.0 network, and I got a direct route to
these machines from my computer, outside the cluster's private network.
So in this configuration, one of worker nodes became a head, and
cluster's head is not being used at all.
That solved problem.
Thank you for your support!
regards, Marcin Skoczylas