On Wed, Oct 11, 2006 at 11:28:13PM +0200, Adrian Knoth wrote: > The ringtest also works fine in plain IPv4 environments and > mixed environments within the same cluster. It fails on > mixed multi-cluster setups and heterogenous OSs, but I'm > going to fix these issues on Saturday (or next week).
I've fixed it: 0: sending message (0) to 1 0: got message (3) from 3 [0,1,1][/home/racl/adi/ompi/trunk/src/ompi/mca/btl/tcp/btl_tcp_endpoint.c:194:mca_btl_tcp_endpoint_dump] accepted: 192.168.1.132 - 192.168.1.1 nodelay 1 sndbuf 262144 rcvbuf 262144 flags 00000802 3: got message (3) from 2, sending to 0 2: got message (2) from 1, sending to 3 [0,1,0][/home/racl/adi/ompi/trunk/src/ompi/mca/btl/tcp/btl_tcp_endpoint.c:194:mca_btl_tcp_endpoint_dump] connected: 192.168.1.1 - 192.168.1.132 nodelay 1 sndbuf 262144 rcvbuf 262144 flags 00000802 [0,1,0][/home/racl/adi/ompi/trunk/src/ompi/mca/btl/tcp/btl_tcp_endpoint.c:194:mca_btl_tcp_endpoint_dump] accepted: 141.35.14.189 - 141.35.13.178 nodelay 1 sndbuf 262144 rcvbuf 262144 flags 00000802 [0,1,1][/home/racl/adi/ompi/trunk/src/ompi/mca/btl/tcp/btl_tcp_endpoint.c:194:mca_btl_tcp_endpoint_dump] connected: 2001:638:906:1:20e:a6ff:fe3d:48d6 - 2001:638:906:2:213:d3ff:fec5:3480 nodelay 1 sndbuf 262144 rcvbuf 262144 flags 00000802 1: got message (1) from 0, sending to 2 This is a ringtest between two different Linux machines and two Solaris hosts (both x86) in a mixed environment. The two Linux nodes talk via RFC1918 (192.168.1.x) - the fastest connection between them. One of them talk to the Solaris via public IPv4 (141.x.y.z), also the fastest connection, the other Linux system (which is 192.168.1.131 and doesn't have a public IPv4 address) uses IPv6 to communicate with the second Solaris, because no faster (other) connection is available. (it's a node from the formerly RFC1918-only cluster which now has its own IPv6 subnet (2001:638:908:1/64)) Things are getting interesting... ;) -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver [X] <-- nail here for new monitor