Hi,

Could someone explain how my configuration directives should be with the following setup?

Total of 18 compute nodes

5 compute nodes with eth0 connected to a switch. (192.168.2.* network)
6 compute nodes with eth1 connected to this switch (192.168.2.* network) and eth0 connected to a different switch (10.1.21.* network) 7 compute nodes with eth1 connected to this switch (192.168.2.* network) and eth0 connected to a different switch (10.1.74.* network)

The routing table on each group of nodes looks like this

Group 1:
Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0 192.168.2.254 0.0.0.0 UG 0 0 0 eth0

Group 2:
Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 10.1.21.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0 10.1.21.1 0.0.0.0 UG 0 0 0 eth0

Group 3:
Destination Gateway Genmask Flags Metric Ref Use Iface 10.1.74.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0 10.1.74.1 0.0.0.0 UG 0 0 0 eth0

When I set the default configuration for all the nodes (without a mcast_if directive), each of these groups of nodes only show up within their subnet, so the collection agent only sees one group of nodes (depending on which node is first in the data_source line for that cluster).

Later I set the configuration for the first group as default and changed the configuration for the rest of the nodes by adding an "mcast_if eth1" to the "udp_send_channel" and "udp_recv_channel" groups, but still the result is the same.

I get the desired result of all nodes multicasting to all the other nodes only when I add the following route to the tables of the nodes in group 2 & group 3. Is there a reason why and is there a way around it. If I do this change to the routing table, I lose the ability to login directly to a node.

0.0.0.0 192.168.2.254 0.0.0.0 UG 0 0 0 eth1

Hoping to get an answer to this rather intriguing issue.

Thanks,
Prakash

Reply via email to