Hi,
Could someone explain how my configuration directives should be with the
following setup?
Total of 18 compute nodes
5 compute nodes with eth0 connected to a switch. (192.168.2.* network)
6 compute nodes with eth1 connected to this switch (192.168.2.* network)
and eth0 connected to a different switch (10.1.21.* network)
7 compute nodes with eth1 connected to this switch (192.168.2.* network)
and eth0 connected to a different switch (10.1.74.* network)
The routing table on each group of nodes looks like this
Group 1:
Destination Gateway Genmask Flags Metric Ref Use
Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0
eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 192.168.2.254 0.0.0.0 UG 0 0 0
eth0
Group 2:
Destination Gateway Genmask Flags Metric Ref Use
Iface
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0
eth1
10.1.21.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0
eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 10.1.21.1 0.0.0.0 UG 0 0 0
eth0
Group 3:
Destination Gateway Genmask Flags Metric Ref Use
Iface
10.1.74.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0
eth1
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0
eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 10.1.74.1 0.0.0.0 UG 0 0 0
eth0
When I set the default configuration for all the nodes (without a
mcast_if directive), each of these groups of nodes only show up within
their subnet, so the collection agent only sees one group of nodes
(depending on which node is first in the data_source line for that
cluster).
Later I set the configuration for the first group as default and changed
the configuration for the rest of the nodes by adding an "mcast_if eth1"
to the "udp_send_channel" and "udp_recv_channel" groups, but still the
result is the same.
I get the desired result of all nodes multicasting to all the other
nodes only when I add the following route to the tables of the nodes in
group 2 & group 3. Is there a reason why and is there a way around it.
If I do this change to the routing table, I lose the ability to login
directly to a node.
0.0.0.0 192.168.2.254 0.0.0.0 UG 0
0 0 eth1
Hoping to get an answer to this rather intriguing issue.
Thanks,
Prakash