On 11/09/15 02:40 PM, Noel Kuntze wrote: > > Hello, > > Personally, I would set filters in tc (packet priorization/shaper part of the > network stack) > to prioritize Cluster packets. That way, delivery of the packets is basicly > guaranteed. I'd do something similiar > on the switch to make sure it prioritizes the packets, too. > > The migration traffic must always have a lower priority than the traffic if > cman and the other components, > so the totems get delivered in any case. > The default queuing behaviour is FIFO. This is obviously not desireable here. > If the preset queuing mechanism supports traffic priorization, you can > possible get away with > a single set of iptables DSCP/TOS rules in *mangle OUTPUT to set the correct > value > so the queue prioritizes it.
So back when I was designing the initial Anvil!, I had a conversation about this with one of the core devs. His take on it was that QoS causes more problems than it solves. That was what was in my mind when I decided on RRP instead of QoS. I am a strong believer in "keep it as simple as possible". In a case like this, it's hard to argue that any option is simple, but given that RRP is baked into the HA stack, I decided to trust it over QoS. I am perfectly open to contrary ideas though. Can you help me understand why you think tc's additional complexity is worth it? I'm willing to fully believe it is, but I want to understand the pros and cons first. As an aside; I've now got the cluster running in either UDP-unicast or UDP-multicast (wanted to leave my options open) with the following iptables rules; (note: 10.20/16 = BCN, 10.10/16 = SN, 192.168.199/24 = IFN) ==== # Generated by iptables-save v1.4.7 on Sat Sep 12 01:00:09 2015 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [300170:21524114] -A INPUT -s 10.10.0.0/16 -p udp -m addrtype --dst-type MULTICAST -m conntrack --ctstate NEW -m multiport --dports 5404,5405 -j ACCEPT -A INPUT -s 10.20.0.0/16 -p udp -m addrtype --dst-type MULTICAST -m conntrack --ctstate NEW -m multiport --dports 5404,5405 -j ACCEPT -A INPUT -p tcp -m conntrack --ctstate NEW -m tcp --dport 22 -j ACCEPT -A INPUT -s 10.20.0.0/16 -p sctp -j ACCEPT -A INPUT -s 10.10.0.0/16 -p sctp -j ACCEPT -A INPUT -s 10.20.0.0/16 -d 10.20.0.0/16 -p udp -m conntrack --ctstate NEW -m multiport --dports 5404,5405 -j ACCEPT -A INPUT -s 10.10.0.0/16 -d 10.10.0.0/16 -p udp -m conntrack --ctstate NEW -m multiport --dports 5404,5405 -j ACCEPT -A INPUT -s 10.20.0.0/16 -d 10.20.0.0/16 -p tcp -m conntrack --ctstate NEW -m multiport --dports 123,5800,5900:5999,11111,16851,21064,49152:49216 -j ACCEPT -A INPUT -s 10.10.0.0/16 -d 10.10.0.0/16 -p tcp -m conntrack --ctstate NEW -m multiport --dports 7788:7799,11111,16851,21064 -j ACCEPT -A INPUT -s 192.168.122.0/24 -d 192.168.122.0/24 -p tcp -m conntrack --ctstate NEW -m multiport --dports 123,5800,5900:5999 -j ACCEPT -A INPUT -p igmp -j ACCEPT -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -j REJECT --reject-with icmp-host-prohibited -A FORWARD -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Sat Sep 12 01:00:09 2015 ==== In both unicast and multicast mode, I still see: ==== Sep 12 00:01:24 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:01:36 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 546 iface 10.20.10.2 to [1 of 3] Sep 12 00:01:38 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:01:42 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 548 iface 10.20.10.2 to [1 of 3] Sep 12 00:01:44 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:01:49 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 550 iface 10.20.10.2 to [1 of 3] Sep 12 00:01:51 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:01:56 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 552 iface 10.20.10.2 to [1 of 3] Sep 12 00:01:58 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:02:02 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 554 iface 10.20.10.2 to [1 of 3] Sep 12 00:02:04 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults Sep 12 00:02:09 node2 corosync[26991]: [TOTEM ] Incrementing problem counter for seqid 556 iface 10.20.10.2 to [1 of 3] Sep 12 00:02:11 node2 corosync[26991]: [TOTEM ] ring 0 active with no faults ==== When I fail the BCN (ring 0). This worries me, though the cluster never breaks. Thanks again! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
