On 12/09/15 19:15, Noel Kuntze wrote: > > Hello Digimer, > >> I am a strong believer in "keep it as simple as possible". In a case >> like this, it's hard to argue that any option is simple, but given that >> RRP is baked into the HA stack, I decided to trust it over QoS. I am >> perfectly open to contrary ideas though. Can you help me understand why >> you think tc's additional complexity is worth it? I'm willing to fully >> believe it is, but I want to understand the pros and cons first. > > If you want or not, the kernel always has a queuing policy on any network > device. > On CentOS 7, it's a pfifo_fast[1][2] queue, which is a classless fifo queue, > but with 3 bands (0,1,2). > Bands with lower numbers have priority over higher numbers. As long as > packets are in band 0, band 1 won't be worked upon and so on. > The band a packet gets put into depends on the TOS/DSCP mark on the packet > (TOS and DSCP use the same field in an IP packet. > They're just different standards for the values in it). > The TOS/DSCP field of applications that don't set a specific value for that > on the network socket they use (SSH can do that using the IPQoS[3]). > > So obviously, by influencing the TOS/DSCP field value with iptables, we can > influence what packets get send out when. > The goal here is to prioritize Corosync totem traffic (UDP port 5404 and > 5405) with the correct > TOS/DSCP value, so it ends up in band 0 and all the other stuff in band 3. > This leaves you the option to > put dlm traffic into band 2. > > The priority would thus be: totem > dlm > migration > > As pfifo_fast maps different priorities into different bands based on the > priomap, > it must be looked at to figure out what TOS/DSCP value must be set. > The second table on the linked LART article[2] gives you the priorities and > what bands they're > mapped to. It tells us that TOS values from 0x10 to 0x16 are mapped to band > 0. So we need to set Corosync > traffic to that TOS value. DLM must be set to a value that maps to band 1 and > so on. > > AFAIK, the TOS target in iptables is a non-terminating target, so packets > that matched the rule continue > to traverse through the chain. We don't want that here, because it will screw > with our prioritization order. > So we ACCEPT traffic that we TOS'ed. > > Example iptables rule: > > iptables -t mangle -A OUTPUT -p udp -m multiport --dports 5404,5405 -j TOS > --set-tos 0x10 > iptables -t mangle -A OUTPUT -p udp -m multiport --dports 5405,5405 -j ACCEPT > iptables -t mangle -A OUTPUT -p sctp -j TOS --set-tos 0x18 > iptables -t mangle -A OUTPUT -p sctp -j ACCEPT > iptables -t mangle -A OUTPUT -j TOS --set-tos 0x8 > (The default policy of *mangle OUTPUT is ACCEPT, so there's no need for an > additional rule at the end to accept the rest) > > Tadaa. >
I think it's worth mentioning here that corosync already sets its packets to TC_INTERACTIVE (which DLM does not), so they should not need too much messing around with in iptables/qdisc Chrissie > [1] > [root@c7-arch-mirror-1 ~]# cat /etc/redhat-release > CentOS Linux release 7.1.1503 (Core) > [root@c7-arch-mirror-1 ~]# tc qdisc > qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 > 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 > 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 > 1 1 1 1 1 1 1 > > [2] http://lartc.org/howto/lartc.qdisc.classless.html#AEN658 > [3] `man ssh_config`, search for IPQoS > > > > > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
