> On 7 Mar 2019, at 17:40, Toke Høiland-Jørgensen <[email protected]> wrote: > > Kevin Darbyshire-Bryant <[email protected]> writes: > >>> On 7 Mar 2019, at 10:10, Toke Høiland-Jørgensen <[email protected]> wrote: >> >> <snipping some context> >>>>>> >>>>>> The valid bit is set when the ‘getdscp’ function has written a DSCP >>>>>> value into the conntrack (& hence skb) mark. This allows us & other >>>>>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that >>>>>> a DSCP value has been placed in the field. We cannot simply use a >>>>>> non-zero DSCP because zero is a valid DSCP. >>>>> >>>>> If someone installs the action, the field is supposedly always copied; >>>>> so why do we need another flag? >>>> >>>> I’m trying to limit the number of times expensive iptables mangle >>>> rules have to run. >>> >>> Right, I see your point, but I'm worried that this can risk becoming a >>> source of hard-to-debug bugs if this bit happens to get set by accident >>> in other places. So, I would suggest to at least make it optional (and >>> configurable). So how about the following configuration options: >> >> Phew, I explained myself clearly enough (just) that time :-). There’s >> a compromise here to be had between setting/using a DSCP for >> connection(fwmark) vs setting/using a DSCP per packet. Mostly it’s a >> (improved) performance vs DSCP accuracy compromise and assumes DSCP >> isn’t going to change after the connection has been established - some >> people have rules that mangle DSCP based on amount of data >> transferred. Personal view: anything that permits some sort of >> classification restoration on ingress has to be an improvement on what >> we have now. > > So I don't think we in general can assume that all packets of the same > flow has the same DSCP mark. If we want to enforce this, that is another > matter entirely. > >>> - fwmark shift (valid values 0-32) >>> specifies how many bits to left-shift the DSCP values before putting >>> them into the fwmark (and how many bits to right-shift the value read >>> from fwmark before writing it to the DSCP field); this could also be >>> inferred from the mask rather than be a separate option (shift = >>> lowest set bit of mask) >> >> I think the inferred choice is best. I can see all manner of confusing >> behaviour with mismatches between mask & shift. Also it’s one less >> parameter to pass … and in my dream world have to add some of these >> parameters into cake as well so it can interpret the DSCP containing >> fwmark as well… the fewer the better :-) > > Yeah, I tend to lean towards that as well. Which could also work for > cake: simply interpret the FWMARK parameter as a mask, and shift by the > lowest set bit. > >>> - get_dscp (boolean; cannot be set along with set_dscp) >>> if set, the DSCP field will be copied to the fwmark field, subject to >>> mask and shift >> >> Yes >> >>> >>> - set_dscp (boolean; cannot be set along with get_dscp) >>> if set, the value in fwmark will be copied to the DSCP field, subject >>> to mask and shift >> >> Yes >> >>> >>> - state mask (32-bit value; probably needs better name) >>> if set: the get_dscp action will OR the resulting fwmark before >>> storing it. the set_dscp value will AND the fwmark with this value >>> before doing anything, and abort if the result is false. >> >> Nearly: If set; the get_dscp action will AND the fwmark with this >> value and abort if true. If false it will OR the resulting fwmark >> before storing it. the set_dscp action will AND the fwmark with this >> value before doing anything and will abort if result is false. >> >> The ‘get_dscp action AND with fwmark and abort if true’ is the >> function that allows DSCP values to ‘stick' into a connection, thus >> obviating the need for iptables rules to mangle the DSCP on every >> egress packet. >> >> I can see a conflict between people who want the DSCP copied into >> fwmark no matter what and people such as myself who wish it to only >> happen if fwmark doesn’t have it set already. >> >> Is the solution to have a ‘get_check_state’ option that enables the >> ‘get_dscp action and with fwmark ’state mask’ abort if true check? >> Maybe even simpler to have a ‘get_state mask’ and a ’set_state mask’ - >> AND the fwmark with the get_state mask, if false copy the dscp, else >> abort. > > Right, if you want to be able to do all those combinations of things we > quickly run into combinatorial explosion, or we risk hard-coding a > policy that some users won't like. However, if we do want to express > these kinds of things, maybe this is better implemented as additional > options to the existing iptables matches? > > I.e., we could add > > --set-mark-from-dscp MASK
Yes, that (I think) could work. I could wrap my own iptables based ‘have I copied the required DSCP to the fwmark’ rules around that by setting my own flag bit in the fwmark. but.. > > to the MARK target, and > > --set-dscp-from-fwmark MASK > > to the DSCP target. Then you can implement whatever policy you want > using iptables rules... Any reason that wouldn't work? On its own I don’t think that would work for ingress traffic - iptables happens too late. So on planet Kevin I still need some sort of flag held in the fwmark that says ‘I hold a DSCP value’ so cake can use it and act_connmarkdscp can (optionally) restore it to the diffserv field. I suspect we’re going around in circles around what I would like which is “a bit DSCP fuzzy but lighter on CPU ‘cos I don’t have to hit iptables mangle rules as much” v what I think you would like is ’update the fwmark DSCP every time but that also requires iptables to mangle the DSCP for every packet’ I think the options of get_mask & set_mask can accommodate both behaviour choices. Phew this is hard - come on all you out there… i know you exist! What am I missing/misunderstanding? Ryan on this list previously said "Perfect! And to me, this functionality truly is the icing on (the) cake that makes it the complete bufferbloat/QoS system I've been dreaming of for ingress.” Thoughts, comments or do Toke & myself make an amusing enough side show? :-) Kevin _______________________________________________ Cake mailing list [email protected] https://lists.bufferbloat.net/listinfo/cake
