> On 4 Mar 2019, at 12:44, Toke Høiland-Jørgensen <t...@redhat.com> wrote: > > Kevin Darbyshire-Bryant <ke...@darbyshire-bryant.me.uk> writes: > >>> On 4 Mar 2019, at 11:17, Toke Høiland-Jørgensen <t...@redhat.com> wrote: >>> >>> Kevin Darbyshire-Bryant <ke...@darbyshire-bryant.me.uk> writes: >>> >>>>> On 4 Mar 2019, at 08:39, Pete Heist <p...@heistp.net> wrote: >>>>> >>>>> >>>>>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant >>>>>> <ke...@darbyshire-bryant.me.uk> wrote: >>>>>> >>>>>> The very bad idea: >>>>>> >>>>>> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark >>>>>> implementation as described above. So an awful lot of our >>>>>> shenanigans above is due to DSCP not traversing the internet >>>>>> particularly well. The solution above abstracts DSCP into ’tins’ >>>>>> which we put into fwmarks. Another approach would be to put the DSCP >>>>>> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained >>>>>> DSCP into the diffserv field onto the actual packets. Voila DSCP >>>>>> traversal across ’tinternet with tin/bandwidth allocation in our >>>>>> local domain preserved. >>>>> >>>>> If I understand it right, another use case for this “very bad idea” >>>>> is preserving DSCP locally while traversing upstream WiFi links as >>>>> besteffort, which avoids airtime efficiency problems that can occur >>>>> with 802.11e (WMM). In cases where the router config can’t be changed >>>>> (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as >>>>> it hides DSCP from the WiFi stack while preserving the values through >>>>> the tunnel, but this would be easier. Neat… :) >>>> >>>> Everyone has understood the intent & maybe the implementation >>>> correctly. 2 patches attached, one for cake, one for tc. >>>> >>>> They are naively coded and some of it undoes Toke’s recent tidying up >>>> (sorry!) >>> >>> Heh. First comment: Don't do that ;) >> >> I did say naively coded. >> >>> >>> A few more below. >>> >>>> 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A >>>> diff --git a/pkt_sched.h b/pkt_sched.h >>>> index a2f570c..d1f288d 100644 >>>> --- a/pkt_sched.h >>>> +++ b/pkt_sched.h >>>> @@ -879,6 +879,7 @@ enum { >>>> TCA_CAKE_ACK_FILTER, >>>> TCA_CAKE_SPLIT_GSO, >>>> TCA_CAKE_FWMARK, >>>> + TCA_CAKE_ICING, >>>> __TCA_CAKE_MAX >>>> }; >>>> #define TCA_CAKE_MAX (__TCA_CAKE_MAX - 1) >>>> diff --git a/sch_cake.c b/sch_cake.c >>>> index 733b897..5aca0f3 100644 >>>> --- a/sch_cake.c >>>> +++ b/sch_cake.c >>>> @@ -270,7 +270,8 @@ enum { >>>> CAKE_FLAG_INGRESS = BIT(2), >>>> CAKE_FLAG_WASH = BIT(3), >>>> CAKE_FLAG_SPLIT_GSO = BIT(4), >>>> - CAKE_FLAG_FWMARK = BIT(5) >>>> + CAKE_FLAG_FWMARK = BIT(5), >>>> + CAKE_FLAG_ICING = BIT(6) >>> >>> This implies that icing and fwmark can be enabled completely >>> independently of each other. Are you sure about the semantics for that? >> >> No, I’m not. I sent the patches so others could see my lunacy in action and >> hopefully improve it. >> >> The actual operation links FWMARK, INGRESS & ICING in a variety of >> combinations. > > Right, so obviously this needs to be thought through…
Yes. Volunteers with thoughts & ideally code sought. > >>>> }; >>>> >>>> /* COBALT operates the Codel and BLUE algorithms in parallel, in order to >>>> @@ -333,7 +334,7 @@ static const u8 diffserv8[] = { >>>> }; >>>> >>>> static const u8 diffserv4[] = { >>>> - 0, 2, 0, 0, 2, 0, 0, 0, >>>> + 0, 1, 0, 0, 2, 0, 0, 0, >>>> 1, 0, 0, 0, 0, 0, 0, 0, >>>> 2, 0, 2, 0, 2, 0, 2, 0, >>>> 2, 0, 2, 0, 2, 0, 2, 0, >>>> @@ -344,7 +345,7 @@ static const u8 diffserv4[] = { >>>> }; >>>> >>>> static const u8 diffserv3[] = { >>>> - 0, 0, 0, 0, 2, 0, 0, 0, >>>> + 0, 1, 0, 0, 2, 0, 0, 0, >>> >>> Why are you messing with the diffserv mappings in this patch? >> >> This is a combination patch of Dave’s new LE coding and the >> fwmark/dscp mangling. > > Ah. Well let's keep that separate from this patch/discussion… Done > >>> >>>> 1, 0, 0, 0, 0, 0, 0, 0, >>>> 0, 0, 0, 0, 0, 0, 0, 0, >>>> 0, 0, 0, 0, 0, 0, 0, 0, >>>> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, >>>> struct sk_buff **to_free) >>>> return idx + (tin << 16); >>>> } >>>> >>>> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) >>>> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp) >>>> +{ >>>> + switch (skb->protocol) { >>>> + case htons(ETH_P_IP): >>>> + if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp) >>>> + ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp); >>>> + break; >>>> + case htons(ETH_P_IPV6): >>>> + if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp) >>>> + ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp); >>>> + break; >>>> + default: >>>> + break; >>>> + } >>>> + >>>> +} >>> >>> So washing is just a special case of this (wash is >>> cake_update_diffserv(skb,0)). So you shouldn't need to add another >>> function, just augment the existing handling code. >> >> Erm, that’s exactly what I’ve done. > > Ah, right; I guess it's just the reverting of the cleanup patch that is > the issue, then :) With the reverting of the cleaning reverted - if that makes any sense. And I have major twitchiness at the result. > >>>> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash) >>>> { >>>> u8 dscp; >>>> >>>> @@ -1644,37 +1662,70 @@ static u8 cake_handle_diffserv(struct sk_buff >>>> *skb, u16 wash) >>>> } >>>> } >>>> >>>> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK) >>> >>> Save an ifdef below by moving the ifdef inside the function definition. >>> >>>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp) >>>> +{ >>>> + enum ip_conntrack_info ctinfo; >>>> + struct nf_conn *ct; >>>> + >>>> + ct = nf_ct_get(skb, &ctinfo); >>>> + if (!ct) >>>> + return; >>>> + >>>> + ct->mark &= 0x80ffffff; >>>> + ct->mark |= (0x40 | dscp) << 24; >>> >>> Right, so we *might* have an argument that putting the *tin* into the >>> fwmark is CAKE's business, but copying over the dscp mark is not >>> something a qdisc should be doing… >> >> Why ever not? It’s not the DSCP, it’s a lookup value into the cake >> priority table, it just happens to look like the DSCP ;-) > > If it quacks like a duck… I honestly don’t know where to go from here. I’m clearly trying to do something that the kernel doesn’t want to do. > >> >>>> + nf_conntrack_event_cache(IPCT_MARK, ct); >>>> +} >>>> +#endif >>> >>> Also, are you sure this will work in all permutations of conntrack being >>> a module vs not etc? (we had to jump through quite some hoops to get the >>> conntrack hooks to work last time; this is probably my biggest worry here). >> >> No, I have absolutely no clue here at all. > > Well, another issue that needs fixing, then… Again bright-eyed & bushy tailed volunteers sought! > > -Toke v2 addressing some of the comments attached. Is it best to keep the in progress patches here or should they be github PRs ? Cheers, Kevin D-B 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
v2-0001-Copy-dscp-to-fwmark.patch
Description: v2-0001-Copy-dscp-to-fwmark.patch
_______________________________________________ Cake mailing list Cake@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cake