Re: [Cake] Using firewall connmarks as tin selectors

2019-03-05 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 21:33, Toke Høiland-Jørgensen  wrote:

>> I think so too though I think the mechanism of copying the DSCP bits
>> and adding a ‘I did this’ flag bit should be retained so that other
>> user space tools (iptables etc) can detect when a connmark based DSCP
>> has been set/applied.
> 
> I guess this could be an option as well?

If we don’t do that then potentially we have to look up the DSCP and update the 
conntrack mark for every packet.

>> I think cake ‘fwmark’ should have the smarts to look for the act_dscp
>> DSCP value if nothing else so we don’t have to have the overhead of
>> act_dscp set restoring DSCP to all the packets if we don’t want to.
> 
> Not sure what you mean here?

What I meant was that we can make the diffserv restore part optional.  Our 
qdisc (or whatever) could pick up the fw stored DSCP for tin/bandwidth 
selection and not require the real DSCP to be set and quite possibly 
washed/bleached again anyway.


>> I’m right at the limit of my coding ability with what I’ve sent in so
>> far - the kernel space bits of act_connmark leave me mostly confused -
>> really not sure where to start with act_dscp!
> 
> I think I would start with `cp act_connmark.c act_dscp.c`, adding the
> new file to the Makefile and Kconfig, and working from there. Then rip
> out everything not needed, and copy over what you already added to cake.
> 
> Happy to help you work out the details; but I think we'll make more
> progress on this if you are driving it :)

OK - we’ll see how long it takes before someone screams or laughs themselves to 
death :-)

Kevin
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Toke Høiland-Jørgensen  writes:

> Kevin Darbyshire-Bryant  writes:
>
>>> On 4 Mar 2019, at 17:36, Toke Høiland-Jørgensen  wrote:
>>> < snips >
>>> 
>>> Or act_dscp with 'get' and 'set' options :)
>>> 
 As I said earlier I couldn't work out how m_conntrack did… anything at
 all to be honest!
>>> 
>>> act_connmark is pretty straight forward:
>>> https://elixir.bootlin.com/linux/latest/source/net/sched/act_connmark.c#L34
>>
>> Oh bloody hell I’m an idiot - I was looking in user space tc code for
>> something that obviously lives in kernel land.  Yes *now* I can see
>> what it does.
>
> No comment ;)
>
>> @@ -1661,13 +1695,14 @@ static struct cake_tin_data 
>> *cake_select_tin(struct Qdisc *sch,
>>  tin = 0;
>> 
>>  else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
>> - skb->mark &&
>> - skb->mark <= q->tin_cnt)
>> -tin = q->tin_order[skb->mark - 1];
>> -
>> -else if (TC_H_MAJ(skb->priority) == sch->handle &&
>> - TC_H_MIN(skb->priority) > 0 &&
>> - TC_H_MIN(skb->priority) <= q->tin_cnt)
>> +   skb->mark & 0x4000) {
> 
> I think there's something odd with this mask?  There's only one bit set
> in it…
 
 I use the single bit as a flag to indicate cake has stored the  DSCP
 in the lower 6 bits of the byte.  Otherwise with a DSCP of 0 (the
 default) it’s rather difficult to know if a connection has been
 through the cake ’save dscp to fwmark’ process or not.  That also
 indicates to user space whether it should consider mangling packets or
 not e.g.
>>> 
>>> Ah, right. But that breaks the previous use case where someone just
>>> wants to set fwmarks that get turned into CAKE tins?
>>
>> Yes, which is why I was hoping for upstream to bounce it, not least
>> because  of the unmasked use of fwmark field.  Personally I’d like to
>> see that change reverted before we get trapped into something we’ll
>> regret.
>
> Well, we have plenty of time to try out things; the whole point of the
> rc cycle is testing. But sure, if we don't settle on something, we can
> just revert it :)

And, well, the simple thing to do is just keep the current behaviour, or
add a mask of 0xff, and use the tc action for everything that's more
advanced than this...

>>> I think this definitely is leaning towards decoupling the
>>> fw-mark-to-DSCP settings to an action. And probably making the shift and
>>> mask configurable rather than hard-coding something…
>>
>> I think so too though I think the mechanism of copying the DSCP bits
>> and adding a ‘I did this’ flag bit should be retained so that other
>> user space tools (iptables etc) can detect when a connmark based DSCP
>> has been set/applied.
>
> I guess this could be an option as well?
>
>> I think cake ‘fwmark’ should have the smarts to look for the act_dscp
>> DSCP value if nothing else so we don’t have to have the overhead of
>> act_dscp set restoring DSCP to all the packets if we don’t want to.
>
> Not sure what you mean here?
>
>> I’m right at the limit of my coding ability with what I’ve sent in so
>> far - the kernel space bits of act_connmark leave me mostly confused -
>> really not sure where to start with act_dscp!
>
> I think I would start with `cp act_connmark.c act_dscp.c`, adding the
> new file to the Makefile and Kconfig, and working from there. Then rip
> out everything not needed, and copy over what you already added to
> cake.

Or even simpler, just add new options to act_connmark...

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 4 Mar 2019, at 17:36, Toke Høiland-Jørgensen  wrote:
>> < snips >
>> 
>> Or act_dscp with 'get' and 'set' options :)
>> 
>>> As I said earlier I couldn't work out how m_conntrack did… anything at
>>> all to be honest!
>> 
>> act_connmark is pretty straight forward:
>> https://elixir.bootlin.com/linux/latest/source/net/sched/act_connmark.c#L34
>
> Oh bloody hell I’m an idiot - I was looking in user space tc code for
> something that obviously lives in kernel land.  Yes *now* I can see
> what it does.

No comment ;)

> @@ -1661,13 +1695,14 @@ static struct cake_tin_data 
> *cake_select_tin(struct Qdisc *sch,
>   tin = 0;
> 
>   else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
> -  skb->mark &&
> -  skb->mark <= q->tin_cnt)
> - tin = q->tin_order[skb->mark - 1];
> -
> - else if (TC_H_MAJ(skb->priority) == sch->handle &&
> -  TC_H_MIN(skb->priority) > 0 &&
> -  TC_H_MIN(skb->priority) <= q->tin_cnt)
> +skb->mark & 0x4000) {
 
 I think there's something odd with this mask?  There's only one bit set
 in it…
>>> 
>>> I use the single bit as a flag to indicate cake has stored the  DSCP
>>> in the lower 6 bits of the byte.  Otherwise with a DSCP of 0 (the
>>> default) it’s rather difficult to know if a connection has been
>>> through the cake ’save dscp to fwmark’ process or not.  That also
>>> indicates to user space whether it should consider mangling packets or
>>> not e.g.
>> 
>> Ah, right. But that breaks the previous use case where someone just
>> wants to set fwmarks that get turned into CAKE tins?
>
> Yes, which is why I was hoping for upstream to bounce it, not least
> because  of the unmasked use of fwmark field.  Personally I’d like to
> see that change reverted before we get trapped into something we’ll
> regret.

Well, we have plenty of time to try out things; the whole point of the
rc cycle is testing. But sure, if we don't settle on something, we can
just revert it :)

>> I think this definitely is leaning towards decoupling the
>> fw-mark-to-DSCP settings to an action. And probably making the shift and
>> mask configurable rather than hard-coding something…
>
> I think so too though I think the mechanism of copying the DSCP bits
> and adding a ‘I did this’ flag bit should be retained so that other
> user space tools (iptables etc) can detect when a connmark based DSCP
> has been set/applied.

I guess this could be an option as well?

> I think cake ‘fwmark’ should have the smarts to look for the act_dscp
> DSCP value if nothing else so we don’t have to have the overhead of
> act_dscp set restoring DSCP to all the packets if we don’t want to.

Not sure what you mean here?

> I’m right at the limit of my coding ability with what I’ve sent in so
> far - the kernel space bits of act_connmark leave me mostly confused -
> really not sure where to start with act_dscp!

I think I would start with `cp act_connmark.c act_dscp.c`, adding the
new file to the Makefile and Kconfig, and working from there. Then rip
out everything not needed, and copy over what you already added to cake.

Happy to help you work out the details; but I think we'll make more
progress on this if you are driving it :)

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 17:36, Toke Høiland-Jørgensen  wrote:
> < snips >
> 
> Or act_dscp with 'get' and 'set' options :)
> 
>> As I said earlier I couldn't work out how m_conntrack did… anything at
>> all to be honest!
> 
> act_connmark is pretty straight forward:
> https://elixir.bootlin.com/linux/latest/source/net/sched/act_connmark.c#L34

Oh bloody hell I’m an idiot - I was looking in user space tc code for something 
that obviously lives in kernel land.  Yes *now* I can see what it does.

> 
 @@ -1661,13 +1695,14 @@ static struct cake_tin_data 
 *cake_select_tin(struct Qdisc *sch,
tin = 0;
 
else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
 -   skb->mark &&
 -   skb->mark <= q->tin_cnt)
 -  tin = q->tin_order[skb->mark - 1];
 -
 -  else if (TC_H_MAJ(skb->priority) == sch->handle &&
 -   TC_H_MIN(skb->priority) > 0 &&
 -   TC_H_MIN(skb->priority) <= q->tin_cnt)
 + skb->mark & 0x4000) {
>>> 
>>> I think there's something odd with this mask?  There's only one bit set
>>> in it…
>> 
>> I use the single bit as a flag to indicate cake has stored the  DSCP
>> in the lower 6 bits of the byte.  Otherwise with a DSCP of 0 (the
>> default) it’s rather difficult to know if a connection has been
>> through the cake ’save dscp to fwmark’ process or not.  That also
>> indicates to user space whether it should consider mangling packets or
>> not e.g.
> 
> Ah, right. But that breaks the previous use case where someone just
> wants to set fwmarks that get turned into CAKE tins?

Yes, which is why I was hoping for upstream to bounce it, not least because  of 
the unmasked use of fwmark field.  Personally I’d like to see that change 
reverted before we get trapped into something we’ll regret.

> I think this definitely is leaning towards decoupling the
> fw-mark-to-DSCP settings to an action. And probably making the shift and
> mask configurable rather than hard-coding something…

I think so too though I think the mechanism of copying the DSCP bits and adding 
a ‘I did this’ flag bit should be retained so that other user space tools 
(iptables etc) can detect when a connmark based DSCP has been set/applied.  I 
think cake ‘fwmark’ should have the smarts to look for the act_dscp DSCP value 
if nothing else so we don’t have to have the overhead of act_dscp set restoring 
DSCP to all the packets if we don’t want to.

I’m right at the limit of my coding ability with what I’ve sent in so far - the 
kernel space bits of act_connmark leave me mostly confused - really not sure 
where to start with act_dscp!



Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 4 Mar 2019, at 16:39, Toke Høiland-Jørgensen  wrote:
>> 
>> Kevin Darbyshire-Bryant  writes:
>> 
>> [ ... snipping a bit of context here ... ]
>> 
>>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
>>> +{
>>> +   enum ip_conntrack_info ctinfo;
>>> +   struct nf_conn *ct;
>>> +
>>> +   ct = nf_ct_get(skb, );
>>> +   if (!ct)
>>> +   return;
>>> +
>>> +   ct->mark &= 0x80ff;
>>> +   ct->mark |= (0x40 | dscp) << 24;
>> 
>> Right, so we *might* have an argument that putting the *tin* into the
>> fwmark is CAKE's business, but copying over the dscp mark is not
>> something a qdisc should be doing…
> 
> Why ever not?  It’s not the DSCP, it’s a lookup value into the cake
> priority table, it just happens to look like the DSCP ;-)
 
 If it quacks like a duck…
>>> 
>>> I honestly don’t know where to go from here. I’m clearly trying to do
>>> something that the kernel doesn’t want to do.
>> 
>> I'm not disputing that what you're trying to do (moving DSCP field into
>> connmark) is useful. I'm just questioning whether CAKE is the right
>> place to do this. I think it would fit better in a TC action; either
>> modify act_connmark, or create a new action to sync fwmarks and DSCP
>> marks…
>
> Interesting.  Thinks out loud - Two actions - ‘act_storedscp’,
> ‘act_restoredscp'

Or act_dscp with 'get' and 'set' options :)

> As I said earlier I couldn't work out how m_conntrack did… anything at
> all to be honest!

act_connmark is pretty straight forward:
https://elixir.bootlin.com/linux/latest/source/net/sched/act_connmark.c#L34

>>> @@ -1661,13 +1695,14 @@ static struct cake_tin_data *cake_select_tin(struct 
>>> Qdisc *sch,
>>> tin = 0;
>>> 
>>> else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
>>> -skb->mark &&
>>> -skb->mark <= q->tin_cnt)
>>> -   tin = q->tin_order[skb->mark - 1];
>>> -
>>> -   else if (TC_H_MAJ(skb->priority) == sch->handle &&
>>> -TC_H_MIN(skb->priority) > 0 &&
>>> -TC_H_MIN(skb->priority) <= q->tin_cnt)
>>> +  skb->mark & 0x4000) {
>> 
>> I think there's something odd with this mask?  There's only one bit set
>> in it…
>
> I use the single bit as a flag to indicate cake has stored the  DSCP
> in the lower 6 bits of the byte.  Otherwise with a DSCP of 0 (the
> default) it’s rather difficult to know if a connection has been
> through the cake ’save dscp to fwmark’ process or not.  That also
> indicates to user space whether it should consider mangling packets or
> not e.g.

Ah, right. But that breaks the previous use case where someone just
wants to set fwmarks that get turned into CAKE tins?

I think this definitely is leaning towards decoupling the
fw-mark-to-DSCP settings to an action. And probably making the shift and
mask configurable rather than hard-coding something...

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 16:39, Toke Høiland-Jørgensen  wrote:
> 
> Kevin Darbyshire-Bryant  writes:
> 
> [ ... snipping a bit of context here ... ]
> 
>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
>> +{
>> +enum ip_conntrack_info ctinfo;
>> +struct nf_conn *ct;
>> +
>> +ct = nf_ct_get(skb, );
>> +if (!ct)
>> +return;
>> +
>> +ct->mark &= 0x80ff;
>> +ct->mark |= (0x40 | dscp) << 24;
> 
> Right, so we *might* have an argument that putting the *tin* into the
> fwmark is CAKE's business, but copying over the dscp mark is not
> something a qdisc should be doing…
 
 Why ever not?  It’s not the DSCP, it’s a lookup value into the cake
 priority table, it just happens to look like the DSCP ;-)
>>> 
>>> If it quacks like a duck…
>> 
>> I honestly don’t know where to go from here. I’m clearly trying to do
>> something that the kernel doesn’t want to do.
> 
> I'm not disputing that what you're trying to do (moving DSCP field into
> connmark) is useful. I'm just questioning whether CAKE is the right
> place to do this. I think it would fit better in a TC action; either
> modify act_connmark, or create a new action to sync fwmarks and DSCP
> marks…

Interesting.  Thinks out loud - Two actions - ‘act_storedscp’, ‘act_restoredscp'

As I said earlier I couldn't work out how m_conntrack did… anything at all to 
be honest!

> 
> This would both sidestep the whole conntrack dependency issue, and make
> the same functionality available outside of CAKE (for an HTB-based
> setup, for instance).
> 
>> v2 addressing some of the comments attached.  Is it best to keep the
>> in progress patches here or should they be github PRs ?
> 
> Patches on the mailing list is fine by me, and it seems there are people
> reading the list, but not github, so let's keep it here for now at
> least. However, I'll hold off on more detailed comments on the patch
> until we've resolved the point above. With one exception:

There’s certainly been some response here that’s for sure :-)

> 
>> @@ -1661,13 +1695,14 @@ static struct cake_tin_data *cake_select_tin(struct 
>> Qdisc *sch,
>>  tin = 0;
>> 
>>  else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
>> - skb->mark &&
>> - skb->mark <= q->tin_cnt)
>> -tin = q->tin_order[skb->mark - 1];
>> -
>> -else if (TC_H_MAJ(skb->priority) == sch->handle &&
>> - TC_H_MIN(skb->priority) > 0 &&
>> - TC_H_MIN(skb->priority) <= q->tin_cnt)
>> +   skb->mark & 0x4000) {
> 
> I think there's something odd with this mask?  There's only one bit set
> in it…

I use the single bit as a flag to indicate cake has stored the  DSCP in the 
lower 6 bits of the byte.  Otherwise with a DSCP of 0 (the default) it’s rather 
difficult to know if a connection has been through the cake ’save dscp to 
fwmark’ process or not.  That also indicates to user space whether it should 
consider mangling packets or not e.g.

#if bit 6 0 then not been marked by cake - go & mangle the DSCP ready for cake 
to find & set
#the mark.
ipt -t mangle -A PREROUTING  -i $IFACE -m mark --mark 0x00/0x4000 -g 
QOS_MARK_${IFACE}
ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0x4000 -g 
QOS_MARK_${IFACE}



Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

[ ... snipping a bit of context here ... ]

> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
> +{
> + enum ip_conntrack_info ctinfo;
> + struct nf_conn *ct;
> +
> + ct = nf_ct_get(skb, );
> + if (!ct)
> + return;
> +
> + ct->mark &= 0x80ff;
> + ct->mark |= (0x40 | dscp) << 24;
 
 Right, so we *might* have an argument that putting the *tin* into the
 fwmark is CAKE's business, but copying over the dscp mark is not
 something a qdisc should be doing…
>>> 
>>> Why ever not?  It’s not the DSCP, it’s a lookup value into the cake
>>> priority table, it just happens to look like the DSCP ;-)
>> 
>> If it quacks like a duck…
>
> I honestly don’t know where to go from here. I’m clearly trying to do
> something that the kernel doesn’t want to do.

I'm not disputing that what you're trying to do (moving DSCP field into
connmark) is useful. I'm just questioning whether CAKE is the right
place to do this. I think it would fit better in a TC action; either
modify act_connmark, or create a new action to sync fwmarks and DSCP
marks...

This would both sidestep the whole conntrack dependency issue, and make
the same functionality available outside of CAKE (for an HTB-based
setup, for instance).

> v2 addressing some of the comments attached.  Is it best to keep the
> in progress patches here or should they be github PRs ?

Patches on the mailing list is fine by me, and it seems there are people
reading the list, but not github, so let's keep it here for now at
least. However, I'll hold off on more detailed comments on the patch
until we've resolved the point above. With one exception:

> @@ -1661,13 +1695,14 @@ static struct cake_tin_data *cake_select_tin(struct 
> Qdisc *sch,
>   tin = 0;
>  
>   else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
> -  skb->mark &&
> -  skb->mark <= q->tin_cnt)
> - tin = q->tin_order[skb->mark - 1];
> -
> - else if (TC_H_MAJ(skb->priority) == sch->handle &&
> -  TC_H_MIN(skb->priority) > 0 &&
> -  TC_H_MIN(skb->priority) <= q->tin_cnt)
> +skb->mark & 0x4000) {

I think there's something odd with this mask?  There's only one bit set
in it...

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 12:44, Toke Høiland-Jørgensen  wrote:
> 
> Kevin Darbyshire-Bryant  writes:
> 
>>> On 4 Mar 2019, at 11:17, Toke Høiland-Jørgensen  wrote:
>>> 
>>> Kevin Darbyshire-Bryant  writes:
>>> 
> On 4 Mar 2019, at 08:39, Pete Heist  wrote:
> 
> 
>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
>>  wrote:
>> 
>> The very bad idea:
>> 
>> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark
>> implementation as described above. So an awful lot of our
>> shenanigans above is due to DSCP not traversing the internet
>> particularly well. The solution above abstracts DSCP into ’tins’
>> which we put into fwmarks. Another approach would be to put the DSCP
>> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained
>> DSCP into the diffserv field onto the actual packets. Voila DSCP
>> traversal across ’tinternet with tin/bandwidth allocation in our
>> local domain preserved.
> 
> If I understand it right, another use case for this “very bad idea”
> is preserving DSCP locally while traversing upstream WiFi links as
> besteffort, which avoids airtime efficiency problems that can occur
> with 802.11e (WMM). In cases where the router config can’t be changed
> (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as
> it hides DSCP from the WiFi stack while preserving the values through
> the tunnel, but this would be easier. Neat… :)
 
 Everyone has understood the intent & maybe the implementation
 correctly. 2 patches attached, one for cake, one for tc.
 
 They are naively coded and some of it undoes Toke’s recent tidying up
 (sorry!)
>>> 
>>> Heh. First comment: Don't do that ;)
>> 
>> I did say naively coded.
>> 
>>> 
>>> A few more below.
>>> 
 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
 diff --git a/pkt_sched.h b/pkt_sched.h
 index a2f570c..d1f288d 100644
 --- a/pkt_sched.h
 +++ b/pkt_sched.h
 @@ -879,6 +879,7 @@ enum {
TCA_CAKE_ACK_FILTER,
TCA_CAKE_SPLIT_GSO,
TCA_CAKE_FWMARK,
 +  TCA_CAKE_ICING,
__TCA_CAKE_MAX
 };
 #define TCA_CAKE_MAX   (__TCA_CAKE_MAX - 1)
 diff --git a/sch_cake.c b/sch_cake.c
 index 733b897..5aca0f3 100644
 --- a/sch_cake.c
 +++ b/sch_cake.c
 @@ -270,7 +270,8 @@ enum {
CAKE_FLAG_INGRESS  = BIT(2),
CAKE_FLAG_WASH = BIT(3),
CAKE_FLAG_SPLIT_GSO= BIT(4),
 -  CAKE_FLAG_FWMARK   = BIT(5)
 +  CAKE_FLAG_FWMARK   = BIT(5),
 +  CAKE_FLAG_ICING= BIT(6)
>>> 
>>> This implies that icing and fwmark can be enabled completely
>>> independently of each other. Are you sure about the semantics for that?
>> 
>> No, I’m not.  I sent the patches so others could see my lunacy in action and 
>> hopefully improve it.
>> 
>> The actual operation links FWMARK, INGRESS & ICING in a variety of
>> combinations.
> 
> Right, so obviously this needs to be thought through…

Yes.  Volunteers with thoughts & ideally code sought.

> 
 };
 
 /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
 @@ -333,7 +334,7 @@ static const u8 diffserv8[] = {
 };
 
 static const u8 diffserv4[] = {
 -  0, 2, 0, 0, 2, 0, 0, 0,
 +  0, 1, 0, 0, 2, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0,
2, 0, 2, 0, 2, 0, 2, 0,
2, 0, 2, 0, 2, 0, 2, 0,
 @@ -344,7 +345,7 @@ static const u8 diffserv4[] = {
 };
 
 static const u8 diffserv3[] = {
 -  0, 0, 0, 0, 2, 0, 0, 0,
 +  0, 1, 0, 0, 2, 0, 0, 0,
>>> 
>>> Why are you messing with the diffserv mappings in this patch?
>> 
>> This is a combination patch of Dave’s new LE coding and the
>> fwmark/dscp mangling.
> 
> Ah. Well let's keep that separate from this patch/discussion…

Done
> 
>>> 
1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
 @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, 
 struct sk_buff **to_free)
return idx + (tin << 16);
 }
 
 -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
 +void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
 +{
 +  switch (skb->protocol) {
 +  case htons(ETH_P_IP):
 +  if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
 +  ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
 +  break;
 +  case htons(ETH_P_IPV6):
 +  if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
 +  ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
 +  break;
 +  default:
 +  break;
 +  }
 +
 +}
>>> 
>>> So washing is just a special case of this (wash is
>>> cake_update_diffserv(skb,0)). So you shouldn't need to add another
>>> function, just augment the 

Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 4 Mar 2019, at 11:17, Toke Høiland-Jørgensen  wrote:
>> 
>> Kevin Darbyshire-Bryant  writes:
>> 
 On 4 Mar 2019, at 08:39, Pete Heist  wrote:
 
 
> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
>  wrote:
> 
> The very bad idea:
> 
> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark
> implementation as described above. So an awful lot of our
> shenanigans above is due to DSCP not traversing the internet
> particularly well. The solution above abstracts DSCP into ’tins’
> which we put into fwmarks. Another approach would be to put the DSCP
> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained
> DSCP into the diffserv field onto the actual packets. Voila DSCP
> traversal across ’tinternet with tin/bandwidth allocation in our
> local domain preserved.
 
 If I understand it right, another use case for this “very bad idea”
 is preserving DSCP locally while traversing upstream WiFi links as
 besteffort, which avoids airtime efficiency problems that can occur
 with 802.11e (WMM). In cases where the router config can’t be changed
 (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as
 it hides DSCP from the WiFi stack while preserving the values through
 the tunnel, but this would be easier. Neat… :)
>>> 
>>> Everyone has understood the intent & maybe the implementation
>>> correctly. 2 patches attached, one for cake, one for tc.
>>> 
>>> They are naively coded and some of it undoes Toke’s recent tidying up
>>> (sorry!)
>> 
>> Heh. First comment: Don't do that ;)
>
> I did say naively coded.
>
>> 
>> A few more below.
>> 
>>> 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
>>> diff --git a/pkt_sched.h b/pkt_sched.h
>>> index a2f570c..d1f288d 100644
>>> --- a/pkt_sched.h
>>> +++ b/pkt_sched.h
>>> @@ -879,6 +879,7 @@ enum {
>>> TCA_CAKE_ACK_FILTER,
>>> TCA_CAKE_SPLIT_GSO,
>>> TCA_CAKE_FWMARK,
>>> +   TCA_CAKE_ICING,
>>> __TCA_CAKE_MAX
>>> };
>>> #define TCA_CAKE_MAX(__TCA_CAKE_MAX - 1)
>>> diff --git a/sch_cake.c b/sch_cake.c
>>> index 733b897..5aca0f3 100644
>>> --- a/sch_cake.c
>>> +++ b/sch_cake.c
>>> @@ -270,7 +270,8 @@ enum {
>>> CAKE_FLAG_INGRESS  = BIT(2),
>>> CAKE_FLAG_WASH = BIT(3),
>>> CAKE_FLAG_SPLIT_GSO= BIT(4),
>>> -   CAKE_FLAG_FWMARK   = BIT(5)
>>> +   CAKE_FLAG_FWMARK   = BIT(5),
>>> +   CAKE_FLAG_ICING= BIT(6)
>> 
>> This implies that icing and fwmark can be enabled completely
>> independently of each other. Are you sure about the semantics for that?
>
> No, I’m not.  I sent the patches so others could see my lunacy in action and 
> hopefully improve it.
>
> The actual operation links FWMARK, INGRESS & ICING in a variety of
> combinations.

Right, so obviously this needs to be thought through...

>>> };
>>> 
>>> /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
>>> @@ -333,7 +334,7 @@ static const u8 diffserv8[] = {
>>> };
>>> 
>>> static const u8 diffserv4[] = {
>>> -   0, 2, 0, 0, 2, 0, 0, 0,
>>> +   0, 1, 0, 0, 2, 0, 0, 0,
>>> 1, 0, 0, 0, 0, 0, 0, 0,
>>> 2, 0, 2, 0, 2, 0, 2, 0,
>>> 2, 0, 2, 0, 2, 0, 2, 0,
>>> @@ -344,7 +345,7 @@ static const u8 diffserv4[] = {
>>> };
>>> 
>>> static const u8 diffserv3[] = {
>>> -   0, 0, 0, 0, 2, 0, 0, 0,
>>> +   0, 1, 0, 0, 2, 0, 0, 0,
>> 
>> Why are you messing with the diffserv mappings in this patch?
>
> This is a combination patch of Dave’s new LE coding and the
> fwmark/dscp mangling.

Ah. Well let's keep that separate from this patch/discussion...

>> 
>>> 1, 0, 0, 0, 0, 0, 0, 0,
>>> 0, 0, 0, 0, 0, 0, 0, 0,
>>> 0, 0, 0, 0, 0, 0, 0, 0,
>>> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, 
>>> struct sk_buff **to_free)
>>> return idx + (tin << 16);
>>> }
>>> 
>>> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
>>> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
>>> +{
>>> +   switch (skb->protocol) {
>>> +   case htons(ETH_P_IP):
>>> +   if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>>> +   ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
>>> +   break;
>>> +   case htons(ETH_P_IPV6):
>>> +   if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>>> +   ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
>>> +   break;
>>> +   default:
>>> +   break;
>>> +   }
>>> +
>>> +}
>> 
>> So washing is just a special case of this (wash is
>> cake_update_diffserv(skb,0)). So you shouldn't need to add another
>> function, just augment the existing handling code.
>
> Erm, that’s exactly what I’ve done.

Ah, right; I guess it's just the reverting of the cleanup patch that is
the issue, then :)

>>> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
>>> {
>>> u8 

Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 11:17, Toke Høiland-Jørgensen  wrote:
> 
> Kevin Darbyshire-Bryant  writes:
> 
>>> On 4 Mar 2019, at 08:39, Pete Heist  wrote:
>>> 
>>> 
 On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
  wrote:
 
 The very bad idea:
 
 And it’s bad ‘cos it’s sort of incompatible with the existing fwmark
 implementation as described above. So an awful lot of our
 shenanigans above is due to DSCP not traversing the internet
 particularly well. The solution above abstracts DSCP into ’tins’
 which we put into fwmarks. Another approach would be to put the DSCP
 *into* the fwmark. CAKE could (optionally) copy the FWMARK contained
 DSCP into the diffserv field onto the actual packets. Voila DSCP
 traversal across ’tinternet with tin/bandwidth allocation in our
 local domain preserved.
>>> 
>>> If I understand it right, another use case for this “very bad idea”
>>> is preserving DSCP locally while traversing upstream WiFi links as
>>> besteffort, which avoids airtime efficiency problems that can occur
>>> with 802.11e (WMM). In cases where the router config can’t be changed
>>> (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as
>>> it hides DSCP from the WiFi stack while preserving the values through
>>> the tunnel, but this would be easier. Neat… :)
>> 
>> Everyone has understood the intent & maybe the implementation
>> correctly. 2 patches attached, one for cake, one for tc.
>> 
>> They are naively coded and some of it undoes Toke’s recent tidying up
>> (sorry!)
> 
> Heh. First comment: Don't do that ;)

I did say naively coded.

> 
> A few more below.
> 
>> 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
>> diff --git a/pkt_sched.h b/pkt_sched.h
>> index a2f570c..d1f288d 100644
>> --- a/pkt_sched.h
>> +++ b/pkt_sched.h
>> @@ -879,6 +879,7 @@ enum {
>>  TCA_CAKE_ACK_FILTER,
>>  TCA_CAKE_SPLIT_GSO,
>>  TCA_CAKE_FWMARK,
>> +TCA_CAKE_ICING,
>>  __TCA_CAKE_MAX
>> };
>> #define TCA_CAKE_MAX (__TCA_CAKE_MAX - 1)
>> diff --git a/sch_cake.c b/sch_cake.c
>> index 733b897..5aca0f3 100644
>> --- a/sch_cake.c
>> +++ b/sch_cake.c
>> @@ -270,7 +270,8 @@ enum {
>>  CAKE_FLAG_INGRESS  = BIT(2),
>>  CAKE_FLAG_WASH = BIT(3),
>>  CAKE_FLAG_SPLIT_GSO= BIT(4),
>> -CAKE_FLAG_FWMARK   = BIT(5)
>> +CAKE_FLAG_FWMARK   = BIT(5),
>> +CAKE_FLAG_ICING= BIT(6)
> 
> This implies that icing and fwmark can be enabled completely
> independently of each other. Are you sure about the semantics for that?

No, I’m not.  I sent the patches so others could see my lunacy in action and 
hopefully improve it.

The actual operation links FWMARK, INGRESS & ICING in a variety of combinations.

>> };
>> 
>> /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
>> @@ -333,7 +334,7 @@ static const u8 diffserv8[] = {
>> };
>> 
>> static const u8 diffserv4[] = {
>> -0, 2, 0, 0, 2, 0, 0, 0,
>> +0, 1, 0, 0, 2, 0, 0, 0,
>>  1, 0, 0, 0, 0, 0, 0, 0,
>>  2, 0, 2, 0, 2, 0, 2, 0,
>>  2, 0, 2, 0, 2, 0, 2, 0,
>> @@ -344,7 +345,7 @@ static const u8 diffserv4[] = {
>> };
>> 
>> static const u8 diffserv3[] = {
>> -0, 0, 0, 0, 2, 0, 0, 0,
>> +0, 1, 0, 0, 2, 0, 0, 0,
> 
> Why are you messing with the diffserv mappings in this patch?

This is a combination patch of Dave’s new LE coding and the fwmark/dscp 
mangling.

> 
>>  1, 0, 0, 0, 0, 0, 0, 0,
>>  0, 0, 0, 0, 0, 0, 0, 0,
>>  0, 0, 0, 0, 0, 0, 0, 0,
>> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, 
>> struct sk_buff **to_free)
>>  return idx + (tin << 16);
>> }
>> 
>> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
>> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
>> +{
>> +switch (skb->protocol) {
>> +case htons(ETH_P_IP):
>> +if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>> +ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
>> +break;
>> +case htons(ETH_P_IPV6):
>> +if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>> +ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
>> +break;
>> +default:
>> +break;
>> +}
>> +
>> +}
> 
> So washing is just a special case of this (wash is
> cake_update_diffserv(skb,0)). So you shouldn't need to add another
> function, just augment the existing handling code.

Erm, that’s exactly what I’ve done.

> 
>> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
>> {
>>  u8 dscp;
>> 
>> @@ -1644,37 +1662,70 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, 
>> u16 wash)
>>  }
>> }
>> 
>> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> 
> Save an ifdef below by moving the ifdef inside the function definition.
> 
>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
>> +{
>> +enum ip_conntrack_info ctinfo;
>> +

Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread John Sager
Let action connmark continue to do that. You still need mirred anyway.

John


On 4 March 2019 11:04:39 GMT, "Toke Høiland-Jørgensen"  wrote:
>Kevin Darbyshire-Bryant  writes:
>
>>> On 3 Mar 2019, at 12:22, John Sager  wrote:
>>> 
>>> If you are going to do that, I would suggest using a few of the
>upper bits
>>> of the 32-bit fwmark/connmark space available, rather than the
>lowest bits.
>>> Then that would allow to use fwmarks other purposes, and to use the
>lowest
>>> bits, as well in the future. As iptables allows a mask before
>comparison,
>>> then choose a specific mask for the bits you use both for setting
>and testing.
>>
>> That’s a good point and I’m sort of hoping upstream reject the
>current
>> submission on that basis.  I think the ‘use of fwmarks’ needs more
>> thought as to how it’s done for the future - too keen to get
>something
>> out.  Maybe it’s sufficient as is, I don’t know.
>
>https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=0b5c7efdfc6e389ec6840579fe90bdb6f42b08dc
>
>This means it'll be in 5.1; so we have until that is released (~8 weeks
>or so) to set the behaviour in stone.
>
>I do think we at least need to add masking of the mark before using it
>for tin selection; the question is just which bits to use from it.
>
>As for setting the fwmark back in conntrack, I'm not sure I agree that
>this is something CAKE should be doing. Mostly because it means even
>tighter coupling between CAKE and the conntrack subsystem. However, I
>may be convinced by a sufficiently neat implementation, and anyway this
>is definitely something that will need to wait for 5.2 for upstream.
>
>So I think the priority is to agree on semantics for masking the fwmark
>when reading, and getting that implemented in a way that is compatible
>with both other uses of marks, and with anything we else we might want
>to do down the road.
>
>-Toke

-- 
Sent from the Aether.___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 4 Mar 2019, at 08:39, Pete Heist  wrote:
>> 
>> 
>>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
>>>  wrote:
>>> 
>>> The very bad idea:
>>> 
>>> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark
>>> implementation as described above. So an awful lot of our
>>> shenanigans above is due to DSCP not traversing the internet
>>> particularly well. The solution above abstracts DSCP into ’tins’
>>> which we put into fwmarks. Another approach would be to put the DSCP
>>> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained
>>> DSCP into the diffserv field onto the actual packets. Voila DSCP
>>> traversal across ’tinternet with tin/bandwidth allocation in our
>>> local domain preserved.
>> 
>> If I understand it right, another use case for this “very bad idea”
>> is preserving DSCP locally while traversing upstream WiFi links as
>> besteffort, which avoids airtime efficiency problems that can occur
>> with 802.11e (WMM). In cases where the router config can’t be changed
>> (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as
>> it hides DSCP from the WiFi stack while preserving the values through
>> the tunnel, but this would be easier. Neat… :)
>
> Everyone has understood the intent & maybe the implementation
> correctly. 2 patches attached, one for cake, one for tc.
>
> They are naively coded and some of it undoes Toke’s recent tidying up
> (sorry!)

Heh. First comment: Don't do that ;)

A few more below.

> 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
> diff --git a/pkt_sched.h b/pkt_sched.h
> index a2f570c..d1f288d 100644
> --- a/pkt_sched.h
> +++ b/pkt_sched.h
> @@ -879,6 +879,7 @@ enum {
>   TCA_CAKE_ACK_FILTER,
>   TCA_CAKE_SPLIT_GSO,
>   TCA_CAKE_FWMARK,
> + TCA_CAKE_ICING,
>   __TCA_CAKE_MAX
>  };
>  #define TCA_CAKE_MAX (__TCA_CAKE_MAX - 1)
> diff --git a/sch_cake.c b/sch_cake.c
> index 733b897..5aca0f3 100644
> --- a/sch_cake.c
> +++ b/sch_cake.c
> @@ -270,7 +270,8 @@ enum {
>   CAKE_FLAG_INGRESS  = BIT(2),
>   CAKE_FLAG_WASH = BIT(3),
>   CAKE_FLAG_SPLIT_GSO= BIT(4),
> - CAKE_FLAG_FWMARK   = BIT(5)
> + CAKE_FLAG_FWMARK   = BIT(5),
> + CAKE_FLAG_ICING= BIT(6)

This implies that icing and fwmark can be enabled completely
independently of each other. Are you sure about the semantics for that?
>  };
>  
>  /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
> @@ -333,7 +334,7 @@ static const u8 diffserv8[] = {
>  };
>  
>  static const u8 diffserv4[] = {
> - 0, 2, 0, 0, 2, 0, 0, 0,
> + 0, 1, 0, 0, 2, 0, 0, 0,
>   1, 0, 0, 0, 0, 0, 0, 0,
>   2, 0, 2, 0, 2, 0, 2, 0,
>   2, 0, 2, 0, 2, 0, 2, 0,
> @@ -344,7 +345,7 @@ static const u8 diffserv4[] = {
>  };
>  
>  static const u8 diffserv3[] = {
> - 0, 0, 0, 0, 2, 0, 0, 0,
> + 0, 1, 0, 0, 2, 0, 0, 0,

Why are you messing with the diffserv mappings in this patch?

>   1, 0, 0, 0, 0, 0, 0, 0,
>   0, 0, 0, 0, 0, 0, 0, 0,
>   0, 0, 0, 0, 0, 0, 0, 0,
> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, 
> struct sk_buff **to_free)
>   return idx + (tin << 16);
>  }
>  
> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
> +{
> + switch (skb->protocol) {
> + case htons(ETH_P_IP):
> + if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
> + ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
> + break;
> + case htons(ETH_P_IPV6):
> + if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
> + ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
> + break;
> + default:
> + break;
> + }
> +
> +}

So washing is just a special case of this (wash is
cake_update_diffserv(skb,0)). So you shouldn't need to add another
function, just augment the existing handling code.

> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
>  {
>   u8 dscp;
>  
> @@ -1644,37 +1662,70 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, 
> u16 wash)
>   }
>  }
>  
> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)

Save an ifdef below by moving the ifdef inside the function definition.

> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
> +{
> + enum ip_conntrack_info ctinfo;
> + struct nf_conn *ct;
> +
> + ct = nf_ct_get(skb, );
> + if (!ct)
> + return;
> +
> + ct->mark &= 0x80ff;
> + ct->mark |= (0x40 | dscp) << 24;

Right, so we *might* have an argument that putting the *tin* into the
fwmark is CAKE's business, but copying over the dscp mark is not
something a qdisc should be doing...

> + nf_conntrack_event_cache(IPCT_MARK, ct);
> +}
> +#endif

Also, are you sure this will work in all permutations of conntrack being
a module vs 

Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 3 Mar 2019, at 12:22, John Sager  wrote:
>> 
>> If you are going to do that, I would suggest using a few of the upper bits
>> of the 32-bit fwmark/connmark space available, rather than the lowest bits.
>> Then that would allow to use fwmarks other purposes, and to use the lowest
>> bits, as well in the future. As iptables allows a mask before comparison,
>> then choose a specific mask for the bits you use both for setting and 
>> testing.
>
> That’s a good point and I’m sort of hoping upstream reject the current
> submission on that basis.  I think the ‘use of fwmarks’ needs more
> thought as to how it’s done for the future - too keen to get something
> out.  Maybe it’s sufficient as is, I don’t know.

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=0b5c7efdfc6e389ec6840579fe90bdb6f42b08dc

This means it'll be in 5.1; so we have until that is released (~8 weeks
or so) to set the behaviour in stone.

I do think we at least need to add masking of the mark before using it
for tin selection; the question is just which bits to use from it.

As for setting the fwmark back in conntrack, I'm not sure I agree that
this is something CAKE should be doing. Mostly because it means even
tighter coupling between CAKE and the conntrack subsystem. However, I
may be convinced by a sufficiently neat implementation, and anyway this
is definitely something that will need to wait for 5.2 for upstream.

So I think the priority is to agree on semantics for masking the fwmark
when reading, and getting that implemented in a way that is compatible
with both other uses of marks, and with anything we else we might want
to do down the road.

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Kevin Darbyshire-Bryant


> On 4 Mar 2019, at 08:39, Pete Heist  wrote:
> 
> 
>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
>>  wrote:
>> 
>> The very bad idea:
>> 
>> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark 
>> implementation as described above.  So an awful lot of our shenanigans above 
>> is due to DSCP not traversing the internet particularly well.  The solution 
>> above abstracts DSCP into ’tins’ which we put into fwmarks.  Another 
>> approach would be to put the DSCP *into* the fwmark.  CAKE could 
>> (optionally) copy the FWMARK contained DSCP into the diffserv field onto the 
>> actual packets.  Voila DSCP traversal across ’tinternet with tin/bandwidth 
>> allocation in our local domain preserved.
> 
> If I understand it right, another use case for this “very bad idea” is 
> preserving DSCP locally while traversing upstream WiFi links as besteffort, 
> which avoids airtime efficiency problems that can occur with 802.11e (WMM). 
> In cases where the router config can’t be changed (802.11e is mandatory after 
> all) I’ve used IPIP tunnels for this, as it hides DSCP from the WiFi stack 
> while preserving the values through the tunnel, but this would be easier. 
> Neat… :)

Everyone has understood the intent & maybe the implementation correctly.  2 
patches attached, one for cake, one for tc.

They are naively coded and some of it undoes Toke’s recent tidying up (sorry!)

The ingress path is dependent upon tc actions having restored the fwmark.  I 
cannot for the life of me work out how that actually happens (ie. the code that 
does it) to see if that’s all it does, or if cake could do it itself and so on.

Anyway, the code warts and all….



Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A


cake-add-dscp-copy-and-icing.patch
Description: cake-add-dscp-copy-and-icing.patch


my_layer_cake.qos
Description: my_layer_cake.qos


tc-cake-add-fwmark-icing-options.patch
Description: tc-cake-add-fwmark-icing-options.patch
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-04 Thread Pete Heist

> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant 
>  wrote:
> 
> The very bad idea:
> 
> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark 
> implementation as described above.  So an awful lot of our shenanigans above 
> is due to DSCP not traversing the internet particularly well.  The solution 
> above abstracts DSCP into ’tins’ which we put into fwmarks.  Another approach 
> would be to put the DSCP *into* the fwmark.  CAKE could (optionally) copy the 
> FWMARK contained DSCP into the diffserv field onto the actual packets.  Voila 
> DSCP traversal across ’tinternet with tin/bandwidth allocation in our local 
> domain preserved.

If I understand it right, another use case for this “very bad idea” is 
preserving DSCP locally while traversing upstream WiFi links as besteffort, 
which avoids airtime efficiency problems that can occur with 802.11e (WMM). In 
cases where the router config can’t be changed (802.11e is mandatory after all) 
I’ve used IPIP tunnels for this, as it hides DSCP from the WiFi stack while 
preserving the values through the tunnel, but this would be easier. Neat… :)
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-03 Thread Dave Taht
On Sun, Mar 3, 2019 at 10:37 PM Ryan Mounce  wrote:
>
> On Mon, 4 Mar 2019 at 17:01, Jonathan Morton  wrote:
> > …icing?
>
> Perfect! And to me, this functionality truly is the icing on (the)
> cake that makes it the complete bufferbloat/QoS system I've been
> dreaming of for ingress.
> ___
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake

Goforit. And while you are at it please patch in support for the new
LE bit and see if that works better than CS1?

It's in IESG now, it should have a number in 2-3 months.

https://datatracker.ietf.org/doc/draft-ietf-tsvwg-le-phb/


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-03 Thread Ryan Mounce
On Mon, 4 Mar 2019 at 17:01, Jonathan Morton  wrote:
> …icing?

Perfect! And to me, this functionality truly is the icing on (the)
cake that makes it the complete bufferbloat/QoS system I've been
dreaming of for ingress.
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-03 Thread Jonathan Morton
> On 4 Mar, 2019, at 7:37 am, Ryan Mounce  wrote:
> 
> The overwriting of the DSCP bits from fwmark could be called
> 'staining', as opposed to DSCP 'bleaching'. But this doesn't sound
> very appealing when we're baking delicious CAKE - we're in the kitchen
> not the laundry! So how about dyeing, or colouring?

…icing?

https://www.youtube.com/watch?v=ouRcDRpsRsA

 - Jonathan Morton
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-03 Thread Ryan Mounce
On Sun, 3 Mar 2019 at 22:22, Kevin Darbyshire-Bryant
 wrote:
>
> Be afraid, be very afraid.
>
> I’ve woken up with two ideas in my head, one is bad, the other is very bad.  
> The bad one is already implemented and lurking in the mine branch of my cake 
> git tree:
>
> The bad idea:
>
> An extension of the ‘fwmark’ tin allocation idea is to get cake to 
> automagically update the conntrack mark based on the DSCP tin allocation 
> chosen on egress.  That way, well behaved applications using DSCP (e.g. 
> dropbear) get their return path packets similarly classified on ingress.  
> Badly behaved applications can have iptables rules put in place to ‘manually’ 
> add fwmarks as is already done.
>
>
> The very bad idea:
>
> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark 
> implementation as described above.  So an awful lot of our shenanigans above 
> is due to DSCP not traversing the internet particularly well.  The solution 
> above abstracts DSCP into ’tins’ which we put into fwmarks.  Another approach 
> would be to put the DSCP *into* the fwmark.  CAKE could (optionally) copy the 
> FWMARK contained DSCP into the diffserv field onto the actual packets.  Voila 
> DSCP traversal across ’tinternet with tin/bandwidth allocation in our local 
> domain preserved.

Ooh now you've got my attention - equal parts horrifying and
brilliant. Just making sure I have my head around this, I think it
would be perfect for my use case.

My ISP requires traffic to be washed on egress, and they wash it on
ingress. I set the DSCP in some applications and mangle some others
with iptables on the router. Either way, DSCP is present by the time
packets reach cake on egress, which dutifully uses DSCP to sort into
tins before washing. This works great, albeit with each packet for
certain flows being inefficiently mangled.

So the first change is that cake will copy the DSCP bits of a packet
into the fwmark on egress - if the fwmark hasn't already been set by
cake. This allows the impact of my iptables mangle rules to be
minimised as only packets without a cake-populated fwmark need their
DSCP to be mangled. Cake then uses the fwmark to sort packets into
tins, and in my case the DSCP is washed on egress.

Then on ingress, cake uses the same fwmark to sort packets into tins.
And cake can copy the DSCP bits back from the fwmark to packets on
ingress. Magic!

The overwriting of the DSCP bits from fwmark could be called
'staining', as opposed to DSCP 'bleaching'. But this doesn't sound
very appealing when we're baking delicious CAKE - we're in the kitchen
not the laundry! So how about dyeing, or colouring?

-Ryan
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-03-03 Thread Kevin Darbyshire-Bryant


> On 3 Mar 2019, at 12:22, John Sager  wrote:
> 
> If you are going to do that, I would suggest using a few of the upper bits
> of the 32-bit fwmark/connmark space available, rather than the lowest bits.
> Then that would allow to use fwmarks other purposes, and to use the lowest
> bits, as well in the future. As iptables allows a mask before comparison,
> then choose a specific mask for the bits you use both for setting and testing.

That’s a good point and I’m sort of hoping upstream reject the current 
submission on that basis.  I think the ‘use of fwmarks’ needs more thought as 
to how it’s done for the future - too keen to get something out.  Maybe it’s 
sufficient as is, I don’t know.

What I do know is that after implementing the ‘bad idea’ I subsequently 
implemented the very bad idea!  Using the top byte of mark, bits 5-0 are the 
DSCP, bit 6 is the ‘Cake set this’ flag, and bit 7 is left alone as it tends to 
get mis-interpreted as a sign bit.

I implemented the iptables marking chains as ‘check if cake fwmark bit set, go 
to a DSCP mangle chain if not’, mangle the DSCP, picked up by CAKE which (in 
egress mode) copies the DSCP to the fwmark’ and set the ‘cake did this’ bit. In 
theory connections only go through the dscp mangle once.

Cake in ingress mode will copy the fwmark if it’s been set by cake back into 
the diffserv field on each packet.  I think that last step should be made 
optional but I’ve had enough ‘fun’ for the day.

https://github.com/ldir-EDB0/sch_cake/commits/mine



Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-28 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

>> On 28 Feb 2019, at 09:54, Toke Høiland-Jørgensen  wrote:
>> 
>>> I also equally aware that this is ‘creeping featuritis’ and doing
>>> nothing to speed cake up…
>> 
>> Yeah, this is the crux of the issue, really: it's a tradeoff between
>> ease of use and featuritis. Now in this case the actual impact is a
>> single check it might actually be acceptable
>> 
>>> actually I may have improved BESTEFFORT a little - we no longer look
>>> for matching TC Major numbers if there’s no actual choice of tin to be
>>> made :-)
>> 
>> Well, you made the besteffort case slightly faster, but every other mode
>> slightly slower... :)
>
> Ah, ok, tc filter really is king ;-)

Well, I don't mind doing the simplification, but it should be done as a
separate commit; I'll do that after mering your pull request.

>> If you are going to send a patch (or pull request), please leave out the
>> refactoring, and only include the feature. This makes it easier to see
>> the impact of the feature addition on its own.
>
> Ha ha, sending a patch to the kernel lists???  Never again!

Haha, no, I meant to this list. I'll submit upstream, along with the
other stuff.

> The non-refactored version is
> https://github.com/ldir-EDB0/sch_cake/commit/f30bb18adc97c827c81c9a3297ab14bfaf2adcb0
> and I’ve created a PR

Great, thanks!

>> Also, I assume you have a companion patch for iproute2 somewhere?
>
> In the first email: https://github.com/ldir-EDB0/tc-adv/commits/fwmark
> - based on 4.20.0 ‘cos that’s where openwrt is at the moment.

A pull request against tc-adv would be useful :)

> When I’ve found a suitable editor I was even going to update the man
> page!

I usually just use a regular editor and copy-paste the syntax bits from
another entry...

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-28 Thread Kevin Darbyshire-Bryant


> On 28 Feb 2019, at 09:54, Toke Høiland-Jørgensen  wrote:
> 
>> I also equally aware that this is ‘creeping featuritis’ and doing
>> nothing to speed cake up…
> 
> Yeah, this is the crux of the issue, really: it's a tradeoff between
> ease of use and featuritis. Now in this case the actual impact is a
> single check it might actually be acceptable
> 
>> actually I may have improved BESTEFFORT a little - we no longer look
>> for matching TC Major numbers if there’s no actual choice of tin to be
>> made :-)
> 
> Well, you made the besteffort case slightly faster, but every other mode
> slightly slower... :)

Ah, ok, tc filter really is king ;-)

> 
> If you are going to send a patch (or pull request), please leave out the
> refactoring, and only include the feature. This makes it easier to see
> the impact of the feature addition on its own.

Ha ha, sending a patch to the kernel lists???  Never again!

The non-refactored version is 
https://github.com/ldir-EDB0/sch_cake/commit/f30bb18adc97c827c81c9a3297ab14bfaf2adcb0
 and I’ve created a PR

> 
> Also, I assume you have a companion patch for iproute2 somewhere?

In the first email: https://github.com/ldir-EDB0/tc-adv/commits/fwmark - based 
on 4.20.0 ‘cos that’s where openwrt is at the moment.  When I’ve found a 
suitable editor I was even going to update the man page!

> 
> -Toke


Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-28 Thread Toke Høiland-Jørgensen
> I also equally aware that this is ‘creeping featuritis’ and doing
> nothing to speed cake up…

Yeah, this is the crux of the issue, really: it's a tradeoff between
ease of use and featuritis. Now in this case the actual impact is a
single check it might actually be acceptable

> actually I may have improved BESTEFFORT a little - we no longer look
> for matching TC Major numbers if there’s no actual choice of tin to be
> made :-)

Well, you made the besteffort case slightly faster, but every other mode
slightly slower... :)

If you are going to send a patch (or pull request), please leave out the
refactoring, and only include the feature. This makes it easier to see
the impact of the feature addition on its own.

Also, I assume you have a companion patch for iproute2 somewhere?

-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-28 Thread Kevin Darbyshire-Bryant


> On 27 Feb 2019, at 15:14, Toke Høiland-Jørgensen  wrote:
> 
> Kevin Darbyshire-Bryant  writes:
> 
>> How unpopular would the idea of having cake look at skb->mark directly be?
>> 
>> https://github.com/ldir-EDB0/sch_cake/commit/64d0e6ac9368a271221db888ab91a367fcd37ae1
>> 
>> https://github.com/ldir-EDB0/tc-adv/commit/4f16ae5d588d44f8a5c83fe2f2b7dcad97843cbc
> 

Hiya Toke,

OK, so it’s not an instant no :-)

> Hmm, not impossible, but seeing as we already have a way to achieve that
> with BPF, is it really needed?

Well, from a command line usability I’d say a ’no/fwmark’ option is a lot 
easier to grasp/configure than invoking anything BPF related.

Similarly, a suitable BPF program requires building and based on your below 
comments, with hard coded constants e.g. major number.  The ‘entry barrier’ is 
high, I have to write the BPF program, compile it (eBPF ‘toolchain'), maintain 
it, install it and there are certain (lack of) features that make it clunky to 
use even after you’ve done all that.

> 
>> I did the equivalent in eBPF here
>> https://github.com/ldir-EDB0/cls_bpf_connmark_to_caketin but I can’t
>> work out how to make the major number a tc command line argument into
>> the BPF code.
> 
> You can't, but you can set the major number explicitly when you create
> the qdisc:

Indeed the script I’m using does exactly that, but it means I need a ‘hard 
coded’ BPF program (or section at least) per cake instance.  Tail wagging the 
dog springs to mind.

I re-worked the code (again) anyway, (last commit 
https://github.com/ldir-EDB0/sch_cake/commits/mine )

The driver for doing any of this is primarily related to wan ingress 
classification.  DSCP can’t be trusted and can’t be manipulated (save for eBPF) 
 Neither iptables rules nor conntrack NAT lookup will have occurred so an eBPF 
program using internal addresses is challenging.  i *can* persuade tc to 
restore the connmark on ingress, so I can write old style, complicated, slow 
iptables rules to apply a connmark once on egress and have that connmark used 
for both egress & ingress aspects of that connection.

I also equally aware that this is ‘creeping featuritis’ and doing nothing to 
speed cake up…actually I may have improved BESTEFFORT a little - we no longer 
look for matching TC Major numbers if there’s no actual choice of tin to be 
made :-)


Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-27 Thread gamanakis
I think it's much simpler to use than tc-filter, BPF or even DSCP bits.
Manipulating DSCP bits seems the simplest of the currently available mechanisms 
to classify traffic. Even in this case, fwmarks are essentially simpler.
E.g. if you want to classify outgoing traffic on the LAN interface:
with DSCP you need to manipulate DSCP bits on incoming packets on the WAN 
interface. 
with fwmark you can directly mark outgoing packets on the LAN interface and 
cake will classify them appropriately.


___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-27 Thread Felix Resch
> How unpopular would the idea of having cake look at skb->mark directly be?

loving it
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] Using firewall connmarks as tin selectors

2019-02-27 Thread Toke Høiland-Jørgensen
Kevin Darbyshire-Bryant  writes:

> How unpopular would the idea of having cake look at skb->mark directly be?
>
> https://github.com/ldir-EDB0/sch_cake/commit/64d0e6ac9368a271221db888ab91a367fcd37ae1
>
> https://github.com/ldir-EDB0/tc-adv/commit/4f16ae5d588d44f8a5c83fe2f2b7dcad97843cbc

Hmm, not impossible, but seeing as we already have a way to achieve that
with BPF, is it really needed?

> I did the equivalent in eBPF here
> https://github.com/ldir-EDB0/cls_bpf_connmark_to_caketin but I can’t
> work out how to make the major number a tc command line argument into
> the BPF code.

You can't, but you can set the major number explicitly when you create
the qdisc:

$ sudo tc qdisc replace dev eth0 root handle 1: cake
$ tc qdisc show dev eth0
qdisc cake 1: root refcnt 2 bandwidth unlimited diffserv3 triple-isolate nonat 
nowash no-ack-filter split-gso rtt 100.0ms raw overhead 0


-Toke
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake