> On 7 Mar 2019, at 10:10, Toke Høiland-Jørgensen <[email protected]> wrote:

<snipping some context>
>>>> 
>>>> The valid bit is set when the ‘getdscp’ function has written a DSCP
>>>> value into the conntrack (& hence skb) mark. This allows us & other
>>>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that
>>>> a DSCP value has been placed in the field. We cannot simply use a
>>>> non-zero DSCP because zero is a valid DSCP.
>>> 
>>> If someone installs the action, the field is supposedly always copied;
>>> so why do we need another flag?
>> 
>> I’m trying to limit the number of times expensive iptables mangle
>> rules have to run.
> 
> Right, I see your point, but I'm worried that this can risk becoming a
> source of hard-to-debug bugs if this bit happens to get set by accident
> in other places. So, I would suggest to at least make it optional (and
> configurable). So how about the following configuration options:

Phew, I explained myself clearly enough (just) that time :-). There’s a 
compromise here to be had between setting/using a DSCP for connection(fwmark) 
vs setting/using a DSCP per packet.  Mostly it’s a (improved) performance vs 
DSCP accuracy compromise and assumes DSCP isn’t going to change after the 
connection has been established - some people have rules that mangle DSCP based 
on amount of data transferred.  Personal view: anything that permits some sort 
of classification restoration on ingress has to be an improvement on what we 
have now.

> 
> - fwmark mask (32-bit value)
>  specifies a mask to apply to the fwmark field before all operations

Yes - makes sense.

> 
> - fwmark shift (valid values 0-32)
>  specifies how many bits to left-shift the DSCP values before putting
>  them into the fwmark (and how many bits to right-shift the value read
>  from fwmark before writing it to the DSCP field); this could also be
>  inferred from the mask rather than be a separate option (shift =
>  lowest set bit of mask)

I think the inferred choice is best.  I can see all manner of confusing 
behaviour with mismatches between mask & shift.  Also it’s one less parameter 
to pass … and in my dream world have to add some of these parameters into cake 
as well so it can interpret the DSCP containing fwmark as well… the fewer the 
better :-)

> 
> - get_dscp (boolean; cannot be set along with set_dscp)
>  if set, the DSCP field will be copied to the fwmark field, subject to
>  mask and shift

Yes

> 
> - set_dscp (boolean; cannot be set along with get_dscp)
>  if set, the value in fwmark will be copied to the DSCP field, subject
>  to mask and shift

Yes

> 
> - state mask (32-bit value; probably needs better name)
>  if set: the get_dscp action will OR the resulting fwmark before
>  storing it. the set_dscp value will AND the fwmark with this value
>  before doing anything, and abort if the result is false.

Nearly:  If set; the get_dscp action will AND the fwmark with this value and 
abort if true.  If false it will OR the resulting fwmark before storing it.  
the set_dscp action will AND the fwmark with this value before doing anything 
and will abort if result is false.

The ‘get_dscp action AND with fwmark and abort if true’ is the function that 
allows DSCP values to ‘stick' into a connection, thus obviating the need for 
iptables rules to mangle the DSCP on every egress packet.

I can see a conflict between people who want the DSCP copied into fwmark no 
matter what and people such as myself who wish it to only happen if fwmark 
doesn’t have it set already.

Is the solution to have a ‘get_check_state’ option that enables the ‘get_dscp 
action and with fwmark ’state mask’ abort if true check?  Maybe even simpler to 
have a ‘get_state mask’ and a ’set_state mask’ - AND the fwmark with the 
get_state mask, if false copy the dscp, else abort.

> 
> I think this would allow you to implement what you described, without
> hard-coding any behaviour. Right?

Not quite - see above.

> 
> Does anyone else have any opinions / objections to the above API?
> 
> 
>> The reality is that I enjoyed doing this in the cake codebase.  I
>> cannot say the same for act_connmark in fact I hate it so much I’m
>> stopping.  The mental effort for a non-programmer and more importantly
>> a non-kernel programmer is exhausting & I’m completely disillusioned.
>> I really need to concentrate on the job that means I can pay the
>> mortgage, which isn’t bashing my head against the kernel.
> 
> Fair enough; no reason to do this if you're not enjoying it! We can
> iterate on the API, and I guess I can write the code at some point in
> the future if no one else beats me to it. No promises on when, though,
> so if someone else feels like tackling it, please go ahead :)
> 
> -Toke



_______________________________________________
Cake mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cake

Reply via email to