Hi Sebastian et al,

I'm feeling a bit unwell at the moment with an eye infection and I'm working nights on some tennis coverage for TV so the brain cell is somewhat addled.

It is indeed the missing puzzle piece and represents something of a holy grail for my use case. A *lot* of credit has to go to 'tegularius' who took the idea and ran with it after I'd given up. My only consolation is that the methods are broadly similar, the current implementation is so much neater and obviously written in a more kernel/conntrack knowledgeable way (based on net/sched/cls_flow.c)

This really needs to be tested. As I mentioned the 'ingress' side of things is harder work because the kernel hasn't filled in the conntrack pointer for us. There are some remaining concerns over how reliable our own lookup actually is. The conntrack entry 'direction' is apparently determined by where it is seen first, there are then 2 tuples created in the 'original' and 'reverse' directions. This made me think that a connection initiated by the router vs a connection initiated from outside into it (even if natted) would have the src & destination fields swapped...however in my limited testing 'who started the connection' appeared to make no difference. But conntrack makes my brain cell hurt.

I'm sure there are people on this list who are a) much cleverererer than me and b) know conntrack upside down & backwards. Help is as ever gratefully received.

Regarding IPv6 vs IPv4: As it currently stands the code does conntrack lookups for both so if someone is translating IPv6 addresses then we know about it. I'm now thinking about making IPv6 lookups a runtime option (default off) From a flow/host fairness point of view I really don't care if a one to one address translation has occurred...and if someone really does implement a 'masquerading many hosts behind one IPv6 address' environment...and they still want per IP & per flow fairness then unmentionable things should be done to them.

I'm not a fan of de-natting by default. Per IP fairness is not the default and requires at least one of the 'dual-???host' or 'triple-isolate' options to be relevant. I've also concerns on CPU usage.

CPU usage is difficult to quantify. As a rough guestimate my Archer C7 used about 10% cpu per megabit. I'd say that has gone up by 2% percent with this change, so it is heavy!

The code is out there, if you've an itch...scratch it :-) Fork it, improve it etc but please don't think I'm any sort of kernel guru :-)

Incidentally, an obvious gaming of this: A host that has both IPv4 & v6 addresses can get at least double the bandwidth than a host with only one of them, it's per IP fairness really, not per host.


On 26/09/16 09:54, moeller0 wrote:
Hi Kevin,

this is like the missing puzzle piece, if you solved this, most home users 
might end up deep in your debt (without them realizing it of course).
Question, if I enable this on my link how will it deal with the typical 
differences between IPv4 and IPv6? I believe that the situation I have at home, 
NAT for IPv4 but no NAT for IPv6 (or if NAT, at least NAT with identifying last 
64 bits of the IPv6 addresses, no port remapping games) is quite common now a 
days. I assume it will do the right thing for IPv4 but will it still do the 
right thing for IPv6 flows as well? And what if for $DEITY’s sake someone would 
insist on using a port-remapping NAT on IPv6?
If, what I assume it will do the right thing by default, I would vote for 
enabling this by default and introduce keywords to disable this if required (in 
what I assume to be one of cake’s main ideas use reasonable defaults that in 
general do the right thing, but also allow crazy stuff if need be).
Do you have any idea how expensive this is computationally? I realize that this 
is a tad hard to measure as cake will not simply reduce the available bandwidth 
when running out of CPU cycles but first will allow the latency to increase.

Best Regards

On Sep 26, 2016, at 05:20 , Kevin Darbyshire-Bryant 
<ke...@darbyshire-bryant.me.uk> wrote:


A while back I started on a quest to make cake 'nat' aware as the lack of host 
fairness in a typical home router environment was the only thing that prevented 
cake from being the ultimate qdisc in my opinion.  This involves dealing with 
conntrack which on egress is easy (the kernel fills in a data structure for 
us), ingress is less clear.  I hacked something together but wasn't really 
happy with it.

Another github user 'tegularius' presented some beautifully crafted code that did 
the lookups in a much neater way.  Originally it too had an 'ingress' lookup 
problem.  This was worked on and I hacked some conditional 'denat' options into 
cake & tc.

For your 'delight' a denat cake 
https://github.com/kdarbyshirebryant/sch_cake/tree/natoptions along with a 
matching tc https://github.com/kdarbyshirebryant/tc-adv/tree/denat

Typically I use 'dual-srchost srcnat' options on the egress interface, with 
'dual-dsthost dstnat' in the ingress ifb interface.  In *brief* testing, 
bandwidth is shared fairly between hosts, and fairly by flow within each host.  
And it's not crashed yet.

Cake mailing list

Cake mailing list

Reply via email to