--- Begin Message ---

> On 19 Jun 2018, at 13:26, Pete Heist <[email protected]> wrote:
> 
> 
>> On Jun 19, 2018, at 1:54 PM, Toke Høiland-Jørgensen <[email protected]> wrote:
>> 
>> We also saw a bug on 32-bit MIPS where some combinations of 64-bit
>> netlink attributes would cause stats display in tc to fail. However, I
>> believe this is more a case of Cake exposing a latent bug somewhere in
>> the tc or kernel netlink code (alignment issues, perhaps?), and so I'm
>> not sure it is necessarily a blocker for merging Cake. However, if
>> someone could take a look that would be very helpful. I forget if the
>> current head of the cobalt branch exposes the bug, but I think it does.
>> It's quite obvious when it happens: no stats output whatsoever...
> 
> I have a 32-bit MIPS in my ER-X, but it sounds like what I saw (outrageous 
> refcnt values) was something different:
<snip>

Yes it was.  At one point iproute’s tc was doing hidden type promotions in 
printing from 32bit to 64bit types and neglecting to tell the printf formatter 
of the change, thus printf was starting at the wrong point in memory in big 
endian environments.  This was part of the move to JSON output.

Toke took my bug report & patch and made it acceptable to upstream where it now 
lives as: 
https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=4db2ff0db46f6368d89cfb3498a700e1256d2a04
 and is included in iproute2 v.4.17

> However, if there’s a way I should try to reproduce something on this 
> hardware to take a look, send any info you’ve got (how to add 64-bit netlink 
> attributes?). I even have a spare ER-X on which I could put OpenWRT in case I 
> need to be working with a more modern kernel…

The lack of stats on recent (ie post 
https://github.com/dtaht/sch_cake/commit/af1d7cde7046af55ec867b29854d754816b64bc8
 May 15th) with MIPS BE & LE 32 bit arch is a mystery.  My hack workaround to 
that for my own personal openwrt builds is 
https://github.com/ldir-EDB0/openwrt/tree/tokesiproutedebug - which also 
includes a debug commit from Toke.

I considered bumping openwrt’s master branch to point at latest commit of 
‘cobalt’ like my build does, so we could judge from the resultant screaming if 
it was just MIPS affected or other 32 bit arch’s.  I was dissuaded from doing 
so.

I got a little further into collecting info on this courtesy ‘kmod-netnl’ which 
allows packet capture of netlink packets as if on a network interface - 
captures sent to Toke IIRC but they require hand disassembly to determine where 
the packet formatting is going wrong.  And there $real_life intervened and I’ve 
not looked at since/had some more pressing bugs to ponder.

Openwrt nearly bumped to iproute v4.17 but I haven’t yet got around to seeing 
if that makes any difference.  It looks like netlink_parse_nested cannot cope 
with 64bit netlink attributes…. but this requires a person who can code rather 
than me to go any further.

RE: the stalemate.  I swing between an absolute hatred of anything linux/open 
source/mail lists and finding some people *incredibly* helpful and thinking 
‘it’s not so bad, actually this is fun’.  I offer a very recent example of this 
where I worked with David Woodhouse on a kernel PPPoATM bug (caused by a 
ticking timebomb that one E Dumazet left behind ;-) that stretched me to my 
absolute limits but was executed in a spirit of helpfulness, curiosity & fun.  
So it seems to be about finding the right person in kernel land who can both 
see the errors in our code but also see the value and effort in what we have 
achieved.  Maybe I’m being unfair and not interpreting the kernel mailing list 
environment correctly but to me it comes across as abrasive at best (and I 
swore I'd put my head in a tiger’s mouth and tickle its testicles with a 
spanner before I even think of trying to submit another patch upstream)

On the other hand I can also see that had we approached/involved the kernel 
people earlier on then some of the blind alleys we’ve travelled (I’m thinking 
passing of netlink stats here) could have been avoided.  Instead we’ve invested 
years of work and just presented a fait accompli.  Whether that would have 
yielded some of the layer breaking stuff we’ve ended up with I very much doubt 
and cobalt would have been much, much poorer as a result.

The beauty of cake/cobalt is that it does a number of sensible things all in 
one command line (and has to work around some of linux’s layering decisions.. 
IFB)

Anyway, there’s my opinion.

KDB

> 
> _______________________________________________
> Cake mailing list
> [email protected]
> https://lists.bufferbloat.net/listinfo/cake


Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

Attachment: signature.asc
Description: Message signed with OpenPGP


--- End Message ---
_______________________________________________
Cake mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cake

Reply via email to