Kevin Darbyshire-Bryant <[email protected]> writes:

>> On 2 Jul 2018, at 17:59, Kevin Darbyshire-Bryant via Cake 
>> <[email protected]> wrote:
>> 
>> 
>> From: Kevin Darbyshire-Bryant <[email protected]>
>> Subject: Re: [Cake] Cake on openwrt - falling behind
>> Date: 2 July 2018 at 17:59:36 BST
>> To: Toke Høiland-Jørgensen <[email protected]>
>> Cc: Pete Heist <[email protected]>, Cake List <[email protected]>
>> 
>> 
>> 
>> 
>>> On 2 Jul 2018, at 17:14, Toke Høiland-Jørgensen <[email protected]> wrote:
>>> 
>>> Pete Heist <[email protected]> writes:
>>> 
>>>>> On Jul 2, 2018, at 2:03 PM, Toke Høiland-Jørgensen <[email protected]> wrote:
>>>>> 
>>>>> Pete Heist <[email protected]> writes:
>>>>> 
>>>>>> But, um, I find it curious that tb[TCA_PAD] has valid looking values
>>>>>> in it, and if I just go:
>>>>>> 
>>>>>> tb[TCA_STATS2] = tb[TCA_PAD]
>>>>>> 
>>>>>> right after parse_rtattr is called, I start getting tin stats printed
>>>>>> that look valid. There should be zeroes or invalid values at
>>>>>> tb[TCA_PAD], as that’s just supposed to be padding, right?
>>>>> 
>>>>> Hmm, that's interesting. Sounds like you are on the right track. What
>>>>> are the numerical values of TCA_STATS2 and TCA_PAD in the kernel and
>>>>> iproute2, respectively?
>>>> 
>>>> I never would’ve guessed they could be different when compiled at once
>>>> :) but true, there are at least five different versions of rtnetlink.h
>>>> under build_dir based on their md5sums, and I see one belongs to
>>>> iproute2-full. In all of those, it looks in the source like
>>>> TCA_STATS2=7 and TCA_PAD=9.
>>> 
>>> Aha! I think I figured out what is going on:
>>> 
>>> The gen_stats facility will add an nlattr header at the beginning of the
>>> qdisc stats, which is the toplevel TLV that contains all stats (and that
>>> we put our stats inside). It stores a reference to this header, and when
>>> all the per-qdisc callbacks have finished adding their stats, it goes
>>> back and fixes up the length of the containing header.
>>> 
>>> The problem is that on architectures that need padding, the padding TLV
>>> is added *first*, which means that the nlattr pointer that is stored
>>> before the callbacks are performed points to the padding TLV and not the
>>> stats TLV. And so, when the header is fixed up, the result (from the
>>> parser's perspective) is just a very big padding TLV.
>>> 
>>> The options TLV is before the stats TLV, so the bug only occurs if the
>>> options happen to have a length that means the stats will need padding.
>>> Which is why messing with the number of options "fixes" the bug.
>>> 
>>> Could you try applying the patch below (to the kernel) and see if that
>>> resolves the issue, please?
>>> 
>>> -Toke
>>> 
>>> 
>>> @@ -77,8 +77,12 @@ gnet_stats_start_copy_compat(struct sk_buff *skb, int 
>>> type, int tc_stats_type,
>>>             d->lock = lock;
>>>             spin_lock_bh(lock);
>>>     }
>>> -   if (d->tail)
>>> -           return gnet_stats_copy(d, type, NULL, 0, padattr);
>>> +   if (d->tail) {
>>> +           int ret = gnet_stats_copy(d, type, NULL, 0, padattr);
>>> +           if (!ret)
>>> +                   d->tail = ((struct nlattr *)skb_tail_pointer(skb)) - 1;
>>> +           return ret;
>>> +   }
>>> 
>>>     return 0;
>>> }
>> o/cake
>
> Would you like me to bash this into openwrt? Plans for this to go
> upstream? Perhaps now take off list or onto IRC?

I'll submit a patch for netdev. If you want to pack it up and submit it
for openwrt, that would be helpful! :)

-Toke
_______________________________________________
Cake mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cake

Reply via email to