That first bug report looks decidedly similar to mine, but Toke would have to 
comment on the specifics. So far I see the patch to sch_codel.c you mentioned 
and another two-liner to remove the warning in hfsc.c 
(https://patchwork.ozlabs.org/patch/933611/). It would be really good to know 
that that warning is truly bogus, that it wasn’t put there by the author for 
good reason, as Toke may have been thinking of a different way to fix hfsc.

Thanks for bringing this up! I see that I ought to search OpenWRT/kernel.org 
next time… :)

> On Jan 5, 2019, at 8:27 PM, Sebastian Moeller <[email protected]> wrote:
> 
> Dear all,
> 
> I am most likely wrong, but did you have a look at 
> https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet?
> Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and 
> https://www.spinics.net/lists/netdev/msg450655.html might be related to 
> Pete's bug.
> Then again, I might be wrong as the whole flurry of emails went past my head 
> quickly.
> 
> Best Regards
>       Sebastian
> 
> 
>> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <[email protected]> wrote:
>> 
>> Pete Heist <[email protected]> writes:
>> 
>>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <[email protected]> wrote:
>>>> 
>>>> Pete Heist <[email protected]> writes:
>>>> 
>>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <[email protected]> wrote:
>>>>>> 
>>>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>>> 
>>>>>>          net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, 
>>>>>> len += %d",
>>>>>>                               parentid, n, len);
>>>>>> 
>>>>>> And see if you actually get any of those lines in your dmesg?
>>>>> 
>>>>> I do see the messages twice, then not after that in the rest of the
>>>>> output...
>>>> 
>>>> Right. Looking at the HFSC code some more, I think the bug is actually
>>>> caused by another, but related, interaction between HFSC and CAKE.
>>>> 
>>>> Specifically, this line:
>>>> 
>>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>>> 
>>>> where HFSC checks whether the child queue len is 1, which it interprets
>>>> as the event that activates that queue. However, because CAKE splits the
>>>> packet, this check will fail, and the HFSC class will not be activated.
>>>> This also explains why you only see the bug with HFSC, and not with HTB
>>>> (although I do think that we still need to update the hierarchy).
>>>> 
>>>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>>>> is that it's something that's hard to work around from the out-of-tree
>>>> CAKE...
>>> 
>>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
>>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
>>> interaction between qdiscs would be clarified somewhere, at some
>>> point. :)
>>> 
>>> Thanks a lot for doing the discovery though!
>> 
>> You're welcome, and thanks for you help :)
>> 
>>> We may not have hfsc+cake with GSO splitting on older kernels very
>>> soon, but what should we do with this? There’s nobody in MAINTAINERS
>>> for hfsc, so we may not get much of a response to any bug
>>> submissions...
>> 
>> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c 
>> Jamal Hadi Salim <[email protected]> (maintainer:TC subsystem)
>> Cong Wang <[email protected]> (maintainer:TC subsystem)
>> Jiri Pirko <[email protected]> (maintainer:TC subsystem)
>> "David S. Miller" <[email protected]> (maintainer:NETWORKING [GENERAL])
>> [email protected] (open list:TC subsystem)
>> 
>> I'll submit a patch sometime next week, and also look into the qlen
>> adjustment for CAKE GSO splitting...
>> 
>> -Toke
>> _______________________________________________
>> Cake mailing list
>> [email protected]
>> https://lists.bufferbloat.net/listinfo/cake
> 

_______________________________________________
Cake mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cake

Reply via email to