That first bug report looks decidedly similar to mine, but Toke would have to comment on the specifics. So far I see the patch to sch_codel.c you mentioned and another two-liner to remove the warning in hfsc.c (https://patchwork.ozlabs.org/patch/933611/). It would be really good to know that that warning is truly bogus, that it wasn’t put there by the author for good reason, as Toke may have been thinking of a different way to fix hfsc.
Thanks for bringing this up! I see that I ought to search OpenWRT/kernel.org next time… :) > On Jan 5, 2019, at 8:27 PM, Sebastian Moeller <[email protected]> wrote: > > Dear all, > > I am most likely wrong, but did you have a look at > https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet? > Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and > https://www.spinics.net/lists/netdev/msg450655.html might be related to > Pete's bug. > Then again, I might be wrong as the whole flurry of emails went past my head > quickly. > > Best Regards > Sebastian > > >> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <[email protected]> wrote: >> >> Pete Heist <[email protected]> writes: >> >>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <[email protected]> wrote: >>>> >>>> Pete Heist <[email protected]> writes: >>>> >>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <[email protected]> wrote: >>>>>> >>>>>> Hmm, that's odd. Could you try adding this debugging line in >>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line: >>>>>> >>>>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, >>>>>> len += %d", >>>>>> parentid, n, len); >>>>>> >>>>>> And see if you actually get any of those lines in your dmesg? >>>>> >>>>> I do see the messages twice, then not after that in the rest of the >>>>> output... >>>> >>>> Right. Looking at the HFSC code some more, I think the bug is actually >>>> caused by another, but related, interaction between HFSC and CAKE. >>>> >>>> Specifically, this line: >>>> >>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605 >>>> >>>> where HFSC checks whether the child queue len is 1, which it interprets >>>> as the event that activates that queue. However, because CAKE splits the >>>> packet, this check will fail, and the HFSC class will not be activated. >>>> This also explains why you only see the bug with HFSC, and not with HTB >>>> (although I do think that we still need to update the hierarchy). >>>> >>>> The good news it that it is a fairly simple to fix in HFSC. The bad news >>>> is that it's something that's hard to work around from the out-of-tree >>>> CAKE... >>> >>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe >>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this >>> interaction between qdiscs would be clarified somewhere, at some >>> point. :) >>> >>> Thanks a lot for doing the discovery though! >> >> You're welcome, and thanks for you help :) >> >>> We may not have hfsc+cake with GSO splitting on older kernels very >>> soon, but what should we do with this? There’s nobody in MAINTAINERS >>> for hfsc, so we may not get much of a response to any bug >>> submissions... >> >> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c >> Jamal Hadi Salim <[email protected]> (maintainer:TC subsystem) >> Cong Wang <[email protected]> (maintainer:TC subsystem) >> Jiri Pirko <[email protected]> (maintainer:TC subsystem) >> "David S. Miller" <[email protected]> (maintainer:NETWORKING [GENERAL]) >> [email protected] (open list:TC subsystem) >> >> I'll submit a patch sometime next week, and also look into the qlen >> adjustment for CAKE GSO splitting... >> >> -Toke >> _______________________________________________ >> Cake mailing list >> [email protected] >> https://lists.bufferbloat.net/listinfo/cake > _______________________________________________ Cake mailing list [email protected] https://lists.bufferbloat.net/listinfo/cake
