Pete Heist <[email protected]> writes: > I don’t know if we want to call this an issue, but... > > I’m seeing a lockup with cake (and also sfq, but not either pfifo or > fq_codel), when run over veth devices. Two network namespaces are > created, one for client and one for server, each with one veth device. > Netem is added as the root qdisc with a delay of 1ms, and a leaf qdisc > may be added. Lockups occur on my box when the leaf qdisc is either > cake or sfq, and I'm running flent’s tcp_ndown test with >= 4 download > streams. Note that I happen to be running on a quad-core. > > - If no leaf qdisc is added below netem, no lockup occurs. > - If either pfifo or fq_codel is added below netem, no lockup occurs. > - If either cake or sfq is the leaf, the lockup occurs. > > The symptoms (lockup with >= 4 streams on a quad-core box), and the > fact that it occurs with both cake and sfq, make me think that it may > simply have to do with the code not being re-entrant, which may be the > case for veth, and this is just by design? maybe something that we > should consider fixing but wouldn’t be a show-stopper? But that should > be confirmed. > > I’ll keep investigating, but am sharing the scripts I’m running > meanwhile in case anyone else wants to look. See README.txt in the > attached...
Thanks for investigating! I'll take a look later. The fact that it happens with sfq as well means it's probably not cake-specific, though, so I don't think we should hold off on the upstream submission until we've figured it out. Using leaf qdiscs with netem has been dodgy for a while IIRC... -Toke _______________________________________________ Cake mailing list [email protected] https://lists.bufferbloat.net/listinfo/cake
