On Mon, 2014-03-24 at 10:09 -0700, Dave Taht wrote: > > It has long been my hope that conventional distros would start > selecting sch_fq and sch_fq_codel up in safe scenarios. > > 1) Can an appropriate clocksource be detected from userspace? > > if [ have_good_clocksources ] > then > if [ i am a router ] > then > sysctl -w something=fq_codel # or is it an entry in proc? > else > sysctl -w something=sch_fq > fi > fi >
Sure you can do all this from user space. Thats policy, and this should not belong to kernel. sysctl -w net.core.default_qdisc=fq # force a load/delete to bring default qdisc for all devices already up for ETH in `list of network devices (excluding virtual devices)` do tc qdisc add dev $ETH root pfifo 2>/dev/null tc qdisc del dev $ETH root 2>/dev/null done > How early in boot would this have to be to take effect? It doesn't matter, if you force a load/unload of the qdisc. > > 2) In the case of a server machine providing vms, and meeting the > above precondition(s), > what would be a more right qdisc, sch_fq or sch_codel? sch_fq 'works' only for locally generated traffic, as we look at skb->sk->sk_pacing_rate to read the per socket rate. No way an hypervisor (or a router 2 hops away) can access to original socket without hacks. If your linux vm needs TCP pacing, then it also need fq packet scheduler in the vm. > > 3) Containers? > > 4) The machine in the vm going through the virtual ethernet interface? > > (I don't understand to what extent tracking the exit of packets from tcp > through > the stack and vm happens - I imagine a TSO is preserved all the way through, > and also imagine that tcp small queues doesn't survive transit through the vm, > but I am known to have a fevered imagination. Small Queues controls the host queues. Not the queues on external routers. Consider an hypervisor as a router. > > > > Another issue is TCP CUBIC Hystart 'ACK TRAIN' detection that triggers > > early, since goal of TSO autosizing + FQ/pacing is to get ACK clocking > > every ms. By design, it tends to get ACK trains, way before the cwnd > > might reach BDP. > > Fascinating! Push on one thing, break another. As best I recall hystart had a > string of issues like this in it's early deployment. > > /me looks forward to one day escaping 3.10-land and observing this for himself > > so some sort of bidirectional awareness of the underlying qdisc would be > needed > to retune hystart properly. > > Is ms resolution the best possible at this point? Nope. Hystart ACK train detection is very lazy and current algo was kind of a hack. If you use better resolution, then you have problems because of ACK jitter in reverse path. Really, only looking at delay between 2 ACKS is not generic enough, we need something else, or just disable ACK TRAIN detection, as it is not that useful. Delay detection is less noisy. _______________________________________________ Bloat mailing list [email protected] https://lists.bufferbloat.net/listinfo/bloat
