I just got back from vacation - and did not intend message 1 to come across as this cranky.
Certainly my take on "finishing cake" was to get more users using it and providing feedback, and getting it into lede mainline will make for more users as well as make it possible for me and others to easily test again, and once it showed at least a few benefits, and perhaps grew or lost a few more features, push towards mainline into linux. I applaud this. Keep at it. No matter how grouchy I sound below, I am rooting for y'all to get it right. But: *My* principal design goal for "cake" was to pour the existing sqm-scripts into C, where it would be faster, and to scale well across a bandwidth range of 0-40Gbit+ with older and modern hardware. Everybody else here does not have a need for performance much higher than a few mbit, it seems, and that colors your viewpoints. I'd at least like to get inbound shaping to work well at 400Mbits... Cake *was* - last june - significantly faster than htb + fq_codel, enough so to do 100Mbit inbound queue management where htb+fq_codel fell over at 60Mbit on things like the wndr3800 and archer c7v2. No longer. It benched as slower when I last benched it (in december), and with incremental sub percentage point improvements or disimprovements, including many issues in the codel implementation, "presumed" fixed. It was a huge percentage slower than pfifo_fast on 10GigE and higher. At which point I gave up, went back to htb+fq_codel, and focused all my energies on building up my ability to work on wifi, where we are now showing comfortable order-of-magnitude gains and real progress. I do see many - undertested - features "improving" this that or the other thing have landed since I last paid attention. I do fear it is expected of me and toke to take cake through a serious string of tests, and my pushback has been to ask that those working on it and testing it first work with the fleet of flent servers worldwide to test the codel implementation, at least, first, and preferably have a few boxes locally to be able to test other features, or something in the cloud, perhaps leveraging mahi-mahi or some other framework. It will be easier for me to do drive-by tests of cake again once it hits lede mainline. Open issues: cake still lacks a "sqm" mode - 3 tiers of shaping - which makes it impossible to benchmark properly vs the sqm-scripts. I still see no proof that more tiers help in any way, nor any testing or proof that it (or the hfsc-fq_codel stuff that landed more recently) that it is any better. fq_codel's natural characteristics solve for VOIP just fine, in particular. 3 tiers has been enough for every other qdisc (:cough: pfifo_fast, mqprio) since the dawn of linux time. I also preferred to statically generate the parameters for each diffserv related model, saving tons of code AND resulting in shared data for it (increasingly important with hw mq) I also thought isps and some users would want a more strict prio queue model available, similar to what free.fr is using, which makes managing tv multicast easier. I think the quantum should be even more dynamic than it is today, scaling up to 3028 as sch_fq does (say, starting at 200-500mbit), and it should go back to peeling less hard. I am aware I am the one that ripped it out (in favor of testing better what we had)... I do not see any proof that the triple isolation mode for torrents does any better than the regular mode for torrents, against real torrents - or any other forms of normal multi-user traffic Somebody prove that cake's mode for this actually makes a difference, please. I thought the invsqrt cache was pointless, and most of the other tweaks to codel needed testing and evaluation, and all of them cost cpu. Register usage was poor on arm and mips architectures. I did not see a functional use for the rate estimator. I felt nearly all of the statistics collection could be dropped. In terms of API - the rate limiter does not work above 40GBit. *All* the new hardware I have played with of late does 4 or more hardware queues (on inbound and outbound), and finding ways to handle those within a single qdisc across those cpus sounds like an increasingly good idea. They are doing that for CPU efficiency, not QoS. That said, even basic support for BQL has been lacking in those arches (I'm looking at you linksys ac1200!) And I'd hoped that sane ways of leveraging cake from an ISP's perspective would emerge, which seems to involve lots more tc or iptables magic yet to be written. Conceptually I do love the idea of a set associative cache, but as for actual measurements of it's helpfulness, I have very little to show for it's benefits thus far. please keep banging the rocks together, and *please* benchmark the thing at higher rates and RTTs. _______________________________________________ Cake mailing list [email protected] https://lists.bufferbloat.net/listinfo/cake
