I had a chance to look over this patch last night.
http://caia.swin.edu.au/freebsd/aqm/patches/dummynet-aqm-patch-0.1-freebsd11-r295345.patch
My comments are by no means complete.
The BSD implementation has a few substantial differences from the Linux
implementation, but I'm not sure if further clarifying the internet
draft is needed or not.
I think section C1 below is potentially the only thing majorly wrong
with the BSD implementation, and that maybe I'm just not reading the
code right. Well... section F - ecn support - looks very wrong.
A) The BSD fq_codel implementation has 1 ms (jiffies) timestamp
resolution, (the ns2 one was floating point, the linux version (nanosec
>> 10), and cake uses nsec.[1])
In the draft we merely suggest a resolution "less than the target".
I am a little uncomfortable with 1 ms resolution as those trying to use
codel in DCs have used target and interval settings of 500us and 10ms.
Have not seen anyone publish anything using that however...
We have no real data on coarser resolutions. The initial tests at low
rates with this code appeared to work, I'd be interested
in seeing behaviors at 100mbit+.
and: Does freebsd have no higher precision timers available?
B) this initial BSD fq_codel implementation only extracts the 5 tuple
from tcp and udp packets on ipv6 and ipv4.
As this is the minimum suggested in the draft... and covers a huge
percentage of internet traffic, I'm cool with this.
Over time, however, the linux version has peered more deeply into the 5
tuple, covering encapsulations for ipv6, v4, gre0, and udplite, sctp, etc.
Otherwise it falls back to merely hashing the src/dst ips which is quite
ok in many scenarios, as does the bsd verson.
Recent additions to the linux hashing api include vlan ids and mpls
support, but those have not quite made it to the userspace tc utility.
We touch upon the potential need for hashing on alternate fields in the
draft and I would hope to see similar extensions emerge for the bsd
version over time.
udplite support, in particular, is a one liner to add to the relevant
IPPROTO_UDP case statements.
C) BSD fq_codel's hash is better than linux's default (until recently)
as it uses all the fields in ipv6 sanely.
Linux's construction of the hash originally for ipv6 was very poor,
which has improved a few times, and a subset of the new hashing API in
linux 4.2 and later is equivalent to this BSD implementation.
Again, so long as a good hash with avalanche properties is used, and
combined with a permuted hash, we're good. The jenkins hash used here
and in linux is good. But:
C1) I am not sure what HASHINIT does? I otherwise do not see code that
permutes the hash. (I know full well that the random permutation on
setup requirement in fq_codel makes it hard to have repeatable tests! -
but for shipping code it should be there)
D) The ipv6 support could be improved in that ipv6 can contain multiple
headers and the last decodable one could be used.
E) count is 16 bits here, 32 bits in linux. Damned if I know what the
net effect of that is. We used saturating arithmetic in cake but it's
really hard to hit it at 32 bits.
F) I think the ecn support is wrong in that the ECT marked packet needs
to be returned here, where instead no matter what, another packet is
dequeued. This leaks memory and never does anything sane with ecn.
while (now >= cst->drop_next_time && cst->dropping) {
+
+ if (!(cprms->flags & CODEL_ECN_ENABLED) ||
!ecn_mark(m)) {
+ update_stats(q, 0, 1);
+ FREE_PKT(m);
+ }
+
+ m = codel_dodequeue(q, now, &ok_to_drop);
I do have the ability to run freebsd in a VM, however my last attempt at
building a kernel for it was a disaster - if there is a freebsd kernel
binary to try, I can try that....
[1] Cake uses nsec in the codel portion only because it needed a high
precision timestamp for the shaper portion of the code and the now
variable is reused throughout.
_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm