[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-24 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

--- Comment #5 from Dave Taht  ---

I am a little dubious about the "active" idea.

http://fxr.watson.org/fxr/source/netpfil/ipfw/dn_sched_fq_codel.c#L413

Also.

 if (!si->flows[idx].active ) {
  319 STAILQ_INSERT_TAIL(>newflows, >flows[idx],
flowchain);
  320 si->flows[idx].deficit = param->quantum;
  321 si->flows[idx].cst.dropping = false;
  322 si->flows[idx].cst.first_above_time = 0;
  323 si->flows[idx].active = 1;
  324 //D("activate %d",idx);
  325 }

In the linux version we do not touch the codel state variables at this phase at
all, but retain the previous settings. It may be that dropping and
first_above_time get set in roughly the same way in my version, but perhaps if
I describe the intent of what should happen, I too will understand this code
better. 

The idea of a flow going out of an "active" state is not that any of it´s state
needs to be reset. The overall target of fq_codel is to reduce the total delay
in all the queues to the target (usually 5ms). It maintains a cache of the last
"good" drop rate. 

If a single flow, out of dozens, has an arrival rate like this:

A  AA A

And that is still too much relative to the other flows, it needs to get more
drops. 

Anyway, instead of checking for or maintaining an active or inactive "state" we
just check to see if queue length > 0.

Just saving my state here on this subtley. Also we use the global queue length
not the per queue length to turn off the global dropper, which I have to go
looking through here to see if it is correct.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-24 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

--- Comment #4 from Dave Taht  ---
Assuming I am looking at a correct source tree:

http://fxr.watson.org/fxr/source/netpfil/ipfw/dn_sched_fq_codel.c#L330

Logging an overflow event twice, and to the console, while under stress, is not
a good idea.

  if (mainq->ni.length > schk->cfg.limit) { D("over limit");
  331 /* find first active flow */
  332 for (maxidx = 0; maxidx < schk->cfg.flows_cnt; maxidx++)
  333 if (si->flows[maxidx].active)
  334 break;
  335 if (maxidx < schk->cfg.flows_cnt) {
  336 /* find the largest sub- queue */
  337 for (i = maxidx + 1; i < schk->cfg.flows_cnt;
i++) 
  338 if (si->flows[i].active &&
si->flows[i].stats.length >
  339 si->flows[maxidx].stats.length)
  340 maxidx = i;
  341 codel_drop_head(>flows[maxidx], si);
  342 D("maxidx = %d",maxidx);
  343 drop = 1;
  344 }
  345 }

I would delete both Ds here. Even then there are two things we ended up doing
in the linux version - 1) We added the drop_batch facility (and optimized it)
to drop up to 64 packets at a time from the head (all but the last - it helps
to always deliver at least the last packet in the queue).  

It was *very* expensive under a packet flood to hit this limit, search the flow
list, then drop a single packet.

2) Also merely dropping the packet without also telling the AQM to drop harder
on its own led to persistent hitting of this spot. So we also incremented the
codel count variable on this drop in the expectation the AQM would eventually
catch up.

3) It is possible to keep extra state around to always track the fattest queue
(see fq_codel_fast) and eliminate this non-O(1) search, at the cost of tracking
the fattest queue elsewhere. The expectation is that in normal use, this
routine is rarely hit.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-24 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

--- Comment #3 from Dave Taht  ---
One of the issues with the freebsd implementation appears to be too much
logging:

https://forum.opnsense.org/index.php?topic=39046.0

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-13 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

--- Comment #2 from Dave Taht  ---
I got some packet captures from an opnsense user, who also on my behalf turned
on and off LRO. One of the test runs was decidedly odd in that it missed the
target for far, far too long, which could be just about anything, including a
path change or a tcp bug. I will plot it later, and poke deeper into the packet
captures.

http://london.starlink.taht.net/freebsd/ecn/

The ecn on tests also showed a known issue with coping with slow start on short
RTTs.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-09 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

Mark Johnston  changed:

   What|Removed |Added

 CC||ma...@freebsd.org
 Status|New |Open

--- Comment #1 from Mark Johnston  ---
> I am not a freebsd developer, and the effort required for me to "get in 
> there" and test (or worse, patch) is a bit much (I live on a boat nowadays).  
> What I would like is to find someone that can run 3 simple tests on my 
> behalf, and give me the packet captures, so I can verify correctness.

I suspect you'll have more luck by asking for help on the FreeBSD-net mailing
list.  That'll get you much more access to users who would be willing to run
your test.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 276890] Getting fq_codel correct on inbound shaping

2024-02-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

Mark Linimon  changed:

   What|Removed |Added

   Assignee|b...@freebsd.org|n...@freebsd.org

-- 
You are receiving this mail because:
You are the assignee for the bug.