Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-16 Thread Roman Yeryomin
On 16 May 2016 at 19:04, Dave Taht  wrote:
> On Mon, May 16, 2016 at 1:14 AM, Roman Yeryomin  wrote:
>> On 16 May 2016 at 01:34, Roman Yeryomin  wrote:
>>> On 6 May 2016 at 22:43, Dave Taht  wrote:
 On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin  
 wrote:
> On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
>> On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>>>
>>> I've created a OpenWRT ticket[1] on this issue, as it seems that 
>>> someone[2]
>>> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
>>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
>>> is in some kind of conflict.
>>>
>>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>>>
>>> [2] 
>>> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>>
>> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
>> bit with fq_codel limits I was able to get 420Mbps UDP like this:
>> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256
>
> Forgot to mention, I've reduced drop_batch_size down to 32

 0) Not clear to me if that's the right line, there are 4 wifi queues,
 and the third one
 is the BE queue.
>>>
>>> That was an example, sorry, should have stated that. I've applied same
>>> settings to all 4 queues.
>>>
 That is too low a limit, also, for normal use. And:
 for the purpose of this particular UDP test, flows 16 is ok, but not
 ideal.
>>>
>>> I played with different combinations, it doesn't make any
>>> (significant) difference: 20-30Mbps, not more.
>>> What numbers would you propose?
>>>
 1) What's the tcp number (with a simultaneous ping) with this latest 
 patchset?
 (I care about tcp performance a lot more than udp floods - surviving a
 udp flood yes, performance, no)
>>>
>>> During the test (both TCP and UDP) it's roughly 5ms in average, not
>>> running tests ~2ms. Actually I'm now wondering if target is working at
>>> all, because I had same result with target 80ms..
>>> So, yes, latency is good, but performance is poor.
>>>
 before/after?

 tc -s qdisc show dev wlan0 during/after results?
>>>
>>> during the test:
>>>
>>> qdisc mq 0: root
>>>  Sent 1600496000 bytes 1057194 pkt (dropped 1421568, overlimits 0 requeues 
>>> 17)
>>>  backlog 1545794b 1021p requeues 17
>>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 
>>> 17)
>>>  backlog 1541252b 1018p requeues 17
>>>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>>
>>>
>>> after the test (60sec):
>>>
>>> qdisc mq 0: root
>>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 
>>> 28)
>>>  backlog 0b 0p requeues 28
>>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 
>>> 28)
>>>  backlog 0b 0p requeues 28
>>>   maxpacket 1514 drop_overlimit 2770176 

Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-16 Thread Dave Taht
On Mon, May 16, 2016 at 1:14 AM, Roman Yeryomin  wrote:
> On 16 May 2016 at 01:34, Roman Yeryomin  wrote:
>> On 6 May 2016 at 22:43, Dave Taht  wrote:
>>> On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin  
>>> wrote:
 On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
> On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>>
>> I've created a OpenWRT ticket[1] on this issue, as it seems that 
>> someone[2]
>> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
>> is in some kind of conflict.
>>
>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>>
>> [2] 
>> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>
> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
> bit with fq_codel limits I was able to get 420Mbps UDP like this:
> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256

 Forgot to mention, I've reduced drop_batch_size down to 32
>>>
>>> 0) Not clear to me if that's the right line, there are 4 wifi queues,
>>> and the third one
>>> is the BE queue.
>>
>> That was an example, sorry, should have stated that. I've applied same
>> settings to all 4 queues.
>>
>>> That is too low a limit, also, for normal use. And:
>>> for the purpose of this particular UDP test, flows 16 is ok, but not
>>> ideal.
>>
>> I played with different combinations, it doesn't make any
>> (significant) difference: 20-30Mbps, not more.
>> What numbers would you propose?
>>
>>> 1) What's the tcp number (with a simultaneous ping) with this latest 
>>> patchset?
>>> (I care about tcp performance a lot more than udp floods - surviving a
>>> udp flood yes, performance, no)
>>
>> During the test (both TCP and UDP) it's roughly 5ms in average, not
>> running tests ~2ms. Actually I'm now wondering if target is working at
>> all, because I had same result with target 80ms..
>> So, yes, latency is good, but performance is poor.
>>
>>> before/after?
>>>
>>> tc -s qdisc show dev wlan0 during/after results?
>>
>> during the test:
>>
>> qdisc mq 0: root
>>  Sent 1600496000 bytes 1057194 pkt (dropped 1421568, overlimits 0 requeues 
>> 17)
>>  backlog 1545794b 1021p requeues 17
>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 
>> 17)
>>  backlog 1541252b 1018p requeues 17
>>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>>   new_flows_len 0 old_flows_len 1
>> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>>
>>
>> after the test (60sec):
>>
>> qdisc mq 0: root
>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 
>> 28)
>>  backlog 0b 0p requeues 28
>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 
>> 28)
>>  backlog 0b 0p requeues 28
>>   maxpacket 1514 drop_overlimit 2770176 new_flow_count 64 ecn_mark 0
>>   new_flows_len 0 old_flows_len 1
>> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms 

Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-16 Thread Roman Yeryomin
On 16 May 2016 at 01:34, Roman Yeryomin  wrote:
> On 6 May 2016 at 22:43, Dave Taht  wrote:
>> On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin  
>> wrote:
>>> On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
 On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>
> I've created a OpenWRT ticket[1] on this issue, as it seems that 
> someone[2]
> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
> is in some kind of conflict.
>
> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>
> [2] 
> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335

 OK, so, after porting the patch to 4.1 openwrt kernel and playing a
 bit with fq_codel limits I was able to get 420Mbps UDP like this:
 tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256
>>>
>>> Forgot to mention, I've reduced drop_batch_size down to 32
>>
>> 0) Not clear to me if that's the right line, there are 4 wifi queues,
>> and the third one
>> is the BE queue.
>
> That was an example, sorry, should have stated that. I've applied same
> settings to all 4 queues.
>
>> That is too low a limit, also, for normal use. And:
>> for the purpose of this particular UDP test, flows 16 is ok, but not
>> ideal.
>
> I played with different combinations, it doesn't make any
> (significant) difference: 20-30Mbps, not more.
> What numbers would you propose?
>
>> 1) What's the tcp number (with a simultaneous ping) with this latest 
>> patchset?
>> (I care about tcp performance a lot more than udp floods - surviving a
>> udp flood yes, performance, no)
>
> During the test (both TCP and UDP) it's roughly 5ms in average, not
> running tests ~2ms. Actually I'm now wondering if target is working at
> all, because I had same result with target 80ms..
> So, yes, latency is good, but performance is poor.
>
>> before/after?
>>
>> tc -s qdisc show dev wlan0 during/after results?
>
> during the test:
>
> qdisc mq 0: root
>  Sent 1600496000 bytes 1057194 pkt (dropped 1421568, overlimits 0 requeues 17)
>  backlog 1545794b 1021p requeues 17
> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 17)
>  backlog 1541252b 1018p requeues 17
>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>   new_flows_len 0 old_flows_len 1
> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
>
>
> after the test (60sec):
>
> qdisc mq 0: root
>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
>  backlog 0b 0p requeues 28
> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
>  backlog 0b 0p requeues 28
>   maxpacket 1514 drop_overlimit 2770176 new_flow_count 64 ecn_mark 0
>   new_flows_len 0 old_flows_len 1
> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
>
>
>> IF you are 

Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-15 Thread Roman Yeryomin
On 16 May 2016 at 02:07, Eric Dumazet  wrote:
> On Mon, 2016-05-16 at 01:34 +0300, Roman Yeryomin wrote:
>
>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 
>> 17)
>>  backlog 1541252b 1018p requeues 17
>>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>>   new_flows_len 0 old_flows_len 1
>
> Why do you have ce_threshold set ? You really should not (even if it
> does not matter for the kind of traffic you have at this moment)

No idea, it was there always. How do I unset it? Setting it to 0 doesn't help.

> If your expected link speed is around 1Gbps, or 80,000 packets per
> second, then you have to understand that 1024 packets limit is about 12
> ms at most.
>
> Even if the queue is full, max sojourn time of a packet would be 12 ms.
>
> I really do not see how 'target 80 ms' could be hit.

Well, as I said, I've tried different options. Neither target 20ms (as
Dave proposed) not 12ms save the situation.

> You basically have FQ, with no Codel effect, but with the associated
> cost of Codel (having to take timestamps)
>
>
>


Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-15 Thread Eric Dumazet
On Mon, 2016-05-16 at 01:34 +0300, Roman Yeryomin wrote:

> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 17)
>  backlog 1541252b 1018p requeues 17
>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>   new_flows_len 0 old_flows_len 1

Why do you have ce_threshold set ? You really should not (even if it
does not matter for the kind of traffic you have at this moment)

If your expected link speed is around 1Gbps, or 80,000 packets per
second, then you have to understand that 1024 packets limit is about 12
ms at most.

Even if the queue is full, max sojourn time of a packet would be 12 ms.

I really do not see how 'target 80 ms' could be hit.

You basically have FQ, with no Codel effect, but with the associated
cost of Codel (having to take timestamps)





Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-15 Thread Roman Yeryomin
On 7 May 2016 at 12:57, Kevin Darbyshire-Bryant
 wrote:
>
>
> On 06/05/16 10:42, Jesper Dangaard Brouer wrote:
>> Hi Felix,
>>
>> This is an important fix for OpenWRT, please read!
>>
>> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
>> without also adjusting q->flows_cnt.  Eric explains below that you must
>> also adjust the buckets (q->flows_cnt) for this not to break. (Just
>> adjust it to 128)
>>
>> Problematic OpenWRT commit in question:
>>  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
>>  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it from 
>> causing too much cpu load with higher speed (#21326)")
> I 'pull requested' this to the lede-staging tree on github.
> https://github.com/lede-project/staging/pull/11
>
> One way or another Felix & co should see the change :-)

If you would follow the white rabbit, you would see that it doesn't help

>>
>>
>> I also highly recommend you cherry-pick this very recent commit:
>>  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
>>  https://git.kernel.org/davem/net-next/c/9d18562a227
>>
>> This should fix very high CPU usage in-case fq_codel goes into drop mode.
>> The problem is that drop mode was considered rare, and implementation
>> wise it was chosen to be more expensive (to save cycles on normal mode).
>> Unfortunately is it easy to trigger with an UDP flood. Drop mode is
>> especially expensive for smaller devices, as it scans a 4K big array,
>> thus 64 cache misses for small devices!
>>
>> The fix is to allow drop-mode to bulk-drop more packets when entering
>> drop-mode (default 64 bulk drop).  That way we don't suddenly
>> experience a significantly higher processing cost per packet, but
>> instead can amortize this.
> I haven't done the above cherry-pick patch & backport patch creation for
> 4.4/4.1/3.18 yet - maybe if $dayjob permits time and no one else beats
> me to it :-)
>
> Kevin
>


Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-15 Thread Roman Yeryomin
On 6 May 2016 at 22:43, Dave Taht  wrote:
> On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin  wrote:
>> On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
>>> On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:

 I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
 closed Felix'es OpenWRT email account (bad choice! emails bouncing).
 Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
 is in some kind of conflict.

 OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349

 [2] 
 http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>>>
>>> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
>>> bit with fq_codel limits I was able to get 420Mbps UDP like this:
>>> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256
>>
>> Forgot to mention, I've reduced drop_batch_size down to 32
>
> 0) Not clear to me if that's the right line, there are 4 wifi queues,
> and the third one
> is the BE queue.

That was an example, sorry, should have stated that. I've applied same
settings to all 4 queues.

> That is too low a limit, also, for normal use. And:
> for the purpose of this particular UDP test, flows 16 is ok, but not
> ideal.

I played with different combinations, it doesn't make any
(significant) difference: 20-30Mbps, not more.
What numbers would you propose?

> 1) What's the tcp number (with a simultaneous ping) with this latest patchset?
> (I care about tcp performance a lot more than udp floods - surviving a
> udp flood yes, performance, no)

During the test (both TCP and UDP) it's roughly 5ms in average, not
running tests ~2ms. Actually I'm now wondering if target is working at
all, because I had same result with target 80ms..
So, yes, latency is good, but performance is poor.

> before/after?
>
> tc -s qdisc show dev wlan0 during/after results?

during the test:

qdisc mq 0: root
 Sent 1600496000 bytes 1057194 pkt (dropped 1421568, overlimits 0 requeues 17)
 backlog 1545794b 1021p requeues 17
qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 17)
 backlog 1541252b 1018p requeues 17
  maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
  new_flows_len 0 old_flows_len 1
qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0


after the test (60sec):

qdisc mq 0: root
 Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
 backlog 0b 0p requeues 28
qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
 backlog 0b 0p requeues 28
  maxpacket 1514 drop_overlimit 2770176 new_flow_count 64 ecn_mark 0
  new_flows_len 0 old_flows_len 1
qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
target 80.0ms ce_threshold 32us interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0


> IF you are doing builds for the archer c7v2, I can join in on this... (?)

I'm not but I have c7 somewhere, so I can do a build for it and also
test, so we are on the same page.

> I did do a test of the ath10k "before", fq_codel *never engaged*, and
> tcp induced latencies 

Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-07 Thread Kevin Darbyshire-Bryant


On 06/05/16 10:42, Jesper Dangaard Brouer wrote:
> Hi Felix,
>
> This is an important fix for OpenWRT, please read!
>
> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
> without also adjusting q->flows_cnt.  Eric explains below that you must
> also adjust the buckets (q->flows_cnt) for this not to break. (Just
> adjust it to 128)
>
> Problematic OpenWRT commit in question:
>  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
>  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it from 
> causing too much cpu load with higher speed (#21326)")
I 'pull requested' this to the lede-staging tree on github.
https://github.com/lede-project/staging/pull/11

One way or another Felix & co should see the change :-)
>
>
> I also highly recommend you cherry-pick this very recent commit:
>  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
>  https://git.kernel.org/davem/net-next/c/9d18562a227
>
> This should fix very high CPU usage in-case fq_codel goes into drop mode.
> The problem is that drop mode was considered rare, and implementation
> wise it was chosen to be more expensive (to save cycles on normal mode).
> Unfortunately is it easy to trigger with an UDP flood. Drop mode is
> especially expensive for smaller devices, as it scans a 4K big array,
> thus 64 cache misses for small devices!
>
> The fix is to allow drop-mode to bulk-drop more packets when entering
> drop-mode (default 64 bulk drop).  That way we don't suddenly
> experience a significantly higher processing cost per packet, but
> instead can amortize this.
I haven't done the above cherry-pick patch & backport patch creation for
4.4/4.1/3.18 yet - maybe if $dayjob permits time and no one else beats
me to it :-)

Kevin



smime.p7s
Description: S/MIME Cryptographic Signature


Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-06 Thread Dave Taht
On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin  wrote:
> On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
>> On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>>>
>>> I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
>>> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
>>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
>>> is in some kind of conflict.
>>>
>>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>>>
>>> [2] 
>>> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>>
>> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
>> bit with fq_codel limits I was able to get 420Mbps UDP like this:
>> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256
>
> Forgot to mention, I've reduced drop_batch_size down to 32

0) Not clear to me if that's the right line, there are 4 wifi queues,
and the third one
is the BE queue. That is too low a limit, also, for normal use. And:
for the purpose of this particular UDP test, flows 16 is ok, but not
ideal.

1) What's the tcp number (with a simultaneous ping) with this latest patchset?
(I care about tcp performance a lot more than udp floods - surviving a
udp flood yes, performance, no)

before/after?

tc -s qdisc show dev wlan0 during/after results?

IF you are doing builds for the archer c7v2, I can join in on this... (?)

I did do a test of the ath10k "before", fq_codel *never engaged*, and
tcp induced latencies under load, e at 100mbit, cracked 600ms, while
staying flat (20ms) at 100mbit. (not the same patches you are testing)
on x86. I have got tcp 300Mbit out of an osx box, similar latency,
have yet to get anything more on anything I currently have
before/after patchsets.

I'll go add flooding to the tests, I just finished a series comparing
two different speed stations and life was good on that.

"before" - fq_codel never engages, we see seconds of latency under load.

root@apu2:~# tc -s qdisc show dev wlp4s0
qdisc mq 0: root
 Sent 8570563893 bytes 6326983 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514
target 5.0ms interval 100.0ms ecn
 Sent 2262 bytes 17 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514
target 5.0ms interval 100.0ms ecn
 Sent 220486569 bytes 152058 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 18168 drop_overlimit 0 new_flow_count 1 ecn_mark 0
  new_flows_len 0 old_flows_len 1
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514
target 5.0ms interval 100.0ms ecn
 Sent 8340546509 bytes 6163431 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 68130 drop_overlimit 0 new_flow_count 120050 ecn_mark 0
  new_flows_len 1 old_flows_len 3
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514
target 5.0ms interval 100.0ms ecn
 Sent 9528553 bytes 11477 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 66 drop_overlimit 0 new_flow_count 1 ecn_mark 0
  new_flows_len 1 old_flows_len 0
  ```


>> This is certainly better than 30Mbps but still more than two times
>> less than before (900).

The number that I still am not sure we got is that you were sending
900mbit udp and recieving 900mbit on the prior tests?

>> TCP also improved a little (550 to ~590).

The limit is probably a bit low, also.  You might want to try target
20ms as well.

>>
>> Felix, others, do you want to see the ported patch, maybe I did something 
>> wrong?
>> Doesn't look like it will save ath10k from performance regression.

what was tcp "before"? (I'm sorry, such a long thread)

>>
>>>
>>> On Fri, 6 May 2016 11:42:43 +0200
>>> Jesper Dangaard Brouer  wrote:
>>>
 Hi Felix,

 This is an important fix for OpenWRT, please read!

 OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
 without also adjusting q->flows_cnt.  Eric explains below that you must
 also adjust the buckets (q->flows_cnt) for this not to break. (Just
 adjust it to 128)

 Problematic OpenWRT commit in question:
  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it 
 from causing too much cpu load with higher speed (#21326)")


 I also highly recommend you cherry-pick this very recent commit:
  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
  https://git.kernel.org/davem/net-next/c/9d18562a227

 This should fix very high CPU usage in-case fq_codel goes into drop mode.
 The problem is that drop mode was considered rare, and implementation

Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-06 Thread Roman Yeryomin
On 6 May 2016 at 21:43, Roman Yeryomin  wrote:
> On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>>
>> I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
>> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
>> is in some kind of conflict.
>>
>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>>
>> [2] 
>> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>
> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
> bit with fq_codel limits I was able to get 420Mbps UDP like this:
> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256

Forgot to mention, I've reduced drop_batch_size down to 32

> This is certainly better than 30Mbps but still more than two times
> less than before (900).
> TCP also improved a little (550 to ~590).
>
> Felix, others, do you want to see the ported patch, maybe I did something 
> wrong?
> Doesn't look like it will save ath10k from performance regression.
>
>>
>> On Fri, 6 May 2016 11:42:43 +0200
>> Jesper Dangaard Brouer  wrote:
>>
>>> Hi Felix,
>>>
>>> This is an important fix for OpenWRT, please read!
>>>
>>> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
>>> without also adjusting q->flows_cnt.  Eric explains below that you must
>>> also adjust the buckets (q->flows_cnt) for this not to break. (Just
>>> adjust it to 128)
>>>
>>> Problematic OpenWRT commit in question:
>>>  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
>>>  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it from 
>>> causing too much cpu load with higher speed (#21326)")
>>>
>>>
>>> I also highly recommend you cherry-pick this very recent commit:
>>>  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
>>>  https://git.kernel.org/davem/net-next/c/9d18562a227
>>>
>>> This should fix very high CPU usage in-case fq_codel goes into drop mode.
>>> The problem is that drop mode was considered rare, and implementation
>>> wise it was chosen to be more expensive (to save cycles on normal mode).
>>> Unfortunately is it easy to trigger with an UDP flood. Drop mode is
>>> especially expensive for smaller devices, as it scans a 4K big array,
>>> thus 64 cache misses for small devices!
>>>
>>> The fix is to allow drop-mode to bulk-drop more packets when entering
>>> drop-mode (default 64 bulk drop).  That way we don't suddenly
>>> experience a significantly higher processing cost per packet, but
>>> instead can amortize this.
>>>
>>> To Eric, should we recommend OpenWRT to adjust default (max) 64 bulk
>>> drop, given we also recommend bucket size to be 128 ? (thus the amount
>>> of memory to scan is less, but their CPU is also much smaller).
>>>
>>> --Jesper
>>>
>>>
>>> On Thu, 05 May 2016 12:23:27 -0700 Eric Dumazet  
>>> wrote:
>>>
>>> > On Thu, 2016-05-05 at 19:25 +0300, Roman Yeryomin wrote:
>>> > > On 5 May 2016 at 19:12, Eric Dumazet  wrote:
>>> > > > On Thu, 2016-05-05 at 17:53 +0300, Roman Yeryomin wrote:
>>> > > >
>>> > > >>
>>> > > >> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 1024p flows 1024
>>> > > >> quantum 1514 target 5.0ms interval 100.0ms ecn
>>> > > >>  Sent 12306 bytes 128 pkt (dropped 0, overlimits 0 requeues 0)
>>> > > >>  backlog 0b 0p requeues 0
>>> > > >>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>> > > >>   new_flows_len 0 old_flows_len 0
>>> > > >
>>> > > >
>>> > > > Limit of 1024 packets and 1024 flows is not wise I think.
>>> > > >
>>> > > > (If all buckets are in use, each bucket has a virtual queue of 1 
>>> > > > packet,
>>> > > > which is almost the same than having no queue at all)
>>> > > >
>>> > > > I suggest to have at least 8 packets per bucket, to let Codel have a
>>> > > > chance to trigger.
>>> > > >
>>> > > > So you could either reduce number of buckets to 128 (if memory is
>>> > > > tight), or increase limit to 8192.
>>> > >
>>> > > Will try, but what I've posted is default, I didn't change/configure 
>>> > > that.
>>> >
>>> > fq_codel has a default of 10240 packets and 1024 buckets.
>>> >
>>> > http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c#L413
>>> >
>>> > If someone changed that in the linux variant you use, he probably should
>>> > explain the rationale.
>>
>> --
>> Best regards,
>>   Jesper Dangaard Brouer
>>   MSc.CS, Principal Kernel Engineer at Red Hat
>>   Author of http://www.iptv-analyzer.org
>>   LinkedIn: http://www.linkedin.com/in/brouer


Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-06 Thread Roman Yeryomin
On 6 May 2016 at 15:47, Jesper Dangaard Brouer  wrote:
>
> I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
> is in some kind of conflict.
>
> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>
> [2] 
> http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335

OK, so, after porting the patch to 4.1 openwrt kernel and playing a
bit with fq_codel limits I was able to get 420Mbps UDP like this:
tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256

This is certainly better than 30Mbps but still more than two times
less than before (900).
TCP also improved a little (550 to ~590).

Felix, others, do you want to see the ported patch, maybe I did something wrong?
Doesn't look like it will save ath10k from performance regression.

>
> On Fri, 6 May 2016 11:42:43 +0200
> Jesper Dangaard Brouer  wrote:
>
>> Hi Felix,
>>
>> This is an important fix for OpenWRT, please read!
>>
>> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
>> without also adjusting q->flows_cnt.  Eric explains below that you must
>> also adjust the buckets (q->flows_cnt) for this not to break. (Just
>> adjust it to 128)
>>
>> Problematic OpenWRT commit in question:
>>  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
>>  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it from 
>> causing too much cpu load with higher speed (#21326)")
>>
>>
>> I also highly recommend you cherry-pick this very recent commit:
>>  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
>>  https://git.kernel.org/davem/net-next/c/9d18562a227
>>
>> This should fix very high CPU usage in-case fq_codel goes into drop mode.
>> The problem is that drop mode was considered rare, and implementation
>> wise it was chosen to be more expensive (to save cycles on normal mode).
>> Unfortunately is it easy to trigger with an UDP flood. Drop mode is
>> especially expensive for smaller devices, as it scans a 4K big array,
>> thus 64 cache misses for small devices!
>>
>> The fix is to allow drop-mode to bulk-drop more packets when entering
>> drop-mode (default 64 bulk drop).  That way we don't suddenly
>> experience a significantly higher processing cost per packet, but
>> instead can amortize this.
>>
>> To Eric, should we recommend OpenWRT to adjust default (max) 64 bulk
>> drop, given we also recommend bucket size to be 128 ? (thus the amount
>> of memory to scan is less, but their CPU is also much smaller).
>>
>> --Jesper
>>
>>
>> On Thu, 05 May 2016 12:23:27 -0700 Eric Dumazet  
>> wrote:
>>
>> > On Thu, 2016-05-05 at 19:25 +0300, Roman Yeryomin wrote:
>> > > On 5 May 2016 at 19:12, Eric Dumazet  wrote:
>> > > > On Thu, 2016-05-05 at 17:53 +0300, Roman Yeryomin wrote:
>> > > >
>> > > >>
>> > > >> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 1024p flows 1024
>> > > >> quantum 1514 target 5.0ms interval 100.0ms ecn
>> > > >>  Sent 12306 bytes 128 pkt (dropped 0, overlimits 0 requeues 0)
>> > > >>  backlog 0b 0p requeues 0
>> > > >>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>> > > >>   new_flows_len 0 old_flows_len 0
>> > > >
>> > > >
>> > > > Limit of 1024 packets and 1024 flows is not wise I think.
>> > > >
>> > > > (If all buckets are in use, each bucket has a virtual queue of 1 
>> > > > packet,
>> > > > which is almost the same than having no queue at all)
>> > > >
>> > > > I suggest to have at least 8 packets per bucket, to let Codel have a
>> > > > chance to trigger.
>> > > >
>> > > > So you could either reduce number of buckets to 128 (if memory is
>> > > > tight), or increase limit to 8192.
>> > >
>> > > Will try, but what I've posted is default, I didn't change/configure 
>> > > that.
>> >
>> > fq_codel has a default of 10240 packets and 1024 buckets.
>> >
>> > http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c#L413
>> >
>> > If someone changed that in the linux variant you use, he probably should
>> > explain the rationale.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer


Re: OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood)

2016-05-06 Thread Jesper Dangaard Brouer

I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
closed Felix'es OpenWRT email account (bad choice! emails bouncing).
Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
is in some kind of conflict.

OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349

[2] http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335


On Fri, 6 May 2016 11:42:43 +0200
Jesper Dangaard Brouer  wrote:

> Hi Felix,
> 
> This is an important fix for OpenWRT, please read!
> 
> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024,
> without also adjusting q->flows_cnt.  Eric explains below that you must
> also adjust the buckets (q->flows_cnt) for this not to break. (Just
> adjust it to 128)
> 
> Problematic OpenWRT commit in question:
>  http://git.openwrt.org/?p=openwrt.git;a=patch;h=12cd6578084e
>  12cd6578084e ("kernel: revert fq_codel quantum override to prevent it from 
> causing too much cpu load with higher speed (#21326)")
> 
> 
> I also highly recommend you cherry-pick this very recent commit:
>  net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
>  https://git.kernel.org/davem/net-next/c/9d18562a227
> 
> This should fix very high CPU usage in-case fq_codel goes into drop mode.
> The problem is that drop mode was considered rare, and implementation
> wise it was chosen to be more expensive (to save cycles on normal mode).
> Unfortunately is it easy to trigger with an UDP flood. Drop mode is
> especially expensive for smaller devices, as it scans a 4K big array,
> thus 64 cache misses for small devices!
> 
> The fix is to allow drop-mode to bulk-drop more packets when entering
> drop-mode (default 64 bulk drop).  That way we don't suddenly
> experience a significantly higher processing cost per packet, but
> instead can amortize this.
> 
> To Eric, should we recommend OpenWRT to adjust default (max) 64 bulk
> drop, given we also recommend bucket size to be 128 ? (thus the amount
> of memory to scan is less, but their CPU is also much smaller).
> 
> --Jesper
> 
> 
> On Thu, 05 May 2016 12:23:27 -0700 Eric Dumazet  
> wrote:
> 
> > On Thu, 2016-05-05 at 19:25 +0300, Roman Yeryomin wrote:  
> > > On 5 May 2016 at 19:12, Eric Dumazet  wrote:
> > > > On Thu, 2016-05-05 at 17:53 +0300, Roman Yeryomin wrote:
> > > >
> > > >>
> > > >> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 1024p flows 1024
> > > >> quantum 1514 target 5.0ms interval 100.0ms ecn
> > > >>  Sent 12306 bytes 128 pkt (dropped 0, overlimits 0 requeues 0)
> > > >>  backlog 0b 0p requeues 0
> > > >>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
> > > >>   new_flows_len 0 old_flows_len 0
> > > >
> > > >
> > > > Limit of 1024 packets and 1024 flows is not wise I think.
> > > >
> > > > (If all buckets are in use, each bucket has a virtual queue of 1 packet,
> > > > which is almost the same than having no queue at all)
> > > >
> > > > I suggest to have at least 8 packets per bucket, to let Codel have a
> > > > chance to trigger.
> > > >
> > > > So you could either reduce number of buckets to 128 (if memory is
> > > > tight), or increase limit to 8192.
> > > 
> > > Will try, but what I've posted is default, I didn't change/configure 
> > > that.
> > 
> > fq_codel has a default of 10240 packets and 1024 buckets.
> > 
> > http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c#L413
> > 
> > If someone changed that in the linux variant you use, he probably should
> > explain the rationale.  

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer