Re: ath10k performance, master branch from 20160407

2016-05-15 Thread Rajkumar Manoharan

On 2016-05-16 04:29, Roman Yeryomin wrote:

On 9 May 2016 at 15:26, Michal Kazior  wrote:

Hi Roman,

On 22 April 2016 at 19:05, Roman Yeryomin  
wrote:
On 19 April 2016 at 18:35, Valo, Kalle  
wrote:

Michal Kazior  writes:

On 19 April 2016 at 09:31, Roman Yeryomin  
wrote:
On 19 April 2016 at 08:28, Michal Kazior  
wrote:


If my hunch is right there's no easy (and proper) fix for that 
now.


One of the patchset patches (ath10k: implement wake_tx_queue) 
starts

to use mac80211 software queuing. This introduces extra induced
latency and I'm guessing it results in fill-in-then-drain 
sequences in
some cases which end up being long enough to make fq_codel_drop 
more

work than normal.

This is required for other changes and MU-MIMO performance
improvements so this patch can't be removed.


But qca988x doesn't support MU-MIMO, AFAIK.


Correct.



Can this be made chip dependent?


I guess it could but it'd arguably make the driver more complex and
harder to maintain. What we want is a long-term fix, not a 
short-term

one.


But we should never go backwards and TCP dropping from 750 Mbps to 
~550
Mbps is a huge drop, so this is not ok. We have to do something to 
fix
this, be it reverting the wake_tx_queue support, somehow disabling 
it by

default or something.


I would agree with Kalle here. This looks like very serious 
regression.

But I'm afraid I can only help with testing here.


Can you give the following patch a try, please? I didn't get to
reproduce your problem on a real AP135/AP152 board and instead tried
to simulate a slow uni-proc system via KVM and cooling_device in
sysfs. The patch does improve things in this synthetic setup for me.

  http://lists.infradead.org/pipermail/ath10k/2016-May/007526.html



Unfortunately doesn't seem to make any difference at all (really, if
there is, it's less than 10Mbps).
Please see this thread also:
https://lists.openwrt.org/pipermail/openwrt-devel/2016-May/041445.html
That is with your and Eric's patch applied.


Roman,

Can you please try without registering wake_tx_queue callback? software 
queuing is needed for devices that supports peer-flow-control.


diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c

index 6829a08638b2..5df904169ded 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -7313,7 +7313,6 @@ ath10k_mac_op_switch_vif_chanctx(struct 
ieee80211_hw *hw,


 static const struct ieee80211_ops ath10k_ops = {
.tx = ath10k_mac_op_tx,
-   .wake_tx_queue  = ath10k_mac_op_wake_tx_queue,
.start  = ath10k_start,
.stop   = ath10k_stop,
.config = ath10k_config,

-Rajkumar

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-05-15 Thread Roman Yeryomin
On 9 May 2016 at 15:26, Michal Kazior  wrote:
> Hi Roman,
>
> On 22 April 2016 at 19:05, Roman Yeryomin  wrote:
>> On 19 April 2016 at 18:35, Valo, Kalle  wrote:
>>> Michal Kazior  writes:
>>>
 On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
> On 19 April 2016 at 08:28, Michal Kazior  wrote:
>
>> If my hunch is right there's no easy (and proper) fix for that now.
>>
>> One of the patchset patches (ath10k: implement wake_tx_queue) starts
>> to use mac80211 software queuing. This introduces extra induced
>> latency and I'm guessing it results in fill-in-then-drain sequences in
>> some cases which end up being long enough to make fq_codel_drop more
>> work than normal.
>>
>> This is required for other changes and MU-MIMO performance
>> improvements so this patch can't be removed.
>
> But qca988x doesn't support MU-MIMO, AFAIK.

 Correct.


> Can this be made chip dependent?

 I guess it could but it'd arguably make the driver more complex and
 harder to maintain. What we want is a long-term fix, not a short-term
 one.
>>>
>>> But we should never go backwards and TCP dropping from 750 Mbps to ~550
>>> Mbps is a huge drop, so this is not ok. We have to do something to fix
>>> this, be it reverting the wake_tx_queue support, somehow disabling it by
>>> default or something.
>>
>> I would agree with Kalle here. This looks like very serious regression.
>> But I'm afraid I can only help with testing here.
>
> Can you give the following patch a try, please? I didn't get to
> reproduce your problem on a real AP135/AP152 board and instead tried
> to simulate a slow uni-proc system via KVM and cooling_device in
> sysfs. The patch does improve things in this synthetic setup for me.
>
>   http://lists.infradead.org/pipermail/ath10k/2016-May/007526.html
>

Unfortunately doesn't seem to make any difference at all (really, if
there is, it's less than 10Mbps).
Please see this thread also:
https://lists.openwrt.org/pipermail/openwrt-devel/2016-May/041445.html
That is with your and Eric's patch applied.

Regards,
Roman

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-05-09 Thread Michal Kazior
Hi Roman,

On 22 April 2016 at 19:05, Roman Yeryomin  wrote:
> On 19 April 2016 at 18:35, Valo, Kalle  wrote:
>> Michal Kazior  writes:
>>
>>> On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
 On 19 April 2016 at 08:28, Michal Kazior  wrote:

> If my hunch is right there's no easy (and proper) fix for that now.
>
> One of the patchset patches (ath10k: implement wake_tx_queue) starts
> to use mac80211 software queuing. This introduces extra induced
> latency and I'm guessing it results in fill-in-then-drain sequences in
> some cases which end up being long enough to make fq_codel_drop more
> work than normal.
>
> This is required for other changes and MU-MIMO performance
> improvements so this patch can't be removed.

 But qca988x doesn't support MU-MIMO, AFAIK.
>>>
>>> Correct.
>>>
>>>
 Can this be made chip dependent?
>>>
>>> I guess it could but it'd arguably make the driver more complex and
>>> harder to maintain. What we want is a long-term fix, not a short-term
>>> one.
>>
>> But we should never go backwards and TCP dropping from 750 Mbps to ~550
>> Mbps is a huge drop, so this is not ok. We have to do something to fix
>> this, be it reverting the wake_tx_queue support, somehow disabling it by
>> default or something.
>
> I would agree with Kalle here. This looks like very serious regression.
> But I'm afraid I can only help with testing here.

Can you give the following patch a try, please? I didn't get to
reproduce your problem on a real AP135/AP152 board and instead tried
to simulate a slow uni-proc system via KVM and cooling_device in
sysfs. The patch does improve things in this synthetic setup for me.

  http://lists.infradead.org/pipermail/ath10k/2016-May/007526.html


Michał

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-22 Thread Roman Yeryomin
On 19 April 2016 at 18:35, Valo, Kalle  wrote:
> Michal Kazior  writes:
>
>> On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
>>> On 19 April 2016 at 08:28, Michal Kazior  wrote:
>>>
 If my hunch is right there's no easy (and proper) fix for that now.

 One of the patchset patches (ath10k: implement wake_tx_queue) starts
 to use mac80211 software queuing. This introduces extra induced
 latency and I'm guessing it results in fill-in-then-drain sequences in
 some cases which end up being long enough to make fq_codel_drop more
 work than normal.

 This is required for other changes and MU-MIMO performance
 improvements so this patch can't be removed.
>>>
>>> But qca988x doesn't support MU-MIMO, AFAIK.
>>
>> Correct.
>>
>>
>>> Can this be made chip dependent?
>>
>> I guess it could but it'd arguably make the driver more complex and
>> harder to maintain. What we want is a long-term fix, not a short-term
>> one.
>
> But we should never go backwards and TCP dropping from 750 Mbps to ~550
> Mbps is a huge drop, so this is not ok. We have to do something to fix
> this, be it reverting the wake_tx_queue support, somehow disabling it by
> default or something.

I would agree with Kalle here. This looks like very serious regression.
But I'm afraid I can only help with testing here.

Regards,
Roman

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-22 Thread Roman Yeryomin
On 19 April 2016 at 10:43, Michal Kazior  wrote:
> On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
>> On 19 April 2016 at 08:28, Michal Kazior  wrote:
>>> On 18 April 2016 at 15:00, Roman Yeryomin  wrote:
 So it looks like Michal's patch set "ath10k: implement push-pull tx
 model" introduced this regression - after restoring it from reverts
 fq_codel_drop is hungry again.
 Any ideas how to fix?
>>>
>>> If my hunch is right there's no easy (and proper) fix for that now.
>>>
>>> One of the patchset patches (ath10k: implement wake_tx_queue) starts
>>> to use mac80211 software queuing. This introduces extra induced
>>> latency and I'm guessing it results in fill-in-then-drain sequences in
>>> some cases which end up being long enough to make fq_codel_drop more
>>> work than normal.
>>>
>>> This is required for other changes and MU-MIMO performance
>>> improvements so this patch can't be removed.
>>
>> But qca988x doesn't support MU-MIMO, AFAIK.
>
> Correct.
>
>
>> Can this be made chip dependent?
>
> I guess it could but it'd arguably make the driver more complex and
> harder to maintain. What we want is a long-term fix, not a short-term
> one.
>
> The long-term fix is a work-in-progress which aims at killing
> bufferbloat in general [1][2]. This should, by proxy, improve
> everything.
>
> [1]: https://www.spinics.net/lists/linux-wireless/msg149776.html
> [2]: https://www.spinics.net/lists/linux-wireless/msg148714.html
> [3]: https://www.spinics.net/lists/linux-wireless/msg149039.html
>
> You can try out patchset from [1] (and maybe [3] as well) to see if it
> helps you (assuming you have spare time to play around).

Will try.

Regards,
Roman

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-20 Thread Michal Kazior
On 18 April 2016 at 01:03, Roman Yeryomin  wrote:
[...]
> CPU usage didn't go down after simply turning off
> CPTCFG_NET_SCH_FQ_CODEL under compat wireless (and yes, I verified it
> was off in the config after recompilation).
> But still I'm not sure it's really off. Turning it off both in kernel
> config and compat-wireless doesn't seem to have effect. I didn't dig
> deeper into this but it looks I didn't find a correct way to turn it
> off completely.

You can check this using `tc qdisc` command. It'll tell you what kind
of qdiscs sit on each interface.


Michał

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-19 Thread Valo, Kalle
Michal Kazior  writes:

> On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
>> On 19 April 2016 at 08:28, Michal Kazior  wrote:
>>
>>> If my hunch is right there's no easy (and proper) fix for that now.
>>>
>>> One of the patchset patches (ath10k: implement wake_tx_queue) starts
>>> to use mac80211 software queuing. This introduces extra induced
>>> latency and I'm guessing it results in fill-in-then-drain sequences in
>>> some cases which end up being long enough to make fq_codel_drop more
>>> work than normal.
>>>
>>> This is required for other changes and MU-MIMO performance
>>> improvements so this patch can't be removed.
>>
>> But qca988x doesn't support MU-MIMO, AFAIK.
>
> Correct.
>
>
>> Can this be made chip dependent?
>
> I guess it could but it'd arguably make the driver more complex and
> harder to maintain. What we want is a long-term fix, not a short-term
> one.

But we should never go backwards and TCP dropping from 750 Mbps to ~550
Mbps is a huge drop, so this is not ok. We have to do something to fix
this, be it reverting the wake_tx_queue support, somehow disabling it by
default or something.

-- 
Kalle Valo
___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-19 Thread Michal Kazior
On 19 April 2016 at 09:31, Roman Yeryomin  wrote:
> On 19 April 2016 at 08:28, Michal Kazior  wrote:
>> On 18 April 2016 at 15:00, Roman Yeryomin  wrote:
>>> So it looks like Michal's patch set "ath10k: implement push-pull tx
>>> model" introduced this regression - after restoring it from reverts
>>> fq_codel_drop is hungry again.
>>> Any ideas how to fix?
>>
>> If my hunch is right there's no easy (and proper) fix for that now.
>>
>> One of the patchset patches (ath10k: implement wake_tx_queue) starts
>> to use mac80211 software queuing. This introduces extra induced
>> latency and I'm guessing it results in fill-in-then-drain sequences in
>> some cases which end up being long enough to make fq_codel_drop more
>> work than normal.
>>
>> This is required for other changes and MU-MIMO performance
>> improvements so this patch can't be removed.
>
> But qca988x doesn't support MU-MIMO, AFAIK.

Correct.


> Can this be made chip dependent?

I guess it could but it'd arguably make the driver more complex and
harder to maintain. What we want is a long-term fix, not a short-term
one.

The long-term fix is a work-in-progress which aims at killing
bufferbloat in general [1][2]. This should, by proxy, improve
everything.

[1]: https://www.spinics.net/lists/linux-wireless/msg149776.html
[2]: https://www.spinics.net/lists/linux-wireless/msg148714.html
[3]: https://www.spinics.net/lists/linux-wireless/msg149039.html

You can try out patchset from [1] (and maybe [3] as well) to see if it
helps you (assuming you have spare time to play around).


Michał

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-19 Thread Roman Yeryomin
On 19 April 2016 at 08:28, Michal Kazior  wrote:
> On 18 April 2016 at 15:00, Roman Yeryomin  wrote:
>> So it looks like Michal's patch set "ath10k: implement push-pull tx
>> model" introduced this regression - after restoring it from reverts
>> fq_codel_drop is hungry again.
>> Any ideas how to fix?
>
> If my hunch is right there's no easy (and proper) fix for that now.
>
> One of the patchset patches (ath10k: implement wake_tx_queue) starts
> to use mac80211 software queuing. This introduces extra induced
> latency and I'm guessing it results in fill-in-then-drain sequences in
> some cases which end up being long enough to make fq_codel_drop more
> work than normal.
>
> This is required for other changes and MU-MIMO performance
> improvements so this patch can't be removed.

But qca988x doesn't support MU-MIMO, AFAIK.
Can this be made chip dependent?

> I guess you could try forcing fq_codel to use different target time,
> e.g. 20ms (instead of the default 5). You can do this using `tc`
> command like so:
>
>tc qdisc replace dev wlan0 parent :1 fq_codel limit 1024 target 20ms
>tc qdisc replace dev wlan0 parent :2 fq_codel limit 1024 target 20ms
>tc qdisc replace dev wlan0 parent :3 fq_codel limit 1024 target 20ms
>tc qdisc replace dev wlan0 parent :4 fq_codel limit 1024 target 20ms
>
> You might also want to try `pfifo` instead of `fq_codel` for comparison as 
> well.

Will try.


Regards,
Roman

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-17 Thread Roman Yeryomin
Rajkumar,

ok, I've ended up resolving (seems to be trivial) conflicts in revert
list you provided (see comments inlined).
Performance restored and codel symbols are gone from perf top.
Will try reverting "ath10k: combine txrx and replenish task" alone and
then, if that doesn't help, resetting reverts by patch sets.

Regards,
Roman

On 17 April 2016 at 18:06, Manoharan, Rajkumar
<rmano...@qti.qualcomm.com> wrote:
> Roman,
>
> Hmm.. I just listed ath10k changes alone. So there might be some dependencies.

there were ath10k conflicts, please see below

> In your earlier mail fq_codel_drop was consuming 45% cpu. Have you observed 
> any
> improvement after switching off NET_SCH_FQ_CODEL? Had CPU usage gone down?

CPU usage didn't go down after simply turning off
CPTCFG_NET_SCH_FQ_CODEL under compat wireless (and yes, I verified it
was off in the config after recompilation).
But still I'm not sure it's really off. Turning it off both in kernel
config and compat-wireless doesn't seem to have effect. I didn't dig
deeper into this but it looks I didn't find a correct way to turn it
off completely.

Not sure if I stated it correctly: after resetting to
89ef41bfaa46f24a14b776f1cd78c0e0b39e54ce I got same (good enough)
performance as with latest compat-wireless release (20160110).

> Please try to revert the commit "ath10k: combine txrx and replenish task" 
> alone. If you still
> see same behavior (lower numbers), reset master branch to till "ath10k: fix 
> pull-push tx
> threshold handling" and generate backports.
>
> Please make sure that codel is switched off always until regression point is 
> root caused.
>
> -Rajkumar
>
> 
> From: Roman Yeryomin <leroi.li...@gmail.com>
> Sent: Sunday, April 17, 2016 2:58 PM
> To: Manoharan, Rajkumar
> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
> Subject: Re: ath10k performance, master branch from 20160407
>
> Rajkumar,
>
> Somehow unseting CPTCFG_NET_SCH_FQ_CODEL didn't change anything and
> the patches you listed didn't revert cleanly, I gave up on 3rd
> dependent patch somewhere in the middle and just reset master to
> 89ef41bfaa46f24a14b776f1cd78c0e0b39e54ce, which is the last commit
> just before "ath10k: refactor tx code", and generated new backports.
> The result is that it has same performance as before. But I guess it
> is not a very good test as there were many changes to mac80211 too.
>
> So what do you want me to try next? Maybe you could provide a more
> precise list to revert?
>
>
> Regards,
> Roman
>
> On 9 April 2016 at 07:02, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
> wrote:
>> Roman,
>>
>> Need your help to bisect regression point. Can you try w/o 
>> CPTCFG_NET_SCH_FQ_CODEL?
>> If it does not help, try reverting below commits which are major changes in 
>> data path.
>> Instead of generating backports, apply revert commit on top your backports.
>>
>> ath10k: combine txrx and replenish task
>> ath10k: reuse copy engine 5 (htt rx) descriptors
>> ath10k: cleanup copy engine receive next completion
>> ath10k: register ath10k_htt_htc_t2h_msg_handler
>> ath10k: speedup htt rx descriptor processing for rx_ind

this depends on 689de38e37179c6f524dd003e1dae92042f8f5cd

>> ath10k: cleanup amsdu processing for rx indication
>> ath10k: remove unused fw_desc processing
>> ath10k: copy tx fetch indication message
>> ath10k: speedup htt rx descriptor processing for tx completion
>> ath10k: fix null deref if device crashes early
>> ath10k: fix pull-push tx threshold handling
>> ath10k: fix tx hang
>> ath10k: move mgmt descriptor limit handle under mgmt_tx

error: could not revert cac0855... ath10k: move mgmt descriptor limit
handle under mgmt_tx
Not even sure why it fails here, pretty trivial to resolve but still...

>> ath10k: change htt tx desc/qcache peer limit config

error: could not revert 99ad1cb... ath10k: change htt tx desc/qcache
peer limit config
ook, resolved, hope correctly

>> ath10k: fix HTT Tx CE ring size
>> ath10k: implement push-pull tx
>> ath10k: keep track of queue depth per txq
>> ath10k: store txq in skb_cb
>> ath10k: implement updating shared htt txq state
>> ath10k: implement wake_tx_queue

depends on 9d71d47eed20f34620e54e29bcc90f959d5873b8 and
750eeed89cf3c466df302e4707491b015531e26c
all three fail to revert cleanly

>> ath10k: add new htt message generation/parsing logic

fails to revert cleanly

>> ath10k: add fast peer_map lookup
>> ath10k: maintain peer_id for each sta and vif
>> ath10k: refactor tx pending management
>> ath10k: unify txpath decision
>> ath10k: refactor tx code
>>
&

Re: ath10k performance, master branch from 20160407

2016-04-17 Thread Manoharan, Rajkumar
Roman,

Hmm.. I just listed ath10k changes alone. So there might be some dependencies.
In your earlier mail fq_codel_drop was consuming 45% cpu. Have you observed any
improvement after switching off NET_SCH_FQ_CODEL? Had CPU usage gone down?

Please try to revert the commit "ath10k: combine txrx and replenish task" 
alone. If you still
see same behavior (lower numbers), reset master branch to till "ath10k: fix 
pull-push tx
threshold handling" and generate backports.

Please make sure that codel is switched off always until regression point is 
root caused.

-Rajkumar


From: Roman Yeryomin <leroi.li...@gmail.com>
Sent: Sunday, April 17, 2016 2:58 PM
To: Manoharan, Rajkumar
Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
Subject: Re: ath10k performance, master branch from 20160407

Rajkumar,

Somehow unseting CPTCFG_NET_SCH_FQ_CODEL didn't change anything and
the patches you listed didn't revert cleanly, I gave up on 3rd
dependent patch somewhere in the middle and just reset master to
89ef41bfaa46f24a14b776f1cd78c0e0b39e54ce, which is the last commit
just before "ath10k: refactor tx code", and generated new backports.
The result is that it has same performance as before. But I guess it
is not a very good test as there were many changes to mac80211 too.

So what do you want me to try next? Maybe you could provide a more
precise list to revert?


Regards,
Roman

On 9 April 2016 at 07:02, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> wrote:
> Roman,
>
> Need your help to bisect regression point. Can you try w/o 
> CPTCFG_NET_SCH_FQ_CODEL?
> If it does not help, try reverting below commits which are major changes in 
> data path.
> Instead of generating backports, apply revert commit on top your backports.
>
> ath10k: combine txrx and replenish task
> ath10k: reuse copy engine 5 (htt rx) descriptors
> ath10k: cleanup copy engine receive next completion
> ath10k: register ath10k_htt_htc_t2h_msg_handler
> ath10k: speedup htt rx descriptor processing for rx_ind
> ath10k: cleanup amsdu processing for rx indication
> ath10k: remove unused fw_desc processing
> ath10k: copy tx fetch indication message
> ath10k: speedup htt rx descriptor processing for tx completion
> ath10k: fix null deref if device crashes early
> ath10k: fix pull-push tx threshold handling
> ath10k: fix tx hang
> ath10k: move mgmt descriptor limit handle under mgmt_tx
> ath10k: change htt tx desc/qcache peer limit config
> ath10k: fix HTT Tx CE ring size
> ath10k: implement push-pull tx
> ath10k: keep track of queue depth per txq
> ath10k: store txq in skb_cb
> ath10k: implement updating shared htt txq state
> ath10k: implement wake_tx_queue
> ath10k: add new htt message generation/parsing logic
> ath10k: add fast peer_map lookup
> ath10k: maintain peer_id for each sta and vif
> ath10k: refactor tx pending management
> ath10k: unify txpath decision
> ath10k: refactor tx code
>
> -Rajkumar
> 
> From: Roman Yeryomin <leroi.li...@gmail.com>
> Sent: Friday, April 8, 2016 10:49 PM
> To: Manoharan, Rajkumar
> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
> Subject: Re: ath10k performance, master branch from 20160407
>
> Latest backports (compat-wireless) released (20160110) has codel
> enabled (CPTCFG_NET_SCH_FQ_CODEL=y) and there are no openwrt patches
> or special configuration for codel. And it runs ok.
> How old commit do you want me to try?
>
> Regards,
> Roman
>
> On 8 April 2016 at 19:41, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
> wrote:
>> That should be fine. Is codel running only for latest backports? Are there 
>> any openwrt changes to configure codel? Can you plz try to reset master 
>> branch to older commit and validate?
>>
>> -Rajkumar
>> ________
>> From: Roman Yeryomin [leroi.li...@gmail.com]
>> Sent: Friday, April 8, 2016 9:30 PM
>> To: Manoharan, Rajkumar
>> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
>> Subject: Re: ath10k performance, master branch from 20160407
>>
>> Rajkumar,
>>
>> I took backports from
>> git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
>> took latest ath tree from
>> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
>> backports-output based on ath master branch, refreshed openwrt
>> patches.
>> And saw big performance degradation. Am I doing something wrong?
>>
>> Regards,
>> Roman
>>
>> On 8 April 2016 at 18:34, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
>> wrote:
>>> Roman,
>>>
>>> Which backports version are you using? I d

Re: ath10k performance, master branch from 20160407

2016-04-17 Thread Roman Yeryomin
Rajkumar,

Somehow unseting CPTCFG_NET_SCH_FQ_CODEL didn't change anything and
the patches you listed didn't revert cleanly, I gave up on 3rd
dependent patch somewhere in the middle and just reset master to
89ef41bfaa46f24a14b776f1cd78c0e0b39e54ce, which is the last commit
just before "ath10k: refactor tx code", and generated new backports.
The result is that it has same performance as before. But I guess it
is not a very good test as there were many changes to mac80211 too.

So what do you want me to try next? Maybe you could provide a more
precise list to revert?


Regards,
Roman

On 9 April 2016 at 07:02, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> wrote:
> Roman,
>
> Need your help to bisect regression point. Can you try w/o 
> CPTCFG_NET_SCH_FQ_CODEL?
> If it does not help, try reverting below commits which are major changes in 
> data path.
> Instead of generating backports, apply revert commit on top your backports.
>
> ath10k: combine txrx and replenish task
> ath10k: reuse copy engine 5 (htt rx) descriptors
> ath10k: cleanup copy engine receive next completion
> ath10k: register ath10k_htt_htc_t2h_msg_handler
> ath10k: speedup htt rx descriptor processing for rx_ind
> ath10k: cleanup amsdu processing for rx indication
> ath10k: remove unused fw_desc processing
> ath10k: copy tx fetch indication message
> ath10k: speedup htt rx descriptor processing for tx completion
> ath10k: fix null deref if device crashes early
> ath10k: fix pull-push tx threshold handling
> ath10k: fix tx hang
> ath10k: move mgmt descriptor limit handle under mgmt_tx
> ath10k: change htt tx desc/qcache peer limit config
> ath10k: fix HTT Tx CE ring size
> ath10k: implement push-pull tx
> ath10k: keep track of queue depth per txq
> ath10k: store txq in skb_cb
> ath10k: implement updating shared htt txq state
> ath10k: implement wake_tx_queue
> ath10k: add new htt message generation/parsing logic
> ath10k: add fast peer_map lookup
> ath10k: maintain peer_id for each sta and vif
> ath10k: refactor tx pending management
> ath10k: unify txpath decision
> ath10k: refactor tx code
>
> -Rajkumar
> 
> From: Roman Yeryomin <leroi.li...@gmail.com>
> Sent: Friday, April 8, 2016 10:49 PM
> To: Manoharan, Rajkumar
> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
> Subject: Re: ath10k performance, master branch from 20160407
>
> Latest backports (compat-wireless) released (20160110) has codel
> enabled (CPTCFG_NET_SCH_FQ_CODEL=y) and there are no openwrt patches
> or special configuration for codel. And it runs ok.
> How old commit do you want me to try?
>
> Regards,
> Roman
>
> On 8 April 2016 at 19:41, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
> wrote:
>> That should be fine. Is codel running only for latest backports? Are there 
>> any openwrt changes to configure codel? Can you plz try to reset master 
>> branch to older commit and validate?
>>
>> -Rajkumar
>> ________________
>> From: Roman Yeryomin [leroi.li...@gmail.com]
>> Sent: Friday, April 8, 2016 9:30 PM
>> To: Manoharan, Rajkumar
>> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
>> Subject: Re: ath10k performance, master branch from 20160407
>>
>> Rajkumar,
>>
>> I took backports from
>> git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
>> took latest ath tree from
>> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
>> backports-output based on ath master branch, refreshed openwrt
>> patches.
>> And saw big performance degradation. Am I doing something wrong?
>>
>> Regards,
>> Roman
>>
>> On 8 April 2016 at 18:34, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
>> wrote:
>>> Roman,
>>>
>>> Which backports version are you using? I don't see codel changes in 
>>> ath.git/wireless-drivers.git.
>>> Hope you are using same firmware.
>>>
>>> -Rajkumar
>>> 
>>> From: ath10k <ath10k-boun...@lists.infradead.org> on behalf of Roman 
>>> Yeryomin <leroi.li...@gmail.com>
>>> Sent: Friday, April 8, 2016 8:14 PM
>>> To: ath10k@lists.infradead.org
>>> Subject: ath10k performance, master branch from 20160407
>>>
>>> Hello!
>>>
>>> I've seen performance patches were commited so I've decided to give it
>>> a try (using 4.1 kernel and backports).
>>> The results are quite disappointing: TCP download (client pov) dropped
>>> from 750Mbps to ~550 and UDP shows completely weird behavour - if
&g

Re: ath10k performance, master branch from 20160407

2016-04-08 Thread Manoharan, Rajkumar
Roman,

Need your help to bisect regression point. Can you try w/o 
CPTCFG_NET_SCH_FQ_CODEL?
If it does not help, try reverting below commits which are major changes in 
data path.
Instead of generating backports, apply revert commit on top your backports.

ath10k: combine txrx and replenish task 
ath10k: reuse copy engine 5 (htt rx) descriptors
ath10k: cleanup copy engine receive next completion 
ath10k: register ath10k_htt_htc_t2h_msg_handler 
ath10k: speedup htt rx descriptor processing for rx_ind 
ath10k: cleanup amsdu processing for rx indication  
ath10k: remove unused fw_desc processing
ath10k: copy tx fetch indication message
ath10k: speedup htt rx descriptor processing for tx completion  
ath10k: fix null deref if device crashes early  
ath10k: fix pull-push tx threshold handling 
ath10k: fix tx hang 
ath10k: move mgmt descriptor limit handle under mgmt_tx 
ath10k: change htt tx desc/qcache peer limit config 
ath10k: fix HTT Tx CE ring size 
ath10k: implement push-pull tx  
ath10k: keep track of queue depth per txq   
ath10k: store txq in skb_cb
ath10k: implement updating shared htt txq state 
ath10k: implement wake_tx_queue 
ath10k: add new htt message generation/parsing logic
ath10k: add fast peer_map lookup
ath10k: maintain peer_id for each sta and vif   
ath10k: refactor tx pending management  
ath10k: unify txpath decision   
ath10k: refactor tx code

-Rajkumar

From: Roman Yeryomin <leroi.li...@gmail.com>
Sent: Friday, April 8, 2016 10:49 PM
To: Manoharan, Rajkumar
Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
Subject: Re: ath10k performance, master branch from 20160407

Latest backports (compat-wireless) released (20160110) has codel
enabled (CPTCFG_NET_SCH_FQ_CODEL=y) and there are no openwrt patches
or special configuration for codel. And it runs ok.
How old commit do you want me to try?

Regards,
Roman

On 8 April 2016 at 19:41, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> wrote:
> That should be fine. Is codel running only for latest backports? Are there 
> any openwrt changes to configure codel? Can you plz try to reset master 
> branch to older commit and validate?
>
> -Rajkumar
> 
> From: Roman Yeryomin [leroi.li...@gmail.com]
> Sent: Friday, April 8, 2016 9:30 PM
> To: Manoharan, Rajkumar
> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
> Subject: Re: ath10k performance, master branch from 20160407
>
> Rajkumar,
>
> I took backports from
> git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
> took latest ath tree from
> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
> backports-output based on ath master branch, refreshed openwrt
> patches.
> And saw big performance degradation. Am I doing something wrong?
>
> Regards,
> Roman
>
> On 8 April 2016 at 18:34, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
> wrote:
>> Roman,
>>
>> Which backports version are you using? I don't see codel changes in 
>> ath.git/wireless-drivers.git.
>> Hope you are using same firmware.
>>
>> -Rajkumar
>> 
>> From: ath10k <ath10k-boun...@lists.infradead.org> on behalf of Roman 
>> Yeryomin <leroi.li...@gmail.com>
>> Sent: Friday, April 8, 2016 8:14 PM
>> To: ath10k@lists.infradead.org
>> Subject: ath10k performance, master branch from 20160407
>>
>> Hello!
>>
>> I've seen performance patches were commited so I've decided to give it
>> a try (using 4.1 kernel and backports).
>> The results are quite disappointing: TCP download (client pov) dropped
>> from 750Mbps to ~550 and UDP shows completely weird behavour - if
>> generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives
>> 250Mbps, before (latest official backports release from January) I was
>> able to get 900Mbps.
>> Hardware is basically ap152 + qca988x 3x3.
>> When running perf top I see that fq_codel_drop eats a lot of cpu.
>> Here is the output when running iperf3 UDP test:
>>
>> 45.78%  [kernel]   [k] fq_codel_drop
>>  3.05%  [kernel]   [k] ag71xx_poll
>>  2.18%  [kernel]   [k] skb_release_data
>>  2.01%  [kernel]   [k] r4k_dma_cache_inv
>>  1.73%  [kernel]   [k] eth_type_trans
>>  1.24%  [kernel]   [k] build_skb
>>  1.20%  [mac80211] [k] ieee80211_tx_dequeue
>>  1.03%  [kernel]   [k] __delay
>>  0.98%  [kernel]   [k] fq_codel_enqueue
>>  0.94%  [kernel]   [k] __netif_receive_skb_core
>>  0.93%  [kernel]   [k] skb_release_head_state
>>  0.88%  [ath10k_core]  [k] ath10k_h

Re: ath10k performance, master branch from 20160407

2016-04-08 Thread Roman Yeryomin
Latest backports (compat-wireless) released (20160110) has codel
enabled (CPTCFG_NET_SCH_FQ_CODEL=y) and there are no openwrt patches
or special configuration for codel. And it runs ok.
How old commit do you want me to try?

Regards,
Roman

On 8 April 2016 at 19:41, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> wrote:
> That should be fine. Is codel running only for latest backports? Are there 
> any openwrt changes to configure codel? Can you plz try to reset master 
> branch to older commit and validate?
>
> -Rajkumar
> 
> From: Roman Yeryomin [leroi.li...@gmail.com]
> Sent: Friday, April 8, 2016 9:30 PM
> To: Manoharan, Rajkumar
> Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
> Subject: Re: ath10k performance, master branch from 20160407
>
> Rajkumar,
>
> I took backports from
> git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
> took latest ath tree from
> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
> backports-output based on ath master branch, refreshed openwrt
> patches.
> And saw big performance degradation. Am I doing something wrong?
>
> Regards,
> Roman
>
> On 8 April 2016 at 18:34, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> 
> wrote:
>> Roman,
>>
>> Which backports version are you using? I don't see codel changes in 
>> ath.git/wireless-drivers.git.
>> Hope you are using same firmware.
>>
>> -Rajkumar
>> 
>> From: ath10k <ath10k-boun...@lists.infradead.org> on behalf of Roman 
>> Yeryomin <leroi.li...@gmail.com>
>> Sent: Friday, April 8, 2016 8:14 PM
>> To: ath10k@lists.infradead.org
>> Subject: ath10k performance, master branch from 20160407
>>
>> Hello!
>>
>> I've seen performance patches were commited so I've decided to give it
>> a try (using 4.1 kernel and backports).
>> The results are quite disappointing: TCP download (client pov) dropped
>> from 750Mbps to ~550 and UDP shows completely weird behavour - if
>> generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives
>> 250Mbps, before (latest official backports release from January) I was
>> able to get 900Mbps.
>> Hardware is basically ap152 + qca988x 3x3.
>> When running perf top I see that fq_codel_drop eats a lot of cpu.
>> Here is the output when running iperf3 UDP test:
>>
>> 45.78%  [kernel]   [k] fq_codel_drop
>>  3.05%  [kernel]   [k] ag71xx_poll
>>  2.18%  [kernel]   [k] skb_release_data
>>  2.01%  [kernel]   [k] r4k_dma_cache_inv
>>  1.73%  [kernel]   [k] eth_type_trans
>>  1.24%  [kernel]   [k] build_skb
>>  1.20%  [mac80211] [k] ieee80211_tx_dequeue
>>  1.03%  [kernel]   [k] __delay
>>  0.98%  [kernel]   [k] fq_codel_enqueue
>>  0.94%  [kernel]   [k] __netif_receive_skb_core
>>  0.93%  [kernel]   [k] skb_release_head_state
>>  0.88%  [ath10k_core]  [k] ath10k_htt_tx
>>  0.87%  [kernel]   [k] __dev_queue_xmit
>>  0.84%  [mac80211] [k] ieee80211_tx_status
>>  0.81%  [kernel]   [k] __build_skb
>>  0.80%  [mac80211] [k] __ieee80211_subif_start_xmit
>>  0.77%  [kernel]   [k] br_handle_frame_finish
>>  0.75%  [kernel]   [k] __qdisc_run
>>  0.73%  [kernel]   [k] skb_recycler_consume
>>  0.72%  [kernel]   [k] kfree_skb
>>  0.72%  [kernel]   [k] get_page_from_freelist
>>  0.69%  [kernel]   [k] br_fdb_update
>>  0.69%  [kernel]   [k] br_handle_frame
>>  0.67%  [kernel]   [k] __copy_user_common
>>  0.66%  [kernel]   [k] __skb_flow_dissect
>>  0.65%  [ath10k_core]  [k] ath10k_txrx_tx_unref
>>  0.60%  [kernel]   [k] kmem_cache_alloc
>>  0.60%  [mac80211] [k] sta_addr_hash
>>  0.56%  [kernel]   [k] fq_codel_dequeue
>>  0.53%  [kernel]   [k] __local_bh_enable_ip
>>  0.50%  [kernel]   [k] __br_fdb_get
>>
>> What could be the reason?
>> I've seen there are some patches from Michal which touch fq_codel,
>> would those help or not?
>>
>>
>> Regards,
>> Roman
>>
>> ___
>> ath10k mailing list
>> ath10k@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/ath10k

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


RE: ath10k performance, master branch from 20160407

2016-04-08 Thread Manoharan, Rajkumar
That should be fine. Is codel running only for latest backports? Are there any 
openwrt changes to configure codel? Can you plz try to reset master branch to 
older commit and validate?

-Rajkumar

From: Roman Yeryomin [leroi.li...@gmail.com]
Sent: Friday, April 8, 2016 9:30 PM
To: Manoharan, Rajkumar
Cc: ath10k@lists.infradead.org; Rajkumar Manoharan
Subject: Re: ath10k performance, master branch from 20160407

Rajkumar,

I took backports from
git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
took latest ath tree from
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
backports-output based on ath master branch, refreshed openwrt
patches.
And saw big performance degradation. Am I doing something wrong?

Regards,
Roman

On 8 April 2016 at 18:34, Manoharan, Rajkumar <rmano...@qti.qualcomm.com> wrote:
> Roman,
>
> Which backports version are you using? I don't see codel changes in 
> ath.git/wireless-drivers.git.
> Hope you are using same firmware.
>
> -Rajkumar
> 
> From: ath10k <ath10k-boun...@lists.infradead.org> on behalf of Roman Yeryomin 
> <leroi.li...@gmail.com>
> Sent: Friday, April 8, 2016 8:14 PM
> To: ath10k@lists.infradead.org
> Subject: ath10k performance, master branch from 20160407
>
> Hello!
>
> I've seen performance patches were commited so I've decided to give it
> a try (using 4.1 kernel and backports).
> The results are quite disappointing: TCP download (client pov) dropped
> from 750Mbps to ~550 and UDP shows completely weird behavour - if
> generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives
> 250Mbps, before (latest official backports release from January) I was
> able to get 900Mbps.
> Hardware is basically ap152 + qca988x 3x3.
> When running perf top I see that fq_codel_drop eats a lot of cpu.
> Here is the output when running iperf3 UDP test:
>
> 45.78%  [kernel]   [k] fq_codel_drop
>  3.05%  [kernel]   [k] ag71xx_poll
>  2.18%  [kernel]   [k] skb_release_data
>  2.01%  [kernel]   [k] r4k_dma_cache_inv
>  1.73%  [kernel]   [k] eth_type_trans
>  1.24%  [kernel]   [k] build_skb
>  1.20%  [mac80211] [k] ieee80211_tx_dequeue
>  1.03%  [kernel]   [k] __delay
>  0.98%  [kernel]   [k] fq_codel_enqueue
>  0.94%  [kernel]   [k] __netif_receive_skb_core
>  0.93%  [kernel]   [k] skb_release_head_state
>  0.88%  [ath10k_core]  [k] ath10k_htt_tx
>  0.87%  [kernel]   [k] __dev_queue_xmit
>  0.84%  [mac80211] [k] ieee80211_tx_status
>  0.81%  [kernel]   [k] __build_skb
>  0.80%  [mac80211] [k] __ieee80211_subif_start_xmit
>  0.77%  [kernel]   [k] br_handle_frame_finish
>  0.75%  [kernel]   [k] __qdisc_run
>  0.73%  [kernel]   [k] skb_recycler_consume
>  0.72%  [kernel]   [k] kfree_skb
>  0.72%  [kernel]   [k] get_page_from_freelist
>  0.69%  [kernel]   [k] br_fdb_update
>  0.69%  [kernel]   [k] br_handle_frame
>  0.67%  [kernel]   [k] __copy_user_common
>  0.66%  [kernel]   [k] __skb_flow_dissect
>  0.65%  [ath10k_core]  [k] ath10k_txrx_tx_unref
>  0.60%  [kernel]   [k] kmem_cache_alloc
>  0.60%  [mac80211] [k] sta_addr_hash
>  0.56%  [kernel]   [k] fq_codel_dequeue
>  0.53%  [kernel]   [k] __local_bh_enable_ip
>  0.50%  [kernel]   [k] __br_fdb_get
>
> What could be the reason?
> I've seen there are some patches from Michal which touch fq_codel,
> would those help or not?
>
>
> Regards,
> Roman
>
> ___
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-08 Thread Roman Yeryomin
Rajkumar,

I took backports from
git://git.kernel.org/pub/scm/linux/kernel/git/backports/backports.git,
took latest ath tree from
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git, generated
backports-output based on ath master branch, refreshed openwrt
patches.
And saw big performance degradation. Am I doing something wrong?

Regards,
Roman

On 8 April 2016 at 18:34, Manoharan, Rajkumar  wrote:
> Roman,
>
> Which backports version are you using? I don't see codel changes in 
> ath.git/wireless-drivers.git.
> Hope you are using same firmware.
>
> -Rajkumar
> 
> From: ath10k  on behalf of Roman Yeryomin 
> 
> Sent: Friday, April 8, 2016 8:14 PM
> To: ath10k@lists.infradead.org
> Subject: ath10k performance, master branch from 20160407
>
> Hello!
>
> I've seen performance patches were commited so I've decided to give it
> a try (using 4.1 kernel and backports).
> The results are quite disappointing: TCP download (client pov) dropped
> from 750Mbps to ~550 and UDP shows completely weird behavour - if
> generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives
> 250Mbps, before (latest official backports release from January) I was
> able to get 900Mbps.
> Hardware is basically ap152 + qca988x 3x3.
> When running perf top I see that fq_codel_drop eats a lot of cpu.
> Here is the output when running iperf3 UDP test:
>
> 45.78%  [kernel]   [k] fq_codel_drop
>  3.05%  [kernel]   [k] ag71xx_poll
>  2.18%  [kernel]   [k] skb_release_data
>  2.01%  [kernel]   [k] r4k_dma_cache_inv
>  1.73%  [kernel]   [k] eth_type_trans
>  1.24%  [kernel]   [k] build_skb
>  1.20%  [mac80211] [k] ieee80211_tx_dequeue
>  1.03%  [kernel]   [k] __delay
>  0.98%  [kernel]   [k] fq_codel_enqueue
>  0.94%  [kernel]   [k] __netif_receive_skb_core
>  0.93%  [kernel]   [k] skb_release_head_state
>  0.88%  [ath10k_core]  [k] ath10k_htt_tx
>  0.87%  [kernel]   [k] __dev_queue_xmit
>  0.84%  [mac80211] [k] ieee80211_tx_status
>  0.81%  [kernel]   [k] __build_skb
>  0.80%  [mac80211] [k] __ieee80211_subif_start_xmit
>  0.77%  [kernel]   [k] br_handle_frame_finish
>  0.75%  [kernel]   [k] __qdisc_run
>  0.73%  [kernel]   [k] skb_recycler_consume
>  0.72%  [kernel]   [k] kfree_skb
>  0.72%  [kernel]   [k] get_page_from_freelist
>  0.69%  [kernel]   [k] br_fdb_update
>  0.69%  [kernel]   [k] br_handle_frame
>  0.67%  [kernel]   [k] __copy_user_common
>  0.66%  [kernel]   [k] __skb_flow_dissect
>  0.65%  [ath10k_core]  [k] ath10k_txrx_tx_unref
>  0.60%  [kernel]   [k] kmem_cache_alloc
>  0.60%  [mac80211] [k] sta_addr_hash
>  0.56%  [kernel]   [k] fq_codel_dequeue
>  0.53%  [kernel]   [k] __local_bh_enable_ip
>  0.50%  [kernel]   [k] __br_fdb_get
>
> What could be the reason?
> I've seen there are some patches from Michal which touch fq_codel,
> would those help or not?
>
>
> Regards,
> Roman
>
> ___
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: ath10k performance, master branch from 20160407

2016-04-08 Thread Manoharan, Rajkumar
Roman,

Which backports version are you using? I don't see codel changes in 
ath.git/wireless-drivers.git.
Hope you are using same firmware.

-Rajkumar

From: ath10k  on behalf of Roman Yeryomin 

Sent: Friday, April 8, 2016 8:14 PM
To: ath10k@lists.infradead.org
Subject: ath10k performance, master branch from 20160407

Hello!

I've seen performance patches were commited so I've decided to give it
a try (using 4.1 kernel and backports).
The results are quite disappointing: TCP download (client pov) dropped
from 750Mbps to ~550 and UDP shows completely weird behavour - if
generating 900Mbps it gives 30Mbps max, if generating 300Mbps it gives
250Mbps, before (latest official backports release from January) I was
able to get 900Mbps.
Hardware is basically ap152 + qca988x 3x3.
When running perf top I see that fq_codel_drop eats a lot of cpu.
Here is the output when running iperf3 UDP test:

45.78%  [kernel]   [k] fq_codel_drop
 3.05%  [kernel]   [k] ag71xx_poll
 2.18%  [kernel]   [k] skb_release_data
 2.01%  [kernel]   [k] r4k_dma_cache_inv
 1.73%  [kernel]   [k] eth_type_trans
 1.24%  [kernel]   [k] build_skb
 1.20%  [mac80211] [k] ieee80211_tx_dequeue
 1.03%  [kernel]   [k] __delay
 0.98%  [kernel]   [k] fq_codel_enqueue
 0.94%  [kernel]   [k] __netif_receive_skb_core
 0.93%  [kernel]   [k] skb_release_head_state
 0.88%  [ath10k_core]  [k] ath10k_htt_tx
 0.87%  [kernel]   [k] __dev_queue_xmit
 0.84%  [mac80211] [k] ieee80211_tx_status
 0.81%  [kernel]   [k] __build_skb
 0.80%  [mac80211] [k] __ieee80211_subif_start_xmit
 0.77%  [kernel]   [k] br_handle_frame_finish
 0.75%  [kernel]   [k] __qdisc_run
 0.73%  [kernel]   [k] skb_recycler_consume
 0.72%  [kernel]   [k] kfree_skb
 0.72%  [kernel]   [k] get_page_from_freelist
 0.69%  [kernel]   [k] br_fdb_update
 0.69%  [kernel]   [k] br_handle_frame
 0.67%  [kernel]   [k] __copy_user_common
 0.66%  [kernel]   [k] __skb_flow_dissect
 0.65%  [ath10k_core]  [k] ath10k_txrx_tx_unref
 0.60%  [kernel]   [k] kmem_cache_alloc
 0.60%  [mac80211] [k] sta_addr_hash
 0.56%  [kernel]   [k] fq_codel_dequeue
 0.53%  [kernel]   [k] __local_bh_enable_ip
 0.50%  [kernel]   [k] __br_fdb_get

What could be the reason?
I've seen there are some patches from Michal which touch fq_codel,
would those help or not?


Regards,
Roman

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k