Re: ath10k wake_tx_queue issues
On Wed, May 16, 2018 at 07:28:21PM +0200, Erik Stromdahl wrote: > Hello Niklas > > Quick question: > Are you using my patch: "ath10k: add htt_tx num_pending window"? Nope, but I definitely think that your patch should be merged, since the current code can lock/unlock/lock a lot of times for no good reason. (I actually tried it, but I could still reproduce the bug with it.) > > I assume (from your logs below) that you are not... > Thanks a lot for you suggestions Erik! Increasing max_num_pending might be a good idea (perhaps we will get better thoughput in the SDIO case). However, increasing either max_num_pending or HTC_HOST_MAX_MSG_PER_TX_BUNDLE would probably just move the problem, since it would still be possible for us to get hit by the same problem again in the future. I actually took Toke's suggestion and cooked up a patch: https://marc.info/?l=linux-kernel=152659902128543=2 Kind regards, Niklas
Re: ath10k wake_tx_queue issues
Hello Niklas Quick question: Are you using my patch: "ath10k: add htt_tx num_pending window"? I assume (from your logs below) that you are not... See more comments below. I guess the best way to describe this is to show my ftrace buffer: ksoftirqd/2-21[002] .ns474.711744: ath10k_htt_tx_dec_pending: num_pen: 60 ksoftirqd/2-21[002] .ns374.711761: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 60 queue: 0 ksoftirqd/2-21[002] .ns474.711765: ath10k_htt_tx_inc_pending: num_pen: 61 ksoftirqd/2-21[002] .ns474.711781: ath10k_htt_tx_inc_pending: num_pen: 62 ksoftirqd/2-21[002] .ns474.711787: ath10k_htt_tx_dec_pending: num_pen: 61 ksoftirqd/2-21[002] .ns374.711803: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 61 queue: 0 ksoftirqd/2-21[002] .ns474.711807: ath10k_htt_tx_inc_pending: num_pen: 62 ksoftirqd/2-21[002] .ns474.711823: ath10k_htt_tx_inc_pending: num_pen: 63 ksoftirqd/2-21[002] .ns474.711829: ath10k_htt_tx_dec_pending: num_pen: 62 ksoftirqd/2-21[002] .ns374.711845: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 62 queue: 0 ksoftirqd/2-21[002] .ns474.711849: ath10k_htt_tx_inc_pending: num_pen: 63 ksoftirqd/2-21[002] .ns474.711865: ath10k_htt_tx_inc_pending: num_pen: 64 For high latency devices, when the num_pending counter reaches this point (64), ath10k calls ieee80211_stop_queues (from ath10k_htt_tx_inc_pending -> ath10k_mac_tx_lock). This will stop all per-TID queues (16 in total). ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, reason:0 ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, reason:0 ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, reason:0 ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, reason:0 ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, reason:0 ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, reason:0 ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, reason:0 ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, reason:0 ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, reason:0 ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 queue:10, reason:0 ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 queue:11, reason:0 ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 queue:12, reason:0 ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 queue:13, reason:0 ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 queue:14, reason:0 ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 queue:15, reason:0 ksoftirqd/2-21[002] .ns474.711917: ath10k_htt_tx_dec_pending: num_pen: 63 At this point, the driver have received a TX_COMPL_IND (since num_pending has been decremented to 63). Without the patch I mentioned above, the logic is that the TX queues will be re-enabled immediately. This is achieved by calling ieee80211_stop_queues. ksoftirqd/2-21[002] dns574.711922: wake_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711929: wake_queue: phy0 queue:15, reason:0 2 queues have been enabled here. I don't know why not all queues have been re-enabled. Perhaps it is due to that mac80211 immediately gets more data on one of the just re-enabled queues, resulting in... ksoftirqd/2-21[002] .ns374.711948: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 63 queue: 0 ...the the wake_tx_queue op beeing invoked again ... ksoftirqd/2-21[002] .ns474.711952: ath10k_htt_tx_inc_pending: num_pen: 64 ... and the num_pending counter beeing incremented as a result of this. This results in all queues beeing ... ksoftirqd/2-21[002] dns574.711956: stop_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711959: stop_queue: phy0 queue:1, reason:0 ksoftirqd/2-21[002] dns574.711962: stop_queue: phy0 queue:2, reason:0 ksoftirqd/2-21[002] dns574.711964: stop_queue: phy0 queue:3, reason:0 ksoftirqd/2-21[002] dns574.711967: stop_queue: phy0 queue:4, reason:0 ksoftirqd/2-21[002] dns574.711969: stop_queue: phy0 queue:5, reason:0 ksoftirqd/2-21[002] dns574.711972: stop_queue: phy0 queue:6, reason:0 ksoftirqd/2-21[002] dns574.711974: stop_queue: phy0 queue:7, reason:0 ksoftirqd/2-21
Re: ath10k wake_tx_queue issues
Niklas Casselwrites: > [ .. snip .. ] >> > Sure, the regular way ath10k_mac_op_wake_tx_queue is called is via >> > ieee80211_subif_start_xmit => __ieee80211_subif_start_xmit => >> > ieee80211_xmit_fast >> > => ieee80211_queue_skb => drv_wake_tx_queue. >> > >> > But I was expecting the call to ieee80211_wake_queue to somehow trigger a >> > call >> > to ath10k_mac_op_wake_tx_queue, since there is still data in the send >> > buffer/ >> > in the ieee80211_txq that needs to be sent, to allow more data to be >> > written to >> > the socket. But obviously the callback never comes. >> > Or how else is this supposed to work? >> >> The driver should reschedule itself before/after calling >> ieee80211_wake_queue. mt76 does this; I'm not actually sure if ath9k >> does the right thing either, I'm not too familiar with that part of the >> code. There's no direct call to reschedule that I can see, but there may >> be another reason why this is not needed for ath9k. I'm sure Felix >> knows? > > Hello Toke > > Unfortunately, it doesn't look like mt76 uses any ieee80211_* function > to reschedule. It doesn't need to; it just reschedules itself. Basically, the wake_tx_queue() callback is just a way for mac80211 to notify the driver that new packets are available and that it should start its scheduling function. But in this case it is the driver that is restarting the queues, so it already knows that. And so it can just call its internal scheduling function. This is what mt76 does in mt76_dma_tx_cleanup() with the call to mt76_txq_schedule() before calling ieee80211_wake_queue(). I think that what ath10k should be doing is calling ath10k_mac_tx_push_pending() when it restarts the queues. > I just came across a ieee80211_schedule_txq() function in e937b8da5a59 > ("mac80211: Add TXQ scheduling API"). However, this commit was > reverted. Any plans on resubmitting this? Yeah, I have a revised version lying around waiting for Felix to review it. But that wouldn't help this bug; it's just an API change, it doesn't change behaviour... -Toke
Re: ath10k wake_tx_queue issues
On Wed, May 16, 2018 at 07:54:54AM +0200, Alexander Wetzel wrote: > Hello, > > This sounds exactly like the issue I just submitted a patch for. > Can you test https://patchwork.kernel.org/patch/10399613/ if that solves > the issue? Hello Alexander, I just tried you fix, and unfortunately it doesn't solve my problem. Regards, Niklas > > > > Am 15.05.2018 um 22:31 schrieb Niklas Cassel: > > On Tue, May 15, 2018 at 04:13:48PM +0200, Toke Høiland-Jørgensen wrote: > >> [ Adding Felix ] > >> > >> > >> Niklas Casselwrites: > >> > >>> Hello mac80211 and ath10k people > >>> > >>> Using ath10k, TX stops working when running iperf > >>> > >>> [ 3] 0.0- 1.0 sec 2.00 MBytes 16.8 Mbits/sec > >>> [ 3] 1.0- 2.0 sec 3.12 MBytes 26.2 Mbits/sec > >>> [ 3] 2.0- 3.0 sec 3.25 MBytes 27.3 Mbits/sec > >>> [ 3] 3.0- 4.0 sec 655 KBytes 5.36 Mbits/sec > >>> [ 3] 4.0- 5.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec > >>> [ 3] 0.0-10.3 sec 9.01 MBytes 7.32 Mbits/sec > >>> > >>> The problem can be reproduced without specifying a send buffer size, > >>> however, specifying a small send buffer helps to reproduce the problem > >>> faster. > >>> > >>> What happens is that iperf gets -EAGAIN on write(). > >>> It continues to get -EAGAIN, even if iperf runs for e.g. 300 seconds. > >>> The reason why we get -EAGAIN is because the send socket buffer is full > >>> (iperf uses non-blocking I/O). > >>> > >>> The problem is that the mac80211 wake_tx_queue callback never comes. > >>> > >>> I guess the best way to describe this is to show my ftrace buffer: > >>> > >>> ksoftirqd/2-21[002] .ns474.711744: > >>> ath10k_htt_tx_dec_pending: num_pen: 60 > >>> ksoftirqd/2-21[002] .ns374.711761: > >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 60 queue: 0 > >>> ksoftirqd/2-21[002] .ns474.711765: > >>> ath10k_htt_tx_inc_pending: num_pen: 61 > >>> ksoftirqd/2-21[002] .ns474.711781: > >>> ath10k_htt_tx_inc_pending: num_pen: 62 > >>> ksoftirqd/2-21[002] .ns474.711787: > >>> ath10k_htt_tx_dec_pending: num_pen: 61 > >>> ksoftirqd/2-21[002] .ns374.711803: > >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 61 queue: 0 > >>> ksoftirqd/2-21[002] .ns474.711807: > >>> ath10k_htt_tx_inc_pending: num_pen: 62 > >>> ksoftirqd/2-21[002] .ns474.711823: > >>> ath10k_htt_tx_inc_pending: num_pen: 63 > >>> ksoftirqd/2-21[002] .ns474.711829: > >>> ath10k_htt_tx_dec_pending: num_pen: 62 > >>> ksoftirqd/2-21[002] .ns374.711845: > >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 62 queue: 0 > >>> ksoftirqd/2-21[002] .ns474.711849: > >>> ath10k_htt_tx_inc_pending: num_pen: 63 > >>> ksoftirqd/2-21[002] .ns474.711865: > >>> ath10k_htt_tx_inc_pending: num_pen: 64 > >>> ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, > >>> reason:0 > >>> ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 > >>> queue:10, reason:0 > >>> ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 > >>> queue:11, reason:0 > >>> ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 > >>> queue:12, reason:0 > >>> ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 > >>> queue:13, reason:0 > >>> ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 > >>> queue:14, reason:0 > >>> ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 > >>> queue:15, reason:0 > >>> ksoftirqd/2-21[002] .ns474.711917: > >>> ath10k_htt_tx_dec_pending: num_pen: 63 > >>> ksoftirqd/2-21[002] dns5
Re: ath10k wake_tx_queue issues
Hello, This sounds exactly like the issue I just submitted a patch for. Can you test https://patchwork.kernel.org/patch/10399613/ if that solves the issue? Am 15.05.2018 um 22:31 schrieb Niklas Cassel: > On Tue, May 15, 2018 at 04:13:48PM +0200, Toke Høiland-Jørgensen wrote: >> [ Adding Felix ] >> >> >> Niklas Casselwrites: >> >>> Hello mac80211 and ath10k people >>> >>> Using ath10k, TX stops working when running iperf >>> >>> [ 3] 0.0- 1.0 sec 2.00 MBytes 16.8 Mbits/sec >>> [ 3] 1.0- 2.0 sec 3.12 MBytes 26.2 Mbits/sec >>> [ 3] 2.0- 3.0 sec 3.25 MBytes 27.3 Mbits/sec >>> [ 3] 3.0- 4.0 sec 655 KBytes 5.36 Mbits/sec >>> [ 3] 4.0- 5.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec >>> [ 3] 0.0-10.3 sec 9.01 MBytes 7.32 Mbits/sec >>> >>> The problem can be reproduced without specifying a send buffer size, >>> however, specifying a small send buffer helps to reproduce the problem >>> faster. >>> >>> What happens is that iperf gets -EAGAIN on write(). >>> It continues to get -EAGAIN, even if iperf runs for e.g. 300 seconds. >>> The reason why we get -EAGAIN is because the send socket buffer is full >>> (iperf uses non-blocking I/O). >>> >>> The problem is that the mac80211 wake_tx_queue callback never comes. >>> >>> I guess the best way to describe this is to show my ftrace buffer: >>> >>> ksoftirqd/2-21[002] .ns474.711744: ath10k_htt_tx_dec_pending: >>> num_pen: 60 >>> ksoftirqd/2-21[002] .ns374.711761: >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 60 queue: 0 >>> ksoftirqd/2-21[002] .ns474.711765: ath10k_htt_tx_inc_pending: >>> num_pen: 61 >>> ksoftirqd/2-21[002] .ns474.711781: ath10k_htt_tx_inc_pending: >>> num_pen: 62 >>> ksoftirqd/2-21[002] .ns474.711787: ath10k_htt_tx_dec_pending: >>> num_pen: 61 >>> ksoftirqd/2-21[002] .ns374.711803: >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 61 queue: 0 >>> ksoftirqd/2-21[002] .ns474.711807: ath10k_htt_tx_inc_pending: >>> num_pen: 62 >>> ksoftirqd/2-21[002] .ns474.711823: ath10k_htt_tx_inc_pending: >>> num_pen: 63 >>> ksoftirqd/2-21[002] .ns474.711829: ath10k_htt_tx_dec_pending: >>> num_pen: 62 >>> ksoftirqd/2-21[002] .ns374.711845: >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 62 queue: 0 >>> ksoftirqd/2-21[002] .ns474.711849: ath10k_htt_tx_inc_pending: >>> num_pen: 63 >>> ksoftirqd/2-21[002] .ns474.711865: ath10k_htt_tx_inc_pending: >>> num_pen: 64 >>> ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 queue:10, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 queue:11, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 queue:12, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 queue:13, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 queue:14, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 queue:15, >>> reason:0 >>> ksoftirqd/2-21[002] .ns474.711917: ath10k_htt_tx_dec_pending: >>> num_pen: 63 >>> ksoftirqd/2-21[002] dns574.711922: wake_queue: phy0 queue:0, >>> reason:0 >>> ksoftirqd/2-21[002] dns574.711929: wake_queue: phy0 queue:15, >>> reason:0 >>> ksoftirqd/2-21[002] .ns374.711948: >>> ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: >>> frame_cnt: 1 byte_cnt: 1534 num_pen: 63 queue: 0 >>> ksoftirqd/2-21[002] .ns474.711952:
Re: ath10k wake_tx_queue issues
On Tue, May 15, 2018 at 04:13:48PM +0200, Toke Høiland-Jørgensen wrote: > [ Adding Felix ] > > > Niklas Casselwrites: > > > Hello mac80211 and ath10k people > > > > Using ath10k, TX stops working when running iperf > > > > [ 3] 0.0- 1.0 sec 2.00 MBytes 16.8 Mbits/sec > > [ 3] 1.0- 2.0 sec 3.12 MBytes 26.2 Mbits/sec > > [ 3] 2.0- 3.0 sec 3.25 MBytes 27.3 Mbits/sec > > [ 3] 3.0- 4.0 sec 655 KBytes 5.36 Mbits/sec > > [ 3] 4.0- 5.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec > > [ 3] 0.0-10.3 sec 9.01 MBytes 7.32 Mbits/sec > > > > The problem can be reproduced without specifying a send buffer size, > > however, specifying a small send buffer helps to reproduce the problem > > faster. > > > > What happens is that iperf gets -EAGAIN on write(). > > It continues to get -EAGAIN, even if iperf runs for e.g. 300 seconds. > > The reason why we get -EAGAIN is because the send socket buffer is full > > (iperf uses non-blocking I/O). > > > > The problem is that the mac80211 wake_tx_queue callback never comes. > > > > I guess the best way to describe this is to show my ftrace buffer: > > > > ksoftirqd/2-21[002] .ns474.711744: ath10k_htt_tx_dec_pending: > > num_pen: 60 > > ksoftirqd/2-21[002] .ns374.711761: > > ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > > frame_cnt: 1 byte_cnt: 1534 num_pen: 60 queue: 0 > > ksoftirqd/2-21[002] .ns474.711765: ath10k_htt_tx_inc_pending: > > num_pen: 61 > > ksoftirqd/2-21[002] .ns474.711781: ath10k_htt_tx_inc_pending: > > num_pen: 62 > > ksoftirqd/2-21[002] .ns474.711787: ath10k_htt_tx_dec_pending: > > num_pen: 61 > > ksoftirqd/2-21[002] .ns374.711803: > > ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > > frame_cnt: 1 byte_cnt: 1534 num_pen: 61 queue: 0 > > ksoftirqd/2-21[002] .ns474.711807: ath10k_htt_tx_inc_pending: > > num_pen: 62 > > ksoftirqd/2-21[002] .ns474.711823: ath10k_htt_tx_inc_pending: > > num_pen: 63 > > ksoftirqd/2-21[002] .ns474.711829: ath10k_htt_tx_dec_pending: > > num_pen: 62 > > ksoftirqd/2-21[002] .ns374.711845: > > ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > > frame_cnt: 1 byte_cnt: 1534 num_pen: 62 queue: 0 > > ksoftirqd/2-21[002] .ns474.711849: ath10k_htt_tx_inc_pending: > > num_pen: 63 > > ksoftirqd/2-21[002] .ns474.711865: ath10k_htt_tx_inc_pending: > > num_pen: 64 > > ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, > > reason:0 > > ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, > > reason:0 > > ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, > > reason:0 > > ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, > > reason:0 > > ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, > > reason:0 > > ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, > > reason:0 > > ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, > > reason:0 > > ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, > > reason:0 > > ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, > > reason:0 > > ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, > > reason:0 > > ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 queue:10, > > reason:0 > > ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 queue:11, > > reason:0 > > ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 queue:12, > > reason:0 > > ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 queue:13, > > reason:0 > > ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 queue:14, > > reason:0 > > ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 queue:15, > > reason:0 > > ksoftirqd/2-21[002] .ns474.711917: ath10k_htt_tx_dec_pending: > > num_pen: 63 > > ksoftirqd/2-21[002] dns574.711922: wake_queue: phy0 queue:0, > > reason:0 > > ksoftirqd/2-21[002] dns574.711929: wake_queue: phy0 queue:15, > > reason:0 > > ksoftirqd/2-21[002] .ns374.711948: > > ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: > > frame_cnt: 1 byte_cnt: 1534 num_pen: 63 queue: 0 > > ksoftirqd/2-21[002] .ns474.711952: ath10k_htt_tx_inc_pending: > > num_pen: 64 > > ksoftirqd/2-21[002] dns574.711956: stop_queue: phy0 queue:0, > > reason:0 > > ksoftirqd/2-21[002] dns574.711959: stop_queue: phy0 queue:1, >
Re: ath10k wake_tx_queue issues
[ Adding Felix ] Niklas Casselwrites: > Hello mac80211 and ath10k people > > Using ath10k, TX stops working when running iperf > > [ 3] 0.0- 1.0 sec 2.00 MBytes 16.8 Mbits/sec > [ 3] 1.0- 2.0 sec 3.12 MBytes 26.2 Mbits/sec > [ 3] 2.0- 3.0 sec 3.25 MBytes 27.3 Mbits/sec > [ 3] 3.0- 4.0 sec 655 KBytes 5.36 Mbits/sec > [ 3] 4.0- 5.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec > [ 3] 0.0-10.3 sec 9.01 MBytes 7.32 Mbits/sec > > The problem can be reproduced without specifying a send buffer size, > however, specifying a small send buffer helps to reproduce the problem faster. > > What happens is that iperf gets -EAGAIN on write(). > It continues to get -EAGAIN, even if iperf runs for e.g. 300 seconds. > The reason why we get -EAGAIN is because the send socket buffer is full > (iperf uses non-blocking I/O). > > The problem is that the mac80211 wake_tx_queue callback never comes. > > I guess the best way to describe this is to show my ftrace buffer: > > ksoftirqd/2-21[002] .ns474.711744: ath10k_htt_tx_dec_pending: > num_pen: 60 > ksoftirqd/2-21[002] .ns374.711761: ath10k_mac_op_wake_tx_queue: > txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: > 60 queue: 0 > ksoftirqd/2-21[002] .ns474.711765: ath10k_htt_tx_inc_pending: > num_pen: 61 > ksoftirqd/2-21[002] .ns474.711781: ath10k_htt_tx_inc_pending: > num_pen: 62 > ksoftirqd/2-21[002] .ns474.711787: ath10k_htt_tx_dec_pending: > num_pen: 61 > ksoftirqd/2-21[002] .ns374.711803: ath10k_mac_op_wake_tx_queue: > txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: > 61 queue: 0 > ksoftirqd/2-21[002] .ns474.711807: ath10k_htt_tx_inc_pending: > num_pen: 62 > ksoftirqd/2-21[002] .ns474.711823: ath10k_htt_tx_inc_pending: > num_pen: 63 > ksoftirqd/2-21[002] .ns474.711829: ath10k_htt_tx_dec_pending: > num_pen: 62 > ksoftirqd/2-21[002] .ns374.711845: ath10k_mac_op_wake_tx_queue: > txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: > 62 queue: 0 > ksoftirqd/2-21[002] .ns474.711849: ath10k_htt_tx_inc_pending: > num_pen: 63 > ksoftirqd/2-21[002] .ns474.711865: ath10k_htt_tx_inc_pending: > num_pen: 64 > ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, > reason:0 > ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, > reason:0 > ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, > reason:0 > ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, > reason:0 > ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, > reason:0 > ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, > reason:0 > ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, > reason:0 > ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, > reason:0 > ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, > reason:0 > ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, > reason:0 > ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 queue:10, > reason:0 > ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 queue:11, > reason:0 > ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 queue:12, > reason:0 > ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 queue:13, > reason:0 > ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 queue:14, > reason:0 > ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 queue:15, > reason:0 > ksoftirqd/2-21[002] .ns474.711917: ath10k_htt_tx_dec_pending: > num_pen: 63 > ksoftirqd/2-21[002] dns574.711922: wake_queue: phy0 queue:0, > reason:0 > ksoftirqd/2-21[002] dns574.711929: wake_queue: phy0 queue:15, > reason:0 > ksoftirqd/2-21[002] .ns374.711948: ath10k_mac_op_wake_tx_queue: > txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: > 63 queue: 0 > ksoftirqd/2-21[002] .ns474.711952: ath10k_htt_tx_inc_pending: > num_pen: 64 > ksoftirqd/2-21[002] dns574.711956: stop_queue: phy0 queue:0, > reason:0 > ksoftirqd/2-21[002] dns574.711959: stop_queue: phy0 queue:1, > reason:0 > ksoftirqd/2-21[002] dns574.711962: stop_queue: phy0 queue:2, > reason:0 > ksoftirqd/2-21[002] dns574.711964: stop_queue: phy0 queue:3, > reason:0 > ksoftirqd/2-21[002] dns574.711967: stop_queue: phy0 queue:4, > reason:0 >
Re: ath10k wake_tx_queue issues
On 05/15/2018 06:45 AM, Niklas Cassel wrote: Hello mac80211 and ath10k people Using ath10k, TX stops working when running iperf What kernel and firmware and ath10k hardware are you using? Does the same thing happen on older kernels? I had some issues with ath9k having some sort of tx-hang, but my test case was using 200 virtual stations to transmit, and others could not reproduce it, so I am not sure I saw the same issue as you did. Thanks, Ben [ 3] 0.0- 1.0 sec 2.00 MBytes 16.8 Mbits/sec [ 3] 1.0- 2.0 sec 3.12 MBytes 26.2 Mbits/sec [ 3] 2.0- 3.0 sec 3.25 MBytes 27.3 Mbits/sec [ 3] 3.0- 4.0 sec 655 KBytes 5.36 Mbits/sec [ 3] 4.0- 5.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 0.0-10.3 sec 9.01 MBytes 7.32 Mbits/sec The problem can be reproduced without specifying a send buffer size, however, specifying a small send buffer helps to reproduce the problem faster. What happens is that iperf gets -EAGAIN on write(). It continues to get -EAGAIN, even if iperf runs for e.g. 300 seconds. The reason why we get -EAGAIN is because the send socket buffer is full (iperf uses non-blocking I/O). The problem is that the mac80211 wake_tx_queue callback never comes. I guess the best way to describe this is to show my ftrace buffer: ksoftirqd/2-21[002] .ns474.711744: ath10k_htt_tx_dec_pending: num_pen: 60 ksoftirqd/2-21[002] .ns374.711761: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 60 queue: 0 ksoftirqd/2-21[002] .ns474.711765: ath10k_htt_tx_inc_pending: num_pen: 61 ksoftirqd/2-21[002] .ns474.711781: ath10k_htt_tx_inc_pending: num_pen: 62 ksoftirqd/2-21[002] .ns474.711787: ath10k_htt_tx_dec_pending: num_pen: 61 ksoftirqd/2-21[002] .ns374.711803: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 61 queue: 0 ksoftirqd/2-21[002] .ns474.711807: ath10k_htt_tx_inc_pending: num_pen: 62 ksoftirqd/2-21[002] .ns474.711823: ath10k_htt_tx_inc_pending: num_pen: 63 ksoftirqd/2-21[002] .ns474.711829: ath10k_htt_tx_dec_pending: num_pen: 62 ksoftirqd/2-21[002] .ns374.711845: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 62 queue: 0 ksoftirqd/2-21[002] .ns474.711849: ath10k_htt_tx_inc_pending: num_pen: 63 ksoftirqd/2-21[002] .ns474.711865: ath10k_htt_tx_inc_pending: num_pen: 64 ksoftirqd/2-21[002] dns574.711870: stop_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711874: stop_queue: phy0 queue:1, reason:0 ksoftirqd/2-21[002] dns574.711877: stop_queue: phy0 queue:2, reason:0 ksoftirqd/2-21[002] dns574.711880: stop_queue: phy0 queue:3, reason:0 ksoftirqd/2-21[002] dns574.711882: stop_queue: phy0 queue:4, reason:0 ksoftirqd/2-21[002] dns574.711885: stop_queue: phy0 queue:5, reason:0 ksoftirqd/2-21[002] dns574.711887: stop_queue: phy0 queue:6, reason:0 ksoftirqd/2-21[002] dns574.711890: stop_queue: phy0 queue:7, reason:0 ksoftirqd/2-21[002] dns574.711892: stop_queue: phy0 queue:8, reason:0 ksoftirqd/2-21[002] dns574.711895: stop_queue: phy0 queue:9, reason:0 ksoftirqd/2-21[002] dns574.711898: stop_queue: phy0 queue:10, reason:0 ksoftirqd/2-21[002] dns574.711900: stop_queue: phy0 queue:11, reason:0 ksoftirqd/2-21[002] dns574.711903: stop_queue: phy0 queue:12, reason:0 ksoftirqd/2-21[002] dns574.711905: stop_queue: phy0 queue:13, reason:0 ksoftirqd/2-21[002] dns574.711908: stop_queue: phy0 queue:14, reason:0 ksoftirqd/2-21[002] dns574.711910: stop_queue: phy0 queue:15, reason:0 ksoftirqd/2-21[002] .ns474.711917: ath10k_htt_tx_dec_pending: num_pen: 63 ksoftirqd/2-21[002] dns574.711922: wake_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711929: wake_queue: phy0 queue:15, reason:0 ksoftirqd/2-21[002] .ns374.711948: ath10k_mac_op_wake_tx_queue: txq: frame_cnt: 1 byte_cnt: 1534 f_txq: frame_cnt: 1 byte_cnt: 1534 num_pen: 63 queue: 0 ksoftirqd/2-21[002] .ns474.711952: ath10k_htt_tx_inc_pending: num_pen: 64 ksoftirqd/2-21[002] dns574.711956: stop_queue: phy0 queue:0, reason:0 ksoftirqd/2-21[002] dns574.711959: stop_queue: phy0 queue:1, reason:0 ksoftirqd/2-21[002] dns574.711962: stop_queue: phy0 queue:2, reason:0 ksoftirqd/2-21[002] dns574.711964: stop_queue: phy0 queue:3, reason:0