Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Wen Gong writes: > On 2020-04-16 20:27, Kalle Valo wrote: >> Kalle Valo writes: >> > >>> How much does it drop? Please add the justification (with numbers) for >>> the new thread to the commit log, so that the reason is properly >>> documented. >> >> I see that you already submitted v10. If you can give the numbers I can >> add them to the commit log. > > I tested for VHT80 mode for 3 thread config: > result: > TCP-RXTCP-TXUDP-RX > UDP-TX > use workqueue_tx_complete(Mbps)423 357 448 412 > change it to ar->workqueue(Mbps) 410 360 449 414 > change it to ar->workqueue_aux(Mbps) 405 339 446 401 Thanks, I added the table to the commit log. -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
On 2020-04-16 20:27, Kalle Valo wrote: Kalle Valo writes: How much does it drop? Please add the justification (with numbers) for the new thread to the commit log, so that the reason is properly documented. I see that you already submitted v10. If you can give the numbers I can add them to the commit log. I tested for VHT80 mode for 3 thread config: result: TCP-RXTCP-TXUDP-RX UDP-TX use workqueue_tx_complete(Mbps)423 357 448 412 change it to ar->workqueue(Mbps) 410 360 449 414 change it to ar->workqueue_aux(Mbps) 405 339 446 401 each thread role: tx_bundle_skbs(ar->workqueue), rx_indication(ar->workqueue_aux), sdio_async_tx_request(ar_sdio->workqueue), tx_bundle_complete(ar->workqueue_tx_complete) ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Kalle Valo writes: > Wen Gong writes: > >> On 2020-04-09 22:53, Kalle Valo wrote: >>> Wen Gong writes: >>> + ar->workqueue_tx_complete = + create_singlethread_workqueue("ath10k_tx_complete_wq"); + if (!ar->workqueue_tx_complete) + goto err_free_aux_wq; >>> >>> We already have three threads: >>> >>> ath/ath10k/core.c: ar->workqueue = >>> create_singlethread_workqueue("ath10k_wq"); >>> ath/ath10k/core.c: ar->workqueue_aux = >>> create_singlethread_workqueue("ath10k_aux_wq"); >>> ath/ath10k/sdio.c: ar_sdio->workqueue = >>> create_singlethread_workqueue("ath10k_sdio_wq"); >>> >>> Do we really need a fourth one? For example, why can't we use >>> ar->workqueue_aux? >> >> For tcp test, it has 4 thread work meanwhile: >> tx_bundle_skbs(ar->workqueue), >> rx_indication(ar->workqueue_aux), >> sdio_async_tx_request(ar_sdio->workqueue), >> tx_bundle_complete(ar->workqueue_tx_complete) >> >> It has 4+ cpu/core in system, if reduced to 3 threads, then tcp >> throughput will drop. only when it only has 1/2/3 cpu/core in system, >> then reduced to 3 threads will not drop. > > How much does it drop? Please add the justification (with numbers) for > the new thread to the commit log, so that the reason is properly > documented. I see that you already submitted v10. If you can give the numbers I can add them to the commit log. -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Wen Gong writes: > On 2020-04-09 22:53, Kalle Valo wrote: >> Wen Gong writes: >> >>> + ar->workqueue_tx_complete = >>> + create_singlethread_workqueue("ath10k_tx_complete_wq"); >>> + if (!ar->workqueue_tx_complete) >>> + goto err_free_aux_wq; >> >> We already have three threads: >> >> ath/ath10k/core.c: ar->workqueue = >> create_singlethread_workqueue("ath10k_wq"); >> ath/ath10k/core.c: ar->workqueue_aux = >> create_singlethread_workqueue("ath10k_aux_wq"); >> ath/ath10k/sdio.c: ar_sdio->workqueue = >> create_singlethread_workqueue("ath10k_sdio_wq"); >> >> Do we really need a fourth one? For example, why can't we use >> ar->workqueue_aux? > > For tcp test, it has 4 thread work meanwhile: > tx_bundle_skbs(ar->workqueue), > rx_indication(ar->workqueue_aux), > sdio_async_tx_request(ar_sdio->workqueue), > tx_bundle_complete(ar->workqueue_tx_complete) > > It has 4+ cpu/core in system, if reduced to 3 threads, then tcp > throughput will drop. only when it only has 1/2/3 cpu/core in system, > then reduced to 3 threads will not drop. How much does it drop? Please add the justification (with numbers) for the new thread to the commit log, so that the reason is properly documented. -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
On 2020-04-09 23:42, Kalle Valo wrote: Wen Gong writes: On 2020-04-09 23:05, Kalle Valo wrote: + ep->tx_credits -= credits; + ath10k_dbg(ar, ATH10K_DBG_HTC, + "htc ep %d consumed %d credits (total %d)\n", "htc ep %d consumed %d credits total %d\n" [...] + ath10k_dbg(ar, ATH10K_DBG_HTC, "bundle skb: len:%d\n", bundle_skb->len); "htc bundle skb len %d\n" In other words, start with "htc" and don't use colons or parenthesis. This applies to most of debug messages in this patch. I will change the log and other log and sent v10. but "ath10k: disable TX complete indication of htt for sdio" and "ath10k: change ATH10K_SDIO_BUS_REQUEST_MAX_NUM from 64 to 1024" has appied to ath-next, so I only need to send the left 2 patches: "ath10k: add htt TX bundle for sdio" and "ath10k: enable alt data of TX path for sdio", right? Correct, I already applied patches 1 and 2. But before you resend patches 3-4 did you see my question about creating a new thread, is it really necessary? already replied. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Wen Gong writes: > On 2020-04-09 23:05, Kalle Valo wrote: > >>> + ep->tx_credits -= credits; >>> + ath10k_dbg(ar, ATH10K_DBG_HTC, >>> + "htc ep %d consumed %d credits (total %d)\n", >> >> "htc ep %d consumed %d credits total %d\n" >> >> [...] >> >>> + ath10k_dbg(ar, ATH10K_DBG_HTC, "bundle skb: len:%d\n", >>> bundle_skb->len); >> >> "htc bundle skb len %d\n" >> >> In other words, start with "htc" and don't use colons or parenthesis. >> This applies to most of debug messages in this patch. > > I will change the log and other log and sent v10. > but "ath10k: disable TX complete indication of htt for sdio" and > "ath10k: change ATH10K_SDIO_BUS_REQUEST_MAX_NUM from 64 to 1024" has > appied to ath-next, > so I only need to send the left 2 patches: > "ath10k: add htt TX bundle for sdio" and "ath10k: enable alt data of > TX path for sdio", right? Correct, I already applied patches 1 and 2. But before you resend patches 3-4 did you see my question about creating a new thread, is it really necessary? -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
On 2020-04-09 22:53, Kalle Valo wrote: Wen Gong writes: + ar->workqueue_tx_complete = + create_singlethread_workqueue("ath10k_tx_complete_wq"); + if (!ar->workqueue_tx_complete) + goto err_free_aux_wq; We already have three threads: ath/ath10k/core.c: ar->workqueue = create_singlethread_workqueue("ath10k_wq"); ath/ath10k/core.c: ar->workqueue_aux = create_singlethread_workqueue("ath10k_aux_wq"); ath/ath10k/sdio.c: ar_sdio->workqueue = create_singlethread_workqueue("ath10k_sdio_wq"); Do we really need a fourth one? For example, why can't we use ar->workqueue_aux? For tcp test, it has 4 thread work meanwhile: tx_bundle_skbs(ar->workqueue), rx_indication(ar->workqueue_aux), sdio_async_tx_request(ar_sdio->workqueue), tx_bundle_complete(ar->workqueue_tx_complete) It has 4+ cpu/core in system, if reduced to 3 threads, then tcp throughput will drop. only when it only has 1/2/3 cpu/core in system, then reduced to 3 threads will not drop. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
On 2020-04-09 23:05, Kalle Valo wrote: + ep->tx_credits -= credits; + ath10k_dbg(ar, ATH10K_DBG_HTC, + "htc ep %d consumed %d credits (total %d)\n", "htc ep %d consumed %d credits total %d\n" [...] + ath10k_dbg(ar, ATH10K_DBG_HTC, "bundle skb: len:%d\n", bundle_skb->len); "htc bundle skb len %d\n" In other words, start with "htc" and don't use colons or parenthesis. This applies to most of debug messages in this patch. I will change the log and other log and sent v10. but "ath10k: disable TX complete indication of htt for sdio" and "ath10k: change ATH10K_SDIO_BUS_REQUEST_MAX_NUM from 64 to 1024" has appied to ath-next, so I only need to send the left 2 patches: "ath10k: add htt TX bundle for sdio" and "ath10k: enable alt data of TX path for sdio", right? ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Wen Gong writes: > The transmission utilization ratio for sdio bus for small packet is > slow, because the space and time cost for sdio bus is same for large > length packet and small length packet. So the speed of data for large > length packet is higher than small length. > > Test result of different length of data: > > data packet(byte) cost time(us) calculated rate(Mbps) > 256 2873 > 512 33 124 > 1024 35 234 > 1792 45 318 > 14336 168 682 > 28672 333 688 > 57344 660 695 > > This patch change the TX packet from single packet to a large length > bundle packet, max size is 32, it results in significant performance > improvement on TX path. > > This patch only effect sdio chip, it will not effect PCI, SNOC etc. > It only enable bundle for sdio chip. > > Tested with QCA6174 SDIO with firmware > WLAN.RMH.4.4.1-00017-QCARMSWP-1. > > Signed-off-by: Wen Gong [...] > + if (ep->tx_credits < credits) { > + ath10k_dbg(ar, ATH10K_DBG_HTC, > +"htc insufficient credits ep %d required %d > available %d consume %d\n", > +eid, credits, ep->tx_credits, consume); > + ret = -EAGAIN; > + goto unlock; > + } > + > + if (consume) { > + ep->tx_credits -= credits; > + ath10k_dbg(ar, ATH10K_DBG_HTC, > +"htc ep %d consumed %d credits (total %d)\n", "htc ep %d consumed %d credits total %d\n" [...] > + ath10k_dbg(ar, ATH10K_DBG_HTC, "bundle skb: len:%d\n", bundle_skb->len); "htc bundle skb len %d\n" In other words, start with "htc" and don't use colons or parenthesis. This applies to most of debug messages in this patch. -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v9 3/4] ath10k: add htt TX bundle for sdio
Wen Gong writes: > The transmission utilization ratio for sdio bus for small packet is > slow, because the space and time cost for sdio bus is same for large > length packet and small length packet. So the speed of data for large > length packet is higher than small length. > > Test result of different length of data: > > data packet(byte) cost time(us) calculated rate(Mbps) > 256 2873 > 512 33 124 > 1024 35 234 > 1792 45 318 > 14336 168 682 > 28672 333 688 > 57344 660 695 > > This patch change the TX packet from single packet to a large length > bundle packet, max size is 32, it results in significant performance > improvement on TX path. > > This patch only effect sdio chip, it will not effect PCI, SNOC etc. > It only enable bundle for sdio chip. > > Tested with QCA6174 SDIO with firmware > WLAN.RMH.4.4.1-00017-QCARMSWP-1. > > Signed-off-by: Wen Gong [...] > --- a/drivers/net/wireless/ath/ath10k/core.c > +++ b/drivers/net/wireless/ath/ath10k/core.c > @@ -3208,6 +3208,11 @@ struct ath10k *ath10k_core_create(size_t priv_size, > struct device *dev, > if (!ar->workqueue_aux) > goto err_free_wq; > > + ar->workqueue_tx_complete = > + create_singlethread_workqueue("ath10k_tx_complete_wq"); > + if (!ar->workqueue_tx_complete) > + goto err_free_aux_wq; We already have three threads: ath/ath10k/core.c: ar->workqueue = create_singlethread_workqueue("ath10k_wq"); ath/ath10k/core.c: ar->workqueue_aux = create_singlethread_workqueue("ath10k_aux_wq"); ath/ath10k/sdio.c: ar_sdio->workqueue = create_singlethread_workqueue("ath10k_sdio_wq"); Do we really need a fourth one? For example, why can't we use ar->workqueue_aux? -- https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k