Re: [PATCH net-next 7/8] net: ethernet: ti: am65-cpsw: prepare xmit/rx path for multi-port devices in mac-only mode

2020-10-05 Thread Grygorii Strashko




On 03/10/2020 05:09, David Miller wrote:

From: Grygorii Strashko 
Date: Thu, 1 Oct 2020 13:52:57 +0300


This patch adds multi-port support to TI AM65x CPSW driver xmit/rx path in
preparation for adding support for multi-port devices, like Main CPSW0 on
K3 J721E SoC or future CPSW3g on K3 AM64x SoC.
Hence DMA channels are common/shared for all ext Ports and the RX/TX NAPI
and DMA processing going to be assigned to first netdev this patch:
  - ensures all RX descriptors fields are initialized;
  - adds synchronization for TX DMA push/pop operation (locking) as
Networking core is not enough any more;
  - updates TX bql processing for every packet in
am65_cpsw_nuss_tx_compl_packets() as every completed TX skb can have
different ndev assigned (come from different netdevs).

Signed-off-by: Grygorii Strashko 


This locking is unnecessary in single-port non-shared DMA situations
and therefore will impose unnecessary performance loss for basically
all existing supported setups.

Please do this another way.


ok. I'll try add lock-less push/pop operations and use them for single-port



In fact, I would encourage you to find a way to avoid the new atomic
operations even in multi-port configurations.


I'm not sure I how :( The DMA channels are shared, while net_device TX queues 
are separate.
I've thought - hence there is 8 TX DMA channels it should be possible to use 
qdisc,
like mqprio to segregate traffic between ports and TX DMA channels in which 
case no
blocking on tx dma locks should happen in .xmit().

Thank you.
--
Best regards,
grygorii


Re: [PATCH net-next 7/8] net: ethernet: ti: am65-cpsw: prepare xmit/rx path for multi-port devices in mac-only mode

2020-10-02 Thread David Miller
From: Grygorii Strashko 
Date: Thu, 1 Oct 2020 13:52:57 +0300

> This patch adds multi-port support to TI AM65x CPSW driver xmit/rx path in
> preparation for adding support for multi-port devices, like Main CPSW0 on
> K3 J721E SoC or future CPSW3g on K3 AM64x SoC.
> Hence DMA channels are common/shared for all ext Ports and the RX/TX NAPI
> and DMA processing going to be assigned to first netdev this patch:
>  - ensures all RX descriptors fields are initialized;
>  - adds synchronization for TX DMA push/pop operation (locking) as
> Networking core is not enough any more;
>  - updates TX bql processing for every packet in
> am65_cpsw_nuss_tx_compl_packets() as every completed TX skb can have
> different ndev assigned (come from different netdevs).
> 
> Signed-off-by: Grygorii Strashko 

This locking is unnecessary in single-port non-shared DMA situations
and therefore will impose unnecessary performance loss for basically
all existing supported setups.

Please do this another way.

In fact, I would encourage you to find a way to avoid the new atomic
operations even in multi-port configurations.

Thank you.


[PATCH net-next 7/8] net: ethernet: ti: am65-cpsw: prepare xmit/rx path for multi-port devices in mac-only mode

2020-10-01 Thread Grygorii Strashko
This patch adds multi-port support to TI AM65x CPSW driver xmit/rx path in
preparation for adding support for multi-port devices, like Main CPSW0 on
K3 J721E SoC or future CPSW3g on K3 AM64x SoC.
Hence DMA channels are common/shared for all ext Ports and the RX/TX NAPI
and DMA processing going to be assigned to first netdev this patch:
 - ensures all RX descriptors fields are initialized;
 - adds synchronization for TX DMA push/pop operation (locking) as
Networking core is not enough any more;
 - updates TX bql processing for every packet in
am65_cpsw_nuss_tx_compl_packets() as every completed TX skb can have
different ndev assigned (come from different netdevs).

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/am65-cpsw-nuss.c | 41 +---
 drivers/net/ethernet/ti/am65-cpsw-nuss.h |  1 +
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c 
b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index 0bc0eec46709..a8094e8e49ca 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -375,7 +375,7 @@ static int am65_cpsw_nuss_rx_push(struct am65_cpsw_common 
*common,
 
cppi5_hdesc_init(desc_rx, CPPI5_INFO0_HDESC_EPIB_PRESENT,
 AM65_CPSW_NAV_PS_DATA_SIZE);
-   cppi5_hdesc_attach_buf(desc_rx, 0, 0, buf_dma, skb_tailroom(skb));
+   cppi5_hdesc_attach_buf(desc_rx, buf_dma, skb_tailroom(skb), buf_dma, 
skb_tailroom(skb));
swdata = cppi5_hdesc_get_swdata(desc_rx);
*((void **)swdata) = skb;
 
@@ -933,7 +933,9 @@ static int am65_cpsw_nuss_tx_compl_packets(struct 
am65_cpsw_common *common,
struct am65_cpsw_ndev_priv *ndev_priv;
struct am65_cpsw_ndev_stats *stats;
 
+   spin_lock(_chn->lock);
res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, _dma);
+   spin_unlock(_chn->lock);
if (res == -ENODATA)
break;
 
@@ -960,31 +962,29 @@ static int am65_cpsw_nuss_tx_compl_packets(struct 
am65_cpsw_common *common,
stats->tx_bytes += skb->len;
u64_stats_update_end(>syncp);
 
-   total_bytes += skb->len;
+   total_bytes = skb->len;
napi_consume_skb(skb, budget);
num_tx++;
-   }
-
-   if (!num_tx)
-   return 0;
 
-   netif_txq = netdev_get_tx_queue(ndev, chn);
+   netif_txq = netdev_get_tx_queue(ndev, chn);
 
-   netdev_tx_completed_queue(netif_txq, num_tx, total_bytes);
+   netdev_tx_completed_queue(netif_txq, num_tx, total_bytes);
 
-   if (netif_tx_queue_stopped(netif_txq)) {
-   /* Check whether the queue is stopped due to stalled tx dma,
-* if the queue is stopped then wake the queue as
-* we have free desc for tx
-*/
-   __netif_tx_lock(netif_txq, smp_processor_id());
-   if (netif_running(ndev) &&
-   (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
-MAX_SKB_FRAGS))
-   netif_tx_wake_queue(netif_txq);
+   if (netif_tx_queue_stopped(netif_txq)) {
+   /* Check whether the queue is stopped due to stalled
+* tx dma, if the queue is stopped then wake the queue
+* as we have free desc for tx
+*/
+   __netif_tx_lock(netif_txq, smp_processor_id());
+   if (netif_running(ndev) &&
+   (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
+MAX_SKB_FRAGS))
+   netif_tx_wake_queue(netif_txq);
 
-   __netif_tx_unlock(netif_txq);
+   __netif_tx_unlock(netif_txq);
+   }
}
+
dev_dbg(dev, "%s:%u pkt:%d\n", __func__, chn, num_tx);
 
return num_tx;
@@ -1141,7 +1141,9 @@ static netdev_tx_t am65_cpsw_nuss_ndo_slave_xmit(struct 
sk_buff *skb,
 
cppi5_hdesc_set_pktlen(first_desc, pkt_len);
desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
+   spin_lock_bh(_chn->lock);
ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
+   spin_unlock_bh(_chn->lock);
if (ret) {
dev_err(dev, "can't push desc %d\n", ret);
/* inform bql */
@@ -1498,6 +1500,7 @@ static int am65_cpsw_nuss_init_tx_chns(struct 
am65_cpsw_common *common)
snprintf(tx_chn->tx_chn_name,
 sizeof(tx_chn->tx_chn_name), "tx%d", i);
 
+   spin_lock_init(_chn->lock);
tx_chn->common = common;
tx_chn->id = i;
tx_chn->descs_num = max_desc_num;
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.h 
b/drivers/net/ethernet/ti/am65-cpsw-nuss.h
index