Nori, Sekhar wrote:
From: Stephen Irons [mailto:[email protected]]
Sent: Thursday, April 16, 2009 10:37 AM
This is my preferred solution -- the appended patch is different from
the previous one.

1. I believe that emac_rx_bdproc() and emac_tx_bodproc() only set the
pending flag if there really is more work to do. These functions must
include the test (pkt_processeds == budget) to decide how the while()
loop ended. I think that this is correct as implemented, quite
independently of the test (num_rxpkts==budget) in emac_poll(), though
I
do not understand the implications of ((frame_status &
EMAC_CPPI_OWNERSHIP_BIT) == 0))

That translated to English means: Set pending to true if we have consumed our 
entire budget and there is still a frame in the queue and the DMA on that frame 
is complete.

IOW, the pending flag currently makes sure work is *really* pending.

2. in emac_poll(), we want to stop polling and re-enable interrupts if
there is more work to do. However, net_rx_action() imposes another
constraint: we must not play around with the NAPI list if
(work==budget). So in this special case, we cannot stop polling.

Yes. So work < budget should be the only condition we are actually interested 
in.

3. In the case where we have used up the entire budget and there is
now
no more work to do, we are forced to do another poll by the constraint
imposed by net_rx_action(). But next time around, we should use less
than the budget (unless there has been a sudden flurry of packets). We
can live with the extra poll.


Yes. And this makes me think we can relax the pending condition in 
emac_rx_bdproc().

4. In the case where we want to continue polling, emac_poll() does
nothing, the same as e100 device. net_rx_action() will keep calling
the
poll() routine until we explicitly call netif_rx_complete().

Yes. The __netif_rx_schedule() was superfluous.

5.  We never want to return the number of tx packets processed. At
present, the Tx budget is less than the Rx budget. But if this were to
change in the future, and we returned the number of tx packets, then
it
is possible that we could return a value bigger than the Rx budget.
This
would earn us a WARN_ON_ONCE warning from net_rx_action() and taint
the
kernel. Therefore, keep num_txpkts separate from num_rxpkts, and
she'll
be right.

Yes, Agreed.

6. I am currently running this code on a busy corporate network with
an
additional change (not shown in this patch) that sensitizes the system
to the error -- reduce EMAC_POLL_WEIGHT, EMAC_DEF_TX_MAX_SERVICE and
EMAC_DEF_RX_MAX_SERVICE to 4. When sensitized, and without the
attached
patch, the kernel panic occurs soon after then EMAC device is
initialised. When sensitized, and with the patch, the kernel has been
running for 24 hours and more.

Yes, surely you have identified a corner case.

Signed-off-by: Stephen Irons <[email protected]>
---
 drivers/net/davinci_emac.c |   18 ++++++++----------
 1 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/net/davinci_emac.c b/drivers/net/davinci_emac.c
index 89ff48b..a5354b0 100755
--- a/drivers/net/davinci_emac.c
+++ b/drivers/net/davinci_emac.c
@@ -2157,7 +2157,8 @@ static int emac_poll(struct napi_struct *napi,
int
budget)
     struct emac_priv *priv = container_of(napi, struct emac_priv,
napi);
     struct net_device *ndev = priv->ndev;
     u32 status = 0;
-    u32 num_pkts = 0;
+    u32 num_txpkts = 0;
+    u32 num_rxpkts = 0;
     u32 txpending = 0;
     u32 rxpending = 0;

@@ -2173,7 +2174,7 @@ static int emac_poll(struct napi_struct *napi,
int
budget)
         mask = EMAC_DM646X_MAC_IN_VECTOR_TX_INT_VEC;

     if (status & mask) {
-        num_pkts = emac_tx_bdproc(priv, EMAC_DEF_TX_CH,
+        num_txpkts = emac_tx_bdproc(priv, EMAC_DEF_TX_CH,
                       EMAC_DEF_TX_MAX_SERVICE,
                       &txpending);
     } /* TX processing */
@@ -2184,16 +2185,13 @@ static int emac_poll(struct napi_struct *napi,
int budget)
         mask = EMAC_DM646X_MAC_IN_VECTOR_RX_INT_VEC;

     if (status & mask) {
-        num_pkts = emac_rx_bdproc(priv, EMAC_DEF_RX_CH,
+        num_rxpkts = emac_rx_bdproc(priv, EMAC_DEF_RX_CH,
                       budget, &rxpending);
     } /* RX processing */

-    if (txpending || rxpending) {
-        if (likely(netif_rx_schedule_prep(&priv->napi))) {
-            emac_int_disable(priv);
-            __netif_rx_schedule(&priv->napi);
-        }
-    } else {
+    /* If there is nothing to transmit and nothing to receive, and we
have not consumed the entire budget,
+       then stop polling and re-enable interrupts */
+    if (!txpending && !rxpending && (num_rxpkts != budget)) {


I think just checking for (num_rxpkts < budget) should be sufficient instead of 
!rxpending && (num_rxpkts != budget). This is what is being done in other drivers 
using NAPI (like 8139too or e100).

Thanks,
Sekhar


I agree that (num_rxpkts < budget) is enough to make the system work. However, I feel that the complete condition explains exactly WHY we are doing things. In fact, I would prefer to write it:

if (!txpending && !rxpending) /* There is no more work to do, so we can stop 
polling... */
{
  if (num_rxpkts != budget) /* ...except in the special case that we have use 
the entire budget (see comment net_rx_action() in file net/core/dev.c). We will 
stop polling next time around unless, of course, we have suddenly been given 
more work to do. */
  {
     netif_rx_complete(napi);
     emac_int_enable(priv);
  }
}

It depends whether we want the code to be explicit or 'efficient'. If we use the minimum condition, we have code that reads

if (!txpending && (num_rxpkts < budget))
{
  netif_rx_complete(napi);
  emac_int_enable(priv);
}

and we have to explain why we are not using rxpending: the tx_bdproc() and rx_bdproc() function appear symmetrical, so I would expect to see identical code using the results from those functions. And we still have to explain about not playing around with the NAPI state in the special case.

My feeling is that the requirement to leave the NAPI state alone in one special case is artificial. It should be made explicit right where the problem is. That condition might go away in the future.

Sure, we all understand the issue now, but in 6 months time, someone else will come along and ask why we are not using rxpending. I don't want to drag this whole thing up again...

Stephen Irons


=======================================================================
This email, including any attachments, is only for the intended
addressee.  It is subject to copyright, is confidential and may be
the subject of legal or other privilege, none of which is waived or
lost by reason of this transmission.
If the receiver is not the intended addressee, please accept our
apologies, notify us by return, delete all copies and perform no
other act on the email.
Unfortunately, we cannot warrant that the email has not been
altered or corrupted during transmission.
=======================================================================


_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source

Reply via email to