Re: [E1000-devel] [PATCH] e1000e : Avoid wrong check on TX hang

Michael Wang Thu, 17 Nov 2011 18:08:32 -0800

Hi

This patch fix the wrong check on TX hang when in burst mode.


The wrong check is caused by that:

When in burst mode, the descriptor will not been write-back immediately,
until reach enough number.
In this situation, when the hang is detected, it may be caused by that
the descriptor has not been write-back.

So we need flush the descriptor and do recheck, not just stop the tx
when detect hang.

Flavio:

Please sign-off the patch and provide the test information.

Thanks & Best regards,
Michael Wang

On 11/18/2011 09:37 AM, Michael Wang wrote:

> From: Michael Wang <[email protected]>
> 
> Descriptors may not be write-back while checking TX hang with flag
> FLAG2_DMA_BURST on.
> So when we detect hang, we just flush the descriptor and detect
> again for once.
> 
> Signed-off-by: Michael Wang <[email protected]>
> ---
>  drivers/net/e1000e/e1000.h  |    1 +
>  drivers/net/e1000e/netdev.c |   18 +++++++++++++++++-
>  2 files changed, 18 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
> index 8533ad7..be2f25a 100644
> --- a/drivers/net/e1000e/e1000.h
> +++ b/drivers/net/e1000e/e1000.h
> @@ -309,6 +309,7 @@ struct e1000_adapter {
>       u32 txd_cmd;
> 
>       bool detect_tx_hung;
> +     bool tx_hang_recheck;
>       u8 tx_timeout_factor;
> 
>       u32 tx_int_delay;
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index 2198e61..eab82ed 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -1000,6 +1000,7 @@ static void e1000_print_hw_hang(struct work_struct
> *work)
>       struct e1000_adapter *adapter = container_of(work,
>                                                    struct e1000_adapter,
>                                                    print_hang_task);
> +     struct net_device *netdev = adapter->netdev;
>       struct e1000_ring *tx_ring = adapter->tx_ring;
>       unsigned int i = tx_ring->next_to_clean;
>       unsigned int eop = tx_ring->buffer_info[i].next_to_watch;
> @@ -1011,6 +1012,19 @@ static void e1000_print_hw_hang(struct
> work_struct *work)
>       if (test_bit(__E1000_DOWN, &adapter->state))
>               return;
> 
> +     if ((!adapter->tx_hang_recheck) && (adapter->flags2 & FLAG2_DMA_BURST)) 
> {
> +             /* May be block on write-back, flush and detect again*/
> +             /* flush pending descriptor writebacks to memory */
> +             ew32(TIDV, adapter->tx_int_delay | E1000_TIDV_FPD);
> +             /* execute the writes immediately */
> +             e1e_flush();
> +             adapter->tx_hang_recheck = 1;
> +             return;
> +     }
> +     /* Real hang detected */
> +     adapter->tx_hang_recheck = 0;
> +     netif_stop_queue(netdev);
> +
>       e1e_rphy(hw, PHY_STATUS, &phy_status);
>       e1e_rphy(hw, PHY_1000T_STATUS, &phy_1000t_status);
>       e1e_rphy(hw, PHY_EXT_STATUS, &phy_ext_status);
> @@ -1126,7 +1140,8 @@ static bool e1000_clean_tx_irq(struct
> e1000_adapter *adapter)
>                              + (adapter->tx_timeout_factor * HZ)) &&
>                   !(er32(STATUS) & E1000_STATUS_TXOFF)) {
>                       schedule_work(&adapter->print_hang_task);
> -                     netif_stop_queue(netdev);
> +             } else {
> +                     adapter->tx_hang_recheck = 0;
>               }
>       }
>       adapter->total_tx_bytes += total_tx_bytes;
> @@ -3753,6 +3768,7 @@ static int e1000_open(struct net_device *netdev)
> 
>       e1000_irq_enable(adapter);
> 
> +     adapter->tx_hang_recheck = 0;
>       netif_start_queue(netdev);
> 
>       adapter->idle_check = true;



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] [PATCH] e1000e : Avoid wrong check on TX hang

Reply via email to