Re: [PATCH] via-rhine: Fix tx_timeout handling
On Sun, 2013-07-21 at 18:32 +0200, Richard Weinberger wrote: > Am 21.07.2013 18:18, schrieb Ben Hutchings: > > On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: > >> rhine_reset_task() misses to call netif_stop_queue(), > >> this can lead to a crash if work is still scheduled while > >> we're resetting the tx queue. > >> > >> Fixes: > >> [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at > >> 004c > >> [ 93.595514] IP: [] rhine_napipoll+0x491/0x6e > >> > >> Signed-off-by: Richard Weinberger > >> --- > >> drivers/net/ethernet/via/via-rhine.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/drivers/net/ethernet/via/via-rhine.c > >> b/drivers/net/ethernet/via/via-rhine.c > >> index b75eb9e..57e1b40 100644 > >> --- a/drivers/net/ethernet/via/via-rhine.c > >> +++ b/drivers/net/ethernet/via/via-rhine.c > >> @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct > >> *work) > >>goto out_unlock; > >> > >>napi_disable(>napi); > >> + netif_stop_queue(dev); > > > > This is not really fixing the bug because there is no synchronisation > > with the TX scheduler. You can call netif_tx_disable() instead to do > > that. > > I guess other drivers suffer from that too. > A quick grep showed that not many drivers are using netif_tx_disable(). > > > (I also think that it is preferable to use > > netif_device_{detach,attach}() to stop the queue during reconfiguration, > > as this is independent of TX completions and the watchdog.) Actually, this is not independent of TX completions - netif_wake_queue() will still start the TX scheduler while the device is not present, so you have to avoid that. > So the correct down sequence is napi_disable() -> netif_tx_disable() -> > netif_device_detach()? No, that's redundant. You can do: napi_disable(); netif_tx_lock_bh(); /* sync with TX scheduler */ netif_device_detach(); netif_tx_unlock_bh(); and then when the queue is ready to use again: netif_device_attach(); napi_enable(); Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] via-rhine: Fix tx_timeout handling
Am 21.07.2013 18:18, schrieb Ben Hutchings: > On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: >> rhine_reset_task() misses to call netif_stop_queue(), >> this can lead to a crash if work is still scheduled while >> we're resetting the tx queue. >> >> Fixes: >> [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at >> 004c >> [ 93.595514] IP: [] rhine_napipoll+0x491/0x6e >> >> Signed-off-by: Richard Weinberger >> --- >> drivers/net/ethernet/via/via-rhine.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/net/ethernet/via/via-rhine.c >> b/drivers/net/ethernet/via/via-rhine.c >> index b75eb9e..57e1b40 100644 >> --- a/drivers/net/ethernet/via/via-rhine.c >> +++ b/drivers/net/ethernet/via/via-rhine.c >> @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) >> goto out_unlock; >> >> napi_disable(>napi); >> +netif_stop_queue(dev); > > This is not really fixing the bug because there is no synchronisation > with the TX scheduler. You can call netif_tx_disable() instead to do > that. I guess other drivers suffer from that too. A quick grep showed that not many drivers are using netif_tx_disable(). > (I also think that it is preferable to use > netif_device_{detach,attach}() to stop the queue during reconfiguration, > as this is independent of TX completions and the watchdog.) So the correct down sequence is napi_disable() -> netif_tx_disable() -> netif_device_detach()? Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] via-rhine: Fix tx_timeout handling
On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: > rhine_reset_task() misses to call netif_stop_queue(), > this can lead to a crash if work is still scheduled while > we're resetting the tx queue. > > Fixes: > [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at > 004c > [ 93.595514] IP: [] rhine_napipoll+0x491/0x6e > > Signed-off-by: Richard Weinberger > --- > drivers/net/ethernet/via/via-rhine.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/ethernet/via/via-rhine.c > b/drivers/net/ethernet/via/via-rhine.c > index b75eb9e..57e1b40 100644 > --- a/drivers/net/ethernet/via/via-rhine.c > +++ b/drivers/net/ethernet/via/via-rhine.c > @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) > goto out_unlock; > > napi_disable(>napi); > + netif_stop_queue(dev); This is not really fixing the bug because there is no synchronisation with the TX scheduler. You can call netif_tx_disable() instead to do that. (I also think that it is preferable to use netif_device_{detach,attach}() to stop the queue during reconfiguration, as this is independent of TX completions and the watchdog.) Ben. > spin_lock_bh(>lock); > > /* clear all descriptors */ -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] via-rhine: Fix tx_timeout handling
On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: rhine_reset_task() misses to call netif_stop_queue(), this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 004c [ 93.595514] IP: [c119d10d] rhine_napipoll+0x491/0x6e Signed-off-by: Richard Weinberger rich...@nod.at --- drivers/net/ethernet/via/via-rhine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index b75eb9e..57e1b40 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(rp-napi); + netif_stop_queue(dev); This is not really fixing the bug because there is no synchronisation with the TX scheduler. You can call netif_tx_disable() instead to do that. (I also think that it is preferable to use netif_device_{detach,attach}() to stop the queue during reconfiguration, as this is independent of TX completions and the watchdog.) Ben. spin_lock_bh(rp-lock); /* clear all descriptors */ -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] via-rhine: Fix tx_timeout handling
Am 21.07.2013 18:18, schrieb Ben Hutchings: On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: rhine_reset_task() misses to call netif_stop_queue(), this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 004c [ 93.595514] IP: [c119d10d] rhine_napipoll+0x491/0x6e Signed-off-by: Richard Weinberger rich...@nod.at --- drivers/net/ethernet/via/via-rhine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index b75eb9e..57e1b40 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(rp-napi); +netif_stop_queue(dev); This is not really fixing the bug because there is no synchronisation with the TX scheduler. You can call netif_tx_disable() instead to do that. I guess other drivers suffer from that too. A quick grep showed that not many drivers are using netif_tx_disable(). (I also think that it is preferable to use netif_device_{detach,attach}() to stop the queue during reconfiguration, as this is independent of TX completions and the watchdog.) So the correct down sequence is napi_disable() - netif_tx_disable() - netif_device_detach()? Thanks, //richard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] via-rhine: Fix tx_timeout handling
On Sun, 2013-07-21 at 18:32 +0200, Richard Weinberger wrote: Am 21.07.2013 18:18, schrieb Ben Hutchings: On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote: rhine_reset_task() misses to call netif_stop_queue(), this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 004c [ 93.595514] IP: [c119d10d] rhine_napipoll+0x491/0x6e Signed-off-by: Richard Weinberger rich...@nod.at --- drivers/net/ethernet/via/via-rhine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index b75eb9e..57e1b40 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(rp-napi); + netif_stop_queue(dev); This is not really fixing the bug because there is no synchronisation with the TX scheduler. You can call netif_tx_disable() instead to do that. I guess other drivers suffer from that too. A quick grep showed that not many drivers are using netif_tx_disable(). (I also think that it is preferable to use netif_device_{detach,attach}() to stop the queue during reconfiguration, as this is independent of TX completions and the watchdog.) Actually, this is not independent of TX completions - netif_wake_queue() will still start the TX scheduler while the device is not present, so you have to avoid that. So the correct down sequence is napi_disable() - netif_tx_disable() - netif_device_detach()? No, that's redundant. You can do: napi_disable(); netif_tx_lock_bh(); /* sync with TX scheduler */ netif_device_detach(); netif_tx_unlock_bh(); and then when the queue is ready to use again: netif_device_attach(); napi_enable(); Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] via-rhine: Fix tx_timeout handling
rhine_reset_task() misses to call netif_stop_queue(), this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 004c [ 93.595514] IP: [] rhine_napipoll+0x491/0x6e Signed-off-by: Richard Weinberger --- drivers/net/ethernet/via/via-rhine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index b75eb9e..57e1b40 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(>napi); + netif_stop_queue(dev); spin_lock_bh(>lock); /* clear all descriptors */ -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] via-rhine: Fix tx_timeout handling
rhine_reset_task() misses to call netif_stop_queue(), this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 004c [ 93.595514] IP: [c119d10d] rhine_napipoll+0x491/0x6e Signed-off-by: Richard Weinberger rich...@nod.at --- drivers/net/ethernet/via/via-rhine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index b75eb9e..57e1b40 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(rp-napi); + netif_stop_queue(dev); spin_lock_bh(rp-lock); /* clear all descriptors */ -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/