Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-20 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
> > nope - with this patch applied the box still has no network, symptoms 
> > are similar. (should i apply the WARN_ON() patch too?)
> 
> Yes, that would be nice. If that doesn't help, you can also throw in
> the one below.

> - np->dev = ndev;
>   if (!ndev->npinfo) {
>   npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
>   if (!npinfo) {
> @@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
>   }
>   }
>  
> + np->dev = ndev;

no change in behavior from this patch. If i add the WARN_ON() it 
triggers once and it makes the network work as well (probably due to the 
side-effect of the packets generated by the WARN_ON() printks).

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-20 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
  nope - with this patch applied the box still has no network, symptoms 
  are similar. (should i apply the WARN_ON() patch too?)
 
 Yes, that would be nice. If that doesn't help, you can also throw in
 the one below.

 - np-dev = ndev;
   if (!ndev-npinfo) {
   npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
   if (!npinfo) {
 @@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
   }
   }
  
 + np-dev = ndev;

no change in behavior from this patch. If i add the WARN_ON() it 
triggers once and it makes the network work as well (probably due to the 
side-effect of the packets generated by the WARN_ON() printks).

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
> nope - with this patch applied the box still has no network, symptoms 
> are similar. (should i apply the WARN_ON() patch too?)

Yes, that would be nice. If that doesn't help, you can also throw in
the one below.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Index: build-2.6/net/core/netpoll.c
===
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -650,7 +650,6 @@ int netpoll_setup(struct netpoll *np)
return -ENODEV;
}
 
-   np->dev = ndev;
if (!ndev->npinfo) {
npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
if (!npinfo) {
@@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
}
}
 
+   np->dev = ndev;
+
if (is_zero_ether_addr(np->local_mac) && ndev->dev_addr)
memcpy(np->local_mac, ndev->dev_addr, 6);
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> Does the following help?

> --- build-2.6.orig/drivers/net/netconsole.c
> +++ build-2.6/drivers/net/netconsole.c
> @@ -70,7 +70,7 @@ static void write_msg(struct console *co
>   int frag, left;
>   unsigned long flags;
>  
> - if (!np.dev)
> + if (!np.dev || !netif_running(np.dev))
>   return;

nope - with this patch applied the box still has no network, symptoms 
are similar. (should i apply the WARN_ON() patch too?)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch

Does the following help?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
Test patch
---
Index: build-2.6/drivers/net/netconsole.c
===
--- build-2.6.orig/drivers/net/netconsole.c
+++ build-2.6/drivers/net/netconsole.c
@@ -70,7 +70,7 @@ static void write_msg(struct console *co
int frag, left;
unsigned long flags;
 
-   if (!np.dev)
+   if (!np.dev || !netif_running(np.dev))
return;
 
local_irq_save(flags);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> Here's a somewhat drastic modification that should not change any 
> timing, but just verifies whether my patch is to blame at all. Can you 
> give it a try?

> @@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
>* But at least it doesn't penalize the non-netpoll
>* code path. */
>   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state))
> - return;
> + BUG();

ok, i tried the patch below, and it gave this (single) warning:

 Calling initcall 0xc02f5c17: init_netconsole+0x0/0x67()
 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at include/linux/netdevice.h:1030 netif_rx_complete()
  [] show_trace_log_lvl+0x19/0x2e
  [] show_trace+0x12/0x14
  [] dump_stack+0x14/0x16
  [] e1000_clean+0x1f4/0x26f
  [] netpoll_poll+0x8b/0x357
  [] netpoll_send_skb+0xe8/0x14c
  [] netpoll_send_udp+0x258/0x260
  [] write_msg+0x53/0x8d
  [] __call_console_drivers+0x4e/0x5a
  [] _call_console_drivers+0x5d/0x61
  [] release_console_sem+0x120/0x1c1
  [] register_console+0x22e/0x236
  [] init_netconsole+0x55/0x67
  [] kernel_init+0x154/0x2d9
  [] kernel_thread_helper+0x7/0x10
  ===
 e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
 netconsole: network logging started
 initcall 0xc02f5c17: init_netconsole+0x0/0x67() returned 0.
 initcall 0xc02f5c17 ran for 4012 msecs: init_netconsole+0x0/0x67()
 Calling initcall 0xc0600ff9: spi_transport_init+0x0/0x27()
 initcall 0xc0600ff9: spi_transport_init+0x0/0x27() returned 0.


Ingo

---
 include/linux/netdevice.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
===
--- linux-cfs-2.6.23-git.q.prev3.orig/include/linux/netdevice.h
+++ linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 * But at least it doesn't penalize the non-netpoll
 * code path. */
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state))
-   return;
+   WARN_ON(1);
 #endif
 
local_irq_save(flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:07, Ingo Molnar wrote:
> because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
> anything in the ethtool output that i sent in the previous mail(s)?

If the WARN_ON doesn't trigger, I cannot see how my patch would affect
your system.

 -  IF we enter the if() branch in poll_napi, then the following
must hold:
 -  an interrupt from the e1000 arrived
 -  e1000_intr disables interrupts, increments irq_sem
(which is now 1) and schedules rx_action

 -  Now, inside poll_napi, the following happens:
 -  poll_napi is called, finds the device is marked for
polling, invokes dev->poll
 -  dev->poll calls netif_rx_complete (which does *not*
remove the device from the poll list), and re-enables
interrupts. irq_sem is now 0.

 -  Finally, the rx_action softirq is run, which calls dev->poll
again, which ends up invoking netif_rx_complete once more,
and tries to re-enable interrupts. The latter doesn't do
anything except decrementing irq_sem once more, which now
goes negative.

Which would trigger the WARN_ON.

Now, as you say the WARN_ON is never triggered, it follows that
we never end up in the if() branch of poll_napi. But that is
where the only substantial modification of my patch is.

Here's a somewhat drastic modification that should not change any
timing, but just verifies whether my patch is to blame at all. Can
you give it a try?

Thanks,
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
 include/linux/netdevice.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 * But at least it doesn't penalize the non-netpoll
 * code path. */
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state))
-   return;
+   BUG();
 #endif
 
local_irq_save(flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 19:36, Olaf Kirch wrote:
> Can you confirm this by spraying the laptop with arp packets
> or broadcast pings while it's booting?

Sorry for the noise - didn't see your other message where you
described just that.

This sounds more like a hardware issue - Rx interrupt seems to
go through (triggering all the usual NAPI cleanup), but the Tx
descriptor writeback interrupt doesn't.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
> > that network-intense test also produced periodic broadcast packets that 
> > got the e1000 out of its weird state before the tx timeout could hit. 
> > Now that i've stopped the test, the network is quiescent again and the 
> > e1000 hangs.
> 
> Can you confirm this by spraying the laptop with arp packets or 
> broadcast pings while it's booting?

yes, i did exactly that, and it made the box boot.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
> that network-intense test also produced periodic broadcast packets that 
> got the e1000 out of its weird state before the tx timeout could hit. 
> Now that i've stopped the test, the network is quiescent again and the 
> e1000 hangs.

Can you confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > i'll now check whether removing ignore_on_loglevel (no other 
> > changes) makes the hang go away. Maybe ignore_on_loglevel is buggy - 
> > or it produces an immediate printk (going out to the interface) 
> > during a particularly sensitive period of network initialization.
> 
> nope, that didnt have any effect. I have another theory: i had a 
> network-intense stress-test going on in the past few hours on the same 
> network (not involving the laptop) and i stopped it recently. Perhaps 
> that network-intense test also produced periodic broadcast packets 
> that got the e1000 out of its weird state before the tx timeout could 
> hit. Now that i've stopped the test, the network is quiescent again 
> and the e1000 hangs.

yep - i can confirm this theory now: if i start a broadcast ping on 
another box while the laptop is booting up, the network hang does not 
happen. If i stop the ping and do another bootup with the very same 
kernel, it hangs. So for the hang to happen, the network has to be very 
quiet. At least one mystery solved :-/

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Kok, Auke <[EMAIL PROTECTED]> wrote:

> > I don't have a fix ready yet - I hope I'll have something later this 
> > afternoon.
> 
> interesting, you seem to found the cause allright. I can't confirm the 
> problem but I know that netpoll and NAPI has historically been an 
> issue. I look forward to your suggestions...

because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
anything in the ethtool output that i sent in the previous mail(s)?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> i'll now check whether removing ignore_on_loglevel (no other changes) 
> makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
> produces an immediate printk (going out to the interface) during a 
> particularly sensitive period of network initialization.

nope, that didnt have any effect. I have another theory: i had a 
network-intense stress-test going on in the past few hours on the same 
network (not involving the laptop) and i stopped it recently. Perhaps 
that network-intense test also produced periodic broadcast packets that 
got the e1000 out of its weird state before the tx timeout could hit. 
Now that i've stopped the test, the network is quiescent again and the 
e1000 hangs.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> ah! Just found the reason: the bug apparently depends on the precise 
> kernel command-line contents. I accidentally dropped ignore_loglevel 
> (found this while comparing with the older logs i sent to you), adding 
> it back in produces hung networking too. So it appears that a 
> netconsole printout while e1000 is initializing (or while some other 
> networking component is initializing) might be the culprit?

and the WARN_ON() below does not seem to trigger.

i'll now check whether removing ignore_on_loglevel (no other changes) 
makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
produces an immediate printk (going out to the interface) during a 
particularly sensitive period of network initialization.

Ingo

->
Index: linux/drivers/net/e1000/e1000_main.c
===
--- linux.orig/drivers/net/e1000/e1000_main.c
+++ linux/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
E1000_WRITE_REG(>hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(>hw);
}
+   WARN_ON_ONCE(atomic_read(>irq_sem) < 0);
 }
 
 static void
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Kok, Auke

Olaf Kirch wrote:

On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
hickup symptoms, with no other bad symptoms such as lockups or crashes.


Duh, I found it.

The e1000 poll routine does this to leave polling mode.

netif_rx_complete(poll_dev);
e1000_irq_enable(adapter);
return 0;

Which looks innocent enough, except that e1000_irq_enable has
this little irq_sem counter:

if (likely(atomic_dec_and_test(>irq_sem))) {
E1000_WRITE_REG(>hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(>hw);
}

So as poll_napi calls the poll() routine repeatedly, the irq_sem
counter is decremented by one each time. During the first call,
it re-enables the interrupt. During the next calls, irq_sem goes
negative.

Then an interrupt comes in, e1000_intr disables the interrupt,
increments irq_sem by one, and schedules the device for rx_action.
rx_action calls dev->poll(), which finishes cleaning rx/rx rings,
and when it finds there's no more work, it calls rx_complete and
irq_enable. Except irq_enable doesn't enable anything now, since
irq_sem is <= 0, and dec_and_test returns false.

The whole irq_sem accounting in the e1000 does not rhyme well with
netpoll's way of exercising dev->poll().


it's been accused of worse things ;)


The reason my patch triggers
the problem reliably for you is that now, we always get at least
two invocations of dev->poll: once from poll_napi - where we do not
remove the device from the poll list any longer - and another one
from net_rx_action.

I don't have a fix ready yet - I hope I'll have something later
this afternoon.


interesting, you seem to found the cause allright. I can't confirm the problem 
but I know that netpoll and NAPI has historically been an issue. I look forward 
to your suggestions...


Cheers,

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> ugh. Something really weird happened with this e1000 problem.
> 
> i crashed the laptop in a weird way and had to power-cycle it in an 
> unusual fashion. After that i wanted to try your latest BUG_ON() 
> theory but the network hang went away!
> 
> For 3 hours i tried to reproduce the hang (i went back to the original 
> git tree and the .config under which i found it, i power cycled it 
> again, i unplugged the power cord to make it go off battery, unplugged 
> the ethernet, recreated a completely new tree, etc. etc.) but with no 
> success! Total Heisenbug - and the really annoying thing is that you 
> just had a good theory about what might be happening. I'm now at 
> kernel build iteration #75 in this tree ...

ah! Just found the reason: the bug apparently depends on the precise 
kernel command-line contents. I accidentally dropped ignore_loglevel 
(found this while comparing with the older logs i sent to you), adding 
it back in produces hung networking too. So it appears that a netconsole 
printout while e1000 is initializing (or while some other networking 
component is initializing) might be the culprit?

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 17:07, Ingo Molnar wrote:
> i crashed the laptop in a weird way and had to power-cycle it in an 
> unusual fashion. After that i wanted to try your latest BUG_ON() theory 
> but the network hang went away!

Should I rejoice, or regret? :-)

> maybe it's not the power-cycling that somehow brought the e1000 out of 
> its weird state but the ethtool ops (and the related firmware readouts, 
> etc.)?

Oh, don't tell me there's a --scrub-bad-karma switch in ethtool.

> (i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
> Not that this means anything under these circumstances ...)

Too bad. Now where do we take it from here? I'm currently thinking of
ways to do this patch differently. But that is kind of relying on a
good testbed to verify whether a different patch works better for you
or not...

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

ugh. Something really weird happened with this e1000 problem.

i crashed the laptop in a weird way and had to power-cycle it in an 
unusual fashion. After that i wanted to try your latest BUG_ON() theory 
but the network hang went away!

For 3 hours i tried to reproduce the hang (i went back to the original 
git tree and the .config under which i found it, i power cycled it 
again, i unplugged the power cord to make it go off battery, unplugged 
the ethernet, recreated a completely new tree, etc. etc.) but with no 
success! Total Heisenbug - and the really annoying thing is that you 
just had a good theory about what might be happening. I'm now at kernel 
build iteration #75 in this tree ...

maybe it's not the power-cycling that somehow brought the e1000 out of 
its weird state but the ethtool ops (and the related firmware readouts, 
etc.)?

(i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
Not that this means anything under these circumstances ...)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 14:52, Olaf Kirch wrote:
> On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
> > i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
> > hickup symptoms, with no other bad symptoms such as lockups or crashes.
> 
> Duh, I found it.

The following patch should confirm my theory. Can you give it a try,
please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
Index: build-2.6/drivers/net/e1000/e1000_main.c
===
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
E1000_WRITE_REG(>hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(>hw);
}
+   BUG_ON(atomic_read(>irq_sem) < 0);
 }
 
 static void

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
> i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
> hickup symptoms, with no other bad symptoms such as lockups or crashes.

Duh, I found it.

The e1000 poll routine does this to leave polling mode.

netif_rx_complete(poll_dev);
e1000_irq_enable(adapter);
return 0;

Which looks innocent enough, except that e1000_irq_enable has
this little irq_sem counter:

if (likely(atomic_dec_and_test(>irq_sem))) {
E1000_WRITE_REG(>hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(>hw);
}

So as poll_napi calls the poll() routine repeatedly, the irq_sem
counter is decremented by one each time. During the first call,
it re-enables the interrupt. During the next calls, irq_sem goes
negative.

Then an interrupt comes in, e1000_intr disables the interrupt,
increments irq_sem by one, and schedules the device for rx_action.
rx_action calls dev->poll(), which finishes cleaning rx/rx rings,
and when it finds there's no more work, it calls rx_complete and
irq_enable. Except irq_enable doesn't enable anything now, since
irq_sem is <= 0, and dec_and_test returns false.

The whole irq_sem accounting in the e1000 does not rhyme well with
netpoll's way of exercising dev->poll(). The reason my patch triggers
the problem reliably for you is that now, we always get at least
two invocations of dev->poll: once from poll_napi - where we do not
remove the device from the poll list any longer - and another one
from net_rx_action.

I don't have a fix ready yet - I hope I'll have something later
this afternoon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> * Olaf Kirch <[EMAIL PROTECTED]> wrote:
> 
> > On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
> > >  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
> > >  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
> > >  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
> > >  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
> > >  NET: Registered protocol family 16
> > > 
> > > and no output ever since - and the box has been up for a few minutes.
> > 
> > Okay, I need to ask a stupid question - did you verify that it's not 
> > spinning on a spinlock?

ok, i just to make my description clearer: 'netconsole output hung' 
means that on the netconsole-receiving box (which is not the laptop with 
the problem) i dont get any netconsole output printed. (i.e. UDP packets 
are not being sent by the laptop.) The above snippet is the last i get. 
Otherwise the laptop boots up fine, but has that early tx timeout and 
subsequently no networking. I can still do things on the laptop as 
normal, but eth0 irqs do not advance (are stuck at 5) and networking is 
stuck.

i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
hickup symptoms, with no other bad symptoms such as lockups or crashes.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
> >  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
> >  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
> >  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
> >  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
> >  NET: Registered protocol family 16
> > 
> > and no output ever since - and the box has been up for a few minutes.
> 
> Okay, I need to ask a stupid question - did you verify that it's not 
> spinning on a spinlock?

the box is fully usable after it has booted up and (as you can see it in 
the config i've uploaded) i've got every kernel debug option enabled, 
including lockdep.

> Specifically, I'm wondering whether the net_rx_action softirq may be 
> scheduled while we're in poll_napi holding the poll_lock. 
> net_rx_action would try to take the poll_lock as well, and we'd be 
> hung for good. The patch with local_bh_disable/enable was supposed to 
> test that idea (this is the "trickle" patch)

the box isnt hung - it just has no networking. (i couldnt get the logs 
out if it were hung - netconsole is the only remote debugging option and 
with that broken i can only get info out if it boots up fine)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
>  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
>  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
>  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
>  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
>  NET: Registered protocol family 16
> 
> and no output ever since - and the box has been up for a few minutes.

Okay, I need to ask a stupid question - did you verify that it's not
spinning on a spinlock?

Specifically, I'm wondering whether the net_rx_action softirq may
be scheduled while we're in poll_napi holding the poll_lock.
net_rx_action would try to take the poll_lock as well, and we'd
be hung for good. The patch with local_bh_disable/enable was
supposed to test that idea (this is the "trickle" patch)

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> the e1000 in this laptop is historically pretty robust. The only 
> problem i ever had with it were some rx/tx hw-engine latency problems 
> [pings from the outside took up to 1 second to propagate] that were 
> quickly fixed by the e1000 driver guys. Maybe that's related. 
> (although it never caused total inavailability of networking - it was 
> only latency problems)

i've Cc:-ed Auke - maybe he has some ideas? I've also attached all the 
other possibly relevant ethtool dumps of eth0 below, done when the hung 
interface happens. (Auke, see more details about this problem in this 
thread on lkml.)

another thing: how can i make the tx timeout much longer than it is 
right now? Maybe an interrupt is delayed for long and we just dont 
tolerate things long enough - and once the device tx timeout has been 
noticed we kill the interface from subsequent use in essence?

there's one oddity i noticed:

 0x02810: RDH   (Receive desc head)   0x
 0x02818: RDT   (Receive desc tail)   0x00FE

 0x03810: TDH   (Transmit desc head)  0x
 0x03818: TDT   (Transmit desc tail)  0x

doesnt this suggest that we have a receive packet, just the irq didnt 
come and we didnt process it?

although .. the tx timeout disabled the transmitter and the receiver of 
the device, correct? That in itself could have destroyed any evidence of 
what happened earlier on. But ... i'm really only guessing here.

Ingo

>

MAC Registers
-
0x0: CTRL (Device control register)  0x00140248
  Endian mode (buffers): little
  Link reset:reset
  Set link up:   1
  Invert Loss-Of-Signal: no
  Receive flow control:  disabled
  Transmit flow control: disabled
  VLAN mode: disabled
  Auto speed detect: disabled
  Speed select:  1000Mb/s
  Force speed:   no
  Force duplex:  no
0x8: STATUS (Device status register) 0x80080783
  Duplex:full
  Link up:   link config
  TBI mode:  disabled
  Link speed:1000Mb/s
  Bus type:  PCI
  Bus speed: 33MHz
  Bus width: 32-bit
0x00100: RCTL (Receive control register) 0x8002
  Receiver:  enabled
  Store bad packets: disabled
  Unicast promiscuous:   disabled
  Multicast promiscuous: disabled
  Long packet:   disabled
  Descriptor minimum threshold size: 1/2
  Broadcast accept mode: accept
  VLAN filter:   disabled
  Cononical form indicator:  disabled
  Discard pause frames:  filtered
  Pass MAC control frames:   don't pass
  Receive buffer size:   2048
0x02808: RDLEN (Receive desc length) 0x1000
0x02810: RDH   (Receive desc head)   0x
0x02818: RDT   (Receive desc tail)   0x00FE
0x02820: RDTR  (Receive delay timer) 0x
0x00400: TCTL (Transmit ctrl register)   0x31F8
  Transmitter:   disabled
  Pad short packets: enabled
  Software XOFF Transmission:disabled
  Re-transmit on late collision: enabled
0x03808: TDLEN (Transmit desc length)0x1000
0x03810: TDH   (Transmit desc head)  0x
0x03818: TDT   (Transmit desc tail)  0x
0x03820: TIDV  (Transmit delay timer)0x0008
PHY type:M88
Offset  Values
--  --
0x  00 16 41 17 49 d2 30 0b b2 ff 51 00 ff ff ff ff 
0x0010  53 00 03 01 6b 02 01 20 aa 17 9a 10 86 80 df 80 
0x0020  00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 
0x0030  c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 
0x0040  08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 
0x0050  14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 
0x0060  00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 
0x0070  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ef 9f 
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini:0
RX Jumbo:   0
TX: 4096
Current hardware settings:
RX: 256
RX Mini:0
RX Jumbo:   0
TX: 256

driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: 0.5-1
bus-info: :02:00.0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
NIC statistics:
 rx_packets: 0
 tx_packets: 591
   

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

>  -You say that netconsole output continues to trickle after
>   the network gets wedged. This could be caused by the
>   e1000 watchdog, which triggers a NIC interrupt "to ensure
>   rx ring is cleaned". I assume that this triggers the
>   regular e1000_intr, which succeeds in putting the NIC on
>   the poll_list, and net_rx_action call dev->poll once.

no - it appears that 'trickle' only happened with one of your patches 
(to which i replied with that 'trickle' mail). With what i have booted 
now (only your original patch and nothing else, 100 Hz and !dynticks), 
netconsole output stopped here:

 Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
 initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
 initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
 Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
 NET: Registered protocol family 16

and no output ever since - and the box has been up for a few minutes.

> So, can you verify whether there are any interrupts arriving on the 
> NIC after the network got wedged? You could also try ethtool -s eth0 
> msglevel 65535 - would be interesting to see what dmesg contains. If 
> there's little to no debug output from the driver, let it run for 10 
> seconds or so, in order to catch the e1000 watchdog timer a few times.

eth0's irq count is stuck at 5 interrupts - and has not changed for 
minutes.

i tried ethtool -s eth0 msglvl 65535, but (sa expected) there's no 
output. I've attached below ifconfig output and ethtool -S output - 
maybe that tells you something new about the state of eth0. (to me it 
only tells what we already know: tx timed out once and eth0 is stuck 
ever since.)

Btw., i definitely need your help with this bug as it's now hopelessly 
out of my league :-/

Ingo

-->
eth0  Link encap:Ethernet  HWaddr 00:16:41:17:49:D2
  inet addr:10.0.1.15  Bcast:10.255.255.255  Mask:255.0.0.0
  UP BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:873 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:0 (0.0 b)  TX bytes:87076 (85.0 KiB)
  Base address:0x2000 Memory:ee00-ee02 

NIC statistics:
 rx_packets: 0
 tx_packets: 873
 rx_bytes: 0
 tx_bytes: 87076
 rx_broadcast: 0
 tx_broadcast: 0
 rx_multicast: 0
 tx_multicast: 0
 rx_errors: 0
 tx_errors: 0
 tx_dropped: 0
 multicast: 0
 collisions: 0
 rx_length_errors: 0
 rx_over_errors: 0
 rx_crc_errors: 0
 rx_frame_errors: 0
 rx_no_buffer_count: 0
 rx_missed_errors: 0
 tx_aborted_errors: 0
 tx_carrier_errors: 0
 tx_fifo_errors: 0
 tx_heartbeat_errors: 0
 tx_window_errors: 0
 tx_abort_late_coll: 0
 tx_deferred_ok: 0
 tx_single_coll_ok: 0
 tx_multi_coll_ok: 0
 tx_timeout_count: 1
 tx_restart_queue: 0
 rx_long_length_errors: 0
 rx_short_length_errors: 0
 rx_align_errors: 0
 tx_tcp_seg_good: 0
 tx_tcp_seg_failed: 0
 rx_flow_control_xon: 0
 rx_flow_control_xoff: 0
 tx_flow_control_xon: 0
 tx_flow_control_xoff: 0
 rx_long_byte_count: 0
 rx_csum_offload_good: 0
 rx_csum_offload_errors: 0
 rx_header_split: 0
 alloc_rx_buff_failed: 0
 tx_smbus: 0
 rx_smbus: 0
 dropped_smbus: 0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 11:09, Ingo Molnar wrote:
> the e1000 in this laptop is historically pretty robust. The only problem 
> i ever had with it were some rx/tx hw-engine latency problems [pings 
> from the outside took up to 1 second to propagate] that were quickly 
> fixed by the e1000 driver guys. Maybe that's related. (although it never 
> caused total inavailability of networking - it was only latency 
> problems)

I've been poring over this code for 3 days now, and I'm facing a blank
wall, mind-wise :-)

 -  it is pretty clear that net_rx_action is invoked every once
in a while only. netdev watchdog timeouts are a pretty
unmistakable sign for that.

 -  You say that netconsole output continues to trickle after
the network gets wedged. This could be caused by the
e1000 watchdog, which triggers a NIC interrupt "to ensure
rx ring is cleaned". I assume that this triggers the
regular e1000_intr, which succeeds in putting the NIC on
the poll_list, and net_rx_action call dev->poll once.

If this assumption is true, this means that
 -  once an interrupt gets through, NAPI is working
as designed
 -  no other interrupts are arriving (Rx, Tx-completion)

So, can you verify whether there are any interrupts arriving on the
NIC after the network got wedged? You could also try
ethtool -s eth0 msglevel 65535 - would be interesting to see what
dmesg contains. If there's little to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

i have your original patch applied to my working tree to be able to 
observe this bug's behavior, and here's another observation: the problem 
seems to go away if i turn on CONFIG_NO_HZ. So it looks timing related 
indeed ...

but when the bug happens, it happens all the time, reboot after reboot. 
When it doesnt happen, networking and netlogging is robust for hours, 
reboot after reboot. That seems atypical for timing problems. I'm 
puzzled.

the e1000 in this laptop is historically pretty robust. The only problem 
i ever had with it were some rx/tx hw-engine latency problems [pings 
from the outside took up to 1 second to propagate] that were quickly 
fixed by the e1000 driver guys. Maybe that's related. (although it never 
caused total inavailability of networking - it was only latency 
problems)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Jarek Poplawski
On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote:
...
> I'd be very glad if it could be verified and/or tested.

Jarek,

This patch is verified crap!

Regards,
Jarek P.

PS: Olaf,

You've written earlier that one of the main reasons for poll_napi is
to work when the kernel "doesn't even service softirqs anymore". But
in your patch poll_napi leaves netif_rx_complete for softirqs, so
even if it starts to work for Ingo in normal conditions, probably
something else is needed, anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Jarek Poplawski
On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote:
...
 I'd be very glad if it could be verified and/or tested.

Jarek,

This patch is verified crap!

Regards,
Jarek P.

PS: Olaf,

You've written earlier that one of the main reasons for poll_napi is
to work when the kernel doesn't even service softirqs anymore. But
in your patch poll_napi leaves netif_rx_complete for softirqs, so
even if it starts to work for Ingo in normal conditions, probably
something else is needed, anyway.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

i have your original patch applied to my working tree to be able to 
observe this bug's behavior, and here's another observation: the problem 
seems to go away if i turn on CONFIG_NO_HZ. So it looks timing related 
indeed ...

but when the bug happens, it happens all the time, reboot after reboot. 
When it doesnt happen, networking and netlogging is robust for hours, 
reboot after reboot. That seems atypical for timing problems. I'm 
puzzled.

the e1000 in this laptop is historically pretty robust. The only problem 
i ever had with it were some rx/tx hw-engine latency problems [pings 
from the outside took up to 1 second to propagate] that were quickly 
fixed by the e1000 driver guys. Maybe that's related. (although it never 
caused total inavailability of networking - it was only latency 
problems)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 11:09, Ingo Molnar wrote:
 the e1000 in this laptop is historically pretty robust. The only problem 
 i ever had with it were some rx/tx hw-engine latency problems [pings 
 from the outside took up to 1 second to propagate] that were quickly 
 fixed by the e1000 driver guys. Maybe that's related. (although it never 
 caused total inavailability of networking - it was only latency 
 problems)

I've been poring over this code for 3 days now, and I'm facing a blank
wall, mind-wise :-)

 -  it is pretty clear that net_rx_action is invoked every once
in a while only. netdev watchdog timeouts are a pretty
unmistakable sign for that.

 -  You say that netconsole output continues to trickle after
the network gets wedged. This could be caused by the
e1000 watchdog, which triggers a NIC interrupt to ensure
rx ring is cleaned. I assume that this triggers the
regular e1000_intr, which succeeds in putting the NIC on
the poll_list, and net_rx_action call dev-poll once.

If this assumption is true, this means that
 -  once an interrupt gets through, NAPI is working
as designed
 -  no other interrupts are arriving (Rx, Tx-completion)

So, can you verify whether there are any interrupts arriving on the
NIC after the network got wedged? You could also try
ethtool -s eth0 msglevel 65535 - would be interesting to see what
dmesg contains. If there's little to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

  -You say that netconsole output continues to trickle after
   the network gets wedged. This could be caused by the
   e1000 watchdog, which triggers a NIC interrupt to ensure
   rx ring is cleaned. I assume that this triggers the
   regular e1000_intr, which succeeds in putting the NIC on
   the poll_list, and net_rx_action call dev-poll once.

no - it appears that 'trickle' only happened with one of your patches 
(to which i replied with that 'trickle' mail). With what i have booted 
now (only your original patch and nothing else, 100 Hz and !dynticks), 
netconsole output stopped here:

 Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
 initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
 initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
 Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
 NET: Registered protocol family 16

and no output ever since - and the box has been up for a few minutes.

 So, can you verify whether there are any interrupts arriving on the 
 NIC after the network got wedged? You could also try ethtool -s eth0 
 msglevel 65535 - would be interesting to see what dmesg contains. If 
 there's little to no debug output from the driver, let it run for 10 
 seconds or so, in order to catch the e1000 watchdog timer a few times.

eth0's irq count is stuck at 5 interrupts - and has not changed for 
minutes.

i tried ethtool -s eth0 msglvl 65535, but (sa expected) there's no 
output. I've attached below ifconfig output and ethtool -S output - 
maybe that tells you something new about the state of eth0. (to me it 
only tells what we already know: tx timed out once and eth0 is stuck 
ever since.)

Btw., i definitely need your help with this bug as it's now hopelessly 
out of my league :-/

Ingo

--
eth0  Link encap:Ethernet  HWaddr 00:16:41:17:49:D2
  inet addr:10.0.1.15  Bcast:10.255.255.255  Mask:255.0.0.0
  UP BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:873 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:0 (0.0 b)  TX bytes:87076 (85.0 KiB)
  Base address:0x2000 Memory:ee00-ee02 

NIC statistics:
 rx_packets: 0
 tx_packets: 873
 rx_bytes: 0
 tx_bytes: 87076
 rx_broadcast: 0
 tx_broadcast: 0
 rx_multicast: 0
 tx_multicast: 0
 rx_errors: 0
 tx_errors: 0
 tx_dropped: 0
 multicast: 0
 collisions: 0
 rx_length_errors: 0
 rx_over_errors: 0
 rx_crc_errors: 0
 rx_frame_errors: 0
 rx_no_buffer_count: 0
 rx_missed_errors: 0
 tx_aborted_errors: 0
 tx_carrier_errors: 0
 tx_fifo_errors: 0
 tx_heartbeat_errors: 0
 tx_window_errors: 0
 tx_abort_late_coll: 0
 tx_deferred_ok: 0
 tx_single_coll_ok: 0
 tx_multi_coll_ok: 0
 tx_timeout_count: 1
 tx_restart_queue: 0
 rx_long_length_errors: 0
 rx_short_length_errors: 0
 rx_align_errors: 0
 tx_tcp_seg_good: 0
 tx_tcp_seg_failed: 0
 rx_flow_control_xon: 0
 rx_flow_control_xoff: 0
 tx_flow_control_xon: 0
 tx_flow_control_xoff: 0
 rx_long_byte_count: 0
 rx_csum_offload_good: 0
 rx_csum_offload_errors: 0
 rx_header_split: 0
 alloc_rx_buff_failed: 0
 tx_smbus: 0
 rx_smbus: 0
 dropped_smbus: 0
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 the e1000 in this laptop is historically pretty robust. The only 
 problem i ever had with it were some rx/tx hw-engine latency problems 
 [pings from the outside took up to 1 second to propagate] that were 
 quickly fixed by the e1000 driver guys. Maybe that's related. 
 (although it never caused total inavailability of networking - it was 
 only latency problems)

i've Cc:-ed Auke - maybe he has some ideas? I've also attached all the 
other possibly relevant ethtool dumps of eth0 below, done when the hung 
interface happens. (Auke, see more details about this problem in this 
thread on lkml.)

another thing: how can i make the tx timeout much longer than it is 
right now? Maybe an interrupt is delayed for long and we just dont 
tolerate things long enough - and once the device tx timeout has been 
noticed we kill the interface from subsequent use in essence?

there's one oddity i noticed:

 0x02810: RDH   (Receive desc head)   0x
 0x02818: RDT   (Receive desc tail)   0x00FE

 0x03810: TDH   (Transmit desc head)  0x
 0x03818: TDT   (Transmit desc tail)  0x

doesnt this suggest that we have a receive packet, just the irq didnt 
come and we didnt process it?

although .. the tx timeout disabled the transmitter and the receiver of 
the device, correct? That in itself could have destroyed any evidence of 
what happened earlier on. But ... i'm really only guessing here.

Ingo



MAC Registers
-
0x0: CTRL (Device control register)  0x00140248
  Endian mode (buffers): little
  Link reset:reset
  Set link up:   1
  Invert Loss-Of-Signal: no
  Receive flow control:  disabled
  Transmit flow control: disabled
  VLAN mode: disabled
  Auto speed detect: disabled
  Speed select:  1000Mb/s
  Force speed:   no
  Force duplex:  no
0x8: STATUS (Device status register) 0x80080783
  Duplex:full
  Link up:   link config
  TBI mode:  disabled
  Link speed:1000Mb/s
  Bus type:  PCI
  Bus speed: 33MHz
  Bus width: 32-bit
0x00100: RCTL (Receive control register) 0x8002
  Receiver:  enabled
  Store bad packets: disabled
  Unicast promiscuous:   disabled
  Multicast promiscuous: disabled
  Long packet:   disabled
  Descriptor minimum threshold size: 1/2
  Broadcast accept mode: accept
  VLAN filter:   disabled
  Cononical form indicator:  disabled
  Discard pause frames:  filtered
  Pass MAC control frames:   don't pass
  Receive buffer size:   2048
0x02808: RDLEN (Receive desc length) 0x1000
0x02810: RDH   (Receive desc head)   0x
0x02818: RDT   (Receive desc tail)   0x00FE
0x02820: RDTR  (Receive delay timer) 0x
0x00400: TCTL (Transmit ctrl register)   0x31F8
  Transmitter:   disabled
  Pad short packets: enabled
  Software XOFF Transmission:disabled
  Re-transmit on late collision: enabled
0x03808: TDLEN (Transmit desc length)0x1000
0x03810: TDH   (Transmit desc head)  0x
0x03818: TDT   (Transmit desc tail)  0x
0x03820: TIDV  (Transmit delay timer)0x0008
PHY type:M88
Offset  Values
--  --
0x  00 16 41 17 49 d2 30 0b b2 ff 51 00 ff ff ff ff 
0x0010  53 00 03 01 6b 02 01 20 aa 17 9a 10 86 80 df 80 
0x0020  00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 
0x0030  c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 
0x0040  08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 
0x0050  14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 
0x0060  00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 
0x0070  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ef 9f 
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini:0
RX Jumbo:   0
TX: 4096
Current hardware settings:
RX: 256
RX Mini:0
RX Jumbo:   0
TX: 256

driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: 0.5-1
bus-info: :02:00.0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
NIC statistics:
 rx_packets: 0
 tx_packets: 591
 

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
  NET: Registered protocol family 16
 
 and no output ever since - and the box has been up for a few minutes.

Okay, I need to ask a stupid question - did you verify that it's not
spinning on a spinlock?

Specifically, I'm wondering whether the net_rx_action softirq may
be scheduled while we're in poll_napi holding the poll_lock.
net_rx_action would try to take the poll_lock as well, and we'd
be hung for good. The patch with local_bh_disable/enable was
supposed to test that idea (this is the trickle patch)

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
   Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
   initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
   initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
   Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
   NET: Registered protocol family 16
  
  and no output ever since - and the box has been up for a few minutes.
 
 Okay, I need to ask a stupid question - did you verify that it's not 
 spinning on a spinlock?

the box is fully usable after it has booted up and (as you can see it in 
the config i've uploaded) i've got every kernel debug option enabled, 
including lockdep.

 Specifically, I'm wondering whether the net_rx_action softirq may be 
 scheduled while we're in poll_napi holding the poll_lock. 
 net_rx_action would try to take the poll_lock as well, and we'd be 
 hung for good. The patch with local_bh_disable/enable was supposed to 
 test that idea (this is the trickle patch)

the box isnt hung - it just has no networking. (i couldnt get the logs 
out if it were hung - netconsole is the only remote debugging option and 
with that broken i can only get info out if it boots up fine)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 * Olaf Kirch [EMAIL PROTECTED] wrote:
 
  On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
NET: Registered protocol family 16
   
   and no output ever since - and the box has been up for a few minutes.
  
  Okay, I need to ask a stupid question - did you verify that it's not 
  spinning on a spinlock?

ok, i just to make my description clearer: 'netconsole output hung' 
means that on the netconsole-receiving box (which is not the laptop with 
the problem) i dont get any netconsole output printed. (i.e. UDP packets 
are not being sent by the laptop.) The above snippet is the last i get. 
Otherwise the laptop boots up fine, but has that early tx timeout and 
subsequently no networking. I can still do things on the laptop as 
normal, but eth0 irqs do not advance (are stuck at 5) and networking is 
stuck.

i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
hickup symptoms, with no other bad symptoms such as lockups or crashes.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
 i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
 hickup symptoms, with no other bad symptoms such as lockups or crashes.

Duh, I found it.

The e1000 poll routine does this to leave polling mode.

netif_rx_complete(poll_dev);
e1000_irq_enable(adapter);
return 0;

Which looks innocent enough, except that e1000_irq_enable has
this little irq_sem counter:

if (likely(atomic_dec_and_test(adapter-irq_sem))) {
E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(adapter-hw);
}

So as poll_napi calls the poll() routine repeatedly, the irq_sem
counter is decremented by one each time. During the first call,
it re-enables the interrupt. During the next calls, irq_sem goes
negative.

Then an interrupt comes in, e1000_intr disables the interrupt,
increments irq_sem by one, and schedules the device for rx_action.
rx_action calls dev-poll(), which finishes cleaning rx/rx rings,
and when it finds there's no more work, it calls rx_complete and
irq_enable. Except irq_enable doesn't enable anything now, since
irq_sem is = 0, and dec_and_test returns false.

The whole irq_sem accounting in the e1000 does not rhyme well with
netpoll's way of exercising dev-poll(). The reason my patch triggers
the problem reliably for you is that now, we always get at least
two invocations of dev-poll: once from poll_napi - where we do not
remove the device from the poll list any longer - and another one
from net_rx_action.

I don't have a fix ready yet - I hope I'll have something later
this afternoon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 14:52, Olaf Kirch wrote:
 On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
  i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
  hickup symptoms, with no other bad symptoms such as lockups or crashes.
 
 Duh, I found it.

The following patch should confirm my theory. Can you give it a try,
please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
Index: build-2.6/drivers/net/e1000/e1000_main.c
===
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(adapter-hw);
}
+   BUG_ON(atomic_read(adapter-irq_sem)  0);
 }
 
 static void

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

ugh. Something really weird happened with this e1000 problem.

i crashed the laptop in a weird way and had to power-cycle it in an 
unusual fashion. After that i wanted to try your latest BUG_ON() theory 
but the network hang went away!

For 3 hours i tried to reproduce the hang (i went back to the original 
git tree and the .config under which i found it, i power cycled it 
again, i unplugged the power cord to make it go off battery, unplugged 
the ethernet, recreated a completely new tree, etc. etc.) but with no 
success! Total Heisenbug - and the really annoying thing is that you 
just had a good theory about what might be happening. I'm now at kernel 
build iteration #75 in this tree ...

maybe it's not the power-cycling that somehow brought the e1000 out of 
its weird state but the ethtool ops (and the related firmware readouts, 
etc.)?

(i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
Not that this means anything under these circumstances ...)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 17:07, Ingo Molnar wrote:
 i crashed the laptop in a weird way and had to power-cycle it in an 
 unusual fashion. After that i wanted to try your latest BUG_ON() theory 
 but the network hang went away!

Should I rejoice, or regret? :-)

 maybe it's not the power-cycling that somehow brought the e1000 out of 
 its weird state but the ethtool ops (and the related firmware readouts, 
 etc.)?

Oh, don't tell me there's a --scrub-bad-karma switch in ethtool.

 (i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
 Not that this means anything under these circumstances ...)

Too bad. Now where do we take it from here? I'm currently thinking of
ways to do this patch differently. But that is kind of relying on a
good testbed to verify whether a different patch works better for you
or not...

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 ugh. Something really weird happened with this e1000 problem.
 
 i crashed the laptop in a weird way and had to power-cycle it in an 
 unusual fashion. After that i wanted to try your latest BUG_ON() 
 theory but the network hang went away!
 
 For 3 hours i tried to reproduce the hang (i went back to the original 
 git tree and the .config under which i found it, i power cycled it 
 again, i unplugged the power cord to make it go off battery, unplugged 
 the ethernet, recreated a completely new tree, etc. etc.) but with no 
 success! Total Heisenbug - and the really annoying thing is that you 
 just had a good theory about what might be happening. I'm now at 
 kernel build iteration #75 in this tree ...

ah! Just found the reason: the bug apparently depends on the precise 
kernel command-line contents. I accidentally dropped ignore_loglevel 
(found this while comparing with the older logs i sent to you), adding 
it back in produces hung networking too. So it appears that a netconsole 
printout while e1000 is initializing (or while some other networking 
component is initializing) might be the culprit?

Ingo

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Kok, Auke

Olaf Kirch wrote:

On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
hickup symptoms, with no other bad symptoms such as lockups or crashes.


Duh, I found it.

The e1000 poll routine does this to leave polling mode.

netif_rx_complete(poll_dev);
e1000_irq_enable(adapter);
return 0;

Which looks innocent enough, except that e1000_irq_enable has
this little irq_sem counter:

if (likely(atomic_dec_and_test(adapter-irq_sem))) {
E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(adapter-hw);
}

So as poll_napi calls the poll() routine repeatedly, the irq_sem
counter is decremented by one each time. During the first call,
it re-enables the interrupt. During the next calls, irq_sem goes
negative.

Then an interrupt comes in, e1000_intr disables the interrupt,
increments irq_sem by one, and schedules the device for rx_action.
rx_action calls dev-poll(), which finishes cleaning rx/rx rings,
and when it finds there's no more work, it calls rx_complete and
irq_enable. Except irq_enable doesn't enable anything now, since
irq_sem is = 0, and dec_and_test returns false.

The whole irq_sem accounting in the e1000 does not rhyme well with
netpoll's way of exercising dev-poll().


it's been accused of worse things ;)


The reason my patch triggers
the problem reliably for you is that now, we always get at least
two invocations of dev-poll: once from poll_napi - where we do not
remove the device from the poll list any longer - and another one
from net_rx_action.

I don't have a fix ready yet - I hope I'll have something later
this afternoon.


interesting, you seem to found the cause allright. I can't confirm the problem 
but I know that netpoll and NAPI has historically been an issue. I look forward 
to your suggestions...


Cheers,

Auke
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 ah! Just found the reason: the bug apparently depends on the precise 
 kernel command-line contents. I accidentally dropped ignore_loglevel 
 (found this while comparing with the older logs i sent to you), adding 
 it back in produces hung networking too. So it appears that a 
 netconsole printout while e1000 is initializing (or while some other 
 networking component is initializing) might be the culprit?

and the WARN_ON() below does not seem to trigger.

i'll now check whether removing ignore_on_loglevel (no other changes) 
makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
produces an immediate printk (going out to the interface) during a 
particularly sensitive period of network initialization.

Ingo

-
Index: linux/drivers/net/e1000/e1000_main.c
===
--- linux.orig/drivers/net/e1000/e1000_main.c
+++ linux/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(adapter-hw);
}
+   WARN_ON_ONCE(atomic_read(adapter-irq_sem)  0);
 }
 
 static void
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 i'll now check whether removing ignore_on_loglevel (no other changes) 
 makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
 produces an immediate printk (going out to the interface) during a 
 particularly sensitive period of network initialization.

nope, that didnt have any effect. I have another theory: i had a 
network-intense stress-test going on in the past few hours on the same 
network (not involving the laptop) and i stopped it recently. Perhaps 
that network-intense test also produced periodic broadcast packets that 
got the e1000 out of its weird state before the tx timeout could hit. 
Now that i've stopped the test, the network is quiescent again and the 
e1000 hangs.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Kok, Auke [EMAIL PROTECTED] wrote:

  I don't have a fix ready yet - I hope I'll have something later this 
  afternoon.
 
 interesting, you seem to found the cause allright. I can't confirm the 
 problem but I know that netpoll and NAPI has historically been an 
 issue. I look forward to your suggestions...

because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
anything in the ethtool output that i sent in the previous mail(s)?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

  i'll now check whether removing ignore_on_loglevel (no other 
  changes) makes the hang go away. Maybe ignore_on_loglevel is buggy - 
  or it produces an immediate printk (going out to the interface) 
  during a particularly sensitive period of network initialization.
 
 nope, that didnt have any effect. I have another theory: i had a 
 network-intense stress-test going on in the past few hours on the same 
 network (not involving the laptop) and i stopped it recently. Perhaps 
 that network-intense test also produced periodic broadcast packets 
 that got the e1000 out of its weird state before the tx timeout could 
 hit. Now that i've stopped the test, the network is quiescent again 
 and the e1000 hangs.

yep - i can confirm this theory now: if i start a broadcast ping on 
another box while the laptop is booting up, the network hang does not 
happen. If i stop the ping and do another bootup with the very same 
kernel, it hangs. So for the hang to happen, the network has to be very 
quiet. At least one mystery solved :-/

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
 that network-intense test also produced periodic broadcast packets that 
 got the e1000 out of its weird state before the tx timeout could hit. 
 Now that i've stopped the test, the network is quiescent again and the 
 e1000 hangs.

Can you confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
  that network-intense test also produced periodic broadcast packets that 
  got the e1000 out of its weird state before the tx timeout could hit. 
  Now that i've stopped the test, the network is quiescent again and the 
  e1000 hangs.
 
 Can you confirm this by spraying the laptop with arp packets or 
 broadcast pings while it's booting?

yes, i did exactly that, and it made the box boot.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 19:36, Olaf Kirch wrote:
 Can you confirm this by spraying the laptop with arp packets
 or broadcast pings while it's booting?

Sorry for the noise - didn't see your other message where you
described just that.

This sounds more like a hardware issue - Rx interrupt seems to
go through (triggering all the usual NAPI cleanup), but the Tx
descriptor writeback interrupt doesn't.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:07, Ingo Molnar wrote:
 because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
 anything in the ethtool output that i sent in the previous mail(s)?

If the WARN_ON doesn't trigger, I cannot see how my patch would affect
your system.

 -  IF we enter the if() branch in poll_napi, then the following
must hold:
 -  an interrupt from the e1000 arrived
 -  e1000_intr disables interrupts, increments irq_sem
(which is now 1) and schedules rx_action

 -  Now, inside poll_napi, the following happens:
 -  poll_napi is called, finds the device is marked for
polling, invokes dev-poll
 -  dev-poll calls netif_rx_complete (which does *not*
remove the device from the poll list), and re-enables
interrupts. irq_sem is now 0.

 -  Finally, the rx_action softirq is run, which calls dev-poll
again, which ends up invoking netif_rx_complete once more,
and tries to re-enable interrupts. The latter doesn't do
anything except decrementing irq_sem once more, which now
goes negative.

Which would trigger the WARN_ON.

Now, as you say the WARN_ON is never triggered, it follows that
we never end up in the if() branch of poll_napi. But that is
where the only substantial modification of my patch is.

Here's a somewhat drastic modification that should not change any
timing, but just verifies whether my patch is to blame at all. Can
you give it a try?

Thanks,
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
 include/linux/netdevice.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 * But at least it doesn't penalize the non-netpoll
 * code path. */
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state))
-   return;
+   BUG();
 #endif
 
local_irq_save(flags);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 Here's a somewhat drastic modification that should not change any 
 timing, but just verifies whether my patch is to blame at all. Can you 
 give it a try?

 @@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
* But at least it doesn't penalize the non-netpoll
* code path. */
   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state))
 - return;
 + BUG();

ok, i tried the patch below, and it gave this (single) warning:

 Calling initcall 0xc02f5c17: init_netconsole+0x0/0x67()
 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at include/linux/netdevice.h:1030 netif_rx_complete()
  [c0105e3e] show_trace_log_lvl+0x19/0x2e
  [c0105f37] show_trace+0x12/0x14
  [c0105f4d] dump_stack+0x14/0x16
  [c02c4fef] e1000_clean+0x1f4/0x26f
  [c03d0f6f] netpoll_poll+0x8b/0x357
  [c03d0e80] netpoll_send_skb+0xe8/0x14c
  [c03d1502] netpoll_send_udp+0x258/0x260
  [c02f5cea] write_msg+0x53/0x8d
  [c012c5ba] __call_console_drivers+0x4e/0x5a
  [c012c623] _call_console_drivers+0x5d/0x61
  [c012cc42] release_console_sem+0x120/0x1c1
  [c012d446] register_console+0x22e/0x236
  [c02f5c6c] init_netconsole+0x55/0x67
  [c05e48e5] kernel_init+0x154/0x2d9
  [c0105c53] kernel_thread_helper+0x7/0x10
  ===
 e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
 netconsole: network logging started
 initcall 0xc02f5c17: init_netconsole+0x0/0x67() returned 0.
 initcall 0xc02f5c17 ran for 4012 msecs: init_netconsole+0x0/0x67()
 Calling initcall 0xc0600ff9: spi_transport_init+0x0/0x27()
 initcall 0xc0600ff9: spi_transport_init+0x0/0x27() returned 0.


Ingo

---
 include/linux/netdevice.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
===
--- linux-cfs-2.6.23-git.q.prev3.orig/include/linux/netdevice.h
+++ linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 * But at least it doesn't penalize the non-netpoll
 * code path. */
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state))
-   return;
+   WARN_ON(1);
 #endif
 
local_irq_save(flags);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch

Does the following help?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
Test patch
---
Index: build-2.6/drivers/net/netconsole.c
===
--- build-2.6.orig/drivers/net/netconsole.c
+++ build-2.6/drivers/net/netconsole.c
@@ -70,7 +70,7 @@ static void write_msg(struct console *co
int frag, left;
unsigned long flags;
 
-   if (!np.dev)
+   if (!np.dev || !netif_running(np.dev))
return;
 
local_irq_save(flags);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 Does the following help?

 --- build-2.6.orig/drivers/net/netconsole.c
 +++ build-2.6/drivers/net/netconsole.c
 @@ -70,7 +70,7 @@ static void write_msg(struct console *co
   int frag, left;
   unsigned long flags;
  
 - if (!np.dev)
 + if (!np.dev || !netif_running(np.dev))
   return;

nope - with this patch applied the box still has no network, symptoms 
are similar. (should i apply the WARN_ON() patch too?)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
 nope - with this patch applied the box still has no network, symptoms 
 are similar. (should i apply the WARN_ON() patch too?)

Yes, that would be nice. If that doesn't help, you can also throw in
the one below.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Index: build-2.6/net/core/netpoll.c
===
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -650,7 +650,6 @@ int netpoll_setup(struct netpoll *np)
return -ENODEV;
}
 
-   np-dev = ndev;
if (!ndev-npinfo) {
npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
if (!npinfo) {
@@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
}
}
 
+   np-dev = ndev;
+
if (is_zero_ether_addr(np-local_mac)  ndev-dev_addr)
memcpy(np-local_mac, ndev-dev_addr, 6);
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> > also, i'm using netconsole via the command line (both the network 
> > driver and netconsole is built into the bzImage), maybe that makes a 
> > difference?
> 
> Possibly - but so far there's nothing in the code that jumped at me.
> 
> Can you try the following please? I'm still pretty much in the dark; 
> so I'm adding a bunch of printks. Let's hope this doesn't change the 
> timing in a way that makes the bug disappear.

hm ... the box wouldnt even boot up because the 'keep on polling' and 
'done' messages are scrolling across so fast. Perhaps every printk 
triggers a packet which triggers another printk ... etc?

here's the 15MB log of it:

  http://redhat.com/~mingo/misc/100hz.3.log.bz2

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Wednesday 18 July 2007 14:48, Ingo Molnar wrote:
> something i noticed: netconsole output seems to trickle through though, 
> but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
> not functional.
> 
> also, i'm using netconsole via the command line (both the network driver 
> and netconsole is built into the bzImage), maybe that makes a 
> difference?

Possibly - but so far there's nothing in the code that jumped at me.

Can you try the following please? I'm still pretty much in the dark;
so I'm adding a bunch of printks. Let's hope this doesn't change the
timing in a way that makes the bug disappear.

Thanks
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-

---
 include/linux/netdevice.h |4 
 net/core/dev.c|   14 ++
 2 files changed, 18 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
struct dma_chan *net_dma;
 #endif
+   unsigned long   scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
@@ -1026,6 +1027,9 @@ static inline void netif_rx_complete(str
/* Prevent race with netpoll - yes, this is a kludge.
 * But at least it doesn't penalize the non-netpoll
 * code path. */
+   printk(KERN_DEBUG "netif_rx_complete(%s)%s\n", dev->name,
+   test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state)?
+   " - frozen" : "");
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state))
return;
 #endif
Index: build-2.6/net/core/dev.c
===
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1192,6 +1192,7 @@ void __netif_rx_schedule(struct net_devi
 {
unsigned long flags;
 
+   printk(KERN_DEBUG "__netif_rx_schedule(%s)\n", dev->name);
local_irq_save(flags);
dev_hold(dev);
list_add_tail(>poll_list, &__get_cpu_var(softnet_data).poll_list);
@@ -1199,6 +1200,7 @@ void __netif_rx_schedule(struct net_devi
dev->quota += dev->weight;
else
dev->quota = dev->weight;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
local_irq_restore(flags);
 }
@@ -2028,6 +2030,9 @@ static void net_rx_action(struct softirq
 
local_irq_disable();
 
+   /* Complain if we're scheduled way too late. */
+   WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + 
HZ));
+
while (!list_empty(>poll_list)) {
struct net_device *dev;
 
@@ -2038,9 +2043,13 @@ static void net_rx_action(struct softirq
 
dev = list_entry(queue->poll_list.next,
 struct net_device, poll_list);
+
have = netpoll_poll_lock(dev);
 
+   BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state));
if (dev->quota <= 0 || dev->poll(dev, )) {
+   printk(KERN_DEBUG "netif_rx_action(%s) - keep on 
polling\n",
+   dev->name);
netpoll_poll_unlock(have);
local_irq_disable();
list_move_tail(>poll_list, >poll_list);
@@ -2049,6 +2058,10 @@ static void net_rx_action(struct softirq
else
dev->quota = dev->weight;
} else {
+   printk(KERN_DEBUG "netif_rx_action(%s) - done%s\n",
+   dev->name,
+   test_bit(__LINK_STATE_RX_SCHED, 
>state)?
+"; rx_sched set" : "");
netpoll_poll_unlock(have);
dev_put(dev);
local_irq_disable();
@@ -2074,6 +2087,7 @@ out:
 
 softnet_break:
__get_cpu_var(netdev_rx_stat).time_squeeze++;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
goto out;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> > i logged these not via netconsole but via logging on over the console 
> > and using dmesg, so it should include everything. in the 100hz case the 
> > following seems to show the anomaly:
> > 
> >   NETDEV WATCHDOG: eth0: transmit timed out
> 
> So, it seems as if for some reason, dev->poll isn't called frequently 
> enough.
> 
> Here's a debugging patch that tries to locate the problem - can you 
> give it a try, please?

it triggers here:

 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at kernel/softirq.c:138 local_bh_enable()
  [] show_trace_log_lvl+0x19/0x2e
  [] show_trace+0x12/0x14
  [] dump_stack+0x14/0x16
  [] local_bh_enable+0x95/0x15d
  [] netpoll_poll+0xaf/0x361
  [] netpoll_send_skb+0xe8/0x14c
  [] netpoll_send_udp+0x258/0x260
  [] write_msg+0x53/0x8d
  [] __call_console_drivers+0x4e/0x5a
  [] _call_console_drivers+0x5d/0x61
  [] release_console_sem+0x120/0x1c1
  [] register_console+0x22e/0x236
  [] init_netconsole+0x55/0x67
  [] kernel_init+0x154/0x2d9
  [] kernel_thread_helper+0x7/0x10

I've uploaded the full log to:

 http://redhat.com/~mingo/misc/100hz.2.log

something i noticed: netconsole output seems to trickle through though, 
but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
not functional.

also, i'm using netconsole via the command line (both the network driver 
and netconsole is built into the bzImage), maybe that makes a 
difference?

(if there's any other data you'd like to see, let me know.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> > i logged these not via netconsole but via logging on over the console 
> > and using dmesg, so it should include everything. in the 100hz case the 
> > following seems to show the anomaly:
> > 
> >   NETDEV WATCHDOG: eth0: transmit timed out
> 
> So, it seems as if for some reason, dev->poll isn't called frequently 
> enough.
> 
> Here's a debugging patch that tries to locate the problem - can you 
> give it a try, please?

sure, give me a minute.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> i logged these not via netconsole but via logging on over the console 
> and using dmesg, so it should include everything. in the 100hz case the 
> following seems to show the anomaly:
> 
>   NETDEV WATCHDOG: eth0: transmit timed out

So, it seems as if for some reason, dev->poll isn't called frequently
enough.

Here's a debugging patch that tries to locate the problem - can you give it
a try, please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

patching file drivers/net/e1000/e1000_main.c
---
 include/linux/netdevice.h |1 +
 net/core/dev.c|7 +++
 net/core/netpoll.c|2 ++
 3 files changed, 10 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
struct dma_chan *net_dma;
 #endif
+   unsigned long   scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
Index: build-2.6/net/core/dev.c
===
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1199,6 +1199,7 @@ void __netif_rx_schedule(struct net_devi
dev->quota += dev->weight;
else
dev->quota = dev->weight;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
local_irq_restore(flags);
 }
@@ -2028,6 +2029,9 @@ static void net_rx_action(struct softirq
 
local_irq_disable();
 
+   /* Complain if we're scheduled way too late. */
+   WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + 
HZ));
+
while (!list_empty(>poll_list)) {
struct net_device *dev;
 
@@ -2038,8 +2042,10 @@ static void net_rx_action(struct softirq
 
dev = list_entry(queue->poll_list.next,
 struct net_device, poll_list);
+
have = netpoll_poll_lock(dev);
 
+   BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state));
if (dev->quota <= 0 || dev->poll(dev, )) {
netpoll_poll_unlock(have);
local_irq_disable();
@@ -2074,6 +2080,7 @@ out:
 
 softnet_break:
__get_cpu_var(netdev_rx_stat).time_squeeze++;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
goto out;
 }
Index: build-2.6/net/core/netpoll.c
===
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -130,6 +130,7 @@ static void poll_napi(struct netpoll *np
 * Setting POLL_LIST_FROZEN tells netif_rx_complete
 * to leave the NAPI state alone.
 */
+   local_bh_disable();
set_bit(__LINK_STATE_POLL_LIST_FROZEN, >dev->state);
npinfo->rx_flags |= NETPOLL_RX_DROP;
atomic_inc();
@@ -140,6 +141,7 @@ static void poll_napi(struct netpoll *np
npinfo->rx_flags &= ~NETPOLL_RX_DROP;
clear_bit(__LINK_STATE_POLL_LIST_FROZEN, >dev->state);
spin_unlock(>poll_lock);
+   local_bh_enable();
}
 }
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Jarek Poplawski
Hi,

Here is my proposal of a solution based on dev->state flag,
but intended mainly to prevent poll_napi from disturbing
while net_rx_action is running and polling the device.

It doesn't look very nice or clean but I hope it could
guard net_rx_action enough with some room for netpoll too.

I'd be very glad if it could be verified and/or tested.

Regards,
Jarek P.

PS: not tested!

---

diff -Nurp 2.6.22-git9-/include/linux/netdevice.h 
2.6.22-git9/include/linux/netdevice.h
--- 2.6.22-git9-/include/linux/netdevice.h  2007-07-18 07:50:56.0 
+0200
+++ 2.6.22-git9/include/linux/netdevice.h   2007-07-18 13:21:12.0 
+0200
@@ -262,6 +262,8 @@ enum netdev_state_t
__LINK_STATE_LINKWATCH_PENDING,
__LINK_STATE_DORMANT,
__LINK_STATE_QDISC_RUNNING,
+   /* Set by net_rx_action vs poll_napi */
+   __LINK_STATE_POLL_LIST_FROZEN,
 };
 
 
diff -Nurp 2.6.22-git9-/net/core/dev.c 2.6.22-git9/net/core/dev.c
--- 2.6.22-git9-/net/core/dev.c 2007-07-18 07:50:58.0 +0200
+++ 2.6.22-git9/net/core/dev.c  2007-07-18 13:18:23.0 +0200
@@ -2025,6 +2025,7 @@ static void net_rx_action(struct softirq
unsigned long start_time = jiffies;
int budget = netdev_budget;
void *have;
+   struct net_device *dev_frozen = NULL;
 
local_irq_disable();
 
@@ -2040,15 +2041,35 @@ static void net_rx_action(struct softirq
 struct net_device, poll_list);
have = netpoll_poll_lock(dev);
 
+#ifdef CONFIG_NETPOLL
+   /* prevent a race with poll_napi() */
+   if (dev_frozen && dev_frozen != dev) {
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   _frozen->state);
+   set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+>state);
+   dev_frozen = dev;
+   } else if (!dev_frozen) {
+   set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+>state);
+   dev_frozen = dev;
+   }
+#endif
if (dev->quota <= 0 || dev->poll(dev, )) {
netpoll_poll_unlock(have);
local_irq_disable();
-   list_move_tail(>poll_list, >poll_list);
+   list_move_tail(>poll_list,
+>poll_list);
if (dev->quota < 0)
dev->quota += dev->weight;
else
dev->quota = dev->weight;
} else {
+#ifdef CONFIG_NETPOLL
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   >state);
+   dev_frozen = NULL;
+#endif
netpoll_poll_unlock(have);
dev_put(dev);
local_irq_disable();
@@ -2056,6 +2077,14 @@ static void net_rx_action(struct softirq
}
 out:
local_irq_enable();
+#ifdef CONFIG_NETPOLL
+   if (dev_frozen) {
+   have = netpoll_poll_lock(dev_frozen);
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   _frozen->state);
+   netpoll_poll_unlock(have);
+   }
+#endif
 #ifdef CONFIG_NET_DMA
/*
 * There may not be any more sk_buffs coming right now, so push
diff -Nurp 2.6.22-git9-/net/core/netpoll.c 2.6.22-git9/net/core/netpoll.c
--- 2.6.22-git9-/net/core/netpoll.c 2007-07-18 07:50:58.0 +0200
+++ 2.6.22-git9/net/core/netpoll.c  2007-07-18 12:09:43.0 +0200
@@ -124,13 +124,20 @@ static void poll_napi(struct netpoll *np
if (test_bit(__LINK_STATE_RX_SCHED, >dev->state) &&
npinfo->poll_owner != smp_processor_id() &&
spin_trylock(>poll_lock)) {
-   npinfo->rx_flags |= NETPOLL_RX_DROP;
-   atomic_inc();
+   /*
+* If POLL_LIST_FROZEN is set net_rx_action
+* is working with this device.
+*/
+   if (!test_bit(__LINK_STATE_POLL_LIST_FROZEN,
+>dev->state)) {
+   npinfo->rx_flags |= NETPOLL_RX_DROP;
+   atomic_inc();
 
-   np->dev->poll(np->dev, );
+   np->dev->poll(np->dev, );
 
-   atomic_dec();
-   npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+   atomic_dec();
+   npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+   }
spin_unlock(>poll_lock);
}
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Jarek Poplawski
Hi,

Here is my proposal of a solution based on dev-state flag,
but intended mainly to prevent poll_napi from disturbing
while net_rx_action is running and polling the device.

It doesn't look very nice or clean but I hope it could
guard net_rx_action enough with some room for netpoll too.

I'd be very glad if it could be verified and/or tested.

Regards,
Jarek P.

PS: not tested!

---

diff -Nurp 2.6.22-git9-/include/linux/netdevice.h 
2.6.22-git9/include/linux/netdevice.h
--- 2.6.22-git9-/include/linux/netdevice.h  2007-07-18 07:50:56.0 
+0200
+++ 2.6.22-git9/include/linux/netdevice.h   2007-07-18 13:21:12.0 
+0200
@@ -262,6 +262,8 @@ enum netdev_state_t
__LINK_STATE_LINKWATCH_PENDING,
__LINK_STATE_DORMANT,
__LINK_STATE_QDISC_RUNNING,
+   /* Set by net_rx_action vs poll_napi */
+   __LINK_STATE_POLL_LIST_FROZEN,
 };
 
 
diff -Nurp 2.6.22-git9-/net/core/dev.c 2.6.22-git9/net/core/dev.c
--- 2.6.22-git9-/net/core/dev.c 2007-07-18 07:50:58.0 +0200
+++ 2.6.22-git9/net/core/dev.c  2007-07-18 13:18:23.0 +0200
@@ -2025,6 +2025,7 @@ static void net_rx_action(struct softirq
unsigned long start_time = jiffies;
int budget = netdev_budget;
void *have;
+   struct net_device *dev_frozen = NULL;
 
local_irq_disable();
 
@@ -2040,15 +2041,35 @@ static void net_rx_action(struct softirq
 struct net_device, poll_list);
have = netpoll_poll_lock(dev);
 
+#ifdef CONFIG_NETPOLL
+   /* prevent a race with poll_napi() */
+   if (dev_frozen  dev_frozen != dev) {
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   dev_frozen-state);
+   set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+dev-state);
+   dev_frozen = dev;
+   } else if (!dev_frozen) {
+   set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+dev-state);
+   dev_frozen = dev;
+   }
+#endif
if (dev-quota = 0 || dev-poll(dev, budget)) {
netpoll_poll_unlock(have);
local_irq_disable();
-   list_move_tail(dev-poll_list, queue-poll_list);
+   list_move_tail(dev-poll_list,
+queue-poll_list);
if (dev-quota  0)
dev-quota += dev-weight;
else
dev-quota = dev-weight;
} else {
+#ifdef CONFIG_NETPOLL
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   dev-state);
+   dev_frozen = NULL;
+#endif
netpoll_poll_unlock(have);
dev_put(dev);
local_irq_disable();
@@ -2056,6 +2077,14 @@ static void net_rx_action(struct softirq
}
 out:
local_irq_enable();
+#ifdef CONFIG_NETPOLL
+   if (dev_frozen) {
+   have = netpoll_poll_lock(dev_frozen);
+   clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+   dev_frozen-state);
+   netpoll_poll_unlock(have);
+   }
+#endif
 #ifdef CONFIG_NET_DMA
/*
 * There may not be any more sk_buffs coming right now, so push
diff -Nurp 2.6.22-git9-/net/core/netpoll.c 2.6.22-git9/net/core/netpoll.c
--- 2.6.22-git9-/net/core/netpoll.c 2007-07-18 07:50:58.0 +0200
+++ 2.6.22-git9/net/core/netpoll.c  2007-07-18 12:09:43.0 +0200
@@ -124,13 +124,20 @@ static void poll_napi(struct netpoll *np
if (test_bit(__LINK_STATE_RX_SCHED, np-dev-state) 
npinfo-poll_owner != smp_processor_id() 
spin_trylock(npinfo-poll_lock)) {
-   npinfo-rx_flags |= NETPOLL_RX_DROP;
-   atomic_inc(trapped);
+   /*
+* If POLL_LIST_FROZEN is set net_rx_action
+* is working with this device.
+*/
+   if (!test_bit(__LINK_STATE_POLL_LIST_FROZEN,
+np-dev-state)) {
+   npinfo-rx_flags |= NETPOLL_RX_DROP;
+   atomic_inc(trapped);
 
-   np-dev-poll(np-dev, budget);
+   np-dev-poll(np-dev, budget);
 
-   atomic_dec(trapped);
-   npinfo-rx_flags = ~NETPOLL_RX_DROP;
+   atomic_dec(trapped);
+   npinfo-rx_flags = ~NETPOLL_RX_DROP;
+   }
spin_unlock(npinfo-poll_lock);
}
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL 

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
 i logged these not via netconsole but via logging on over the console 
 and using dmesg, so it should include everything. in the 100hz case the 
 following seems to show the anomaly:
 
   NETDEV WATCHDOG: eth0: transmit timed out

So, it seems as if for some reason, dev-poll isn't called frequently
enough.

Here's a debugging patch that tries to locate the problem - can you give it
a try, please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

patching file drivers/net/e1000/e1000_main.c
---
 include/linux/netdevice.h |1 +
 net/core/dev.c|7 +++
 net/core/netpoll.c|2 ++
 3 files changed, 10 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
struct dma_chan *net_dma;
 #endif
+   unsigned long   scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
Index: build-2.6/net/core/dev.c
===
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1199,6 +1199,7 @@ void __netif_rx_schedule(struct net_devi
dev-quota += dev-weight;
else
dev-quota = dev-weight;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
local_irq_restore(flags);
 }
@@ -2028,6 +2029,9 @@ static void net_rx_action(struct softirq
 
local_irq_disable();
 
+   /* Complain if we're scheduled way too late. */
+   WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + 
HZ));
+
while (!list_empty(queue-poll_list)) {
struct net_device *dev;
 
@@ -2038,8 +2042,10 @@ static void net_rx_action(struct softirq
 
dev = list_entry(queue-poll_list.next,
 struct net_device, poll_list);
+
have = netpoll_poll_lock(dev);
 
+   BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state));
if (dev-quota = 0 || dev-poll(dev, budget)) {
netpoll_poll_unlock(have);
local_irq_disable();
@@ -2074,6 +2080,7 @@ out:
 
 softnet_break:
__get_cpu_var(netdev_rx_stat).time_squeeze++;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
goto out;
 }
Index: build-2.6/net/core/netpoll.c
===
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -130,6 +130,7 @@ static void poll_napi(struct netpoll *np
 * Setting POLL_LIST_FROZEN tells netif_rx_complete
 * to leave the NAPI state alone.
 */
+   local_bh_disable();
set_bit(__LINK_STATE_POLL_LIST_FROZEN, np-dev-state);
npinfo-rx_flags |= NETPOLL_RX_DROP;
atomic_inc(trapped);
@@ -140,6 +141,7 @@ static void poll_napi(struct netpoll *np
npinfo-rx_flags = ~NETPOLL_RX_DROP;
clear_bit(__LINK_STATE_POLL_LIST_FROZEN, np-dev-state);
spin_unlock(npinfo-poll_lock);
+   local_bh_enable();
}
 }
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
  i logged these not via netconsole but via logging on over the console 
  and using dmesg, so it should include everything. in the 100hz case the 
  following seems to show the anomaly:
  
NETDEV WATCHDOG: eth0: transmit timed out
 
 So, it seems as if for some reason, dev-poll isn't called frequently 
 enough.
 
 Here's a debugging patch that tries to locate the problem - can you 
 give it a try, please?

sure, give me a minute.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
  i logged these not via netconsole but via logging on over the console 
  and using dmesg, so it should include everything. in the 100hz case the 
  following seems to show the anomaly:
  
NETDEV WATCHDOG: eth0: transmit timed out
 
 So, it seems as if for some reason, dev-poll isn't called frequently 
 enough.
 
 Here's a debugging patch that tries to locate the problem - can you 
 give it a try, please?

it triggers here:

 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at kernel/softirq.c:138 local_bh_enable()
  [c0105e4a] show_trace_log_lvl+0x19/0x2e
  [c0105f43] show_trace+0x12/0x14
  [c0105f59] dump_stack+0x14/0x16
  [c0130f8d] local_bh_enable+0x95/0x15d
  [c03cf35b] netpoll_poll+0xaf/0x361
  [c03cf248] netpoll_send_skb+0xe8/0x14c
  [c03cf8d4] netpoll_send_udp+0x258/0x260
  [c02f4016] write_msg+0x53/0x8d
  [c012c87e] __call_console_drivers+0x4e/0x5a
  [c012c8e7] _call_console_drivers+0x5d/0x61
  [c012cf06] release_console_sem+0x120/0x1c1
  [c012d70a] register_console+0x22e/0x236
  [c02f3f98] init_netconsole+0x55/0x67
  [c05e28e5] kernel_init+0x154/0x2d9
  [c0105c5f] kernel_thread_helper+0x7/0x10

I've uploaded the full log to:

 http://redhat.com/~mingo/misc/100hz.2.log

something i noticed: netconsole output seems to trickle through though, 
but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
not functional.

also, i'm using netconsole via the command line (both the network driver 
and netconsole is built into the bzImage), maybe that makes a 
difference?

(if there's any other data you'd like to see, let me know.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Wednesday 18 July 2007 14:48, Ingo Molnar wrote:
 something i noticed: netconsole output seems to trickle through though, 
 but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
 not functional.
 
 also, i'm using netconsole via the command line (both the network driver 
 and netconsole is built into the bzImage), maybe that makes a 
 difference?

Possibly - but so far there's nothing in the code that jumped at me.

Can you try the following please? I'm still pretty much in the dark;
so I'm adding a bunch of printks. Let's hope this doesn't change the
timing in a way that makes the bug disappear.

Thanks
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-

---
 include/linux/netdevice.h |4 
 net/core/dev.c|   14 ++
 2 files changed, 18 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
struct dma_chan *net_dma;
 #endif
+   unsigned long   scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
@@ -1026,6 +1027,9 @@ static inline void netif_rx_complete(str
/* Prevent race with netpoll - yes, this is a kludge.
 * But at least it doesn't penalize the non-netpoll
 * code path. */
+   printk(KERN_DEBUG netif_rx_complete(%s)%s\n, dev-name,
+   test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state)?
+- frozen : );
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state))
return;
 #endif
Index: build-2.6/net/core/dev.c
===
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1192,6 +1192,7 @@ void __netif_rx_schedule(struct net_devi
 {
unsigned long flags;
 
+   printk(KERN_DEBUG __netif_rx_schedule(%s)\n, dev-name);
local_irq_save(flags);
dev_hold(dev);
list_add_tail(dev-poll_list, __get_cpu_var(softnet_data).poll_list);
@@ -1199,6 +1200,7 @@ void __netif_rx_schedule(struct net_devi
dev-quota += dev-weight;
else
dev-quota = dev-weight;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
local_irq_restore(flags);
 }
@@ -2028,6 +2030,9 @@ static void net_rx_action(struct softirq
 
local_irq_disable();
 
+   /* Complain if we're scheduled way too late. */
+   WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + 
HZ));
+
while (!list_empty(queue-poll_list)) {
struct net_device *dev;
 
@@ -2038,9 +2043,13 @@ static void net_rx_action(struct softirq
 
dev = list_entry(queue-poll_list.next,
 struct net_device, poll_list);
+
have = netpoll_poll_lock(dev);
 
+   BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state));
if (dev-quota = 0 || dev-poll(dev, budget)) {
+   printk(KERN_DEBUG netif_rx_action(%s) - keep on 
polling\n,
+   dev-name);
netpoll_poll_unlock(have);
local_irq_disable();
list_move_tail(dev-poll_list, queue-poll_list);
@@ -2049,6 +2058,10 @@ static void net_rx_action(struct softirq
else
dev-quota = dev-weight;
} else {
+   printk(KERN_DEBUG netif_rx_action(%s) - done%s\n,
+   dev-name,
+   test_bit(__LINK_STATE_RX_SCHED, 
dev-state)?
+; rx_sched set : );
netpoll_poll_unlock(have);
dev_put(dev);
local_irq_disable();
@@ -2074,6 +2087,7 @@ out:
 
 softnet_break:
__get_cpu_var(netdev_rx_stat).time_squeeze++;
+   __get_cpu_var(softnet_data).scheduled = jiffies;
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
goto out;
 }

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

  also, i'm using netconsole via the command line (both the network 
  driver and netconsole is built into the bzImage), maybe that makes a 
  difference?
 
 Possibly - but so far there's nothing in the code that jumped at me.
 
 Can you try the following please? I'm still pretty much in the dark; 
 so I'm adding a bunch of printks. Let's hope this doesn't change the 
 timing in a way that makes the bug disappear.

hm ... the box wouldnt even boot up because the 'keep on polling' and 
'done' messages are scrolling across so fast. Perhaps every printk 
triggers a packet which triggers another printk ... etc?

here's the 15MB log of it:

  http://redhat.com/~mingo/misc/100hz.3.log.bz2

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
> > (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
> > like HZ=250.)
> > 
> > no 'rx_sched set' messages in either case. Network still hung for 
> > HZ=100, and is working for HZ=1000.
> 
> Is this from dmesg or the netconsole output? I don't see any Tx Unit 
> Hang messages from e1000 or netdev watchdog messages present in your 
> earlier dmesg logs. So maybe these messages are there, but never get 
> logged?

i logged these not via netconsole but via logging on over the console 
and using dmesg, so it should include everything. in the 100hz case the 
following seems to show the anomaly:

  NETDEV WATCHDOG: eth0: transmit timed out

indeed it's not followed by the e1000 messages. (perhaps i didnt wait 
long enough to reboot the box - i zapped it after a minute of uptime)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
> (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
> like HZ=250.)
> 
> no 'rx_sched set' messages in either case. Network still hung for 
> HZ=100, and is working for HZ=1000.

Is this from dmesg or the netconsole output? I don't see any Tx Unit
Hang messages from e1000 or netdev watchdog messages present in your
earlier dmesg logs. So maybe these messages are there, but never
get logged?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> Hi Ingo,
> 
> On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
> > i've done the patch below, but it did not change the timeouts nor did it 
> > solve the 'no network' problem. netconsole output hung earlier as well.
> Hm, pity.
> 
> To rule out any e1000 problem, can you try the the following please, 
> both with HZ=250 and HZ=1000?

done, find the logs here:

 http://redhat.com/~mingo/misc/100hz.log
 http://redhat.com/~mingo/misc/1000hz.log

(one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
like HZ=250.)

no 'rx_sched set' messages in either case. Network still hung for 
HZ=100, and is working for HZ=1000.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> From: Ingo Molnar <[EMAIL PROTECTED]>
> Date: Tue, 17 Jul 2007 00:37:18 +0200
> 
> > I think if you leaned back and thought it through, and if you 
> > applied this scenario to a bad scheduler commit from me that broke 
> > your box, you'd readily agree with me =B-) (which scenario is purely 
> > hypothetical, my scheduler commits are all 100% perfect of course 
> > ;-)
> 
> Actually I'd probably send you a patch for any bug I found that 
> triggered on sparc64, since that's faster than asking you to fix a bug 
> that you are unlikely to be able to trigger on your own systems.
> 
> But that's just how I operate.
> 
> Ask Thomas Gleixner.  Every single hrtimers/nohz bug I've found on 
> sparc64, I sent him either a full fix or a full analysis of the bug 
> and a description of exactly what is going on and what needs to be 
> changed so he can compose the bug fix patch with minimal effort.

yeah, i too usually try to fix bugs straight away - just check:

   git-log net drivers/net | grep "Author: Ingo Molnar"

but in this current case i was freshly back from a few days off, had 
quite a backlog after a rather complex set of merges and i just didnt 
have the time.

And i really applaud you for the clockevents/dynticks contributions (i 
loved your dynticks patches and the clockevents framework fixes that 
resulted out of it), but i really, really think this case is apples to 
oranges. I had some urgent hacking to do in some completely different 
area (not networking) and i ran across that networking regression and 
spent more than an hour on bisecting it.

The bad commit looked simple so i thought the person who is doing it 
(Olaf) would immediately see the bug (i certainly didnt see it). It's 
also not that easy to debug that box, the only remote debugging 
interface to it is ... netconsole. If i were into networking changes 
right now i'd probably have tried to debug and fix it straight away.

So instead i opted to bisect it, to give you the exact bad commit, give 
you full logs of the incident and an offer to run arbitrary patches 
immediately.

There's also little loss from the revert AFAICS: the commit went into 
your tree on July 11th and went upstream July 12th and i found it on 
July 16th. So (unless the commit dates are deceiving me) it does not 
seem to be an over-tested patch with lots of QA value (yet), and it's 
not some critical commit that another 100 commits grew to depend on. 
It's nevertheless a fix we want to have, and i'm willing to test 
anything.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
Hi Ingo,

On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
> i've done the patch below, but it did not change the timeouts nor did it 
> solve the 'no network' problem. netconsole output hung earlier as well.
Hm, pity.

To rule out any e1000 problem, can you try the the following please,
both with HZ=250 and HZ=1000?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
---
 drivers/net/e1000/e1000_main.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: build-2.6/drivers/net/e1000/e1000_main.c
===
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -3871,10 +3871,17 @@ e1000_intr(int irq, void *data)
adapter->total_rx_bytes = 0;
adapter->total_rx_packets = 0;
__netif_rx_schedule(netdev);
-   } else
+   } else {
/* this really should not happen! if it does it is basically a
 * bug, but not a hard error, so enable ints and continue */
+   static unsigned int been_here = 0;
+
+   been_here++;
+   if (net_ratelimit())
+   printk(KERN_NOTICE "e1000_intr and rx_sched set (%u); 
state=0x%lx\n",
+   been_here, netdev->state);
e1000_irq_enable(adapter);
+   }
 #else
/* Writing IMC and IMS is needed for 82547.
 * Due to Hub Link bus being occupied, an interrupt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Linus Torvalds


On Tue, 17 Jul 2007, Ingo Molnar wrote:
> 
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

No, I suspect it's just related to timing: you need to hit that window 
when the LIST_FROZEN bit is set, and since it was so reliable for you 
before (and others didn't see it), your timing probably happened to hit it 
exactly. And changing HZ probably just changed some timeout or softirq 
getting called, and you didn't hit the bug any more.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch <[EMAIL PROTECTED]> wrote:

> Can you try what happens if you change netif_rx_complete to something 
> like this:
> 
>   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state)) {
>   dev->quota = dev->weight;
>   return;
>   }
> 
> This is just a hack to make sure that we don't go to insanely negative 
> quotas while sending packets through netpoll.

i've done the patch below, but it did not change the timeouts nor did it 
solve the 'no network' problem. netconsole output hung earlier as well.

i can try other things too. (it would be best if you sent me 
test/debug-patches, instead of letting me hack in the changes - that's 
the surest way of ensuring that i test exactly what you intended.)

Ingo

->
Index: linux/include/linux/netdevice.h
===
--- linux.orig/include/linux/netdevice.h
+++ linux/include/linux/netdevice.h
@@ -1007,6 +1007,10 @@ static inline int netif_rx_reschedule(st
  */
 static inline void __netif_rx_complete(struct net_device *dev)
 {
+   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state)) {
+   dev->quota = dev->weight;
+   return;
+   }
BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, >state));
list_del(>poll_list);
smp_mb__before_clear_bit();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 10:57, Ingo Molnar wrote:
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

There are several "Tx Hang detected" messages in the log, which looks
a lot as if net_rx_action never runs, or at least never calls
dev->poll on the e1000 nic.

Can you check whether/how often it bails out of net_rx_action taking
the softnet_break path?

My suspicion right now is that dev->quota goes way negative when
pushing out netconsole output. Normally, we bump the quota in
__net_rx_schedule. But the whole point of the patch is that
netpoll has no business removing the device from the poll_list,
so it stays there, and we don't end up calling __net_rx_schedule
as often as we would otherwise.

Can you try what happens if you change netif_rx_complete to
something like this:

if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state)) {
dev->quota = dev->weight;
return;
}

This is just a hack to make sure that we don't go to insanely
negative quotas while sending packets through netpoll.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:57:48AM +0200, Ingo Molnar wrote:
> 
> Olaf,
> 
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

IMHO it could be related with __LINK_STATE_RX_SCHED beeing set
too long e.g. between two softirqs. So, it could be really
a matter of timing out - not necessarily permanent error state.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:28:34AM +0200, Olaf Kirch wrote:
> On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
> > What I find more problematic about this portion of code though
> > is that once a net_device is over quota, net_rx_action will
> > loop for up to one jiffy, even if there's just this one device on
> > the poll_list.
> 
> Duh, wrong. For every loop, it'll add dev->weight to dev->quota,
> so it'll be fine very soon.

After your patch only two things can be different here: dev->poll_list
and this flag. Since Ingo offered testing it should be easy to check
after moving the check of "your" flag to __netif_rx_complete, and
skipping only list_del on true.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

Olaf,

i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
the problem go away. So it's somehow also related to jiffies.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
> What I find more problematic about this portion of code though
> is that once a net_device is over quota, net_rx_action will
> loop for up to one jiffy, even if there's just this one device on
> the poll_list.

Duh, wrong. For every loop, it'll add dev->weight to dev->quota,
so it'll be fine very soon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Monday 16 July 2007 23:40, Linus Torvalds wrote:
>  - The change seems to always set the LIST_FROZEN bit when calling 
>->poll(), and at least on e1000, the NAPI poll() routine ends up doing 
>that netif_rx_complete(), so we're *guaranteed* to always take the 
>early exit and not do anything.

That is only in the poll_napi case, where netpoll tries to push pending
frames. The default caller for dev->poll() is net_rx_action.

We only ever get into this particular code branch in poll_napi when
the RX_SCHED bit is already set, ie someone put the device on the NAPI
poll_list and raised the softirq. poll_napi is just doing manually what
net_rx_action would do anyway once the softirq is serviced.

>  - The early return from netif_rx_complete() ends up meaning that an 
>edge-triggered interrupt isn't handled properly, and will this never 
>happen again, since it never goes away.

Sorry, I may be sitting on my brain this morning, but I don't understand
how skipping netif_rx_complete would affect ACKing of interrupts.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 08:14, Jarek Poplawski wrote:
> > If after poll_napi dev->quota <= 0 dev->poll is not run and
> > __LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.
> 
> Or, more precisely dev->poll_list will be cleared just after this,
> and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

I don't think so. dev will remain on poll_list.

What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
the poll_list.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 00:08, David Miller wrote:
> Sure, but I thought it would be nice to give Olaf a day or two to
> figure out what's going on rather than have the knee-jerk reaction to
> just revert.

Oh, reverting is fine with me. I'll just resubmit the patch.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 07:46:39AM +0200, Jarek Poplawski wrote:
...
> > static void net_rx_action(struct softirq_action *h)
> > {
> > struct softnet_data *queue = &__get_cpu_var(softnet_data);
> > unsigned long start_time = jiffies;
> > int budget = netdev_budget;
> > void *have;
> > 
> > local_irq_disable();
> > 
> > while (!list_empty(>poll_list)) {
> > struct net_device *dev;
> > 
> > if (budget <= 0 || jiffies - start_time > 1)
> > goto softnet_break;
> > 
> > local_irq_enable();
> > 
> > dev = list_entry(queue->poll_list.next,
> >  struct net_device, poll_list);
> > have = netpoll_poll_lock(dev);
> > 
> > if (dev->quota <= 0 || dev->poll(dev, )) {
> 
> If after poll_napi dev->quota <= 0 dev->poll is not run and
> __LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.

Or, more precisely dev->poll_list will be cleared just after this,
and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 07:46:39AM +0200, Jarek Poplawski wrote:
...
  static void net_rx_action(struct softirq_action *h)
  {
  struct softnet_data *queue = __get_cpu_var(softnet_data);
  unsigned long start_time = jiffies;
  int budget = netdev_budget;
  void *have;
  
  local_irq_disable();
  
  while (!list_empty(queue-poll_list)) {
  struct net_device *dev;
  
  if (budget = 0 || jiffies - start_time  1)
  goto softnet_break;
  
  local_irq_enable();
  
  dev = list_entry(queue-poll_list.next,
   struct net_device, poll_list);
  have = netpoll_poll_lock(dev);
  
  if (dev-quota = 0 || dev-poll(dev, budget)) {
 
 If after poll_napi dev-quota = 0 dev-poll is not run and
 __LINK_STATE_RX_SCHED bit (plus dev-poll_list) stays uncleared.

Or, more precisely dev-poll_list will be cleared just after this,
and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 00:08, David Miller wrote:
 Sure, but I thought it would be nice to give Olaf a day or two to
 figure out what's going on rather than have the knee-jerk reaction to
 just revert.

Oh, reverting is fine with me. I'll just resubmit the patch.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 08:14, Jarek Poplawski wrote:
  If after poll_napi dev-quota = 0 dev-poll is not run and
  __LINK_STATE_RX_SCHED bit (plus dev-poll_list) stays uncleared.
 
 Or, more precisely dev-poll_list will be cleared just after this,
 and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

I don't think so. dev will remain on poll_list.

What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
the poll_list.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Monday 16 July 2007 23:40, Linus Torvalds wrote:
  - The change seems to always set the LIST_FROZEN bit when calling 
-poll(), and at least on e1000, the NAPI poll() routine ends up doing 
that netif_rx_complete(), so we're *guaranteed* to always take the 
early exit and not do anything.

That is only in the poll_napi case, where netpoll tries to push pending
frames. The default caller for dev-poll() is net_rx_action.

We only ever get into this particular code branch in poll_napi when
the RX_SCHED bit is already set, ie someone put the device on the NAPI
poll_list and raised the softirq. poll_napi is just doing manually what
net_rx_action would do anyway once the softirq is serviced.

  - The early return from netif_rx_complete() ends up meaning that an 
edge-triggered interrupt isn't handled properly, and will this never 
happen again, since it never goes away.

Sorry, I may be sitting on my brain this morning, but I don't understand
how skipping netif_rx_complete would affect ACKing of interrupts.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
 What I find more problematic about this portion of code though
 is that once a net_device is over quota, net_rx_action will
 loop for up to one jiffy, even if there's just this one device on
 the poll_list.

Duh, wrong. For every loop, it'll add dev-weight to dev-quota,
so it'll be fine very soon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

Olaf,

i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
the problem go away. So it's somehow also related to jiffies.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:28:34AM +0200, Olaf Kirch wrote:
 On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
  What I find more problematic about this portion of code though
  is that once a net_device is over quota, net_rx_action will
  loop for up to one jiffy, even if there's just this one device on
  the poll_list.
 
 Duh, wrong. For every loop, it'll add dev-weight to dev-quota,
 so it'll be fine very soon.

After your patch only two things can be different here: dev-poll_list
and this flag. Since Ingo offered testing it should be easy to check
after moving the check of your flag to __netif_rx_complete, and
skipping only list_del on true.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:57:48AM +0200, Ingo Molnar wrote:
 
 Olaf,
 
 i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
 the problem go away. So it's somehow also related to jiffies.

IMHO it could be related with __LINK_STATE_RX_SCHED beeing set
too long e.g. between two softirqs. So, it could be really
a matter of timing out - not necessarily permanent error state.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 10:57, Ingo Molnar wrote:
 i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
 the problem go away. So it's somehow also related to jiffies.

There are several Tx Hang detected messages in the log, which looks
a lot as if net_rx_action never runs, or at least never calls
dev-poll on the e1000 nic.

Can you check whether/how often it bails out of net_rx_action taking
the softnet_break path?

My suspicion right now is that dev-quota goes way negative when
pushing out netconsole output. Normally, we bump the quota in
__net_rx_schedule. But the whole point of the patch is that
netpoll has no business removing the device from the poll_list,
so it stays there, and we don't end up calling __net_rx_schedule
as often as we would otherwise.

Can you try what happens if you change netif_rx_complete to
something like this:

if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state)) {
dev-quota = dev-weight;
return;
}

This is just a hack to make sure that we don't go to insanely
negative quotas while sending packets through netpoll.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 Can you try what happens if you change netif_rx_complete to something 
 like this:
 
   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state)) {
   dev-quota = dev-weight;
   return;
   }
 
 This is just a hack to make sure that we don't go to insanely negative 
 quotas while sending packets through netpoll.

i've done the patch below, but it did not change the timeouts nor did it 
solve the 'no network' problem. netconsole output hung earlier as well.

i can try other things too. (it would be best if you sent me 
test/debug-patches, instead of letting me hack in the changes - that's 
the surest way of ensuring that i test exactly what you intended.)

Ingo

-
Index: linux/include/linux/netdevice.h
===
--- linux.orig/include/linux/netdevice.h
+++ linux/include/linux/netdevice.h
@@ -1007,6 +1007,10 @@ static inline int netif_rx_reschedule(st
  */
 static inline void __netif_rx_complete(struct net_device *dev)
 {
+   if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, dev-state)) {
+   dev-quota = dev-weight;
+   return;
+   }
BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, dev-state));
list_del(dev-poll_list);
smp_mb__before_clear_bit();
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Linus Torvalds


On Tue, 17 Jul 2007, Ingo Molnar wrote:
 
 i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
 the problem go away. So it's somehow also related to jiffies.

No, I suspect it's just related to timing: you need to hit that window 
when the LIST_FROZEN bit is set, and since it was so reliable for you 
before (and others didn't see it), your timing probably happened to hit it 
exactly. And changing HZ probably just changed some timeout or softirq 
getting called, and you didn't hit the bug any more.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
Hi Ingo,

On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
 i've done the patch below, but it did not change the timeouts nor did it 
 solve the 'no network' problem. netconsole output hung earlier as well.
Hm, pity.

To rule out any e1000 problem, can you try the the following please,
both with HZ=250 and HZ=1000?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
---
 drivers/net/e1000/e1000_main.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: build-2.6/drivers/net/e1000/e1000_main.c
===
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -3871,10 +3871,17 @@ e1000_intr(int irq, void *data)
adapter-total_rx_bytes = 0;
adapter-total_rx_packets = 0;
__netif_rx_schedule(netdev);
-   } else
+   } else {
/* this really should not happen! if it does it is basically a
 * bug, but not a hard error, so enable ints and continue */
+   static unsigned int been_here = 0;
+
+   been_here++;
+   if (net_ratelimit())
+   printk(KERN_NOTICE e1000_intr and rx_sched set (%u); 
state=0x%lx\n,
+   been_here, netdev-state);
e1000_irq_enable(adapter);
+   }
 #else
/* Writing IMC and IMS is needed for 82547.
 * Due to Hub Link bus being occupied, an interrupt
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* David Miller [EMAIL PROTECTED] wrote:

 From: Ingo Molnar [EMAIL PROTECTED]
 Date: Tue, 17 Jul 2007 00:37:18 +0200
 
  I think if you leaned back and thought it through, and if you 
  applied this scenario to a bad scheduler commit from me that broke 
  your box, you'd readily agree with me =B-) (which scenario is purely 
  hypothetical, my scheduler commits are all 100% perfect of course 
  ;-)
 
 Actually I'd probably send you a patch for any bug I found that 
 triggered on sparc64, since that's faster than asking you to fix a bug 
 that you are unlikely to be able to trigger on your own systems.
 
 But that's just how I operate.
 
 Ask Thomas Gleixner.  Every single hrtimers/nohz bug I've found on 
 sparc64, I sent him either a full fix or a full analysis of the bug 
 and a description of exactly what is going on and what needs to be 
 changed so he can compose the bug fix patch with minimal effort.

yeah, i too usually try to fix bugs straight away - just check:

   git-log net drivers/net | grep Author: Ingo Molnar

but in this current case i was freshly back from a few days off, had 
quite a backlog after a rather complex set of merges and i just didnt 
have the time.

And i really applaud you for the clockevents/dynticks contributions (i 
loved your dynticks patches and the clockevents framework fixes that 
resulted out of it), but i really, really think this case is apples to 
oranges. I had some urgent hacking to do in some completely different 
area (not networking) and i ran across that networking regression and 
spent more than an hour on bisecting it.

The bad commit looked simple so i thought the person who is doing it 
(Olaf) would immediately see the bug (i certainly didnt see it). It's 
also not that easy to debug that box, the only remote debugging 
interface to it is ... netconsole. If i were into networking changes 
right now i'd probably have tried to debug and fix it straight away.

So instead i opted to bisect it, to give you the exact bad commit, give 
you full logs of the incident and an offer to run arbitrary patches 
immediately.

There's also little loss from the revert AFAICS: the commit went into 
your tree on July 11th and went upstream July 12th and i found it on 
July 16th. So (unless the commit dates are deceiving me) it does not 
seem to be an over-tested patch with lots of QA value (yet), and it's 
not some critical commit that another 100 commits grew to depend on. 
It's nevertheless a fix we want to have, and i'm willing to test 
anything.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 Hi Ingo,
 
 On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
  i've done the patch below, but it did not change the timeouts nor did it 
  solve the 'no network' problem. netconsole output hung earlier as well.
 Hm, pity.
 
 To rule out any e1000 problem, can you try the the following please, 
 both with HZ=250 and HZ=1000?

done, find the logs here:

 http://redhat.com/~mingo/misc/100hz.log
 http://redhat.com/~mingo/misc/1000hz.log

(one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
like HZ=250.)

no 'rx_sched set' messages in either case. Network still hung for 
HZ=100, and is working for HZ=1000.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
 (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
 like HZ=250.)
 
 no 'rx_sched set' messages in either case. Network still hung for 
 HZ=100, and is working for HZ=1000.

Is this from dmesg or the netconsole output? I don't see any Tx Unit
Hang messages from e1000 or netdev watchdog messages present in your
earlier dmesg logs. So maybe these messages are there, but never
get logged?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar

* Olaf Kirch [EMAIL PROTECTED] wrote:

 On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
  (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
  like HZ=250.)
  
  no 'rx_sched set' messages in either case. Network still hung for 
  HZ=100, and is working for HZ=1000.
 
 Is this from dmesg or the netconsole output? I don't see any Tx Unit 
 Hang messages from e1000 or netdev watchdog messages present in your 
 earlier dmesg logs. So maybe these messages are there, but never get 
 logged?

i logged these not via netconsole but via logging on over the console 
and using dmesg, so it should include everything. in the 100hz case the 
following seems to show the anomaly:

  NETDEV WATCHDOG: eth0: transmit timed out

indeed it's not followed by the e1000 messages. (perhaps i didnt wait 
long enough to reboot the box - i zapped it after a minute of uptime)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Jarek Poplawski
On 16-07-2007 11:12, Ingo Molnar wrote:
> current -git broke my main testbox. No TCP/IP networking to/from the box 
> and e1000 would time out in xmit:
> 
>  NETDEV WATCHDOG: eth0: transmit timed out
>  e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
...

Olaf, I think this error can trigger in this place:

> static void net_rx_action(struct softirq_action *h)
> {
> struct softnet_data *queue = &__get_cpu_var(softnet_data);
> unsigned long start_time = jiffies;
> int budget = netdev_budget;
> void *have;
> 
> local_irq_disable();
> 
> while (!list_empty(>poll_list)) {
> struct net_device *dev;
> 
> if (budget <= 0 || jiffies - start_time > 1)
> goto softnet_break;
> 
> local_irq_enable();
> 
> dev = list_entry(queue->poll_list.next,
>  struct net_device, poll_list);
> have = netpoll_poll_lock(dev);
> 
> if (dev->quota <= 0 || dev->poll(dev, )) {

If after poll_napi dev->quota <= 0 dev->poll is not run and
__LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Linus Torvalds


On Mon, 16 Jul 2007, Matt Mackall wrote:
> 
> Unfortunately the particular patch from Olaf is presumably covering up
> another bug that other people (including Olaf) had hit. So reverting
> it is going to introduce a different regression.

It's not a regression, it's an old problem.

And the rule is: machines that _used_ to work are more important. That 
rule got hewn into stone when the ACPI layer changes constantly kept 
breaking things that used to work, while "fixing" other things.

So we don't fix bugs by introducing new problems.  That way lies madness, 
and nobody ever knows if you actually make any real progress at all. Is it 
two steps forwards, one step back, or one step forward and two steps back? 
Different people will give different answers.

That's why regressions are _so_ much more important than new bugfixes. 
Because it's much more important to make slow but _steady_ progress, and 
have people know things improve (or at least not "deprove"). We don't want 
any kind of "brownian motion development".

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >