Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-20 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Thursday 19 July 2007 21:56, Ingo Molnar wrote: > > nope - with this patch applied the box still has no network, symptoms > > are similar. (should i apply the WARN_ON() patch too?) > > Yes, that would be nice. If that doesn't help, you can also throw

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 21:56, Ingo Molnar wrote: > nope - with this patch applied the box still has no network, symptoms > are similar. (should i apply the WARN_ON() patch too?) Yes, that would be nice. If that doesn't help, you can also throw in the one below. Olaf -- Olaf Kirch | --- o --

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > Does the following help? > --- build-2.6.orig/drivers/net/netconsole.c > +++ build-2.6/drivers/net/netconsole.c > @@ -70,7 +70,7 @@ static void write_msg(struct console *co > int frag, left; > unsigned long flags; > > - if (!np.dev) > +

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
Does the following help? Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax Test patch --- Index: build-2.6/drivers/net/netconsole.c ==

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > Here's a somewhat drastic modification that should not change any > timing, but just verifies whether my patch is to blame at all. Can you > give it a try? > @@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str >* But at least it does

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:07, Ingo Molnar wrote: > because i dont seem to be able to trigger Olaf's WARN_ON(), can you see > anything in the ethtool output that i sent in the previous mail(s)? If the WARN_ON doesn't trigger, I cannot see how my patch would affect your system. - IF we ent

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 19:36, Olaf Kirch wrote: > Can you confirm this by spraying the laptop with arp packets > or broadcast pings while it's booting? Sorry for the noise - didn't see your other message where you described just that. This sounds more like a hardware issue - Rx interrupt seems

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Thursday 19 July 2007 18:05, Ingo Molnar wrote: > > that network-intense test also produced periodic broadcast packets that > > got the e1000 out of its weird state before the tx timeout could hit. > > Now that i've stopped the test, the network is q

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 18:05, Ingo Molnar wrote: > that network-intense test also produced periodic broadcast packets that > got the e1000 out of its weird state before the tx timeout could hit. > Now that i've stopped the test, the network is quiescent again and the > e1000 hangs. Can you co

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > i'll now check whether removing ignore_on_loglevel (no other > > changes) makes the hang go away. Maybe ignore_on_loglevel is buggy - > > or it produces an immediate printk (going out to the interface) > > during a particularly sensitive period of n

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Kok, Auke <[EMAIL PROTECTED]> wrote: > > I don't have a fix ready yet - I hope I'll have something later this > > afternoon. > > interesting, you seem to found the cause allright. I can't confirm the > problem but I know that netpoll and NAPI has historically been an > issue. I look forward

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > i'll now check whether removing ignore_on_loglevel (no other changes) > makes the hang go away. Maybe ignore_on_loglevel is buggy - or it > produces an immediate printk (going out to the interface) during a > particularly sensitive period of network i

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > ah! Just found the reason: the bug apparently depends on the precise > kernel command-line contents. I accidentally dropped ignore_loglevel > (found this while comparing with the older logs i sent to you), adding > it back in produces hung networking

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Kok, Auke
Olaf Kirch wrote: On Thursday 19 July 2007 12:58, Ingo Molnar wrote: i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine hickup symptoms, with no other bad symptoms such as lockups or crashes. Duh, I found it. The e1000 poll routine does this to leave polling mode. net

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > ugh. Something really weird happened with this e1000 problem. > > i crashed the laptop in a weird way and had to power-cycle it in an > unusual fashion. After that i wanted to try your latest BUG_ON() > theory but the network hang went away! > > For

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 17:07, Ingo Molnar wrote: > i crashed the laptop in a weird way and had to power-cycle it in an > unusual fashion. After that i wanted to try your latest BUG_ON() theory > but the network hang went away! Should I rejoice, or regret? :-) > maybe it's not the power-cyclin

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
ugh. Something really weird happened with this e1000 problem. i crashed the laptop in a weird way and had to power-cycle it in an unusual fashion. After that i wanted to try your latest BUG_ON() theory but the network hang went away! For 3 hours i tried to reproduce the hang (i went back to th

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 14:52, Olaf Kirch wrote: > On Thursday 19 July 2007 12:58, Ingo Molnar wrote: > > i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine > > hickup symptoms, with no other bad symptoms such as lockups or crashes. > > Duh, I found it. The following patch shoul

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:58, Ingo Molnar wrote: > i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine > hickup symptoms, with no other bad symptoms such as lockups or crashes. Duh, I found it. The e1000 poll routine does this to leave polling mode. netif_rx_complete(po

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > * Olaf Kirch <[EMAIL PROTECTED]> wrote: > > > On Thursday 19 July 2007 12:01, Ingo Molnar wrote: > > > Calling initcall 0xc0603f55: netpoll_init+0x0/0x39() > > > initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0. > > > initcall 0xc0603f55 ran f

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Thursday 19 July 2007 12:01, Ingo Molnar wrote: > > Calling initcall 0xc0603f55: netpoll_init+0x0/0x39() > > initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0. > > initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39() > > Calling initc

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 12:01, Ingo Molnar wrote: > Calling initcall 0xc0603f55: netpoll_init+0x0/0x39() > initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0. > initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39() > Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a() > NE

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > the e1000 in this laptop is historically pretty robust. The only > problem i ever had with it were some rx/tx hw-engine latency problems > [pings from the outside took up to 1 second to propagate] that were > quickly fixed by the e1000 driver guys. Ma

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > -You say that netconsole output continues to trickle after > the network gets wedged. This could be caused by the > e1000 watchdog, which triggers a NIC interrupt "to ensure > rx ring is cleaned". I assume that this triggers the >

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Olaf Kirch
On Thursday 19 July 2007 11:09, Ingo Molnar wrote: > the e1000 in this laptop is historically pretty robust. The only problem > i ever had with it were some rx/tx hw-engine latency problems [pings > from the outside took up to 1 second to propagate] that were quickly > fixed by the e1000 driver

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Ingo Molnar
i have your original patch applied to my working tree to be able to observe this bug's behavior, and here's another observation: the problem seems to go away if i turn on CONFIG_NO_HZ. So it looks timing related indeed ... but when the bug happens, it happens all the time, reboot after reboot.

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Jarek Poplawski
On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote: ... > I'd be very glad if it could be verified and/or tested. Jarek, This patch is verified crap! Regards, Jarek P. PS: Olaf, You've written earlier that one of the main reasons for poll_napi is to work when the kernel "doesn't e

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > > also, i'm using netconsole via the command line (both the network > > driver and netconsole is built into the bzImage), maybe that makes a > > difference? > > Possibly - but so far there's nothing in the code that jumped at me. > > Can you try the f

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Wednesday 18 July 2007 14:48, Ingo Molnar wrote: > something i noticed: netconsole output seems to trickle through though, > but very, very slowly (a packet once every 4 seconds or so). TCP/IP is > not functional. > > also, i'm using netconsole via the command line (both the network driver >

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Tuesday 17 July 2007 20:56, Ingo Molnar wrote: > > i logged these not via netconsole but via logging on over the console > > and using dmesg, so it should include everything. in the 100hz case the > > following seems to show the anomaly: > > > > N

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Tuesday 17 July 2007 20:56, Ingo Molnar wrote: > > i logged these not via netconsole but via logging on over the console > > and using dmesg, so it should include everything. in the 100hz case the > > following seems to show the anomaly: > > > > N

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Olaf Kirch
On Tuesday 17 July 2007 20:56, Ingo Molnar wrote: > i logged these not via netconsole but via logging on over the console > and using dmesg, so it should include everything. in the 100hz case the > following seems to show the anomaly: > > NETDEV WATCHDOG: eth0: transmit timed out So, it seems

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-18 Thread Jarek Poplawski
Hi, Here is my proposal of a solution based on dev->state flag, but intended mainly to prevent poll_napi from disturbing while net_rx_action is running and polling the device. It doesn't look very nice or clean but I hope it could guard net_rx_action enough with some room for netpoll too. I'd be

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Tuesday 17 July 2007 20:18, Ingo Molnar wrote: > > (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just > > like HZ=250.) > > > > no 'rx_sched set' messages in either case. Network still hung for > > HZ=100, and is working for HZ=1

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 20:18, Ingo Molnar wrote: > (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just > like HZ=250.) > > no 'rx_sched set' messages in either case. Network still hung for > HZ=100, and is working for HZ=1000. Is this from dmesg or the netconsole output? I d

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > Hi Ingo, > > On Tuesday 17 July 2007 18:57, Ingo Molnar wrote: > > i've done the patch below, but it did not change the timeouts nor did it > > solve the 'no network' problem. netconsole output hung earlier as well. > Hm, pity. > > To rule out any e100

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar
* David Miller <[EMAIL PROTECTED]> wrote: > From: Ingo Molnar <[EMAIL PROTECTED]> > Date: Tue, 17 Jul 2007 00:37:18 +0200 > > > I think if you leaned back and thought it through, and if you > > applied this scenario to a bad scheduler commit from me that broke > > your box, you'd readily agree

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
Hi Ingo, On Tuesday 17 July 2007 18:57, Ingo Molnar wrote: > i've done the patch below, but it did not change the timeouts nor did it > solve the 'no network' problem. netconsole output hung earlier as well. Hm, pity. To rule out any e1000 problem, can you try the the following please, both with

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Linus Torvalds
On Tue, 17 Jul 2007, Ingo Molnar wrote: > > i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes > the problem go away. So it's somehow also related to jiffies. No, I suspect it's just related to timing: you need to hit that window when the LIST_FROZEN bit is set, and since i

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > Can you try what happens if you change netif_rx_complete to something > like this: > > if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)) { > dev->quota = dev->weight; > return; > } > > This is just a hack

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 10:57, Ingo Molnar wrote: > i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes > the problem go away. So it's somehow also related to jiffies. There are several "Tx Hang detected" messages in the log, which looks a lot as if net_rx_action never runs, or

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:57:48AM +0200, Ingo Molnar wrote: > > Olaf, > > i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes > the problem go away. So it's somehow also related to jiffies. IMHO it could be related with __LINK_STATE_RX_SCHED beeing set too long e.g. between t

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 10:28:34AM +0200, Olaf Kirch wrote: > On Tuesday 17 July 2007 09:55, Olaf Kirch wrote: > > What I find more problematic about this portion of code though > > is that once a net_device is over quota, net_rx_action will > > loop for up to one jiffy, even if there's just this o

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Ingo Molnar
Olaf, i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes the problem go away. So it's somehow also related to jiffies. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 09:55, Olaf Kirch wrote: > What I find more problematic about this portion of code though > is that once a net_device is over quota, net_rx_action will > loop for up to one jiffy, even if there's just this one device on > the poll_list. Duh, wrong. For every loop, it'll add

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Monday 16 July 2007 23:40, Linus Torvalds wrote: > - The change seems to always set the LIST_FROZEN bit when calling >->poll(), and at least on e1000, the NAPI poll() routine ends up doing >that netif_rx_complete(), so we're *guaranteed* to always take the >early exit and not do

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 08:14, Jarek Poplawski wrote: > > If after poll_napi dev->quota <= 0 dev->poll is not run and > > __LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared. > > Or, more precisely dev->poll_list will be cleared just after this, > and net_rx_action returns with __LINK_

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-17 Thread Olaf Kirch
On Tuesday 17 July 2007 00:08, David Miller wrote: > Sure, but I thought it would be nice to give Olaf a day or two to > figure out what's going on rather than have the knee-jerk reaction to > just revert. Oh, reverting is fine with me. I'll just resubmit the patch. Olaf -- Olaf Kirch | --- o

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Jarek Poplawski
On Tue, Jul 17, 2007 at 07:46:39AM +0200, Jarek Poplawski wrote: ... > > static void net_rx_action(struct softirq_action *h) > > { > > struct softnet_data *queue = &__get_cpu_var(softnet_data); > > unsigned long start_time = jiffies; > > int budget = netdev_budget; > >

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Jarek Poplawski
On 16-07-2007 11:12, Ingo Molnar wrote: > current -git broke my main testbox. No TCP/IP networking to/from the box > and e1000 would time out in xmit: > > NETDEV WATCHDOG: eth0: transmit timed out > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang ... Olaf, I think this error can trigger

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Linus Torvalds
On Mon, 16 Jul 2007, Matt Mackall wrote: > > Unfortunately the particular patch from Olaf is presumably covering up > another bug that other people (including Olaf) had hit. So reverting > it is going to introduce a different regression. It's not a regression, it's an old problem. And the rule

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Matt Mackall
On Mon, Jul 16, 2007 at 03:29:15PM -0700, Linus Torvalds wrote: > > > On Mon, 16 Jul 2007, David Miller wrote: > > > > Ingo is the only person hitting and reporting this and last time I > > checked he is competent enough to revert the thing locally in his own > > trees, right? :-) > > Umm. And

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Tue, 17 Jul 2007 00:37:18 +0200 > I think if you leaned back and thought it through, and if you applied > this scenario to a bad scheduler commit from me that broke your box, > you'd readily agree with me =B-) (which scenario is purely hypothetical, >

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Linus Torvalds <[EMAIL PROTECTED]> Date: Mon, 16 Jul 2007 15:29:15 -0700 (PDT) > If we knew something was wrong before the -rc1 release, all the better: we > can avoid havign that bug in -rc1, and the people who test it will tell us > about the problems we did *not* know about. > > In con

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Ingo Molnar
* David Miller <[EMAIL PROTECTED]> wrote: > From: Ingo Molnar <[EMAIL PROTECTED]> > Date: Mon, 16 Jul 2007 23:51:17 +0200 > > > i also offered to quickly try any test-version of the fixed patch, so > > there's a real and deterministic path towards fixing the patch. The > > regression is obviou

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Linus Torvalds
On Mon, 16 Jul 2007, David Miller wrote: > > Ingo is the only person hitting and reporting this and last time I > checked he is competent enough to revert the thing locally in his own > trees, right? :-) Umm. And your suggestion is what? Wait until -rc1, when non-developers (the kinds of peopl

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 16 Jul 2007 23:09:24 +0200 > so ... i can promise to test whatever new version of the patch Olaf > sends me (the problem is easy to reproduce and easy to test, so i can > check it all in a heartbeat), so to get things back on track, and to > valu

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Linus Torvalds <[EMAIL PROTECTED]> Date: Mon, 16 Jul 2007 14:40:38 -0700 (PDT) > If we don't know what caused a problem in the first place, or if the fix > is known to be required for something else and reverting it would cause > *another* regression, it would be another issue. But as it i

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 16 Jul 2007 23:51:17 +0200 > i also offered to quickly try any test-version of the fixed patch, so > there's a real and deterministic path towards fixing the patch. The > regression is obvious and triggers all the time. For you. - To unsubscribe

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: >With MSI, edge-triggered interrupts are making a comeback in a big >way, and yeah, e1000 is one of the drivers that do MSI. Ingo might >want to confirm whether it's actually enabled for him, and whether >turning it off might hide the

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Linus Torvalds
On Mon, 16 Jul 2007, David Miller wrote: > > Well, let's figure out why before we revert because it > is attempting to fix a legitimate bug. I'm reverting it. I don't think there is any excuse for not reverting something that provably breaks somebody's machine. I don't want this to be on the

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Ingo Molnar
* David Miller <[EMAIL PROTECTED]> wrote: > From: Ingo Molnar <[EMAIL PROTECTED]> > Date: Mon, 16 Jul 2007 11:12:36 +0200 > > > Applying the revert patch below makes it work again. > > Well, let's figure out why before we revert because it is attempting > to fix a legitimate bug. yeah, no dou

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Ingo Molnar
* Olaf Kirch <[EMAIL PROTECTED]> wrote: > On Monday 16 July 2007 13:26, David Miller wrote: > > Well, let's figure out why before we revert because it > > is attempting to fix a legitimate bug. > > > > Olaf, any ideas? > > It seems as if the card is stuck in NAPI mode without being serviced >

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Olaf Kirch
On Monday 16 July 2007 13:26, David Miller wrote: > Well, let's figure out why before we revert because it > is attempting to fix a legitimate bug. > > Olaf, any ideas? It seems as if the card is stuck in NAPI mode without being serviced by net_rx_action. Ingo, is this a UP or SMP machine? Are y

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 16 Jul 2007 11:12:36 +0200 > Applying the revert patch below makes it work again. Well, let's figure out why before we revert because it is attempting to fix a legitimate bug. Olaf, any ideas? - To unsubscribe from this list: send the line "unsubs

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Olaf Kirch
On Monday 16 July 2007 11:12, Ingo Molnar wrote: > After a bisection session the bad commit turned out to be: > > 29578624e354f56143d92510fff33a8b2aaa2c03 is first bad commit > commit 29578624e354f56143d92510fff33a8b2aaa2c03 > Author: Olaf Kirch <[EMAIL PROTECTED]> > Date: Wed Jul 11 19:32:0

[patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-16 Thread Ingo Molnar
evert patch below makes it work again. Ingo ----------> Subject: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll From: Ingo Molnar <[EMAIL PROTECTED]> commit 29578624 causes netconsole failures: NETDEV WATCHDOG: eth0: transmit timed out e1000: eth0: e1000_c