On Fri, Aug 10, 2007 at 12:43:43PM +0200, Marcin Ślusarz wrote:
> 2007/8/10, Jarek Poplawski <[EMAIL PROTECTED]>:
> > (..)
> > I think, there is this one possible for your testing yet?:
> > Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> > Date: Wed, 8 Aug 2007 13:00:37
2007/8/10, Jarek Poplawski <[EMAIL PROTECTED]>:
> (..)
> I think, there is this one possible for your testing yet?:
> Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> Date: Wed, 8 Aug 2007 13:00:37 +0200
I think I already tested this patch, but this thread is sooo big and I
* Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> All correct! There was also checked a possibility it can be not hw
> itself, but wrong way of handling after hw (acking too late). This was
> false idea (or bad implementation), so it looks like hw vs lapic
> problem.
i think the problem is that
On Fri, Aug 10, 2007 at 11:08:33AM +0200, Ingo Molnar wrote:
>
> * Jarek Poplawski <[EMAIL PROTECTED]> wrote:
>
> > On 10-08-2007 10:05, Thomas Gleixner wrote:
> > ...
> > > But suppressing the resend is not fixing the driver problem. The
> > > problem can show up with spurious interrupts and
* Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> On 10-08-2007 10:05, Thomas Gleixner wrote:
> ...
> > But suppressing the resend is not fixing the driver problem. The
> > problem can show up with spurious interrupts and with interrupts on
> > a shared PCI interrupt line at any time. It just
On Fri, Aug 10, 2007 at 10:48:41AM +0200, Ingo Molnar wrote:
>
> * Jarek Poplawski <[EMAIL PROTECTED]> wrote:
>
> > On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
> > ...
> > > I was still testing on -rc2:
> > > Subject: [patch] genirq: temporary fix for level-triggered
* Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
> ...
> > I was still testing on -rc2:
> > Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> > Date: Wed, 8 Aug 2007 13:00:37 +0200
> >
> > For me after
> For me it's enough too but Thomas seems to doubt.
>
> You've written earlier that you've 2.6.23-rc1 with HARDIRQS_SW_RESEND
> prepared too. So, if this is not a great problem maybe you could try
> this first. Tomorrow Thomas may send something, so this 100HZ could
> wait yet, I hope?
Ok, i'll
On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
...
> I was still testing on -rc2:
> Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> Date: Wed, 8 Aug 2007 13:00:37 +0200
>
> For me after 1day 20hours, the network is still up, with more than 1To
> of
> So, we still have to wait for the exact explanation...
>
> Thanks very much Marcin!
>
> I think, there is this one possible for your testing yet?:
> Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> Date: Wed, 8 Aug 2007 13:00:37 +0200
>
> If it's not a great problem it
On Fri, Aug 10, 2007 at 08:33:27AM +0200, Marcin Ślusarz wrote:
> 2007/8/9, Jarek Poplawski <[EMAIL PROTECTED]>:
...
> > diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c
> > --- 2.6.23-rc1-/kernel/irq/chip.c 2007-07-09 01:32:17.0 +0200
> > +++
On Fri, Aug 10, 2007 at 08:33:27AM +0200, Marcin Ślusarz wrote:
2007/8/9, Jarek Poplawski [EMAIL PROTECTED]:
...
diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c
--- 2.6.23-rc1-/kernel/irq/chip.c 2007-07-09 01:32:17.0 +0200
+++ 2.6.23-rc1/kernel/irq/chip.c
On Fri, Aug 10, 2007 at 11:08:33AM +0200, Ingo Molnar wrote:
* Jarek Poplawski [EMAIL PROTECTED] wrote:
On 10-08-2007 10:05, Thomas Gleixner wrote:
...
But suppressing the resend is not fixing the driver problem. The
problem can show up with spurious interrupts and with interrupts
On Fri, Aug 10, 2007 at 10:48:41AM +0200, Ingo Molnar wrote:
* Jarek Poplawski [EMAIL PROTECTED] wrote:
On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
...
I was still testing on -rc2:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
...
I was still testing on -rc2:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200
For me after 1day 20hours, the network is still up, with more than 1To
of
So, we still have to wait for the exact explanation...
Thanks very much Marcin!
I think, there is this one possible for your testing yet?:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200
If it's not a great problem it would be
For me it's enough too but Thomas seems to doubt.
You've written earlier that you've 2.6.23-rc1 with HARDIRQS_SW_RESEND
prepared too. So, if this is not a great problem maybe you could try
this first. Tomorrow Thomas may send something, so this 100HZ could
wait yet, I hope?
Ok, i'll test
* Jarek Poplawski [EMAIL PROTECTED] wrote:
On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote:
...
I was still testing on -rc2:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200
For me after 1day 20hours,
* Jarek Poplawski [EMAIL PROTECTED] wrote:
On 10-08-2007 10:05, Thomas Gleixner wrote:
...
But suppressing the resend is not fixing the driver problem. The
problem can show up with spurious interrupts and with interrupts on
a shared PCI interrupt line at any time. It just might take
* Jarek Poplawski [EMAIL PROTECTED] wrote:
All correct! There was also checked a possibility it can be not hw
itself, but wrong way of handling after hw (acking too late). This was
false idea (or bad implementation), so it looks like hw vs lapic
problem.
i think the problem is that local
2007/8/10, Jarek Poplawski [EMAIL PROTECTED]:
(..)
I think, there is this one possible for your testing yet?:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200
I think I already tested this patch, but this thread is sooo big and I
can't
On Fri, Aug 10, 2007 at 12:43:43PM +0200, Marcin Ślusarz wrote:
2007/8/10, Jarek Poplawski [EMAIL PROTECTED]:
(..)
I think, there is this one possible for your testing yet?:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200
I think
On Thu, Aug 09, 2007 at 06:04:34PM +0200, Andi Kleen wrote:
> Jarek Poplawski <[EMAIL PROTECTED]> writes:
>
> > It seems, we can start to think about some preferred solutions,
> > already. Here are some of my preliminary conclusions and suggestions.
> >
> > The problem of timeouts with some
It seems, we can start to think about some preferred solutions,
already. Here are some of my preliminary conclusions and suggestions.
The problem of timeouts with some 'older' network cards seems to hit
mainly x86_64 arch, and after diagnosing and testing (still beeing
done) it's caused by
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote:
> Read below please:
>
> On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote:
> > 2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> > > So, the let's try this idea yet: modified Ingo's "x86: activate
> > >
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote:
Read below please:
On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote:
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
So, the let's try this idea yet: modified Ingo's x86: activate
HARDIRQS_SW_RESEND patch.
It seems, we can start to think about some preferred solutions,
already. Here are some of my preliminary conclusions and suggestions.
The problem of timeouts with some 'older' network cards seems to hit
mainly x86_64 arch, and after diagnosing and testing (still beeing
done) it's caused by
On Thu, Aug 09, 2007 at 06:04:34PM +0200, Andi Kleen wrote:
Jarek Poplawski [EMAIL PROTECTED] writes:
It seems, we can start to think about some preferred solutions,
already. Here are some of my preliminary conclusions and suggestions.
The problem of timeouts with some 'older' network
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote:
...
> > If you would like to read something more about testing (then of
> > course my suggestions could occur invalid - I'm a very bad tester
> > myself...) you can try this:
> > http://www.stardust.webpages.pl/files/handbook/
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote:
...
> So, it looks like x86_64 io_apic's IPI code was unused too long...
To be fair it's x86_64 lapic's IPI code.
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL
Read below please:
On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote:
> 2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> > So, the let's try this idea yet: modified Ingo's "x86: activate
> > HARDIRQS_SW_RESEND" patch.
> > (Don't forget about make oldconfig before make.)
> > For
2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> And here is one more patch to test the same idea (chip->retrigger()).
> Let's try i386 way! (I hope I will not be arrested for this...)
> (Should be tested without any previous patches.)
>
> Jarek P.
>
> PS: as above
>
> ---
>
> diff -Nurp
2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> So, the let's try this idea yet: modified Ingo's "x86: activate
> HARDIRQS_SW_RESEND" patch.
> (Don't forget about make oldconfig before make.)
> For testing only.
>
> Cheers,
> Jarek P.
>
> PS: alas there was not even time for "compile checking"...
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote:
> > Jean-Baptiste: I'm not sure how much of this testing you can afford?
> > If you can spare some time for this and your box isn't for
> > 'production' it could be very precious to diagnose such reproducible
> > bug.
>
> Well
> Jean-Baptiste: I'm not sure how much of this testing you can afford?
> If you can spare some time for this and your box isn't for
> 'production' it could be very precious to diagnose such reproducible
> bug.
Well i can continue testing patches for sure.
> Then, I'd have a few suggestions (you
On Wed, Aug 08, 2007 at 09:21:14AM +0200, Jarek Poplawski wrote:
> On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote:
...
> Marcin has done this with successfully using the most professional
> way: git bisect (which btw. I did learn yet), but, IMHO, it could be
...
Let me say
On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote:
...
> So this afternoon i compiled 2.6.23-rc2 with same options as 2.6.23-rc1
> and edited grub.conf to add nosmp but after reboot the box did not
> responded. Back home, i saw that the kernel failed because it was unable
> to
On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote:
...
So this afternoon i compiled 2.6.23-rc2 with same options as 2.6.23-rc1
and edited grub.conf to add nosmp but after reboot the box did not
responded. Back home, i saw that the kernel failed because it was unable
to find
On Wed, Aug 08, 2007 at 09:21:14AM +0200, Jarek Poplawski wrote:
On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote:
...
Marcin has done this with successfully using the most professional
way: git bisect (which btw. I did learn yet), but, IMHO, it could be
...
Let me say this
Jean-Baptiste: I'm not sure how much of this testing you can afford?
If you can spare some time for this and your box isn't for
'production' it could be very precious to diagnose such reproducible
bug.
Well i can continue testing patches for sure.
Then, I'd have a few suggestions (you could
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote:
Jean-Baptiste: I'm not sure how much of this testing you can afford?
If you can spare some time for this and your box isn't for
'production' it could be very precious to diagnose such reproducible
bug.
Well i can
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
So, the let's try this idea yet: modified Ingo's x86: activate
HARDIRQS_SW_RESEND patch.
(Don't forget about make oldconfig before make.)
For testing only.
Cheers,
Jarek P.
PS: alas there was not even time for compile checking...
---
diff
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
And here is one more patch to test the same idea (chip-retrigger()).
Let's try i386 way! (I hope I will not be arrested for this...)
(Should be tested without any previous patches.)
Jarek P.
PS: as above
---
diff -Nurp
Read below please:
On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote:
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
So, the let's try this idea yet: modified Ingo's x86: activate
HARDIRQS_SW_RESEND patch.
(Don't forget about make oldconfig before make.)
For testing only.
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote:
...
So, it looks like x86_64 io_apic's IPI code was unused too long...
To be fair it's x86_64 lapic's IPI code.
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote:
...
If you would like to read something more about testing (then of
course my suggestions could occur invalid - I'm a very bad tester
myself...) you can try this:
http://www.stardust.webpages.pl/files/handbook/
I'll
> On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote:
> >
> > > > * interrupts (i use irqbalance, but problem was the same without)
> > >
> > > I wonder if you tried without SMP too?
> >
> > No i did not. Do you think that this can be a problem ?
> > To test with no SMP, do i
On Tue, Aug 07, 2007 at 02:13:39PM +0200, Jarek Poplawski wrote:
> On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
> > On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
...
> > > No, i don't need a break. I'll have more time in next weeks.
> >
> > Great! So, I'll
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
> On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
> > 2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> > > On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
> > > > Network card still locks up (tested on
On Mon, Aug 06, 2007 at 01:43:48PM -0400, Chuck Ebbert wrote:
> On 08/06/2007 03:03 AM, Ingo Molnar wrote:
> >
> > But, since level types don't need this retriggers too much I think
> > this "don't mask interrupts by default" idea should be rethinked:
> > is there enough gain to risk such hard to
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
> 2007/8/7, Jarek Poplawski <[EMAIL PROTECTED]>:
> > On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
> > > Network card still locks up (tested on 2.6.22.1). I had to upload more
> > > data than usual (~350 MB vs ~1-100
On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote:
>
> > > * interrupts (i use irqbalance, but problem was the same without)
> >
> > I wonder if you tried without SMP too?
>
> No i did not. Do you think that this can be a problem ?
> To test with no SMP, do i need to
> > * interrupts (i use irqbalance, but problem was the same without)
>
> I wonder if you tried without SMP too?
No i did not. Do you think that this can be a problem ?
To test with no SMP, do i need to recompile kernel or is there a kernel
parameter ?
> BTW, Jean-Baptiste and Chuck -
On Tue, Aug 07, 2007 at 10:10:34AM +0200, Jean-Baptiste Vignaud wrote:
>
> > BTW: Jean-Babtiste, could you send or point to you current configs?
Oops! I'm very sorry for misspelling!
> > I mean at least proc/interrupts, but with dmesg and .config it would
> > be even better. (I assume this last
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
> 2007/8/6, Ingo Molnar <[EMAIL PROTECTED]>:
> > (..)
> > please try Jarek's second patch too - there was a missing unmask.
> >
> > Ingo
> >
> > -->
> > Subject: genirq: fix simple and fasteoi irq handlers
> >
> BTW: Jean-Babtiste, could you send or point to you current configs?
> I mean at least proc/interrupts, but with dmesg and .config it would
> be even better. (I assume this last report was about the revert patch
> mentioned by Chuck, not the one below your message?)
Sure.
Last reports are
2007/8/6, Ingo Molnar <[EMAIL PROTECTED]>:
> (..)
> please try Jarek's second patch too - there was a missing unmask.
>
> Ingo
>
> -->
> Subject: genirq: fix simple and fasteoi irq handlers
> From: Jarek Poplawski <[EMAIL PROTECTED]>
>
> After the "genirq: do not mask
On Mon, Aug 06, 2007 at 05:19:03PM -0400, Chuck Ebbert wrote:
> On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote:
> > Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
> > 3com card failed with the latest fedora kernel.
> >
> > Aug 6 22:31:09 loki kernel: NETDEV
On Mon, Aug 06, 2007 at 05:19:03PM -0400, Chuck Ebbert wrote:
On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote:
Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
3com card failed with the latest fedora kernel.
Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG:
2007/8/6, Ingo Molnar [EMAIL PROTECTED]:
(..)
please try Jarek's second patch too - there was a missing unmask.
Ingo
--
Subject: genirq: fix simple and fasteoi irq handlers
From: Jarek Poplawski [EMAIL PROTECTED]
After the genirq: do not mask interrupts by default
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
2007/8/6, Ingo Molnar [EMAIL PROTECTED]:
(..)
please try Jarek's second patch too - there was a missing unmask.
Ingo
--
Subject: genirq: fix simple and fasteoi irq handlers
From: Jarek Poplawski
BTW: Jean-Babtiste, could you send or point to you current configs?
I mean at least proc/interrupts, but with dmesg and .config it would
be even better. (I assume this last report was about the revert patch
mentioned by Chuck, not the one below your message?)
Sure.
Last reports are with
* interrupts (i use irqbalance, but problem was the same without)
I wonder if you tried without SMP too?
No i did not. Do you think that this can be a problem ?
To test with no SMP, do i need to recompile kernel or is there a kernel
parameter ?
BTW, Jean-Baptiste and Chuck - it
On Tue, Aug 07, 2007 at 10:10:34AM +0200, Jean-Baptiste Vignaud wrote:
BTW: Jean-Babtiste, could you send or point to you current configs?
Oops! I'm very sorry for misspelling!
I mean at least proc/interrupts, but with dmesg and .config it would
be even better. (I assume this last report
On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote:
* interrupts (i use irqbalance, but problem was the same without)
I wonder if you tried without SMP too?
No i did not. Do you think that this can be a problem ?
To test with no SMP, do i need to recompile kernel or
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
Network card still locks up (tested on 2.6.22.1). I had to upload more
data than usual (~350 MB vs ~1-100 MB) to
On Mon, Aug 06, 2007 at 01:43:48PM -0400, Chuck Ebbert wrote:
On 08/06/2007 03:03 AM, Ingo Molnar wrote:
But, since level types don't need this retriggers too much I think
this don't mask interrupts by default idea should be rethinked:
is there enough gain to risk such hard to diagnose
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
2007/8/7, Jarek Poplawski [EMAIL PROTECTED]:
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
Network card still locks up (tested on 2.6.22.1).
On Tue, Aug 07, 2007 at 02:13:39PM +0200, Jarek Poplawski wrote:
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
...
No, i don't need a break. I'll have more time in next weeks.
Great! So, I'll try to send
On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote:
* interrupts (i use irqbalance, but problem was the same without)
I wonder if you tried without SMP too?
No i did not. Do you think that this can be a problem ?
To test with no SMP, do i need to recompile
Jean-Baptiste Vignaud wrote:
> Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
> 3com card failed with the latest fedora kernel.
>
> Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
> Aug 6 22:31:09 loki kernel: eth2: transmit timed out, tx_status
On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote:
> Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
> 3com card failed with the latest fedora kernel.
>
> Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
> Aug 6 22:31:09 loki kernel: eth2:
Mmm, bad news, after 4 hours of intensive network stressing, one of the 2 3com
card failed with the latest fedora kernel.
Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
Aug 6 22:31:09 loki kernel: eth2: transmit timed out, tx_status 00 status e601.
Aug 6 22:31:09 loki
> * Chuck Ebbert <[EMAIL PROTECTED]> wrote:
>
> > Before, they would print:
> >
> > eth0: transmit timed out, tx_status 00 status e601.
> > diagnostics: net 0ccc media 8880 dma 003a fifo
> > eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
> > Flags;
* Chuck Ebbert <[EMAIL PROTECTED]> wrote:
> Before, they would print:
>
> eth0: transmit timed out, tx_status 00 status e601.
> diagnostics: net 0ccc media 8880 dma 003a fifo
> eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
> Flags; bus-master 1, dirty
On 08/06/2007 03:03 AM, Ingo Molnar wrote:
>
> But, since level types don't need this retriggers too much I think
> this "don't mask interrupts by default" idea should be rethinked:
> is there enough gain to risk such hard to diagnose errors?
>
>
I reverted those masking changes in Fedora
* Marcin Ślusarz <[EMAIL PROTECTED]> wrote:
> 2007/7/31, Jarek Poplawski <[EMAIL PROTECTED]>:
> > Marcin,
> >
> > I see you're quite busy, but if after testing this next Ingo's patch
> > you are alive yet, maybe you could try one more "idea"? No patch this
> > time, but if you could try this
2007/7/31, Jarek Poplawski <[EMAIL PROTECTED]>:
> Marcin,
>
> I see you're quite busy, but if after testing this next Ingo's patch
> you are alive yet, maybe you could try one more "idea"? No patch this
> time, but if you could try this after adding boot option "noirqdebug"
> (I'd like to be sure
2007/8/1, Ingo Molnar <[EMAIL PROTECTED]>:
> ok, it wasnt supposed to be _that_ easy i guess :-) Can you please
> (re-)confirm that the workaround below indeed fixes the hung card
> problem? (after producing a single WARN_ON message into the syslog)
yes, with this patch everything works fine
end
2007/8/1, Ingo Molnar [EMAIL PROTECTED]:
ok, it wasnt supposed to be _that_ easy i guess :-) Can you please
(re-)confirm that the workaround below indeed fixes the hung card
problem? (after producing a single WARN_ON message into the syslog)
yes, with this patch everything works fine
end of
2007/7/31, Jarek Poplawski [EMAIL PROTECTED]:
Marcin,
I see you're quite busy, but if after testing this next Ingo's patch
you are alive yet, maybe you could try one more idea? No patch this
time, but if you could try this after adding boot option noirqdebug
(I'd like to be sure it's not
* Marcin Ślusarz [EMAIL PROTECTED] wrote:
2007/7/31, Jarek Poplawski [EMAIL PROTECTED]:
Marcin,
I see you're quite busy, but if after testing this next Ingo's patch
you are alive yet, maybe you could try one more idea? No patch this
time, but if you could try this after adding boot
On 08/06/2007 03:03 AM, Ingo Molnar wrote:
But, since level types don't need this retriggers too much I think
this don't mask interrupts by default idea should be rethinked:
is there enough gain to risk such hard to diagnose errors?
I reverted those masking changes in Fedora and the
* Chuck Ebbert [EMAIL PROTECTED] wrote:
Before, they would print:
eth0: transmit timed out, tx_status 00 status e601.
diagnostics: net 0ccc media 8880 dma 003a fifo
eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
Flags; bus-master 1, dirty
* Chuck Ebbert [EMAIL PROTECTED] wrote:
Before, they would print:
eth0: transmit timed out, tx_status 00 status e601.
diagnostics: net 0ccc media 8880 dma 003a fifo
eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
Flags; bus-master 1, dirty
Mmm, bad news, after 4 hours of intensive network stressing, one of the 2 3com
card failed with the latest fedora kernel.
Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
Aug 6 22:31:09 loki kernel: eth2: transmit timed out, tx_status 00 status e601.
Aug 6 22:31:09 loki
On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote:
Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
3com card failed with the latest fedora kernel.
Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
Aug 6 22:31:09 loki kernel: eth2: transmit
Jean-Baptiste Vignaud wrote:
Mmm, bad news, after 4 hours of intensive network stressing, one of the 2
3com card failed with the latest fedora kernel.
Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out
Aug 6 22:31:09 loki kernel: eth2: transmit timed out, tx_status 00
* Marcin Ślusarz <[EMAIL PROTECTED]> wrote:
> > ei_outb_p(ENISR_ALL, e8390_base + EN0_IMR);
> > + /* force POST: */
> > + ei_inb_p(e8390_base + EN0_IMR);
> >
> > spin_unlock(_local->page_lock);
> > enable_irq_lockdep_irqrestore(dev->irq, );
> >
>
> Bad news.
2007/7/30, Ingo Molnar <[EMAIL PROTECTED]>:
> (..)
> does the patch below fix those timeouts? It tests the theory whether any
> POST latency could expose this problem.
>
> Ingo
>
> Index: linux/drivers/net/lib8390.c
> ===
>
2007/7/30, Ingo Molnar [EMAIL PROTECTED]:
(..)
does the patch below fix those timeouts? It tests the theory whether any
POST latency could expose this problem.
Ingo
Index: linux/drivers/net/lib8390.c
===
---
* Marcin Ślusarz [EMAIL PROTECTED] wrote:
ei_outb_p(ENISR_ALL, e8390_base + EN0_IMR);
+ /* force POST: */
+ ei_inb_p(e8390_base + EN0_IMR);
spin_unlock(ei_local-page_lock);
enable_irq_lockdep_irqrestore(dev-irq, flags);
Bad news. It doesn't
On Mon, Jul 30, 2007 at 09:29:38AM +0200, Marcin Ślusarz wrote:
...
> ps: I retested all patches posted in this thread on top of 2.6.22.1
> and behavior from 2.6.21.3 didn't changed. My next tests will be on
> 2.6.22.x only.
Marcin,
I see you're quite busy, but if after testing this next Ingo's
On Mon, Jul 30, 2007 at 09:29:38AM +0200, Marcin Ślusarz wrote:
...
ps: I retested all patches posted in this thread on top of 2.6.22.1
and behavior from 2.6.21.3 didn't changed. My next tests will be on
2.6.22.x only.
Marcin,
I see you're quite busy, but if after testing this next Ingo's
Jarek Poplawski wrote:
Hi,
Very below is my patch proposal with a comment, which in my opinion
is precious enough to save it for future help in reading and
understanding the code.
I hope Alan will not blame me I've not asked for his permission before
sending, and he would ack this patch as it
> So the whole locking is to be able to keep irqs enabled for a long time,
> without risking entry of the same IRQ handler on this same CPU, correct?
As implemented - on any CPU.
We also need to know that the IRQ handler is not doing useful work on
another processor which is why we take the
* Marcin Ślusarz <[EMAIL PROTECTED]> wrote:
> > Subject: x86: activate HARDIRQS_SW_RESEND
> > From: Ingo Molnar <[EMAIL PROTECTED]>
> >
> > activate the software-triggered IRQ-resend logic.
> This patch didn't help (tested on 2.6.22.1) - ne2k_pci timed out.
ok. This makes it more likely that
* Alan Cox <[EMAIL PROTECTED]> wrote:
> Ok the logic behind the 8390 is very simple:
thanks for the explanation Alan! A few comments and a question:
> Things to know
> - IRQ delivery is asynchronous to the PCI bus
> - Blocking the local CPU IRQ via spin locks was too slow
> -
2007/7/26, Ingo Molnar <[EMAIL PROTECTED]>:
> (..)
> yeah - i meant to cover both arches but forgot about x86_64 - updated
> patch attached below.
>
> Ingo
>
> ->
> Subject: x86: activate HARDIRQS_SW_RESEND
> From: Ingo Molnar <[EMAIL PROTECTED]>
>
> activate the
2007/7/26, Ingo Molnar [EMAIL PROTECTED]:
(..)
yeah - i meant to cover both arches but forgot about x86_64 - updated
patch attached below.
Ingo
-
Subject: x86: activate HARDIRQS_SW_RESEND
From: Ingo Molnar [EMAIL PROTECTED]
activate the software-triggered
1 - 100 of 172 matches
Mail list logo