Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-14 Thread Jorge Nerin

Frank de Lange wrote:
> 
> On Fri, Jan 12, 2001 at 09:51:36PM +0100, Ingo Molnar wrote:
> > great. Back when i had the same problem, flood pinging another host (on
> > the local network) was the quickest way to reproduce the hang:
> >
> >   ping -f -s 10 otherhost
> >
> > this produced an IOAPIC-hang within seconds.
> 
> Apart from killing streaming audio and interactive network use, nothing hangs.
> As soon as the ping flood is stopped, audio streams on and ssh sessions are
> useable again. So, it seems to fix it...
> 
> Frank

I do have a 3c503 and a ne2k-pci both of them use the 8390, I can hang
the ne2k-pci easily by doing a ping -f, bigger packet size => early the
hang. But I cannot hang the 3c503 by doing this.

Now with 2.4.0 the ne2k-pci behaviour is that: doing a ping -f works for
some amount of time, then stops for a BIG amount of time (various
minutes), and then it works again (it seems), but at a much slower
speed, and if you test it with normal ping (ping host) you don't get
replies.

The packets really go down to the wire and I even got replies. but I
don't receive it.

Previous versions of 2.4.0-testX caused ne2k-pci to hang and remain
hanged until reboot.

System: Mb Gigabyte 586dx, 2x200MMX, 96Mb RAM,

-- 
Jorge Nerin
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Alan Cox

> > The spin_lock_irqsave() is absolutely my preferred fix, and if I remember
> > correctly this is in fact how some early 2.1.x code fixed the ne2000
> > driver when the original irq scalability stuff happened (for some time
> > during development we did not have a working "disable_irq()" AT ALL
> > because the irq-disabling counters etc logic hadn't been done).
> 
> And that's the patch I meant... Manfred's
> spin_lock_irqsave/spin_unlock_irqrestore based one, not my
> (spin_lock_irq/spin_unlock_irq) based patch. That is also the one I'm running
> now.

The old code did it with #ifdef __SMP__ tests so it only screwed up SMP boxes,
which at the time was quite acceptable because real people didnt have them
and certainly at the price didnt put ne2000's in them 8)

The basic problem is that you cannot allow

-   set multicast list
-   open/close
-   irq
-   transmit

to occur except when serialized because of the indirection and register
gunge on the chip. The copies are slow and long enough that they prevent
28.8 modem sessions being usable.

I'd have to check the chip manual to be sure you even disable the irqs
without corrupting an in progress transfer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 04:15:37PM -0800, Linus Torvalds wrote:
> On Fri, 12 Jan 2001, Frank de Lange wrote:
> > 
> > Gentleman, this (the patch to 8390.c) seems to fix the problem.
> 
> The problem with this patch is that anybody with a slow ISA ne2000 clone
> will basically have absolutely _horrible_ interrupt latency because we
> hold the irq lock over some quite expensive operations.
> 
> The spin_lock_irqsave() is absolutely my preferred fix, and if I remember
> correctly this is in fact how some early 2.1.x code fixed the ne2000
> driver when the original irq scalability stuff happened (for some time
> during development we did not have a working "disable_irq()" AT ALL
> because the irq-disabling counters etc logic hadn't been done).

And that's the patch I meant... Manfred's
spin_lock_irqsave/spin_unlock_irqrestore based one, not my
(spin_lock_irq/spin_unlock_irq) based patch. That is also the one I'm running
now.

Frank

-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Linus Torvalds



On Fri, 12 Jan 2001, Frank de Lange wrote:
> 
> Gentleman, this (the patch to 8390.c) seems to fix the problem.

The problem with this patch is that anybody with a slow ISA ne2000 clone
will basically have absolutely _horrible_ interrupt latency because we
hold the irq lock over some quite expensive operations.

The spin_lock_irqsave() is absolutely my preferred fix, and if I remember
correctly this is in fact how some early 2.1.x code fixed the ne2000
driver when the original irq scalability stuff happened (for some time
during development we did not have a working "disable_irq()" AT ALL
because the irq-disabling counters etc logic hadn't been done).

The spinlock was changed to "disable_irq()" by a patch from Alan, if I
remember correctly, exactly because people couldn't access serial lines at
any kind of high speeds otherwise - even on "reasonable" hardware.

Alan may remember details better. The fact is that as a general design
principle we should _not_ be using "disable_irq/enable_irq" anyway. BUT
that there are some real-world concerns that make the "better" spinlock
handling have some problems too.

So yes, I bet the spinlock fixes it. But..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Alan Cox

> WITH. patched 8390.c, patched apic.c, sock io_apic.c. My very strong
> feeling is that this will be a stable combination, and that this is what
> we want as a final solution.

If you do that please #ifdef SMP all the changes. Its impossible to use a modem
and an ne2K together on a typical PC otherwise. The copy from the NE2K with
irq disabled is just _SO_ slow you drop bytes continually.

I did all the horrible magic in the ne2k driver for a reason. The other 
alternative is to provide a way to force the system back out of apic mode
so the ne2K driver can do a

goodbye_apic_crap()

type call

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul

Frank de Lange wrote:
> 
> On Fri, Jan 12, 2001 at 09:54:31PM +0100, Manfred Spraul wrote:
> > I have found one combination that doesn't hang with the unpatched
> > 8390.c, but network throughput is down to 1/2. I hope that's due to the
> > debugging changes.
> 
> Hm, could it be that the fact that network throughput is halved causes the
> problem not to appear?

No. The problem is still there. But now lots of losts packets instead of
a total hang.

Due to the modification of mask_irq now disable_irq_nosync and
enable_irq act as if I would press SysRQ+q every millisecond, and thus
the io apic is immediatly reset when it got stuck.

Btw, my initial assumption about EOI to masked interrupt must be wrong:
2.2 always first masks the irq, then it sends the EOI, and 2.2 doesn't
hang.

--
Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:54:31PM +0100, Manfred Spraul wrote:
> I have found one combination that doesn't hang with the unpatched
> 8390.c, but network throughput is down to 1/2. I hope that's due to the
> debugging changes.

Hm, could it be that the fact that network throughput is halved causes the
problem not to appear? Remember, it only appears under HEAVY network load. A
single nfs cp -rd  was not enough to hang my network, I needed to add
at least another cp -rd or some streaming audio or something else...

Cheers//Frank

-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:51:36PM +0100, Ingo Molnar wrote:
> great. Back when i had the same problem, flood pinging another host (on
> the local network) was the quickest way to reproduce the hang:
> 
>   ping -f -s 10 otherhost
> 
> this produced an IOAPIC-hang within seconds.

Apart from killing streaming audio and interactive network use, nothing hangs.
As soon as the ping flood is stopped, audio streams on and ssh sessions are
useable again. So, it seems to fix it...

Frank
-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul

Ingo Molnar wrote:
> 
> 
> okay - i just wanted to hear a definitive word from you that this fixes
> your problem, because this is what we'll have to do as a final solution.
> (barring any other solution.)
> 
Ingo, is that possible?

The current fix is "disable_irq_nosync() and enable_irq() cause
deadlocks with level triggered ioapic irqs, do not use them" - I'm sure
ne2k-pci isn't the only driver that uses these function.

I have found one combination that doesn't hang with the unpatched
8390.c, but network throughput is down to 1/2. I hope that's due to the
debugging changes.

I'll restart now from a fresh 2.4.0 tree:
Changes:

1) enable focus cpu.
2) apply the attached patch.

I'm not sure if it's a real fix or if it just hides the problem: my
sysrq patch has shown that clearing and setting the "level trigger" bit
in the io apic reanimates the IO APIC.

--
Manfred

--- build-2.4/arch/i386/kernel/io_apic.c.orig   Fri Jan 12 20:17:36 2001
+++ build-2.4/arch/i386/kernel/io_apic.cFri Jan 12 21:26:31 2001
@@ -134,6 +134,30 @@
spin_unlock_irqrestore(&ioapic_lock, flags);
 }
 
+DO_ACTION( __trigger_level,0, |= 0x8000, io_apic_sync(entry->apic))/* mask = 
+1 */
+DO_ACTION( __trigger_edge,  0, &= 0x7fff, )/* 
+mask = 0 */
+
+
+static void unmask_level_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __trigger_level_IO_APIC_irq(irq);
+   __unmask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
+static void mask_level_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __mask_IO_APIC_irq(irq);
+   __trigger_edge_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
 static void unmask_IO_APIC_irq (unsigned int irq)
 {
unsigned long flags;
@@ -143,6 +167,7 @@
spin_unlock_irqrestore(&ioapic_lock, flags);
 }
 
+
 void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
 {
struct IO_APIC_route_entry entry;
@@ -1181,14 +1206,14 @@
  */
 static unsigned int startup_level_ioapic_irq (unsigned int irq)
 {
-   unmask_IO_APIC_irq(irq);
+   unmask_level_IO_APIC_irq(irq);
 
return 0; /* don't check for pending */
 }
 
-#define shutdown_level_ioapic_irq  mask_IO_APIC_irq
-#define enable_level_ioapic_irqunmask_IO_APIC_irq
-#define disable_level_ioapic_irq   mask_IO_APIC_irq
+#define shutdown_level_ioapic_irq  mask_level_IO_APIC_irq
+#define enable_level_ioapic_irqunmask_level_IO_APIC_irq
+#define disable_level_ioapic_irq   mask_level_IO_APIC_irq
 
 static void end_level_ioapic_irq (unsigned int i)
 {



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Frank de Lange wrote:

> PATCHED 8390.c (using irq_safe spinlocks instead of disable_irq)
> PATCHED apic.c (focus cpu ENABLED)
> STOCK io_apic.c
>
> No problems under heavy network load.
>
> Gentleman, this (the patch to 8390.c) seems to fix the problem.

great. Back when i had the same problem, flood pinging another host (on
the local network) was the quickest way to reproduce the hang:

ping -f -s 10 otherhost

this produced an IOAPIC-hang within seconds.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread David Woodhouse

On Fri, 12 Jan 2001, Ingo Molnar wrote:

> okay - i just wanted to hear a definitive word from you that this fixes
> your problem, because this is what we'll have to do as a final solution.
> (barring any other solution.)

Patching 8390.c won't fix this for me. The only thing on IRQ19 when I saw
interrupts die was usb-uhci, and that doesn't appear to use disable_irq.

But then again, I've only ever seen this happen once. It's not repeatable.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:37:24PM +0100, Ingo Molnar wrote:
> okay - i just wanted to hear a definitive word from you that this fixes
> your problem, because this is what we'll have to do as a final solution.
> (barring any other solution.)

Now running with this config:

PATCHED 8390.c (using irq_safe spinlocks instead of disable_irq)
PATCHED apic.c (focus cpu ENABLED)
STOCK io_apic.c

No problems under heavy network load.

Gentleman, this (the patch to 8390.c) seems to fix the problem.

Cheers//Frank

-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:34:03PM +0100, Ingo Molnar wrote:
> ? this is x86-only code. There is no hot-pluggable CPU support for Linux
> AFAIK. (But in any case, the code is basically ready for hot-pluggable
> CPUs, just take a few precautions and change cpu_online_mask and a couple
> of other things.)

OK, maybe the Sun example was not the best to give for this code... But if
there are no hot-pluggable x86's around now (I think there are, but can not
recollect who made 'm...) and nobody is complaining, then it is fine with me...
I won't hot-unplug my BP6's CPU's anyway...

Cheers//Frank
-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Frank de Lange wrote:

> It is. As I already mentioned in other messages, I already tested with
> JUST the patched 8390.c driver, no other patches. It was stable. I
> then patched apic.c AND io_apic.c, which did not introduce new
> instabilities. Unless you think that reverting back to a stock
> io_apic.c would cause instabilities (which would be weird, since I had
> no instabilities running only a patched 8390.c), I think the patch to
> 8390.c DOES remove the symptoms all by itself. No other patches seem
> necessary to get a stable box.

okay - i just wanted to hear a definitive word from you that this fixes
your problem, because this is what we'll have to do as a final solution.
(barring any other solution.)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:31:15PM +0100, Ingo Molnar wrote:
> 
> On Fri, 12 Jan 2001, Frank de Lange wrote:
> 
> > WITH or WITHOUT the changed 8390 driver? I can already give you the
> > results for running WITH the changed driver: it works. I have not yet
> > tried it WITHOUT the changed 8390 driver (so that would be stock 8390,
> > patched apic.c, stock io_apic.c). Please let me know which you want...
> 
> WITH. patched 8390.c, patched apic.c, sock io_apic.c. My very strong
> feeling is that this will be a stable combination, and that this is what
> we want as a final solution.

It is. As I already mentioned in other messages, I already tested with JUST the
patched 8390.c driver, no other patches. It was stable. I then patched apic.c
AND io_apic.c, which did not introduce new instabilities. Unless you think that
reverting back to a stock io_apic.c would cause instabilities (which would be
weird, since I had no instabilities running only a patched 8390.c), I think the
patch to 8390.c DOES remove the symptoms all by itself. No other patches seem
necessary to get a stable box.

But I'll patch the mess again just fox kicks :-)

Cheers//Frank

-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Frank de Lange wrote:

> BTW, does this (TARGET_CPUS cpu_online_mask) not wreak havoc with
> systems with hot-pluggable CPUs (many Suns, etc...)? Wouldn;t it be
> better to make this a config option (like the optional PCI fixes for
> crappy BIOSs)?

? this is x86-only code. There is no hot-pluggable CPU support for Linux
AFAIK. (But in any case, the code is basically ready for hot-pluggable
CPUs, just take a few precautions and change cpu_online_mask and a couple
of other things.)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Frank de Lange wrote:

> WITH or WITHOUT the changed 8390 driver? I can already give you the
> results for running WITH the changed driver: it works. I have not yet
> tried it WITHOUT the changed 8390 driver (so that would be stock 8390,
> patched apic.c, stock io_apic.c). Please let me know which you want...

WITH. patched 8390.c, patched apic.c, sock io_apic.c. My very strong
feeling is that this will be a stable combination, and that this is what
we want as a final solution.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:19:53PM +0100, Ingo Molnar wrote:
> > In addition, I patched apic.c (focus cpu enabled)
> > In addition, I patched io_apic ((TARGET_CPUS 0xff)
> 
> please try it with the focus CPU enabling change (we want to enable that
> feature, i only disabled it due to the stuck-ne2k bug), but with
> TARGET_CPUS set to cpu_online_mask. (this later is needed for certain
> crappy BIOSes.)

WITH or WITHOUT the changed 8390 driver? I can already give you the results for
running WITH the changed driver: it works. I have not yet tried it WITHOUT the
changed 8390 driver (so that would be stock 8390, patched apic.c, stock
io_apic.c). Please let me know which you want...

Frank
-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Frank de Lange wrote:

> In addition, I patched apic.c (focus cpu enabled)
> In addition, I patched io_apic ((TARGET_CPUS 0xff)

please try it with the focus CPU enabling change (we want to enable that
feature, i only disabled it due to the stuck-ne2k bug), but with
TARGET_CPUS set to cpu_online_mask. (this later is needed for certain
crappy BIOSes.)

i believe the ne2k driver change is the key.

> > I have a first idea: we send an EOI to an interrupt that is masked on
> > the IO apic, perhaps that causes the problems.
>
> Sound plausible...

does not help. I've tried it (and many other combinations). I did not find
any direct workaround for this problem. (i tried very hard.)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Frank de Lange

On Fri, Jan 12, 2001 at 09:11:29PM +0100, Manfred Spraul wrote:
> Frank, please clarify:
> you still run without disable_irq_nosync() in 8390.c?

I am running with your patched version of 8390.c (so WITHOUT
disable_irq_nosync()).

In addition, I patched apic.c (focus cpu enabled)
In addition, I patched io_apic ((TARGET_CPUS 0xff)

> I have a first idea: we send an EOI to an interrupt that is masked on
> the IO apic, perhaps that causes the problems.

Sound plausible...

> I'm right now typing a patch.

I'll await yours instead of making my own patch this time... :-)

Cheers//Frank
-- 
  W  ___
 ## o o\/ Frank de Lange \
 }#   \|   /  \
  ##---# _/   \
      \  +31-320-252965/
   \[EMAIL PROTECTED]/
-
 [ "Omnis enim res, quae dando non deficit, dum habetur
et non datur, nondum habetur, quomodo habenda est."  ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Manfred Spraul

Linus Torvalds wrote:
> 
> 
> I'd like to know _which_ of the two makes a difference (or does it only
> trigger with both of them enabled)? And even then I'm not sure that it is
> "the" solution - both changes to io-apic handling had some reason for
> them. Ingo, what was the focus-cpu thing?
> 

Frank, please clarify:
you still run without disable_irq_nosync() in 8390.c?

I have a first idea: we send an EOI to an interrupt that is masked on
the IO apic, perhaps that causes the problems.

I'm right now typing a patch.

--
Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Ingo Molnar


On Fri, 12 Jan 2001, Linus Torvalds wrote:

> [...] Ingo, what was the focus-cpu thing?

well, some time ago i had an ne2k card in an SMP system as well, and found
this very problem. Disabling/enabling focus-cpu appeared to make a
difference, but later on i made experiments that show that in both cases
the hang happens. I spent a good deal of time trying to fix this problem,
but failed - so any fresh ideas are more than welcome.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: QUESTION: Network hangs with BP6 and 2.4.x kernels, hardwarerelated?

2001-01-12 Thread Linus Torvalds



On Fri, 12 Jan 2001, Frank de Lange wrote:

> On Fri, Jan 12, 2001 at 08:33:15PM +0100, Manfred Spraul wrote:
> > Frank, the 2.4.0 contains 2 band aids that were added for ne2k smp:
> > 
> > * From Ingo: focus cpu disabled, in arch/i386/kernel/apic.c
> > * From myself: TARGET_CPU = cpu_online_mask, was 0xFF.
> > 
> > Could you disable both bandaids? I disabled them, no problems so far.
> 
> I disabled both (I guess you meant the 'define TARGET_CPUS cpu_online' in
> io_apic.c?), and reverted my own patch, added your patch... Now running with
> the usual heavy network load, no problems so far... Also made USB produce
> interrupts (shares irq with network), no problems...
> 
> Could this really be the solution?

I'd like to know _which_ of the two makes a difference (or does it only
trigger with both of them enabled)? And even then I'm not sure that it is
"the" solution - both changes to io-apic handling had some reason for
them. Ingo, what was the focus-cpu thing?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/