Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 08:32:39AM -0800, [EMAIL PROTECTED] wrote: > On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: > > Subject says it all: 2.4.3 (unpatchaed) is still causing the dreaded > > APIC-related hangs on SMP BX systems (Abit BP-6, maybe Gigabyte). I still need > > to apply one of Maciej's patches to get rid of these hangs. The source comments > > in arc/i386/kernel/apic.c ("If focus CPU is disabled then the hang goes away") > > are incorrect, as the hang does not go away by simply disabling focus CPU. The > > only way for me to get rid of the hangs is to apply patch-2.4.1-io_apic-46 > > (which does the LEVEL->EDGE->LEVEL triggered trick to 'free' the IO_APIC). I've > > been running with this patch for quite some time now, and have not experienced > > any problems with it. Maybe it it time to include it in the main kernel, > > perhaps as a configurable option ("BROKEN_IO_APIC")? Same for me. > > Maciej, did you submit the patch to Linus? It really seems to solve the > > (occurence of the) problems with these boards... > > Where is this patch found? I am not seeing it so far on kernel.org. Attached, as I assume more people are interested in it ... Regards, -- Kurt Garloff <[EMAIL PROTECTED]> Eindhoven, NL GPG key: See mail header, key servers Linux kernel development SuSE GmbH, Nuernberg, FRG SCSI, Security patch-2.4.1-io_apic-46 diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/apic.c linux-2.4.1/arch/i386/kernel/apic.c --- linux-2.4.1.macro/arch/i386/kernel/apic.c Wed Dec 13 23:54:27 2000 +++ linux-2.4.1/arch/i386/kernel/apic.c Mon Feb 12 16:11:15 2001 @@ -23,6 +23,7 @@ #include #include +#include #include #include #include @@ -270,7 +271,13 @@ void __init setup_local_APIC (void) * PCI Ne2000 networking cards and PII/PIII processors, dual * BX chipset. ] */ -#if 0 + /* +* Actually disabling the focus CPU check just makes the hang less +* frequent as it makes the interrupt distributon model be more +* like LRU than MRU (the short-term load is more even across CPUs). +* See also the comment in end_level_ioapic_irq(). --macro +*/ +#if 1 /* Enable focus processor (bit==0) */ value &= ~(1<<9); #else @@ -764,7 +771,7 @@ asmlinkage void smp_error_interrupt(void apic_write(APIC_ESR, 0); v1 = apic_read(APIC_ESR); ack_APIC_irq(); - irq_err_count++; + atomic_inc(_err_count); /* Here is what the APIC error bits mean: 0: Send CS error diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/i8259.c linux-2.4.1/arch/i386/kernel/i8259.c --- linux-2.4.1.macro/arch/i386/kernel/i8259.c Mon Nov 20 18:01:58 2000 +++ linux-2.4.1/arch/i386/kernel/i8259.cSun Feb 11 19:54:33 2001 @@ -12,6 +12,7 @@ #include #include +#include #include #include #include @@ -321,7 +322,7 @@ spurious_8259A_irq: printk("spurious 8259A interrupt: IRQ%d.\n", irq); spurious_irq_mask |= irqmask; } - irq_err_count++; + atomic_inc(_err_count); /* * Theoretically we do not have to handle this IRQ, * but in Linux this does not cause problems and is diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/io_apic.c linux-2.4.1/arch/i386/kernel/io_apic.c --- linux-2.4.1.macro/arch/i386/kernel/io_apic.cSat Feb 3 12:05:49 2001 +++ linux-2.4.1/arch/i386/kernel/io_apic.c Tue Feb 13 19:59:55 2001 @@ -33,6 +33,8 @@ #include #include +#define APIC_LOCKUP_DEBUG + static spinlock_t ioapic_lock = SPIN_LOCK_UNLOCKED; /* @@ -122,8 +124,14 @@ static void add_pin_to_irq(unsigned int static void name##_IO_APIC_irq (unsigned int irq) \ __DO_ACTION(R, ACTION, FINAL) -DO_ACTION( __mask,0, |= 0x0001, io_apic_sync(entry->apic))/* mask = 1 */ -DO_ACTION( __unmask, 0, &= 0xfffe, ) /* mask = 0 */ +DO_ACTION( __mask, 0, |= 0x0001, io_apic_sync(entry->apic) ) + /* mask = 1 */ +DO_ACTION( __unmask, 0, &= 0xfffe, ) + /* mask = 0 */ +DO_ACTION( __mask_and_edge,0, = (reg & 0x7fff) | 0x0001, ) + /* mask = 1, trigger = 0 */ +DO_ACTION( __unmask_and_level, 0, = (reg & 0xfffe) | 0x8000, ) + /* mask = 0, trigger = 1 */ static void mask_IO_APIC_irq (unsigned int irq) { @@ -847,6 +855,8 @@ void /*__init*/ print_local_APIC(void * v = apic_read(APIC_EOI); printk(KERN_DEBUG "... APIC EOI: %08x\n", v); + v = apic_read(APIC_RRR); + printk(KERN_DEBUG "... APIC RRR: %08x\n", v);
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 08:32:39AM -0800, [EMAIL PROTECTED] wrote: On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: Subject says it all: 2.4.3 (unpatchaed) is still causing the dreaded APIC-related hangs on SMP BX systems (Abit BP-6, maybe Gigabyte). I still need to apply one of Maciej's patches to get rid of these hangs. The source comments in arc/i386/kernel/apic.c ("If focus CPU is disabled then the hang goes away") are incorrect, as the hang does not go away by simply disabling focus CPU. The only way for me to get rid of the hangs is to apply patch-2.4.1-io_apic-46 (which does the LEVEL-EDGE-LEVEL triggered trick to 'free' the IO_APIC). I've been running with this patch for quite some time now, and have not experienced any problems with it. Maybe it it time to include it in the main kernel, perhaps as a configurable option ("BROKEN_IO_APIC")? Same for me. Maciej, did you submit the patch to Linus? It really seems to solve the (occurence of the) problems with these boards... Where is this patch found? I am not seeing it so far on kernel.org. Attached, as I assume more people are interested in it ... Regards, -- Kurt Garloff [EMAIL PROTECTED] Eindhoven, NL GPG key: See mail header, key servers Linux kernel development SuSE GmbH, Nuernberg, FRG SCSI, Security patch-2.4.1-io_apic-46 diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/apic.c linux-2.4.1/arch/i386/kernel/apic.c --- linux-2.4.1.macro/arch/i386/kernel/apic.c Wed Dec 13 23:54:27 2000 +++ linux-2.4.1/arch/i386/kernel/apic.c Mon Feb 12 16:11:15 2001 @@ -23,6 +23,7 @@ #include linux/mc146818rtc.h #include linux/kernel_stat.h +#include asm/atomic.h #include asm/smp.h #include asm/mtrr.h #include asm/mpspec.h @@ -270,7 +271,13 @@ void __init setup_local_APIC (void) * PCI Ne2000 networking cards and PII/PIII processors, dual * BX chipset. ] */ -#if 0 + /* +* Actually disabling the focus CPU check just makes the hang less +* frequent as it makes the interrupt distributon model be more +* like LRU than MRU (the short-term load is more even across CPUs). +* See also the comment in end_level_ioapic_irq(). --macro +*/ +#if 1 /* Enable focus processor (bit==0) */ value = ~(19); #else @@ -764,7 +771,7 @@ asmlinkage void smp_error_interrupt(void apic_write(APIC_ESR, 0); v1 = apic_read(APIC_ESR); ack_APIC_irq(); - irq_err_count++; + atomic_inc(irq_err_count); /* Here is what the APIC error bits mean: 0: Send CS error diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/i8259.c linux-2.4.1/arch/i386/kernel/i8259.c --- linux-2.4.1.macro/arch/i386/kernel/i8259.c Mon Nov 20 18:01:58 2000 +++ linux-2.4.1/arch/i386/kernel/i8259.cSun Feb 11 19:54:33 2001 @@ -12,6 +12,7 @@ #include linux/init.h #include linux/kernel_stat.h +#include asm/atomic.h #include asm/system.h #include asm/io.h #include asm/irq.h @@ -321,7 +322,7 @@ spurious_8259A_irq: printk("spurious 8259A interrupt: IRQ%d.\n", irq); spurious_irq_mask |= irqmask; } - irq_err_count++; + atomic_inc(irq_err_count); /* * Theoretically we do not have to handle this IRQ, * but in Linux this does not cause problems and is diff -up --recursive --new-file linux-2.4.1.macro/arch/i386/kernel/io_apic.c linux-2.4.1/arch/i386/kernel/io_apic.c --- linux-2.4.1.macro/arch/i386/kernel/io_apic.cSat Feb 3 12:05:49 2001 +++ linux-2.4.1/arch/i386/kernel/io_apic.c Tue Feb 13 19:59:55 2001 @@ -33,6 +33,8 @@ #include asm/smp.h #include asm/desc.h +#define APIC_LOCKUP_DEBUG + static spinlock_t ioapic_lock = SPIN_LOCK_UNLOCKED; /* @@ -122,8 +124,14 @@ static void add_pin_to_irq(unsigned int static void name##_IO_APIC_irq (unsigned int irq) \ __DO_ACTION(R, ACTION, FINAL) -DO_ACTION( __mask,0, |= 0x0001, io_apic_sync(entry-apic))/* mask = 1 */ -DO_ACTION( __unmask, 0, = 0xfffe, ) /* mask = 0 */ +DO_ACTION( __mask, 0, |= 0x0001, io_apic_sync(entry-apic) ) + /* mask = 1 */ +DO_ACTION( __unmask, 0, = 0xfffe, ) + /* mask = 0 */ +DO_ACTION( __mask_and_edge,0, = (reg 0x7fff) | 0x0001, ) + /* mask = 1, trigger = 0 */ +DO_ACTION( __unmask_and_level, 0, = (reg 0xfffe) | 0x8000, ) + /* mask = 0, trigger = 1 */ static void mask_IO_APIC_irq (unsigned int irq) { @@ -847,6 +855,8 @@ void /*__init*/ print_local_APIC(void * v = apic_read(APIC_EOI);
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: > Hi'all, > > Subject says it all: 2.4.3 (unpatchaed) is still causing the dreaded > APIC-related hangs on SMP BX systems (Abit BP-6, maybe Gigabyte). I still need > to apply one of Maciej's patches to get rid of these hangs. The source comments > in arc/i386/kernel/apic.c ("If focus CPU is disabled then the hang goes away") > are incorrect, as the hang does not go away by simply disabling focus CPU. The > only way for me to get rid of the hangs is to apply patch-2.4.1-io_apic-46 > (which does the LEVEL->EDGE->LEVEL triggered trick to 'free' the IO_APIC). I've > been running with this patch for quite some time now, and have not experienced > any problems with it. Maybe it it time to include it in the main kernel, > perhaps as a configurable option ("BROKEN_IO_APIC")? > > Maciej, did you submit the patch to Linus? It really seems to solve the > (occurence of the) problems with these boards... Where is this patch found? I am not seeing it so far on kernel.org. -- Ferret - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 08:32:39AM -0800, [EMAIL PROTECTED] wrote: > On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: > > > > Maciej, did you submit the patch to Linus? It really seems to solve the > > (occurence of the) problems with these boards... > > Where is this patch found? I am not seeing it so far on kernel.org. It is allmost ancient history, from days long gone when men were men, women were women and Linux had only reached 2.4.1... I can send you a copy, if you need it... Cheers//Frank -- W ___ ## o o\/ Frank de Lange \ }# \| / \ ##---# _/ \ \ +31-320-252965/ \[EMAIL PROTECTED]/ - [ "Omnis enim res, quae dando non deficit, dum habetur et non datur, nondum habetur, quomodo habenda est." ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, 30 Mar 2001, Frank de Lange wrote: > Maciej, did you submit the patch to Linus? It really seems to solve the > (occurence of the) problems with these boards... I suppose Alan is going to pass the patch to Linus eventually. I think there is actually a number of people interested in the fix, so I may pass it to Linus independently, but I'm really time-constrained these days, so it might not happen before 2.4.3 (I don't feel safe about submitting a patch without actually run-time testing it against whatever test kernel is current at the moment). -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, 30 Mar 2001, Frank de Lange wrote: Maciej, did you submit the patch to Linus? It really seems to solve the (occurence of the) problems with these boards... I suppose Alan is going to pass the patch to Linus eventually. I think there is actually a number of people interested in the fix, so I may pass it to Linus independently, but I'm really time-constrained these days, so it might not happen before 2.4.3 (I don't feel safe about submitting a patch without actually run-time testing it against whatever test kernel is current at the moment). -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: Hi'all, Subject says it all: 2.4.3 (unpatchaed) is still causing the dreaded APIC-related hangs on SMP BX systems (Abit BP-6, maybe Gigabyte). I still need to apply one of Maciej's patches to get rid of these hangs. The source comments in arc/i386/kernel/apic.c ("If focus CPU is disabled then the hang goes away") are incorrect, as the hang does not go away by simply disabling focus CPU. The only way for me to get rid of the hangs is to apply patch-2.4.1-io_apic-46 (which does the LEVEL-EDGE-LEVEL triggered trick to 'free' the IO_APIC). I've been running with this patch for quite some time now, and have not experienced any problems with it. Maybe it it time to include it in the main kernel, perhaps as a configurable option ("BROKEN_IO_APIC")? Maciej, did you submit the patch to Linus? It really seems to solve the (occurence of the) problems with these boards... Where is this patch found? I am not seeing it so far on kernel.org. -- Ferret - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3: still experiencing APIC-related hangs
On Fri, Mar 30, 2001 at 08:32:39AM -0800, [EMAIL PROTECTED] wrote: On Fri, Mar 30, 2001 at 02:32:24PM +0200, Frank de Lange wrote: Maciej, did you submit the patch to Linus? It really seems to solve the (occurence of the) problems with these boards... Where is this patch found? I am not seeing it so far on kernel.org. It is allmost ancient history, from days long gone when men were men, women were women and Linux had only reached 2.4.1... I can send you a copy, if you need it... Cheers//Frank -- W ___ ## o o\/ Frank de Lange \ }# \| / \ ##---# _/ Hacker for Hire \ \ +31-320-252965/ \[EMAIL PROTECTED]/ - [ "Omnis enim res, quae dando non deficit, dum habetur et non datur, nondum habetur, quomodo habenda est." ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/