Re: SATA ahci Bug in 2.6.19.x
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! acpi=off does not help i've already tried that. Ok here some outputs: 1.) complete dmesg with 2.6.16.27 (works) Linux version 2.6.16.27amd ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Sat Aug 26 14:29:07 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 3bfb (usable) BIOS-e820: 3bfb - 3bfbe000 (ACPI data) BIOS-e820: 3bfbe000 - 3bfe (ACPI NVS) BIOS-e820: 3bfe - 3c00 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fecc - fecc1000 (reserved) BIOS-e820: ff7c - 0001 (reserved) ACPI: RSDP (v002 ACPIAM) @ 0x000fa850 ACPI: XSDT (v001 A M I OEMXSDT 0x12000527 MSFT 0x0097) @ 0x3bfb0100 ACPI: FADT (v003 A M I OEMFACP 0x12000527 MSFT 0x0097) @ 0x3bfb0290 ACPI: MADT (v001 A M I OEMAPIC 0x12000527 MSFT 0x0097) @ 0x3bfb0390 ACPI: MCFG (v001 A M I OEMMCFG 0x12000527 MSFT 0x0097) @ 0x3bfb0400 ACPI: OEMB (v001 A M I AMI_OEM 0x12000527 MSFT 0x0097) @ 0x3bfbe040 ACPI: DSDT (v001 A0339 A0339000 0x INTL 0x02002026) @ 0x Scanning NUMA topology in Northbridge 24 Number of nodes 1 Node 0 MemBase Limit 3bfb NUMA: Using 63 for the hash shift. Using node hash shift of 63 Bootmem setup node 0 -3bfb On node 0 totalpages: 240991 DMA zone: 2709 pages, LIFO batch:0 DMA32 zone: 238282 pages, LIFO batch:31 Normal zone: 0 pages, LIFO batch:0 HighMem zone: 0 pages, LIFO batch:0 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:15 APIC version 16 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 3, address 0xfec0, GSI 0-23 ACPI: IOAPIC (id[0x02] address[0xfecc] gsi_base[24]) IOAPIC[1]: apic_id 2, version 3, address 0xfecc, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 4000 (gap: 3c00:c2c0) Checking aperture... CPU 0: aperture @ f000 size 128 MB Built 1 zonelists Kernel command line: root=/dev/sda6 ro rootflags=quota Initializing CPU#0 PID hash table entries: 4096 (order: 12, 131072 bytes) time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected 2400.214 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Memory: 962212k/982720k available (2939k kernel code, 20120k reserved, 1327k data, 220k init) Calibrating delay using timer specific routine.. 4810.51 BogoMIPS (lpj=9621030) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU 0(1) -> Node 0 -> Core 0 Using local APIC timer interrupts. result 12501128 Detected 12.501 MHz APIC timer. Brought up 1 CPUs testing NMI watchdog ... OK. migration_cost=0 DMI 2.3 present. NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 PCI: Using MMCONFIG at e000 ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) Boot video device is :01:00.0 PCI: Transparent bridge - :00:13.1 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBPG._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBP0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P7._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PA._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 *14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c
Re: XFS or Kernel Problem / Bug
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! OK - i rechecked everything. We've 22 Servers with the DFI PM-12 Mainboard with VIA Chipset. But only the 5 oldest of them (before 2004 / 01 / 20) (we've buyed all in a range of 10 month) have this problem. So i think it is a mixture of software and hardware problem. Perhaps DFI changed something on the mainboard (e.g. new revision) or there was a new BIOS Version on it. But there must also changed something in the kernel. > OK, can you post configs for one that works and one that doesn't? You mean Kernel .configs? > And which C compiler(s) do you use? The same for all, I hope... On all 32bit Machines: gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.5 (Debian 1:3.3.5-13) Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. Any others with VIA chipsets? Are you building different kernels for them, or is it just different drivers loaded? No every machine builds it's own kernel. OK, can you post configs for one that works and one that doesn't? And which C compiler(s) do you use? The same for all, I hope... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! OK - i rechecked everything. We've 22 Servers with the DFI PM-12 Mainboard with VIA Chipset. But only the 5 oldest of them (before 2004 / 01 / 20) (we've buyed all in a range of 10 month) have this problem. So i think it is a mixture of software and hardware problem. Perhaps DFI changed something on the mainboard (e.g. new revision) or there was a new BIOS Version on it. But there must also changed something in the kernel. OK, can you post configs for one that works and one that doesn't? You mean Kernel .configs? And which C compiler(s) do you use? The same for all, I hope... On all 32bit Machines: gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.5 (Debian 1:3.3.5-13) Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. Any others with VIA chipsets? Are you building different kernels for them, or is it just different drivers loaded? No every machine builds it's own kernel. OK, can you post configs for one that works and one that doesn't? And which C compiler(s) do you use? The same for all, I hope... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA ahci Bug in 2.6.19.x
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! acpi=off does not help i've already tried that. Ok here some outputs: 1.) complete dmesg with 2.6.16.27 (works) Linux version 2.6.16.27amd ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Sat Aug 26 14:29:07 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 3bfb (usable) BIOS-e820: 3bfb - 3bfbe000 (ACPI data) BIOS-e820: 3bfbe000 - 3bfe (ACPI NVS) BIOS-e820: 3bfe - 3c00 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fecc - fecc1000 (reserved) BIOS-e820: ff7c - 0001 (reserved) ACPI: RSDP (v002 ACPIAM) @ 0x000fa850 ACPI: XSDT (v001 A M I OEMXSDT 0x12000527 MSFT 0x0097) @ 0x3bfb0100 ACPI: FADT (v003 A M I OEMFACP 0x12000527 MSFT 0x0097) @ 0x3bfb0290 ACPI: MADT (v001 A M I OEMAPIC 0x12000527 MSFT 0x0097) @ 0x3bfb0390 ACPI: MCFG (v001 A M I OEMMCFG 0x12000527 MSFT 0x0097) @ 0x3bfb0400 ACPI: OEMB (v001 A M I AMI_OEM 0x12000527 MSFT 0x0097) @ 0x3bfbe040 ACPI: DSDT (v001 A0339 A0339000 0x INTL 0x02002026) @ 0x Scanning NUMA topology in Northbridge 24 Number of nodes 1 Node 0 MemBase Limit 3bfb NUMA: Using 63 for the hash shift. Using node hash shift of 63 Bootmem setup node 0 -3bfb On node 0 totalpages: 240991 DMA zone: 2709 pages, LIFO batch:0 DMA32 zone: 238282 pages, LIFO batch:31 Normal zone: 0 pages, LIFO batch:0 HighMem zone: 0 pages, LIFO batch:0 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:15 APIC version 16 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 3, address 0xfec0, GSI 0-23 ACPI: IOAPIC (id[0x02] address[0xfecc] gsi_base[24]) IOAPIC[1]: apic_id 2, version 3, address 0xfecc, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 4000 (gap: 3c00:c2c0) Checking aperture... CPU 0: aperture @ f000 size 128 MB Built 1 zonelists Kernel command line: root=/dev/sda6 ro rootflags=quota Initializing CPU#0 PID hash table entries: 4096 (order: 12, 131072 bytes) time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected 2400.214 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Memory: 962212k/982720k available (2939k kernel code, 20120k reserved, 1327k data, 220k init) Calibrating delay using timer specific routine.. 4810.51 BogoMIPS (lpj=9621030) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU 0(1) - Node 0 - Core 0 Using local APIC timer interrupts. result 12501128 Detected 12.501 MHz APIC timer. Brought up 1 CPUs testing NMI watchdog ... OK. migration_cost=0 DMI 2.3 present. NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 PCI: Using MMCONFIG at e000 ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) Boot video device is :01:00.0 PCI: Transparent bridge - :00:13.1 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBPG._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBP0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P7._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PA._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 *14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam
Re: SATA ahci Bug in 2.6.19.x
Hi! acpi=off does not help i've already tried that. Ok here some outputs: 1.) complete dmesg with 2.6.16.27 (works) Linux version 2.6.16.27amd ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Sat Aug 26 14:29:07 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 3bfb (usable) BIOS-e820: 3bfb - 3bfbe000 (ACPI data) BIOS-e820: 3bfbe000 - 3bfe (ACPI NVS) BIOS-e820: 3bfe - 3c00 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fecc - fecc1000 (reserved) BIOS-e820: ff7c - 0001 (reserved) ACPI: RSDP (v002 ACPIAM) @ 0x000fa850 ACPI: XSDT (v001 A M I OEMXSDT 0x12000527 MSFT 0x0097) @ 0x3bfb0100 ACPI: FADT (v003 A M I OEMFACP 0x12000527 MSFT 0x0097) @ 0x3bfb0290 ACPI: MADT (v001 A M I OEMAPIC 0x12000527 MSFT 0x0097) @ 0x3bfb0390 ACPI: MCFG (v001 A M I OEMMCFG 0x12000527 MSFT 0x0097) @ 0x3bfb0400 ACPI: OEMB (v001 A M I AMI_OEM 0x12000527 MSFT 0x0097) @ 0x3bfbe040 ACPI: DSDT (v001 A0339 A0339000 0x INTL 0x02002026) @ 0x Scanning NUMA topology in Northbridge 24 Number of nodes 1 Node 0 MemBase Limit 3bfb NUMA: Using 63 for the hash shift. Using node hash shift of 63 Bootmem setup node 0 -3bfb On node 0 totalpages: 240991 DMA zone: 2709 pages, LIFO batch:0 DMA32 zone: 238282 pages, LIFO batch:31 Normal zone: 0 pages, LIFO batch:0 HighMem zone: 0 pages, LIFO batch:0 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:15 APIC version 16 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 3, address 0xfec0, GSI 0-23 ACPI: IOAPIC (id[0x02] address[0xfecc] gsi_base[24]) IOAPIC[1]: apic_id 2, version 3, address 0xfecc, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 4000 (gap: 3c00:c2c0) Checking aperture... CPU 0: aperture @ f000 size 128 MB Built 1 zonelists Kernel command line: root=/dev/sda6 ro rootflags=quota Initializing CPU#0 PID hash table entries: 4096 (order: 12, 131072 bytes) time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected 2400.214 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Memory: 962212k/982720k available (2939k kernel code, 20120k reserved, 1327k data, 220k init) Calibrating delay using timer specific routine.. 4810.51 BogoMIPS (lpj=9621030) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU 0(1) -> Node 0 -> Core 0 Using local APIC timer interrupts. result 12501128 Detected 12.501 MHz APIC timer. Brought up 1 CPUs testing NMI watchdog ... OK. migration_cost=0 DMI 2.3 present. NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 PCI: Using MMCONFIG at e000 ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) Boot video device is :01:00.0 PCI: Transparent bridge - :00:13.1 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBPG._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBP0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P7._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PA._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 *14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 12
Re: SATA ahci Bug in 2.6.19.x
Hi! acpi=off does not help i've already tried that. Ok here some outputs: 1.) complete dmesg with 2.6.16.27 (works) Linux version 2.6.16.27amd ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Sat Aug 26 14:29:07 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 3bfb (usable) BIOS-e820: 3bfb - 3bfbe000 (ACPI data) BIOS-e820: 3bfbe000 - 3bfe (ACPI NVS) BIOS-e820: 3bfe - 3c00 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fecc - fecc1000 (reserved) BIOS-e820: ff7c - 0001 (reserved) ACPI: RSDP (v002 ACPIAM) @ 0x000fa850 ACPI: XSDT (v001 A M I OEMXSDT 0x12000527 MSFT 0x0097) @ 0x3bfb0100 ACPI: FADT (v003 A M I OEMFACP 0x12000527 MSFT 0x0097) @ 0x3bfb0290 ACPI: MADT (v001 A M I OEMAPIC 0x12000527 MSFT 0x0097) @ 0x3bfb0390 ACPI: MCFG (v001 A M I OEMMCFG 0x12000527 MSFT 0x0097) @ 0x3bfb0400 ACPI: OEMB (v001 A M I AMI_OEM 0x12000527 MSFT 0x0097) @ 0x3bfbe040 ACPI: DSDT (v001 A0339 A0339000 0x INTL 0x02002026) @ 0x Scanning NUMA topology in Northbridge 24 Number of nodes 1 Node 0 MemBase Limit 3bfb NUMA: Using 63 for the hash shift. Using node hash shift of 63 Bootmem setup node 0 -3bfb On node 0 totalpages: 240991 DMA zone: 2709 pages, LIFO batch:0 DMA32 zone: 238282 pages, LIFO batch:31 Normal zone: 0 pages, LIFO batch:0 HighMem zone: 0 pages, LIFO batch:0 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:15 APIC version 16 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 3, address 0xfec0, GSI 0-23 ACPI: IOAPIC (id[0x02] address[0xfecc] gsi_base[24]) IOAPIC[1]: apic_id 2, version 3, address 0xfecc, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 4000 (gap: 3c00:c2c0) Checking aperture... CPU 0: aperture @ f000 size 128 MB Built 1 zonelists Kernel command line: root=/dev/sda6 ro rootflags=quota Initializing CPU#0 PID hash table entries: 4096 (order: 12, 131072 bytes) time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected 2400.214 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Memory: 962212k/982720k available (2939k kernel code, 20120k reserved, 1327k data, 220k init) Calibrating delay using timer specific routine.. 4810.51 BogoMIPS (lpj=9621030) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU 0(1) - Node 0 - Core 0 Using local APIC timer interrupts. result 12501128 Detected 12.501 MHz APIC timer. Brought up 1 CPUs testing NMI watchdog ... OK. migration_cost=0 DMI 2.3 present. NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 PCI: Using MMCONFIG at e000 ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) Boot video device is :01:00.0 PCI: Transparent bridge - :00:13.1 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBPG._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NBP0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P7._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PA._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 *14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 12 devices
Re: XFS or Kernel Problem / Bug
Hi! OK - i rechecked everything. We've 22 Servers with the DFI PM-12 Mainboard with VIA Chipset. But only the 5 oldest of them (before 2004 / 01 / 20) (we've buyed all in a range of 10 month) have this problem. So i think it is a mixture of software and hardware problem. Perhaps DFI changed something on the mainboard (e.g. new revision) or there was a new BIOS Version on it. But there must also changed something in the kernel. > OK, can you post configs for one that works and one that doesn't? You mean Kernel .configs? > And which C compiler(s) do you use? The same for all, I hope... On all 32bit Machines: gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.5 (Debian 1:3.3.5-13) Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. Any others with VIA chipsets? Are you building different kernels for them, or is it just different drivers loaded? No every machine builds it's own kernel. OK, can you post configs for one that works and one that doesn't? And which C compiler(s) do you use? The same for all, I hope... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA ahci Bug in 2.6.19.x
Hello Nobody here who cares??? Stefan Stephen Evanchik schrieb: On 1/22/07, Stefan Priebe - FH <[EMAIL PROTECTED]> wrote: I've an Asus A8V Mainboard which works wonderful with a 2.6.18.X kernel. But i cannot use the SATA Controller with a 2.6.19.x Kernel. I also have an Asus A8V motherboard that cannot boot a newer kernel because the SATA controller does not come up properly. I have tried kernels 2.6.19.2 and 2.6.20-rc5 with no luck. It looks like later kernels don't recognize the proper IRQ of the device as compared to the 2.6.18 boot logs. "ACPI: PCI Interrupt :00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21" "ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode" "ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part " "ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 1277" "ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 1277" "ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 1277" "ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 1277" Similar output as above. Does any one have any ideas? Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA ahci Bug in 2.6.19.x
Hello Nobody here who cares??? Stefan Stephen Evanchik schrieb: On 1/22/07, Stefan Priebe - FH [EMAIL PROTECTED] wrote: I've an Asus A8V Mainboard which works wonderful with a 2.6.18.X kernel. But i cannot use the SATA Controller with a 2.6.19.x Kernel. I also have an Asus A8V motherboard that cannot boot a newer kernel because the SATA controller does not come up properly. I have tried kernels 2.6.19.2 and 2.6.20-rc5 with no luck. It looks like later kernels don't recognize the proper IRQ of the device as compared to the 2.6.18 boot logs. ACPI: PCI Interrupt :00:0f.0[B] - GSI 21 (level, low) - IRQ 21 ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 1277 ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 1277 ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 1277 ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 1277 Similar output as above. Does any one have any ideas? Stephen - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! OK - i rechecked everything. We've 22 Servers with the DFI PM-12 Mainboard with VIA Chipset. But only the 5 oldest of them (before 2004 / 01 / 20) (we've buyed all in a range of 10 month) have this problem. So i think it is a mixture of software and hardware problem. Perhaps DFI changed something on the mainboard (e.g. new revision) or there was a new BIOS Version on it. But there must also changed something in the kernel. OK, can you post configs for one that works and one that doesn't? You mean Kernel .configs? And which C compiler(s) do you use? The same for all, I hope... On all 32bit Machines: gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.5 (Debian 1:3.3.5-13) Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. Any others with VIA chipsets? Are you building different kernels for them, or is it just different drivers loaded? No every machine builds it's own kernel. OK, can you post configs for one that works and one that doesn't? And which C compiler(s) do you use? The same for all, I hope... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi Chuck, hi Eric, cause you both asked me nearly the same i will answer you both in one mail. > What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. > Are you building different kernels for them, or is it just different > drivers loaded? No every machine builds it's own kernel. Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Hi! Mhm are you shure? I mean i have this problem on 5 servers - all with the same mainboard. I cannot believe, that all 5 servers have a hardware problem that starts on the same day. The other thing is - that they all work fine with 2.6.16.x and all other kernels before. I mean some of them were used with 2.6.x since two years without any problem... OK it's probably not hardware, but a bit is flipped somehow. What is different about these servers? Are you building different kernels for them, or is it just different drivers loaded? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! Mhm are you shure? I mean i have this problem on 5 servers - all with the same mainboard. I cannot believe, that all 5 servers have a hardware problem that starts on the same day. The other thing is - that they all work fine with 2.6.16.x and all other kernels before. I mean some of them were used with 2.6.x since two years without any problem... Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Sorry that is not possible - cause it is a production machine. But i've catched the error and the files from another machine - perhaps this helps. "BUG: unable to handle kernel NULL pointer dereference at virtual address 0288" " printing eip:" "c0142ff7" "*pde = " "Oops: [#1]" "SMP " "Modules linked in: iptable_filter ip_tables x_tables" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at generic_file_buffered_write+0x390/0x6cf" "eax: ebx: 01ec ecx: ea029a40 edx: 8002" "esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18" "ds: 007b es: 007b ss: 0068" "Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000)" "Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 " " 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc " " 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 " "Call Trace:" " [] sock_def_readable+0x7f/0x81" " [] file_update_time+0xad/0xcb" " [] xfs_iunlock+0x55/0x9f" " [] xfs_write+0xa74/0xc61" " [] sock_aio_read+0x95/0x99" " [] xfs_file_aio_write+0x8f/0xa0" " [] do_sync_write+0xc9/0x10f" " [] autoremove_wake_function+0x0/0x57" " [] generic_file_llseek+0x95/0xbc" " [] do_sync_write+0x0/0x10f" " [] vfs_write+0xa6/0x179" " [] sys_write+0x51/0x80" " [] syscall_call+0x7/0xb" "Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 <8b> 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a " "EIP: [] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18" Files: http://server113-han.de-nserver.de/filemap.s http://server113-han.de-nserver.de/filemap.o You seem to have some kind of hardware/memory problem. Disassembly of the failing instruction from the oops: 8b 7c 24 28 mov0x28(%esp),%edi 8b 85 9c 00 00 00 mov0x9c(%ebp),%eax <= Dump of the object code: 8b 7c 24 28 mov0x28(%esp),%edi 8b 87 9c 00 00 00 mov0x9c(%edi),%eax Looks like a bit is flipped. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! Sorry that is not possible - cause it is a production machine. But i've catched the error and the files from another machine - perhaps this helps. "BUG: unable to handle kernel NULL pointer dereference at virtual address 0288" " printing eip:" "c0142ff7" "*pde = " "Oops: [#1]" "SMP " "Modules linked in: iptable_filter ip_tables x_tables" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at generic_file_buffered_write+0x390/0x6cf" "eax: ebx: 01ec ecx: ea029a40 edx: 8002" "esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18" "ds: 007b es: 007b ss: 0068" "Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000)" "Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 " " 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc " " 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 " "Call Trace:" " [] sock_def_readable+0x7f/0x81" " [] file_update_time+0xad/0xcb" " [] xfs_iunlock+0x55/0x9f" " [] xfs_write+0xa74/0xc61" " [] sock_aio_read+0x95/0x99" " [] xfs_file_aio_write+0x8f/0xa0" " [] do_sync_write+0xc9/0x10f" " [] autoremove_wake_function+0x0/0x57" " [] generic_file_llseek+0x95/0xbc" " [] do_sync_write+0x0/0x10f" " [] vfs_write+0xa6/0x179" " [] sys_write+0x51/0x80" " [] syscall_call+0x7/0xb" "Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 <8b> 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a " "EIP: [] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18" Files: http://server113-han.de-nserver.de/filemap.s http://server113-han.de-nserver.de/filemap.o Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: It could be, that the options are now different - cause i my first try was to change the kernel options - if that did not help i switched back to 2.6.16.37. Any idea what i can do? Chuck Ebbert schrieb: That doesn't match your oops at all. Did you use a different compiler and/or different kernel build options? If you don't know what changed you can try different options until the filemap.s is the same. You should see movl 156(%ebp),%eax testb 16, 48(%eax) in generic_file_buffered_write. And you need to regenerate filemap.s manually each time. (Did you test the kernel that you posted these pieces from? If you can get it to oops the same way, just post that instead.) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! It could be, that the options are now different - cause i my first try was to change the kernel options - if that did not help i switched back to 2.6.16.37. Any idea what i can do? Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Hi! I do everything you like :-) if we can find the bug. So here are the files (2.6.18.6): http://server055.de-nserver.de/filemap.o http://server055.de-nserver.de/filemap.s If you can, post the file mm/filemap.o from your build directory to some website. And do 'make mm/filemap.s' and post that file too. That doesn't match your oops at all. Did you use a different compiler and/or different kernel build options? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! It could be, that the options are now different - cause i my first try was to change the kernel options - if that did not help i switched back to 2.6.16.37. Any idea what i can do? Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Hi! I do everything you like :-) if we can find the bug. So here are the files (2.6.18.6): http://server055.de-nserver.de/filemap.o http://server055.de-nserver.de/filemap.s If you can, post the file mm/filemap.o from your build directory to some website. And do 'make mm/filemap.s' and post that file too. That doesn't match your oops at all. Did you use a different compiler and/or different kernel build options? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! Sorry that is not possible - cause it is a production machine. But i've catched the error and the files from another machine - perhaps this helps. BUG: unable to handle kernel NULL pointer dereference at virtual address 0288 printing eip: c0142ff7 *pde = Oops: [#1] SMP Modules linked in: iptable_filter ip_tables x_tables CPU:0 EIP:0060:[c0142ff7]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at generic_file_buffered_write+0x390/0x6cf eax: ebx: 01ec ecx: ea029a40 edx: 8002 esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18 ds: 007b es: 007b ss: 0068 Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000) Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 Call Trace: [c036d793] sock_def_readable+0x7f/0x81 [c017a03a] file_update_time+0xad/0xcb [c0232015] xfs_iunlock+0x55/0x9f [c0262eeb] xfs_write+0xa74/0xc61 [c036a253] sock_aio_read+0x95/0x99 [c025d9fb] xfs_file_aio_write+0x8f/0xa0 [c015fb94] do_sync_write+0xc9/0x10f [c0133ad6] autoremove_wake_function+0x0/0x57 [c015f3d5] generic_file_llseek+0x95/0xbc [c015facb] do_sync_write+0x0/0x10f [c015fc80] vfs_write+0xa6/0x179 [c015fe24] sys_write+0x51/0x80 [c0102d3f] syscall_call+0x7/0xb Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 8b 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a EIP: [c0142ff7] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18 Files: http://server113-han.de-nserver.de/filemap.s http://server113-han.de-nserver.de/filemap.o Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: It could be, that the options are now different - cause i my first try was to change the kernel options - if that did not help i switched back to 2.6.16.37. Any idea what i can do? Chuck Ebbert schrieb: That doesn't match your oops at all. Did you use a different compiler and/or different kernel build options? If you don't know what changed you can try different options until the filemap.s is the same. You should see movl 156(%ebp),%eax testb 16, 48(%eax) in generic_file_buffered_write. And you need to regenerate filemap.s manually each time. (Did you test the kernel that you posted these pieces from? If you can get it to oops the same way, just post that instead.) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! Mhm are you shure? I mean i have this problem on 5 servers - all with the same mainboard. I cannot believe, that all 5 servers have a hardware problem that starts on the same day. The other thing is - that they all work fine with 2.6.16.x and all other kernels before. I mean some of them were used with 2.6.x since two years without any problem... Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Sorry that is not possible - cause it is a production machine. But i've catched the error and the files from another machine - perhaps this helps. BUG: unable to handle kernel NULL pointer dereference at virtual address 0288 printing eip: c0142ff7 *pde = Oops: [#1] SMP Modules linked in: iptable_filter ip_tables x_tables CPU:0 EIP:0060:[c0142ff7]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at generic_file_buffered_write+0x390/0x6cf eax: ebx: 01ec ecx: ea029a40 edx: 8002 esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18 ds: 007b es: 007b ss: 0068 Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000) Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 Call Trace: [c036d793] sock_def_readable+0x7f/0x81 [c017a03a] file_update_time+0xad/0xcb [c0232015] xfs_iunlock+0x55/0x9f [c0262eeb] xfs_write+0xa74/0xc61 [c036a253] sock_aio_read+0x95/0x99 [c025d9fb] xfs_file_aio_write+0x8f/0xa0 [c015fb94] do_sync_write+0xc9/0x10f [c0133ad6] autoremove_wake_function+0x0/0x57 [c015f3d5] generic_file_llseek+0x95/0xbc [c015facb] do_sync_write+0x0/0x10f [c015fc80] vfs_write+0xa6/0x179 [c015fe24] sys_write+0x51/0x80 [c0102d3f] syscall_call+0x7/0xb Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 8b 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a EIP: [c0142ff7] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18 Files: http://server113-han.de-nserver.de/filemap.s http://server113-han.de-nserver.de/filemap.o You seem to have some kind of hardware/memory problem. Disassembly of the failing instruction from the oops: 8b 7c 24 28 mov0x28(%esp),%edi 8b 85 9c 00 00 00 mov0x9c(%ebp),%eax = Dump of the object code: 8b 7c 24 28 mov0x28(%esp),%edi 8b 87 9c 00 00 00 mov0x9c(%edi),%eax Looks like a bit is flipped. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi Chuck, hi Eric, cause you both asked me nearly the same i will answer you both in one mail. What is different about these servers? All 300 machines are mostly different. We have Dual Opteron, single P4 with HT, single P4 without HT, Dual Xeon, Athlon 64 X2, and many more... different mainboards etc. The only thing i found out is, that all these servers (where the problem exist) are using a DFI PM-12 Mainboard with a VIA Chipset. Are you building different kernels for them, or is it just different drivers loaded? No every machine builds it's own kernel. Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: Hi! Mhm are you shure? I mean i have this problem on 5 servers - all with the same mainboard. I cannot believe, that all 5 servers have a hardware problem that starts on the same day. The other thing is - that they all work fine with 2.6.16.x and all other kernels before. I mean some of them were used with 2.6.x since two years without any problem... OK it's probably not hardware, but a bit is flipped somehow. What is different about these servers? Are you building different kernels for them, or is it just different drivers loaded? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I do everything you like :-) if we can find the bug. So here are the files (2.6.18.6): http://server055.de-nserver.de/filemap.o http://server055.de-nserver.de/filemap.s Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: "general protection fault: [#1]" "Modules linked in:" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b" "eax: ebx: fffe0007 ecx: 0071a4cd edx: " "esi: edi: ebp: 0015 esp: ce35f8f0" "ds: es: 007b ss: 0068" "Process mysqld (pid: 1836, ti=ce35e000 task=ee618550 task.ti=ce35e000)" "Stack: 0232 0233 000c " " 0007 eca90250 eca90278 0001 eca90200 03c3 " " 010003c3 ffc0 ce35fa58 ce35fa58 0001 " "Call Trace:" " [] xfs_trans_dqresv+0x3f9/0x405" " [] xfs_bmap_add_extent+0x163/0x377" " [] xfs_bmapi+0xa4e/0x1109" " [] xfs_iomap_write_delay+0x233/0x2fa" " [] xfs_imap_to_bmap+0x29/0x1d6" " [] xfs_iomap+0x23c/0x3e1" " [] xfs_iomap+0x2e0/0x3e1" " [] xfs_bmap+0x1a/0x1e" " [] __xfs_get_blocks+0x5d/0x195" Without the "Code:" line it's hard to tell what happened... and sometimes this one: "BUG: unable to handle kernel NULL pointer dereference at virtual address 0288" " printing eip:" "c0142ff7" "*pde = " "Oops: [#1]" "SMP " "Modules linked in: iptable_filter ip_tables x_tables" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at generic_file_buffered_write+0x390/0x6cf" "eax: ebx: 01ec ecx: ea029a40 edx: 8002" "esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18" "ds: 007b es: 007b ss: 0068" "Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000)" "Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 " " 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc " " 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 " "Call Trace:" " [] sock_def_readable+0x7f/0x81" " [] file_update_time+0xad/0xcb" " [] xfs_iunlock+0x55/0x9f" " [] xfs_write+0xa74/0xc61" " [] sock_aio_read+0x95/0x99" " [] xfs_file_aio_write+0x8f/0xa0" " [] do_sync_write+0xc9/0x10f" " [] autoremove_wake_function+0x0/0x57" " [] generic_file_llseek+0x95/0xbc" " [] do_sync_write+0x0/0x10f" " [] vfs_write+0xa6/0x179" " [] sys_write+0x51/0x80" " [] syscall_call+0x7/0xb" "Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 <8b> 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a " "EIP: [] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18" Well that's strange. It's here in mm/filemap.c line 2201: /* * For now, when the user asks for O_SYNC, we'll actually give O_DSYNC */ if (likely(status >= 0)) { if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { <=== if (!a_ops->writepage || !is_sync_kiocb(iocb)) status = generic_osync_inode(inode, mapping, OSYNC_METADATA|OSYNC_DATA); } } ebp holds the value of 'inode' and it's obviously wrong (it's also the same as 'written', which is in ebx.) So when it tries to read inode->i_sb, it dies. If you can, post the file mm/filemap.o from your build directory to some website. And do 'make mm/filemap.s' and post that file too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I can give you an idea of the workload :-) I have the same problem on an nearly idle Server. There runs only a few cronjobs (normal Debian System crons). The load was not higher than 0.01 on this system the last 3 days and this morning it crashes with the same error. I've not tested 2.6.19.x cause this one has some problems with SATA AHCI driver which we need. But i can manuelly update only this system with 2.6.19.x and wait some days. There were no other messages in the log. Cheers, Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 09:07:23AM +0100, Stefan Priebe - FH wrote: Hi! The update of the IDE layer was in 2.6.19. I don't think it is a hardware bug cause all these 5 machines runs fine since a few years with 2.6.16.X and before. We switch to 2.6.18.6 on monday last week and all machines began to crash periodically. On friday last week we downgraded them all to 2.6.16.37 and all 5 machines runs fine again. So i don't believe it is a hardware problem. Do you really think that could be? I was thinking more of a driver change that is being triggered on that particular hardware. FWIW, did you test 2.6.19? I really need a better idea of the workload these servers are running and, ideally, a reproducable test case to track something like this down. At the moment I have no idea what is going on and no real information on which to even base a guess. Were there any other messages in the log? On Mon, Jan 22, 2007 at 10:42:36AM +0100, Stefan Priebe - FH wrote: Hi! I've another idea... could it be, that it is a barrier problem? Since barriers are enabled by default from 2.6.17 on ... You could try turning it off. If it does fix the problem, then I'd be pointing once again at hardware ;) Cheers, Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I do everything you like :-) if we can find the bug. So here are the files (2.6.18.6): http://server055.de-nserver.de/filemap.o http://server055.de-nserver.de/filemap.s Stefan Chuck Ebbert schrieb: Stefan Priebe - FH wrote: I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: general protection fault: [#1] Modules linked in: CPU:0 EIP:0060:[c01c8fd2]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b eax: ebx: fffe0007 ecx: 0071a4cd edx: esi: edi: ebp: 0015 esp: ce35f8f0 ds: es: 007b ss: 0068 Process mysqld (pid: 1836, ti=ce35e000 task=ee618550 task.ti=ce35e000) Stack: 0232 0233 000c 0007 eca90250 eca90278 0001 eca90200 03c3 010003c3 ffc0 ce35fa58 ce35fa58 0001 Call Trace: [c01b6c58] xfs_trans_dqresv+0x3f9/0x405 [c01c6485] xfs_bmap_add_extent+0x163/0x377 [c01cd2c3] xfs_bmapi+0xa4e/0x1109 [c01ebbe3] xfs_iomap_write_delay+0x233/0x2fa [c01eaa31] xfs_imap_to_bmap+0x29/0x1d6 [c01eae1a] xfs_iomap+0x23c/0x3e1 [c01eaebe] xfs_iomap+0x2e0/0x3e1 [c020a71a] xfs_bmap+0x1a/0x1e [c020471e] __xfs_get_blocks+0x5d/0x195 Without the Code: line it's hard to tell what happened... and sometimes this one: BUG: unable to handle kernel NULL pointer dereference at virtual address 0288 printing eip: c0142ff7 *pde = Oops: [#1] SMP Modules linked in: iptable_filter ip_tables x_tables CPU:0 EIP:0060:[c0142ff7]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at generic_file_buffered_write+0x390/0x6cf eax: ebx: 01ec ecx: ea029a40 edx: 8002 esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18 ds: 007b es: 007b ss: 0068 Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000) Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 Call Trace: [c036d793] sock_def_readable+0x7f/0x81 [c017a03a] file_update_time+0xad/0xcb [c0232015] xfs_iunlock+0x55/0x9f [c0262eeb] xfs_write+0xa74/0xc61 [c036a253] sock_aio_read+0x95/0x99 [c025d9fb] xfs_file_aio_write+0x8f/0xa0 [c015fb94] do_sync_write+0xc9/0x10f [c0133ad6] autoremove_wake_function+0x0/0x57 [c015f3d5] generic_file_llseek+0x95/0xbc [c015facb] do_sync_write+0x0/0x10f [c015fc80] vfs_write+0xa6/0x179 [c015fe24] sys_write+0x51/0x80 [c0102d3f] syscall_call+0x7/0xb Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 8b 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a EIP: [c0142ff7] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18 Well that's strange. It's here in mm/filemap.c line 2201: /* * For now, when the user asks for O_SYNC, we'll actually give O_DSYNC */ if (likely(status = 0)) { if (unlikely((file-f_flags O_SYNC) || IS_SYNC(inode))) { === if (!a_ops-writepage || !is_sync_kiocb(iocb)) status = generic_osync_inode(inode, mapping, OSYNC_METADATA|OSYNC_DATA); } } ebp holds the value of 'inode' and it's obviously wrong (it's also the same as 'written', which is in ebx.) So when it tries to read inode-i_sb, it dies. If you can, post the file mm/filemap.o from your build directory to some website. And do 'make mm/filemap.s' and post that file too. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I can give you an idea of the workload :-) I have the same problem on an nearly idle Server. There runs only a few cronjobs (normal Debian System crons). The load was not higher than 0.01 on this system the last 3 days and this morning it crashes with the same error. I've not tested 2.6.19.x cause this one has some problems with SATA AHCI driver which we need. But i can manuelly update only this system with 2.6.19.x and wait some days. There were no other messages in the log. Cheers, Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 09:07:23AM +0100, Stefan Priebe - FH wrote: Hi! The update of the IDE layer was in 2.6.19. I don't think it is a hardware bug cause all these 5 machines runs fine since a few years with 2.6.16.X and before. We switch to 2.6.18.6 on monday last week and all machines began to crash periodically. On friday last week we downgraded them all to 2.6.16.37 and all 5 machines runs fine again. So i don't believe it is a hardware problem. Do you really think that could be? I was thinking more of a driver change that is being triggered on that particular hardware. FWIW, did you test 2.6.19? I really need a better idea of the workload these servers are running and, ideally, a reproducable test case to track something like this down. At the moment I have no idea what is going on and no real information on which to even base a guess. Were there any other messages in the log? On Mon, Jan 22, 2007 at 10:42:36AM +0100, Stefan Priebe - FH wrote: Hi! I've another idea... could it be, that it is a barrier problem? Since barriers are enabled by default from 2.6.17 on ... You could try turning it off. If it does fix the problem, then I'd be pointing once again at hardware ;) Cheers, Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I've another idea... could it be, that it is a barrier problem? Since barriers are enabled by default from 2.6.17 on ... Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 08:51:10AM +0100, Stefan Priebe - FH wrote: Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are "old" Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Hmmm - that points more to a hardware problem than a software problem; crashes in generic_file_buffered_write() are relatively uncommon, and to have them all isolated to a specific type of hardware is suspicious Wasn't there a major update of the IDE layer in 2.6.18? or was that 2.6.19 that I'm thinking of? BTW, have you run memtest86 on these boxes to rule out dodgy memory? Cheers, Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! The update of the IDE layer was in 2.6.19. I don't think it is a hardware bug cause all these 5 machines runs fine since a few years with 2.6.16.X and before. We switch to 2.6.18.6 on monday last week and all machines began to crash periodically. On friday last week we downgraded them all to 2.6.16.37 and all 5 machines runs fine again. So i don't believe it is a hardware problem. Do you really think that could be? Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 08:51:10AM +0100, Stefan Priebe - FH wrote: Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are "old" Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Hmmm - that points more to a hardware problem than a software problem; crashes in generic_file_buffered_write() are relatively uncommon, and to have them all isolated to a specific type of hardware is suspicious Wasn't there a major update of the IDE layer in 2.6.18? or was that 2.6.19 that I'm thinking of? BTW, have you run memtest86 on these boxes to rule out dodgy memory? Cheers, Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
SATA ahci Bug in 2.6.19.x
Hello! I've an Asus A8V Mainboard which works wonderful with a 2.6.18.X kernel. But i cannot use the SATA Controller with a 2.6.19.x Kernel. dmesg output from 2.6.18.3 where it works perfectly: libata version 2.00 loaded. ahci :00:0f.0: version 2.0 GSI 19 sharing vector 0xD9 and IRQ 19 ACPI: PCI Interrupt :00:0f.0[B] -> GSI 21 (level, low) -> IRQ 217 ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 225 ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 225 ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 225 ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 225 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA7, 312581808 sectors: LBA48 NCQ (depth 0/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/133 scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-7, max UDMA7, 312581808 sectors: LBA48 NCQ (depth 0/32) ata2.00: ata2: dev 0 multi count 16 ata2.00: configured for UDMA/133 scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) Vendor: ATA Model: SAMSUNG HD160JJ Rev: ZM10 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: SAMSUNG HD160JJ Rev: ZM10 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back sda: sda1 < sda5 sda6 sda7 > sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back sdb: sdb1 < sdb5 sdb6 sdb7 > sd 1:0:0:0: Attached scsi disk sdb Output with 2.6.19.2 (logged via netconsole cause the system can't boot): "ACPI: PCI Interrupt :00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21" "ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode" "ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part " "ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 1277" "ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 1277" "ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 1277" "ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 1277" "scsi0 : ahci" "ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)" "ata1.00: qc timeout (cmd 0xec)" "ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)" "ata1: port is slow to respond, please be patient (Status 0x80)" "ata1: port failed to respond (30 secs, Status 0x80)" "ata1: COMRESET failed (device not ready)" "ata1: hardreset failed, retrying in 5 secs" "ata1: port is slow to respond, please be patient (Status 0x80)" "ata1: port failed to respond (30 secs, Status 0x80)" "ata1: COMRESET failed (device not ready)" "ata1: hardreset failed, retrying in 5 secs" "ata1: port is slow to respond, please be patient (Status 0x80)" "ata1: port failed to respond (30 secs, Status 0x80)" "ata1: COMRESET failed (device not ready)" "ata1: reset failed, giving up" "scsi1 : ahci" "ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)" "ata2.00: qc timeout (cmd 0xec)" "ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)" "ata2: port is slow to respond, please be patient (Status 0x80)" "ata2: port failed to respond (30 secs, Status 0x80)" "ata2: COMRESET failed (device not ready)" "ata2: hardreset failed, retrying in 5 secs" "ata2: port is slow to respond, please be patient (Status 0x80)" "ata2: port failed to respond (30 secs, Status 0x80)" "ata2: COMRESET failed (device not ready)" "ata2: hardreset failed, retrying in 5 secs" "ata2: port is slow to respond, please be patient (Status 0x80)" "ata2: port failed to respond (30 secs, Status 0x80)" "ata2: COMRESET failed (device not ready)" "ata2: reset failed, giving up" "scsi2 : ahci" "ata3: SATA link down (SStatus 0 SControl 300)" "scsi3 : ahci" "ata4: SATA link down (SStatus 0 SControl 300)" Stefan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
SATA ahci Bug in 2.6.19.x
Hello! I've an Asus A8V Mainboard which works wonderful with a 2.6.18.X kernel. But i cannot use the SATA Controller with a 2.6.19.x Kernel. dmesg output from 2.6.18.3 where it works perfectly: libata version 2.00 loaded. ahci :00:0f.0: version 2.0 GSI 19 sharing vector 0xD9 and IRQ 19 ACPI: PCI Interrupt :00:0f.0[B] - GSI 21 (level, low) - IRQ 217 ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 225 ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 225 ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 225 ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 225 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA7, 312581808 sectors: LBA48 NCQ (depth 0/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/133 scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-7, max UDMA7, 312581808 sectors: LBA48 NCQ (depth 0/32) ata2.00: ata2: dev 0 multi count 16 ata2.00: configured for UDMA/133 scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) Vendor: ATA Model: SAMSUNG HD160JJ Rev: ZM10 Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: SAMSUNG HD160JJ Rev: ZM10 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back sda: sda1 sda5 sda6 sda7 sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back sdb: sdb1 sdb5 sdb6 sdb7 sd 1:0:0:0: Attached scsi disk sdb Output with 2.6.19.2 (logged via netconsole cause the system can't boot): ACPI: PCI Interrupt :00:0f.0[B] - GSI 21 (level, low) - IRQ 21 ahci :00:0f.0: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xC2004D00 ctl 0x0 bmdma 0x0 irq 1277 ata2: SATA max UDMA/133 cmd 0xC2004D80 ctl 0x0 bmdma 0x0 irq 1277 ata3: SATA max UDMA/133 cmd 0xC2004E00 ctl 0x0 bmdma 0x0 irq 1277 ata4: SATA max UDMA/133 cmd 0xC2004E80 ctl 0x0 bmdma 0x0 irq 1277 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: hardreset failed, retrying in 5 secs ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: hardreset failed, retrying in 5 secs ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: reset failed, giving up scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: qc timeout (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: hardreset failed, retrying in 5 secs ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: hardreset failed, retrying in 5 secs ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: reset failed, giving up scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) Stefan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! The update of the IDE layer was in 2.6.19. I don't think it is a hardware bug cause all these 5 machines runs fine since a few years with 2.6.16.X and before. We switch to 2.6.18.6 on monday last week and all machines began to crash periodically. On friday last week we downgraded them all to 2.6.16.37 and all 5 machines runs fine again. So i don't believe it is a hardware problem. Do you really think that could be? Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 08:51:10AM +0100, Stefan Priebe - FH wrote: Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are old Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Hmmm - that points more to a hardware problem than a software problem; crashes in generic_file_buffered_write() are relatively uncommon, and to have them all isolated to a specific type of hardware is suspicious Wasn't there a major update of the IDE layer in 2.6.18? or was that 2.6.19 that I'm thinking of? BTW, have you run memtest86 on these boxes to rule out dodgy memory? Cheers, Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I've another idea... could it be, that it is a barrier problem? Since barriers are enabled by default from 2.6.17 on ... Stefan David Chinner schrieb: On Mon, Jan 22, 2007 at 08:51:10AM +0100, Stefan Priebe - FH wrote: Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are old Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Hmmm - that points more to a hardware problem than a software problem; crashes in generic_file_buffered_write() are relatively uncommon, and to have them all isolated to a specific type of hardware is suspicious Wasn't there a major update of the IDE layer in 2.6.18? or was that 2.6.19 that I'm thinking of? BTW, have you run memtest86 on these boxes to rule out dodgy memory? Cheers, Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are "old" Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Other P4 Machines with a Tyan Mainboard or a Gigabyte Mainboard are not affected. All 300 machines runs the same Debian 3.0 with self build kernel. Some of these 5 use a 3ware controller and some of them the mainboardcontroller. All systems are using IDE. But i cannot say what happens to these machines at the time of failure. Sometimes these servers crashed directly after a few minutes. Sometimes they run about 2-3 days... i've now downgraded all servers to 2.6.16.37. Cause they are production machines... but i have one machine where we can test - if you need something. Here is the output running 2.6.16.37 at the moment: xfs_growfs -n / meta-data=/dev/root isize=256agcount=16, agsize=603855 blks = sectsz=512 attr=0 data = bsize=4096 blocks=9661680, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=4717, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 Stefan David Chinner schrieb: On Sun, Jan 21, 2007 at 01:30:15PM +0100, Stefan Priebe - FH wrote: Hello! I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: [ EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b ] [ EIP is at generic_file_buffered_write+0x390/0x6cf ] Do you have a reproducable test case for these? if not, do you have any idea what is going on in the system at the time of the failure? Can you describe the storage subsystem you are using and post the output of xfs_growfs -n on the filesystem that is causing problems? Cheers, Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
XFS or Kernel Problem / Bug
Hello! I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: "general protection fault: [#1]" "Modules linked in:" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b" "eax: ebx: fffe0007 ecx: 0071a4cd edx: " "esi: edi: ebp: 0015 esp: ce35f8f0" "ds: es: 007b ss: 0068" "Process mysqld (pid: 1836, ti=ce35e000 task=ee618550 task.ti=ce35e000)" "Stack: 0232 0233 000c " " 0007 eca90250 eca90278 0001 eca90200 03c3 " " 010003c3 ffc0 ce35fa58 ce35fa58 0001 " "Call Trace:" " [] xfs_trans_dqresv+0x3f9/0x405" " [] xfs_bmap_add_extent+0x163/0x377" " [] xfs_bmapi+0xa4e/0x1109" " [] xfs_iomap_write_delay+0x233/0x2fa" " [] xfs_imap_to_bmap+0x29/0x1d6" " [] xfs_iomap+0x23c/0x3e1" " [] xfs_iomap+0x2e0/0x3e1" " [] xfs_bmap+0x1a/0x1e" " [] __xfs_get_blocks+0x5d/0x195" and sometimes this one: "BUG: unable to handle kernel NULL pointer dereference at virtual address 0288" " printing eip:" "c0142ff7" "*pde = " "Oops: [#1]" "SMP " "Modules linked in: iptable_filter ip_tables x_tables" "CPU:0" "EIP:0060:[]Not tainted VLI" "EFLAGS: 00010246 (2.6.18.6 #1) " "EIP is at generic_file_buffered_write+0x390/0x6cf" "eax: ebx: 01ec ecx: ea029a40 edx: 8002" "esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18" "ds: 007b es: 007b ss: 0068" "Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000)" "Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 " " 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc " " 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 " "Call Trace:" " [] sock_def_readable+0x7f/0x81" " [] file_update_time+0xad/0xcb" " [] xfs_iunlock+0x55/0x9f" " [] xfs_write+0xa74/0xc61" " [] sock_aio_read+0x95/0x99" " [] xfs_file_aio_write+0x8f/0xa0" " [] do_sync_write+0xc9/0x10f" " [] autoremove_wake_function+0x0/0x57" " [] generic_file_llseek+0x95/0xbc" " [] do_sync_write+0x0/0x10f" " [] vfs_write+0xa6/0x179" " [] sys_write+0x51/0x80" " [] syscall_call+0x7/0xb" "Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 <8b> 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a " "EIP: [] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18" Stefan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
XFS or Kernel Problem / Bug
Hello! I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: general protection fault: [#1] Modules linked in: CPU:0 EIP:0060:[c01c8fd2]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b eax: ebx: fffe0007 ecx: 0071a4cd edx: esi: edi: ebp: 0015 esp: ce35f8f0 ds: es: 007b ss: 0068 Process mysqld (pid: 1836, ti=ce35e000 task=ee618550 task.ti=ce35e000) Stack: 0232 0233 000c 0007 eca90250 eca90278 0001 eca90200 03c3 010003c3 ffc0 ce35fa58 ce35fa58 0001 Call Trace: [c01b6c58] xfs_trans_dqresv+0x3f9/0x405 [c01c6485] xfs_bmap_add_extent+0x163/0x377 [c01cd2c3] xfs_bmapi+0xa4e/0x1109 [c01ebbe3] xfs_iomap_write_delay+0x233/0x2fa [c01eaa31] xfs_imap_to_bmap+0x29/0x1d6 [c01eae1a] xfs_iomap+0x23c/0x3e1 [c01eaebe] xfs_iomap+0x2e0/0x3e1 [c020a71a] xfs_bmap+0x1a/0x1e [c020471e] __xfs_get_blocks+0x5d/0x195 and sometimes this one: BUG: unable to handle kernel NULL pointer dereference at virtual address 0288 printing eip: c0142ff7 *pde = Oops: [#1] SMP Modules linked in: iptable_filter ip_tables x_tables CPU:0 EIP:0060:[c0142ff7]Not tainted VLI EFLAGS: 00010246 (2.6.18.6 #1) EIP is at generic_file_buffered_write+0x390/0x6cf eax: ebx: 01ec ecx: ea029a40 edx: 8002 esi: edi: e3b28c9c ebp: 01ec esp: dd04bd18 ds: 007b es: 007b ss: 0068 Process proftpd (pid: 3615, ti=dd04a000 task=eba88a70 task.ti=dd04a000) Stack: e3b28d44 0001 0010 01fc c036d793 01fc c14765c0 0010 080d404c 01ec e3b28c9c c03e78c0 e3b28d44 ea029a40 01fc 01ec dd04beac 00d420b1 dd04bd80 45b1fa67 Call Trace: [c036d793] sock_def_readable+0x7f/0x81 [c017a03a] file_update_time+0xad/0xcb [c0232015] xfs_iunlock+0x55/0x9f [c0262eeb] xfs_write+0xa74/0xc61 [c036a253] sock_aio_read+0x95/0x99 [c025d9fb] xfs_file_aio_write+0x8f/0xa0 [c015fb94] do_sync_write+0xc9/0x10f [c0133ad6] autoremove_wake_function+0x0/0x57 [c015f3d5] generic_file_llseek+0x95/0xbc [c015facb] do_sync_write+0x0/0x10f [c015fc80] vfs_write+0xa6/0x179 [c015fe24] sys_write+0x51/0x80 [c0102d3f] syscall_call+0x7/0xb Code: 04 89 10 8b 44 24 40 85 c0 0f 85 db 00 00 00 8b 5c 24 24 85 db 0f 88 c3 00 00 00 8b 4c 24 34 8b 51 18 f6 c6 10 75 73 8b 7c 24 28 8b 85 9c 00 00 00 f6 40 30 10 75 63 f6 87 48 01 00 00 01 75 5a EIP: [c0142ff7] generic_file_buffered_write+0x390/0x6cf SS:ESP 0068:dd04bd18 Stefan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS or Kernel Problem / Bug
Hi! I'm not shure but perhaps it isn't an XFS Bug. Here is what i find out: We've about 300 servers at the momentan and 5 of them are old Intel Pentium 4 Machines with a DFI PM-12 Mainboard with VIA chipset. It only happens on THESE Machines. Other P4 Machines with a Tyan Mainboard or a Gigabyte Mainboard are not affected. All 300 machines runs the same Debian 3.0 with self build kernel. Some of these 5 use a 3ware controller and some of them the mainboardcontroller. All systems are using IDE. But i cannot say what happens to these machines at the time of failure. Sometimes these servers crashed directly after a few minutes. Sometimes they run about 2-3 days... i've now downgraded all servers to 2.6.16.37. Cause they are production machines... but i have one machine where we can test - if you need something. Here is the output running 2.6.16.37 at the moment: xfs_growfs -n / meta-data=/dev/root isize=256agcount=16, agsize=603855 blks = sectsz=512 attr=0 data = bsize=4096 blocks=9661680, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=4717, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 Stefan David Chinner schrieb: On Sun, Jan 21, 2007 at 01:30:15PM +0100, Stefan Priebe - FH wrote: Hello! I've 3 Servers which works wonderful with 2.6.16.X (also testet the latest 2.6.16.37) but with 2.6.18.6 i get these errors: [ EIP is at xfs_bmap_add_extent_hole_delay+0x58d/0x59b ] [ EIP is at generic_file_buffered_write+0x390/0x6cf ] Do you have a reproducable test case for these? if not, do you have any idea what is going on in the system at the time of the failure? Can you describe the storage subsystem you are using and post the output of xfs_growfs -n mntpt on the filesystem that is causing problems? Cheers, Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/