Re: APIC error on 32-bit kernel
Thank you very much for looking at this, Len. On Fri, 11 May 2007 23:28:58 -0400 Len Brown <[EMAIL PROTECTED]> wrote: > > > [ 94.754852] APIC error on CPU0: 08(40) > > > [ 94.806045] APIC error on CPU0: 40(08) > > /* Here is what the APIC error bits mean: >0: Send CS error >1: Receive CS error >2: Send accept error >3: Receive accept error >4: Reserved >5: Send illegal vector >6: Received illegal vector >7: Illegal register address > */ > > So the 40 means the APIC got an illegal vector. > Certainly this is consistent with the fact that > the errors start when a specific device is being > used. I assume that device is using MSI? Yes, the device is using MSI. > Curious that it is different in 32-bit and 64-bit mode. Agreed, although I had one user back in March report APIC errors on the Asus M2V board while running Debian x86_64. I personally have never encountered the problem under a 64-bit kernel, but I admit that just might be random luck. > > > We also do not see this problem on Intel-based motherboards, with > > > either 32- or 64-bit kernels. > > > > A full raft of documentation -- including acpidump and > > linux-firmware-kit output, console capture, kernel config, lspci > > -vvxxx (with apic=debug boot option), dmesg, and /proc/interrupts > > -- is available at http://www.hogchain.net/m2v/apic-problem/ > > > [06Dh 109 2] Boot Architecture Flags : 0003 > > for what it is worth, the bit in ACPI that is used to > disable MSI support is not set -- so as far as the BIOS > is concerned, this system should support MSI. > > Is it an add-in card, or lan-on-motherboard? This is a PCIe LAN-on-motherboard. My goal is to understand whether this is a problem in the atl1 driver, or a problem on the motherboard. If it's the former, obviously I want to fix it. If it's the latter, then I want to disable MSI in the driver when we discover we're running on this motherboard. Thanks again for taking time to look at this. Any advice or hints you provide will be greatly appreciated. Jay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
Thank you very much for looking at this, Len. On Fri, 11 May 2007 23:28:58 -0400 Len Brown [EMAIL PROTECTED] wrote: [ 94.754852] APIC error on CPU0: 08(40) [ 94.806045] APIC error on CPU0: 40(08) /* Here is what the APIC error bits mean: 0: Send CS error 1: Receive CS error 2: Send accept error 3: Receive accept error 4: Reserved 5: Send illegal vector 6: Received illegal vector 7: Illegal register address */ So the 40 means the APIC got an illegal vector. Certainly this is consistent with the fact that the errors start when a specific device is being used. I assume that device is using MSI? Yes, the device is using MSI. Curious that it is different in 32-bit and 64-bit mode. Agreed, although I had one user back in March report APIC errors on the Asus M2V board while running Debian x86_64. I personally have never encountered the problem under a 64-bit kernel, but I admit that just might be random luck. We also do not see this problem on Intel-based motherboards, with either 32- or 64-bit kernels. A full raft of documentation -- including acpidump and linux-firmware-kit output, console capture, kernel config, lspci -vvxxx (with apic=debug boot option), dmesg, and /proc/interrupts -- is available at http://www.hogchain.net/m2v/apic-problem/ [06Dh 109 2] Boot Architecture Flags : 0003 for what it is worth, the bit in ACPI that is used to disable MSI support is not set -- so as far as the BIOS is concerned, this system should support MSI. Is it an add-in card, or lan-on-motherboard? This is a PCIe LAN-on-motherboard. My goal is to understand whether this is a problem in the atl1 driver, or a problem on the motherboard. If it's the former, obviously I want to fix it. If it's the latter, then I want to disable MSI in the driver when we discover we're running on this motherboard. Thanks again for taking time to look at this. Any advice or hints you provide will be greatly appreciated. Jay - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
> > We're trying to track down the source of a problem that occurs > > whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 > > and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. > > > We can load the driver just fine, but whenever we activate the > > network, we see APIC errors (a sample of them are shown here, > > captured from a serial console): > > > > [EMAIL PROTECTED] ~]# echo 8 > /proc/sys/kernel/printk > > [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated > > sysctl (sysc. > > [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex > > [ 94.498887] APIC error on CPU0: 00(08) > > [ 94.498534] APIC error on CPU1: 00(08) > > [ 94.550079] APIC error on CPU0: 08(08) > > [ 94.549725] APIC error on CPU1: 08(08) > > [ 94.600915] APIC error on CPU1: 08(08) > > [ 94.601276] APIC error on CPU0: 08(08) > > [ 94.652108] APIC error on CPU1: 08(08) > > [ 94.652470] APIC error on CPU0: 08(08) > > [ 94.703659] APIC error on CPU0: 08(08) > > [ 94.703305] APIC error on CPU1: 08(08) > > [ 94.754852] APIC error on CPU0: 08(40) > > [ 94.806045] APIC error on CPU0: 40(08) /* Here is what the APIC error bits mean: 0: Send CS error 1: Receive CS error 2: Send accept error 3: Receive accept error 4: Reserved 5: Send illegal vector 6: Received illegal vector 7: Illegal register address */ So the 40 means the APIC got an illegal vector. Certainly this is consistent with the fact that the errors start when a specific device is being used. I assume that device is using MSI? Curious that it is different in 32-bit and 64-bit mode. > > [ 94.805692] APIC error on CPU1: 08(08) > > [ 94.857238] APIC error on CPU0: 08(08) > > [ 94.856884] APIC error on CPU1: 08(08) > > [ 94.908432] APIC error on CPU0: 08(08) > > [ 94.908078] APIC error on CPU1: 08(08) > > [snip, more of the same] > > [ 98.901156] APIC error on CPU1: 08(08) > > [ 98.952702] APIC error on CPU0: 08(08) > > [ 98.952349] APIC error on CPU1: 08(08) > > [ 99.003895] APIC error on CPU0: 08(08) > > [ 99.003542] APIC error on CPU1: 08(08) > > > > The machine hangs for about 5-10 seconds, then spontaneously reboots > > without further console output. > > I can prompt an oops by pinging my router while the apic errors are > scrolling by. > > > > > This is an Asus M2V (Via K8T890) motherboard. > > > > The problem does not occur on a 32-bit kernel if we boot with > > pci=nomsi, and it doesn't occur at all on a 64-bit kernel on the same > > motherboard. pci=nomsi, works, okay... > > We also do not see this problem on Intel-based motherboards, with > > either 32- or 64-bit kernels. > > A full raft of documentation -- including acpidump and > linux-firmware-kit output, console capture, kernel config, lspci -vvxxx > (with apic=debug boot option), dmesg, and /proc/interrupts -- is > available at http://www.hogchain.net/m2v/apic-problem/ [06Dh 109 2] Boot Architecture Flags : 0003 for what it is worth, the bit in ACPI that is used to disable MSI support is not set -- so as far as the BIOS is concerned, this system should support MSI. Is it an add-in card, or lan-on-motherboard? -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
We're trying to track down the source of a problem that occurs whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. We can load the driver just fine, but whenever we activate the network, we see APIC errors (a sample of them are shown here, captured from a serial console): [EMAIL PROTECTED] ~]# echo 8 /proc/sys/kernel/printk [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated sysctl (sysc. [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex [ 94.498887] APIC error on CPU0: 00(08) [ 94.498534] APIC error on CPU1: 00(08) [ 94.550079] APIC error on CPU0: 08(08) [ 94.549725] APIC error on CPU1: 08(08) [ 94.600915] APIC error on CPU1: 08(08) [ 94.601276] APIC error on CPU0: 08(08) [ 94.652108] APIC error on CPU1: 08(08) [ 94.652470] APIC error on CPU0: 08(08) [ 94.703659] APIC error on CPU0: 08(08) [ 94.703305] APIC error on CPU1: 08(08) [ 94.754852] APIC error on CPU0: 08(40) [ 94.806045] APIC error on CPU0: 40(08) /* Here is what the APIC error bits mean: 0: Send CS error 1: Receive CS error 2: Send accept error 3: Receive accept error 4: Reserved 5: Send illegal vector 6: Received illegal vector 7: Illegal register address */ So the 40 means the APIC got an illegal vector. Certainly this is consistent with the fact that the errors start when a specific device is being used. I assume that device is using MSI? Curious that it is different in 32-bit and 64-bit mode. [ 94.805692] APIC error on CPU1: 08(08) [ 94.857238] APIC error on CPU0: 08(08) [ 94.856884] APIC error on CPU1: 08(08) [ 94.908432] APIC error on CPU0: 08(08) [ 94.908078] APIC error on CPU1: 08(08) [snip, more of the same] [ 98.901156] APIC error on CPU1: 08(08) [ 98.952702] APIC error on CPU0: 08(08) [ 98.952349] APIC error on CPU1: 08(08) [ 99.003895] APIC error on CPU0: 08(08) [ 99.003542] APIC error on CPU1: 08(08) The machine hangs for about 5-10 seconds, then spontaneously reboots without further console output. I can prompt an oops by pinging my router while the apic errors are scrolling by. This is an Asus M2V (Via K8T890) motherboard. The problem does not occur on a 32-bit kernel if we boot with pci=nomsi, and it doesn't occur at all on a 64-bit kernel on the same motherboard. pci=nomsi, works, okay... We also do not see this problem on Intel-based motherboards, with either 32- or 64-bit kernels. A full raft of documentation -- including acpidump and linux-firmware-kit output, console capture, kernel config, lspci -vvxxx (with apic=debug boot option), dmesg, and /proc/interrupts -- is available at http://www.hogchain.net/m2v/apic-problem/ [06Dh 109 2] Boot Architecture Flags : 0003 for what it is worth, the bit in ACPI that is used to disable MSI support is not set -- so as far as the BIOS is concerned, this system should support MSI. Is it an add-in card, or lan-on-motherboard? -Len - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
Chuck Ebbert wrote: Where is the text of the oops? In one of the files on the website I referenced. Here's the text... [ 173.584000] APIC error on CPU1: 08(08) [ 173.665000] APIC error on CPU0: 08(08) [ 173.665000] APIC error on CPU1: 08(08) [ 173.746000] APIC error on CPU0: 08(08) [ 173.746000] APIC error on CPU1: 08(08) [ 173.827000] APIC error on CPU0: 08(08) [ 173.827000] APIC error on CPU1: 08(08) [ 173.908000] APIC error on CPU0: 08(08) [ 173.908000] APIC error on CPU1: 08(08) [ 173.989000] APIC error on CPU0: 08(08) [ 173.989000] APIC error on CPU1: 08(08) pinged my router somewhere along about here... [ 174.069000] BUG: unable to handle kernel NULL pointer dereference<1>BUG: unable to 0 [ 174.069000] printing eip: [ 174.069000] [ 174.069000] *pde = 1feb8067 [ 174.069000] Oops: [#1] [ 174.069000] SMP [ 174.069000] Modules linked in: nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4d [ 174.069000] CPU:1 [ 174.069000] EIP:0060:[<>]Not tainted VLI [ 174.069000] EFLAGS: 00010006 (2.6.21-rc5-git1 #1) [ 174.069000] EIP is at 0x0 [ 174.069000] eax: 00a0 ebx: dfe99f98 ecx: c07bb000 edx: c074de00 [ 174.069000] esi: 00a0 edi: ebp: esp: c07bbffc [ 174.069000] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 174.069000] Process beagled-helper (pid: 3393, ti=c07bb000 task=dfe28270 task.ti=df) [ 174.069000] Stack: c040704b [ 174.069000] Call Trace: [ 174.069000] [] do_IRQ+0xac/0xd1 [ 174.069000] [] common_interrupt+0x2e/0x34 [ 174.069000] === [ 174.069000] Code: Bad EIP value. [ 174.069000] EIP: [<>] 0x0 SS:ESP 0068:c07bbffc [ 174.069000] Kernel panic - not syncing: Fatal exception in interrupt [ 174.069000] BUG: at arch/i386/kernel/smp.c:546 smp_call_function() [ 174.069000] [] smp_call_function+0x5c/0xc8 [ 174.069000] [] do_unblank_screen+0x2a/0x120 [ 174.069000] [] smp_send_stop+0x1b/0x2e [ 174.069000] [] panic+0x54/0xf2 [ 174.069000] [] die+0x1f8/0x22c [ 174.069000] [] do_page_fault+0x40c/0x4df [ 174.069000] [] do_page_fault+0x0/0x4df [ 174.069000] [] error_code+0x7c/0x84 [ 174.069000] [] do_IRQ+0xac/0xd1 [ 174.069000] [] common_interrupt+0x2e/0x34 [ 174.069000] === [ 174.069000] at virtual address [ 174.069000] printing eip: [ 174.069000] [ 174.069000] *pde = 20bd3067 [ 174.069000] Oops: [#2] [ 174.069000] SMP [ 174.069000] Modules linked in: nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4d [ 174.069000] CPU:0 [ 174.069000] EIP:0060:[<>]Not tainted VLI [ 174.069000] EFLAGS: 00010087 (2.6.21-rc5-git1 #1) [ 174.069000] EIP is at 0x0 [ 174.069000] eax: 00a0 ebx: c0753f74 ecx: c07ba000 edx: c074de00 [ 174.069000] esi: 00a0 edi: ebp: esp: c07baffc [ 174.069000] ds: 007b es: 007b fs: 00d8 gs: ss: 0068 [ 174.069000] Process swapper (pid: 0, ti=c07ba000 task=c07094c0 task.ti=c0753000) [ 174.069000] Stack: c040704b [ 174.069000] Call Trace: [ 174.069000] [] do_IRQ+0xac/0xd1 [ 174.069000] [] common_interrupt+0x2e/0x34 [ 174.069000] [] default_idle+0x3d/0x54 [ 174.069000] [] cpu_idle+0xa3/0xbc [ 174.069000] [] start_kernel+0x45d/0x465 [ 174.069000] [] unknown_bootoption+0x0/0x202 [ 174.069000] === [ 174.069000] Code: Bad EIP value. [ 174.069000] EIP: [<>] 0x0 SS:ESP 0068:c07baffc [ 174.069000] Kernel panic - not syncing: Fatal exception in interrupt Short hang, then spontaneous reboot. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
Jay Cliburn wrote: > [Adding linux-kernel to the cc list, hoping for wider exposure.] > > On Fri, 23 Mar 2007 20:08:17 -0500 > Jay Cliburn <[EMAIL PROTECTED]> wrote: > >> We're trying to track down the source of a problem that occurs >> whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 > > and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. > >> We can load the driver just fine, but whenever we activate the >> network, we see APIC errors (a sample of them are shown here, >> captured from a serial console): >> >> [EMAIL PROTECTED] ~]# echo 8 > /proc/sys/kernel/printk >> [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated >> sysctl (sysc. >> [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex >> [ 94.498887] APIC error on CPU0: 00(08) >> [ 94.498534] APIC error on CPU1: 00(08) >> [ 94.550079] APIC error on CPU0: 08(08) >> [ 94.549725] APIC error on CPU1: 08(08) >> [ 94.600915] APIC error on CPU1: 08(08) >> [ 94.601276] APIC error on CPU0: 08(08) >> [ 94.652108] APIC error on CPU1: 08(08) >> [ 94.652470] APIC error on CPU0: 08(08) >> [ 94.703659] APIC error on CPU0: 08(08) >> [ 94.703305] APIC error on CPU1: 08(08) >> [ 94.754852] APIC error on CPU0: 08(40) >> [ 94.806045] APIC error on CPU0: 40(08) >> [ 94.805692] APIC error on CPU1: 08(08) >> [ 94.857238] APIC error on CPU0: 08(08) >> [ 94.856884] APIC error on CPU1: 08(08) >> [ 94.908432] APIC error on CPU0: 08(08) >> [ 94.908078] APIC error on CPU1: 08(08) >> [snip, more of the same] >> [ 98.901156] APIC error on CPU1: 08(08) >> [ 98.952702] APIC error on CPU0: 08(08) >> [ 98.952349] APIC error on CPU1: 08(08) >> [ 99.003895] APIC error on CPU0: 08(08) >> [ 99.003542] APIC error on CPU1: 08(08) >> >> The machine hangs for about 5-10 seconds, then spontaneously reboots >> without further console output. > > I can prompt an oops by pinging my router while the apic errors are > scrolling by. Where is the text of the oops? > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
Jay Cliburn wrote: [Adding linux-kernel to the cc list, hoping for wider exposure.] On Fri, 23 Mar 2007 20:08:17 -0500 Jay Cliburn [EMAIL PROTECTED] wrote: We're trying to track down the source of a problem that occurs whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. We can load the driver just fine, but whenever we activate the network, we see APIC errors (a sample of them are shown here, captured from a serial console): [EMAIL PROTECTED] ~]# echo 8 /proc/sys/kernel/printk [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated sysctl (sysc. [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex [ 94.498887] APIC error on CPU0: 00(08) [ 94.498534] APIC error on CPU1: 00(08) [ 94.550079] APIC error on CPU0: 08(08) [ 94.549725] APIC error on CPU1: 08(08) [ 94.600915] APIC error on CPU1: 08(08) [ 94.601276] APIC error on CPU0: 08(08) [ 94.652108] APIC error on CPU1: 08(08) [ 94.652470] APIC error on CPU0: 08(08) [ 94.703659] APIC error on CPU0: 08(08) [ 94.703305] APIC error on CPU1: 08(08) [ 94.754852] APIC error on CPU0: 08(40) [ 94.806045] APIC error on CPU0: 40(08) [ 94.805692] APIC error on CPU1: 08(08) [ 94.857238] APIC error on CPU0: 08(08) [ 94.856884] APIC error on CPU1: 08(08) [ 94.908432] APIC error on CPU0: 08(08) [ 94.908078] APIC error on CPU1: 08(08) [snip, more of the same] [ 98.901156] APIC error on CPU1: 08(08) [ 98.952702] APIC error on CPU0: 08(08) [ 98.952349] APIC error on CPU1: 08(08) [ 99.003895] APIC error on CPU0: 08(08) [ 99.003542] APIC error on CPU1: 08(08) The machine hangs for about 5-10 seconds, then spontaneously reboots without further console output. I can prompt an oops by pinging my router while the apic errors are scrolling by. Where is the text of the oops? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
Chuck Ebbert wrote: Where is the text of the oops? In one of the files on the website I referenced. Here's the text... [ 173.584000] APIC error on CPU1: 08(08) [ 173.665000] APIC error on CPU0: 08(08) [ 173.665000] APIC error on CPU1: 08(08) [ 173.746000] APIC error on CPU0: 08(08) [ 173.746000] APIC error on CPU1: 08(08) [ 173.827000] APIC error on CPU0: 08(08) [ 173.827000] APIC error on CPU1: 08(08) [ 173.908000] APIC error on CPU0: 08(08) [ 173.908000] APIC error on CPU1: 08(08) [ 173.989000] APIC error on CPU0: 08(08) [ 173.989000] APIC error on CPU1: 08(08) pinged my router somewhere along about here... [ 174.069000] BUG: unable to handle kernel NULL pointer dereference1BUG: unable to 0 [ 174.069000] printing eip: [ 174.069000] [ 174.069000] *pde = 1feb8067 [ 174.069000] Oops: [#1] [ 174.069000] SMP [ 174.069000] Modules linked in: nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4d [ 174.069000] CPU:1 [ 174.069000] EIP:0060:[]Not tainted VLI [ 174.069000] EFLAGS: 00010006 (2.6.21-rc5-git1 #1) [ 174.069000] EIP is at 0x0 [ 174.069000] eax: 00a0 ebx: dfe99f98 ecx: c07bb000 edx: c074de00 [ 174.069000] esi: 00a0 edi: ebp: esp: c07bbffc [ 174.069000] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 174.069000] Process beagled-helper (pid: 3393, ti=c07bb000 task=dfe28270 task.ti=df) [ 174.069000] Stack: c040704b [ 174.069000] Call Trace: [ 174.069000] [c040704b] do_IRQ+0xac/0xd1 [ 174.069000] [c040580e] common_interrupt+0x2e/0x34 [ 174.069000] === [ 174.069000] Code: Bad EIP value. [ 174.069000] EIP: [] 0x0 SS:ESP 0068:c07bbffc [ 174.069000] Kernel panic - not syncing: Fatal exception in interrupt [ 174.069000] BUG: at arch/i386/kernel/smp.c:546 smp_call_function() [ 174.069000] [c0417b4f] smp_call_function+0x5c/0xc8 [ 174.069000] [c054052e] do_unblank_screen+0x2a/0x120 [ 174.069000] [c0417bd6] smp_send_stop+0x1b/0x2e [ 174.069000] [c04271ca] panic+0x54/0xf2 [ 174.069000] [c04062c5] die+0x1f8/0x22c [ 174.069000] [c0623d13] do_page_fault+0x40c/0x4df [ 174.069000] [c0623907] do_page_fault+0x0/0x4df [ 174.069000] [c0622574] error_code+0x7c/0x84 [ 174.069000] [c040704b] do_IRQ+0xac/0xd1 [ 174.069000] [c040580e] common_interrupt+0x2e/0x34 [ 174.069000] === [ 174.069000] at virtual address [ 174.069000] printing eip: [ 174.069000] [ 174.069000] *pde = 20bd3067 [ 174.069000] Oops: [#2] [ 174.069000] SMP [ 174.069000] Modules linked in: nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4d [ 174.069000] CPU:0 [ 174.069000] EIP:0060:[]Not tainted VLI [ 174.069000] EFLAGS: 00010087 (2.6.21-rc5-git1 #1) [ 174.069000] EIP is at 0x0 [ 174.069000] eax: 00a0 ebx: c0753f74 ecx: c07ba000 edx: c074de00 [ 174.069000] esi: 00a0 edi: ebp: esp: c07baffc [ 174.069000] ds: 007b es: 007b fs: 00d8 gs: ss: 0068 [ 174.069000] Process swapper (pid: 0, ti=c07ba000 task=c07094c0 task.ti=c0753000) [ 174.069000] Stack: c040704b [ 174.069000] Call Trace: [ 174.069000] [c040704b] do_IRQ+0xac/0xd1 [ 174.069000] [c040580e] common_interrupt+0x2e/0x34 [ 174.069000] [c0403c74] default_idle+0x3d/0x54 [ 174.069000] [c040339b] cpu_idle+0xa3/0xbc [ 174.069000] [c0758a37] start_kernel+0x45d/0x465 [ 174.069000] [c07581ae] unknown_bootoption+0x0/0x202 [ 174.069000] === [ 174.069000] Code: Bad EIP value. [ 174.069000] EIP: [] 0x0 SS:ESP 0068:c07baffc [ 174.069000] Kernel panic - not syncing: Fatal exception in interrupt Short hang, then spontaneous reboot. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
[Adding linux-kernel to the cc list, hoping for wider exposure.] On Fri, 23 Mar 2007 20:08:17 -0500 Jay Cliburn <[EMAIL PROTECTED]> wrote: > We're trying to track down the source of a problem that occurs > whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. > We can load the driver just fine, but whenever we activate the > network, we see APIC errors (a sample of them are shown here, > captured from a serial console): > > [EMAIL PROTECTED] ~]# echo 8 > /proc/sys/kernel/printk > [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated > sysctl (sysc. > [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex > [ 94.498887] APIC error on CPU0: 00(08) > [ 94.498534] APIC error on CPU1: 00(08) > [ 94.550079] APIC error on CPU0: 08(08) > [ 94.549725] APIC error on CPU1: 08(08) > [ 94.600915] APIC error on CPU1: 08(08) > [ 94.601276] APIC error on CPU0: 08(08) > [ 94.652108] APIC error on CPU1: 08(08) > [ 94.652470] APIC error on CPU0: 08(08) > [ 94.703659] APIC error on CPU0: 08(08) > [ 94.703305] APIC error on CPU1: 08(08) > [ 94.754852] APIC error on CPU0: 08(40) > [ 94.806045] APIC error on CPU0: 40(08) > [ 94.805692] APIC error on CPU1: 08(08) > [ 94.857238] APIC error on CPU0: 08(08) > [ 94.856884] APIC error on CPU1: 08(08) > [ 94.908432] APIC error on CPU0: 08(08) > [ 94.908078] APIC error on CPU1: 08(08) > [snip, more of the same] > [ 98.901156] APIC error on CPU1: 08(08) > [ 98.952702] APIC error on CPU0: 08(08) > [ 98.952349] APIC error on CPU1: 08(08) > [ 99.003895] APIC error on CPU0: 08(08) > [ 99.003542] APIC error on CPU1: 08(08) > > The machine hangs for about 5-10 seconds, then spontaneously reboots > without further console output. I can prompt an oops by pinging my router while the apic errors are scrolling by. > > This is an Asus M2V (Via K8T890) motherboard. > > The problem does not occur on a 32-bit kernel if we boot with > pci=nomsi, and it doesn't occur at all on a 64-bit kernel on the same > motherboard. > > We also do not see this problem on Intel-based motherboards, with > either 32- or 64-bit kernels. A full raft of documentation -- including acpidump and linux-firmware-kit output, console capture, kernel config, lspci -vvxxx (with apic=debug boot option), dmesg, and /proc/interrupts -- is available at http://www.hogchain.net/m2v/apic-problem/ If this is a motherboard problem, that's fine; I'd just like to know the details so I tell users something more than "it's a motherboard problem." Thanks, Jay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC error on 32-bit kernel
[Adding linux-kernel to the cc list, hoping for wider exposure.] On Fri, 23 Mar 2007 20:08:17 -0500 Jay Cliburn [EMAIL PROTECTED] wrote: We're trying to track down the source of a problem that occurs whenever the atl1 network driver is activated on a 32-bit 2.6.21-rc4 and -rc5, -rc6, 2.6.20.x, 2.6.19.3, and probably others. We can load the driver just fine, but whenever we activate the network, we see APIC errors (a sample of them are shown here, captured from a serial console): [EMAIL PROTECTED] ~]# echo 8 /proc/sys/kernel/printk [EMAIL PROTECTED] ~]# [ 93.942012] process `sysctl' is using deprecated sysctl (sysc. [ 94.396609] atl1: eth0 link is up 1000 Mbps full duplex [ 94.498887] APIC error on CPU0: 00(08) [ 94.498534] APIC error on CPU1: 00(08) [ 94.550079] APIC error on CPU0: 08(08) [ 94.549725] APIC error on CPU1: 08(08) [ 94.600915] APIC error on CPU1: 08(08) [ 94.601276] APIC error on CPU0: 08(08) [ 94.652108] APIC error on CPU1: 08(08) [ 94.652470] APIC error on CPU0: 08(08) [ 94.703659] APIC error on CPU0: 08(08) [ 94.703305] APIC error on CPU1: 08(08) [ 94.754852] APIC error on CPU0: 08(40) [ 94.806045] APIC error on CPU0: 40(08) [ 94.805692] APIC error on CPU1: 08(08) [ 94.857238] APIC error on CPU0: 08(08) [ 94.856884] APIC error on CPU1: 08(08) [ 94.908432] APIC error on CPU0: 08(08) [ 94.908078] APIC error on CPU1: 08(08) [snip, more of the same] [ 98.901156] APIC error on CPU1: 08(08) [ 98.952702] APIC error on CPU0: 08(08) [ 98.952349] APIC error on CPU1: 08(08) [ 99.003895] APIC error on CPU0: 08(08) [ 99.003542] APIC error on CPU1: 08(08) The machine hangs for about 5-10 seconds, then spontaneously reboots without further console output. I can prompt an oops by pinging my router while the apic errors are scrolling by. This is an Asus M2V (Via K8T890) motherboard. The problem does not occur on a 32-bit kernel if we boot with pci=nomsi, and it doesn't occur at all on a 64-bit kernel on the same motherboard. We also do not see this problem on Intel-based motherboards, with either 32- or 64-bit kernels. A full raft of documentation -- including acpidump and linux-firmware-kit output, console capture, kernel config, lspci -vvxxx (with apic=debug boot option), dmesg, and /proc/interrupts -- is available at http://www.hogchain.net/m2v/apic-problem/ If this is a motherboard problem, that's fine; I'd just like to know the details so I tell users something more than it's a motherboard problem. Thanks, Jay - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/