Re: [Ipmitool-devel] IPMI problem with FAI and wheezy
On Fri, Sep 07, 2012 at 03:02:05PM +0200, Steffen Grunewald wrote: Hi, I'm at my wits' end now with this old system, perhaps one of you can come up with another idea: The hardware is somewhat old, SuperMicro H8SSL board with IPMI card (BMC) looped into eth0 (Broadcom Tigon3). Excerpts from the demsg file: [0.00] Linux version 3.2.0-3-amd64 (Debian 3.2.23-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 02:45:17 UTC 2012 [0.00] ACPI: FACP 7ffe0290 000F4 (v03 A M I OEMFACP 12000606 MSFT 0097) [0.00] ACPI: DSDT 7ffe0410 033A8 (v01 0ABSW 0ABSW005 0005 INTL 02002026) [0.884954] tg3 :02:03.0: eth0: Tigon3 [partno(BCM95704A6) rev 2100] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx I used to set console=ttyS1,19200n1 in the pxelinux.cfg file, and watch FAI running via serial-over-LAN, but that stops right at the beginning - and the IPMI card cannot be reached afterwards, not by rebooting, nor by applying other tricks. The only way to get the connection back is power- cycling the whole box. This behaviour did not show up with Squeeze (2.6.32-5 kernel). I'm suspecting a change in the handling of the eth0/BMC bridge by the tg3 driver, but that's only part of the story: it gets worse. Actually, the problem has gone away with the latest (3.2.32 vs 3.2.23) kernel now available for Wheezy. S -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
Re: [Ipmitool-devel] IPMI problem with FAI and wheezy
On Mon, Sep 10, 2012 at 10:40:01AM -0700, Albert Chu wrote: On Mon, 2012-09-10 at 10:32 -0700, Andy Cress wrote: For this symptom: Trying to shut down the machine (actually, a whole set of machines, all behaving the same, so it's not a single fault), by running shutdown -h now, will not halt but reboot it. The only way to reliably switch it off seems to be to run ipmitool chassis power soft, then shutdown -h now. The machine will then stay off for exactly 24 hours, then magically restart. It sounds to me like someone is doing one of these every 24 hours: * sending a Wake-On-LAN magic packet to eth0 * sending an IPMI LAN chassis control power on command. Since it happens at random times, and the box affected had been disconnected from mains power, *and* the BMC is not reachable from the network side, I can exclude both of these. (If it were a WOL packet, other nodes would be affected too, If it were chassis power on it must have been sent from somewhere that has access to the BMC - certainly not a powered-down mainboard.) The get system restart cause IPMI command might be useful for debugging these possibilities. In ipmitool I believe it's the chassis restart_cause command. # ipmitool chassis restart_cause System restart cause: unknown Not very helpful, I guess... Thanks for your ideas, anything else you can imagine? S -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
Re: [Ipmitool-devel] IPMI problem with FAI and wheezy
On Mon, Sep 10, 2012 at 3:37 PM, Steffen Grunewald steffen.grunew...@aei.mpg.de wrote: On Fri, Sep 07, 2012 at 08:02:32AM -0700, Andy Cress wrote: Steffen, Sounds like a firmware bug to me. Is there a later firmware version for this board? Nothing I'm aware of - as I said, those boxen are 6 years old now. Looking for an explanation of the IPMI behaviour, I found that chassis power soft is connected with a sysctl named IPMI_CHASSIS_CTL_ACPI_SOFT, and Debian Wheezy's kernel doesn't have any /proc/acpi structure anymore (and supposedly, some other acpi functionality probably has moved as well) - that's why the expected shutdown doesn't happen... Why there's an alarm being set that wakes up the machine after 24 hours, that's still unknown, probably there's a date-less clock in the BMC? (Cutting power, and re-connecting to mains, doesn't change the behaviour.) With Squeeze kernels, everything worked. I would't expect buggy (or old) firmware to interact with kernels in such a way, and an suspecting a bug in the tg3 driver instead :( Got to UTS, I guess. Have you considered possibility this might be a kernel bug? Have you tried vanilla kernel instead of Debian stock kernel? Or compare kernel configs to guess/bisect the problem? May be it's just some missing kernel feature that wasn't compiled in. Things do get broken, even in kernel. Regards, Z. -Original Message- From: Steffen Grunewald [mailto:steffen.grunew...@aei.mpg.de] Sent: Friday, September 07, 2012 9:02 AM To: FAI mailing list Cc: ipmitool developers list Subject: [Ipmitool-devel] IPMI problem with FAI and wheezy Hi, I'm at my wits' end now with this old system, perhaps one of you can come up with another idea: The hardware is somewhat old, SuperMicro H8SSL board with IPMI card (BMC) looped into eth0 (Broadcom Tigon3). Excerpts from the demsg file: [0.00] Linux version 3.2.0-3-amd64 (Debian 3.2.23-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 02:45:17 UTC 2012 [0.00] ACPI: FACP 7ffe0290 000F4 (v03 A M I OEMFACP 12000606 MSFT 0097) [0.00] ACPI: DSDT 7ffe0410 033A8 (v01 0ABSW 0ABSW005 0005 INTL 02002026) [0.884954] tg3 :02:03.0: eth0: Tigon3 [partno(BCM95704A6) rev 2100] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx I used to set console=ttyS1,19200n1 in the pxelinux.cfg file, and watch FAI running via serial-over-LAN, but that stops right at the beginning - and the IPMI card cannot be reached afterwards, not by rebooting, nor by applying other tricks. The only way to get the connection back is power- cycling the whole box. This behaviour did not show up with Squeeze (2.6.32-5 kernel). I'm suspecting a change in the handling of the eth0/BMC bridge by the tg3 driver, but that's only part of the story: it gets worse. Trying to shut down the machine (actually, a whole set of machines, all behaving the same, so it's not a single fault), by running shutdown -h now, will not halt but reboot it. The only way to reliably switch it off seems to be to run ipmitool chassis power soft, then shutdown -h now. The machine will then stay off for exactly 24 hours, then magically restart. Needless to say I didn't change any BIOS settings, nor implemented kind of a watchdog on the BMC. Is there anything I can do to nail down the problem? Thank you in advance for your suggestions. Steffen -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel -- Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam Cluster Admin * - * http://www.aei.mpg.de/ * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7274,fax:7298} -- Live Security Virtual Conference Exclusive
Re: [Ipmitool-devel] IPMI problem with FAI and wheezy
On Mon, 2012-09-10 at 10:32 -0700, Andy Cress wrote: For this symptom: Trying to shut down the machine (actually, a whole set of machines, all behaving the same, so it's not a single fault), by running shutdown -h now, will not halt but reboot it. The only way to reliably switch it off seems to be to run ipmitool chassis power soft, then shutdown -h now. The machine will then stay off for exactly 24 hours, then magically restart. It sounds to me like someone is doing one of these every 24 hours: * sending a Wake-On-LAN magic packet to eth0 * sending an IPMI LAN chassis control power on command. The get system restart cause IPMI command might be useful for debugging these possibilities. In ipmitool I believe it's the chassis restart_cause command. Al Andy -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
Re: [Ipmitool-devel] IPMI problem with FAI and wheezy
Steffen, Sounds like a firmware bug to me. Is there a later firmware version for this board? Andy -Original Message- From: Steffen Grunewald [mailto:steffen.grunew...@aei.mpg.de] Sent: Friday, September 07, 2012 9:02 AM To: FAI mailing list Cc: ipmitool developers list Subject: [Ipmitool-devel] IPMI problem with FAI and wheezy Hi, I'm at my wits' end now with this old system, perhaps one of you can come up with another idea: The hardware is somewhat old, SuperMicro H8SSL board with IPMI card (BMC) looped into eth0 (Broadcom Tigon3). Excerpts from the demsg file: [0.00] Linux version 3.2.0-3-amd64 (Debian 3.2.23-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 02:45:17 UTC 2012 [0.00] ACPI: FACP 7ffe0290 000F4 (v03 A M I OEMFACP 12000606 MSFT 0097) [0.00] ACPI: DSDT 7ffe0410 033A8 (v01 0ABSW 0ABSW005 0005 INTL 02002026) [0.884954] tg3 :02:03.0: eth0: Tigon3 [partno(BCM95704A6) rev 2100] (PCIX:133MHz:64-bit) MAC address xx:xx:xx:xx:xx:xx I used to set console=ttyS1,19200n1 in the pxelinux.cfg file, and watch FAI running via serial-over-LAN, but that stops right at the beginning - and the IPMI card cannot be reached afterwards, not by rebooting, nor by applying other tricks. The only way to get the connection back is power- cycling the whole box. This behaviour did not show up with Squeeze (2.6.32-5 kernel). I'm suspecting a change in the handling of the eth0/BMC bridge by the tg3 driver, but that's only part of the story: it gets worse. Trying to shut down the machine (actually, a whole set of machines, all behaving the same, so it's not a single fault), by running shutdown -h now, will not halt but reboot it. The only way to reliably switch it off seems to be to run ipmitool chassis power soft, then shutdown -h now. The machine will then stay off for exactly 24 hours, then magically restart. Needless to say I didn't change any BIOS settings, nor implemented kind of a watchdog on the BMC. Is there anything I can do to nail down the problem? Thank you in advance for your suggestions. Steffen -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel