http://bugzilla.kernel.org/show_bug.cgi?id=9772
Summary: 2.6.24-rc8 + patches: CPU hot removal while CPU is online leaves system in bad state Product: ACPI Version: 2.5 KernelVersion: 2.6.24-rc8 + patch for bug 2884 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Config-Hotplug AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Latest working kernel version: Unknown Earliest failing kernel version: Unknown Distribution: sles10 Hardware Environment: x86_64 Software Environment: Problem Description: I've applied the patches attached to bug 2884 in kernel bugzilla so that I can work around the deadlock that occurs when you write to the eject node for an ACPI object. The kernel doesn't properly handle the case of ejecting a CPU that is still online. Doing so leaves the CPU online. It still shows up in /proc/cpuinfo. And the /sys/devices/system/cpuX node still exists. But the kernel goes ahead and calls the ACPI eject method anyway, which means the hardware is free to be removed at that point. And it cleans up the /sys nodes for the ACPI tree, so it's not possible to request another ejection. This leads to system instability as things gradually hang as they attempt to interact with the CPU that has gone away. Writing to the "eject" node for an online CPU should probably result in the ejection request being ignored because the CPU is still online. The /sys/ tree node for the ACPI device should be left intact, so another ejection can be requested after the CPU has been taken offline. The write to the eject sys node should probably fail with an error, but that's optional. Alternatively, the kernel could automatically offline an online CPU before ejecting it. I think it's a matter of taste which behavior you prefer. acpi_processor_remove looks like it returns -EINVAL if the CPU is online (see acpi_processor_handle_eject), but I don't think this return value is ever looked at, so the eject request isn't ever stopped. At least as far as I can see. Steps to reproduce: echo 1 > /sys/devices/LNXSYSTM:00/device:00/ACPI0007:01/eject while CPU is still online. Watch your shell hang. Wait a few minutes and watch the whole system hang. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ acpi-bugzilla mailing list acpi-bugzilla@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla