From: "Gautham R. Shenoy" <e...@linux.vnet.ibm.com>

Currently on Pseries Linux Guests, the offlined CPU can be put to one
of the following two states:
   - Long term processor cede (also called extended cede)
   - Returned to the Hypervisor via RTAS "stop-self" call.

This is controlled by the kernel boot parameter "cede_offline=on/off".

By default the offlined CPUs enter extended cede. The PHYP hypervisor
considers CPUs in extended cede to be "active" since they are still
under the control fo the Linux Guests. Hence, when we change the SMT
modes by offlining the secondary CPUs, the PURR and the RWMR SPRs will
continue to count the values for offlined CPUs in extended cede as if
they are online. This breaks the accounting in tools such as lparstat.

To fix this, ensure that by default the offlined CPUs are returned to
the Hypervisor via RTAS "stop-self" call by changing the default value
of "cede_offline_enabled" to false.

Signed-off-by: Gautham R. Shenoy <e...@linux.vnet.ibm.com>
---
 Documentation/core-api/cpu_hotplug.rst       |  2 +-
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 +++++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/core-api/cpu_hotplug.rst 
b/Documentation/core-api/cpu_hotplug.rst
index 4a50ab7..5319593 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -53,7 +53,7 @@ Command Line Switches
 ``cede_offline={"off","on"}``
   Use this option to disable/enable putting offlined processors to an extended
   ``H_CEDE`` state on supported pseries platforms. If nothing is specified,
-  ``cede_offline`` is set to "on".
+  ``cede_offline`` is set to "off".
 
   This option is limited to the PowerPC architecture.
 
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index bbda646..f9d0366 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -46,7 +46,17 @@ static DEFINE_PER_CPU(enum cpu_state_vals, 
preferred_offline_state) =
 
 static enum cpu_state_vals default_offline_state = CPU_STATE_OFFLINE;
 
-static bool cede_offline_enabled __read_mostly = true;
+/*
+ * Determines whether the offlined CPUs should be put to a long term
+ * processor cede (called extended cede) for power-saving
+ * purposes. The CPUs in extended cede are still with the Linux Guest
+ * and are not returned to the Hypervisor.
+ *
+ * By default, the offlined CPUs are returned to the hypervisor via
+ * RTAS "stop-self". This behaviour can be changed by passing the
+ * kernel commandline parameter "cede_offline=on".
+ */
+static bool cede_offline_enabled __read_mostly;
 
 /*
  * Enable/disable cede_offline when available.
-- 
1.9.4

Reply via email to