This patch ports microcode improvement patches from linux kernel.
Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.
Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.
Signed-off-by: Chao Gao
Tested-by: Chao Gao
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian
Cc: Jun Nakajima
Cc: Ashok Raj
Cc: Borislav Petkov
Cc: Thomas Gleixner
Cc: Andrew Cooper
Cc: Jan Beulich
---
Changes in v11:
- Use the sample code of wait_for_state() provided by Jan
- make wait_cpu_call{in,out} take unsigned int to avoid type casting
- do assignment in while clause in control_thread_fn() to eliminate
duplication.
Changes in v10:
- introduce wait_for_state() and set_state() helper functions
- make wait_for_condition() return bool and take const void *
- disable/enable watchdog in control thread
- rename "master" and "slave" thread to "primary" and "secondary"
Changes in v9:
- log __buildin_return_address(0) when timeout
- divide CPUs into three logical sets and they will call different
functions during ucode loading. The 'control thread' is chosen to
coordinate ucode loading on all CPUs. Since only control thread would
set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
- s/rep_nop/cpu_relax
- each thread updates its revision number itself
- add XENLOG_ERR prefix for each line of multi-line log messages
Changes in v8:
- to support blocking #NMI handling during loading ucode
* introduce a flag, 'loading_state', to mark the start or end of
ucode loading.
* use a bitmap for cpu callin since if cpu may stay in #NMI handling,
there are two places for a cpu to call in. bitmap won't be counted
twice.
* don't wait for all CPUs callout, just wait for CPUs that perform the
update. We have to do this because some threads may be stuck in NMI
handling (where cannot reach the rendezvous).
- emit a warning if the system stays in stop_machine context for more
than 1s
- comment that rdtsc is fine while loading an update
- use cmpxchg() to avoid panic being called on multiple CPUs
- Propagate revision number to other threads
- refine comments and prompt messages
Changes in v7:
- Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
- reword the comment above microcode_update_cpu() to clearly state that
one thread per core should do the update.
---
xen/arch/x86/microcode.c | 297 ++-
1 file changed, 267 insertions(+), 30 deletions(-)
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 9c0e5c4..6c23879 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -30,18 +30,52 @@
#include
#include
#include
+#include
#include
#include
#include
+#include
+#include
#include
#include
#include
#include
+/*
+ * Before performing a late microcode update on any thread, we
+ * rendezvous all cpus in stop_machine context. The timeout for
+ * waiting for cpu rendezvous is 30ms. It is the timeout used by
+ * live patching
+ */
+#define MICROCODE_CALLIN_TIMEOUT_US 3
+
+/*
+ * Timeout for each thread to complete update is set to 1s. It is a
+ * conservative choice considering all possible interference.
+ */
+#define MICROCODE_UPDATE_TIMEOUT_US 100
+
static module_t __initdata ucode_mod;
static signed int __initdata ucode_mod_idx;
static bool_t __initdata ucode_mod_forced;
+static unsigned int nr_cores;
+
+/*
+ * These states help to coordinate CPUs during loading an update.
+ *
+ * The semantics of each state is as follow:
+ * - LOADING_PREPARE: initial state of 'loading_state'.
+ * - LOADING_CALLIN: CPUs are allowed to callin.
+ * - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
+ * - LOADING_EXIT: ucode loading is done or aborted.
+ */
+static enum {
+LOADING_PREPARE,
+LOADING_CALLIN,
+LOADING_ENTER,
+LOADING_EXIT,
+} loading_state;
/*
* If we scan the initramfs.cpio for the early microcode code
@@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
/*
+ * Count the CPUs that have entered, exited the rendezvous and succeeded in
+ * microcode update during late microcode update respectively.
+ *
+ * Note that a bitmap is used for callin to allow cpu to set a bit multiple
+ * times. It is required to do busy-loop in #NMI handling.
+ */
+static cpumask_t cpu_callin_map;
+static atomic_t cpu_out, cpu_updated;
+
+/*
* Return a patch that covers current CPU. If there are multiple patches,
* return the one with the highest revision number. Return error If no
* patch is