On 12.01.2018 18:49, Woodhouse, David wrote: > When we context switch from a shallow call stack to a deeper one, as we > 'ret' up the deeper side we may encounter RSB entries (predictions for > where the 'ret' goes to) which were populated in userspace. This is > problematic if we have neither SMEP nor KPTI (the latter of which marks > userspace pages as NX for the kernel), as malicious code in userspace > may then be executed speculatively. So overwrite the CPU's return > prediction stack with calls which are predicted to return to an infinite > loop, to "capture" speculation if this happens. This is required both > for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI. > > On Skylake+ the problem is slightly different, and an *underflow* of the > RSB may cause errant branch predictions to occur. So there it's not so > much overwrite, as *filling* the RSB to attempt to prevent it getting > empty. This is only a partial solution for Skylake+ since there are many > other conditions which may result in the RSB becoming empty. The full > solution on Skylake+ is to use IBRS, which will prevent the problem even > when the RSB becomes empty. With IBRS, the RSB-stuffing will not be > required on context switch. > > Signed-off-by: David Woodhouse <[email protected]> > Acked-by: Arjan van de Ven <[email protected]> > --- (..) > @@ -213,6 +230,23 @@ static void __init spectre_v2_select_mitigation(void) > > spectre_v2_enabled = mode; > pr_info("%s\n", spectre_v2_strings[mode]); > + > + /* > + * If we don't have SMEP or KPTI, then we run the risk of hitting > + * userspace addresses in the RSB after a context switch from a > + * shallow call stack to a deeper one. We must must fill the entire > + * RSB to avoid that, even when using IBRS. > + * > + * Skylake era CPUs have a separate issue with *underflow* of the > + * RSB, when they will predict 'ret' targets from the generic BTB. > + * IBRS makes that safe, but we need to fill the RSB on context > + * switch if we're using retpoline. > + */ > + if ((!boot_cpu_has(X86_FEATURE_PTI) && > + !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { > + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); > + pr_info("Filling RSB on context switch\n"); > + }
Shouldn't the RSB filling on context switch also be done on non-IBPB CPUs to protect (retpolined) user space tasks from other user space tasks? We already issue a IBPB when switching to high-value user space tasks to protect them from other user space tasks. Thanks, Maciej

