On Tue, Mar 05, 2024 at 04:18:24PM -0600, Jeremy Linton wrote: > The existing arm64 stack randomization uses the kernel rng to acquire > 5 bits of address space randomization. This is problematic because it > creates non determinism in the syscall path when the rng needs to be > generated or reseeded. This shows up as large tail latencies in some > benchmarks and directly affects the minimum RT latencies as seen by > cyclictest. > > Other architectures are using timers/cycle counters for this function, > which is sketchy from a randomization perspective because it should be > possible to estimate this value from knowledge of the syscall return > time, and from reading the current value of the timer/counters. > > So, a poor rng should be better than the cycle counter if it is hard > to extract the stack offsets sufficiently to be able to detect the > PRNG's period. Lets downgrade from get_random_u16() to > prandom_u32_state() under the theory that the danger of someone > guessing the 1 in 32 per call offset, is larger than that of being > able to extract sufficient history to accurately predict future > offsets. Further it should be safer to run with prandom_u32_state than > disabling stack randomization for those subset of applications where the > difference in latency is on the order of ~5X worse. > > Reported-by: James Yang <[email protected]> > Reported-by: Shiyou Huang <[email protected]> > Signed-off-by: Jeremy Linton <[email protected]> > --- > arch/arm64/kernel/syscall.c | 42 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 41 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c > index 9a70d9746b66..33b3ea4adff8 100644 > --- a/arch/arm64/kernel/syscall.c > +++ b/arch/arm64/kernel/syscall.c > @@ -5,6 +5,7 @@ > #include <linux/errno.h> > #include <linux/nospec.h> > #include <linux/ptrace.h> > +#include <linux/prandom.h> > #include <linux/randomize_kstack.h> > #include <linux/syscalls.h> > > @@ -37,6 +38,45 @@ static long __invoke_syscall(struct pt_regs *regs, > syscall_fn_t syscall_fn) > return syscall_fn(regs); > } > > +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET > +DEFINE_PER_CPU(struct rnd_state, kstackrng); > + > +static u16 kstack_rng(void) > +{ > + u32 rng = prandom_u32_state(this_cpu_ptr(&kstackrng)); > + > + return rng & 0x1ff; > +} > + > +/* Should we reseed? */ > +static int kstack_rng_setup(unsigned int cpu) > +{ > + u32 rng_seed; > + > + /* zero should be avoided as a seed */ > + do { > + rng_seed = get_random_u32(); > + } while (!rng_seed); > + prandom_seed_state(this_cpu_ptr(&kstackrng), rng_seed); > + return 0; > +} > + > +static int kstack_init(void) > +{ > + int ret; > + > + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, > "arm64/cpuinfo:kstackrandomize", > + kstack_rng_setup, NULL);
This will run initial seeding, but don't we need to reseed this with some kind of frequency? Otherwise, seems fine to me. -- Kees Cook
