On Tue, Feb 20, 2024 at 08:02:58PM -0600, Jeremy Linton wrote: > The existing arm64 stack randomization uses the kernel rng to acquire > 5 bits of address space randomization. This is problematic because it > creates non determinism in the syscall path when the rng needs to be > generated or reseeded. This shows up as large tail latencies in some > benchmarks and directly affects the minimum RT latencies as seen by > cyclictest.
Some questions: - for benchmarks, why not disable kstack randomization? - if the existing pRNG reseeding is a problem here, why isn't it a problem in the many other places it's used? - I though the pRNG already did out-of-line reseeding? > Other architectures are using timers/cycle counters for this function, > which is sketchy from a randomization perspective because it should be > possible to estimate this value from knowledge of the syscall return > time, and from reading the current value of the timer/counters. The expectation is that it would be, at best, unstable. > So, a poor rng should be better than the cycle counter if it is hard > to extract the stack offsets sufficiently to be able to detect the > PRNG's period. > > So, we can potentially choose a 'better' or larger PRNG, going as far > as using one of the CSPRNGs already in the kernel, but the overhead > increases appropriately. Further, there are a few options for > reseeding, possibly out of the syscall path, but is it even useful in > this case? I'd love to find a way to avoid an pRNG that could be reconstructed given enough samples. (But perhaps this xorshift RNG resists that?) -Kees > Reported-by: James Yang <[email protected]> > Reported-by: Shiyou Huang <[email protected]> > Signed-off-by: Jeremy Linton <[email protected]> > --- > arch/arm64/kernel/syscall.c | 55 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 54 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c > index 9a70d9746b66..70143cb8c7be 100644 > --- a/arch/arm64/kernel/syscall.c > +++ b/arch/arm64/kernel/syscall.c > @@ -37,6 +37,59 @@ static long __invoke_syscall(struct pt_regs *regs, > syscall_fn_t syscall_fn) > return syscall_fn(regs); > } > > +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET > +DEFINE_PER_CPU(u32, kstackrng); > +static u32 xorshift32(u32 state) > +{ > + /* > + * From top of page 4 of Marsaglia, "Xorshift RNGs" > + * This algorithm is intended to have a period 2^32 -1 > + * And should not be used anywhere else outside of this > + * code path. > + */ > + state ^= state << 13; > + state ^= state >> 17; > + state ^= state << 5; > + return state; > +} > + > +static u16 kstack_rng(void) > +{ > + u32 rng = raw_cpu_read(kstackrng); > + > + rng = xorshift32(rng); > + raw_cpu_write(kstackrng, rng); > + return rng & 0x1ff; > +} > + > +/* Should we reseed? */ > +static int kstack_rng_setup(unsigned int cpu) > +{ > + u32 rng_seed; > + > + do { > + rng_seed = get_random_u32(); > + } while (!rng_seed); > + raw_cpu_write(kstackrng, rng_seed); > + return 0; > +} > + > +static int kstack_init(void) > +{ > + int ret; > + > + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, > "arm64/cpuinfo:kstackrandomize", > + kstack_rng_setup, NULL); > + if (ret < 0) > + pr_err("kstack: failed to register rng callbacks.\n"); > + return 0; > +} > + > +arch_initcall(kstack_init); > +#else > +static u16 kstack_rng(void) { return 0; } > +#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */ > + > static void invoke_syscall(struct pt_regs *regs, unsigned int scno, > unsigned int sc_nr, > const syscall_fn_t syscall_table[]) > @@ -66,7 +119,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned > int scno, > * > * The resulting 5 bits of entropy is seen in SP[8:4]. > */ > - choose_random_kstack_offset(get_random_u16() & 0x1FF); > + choose_random_kstack_offset(kstack_rng()); > } > > static inline bool has_syscall_work(unsigned long flags) > -- > 2.43.0 > -- Kees Cook
