On Tue, 05 Dec 2017 14:04:42 +1100 Michael Ellerman <m...@ellerman.id.au> wrote:
> Hi Nick, > > Sorry I didn't reply sooner. > > Nicholas Piggin <npig...@gmail.com> writes: > > > kexec can leave MMU registers set when booting into a new kernel, PIDR > > in particular. The boot sequence does not zero PIDR, so it only gets > > set when CPUs first switch to a userspace processes (until then it's > > running a kernel thread with effective PID = 0). > > > > This leaves a window where a process table entry and page tables are > > set up due to user processes running on other CPUs, that happen to > > match with a stale PID. The CPU with that PID may cause speculative > > accesses that address quadrant 0, which will result in cached > > translations and PWC for that process, on a CPU which is not in the > > mm_cpumask and so they will not get invalidated properly. > > > > The most common result is the kernel hanging in infinite page fault > > loops soon after kexec (usually in schedule_tail, which is usually the > > first non-speculative quardant 0 access to a new PID) due to a stale > > PWC. However being a stale translation erorr, it could result in > > anything up to security and data corruption errors. > > > > Fix this by zeroing out PIDR before setting PTCR. > > > > LPIDR is also not initialized, and may cause a similar issue with > > speculative access to quadrant 1/2. This has not been observed, but > > LPIDR is cleared to prevent that possibility. > > Isn't LPID initialised in __setup_cpu_power9() and __restore_cpu_power9() ? > > eg: > > _GLOBAL(__setup_cpu_power9) > mflr r11 > bl __init_FSCR > bl __init_PMU > bl __init_hvmode_206 > mtlr r11 > beqlr > li r0,0 > mtspr SPRN_PSSCR,r0 > mtspr SPRN_LPID,r0 > > > Similarly, shouldn't we be doing the PID initialisation there as well? Hmm, yes I must have missed that! Yes that would be the best place to put it. Thanks, Nick