On 09/02/18 10:25, Joerg Roedel wrote:
> here is the second version of my PTI implementation for
> x86_32, based on tip/x86-pti-for-linus. It took a lot longer
> than I had hoped, but there have been a number of obstacles
> on the way. It also isn't the small patch-set anymore that v1
> was, but compared to it this one actually works :)
> The biggest changes were necessary in the entry code, a lot
> of it is moving code around, but there are also significant
> changes to get all cases covered. This includes NMIs and
> exceptions on the kernel exit-path where we are already on
> the entry-stack. To make this work I decided to mostly split
> up the common kernel-exit path into a return-to-kernel,
> return-to-user and return-from-nmi part.
> On the page-table side I had to do a lot of special cases
> for PAE because PAE paging is so, well, special. The biggest
> example here is the LDT mapping code, which needs to work on
> the PMD level instead of PGD when PAE is enabled.
> During development I also experimented with unshared PMDs
> between the kernel and the user page-tables for PAE. It
> worked by allocating 8k PMDs and using the lower half for
> the kernel and the upper half for the user page-table. While
> this worked and allowed me to NX-protect the user-space
> address-range in the kernel page-table, it also required 5
> order-1 allocations in low-mem for each process. In my
> testing I got this to fail pretty quickly and trigger OOM,
> so I abandoned the approach for now.
> Here is how I tested these patches:
> * Booted on a real machine (4C/8T, 16GB RAM) and run
> an overnight load-test with 'perf top' running
> (for the NMIs), the ldt_gdt selftest running in a
> loop (for more stress on the entry/exit path) and
> a -j16 kernel compile also running in a loop. The
> box survived the test, which ran for more than 18
> * Tested most x86 selftests in the kernel on the
> real machine. This showed no regressions. I did
> not run the mpx and protection-key tests, as the
> machine does not support these features, and I
> also skipped the check_initial_reg_state test, as
> it made problems while compiling and it didn't
> seem relevant enough to fix that for this
> * Boot tested all valid combinations of [NO]HIGHMEM* vs.
> VMSPLIT* vs. PAE in KVM. All booted fine.
> * Did compile-tests with various configs (allyes,
> allmod, defconfig, ..., basically what I usually
> use to test the iommu-tree as well). All compiled
> * Some basic compile, boot and runtime testing of
> 64 bit to make sure I didn't break anything there.
> I did not explicitly test wine and dosemu, but since the
> vm86 and the ldt_gdt self-tests all passed fine I am
> confident that those will also still work.
> XENPV is also untested from my side, but I added checks to
> not do the stack switches in the entry-code when XENPV is
> enabled, so hopefully it works. But someone should test it,
> of course.
That's unfortunate. 32 bit XENPV kernel is vulnerable to Meltdown, too.
I'll have a look whether 32 bit XENPV is still working, though.
Adding support for KPTI with Xen PV should probably be done later. :-)