Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

Ingo Molnar Sat, 20 Jan 2018 05:13:37 -0800

* Ingo Molnar <[email protected]> wrote:

> 2)
> 
> using global variables, which is unsafe in early code if the kernel is 
> relocatable.
> 
> The bisected to commit uses a new sme_populate_pgd_data to collect variables 
> that 
> were already on the stack, which should be position independent and safe.
> 
> But the other commits use sme_active(), which does:
> 
> bool sme_active(void)
> {
>         return sme_me_mask && !sev_enabled;
> }
> EXPORT_SYMBOL(sme_active);
> 
> And that looks PIC-unsafe to me, as both are globals:
> 
> u64 sme_me_mask __section(.data) = 0;
> EXPORT_SYMBOL(sme_me_mask);
> 
> Does the code start working if you force sme_active() to 0 while keeping the 
> function call, i.e. something like the hack below?


BTW., this aspect of the boot code is really fragile, and depending on compiler 
there could be unsafe relocations generated without it being 'obvious' from the 
patch itself. It's also pretty compiler and code layout dependent ...

A good way to check this I think would be to turn off CONFIG_RELOCATABLE=y in 
the 
.config - does that make the kernel boot again?

If that makes a difference then we need to take a look at the relocations in 
the 
two key files, with CONFIG_RELOCATABLE=y turned back on:

  objdump -r arch/x86/kernel/head64.o
  objdump -r arch/x86/mm/mem_encrypt.o

There's three types of relocations that should be there normally:

#define R_X86_64_64             1       /* Direct 64 bit  */
#define R_X86_64_PC32           2       /* PC relative 32 bit signed */
#define R_X86_64_32S            11      /* Direct 32 bit sign extended */

Only R_X86_64_PC32 is safe as-is, R_X86_64_32S needs to be used via 
fixup_pointer().

What makes this difficult in the SME context is that the early boot portion of 
arch/x86/mm/mem_encrypt.c is not separated out, but mixed in with later code.

I missed this aspect when reviewing and merging this code :-(

Maybe a diff of the list of relocations of the before/after commit points would 
be 
nice.

I.e. does something like:

  git checkout <last_working_commit_sha1>
  objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- > 
working.relocs

  git checkout <first_broken_commit_sha1>
  objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- > 
broken.relocs

  diff -up working.relocs broken.relocs

show any changes to the relocations?

Side note:

Regardless of whether it's the root cause for this regression we definitely 
need 
to improve the relocations robustness of early boot code: at minimum we should 
isolate all critical functionality into a separate section, and then add 
tooling 
checks to make sure all relocations are safe.

Thanks,

        Ingo

Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

Reply via email to