RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Prakhya, Sai Praneeth
> >  /*
> >   * The testcases use internal knowledge of the implementation that 
> > shouldn't
> >   * be exposed to the rest of the kernel. Include these directly here.
> > diff --git a/arch/x86/platform/efi/quirks.c
> > b/arch/x86/platform/efi/quirks.c index 669babcaf245..fb1c44b11235
> > 100644
> > --- a/arch/x86/platform/efi/quirks.c
> > +++ b/arch/x86/platform/efi/quirks.c
> 
> While you are at it, can you please split the EFI part out into a separate 
> patch?

Sure! I will do it in V3.

Regards,
Sai


RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Prakhya, Sai Praneeth
> > In general it's good to do on list because others can learn from the
> > answers as well.
> 
> So you somehow managed to keep everyone and the lists in CC nevertheless.
> Did not pay attention to the CC list until I got my reply on the list :)

Sorry! my bad.. yes, the conversation did happen on the list and it's by 
accident.
> 
> > > > > > +   retval = __change_page_attr_set_clr(, 0);

[]

> > > So some questions I have are,
> > > 1. What did Peter Z mean here? "How is that not a TLB invalidation bug ?"
> >
> > __flush_tlb_all() flushes only the mapping on the current CPU. So in a
> > SMP environment it's not sufficient.

Thanks for the explanation. The issue makes sense to me now.

[.]

> > I'll have a second look at the whole thing and reply on list.
> 
> So actually for the problem at hand __flush_tlb_all() works, because that is
> called way before the secondary CPUs are up.
> 
> Ditto for kernel_map_pages_in_pgd().

Yes, you are right. Both the functions are *always* called before SMP 
initialization
and as you rightly pointed out they should be marked with __init.

> But both functions want to be marked
> __init and aside of that they both should contain a warning when they are 
> called
> after the secondary CPUs have been brought up.

Makes sense too.. I will include these changes in V3.

Regards,
Sai


RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Thomas Gleixner
Sai,

On Mon, 29 Oct 2018, Thomas Gleixner wrote:
> On Mon, 29 Oct 2018, Prakhya, Sai Praneeth wrote:
> 
> > Hi Thomas and Peter,
> > 
> > [off the mailing list because I wasn't sure if it's a good idea to spam 
> > others with my questions]
> 
> In general it's good to do on list because others can learn from the
> answers as well.

So you somehow managed to keep everyone and the lists in CC
nevertheless. Did not pay attention to the CC list until I got my reply on
the list :)

> > > > > + retval = __change_page_attr_set_clr(, 0);
> > > > > + __flush_tlb_all();
> > > >
> > > > So this looks like you copied it from kernel_map_pages_in_pgd() which
> > > > has been discussed before to be not sufficient, but it can't be
> > > > changed right now due to locking issues.
> > >
> > > Managed to confuse myself. The place which cannot be changed is a 
> > > different
> > > one, but still for your call site __flush_tlb_all() might not be 
> > > sufficient.
> > 
> > Since the mappings are changed, I thought a flush_tlb() might be needed.
> > But, (to be honest), I don't completely understand the x86/mm code. So, 
> > could you 
> > please elaborate the issue more for my better understanding?
> > 
> > So some questions I have are,
> > 1. What did Peter Z mean here? "How is that not a TLB invalidation bug ?"
> 
> __flush_tlb_all() flushes only the mapping on the current CPU. So in a SMP
> environment it's not sufficient.
> 
> > 2. I assumed kernel_map_pages_in_pgd() to be a good code but you said that 
> > it has some locking issues. So, could you please elaborate more on that.
> 
> I corrected myself. It's the other function which cannot use the proper
> cpa_flush.*() functions.
> 
> > Or, could you please provide me some pointers in the source code that I
> > can take a look at so that I could understand the issue much better.
> 
> I'll have a second look at the whole thing and reply on list.

So actually for the problem at hand __flush_tlb_all() works, because that
is called way before the secondary CPUs are up.

Ditto for kernel_map_pages_in_pgd(). But both functions want to be marked
__init and aside of that they both should contain a warning when they are
called after the secondary CPUs have been brought up.

Thanks,

tglx


RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Thomas Gleixner
Sai,

On Mon, 29 Oct 2018, Prakhya, Sai Praneeth wrote:

> Hi Thomas and Peter,
> 
> [off the mailing list because I wasn't sure if it's a good idea to spam 
> others with my questions]

In general it's good to do on list because others can learn from the
answers as well.

> > > > +   retval = __change_page_attr_set_clr(, 0);
> > > > +   __flush_tlb_all();
> > >
> > > So this looks like you copied it from kernel_map_pages_in_pgd() which
> > > has been discussed before to be not sufficient, but it can't be
> > > changed right now due to locking issues.
> >
> > Managed to confuse myself. The place which cannot be changed is a different
> > one, but still for your call site __flush_tlb_all() might not be sufficient.
> 
> Since the mappings are changed, I thought a flush_tlb() might be needed.
> But, (to be honest), I don't completely understand the x86/mm code. So, could 
> you 
> please elaborate the issue more for my better understanding?
> 
> So some questions I have are,
> 1. What did Peter Z mean here? "How is that not a TLB invalidation bug ?"

__flush_tlb_all() flushes only the mapping on the current CPU. So in a SMP
environment it's not sufficient.

> 2. I assumed kernel_map_pages_in_pgd() to be a good code but you said that 
> it has some locking issues. So, could you please elaborate more on that.

I corrected myself. It's the other function which cannot use the proper
cpa_flush.*() functions.

> Or, could you please provide me some pointers in the source code that I
> can take a look at so that I could understand the issue much better.

I'll have a second look at the whole thing and reply on list.

Thanks,

tglx


RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Prakhya, Sai Praneeth
Hi Thomas and Peter,

[off the mailing list because I wasn't sure if it's a good idea to spam others 
with my questions]

> > > + /*
> > > +  * The typical sequence for unmapping is to find a pte through
> > > +  * lookup_address_in_pgd() (ideally, it should never return NULL
> because
> > > +  * the address is already mapped) and change it's protections.
> > > +  * As pfn is the *target* of a mapping, it's not useful while unmapping.
> > > +  */
> > > + struct cpa_data cpa = {
> > > + .vaddr  = ,
> > > + .pgd= pgd,
> > > + .numpages   = numpages,
> > > + .mask_set   = __pgprot(0),
> > > + .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> > > + .flags  = 0,
> > > + };
> > > +
> > > + retval = __change_page_attr_set_clr(, 0);
> > > + __flush_tlb_all();
> >
> > So this looks like you copied it from kernel_map_pages_in_pgd() which
> > has been discussed before to be not sufficient, but it can't be
> > changed right now due to locking issues.
>
> Managed to confuse myself. The place which cannot be changed is a different
> one, but still for your call site __flush_tlb_all() might not be sufficient.

Since the mappings are changed, I thought a flush_tlb() might be needed.
But, (to be honest), I don't completely understand the x86/mm code. So, could 
you 
please elaborate the issue more for my better understanding?

So some questions I have are,
1. What did Peter Z mean here? "How is that not a TLB invalidation bug ?"
2. I assumed kernel_map_pages_in_pgd() to be a good code but you said that 
it has some locking issues. So, could you please elaborate more on that.

Or, could you please provide me some pointers in the source code that I can 
take a look at so that I could understand the issue much better.

Regards,
Sai


Re: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Thomas Gleixner
Sai,

On Mon, 29 Oct 2018, Thomas Gleixner wrote:
> On Fri, 26 Oct 2018, Sai Praneeth Prakhya wrote:
> >  
> > +int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> > + unsigned long numpages)
> > +{
> > +   int retval;
> > +
> > +   /*
> > +* The typical sequence for unmapping is to find a pte through
> > +* lookup_address_in_pgd() (ideally, it should never return NULL because
> > +* the address is already mapped) and change it's protections.
> > +* As pfn is the *target* of a mapping, it's not useful while unmapping.
> > +*/
> > +   struct cpa_data cpa = {
> > +   .vaddr  = ,
> > +   .pgd= pgd,
> > +   .numpages   = numpages,
> > +   .mask_set   = __pgprot(0),
> > +   .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> > +   .flags  = 0,
> > +   };
> > +
> > +   retval = __change_page_attr_set_clr(, 0);
> > +   __flush_tlb_all();
> 
> So this looks like you copied it from kernel_map_pages_in_pgd() which has
> been discussed before to be not sufficient, but it can't be changed right
> now due to locking issues.

Managed to confuse myself. The place which cannot be changed is a different
one, but still for your call site __flush_tlb_all() might not be sufficient.

Thanks,

tglx


Re: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Thomas Gleixner
Sai,

On Fri, 26 Oct 2018, Sai Praneeth Prakhya wrote:
>  
> +int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> +   unsigned long numpages)
> +{
> + int retval;
> +
> + /*
> +  * The typical sequence for unmapping is to find a pte through
> +  * lookup_address_in_pgd() (ideally, it should never return NULL because
> +  * the address is already mapped) and change it's protections.
> +  * As pfn is the *target* of a mapping, it's not useful while unmapping.
> +  */
> + struct cpa_data cpa = {
> + .vaddr  = ,
> + .pgd= pgd,
> + .numpages   = numpages,
> + .mask_set   = __pgprot(0),
> + .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> + .flags  = 0,
> + };
> +
> + retval = __change_page_attr_set_clr(, 0);
> + __flush_tlb_all();

So this looks like you copied it from kernel_map_pages_in_pgd() which has
been discussed before to be not sufficient, but it can't be changed right
now due to locking issues.

For this particular use case, this should not be an issue at all, so please
use the proper cpa_flush_*() variant.

> +
> + return retval;
> +}
> +
>  /*
>   * The testcases use internal knowledge of the implementation that shouldn't
>   * be exposed to the rest of the kernel. Include these directly here.
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index 669babcaf245..fb1c44b11235 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c

While you are at it, can you please split the EFI part out into a separate
patch?

Thanks,

tglx


Re: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Peter Zijlstra
On Fri, Oct 26, 2018 at 02:38:44PM -0700, Sai Praneeth Prakhya wrote:
> +int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> +   unsigned long numpages)
> +{
> + int retval;
> +
> + /*
> +  * The typical sequence for unmapping is to find a pte through
> +  * lookup_address_in_pgd() (ideally, it should never return NULL because
> +  * the address is already mapped) and change it's protections.
> +  * As pfn is the *target* of a mapping, it's not useful while unmapping.
> +  */
> + struct cpa_data cpa = {
> + .vaddr  = ,
> + .pgd= pgd,
> + .numpages   = numpages,
> + .mask_set   = __pgprot(0),
> + .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> + .flags  = 0,
> + };
> +
> + retval = __change_page_attr_set_clr(, 0);
> + __flush_tlb_all();

How is that not a TLB invalidation bug ?

> +
> + return retval;
> +}


RE: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Prakhya, Sai Praneeth
> > +int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> > + unsigned long numpages)
> > +{
> > +   int retval;
> > +
> > +   /*
> > +* The typical sequence for unmapping is to find a pte through
> > +* lookup_address_in_pgd() (ideally, it should never return NULL
> because
> > +* the address is already mapped) and change it's protections.
> > +* As pfn is the *target* of a mapping, it's not useful while unmapping.
> > +*/
> > +   struct cpa_data cpa = {
> > +   .vaddr  = ,
> > +   .pgd= pgd,
> > +   .numpages   = numpages,
> > +   .mask_set   = __pgprot(0),
> > +   .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> > +   .flags  = 0,
> > +   };
> > +
> > +   retval = __change_page_attr_set_clr(, 0);
> > +   __flush_tlb_all();
> 
> So, just to clarify, 'pfn' is kept at 0 here? Might make sense to write it out
> explicitly like 'flags', even if it's not used by 
> __change_page_attr_set_clr().

Sure! Makes sense. I will add it.

Regards,
Sai


Re: [PATCH V2 1/2] x86/efi: Unmap EFI boot services code/data regions from efi_pgd

2018-10-29 Thread Ingo Molnar


* Sai Praneeth Prakhya  wrote:

> Ideally, after kernel assumes control of the platform, firmware shouldn't
> access EFI boot services code/data regions. But, it's noticed that this is
> not so true in many x86 platforms. Hence, during boot, kernel reserves EFI
> boot services code/data regions [1] and maps [2] them to efi_pgd so that
> call to set_virtual_address_map() doesn't fail. After returning from
> set_virtual_address_map(), kernel frees the reserved regions [3] but they
> still remain mapped.
> 
> This means that any code that's running in efi_pgd address space (e.g: any
> EFI runtime service) would still be able to access EFI boot services
> code/data regions but the contents of these regions would have long been
> over written by someone else as they are freed by efi_free_boot_services().
> So, it's important to unmap these regions. After unmapping EFI boot
> services code/data regions, any illegal access by buggy firmware to these
> regions would result in page fault which will be handled by efi specific
> fault handler.
> 
> Unmapping EFI boot services code/data regions will result in clearing
> PAGE_PRESENT bit and it shouldn't bother L1TF cases because it's already
> handled by protnone_mask() at arch/x86/include/asm/pgtable-invert.h.
> 
> [1] Please see efi_reserve_boot_services()
> [2] Please see efi_map_region() -> __map_region()
> [3] Please see efi_free_boot_services()
> 
> Signed-off-by: Sai Praneeth Prakhya 
> Cc: Borislav Petkov 
> Cc: Ingo Molnar 
> Cc: Andy Lutomirski 
> Cc: Dave Hansen 
> Cc: Bhupesh Sharma 
> Cc: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: Ard Biesheuvel 
> ---
>  arch/x86/include/asm/pgtable_types.h |  2 ++
>  arch/x86/mm/pageattr.c   | 26 ++
>  arch/x86/platform/efi/quirks.c   | 25 +
>  3 files changed, 53 insertions(+)
> 
> diff --git a/arch/x86/include/asm/pgtable_types.h 
> b/arch/x86/include/asm/pgtable_types.h
> index b64acb08a62b..cda04ecf5432 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -566,6 +566,8 @@ extern pmd_t *lookup_pmd_address(unsigned long address);
>  extern phys_addr_t slow_virt_to_phys(void *__address);
>  extern int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long 
> address,
>  unsigned numpages, unsigned long page_flags);
> +extern int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> +  unsigned long numpages);
>  #endif   /* !__ASSEMBLY__ */
>  
>  #endif /* _ASM_X86_PGTABLE_DEFS_H */
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 51a5a69ecac9..248f16181bed 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -2147,6 +2147,32 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, 
> unsigned long address,
>   return retval;
>  }
>  
> +int kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address,
> +   unsigned long numpages)
> +{
> + int retval;
> +
> + /*
> +  * The typical sequence for unmapping is to find a pte through
> +  * lookup_address_in_pgd() (ideally, it should never return NULL because
> +  * the address is already mapped) and change it's protections.
> +  * As pfn is the *target* of a mapping, it's not useful while unmapping.
> +  */
> + struct cpa_data cpa = {
> + .vaddr  = ,
> + .pgd= pgd,
> + .numpages   = numpages,
> + .mask_set   = __pgprot(0),
> + .mask_clr   = __pgprot(_PAGE_PRESENT | _PAGE_RW),
> + .flags  = 0,
> + };
> +
> + retval = __change_page_attr_set_clr(, 0);
> + __flush_tlb_all();

So, just to clarify, 'pfn' is kept at 0 here? Might make sense to write 
it out explicitly like 'flags', even if it's not used by 
__change_page_attr_set_clr().


> +
> + return retval;
> +}
> +
>  /*
>   * The testcases use internal knowledge of the implementation that shouldn't
>   * be exposed to the rest of the kernel. Include these directly here.
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index 669babcaf245..fb1c44b11235 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -370,6 +370,24 @@ void __init efi_reserve_boot_services(void)
>   }
>  }
>  
> +/*
> + * Apart from having VA mappings for EFI boot services code/data regions,
> + * (duplicate) 1:1 mappings were also created as a quirk for buggy firmware. 
> So,
> + * unmap both 1:1 and VA mappings.
> + */
> +static void __init efi_unmap_pages(efi_memory_desc_t *md)
> +{
> + pgd_t *pgd = efi_mm.pgd;
> + u64 pa = md->phys_addr;
> + u64 va = md->virt_addr;
> +
> + if (kernel_unmap_pages_in_pgd(pgd, pa, md->num_pages))
> + pr_err("Failed to unmap 1:1 mapping for 0x%llx\n", pa);
> +
> + if (kernel_unmap_pages_in_pgd(pgd, va,