Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries

2020-08-13 Thread Ingo Molnar


* Mike Rapoport  wrote:

> On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote:
> > ... adding Kirill
> > 
> > On 8/7/20 1:40 AM, Joerg Roedel wrote:
> > > + lvl = "p4d";
> > > + p4d = p4d_alloc(_mm, pgd, addr);
> > > + if (!p4d)
> > > + goto failed;
> > >  
> > > + /*
> > > +  * With 5-level paging the P4D level is not folded. So the PGDs
> > > +  * are now populated and there is no need to walk down to the
> > > +  * PUD level.
> > > +  */
> > >   if (pgtable_l5_enabled())
> > >   continue;
> > 
> > It's early and I'm a coffee or two short of awake, but I had to stare at
> > the comment for a but to make sense of it.
> > 
> > It feels wrong, I think, because the 5-level code usually ends up doing
> > *more* allocations and in this case, it is _appearing_ to do fewer.
> > Would something like this make sense?
> 
> Unless I miss something, with 5 levels vmalloc mappings are shared at
> p4d level, so allocating a p4d page would be enough. With 4 levels,
> p4d_alloc() is a nop and pud is the first actually populated level below
> pgd.
> 
> > /*
> >  * The goal here is to allocate all possibly required
> >  * hardware page tables pointed to by the top hardware
> >  * level.
> >  *
> >  * On 4-level systems, the p4d layer is folded away and
> >  * the above code does no preallocation.  Below, go down
> >  * to the pud _software_ level to ensure the second
> >  * hardware level is allocated.
> >  */

Would be nice to integrate all these explanations into the comment itself?

Thanks,

Ingo


Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries

2020-08-10 Thread Mike Rapoport
On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote:
> ... adding Kirill
> 
> On 8/7/20 1:40 AM, Joerg Roedel wrote:
> > +   lvl = "p4d";
> > +   p4d = p4d_alloc(_mm, pgd, addr);
> > +   if (!p4d)
> > +   goto failed;
> >  
> > +   /*
> > +* With 5-level paging the P4D level is not folded. So the PGDs
> > +* are now populated and there is no need to walk down to the
> > +* PUD level.
> > +*/
> > if (pgtable_l5_enabled())
> > continue;
> 
> It's early and I'm a coffee or two short of awake, but I had to stare at
> the comment for a but to make sense of it.
> 
> It feels wrong, I think, because the 5-level code usually ends up doing
> *more* allocations and in this case, it is _appearing_ to do fewer.
> Would something like this make sense?

Unless I miss something, with 5 levels vmalloc mappings are shared at
p4d level, so allocating a p4d page would be enough. With 4 levels,
p4d_alloc() is a nop and pud is the first actually populated level below
pgd.

>   /*
>* The goal here is to allocate all possibly required
>* hardware page tables pointed to by the top hardware
>* level.
>*
>* On 4-level systems, the p4d layer is folded away and
>* the above code does no preallocation.  Below, go down
>* to the pud _software_ level to ensure the second
>* hardware level is allocated.
>*/
> 
> 
> > -   pud = pud_offset(p4d, addr);
> > -   if (pud_none(*pud)) {
> > -   /* Ends up here only with 4-level paging */
> > -   pud = pud_alloc(_mm, p4d, addr);
> > -   if (!pud) {
> > -   lvl = "pud";
> > -   goto failed;
> > -   }
> > -   }
> > +   lvl = "pud";
> > +   pud = pud_alloc(_mm, p4d, addr);
> > +   if (!pud)
> > +   goto failed;
> > }

-- 
Sincerely yours,
Mike.


Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries

2020-08-10 Thread Dave Hansen
... adding Kirill

On 8/7/20 1:40 AM, Joerg Roedel wrote:
> + lvl = "p4d";
> + p4d = p4d_alloc(_mm, pgd, addr);
> + if (!p4d)
> + goto failed;
>  
> + /*
> +  * With 5-level paging the P4D level is not folded. So the PGDs
> +  * are now populated and there is no need to walk down to the
> +  * PUD level.
> +  */
>   if (pgtable_l5_enabled())
>   continue;

It's early and I'm a coffee or two short of awake, but I had to stare at
the comment for a but to make sense of it.

It feels wrong, I think, because the 5-level code usually ends up doing
*more* allocations and in this case, it is _appearing_ to do fewer.
Would something like this make sense?

/*
 * The goal here is to allocate all possibly required
 * hardware page tables pointed to by the top hardware
 * level.
 *
 * On 4-level systems, the p4d layer is folded away and
 * the above code does no preallocation.  Below, go down
 * to the pud _software_ level to ensure the second
 * hardware level is allocated.
 */


> - pud = pud_offset(p4d, addr);
> - if (pud_none(*pud)) {
> - /* Ends up here only with 4-level paging */
> - pud = pud_alloc(_mm, p4d, addr);
> - if (!pud) {
> - lvl = "pud";
> - goto failed;
> - }
> - }
> + lvl = "pud";
> + pud = pud_alloc(_mm, p4d, addr);
> + if (!pud)
> + goto failed;
>   }


Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries

2020-08-07 Thread Mike Rapoport
On Fri, Aug 07, 2020 at 10:40:13AM +0200, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> The code for preallocate_vmalloc_pages() was written under the
> assumption that the p4d_offset() and pud_offset() functions will perform
> present checks before dereferencing the parent entries.
> 
> This assumption is wrong an leads to a bug in the code which causes the
> physical address found in the PGD be used as a page-table page, even if
> the PGD is not present.
> 
> So the code flow currently is:
> 
>   pgd = pgd_offset_k(addr);
>   p4d = p4d_offset(pgd, addr);
>   if (p4d_none(*p4d))
>   p4d = p4d_alloc(_mm, pgd, addr);
> 
> This lacks a check for pgd_none() at least, the correct flow would be:
> 
>   pgd = pgd_offset_k(addr);
>   if (pgd_none(*pgd))
>   p4d = p4d_alloc(_mm, pgd, addr);
>   else
>   p4d = p4d_offset(pgd, addr);
> 
> But this is the same flow that the p4d_alloc() and the pud_alloc()
> functions use internally, so there is no need to duplicate them.
> 
> Remove the p?d_none() checks from the function and just call into
> p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries.
> 
> Reported-by: Jason A. Donenfeld 
> Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area")
> Signed-off-by: Joerg Roedel 

LGTM,

Reviewed-by: Mike Rapoport 

> ---
>  arch/x86/mm/init_64.c | 31 +--
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 3f4e29a78f2b..449e071240e1 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void)
>   p4d_t *p4d;
>   pud_t *pud;
>  
> - p4d = p4d_offset(pgd, addr);
> - if (p4d_none(*p4d)) {
> - /* Can only happen with 5-level paging */
> - p4d = p4d_alloc(_mm, pgd, addr);
> - if (!p4d) {
> - lvl = "p4d";
> - goto failed;
> - }
> - }
> + lvl = "p4d";
> + p4d = p4d_alloc(_mm, pgd, addr);
> + if (!p4d)
> + goto failed;
>  
> + /*
> +  * With 5-level paging the P4D level is not folded. So the PGDs
> +  * are now populated and there is no need to walk down to the
> +  * PUD level.
> +  */
>   if (pgtable_l5_enabled())
>   continue;
>  
> - pud = pud_offset(p4d, addr);
> - if (pud_none(*pud)) {
> - /* Ends up here only with 4-level paging */
> - pud = pud_alloc(_mm, p4d, addr);
> - if (!pud) {
> - lvl = "pud";
> - goto failed;
> - }
> - }
> + lvl = "pud";
> + pud = pud_alloc(_mm, p4d, addr);
> + if (!pud)
> + goto failed;
>   }
>  
>   return;
> -- 
> 2.26.2
> 

-- 
Sincerely yours,
Mike.


Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries

2020-08-07 Thread Jason A. Donenfeld
On Fri, Aug 7, 2020 at 10:40 AM Joerg Roedel  wrote:
>
> From: Joerg Roedel 
>
> The code for preallocate_vmalloc_pages() was written under the
> assumption that the p4d_offset() and pud_offset() functions will perform
> present checks before dereferencing the parent entries.
>
> This assumption is wrong an leads to a bug in the code which causes the
> physical address found in the PGD be used as a page-table page, even if
> the PGD is not present.
>
> So the code flow currently is:
>
> pgd = pgd_offset_k(addr);
> p4d = p4d_offset(pgd, addr);
> if (p4d_none(*p4d))
> p4d = p4d_alloc(_mm, pgd, addr);
>
> This lacks a check for pgd_none() at least, the correct flow would be:
>
> pgd = pgd_offset_k(addr);
> if (pgd_none(*pgd))
> p4d = p4d_alloc(_mm, pgd, addr);
> else
> p4d = p4d_offset(pgd, addr);
>
> But this is the same flow that the p4d_alloc() and the pud_alloc()
> functions use internally, so there is no need to duplicate them.
>
> Remove the p?d_none() checks from the function and just call into
> p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries.
>
> Reported-by: Jason A. Donenfeld 
> Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area")
> Signed-off-by: Joerg Roedel 
> ---
>  arch/x86/mm/init_64.c | 31 +--
>  1 file changed, 13 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 3f4e29a78f2b..449e071240e1 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void)
> p4d_t *p4d;
> pud_t *pud;
>
> -   p4d = p4d_offset(pgd, addr);
> -   if (p4d_none(*p4d)) {
> -   /* Can only happen with 5-level paging */
> -   p4d = p4d_alloc(_mm, pgd, addr);
> -   if (!p4d) {
> -   lvl = "p4d";
> -   goto failed;
> -   }
> -   }
> +   lvl = "p4d";
> +   p4d = p4d_alloc(_mm, pgd, addr);
> +   if (!p4d)
> +   goto failed;
>
> +   /*
> +* With 5-level paging the P4D level is not folded. So the 
> PGDs
> +* are now populated and there is no need to walk down to the
> +* PUD level.
> +*/
> if (pgtable_l5_enabled())
> continue;
>
> -   pud = pud_offset(p4d, addr);
> -   if (pud_none(*pud)) {
> -   /* Ends up here only with 4-level paging */
> -   pud = pud_alloc(_mm, p4d, addr);
> -   if (!pud) {
> -   lvl = "pud";
> -   goto failed;
> -   }
> -   }
> +   lvl = "pud";
> +   pud = pud_alloc(_mm, p4d, addr);
> +   if (!pud)
> +   goto failed;
> }
>
> return;
> --
> 2.26.2


This appears to fix the issue, so:

Tested-by: Jason A. Donenfeld