Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
* Mike Rapoport wrote: > On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote: > > ... adding Kirill > > > > On 8/7/20 1:40 AM, Joerg Roedel wrote: > > > + lvl = "p4d"; > > > + p4d = p4d_alloc(_mm, pgd, addr); > > > + if (!p4d) > > > + goto failed; > > > > > > + /* > > > + * With 5-level paging the P4D level is not folded. So the PGDs > > > + * are now populated and there is no need to walk down to the > > > + * PUD level. > > > + */ > > > if (pgtable_l5_enabled()) > > > continue; > > > > It's early and I'm a coffee or two short of awake, but I had to stare at > > the comment for a but to make sense of it. > > > > It feels wrong, I think, because the 5-level code usually ends up doing > > *more* allocations and in this case, it is _appearing_ to do fewer. > > Would something like this make sense? > > Unless I miss something, with 5 levels vmalloc mappings are shared at > p4d level, so allocating a p4d page would be enough. With 4 levels, > p4d_alloc() is a nop and pud is the first actually populated level below > pgd. > > > /* > > * The goal here is to allocate all possibly required > > * hardware page tables pointed to by the top hardware > > * level. > > * > > * On 4-level systems, the p4d layer is folded away and > > * the above code does no preallocation. Below, go down > > * to the pud _software_ level to ensure the second > > * hardware level is allocated. > > */ Would be nice to integrate all these explanations into the comment itself? Thanks, Ingo
Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote: > ... adding Kirill > > On 8/7/20 1:40 AM, Joerg Roedel wrote: > > + lvl = "p4d"; > > + p4d = p4d_alloc(_mm, pgd, addr); > > + if (!p4d) > > + goto failed; > > > > + /* > > +* With 5-level paging the P4D level is not folded. So the PGDs > > +* are now populated and there is no need to walk down to the > > +* PUD level. > > +*/ > > if (pgtable_l5_enabled()) > > continue; > > It's early and I'm a coffee or two short of awake, but I had to stare at > the comment for a but to make sense of it. > > It feels wrong, I think, because the 5-level code usually ends up doing > *more* allocations and in this case, it is _appearing_ to do fewer. > Would something like this make sense? Unless I miss something, with 5 levels vmalloc mappings are shared at p4d level, so allocating a p4d page would be enough. With 4 levels, p4d_alloc() is a nop and pud is the first actually populated level below pgd. > /* >* The goal here is to allocate all possibly required >* hardware page tables pointed to by the top hardware >* level. >* >* On 4-level systems, the p4d layer is folded away and >* the above code does no preallocation. Below, go down >* to the pud _software_ level to ensure the second >* hardware level is allocated. >*/ > > > > - pud = pud_offset(p4d, addr); > > - if (pud_none(*pud)) { > > - /* Ends up here only with 4-level paging */ > > - pud = pud_alloc(_mm, p4d, addr); > > - if (!pud) { > > - lvl = "pud"; > > - goto failed; > > - } > > - } > > + lvl = "pud"; > > + pud = pud_alloc(_mm, p4d, addr); > > + if (!pud) > > + goto failed; > > } -- Sincerely yours, Mike.
Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
... adding Kirill On 8/7/20 1:40 AM, Joerg Roedel wrote: > + lvl = "p4d"; > + p4d = p4d_alloc(_mm, pgd, addr); > + if (!p4d) > + goto failed; > > + /* > + * With 5-level paging the P4D level is not folded. So the PGDs > + * are now populated and there is no need to walk down to the > + * PUD level. > + */ > if (pgtable_l5_enabled()) > continue; It's early and I'm a coffee or two short of awake, but I had to stare at the comment for a but to make sense of it. It feels wrong, I think, because the 5-level code usually ends up doing *more* allocations and in this case, it is _appearing_ to do fewer. Would something like this make sense? /* * The goal here is to allocate all possibly required * hardware page tables pointed to by the top hardware * level. * * On 4-level systems, the p4d layer is folded away and * the above code does no preallocation. Below, go down * to the pud _software_ level to ensure the second * hardware level is allocated. */ > - pud = pud_offset(p4d, addr); > - if (pud_none(*pud)) { > - /* Ends up here only with 4-level paging */ > - pud = pud_alloc(_mm, p4d, addr); > - if (!pud) { > - lvl = "pud"; > - goto failed; > - } > - } > + lvl = "pud"; > + pud = pud_alloc(_mm, p4d, addr); > + if (!pud) > + goto failed; > }
Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
On Fri, Aug 07, 2020 at 10:40:13AM +0200, Joerg Roedel wrote: > From: Joerg Roedel > > The code for preallocate_vmalloc_pages() was written under the > assumption that the p4d_offset() and pud_offset() functions will perform > present checks before dereferencing the parent entries. > > This assumption is wrong an leads to a bug in the code which causes the > physical address found in the PGD be used as a page-table page, even if > the PGD is not present. > > So the code flow currently is: > > pgd = pgd_offset_k(addr); > p4d = p4d_offset(pgd, addr); > if (p4d_none(*p4d)) > p4d = p4d_alloc(_mm, pgd, addr); > > This lacks a check for pgd_none() at least, the correct flow would be: > > pgd = pgd_offset_k(addr); > if (pgd_none(*pgd)) > p4d = p4d_alloc(_mm, pgd, addr); > else > p4d = p4d_offset(pgd, addr); > > But this is the same flow that the p4d_alloc() and the pud_alloc() > functions use internally, so there is no need to duplicate them. > > Remove the p?d_none() checks from the function and just call into > p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries. > > Reported-by: Jason A. Donenfeld > Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area") > Signed-off-by: Joerg Roedel LGTM, Reviewed-by: Mike Rapoport > --- > arch/x86/mm/init_64.c | 31 +-- > 1 file changed, 13 insertions(+), 18 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 3f4e29a78f2b..449e071240e1 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void) > p4d_t *p4d; > pud_t *pud; > > - p4d = p4d_offset(pgd, addr); > - if (p4d_none(*p4d)) { > - /* Can only happen with 5-level paging */ > - p4d = p4d_alloc(_mm, pgd, addr); > - if (!p4d) { > - lvl = "p4d"; > - goto failed; > - } > - } > + lvl = "p4d"; > + p4d = p4d_alloc(_mm, pgd, addr); > + if (!p4d) > + goto failed; > > + /* > + * With 5-level paging the P4D level is not folded. So the PGDs > + * are now populated and there is no need to walk down to the > + * PUD level. > + */ > if (pgtable_l5_enabled()) > continue; > > - pud = pud_offset(p4d, addr); > - if (pud_none(*pud)) { > - /* Ends up here only with 4-level paging */ > - pud = pud_alloc(_mm, p4d, addr); > - if (!pud) { > - lvl = "pud"; > - goto failed; > - } > - } > + lvl = "pud"; > + pud = pud_alloc(_mm, p4d, addr); > + if (!pud) > + goto failed; > } > > return; > -- > 2.26.2 > -- Sincerely yours, Mike.
Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
On Fri, Aug 7, 2020 at 10:40 AM Joerg Roedel wrote: > > From: Joerg Roedel > > The code for preallocate_vmalloc_pages() was written under the > assumption that the p4d_offset() and pud_offset() functions will perform > present checks before dereferencing the parent entries. > > This assumption is wrong an leads to a bug in the code which causes the > physical address found in the PGD be used as a page-table page, even if > the PGD is not present. > > So the code flow currently is: > > pgd = pgd_offset_k(addr); > p4d = p4d_offset(pgd, addr); > if (p4d_none(*p4d)) > p4d = p4d_alloc(_mm, pgd, addr); > > This lacks a check for pgd_none() at least, the correct flow would be: > > pgd = pgd_offset_k(addr); > if (pgd_none(*pgd)) > p4d = p4d_alloc(_mm, pgd, addr); > else > p4d = p4d_offset(pgd, addr); > > But this is the same flow that the p4d_alloc() and the pud_alloc() > functions use internally, so there is no need to duplicate them. > > Remove the p?d_none() checks from the function and just call into > p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries. > > Reported-by: Jason A. Donenfeld > Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area") > Signed-off-by: Joerg Roedel > --- > arch/x86/mm/init_64.c | 31 +-- > 1 file changed, 13 insertions(+), 18 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 3f4e29a78f2b..449e071240e1 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void) > p4d_t *p4d; > pud_t *pud; > > - p4d = p4d_offset(pgd, addr); > - if (p4d_none(*p4d)) { > - /* Can only happen with 5-level paging */ > - p4d = p4d_alloc(_mm, pgd, addr); > - if (!p4d) { > - lvl = "p4d"; > - goto failed; > - } > - } > + lvl = "p4d"; > + p4d = p4d_alloc(_mm, pgd, addr); > + if (!p4d) > + goto failed; > > + /* > +* With 5-level paging the P4D level is not folded. So the > PGDs > +* are now populated and there is no need to walk down to the > +* PUD level. > +*/ > if (pgtable_l5_enabled()) > continue; > > - pud = pud_offset(p4d, addr); > - if (pud_none(*pud)) { > - /* Ends up here only with 4-level paging */ > - pud = pud_alloc(_mm, p4d, addr); > - if (!pud) { > - lvl = "pud"; > - goto failed; > - } > - } > + lvl = "pud"; > + pud = pud_alloc(_mm, p4d, addr); > + if (!pud) > + goto failed; > } > > return; > -- > 2.26.2 This appears to fix the issue, so: Tested-by: Jason A. Donenfeld