Re: [GIT PULL] x86/mm changes for v5.10

2020-10-13 Thread Linus Torvalds
On Tue, Oct 13, 2020 at 1:06 AM Joerg Roedel  wrote:
>
> So pre-allocating has its implications. If we decide to pre-allocate on
> x86-32 too, then we should be prepared for that fall-out of the higher
> memory usage.

Ok, fair enough. Probably not worth worrying about then, particularly
since 32-bit x86 is becoming  more and more just legacy.

   Linus


Re: [GIT PULL] x86/mm changes for v5.10

2020-10-13 Thread Joerg Roedel
On Mon, Oct 12, 2020 at 03:07:45PM -0700, Linus Torvalds wrote:
> On Mon, Oct 12, 2020 at 10:24 AM Ingo Molnar  wrote:
> >
> > Do not sync vmalloc/ioremap mappings on x86-64 kernels.
> >
> > Hopefully now without the bugs!
> 
> Let's hope so.
> 
> If this turns out to work this time, can we do a similar preallocation
> of the page directories on 32-bit? Because I think now x86-32 is the
> only remaining case of doing that arch_sync_kernel_mappings() thing.
> 
> Or is there some reason that won't work that I've lost sight of?

There were two reasons which made me decide to not pre-allocate on
x86-32:

1) The sync-level is the same as the huge-page level (PMD) on
   both paging modes, so with large ioremap mappings the
   synchronization is always needed. The huge ioremap mapping
   could possibly be disabled without much performance impact on
   x86-32.

2) The vmalloc area has a variable size and grows with less RAM
   in the machine. And when the vmalloc area gets larger, more
   pages are needed. Another factor is the configurable
   vm-split. With a 1G/3G split on a machine with 128MB of RAM
   there would be:

VMalloc area size (hole ignored): 3072MB - 128MB = 2944MB
PTE-pages needed (with PAE):  2944MB / 2MB/page = 1472 4k 
pages
Memory needed:1472*4k = 5888kb

   So on such machine the pre-allocation would need 5.75MB of
   the 128MB RAM. Without PAE it is half of that. This is an
   exotic configuration and I am not sure it matters much in
   practice. It could also be worked around by setting limits
   such as, for example, don't make the vmalloc area larger then
   the avauilable memory in the system.

So pre-allocating has its implications. If we decide to pre-allocate on
x86-32 too, then we should be prepared for that fall-out of the higher
memory usage.

Regards,

Joerg


Re: [GIT PULL] x86/mm changes for v5.10

2020-10-12 Thread pr-tracker-bot
The pull request you sent on Mon, 12 Oct 2020 19:24:15 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-2020-10-12

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c1b4ec85ee40cc7a9f7b48bea9013094f2d88203

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] x86/mm changes for v5.10

2020-10-12 Thread Linus Torvalds
On Mon, Oct 12, 2020 at 10:24 AM Ingo Molnar  wrote:
>
> Do not sync vmalloc/ioremap mappings on x86-64 kernels.
>
> Hopefully now without the bugs!

Let's hope so.

If this turns out to work this time, can we do a similar preallocation
of the page directories on 32-bit? Because I think now x86-32 is the
only remaining case of doing that arch_sync_kernel_mappings() thing.

Or is there some reason that won't work that I've lost sight of?

   Linus


[GIT PULL] x86/mm changes for v5.10

2020-10-12 Thread Ingo Molnar
Linus,

Please pull the latest x86/mm git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-2020-10-12

   # HEAD: 7a27ef5e83089090f3a4073a9157c862ef00acfc x86/mm/64: Update comment 
in preallocate_vmalloc_pages()

Do not sync vmalloc/ioremap mappings on x86-64 kernels.

Hopefully now without the bugs!

 Thanks,

Ingo

-->
Joerg Roedel (2):
  x86/mm/64: Do not sync vmalloc/ioremap mappings
  x86/mm/64: Update comment in preallocate_vmalloc_pages()


 arch/x86/include/asm/pgtable_64_types.h |  2 --
 arch/x86/mm/init_64.c   | 20 ++--
 2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 8f63efb2a2cc..52e5f5f2240d 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -159,6 +159,4 @@ extern unsigned int ptrs_per_p4d;
 
 #define PGD_KERNEL_START   ((PAGE_SIZE / 2) / sizeof(pgd_t))
 
-#define ARCH_PAGE_TABLE_SYNC_MASK  (pgtable_l5_enabled() ? 
PGTBL_PGD_MODIFIED : PGTBL_P4D_MODIFIED)
-
 #endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index a4ac13cc3fdc..b5a3fa4033d3 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -217,11 +217,6 @@ static void sync_global_pgds(unsigned long start, unsigned 
long end)
sync_global_pgds_l4(start, end);
 }
 
-void arch_sync_kernel_mappings(unsigned long start, unsigned long end)
-{
-   sync_global_pgds(start, end);
-}
-
 /*
  * NOTE: This function is marked __ref because it calls __init function
  * (alloc_bootmem_pages). It's safe to do it ONLY when after_bootmem == 0.
@@ -1257,14 +1252,19 @@ static void __init preallocate_vmalloc_pages(void)
if (!p4d)
goto failed;
 
-   /*
-* With 5-level paging the P4D level is not folded. So the PGDs
-* are now populated and there is no need to walk down to the
-* PUD level.
-*/
if (pgtable_l5_enabled())
continue;
 
+   /*
+* The goal here is to allocate all possibly required
+* hardware page tables pointed to by the top hardware
+* level.
+*
+* On 4-level systems, the P4D layer is folded away and
+* the above code does no preallocation.  Below, go down
+* to the pud _software_ level to ensure the second
+* hardware level is allocated on 4-level systems too.
+*/
lvl = "pud";
pud = pud_alloc(_mm, p4d, addr);
if (!pud)