Re: [PATCH v1 1/9] mm/memory: factor out zapping of present pte into zap_present_pte()

2024-01-30 Thread David Hildenbrand
On 30.01.24 09:13, Ryan Roberts wrote: On 29/01/2024 14:32, David Hildenbrand wrote: Let's prepare for further changes by factoring out processing of present PTEs. Signed-off-by: David Hildenbrand --- mm/memory.c | 92 ++--- 1 file changed

Re: [PATCH v1 3/9] mm/memory: further separate anon and pagecache folio handling in zap_present_pte()

2024-01-30 Thread David Hildenbrand
On 30.01.24 09:31, Ryan Roberts wrote: On 29/01/2024 14:32, David Hildenbrand wrote: We don't need up-to-date accessed-dirty information for anon folios and can simply work with the ptent we already have. Also, we know the RSS counter we want to update. We can safely move arch_check_zapped_pte

[PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP

2024-01-29 Thread David Hildenbrand
with nr=1. Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 66 + mm/memory.c | 92 + 2 files changed, 132 insertions(+), 26 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index

[PATCH v1 8/9] mm/mmu_gather: add tlb_remove_tlb_entries()

2024-01-29 Thread David Hildenbrand
() a macro). Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/tlb.h | 2 ++ include/asm-generic/tlb.h | 20 2 files changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index b3de6102a907..1ca7d4c4b90d 100644

[PATCH v1 7/9] mm/mmu_gather: add __tlb_remove_folio_pages()

2024-01-29 Thread David Hildenbrand
the whole mmu_gather. For now, let's keep it simple and add "nr_pages" only. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/tlb.h | 17 +++ include/asm-generic/tlb.h | 8 + include/linux/mm_types.h| 20 mm/mmu_gathe

[PATCH v1 6/9] mm/mmu_gather: define ENCODED_PAGE_FLAG_DELAY_RMAP

2024-01-29 Thread David Hildenbrand
the defines to reflect their context (e.g., ENCODED_PAGE_FLAG_MMU_GATHER_DELAY_RMAP). For now, let's keep it simple. This is a preparation for using the remaining spare bit to indicate that the next item in an array of encoded pages is a "nr_pages" argument and not an encoded page. Signed-off

[PATCH v1 5/9] mm/mmu_gather: pass "delay_rmap" instead of encoded page to __tlb_remove_page_size()

2024-01-29 Thread David Hildenbrand
that the next encoded page pointer in an array is actually "nr_pages". So pass page + delay_rmap flag instead of an encoded page, to handle the encoding internally. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/tlb.h | 13 ++--- include/asm-generic/tlb.h | 12 ++

[PATCH v1 4/9] mm/memory: factor out zapping folio pte into zap_present_folio_pte()

2024-01-29 Thread David Hildenbrand
Let's prepare for further changes by factoring it out into a separate function. Signed-off-by: David Hildenbrand --- mm/memory.c | 53 - 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index

[PATCH v1 3/9] mm/memory: further separate anon and pagecache folio handling in zap_present_pte()

2024-01-29 Thread David Hildenbrand
and RSS. While at it, only call zap_install_uffd_wp_if_needed() if there is even any chance that pte_install_uffd_wp_if_needed() would do *something*. That is, just don't bother if uffd-wp does not apply. Signed-off-by: David Hildenbrand --- mm/memory.c | 16 +++- 1 file changed, 11

[PATCH v1 2/9] mm/memory: handle !page case in zap_present_pte() separately

2024-01-29 Thread David Hildenbrand
page (i.e., zeropage). Add a sanity check to make sure this remains true. should_zap_folio() no longer has to handle NULL pointers. This change replaces 2/3 "!page/!folio" checks by a single "!page" one. Signed-off-by: David Hildenbrand --- mm/memory.c | 20 ++--

[PATCH v1 1/9] mm/memory: factor out zapping of present pte into zap_present_pte()

2024-01-29 Thread David Hildenbrand
Let's prepare for further changes by factoring out processing of present PTEs. Signed-off-by: David Hildenbrand --- mm/memory.c | 92 ++--- 1 file changed, 52 insertions(+), 40 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index

[PATCH v1 0/9] mm/memory: optimize unmap/zap with PTE-mapped THP

2024-01-29 Thread David Hildenbrand
Ellerman Cc: Christophe Leroy Cc: "Naveen N. Rao" Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: Sven Schnelle Cc: Arnd Bergmann Cc: linux-a...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s...@vger.kernel.org David Hildenbrand (

[PATCH v3 15/15] mm/memory: ignore writable bit in folio_pte_batch()

2024-01-29 Thread David Hildenbrand
... and conditionally return to the caller if any PTE except the first one is writable. fork() has to make sure to properly write-protect in case any PTE is writable. Other users (e.g., page unmaping) are expected to not care. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm

[PATCH v3 14/15] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-29 Thread David Hildenbrand
-wp bit. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 36 +++- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 86f8a0021c8e..b2ec2b6b54c7 100644 --- a/mm/memory.c +++ b/mm/memory.c

[PATCH v3 12/15] mm/memory: pass PTE to copy_present_pte()

2024-01-29 Thread David Hildenbrand
We already read it, let's just forward it. This patch is based on work by Ryan Roberts. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a3bdb25f4c8d

[PATCH v3 11/15] mm/memory: factor out copying the actual PTE in copy_present_pte()

2024-01-29 Thread David Hildenbrand
Let's prepare for further changes. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 63 - 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 8d14ba440929..a3bdb25f4c8d

[PATCH v3 10/15] powerpc/mm: use pte_next_pfn() in set_ptes()

2024-01-29 Thread David Hildenbrand
Let's use our handy new helper. Note that the implementation is slightly different, but shouldn't really make a difference in practice. Reviewed-by: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/mm/pgtable.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff

[PATCH v3 09/15] arm/mm: use pte_next_pfn() in set_ptes()

2024-01-29 Thread David Hildenbrand
Let's use our handy helper now that it's available on all archs. Signed-off-by: David Hildenbrand --- arch/arm/mm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 674ed71573a8..c24e29c0b9a4 100644 --- a/arch/arm/mm/mmu.c +++ b

[PATCH v3 08/15] mm/pgtable: make pte_next_pfn() independent of set_ptes()

2024-01-29 Thread David Hildenbrand
Let's provide pte_next_pfn(), independently of set_ptes(). This allows for using the generic pte_next_pfn() version in some arch-specific set_ptes() implementations, and prepares for reusing pte_next_pfn() in other context. Reviewed-by: Christophe Leroy Signed-off-by: David Hildenbrand

[PATCH v3 07/15] sparc/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_64.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch

[PATCH v3 06/15] s390/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390

[PATCH v3 05/15] riscv/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Reviewed-by: Alexandre Ghiti Signed-off-by: David Hildenbrand --- arch/riscv/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv

[PATCH v3 04/15] powerpc/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Reviewed-by: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc

[PATCH v3 03/15] nios2/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2

[PATCH v3 02/15] arm/pgtable: define PFN_PTE_SHIFT

2024-01-29 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include

[PATCH v3 01/15] arm64/mm: Make set_ptes() robust when OAs cross 48-bit boundary

2024-01-29 Thread David Hildenbrand
...@arm.com Fixes: 4a169d61c2ed ("arm64: implement the new page table range API") Closes: https://lore.kernel.org/linux-mm/fdaeb9a5-d890-499a-92c8-d171df43a...@arm.com/ Signed-off-by: Ryan Roberts Reviewed-by: Catalin Marinas Reviewed-by: David Hildenbrand Signed-off-by: David H

[PATCH v3 00/15] mm/memory: optimize fork() with PTE-mapped THP

2024-01-29 Thread David Hildenbrand
ed for a resend based on latest mm-unstable. I am sending this out earlier than I would usually have sent out the next version, so we can pull it into mm-unstable again now that v1 was dropped. David Hildenbrand (14): arm/pgtable: define PFN_PTE_SHIFT nios2/pgtable: define PFN_PTE_SHIFT powerpc/pg

Re: [PATCH v2 13/15] mm/memory: optimize fork() with PTE-mapped THP

2024-01-26 Thread David Hildenbrand
On 25.01.24 20:32, David Hildenbrand wrote: Let's implement PTE batching when consecutive (present) PTEs map consecutive pages of the same large folio, and all other PTE bits besides the PFNs are equal. We will optimize folio_pte_batch() separately, to ignore selected PTE bits. This patch

[PATCH v2 15/15] mm/memory: ignore writable bit in folio_pte_batch()

2024-01-25 Thread David Hildenbrand
... and conditionally return to the caller if any PTE except the first one is writable. fork() has to make sure to properly write-protect in case any PTE is writable. Other users (e.g., page unmaping) are expected to not care. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm

[PATCH v2 14/15] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-25 Thread David Hildenbrand
-wp bit. Signed-off-by: David Hildenbrand --- mm/memory.c | 36 +++- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 4d1be89a01ee0..b3f035fe54c8d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -953,24 +953,44 @@ static

[PATCH v2 13/15] mm/memory: optimize fork() with PTE-mapped THP

2024-01-25 Thread David Hildenbrand
and allocate a single page to copy a single page. For now we won't care as pinned pages are a corner case, and we should rather look into maintaining only a single PageAnonExclusive bit for large folios. Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 31 +++ mm/memory.c

[PATCH v2 12/15] mm/memory: pass PTE to copy_present_pte()

2024-01-25 Thread David Hildenbrand
We already read it, let's just forward it. This patch is based on work by Ryan Roberts. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 10fc14ff8e49b

[PATCH v2 11/15] mm/memory: factor out copying the actual PTE in copy_present_pte()

2024-01-25 Thread David Hildenbrand
Let's prepare for further changes. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 63 - 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7e1f4849463aa..10fc14ff8e49b

[PATCH v2 10/15] powerpc/mm: use pte_next_pfn() in set_ptes()

2024-01-25 Thread David Hildenbrand
Let's use our handy new helper. Note that the implementation is slightly different, but shouldn't really make a difference in practice. Signed-off-by: David Hildenbrand --- arch/powerpc/mm/pgtable.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/arch/powerpc/mm

[PATCH v2 09/15] arm/mm: use pte_next_pfn() in set_ptes()

2024-01-25 Thread David Hildenbrand
Let's use our handy helper now that it's available on all archs. Signed-off-by: David Hildenbrand --- arch/arm/mm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 674ed71573a84..c24e29c0b9a48 100644 --- a/arch/arm/mm/mmu.c

[PATCH v2 08/15] mm/pgtable: make pte_next_pfn() independent of set_ptes()

2024-01-25 Thread David Hildenbrand
Let's provide pte_next_pfn(), independently of set_ptes(). This allows for using the generic pte_next_pfn() version in some arch-specific set_ptes() implementations, and prepares for reusing pte_next_pfn() in other context. Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 2 +- 1

[PATCH v2 07/15] sparc/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_64.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch

[PATCH v2 06/15] s390/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390

[PATCH v2 05/15] riscv/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Reviewed-by: Alexandre Ghiti Signed-off-by: David Hildenbrand --- arch/riscv/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv

[PATCH v2 03/15] nios2/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2

[PATCH v2 04/15] powerpc/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch

[PATCH v2 02/15] arm/pgtable: define PFN_PTE_SHIFT

2024-01-25 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include

[PATCH v2 01/15] arm64/mm: Make set_ptes() robust when OAs cross 48-bit boundary

2024-01-25 Thread David Hildenbrand
...@arm.com Fixes: 4a169d61c2ed ("arm64: implement the new page table range API") Closes: https://lore.kernel.org/linux-mm/fdaeb9a5-d890-499a-92c8-d171df43a...@arm.com/ Signed-off-by: Ryan Roberts Reviewed-by: Catalin Marinas Reviewed-by: David Hildenbrand Signed-off-by: David H

[PATCH v2 00/15] mm/memory: optimize fork() with PTE-mapped THP

2024-01-25 Thread David Hildenbrand
asily Gorbik Cc: Christian Borntraeger Cc: Sven Schnelle Cc: "David S. Miller" Cc: linux-arm-ker...@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux-s...@vger.kernel.org Cc: sparcli...@vger.kernel.org David Hildenbrand (14): arm/pgtable: d

Re: [PATCH v1 00/11] mm/memory: optimize fork() with PTE-mapped THP

2024-01-23 Thread David Hildenbrand
On 23.01.24 20:43, Ryan Roberts wrote: On 23/01/2024 19:33, David Hildenbrand wrote: On 23.01.24 20:15, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let's implement PTE batching

Re: [PATCH v1 00/11] mm/memory: optimize fork() with PTE-mapped THP

2024-01-23 Thread David Hildenbrand
On 23.01.24 20:15, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let's implement PTE batching during fork when processing PTE-mapped THPs. This series is partially based on Ryan's

Re: [PATCH v1 10/11] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-23 Thread David Hildenbrand
14e83ff2a422a96ce5701f9c8454a49f9ed947e3 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Sat, 30 Dec 2023 12:54:35 +0100 Subject: [PATCH] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch() Let's always ignore the accessed/young bit: we'll always mark the PTE as old in our child process during

Re: [PATCH v1 10/11] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-23 Thread David Hildenbrand
On 23.01.24 14:42, Ryan Roberts wrote: On 23/01/2024 13:06, David Hildenbrand wrote: On 23.01.24 13:25, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: Let's ignore these bits: they are irrelevant for fork, and will likely be irrelevant for upcoming users such as page

Re: [PATCH v1 10/11] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-23 Thread David Hildenbrand
On 23.01.24 13:25, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: Let's ignore these bits: they are irrelevant for fork, and will likely be irrelevant for upcoming users such as page unmapping. Signed-off-by: David Hildenbrand --- mm/memory.c | 10 -- 1 file

Re: [PATCH v1 09/11] mm/memory: optimize fork() with PTE-mapped THP

2024-01-23 Thread David Hildenbrand
[...] I wrote some documentation for this (based on Matthew's docs for set_ptes() in my version. Perhaps it makes sense to add it here, given this is overridable by the arch. /** * wrprotect_ptes - Write protect a consecutive set of pages. * @mm: Address space that the pages are mapped

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
On 23.01.24 12:48, Christophe Leroy wrote: Le 23/01/2024 à 12:38, Ryan Roberts a écrit : On 23/01/2024 11:31, David Hildenbrand wrote: If high bits are used for something else, then we might produce a garbage PTE on overflow, but that shouldn't really matter I concluded for folio_pte_batch

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
On 23.01.24 12:38, Ryan Roberts wrote: On 23/01/2024 11:31, David Hildenbrand wrote: If high bits are used for something else, then we might produce a garbage PTE on overflow, but that shouldn't really matter I concluded for folio_pte_batch() purposes, we'd not detect "belongs to this

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
On 23.01.24 12:17, Ryan Roberts wrote: On 23/01/2024 11:02, David Hildenbrand wrote: On 23.01.24 11:48, David Hildenbrand wrote: On 23.01.24 11:34, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
If high bits are used for something else, then we might produce a garbage PTE on overflow, but that shouldn't really matter I concluded for folio_pte_batch() purposes, we'd not detect "belongs to this folio batch" either way. Exactly. Maybe it's likely cleaner to also have a custom

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
On 23.01.24 11:48, David Hildenbrand wrote: On 23.01.24 11:34, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand

Re: [PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-23 Thread David Hildenbrand
On 23.01.24 11:34, Ryan Roberts wrote: On 22/01/2024 19:41, David Hildenbrand wrote: We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable.h | 2

Re: [PATCH v1 04/11] risc: pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
On 22.01.24 21:03, Alexandre Ghiti wrote: Hi David, On 22/01/2024 20:41, David Hildenbrand wrote: We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/riscv/include/asm

[PATCH v1 11/11] mm/memory: ignore writable bit in folio_pte_batch()

2024-01-22 Thread David Hildenbrand
... and conditionally return to the caller if any pte except the first one is writable. fork() has to make sure to properly write-protect in case any PTE is writable. Other users (e.g., page unmaping) won't care. Signed-off-by: David Hildenbrand --- mm/memory.c | 26

[PATCH v1 10/11] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()

2024-01-22 Thread David Hildenbrand
Let's ignore these bits: they are irrelevant for fork, and will likely be irrelevant for upcoming users such as page unmapping. Signed-off-by: David Hildenbrand --- mm/memory.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index

[PATCH v1 09/11] mm/memory: optimize fork() with PTE-mapped THP

2024-01-22 Thread David Hildenbrand
and will always stay within VMA boundaries. Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 17 +- mm/memory.c | 113 +--- 2 files changed, 109 insertions(+), 21 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h

[PATCH v1 08/11] mm/memory: pass PTE to copy_present_pte()

2024-01-22 Thread David Hildenbrand
We already read it, let's just forward it. This patch is based on work by Ryan Roberts. Signed-off-by: David Hildenbrand --- mm/memory.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 2aa2051ee51d3..185b4aff13d62 100644 --- a/mm

[PATCH v1 07/11] mm/memory: factor out copying the actual PTE in copy_present_pte()

2024-01-22 Thread David Hildenbrand
Let's prepare for further changes. Signed-off-by: David Hildenbrand --- mm/memory.c | 60 - 1 file changed, 32 insertions(+), 28 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7e1f4849463aa..2aa2051ee51d3 100644 --- a/mm/memory.c

[PATCH v1 06/11] sparc/pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_64.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch

[PATCH v1 05/11] s390/pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390

[PATCH v1 04/11] risc: pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/riscv/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv

[PATCH v1 03/11] powerpc/pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch

[PATCH v1 02/11] nios2/pgtable: define PFN_PTE_SHIFT

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2

[PATCH v1 01/11] arm/pgtable: define PFN_PTE_SHIFT on arm and arm64

2024-01-22 Thread David Hildenbrand
We want to make use of pte_next_pfn() outside of set_ptes(). Let's simpliy define PFN_PTE_SHIFT, required by pte_next_pfn(). Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable.h | 2 ++ arch/arm64/include/asm/pgtable.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch

[PATCH v1 00/11] mm/memory: optimize fork() with PTE-mapped THP

2024-01-22 Thread David Hildenbrand
ux-arm-ker...@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux-s...@vger.kernel.org Cc: sparcli...@vger.kernel.org David Hildenbrand (11): arm/pgtable: define PFN_PTE_SHIFT on arm and arm64 nios2/pgtable: define PFN_PTE_SHIFT powerpc/pgt

Re: mm/debug_vm_pgtable.c:860 warning triggered

2023-11-06 Thread David Hildenbrand
On 06.11.23 07:06, Michael Ellerman wrote: Anshuman Khandual writes: Hello Daniel, This test just ensures that PFN is preserved during pte <--> swap pte transformations , and the warning here seems to have been caused by powerpc platform specific helpers and/or its pte_t representation.

Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files

2023-10-12 Thread David Hildenbrand
On 10.10.23 02:20, Andrew Morton wrote: On Sun, 08 Oct 2023 09:54:22 +1100 Michael Ellerman wrote: I don't know why powerpc's PTE_INDEX_SIZE is variable. To allow a single vmlinux to boot using either the Hashed Page Table MMU, or Radix Tree MMU, which have different page table geometry.

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-22 Thread David Hildenbrand
On 22.08.23 17:30, Jann Horn wrote: On Tue, Aug 22, 2023 at 5:23 PM Matthew Wilcox wrote: On Tue, Aug 22, 2023 at 04:39:43PM +0200, Jann Horn wrote: Perhaps something else will want that same behaviour in future (it's tempting, but difficult to guarantee correctness); for now, it is just

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-22 Thread David Hildenbrand
On 22.08.23 16:39, Jann Horn wrote: On Tue, Aug 22, 2023 at 4:51 AM Hugh Dickins wrote: On Mon, 21 Aug 2023, Jann Horn wrote: On Mon, Aug 21, 2023 at 9:51 PM Hugh Dickins wrote: Just for this case, take the pmd_lock() two steps earlier: not because it gives any protection against this case

Re: [BUG] Re: [PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-08-15 Thread David Hildenbrand
On 15.08.23 08:34, Hugh Dickins wrote: On Mon, 14 Aug 2023, Jann Horn wrote: On Wed, Jul 12, 2023 at 6:42 AM Hugh Dickins wrote: Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). It does need mmap_read_lock(), but it does not need mmap_write_lock(), nor vma_start_write() nor

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-08 Thread David Hildenbrand
On 08.08.23 08:29, Aneesh Kumar K V wrote: On 8/8/23 12:05 AM, David Hildenbrand wrote: On 07.08.23 14:41, David Hildenbrand wrote: On 07.08.23 14:27, Michal Hocko wrote: On Sat 05-08-23 19:54:23, Aneesh Kumar K V wrote: [...] Do you see a need for firmware-managed memory to be hotplugged

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-07 Thread David Hildenbrand
On 07.08.23 14:41, David Hildenbrand wrote: On 07.08.23 14:27, Michal Hocko wrote: On Sat 05-08-23 19:54:23, Aneesh Kumar K V wrote: [...] Do you see a need for firmware-managed memory to be hotplugged in with different memory block sizes? In short. Yes. Slightly longer, a fixed size memory

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-07 Thread David Hildenbrand
On 03.08.23 13:30, Michal Hocko wrote: On Thu 03-08-23 11:24:08, David Hildenbrand wrote: [...] would be readable only when the block is offline and it would reallocate vmemmap on the change. Makes sense? Are there any risks? Maybe pfn walkers? The question is: is it of any real value

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-07 Thread David Hildenbrand
On 07.08.23 14:27, Michal Hocko wrote: On Sat 05-08-23 19:54:23, Aneesh Kumar K V wrote: [...] Do you see a need for firmware-managed memory to be hotplugged in with different memory block sizes? In short. Yes. Slightly longer, a fixed size memory block semantic is just standing in the way

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-03 Thread David Hildenbrand
On 03.08.23 10:52, Michal Hocko wrote: On Wed 02-08-23 18:59:04, Michal Hocko wrote: On Wed 02-08-23 17:54:04, David Hildenbrand wrote: On 02.08.23 17:50, Michal Hocko wrote: On Wed 02-08-23 10:15:04, Aneesh Kumar K V wrote: On 8/1/23 4:20 PM, Michal Hocko wrote: On Tue 01-08-23 14:58:29

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-02 Thread David Hildenbrand
On 02.08.23 17:50, Michal Hocko wrote: On Wed 02-08-23 10:15:04, Aneesh Kumar K V wrote: On 8/1/23 4:20 PM, Michal Hocko wrote: On Tue 01-08-23 14:58:29, Aneesh Kumar K V wrote: On 8/1/23 2:28 PM, Michal Hocko wrote: On Tue 01-08-23 10:11:16, Aneesh Kumar K.V wrote: Allow updating

Re: [PATCH v6 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-07-27 Thread David Hildenbrand
On 27.07.23 10:02, Aneesh Kumar K.V wrote: Acked-by: David Hildenbrand Signed-off-by: Aneesh Kumar K.V --- mm/memory_hotplug.c | 35 +++ 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index

Re: [PATCH v6 6/7] mm/memory_hotplug: Embed vmem_altmap details in memory block

2023-07-27 Thread David Hildenbrand
if (altmap) { + WARN(altmap->alloc, "Altmap not fully unmapped"); + kfree(altmap); + } + if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { memblock_phys_free(start, size); memblock_remove(start, size); Acked-by: David Hildenbrand -- Cheers, David / dhildenb

Re: [PATCH v5 6/7] mm/hotplug: Embed vmem_altmap details in memory block

2023-07-26 Thread David Hildenbrand
On 26.07.23 12:31, Aneesh Kumar K.V wrote: David Hildenbrand writes: On 25.07.23 12:02, Aneesh Kumar K.V wrote: With memmap on memory, some architecture needs more details w.r.t altmap such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of computing them again when we remove

Re: [PATCH v5 4/7] mm/hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-07-26 Thread David Hildenbrand
On 26.07.23 11:57, Aneesh Kumar K V wrote: On 7/26/23 2:34 PM, David Hildenbrand wrote:    /* @@ -1310,7 +1400,10 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)    {    struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };    enum

Re: [PATCH v5 6/7] mm/hotplug: Embed vmem_altmap details in memory block

2023-07-26 Thread David Hildenbrand
On 25.07.23 12:02, Aneesh Kumar K.V wrote: With memmap on memory, some architecture needs more details w.r.t altmap such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of computing them again when we remove a memory block, embed vmem_altmap details in struct memory_block if we are

Re: [PATCH v5 4/7] mm/hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-07-26 Thread David Hildenbrand
/* @@ -1310,7 +1400,10 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; enum memblock_flags memblock_flags = MEMBLOCK_NONE; - struct vmem_altmap mhp_altmap = {}; +

Re: [PATCH v5 4/7] mm/hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-07-25 Thread David Hildenbrand
truct vmem_altmap mhp_altmap = {}; + struct vmem_altmap mhp_altmap = { + .base_pfn = PHYS_PFN(res->start), + .end_pfn = PHYS_PFN(res->end), Is it required to set .end_pfn, and if so, shouldn't we also set it to base_pfn + memory_block_memmap_on_memory_pa

Re: [PATCH v5 7/7] mm/hotplug: Enable runtime update of memmap_on_memory parameter

2023-07-25 Thread David Hildenbrand
use to remove %#llx - %#llx," + "wrong granularity\n", + start, start + size); + return -EINVAL; } + altmap = _altmap; } /* remove memmap entry */ Acked-by: David Hildenbrand -- Cheers, David / dhildenb

Re: [PATCH v5 5/7] powerpc/book3s64/memhotplug: Enable memmap on memory for radix

2023-07-25 Thread David Hildenbrand
e_addr, block_sz, mhp_flags); ... this becomes rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_MEMMAP_ON_MEMORY); With that Reviewed-by: David Hildenbrand -- Cheers, David / dhildenb

Re: [PATCH v5 0/7] Add support for memmap on memory feature on ppc64

2023-07-25 Thread David Hildenbrand
On 25.07.23 12:02, Aneesh Kumar K.V wrote: This patch series update memmap on memory feature to fall back to memmap allocation outside the memory block if the alignment rules are not met. This makes the feature more useful on architectures like ppc64 where alignment rules are different with 64K

Re: [PATCH] powerpc/mm/altmap: Fix altmap boundary check

2023-07-24 Thread David Hildenbrand
On 24.07.23 20:13, Aneesh Kumar K.V wrote: altmap->free includes the entire free space from which altmap blocks can be allocated. So when checking whether the kernel is doing altmap block free, compute the boundary correctly. Cc: David Hildenbrand Cc: Dan Williams Fixes: 9ef34630a

Re: [PATCH] mm/hotplug: Enable runtime update of memmap_on_memory parameter

2023-07-24 Thread David Hildenbrand
On 24.07.23 19:31, Andrew Morton wrote: On Fri, 21 Jul 2023 18:49:50 +0530 "Aneesh Kumar K.V" wrote: Signed-off-by: Aneesh Kumar K.V --- This is dependent on patches posted at https://lore.kernel.org/linux-mm/20230718024409.95742-1-aneesh.ku...@linux.ibm.com/ It appears that the

Re: [PATCH v4 4/6] mm/hotplug: Allow pageblock alignment via altmap reservation

2023-07-24 Thread David Hildenbrand
On 24.07.23 18:02, Aneesh Kumar K V wrote: On 7/24/23 9:11 PM, David Hildenbrand wrote: On 24.07.23 17:16, Aneesh Kumar K V wrote: /*   * In "forced" memmap_on_memory mode, we always align the vmemmap size up to cover   * full pageblocks. That way, we can add memory even if t

Re: [PATCH v4 5/6] powerpc/book3s64/memhotplug: Enable memmap on memory for radix

2023-07-24 Thread David Hildenbrand
+    mhp_altmap.base_pfn = PHYS_PFN(start); +    mhp_altmap.free = PHYS_PFN(size) - nr_vmemmap_pages; That change does not belong into this patch. I kept that change with ppc64 enablement because only ppc64 arch got check against those values in the free path. Let's make

Re: [PATCH v4 4/6] mm/hotplug: Allow pageblock alignment via altmap reservation

2023-07-24 Thread David Hildenbrand
On 24.07.23 17:16, Aneesh Kumar K V wrote: /*  * In "forced" memmap_on_memory mode, we always align the vmemmap size up to cover  * full pageblocks. That way, we can add memory even if the vmemmap size is not properly  * aligned, however, we might waste memory.  */ I am finding that

Re: [PATCH v4 5/6] powerpc/book3s64/memhotplug: Enable memmap on memory for radix

2023-07-24 Thread David Hildenbrand
On 18.07.23 04:44, Aneesh Kumar K.V wrote: Radix vmemmap mapping can map things correctly at the PMD level or PTE level based on different device boundary checks. Hence we skip the restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also makes the feature widely useful because to

Re: [PATCH v4 4/6] mm/hotplug: Allow pageblock alignment via altmap reservation

2023-07-24 Thread David Hildenbrand
On 18.07.23 04:44, Aneesh Kumar K.V wrote: Add a new kconfig option that can be selected if we want to allow That description seems outdated. pageblock alignment by reserving pages in the vmemmap altmap area. This implies we will be reserving some pages for every memoryblock This also allows

Re: [PATCH v4 3/6] mm/hotplug: Allow architecture to override memmap on memory support check

2023-07-24 Thread David Hildenbrand
On 18.07.23 04:44, Aneesh Kumar K.V wrote: Some architectures would want different restrictions. Hence add an architecture-specific override. Both the PMD_SIZE check and pageblock alignment check are moved there. Signed-off-by: Aneesh Kumar K.V --- mm/memory_hotplug.c | 22

Re: [PATCH v4 3/6] mm/hotplug: Allow architecture to override memmap on memory support check

2023-07-24 Thread David Hildenbrand
NED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)); + IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)) && + arch_supports_memmap_on_memory(size); } /* Acked-by: David Hildenbrand -- Cheers, David / dhildenb

Re: [PATCH v4 2/6] mm/hotplug: Allow memmap on memory hotplug request to fallback

2023-07-24 Thread David Hildenbrand
On 18.07.23 04:44, Aneesh Kumar K.V wrote: If not supported, fallback to not using memap on memmory. This avoids the need for callers to do the fallback. Signed-off-by: Aneesh Kumar K.V --- Acked-by: David Hildenbrand -- Cheers, David / dhildenb

<    1   2   3   4   5   6   7   8   9   10   >