vunmap will remove ptes.
Cc: Christoph Hellwig
Cc: Marek Szyprowski
Cc: Robin Murphy
Cc: io...@lists.linux-foundation.org
Signed-off-by: Nicholas Piggin
---
kernel/dma/remap.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index 905c3fa005f1
This is a shim around vmap_pages_range, get rid of it.
Move the main API comment from the _noflush variant to the normal
variant, and make _noflush internal to mm/.
Signed-off-by: Nicholas Piggin
---
Documentation/core-api/cachetlb.rst | 2 +-
include/linux/vmalloc.h | 11
Christoph pointed out some overdue cleanups required after the huge
page series, and I had some other comment and warning changes.
Thanks,
Nick
Nicholas Piggin (5):
mm/vmalloc: remove map_kernel_range
kernel/dma: remove unnecessary unmap_kernel_range
powerpc/xive: remove unnecessary
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Nicholas Piggin
---
.../admin-guide/kernel-parameters.txt | 2 ++
arch/powerpc/Kconfig | 1 +
arch/powerpc/kernel/module.c | 21 +++
3 files changed, 20 insertions(+), 4 deletions
misses by nearly 30x on a `git diff` workload on a 2-node
POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
This can result in more internal fragmentation and memory overhead for a
given allocation, an option nohugevmalloc is added to disable at boot.
Signed-off-by: Nicholas Pig
As a side-effect, the order of flush_cache_vmap() and
arch_sync_kernel_mappings() calls are switched, but that now matches
the other callers in this file.
Reviewed-by: Christoph Hellwig
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16 +---
1 file changed, 13 insertions(+), 3
This is a generic kernel virtual memory mapper, not specific to ioremap.
Code is unchanged other than making vmap_range non-static.
Reviewed-by: Christoph Hellwig
Signed-off-by: Nicholas Piggin
---
include/linux/vmalloc.h | 3 +
mm/ioremap.c| 203
If an architecture doesn't support a particular page table level as
a huge vmap page size then allow it to skip defining the support
query function.
Suggested-by: Christoph Hellwig
Signed-off-by: Nicholas Piggin
---
arch/arm64/include/asm/vmalloc.h | 7 +++
arch/powerpc/include/asm
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: Catalin Marinas
Cc: Will Deacon
Cc: linux-arm-ker...@lists.infradead.org
Acked-by: Catalin Marinas
Signed-off-by: Nicholas Piggin
---
arch/arm64/include
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: linuxppc-...@lists.ozlabs.org
Acked-by: Michael Ellerman
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/vmalloc.h | 19
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: x...@kernel.org
Cc: "H. Peter Anvin"
Acked-by: Catalin Marinas [arm64]
Signed-off-by: Nicholas Piggin
---
arch/arm64/include/asm/vmalloc.h | 8 ++
arch/arm64/mm/mmu.c | 10 +--
arch/powerpc/i
This will be used as a generic kernel virtual mapping function, so
re-name it in preparation.
Signed-off-by: Nicholas Piggin
---
mm/ioremap.c | 64 +++-
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/ioremap.c b/mm/ioremap.c
The vmalloc mapper operates on a struct page * array rather than a
linear physical address, re-name it to make this distinction clear.
Reviewed-by: Christoph Hellwig
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff
apply_to_pte_range might mistake a large pte for bad, or treat it as a
page table, resulting in a crash or corruption. Add a test to warn and
return error if large entries are found.
Reviewed-by: Christoph Hellwig
Signed-off-by: Nicholas Piggin
---
mm/memory.c | 66
Excerpts from Christophe Leroy's message of January 25, 2021 6:42 pm:
>
>
> Le 24/01/2021 à 09:22, Nicholas Piggin a écrit :
>> This allows unsupported levels to be constant folded away, and so
>> p4d_free_pud_page can be removed because it's no longer linked to.
>
&
Excerpts from Christophe Leroy's message of January 25, 2021 7:14 pm:
>
>
> Le 24/01/2021 à 09:22, Nicholas Piggin a écrit :
>> Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
>> enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
Excerpts from Christophe Leroy's message of January 26, 2021 12:48 am:
> When r3 is not modified, reload it from regs->orig_r3 to free
> volatile registers. This avoids a stack frame for the likely part
> of system_call_exception()
>
> Before the patch:
>
> c000b4d4 :
> c000b4d4: 7c 08 02 a6
Excerpts from Christophe Leroy's message of January 26, 2021 12:48 am:
> Save r3 in regs->orig_r3 in system_call_exception()
>
> Signed-off-by: Christophe Leroy
> ---
> arch/powerpc/kernel/entry_64.S | 1 -
> arch/powerpc/kernel/syscall.c | 2 ++
> 2 files changed, 2 insertions(+), 1
Excerpts from Ding Tianhong's message of January 26, 2021 4:59 pm:
> On 2021/1/26 12:45, Nicholas Piggin wrote:
>> Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
>> enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
>> supports PM
Excerpts from Christophe Leroy's message of January 26, 2021 12:48 am:
> syscall_64.c will be reused almost as is for PPC32.
>
> Rename it syscall.c
Could you rename it to interrupt.c instead? A system call is an
interrupt, and the file now also has code to return from other
interrupts as well,
Excerpts from Christophe Leroy's message of January 26, 2021 12:48 am:
> Only PPC64 has scv. No need to check the 0x7ff0 trap on PPC32.
>
> And ignore the scv parameter in syscall_exit_prepare (Save 14 cycles
> 346 => 332 cycles)
>
> Signed-off-by: Christophe Leroy
> ---
>
Excerpts from David Laight's message of January 25, 2021 10:24 pm:
> From: Christophe Leroy
>> Sent: 25 January 2021 09:15
>>
>> Le 24/01/2021 à 09:22, Nicholas Piggin a écrit :
>> > Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
>>
Excerpts from Christoph Hellwig's message of January 25, 2021 1:07 am:
> On Sun, Jan 24, 2021 at 06:22:29PM +1000, Nicholas Piggin wrote:
>> diff --git a/arch/Kconfig b/arch/Kconfig
>> index 24862d15f3a3..f87feb616184 100644
>> --- a/arch/Kconfig
>> +++ b/arch/K
Excerpts from Christoph Hellwig's message of January 24, 2021 9:40 pm:
>> diff --git a/arch/arm64/include/asm/vmalloc.h
>> b/arch/arm64/include/asm/vmalloc.h
>> index 2ca708ab9b20..597b40405319 100644
>> --- a/arch/arm64/include/asm/vmalloc.h
>> +++ b/arch/arm64/include/asm/vmalloc.h
>> @@ -1,4
Excerpts from Christoph Hellwig's message of January 24, 2021 9:36 pm:
> On Sun, Jan 24, 2021 at 06:22:22PM +1000, Nicholas Piggin wrote:
>> This will be used as a generic kernel virtual mapping function, so
>> re-name it in preparation.
>
> The new name looks ok, but s
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: Catalin Marinas
Cc: Will Deacon
Cc: linux-arm-ker...@lists.infradead.org
Acked-by: Catalin Marinas
Signed-off-by: Nicholas Piggin
---
arch/arm64/include
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Nicholas Piggin
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/powerpc/Kconfig| 1 +
arch/powerpc/kernel/module.c| 13 +++--
3 files changed, 14 insertions(+), 2 deletions
misses by nearly 30x on a `git diff` workload on a 2-node
POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
This can result in more internal fragmentation and memory overhead for a
given allocation, an option nohugevmalloc is added to disable at boot.
Signed-off-by: Nicholas Pig
As a side-effect, the order of flush_cache_vmap() and
arch_sync_kernel_mappings() calls are switched, but that now matches
the other callers in this file.
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/mm
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: x...@kernel.org
Cc: "H. Peter Anvin"
Signed-off-by: Nicholas Piggin
---
arch/x86/i
This is a generic kernel virtual memory mapper, not specific to ioremap.
Signed-off-by: Nicholas Piggin
---
include/linux/vmalloc.h | 3 +
mm/ioremap.c| 197
mm/vmalloc.c| 196 +++
3 files
This will be used as a generic kernel virtual mapping function, so
re-name it in preparation.
Signed-off-by: Nicholas Piggin
---
mm/ioremap.c | 64 +++-
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/ioremap.c b/mm/ioremap.c
apply_to_pte_range might mistake a large pte for bad, or treat it as a
page table, resulting in a crash or corruption. Add a test to warn and
return error if large entries are found.
Signed-off-by: Nicholas Piggin
---
mm/memory.c | 66 +++--
1
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: x...@kernel.org
Cc: "H. Peter Anvin"
Acked-by: Catalin Marinas [arm64]
Signed-off-by: Nicholas Piggin
---
arch/arm64/include/asm/vmalloc.h | 8 +++
arch/arm64/mm/mmu.c | 10 +--
arch/powerpc/i
The vmalloc mapper operates on a struct page * array rather than a
linear physical address, re-name it to make this distinction clear.
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: linuxppc-...@lists.ozlabs.org
Acked-by: Michael Ellerman
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/vmalloc.h | 19
pings")
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 41 ++---
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e6f352bf0498..62372f9e0167 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
ut.
- Made an architecture config option, powerpc only for now.
Since v3:
- Fixed an off-by-one bug in a loop
- Fix !CONFIG_HAVE_ARCH_HUGE_VMAP build fail
*** BLURB HERE ***
Nicholas Piggin (12):
mm/vmalloc: fix vmalloc_to_page for huge vmap mappings
mm: apply_to_pte_range warn and fail if a
Excerpts from Ding Tianhong's message of January 4, 2021 10:33 pm:
> On 2020/12/5 14:57, Nicholas Piggin wrote:
>> This changes the awkward approach where architectures provide init
>> functions to determine which levels they can provide large mappings for,
>> to one wher
Excerpts from Russell King - ARM Linux admin's message of December 30, 2020
8:58 pm:
> On Wed, Dec 30, 2020 at 10:00:28AM +, Russell King - ARM Linux admin
> wrote:
>> On Wed, Dec 30, 2020 at 12:33:02PM +1000, Nicholas Piggin wrote:
>> > Excerpts from Russell King - ARM
Excerpts from Russell King - ARM Linux admin's message of December 29, 2020
8:44 pm:
> On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
>> I think it should certainly be documented in terms of what guarantees
>> it provides to application, _not_ the kinds of inst
Excerpts from Andy Lutomirski's message of December 29, 2020 10:36 am:
> On Mon, Dec 28, 2020 at 4:11 PM Nicholas Piggin wrote:
>>
>> Excerpts from Andy Lutomirski's message of December 28, 2020 4:28 am:
>> > The old sync_core_before_usermode() comments said that a non-ic
Excerpts from Andy Lutomirski's message of December 29, 2020 10:56 am:
> On Mon, Dec 28, 2020 at 4:36 PM Nicholas Piggin wrote:
>>
>> Excerpts from Andy Lutomirski's message of December 29, 2020 7:06 am:
>> > On Mon, Dec 28, 2020 at 12:32 PM Mathieu Desnoyers
>> >
Excerpts from Andy Lutomirski's message of December 29, 2020 7:06 am:
> On Mon, Dec 28, 2020 at 12:32 PM Mathieu Desnoyers
> wrote:
>>
>> - On Dec 28, 2020, at 2:44 PM, Andy Lutomirski l...@kernel.org wrote:
>>
>> > On Mon, Dec 28, 2020 at 11:09 AM Russell King - ARM Linux admin
>> > wrote:
; Cc: Michael Ellerman
> Cc: Benjamin Herrenschmidt
> Cc: Paul Mackerras
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: Nicholas Piggin
> Cc: Catalin Marinas
> Cc: Will Deacon
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: Mathieu Desnoyers
> Cc: x...@kernel.org
> Cc
Excerpts from Christophe Leroy's message of December 22, 2020 11:28 pm:
> The address argument is not used by bad_page_fault().
>
> Remove it.
>
> Suggested-by: Nicholas Piggin
> Signed-off-by: Christophe Leroy
> ---
> arch/powerpc/include/asm/bug.h | 4 ++-
Excerpts from Christophe Leroy's message of December 22, 2020 11:28 pm:
> Let do_break() retrieve address and errorcode from regs.
>
> This simplifies the code and shouldn't impeed performance as
> address and errorcode are likely still hot in the cache.
>
> Suggested-b
Excerpts from Boqun Feng's message of November 14, 2020 1:30 am:
> Hi Nicholas,
>
> On Wed, Nov 11, 2020 at 09:07:23PM +1000, Nicholas Piggin wrote:
>> All the cool kids are doing it.
>>
>> Signed-off-by: Nicholas Piggin
>> ---
>> a
Excerpts from Michael Ellerman's message of December 14, 2020 8:43 pm:
> Nicholas Piggin writes:
>> Excerpts from Geert Uytterhoeven's message of December 10, 2020 7:06 pm:
>>> Hi Nicholas,
>>>
>>> On Fri, Nov 20, 2020 at 4:01 AM Nicholas Piggin wrote:
>
Use _lazy_tlb functions for lazy mm refcounting in powerpc, to prepare
to move to MMU_LAZY_TLB_SHOOTDOWN.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index
On a 16-socket 192-core POWER8 system, a context switching benchmark
with as many software threads as CPUs (so each switch will go in and
out of idle), upstream can achieve a rate of about 1 million context
switches per second. After this patch it goes up to 118 million.
Signed-off-by: Nicholas
provides an alternate scheme.
Signed-off-by: Nicholas Piggin
---
arch/Kconfig | 17 +
include/linux/sched/mm.h | 13 +--
kernel/sched/core.c | 75 ++--
kernel/sched/sched.h | 4 ++-
4 files changed, 87 insertions(+), 22
tional interrupts on a 144 CPU system during a kernel compile).
There are a number of strategies that could be employed to reduce IPIs
if they turn out to be a problem for some workload.
Signed-off-by: Nicholas Piggin
---
arch/Kconfig | 17 +++--
kernel/fork.
Add explicit _lazy_tlb annotated functions for lazy mm refcounting.
This makes things a bit more explicit, and allows explicit refcounting
to be removed if it is not used.
Signed-off-by: Nicholas Piggin
---
arch/arm/mach-rpc/ecard.c| 2 +-
arch/powerpc/mm/book3s64/radix_tlb.c | 4
This is another rebase, on top of mainline now (don't need the
asm-generic tree), and without any x86 or membarrier changes.
This makes the series far smaller and more manageable and
without the controversial bits.
Thanks,
Nick
Nicholas Piggin (5):
lazy tlb: introduce lazy mm refcount helper
Excerpts from Nicholas Piggin's message of December 14, 2020 2:07 pm:
> Excerpts from Andy Lutomirski's message of December 11, 2020 10:11 am:
>>> On Dec 5, 2020, at 7:59 PM, Nicholas Piggin wrote:
>>>
>>
>>> I'm still going to persue shoot-lazies for the mer
Excerpts from Geert Uytterhoeven's message of December 10, 2020 7:06 pm:
> Hi Nicholas,
>
> On Fri, Nov 20, 2020 at 4:01 AM Nicholas Piggin wrote:
>>
>> When offlining a CPU, powerpc/64s does not flush TLBs, rather it just
>> leaves the CPU set in mm_cpumasks, so it
Excerpts from Andy Lutomirski's message of December 11, 2020 10:11 am:
>> On Dec 5, 2020, at 7:59 PM, Nicholas Piggin wrote:
>>
>
>> I'm still going to persue shoot-lazies for the merge window. As you
>> see it's about a dozen lines and a if (IS_ENABLED(... in core cod
Excerpts from Andy Lutomirski's message of December 6, 2020 10:36 am:
> On Sat, Dec 5, 2020 at 3:15 PM Nicholas Piggin wrote:
>>
>> Excerpts from Andy Lutomirski's message of December 6, 2020 2:11 am:
>> >
>
>> If an mm was lazy tlb for a kernel
Excerpts from Andy Lutomirski's message of December 6, 2020 2:11 am:
>
>> On Dec 5, 2020, at 12:00 AM, Nicholas Piggin wrote:
>>
>>
>> I disagree. Until now nobody following it noticed that the mm gets
>> un-lazied in other cases, because that was not
Excerpts from Andy Lutomirski's message of December 3, 2020 3:09 pm:
> On Tue, Dec 1, 2020 at 6:50 PM Nicholas Piggin wrote:
>>
>> Excerpts from Andy Lutomirski's message of November 29, 2020 3:55 am:
>> > On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
As a side-effect, the order of flush_cache_vmap() and
arch_sync_kernel_mappings() calls are switched, but that now matches
the other callers in this file.
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/mm
misses by nearly 30x on a `git diff` workload on a 2-node
POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
This can result in more internal fragmentation and memory overhead for a
given allocation, an option nohugevmalloc is added to disable at boot.
Signed-off-by: Nicholas Pig
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Nicholas Piggin
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/powerpc/Kconfig| 1 +
arch/powerpc/kernel/module.c| 13 +++--
3 files changed, 14 insertions(+), 2 deletions
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: x...@kernel.org
Cc: "H. Peter Anvin"
Acked-by: Catalin Marinas [arm64]
Signed-off-by: Nicholas Piggin
---
arch/arm64/include/asm/vmalloc.h | 8 +++
arch/arm64/mm/mmu.c | 10 +--
arch/powerpc/i
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: x...@kernel.org
Cc: "H. Peter Anvin"
Signed-off-by: Nicholas Piggin
---
arch/x86/i
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: Catalin Marinas
Cc: Will Deacon
Cc: linux-arm-ker...@lists.infradead.org
Acked-by: Catalin Marinas
Signed-off-by: Nicholas Piggin
---
arch/arm64/include
This is a generic kernel virtual memory mapper, not specific to ioremap.
Signed-off-by: Nicholas Piggin
---
include/linux/vmalloc.h | 3 +
mm/ioremap.c| 197
mm/vmalloc.c| 196 +++
3 files
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: linuxppc-...@lists.ozlabs.org
Acked-by: Michael Ellerman
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/vmalloc.h | 19
This will be used as a generic kernel virtual mapping function, so
re-name it in preparation.
Signed-off-by: Nicholas Piggin
---
mm/ioremap.c | 64 +++-
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/ioremap.c b/mm/ioremap.c
apply_to_pte_range might mistake a large pte for bad, or treat it as a
page table, resulting in a crash or corruption. Add a test to warn and
return error if large entries are found.
Signed-off-by: Nicholas Piggin
---
mm/memory.c | 66 +++--
1
The vmalloc mapper operates on a struct page * array rather than a
linear physical address, re-name it to make this distinction clear.
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
pings")
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 41 ++---
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6ae491a8b210..f85124e88bdb 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
- Rebased on vmalloc cleanups, split series into simpler pieces.
- Fixed several compile errors and warnings
- Keep the page array and accounting in small page units because
struct vm_struct is an interface (this should fix x86 vmap stack debug
assert). [Thanks Zefan]
Nicholas Piggin (12):
Excerpts from Andy Lutomirski's message of December 5, 2020 12:37 am:
>
>
>> On Dec 3, 2020, at 11:54 PM, Nicholas Piggin wrote:
>>
>> Excerpts from Andy Lutomirski's message of December 4, 2020 3:26 pm:
>>> This is a mockup. It's designed to illustrate t
Excerpts from Edgecombe, Rick P's message of December 5, 2020 4:33 am:
> On Fri, 2020-12-04 at 18:12 +1000, Nicholas Piggin wrote:
>> Excerpts from Edgecombe, Rick P's message of December 1, 2020 6:21
>> am:
>> > On Sun, 2020-11-29 at 01:25 +1000, Nicholas Piggin wrote:
Excerpts from Edgecombe, Rick P's message of December 1, 2020 6:21 am:
> On Sun, 2020-11-29 at 01:25 +1000, Nicholas Piggin wrote:
>> Support huge page vmalloc mappings. Config option
>> HAVE_ARCH_HUGE_VMALLOC
>> enables support on architectures that define HAVE_ARCH_HUGE_VMA
Excerpts from Andy Lutomirski's message of December 4, 2020 3:26 pm:
> This is a mockup. It's designed to illustrate the algorithm and how the
> code might be structured. There are several things blatantly wrong with
> it:
>
> The coding stype is not up to kernel standards. I have prototypes
Excerpts from Andy Lutomirski's message of December 4, 2020 3:26 pm:
> The core scheduler isn't a great place for
> membarrier_mm_sync_core_before_usermode() -- the core scheduler doesn't
> actually know whether we are lazy. With the old code, if a CPU is
> running a membarrier-registered task,
Excerpts from Peter Zijlstra's message of December 3, 2020 6:44 pm:
> On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote:
>
>> power: same as ARM, except that the loop may be rather larger since
>> the systems are bigger. But I imagine it's still faster than Nick's
>> approach -- a
gt;
>> On Sat, Nov 28, 2020 at 7:54 PM Andy Lutomirski wrote:
>> >
>> > On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
>> > >
>> > > On big systems, the mm refcount can become highly contented when doing
>> > > a lot of co
Excerpts from Andy Lutomirski's message of November 29, 2020 1:54 pm:
> On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
>>
>> On big systems, the mm refcount can become highly contented when doing
>> a lot of context switching with threaded applications (particularly
Excerpts from Andy Lutomirski's message of November 29, 2020 10:38 am:
> On Sat, Nov 28, 2020 at 8:01 AM Nicholas Piggin wrote:
>>
>> This is called at points where a lazy mm is switched away or made not
>> lazy (by its owner switching back).
>>
>> Signed-off-by:
Excerpts from Andy Lutomirski's message of November 29, 2020 3:55 am:
> On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
>>
>> And get rid of the generic sync_core_before_usermode facility. This is
>> functionally a no-op in the core scheduler code, but it also catc
Excerpts from Andy Lutomirski's message of November 29, 2020 10:36 am:
> On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
>>
>> NOMMU systems could easily go without this and save a bit of code
>> and the refcount atomics, because their mm switch is a no-op. I
>>
Switch remaining x86-specific users to asm/sync_core.h, remove the
linux/sync_core.h header and ARCH_ option.
Signed-off-by: Nicholas Piggin
---
arch/x86/Kconfig| 1 -
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/cpu/mce/core.c | 2 +-
drivers/misc/sgi
).
This makes lazy tlb code a bit more modular.
Signed-off-by: Nicholas Piggin
---
.../membarrier-sync-core/arch-support.txt | 6 -
arch/x86/include/asm/mmu_context.h| 27 +++
include/linux/sched/mm.h | 14 --
kernel/cpu.c
This is called at points where a lazy mm is switched away or made not
lazy (by its owner switching back).
Signed-off-by: Nicholas Piggin
---
arch/arm/mach-rpc/ecard.c| 1 +
arch/powerpc/mm/book3s64/radix_tlb.c | 1 +
fs/exec.c| 6 --
include/asm
go in a generic
mm/scheduler series if we get arch acks because it's really just
refactoring wrappers.
The main result is reduced contention on lazy tlb mm refcount that
helps very big systems.
Thanks,
Nick
Nicholas Piggin (8):
lazy tlb: introduce exit_lazy_tlb
x86: use exit_lazy_tlb rather
NOMMU systems could easily go without this and save a bit of code
and the refcount atomics, because their mm switch is a no-op. I
haven't flipped them over because haven't audited all arch code to
convert over to using the _lazy_tlb refcounting.
Signed-off-by: Nicholas Piggin
---
arch/Kconfig
Add explicit _lazy_tlb annotated functions for lazy mm refcounting.
This makes things a bit more explicit, and allows explicit refcounting
to be removed if it is not used.
Signed-off-by: Nicholas Piggin
---
arch/arm/mach-rpc/ecard.c| 2 +-
arch/powerpc/mm/book3s64/radix_tlb.c | 4
Use _lazy_tlb functions for lazy mm refcounting in powerpc, to prepare
to move to MMU_LAZY_TLB_SHOOTDOWN.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index
On a 16-socket 192-core POWER8 system, a context switching benchmark
with as many software threads as CPUs (so each switch will go in and
out of idle), upstream can achieve a rate of about 1 million context
switches per second. After this patch it goes up to 118 million.
Signed-off-by: Nicholas
tional interrupts on a 144 CPU system during a kernel compile).
There are a number of strategies that could be employed to reduce IPIs
if they turn out to be a problem for some workload.
Signed-off-by: Nicholas Piggin
---
arch/Kconfig | 13 +
kernel/fork.
apply_to_pte_range might mistake a large pte for bad, or treat it as a
page table, resulting in a crash or corruption. Add a test to warn and
return error if large entries are found.
Signed-off-by: Nicholas Piggin
---
mm/memory.c | 66 +++--
1
pings")
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 41 ++---
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6ae491a8b210..f85124e88bdb 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.
Cc: linuxppc-...@lists.ozlabs.org
Acked-by: Michael Ellerman
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/vmalloc.h | 19
As a side-effect, the order of flush_cache_vmap() and
arch_sync_kernel_mappings() calls are switched, but that now matches
the other callers in this file.
Signed-off-by: Nicholas Piggin
---
mm/vmalloc.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/mm
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Nicholas Piggin
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/powerpc/Kconfig| 1 +
2 files changed, 3 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation
This will be used as a generic kernel virtual mapping function, so
re-name it in preparation.
Signed-off-by: Nicholas Piggin
---
mm/ioremap.c | 64 +++-
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/ioremap.c b/mm/ioremap.c
101 - 200 of 1466 matches
Mail list logo