[PATCH v4 10/14] treewide: Use initializer for struct vm_unmapped_area_info

2024-03-25 Thread Rick Edgecombe
Future changes will need to add a new member to struct
vm_unmapped_area_info. This would cause trouble for any call site that
doesn't initialize the struct. Currently every caller sets each member
manually, so if new ones are added they will be uninitialized and the
core code parsing the struct will see garbage in the new member.

It could be possible to initialize the new member manually to 0 at each
call site. This and a couple other options were discussed. Having some
struct vm_unmapped_area_info instances not zero initialized will put
those sites at risk of feeding garbage into vm_unmapped_area(), if the
convention is to zero initialize the struct and any new field addition
missed a call site that initializes each field manually. So it is
useful to do things similar across the kernel.

The consensus (see links) was that in general the best way to accomplish
taking into account both code cleanliness and minimizing the chance of
introducing bugs, was to do C99 static initialization. As in:
struct vm_unmapped_area_info info = {};

With this method of initialization, the whole struct will be zero
initialized, and any statements setting fields to zero will be unneeded.
The change should not leave cleanup at the call sides.

While iterating though the possible solutions a few archs kindly acked
other variations that still zero initialized the struct. These sites have
been modified in previous changes using the pattern acked by the respective
arch.

So to be reduce the chance of bugs via uninitialized fields, perform a
tree wide change using the consensus for the best general way to do this
change. Use C99 static initializing to zero the struct and remove and
statements that simply set members to zero.

Signed-off-by: Rick Edgecombe 
Reviewed-by: Kees Cook 
Cc: linux...@kvack.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t
Link: 
https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcghhfh3rhmzi@wzlcoecgy5rs/
Link: 
https://lore.kernel.org/lkml/ec3e377a-c0a0-4dd3-9cb9-96517e54d...@csgroup.eu/
---
v4:
 - Trivial rebase conflict in s390

Hi archs,

For some context, this is part of a larger series to improve shadow stack
guard gaps. It involves plumbing a new field via
struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
likely use it as well. The change is compile tested only for non-x86.

Thanks,

Rick
---
 arch/alpha/kernel/osf_sys.c  | 5 +
 arch/arc/mm/mmap.c   | 4 +---
 arch/arm/mm/mmap.c   | 5 ++---
 arch/loongarch/mm/mmap.c | 3 +--
 arch/mips/mm/mmap.c  | 3 +--
 arch/s390/mm/hugetlbpage.c   | 7 ++-
 arch/s390/mm/mmap.c  | 5 ++---
 arch/sh/mm/mmap.c| 5 ++---
 arch/sparc/kernel/sys_sparc_32.c | 3 +--
 arch/sparc/kernel/sys_sparc_64.c | 5 ++---
 arch/sparc/mm/hugetlbpage.c  | 7 ++-
 arch/x86/kernel/sys_x86_64.c | 7 ++-
 arch/x86/mm/hugetlbpage.c| 7 ++-
 fs/hugetlbfs/inode.c | 7 ++-
 mm/mmap.c| 9 ++---
 15 files changed, 25 insertions(+), 57 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..e5f881bc8288 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,14 +1218,11 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 unsigned long limit)
 {
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
-   info.flags = 0;
info.length = len;
info.low_limit = addr;
info.high_limit = limit;
-   info.align_mask = 0;
-   info.align_offset = 0;
return vm_unmapped_area();
 }
 
diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
index 3c1c7ae73292..69a915297155 100644
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We enforce the MAP_FIXED case.
@@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
return addr;
}
 
-   info.flags = 0;
info.length = len;
info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE;
-   info.align_mask = 0;
info.align_offset = pgoff << PAGE_SHIFT;
return vm_unmapped_area();
 }
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index a0f8a0ca0788..d65d0e6ed10a 100644
--- 

Re: [PATCH v3 08/12] treewide: Use initializer for struct vm_unmapped_area_info

2024-03-13 Thread Edgecombe, Rick P
On Tue, 2024-03-12 at 20:18 -0700, Kees Cook wrote:
> 
> Thanks! This looks to do exactly what it describes. :)
> 
> Reviewed-by: Kees Cook 

Thanks!


Re: [PATCH v3 08/12] treewide: Use initializer for struct vm_unmapped_area_info

2024-03-12 Thread Kees Cook
On Tue, Mar 12, 2024 at 03:28:39PM -0700, Rick Edgecombe wrote:
> So to be reduce the chance of bugs via uninitialized fields, perform a
> tree wide change using the consensus for the best general way to do this
> change. Use C99 static initializing to zero the struct and remove and
> statements that simply set members to zero.
> 
> Signed-off-by: Rick Edgecombe 

Thanks! This looks to do exactly what it describes. :)

Reviewed-by: Kees Cook 

-- 
Kees Cook



[PATCH v3 08/12] treewide: Use initializer for struct vm_unmapped_area_info

2024-03-12 Thread Rick Edgecombe
Future changes will need to add a new member to struct
vm_unmapped_area_info. This would cause trouble for any call site that
doesn't initialize the struct. Currently every caller sets each member
manually, so if new ones are added they will be uninitialized and the
core code parsing the struct will see garbage in the new member.

It could be possible to initialize the new member manually to 0 at each
call site. This and a couple other options were discussed. Having some
struct vm_unmapped_area_info instances not zero initialized will put
those sites at risk of feeding garbage into vm_unmapped_area(), if the
convention is to zero initialize the struct and any new field addition
missed a call site that initializes each field manually. So it is
useful to do things similar across the kernel.

The consensus (see links) was that in general the best way to accomplish
taking into account both code cleanliness and minimizing the chance of
introducing bugs, was to do C99 static initialization. As in:
struct vm_unmapped_area_info info = {};

With this method of initialization, the whole struct will be zero
initialized, and any statements setting fields to zero will be unneeded.
The change should not leave cleanup at the call sides.

While iterating though the possible solutions a few archs kindly acked
other variations that still zero initialized the struct. These sites have
been modified in previous changes using the pattern acked by the respective
arch.

So to be reduce the chance of bugs via uninitialized fields, perform a
tree wide change using the consensus for the best general way to do this
change. Use C99 static initializing to zero the struct and remove and
statements that simply set members to zero.

Signed-off-by: Rick Edgecombe 
Cc: linux...@kvack.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t
Link: 
https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcghhfh3rhmzi@wzlcoecgy5rs/
Link: 
https://lore.kernel.org/lkml/ec3e377a-c0a0-4dd3-9cb9-96517e54d...@csgroup.eu/
---
Hi archs,

For some context, this is part of a larger series to improve shadow stack
guard gaps. It involves plumbing a new field via
struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
likely use it as well. The change is compile tested only for non-x86.

Thanks,

Rick
---
 arch/alpha/kernel/osf_sys.c  |  5 +
 arch/arc/mm/mmap.c   |  4 +---
 arch/arm/mm/mmap.c   |  5 ++---
 arch/loongarch/mm/mmap.c |  3 +--
 arch/mips/mm/mmap.c  |  3 +--
 arch/s390/mm/hugetlbpage.c   |  7 ++-
 arch/s390/mm/mmap.c  | 11 ---
 arch/sh/mm/mmap.c|  5 ++---
 arch/sparc/kernel/sys_sparc_32.c |  3 +--
 arch/sparc/kernel/sys_sparc_64.c |  5 ++---
 arch/sparc/mm/hugetlbpage.c  |  7 ++-
 arch/x86/kernel/sys_x86_64.c |  7 ++-
 arch/x86/mm/hugetlbpage.c|  7 ++-
 fs/hugetlbfs/inode.c |  7 ++-
 mm/mmap.c|  9 ++---
 15 files changed, 27 insertions(+), 61 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..e5f881bc8288 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,14 +1218,11 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 unsigned long limit)
 {
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
-   info.flags = 0;
info.length = len;
info.low_limit = addr;
info.high_limit = limit;
-   info.align_mask = 0;
-   info.align_offset = 0;
return vm_unmapped_area();
 }
 
diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
index 3c1c7ae73292..69a915297155 100644
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We enforce the MAP_FIXED case.
@@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
return addr;
}
 
-   info.flags = 0;
info.length = len;
info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE;
-   info.align_mask = 0;
info.align_offset = pgoff << PAGE_SHIFT;
return vm_unmapped_area();
 }
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index a0f8a0ca0788..d65d0e6ed10a 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ 

Re: [v2 PATCH 0/3] arch: mm, vdso: consolidate PAGE_SIZE definition

2024-03-08 Thread Vincenzo Frascino



On 06/03/2024 14:14, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Naresh noticed that the newly added usage of the PAGE_SIZE macro in
> include/vdso/datapage.h introduced a build regression. I had an older
> patch that I revived to have this defined through Kconfig rather than
> through including asm/page.h, which is not allowed in vdso code.
> 
> The vdso patch series now has a temporary workaround, but I still want to
> get this into v6.9 so we can place the hack with CONFIG_PAGE_SIZE
> in the vdso.
> 
> I've applied this to the asm-generic tree already, please let me know if
> there are still remaining issues. It's really close to the merge window
> already, so I'd probably give this a few more days before I send a pull
> request, or defer it to v6.10 if anything goes wrong.
> 
> Sorry for the delay, I was still waiting to resolve the m68k question,
> but there were no further replies in the end, so I kept my original
> version.
> 
> Changes from v1:
> 
>  - improve Kconfig help texts
>  - remove an extraneous line in hexagon
> 
>   Arnd
>

Thanks Arnd, looks good to me.

Reviewed-by: Vincenzo Frascino 



Re: [RFC PATCH 00/14] Introducing TIF_NOTIFY_IPI flag

2024-03-07 Thread Julia Lawall



On Wed, 6 Mar 2024, Vincent Guittot wrote:

> Hi Prateek,
>
> Adding Julia who could be interested in this patchset. Your patchset
> should trigger idle load balance instead of newly idle load balance
> now when the polling is used. This was one reason for not migrating
> task in idle CPU

My situation is roughly as follows:

The machine is an Intel 6130 with two sockets and 32 hardware threads
(subsequently referred to as cores) per socket.  The test is bt.B of the
OpenMP version of the NAS benchmark suite.  Initially there is one
thread per core.  NUMA balancing occurs, resulting in a move, and thus 31
threads on one socket and 33 on the other.

Load balancing should result in the idle core pulling one of the threads
from the other socket.  But that doesn't happen in normal load balancing,
because all 33 threads on the overloaded socket are considered to have a
preference for that socket.  Active balancing could pull a thread, but it
is not triggered because the idle core is seen as being newly idle.

The question is then why a core that has been idle for up to multiple
seconds is continually seen as newly idle.  Every 4ms, a scheduler tick
submits some work to try to load balance.  This submission process
previously broke out of the idle loop due to a need_resched, hence the
same issue as involved in this patch series.  The need_resched caused
invocation of schedule, which would then see that there was no task to
pick, making the core be considered to be newly idle.  The classification
as newly idle doesn't take into account whether any task was running prior
to the call to schedule.

The load balancing work that was submitted every 4ms is also a NOP due a
test for need_resched.

This patch series no longer makes need resched be the only way out of the
idle loop.  Without the need resched, the load balancing work that is
submitted every 4ms can actually try to do load balancing.  The core is
not newly idle, so active balancing could in principle occur.  But now
nothing happens because the work is run by ksoftirqd.  The presence of
ksoftirqd on the idle core means that the core is no longer idle.  Thus
there is no more need for load balancing.

So this patch series in itself doesn't solve the problem.  I did 500 runs
with this patch series and 500 runs with the Linux kernel that this patch
series builds on, and there is essentially no difference in the
performance.

julia


>
> On Tue, 20 Feb 2024 at 18:15, K Prateek Nayak  wrote:
> >
> > Hello everyone,
> >
> > Before jumping into the issue, let me clarify the Cc list. Everyone have
> > been cc'ed on Patch 0 through Patch 3. Respective arch maintainers,
> > reviewers, and committers returned by scripts/get_maintainer.pl have
> > been cc'ed on the respective arch side changes. Scheduler and CPU Idle
> > maintainers and reviewers have been included for the entire series. If I
> > have missed anyone, please do add them. If you would like to be dropped
> > from the cc list, wholly or partially, for the future iterations, please
> > do let me know.
> >
> > With that out of the way ...
> >
> > Problem statement
> > =
> >
> > When measuring IPI throughput using a modified version of Anton
> > Blanchard's ipistorm benchmark [1], configured to measure time taken to
> > perform a fixed number of smp_call_function_single() (with wait set to
> > 1), an increase in benchmark time was observed between v5.7 and the
> > current upstream release (v6.7-rc6 at the time of encounter).
> >
> > Bisection pointed to commit b2a02fc43a1f ("smp: Optimize
> > send_call_function_single_ipi()") as the reason behind this increase in
> > runtime.
> >
> >
> > Experiments
> > ===
> >
> > Since the commit cannot be cleanly reverted on top of the current
> > tip:sched/core, the effects of the optimizations were reverted by:
> >
> > 1. Removing the check for call_function_single_prep_ipi() in
> >send_call_function_single_ipi(). With this change
> >send_call_function_single_ipi() always calls
> >arch_send_call_function_single_ipi()
> >
> > 2. Removing the call to flush_smp_call_function_queue() in do_idle()
> >since every smp_call_function, with (1.), would unconditionally send
> >an IPI to an idle CPU in TIF_POLLING mode.
> >
> > Following is the diff of the above described changes which will be
> > henceforth referred to as the "revert":
> >
> > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > index 31231925f1ec..735184d98c0f 100644
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -332,11 +332,6 @@ static void do_idle(void)
> >  */
> > smp_mb__after_atomic();
> >
> > -   /*
> > -* RCU relies on this call to be done outside of an RCU read-side
> > -* critical section.
> > -*/
> > -   flush_smp_call_function_queue();
> > schedule_idle();
> >
> > if (unlikely(klp_patch_pending(current)))
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index 

Re: [PATCH v2 3/3] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-07 Thread Andreas Larsson
On 2024-03-06 15:14, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
> 
> Acked-by: Guo Ren 
> Acked-by: Heiko Carstens 
> Acked-by: Stafford Horne 
> Acked-by: Johannes Berg 
> Signed-off-by: Arnd Bergmann 
> ---
> No changes from v1

>  arch/sparc/Kconfig | 2 ++
>  arch/sparc/include/asm/page_32.h   | 2 +-
>  arch/sparc/include/asm/page_64.h   | 3 +--

Acked-by: Andreas Larsson 

Thanks,
Andreas




Re: [PATCH v2 2/3] arch: simplify architecture specific page size configuration

2024-03-07 Thread Michael Ellerman
Arnd Bergmann  writes:
> From: Arnd Bergmann 
>
> arc, arm64, parisc and powerpc all have their own Kconfig symbols
> in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
> so the common symbols are the ones that are actually used, while
> leaving the arhcitecture specific ones as the user visible
> place for configuring it, to avoid breaking user configs.
>
> Reviewed-by: Christophe Leroy  (powerpc32)
> Acked-by: Catalin Marinas 
> Acked-by: Helge Deller  # parisc
> Signed-off-by: Arnd Bergmann 
> ---
> No changes from v1
>
>  arch/arc/Kconfig  |  3 +++
>  arch/arc/include/uapi/asm/page.h  |  6 ++
>  arch/arm64/Kconfig| 29 +
>  arch/arm64/include/asm/page-def.h |  2 +-
>  arch/parisc/Kconfig   |  3 +++
>  arch/parisc/include/asm/page.h| 10 +-
>  arch/powerpc/Kconfig  | 31 ++-
>  arch/powerpc/include/asm/page.h   |  2 +-
>  scripts/gdb/linux/constants.py.in |  2 +-
>  scripts/gdb/linux/mm.py   |  2 +-
>  10 files changed, 32 insertions(+), 58 deletions(-)

Acked-by: Michael Ellerman  (powerpc)

cheers



Re: [PATCH v2 1/3] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-03-06 Thread Michael Ellerman
Hi Arnd,

Arnd Bergmann  writes:
> From: Arnd Bergmann 
>
> These four architectures define the same Kconfig symbols for configuring
> the page size. Move the logic into a common place where it can be shared
> with all other architectures.
>
> Signed-off-by: Arnd Bergmann 
> ---
> Changes from v1:
>  - improve Kconfig help texts
>  - fix Hexagon Kconfig
>
>  arch/Kconfig  | 92 ++-
>  arch/hexagon/Kconfig  | 24 ++--
>  arch/hexagon/include/asm/page.h   |  6 +-
>  arch/loongarch/Kconfig| 21 ++-
>  arch/loongarch/include/asm/page.h | 10 +---
>  arch/mips/Kconfig | 58 ++-
>  arch/mips/include/asm/page.h  | 16 +-
>  arch/sh/include/asm/page.h| 13 +
>  arch/sh/mm/Kconfig| 42 --
>  9 files changed, 121 insertions(+), 161 deletions(-)

There's a few "help" lines missing, which breaks the build:

  arch/Kconfig:1134: syntax error
  arch/Kconfig:1133: invalid statement
  arch/Kconfig:1134: invalid statement
  arch/Kconfig:1135:warning: ignoring unsupported character '.'
  arch/Kconfig:1135:warning: ignoring unsupported character '.'
  arch/Kconfig:1135: invalid statement
  arch/Kconfig:1136: invalid statement
  arch/Kconfig:1137:warning: ignoring unsupported character '.'
  arch/Kconfig:1137: invalid statement
  arch/Kconfig:1143: syntax error
  arch/Kconfig:1142: invalid statement
  arch/Kconfig:1143: invalid statement
  arch/Kconfig:1144:warning: ignoring unsupported character '.'
  arch/Kconfig:1144: invalid statement
  arch/Kconfig:1145: invalid statement
  arch/Kconfig:1146: invalid statement
  arch/Kconfig:1147: invalid statement
  arch/Kconfig:1148:warning: ignoring unsupported character '.'
  arch/Kconfig:1148: invalid statement
  make[4]: *** [../scripts/kconfig/Makefile:85: syncconfig] Error 1

Fixup diff is:

diff --git a/arch/Kconfig b/arch/Kconfig
index 56d45a75f625..f2295fa3b48c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1130,6 +1130,7 @@ config PAGE_SIZE_16KB
 config PAGE_SIZE_32KB
bool "32KiB pages"
depends on HAVE_PAGE_SIZE_32KB
+   help
  Using 32KiB page size will result in slightly higher performance
  kernel at the price of higher memory consumption compared to
  16KiB pages.  This option is available only on cnMIPS cores.
@@ -1139,6 +1140,7 @@ config PAGE_SIZE_32KB
 config PAGE_SIZE_64KB
bool "64KiB pages"
depends on HAVE_PAGE_SIZE_64KB
+   help
  Using 64KiB page size will result in slightly higher performance
  kernel at the price of much higher memory consumption compared to
  4KiB or 16KiB pages.


cheers



Re: [v2 PATCH 0/3] arch: mm, vdso: consolidate PAGE_SIZE definition

2024-03-06 Thread Thomas Gleixner
On Wed, Mar 06 2024 at 15:14, Arnd Bergmann wrote:
> From: Arnd Bergmann 
>
> Naresh noticed that the newly added usage of the PAGE_SIZE macro in
> include/vdso/datapage.h introduced a build regression. I had an older
> patch that I revived to have this defined through Kconfig rather than
> through including asm/page.h, which is not allowed in vdso code.
>
> The vdso patch series now has a temporary workaround, but I still want to
> get this into v6.9 so we can place the hack with CONFIG_PAGE_SIZE
> in the vdso.

Thank you for cleaning this up!

  tglx



Re: [PATCH v2 3/3] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-06 Thread Thomas Gleixner
On Wed, Mar 06 2024 at 15:14, Arnd Bergmann wrote:

> From: Arnd Bergmann 
>
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
>
> Acked-by: Guo Ren 
> Acked-by: Heiko Carstens 
> Acked-by: Stafford Horne 
> Acked-by: Johannes Berg 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Thomas Gleixner 



Re: [PATCH v2 2/3] arch: simplify architecture specific page size configuration

2024-03-06 Thread Thomas Gleixner
On Wed, Mar 06 2024 at 15:14, Arnd Bergmann wrote:
> From: Arnd Bergmann 
>
> arc, arm64, parisc and powerpc all have their own Kconfig symbols
> in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
> so the common symbols are the ones that are actually used, while
> leaving the arhcitecture specific ones as the user visible
> place for configuring it, to avoid breaking user configs.
>
> Reviewed-by: Christophe Leroy  (powerpc32)
> Acked-by: Catalin Marinas 
> Acked-by: Helge Deller  # parisc
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Thomas Gleixner 



Re: [PATCH v2 1/3] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-03-06 Thread Thomas Gleixner
On Wed, Mar 06 2024 at 15:14, Arnd Bergmann wrote:
> From: Arnd Bergmann 
>
> These four architectures define the same Kconfig symbols for configuring
> the page size. Move the logic into a common place where it can be shared
> with all other architectures.
>
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Thomas Gleixner 



Re: [PATCH v2 3/3] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-06 Thread Geert Uytterhoeven
On Wed, Mar 6, 2024 at 3:15 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
>
> Acked-by: Guo Ren 
> Acked-by: Heiko Carstens 
> Acked-by: Stafford Horne 
> Acked-by: Johannes Berg 
> Signed-off-by: Arnd Bergmann 
> ---
> No changes from v1

>  arch/m68k/Kconfig  | 3 +++
>  arch/m68k/Kconfig.cpu  | 2 ++
>  arch/m68k/include/asm/page.h   | 6 +-

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



[PATCH v2 3/3] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-06 Thread Arnd Bergmann
From: Arnd Bergmann 

Most architectures only support a single hardcoded page size. In order
to ensure that each one of these sets the corresponding Kconfig symbols,
change over the PAGE_SHIFT definition to the common one and allow
only the hardware page size to be selected.

Acked-by: Guo Ren 
Acked-by: Heiko Carstens 
Acked-by: Stafford Horne 
Acked-by: Johannes Berg 
Signed-off-by: Arnd Bergmann 
---
No changes from v1

 arch/alpha/Kconfig | 1 +
 arch/alpha/include/asm/page.h  | 2 +-
 arch/arm/Kconfig   | 1 +
 arch/arm/include/asm/page.h| 2 +-
 arch/csky/Kconfig  | 1 +
 arch/csky/include/asm/page.h   | 2 +-
 arch/m68k/Kconfig  | 3 +++
 arch/m68k/Kconfig.cpu  | 2 ++
 arch/m68k/include/asm/page.h   | 6 +-
 arch/microblaze/Kconfig| 1 +
 arch/microblaze/include/asm/page.h | 2 +-
 arch/nios2/Kconfig | 1 +
 arch/nios2/include/asm/page.h  | 2 +-
 arch/openrisc/Kconfig  | 1 +
 arch/openrisc/include/asm/page.h   | 2 +-
 arch/riscv/Kconfig | 1 +
 arch/riscv/include/asm/page.h  | 2 +-
 arch/s390/Kconfig  | 1 +
 arch/s390/include/asm/page.h   | 2 +-
 arch/sparc/Kconfig | 2 ++
 arch/sparc/include/asm/page_32.h   | 2 +-
 arch/sparc/include/asm/page_64.h   | 3 +--
 arch/um/Kconfig| 1 +
 arch/um/include/asm/page.h | 2 +-
 arch/x86/Kconfig   | 1 +
 arch/x86/include/asm/page_types.h  | 2 +-
 arch/xtensa/Kconfig| 1 +
 arch/xtensa/include/asm/page.h | 2 +-
 28 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index d6968d090d49..4f490250d323 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -14,6 +14,7 @@ config ALPHA
select PCI_DOMAINS if PCI
select PCI_SYSCALL if PCI
select HAVE_ASM_MODVERSIONS
+   select HAVE_PAGE_SIZE_8KB
select HAVE_PCSPKR_PLATFORM
select HAVE_PERF_EVENTS
select NEED_DMA_MAP_STATE
diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
index 4db1ebc0ed99..70419e6be1a3 100644
--- a/arch/alpha/include/asm/page.h
+++ b/arch/alpha/include/asm/page.h
@@ -6,7 +6,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 13
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0af6709570d1..9d52ba3a8ad1 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -116,6 +116,7 @@ config ARM
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPTPROBES if !THUMB2_KERNEL
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PCI if MMU
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 119aa85d1feb..62af9f7f9e96 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -8,7 +8,7 @@
 #define _ASMARM_PAGE_H
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
 
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index cf2a6fd7dff8..9c2723ab1c94 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -89,6 +89,7 @@ config CSKY
select HAVE_KPROBES if !CPU_CK610
select HAVE_KPROBES_ON_FTRACE if !CPU_CK610
select HAVE_KRETPROBES if !CPU_CK610
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
diff --git a/arch/csky/include/asm/page.h b/arch/csky/include/asm/page.h
index 866855e1ab43..0ca6c408c07f 100644
--- a/arch/csky/include/asm/page.h
+++ b/arch/csky/include/asm/page.h
@@ -10,7 +10,7 @@
 /*
  * PAGE_SHIFT determines the page size: 4KB
  */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE - 1))
 #define THREAD_SIZE(PAGE_SIZE * 2)
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..7b709453d5e7 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -84,12 +84,15 @@ config MMU
 
 config MMU_MOTOROLA
bool
+   select HAVE_PAGE_SIZE_4KB
 
 config MMU_COLDFIRE
+   select HAVE_PAGE_SIZE_8KB
bool
 
 config MMU_SUN3
bool
+   select HAVE_PAGE_SIZE_8KB
depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE
 
 config ARCH_SUPPORTS_KEXEC
diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 9dcf245c9cbf..c777a129768a 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -30,6 +30,7 @@ config COLDFIRE
select GENERIC_CSUM
select GPIOLIB
   

[PATCH v2 2/3] arch: simplify architecture specific page size configuration

2024-03-06 Thread Arnd Bergmann
From: Arnd Bergmann 

arc, arm64, parisc and powerpc all have their own Kconfig symbols
in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
so the common symbols are the ones that are actually used, while
leaving the arhcitecture specific ones as the user visible
place for configuring it, to avoid breaking user configs.

Reviewed-by: Christophe Leroy  (powerpc32)
Acked-by: Catalin Marinas 
Acked-by: Helge Deller  # parisc
Signed-off-by: Arnd Bergmann 
---
No changes from v1

 arch/arc/Kconfig  |  3 +++
 arch/arc/include/uapi/asm/page.h  |  6 ++
 arch/arm64/Kconfig| 29 +
 arch/arm64/include/asm/page-def.h |  2 +-
 arch/parisc/Kconfig   |  3 +++
 arch/parisc/include/asm/page.h| 10 +-
 arch/powerpc/Kconfig  | 31 ++-
 arch/powerpc/include/asm/page.h   |  2 +-
 scripts/gdb/linux/constants.py.in |  2 +-
 scripts/gdb/linux/mm.py   |  2 +-
 10 files changed, 32 insertions(+), 58 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 1b0483c51cc1..4092bec198be 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -284,14 +284,17 @@ choice
 
 config ARC_PAGE_SIZE_8K
bool "8KB"
+   select HAVE_PAGE_SIZE_8KB
help
  Choose between 8k vs 16k
 
 config ARC_PAGE_SIZE_16K
+   select HAVE_PAGE_SIZE_16KB
bool "16KB"
 
 config ARC_PAGE_SIZE_4K
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
depends on ARC_MMU_V3 || ARC_MMU_V4
 
 endchoice
diff --git a/arch/arc/include/uapi/asm/page.h b/arch/arc/include/uapi/asm/page.h
index 2a4ad619abfb..7fd9e741b527 100644
--- a/arch/arc/include/uapi/asm/page.h
+++ b/arch/arc/include/uapi/asm/page.h
@@ -13,10 +13,8 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#if defined(CONFIG_ARC_PAGE_SIZE_16K)
-#define PAGE_SHIFT 14
-#elif defined(CONFIG_ARC_PAGE_SIZE_4K)
-#define PAGE_SHIFT 12
+#ifdef __KERNEL__
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #else
 /*
  * Default 8k
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index aa7c1d435139..29290b8cb36d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -277,27 +277,21 @@ config 64BIT
 config MMU
def_bool y
 
-config ARM64_PAGE_SHIFT
-   int
-   default 16 if ARM64_64K_PAGES
-   default 14 if ARM64_16K_PAGES
-   default 12
-
 config ARM64_CONT_PTE_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 7 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 7 if PAGE_SIZE_16KB
default 4
 
 config ARM64_CONT_PMD_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 5 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 5 if PAGE_SIZE_16KB
default 4
 
 config ARCH_MMAP_RND_BITS_MIN
-   default 14 if ARM64_64K_PAGES
-   default 16 if ARM64_16K_PAGES
+   default 14 if PAGE_SIZE_64KB
+   default 16 if PAGE_SIZE_16KB
default 18
 
 # max bits determined by the following formula:
@@ -1259,11 +1253,13 @@ choice
 
 config ARM64_4K_PAGES
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
help
  This feature enables 4KB pages support.
 
 config ARM64_16K_PAGES
bool "16KB"
+   select HAVE_PAGE_SIZE_16KB
help
  The system will use 16KB pages support. AArch32 emulation
  requires applications compiled with 16K (or a multiple of 16K)
@@ -1271,6 +1267,7 @@ config ARM64_16K_PAGES
 
 config ARM64_64K_PAGES
bool "64KB"
+   select HAVE_PAGE_SIZE_64KB
help
  This feature enables 64KB pages support (4KB by default)
  allowing only two levels of page tables and faster TLB
@@ -1291,19 +1288,19 @@ choice
 
 config ARM64_VA_BITS_36
bool "36-bit" if EXPERT
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_39
bool "39-bit"
-   depends on ARM64_4K_PAGES
+   depends on PAGE_SIZE_4KB
 
 config ARM64_VA_BITS_42
bool "42-bit"
-   depends on ARM64_64K_PAGES
+   depends on PAGE_SIZE_64KB
 
 config ARM64_VA_BITS_47
bool "47-bit"
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_48
bool "48-bit"
diff --git a/arch/arm64/include/asm/page-def.h 
b/arch/arm64/include/asm/page-def.h
index 2403f7b4cdbf..792e9fe881dc 100644
--- a/arch/arm64/include/asm/page-def.h
+++ b/arch/arm64/include/asm/page-def.h
@@ -11,7 +11,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT CONFIG_ARM64_PAGE_SHIFT
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 5c845e8d59d9..b180e684fa0d 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -273,6 +273,7 @@ choice
 
 config PARISC_PAGE_SIZE_4KB

[PATCH v2 1/3] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-03-06 Thread Arnd Bergmann
From: Arnd Bergmann 

These four architectures define the same Kconfig symbols for configuring
the page size. Move the logic into a common place where it can be shared
with all other architectures.

Signed-off-by: Arnd Bergmann 
---
Changes from v1:
 - improve Kconfig help texts
 - fix Hexagon Kconfig

 arch/Kconfig  | 92 ++-
 arch/hexagon/Kconfig  | 24 ++--
 arch/hexagon/include/asm/page.h   |  6 +-
 arch/loongarch/Kconfig| 21 ++-
 arch/loongarch/include/asm/page.h | 10 +---
 arch/mips/Kconfig | 58 ++-
 arch/mips/include/asm/page.h  | 16 +-
 arch/sh/include/asm/page.h| 13 +
 arch/sh/mm/Kconfig| 42 --
 9 files changed, 121 insertions(+), 161 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a5af0edd3eb8..c63034e092d0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1078,17 +1078,105 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
  and vice-versa 32-bit applications to call 64-bit mmap().
  Required for applications doing different bitness syscalls.
 
+config HAVE_PAGE_SIZE_4KB
+   bool
+
+config HAVE_PAGE_SIZE_8KB
+   bool
+
+config HAVE_PAGE_SIZE_16KB
+   bool
+
+config HAVE_PAGE_SIZE_32KB
+   bool
+
+config HAVE_PAGE_SIZE_64KB
+   bool
+
+config HAVE_PAGE_SIZE_256KB
+   bool
+
+choice
+   prompt "MMU page size"
+
+config PAGE_SIZE_4KB
+   bool "4KiB pages"
+   depends on HAVE_PAGE_SIZE_4KB
+   help
+ This option select the standard 4KiB Linux page size and the only
+ available option on many architectures. Using 4KiB page size will
+ minimize memory consumption and is therefore recommended for low
+ memory systems.
+ Some software that is written for x86 systems makes incorrect
+ assumptions about the page size and only runs on 4KiB pages.
+
+config PAGE_SIZE_8KB
+   bool "8KiB pages"
+   depends on HAVE_PAGE_SIZE_8KB
+   help
+ This option is the only supported page size on a few older
+ processors, and can be slightly faster than 4KiB pages.
+
+config PAGE_SIZE_16KB
+   bool "16KiB pages"
+   depends on HAVE_PAGE_SIZE_16KB
+   help
+ This option is usually a good compromise between memory
+ consumption and performance for typical desktop and server
+ workloads, often saving a level of page table lookups compared
+ to 4KB pages as well as reducing TLB pressure and overhead of
+ per-page operations in the kernel at the expense of a larger
+ page cache.
+
+config PAGE_SIZE_32KB
+   bool "32KiB pages"
+   depends on HAVE_PAGE_SIZE_32KB
+ Using 32KiB page size will result in slightly higher performance
+ kernel at the price of higher memory consumption compared to
+ 16KiB pages.  This option is available only on cnMIPS cores.
+ Note that you will need a suitable Linux distribution to
+ support this.
+
+config PAGE_SIZE_64KB
+   bool "64KiB pages"
+   depends on HAVE_PAGE_SIZE_64KB
+ Using 64KiB page size will result in slightly higher performance
+ kernel at the price of much higher memory consumption compared to
+ 4KiB or 16KiB pages.
+ This is not suitable for general-purpose workloads but the
+ better performance may be worth the cost for certain types of
+ supercomputing or database applications that work mostly with
+ large in-memory data rather than small files.
+
+config PAGE_SIZE_256KB
+   bool "256KiB pages"
+   depends on HAVE_PAGE_SIZE_256KB
+   help
+ 256KiB pages have little practical value due to their extreme
+ memory usage.  The kernel will only be able to run applications
+ that have been compiled with '-zmax-page-size' set to 256KiB
+ (the default is 64KiB or 4KiB on most architectures).
+
+endchoice
+
 config PAGE_SIZE_LESS_THAN_64KB
def_bool y
-   depends on !ARM64_64K_PAGES
depends on !PAGE_SIZE_64KB
-   depends on !PARISC_PAGE_SIZE_64KB
depends on PAGE_SIZE_LESS_THAN_256KB
 
 config PAGE_SIZE_LESS_THAN_256KB
def_bool y
depends on !PAGE_SIZE_256KB
 
+config PAGE_SHIFT
+   int
+   default 12 if PAGE_SIZE_4KB
+   default 13 if PAGE_SIZE_8KB
+   default 14 if PAGE_SIZE_16KB
+   default 15 if PAGE_SIZE_32KB
+   default 16 if PAGE_SIZE_64KB
+   default 18 if PAGE_SIZE_256KB
+
 # This allows to use a set of generic functions to determine mmap base
 # address by giving priority to top-down scheme only if the process
 # is not in legacy mode (compat task, unlimited stack size or
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index a880ee067d2e..1414052e7d6b 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -8,6 +8,10 @@ config HEXAGON
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select 

[v2 PATCH 0/3] arch: mm, vdso: consolidate PAGE_SIZE definition

2024-03-06 Thread Arnd Bergmann
From: Arnd Bergmann 

Naresh noticed that the newly added usage of the PAGE_SIZE macro in
include/vdso/datapage.h introduced a build regression. I had an older
patch that I revived to have this defined through Kconfig rather than
through including asm/page.h, which is not allowed in vdso code.

The vdso patch series now has a temporary workaround, but I still want to
get this into v6.9 so we can place the hack with CONFIG_PAGE_SIZE
in the vdso.

I've applied this to the asm-generic tree already, please let me know if
there are still remaining issues. It's really close to the merge window
already, so I'd probably give this a few more days before I send a pull
request, or defer it to v6.10 if anything goes wrong.

Sorry for the delay, I was still waiting to resolve the m68k question,
but there were no further replies in the end, so I kept my original
version.

Changes from v1:

 - improve Kconfig help texts
 - remove an extraneous line in hexagon

  Arnd

Link: 
https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
Link: https://lore.kernel.org/all/65dc6c14.170a0220.f4a3f.9...@mx.google.com/
Link: https://lore.kernel.org/lkml/20240226161414.2316610-1-a...@kernel.org/

Arnd Bergmann (3):
  arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions
  arch: simplify architecture specific page size configuration
  arch: define CONFIG_PAGE_SIZE_*KB on all architectures

 arch/Kconfig   | 92 +-
 arch/alpha/Kconfig |  1 +
 arch/alpha/include/asm/page.h  |  2 +-
 arch/arc/Kconfig   |  3 +
 arch/arc/include/uapi/asm/page.h   |  6 +-
 arch/arm/Kconfig   |  1 +
 arch/arm/include/asm/page.h|  2 +-
 arch/arm64/Kconfig | 29 +-
 arch/arm64/include/asm/page-def.h  |  2 +-
 arch/csky/Kconfig  |  1 +
 arch/csky/include/asm/page.h   |  2 +-
 arch/hexagon/Kconfig   | 24 ++--
 arch/hexagon/include/asm/page.h|  6 +-
 arch/loongarch/Kconfig | 21 ++-
 arch/loongarch/include/asm/page.h  | 10 +---
 arch/m68k/Kconfig  |  3 +
 arch/m68k/Kconfig.cpu  |  2 +
 arch/m68k/include/asm/page.h   |  6 +-
 arch/microblaze/Kconfig|  1 +
 arch/microblaze/include/asm/page.h |  2 +-
 arch/mips/Kconfig  | 58 ++-
 arch/mips/include/asm/page.h   | 16 +-
 arch/nios2/Kconfig |  1 +
 arch/nios2/include/asm/page.h  |  2 +-
 arch/openrisc/Kconfig  |  1 +
 arch/openrisc/include/asm/page.h   |  2 +-
 arch/parisc/Kconfig|  3 +
 arch/parisc/include/asm/page.h | 10 +---
 arch/powerpc/Kconfig   | 31 ++
 arch/powerpc/include/asm/page.h|  2 +-
 arch/riscv/Kconfig |  1 +
 arch/riscv/include/asm/page.h  |  2 +-
 arch/s390/Kconfig  |  1 +
 arch/s390/include/asm/page.h   |  2 +-
 arch/sh/include/asm/page.h | 13 +
 arch/sh/mm/Kconfig | 42 --
 arch/sparc/Kconfig |  2 +
 arch/sparc/include/asm/page_32.h   |  2 +-
 arch/sparc/include/asm/page_64.h   |  3 +-
 arch/um/Kconfig|  1 +
 arch/um/include/asm/page.h |  2 +-
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/page_types.h  |  2 +-
 arch/xtensa/Kconfig|  1 +
 arch/xtensa/include/asm/page.h |  2 +-
 scripts/gdb/linux/constants.py.in  |  2 +-
 scripts/gdb/linux/mm.py|  2 +-
 47 files changed, 185 insertions(+), 238 deletions(-)

-- 
2.39.2

To: Thomas Gleixner 
To: Vincenzo Frascino 
To: Kees Cook 
To: Anna-Maria Behnsen 
Cc: Matt Turner 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Guo Ren 
Cc: Brian Cain 
Cc: Huacai Chen 
Cc: Geert Uytterhoeven 
Cc: Michal Simek 
Cc: Thomas Bogendoerfer 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Palmer Dabbelt 
Cc: John Paul Adrian Glaubitz 
Cc: Andreas Larsson 
Cc: Richard Weinberger 
Cc: x...@kernel.org
Cc: Max Filippov 
Cc: Andy Lutomirski 
Cc: Vincenzo Frascino 
Cc: Jan Kiszka 
Cc: Kieran Bingham 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: linux-ker...@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-hexa...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: linux-openr...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org



Re: [RFC PATCH 00/14] Introducing TIF_NOTIFY_IPI flag

2024-03-06 Thread Vincent Guittot
On Wed, 6 Mar 2024 at 11:18, K Prateek Nayak  wrote:
>
> Hello Vincent,
>
> Thank you for taking a look at the series.
>
> On 3/6/2024 3:29 PM, Vincent Guittot wrote:
> > Hi Prateek,
> >
> > Adding Julia who could be interested in this patchset. Your patchset
> > should trigger idle load balance instead of newly idle load balance
> > now when the polling is used. This was one reason for not migrating
> > task in idle CPU
>
> Thank you.
>
> >
> > On Tue, 20 Feb 2024 at 18:15, K Prateek Nayak  
> > wrote:
> >>
> >> Hello everyone,
> >>
> >> [..snip..]
> >>
> >>
> >> Skipping newidle_balance()
> >> ==
> >>
> >> In an earlier attempt to solve the challenge of the long IRQ disabled
> >> section, newidle_balance() was skipped when a CPU waking up from idle
> >> was found to have no runnable tasks, and was transitioning back to
> >> idle [2]. Tim [3] and David [4] had pointed out that newidle_balance()
> >> may be viable for CPUs that are idling with tick enabled, where the
> >> newidle_balance() has the opportunity to pull tasks onto the idle CPU.
> >>
> >> Vincent [5] pointed out a case where the idle load kick will fail to
> >> run on an idle CPU since the IPI handler launching the ILB will check
> >> for need_resched(). In such cases, the idle CPU relies on
> >> newidle_balance() to pull tasks towards itself.
> >
> > Calling newidle_balance() instead of the normal idle load balance
> > prevents the CPU to pull tasks from other groups
>
> Thank you for the correction.
>
> >
> >>
> >> Using an alternate flag instead of NEED_RESCHED to indicate a pending
> >> IPI was suggested as the correct approach to solve this problem on the
> >> same thread.
> >>
> >>
> >> Proposed solution: TIF_NOTIFY_IPI
> >> =
> >>
> >> Instead of reusing TIF_NEED_RESCHED bit to pull an TIF_POLLING CPU out
> >> of idle, TIF_NOTIFY_IPI is a newly introduced flag that
> >> call_function_single_prep_ipi() sets on a target TIF_POLLING CPU to
> >> indicate a pending IPI, which the idle CPU promises to process soon.
> >>
> >> On architectures that do not support the TIF_NOTIFY_IPI flag (this
> >> series only adds support for x86 and ARM processors for now),
> >
> > I'm surprised that you are mentioning ARM processors because they
> > don't use TIF_POLLING.
>
> Yup I just realised that after Linus Walleij pointed it out on the
> thread.
>
> >
> >> call_function_single_prep_ipi() will fallback to setting
> >> TIF_NEED_RESCHED bit to pull the TIF_POLLING CPU out of idle.
> >>
> >> Since the pending IPI handlers are processed before the call to
> >> schedule_idle() in do_idle(), schedule_idle() will only be called if the
> >> IPI handler have woken / migrated a new task on the idle CPU and has set
> >> TIF_NEED_RESCHED bit to indicate the same. This avoids running into the
> >> long IRQ disabled section in schedule_idle() unnecessarily, and any
> >> need_resched() check within a call function will accurately notify if a
> >> task is waiting for CPU time on the CPU handling the IPI.
> >>
> >> Following is the crude visualization of how the situation changes with
> >> the newly introduced TIF_NOTIFY_IPI flag:
> >> --
> >> CPU0CPU1
> >> 
> >> do_idle() {
> >> 
> >> __current_set_polling();
> >> ...
> >> 
> >> monitor(addr);
> >> if 
> >> (!need_resched_or_ipi())
> >> 
> >> mwait() {
> >> /* 
> >> Waiting */
> >> smp_call_function_single(CPU1, func, wait = 1) {   
> >>  ...
> >> ...
> >>  ...
> >> set_nr_if_polling(CPU1) {  
> >>  ...
> >> /* Realizes CPU1 is polling */ 
> >>  ...
> >> try_cmpxchg(addr,  
> >>  ...
> >> ,  
> >>  ...
> >> val | _TIF_NOTIFY_IPI);
> >>  ...
> >> } /* Does not send an IPI */   
> >>  ...
> >> ... } 
> >> /* mwait exit due to write at addr */
> >> csd_lock_wait() {   ...
> >> /* Waiting */   
> >> 

Re: [RFC PATCH 00/14] Introducing TIF_NOTIFY_IPI flag

2024-03-06 Thread K Prateek Nayak
Hello Vincent,

Thank you for taking a look at the series.

On 3/6/2024 3:29 PM, Vincent Guittot wrote:
> Hi Prateek,
> 
> Adding Julia who could be interested in this patchset. Your patchset
> should trigger idle load balance instead of newly idle load balance
> now when the polling is used. This was one reason for not migrating
> task in idle CPU

Thank you.

> 
> On Tue, 20 Feb 2024 at 18:15, K Prateek Nayak  wrote:
>>
>> Hello everyone,
>>
>> [..snip..]
>>
>>
>> Skipping newidle_balance()
>> ==
>>
>> In an earlier attempt to solve the challenge of the long IRQ disabled
>> section, newidle_balance() was skipped when a CPU waking up from idle
>> was found to have no runnable tasks, and was transitioning back to
>> idle [2]. Tim [3] and David [4] had pointed out that newidle_balance()
>> may be viable for CPUs that are idling with tick enabled, where the
>> newidle_balance() has the opportunity to pull tasks onto the idle CPU.
>>
>> Vincent [5] pointed out a case where the idle load kick will fail to
>> run on an idle CPU since the IPI handler launching the ILB will check
>> for need_resched(). In such cases, the idle CPU relies on
>> newidle_balance() to pull tasks towards itself.
> 
> Calling newidle_balance() instead of the normal idle load balance
> prevents the CPU to pull tasks from other groups

Thank you for the correction.

> 
>>
>> Using an alternate flag instead of NEED_RESCHED to indicate a pending
>> IPI was suggested as the correct approach to solve this problem on the
>> same thread.
>>
>>
>> Proposed solution: TIF_NOTIFY_IPI
>> =
>>
>> Instead of reusing TIF_NEED_RESCHED bit to pull an TIF_POLLING CPU out
>> of idle, TIF_NOTIFY_IPI is a newly introduced flag that
>> call_function_single_prep_ipi() sets on a target TIF_POLLING CPU to
>> indicate a pending IPI, which the idle CPU promises to process soon.
>>
>> On architectures that do not support the TIF_NOTIFY_IPI flag (this
>> series only adds support for x86 and ARM processors for now),
> 
> I'm surprised that you are mentioning ARM processors because they
> don't use TIF_POLLING.

Yup I just realised that after Linus Walleij pointed it out on the
thread.

> 
>> call_function_single_prep_ipi() will fallback to setting
>> TIF_NEED_RESCHED bit to pull the TIF_POLLING CPU out of idle.
>>
>> Since the pending IPI handlers are processed before the call to
>> schedule_idle() in do_idle(), schedule_idle() will only be called if the
>> IPI handler have woken / migrated a new task on the idle CPU and has set
>> TIF_NEED_RESCHED bit to indicate the same. This avoids running into the
>> long IRQ disabled section in schedule_idle() unnecessarily, and any
>> need_resched() check within a call function will accurately notify if a
>> task is waiting for CPU time on the CPU handling the IPI.
>>
>> Following is the crude visualization of how the situation changes with
>> the newly introduced TIF_NOTIFY_IPI flag:
>> --
>> CPU0CPU1
>> 
>> do_idle() {
>> 
>> __current_set_polling();
>> ...
>> 
>> monitor(addr);
>> if 
>> (!need_resched_or_ipi())
>> 
>> mwait() {
>> /* 
>> Waiting */
>> smp_call_function_single(CPU1, func, wait = 1) { 
>>...
>> ...  
>>...
>> set_nr_if_polling(CPU1) {
>>...
>> /* Realizes CPU1 is polling */   
>>...
>> try_cmpxchg(addr,
>>...
>> ,
>>...
>> val | _TIF_NOTIFY_IPI);  
>>...
>> } /* Does not send an IPI */ 
>>...
>> ... } /* 
>> mwait exit due to write at addr */
>> csd_lock_wait() {   ...
>> /* Waiting */   
>> preempt_fold_need_resched(); /* fold if NEED_RESCHED */
>> ... 
>> __current_clr_polling();
>> ... 
>> flush_smp_call_function_queue() {
>> ...   

Re: [RFC PATCH 00/14] Introducing TIF_NOTIFY_IPI flag

2024-03-06 Thread Vincent Guittot
Hi Prateek,

Adding Julia who could be interested in this patchset. Your patchset
should trigger idle load balance instead of newly idle load balance
now when the polling is used. This was one reason for not migrating
task in idle CPU

On Tue, 20 Feb 2024 at 18:15, K Prateek Nayak  wrote:
>
> Hello everyone,
>
> Before jumping into the issue, let me clarify the Cc list. Everyone have
> been cc'ed on Patch 0 through Patch 3. Respective arch maintainers,
> reviewers, and committers returned by scripts/get_maintainer.pl have
> been cc'ed on the respective arch side changes. Scheduler and CPU Idle
> maintainers and reviewers have been included for the entire series. If I
> have missed anyone, please do add them. If you would like to be dropped
> from the cc list, wholly or partially, for the future iterations, please
> do let me know.
>
> With that out of the way ...
>
> Problem statement
> =
>
> When measuring IPI throughput using a modified version of Anton
> Blanchard's ipistorm benchmark [1], configured to measure time taken to
> perform a fixed number of smp_call_function_single() (with wait set to
> 1), an increase in benchmark time was observed between v5.7 and the
> current upstream release (v6.7-rc6 at the time of encounter).
>
> Bisection pointed to commit b2a02fc43a1f ("smp: Optimize
> send_call_function_single_ipi()") as the reason behind this increase in
> runtime.
>
>
> Experiments
> ===
>
> Since the commit cannot be cleanly reverted on top of the current
> tip:sched/core, the effects of the optimizations were reverted by:
>
> 1. Removing the check for call_function_single_prep_ipi() in
>send_call_function_single_ipi(). With this change
>send_call_function_single_ipi() always calls
>arch_send_call_function_single_ipi()
>
> 2. Removing the call to flush_smp_call_function_queue() in do_idle()
>since every smp_call_function, with (1.), would unconditionally send
>an IPI to an idle CPU in TIF_POLLING mode.
>
> Following is the diff of the above described changes which will be
> henceforth referred to as the "revert":
>
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 31231925f1ec..735184d98c0f 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -332,11 +332,6 @@ static void do_idle(void)
>  */
> smp_mb__after_atomic();
>
> -   /*
> -* RCU relies on this call to be done outside of an RCU read-side
> -* critical section.
> -*/
> -   flush_smp_call_function_queue();
> schedule_idle();
>
> if (unlikely(klp_patch_pending(current)))
> diff --git a/kernel/smp.c b/kernel/smp.c
> index f085ebcdf9e7..2ff100c41885 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -111,11 +111,9 @@ void __init call_function_init(void)
>  static __always_inline void
>  send_call_function_single_ipi(int cpu)
>  {
> -   if (call_function_single_prep_ipi(cpu)) {
> -   trace_ipi_send_cpu(cpu, _RET_IP_,
> -  
> generic_smp_call_function_single_interrupt);
> -   arch_send_call_function_single_ipi(cpu);
> -   }
> +   trace_ipi_send_cpu(cpu, _RET_IP_,
> +  generic_smp_call_function_single_interrupt);
> +   arch_send_call_function_single_ipi(cpu);
>  }
>
>  static __always_inline void
> --
>
> With the revert, the time taken to complete a fixed set of IPIs using
> ipistorm improves significantly. Following are the numbers from a dual
> socket 3rd Generation EPYC system (2 x 64C/128T) (boost on, C2 disabled)
> running ipistorm between CPU8 and CPU16:
>
> cmdline: insmod ipistorm.ko numipi=10 single=1 offset=8 cpulist=8 wait=1
>
> (tip:sched/core at tag "sched-core-2024-01-08" for all the testing done
> below)
>
>   ==
>   Test  : ipistorm (modified)
>   Units : Normalized runtime
>   Interpretation: Lower is better
>   Statistic : AMean
>   ==
>   kernel:   time [pct imp]
>   tip:sched/core1.00 [0.00]
>   tip:sched/core + revert   0.81 [19.36]
>
> Although the revert improves ipistorm performance, it also regresses
> tbench and netperf, supporting the validity of the optimization.
> Following are netperf and tbench numbers from the same machine comparing
> vanilla tip:sched/core and the revert applied on top:
>
>   ==
>   Test  : tbench
>   Units : Normalized throughput
>   Interpretation: Higher is better
>   Statistic : AMean
>   ==
>   Clients:tip[pct imp](CV)   revert[pct imp](CV)
>   1 1.00 [  0.00]( 0.24) 0.91 [ -8.96]( 0.30)
>   2 1.00 [  0.00]( 0.25) 0.92 [ -8.20]( 0.97)
>   4 1.00 [  0.00]( 0.23) 0.91 [ -9.20]( 1.75)

Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-03-05 Thread Johannes Berg
On Mon, 2024-02-26 at 17:14 +0100, Arnd Bergmann wrote:
> 
>  arch/um/Kconfig| 1 +
>  arch/um/include/asm/page.h | 2 +-


LGTM, thanks.

Acked-by: Johannes Berg 

johannes



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-03-04 Thread Edgecombe, Rick P
On Mon, 2024-03-04 at 18:00 +, Christophe Leroy wrote:
> > Personally, I think a single patch that sets "= {}" for all of them
> > and
> > drop the all the "= 0" or "= NULL" assignments would be the
> > cleanest way
> > to go.
> 
> I agree with Kees, set = {} and drop all the "something = 0;" stuff.

Thanks. Now some of the arch's have very nicely acked and reviewed the
existing patches. I'll leave those as is, and do this for anyone that
doesn't respond.


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-03-04 Thread Christophe Leroy


Le 02/03/2024 à 02:51, Kees Cook a écrit :
> On Sat, Mar 02, 2024 at 12:47:08AM +, Edgecombe, Rick P wrote:
>> On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
>>> I totally understand. If the "uninitialized" warnings were actually
>>> reliable, I would agree. I look at it this way:
>>>
>>> - initializations can be missed either in static initializers or via
>>>    run time initializers. (So the risk of mistake here is matched --
>>>    though I'd argue it's easier to *find* static initializers when
>>> adding
>>>    new struct members.)
>>> - uninitialized warnings are inconsistent (this becomes an unknown
>>> risk)
>>> - when a run time initializer is missed, the contents are whatever
>>> was
>>>    on the stack (high risk)
>>> - what a static initializer is missed, the content is 0 (low risk)
>>>
>>> I think unambiguous state (always 0) is significantly more important
>>> for
>>> the safety of the system as a whole. Yes, individual cases maybe bad
>>> ("what uid should this be? root?!") but from a general memory safety
>>> perspective the value doesn't become potentially influenced by order
>>> of
>>> operations, leftover stack memory, etc.
>>>
>>> I'd agree, lifting everything into a static initializer does seem
>>> cleanest of all the choices.
>>
>> Hi Kees,
>>
>> Well, I just gave this a try. It is giving me flashbacks of when I last
>> had to do a tree wide change that I couldn't fully test and the
>> breakage was caught by Linus.
> 
> Yeah, testing isn't fun for these kinds of things. This is traditionally
> why the "obviously correct" changes tend to have an easier time landing
> (i.e. adding "= {}" to all of them).
> 
>> Could you let me know if you think this is additionally worthwhile
>> cleanup outside of the guard gap improvements of this series? Because I
>> was thinking a more cowardly approach could be a new vm_unmapped_area()
>> variant that takes the new start gap member as a separate argument
>> outside of struct vm_unmapped_area_info. It would be kind of strange to
>> keep them separate, but it would be less likely to bump something.
> 
> I think you want a new member -- AIUI, that's what that struct is for.
> 
> Looking at this resulting set of patches, I do kinda think just adding
> the "= {}" in a single patch is more sensible. Having to split things
> that are know at the top of the function from the stuff known at the
> existing initialization time is rather awkward.
> 
> Personally, I think a single patch that sets "= {}" for all of them and
> drop the all the "= 0" or "= NULL" assignments would be the cleanest way
> to go.

I agree with Kees, set = {} and drop all the "something = 0;" stuff.

Christophe


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-03-01 Thread Kees Cook
On Sat, Mar 02, 2024 at 12:47:08AM +, Edgecombe, Rick P wrote:
> On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
> > I totally understand. If the "uninitialized" warnings were actually
> > reliable, I would agree. I look at it this way:
> > 
> > - initializations can be missed either in static initializers or via
> >   run time initializers. (So the risk of mistake here is matched --
> >   though I'd argue it's easier to *find* static initializers when
> > adding
> >   new struct members.)
> > - uninitialized warnings are inconsistent (this becomes an unknown
> > risk)
> > - when a run time initializer is missed, the contents are whatever
> > was
> >   on the stack (high risk)
> > - what a static initializer is missed, the content is 0 (low risk)
> > 
> > I think unambiguous state (always 0) is significantly more important
> > for
> > the safety of the system as a whole. Yes, individual cases maybe bad
> > ("what uid should this be? root?!") but from a general memory safety
> > perspective the value doesn't become potentially influenced by order
> > of
> > operations, leftover stack memory, etc.
> > 
> > I'd agree, lifting everything into a static initializer does seem
> > cleanest of all the choices.
> 
> Hi Kees,
> 
> Well, I just gave this a try. It is giving me flashbacks of when I last
> had to do a tree wide change that I couldn't fully test and the
> breakage was caught by Linus.

Yeah, testing isn't fun for these kinds of things. This is traditionally
why the "obviously correct" changes tend to have an easier time landing
(i.e. adding "= {}" to all of them).

> Could you let me know if you think this is additionally worthwhile
> cleanup outside of the guard gap improvements of this series? Because I
> was thinking a more cowardly approach could be a new vm_unmapped_area()
> variant that takes the new start gap member as a separate argument
> outside of struct vm_unmapped_area_info. It would be kind of strange to
> keep them separate, but it would be less likely to bump something.

I think you want a new member -- AIUI, that's what that struct is for.

Looking at this resulting set of patches, I do kinda think just adding
the "= {}" in a single patch is more sensible. Having to split things
that are know at the top of the function from the stuff known at the
existing initialization time is rather awkward.

Personally, I think a single patch that sets "= {}" for all of them and
drop the all the "= 0" or "= NULL" assignments would be the cleanest way
to go.

-Kees

-- 
Kees Cook



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-03-01 Thread Edgecombe, Rick P
On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
> I totally understand. If the "uninitialized" warnings were actually
> reliable, I would agree. I look at it this way:
> 
> - initializations can be missed either in static initializers or via
>   run time initializers. (So the risk of mistake here is matched --
>   though I'd argue it's easier to *find* static initializers when
> adding
>   new struct members.)
> - uninitialized warnings are inconsistent (this becomes an unknown
> risk)
> - when a run time initializer is missed, the contents are whatever
> was
>   on the stack (high risk)
> - what a static initializer is missed, the content is 0 (low risk)
> 
> I think unambiguous state (always 0) is significantly more important
> for
> the safety of the system as a whole. Yes, individual cases maybe bad
> ("what uid should this be? root?!") but from a general memory safety
> perspective the value doesn't become potentially influenced by order
> of
> operations, leftover stack memory, etc.
> 
> I'd agree, lifting everything into a static initializer does seem
> cleanest of all the choices.

Hi Kees,

Well, I just gave this a try. It is giving me flashbacks of when I last
had to do a tree wide change that I couldn't fully test and the
breakage was caught by Linus.

Could you let me know if you think this is additionally worthwhile
cleanup outside of the guard gap improvements of this series? Because I
was thinking a more cowardly approach could be a new vm_unmapped_area()
variant that takes the new start gap member as a separate argument
outside of struct vm_unmapped_area_info. It would be kind of strange to
keep them separate, but it would be less likely to bump something.

Thanks,

Rick


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-28 Thread Christophe Leroy


Le 28/02/2024 à 18:01, Edgecombe, Rick P a écrit :
> On Wed, 2024-02-28 at 13:22 +, Christophe Leroy wrote:
>>> Any preference? Or maybe am I missing your point and talking
>>> nonsense?
>>>
>>
>> So my preference would go to the addition of:
>>
>>  info.new_field = 0;
>>
>> But that's very minor and if you think it is easier to manage and
>> maintain by performing {} initialisation at declaration, lets go for
>> that.
> 
> Appreciate the clarification and help getting this right. I'm thinking
> Kees' and now Kirill's point about this patch resulting in unnecessary
> manual zero initialization of the structs is probably something that
> needs to be addressed.
> 
> If I created a bunch of patches to change each call site, I think the
> the best is probably to do the designated field zero initialization
> way.
> 
> But I can do something for powerpc special if you want. I'll first try
> with powerpc matching the others, and if it seems objectionable, please
> let me know.
> 

My comments were generic, it was not powerpc oriented. Please keep 
powerpc as similar as possible with others.

Christophe


Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-28 Thread Stafford Horne
On Mon, Feb 26, 2024 at 05:14:13PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/Kconfig | 1 +
>  arch/alpha/include/asm/page.h  | 2 +-
>  arch/arm/Kconfig   | 1 +
>  arch/arm/include/asm/page.h| 2 +-
>  arch/csky/Kconfig  | 1 +
>  arch/csky/include/asm/page.h   | 2 +-
>  arch/m68k/Kconfig  | 3 +++
>  arch/m68k/Kconfig.cpu  | 2 ++
>  arch/m68k/include/asm/page.h   | 6 +-
>  arch/microblaze/Kconfig| 1 +
>  arch/microblaze/include/asm/page.h | 2 +-
>  arch/nios2/Kconfig | 1 +
>  arch/nios2/include/asm/page.h  | 2 +-
>  arch/openrisc/Kconfig  | 1 +
>  arch/openrisc/include/asm/page.h   | 2 +-
>  arch/riscv/Kconfig | 1 +
>  arch/riscv/include/asm/page.h  | 2 +-
>  arch/s390/Kconfig  | 1 +
>  arch/s390/include/asm/page.h   | 2 +-
>  arch/sparc/Kconfig | 2 ++
>  arch/sparc/include/asm/page_32.h   | 2 +-
>  arch/sparc/include/asm/page_64.h   | 3 +--
>  arch/um/Kconfig| 1 +
>  arch/um/include/asm/page.h | 2 +-
>  arch/x86/Kconfig   | 1 +
>  arch/x86/include/asm/page_types.h  | 2 +-
>  arch/xtensa/Kconfig| 1 +
>  arch/xtensa/include/asm/page.h | 2 +-
>  28 files changed, 32 insertions(+), 19 deletions(-)

> diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
> index fd9bb76a610b..3586cda55bde 100644
> --- a/arch/openrisc/Kconfig
> +++ b/arch/openrisc/Kconfig
> @@ -25,6 +25,7 @@ config OPENRISC
>   select GENERIC_CPU_DEVICES
>   select HAVE_PCI
>   select HAVE_UID16
> + select HAVE_PAGE_SIZE_8KB
>   select GENERIC_ATOMIC64
>   select GENERIC_CLOCKEVENTS_BROADCAST
>   select GENERIC_SMP_IDLE_THREAD
> diff --git a/arch/openrisc/include/asm/page.h 
> b/arch/openrisc/include/asm/page.h
> index 44fc1fd56717..7925ce09ab5a 100644
> --- a/arch/openrisc/include/asm/page.h
> +++ b/arch/openrisc/include/asm/page.h
> @@ -18,7 +18,7 @@
>  
>  /* PAGE_SHIFT determines the page size */
>  
> -#define PAGE_SHIFT  13
> +#define PAGE_SHIFT  CONFIG_PAGE_SHIFT
>  #ifdef __ASSEMBLY__
>  #define PAGE_SIZE   (1 << PAGE_SHIFT)
>  #else

For the openrisc bits,

Acked-by: Stafford Horne 



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-28 Thread Kees Cook
On Wed, Feb 28, 2024 at 01:22:09PM +, Christophe Leroy wrote:
> [...]
> My worry with initialisation at declaration is it often hides missing 
> assignments. Let's take following simple exemple:
> 
> char *colour(int num)
> {
>   char *name;
> 
>   if (num == 0) {
>   name = "black";
>   } else if (num == 1) {
>   name = "white";
>   } else if (num == 2) {
>   } else {
>   name = "no colour";
>   }
> 
>   return name;
> }
> 
> Here, GCC warns about a missing initialisation of variable 'name'.

Sometimes. :( We build with -Wno-maybe-uninitialized because GCC gets
this wrong too often. Also, like with large structs like this, all
uninit warnings get suppressed if anything takes it by reference. So, if
before your "return name" statement above, you had something like:

do_something();

it won't warn with any option enabled.

> But if I declare it as
> 
>   char *name = "no colour";
> 
> Then GCC won't warn anymore that we are missing a value for when num is 2.
> 
> During my life I have so many times spent huge amount of time 
> investigating issues and bugs due to missing assignments that were going 
> undetected due to default initialisation at declaration.

I totally understand. If the "uninitialized" warnings were actually
reliable, I would agree. I look at it this way:

- initializations can be missed either in static initializers or via
  run time initializers. (So the risk of mistake here is matched --
  though I'd argue it's easier to *find* static initializers when adding
  new struct members.)
- uninitialized warnings are inconsistent (this becomes an unknown risk)
- when a run time initializer is missed, the contents are whatever was
  on the stack (high risk)
- what a static initializer is missed, the content is 0 (low risk)

I think unambiguous state (always 0) is significantly more important for
the safety of the system as a whole. Yes, individual cases maybe bad
("what uid should this be? root?!") but from a general memory safety
perspective the value doesn't become potentially influenced by order of
operations, leftover stack memory, etc.

I'd agree, lifting everything into a static initializer does seem
cleanest of all the choices.

-Kees

-- 
Kees Cook



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-28 Thread Edgecombe, Rick P
On Wed, 2024-02-28 at 13:22 +, Christophe Leroy wrote:
> > Any preference? Or maybe am I missing your point and talking
> > nonsense?
> > 
> 
> So my preference would go to the addition of:
> 
> info.new_field = 0;
> 
> But that's very minor and if you think it is easier to manage and 
> maintain by performing {} initialisation at declaration, lets go for
> that.

Appreciate the clarification and help getting this right. I'm thinking
Kees' and now Kirill's point about this patch resulting in unnecessary
manual zero initialization of the structs is probably something that
needs to be addressed.

If I created a bunch of patches to change each call site, I think the
the best is probably to do the designated field zero initialization
way.

But I can do something for powerpc special if you want. I'll first try
with powerpc matching the others, and if it seems objectionable, please
let me know.

Thanks,

Rick


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-28 Thread Christophe Leroy


Le 27/02/2024 à 21:25, Edgecombe, Rick P a écrit :
> On Tue, 2024-02-27 at 18:16 +, Christophe Leroy wrote:
 Why doing a full init of the struct when all fields are re-
 written a few
 lines after ?
>>>
>>> It's a nice change for robustness and makes future changes easier.
>>> It's
>>> not actually wasteful since the compiler will throw away all
>>> redundant
>>> stores.
>>
>> Well, I tend to dislike default init at declaration because it often
>> hides missed real init. When a field is not initialized GCC should
>> emit
>> a Warning, at least when built with W=2 which sets
>> -Wmissing-field-initializers ?
> 
> Sorry, I'm not following where you are going with this. There aren't
> any struct vm_unmapped_area_info users that use initializers today, so
> that warning won't apply in this case. Meanwhile, designated style
> struct initialization (which would zero new members) is very common, as
> well as not get anything checked by that warning. Anything with this
> many members is probably going to use the designated style.
> 
> If we are optimizing to avoid bugs, the way this struct is used today
> is not great. It is essentially being used as an argument passer.
> Normally when a function signature changes, but a caller is missed, of
> course the compiler will notice loudly. But not here. So I think
> probably zero initializing it is safer than being setup to pass
> garbage.

No worry, if everybody thinks that init at declaration is worth it in 
that case it is OK for me and I'm not going to ask for something special 
on powerpc, my comment was more general allthough I used powerpc as an 
exemple.

My worry with initialisation at declaration is it often hides missing 
assignments. Let's take following simple exemple:

char *colour(int num)
{
char *name;

if (num == 0) {
name = "black";
} else if (num == 1) {
name = "white";
} else if (num == 2) {
} else {
name = "no colour";
}

return name;
}


Here, GCC warns about a missing initialisation of variable 'name'.

But if I declare it as

char *name = "no colour";

Then GCC won't warn anymore that we are missing a value for when num is 2.

During my life I have so many times spent huge amount of time 
investigating issues and bugs due to missing assignments that were going 
undetected due to default initialisation at declaration.

> 
> I'm trying to figure out what to do here. If I changed it so that just
> powerpc set the new field manually, then the convention across the
> kernel would be for everything to be default zero, and future other new
> parameters could have a greater chance of turning into garbage on
> powerpc. Since it could be easy to miss that powerpc was special. Would
> you prefer it?
> 
> Or maybe I could try a new vm_unmapped_area() that takes the extra
> argument separately? The old callers could call the old function and
> not need any arch updates. It all seems strange though, because
> automatic zero initializing struct members is so common in the kernel.
> But it also wouldn't add the cleanup Kees was pointing out. Hmm.
> 
> Any preference? Or maybe am I missing your point and talking nonsense?
> 

So my preference would go to the addition of:

info.new_field = 0;

But that's very minor and if you think it is easier to manage and 
maintain by performing {} initialisation at declaration, lets go for that.

Christophe


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-28 Thread Kirill A. Shutemov
On Mon, Feb 26, 2024 at 11:09:47AM -0800, Rick Edgecombe wrote:
> diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
> index 5db88b627439..dd6801bb9240 100644
> --- a/arch/alpha/kernel/osf_sys.c
> +++ b/arch/alpha/kernel/osf_sys.c
> @@ -1218,7 +1218,7 @@ static unsigned long
>  arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
>unsigned long limit)
>  {
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>  
>   info.flags = 0;
>   info.length = len;

Can we make a step forward and actually move initialization inside the
initializator? Something like below.

I understand that it is substantially more work, but I think it is useful.

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..c40ddede3b13 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,14 +1218,12 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 unsigned long limit)
 {
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {
+   .length = len;
+   .low_limit = addr,
+   .high_limit = limit,
+   };

-   info.flags = 0;
-   info.length = len;
-   info.low_limit = addr;
-   info.high_limit = limit;
-   info.align_mask = 0;
-   info.align_offset = 0;
return vm_unmapped_area();
 }

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-27 Thread Edgecombe, Rick P
On Tue, 2024-02-27 at 18:16 +, Christophe Leroy wrote:
> > > Why doing a full init of the struct when all fields are re-
> > > written a few
> > > lines after ?
> > 
> > It's a nice change for robustness and makes future changes easier.
> > It's
> > not actually wasteful since the compiler will throw away all
> > redundant
> > stores.
> 
> Well, I tend to dislike default init at declaration because it often 
> hides missed real init. When a field is not initialized GCC should
> emit 
> a Warning, at least when built with W=2 which sets 
> -Wmissing-field-initializers ?

Sorry, I'm not following where you are going with this. There aren't
any struct vm_unmapped_area_info users that use initializers today, so
that warning won't apply in this case. Meanwhile, designated style
struct initialization (which would zero new members) is very common, as
well as not get anything checked by that warning. Anything with this
many members is probably going to use the designated style.

If we are optimizing to avoid bugs, the way this struct is used today
is not great. It is essentially being used as an argument passer.
Normally when a function signature changes, but a caller is missed, of
course the compiler will notice loudly. But not here. So I think
probably zero initializing it is safer than being setup to pass
garbage.

I'm trying to figure out what to do here. If I changed it so that just
powerpc set the new field manually, then the convention across the
kernel would be for everything to be default zero, and future other new
parameters could have a greater chance of turning into garbage on
powerpc. Since it could be easy to miss that powerpc was special. Would
you prefer it?

Or maybe I could try a new vm_unmapped_area() that takes the extra
argument separately? The old callers could call the old function and
not need any arch updates. It all seems strange though, because
automatic zero initializing struct members is so common in the kernel.
But it also wouldn't add the cleanup Kees was pointing out. Hmm.

Any preference? Or maybe am I missing your point and talking nonsense?



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-27 Thread Christophe Leroy


Le 27/02/2024 à 19:07, Kees Cook a écrit :
> On Tue, Feb 27, 2024 at 07:02:59AM +, Christophe Leroy wrote:
>>
>>
>> Le 26/02/2024 à 20:09, Rick Edgecombe a écrit :
>>> Future changes will need to add a field to struct vm_unmapped_area_info.
>>> This would cause trouble for any archs that don't initialize the
>>> struct. Currently every user sets each field, so if new fields are
>>> added, the core code parsing the struct will see garbage in the new
>>> field.
>>>
>>> It could be possible to initialize the new field for each arch to 0, but
>>> instead simply inialize the field with a C99 struct inializing syntax.
>>
>> Why doing a full init of the struct when all fields are re-written a few
>> lines after ?
> 
> It's a nice change for robustness and makes future changes easier. It's
> not actually wasteful since the compiler will throw away all redundant
> stores.

Well, I tend to dislike default init at declaration because it often 
hides missed real init. When a field is not initialized GCC should emit 
a Warning, at least when built with W=2 which sets 
-Wmissing-field-initializers ?

> 
>> If I take the exemple of powerpc function slice_find_area_bottomup():
>>
>>  struct vm_unmapped_area_info info;
>>
>>  info.flags = 0;
>>  info.length = len;
>>  info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
>>  info.align_offset = 0;
> 
> But one cleanup that is possible from explicitly zero-initializing the
> whole structure would be dropping all the individual "= 0" assignments.
> :)
> 

Sure if we decide to go that direction all those 0 assignments void.


Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-27 Thread Kees Cook
On Tue, Feb 27, 2024 at 07:02:59AM +, Christophe Leroy wrote:
> 
> 
> Le 26/02/2024 à 20:09, Rick Edgecombe a écrit :
> > Future changes will need to add a field to struct vm_unmapped_area_info.
> > This would cause trouble for any archs that don't initialize the
> > struct. Currently every user sets each field, so if new fields are
> > added, the core code parsing the struct will see garbage in the new
> > field.
> > 
> > It could be possible to initialize the new field for each arch to 0, but
> > instead simply inialize the field with a C99 struct inializing syntax.
> 
> Why doing a full init of the struct when all fields are re-written a few 
> lines after ?

It's a nice change for robustness and makes future changes easier. It's
not actually wasteful since the compiler will throw away all redundant
stores.

> If I take the exemple of powerpc function slice_find_area_bottomup():
> 
>   struct vm_unmapped_area_info info;
> 
>   info.flags = 0;
>   info.length = len;
>   info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
>   info.align_offset = 0;

But one cleanup that is possible from explicitly zero-initializing the
whole structure would be dropping all the individual "= 0" assignments.
:)

-- 
Kees Cook



Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2024, at 16:44, Christophe Leroy wrote:
> Le 27/02/2024 à 16:40, Arnd Bergmann a écrit :
>> On Mon, Feb 26, 2024, at 17:55, Samuel Holland wrote:
>
>
> For 256K pages, powerpc has the following help. I think you should have 
> it too:
>
> The kernel will only be able to run applications that have been
> compiled with '-zmax-page-size' set to 256K (the default is 64K) using
> binutils later than 2.17.50.0.3, or by patching the ELF_MAXPAGESIZE
> definition from 0x1 to 0x4 in older versions.

I don't think we need to mention pre-2.18 binutils any more, but the
rest seems useful, changed the text now to

config PAGE_SIZE_256KB
bool "256KiB pages"
depends on HAVE_PAGE_SIZE_256KB
help
  256KiB pages have little practical value due to their extreme
  memory usage.  The kernel will only be able to run applications
  that have been compiled with '-zmax-page-size' set to 256KiB
  (the default is 64KiB or 4KiB on most architectures).

  Arnd



Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Christophe Leroy


Le 27/02/2024 à 16:40, Arnd Bergmann a écrit :
> On Mon, Feb 26, 2024, at 17:55, Samuel Holland wrote:
>> On 2024-02-26 10:14 AM, Arnd Bergmann wrote:
>>>   
>>> +config HAVE_PAGE_SIZE_4KB
>>> +   bool
>>> +
>>> +config HAVE_PAGE_SIZE_8KB
>>> +   bool
>>> +
>>> +config HAVE_PAGE_SIZE_16KB
>>> +   bool
>>> +
>>> +config HAVE_PAGE_SIZE_32KB
>>> +   bool
>>> +
>>> +config HAVE_PAGE_SIZE_64KB
>>> +   bool
>>> +
>>> +config HAVE_PAGE_SIZE_256KB
>>> +   bool
>>> +
>>> +choice
>>> +   prompt "MMU page size"
>>
>> Should this have some generic help text (at least a warning about
>> compatibility)?
> 
> Good point. I've added some of this now, based on the mips
> text with some generalizations for other architectures:
> 
> config PAGE_SIZE_4KB
>  bool "4KiB pages"
>  depends on HAVE_PAGE_SIZE_4KB
>  help
>This option select the standard 4KiB Linux page size and the only
>available option on many architectures. Using 4KiB page size will
>minimize memory consumption and is therefore recommended for low
>memory systems.
>Some software that is written for x86 systems makes incorrect
>assumptions about the page size and only runs on 4KiB pages.
> 
> config PAGE_SIZE_8KB
>  bool "8KiB pages"
>  depends on HAVE_PAGE_SIZE_8KB
>  help
>This option is the only supported page size on a few older
>processors, and can be slightly faster than 4KiB pages.
> 
> config PAGE_SIZE_16KB
>  bool "16KiB pages"
>  depends on HAVE_PAGE_SIZE_16KB
>  help
>This option is usually a good compromise between memory
>consumption and performance for typical desktop and server
>workloads, often saving a level of page table lookups compared
>to 4KB pages as well as reducing TLB pressure and overhead of
>per-page operations in the kernel at the expense of a larger
>page cache.
> 
> config PAGE_SIZE_32KB
>  bool "32KiB pages"
>  depends on HAVE_PAGE_SIZE_32KB
>Using 32KiB page size will result in slightly higher performance
>kernel at the price of higher memory consumption compared to
>16KiB pages.  This option is available only on cnMIPS cores.
>Note that you will need a suitable Linux distribution to
>support this.
> 
> config PAGE_SIZE_64KB
>  bool "64KiB pages"
>  depends on HAVE_PAGE_SIZE_64KB
>Using 64KiB page size will result in slightly higher performance
>kernel at the price of much higher memory consumption compared to
>4KiB or 16KiB pages.
>This is not suitable for general-purpose workloads but the
>better performance may be worth the cost for certain types of
>supercomputing or database applications that work mostly with
>large in-memory data rather than small files.
> 
> config PAGE_SIZE_256KB
>  bool "256KiB pages"
>  depends on HAVE_PAGE_SIZE_256KB
>  help
>256KB pages have little practical value due to their extreme
>memory usage.


For 256K pages, powerpc has the following help. I think you should have 
it too:

  The kernel will only be able to run applications that have been
  compiled with '-zmax-page-size' set to 256K (the default is 64K) using
  binutils later than 2.17.50.0.3, or by patching the ELF_MAXPAGESIZE
  definition from 0x1 to 0x4 in older versions.


Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2024, at 09:45, Geert Uytterhoeven wrote:
>
>> +config PAGE_SIZE_4KB
>> +   bool "4KB pages"
>
> Now you got rid of the 4000-byte ("4kB") pages and friends, please
> do not replace these by Kelvin-bytes, and use the official binary
> prefixes => "4 KiB".
>

Done, thanks.

Arnd



Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann
On Mon, Feb 26, 2024, at 20:02, Christophe Leroy wrote:
> Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
>> From: Arnd Bergmann 
>
> That's a nice re-factor.
>
> The only drawback I see is that we are loosing several interesting 
> arch-specific comments/help text. Don't know if there could be an easy 
> way to keep them.

This is what I have now, trying to write it as generic as
possible while still giving useful advice:

config PAGE_SIZE_4KB
bool "4KiB pages"
depends on HAVE_PAGE_SIZE_4KB
help
  This option select the standard 4KiB Linux page size and the only
  available option on many architectures. Using 4KiB page size will
  minimize memory consumption and is therefore recommended for low
  memory systems.
  Some software that is written for x86 systems makes incorrect
  assumptions about the page size and only runs on 4KiB pages.

config PAGE_SIZE_8KB
bool "8KiB pages"
depends on HAVE_PAGE_SIZE_8KB
help
  This option is the only supported page size on a few older
  processors, and can be slightly faster than 4KiB pages.

config PAGE_SIZE_16KB
bool "16KiB pages"
depends on HAVE_PAGE_SIZE_16KB
help
  This option is usually a good compromise between memory
  consumption and performance for typical desktop and server
  workloads, often saving a level of page table lookups compared
  to 4KB pages as well as reducing TLB pressure and overhead of
  per-page operations in the kernel at the expense of a larger
  page cache.

config PAGE_SIZE_32KB
bool "32KiB pages"
depends on HAVE_PAGE_SIZE_32KB
  Using 32KiB page size will result in slightly higher performance
  kernel at the price of higher memory consumption compared to
  16KiB pages.  This option is available only on cnMIPS cores.
  Note that you will need a suitable Linux distribution to
  support this.

config PAGE_SIZE_64KB
bool "64KiB pages"
depends on HAVE_PAGE_SIZE_64KB
  Using 64KiB page size will result in slightly higher performance
  kernel at the price of much higher memory consumption compared to
  4KiB or 16KiB pages.
  This is not suitable for general-purpose workloads but the
  better performance may be worth the cost for certain types of
  supercomputing or database applications that work mostly with
  large in-memory data rather than small files.

config PAGE_SIZE_256KB
bool "256KiB pages"
depends on HAVE_PAGE_SIZE_256KB
help
  256KB pages have little practical value due to their extreme
  memory usage.

Let me know if you think some of this should be adapted further.

>>   
>> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
>>   #define PAGE_SIZE  (1UL << PAGE_SHIFT)
>>   #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
>>   
>
> Could we move PAGE_SIZE and PAGE_MASK in a generic/core header instead 
> of having it duplicated for each arch ?

Yes, but I'm leaving this for a follow-up series, since I had
to stop somewhere and there is always room for cleanup up headers
further ;-)

  Arnd



Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Arnd Bergmann
On Mon, Feb 26, 2024, at 17:55, Samuel Holland wrote:
> On 2024-02-26 10:14 AM, Arnd Bergmann wrote:
>>  
>> +config HAVE_PAGE_SIZE_4KB
>> +bool
>> +
>> +config HAVE_PAGE_SIZE_8KB
>> +bool
>> +
>> +config HAVE_PAGE_SIZE_16KB
>> +bool
>> +
>> +config HAVE_PAGE_SIZE_32KB
>> +bool
>> +
>> +config HAVE_PAGE_SIZE_64KB
>> +bool
>> +
>> +config HAVE_PAGE_SIZE_256KB
>> +bool
>> +
>> +choice
>> +prompt "MMU page size"
>
> Should this have some generic help text (at least a warning about 
> compatibility)?

Good point. I've added some of this now, based on the mips
text with some generalizations for other architectures:

config PAGE_SIZE_4KB
bool "4KiB pages" 
depends on HAVE_PAGE_SIZE_4KB
help 
  This option select the standard 4KiB Linux page size and the only
  available option on many architectures. Using 4KiB page size will
  minimize memory consumption and is therefore recommended for low
  memory systems.
  Some software that is written for x86 systems makes incorrect
  assumptions about the page size and only runs on 4KiB pages.

config PAGE_SIZE_8KB
bool "8KiB pages"
depends on HAVE_PAGE_SIZE_8KB
help
  This option is the only supported page size on a few older
  processors, and can be slightly faster than 4KiB pages.

config PAGE_SIZE_16KB
bool "16KiB pages"
depends on HAVE_PAGE_SIZE_16KB
help 
  This option is usually a good compromise between memory
  consumption and performance for typical desktop and server
  workloads, often saving a level of page table lookups compared
  to 4KB pages as well as reducing TLB pressure and overhead of
  per-page operations in the kernel at the expense of a larger
  page cache. 

config PAGE_SIZE_32KB
bool "32KiB pages"
depends on HAVE_PAGE_SIZE_32KB
  Using 32KiB page size will result in slightly higher performance
  kernel at the price of higher memory consumption compared to
  16KiB pages.  This option is available only on cnMIPS cores.
  Note that you will need a suitable Linux distribution to
  support this.

config PAGE_SIZE_64KB
bool "64KiB pages"
depends on HAVE_PAGE_SIZE_64KB
  Using 64KiB page size will result in slightly higher performance
  kernel at the price of much higher memory consumption compared to
  4KiB or 16KiB pages.
  This is not suitable for general-purpose workloads but the
  better performance may be worth the cost for certain types of
  supercomputing or database applications that work mostly with
  large in-memory data rather than small files.

config PAGE_SIZE_256KB
bool "256KiB pages"
depends on HAVE_PAGE_SIZE_256KB
help
  256KB pages have little practical value due to their extreme
  memory usage.

>> diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
>> index a880ee067d2e..aac46ee1a000 100644
>> --- a/arch/hexagon/Kconfig
>> +++ b/arch/hexagon/Kconfig
>> @@ -8,6 +8,11 @@ config HEXAGON
>>  select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>>  select ARCH_NO_PREEMPT
>>  select DMA_GLOBAL_POOL
>> +select FRAME_POINTER
>
> Looks like a paste error.
>

Fixed, thanks! I think that happened during a rebase.

>>  #ifdef CONFIG_PAGE_SIZE_1MB
>> -#define PAGE_SHIFT 20
>>  #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_1MB
>>  #endif
>
> The corresponding Kconfig option does not exist (and did not exist before this
> patch).

Yes, I noticed that as well. It's clearly harmless.

 Arnd



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-27 Thread Edgecombe, Rick P
On Tue, 2024-02-27 at 07:02 +, Christophe Leroy wrote:
> > It could be possible to initialize the new field for each arch to
> > 0, but
> > instead simply inialize the field with a C99 struct inializing
> > syntax.
> 
> Why doing a full init of the struct when all fields are re-written a
> few 
> lines after ?
> 
> If I take the exemple of powerpc function slice_find_area_bottomup():
> 
> struct vm_unmapped_area_info info;
> 
> info.flags = 0;
> info.length = len;
> info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
> info.align_offset = 0;
> 
> For me it looks better to just add:
> 
> info.new_field = 0; /* or whatever value it needs to have */

Hi,

Thanks for taking a look. Yes, I guess that should have some
justification. I was thinking of two reasons:
1. No future additions of optional parameters would need to make tree
wide changes like this.
2. The change is easier to review and get correct because the necessary
context is within a single line. For example, in that function some of
members are set within a while loop. The place you pointed seems to be
the correct one, but a diff that had the new field set after:
   info.high_limit = addr;
...would look correct too, but not be.

What is the concern with C99 initialization? FWIW, the full series also
removes an indirect branch, and probably is a net win for performance
in this path.



Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2024, at 12:12, Geert Uytterhoeven wrote:
> On Tue, Feb 27, 2024 at 11:59 AM Arnd Bergmann  wrote:
>> On Tue, Feb 27, 2024, at 09:54, Geert Uytterhoeven wrote:
>> I was a bit unsure about how to best do this since there
>> is not really a need for a fixed page size on nommu kernels,
>> whereas the three MMU configs clearly tie the page size to
>> the MMU rather than the platform.
>>
>> There should be no reason for coldfire to have a different
>> page size from dragonball if neither of them actually uses
>> hardware pages, so one of them could be changed later.
>
> Indeed, in theory, PAGE_SIZE doesn't matter for nommu, but the concept
> of pages is used all over the place in Linux.
>
> I'm mostly worried about some Coldfire code relying on the actual value
> of PAGE_SIZE in some other context. e.g. for configuring non-cacheable
> regions.

Right, any change here would have to be carefully tested. I would
expect that a 4K page size would reduce memory consumption even on
NOMMU systems that should have the same tradeoffs for representing
files in the page cache and in mem_map[].

> And does this impact running nommu binaries on a system with MMU?
> I.e. if nommu binaries were built with a 4 KiB PAGE_SIZE, do they
> still run on MMU systems with an 8 KiB PAGE_SIZE (coldfire and sun3),
> or are there some subtleties to take into account?

As far as I understand, binaries have to be built and linked for
the largest page size they can run on, so running them on a kernel
with smaller page size usually works.

One notable exception is sys_mmap2(), which on most architectures
takes units of 4KiB but on m68k is actually written to take
PAGE_SIZE units. As Al pointed out in f8b7256096a2 ("Unify
sys_mmap*"), it has always been wrong on sun3, presumably
because users of that predate modern glibc. Running coldfire
nommu binaries on coldfire mmu kernels would run into the same
bug if either of them changes PAGE_SIZE. If you can run
coldfire nommu binaries on classic m68k, that is already
broken in the same way.

  Arnd



Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Heiko Carstens
On Mon, Feb 26, 2024 at 05:14:13PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
> 
> Signed-off-by: Arnd Bergmann 
> ---
...
>  arch/s390/Kconfig  | 1 +
>  arch/s390/include/asm/page.h   | 2 +-
...
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index fe565f3a3a91..b61c74c10050 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -199,6 +199,7 @@ config S390
>   select HAVE_MOD_ARCH_SPECIFIC
>   select HAVE_NMI
>   select HAVE_NOP_MCOUNT
> + select HAVE_PAGE_SIZE_4KB
>   select HAVE_PCI
>   select HAVE_PERF_EVENTS
>   select HAVE_PERF_REGS
> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index 73b9c3bf377f..ded9548d11d9 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -11,7 +11,7 @@
>  #include 
>  #include 
>  
> -#define _PAGE_SHIFT  12
> +#define _PAGE_SHIFT  CONFIG_PAGE_SHIFT

Acked-by: Heiko Carstens 



Re: [PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-27 Thread Helge Deller

On 2/26/24 17:14, Arnd Bergmann wrote:

From: Arnd Bergmann 

arc, arm64, parisc and powerpc all have their own Kconfig symbols
in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
so the common symbols are the ones that are actually used, while
leaving the arhcitecture specific ones as the user visible
place for configuring it, to avoid breaking user configs.

Signed-off-by: Arnd Bergmann 
---
  arch/arc/Kconfig  |  3 +++
  arch/arc/include/uapi/asm/page.h  |  6 ++
  arch/arm64/Kconfig| 29 +
  arch/arm64/include/asm/page-def.h |  2 +-
  arch/parisc/Kconfig   |  3 +++
  arch/parisc/include/asm/page.h| 10 +-


Acked-by: Helge Deller  # parisc

Thanks for the cleanups!

Helge




Re: [PATCH 4/4] vdso: avoid including asm/page.h

2024-02-27 Thread Catalin Marinas
On Mon, Feb 26, 2024 at 05:14:14PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The recent change to the vdso_data_store broke building compat VDSO
> on at least arm64 because it includes headers outside of the include/vdso/
> namespace:
> 
> In file included from arch/arm64/include/asm/lse.h:5,
>  from arch/arm64/include/asm/cmpxchg.h:14,
>  from arch/arm64/include/asm/atomic.h:16,
>  from include/linux/atomic.h:7,
>  from include/asm-generic/bitops/atomic.h:5,
>  from arch/arm64/include/asm/bitops.h:25,
>  from include/linux/bitops.h:68,
>  from arch/arm64/include/asm/memory.h:209,
>  from arch/arm64/include/asm/page.h:46,
>  from include/vdso/datapage.h:22,
>  from lib/vdso/gettimeofday.c:5,
>  from :
> arch/arm64/include/asm/atomic_ll_sc.h:298:9: error: unknown type name 'u128'
>   298 | u128 full;
> 
> Use an open-coded page size calculation based on the new CONFIG_PAGE_SHIFT
> Kconfig symbol instead.
> 
> Reported-by: Linux Kernel Functional Testing 
> Fixes: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all 
> architectures")
> Link: 
> https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
> Signed-off-by: Arnd Bergmann 

Acked-by: Catalin Marinas 



Re: [PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-27 Thread Catalin Marinas
On Mon, Feb 26, 2024 at 05:14:12PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> arc, arm64, parisc and powerpc all have their own Kconfig symbols
> in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
> so the common symbols are the ones that are actually used, while
> leaving the arhcitecture specific ones as the user visible
> place for configuring it, to avoid breaking user configs.
> 
> Signed-off-by: Arnd Bergmann 

For arm64:

Acked-by: Catalin Marinas 



Re: [PATCH 4/4] vdso: avoid including asm/page.h

2024-02-27 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
>> From: Arnd Bergmann 
>> 
>> The recent change to the vdso_data_store broke building compat VDSO
>> on at least arm64 because it includes headers outside of the include/vdso/
>> namespace:
>
> I understand that powerpc64 also has an issue, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20231221120410.2226678-1-...@ellerman.id.au/

Yeah, and that patch would silently conflict with this series, which is
not ideal.

I could delay merging my patch above until after this series goes in,
mine only fixes a fairly obscure build warning.

cheers



Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Geert Uytterhoeven
Hi Arnd,

CC Greg

On Tue, Feb 27, 2024 at 11:59 AM Arnd Bergmann  wrote:
> On Tue, Feb 27, 2024, at 09:54, Geert Uytterhoeven wrote:
> >> diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
> >> index 9dcf245c9cbf..c777a129768a 100644
> >> --- a/arch/m68k/Kconfig.cpu
> >> +++ b/arch/m68k/Kconfig.cpu
> >> @@ -30,6 +30,7 @@ config COLDFIRE
> >> select GENERIC_CSUM
> >> select GPIOLIB
> >> select HAVE_LEGACY_CLK
> >> +   select HAVE_PAGE_SIZE_8KB if !MMU
> >
> >  if you would drop the !MMU-dependency here.
> >
> >>
> >>  endchoice
> >>
> >> @@ -45,6 +46,7 @@ config M68000
> >> select GENERIC_CSUM
> >> select CPU_NO_EFFICIENT_FFS
> >> select HAVE_ARCH_HASH
> >> +   select HAVE_PAGE_SIZE_4KB
> >
> > Perhaps replace this by
> >
> > config M68KCLASSIC
> > bool "Classic M68K CPU family support"
> > select HAVE_ARCH_PFN_VALID
> >   + select HAVE_PAGE_SIZE_4KB if !MMU
> >
> > so it covers all 680x0 CPUs without MMU?
>
> I was a bit unsure about how to best do this since there
> is not really a need for a fixed page size on nommu kernels,
> whereas the three MMU configs clearly tie the page size to
> the MMU rather than the platform.
>
> There should be no reason for coldfire to have a different
> page size from dragonball if neither of them actually uses
> hardware pages, so one of them could be changed later.

Indeed, in theory, PAGE_SIZE doesn't matter for nommu, but the concept
of pages is used all over the place in Linux.

I'm mostly worried about some Coldfire code relying on the actual value
of PAGE_SIZE in some other context. e.g. for configuring non-cacheable
regions.

And does this impact running nommu binaries on a system with MMU?
I.e. if nommu binaries were built with a 4 KiB PAGE_SIZE, do they
still run on MMU systems with an 8 KiB PAGE_SIZE (coldfire and sun3),
or are there some subtleties to take into account?

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2024, at 09:54, Geert Uytterhoeven wrote:
> Hi Arnd,
>> diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
>> index 9dcf245c9cbf..c777a129768a 100644
>> --- a/arch/m68k/Kconfig.cpu
>> +++ b/arch/m68k/Kconfig.cpu
>> @@ -30,6 +30,7 @@ config COLDFIRE
>> select GENERIC_CSUM
>> select GPIOLIB
>> select HAVE_LEGACY_CLK
>> +   select HAVE_PAGE_SIZE_8KB if !MMU
>
>  if you would drop the !MMU-dependency here.
>
>>
>>  endchoice
>>
>> @@ -45,6 +46,7 @@ config M68000
>> select GENERIC_CSUM
>> select CPU_NO_EFFICIENT_FFS
>> select HAVE_ARCH_HASH
>> +   select HAVE_PAGE_SIZE_4KB
>
> Perhaps replace this by
>
> config M68KCLASSIC
> bool "Classic M68K CPU family support"
> select HAVE_ARCH_PFN_VALID
>   + select HAVE_PAGE_SIZE_4KB if !MMU
>
> so it covers all 680x0 CPUs without MMU?

I was a bit unsure about how to best do this since there
is not really a need for a fixed page size on nommu kernels,
whereas the three MMU configs clearly tie the page size to
the MMU rather than the platform.

There should be no reason for coldfire to have a different
page size from dragonball if neither of them actually uses
hardware pages, so one of them could be changed later.

Let me know if that makes sense to you, or you still
prefer me to change it like you suggested.

  Arnd



Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Geert Uytterhoeven
Hi Arnd,

On Mon, Feb 26, 2024 at 5:15 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
>
> Signed-off-by: Arnd Bergmann 

Thanks for your patch!

> --- a/arch/m68k/Kconfig
> +++ b/arch/m68k/Kconfig
> @@ -84,12 +84,15 @@ config MMU
>
>  config MMU_MOTOROLA
> bool
> +   select HAVE_PAGE_SIZE_4KB
>
>  config MMU_COLDFIRE
> +   select HAVE_PAGE_SIZE_8KB

I think you can do without this...

> bool
>
>  config MMU_SUN3
> bool
> +   select HAVE_PAGE_SIZE_8KB
> depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE
>
>  config ARCH_SUPPORTS_KEXEC
> diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
> index 9dcf245c9cbf..c777a129768a 100644
> --- a/arch/m68k/Kconfig.cpu
> +++ b/arch/m68k/Kconfig.cpu
> @@ -30,6 +30,7 @@ config COLDFIRE
> select GENERIC_CSUM
> select GPIOLIB
> select HAVE_LEGACY_CLK
> +   select HAVE_PAGE_SIZE_8KB if !MMU

 if you would drop the !MMU-dependency here.

>
>  endchoice
>
> @@ -45,6 +46,7 @@ config M68000
> select GENERIC_CSUM
> select CPU_NO_EFFICIENT_FFS
> select HAVE_ARCH_HASH
> +   select HAVE_PAGE_SIZE_4KB

Perhaps replace this by

config M68KCLASSIC
bool "Classic M68K CPU family support"
select HAVE_ARCH_PFN_VALID
  + select HAVE_PAGE_SIZE_4KB if !MMU

so it covers all 680x0 CPUs without MMU?

> select LEGACY_TIMER_TICK
> help
>   The Freescale (was Motorola) 68000 CPU is the first generation of

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-27 Thread Geert Uytterhoeven
Hi Arnd,

On Mon, Feb 26, 2024 at 5:14 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> These four architectures define the same Kconfig symbols for configuring
> the page size. Move the logic into a common place where it can be shared
> with all other architectures.
>
> Signed-off-by: Arnd Bergmann 

Thanks for your patch!

> --- a/arch/Kconfig
> +++ b/arch/Kconfig

> +config PAGE_SIZE_4KB
> +   bool "4KB pages"

Now you got rid of the 4000-byte ("4kB") pages and friends, please
do not replace these by Kelvin-bytes, and use the official binary
prefixes => "4 KiB".

> +   depends on HAVE_PAGE_SIZE_4KB

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-26 Thread Christophe Leroy


Le 26/02/2024 à 20:09, Rick Edgecombe a écrit :
> Future changes will need to add a field to struct vm_unmapped_area_info.
> This would cause trouble for any archs that don't initialize the
> struct. Currently every user sets each field, so if new fields are
> added, the core code parsing the struct will see garbage in the new
> field.
> 
> It could be possible to initialize the new field for each arch to 0, but
> instead simply inialize the field with a C99 struct inializing syntax.

Why doing a full init of the struct when all fields are re-written a few 
lines after ?

If I take the exemple of powerpc function slice_find_area_bottomup():

struct vm_unmapped_area_info info;

info.flags = 0;
info.length = len;
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
info.align_offset = 0;

For me it looks better to just add:

info.new_field = 0; /* or whatever value it needs to have */

Christophe


> 
> Cc: linux...@kvack.org
> Cc: linux-alpha@vger.kernel.org
> Cc: linux-snps-...@lists.infradead.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-c...@vger.kernel.org
> Cc: loonga...@lists.linux.dev
> Cc: linux-m...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Cc: sparcli...@vger.kernel.org
> Cc: x...@kernel.org
> Suggested-by: Kirill A. Shutemov 
> Signed-off-by: Rick Edgecombe 
> Link: 
> https://lore.kernel.org/lkml/3ynogxcgokc6i6xojbxzzwqectg472laes24u7jmtktlxcch5e@dfytra3ia3zc/#t
> ---
> Hi archs,
> 
> For some context, this is part of a larger series to improve shadow stack
> guard gaps. It involves plumbing a new field via
> struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
> likely use it as well. The change is compile tested only for non-x86 but
> seems like a relatively safe one.
> 
> Thanks,
> 
> Rick
> 
> v2:
>   - New patch
> ---
>   arch/alpha/kernel/osf_sys.c  | 2 +-
>   arch/arc/mm/mmap.c   | 2 +-
>   arch/arm/mm/mmap.c   | 4 ++--
>   arch/csky/abiv1/mmap.c   | 2 +-
>   arch/loongarch/mm/mmap.c | 2 +-
>   arch/mips/mm/mmap.c  | 2 +-
>   arch/parisc/kernel/sys_parisc.c  | 2 +-
>   arch/powerpc/mm/book3s64/slice.c | 4 ++--
>   arch/s390/mm/hugetlbpage.c   | 4 ++--
>   arch/s390/mm/mmap.c  | 4 ++--
>   arch/sh/mm/mmap.c| 4 ++--
>   arch/sparc/kernel/sys_sparc_32.c | 2 +-
>   arch/sparc/kernel/sys_sparc_64.c | 4 ++--
>   arch/sparc/mm/hugetlbpage.c  | 4 ++--
>   arch/x86/kernel/sys_x86_64.c | 4 ++--
>   arch/x86/mm/hugetlbpage.c| 4 ++--
>   fs/hugetlbfs/inode.c | 4 ++--
>   mm/mmap.c| 4 ++--
>   18 files changed, 29 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
> index 5db88b627439..dd6801bb9240 100644
> --- a/arch/alpha/kernel/osf_sys.c
> +++ b/arch/alpha/kernel/osf_sys.c
> @@ -1218,7 +1218,7 @@ static unsigned long
>   arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
>unsigned long limit)
>   {
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>   
>   info.flags = 0;
>   info.length = len;
> diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
> index 3c1c7ae73292..6549b3375f54 100644
> --- a/arch/arc/mm/mmap.c
> +++ b/arch/arc/mm/mmap.c
> @@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long 
> addr,
>   {
>   struct mm_struct *mm = current->mm;
>   struct vm_area_struct *vma;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>   
>   /*
>* We enforce the MAP_FIXED case.
> diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
> index a0f8a0ca0788..525795578c29 100644
> --- a/arch/arm/mm/mmap.c
> +++ b/arch/arm/mm/mmap.c
> @@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long 
> addr,
>   struct vm_area_struct *vma;
>   int do_align = 0;
>   int aliasing = cache_is_vipt_aliasing();
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>   
>   /*
>* We only need to do colour alignment if either the I or D
> @@ -87,7 +87,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
> unsigned long addr0,
>   unsigned long addr = addr0;
>   int do_align = 0;
>   int aliasing = cache_is_vipt_aliasing();
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>   
>   /*
>* We only need to do colour alignment if either the I or D
> diff --git a/arch/csky/abiv1/mmap.c b/arch/csky/abiv1/mmap.c
> index 6792aca4..726659d41fa9 100644
> --- a/arch/csky/abiv1/mmap.c
> +++ b/arch/csky/abiv1/mmap.c
> @@ -28,7 +28,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long 
> addr,
>   struct mm_struct *mm = 

Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-26 Thread Guo Ren
On Tue, Feb 27, 2024 at 12:15 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/Kconfig | 1 +
>  arch/alpha/include/asm/page.h  | 2 +-
>  arch/arm/Kconfig   | 1 +
>  arch/arm/include/asm/page.h| 2 +-
>  arch/csky/Kconfig  | 1 +
>  arch/csky/include/asm/page.h   | 2 +-
>  arch/m68k/Kconfig  | 3 +++
>  arch/m68k/Kconfig.cpu  | 2 ++
>  arch/m68k/include/asm/page.h   | 6 +-
>  arch/microblaze/Kconfig| 1 +
>  arch/microblaze/include/asm/page.h | 2 +-
>  arch/nios2/Kconfig | 1 +
>  arch/nios2/include/asm/page.h  | 2 +-
>  arch/openrisc/Kconfig  | 1 +
>  arch/openrisc/include/asm/page.h   | 2 +-
>  arch/riscv/Kconfig | 1 +
>  arch/riscv/include/asm/page.h  | 2 +-
>  arch/s390/Kconfig  | 1 +
>  arch/s390/include/asm/page.h   | 2 +-
>  arch/sparc/Kconfig | 2 ++
>  arch/sparc/include/asm/page_32.h   | 2 +-
>  arch/sparc/include/asm/page_64.h   | 3 +--
>  arch/um/Kconfig| 1 +
>  arch/um/include/asm/page.h | 2 +-
>  arch/x86/Kconfig   | 1 +
>  arch/x86/include/asm/page_types.h  | 2 +-
>  arch/xtensa/Kconfig| 1 +
>  arch/xtensa/include/asm/page.h | 2 +-
>  28 files changed, 32 insertions(+), 19 deletions(-)
>
> diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
> index d6968d090d49..4f490250d323 100644
> --- a/arch/alpha/Kconfig
> +++ b/arch/alpha/Kconfig
> @@ -14,6 +14,7 @@ config ALPHA
> select PCI_DOMAINS if PCI
> select PCI_SYSCALL if PCI
> select HAVE_ASM_MODVERSIONS
> +   select HAVE_PAGE_SIZE_8KB
> select HAVE_PCSPKR_PLATFORM
> select HAVE_PERF_EVENTS
> select NEED_DMA_MAP_STATE
> diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
> index 4db1ebc0ed99..70419e6be1a3 100644
> --- a/arch/alpha/include/asm/page.h
> +++ b/arch/alpha/include/asm/page.h
> @@ -6,7 +6,7 @@
>  #include 
>
>  /* PAGE_SHIFT determines the page size */
> -#define PAGE_SHIFT 13
> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
>  #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
>  #define PAGE_MASK  (~(PAGE_SIZE-1))
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 0af6709570d1..9d52ba3a8ad1 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -116,6 +116,7 @@ config ARM
> select HAVE_MOD_ARCH_SPECIFIC
> select HAVE_NMI
> select HAVE_OPTPROBES if !THUMB2_KERNEL
> +   select HAVE_PAGE_SIZE_4KB
> select HAVE_PCI if MMU
> select HAVE_PERF_EVENTS
> select HAVE_PERF_REGS
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 119aa85d1feb..62af9f7f9e96 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -8,7 +8,7 @@
>  #define _ASMARM_PAGE_H
>
>  /* PAGE_SHIFT determines the page size */
> -#define PAGE_SHIFT 12
> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
>  #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
>  #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
>
> diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
> index cf2a6fd7dff8..9c2723ab1c94 100644
> --- a/arch/csky/Kconfig
> +++ b/arch/csky/Kconfig
> @@ -89,6 +89,7 @@ config CSKY
> select HAVE_KPROBES if !CPU_CK610
> select HAVE_KPROBES_ON_FTRACE if !CPU_CK610
> select HAVE_KRETPROBES if !CPU_CK610
> +   select HAVE_PAGE_SIZE_4KB
> select HAVE_PERF_EVENTS
> select HAVE_PERF_REGS
> select HAVE_PERF_USER_STACK_DUMP
> diff --git a/arch/csky/include/asm/page.h b/arch/csky/include/asm/page.h
> index 4a0502e324a6..f70f37402d75 100644
> --- a/arch/csky/include/asm/page.h
> +++ b/arch/csky/include/asm/page.h
> @@ -10,7 +10,7 @@
>  /*
>   * PAGE_SHIFT determines the page size: 4KB
>   */
> -#define PAGE_SHIFT 12
> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
LGTM, thx.
Acked-by: Guo Ren 

>  #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
>  #define PAGE_MASK  (~(PAGE_SIZE - 1))
>  #define THREAD_SIZE(PAGE_SIZE * 2)
> diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
> index 4b3e93cac723..7b709453d5e7 100644
> --- a/arch/m68k/Kconfig
> +++ b/arch/m68k/Kconfig
> @@ -84,12 +84,15 @@ config MMU
>
>  config MMU_MOTOROLA
> bool
> +   select HAVE_PAGE_SIZE_4KB
>
>  config MMU_COLDFIRE
> +   select HAVE_PAGE_SIZE_8KB
> bool
>
>  config MMU_SUN3
> bool
> +   select HAVE_PAGE_SIZE_8KB
> depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE
>
>  config ARCH_SUPPORTS_KEXEC
> diff --git 

[PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

2024-02-26 Thread Rick Edgecombe
Future changes will need to add a field to struct vm_unmapped_area_info.
This would cause trouble for any archs that don't initialize the
struct. Currently every user sets each field, so if new fields are
added, the core code parsing the struct will see garbage in the new
field.

It could be possible to initialize the new field for each arch to 0, but
instead simply inialize the field with a C99 struct inializing syntax.

Cc: linux...@kvack.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: x...@kernel.org
Suggested-by: Kirill A. Shutemov 
Signed-off-by: Rick Edgecombe 
Link: 
https://lore.kernel.org/lkml/3ynogxcgokc6i6xojbxzzwqectg472laes24u7jmtktlxcch5e@dfytra3ia3zc/#t
---
Hi archs,

For some context, this is part of a larger series to improve shadow stack
guard gaps. It involves plumbing a new field via
struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
likely use it as well. The change is compile tested only for non-x86 but
seems like a relatively safe one.

Thanks,

Rick

v2:
 - New patch
---
 arch/alpha/kernel/osf_sys.c  | 2 +-
 arch/arc/mm/mmap.c   | 2 +-
 arch/arm/mm/mmap.c   | 4 ++--
 arch/csky/abiv1/mmap.c   | 2 +-
 arch/loongarch/mm/mmap.c | 2 +-
 arch/mips/mm/mmap.c  | 2 +-
 arch/parisc/kernel/sys_parisc.c  | 2 +-
 arch/powerpc/mm/book3s64/slice.c | 4 ++--
 arch/s390/mm/hugetlbpage.c   | 4 ++--
 arch/s390/mm/mmap.c  | 4 ++--
 arch/sh/mm/mmap.c| 4 ++--
 arch/sparc/kernel/sys_sparc_32.c | 2 +-
 arch/sparc/kernel/sys_sparc_64.c | 4 ++--
 arch/sparc/mm/hugetlbpage.c  | 4 ++--
 arch/x86/kernel/sys_x86_64.c | 4 ++--
 arch/x86/mm/hugetlbpage.c| 4 ++--
 fs/hugetlbfs/inode.c | 4 ++--
 mm/mmap.c| 4 ++--
 18 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..dd6801bb9240 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,7 +1218,7 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 unsigned long limit)
 {
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
info.flags = 0;
info.length = len;
diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
index 3c1c7ae73292..6549b3375f54 100644
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We enforce the MAP_FIXED case.
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index a0f8a0ca0788..525795578c29 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct vm_area_struct *vma;
int do_align = 0;
int aliasing = cache_is_vipt_aliasing();
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We only need to do colour alignment if either the I or D
@@ -87,7 +87,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
unsigned long addr0,
unsigned long addr = addr0;
int do_align = 0;
int aliasing = cache_is_vipt_aliasing();
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We only need to do colour alignment if either the I or D
diff --git a/arch/csky/abiv1/mmap.c b/arch/csky/abiv1/mmap.c
index 6792aca4..726659d41fa9 100644
--- a/arch/csky/abiv1/mmap.c
+++ b/arch/csky/abiv1/mmap.c
@@ -28,7 +28,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
int do_align = 0;
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
/*
 * We only need to do colour alignment if either the I or D
diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
index a9630a81b38a..664bf4abfdcf 100644
--- a/arch/loongarch/mm/mmap.c
+++ b/arch/loongarch/mm/mmap.c
@@ -24,7 +24,7 @@ static unsigned long arch_get_unmapped_area_common(struct 
file *filp,
struct vm_area_struct *vma;
unsigned long addr = addr0;
int do_color_align;
-   struct vm_unmapped_area_info info;
+   struct vm_unmapped_area_info info = {};
 
if (unlikely(len > TASK_SIZE))
   

Re: [PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-26 Thread Christophe Leroy


Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
> From: Arnd Bergmann 
> 
> arc, arm64, parisc and powerpc all have their own Kconfig symbols
> in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
> so the common symbols are the ones that are actually used, while
> leaving the arhcitecture specific ones as the user visible
> place for configuring it, to avoid breaking user configs.
> 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Christophe Leroy  (powerpc32)

> ---
>   arch/arc/Kconfig  |  3 +++
>   arch/arc/include/uapi/asm/page.h  |  6 ++
>   arch/arm64/Kconfig| 29 +
>   arch/arm64/include/asm/page-def.h |  2 +-
>   arch/parisc/Kconfig   |  3 +++
>   arch/parisc/include/asm/page.h| 10 +-
>   arch/powerpc/Kconfig  | 31 ++-
>   arch/powerpc/include/asm/page.h   |  2 +-
>   scripts/gdb/linux/constants.py.in |  2 +-
>   scripts/gdb/linux/mm.py   |  2 +-
>   10 files changed, 32 insertions(+), 58 deletions(-)
> 
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 1b0483c51cc1..4092bec198be 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -284,14 +284,17 @@ choice
>   
>   config ARC_PAGE_SIZE_8K
>   bool "8KB"
> + select HAVE_PAGE_SIZE_8KB
>   help
> Choose between 8k vs 16k
>   
>   config ARC_PAGE_SIZE_16K
> + select HAVE_PAGE_SIZE_16KB
>   bool "16KB"
>   
>   config ARC_PAGE_SIZE_4K
>   bool "4KB"
> + select HAVE_PAGE_SIZE_4KB
>   depends on ARC_MMU_V3 || ARC_MMU_V4
>   
>   endchoice
> diff --git a/arch/arc/include/uapi/asm/page.h 
> b/arch/arc/include/uapi/asm/page.h
> index 2a4ad619abfb..7fd9e741b527 100644
> --- a/arch/arc/include/uapi/asm/page.h
> +++ b/arch/arc/include/uapi/asm/page.h
> @@ -13,10 +13,8 @@
>   #include 
>   
>   /* PAGE_SHIFT determines the page size */
> -#if defined(CONFIG_ARC_PAGE_SIZE_16K)
> -#define PAGE_SHIFT 14
> -#elif defined(CONFIG_ARC_PAGE_SIZE_4K)
> -#define PAGE_SHIFT 12
> +#ifdef __KERNEL__
> +#define PAGE_SHIFT CONFIG_PAGE_SHIFT
>   #else
>   /*
>* Default 8k
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index aa7c1d435139..29290b8cb36d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -277,27 +277,21 @@ config 64BIT
>   config MMU
>   def_bool y
>   
> -config ARM64_PAGE_SHIFT
> - int
> - default 16 if ARM64_64K_PAGES
> - default 14 if ARM64_16K_PAGES
> - default 12
> -
>   config ARM64_CONT_PTE_SHIFT
>   int
> - default 5 if ARM64_64K_PAGES
> - default 7 if ARM64_16K_PAGES
> + default 5 if PAGE_SIZE_64KB
> + default 7 if PAGE_SIZE_16KB
>   default 4
>   
>   config ARM64_CONT_PMD_SHIFT
>   int
> - default 5 if ARM64_64K_PAGES
> - default 5 if ARM64_16K_PAGES
> + default 5 if PAGE_SIZE_64KB
> + default 5 if PAGE_SIZE_16KB
>   default 4
>   
>   config ARCH_MMAP_RND_BITS_MIN
> - default 14 if ARM64_64K_PAGES
> - default 16 if ARM64_16K_PAGES
> + default 14 if PAGE_SIZE_64KB
> + default 16 if PAGE_SIZE_16KB
>   default 18
>   
>   # max bits determined by the following formula:
> @@ -1259,11 +1253,13 @@ choice
>   
>   config ARM64_4K_PAGES
>   bool "4KB"
> + select HAVE_PAGE_SIZE_4KB
>   help
> This feature enables 4KB pages support.
>   
>   config ARM64_16K_PAGES
>   bool "16KB"
> + select HAVE_PAGE_SIZE_16KB
>   help
> The system will use 16KB pages support. AArch32 emulation
> requires applications compiled with 16K (or a multiple of 16K)
> @@ -1271,6 +1267,7 @@ config ARM64_16K_PAGES
>   
>   config ARM64_64K_PAGES
>   bool "64KB"
> + select HAVE_PAGE_SIZE_64KB
>   help
> This feature enables 64KB pages support (4KB by default)
> allowing only two levels of page tables and faster TLB
> @@ -1291,19 +1288,19 @@ choice
>   
>   config ARM64_VA_BITS_36
>   bool "36-bit" if EXPERT
> - depends on ARM64_16K_PAGES
> + depends on PAGE_SIZE_16KB
>   
>   config ARM64_VA_BITS_39
>   bool "39-bit"
> - depends on ARM64_4K_PAGES
> + depends on PAGE_SIZE_4KB
>   
>   config ARM64_VA_BITS_42
>   bool "42-bit"
> - depends on ARM64_64K_PAGES
> + depends on PAGE_SIZE_64KB
>   
>   config ARM64_VA_BITS_47
>   bool "47-bit"
> - depends on ARM64_16K_PAGES
> + depends on PAGE_SIZE_16KB
>   
>   config ARM64_VA_BITS_48
>   bool "48-bit"
> diff --git a/arch/arm64/include/asm/page-def.h 
> b/arch/arm64/include/asm/page-def.h
> index 2403f7b4cdbf..792e9fe881dc 100644
> --- a/arch/arm64/include/asm/page-def.h
> +++ b/arch/arm64/include/asm/page-def.h
> @@ -11,7 +11,7 @@
>   #include 
>   
>   /* PAGE_SHIFT determines the page size */
> -#define PAGE_SHIFT   CONFIG_ARM64_PAGE_SHIFT
> +#define PAGE_SHIFT   CONFIG_PAGE_SHIFT
>   #define PAGE_SIZE   (_AC(1, UL) << PAGE_SHIFT)
>   #define PAGE_MASK   

Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-26 Thread Christophe Leroy


Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
> From: Arnd Bergmann 
> 
> These four architectures define the same Kconfig symbols for configuring
> the page size. Move the logic into a common place where it can be shared
> with all other architectures.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>   arch/Kconfig  | 58 +--
>   arch/hexagon/Kconfig  | 25 +++--
>   arch/hexagon/include/asm/page.h   |  6 +---
>   arch/loongarch/Kconfig| 21 ---
>   arch/loongarch/include/asm/page.h | 10 +-
>   arch/mips/Kconfig | 58 +++
>   arch/mips/include/asm/page.h  | 16 +
>   arch/sh/include/asm/page.h| 13 +--
>   arch/sh/mm/Kconfig| 42 +++---
>   9 files changed, 88 insertions(+), 161 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index a5af0edd3eb8..237cea01ed9b 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1078,17 +1078,71 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
> and vice-versa 32-bit applications to call 64-bit mmap().
> Required for applications doing different bitness syscalls.
>   
> +config HAVE_PAGE_SIZE_4KB
> + bool
> +
> +config HAVE_PAGE_SIZE_8KB
> + bool
> +
> +config HAVE_PAGE_SIZE_16KB
> + bool
> +
> +config HAVE_PAGE_SIZE_32KB
> + bool
> +
> +config HAVE_PAGE_SIZE_64KB
> + bool
> +
> +config HAVE_PAGE_SIZE_256KB
> + bool
> +
> +choice
> + prompt "MMU page size"
> +

That's a nice re-factor.

The only drawback I see is that we are loosing several interesting 
arch-specific comments/help text. Don't know if there could be an easy 
way to keep them.


> +config PAGE_SIZE_4KB
> + bool "4KB pages"
> + depends on HAVE_PAGE_SIZE_4KB
> +
> +config PAGE_SIZE_8KB
> + bool "8KB pages"
> + depends on HAVE_PAGE_SIZE_8KB
> +
> +config PAGE_SIZE_16KB
> + bool "16KB pages"
> + depends on HAVE_PAGE_SIZE_16KB
> +
> +config PAGE_SIZE_32KB
> + bool "32KB pages"
> + depends on HAVE_PAGE_SIZE_32KB
> +
> +config PAGE_SIZE_64KB
> + bool "64KB pages"
> + depends on HAVE_PAGE_SIZE_64KB
> +
> +config PAGE_SIZE_256KB
> + bool "256KB pages"
> + depends on HAVE_PAGE_SIZE_256KB

Hexagon seem to also use CONFIG_PAGE_SIZE_1MB ?

> +
> +endchoice
> +
>   config PAGE_SIZE_LESS_THAN_64KB
>   def_bool y
> - depends on !ARM64_64K_PAGES
>   depends on !PAGE_SIZE_64KB
> - depends on !PARISC_PAGE_SIZE_64KB
>   depends on PAGE_SIZE_LESS_THAN_256KB
>   
>   config PAGE_SIZE_LESS_THAN_256KB
>   def_bool y
>   depends on !PAGE_SIZE_256KB
>   
> +config PAGE_SHIFT
> + int
> + default 12 if PAGE_SIZE_4KB
> + default 13 if PAGE_SIZE_8KB
> + default 14 if PAGE_SIZE_16KB
> + default 15 if PAGE_SIZE_32KB
> + default 16 if PAGE_SIZE_64KB
> + default 18 if PAGE_SIZE_256KB
> +
>   # This allows to use a set of generic functions to determine mmap base
>   # address by giving priority to top-down scheme only if the process
>   # is not in legacy mode (compat task, unlimited stack size or
> diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
> index a880ee067d2e..aac46ee1a000 100644
> --- a/arch/hexagon/Kconfig
> +++ b/arch/hexagon/Kconfig
> @@ -8,6 +8,11 @@ config HEXAGON
>   select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>   select ARCH_NO_PREEMPT
>   select DMA_GLOBAL_POOL
> + select FRAME_POINTER
> + select HAVE_PAGE_SIZE_4KB
> + select HAVE_PAGE_SIZE_16KB
> + select HAVE_PAGE_SIZE_64KB
> + select HAVE_PAGE_SIZE_256KB
>   # Other pending projects/to-do items.
>   # select HAVE_REGS_AND_STACK_ACCESS_API
>   # select HAVE_HW_BREAKPOINT if PERF_EVENTS
> @@ -120,26 +125,6 @@ config NR_CPUS
> This is purely to save memory - each supported CPU adds
> approximately eight kilobytes to the kernel image.
>   
> -choice
> - prompt "Kernel page size"
> - default PAGE_SIZE_4KB
> - help
> -   Changes the default page size; use with caution.
> -
> -config PAGE_SIZE_4KB
> - bool "4KB"
> -
> -config PAGE_SIZE_16KB
> - bool "16KB"
> -
> -config PAGE_SIZE_64KB
> - bool "64KB"
> -
> -config PAGE_SIZE_256KB
> - bool "256KB"
> -
> -endchoice
> -
>   source "kernel/Kconfig.hz"
>   
>   endmenu
> diff --git a/arch/hexagon/include/asm/page.h b/arch/hexagon/include/asm/page.h
> index 10f1bc07423c..65c9bac639fa 100644
> --- a/arch/hexagon/include/asm/page.h
> +++ b/arch/hexagon/include/asm/page.h
> @@ -13,27 +13,22 @@
>   /*  This is probably not the most graceful way to handle this.  */
>   
>   #ifdef CONFIG_PAGE_SIZE_4KB
> -#define PAGE_SHIFT 12
>   #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_4KB
>   #endif
>   
>   #ifdef CONFIG_PAGE_SIZE_16KB
> -#define PAGE_SHIFT 14
>   #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_16KB
>   #endif
>   
>   #ifdef CONFIG_PAGE_SIZE_64KB
> -#define PAGE_SHIFT 16
>   #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_64KB
>   

Re: [PATCH 4/4] vdso: avoid including asm/page.h

2024-02-26 Thread Christophe Leroy


Le 26/02/2024 à 17:14, Arnd Bergmann a écrit :
> From: Arnd Bergmann 
> 
> The recent change to the vdso_data_store broke building compat VDSO
> on at least arm64 because it includes headers outside of the include/vdso/
> namespace:

I understand that powerpc64 also has an issue, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20231221120410.2226678-1-...@ellerman.id.au/

> 
> In file included from arch/arm64/include/asm/lse.h:5,
>   from arch/arm64/include/asm/cmpxchg.h:14,
>   from arch/arm64/include/asm/atomic.h:16,
>   from include/linux/atomic.h:7,
>   from include/asm-generic/bitops/atomic.h:5,
>   from arch/arm64/include/asm/bitops.h:25,
>   from include/linux/bitops.h:68,
>   from arch/arm64/include/asm/memory.h:209,
>   from arch/arm64/include/asm/page.h:46,
>   from include/vdso/datapage.h:22,
>   from lib/vdso/gettimeofday.c:5,
>   from :
> arch/arm64/include/asm/atomic_ll_sc.h:298:9: error: unknown type name 'u128'
>298 | u128 full;
> 
> Use an open-coded page size calculation based on the new CONFIG_PAGE_SHIFT
> Kconfig symbol instead.
> 
> Reported-by: Linux Kernel Functional Testing 
> Fixes: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all 
> architectures")
> Link: 
> https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
> Signed-off-by: Arnd Bergmann 
> ---
>   include/vdso/datapage.h | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
> index 7ba44379a095..2c39a67d7e23 100644
> --- a/include/vdso/datapage.h
> +++ b/include/vdso/datapage.h
> @@ -19,8 +19,6 @@
>   #include 
>   #include 
>   
> -#include 
> -
>   #ifdef CONFIG_ARCH_HAS_VDSO_DATA
>   #include 
>   #else
> @@ -128,7 +126,7 @@ extern struct vdso_data _timens_data[CS_BASES] 
> __attribute__((visibility("hidden
>*/
>   union vdso_data_store {
>   struct vdso_datadata[CS_BASES];
> - u8  page[PAGE_SIZE];
> + u8  page[1ul << CONFIG_PAGE_SHIFT];

Usually 1UL is used (capital letter)

Maybe better to (re)define PAGE_SIZE instead, something like:

#define PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT)


>   };
>   
>   /*


Re: [PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-26 Thread Samuel Holland
On 2024-02-26 10:14 AM, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> These four architectures define the same Kconfig symbols for configuring
> the page size. Move the logic into a common place where it can be shared
> with all other architectures.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/Kconfig  | 58 +--
>  arch/hexagon/Kconfig  | 25 +++--
>  arch/hexagon/include/asm/page.h   |  6 +---
>  arch/loongarch/Kconfig| 21 ---
>  arch/loongarch/include/asm/page.h | 10 +-
>  arch/mips/Kconfig | 58 +++
>  arch/mips/include/asm/page.h  | 16 +
>  arch/sh/include/asm/page.h| 13 +--
>  arch/sh/mm/Kconfig| 42 +++---
>  9 files changed, 88 insertions(+), 161 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index a5af0edd3eb8..237cea01ed9b 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1078,17 +1078,71 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
> and vice-versa 32-bit applications to call 64-bit mmap().
> Required for applications doing different bitness syscalls.
>  
> +config HAVE_PAGE_SIZE_4KB
> + bool
> +
> +config HAVE_PAGE_SIZE_8KB
> + bool
> +
> +config HAVE_PAGE_SIZE_16KB
> + bool
> +
> +config HAVE_PAGE_SIZE_32KB
> + bool
> +
> +config HAVE_PAGE_SIZE_64KB
> + bool
> +
> +config HAVE_PAGE_SIZE_256KB
> + bool
> +
> +choice
> + prompt "MMU page size"

Should this have some generic help text (at least a warning about 
compatibility)?

> +
> +config PAGE_SIZE_4KB
> + bool "4KB pages"
> + depends on HAVE_PAGE_SIZE_4KB
> +
> +config PAGE_SIZE_8KB
> + bool "8KB pages"
> + depends on HAVE_PAGE_SIZE_8KB
> +
> +config PAGE_SIZE_16KB
> + bool "16KB pages"
> + depends on HAVE_PAGE_SIZE_16KB
> +
> +config PAGE_SIZE_32KB
> + bool "32KB pages"
> + depends on HAVE_PAGE_SIZE_32KB
> +
> +config PAGE_SIZE_64KB
> + bool "64KB pages"
> + depends on HAVE_PAGE_SIZE_64KB
> +
> +config PAGE_SIZE_256KB
> + bool "256KB pages"
> + depends on HAVE_PAGE_SIZE_256KB
> +
> +endchoice
> +
>  config PAGE_SIZE_LESS_THAN_64KB
>   def_bool y
> - depends on !ARM64_64K_PAGES
>   depends on !PAGE_SIZE_64KB
> - depends on !PARISC_PAGE_SIZE_64KB
>   depends on PAGE_SIZE_LESS_THAN_256KB
>  
>  config PAGE_SIZE_LESS_THAN_256KB
>   def_bool y
>   depends on !PAGE_SIZE_256KB
>  
> +config PAGE_SHIFT
> + int
> + default 12 if PAGE_SIZE_4KB
> + default 13 if PAGE_SIZE_8KB
> + default 14 if PAGE_SIZE_16KB
> + default 15 if PAGE_SIZE_32KB
> + default 16 if PAGE_SIZE_64KB
> + default 18 if PAGE_SIZE_256KB
> +
>  # This allows to use a set of generic functions to determine mmap base
>  # address by giving priority to top-down scheme only if the process
>  # is not in legacy mode (compat task, unlimited stack size or
> diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
> index a880ee067d2e..aac46ee1a000 100644
> --- a/arch/hexagon/Kconfig
> +++ b/arch/hexagon/Kconfig
> @@ -8,6 +8,11 @@ config HEXAGON
>   select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>   select ARCH_NO_PREEMPT
>   select DMA_GLOBAL_POOL
> + select FRAME_POINTER

Looks like a paste error.

> + select HAVE_PAGE_SIZE_4KB
> + select HAVE_PAGE_SIZE_16KB
> + select HAVE_PAGE_SIZE_64KB
> + select HAVE_PAGE_SIZE_256KB
>   # Other pending projects/to-do items.
>   # select HAVE_REGS_AND_STACK_ACCESS_API
>   # select HAVE_HW_BREAKPOINT if PERF_EVENTS
> @@ -120,26 +125,6 @@ config NR_CPUS
> This is purely to save memory - each supported CPU adds
> approximately eight kilobytes to the kernel image.
>  
> -choice
> - prompt "Kernel page size"
> - default PAGE_SIZE_4KB
> - help
> -   Changes the default page size; use with caution.
> -
> -config PAGE_SIZE_4KB
> - bool "4KB"
> -
> -config PAGE_SIZE_16KB
> - bool "16KB"
> -
> -config PAGE_SIZE_64KB
> - bool "64KB"
> -
> -config PAGE_SIZE_256KB
> - bool "256KB"
> -
> -endchoice
> -
>  source "kernel/Kconfig.hz"
>  
>  endmenu
> diff --git a/arch/hexagon/include/asm/page.h b/arch/hexagon/include/asm/page.h
> index 10f1bc07423c..65c9bac639fa 100644
> --- a/arch/hexagon/include/asm/page.h
> +++ b/arch/hexagon/include/asm/page.h
> @@ -13,27 +13,22 @@
>  /*  This is probably not the most graceful way to handle this.  */
>  
>  #ifdef CONFIG_PAGE_SIZE_4KB
> -#define PAGE_SHIFT 12
>  #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_4KB
>  #endif
>  
>  #ifdef CONFIG_PAGE_SIZE_16KB
> -#define PAGE_SHIFT 14
>  #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_16KB
>  #endif
>  
>  #ifdef CONFIG_PAGE_SIZE_64KB
> -#define PAGE_SHIFT 16
>  #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_64KB
>  #endif
>  
>  #ifdef CONFIG_PAGE_SIZE_256KB
> -#define PAGE_SHIFT 18
>  #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_256KB
>  #endif
>  
>  #ifdef CONFIG_PAGE_SIZE_1MB
> 

[PATCH 4/4] vdso: avoid including asm/page.h

2024-02-26 Thread Arnd Bergmann
From: Arnd Bergmann 

The recent change to the vdso_data_store broke building compat VDSO
on at least arm64 because it includes headers outside of the include/vdso/
namespace:

In file included from arch/arm64/include/asm/lse.h:5,
 from arch/arm64/include/asm/cmpxchg.h:14,
 from arch/arm64/include/asm/atomic.h:16,
 from include/linux/atomic.h:7,
 from include/asm-generic/bitops/atomic.h:5,
 from arch/arm64/include/asm/bitops.h:25,
 from include/linux/bitops.h:68,
 from arch/arm64/include/asm/memory.h:209,
 from arch/arm64/include/asm/page.h:46,
 from include/vdso/datapage.h:22,
 from lib/vdso/gettimeofday.c:5,
 from :
arch/arm64/include/asm/atomic_ll_sc.h:298:9: error: unknown type name 'u128'
  298 | u128 full;

Use an open-coded page size calculation based on the new CONFIG_PAGE_SHIFT
Kconfig symbol instead.

Reported-by: Linux Kernel Functional Testing 
Fixes: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all 
architectures")
Link: 
https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
Signed-off-by: Arnd Bergmann 
---
 include/vdso/datapage.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
index 7ba44379a095..2c39a67d7e23 100644
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -19,8 +19,6 @@
 #include 
 #include 
 
-#include 
-
 #ifdef CONFIG_ARCH_HAS_VDSO_DATA
 #include 
 #else
@@ -128,7 +126,7 @@ extern struct vdso_data _timens_data[CS_BASES] 
__attribute__((visibility("hidden
  */
 union vdso_data_store {
struct vdso_datadata[CS_BASES];
-   u8  page[PAGE_SIZE];
+   u8  page[1ul << CONFIG_PAGE_SHIFT];
 };
 
 /*
-- 
2.39.2




[PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-26 Thread Arnd Bergmann
From: Arnd Bergmann 

Most architectures only support a single hardcoded page size. In order
to ensure that each one of these sets the corresponding Kconfig symbols,
change over the PAGE_SHIFT definition to the common one and allow
only the hardware page size to be selected.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/Kconfig | 1 +
 arch/alpha/include/asm/page.h  | 2 +-
 arch/arm/Kconfig   | 1 +
 arch/arm/include/asm/page.h| 2 +-
 arch/csky/Kconfig  | 1 +
 arch/csky/include/asm/page.h   | 2 +-
 arch/m68k/Kconfig  | 3 +++
 arch/m68k/Kconfig.cpu  | 2 ++
 arch/m68k/include/asm/page.h   | 6 +-
 arch/microblaze/Kconfig| 1 +
 arch/microblaze/include/asm/page.h | 2 +-
 arch/nios2/Kconfig | 1 +
 arch/nios2/include/asm/page.h  | 2 +-
 arch/openrisc/Kconfig  | 1 +
 arch/openrisc/include/asm/page.h   | 2 +-
 arch/riscv/Kconfig | 1 +
 arch/riscv/include/asm/page.h  | 2 +-
 arch/s390/Kconfig  | 1 +
 arch/s390/include/asm/page.h   | 2 +-
 arch/sparc/Kconfig | 2 ++
 arch/sparc/include/asm/page_32.h   | 2 +-
 arch/sparc/include/asm/page_64.h   | 3 +--
 arch/um/Kconfig| 1 +
 arch/um/include/asm/page.h | 2 +-
 arch/x86/Kconfig   | 1 +
 arch/x86/include/asm/page_types.h  | 2 +-
 arch/xtensa/Kconfig| 1 +
 arch/xtensa/include/asm/page.h | 2 +-
 28 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index d6968d090d49..4f490250d323 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -14,6 +14,7 @@ config ALPHA
select PCI_DOMAINS if PCI
select PCI_SYSCALL if PCI
select HAVE_ASM_MODVERSIONS
+   select HAVE_PAGE_SIZE_8KB
select HAVE_PCSPKR_PLATFORM
select HAVE_PERF_EVENTS
select NEED_DMA_MAP_STATE
diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
index 4db1ebc0ed99..70419e6be1a3 100644
--- a/arch/alpha/include/asm/page.h
+++ b/arch/alpha/include/asm/page.h
@@ -6,7 +6,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 13
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0af6709570d1..9d52ba3a8ad1 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -116,6 +116,7 @@ config ARM
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPTPROBES if !THUMB2_KERNEL
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PCI if MMU
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 119aa85d1feb..62af9f7f9e96 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -8,7 +8,7 @@
 #define _ASMARM_PAGE_H
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
 
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index cf2a6fd7dff8..9c2723ab1c94 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -89,6 +89,7 @@ config CSKY
select HAVE_KPROBES if !CPU_CK610
select HAVE_KPROBES_ON_FTRACE if !CPU_CK610
select HAVE_KRETPROBES if !CPU_CK610
+   select HAVE_PAGE_SIZE_4KB
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
diff --git a/arch/csky/include/asm/page.h b/arch/csky/include/asm/page.h
index 4a0502e324a6..f70f37402d75 100644
--- a/arch/csky/include/asm/page.h
+++ b/arch/csky/include/asm/page.h
@@ -10,7 +10,7 @@
 /*
  * PAGE_SHIFT determines the page size: 4KB
  */
-#define PAGE_SHIFT 12
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE - 1))
 #define THREAD_SIZE(PAGE_SIZE * 2)
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..7b709453d5e7 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -84,12 +84,15 @@ config MMU
 
 config MMU_MOTOROLA
bool
+   select HAVE_PAGE_SIZE_4KB
 
 config MMU_COLDFIRE
+   select HAVE_PAGE_SIZE_8KB
bool
 
 config MMU_SUN3
bool
+   select HAVE_PAGE_SIZE_8KB
depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE
 
 config ARCH_SUPPORTS_KEXEC
diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 9dcf245c9cbf..c777a129768a 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -30,6 +30,7 @@ config COLDFIRE
select GENERIC_CSUM
select GPIOLIB
select HAVE_LEGACY_CLK
+   select HAVE_PAGE_SIZE_8KB if !MMU
 
 endchoice
 
@@ -45,6 +46,7 @@ config M68000
   

[PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-26 Thread Arnd Bergmann
From: Arnd Bergmann 

arc, arm64, parisc and powerpc all have their own Kconfig symbols
in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
so the common symbols are the ones that are actually used, while
leaving the arhcitecture specific ones as the user visible
place for configuring it, to avoid breaking user configs.

Signed-off-by: Arnd Bergmann 
---
 arch/arc/Kconfig  |  3 +++
 arch/arc/include/uapi/asm/page.h  |  6 ++
 arch/arm64/Kconfig| 29 +
 arch/arm64/include/asm/page-def.h |  2 +-
 arch/parisc/Kconfig   |  3 +++
 arch/parisc/include/asm/page.h| 10 +-
 arch/powerpc/Kconfig  | 31 ++-
 arch/powerpc/include/asm/page.h   |  2 +-
 scripts/gdb/linux/constants.py.in |  2 +-
 scripts/gdb/linux/mm.py   |  2 +-
 10 files changed, 32 insertions(+), 58 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 1b0483c51cc1..4092bec198be 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -284,14 +284,17 @@ choice
 
 config ARC_PAGE_SIZE_8K
bool "8KB"
+   select HAVE_PAGE_SIZE_8KB
help
  Choose between 8k vs 16k
 
 config ARC_PAGE_SIZE_16K
+   select HAVE_PAGE_SIZE_16KB
bool "16KB"
 
 config ARC_PAGE_SIZE_4K
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
depends on ARC_MMU_V3 || ARC_MMU_V4
 
 endchoice
diff --git a/arch/arc/include/uapi/asm/page.h b/arch/arc/include/uapi/asm/page.h
index 2a4ad619abfb..7fd9e741b527 100644
--- a/arch/arc/include/uapi/asm/page.h
+++ b/arch/arc/include/uapi/asm/page.h
@@ -13,10 +13,8 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#if defined(CONFIG_ARC_PAGE_SIZE_16K)
-#define PAGE_SHIFT 14
-#elif defined(CONFIG_ARC_PAGE_SIZE_4K)
-#define PAGE_SHIFT 12
+#ifdef __KERNEL__
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #else
 /*
  * Default 8k
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index aa7c1d435139..29290b8cb36d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -277,27 +277,21 @@ config 64BIT
 config MMU
def_bool y
 
-config ARM64_PAGE_SHIFT
-   int
-   default 16 if ARM64_64K_PAGES
-   default 14 if ARM64_16K_PAGES
-   default 12
-
 config ARM64_CONT_PTE_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 7 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 7 if PAGE_SIZE_16KB
default 4
 
 config ARM64_CONT_PMD_SHIFT
int
-   default 5 if ARM64_64K_PAGES
-   default 5 if ARM64_16K_PAGES
+   default 5 if PAGE_SIZE_64KB
+   default 5 if PAGE_SIZE_16KB
default 4
 
 config ARCH_MMAP_RND_BITS_MIN
-   default 14 if ARM64_64K_PAGES
-   default 16 if ARM64_16K_PAGES
+   default 14 if PAGE_SIZE_64KB
+   default 16 if PAGE_SIZE_16KB
default 18
 
 # max bits determined by the following formula:
@@ -1259,11 +1253,13 @@ choice
 
 config ARM64_4K_PAGES
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
help
  This feature enables 4KB pages support.
 
 config ARM64_16K_PAGES
bool "16KB"
+   select HAVE_PAGE_SIZE_16KB
help
  The system will use 16KB pages support. AArch32 emulation
  requires applications compiled with 16K (or a multiple of 16K)
@@ -1271,6 +1267,7 @@ config ARM64_16K_PAGES
 
 config ARM64_64K_PAGES
bool "64KB"
+   select HAVE_PAGE_SIZE_64KB
help
  This feature enables 64KB pages support (4KB by default)
  allowing only two levels of page tables and faster TLB
@@ -1291,19 +1288,19 @@ choice
 
 config ARM64_VA_BITS_36
bool "36-bit" if EXPERT
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_39
bool "39-bit"
-   depends on ARM64_4K_PAGES
+   depends on PAGE_SIZE_4KB
 
 config ARM64_VA_BITS_42
bool "42-bit"
-   depends on ARM64_64K_PAGES
+   depends on PAGE_SIZE_64KB
 
 config ARM64_VA_BITS_47
bool "47-bit"
-   depends on ARM64_16K_PAGES
+   depends on PAGE_SIZE_16KB
 
 config ARM64_VA_BITS_48
bool "48-bit"
diff --git a/arch/arm64/include/asm/page-def.h 
b/arch/arm64/include/asm/page-def.h
index 2403f7b4cdbf..792e9fe881dc 100644
--- a/arch/arm64/include/asm/page-def.h
+++ b/arch/arm64/include/asm/page-def.h
@@ -11,7 +11,7 @@
 #include 
 
 /* PAGE_SHIFT determines the page size */
-#define PAGE_SHIFT CONFIG_ARM64_PAGE_SHIFT
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 5c845e8d59d9..b180e684fa0d 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -273,6 +273,7 @@ choice
 
 config PARISC_PAGE_SIZE_4KB
bool "4KB"
+   select HAVE_PAGE_SIZE_4KB
help
  This lets you select the page size of the kernel.  For 

[PATCH 1/4] arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions

2024-02-26 Thread Arnd Bergmann
From: Arnd Bergmann 

These four architectures define the same Kconfig symbols for configuring
the page size. Move the logic into a common place where it can be shared
with all other architectures.

Signed-off-by: Arnd Bergmann 
---
 arch/Kconfig  | 58 +--
 arch/hexagon/Kconfig  | 25 +++--
 arch/hexagon/include/asm/page.h   |  6 +---
 arch/loongarch/Kconfig| 21 ---
 arch/loongarch/include/asm/page.h | 10 +-
 arch/mips/Kconfig | 58 +++
 arch/mips/include/asm/page.h  | 16 +
 arch/sh/include/asm/page.h| 13 +--
 arch/sh/mm/Kconfig| 42 +++---
 9 files changed, 88 insertions(+), 161 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a5af0edd3eb8..237cea01ed9b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1078,17 +1078,71 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
  and vice-versa 32-bit applications to call 64-bit mmap().
  Required for applications doing different bitness syscalls.
 
+config HAVE_PAGE_SIZE_4KB
+   bool
+
+config HAVE_PAGE_SIZE_8KB
+   bool
+
+config HAVE_PAGE_SIZE_16KB
+   bool
+
+config HAVE_PAGE_SIZE_32KB
+   bool
+
+config HAVE_PAGE_SIZE_64KB
+   bool
+
+config HAVE_PAGE_SIZE_256KB
+   bool
+
+choice
+   prompt "MMU page size"
+
+config PAGE_SIZE_4KB
+   bool "4KB pages"
+   depends on HAVE_PAGE_SIZE_4KB
+
+config PAGE_SIZE_8KB
+   bool "8KB pages"
+   depends on HAVE_PAGE_SIZE_8KB
+
+config PAGE_SIZE_16KB
+   bool "16KB pages"
+   depends on HAVE_PAGE_SIZE_16KB
+
+config PAGE_SIZE_32KB
+   bool "32KB pages"
+   depends on HAVE_PAGE_SIZE_32KB
+
+config PAGE_SIZE_64KB
+   bool "64KB pages"
+   depends on HAVE_PAGE_SIZE_64KB
+
+config PAGE_SIZE_256KB
+   bool "256KB pages"
+   depends on HAVE_PAGE_SIZE_256KB
+
+endchoice
+
 config PAGE_SIZE_LESS_THAN_64KB
def_bool y
-   depends on !ARM64_64K_PAGES
depends on !PAGE_SIZE_64KB
-   depends on !PARISC_PAGE_SIZE_64KB
depends on PAGE_SIZE_LESS_THAN_256KB
 
 config PAGE_SIZE_LESS_THAN_256KB
def_bool y
depends on !PAGE_SIZE_256KB
 
+config PAGE_SHIFT
+   int
+   default 12 if PAGE_SIZE_4KB
+   default 13 if PAGE_SIZE_8KB
+   default 14 if PAGE_SIZE_16KB
+   default 15 if PAGE_SIZE_32KB
+   default 16 if PAGE_SIZE_64KB
+   default 18 if PAGE_SIZE_256KB
+
 # This allows to use a set of generic functions to determine mmap base
 # address by giving priority to top-down scheme only if the process
 # is not in legacy mode (compat task, unlimited stack size or
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index a880ee067d2e..aac46ee1a000 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -8,6 +8,11 @@ config HEXAGON
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_NO_PREEMPT
select DMA_GLOBAL_POOL
+   select FRAME_POINTER
+   select HAVE_PAGE_SIZE_4KB
+   select HAVE_PAGE_SIZE_16KB
+   select HAVE_PAGE_SIZE_64KB
+   select HAVE_PAGE_SIZE_256KB
# Other pending projects/to-do items.
# select HAVE_REGS_AND_STACK_ACCESS_API
# select HAVE_HW_BREAKPOINT if PERF_EVENTS
@@ -120,26 +125,6 @@ config NR_CPUS
  This is purely to save memory - each supported CPU adds
  approximately eight kilobytes to the kernel image.
 
-choice
-   prompt "Kernel page size"
-   default PAGE_SIZE_4KB
-   help
- Changes the default page size; use with caution.
-
-config PAGE_SIZE_4KB
-   bool "4KB"
-
-config PAGE_SIZE_16KB
-   bool "16KB"
-
-config PAGE_SIZE_64KB
-   bool "64KB"
-
-config PAGE_SIZE_256KB
-   bool "256KB"
-
-endchoice
-
 source "kernel/Kconfig.hz"
 
 endmenu
diff --git a/arch/hexagon/include/asm/page.h b/arch/hexagon/include/asm/page.h
index 10f1bc07423c..65c9bac639fa 100644
--- a/arch/hexagon/include/asm/page.h
+++ b/arch/hexagon/include/asm/page.h
@@ -13,27 +13,22 @@
 /*  This is probably not the most graceful way to handle this.  */
 
 #ifdef CONFIG_PAGE_SIZE_4KB
-#define PAGE_SHIFT 12
 #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_4KB
 #endif
 
 #ifdef CONFIG_PAGE_SIZE_16KB
-#define PAGE_SHIFT 14
 #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_16KB
 #endif
 
 #ifdef CONFIG_PAGE_SIZE_64KB
-#define PAGE_SHIFT 16
 #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_64KB
 #endif
 
 #ifdef CONFIG_PAGE_SIZE_256KB
-#define PAGE_SHIFT 18
 #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_256KB
 #endif
 
 #ifdef CONFIG_PAGE_SIZE_1MB
-#define PAGE_SHIFT 20
 #define HEXAGON_L1_PTE_SIZE __HVM_PDE_S_1MB
 #endif
 
@@ -50,6 +45,7 @@
 #define HVM_HUGEPAGE_SIZE 0x5
 #endif
 
+#define PAGE_SHIFT CONFIG_PAGE_SHIFT
 #define PAGE_SIZE  (1UL << PAGE_SHIFT)
 #define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
 
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 929f68926b34..b274784c2e26 100644
--- 

[PATCH 0/4] arch: mm, vdso: consolidate PAGE_SIZE definition

2024-02-26 Thread Arnd Bergmann
From: Arnd Bergmann 

Naresh noticed that the newly added usage of the PAGE_SIZE macro in
include/vdso/datapage.h introduced a build regression. I had an older
patch that I revived to have this defined through Kconfig rather than
through including asm/page.h, which is not allowed in vdso code.

I rebased and tested on top of the tip/timers/core branch that
introduced the regression. If these patches get added, the
compat VDSOs all build again, but the changes are a bit invasive.

  Arnd

Link: 
https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/
Link: https://lore.kernel.org/all/65dc6c14.170a0220.f4a3f.9...@mx.google.com/

Arnd Bergmann (4):
  arch: consolidate existing CONFIG_PAGE_SIZE_*KB definitions
  arch: simplify architecture specific page size configuration
  arch: define CONFIG_PAGE_SIZE_*KB on all architectures
  vdso: avoid including asm/page.h

 arch/Kconfig   | 58 --
 arch/alpha/Kconfig |  1 +
 arch/alpha/include/asm/page.h  |  2 +-
 arch/arc/Kconfig   |  3 ++
 arch/arc/include/uapi/asm/page.h   |  6 ++--
 arch/arm/Kconfig   |  1 +
 arch/arm/include/asm/page.h|  2 +-
 arch/arm64/Kconfig | 29 +++
 arch/arm64/include/asm/page-def.h  |  2 +-
 arch/csky/Kconfig  |  1 +
 arch/csky/include/asm/page.h   |  2 +-
 arch/hexagon/Kconfig   | 25 +++--
 arch/hexagon/include/asm/page.h|  6 +---
 arch/loongarch/Kconfig | 21 ---
 arch/loongarch/include/asm/page.h  | 10 +-
 arch/m68k/Kconfig  |  3 ++
 arch/m68k/Kconfig.cpu  |  2 ++
 arch/m68k/include/asm/page.h   |  6 +---
 arch/microblaze/Kconfig|  1 +
 arch/microblaze/include/asm/page.h |  2 +-
 arch/mips/Kconfig  | 58 +++---
 arch/mips/include/asm/page.h   | 16 +
 arch/nios2/Kconfig |  1 +
 arch/nios2/include/asm/page.h  |  2 +-
 arch/openrisc/Kconfig  |  1 +
 arch/openrisc/include/asm/page.h   |  2 +-
 arch/parisc/Kconfig|  3 ++
 arch/parisc/include/asm/page.h | 10 +-
 arch/powerpc/Kconfig   | 31 
 arch/powerpc/include/asm/page.h|  2 +-
 arch/riscv/Kconfig |  1 +
 arch/riscv/include/asm/page.h  |  2 +-
 arch/s390/Kconfig  |  1 +
 arch/s390/include/asm/page.h   |  2 +-
 arch/sh/include/asm/page.h | 13 +--
 arch/sh/mm/Kconfig | 42 +++---
 arch/sparc/Kconfig |  2 ++
 arch/sparc/include/asm/page_32.h   |  2 +-
 arch/sparc/include/asm/page_64.h   |  3 +-
 arch/um/Kconfig|  1 +
 arch/um/include/asm/page.h |  2 +-
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/page_types.h  |  2 +-
 arch/xtensa/Kconfig|  1 +
 arch/xtensa/include/asm/page.h |  2 +-
 include/vdso/datapage.h|  4 +--
 scripts/gdb/linux/constants.py.in  |  2 +-
 scripts/gdb/linux/mm.py|  2 +-
 48 files changed, 153 insertions(+), 241 deletions(-)

-- 
2.39.2
To: Thomas Gleixner 
To: Vincenzo Frascino 
To: Kees Cook 
To: Anna-Maria Behnsen 
Cc: Matt Turner 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Guo Ren 
Cc: Brian Cain 
Cc: Huacai Chen 
Cc: Geert Uytterhoeven 
Cc: Michal Simek 
Cc: Thomas Bogendoerfer 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Palmer Dabbelt 
Cc: John Paul Adrian Glaubitz 
Cc: Andreas Larsson 
Cc: Richard Weinberger 
Cc: x...@kernel.org
Cc: Max Filippov 
Cc: Andy Lutomirski 
Cc: Vincenzo Frascino 
Cc: Jan Kiszka 
Cc: Kieran Bingham 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: linux-ker...@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-hexa...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: linux-openr...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org



[RFC PATCH 03/14] sched/core: Use TIF_NOTIFY_IPI to notify an idle CPU in TIF_POLLING mode of pending IPI

2024-02-20 Thread K Prateek Nayak
From: "Gautham R. Shenoy" 

Problem statement
=

When measuring IPI throughput using a modified version of Anton
Blanchard's ipistorm benchmark [1], configured to measure time taken to
perform a fixed number of smp_call_function_single() (with wait set to
1), an increase in benchmark time was observed between v5.7 and the
upstream kernel (v6.7-rc6).

Bisection pointed to commit b2a02fc43a1f ("smp: Optimize
send_call_function_single_ipi()") as the reason behind this increase in
runtime. Reverting the optimization introduced by the above commit fixed
the regression in ipistorm, however benchmarks like tbench and netperf
regressed with the revert, supporting the validity of the optimization.

Following are the benchmark results on top of tip:sched/core with the
optimization reverted on a dual socket 3rd Generation aMD EPYC system
(2 x 64C/128T) running with boost enabled and C2 disabled:

(tip:sched/core at tag "sched-core-2024-01-08" for all the testing done
below)

  ==
  Test  : ipistorm (modified)
  Units : Normalized runtime
  Interpretation: Lower is better
  Statistic : AMean
  cmdline   : insmod ipistorm.ko numipi=10 single=1 offset=8 cpulist=8 
wait=1
  ==
  kernel:   time [pct imp]
  tip:sched/core1.00 [0.00]
  tip:sched/core + revert   0.81 [19.36]

  ==
  Test  : tbench
  Units : Normalized throughput
  Interpretation: Higher is better
  Statistic : AMean
  ==
  Clients:tip[pct imp](CV)   revert[pct imp](CV)
  1 1.00 [  0.00]( 0.24) 0.91 [ -8.96]( 0.30)
  2 1.00 [  0.00]( 0.25) 0.92 [ -8.20]( 0.97)
  4 1.00 [  0.00]( 0.23) 0.91 [ -9.20]( 1.75)
  8 1.00 [  0.00]( 0.69) 0.91 [ -9.48]( 1.56)
 16 1.00 [  0.00]( 0.66) 0.92 [ -8.49]( 2.43)
 32 1.00 [  0.00]( 0.96) 0.89 [-11.13]( 0.96)
 64 1.00 [  0.00]( 1.06) 0.90 [ -9.72]( 2.49)
128 1.00 [  0.00]( 0.70) 0.92 [ -8.36]( 1.26)
256 1.00 [  0.00]( 0.72) 0.97 [ -3.30]( 1.10)
512 1.00 [  0.00]( 0.42) 0.98 [ -1.73]( 0.37)
   1024 1.00 [  0.00]( 0.28) 0.99 [ -1.39]( 0.43)

  ==
  Test  : netperf
  Units : Normalized Througput
  Interpretation: Higher is better
  Statistic : AMean
  ==
  Clients: tip[pct imp](CV)   revert[pct imp](CV)
   1-clients 1.00 [  0.00]( 0.50) 0.89 [-10.51]( 0.20)
   2-clients 1.00 [  0.00]( 1.16) 0.89 [-11.10]( 0.59)
   4-clients 1.00 [  0.00]( 1.03) 0.89 [-10.68]( 0.38)
   8-clients 1.00 [  0.00]( 0.99) 0.89 [-10.54]( 0.50)
  16-clients 1.00 [  0.00]( 0.87) 0.89 [-10.92]( 0.95)
  32-clients 1.00 [  0.00]( 1.24) 0.89 [-10.85]( 0.63)
  64-clients 1.00 [  0.00]( 1.58) 0.90 [-10.11]( 1.18)
  128-clients1.00 [  0.00]( 0.87) 0.89 [-10.94]( 1.11)
  256-clients1.00 [  0.00]( 4.77) 1.00 [ -0.16]( 3.45)
  512-clients1.00 [  0.00](56.16) 1.02 [  2.10](56.05)

Since a simple revert is not a viable solution, the changes in the code
path of call_function_single_prep_ipi(), with and without the
optimization were audited to better understand the effect of the commit.

Effects of call_function_single_prep_ipi()
==

To pull a TIF_POLLING thread out of idle to process an IPI, the sender
sets the TIF_NEED_RESCHED bit in the idle task's thread info in
call_function_single_prep_ipi() and avoids sending an actual IPI to the
target. As a result, the scheduler expects a task to be enqueued when
exiting the idle path. This is not the case with non-polling idle states
where the idle CPU exits the non-polling idle state to process the
interrupt, and since need_resched() returns false, soon goes back to
idle again.

When TIF_NEED_RESCHED flag is set, do_idle() will call schedule_idle(),
a large part of which runs with local IRQ disabled. In case of ipistorm,
when measuring IPI throughput, this large IRQ disabled section delays
processing of IPIs. Further auditing revealed that in absence of any
runnable tasks, pick_next_task_fair(), which is called from the
pick_next_task() fast path, will always call newidle_balance() in this
scenario, further increasing the time spent in the IRQ disabled section.

Following is the crude visualization of the problem with relevant
functions expanded:
--
CPU0CPU1

do_idle() {
  

[RFC PATCH 02/14] sched: Define a need_resched_or_ipi() helper and use it treewide

2024-02-20 Thread K Prateek Nayak
From: "Gautham R. Shenoy" 

Currently TIF_NEED_RESCHED is being overloaded, to wakeup an idle CPU in
TIF_POLLING mode to service an IPI even if there are no new tasks being
woken up on the said CPU.

In preparation of a proper fix, introduce a new helper
"need_resched_or_ipi()" which is intended to return true if either
the TIF_NEED_RESCHED flag or if TIF_NOTIFY_IPI flag is set. Use this
helper function in place of need_resched() in idle loops where
TIF_POLLING_NRFLAG is set.

To preserve bisectibility and avoid unbreakable idle loops, all the
need_resched() checks within TIF_POLLING_NRFLAGS sections, have been
replaced tree-wide with the need_resched_or_ipi() check.

[ prateek: Replaced some of the missed out occurrences of
  need_resched() within a TIF_POLLING sections with
  need_resched_or_ipi() ]

Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: Russell King 
Cc: Guo Ren 
Cc: Michal Simek 
Cc: Dinh Nguyen 
Cc: Jonas Bonn 
Cc: Stefan Kristiansson 
Cc: Stafford Horne 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: "Aneesh Kumar K.V" 
Cc: "Naveen N. Rao" 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
Cc: "David S. Miller" 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: "H. Peter Anvin" 
Cc: "Rafael J. Wysocki" 
Cc: Daniel Lezcano 
Cc: Peter Zijlstra 
Cc: Juri Lelli 
Cc: Vincent Guittot 
Cc: Dietmar Eggemann 
Cc: Steven Rostedt 
Cc: Ben Segall 
Cc: Mel Gorman 
Cc: Daniel Bristot de Oliveira 
Cc: Valentin Schneider 
Cc: Al Viro 
Cc: Linus Walleij 
Cc: Ard Biesheuvel 
Cc: Andrew Donnellan 
Cc: Nicholas Miehlbradt 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Josh Poimboeuf 
Cc: "Kirill A. Shutemov" 
Cc: Rick Edgecombe 
Cc: Tony Battersby 
Cc: Brian Gerst 
Cc: Tim Chen 
Cc: David Vernet 
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-openr...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@vger.kernel.org
Signed-off-by: Gautham R. Shenoy 
Co-developed-by: K Prateek Nayak 
Signed-off-by: K Prateek Nayak 
---
 arch/x86/include/asm/mwait.h  | 2 +-
 arch/x86/kernel/process.c | 2 +-
 drivers/cpuidle/cpuidle-powernv.c | 2 +-
 drivers/cpuidle/cpuidle-pseries.c | 2 +-
 drivers/cpuidle/poll_state.c  | 2 +-
 include/linux/sched.h | 5 +
 include/linux/sched/idle.h| 4 ++--
 kernel/sched/idle.c   | 7 ---
 8 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 778df05f8539..ac1370143407 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -115,7 +115,7 @@ static __always_inline void mwait_idle_with_hints(unsigned 
long eax, unsigned lo
}
 
__monitor((void *)_thread_info()->flags, 0, 0);
-   if (!need_resched())
+   if (!need_resched_or_ipi())
__mwait(eax, ecx);
}
current_clr_polling();
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b6f4e8399fca..ca6cb7e28cba 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -925,7 +925,7 @@ static __cpuidle void mwait_idle(void)
}
 
__monitor((void *)_thread_info()->flags, 0, 0);
-   if (!need_resched()) {
+   if (!need_resched_or_ipi()) {
__sti_mwait(0, 0);
raw_local_irq_disable();
}
diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index 9ebedd972df0..77c3bb371f56 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -79,7 +79,7 @@ static int snooze_loop(struct cpuidle_device *dev,
dev->poll_time_limit = false;
ppc64_runlatch_off();
HMT_very_low();
-   while (!need_resched()) {
+   while (!need_resched_or_ipi()) {
if (likely(snooze_timeout_en) && get_tb() > snooze_exit_time) {
/*
 * Task has not woken up but we are exiting the polling
diff --git a/drivers/cpuidle/cpuidle-pseries.c 
b/drivers/cpuidle/cpuidle-pseries.c
index 14db9b7d985d..4f2b490f8b73 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -46,7 +46,7 @@ int snooze_loop(struct cpuidle_device *dev, struct 
cpuidle_driver *drv,
snooze_exit_time = get_tb() + snooze_timeout;
dev->poll_time_limit = false;
 
-   while (!need_resched()) {
+   while (!need_resched_or_ipi()) {
HMT_low();
HMT_very_low();
if (likely(snooze_timeout_en) && get_tb() > snooze_exit_time) {
diff --git 

[RFC PATCH 01/14] thread_info: Add helpers to test and clear TIF_NOTIFY_IPI

2024-02-20 Thread K Prateek Nayak
From: "Gautham R. Shenoy" 

Introduce the notion of TIF_NOTIFY_IPI flag. When a processor in
TIF_POLLING mode needs to process an IPI, the sender sets NEED_RESCHED
bit in idle task's thread_info to pull the target out of idle and avoids
sending an interrupt to the idle CPU. When NEED_RESCHED is set, the
scheduler assumes that a new task has been queued on the idle CPU and
calls schedule_idle(), however, it is not necessary that an IPI on an
idle CPU will necessarily end up waking a task on the said CPU. To avoid
spurious calls to schedule_idle() assuming an IPI on an idle CPU will
always wake a task on the said CPU, TIF_NOTIFY_IPI will be used to pull
a TIF_POLLING CPU out of idle.

Since the IPI handlers are processed before the call to schedule_idle(),
schedule_idle() will be called only if one of the handlers have woken up
a new task on the CPU and has set NEED_RESCHED.

Add tif_notify_ipi() and current_clr_notify_ipi() helpers to test if
TIF_NOTIFY_IPI is set in the current task's thread_info, and to clear it
respectively. These interfaces will be used in subsequent patches as
TIF_NOTIFY_IPI notion is integrated in the scheduler and in the idle
path.

[ prateek: Split the changes into a separate patch, add commit log ]

Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: Russell King 
Cc: Guo Ren 
Cc: Michal Simek 
Cc: Dinh Nguyen 
Cc: Jonas Bonn 
Cc: Stefan Kristiansson 
Cc: Stafford Horne 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: "Aneesh Kumar K.V" 
Cc: "Naveen N. Rao" 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
Cc: "David S. Miller" 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: "H. Peter Anvin" 
Cc: "Rafael J. Wysocki" 
Cc: Daniel Lezcano 
Cc: Peter Zijlstra 
Cc: Juri Lelli 
Cc: Vincent Guittot 
Cc: Dietmar Eggemann 
Cc: Steven Rostedt 
Cc: Ben Segall 
Cc: Mel Gorman 
Cc: Daniel Bristot de Oliveira 
Cc: Valentin Schneider 
Cc: Al Viro 
Cc: Linus Walleij 
Cc: Ard Biesheuvel 
Cc: Andrew Donnellan 
Cc: Nicholas Miehlbradt 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Josh Poimboeuf 
Cc: "Kirill A. Shutemov" 
Cc: Rick Edgecombe 
Cc: Tony Battersby 
Cc: Brian Gerst 
Cc: Tim Chen 
Cc: David Vernet 
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-openr...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@vger.kernel.org
Signed-off-by: Gautham R. Shenoy 
Co-developed-by: K Prateek Nayak 
Signed-off-by: K Prateek Nayak 
---
 include/linux/thread_info.h | 43 +
 1 file changed, 43 insertions(+)

diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 9ea0b28068f4..1e10dd8c0227 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -195,6 +195,49 @@ static __always_inline bool tif_need_resched(void)
 
 #endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
 
+#ifdef TIF_NOTIFY_IPI
+
+#ifdef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H
+
+static __always_inline bool tif_notify_ipi(void)
+{
+   return arch_test_bit(TIF_NOTIFY_IPI,
+(unsigned long *)(_thread_info()->flags));
+}
+
+static __always_inline void current_clr_notify_ipi(void)
+{
+   arch_clear_bit(TIF_NOTIFY_IPI,
+  (unsigned long *)(_thread_info()->flags));
+}
+
+#else
+
+static __always_inline bool tif_notify_ipi(void)
+{
+   return test_bit(TIF_NOTIFY_IPI,
+   (unsigned long *)(_thread_info()->flags));
+}
+
+static __always_inline void current_clr_notify_ipi(void)
+{
+   clear_bit(TIF_NOTIFY_IPI,
+ (unsigned long *)(_thread_info()->flags));
+}
+
+#endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
+
+#else /* !TIF_NOTIFY_IPI */
+
+static __always_inline bool tif_notify_ipi(void)
+{
+   return false;
+}
+
+static __always_inline void current_clr_notify_ipi(void) { }
+
+#endif /* TIF_NOTIFY_IPI */
+
 #ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES
 static inline int arch_within_stack_frames(const void * const stack,
   const void * const stackend,
-- 
2.34.1




[RFC PATCH 00/14] Introducing TIF_NOTIFY_IPI flag

2024-02-20 Thread K Prateek Nayak
Hello everyone,

Before jumping into the issue, let me clarify the Cc list. Everyone have
been cc'ed on Patch 0 through Patch 3. Respective arch maintainers,
reviewers, and committers returned by scripts/get_maintainer.pl have
been cc'ed on the respective arch side changes. Scheduler and CPU Idle
maintainers and reviewers have been included for the entire series. If I
have missed anyone, please do add them. If you would like to be dropped
from the cc list, wholly or partially, for the future iterations, please
do let me know.

With that out of the way ...

Problem statement
=

When measuring IPI throughput using a modified version of Anton
Blanchard's ipistorm benchmark [1], configured to measure time taken to
perform a fixed number of smp_call_function_single() (with wait set to
1), an increase in benchmark time was observed between v5.7 and the
current upstream release (v6.7-rc6 at the time of encounter).

Bisection pointed to commit b2a02fc43a1f ("smp: Optimize
send_call_function_single_ipi()") as the reason behind this increase in
runtime.


Experiments
===

Since the commit cannot be cleanly reverted on top of the current
tip:sched/core, the effects of the optimizations were reverted by:

1. Removing the check for call_function_single_prep_ipi() in
   send_call_function_single_ipi(). With this change
   send_call_function_single_ipi() always calls
   arch_send_call_function_single_ipi()

2. Removing the call to flush_smp_call_function_queue() in do_idle()
   since every smp_call_function, with (1.), would unconditionally send
   an IPI to an idle CPU in TIF_POLLING mode.

Following is the diff of the above described changes which will be
henceforth referred to as the "revert":

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 31231925f1ec..735184d98c0f 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -332,11 +332,6 @@ static void do_idle(void)
 */
smp_mb__after_atomic();
 
-   /*
-* RCU relies on this call to be done outside of an RCU read-side
-* critical section.
-*/
-   flush_smp_call_function_queue();
schedule_idle();
 
if (unlikely(klp_patch_pending(current)))
diff --git a/kernel/smp.c b/kernel/smp.c
index f085ebcdf9e7..2ff100c41885 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -111,11 +111,9 @@ void __init call_function_init(void)
 static __always_inline void
 send_call_function_single_ipi(int cpu)
 {
-   if (call_function_single_prep_ipi(cpu)) {
-   trace_ipi_send_cpu(cpu, _RET_IP_,
-  generic_smp_call_function_single_interrupt);
-   arch_send_call_function_single_ipi(cpu);
-   }
+   trace_ipi_send_cpu(cpu, _RET_IP_,
+  generic_smp_call_function_single_interrupt);
+   arch_send_call_function_single_ipi(cpu);
 }
 
 static __always_inline void
--

With the revert, the time taken to complete a fixed set of IPIs using
ipistorm improves significantly. Following are the numbers from a dual
socket 3rd Generation EPYC system (2 x 64C/128T) (boost on, C2 disabled)
running ipistorm between CPU8 and CPU16:

cmdline: insmod ipistorm.ko numipi=10 single=1 offset=8 cpulist=8 wait=1

(tip:sched/core at tag "sched-core-2024-01-08" for all the testing done
below)

  ==
  Test  : ipistorm (modified)
  Units : Normalized runtime
  Interpretation: Lower is better
  Statistic : AMean
  ==
  kernel:   time [pct imp]
  tip:sched/core1.00 [0.00]
  tip:sched/core + revert   0.81 [19.36]

Although the revert improves ipistorm performance, it also regresses
tbench and netperf, supporting the validity of the optimization.
Following are netperf and tbench numbers from the same machine comparing
vanilla tip:sched/core and the revert applied on top:

  ==
  Test  : tbench
  Units : Normalized throughput
  Interpretation: Higher is better
  Statistic : AMean
  ==
  Clients:tip[pct imp](CV)   revert[pct imp](CV)
  1 1.00 [  0.00]( 0.24) 0.91 [ -8.96]( 0.30)
  2 1.00 [  0.00]( 0.25) 0.92 [ -8.20]( 0.97)
  4 1.00 [  0.00]( 0.23) 0.91 [ -9.20]( 1.75)
  8 1.00 [  0.00]( 0.69) 0.91 [ -9.48]( 1.56)
 16 1.00 [  0.00]( 0.66) 0.92 [ -8.49]( 2.43)
 32 1.00 [  0.00]( 0.96) 0.89 [-11.13]( 0.96)
 64 1.00 [  0.00]( 1.06) 0.90 [ -9.72]( 2.49)
128 1.00 [  0.00]( 0.70) 0.92 [ -8.36]( 1.26)
256 1.00 [  0.00]( 0.72) 0.97 [ -3.30]( 1.10)
512 1.00 [  0.00]( 0.42) 0.98 [ -1.73]( 0.37)
   1024 1.00 [  0.00]( 0.28) 0.99 [ -1.39]( 0.43)

  

[linux-next:master] BUILD REGRESSION abb240f7a2bd14567ab53e602db562bb683391e6

2023-12-12 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: abb240f7a2bd14567ab53e602db562bb683391e6  Add linux-next specific 
files for 20231212

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202312121926.gc7oytbz-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202312130153.ebbunfqa-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Warning: MAINTAINERS references a file that doesn't exist: 
Documentation/devicetree/bindings/display/panel/synaptics,r63353.yaml

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- alpha-randconfig-r113-20231212
|   |-- arch-alpha-mm-fault.c:sparse:sparse:Using-plain-integer-as-NULL-pointer
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   |-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|   |-- 
lib-zstd-compress-zstd_fast.c:sparse:sparse:Using-plain-integer-as-NULL-pointer
|   |-- 
sound-soc-codecs-cs42l43.c:sparse:sparse:symbol-cs42l43_hp_ilimit_clear_work-was-not-declared.-Should-it-be-static
|   `-- 
sound-soc-codecs-cs42l43.c:sparse:sparse:symbol-cs42l43_hp_ilimit_work-was-not-declared.-Should-it-be-static
|-- arc-allmodconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arc-allyesconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arc-randconfig-002-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-allmodconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-allyesconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-randconfig-001-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-randconfig-002-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-randconfig-003-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-randconfig-004-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- arm-randconfig-r133-20231212
|   |-- 
fs-ntfs3-ntfs.h:sparse:sparse:static-assertion-failed:sizeof(struct-ATTR_LIST_ENTRY)
|   `-- 
lib-zstd-compress-zstd_fast.c:sparse:sparse:Using-plain-integer-as-NULL-pointer
|-- arm64-randconfig-002-20231212
|   `-- 
WARNING:modpost:missing-MODULE_DESCRIPTION()-in-lib-zlib_inflate-zlib_inflate.o
|-- arm64-randconfig-003-20231212
|   `-- 
WARNING:modpost:missing-MODULE_DESCRIPTION()-in-lib-zlib_inflate-zlib_inflate.o
|-- arm64-randconfig-004-20231212
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- csky-allmodconfig
|   |-- 
fs-bcachefs-chardev.c:warning:function-run_thread_with_file-might-be-a-candidate-for-gnu_printf-format-attribute
|   `-- 
fs-bcachefs-super.c:warning:function-__bch2_print-might-be-a-candidate-for-gnu_printf-format-attribute
|-- csky-allyesconfig
|   |-- 

Re: [PATCH] tty: virtio: drop virtio_cons_early_init()

2023-11-30 Thread Jason Wang
On Thu, Nov 30, 2023 at 7:31 PM Jiri Slaby (SUSE)  wrote:
>
> The last user of virtio_cons_early_init() was dropped in commit
> 7fb2b2d51244 ("s390/virtio: remove the old KVM virtio transport").
>
> So now, drop virtio_cons_early_init() and the logic and headers behind
> too.
>
> Signed-off-by: Jiri Slaby (SUSE) 
> Cc: Richard Henderson 
> Cc: Ivan Kokshaysky 
> Cc: Matt Turner 
> Cc: Amit Shah 
> Cc: Arnd Bergmann 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Xuan Zhuo 
> Cc: linux-alpha@vger.kernel.org
> Cc: virtualizat...@lists.linux.dev
> ---

Acked-by: Jason Wang 

Thanks




PSA: this list has been migrated (no action required)

2023-11-06 Thread Konstantin Ryabitsev
Hello:

This list has been migrated to the new vger infrastructure. You should't need
to change anything about how you participate with the list or how you receive
mail.

If something isn't working right, please reach out to helpd...@kernel.org.

Best regards,
Konstantin



Re: [PATCH 2/2] rtc/alpha: remove legacy rtc driver

2019-10-23 Thread Paul Gortmaker
[[PATCH 2/2] rtc/alpha: remove legacy rtc driver] On 23/10/2019 (Wed 17:01) 
Arnd Bergmann wrote:

> The old drivers/char/rtc.c driver was originally the implementation
> for x86 PCs but got subsequently replaced by the rtc class driver
> on all architectures except alpha.
> 
> Move alpha over to the portable driver and remove the old one
> for good.

Git history will show I'm in favour of showing old code and old drivers
to the curb - even if it is stuff that I wrote myself 20+ years ago!  So
if all users are now on the formalized rtc framework, then this relic
should go away, and you can add my ack'd for the commit.

Thanks,
Paul.
--

> 
> The CONFIG_JS_RTC option was only ever used on SPARC32 but
> has not been available for many years, this was used to build
> the same rtc driver with a different module name.
> 
> Cc: Richard Henderson 
> Cc: Ivan Kokshaysky 
> Cc: Matt Turner 
> Cc: linux-alpha@vger.kernel.org
> Cc: Paul Gortmaker 
> Signed-off-by: Arnd Bergmann 
> ---
> This was last discussed in early 2018 in
> https://lore.kernel.org/lkml/CAK8P3a0QZNY+K+V1HG056xCerz=_l2jh5ufz+2lwkdqkw5z...@mail.gmail.com/
> 
> Nobody ever replied there, so let's try this instead.
> If there is any reason to keep the driver after all,
> please let us know.
> ---
>  arch/alpha/configs/defconfig |3 +-
>  drivers/char/Kconfig |   56 --
>  drivers/char/Makefile|4 -
>  drivers/char/rtc.c   | 1311 --
>  4 files changed, 2 insertions(+), 1372 deletions(-)
>  delete mode 100644 drivers/char/rtc.c
> 
> diff --git a/arch/alpha/configs/defconfig b/arch/alpha/configs/defconfig
> index f4ec420d7f2d..e10c1be3c0d1 100644
> --- a/arch/alpha/configs/defconfig
> +++ b/arch/alpha/configs/defconfig
> @@ -53,7 +53,8 @@ CONFIG_NET_PCI=y
>  CONFIG_YELLOWFIN=y
>  CONFIG_SERIAL_8250=y
>  CONFIG_SERIAL_8250_CONSOLE=y
> -CONFIG_RTC=y
> +CONFIG_RTC_CLASS=y
> +CONFIG_RTC_DRV_CMOS=y
>  CONFIG_EXT2_FS=y
>  CONFIG_REISERFS_FS=m
>  CONFIG_ISO9660_FS=y
> diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
> index dabbf3f519c6..c2ac4f257c82 100644
> --- a/drivers/char/Kconfig
> +++ b/drivers/char/Kconfig
> @@ -243,62 +243,6 @@ config NVRAM
> To compile this driver as a module, choose M here: the
> module will be called nvram.
>  
> -#
> -# These legacy RTC drivers just cause too many conflicts with the generic
> -# RTC framework ... let's not even try to coexist any more.
> -#
> -if RTC_LIB=n
> -
> -config RTC
> - tristate "Enhanced Real Time Clock Support (legacy PC RTC driver)"
> - depends on ALPHA
> - ---help---
> -   If you say Y here and create a character special file /dev/rtc with
> -   major number 10 and minor number 135 using mknod ("man mknod"), you
> -   will get access to the real time clock (or hardware clock) built
> -   into your computer.
> -
> -   Every PC has such a clock built in. It can be used to generate
> -   signals from as low as 1Hz up to 8192Hz, and can also be used
> -   as a 24 hour alarm. It reports status information via the file
> -   /proc/driver/rtc and its behaviour is set by various ioctls on
> -   /dev/rtc.
> -
> -   If you run Linux on a multiprocessor machine and said Y to
> -   "Symmetric Multi Processing" above, you should say Y here to read
> -   and set the RTC in an SMP compatible fashion.
> -
> -   If you think you have a use for such a device (such as periodic data
> -   sampling), then say Y here, and read 
> 
> -   for details.
> -
> -   To compile this driver as a module, choose M here: the
> -   module will be called rtc.
> -
> -config JS_RTC
> - tristate "Enhanced Real Time Clock Support"
> - depends on SPARC32 && PCI
> - ---help---
> -   If you say Y here and create a character special file /dev/rtc with
> -   major number 10 and minor number 135 using mknod ("man mknod"), you
> -   will get access to the real time clock (or hardware clock) built
> -   into your computer.
> -
> -   Every PC has such a clock built in. It can be used to generate
> -   signals from as low as 1Hz up to 8192Hz, and can also be used
> -   as a 24 hour alarm. It reports status information via the file
> -   /proc/driver/rtc and its behaviour is set by various ioctls on
> -   /dev/rtc.
> -
> -   If you think you have a use for such a device (such as periodic data
> -   sampling), then say Y here, and read 
> 
> -   for details.
> -
> -   To compile this driver as a module, choose M here: the
> -   module will be called js-rtc.
> -
> -endif # RTC_LIB
> -
>  config DTLK
>   tristate "Double Talk PC internal speech card support"
>   depends on ISA
> diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> index abe3138b1f5a..ffce287ef415 100644
> --- a/drivers/char/Makefile
> +++ b/drivers/char/Makefile
> @@ -20,7 +20,6 @@ obj-$(CONFIG_APM_EMULATION) += apm-emulation.o
>  

Re: [PATCH 2/2] rtc/alpha: remove legacy rtc driver

2019-10-23 Thread Alexandre Belloni
On 23/10/2019 17:01:59+0200, Arnd Bergmann wrote:
> The old drivers/char/rtc.c driver was originally the implementation
> for x86 PCs but got subsequently replaced by the rtc class driver
> on all architectures except alpha.
> 
> Move alpha over to the portable driver and remove the old one
> for good.
> 
> The CONFIG_JS_RTC option was only ever used on SPARC32 but
> has not been available for many years, this was used to build
> the same rtc driver with a different module name.
> 
> Cc: Richard Henderson 
> Cc: Ivan Kokshaysky 
> Cc: Matt Turner 
> Cc: linux-alpha@vger.kernel.org
> Cc: Paul Gortmaker 
> Signed-off-by: Arnd Bergmann 
Acked-by: Alexandre Belloni 

> ---
> This was last discussed in early 2018 in
> https://lore.kernel.org/lkml/CAK8P3a0QZNY+K+V1HG056xCerz=_l2jh5ufz+2lwkdqkw5z...@mail.gmail.com/
> 
> Nobody ever replied there, so let's try this instead.
> If there is any reason to keep the driver after all,
> please let us know.
> ---
>  arch/alpha/configs/defconfig |3 +-
>  drivers/char/Kconfig |   56 --
>  drivers/char/Makefile|4 -
>  drivers/char/rtc.c   | 1311 --
>  4 files changed, 2 insertions(+), 1372 deletions(-)
>  delete mode 100644 drivers/char/rtc.c
> 
> diff --git a/arch/alpha/configs/defconfig b/arch/alpha/configs/defconfig
> index f4ec420d7f2d..e10c1be3c0d1 100644
> --- a/arch/alpha/configs/defconfig
> +++ b/arch/alpha/configs/defconfig
> @@ -53,7 +53,8 @@ CONFIG_NET_PCI=y
>  CONFIG_YELLOWFIN=y
>  CONFIG_SERIAL_8250=y
>  CONFIG_SERIAL_8250_CONSOLE=y
> -CONFIG_RTC=y
> +CONFIG_RTC_CLASS=y
> +CONFIG_RTC_DRV_CMOS=y
>  CONFIG_EXT2_FS=y
>  CONFIG_REISERFS_FS=m
>  CONFIG_ISO9660_FS=y
> diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
> index dabbf3f519c6..c2ac4f257c82 100644
> --- a/drivers/char/Kconfig
> +++ b/drivers/char/Kconfig
> @@ -243,62 +243,6 @@ config NVRAM
> To compile this driver as a module, choose M here: the
> module will be called nvram.
>  
> -#
> -# These legacy RTC drivers just cause too many conflicts with the generic
> -# RTC framework ... let's not even try to coexist any more.
> -#
> -if RTC_LIB=n
> -
> -config RTC
> - tristate "Enhanced Real Time Clock Support (legacy PC RTC driver)"
> - depends on ALPHA
> - ---help---
> -   If you say Y here and create a character special file /dev/rtc with
> -   major number 10 and minor number 135 using mknod ("man mknod"), you
> -   will get access to the real time clock (or hardware clock) built
> -   into your computer.
> -
> -   Every PC has such a clock built in. It can be used to generate
> -   signals from as low as 1Hz up to 8192Hz, and can also be used
> -   as a 24 hour alarm. It reports status information via the file
> -   /proc/driver/rtc and its behaviour is set by various ioctls on
> -   /dev/rtc.
> -
> -   If you run Linux on a multiprocessor machine and said Y to
> -   "Symmetric Multi Processing" above, you should say Y here to read
> -   and set the RTC in an SMP compatible fashion.
> -
> -   If you think you have a use for such a device (such as periodic data
> -   sampling), then say Y here, and read 
> 
> -   for details.
> -
> -   To compile this driver as a module, choose M here: the
> -   module will be called rtc.
> -
> -config JS_RTC
> - tristate "Enhanced Real Time Clock Support"
> - depends on SPARC32 && PCI
> - ---help---
> -   If you say Y here and create a character special file /dev/rtc with
> -   major number 10 and minor number 135 using mknod ("man mknod"), you
> -   will get access to the real time clock (or hardware clock) built
> -   into your computer.
> -
> -   Every PC has such a clock built in. It can be used to generate
> -   signals from as low as 1Hz up to 8192Hz, and can also be used
> -   as a 24 hour alarm. It reports status information via the file
> -   /proc/driver/rtc and its behaviour is set by various ioctls on
> -   /dev/rtc.
> -
> -   If you think you have a use for such a device (such as periodic data
> -   sampling), then say Y here, and read 
> 
> -   for details.
> -
> -   To compile this driver as a module, choose M here: the
> -   module will be called js-rtc.
> -
> -endif # RTC_LIB
> -
>  config DTLK
>   tristate "Double Talk PC internal speech card support"
>   depends on ISA
> diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> index abe3138b1f5a..ffce287ef415 100644
> --- a/drivers/char/Makefile
> +++ b/drivers/char/Makefile
> @@ -20,7 +20,6 @@ obj-$(CONFIG_APM_EMULATION) += apm-emulation.o
>  obj-$(CONFIG_DTLK)   += dtlk.o
>  obj-$(CONFIG_APPLICOM)   += applicom.o
>  obj-$(CONFIG_SONYPI) += sonypi.o
> -obj-$(CONFIG_RTC)+= rtc.o
>  obj-$(CONFIG_HPET)   += hpet.o
>  obj-$(CONFIG_XILINX_HWICAP)  += xilinx_hwicap/
>  obj-$(CONFIG_NVRAM)  += nvram.o
> @@ -45,9 

[PATCH 2/2] rtc/alpha: remove legacy rtc driver

2019-10-23 Thread Arnd Bergmann
The old drivers/char/rtc.c driver was originally the implementation
for x86 PCs but got subsequently replaced by the rtc class driver
on all architectures except alpha.

Move alpha over to the portable driver and remove the old one
for good.

The CONFIG_JS_RTC option was only ever used on SPARC32 but
has not been available for many years, this was used to build
the same rtc driver with a different module name.

Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: linux-alpha@vger.kernel.org
Cc: Paul Gortmaker 
Signed-off-by: Arnd Bergmann 
---
This was last discussed in early 2018 in
https://lore.kernel.org/lkml/CAK8P3a0QZNY+K+V1HG056xCerz=_l2jh5ufz+2lwkdqkw5z...@mail.gmail.com/

Nobody ever replied there, so let's try this instead.
If there is any reason to keep the driver after all,
please let us know.
---
 arch/alpha/configs/defconfig |3 +-
 drivers/char/Kconfig |   56 --
 drivers/char/Makefile|4 -
 drivers/char/rtc.c   | 1311 --
 4 files changed, 2 insertions(+), 1372 deletions(-)
 delete mode 100644 drivers/char/rtc.c

diff --git a/arch/alpha/configs/defconfig b/arch/alpha/configs/defconfig
index f4ec420d7f2d..e10c1be3c0d1 100644
--- a/arch/alpha/configs/defconfig
+++ b/arch/alpha/configs/defconfig
@@ -53,7 +53,8 @@ CONFIG_NET_PCI=y
 CONFIG_YELLOWFIN=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
-CONFIG_RTC=y
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_DRV_CMOS=y
 CONFIG_EXT2_FS=y
 CONFIG_REISERFS_FS=m
 CONFIG_ISO9660_FS=y
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index dabbf3f519c6..c2ac4f257c82 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -243,62 +243,6 @@ config NVRAM
  To compile this driver as a module, choose M here: the
  module will be called nvram.
 
-#
-# These legacy RTC drivers just cause too many conflicts with the generic
-# RTC framework ... let's not even try to coexist any more.
-#
-if RTC_LIB=n
-
-config RTC
-   tristate "Enhanced Real Time Clock Support (legacy PC RTC driver)"
-   depends on ALPHA
-   ---help---
- If you say Y here and create a character special file /dev/rtc with
- major number 10 and minor number 135 using mknod ("man mknod"), you
- will get access to the real time clock (or hardware clock) built
- into your computer.
-
- Every PC has such a clock built in. It can be used to generate
- signals from as low as 1Hz up to 8192Hz, and can also be used
- as a 24 hour alarm. It reports status information via the file
- /proc/driver/rtc and its behaviour is set by various ioctls on
- /dev/rtc.
-
- If you run Linux on a multiprocessor machine and said Y to
- "Symmetric Multi Processing" above, you should say Y here to read
- and set the RTC in an SMP compatible fashion.
-
- If you think you have a use for such a device (such as periodic data
- sampling), then say Y here, and read 

- for details.
-
- To compile this driver as a module, choose M here: the
- module will be called rtc.
-
-config JS_RTC
-   tristate "Enhanced Real Time Clock Support"
-   depends on SPARC32 && PCI
-   ---help---
- If you say Y here and create a character special file /dev/rtc with
- major number 10 and minor number 135 using mknod ("man mknod"), you
- will get access to the real time clock (or hardware clock) built
- into your computer.
-
- Every PC has such a clock built in. It can be used to generate
- signals from as low as 1Hz up to 8192Hz, and can also be used
- as a 24 hour alarm. It reports status information via the file
- /proc/driver/rtc and its behaviour is set by various ioctls on
- /dev/rtc.
-
- If you think you have a use for such a device (such as periodic data
- sampling), then say Y here, and read 

- for details.
-
- To compile this driver as a module, choose M here: the
- module will be called js-rtc.
-
-endif # RTC_LIB
-
 config DTLK
tristate "Double Talk PC internal speech card support"
depends on ISA
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index abe3138b1f5a..ffce287ef415 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -20,7 +20,6 @@ obj-$(CONFIG_APM_EMULATION)   += apm-emulation.o
 obj-$(CONFIG_DTLK) += dtlk.o
 obj-$(CONFIG_APPLICOM) += applicom.o
 obj-$(CONFIG_SONYPI)   += sonypi.o
-obj-$(CONFIG_RTC)  += rtc.o
 obj-$(CONFIG_HPET) += hpet.o
 obj-$(CONFIG_XILINX_HWICAP)+= xilinx_hwicap/
 obj-$(CONFIG_NVRAM)+= nvram.o
@@ -45,9 +44,6 @@ obj-$(CONFIG_TCG_TPM) += tpm/
 
 obj-$(CONFIG_PS3_FLASH)+= ps3flash.o
 
-obj-$(CONFIG_JS_RTC)   += js-rtc.o
-js-rtc-y = rtc.o
-
 obj-$(CONFIG_XILLYBUS) += xillybus/
 obj-$(CONFIG_POWERNV_OP_PANEL) 

Re: [PATCH 00/12] mm: remove __ARCH_HAS_4LEVEL_HACK

2019-10-23 Thread Linus Torvalds
On Wed, Oct 23, 2019 at 5:29 AM Mike Rapoport  wrote:
>
> These patches convert several architectures to use page table folding and
> remove __ARCH_HAS_4LEVEL_HACK along with include/asm-generic/4level-fixup.h.

Thanks for doing this.

The patches look sane from a quick scan, and it's definitely the right
thing to do. So ack on my part, but obviously testing the different
architectures would be a really good thing...

Linus


Re: [PATCH 08/12] parisc: use pgtable-nopXd instead of 4level-fixup

2019-10-23 Thread Rolf Eike Beer
diff --git a/arch/parisc/include/asm/page.h 
b/arch/parisc/include/asm/page.h

index 93caf17..1d339ee 100644
--- a/arch/parisc/include/asm/page.h
+++ b/arch/parisc/include/asm/page.h
@@ -42,48 +42,54 @@ typedef struct { unsigned long pte; } pte_t; /*
either 32 or 64bit */

 /* NOTE: even on 64 bits, these entries are __u32 because we allocate
  * the pmd and pgd in ZONE_DMA (i.e. under 4GB) */
-typedef struct { __u32 pmd; } pmd_t;
 typedef struct { __u32 pgd; } pgd_t;
 typedef struct { unsigned long pgprot; } pgprot_t;

-#define pte_val(x) ((x).pte)
-/* These do not work lvalues, so make sure we don't use them as such. 
*/

+#if CONFIG_PGTABLE_LEVELS == 3
+typedef struct { __u32 pmd; } pmd_t;
+#define __pmd(x)   ((pmd_t) { (x) } )
+/* pXd_val() do not work lvalues, so make sure we don't use them as 
such. */


For me it sounds like there is something missing, maybe an "as" before 
lvalues?

And it was "These", so plural, and now it is singular, so do -> does?

Eike


Re: [PATCH 02/12] arm: nommu: use pgtable-nopud instead of 4level-fixup

2019-10-23 Thread Russell King - ARM Linux admin
On Wed, Oct 23, 2019 at 12:28:51PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The generic nommu implementation of page table manipulation takes care of
> folding of the upper levels and does not require fixups.
> 
> Simply replace of include/asm-generic/4level-fixup.h with
> include/asm-generic/pgtable-nopud.h.
> 
> Signed-off-by: Mike Rapoport 

Acked-by: Russell King 

Thanks.

> ---
>  arch/arm/include/asm/pgtable.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 3ae120c..eabcb48 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -12,7 +12,7 @@
>  
>  #ifndef CONFIG_MMU
>  
> -#include 
> +#include 
>  #include 
>  
>  #else
> -- 
> 2.7.4
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


[PATCH 08/12] parisc: use pgtable-nopXd instead of 4level-fixup

2019-10-23 Thread Mike Rapoport
From: Mike Rapoport 

parisc has two or three levels of page tables and can use appropriate
pgtable-nopXd and folding of the upper layers.

Replace usage of include/asm-generic/4level-fixup.h and explicit
definitions of __PAGETABLE_PxD_FOLDED in parisc with
include/asm-generic/pgtable-nopmd.h for two-level configurations and with
include/asm-generic/pgtable-nopmd.h for three-lelve configurations and
adjust page table manipulation macros and functions accordingly.

Signed-off-by: Mike Rapoport 
---
 arch/parisc/include/asm/page.h| 30 +-
 arch/parisc/include/asm/pgalloc.h | 41 +++---
 arch/parisc/include/asm/pgtable.h | 52 +++
 arch/parisc/include/asm/tlb.h |  2 ++
 arch/parisc/kernel/cache.c| 13 ++
 arch/parisc/kernel/pci-dma.c  |  9 +--
 arch/parisc/mm/fixmap.c   | 10 +---
 7 files changed, 81 insertions(+), 76 deletions(-)

diff --git a/arch/parisc/include/asm/page.h b/arch/parisc/include/asm/page.h
index 93caf17..1d339ee 100644
--- a/arch/parisc/include/asm/page.h
+++ b/arch/parisc/include/asm/page.h
@@ -42,48 +42,54 @@ typedef struct { unsigned long pte; } pte_t; /* either 32 
or 64bit */
 
 /* NOTE: even on 64 bits, these entries are __u32 because we allocate
  * the pmd and pgd in ZONE_DMA (i.e. under 4GB) */
-typedef struct { __u32 pmd; } pmd_t;
 typedef struct { __u32 pgd; } pgd_t;
 typedef struct { unsigned long pgprot; } pgprot_t;
 
-#define pte_val(x) ((x).pte)
-/* These do not work lvalues, so make sure we don't use them as such. */
+#if CONFIG_PGTABLE_LEVELS == 3
+typedef struct { __u32 pmd; } pmd_t;
+#define __pmd(x)   ((pmd_t) { (x) } )
+/* pXd_val() do not work lvalues, so make sure we don't use them as such. */
 #define pmd_val(x) ((x).pmd + 0)
+#endif
+
+#define pte_val(x) ((x).pte)
 #define pgd_val(x) ((x).pgd + 0)
 #define pgprot_val(x)  ((x).pgprot)
 
 #define __pte(x)   ((pte_t) { (x) } )
-#define __pmd(x)   ((pmd_t) { (x) } )
 #define __pgd(x)   ((pgd_t) { (x) } )
 #define __pgprot(x)((pgprot_t) { (x) } )
 
-#define __pmd_val_set(x,n) (x).pmd = (n)
-#define __pgd_val_set(x,n) (x).pgd = (n)
-
 #else
 /*
  * .. while these make it easier on the compiler
  */
 typedef unsigned long pte_t;
+
+#if CONFIG_PGTABLE_LEVELS == 3
 typedef __u32 pmd_t;
+#define pmd_val(x)  (x)
+#define __pmd(x)   (x)
+#endif
+
 typedef __u32 pgd_t;
 typedef unsigned long pgprot_t;
 
 #define pte_val(x)  (x)
-#define pmd_val(x)  (x)
 #define pgd_val(x)  (x)
 #define pgprot_val(x)   (x)
 
 #define __pte(x)(x)
-#define __pmd(x)   (x)
 #define __pgd(x)(x)
 #define __pgprot(x) (x)
 
-#define __pmd_val_set(x,n) (x) = (n)
-#define __pgd_val_set(x,n) (x) = (n)
-
 #endif /* STRICT_MM_TYPECHECKS */
 
+#define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval))
+#if CONFIG_PGTABLE_LEVELS == 3
+#define set_pud(pudptr, pudval) (*(pudptr) = (pudval))
+#endif
+
 typedef struct page *pgtable_t;
 
 typedef struct __physmem_range {
diff --git a/arch/parisc/include/asm/pgalloc.h 
b/arch/parisc/include/asm/pgalloc.h
index d98647c..9ac74da 100644
--- a/arch/parisc/include/asm/pgalloc.h
+++ b/arch/parisc/include/asm/pgalloc.h
@@ -34,13 +34,13 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
/* Populate first pmd with allocated memory.  We mark it
 * with PxD_FLAG_ATTACHED as a signal to the system that this
 * pmd entry may not be cleared. */
-   __pgd_val_set(*actual_pgd, (PxD_FLAG_PRESENT | 
-   PxD_FLAG_VALID | 
-   PxD_FLAG_ATTACHED) 
-   + (__u32)(__pa((unsigned long)pgd) >> PxD_VALUE_SHIFT));
+   set_pgd(actual_pgd, __pgd((PxD_FLAG_PRESENT |
+   PxD_FLAG_VALID |
+   PxD_FLAG_ATTACHED)
+   + (__u32)(__pa((unsigned long)pgd) >> 
PxD_VALUE_SHIFT)));
/* The first pmd entry also is marked with PxD_FLAG_ATTACHED as
 * a signal that this pmd may not be freed */
-   __pgd_val_set(*pgd, PxD_FLAG_ATTACHED);
+   set_pgd(pgd, __pgd(PxD_FLAG_ATTACHED));
 #endif
}
spin_lock_init(pgd_spinlock(actual_pgd));
@@ -59,10 +59,10 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t 
*pgd)
 
 /* Three Level Page Table Support for pmd's */
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 {
-   __pgd_val_set(*pgd, (PxD_FLAG_PRESENT | PxD_FLAG_VALID) +
-   (__u32)(__pa((unsigned long)pmd) >> PxD_VALUE_SHIFT));
+   set_pud(pud, __pud((PxD_FLAG_PRESENT | PxD_FLAG_VALID) +
+   (__u32)(__pa((unsigned long)pmd) >> PxD_VALUE_SHIFT)));
 }
 
 static inline 

[PATCH 09/12] sparc32: use pgtable-nopud instead of 4level-fixup

2019-10-23 Thread Mike Rapoport
From: Mike Rapoport 

32-bit version of sparc has three-level page tables and can use
pgtable-nopud and folding of the upper layers.

Replace usage of include/asm-generic/4level-fixup.h with
include/asm-generic/pgtable-nopud.h and adjust page table manipulation
macros and functions accordingly.

Signed-off-by: Mike Rapoport 
---
 arch/sparc/include/asm/pgalloc_32.h |  6 ++---
 arch/sparc/include/asm/pgtable_32.h | 28 ++--
 arch/sparc/mm/fault_32.c| 11 ++--
 arch/sparc/mm/highmem.c |  6 -
 arch/sparc/mm/io-unit.c |  6 -
 arch/sparc/mm/iommu.c   |  6 -
 arch/sparc/mm/srmmu.c   | 51 +
 7 files changed, 81 insertions(+), 33 deletions(-)

diff --git a/arch/sparc/include/asm/pgalloc_32.h 
b/arch/sparc/include/asm/pgalloc_32.h
index 10538a4..eae0c92 100644
--- a/arch/sparc/include/asm/pgalloc_32.h
+++ b/arch/sparc/include/asm/pgalloc_32.h
@@ -26,14 +26,14 @@ static inline void free_pgd_fast(pgd_t *pgd)
 #define pgd_free(mm, pgd)  free_pgd_fast(pgd)
 #define pgd_alloc(mm)  get_pgd_fast()
 
-static inline void pgd_set(pgd_t * pgdp, pmd_t * pmdp)
+static inline void pud_set(pud_t * pudp, pmd_t * pmdp)
 {
unsigned long pa = __nocache_pa(pmdp);
 
-   set_pte((pte_t *)pgdp, __pte((SRMMU_ET_PTD | (pa >> 4;
+   set_pte((pte_t *)pudp, __pte((SRMMU_ET_PTD | (pa >> 4;
 }
 
-#define pgd_populate(MM, PGD, PMD)  pgd_set(PGD, PMD)
+#define pud_populate(MM, PGD, PMD)  pud_set(PGD, PMD)
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm,
   unsigned long address)
diff --git a/arch/sparc/include/asm/pgtable_32.h 
b/arch/sparc/include/asm/pgtable_32.h
index 31da448..6d6f44c 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -12,7 +12,7 @@
 #include 
 
 #ifndef __ASSEMBLY__
-#include 
+#include 
 
 #include 
 #include 
@@ -132,12 +132,12 @@ static inline struct page *pmd_page(pmd_t pmd)
return pfn_to_page((pmd_val(pmd) & SRMMU_PTD_PMASK) >> (PAGE_SHIFT-4));
 }
 
-static inline unsigned long pgd_page_vaddr(pgd_t pgd)
+static inline unsigned long pud_page_vaddr(pud_t pud)
 {
-   if (srmmu_device_memory(pgd_val(pgd))) {
+   if (srmmu_device_memory(pud_val(pud))) {
return ~0;
} else {
-   unsigned long v = pgd_val(pgd) & SRMMU_PTD_PMASK;
+   unsigned long v = pud_val(pud) & SRMMU_PTD_PMASK;
return (unsigned long)__nocache_va(v << 4);
}
 }
@@ -184,24 +184,24 @@ static inline void pmd_clear(pmd_t *pmdp)
set_pte((pte_t *)>pmdv[i], __pte(0));
 }
 
-static inline int pgd_none(pgd_t pgd)  
+static inline int pud_none(pud_t pud)
 {
-   return !(pgd_val(pgd) & 0xFFF);
+   return !(pud_val(pud) & 0xFFF);
 }
 
-static inline int pgd_bad(pgd_t pgd)
+static inline int pud_bad(pud_t pud)
 {
-   return (pgd_val(pgd) & SRMMU_ET_MASK) != SRMMU_ET_PTD;
+   return (pud_val(pud) & SRMMU_ET_MASK) != SRMMU_ET_PTD;
 }
 
-static inline int pgd_present(pgd_t pgd)
+static inline int pud_present(pud_t pud)
 {
-   return ((pgd_val(pgd) & SRMMU_ET_MASK) == SRMMU_ET_PTD);
+   return ((pud_val(pud) & SRMMU_ET_MASK) == SRMMU_ET_PTD);
 }
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void pud_clear(pud_t *pudp)
 {
-   set_pte((pte_t *)pgdp, __pte(0));
+   set_pte((pte_t *)pudp, __pte(0));
 }
 
 /*
@@ -319,9 +319,9 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #define pgd_offset_k(address) pgd_offset(_mm, address)
 
 /* Find an entry in the second-level page table.. */
-static inline pmd_t *pmd_offset(pgd_t * dir, unsigned long address)
+static inline pmd_t *pmd_offset(pud_t * dir, unsigned long address)
 {
-   return (pmd_t *) pgd_page_vaddr(*dir) +
+   return (pmd_t *) pud_page_vaddr(*dir) +
((address >> PMD_SHIFT) & (PTRS_PER_PMD - 1));
 }
 
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 8d69de1..89976c9 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -351,6 +351,8 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int 
text_fault, int write,
 */
int offset = pgd_index(address);
pgd_t *pgd, *pgd_k;
+   p4d_t *p4d, *p4d_k;
+   pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
 
pgd = tsk->active_mm->pgd + offset;
@@ -363,8 +365,13 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int 
text_fault, int write,
return;
}
 
-   pmd = pmd_offset(pgd, address);
-   pmd_k = pmd_offset(pgd_k, address);
+   p4d = p4d_offset(pgd, address);
+   pud = pud_offset(p4d, address);
+   pmd = pmd_offset(pud, address);
+
+   p4d_k = p4d_offset(pgd_k, address);
+   

[PATCH 03/12] c6x: use pgtable-nopud instead of 4level-fixup

2019-10-23 Thread Mike Rapoport
From: Mike Rapoport 

c6x is a nommu architecture and does not require fixup for upper layers of
the page tables because it is already handled by the generic nommu
implementation.

Replace usage of include/asm-generic/4level-fixup.h with
include/asm-generic/pgtable-nopud.h

Signed-off-by: Mike Rapoport 
---
 arch/c6x/include/asm/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/c6x/include/asm/pgtable.h b/arch/c6x/include/asm/pgtable.h
index 0b6919c..197c473 100644
--- a/arch/c6x/include/asm/pgtable.h
+++ b/arch/c6x/include/asm/pgtable.h
@@ -8,7 +8,7 @@
 #ifndef _ASM_C6X_PGTABLE_H
 #define _ASM_C6X_PGTABLE_H
 
-#include 
+#include 
 
 #include 
 #include 
-- 
2.7.4



Re: [PATCH 03/21] ia64: rename ioremap_nocache to ioremap_uc

2019-10-21 Thread Sergei Shtylyov

Hello!

On 17.10.2019 20:45, Christoph Hellwig wrote:


On ia64 ioremap_nocache fails if attributs don't match.  Not other


  Attributes?


architectures does this, and we plan to get rid of ioremap_nocache.
So get rid of the special semantics and define ioremap_nocache in
terms of ioremap as no portable driver could rely on the behavior
anyway.

However x86 implements ioremap_uc with a in a similar way as the ia64


   With a what?


version of ioremap_nocache, so implement that instead.

Signed-off-by: Christoph Hellwig 
---
  arch/ia64/include/asm/io.h | 6 +++---
  arch/ia64/mm/ioremap.c | 4 ++--
  2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/ia64/include/asm/io.h b/arch/ia64/include/asm/io.h
index 54e70c21352a..fec9df9609ed 100644
--- a/arch/ia64/include/asm/io.h
+++ b/arch/ia64/include/asm/io.h

[...]

MBR, Sergei


Re: [PATCH 20/21] csky: remove ioremap_cache

2019-10-21 Thread Guo Ren
Acked-by: Guo Ren 

On Fri, Oct 18, 2019 at 1:47 AM Christoph Hellwig  wrote:
>
> No driver that can be used on csky uses ioremap_cache, and this
> interface has been deprecated in favor of memremap.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/csky/include/asm/io.h | 2 --
>  arch/csky/mm/ioremap.c | 7 ---
>  2 files changed, 9 deletions(-)
>
> diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h
> index a4b9fb616faa..f572605d5ad5 100644
> --- a/arch/csky/include/asm/io.h
> +++ b/arch/csky/include/asm/io.h
> @@ -36,13 +36,11 @@
>  /*
>   * I/O memory mapping functions.
>   */
> -extern void __iomem *ioremap_cache(phys_addr_t addr, size_t size);
>  extern void __iomem *__ioremap(phys_addr_t addr, size_t size, pgprot_t prot);
>  extern void iounmap(void *addr);
>
>  #define ioremap(addr, size)__ioremap((addr), (size), 
> pgprot_noncached(PAGE_KERNEL))
>  #define ioremap_wc(addr, size) __ioremap((addr), (size), 
> pgprot_writecombine(PAGE_KERNEL))
> -#define ioremap_cache  ioremap_cache
>
>  #include 
>
> diff --git a/arch/csky/mm/ioremap.c b/arch/csky/mm/ioremap.c
> index e13cd3497628..ae78256a56fd 100644
> --- a/arch/csky/mm/ioremap.c
> +++ b/arch/csky/mm/ioremap.c
> @@ -44,13 +44,6 @@ void __iomem *__ioremap(phys_addr_t phys_addr, size_t 
> size, pgprot_t prot)
>  }
>  EXPORT_SYMBOL(__ioremap);
>
> -void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
> -{
> -   return __ioremap_caller(phys_addr, size, PAGE_KERNEL,
> -   __builtin_return_address(0));
> -}
> -EXPORT_SYMBOL(ioremap_cache);
> -
>  void iounmap(void __iomem *addr)
>  {
> vunmap((void *)((unsigned long)addr & PAGE_MASK));
> --
> 2.20.1
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/


Darlehen

2019-10-20 Thread ST JOHN MICHAEL
Ich bringe dir gute Nachrichten. Wir vergeben Kredite zu einem
niedrigen Zinssatz von 2% über einen bestimmten Zeitraum. Müssen Sie
ein Haus, ein Auto kaufen, ein Unternehmen gründen, ein Projekt
finanzieren und vieles mehr? Wir vergeben schnelle und legitime
Kredite sowohl an Privatpersonen als auch an Firmen. Kontaktieren Sie
uns noch heute und Sie werden ohne Verzögerungen ausreichend
finanziert.


Re: [PATCH 13/21] m68k: rename __iounmap and mark it static

2019-10-18 Thread Geert Uytterhoeven
Hi Christoph,

On Thu, Oct 17, 2019 at 7:53 PM Christoph Hellwig  wrote:
> m68k uses __iounmap as the name for an internal helper that is only
> used for some CPU types.  Mark it static and give it a better name.
>
> Signed-off-by: Christoph Hellwig 

Thanks for your patch!

> --- a/arch/m68k/mm/kmap.c
> +++ b/arch/m68k/mm/kmap.c
> @@ -52,6 +52,7 @@ static inline void free_io_area(void *addr)
>
>  #define IO_SIZE(256*1024)
>
> +static void __free_io_area(void *addr, unsigned long size);
>  static struct vm_struct *iolist;
>
>  static struct vm_struct *get_io_area(unsigned long size)
> @@ -90,7 +91,7 @@ static inline void free_io_area(void *addr)
> if (tmp->addr == addr) {
> *p = tmp->next;
> /* remove gap added in get_io_area() */
> -   __iounmap(tmp->addr, tmp->size - IO_SIZE);
> +   __free_io_area(tmp->addr, tmp->size - IO_SIZE);
> kfree(tmp);
> return;
> }
> @@ -249,12 +250,13 @@ void iounmap(void __iomem *addr)
>  }
>  EXPORT_SYMBOL(iounmap);
>
> +#ifndef CPU_M68040_OR_M68060_ONLY

Can you please move this block up, instead of adding more #ifdef cluttery?
That would also remove the need for a forward declaration.

>  /*
> - * __iounmap unmaps nearly everything, so be careful
> + * __free_io_area unmaps nearly everything, so be careful
>   * Currently it doesn't free pointer/page tables anymore but this
>   * wasn't used anyway and might be added later.
>   */
> -void __iounmap(void *addr, unsigned long size)
> +static void __free_io_area(void *addr, unsigned long size)
>  {
> unsigned long virtaddr = (unsigned long)addr;
> pgd_t *pgd_dir;
> @@ -297,6 +299,7 @@ void __iounmap(void *addr, unsigned long size)
>
> flush_tlb_all();
>  }
> +#endif /* CPU_M68040_OR_M68060_ONLY */
>
>  /*
>   * Set new cache mode for some kernel address space.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Some Alphas broken by f75b99d5a77d (PCI: Enforce bus address limits in resource allocation)

2019-10-17 Thread Matt Turner
On Mon, Apr 23, 2018 at 10:34 AM Ivan Kokshaysky
 wrote:
>
> On Sun, Apr 22, 2018 at 01:07:38PM -0700, Matt Turner wrote:
> > On Wed, Apr 18, 2018 at 1:48 PM, Ivan Kokshaysky
> >  wrote:
> > > On Tue, Apr 17, 2018 at 02:43:44PM -0500, Bjorn Helgaas wrote:
> > >> On Mon, Apr 16, 2018 at 09:43:42PM -0700, Matt Turner wrote:
> > >> > On Mon, Apr 16, 2018 at 2:50 PM, Bjorn Helgaas  
> > >> > wrote:
> > >> > > Hi Matt,
> > >> > >
> > >> > > First of all, sorry about breaking Nautilus, and thanks very much for
> > >> > > tracking it down to this commit.
> > >> >
> > >> > It's a particularly weird case, as far as I've been able to discern :)
> > >> >
> > >> > > On Mon, Apr 16, 2018 at 07:33:57AM -0700, Matt Turner wrote:
> > >> > >> Commit f75b99d5a77d63f20e07bd276d5a427808ac8ef6 (PCI: Enforce bus
> > >> > >> address limits in resource allocation) broke Alpha systems using
> > >> > >> CONFIG_ALPHA_NAUTILUS. Alpha is 64-bit, but Nautilus systems use a
> > >> > >> 32-bit AMD 751/761 chipset. arch/alpha/kernel/sys_nautilus.c maps 
> > >> > >> PCI
> > >> > >> into the upper addresses just below 4GB.
> > >> > >>
> > >> > >> I can get a working kernel by ifdef'ing out the code in
> > >> > >> drivers/pci/bus.c:pci_bus_alloc_resource. We can't tie
> > >> > >> PCI_BUS_ADDR_T_64BIT to ALPHA_NAUTILUS without breaking generic
> > >> > >> kernels.
> > >> > >>
> > >> > >> How can we get Nautilus working again?
> > >> > >
> > >> > > Can you collect a complete dmesg log, ideally both before and after
> > >> > > f75b99d5a77d?  I assume the problem is that after f75b99d5a77d? we
> > >> > > erroneously assign space for something above 4GB.  But if we know the
> > >> > > correct host bridge apertures, we shouldn't assign space outside 
> > >> > > them,
> > >> > > regardless of the PCI bus address size.
> > >> >
> > >> > I made a mistake in my initial report. Commit f75b99d5a77d is actually
> > >> > the last *working* commit. My apologies. The next commit is
> > >> > d56dbf5bab8c (PCI: Allocate 64-bit BARs above 4G when possible) and it
> > >> > breaks Nautilus I've confirmed.
> > >> >
> > >> > Please find attached dmesgs from those two commits, from the commit
> > >> > immediately before them, and another from 4.17-rc1 with my hack of #if
> > >> > 0'ing out the pci_bus_alloc_from_region(..., _high) code.
> > >> >
> > >> > Thanks for having a look!
> > >>
> > >> We're telling the PCI core that the host bridge MMIO aperture is the
> > >> entire 64-bit address space, so when we assign BARs, some of them end
> > >> up above 4GB:
> > >>
> > >>   pci_bus :00: root bus resource [mem 0x-0x]
> > >>   pci :00:09.0: BAR 0: assigned [mem 0x1-0x1 64bit]
> > >>
> > >> But it sounds like the MMIO aperture really ends at 0x, so
> > >> that's not going to work.
> > >
> > > Correct... This would do as a quick fix, I think:
> > >
> > > diff --git a/arch/alpha/kernel/sys_nautilus.c 
> > > b/arch/alpha/kernel/sys_nautilus.c
> > > index ff4f54b..477ba65 100644
> > > --- a/arch/alpha/kernel/sys_nautilus.c
> > > +++ b/arch/alpha/kernel/sys_nautilus.c
> > > @@ -193,6 +193,8 @@ static struct resource irongate_io = {
> > >  };
> > >  static struct resource irongate_mem = {
> > > .name   = "Irongate PCI MEM",
> > > +   .start  = 0,
> > > +   .end= 0x,
> > > .flags  = IORESOURCE_MEM,
> > >  };
> > >  static struct resource busn_resource = {
> > > @@ -218,7 +220,7 @@ nautilus_init_pci(void)
> > > return;
> > >
> > > pci_add_resource(>windows, _resource);
> > > -   pci_add_resource(>windows, _resource);
> > > +   pci_add_resource(>windows, _mem);
> > > pci_add_resource(>windows, _resource);
> > > bridge->dev.parent = NULL;
> > > bridge->sysdata = hose;
> >
> > Thanks. But with that I get
> >
> > PCI host bridge to bus :00
> > pci_bus :00: root bus resource [io  0x-0x]
> > pci_bus :00: root bus resource [mem 0x-0x]
> > pci_bus :00: root bus resource [bus 00-ff]
> > pci :00:10.0: [Firmware Bug]: reg 0x10: invalid BAR (can't size)
> > pci :00:10.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
> > pci :00:10.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
> > pci :00:10.0: [Firmware Bug]: reg 0x1c: invalid BAR (can't size)
> > pci :00:10.0: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
> > pci :00:10.0: legacy IDE quirk: reg 0x14: [io  0x03f6]
> > pci :00:10.0: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
> > pci :00:10.0: legacy IDE quirk: reg 0x1c: [io  0x0376]
> > pci :00:11.0: quirk: [io  0x4000-0x403f] claimed by ali7101 ACPI
> > pci :00:11.0: quirk: [io  0x5000-0x501f] claimed by ali7101 SMB
> > pci :00:01.0: BAR 9: assigned [mem 0xc000-0xc2ff pref]
> > pci :00:01.0: BAR 8: assigned [mem 0xc300-0xc3bf]
> > pci :00:0b.0: BAR 6: assigned [mem 0xc3c0-0xc3c3 pref]
> > pci :00:08.0: 

Re: [PATCH 18/21] riscv: use the generic ioremap code

2019-10-17 Thread Paul Walmsley
On Thu, 17 Oct 2019, Christoph Hellwig wrote:

> Use the generic ioremap code instead of providing a local version.
> Note that this relies on the asm-generic no-op definition of
> pgprot_noncached.
> 
> Signed-off-by: Christoph Hellwig 

According to the series introduction E-mail:

https://lore.kernel.org/linux-riscv/20191017174554.29840-1-...@lst.de/T/#m9ac4010fd725c8c84179fa99aa391a6f701a32de

nothing substantive related to RISC-V or the common code has changed since 
the first version of this series, and this RISC-V-specific patch appears 
to be quite close (if not identical) to the first version of the patch:

https://lore.kernel.org/linux-riscv/alpine.deb.2.21..1908171421560.4...@viisi.sifive.com/

Thus the Tested-by, Reviewed-by, and Acked-by for RISC-V should all still 
apply:

https://lore.kernel.org/linux-riscv/alpine.deb.2.21..1908171421560.4...@viisi.sifive.com/


- Paul


Re: [PATCH 07/21] parisc: remove __ioremap

2019-10-17 Thread Rolf Eike Beer
Christoph Hellwig wrote:
> __ioremap is always called with the _PAGE_NO_CACHE, so fold the whole
> thing and rename it to ioremap.  This allows allows to remove the
^
> special EISA quirk to force _PAGE_NO_CACHE.

Eike

signature.asc
Description: This is a digitally signed message part.


[PATCH 07/21] parisc: remove __ioremap

2019-10-17 Thread Christoph Hellwig
__ioremap is always called with the _PAGE_NO_CACHE, so fold the whole
thing and rename it to ioremap.  This allows allows to remove the
special EISA quirk to force _PAGE_NO_CACHE.

Signed-off-by: Christoph Hellwig 
---
 arch/parisc/include/asm/io.h | 11 +--
 arch/parisc/mm/ioremap.c | 10 --
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/arch/parisc/include/asm/io.h b/arch/parisc/include/asm/io.h
index 93d37010b375..46212b52c23e 100644
--- a/arch/parisc/include/asm/io.h
+++ b/arch/parisc/include/asm/io.h
@@ -127,16 +127,7 @@ static inline void gsc_writeq(unsigned long long val, 
unsigned long addr)
 /*
  * The standard PCI ioremap interfaces
  */
-
-extern void __iomem * __ioremap(unsigned long offset, unsigned long size, 
unsigned long flags);
-
-/* Most machines react poorly to I/O-space being cacheable... Instead let's
- * define ioremap() in terms of ioremap_nocache().
- */
-static inline void __iomem * ioremap(unsigned long offset, unsigned long size)
-{
-   return __ioremap(offset, size, _PAGE_NO_CACHE);
-}
+void __iomem *ioremap(unsigned long offset, unsigned long size);
 #define ioremap_nocache(off, sz)   ioremap((off), (sz))
 #define ioremap_wc ioremap_nocache
 #define ioremap_uc ioremap_nocache
diff --git a/arch/parisc/mm/ioremap.c b/arch/parisc/mm/ioremap.c
index f29f682352f0..6e7c005aa09b 100644
--- a/arch/parisc/mm/ioremap.c
+++ b/arch/parisc/mm/ioremap.c
@@ -25,7 +25,7 @@
  * have to convert them into an offset in a page-aligned mapping, but the
  * caller shouldn't need to know that small detail.
  */
-void __iomem * __ioremap(unsigned long phys_addr, unsigned long size, unsigned 
long flags)
+void __iomem *ioremap(unsigned long phys_addr, unsigned long size)
 {
void __iomem *addr;
struct vm_struct *area;
@@ -36,10 +36,8 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned 
long size, unsigned l
unsigned long end = phys_addr + size - 1;
/* Support EISA addresses */
if ((phys_addr >= 0x0008 && end < 0x000f) ||
-   (phys_addr >= 0x0050 && end < 0x03bf)) {
+   (phys_addr >= 0x0050 && end < 0x03bf))
phys_addr |= F_EXTEND(0xfc00);
-   flags |= _PAGE_NO_CACHE;
-   }
 #endif
 
/* Don't allow wraparound or zero size */
@@ -65,7 +63,7 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned 
long size, unsigned l
}
 
pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY |
- _PAGE_ACCESSED | flags);
+ _PAGE_ACCESSED | _PAGE_NO_CACHE);
 
/*
 * Mappings have to be page-aligned
@@ -90,7 +88,7 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned 
long size, unsigned l
 
return (void __iomem *) (offset + (char __iomem *)addr);
 }
-EXPORT_SYMBOL(__ioremap);
+EXPORT_SYMBOL(ioremap);
 
 void iounmap(const volatile void __iomem *io_addr)
 {
-- 
2.20.1



[PATCH 17/21] lib: provide a simple generic ioremap implementation

2019-10-17 Thread Christoph Hellwig
A lot of architectures reuse the same simple ioremap implementation, so
start lifting the most simple variant to lib/ioremap.c.  It provides
ioremap_prot and iounmap, plus a default ioremap that uses prot_noncached,
although that can be overridden by asm/io.h.

Signed-off-by: Christoph Hellwig 
---
 include/asm-generic/io.h | 20 
 lib/Kconfig  |  3 +++
 lib/ioremap.c| 39 +++
 3 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 4e45e1cb6560..4a661fdd1937 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -923,9 +923,10 @@ static inline void *phys_to_virt(unsigned long address)
  * DOC: ioremap() and ioremap_*() variants
  *
  * Architectures with an MMU are expected to provide ioremap() and iounmap()
- * themselves.  For NOMMU architectures we provide a default nop-op
- * implementation that expect that the physical address used for MMIO are
- * already marked as uncached, and can be used as kernel virtual addresses.
+ * themselves or rely on GENERIC_IOREMAP.  For NOMMU architectures we provide
+ * a default nop-op implementation that expect that the physical address used
+ * for MMIO are already marked as uncached, and can be used as kernel virtual
+ * addresses.
  *
  * ioremap_wc() and ioremap_wt() can provide more relaxed caching attributes
  * for specific drivers if the architecture choses to implement them.  If they
@@ -946,7 +947,18 @@ static inline void iounmap(void __iomem *addr)
 {
 }
 #endif
-#endif /* CONFIG_MMU */
+#elif defined(CONFIG_GENERIC_IOREMAP)
+#include 
+
+void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot);
+void iounmap(volatile void __iomem *addr);
+
+static inline void __iomem *ioremap(phys_addr_t addr, size_t size)
+{
+   /* _PAGE_IOREMAP needs to be supplied by the architecture */
+   return ioremap_prot(addr, size, _PAGE_IOREMAP);
+}
+#endif /* !CONFIG_MMU || CONFIG_GENERIC_IOREMAP */
 
 #ifndef ioremap_nocache
 #define ioremap_nocache ioremap
diff --git a/lib/Kconfig b/lib/Kconfig
index 183f92a297ca..afc78aaf2b25 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -638,6 +638,9 @@ config STRING_SELFTEST
 
 endmenu
 
+config GENERIC_IOREMAP
+   bool
+
 config GENERIC_LIB_ASHLDI3
bool
 
diff --git a/lib/ioremap.c b/lib/ioremap.c
index 0a2ffadc6d71..3f0e18543de8 100644
--- a/lib/ioremap.c
+++ b/lib/ioremap.c
@@ -231,3 +231,42 @@ int ioremap_page_range(unsigned long addr,
 
return err;
 }
+
+#ifdef CONFIG_GENERIC_IOREMAP
+void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot)
+{
+   unsigned long offset, vaddr;
+   phys_addr_t last_addr;
+   struct vm_struct *area;
+
+   /* Disallow wrap-around or zero size */
+   last_addr = addr + size - 1;
+   if (!size || last_addr < addr)
+   return NULL;
+
+   /* Page-align mappings */
+   offset = addr & (~PAGE_MASK);
+   addr -= offset;
+   size = PAGE_ALIGN(size + offset);
+
+   area = get_vm_area_caller(size, VM_IOREMAP,
+   __builtin_return_address(0));
+   if (!area)
+   return NULL;
+   vaddr = (unsigned long)area->addr;
+
+   if (ioremap_page_range(vaddr, vaddr + size, addr, __pgprot(prot))) {
+   free_vm_area(area);
+   return NULL;
+   }
+
+   return (void __iomem *)(vaddr + offset);
+}
+EXPORT_SYMBOL(ioremap_prot);
+
+void iounmap(volatile void __iomem *addr)
+{
+   vunmap((void *)((unsigned long)addr & PAGE_MASK));
+}
+EXPORT_SYMBOL(iounmap);
+#endif /* CONFIG_GENERIC_IOREMAP */
-- 
2.20.1



[PATCH 12/21] arch: rely on asm-generic/io.h for default ioremap_* definitions

2019-10-17 Thread Christoph Hellwig
Various architectures that use asm-generic/io.h still defined their
own default versions of ioremap_nocache, ioremap_wt and ioremap_wc
that point back to plain ioremap directly or indirectly.  Remove these
definitions and rely on asm-generic/io.h instead.  For this to work
the backup ioremap_* defintions needs to be changed to purely cpp
macros instea of inlines to cover for architectures like openrisc
that only define ioremap after including .

Signed-off-by: Christoph Hellwig 
---
 arch/arc/include/asm/io.h|  4 
 arch/arm/include/asm/io.h|  1 -
 arch/arm64/include/asm/io.h  |  2 --
 arch/csky/include/asm/io.h   |  1 -
 arch/ia64/include/asm/io.h   |  1 -
 arch/microblaze/include/asm/io.h |  3 ---
 arch/nios2/include/asm/io.h  |  4 
 arch/openrisc/include/asm/io.h   |  1 -
 arch/riscv/include/asm/io.h  | 10 --
 arch/s390/include/asm/io.h   |  4 
 arch/x86/include/asm/io.h|  1 -
 arch/xtensa/include/asm/io.h |  4 
 include/asm-generic/io.h | 18 +++---
 13 files changed, 3 insertions(+), 51 deletions(-)

diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
index 72f7929736f8..8f777d6441a5 100644
--- a/arch/arc/include/asm/io.h
+++ b/arch/arc/include/asm/io.h
@@ -34,10 +34,6 @@ static inline void ioport_unmap(void __iomem *addr)
 
 extern void iounmap(const void __iomem *addr);
 
-#define ioremap_nocache(phy, sz)   ioremap(phy, sz)
-#define ioremap_wc(phy, sz)ioremap(phy, sz)
-#define ioremap_wt(phy, sz)ioremap(phy, sz)
-
 /*
  * io{read,write}{16,32}be() macros
  */
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 924f9dd502ed..aefdabdbeb84 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -392,7 +392,6 @@ static inline void memcpy_toio(volatile void __iomem *to, 
const void *from,
  */
 void __iomem *ioremap(resource_size_t res_cookie, size_t size);
 #define ioremap ioremap
-#define ioremap_nocache ioremap
 
 /*
  * Do not use ioremap_cache for mapping memory. Use memremap instead.
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 323cb306bd28..4e531f57147d 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -167,9 +167,7 @@ extern void iounmap(volatile void __iomem *addr);
 extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size);
 
 #define ioremap(addr, size)__ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRE))
-#define ioremap_nocache(addr, size)__ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_NORMAL_NC))
-#define ioremap_wt(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRE))
 
 /*
  * PCI configuration space mapping function.
diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h
index 80d071e2567f..a4b9fb616faa 100644
--- a/arch/csky/include/asm/io.h
+++ b/arch/csky/include/asm/io.h
@@ -42,7 +42,6 @@ extern void iounmap(void *addr);
 
 #define ioremap(addr, size)__ioremap((addr), (size), 
pgprot_noncached(PAGE_KERNEL))
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
pgprot_writecombine(PAGE_KERNEL))
-#define ioremap_nocache(addr, size)ioremap((addr), (size))
 #define ioremap_cache  ioremap_cache
 
 #include 
diff --git a/arch/ia64/include/asm/io.h b/arch/ia64/include/asm/io.h
index fec9df9609ed..3d666a11a2de 100644
--- a/arch/ia64/include/asm/io.h
+++ b/arch/ia64/include/asm/io.h
@@ -263,7 +263,6 @@ static inline void __iomem * ioremap_cache (unsigned long 
phys_addr, unsigned lo
return ioremap(phys_addr, size);
 }
 #define ioremap ioremap
-#define ioremap_nocache ioremap
 #define ioremap_cache ioremap_cache
 #define ioremap_uc ioremap_uc
 #define iounmap iounmap
diff --git a/arch/microblaze/include/asm/io.h b/arch/microblaze/include/asm/io.h
index 86c95b2a1ce1..d33c61737b8b 100644
--- a/arch/microblaze/include/asm/io.h
+++ b/arch/microblaze/include/asm/io.h
@@ -39,9 +39,6 @@ extern resource_size_t isa_mem_base;
 extern void iounmap(volatile void __iomem *addr);
 
 extern void __iomem *ioremap(phys_addr_t address, unsigned long size);
-#define ioremap_nocache(addr, size)ioremap((addr), (size))
-#define ioremap_wc(addr, size) ioremap((addr), (size))
-#define ioremap_wt(addr, size) ioremap((addr), (size))
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/nios2/include/asm/io.h b/arch/nios2/include/asm/io.h
index 74ab34aa6731..d108937c321e 100644
--- a/arch/nios2/include/asm/io.h
+++ b/arch/nios2/include/asm/io.h
@@ -33,10 +33,6 @@ static inline void iounmap(void __iomem *addr)
__iounmap(addr);
 }
 
-#define ioremap_nocache ioremap
-#define ioremap_wc ioremap
-#define ioremap_wt ioremap
-
 /* Pages to physical address... */
 #define page_to_phys(page) virt_to_phys(page_to_virt(page))
 
diff --git 

[PATCH 15/21] nios2: remove __iounmap

2019-10-17 Thread Christoph Hellwig
No need to indirect iounmap for nios2.

Signed-off-by: Christoph Hellwig 
---
 arch/nios2/include/asm/io.h | 7 +--
 arch/nios2/mm/ioremap.c | 6 +++---
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/nios2/include/asm/io.h b/arch/nios2/include/asm/io.h
index d108937c321e..746853ac7d8d 100644
--- a/arch/nios2/include/asm/io.h
+++ b/arch/nios2/include/asm/io.h
@@ -26,12 +26,7 @@
 #define writel_relaxed(x, addr)writel(x, addr)
 
 void __iomem *ioremap(unsigned long physaddr, unsigned long size);
-extern void __iounmap(void __iomem *addr);
-
-static inline void iounmap(void __iomem *addr)
-{
-   __iounmap(addr);
-}
+void iounmap(void __iomem *addr);
 
 /* Pages to physical address... */
 #define page_to_phys(page) virt_to_phys(page_to_virt(page))
diff --git a/arch/nios2/mm/ioremap.c b/arch/nios2/mm/ioremap.c
index 7a1a27f3daa3..b56af759dcdf 100644
--- a/arch/nios2/mm/ioremap.c
+++ b/arch/nios2/mm/ioremap.c
@@ -157,11 +157,11 @@ void __iomem *ioremap(unsigned long phys_addr, unsigned 
long size)
 EXPORT_SYMBOL(ioremap);
 
 /*
- * __iounmap unmaps nearly everything, so be careful
+ * iounmap unmaps nearly everything, so be careful
  * it doesn't free currently pointer/page tables anymore but it
  * wasn't used anyway and might be added later.
  */
-void __iounmap(void __iomem *addr)
+void iounmap(void __iomem *addr)
 {
struct vm_struct *p;
 
@@ -173,4 +173,4 @@ void __iounmap(void __iomem *addr)
pr_err("iounmap: bad address %p\n", addr);
kfree(p);
 }
-EXPORT_SYMBOL(__iounmap);
+EXPORT_SYMBOL(iounmap);
-- 
2.20.1



[PATCH 13/21] m68k: rename __iounmap and mark it static

2019-10-17 Thread Christoph Hellwig
m68k uses __iounmap as the name for an internal helper that is only
used for some CPU types.  Mark it static and give it a better name.

Signed-off-by: Christoph Hellwig 
---
 arch/m68k/include/asm/kmap.h | 1 -
 arch/m68k/mm/kmap.c  | 9 ++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/m68k/include/asm/kmap.h b/arch/m68k/include/asm/kmap.h
index 421b6c9c769d..559cb91bede1 100644
--- a/arch/m68k/include/asm/kmap.h
+++ b/arch/m68k/include/asm/kmap.h
@@ -20,7 +20,6 @@ extern void __iomem *__ioremap(unsigned long physaddr, 
unsigned long size,
   int cacheflag);
 #define iounmap iounmap
 extern void iounmap(void __iomem *addr);
-extern void __iounmap(void *addr, unsigned long size);
 
 #define ioremap ioremap
 static inline void __iomem *ioremap(unsigned long physaddr, unsigned long size)
diff --git a/arch/m68k/mm/kmap.c b/arch/m68k/mm/kmap.c
index 40a3b327da07..4c279cf0bcc8 100644
--- a/arch/m68k/mm/kmap.c
+++ b/arch/m68k/mm/kmap.c
@@ -52,6 +52,7 @@ static inline void free_io_area(void *addr)
 
 #define IO_SIZE(256*1024)
 
+static void __free_io_area(void *addr, unsigned long size);
 static struct vm_struct *iolist;
 
 static struct vm_struct *get_io_area(unsigned long size)
@@ -90,7 +91,7 @@ static inline void free_io_area(void *addr)
if (tmp->addr == addr) {
*p = tmp->next;
/* remove gap added in get_io_area() */
-   __iounmap(tmp->addr, tmp->size - IO_SIZE);
+   __free_io_area(tmp->addr, tmp->size - IO_SIZE);
kfree(tmp);
return;
}
@@ -249,12 +250,13 @@ void iounmap(void __iomem *addr)
 }
 EXPORT_SYMBOL(iounmap);
 
+#ifndef CPU_M68040_OR_M68060_ONLY
 /*
- * __iounmap unmaps nearly everything, so be careful
+ * __free_io_area unmaps nearly everything, so be careful
  * Currently it doesn't free pointer/page tables anymore but this
  * wasn't used anyway and might be added later.
  */
-void __iounmap(void *addr, unsigned long size)
+static void __free_io_area(void *addr, unsigned long size)
 {
unsigned long virtaddr = (unsigned long)addr;
pgd_t *pgd_dir;
@@ -297,6 +299,7 @@ void __iounmap(void *addr, unsigned long size)
 
flush_tlb_all();
 }
+#endif /* CPU_M68040_OR_M68060_ONLY */
 
 /*
  * Set new cache mode for some kernel address space.
-- 
2.20.1



[PATCH 18/21] riscv: use the generic ioremap code

2019-10-17 Thread Christoph Hellwig
Use the generic ioremap code instead of providing a local version.
Note that this relies on the asm-generic no-op definition of
pgprot_noncached.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/Kconfig   |  1 +
 arch/riscv/include/asm/io.h  |  3 --
 arch/riscv/include/asm/pgtable.h |  6 +++
 arch/riscv/mm/Makefile   |  1 -
 arch/riscv/mm/ioremap.c  | 84 
 5 files changed, 7 insertions(+), 88 deletions(-)
 delete mode 100644 arch/riscv/mm/ioremap.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 8eebbc8860bb..a02e91ed747a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -30,6 +30,7 @@ config RISCV
select GENERIC_STRNLEN_USER
select GENERIC_SMP_IDLE_THREAD
select GENERIC_ATOMIC64 if !64BIT
+   select GENERIC_IOREMAP
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ASM_MODVERSIONS
select HAVE_MEMBLOCK_NODE_MAP
diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index c1de6875cc77..df4c8812ff64 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -14,9 +14,6 @@
 #include 
 #include 
 
-extern void __iomem *ioremap(phys_addr_t offset, unsigned long size);
-extern void iounmap(volatile void __iomem *addr);
-
 /* Generic IO read/write.  These perform native-endian accesses. */
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7255f2d8395b..65a216e91df2 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -61,6 +61,12 @@
 
 #define PAGE_TABLE __pgprot(_PAGE_TABLE)
 
+/*
+ * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
+ * change the properties of memory regions.
+ */
+#define _PAGE_IOREMAP _PAGE_KERNEL
+
 extern pgd_t swapper_pg_dir[];
 
 /* MAP_PRIVATE permissions: xwr (copy-on-write) */
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 9d9a17335686..b3a356c80c1f 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -8,7 +8,6 @@ endif
 obj-y += init.o
 obj-y += fault.o
 obj-y += extable.o
-obj-y += ioremap.o
 obj-y += cacheflush.o
 obj-y += context.o
 obj-y += sifive_l2_cache.o
diff --git a/arch/riscv/mm/ioremap.c b/arch/riscv/mm/ioremap.c
deleted file mode 100644
index ac621ddb45c0..
--- a/arch/riscv/mm/ioremap.c
+++ /dev/null
@@ -1,84 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * (C) Copyright 1995 1996 Linus Torvalds
- * (C) Copyright 2012 Regents of the University of California
- */
-
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-/*
- * Remap an arbitrary physical address space into the kernel virtual
- * address space. Needed when the kernel wants to access high addresses
- * directly.
- *
- * NOTE! We need to allow non-page-aligned mappings too: we will obviously
- * have to convert them into an offset in a page-aligned mapping, but the
- * caller shouldn't need to know that small detail.
- */
-static void __iomem *__ioremap_caller(phys_addr_t addr, size_t size,
-   pgprot_t prot, void *caller)
-{
-   phys_addr_t last_addr;
-   unsigned long offset, vaddr;
-   struct vm_struct *area;
-
-   /* Disallow wrap-around or zero size */
-   last_addr = addr + size - 1;
-   if (!size || last_addr < addr)
-   return NULL;
-
-   /* Page-align mappings */
-   offset = addr & (~PAGE_MASK);
-   addr -= offset;
-   size = PAGE_ALIGN(size + offset);
-
-   area = get_vm_area_caller(size, VM_IOREMAP, caller);
-   if (!area)
-   return NULL;
-   vaddr = (unsigned long)area->addr;
-
-   if (ioremap_page_range(vaddr, vaddr + size, addr, prot)) {
-   free_vm_area(area);
-   return NULL;
-   }
-
-   return (void __iomem *)(vaddr + offset);
-}
-
-/*
- * ioremap -   map bus memory into CPU space
- * @offset:bus address of the memory
- * @size:  size of the resource to map
- *
- * ioremap performs a platform specific sequence of operations to
- * make bus memory CPU accessible via the readb/readw/readl/writeb/
- * writew/writel functions and the other mmio helpers. The returned
- * address is not guaranteed to be usable directly as a virtual
- * address.
- *
- * Must be freed with iounmap.
- */
-void __iomem *ioremap(phys_addr_t offset, unsigned long size)
-{
-   return __ioremap_caller(offset, size, PAGE_KERNEL,
-   __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap);
-
-
-/**
- * iounmap - Free a IO remapping
- * @addr: virtual address from ioremap_*
- *
- * Caller must ensure there is only one unmapping for the same pointer.
- */
-void iounmap(volatile void __iomem *addr)
-{
-   vunmap((void *)((unsigned long)addr & PAGE_MASK));
-}
-EXPORT_SYMBOL(iounmap);
-- 
2.20.1



[PATCH 14/21] hexagon: remove __iounmap

2019-10-17 Thread Christoph Hellwig
No need to indirect iounmap for hexagon.

Signed-off-by: Christoph Hellwig 
---
 arch/hexagon/include/asm/io.h   | 7 +--
 arch/hexagon/kernel/hexagon_ksyms.c | 2 +-
 arch/hexagon/mm/ioremap.c   | 2 +-
 3 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/hexagon/include/asm/io.h b/arch/hexagon/include/asm/io.h
index 89537dc1cf97..539e3efcf39c 100644
--- a/arch/hexagon/include/asm/io.h
+++ b/arch/hexagon/include/asm/io.h
@@ -27,7 +27,7 @@
 extern int remap_area_pages(unsigned long start, unsigned long phys_addr,
unsigned long end, unsigned long flags);
 
-extern void __iounmap(const volatile void __iomem *addr);
+extern void iounmap(const volatile void __iomem *addr);
 
 /* Defined in lib/io.c, needed for smc91x driver. */
 extern void __raw_readsw(const void __iomem *addr, void *data, int wordlen);
@@ -175,11 +175,6 @@ void __iomem *ioremap(unsigned long phys_addr, unsigned 
long size);
 #define ioremap_nocache ioremap
 
 
-static inline void iounmap(volatile void __iomem *addr)
-{
-   __iounmap(addr);
-}
-
 #define __raw_writel writel
 
 static inline void memcpy_fromio(void *dst, const volatile void __iomem *src,
diff --git a/arch/hexagon/kernel/hexagon_ksyms.c 
b/arch/hexagon/kernel/hexagon_ksyms.c
index b3dbb472572e..6fb1aaab1c29 100644
--- a/arch/hexagon/kernel/hexagon_ksyms.c
+++ b/arch/hexagon/kernel/hexagon_ksyms.c
@@ -14,7 +14,7 @@
 EXPORT_SYMBOL(__clear_user_hexagon);
 EXPORT_SYMBOL(raw_copy_from_user);
 EXPORT_SYMBOL(raw_copy_to_user);
-EXPORT_SYMBOL(__iounmap);
+EXPORT_SYMBOL(iounmap);
 EXPORT_SYMBOL(__strnlen_user);
 EXPORT_SYMBOL(__vmgetie);
 EXPORT_SYMBOL(__vmsetie);
diff --git a/arch/hexagon/mm/ioremap.c b/arch/hexagon/mm/ioremap.c
index b103d83b5fbb..255c5b1ee1a7 100644
--- a/arch/hexagon/mm/ioremap.c
+++ b/arch/hexagon/mm/ioremap.c
@@ -38,7 +38,7 @@ void __iomem *ioremap(unsigned long phys_addr, unsigned long 
size)
return (void __iomem *) (offset + addr);
 }
 
-void __iounmap(const volatile void __iomem *addr)
+void iounmap(const volatile void __iomem *addr)
 {
vunmap((void *) ((unsigned long) addr & PAGE_MASK));
 }
-- 
2.20.1



[PATCH 11/21] asm-generic: don't provide ioremap for CONFIG_MMU

2019-10-17 Thread Christoph Hellwig
All MMU-enabled ports have a non-trivial ioremap and should thus provide
the prototype for their implementation instead of providing a generic
one unless a different symbol is not defined.  Note that this only
affects sparc32 nds32 as all others do provide their own version.

Also update the kerneldoc comments in asm-generic/io.h to explain the
situation around the default ioremap* implementations correctly.

Signed-off-by: Christoph Hellwig 
---
 arch/nds32/include/asm/io.h|  2 ++
 arch/sparc/include/asm/io_32.h |  1 +
 include/asm-generic/io.h   | 29 -
 3 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/arch/nds32/include/asm/io.h b/arch/nds32/include/asm/io.h
index 16f262322b8f..fb0e8a24c7af 100644
--- a/arch/nds32/include/asm/io.h
+++ b/arch/nds32/include/asm/io.h
@@ -6,6 +6,7 @@
 
 #include 
 
+void __iomem *ioremap(phys_addr_t phys_addr, size_t size);
 extern void iounmap(volatile void __iomem *addr);
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
@@ -80,4 +81,5 @@ static inline u32 __raw_readl(const volatile void __iomem 
*addr)
 #define writew(v,c)({ __iowmb(); writew_relaxed((v),(c)); })
 #define writel(v,c)({ __iowmb(); writel_relaxed((v),(c)); })
 #include 
+
 #endif /* __ASM_NDS32_IO_H */
diff --git a/arch/sparc/include/asm/io_32.h b/arch/sparc/include/asm/io_32.h
index df2dc1784673..9a52d9506f80 100644
--- a/arch/sparc/include/asm/io_32.h
+++ b/arch/sparc/include/asm/io_32.h
@@ -127,6 +127,7 @@ static inline void sbus_memcpy_toio(volatile void __iomem 
*dst,
  * Bus number may be embedded in the higher bits of the physical address.
  * This is why we have no bus number argument to ioremap().
  */
+void __iomem *ioremap(phys_addr_t offset, size_t size);
 void iounmap(volatile void __iomem *addr);
 /* Create a virtual mapping cookie for an IO port range */
 void __iomem *ioport_map(unsigned long port, unsigned int nr);
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index a98ed6325727..6a5edc23afe2 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -922,28 +922,16 @@ static inline void *phys_to_virt(unsigned long address)
 /**
  * DOC: ioremap() and ioremap_*() variants
  *
- * If you have an IOMMU your architecture is expected to have both ioremap()
- * and iounmap() implemented otherwise the asm-generic helpers will provide a
- * direct mapping.
+ * Architectures with an MMU are expected to provide ioremap() and iounmap()
+ * themselves.  For NOMMU architectures we provide a default nop-op
+ * implementation that expect that the physical address used for MMIO are
+ * already marked as uncached, and can be used as kernel virtual addresses.
  *
- * There are ioremap_*() call variants, if you have no IOMMU we naturally will
- * default to direct mapping for all of them, you can override these defaults.
- * If you have an IOMMU you are highly encouraged to provide your own
- * ioremap variant implementation as there currently is no safe architecture
- * agnostic default. To avoid possible improper behaviour default asm-generic
- * ioremap_*() variants all return NULL when an IOMMU is available. If you've
- * defined your own ioremap_*() variant you must then declare your own
- * ioremap_*() variant as defined to itself to avoid the default NULL return.
+ * ioremap_wc() and ioremap_wt() can provide more relaxed caching attributes
+ * for specific drivers if the architecture choses to implement them.  If they
+ * are not implemented we fall back to plain ioremap.
  */
 #ifndef CONFIG_MMU
-
-/*
- * Change "struct page" to physical address.
- *
- * This implementation is for the no-MMU case only... if you have an MMU
- * you'll need to provide your own definitions.
- */
-
 #ifndef ioremap
 #define ioremap ioremap
 static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
@@ -954,14 +942,13 @@ static inline void __iomem *ioremap(phys_addr_t offset, 
size_t size)
 
 #ifndef iounmap
 #define iounmap iounmap
-
 static inline void iounmap(void __iomem *addr)
 {
 }
 #endif
 #endif /* CONFIG_MMU */
+
 #ifndef ioremap_nocache
-void __iomem *ioremap(phys_addr_t phys_addr, size_t size);
 #define ioremap_nocache ioremap_nocache
 static inline void __iomem *ioremap_nocache(phys_addr_t offset, size_t size)
 {
-- 
2.20.1



[PATCH 16/21] sh: remove __iounmap

2019-10-17 Thread Christoph Hellwig
No need to indirect iounmap for sh.

Signed-off-by: Christoph Hellwig 
---
 arch/sh/include/asm/io.h | 9 ++---
 arch/sh/mm/ioremap.c | 4 ++--
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/sh/include/asm/io.h b/arch/sh/include/asm/io.h
index ac0561960c52..1495489225ac 100644
--- a/arch/sh/include/asm/io.h
+++ b/arch/sh/include/asm/io.h
@@ -267,7 +267,7 @@ unsigned long long poke_real_address_q(unsigned long long 
addr,
 #ifdef CONFIG_MMU
 void __iomem *__ioremap_caller(phys_addr_t offset, unsigned long size,
   pgprot_t prot, void *caller);
-void __iounmap(void __iomem *addr);
+void iounmap(void __iomem *addr);
 
 static inline void __iomem *
 __ioremap(phys_addr_t offset, unsigned long size, pgprot_t prot)
@@ -328,7 +328,7 @@ __ioremap_mode(phys_addr_t offset, unsigned long size, 
pgprot_t prot)
 #else
 #define __ioremap(offset, size, prot)  ((void __iomem *)(offset))
 #define __ioremap_mode(offset, size, prot) ((void __iomem *)(offset))
-#define __iounmap(addr)do { } while (0)
+#define iounmap(addr)  do { } while (0)
 #endif /* CONFIG_MMU */
 
 static inline void __iomem *ioremap(phys_addr_t offset, unsigned long size)
@@ -370,11 +370,6 @@ static inline int iounmap_fixed(void __iomem *addr) { 
return -EINVAL; }
 #define ioremap_nocacheioremap
 #define ioremap_uc ioremap
 
-static inline void iounmap(void __iomem *addr)
-{
-   __iounmap(addr);
-}
-
 /*
  * Convert a physical pointer to a virtual kernel pointer for /dev/mem
  * access
diff --git a/arch/sh/mm/ioremap.c b/arch/sh/mm/ioremap.c
index d09ddfe58fd8..f6d02246d665 100644
--- a/arch/sh/mm/ioremap.c
+++ b/arch/sh/mm/ioremap.c
@@ -103,7 +103,7 @@ static inline int iomapping_nontranslatable(unsigned long 
offset)
return 0;
 }
 
-void __iounmap(void __iomem *addr)
+void iounmap(void __iomem *addr)
 {
unsigned long vaddr = (unsigned long __force)addr;
struct vm_struct *p;
@@ -134,4 +134,4 @@ void __iounmap(void __iomem *addr)
 
kfree(p);
 }
-EXPORT_SYMBOL(__iounmap);
+EXPORT_SYMBOL(iounmap);
-- 
2.20.1



[PATCH 21/21] csky: use generic ioremap

2019-10-17 Thread Christoph Hellwig
Use the generic ioremap_prot and iounmap helpers.

Signed-off-by: Christoph Hellwig 
---
 arch/csky/Kconfig   |  1 +
 arch/csky/include/asm/io.h  |  8 +++---
 arch/csky/include/asm/pgtable.h |  4 +++
 arch/csky/mm/ioremap.c  | 45 -
 4 files changed, 8 insertions(+), 50 deletions(-)

diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 3973847b5f42..da09c884cc30 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -17,6 +17,7 @@ config CSKY
select IRQ_DOMAIN
select HANDLE_DOMAIN_IRQ
select DW_APB_TIMER_OF
+   select GENERIC_IOREMAP
select GENERIC_LIB_ASHLDI3
select GENERIC_LIB_ASHRDI3
select GENERIC_LIB_LSHRDI3
diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h
index f572605d5ad5..332f51bc68fb 100644
--- a/arch/csky/include/asm/io.h
+++ b/arch/csky/include/asm/io.h
@@ -36,11 +36,9 @@
 /*
  * I/O memory mapping functions.
  */
-extern void __iomem *__ioremap(phys_addr_t addr, size_t size, pgprot_t prot);
-extern void iounmap(void *addr);
-
-#define ioremap(addr, size)__ioremap((addr), (size), 
pgprot_noncached(PAGE_KERNEL))
-#define ioremap_wc(addr, size) __ioremap((addr), (size), 
pgprot_writecombine(PAGE_KERNEL))
+#define ioremap_wc(addr, size) \
+   ioremap_prot((addr), (size), \
+   (_PAGE_IOREMAP & ~_CACHE_MASK) | _CACHE_UNCACHED)
 
 #include 
 
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 7c21985c60dc..4b2a41e15f2e 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -86,6 +86,10 @@
 #define PAGE_USERIO__pgprot(_PAGE_PRESENT | _PAGE_READ | _PAGE_WRITE | \
_CACHE_CACHED)
 
+#define _PAGE_IOREMAP \
+   (_PAGE_PRESENT | __READABLE | __WRITEABLE | _PAGE_GLOBAL | \
+_CACHE_UNCACHED | _PAGE_SO)
+
 #define __P000 PAGE_NONE
 #define __P001 PAGE_READONLY
 #define __P010 PAGE_COPY
diff --git a/arch/csky/mm/ioremap.c b/arch/csky/mm/ioremap.c
index ae78256a56fd..70c8268d3b2b 100644
--- a/arch/csky/mm/ioremap.c
+++ b/arch/csky/mm/ioremap.c
@@ -3,53 +3,8 @@
 
 #include 
 #include 
-#include 
 #include 
 
-#include 
-
-static void __iomem *__ioremap_caller(phys_addr_t addr, size_t size,
- pgprot_t prot, void *caller)
-{
-   phys_addr_t last_addr;
-   unsigned long offset, vaddr;
-   struct vm_struct *area;
-
-   last_addr = addr + size - 1;
-   if (!size || last_addr < addr)
-   return NULL;
-
-   offset = addr & (~PAGE_MASK);
-   addr &= PAGE_MASK;
-   size = PAGE_ALIGN(size + offset);
-
-   area = get_vm_area_caller(size, VM_IOREMAP, caller);
-   if (!area)
-   return NULL;
-
-   vaddr = (unsigned long)area->addr;
-
-   if (ioremap_page_range(vaddr, vaddr + size, addr, prot)) {
-   free_vm_area(area);
-   return NULL;
-   }
-
-   return (void __iomem *)(vaddr + offset);
-}
-
-void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot)
-{
-   return __ioremap_caller(phys_addr, size, prot,
-   __builtin_return_address(0));
-}
-EXPORT_SYMBOL(__ioremap);
-
-void iounmap(void __iomem *addr)
-{
-   vunmap((void *)((unsigned long)addr & PAGE_MASK));
-}
-EXPORT_SYMBOL(iounmap);
-
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
  unsigned long size, pgprot_t vma_prot)
 {
-- 
2.20.1



[PATCH 20/21] csky: remove ioremap_cache

2019-10-17 Thread Christoph Hellwig
No driver that can be used on csky uses ioremap_cache, and this
interface has been deprecated in favor of memremap.

Signed-off-by: Christoph Hellwig 
---
 arch/csky/include/asm/io.h | 2 --
 arch/csky/mm/ioremap.c | 7 ---
 2 files changed, 9 deletions(-)

diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h
index a4b9fb616faa..f572605d5ad5 100644
--- a/arch/csky/include/asm/io.h
+++ b/arch/csky/include/asm/io.h
@@ -36,13 +36,11 @@
 /*
  * I/O memory mapping functions.
  */
-extern void __iomem *ioremap_cache(phys_addr_t addr, size_t size);
 extern void __iomem *__ioremap(phys_addr_t addr, size_t size, pgprot_t prot);
 extern void iounmap(void *addr);
 
 #define ioremap(addr, size)__ioremap((addr), (size), 
pgprot_noncached(PAGE_KERNEL))
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
pgprot_writecombine(PAGE_KERNEL))
-#define ioremap_cache  ioremap_cache
 
 #include 
 
diff --git a/arch/csky/mm/ioremap.c b/arch/csky/mm/ioremap.c
index e13cd3497628..ae78256a56fd 100644
--- a/arch/csky/mm/ioremap.c
+++ b/arch/csky/mm/ioremap.c
@@ -44,13 +44,6 @@ void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, 
pgprot_t prot)
 }
 EXPORT_SYMBOL(__ioremap);
 
-void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
-{
-   return __ioremap_caller(phys_addr, size, PAGE_KERNEL,
-   __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap_cache);
-
 void iounmap(void __iomem *addr)
 {
vunmap((void *)((unsigned long)addr & PAGE_MASK));
-- 
2.20.1



[PATCH 19/21] nds32: use generic ioremap

2019-10-17 Thread Christoph Hellwig
Use the generic ioremap_prot and iounmap helpers.

Note that the io.h include in pgtable.h had to be removed to not create
an include loop.  As far as I can tell there was no need for it to
start with.

Signed-off-by: Christoph Hellwig 
---
 arch/nds32/Kconfig   |  1 +
 arch/nds32/include/asm/io.h  |  3 +-
 arch/nds32/include/asm/pgtable.h |  4 ++-
 arch/nds32/mm/Makefile   |  3 +-
 arch/nds32/mm/ioremap.c  | 62 
 5 files changed, 6 insertions(+), 67 deletions(-)
 delete mode 100644 arch/nds32/mm/ioremap.c

diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index fbd68329737f..12c06a833b7c 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -20,6 +20,7 @@ config NDS32
select GENERIC_CLOCKEVENTS
select GENERIC_IRQ_CHIP
select GENERIC_IRQ_SHOW
+   select GENERIC_IOREMAP
select GENERIC_LIB_ASHLDI3
select GENERIC_LIB_ASHRDI3
select GENERIC_LIB_CMPDI2
diff --git a/arch/nds32/include/asm/io.h b/arch/nds32/include/asm/io.h
index fb0e8a24c7af..e57378d04006 100644
--- a/arch/nds32/include/asm/io.h
+++ b/arch/nds32/include/asm/io.h
@@ -6,8 +6,6 @@
 
 #include 
 
-void __iomem *ioremap(phys_addr_t phys_addr, size_t size);
-extern void iounmap(volatile void __iomem *addr);
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
 {
@@ -80,6 +78,7 @@ static inline u32 __raw_readl(const volatile void __iomem 
*addr)
 #define writeb(v,c)({ __iowmb(); writeb_relaxed((v),(c)); })
 #define writew(v,c)({ __iowmb(); writew_relaxed((v),(c)); })
 #define writel(v,c)({ __iowmb(); writel_relaxed((v),(c)); })
+
 #include 
 
 #endif /* __ASM_NDS32_IO_H */
diff --git a/arch/nds32/include/asm/pgtable.h b/arch/nds32/include/asm/pgtable.h
index 0588ec99725c..6fbf251cfc26 100644
--- a/arch/nds32/include/asm/pgtable.h
+++ b/arch/nds32/include/asm/pgtable.h
@@ -12,7 +12,6 @@
 #include 
 #ifndef __ASSEMBLY__
 #include 
-#include 
 #include 
 #endif
 
@@ -130,6 +129,9 @@ extern void __pgd_error(const char *file, int line, 
unsigned long val);
 #define _PAGE_CACHE_PAGE_C_MEM_WB
 #endif
 
+#define _PAGE_IOREMAP \
+   (_PAGE_V | _PAGE_M_KRW | _PAGE_D | _PAGE_G | _PAGE_C_DEV)
+
 /*
  * + Level 1 descriptor (PMD)
  */
diff --git a/arch/nds32/mm/Makefile b/arch/nds32/mm/Makefile
index bd360e4583b5..897ecaf5cf54 100644
--- a/arch/nds32/mm/Makefile
+++ b/arch/nds32/mm/Makefile
@@ -1,6 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-y  := extable.o tlb.o \
-  fault.o init.o ioremap.o mmap.o \
+obj-y  := extable.o tlb.o fault.o init.o mmap.o \
mm-nds32.o cacheflush.o proc.o
 
 obj-$(CONFIG_ALIGNMENT_TRAP)   += alignment.o
diff --git a/arch/nds32/mm/ioremap.c b/arch/nds32/mm/ioremap.c
deleted file mode 100644
index 690140bb23a2..
--- a/arch/nds32/mm/ioremap.c
+++ /dev/null
@@ -1,62 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-// Copyright (C) 2005-2017 Andes Technology Corporation
-
-#include 
-#include 
-#include 
-#include 
-
-void __iomem *ioremap(phys_addr_t phys_addr, size_t size);
-
-static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
- void *caller)
-{
-   struct vm_struct *area;
-   unsigned long addr, offset, last_addr;
-   pgprot_t prot;
-
-   /* Don't allow wraparound or zero size */
-   last_addr = phys_addr + size - 1;
-   if (!size || last_addr < phys_addr)
-   return NULL;
-
-   /*
-* Mappings have to be page-aligned
-*/
-   offset = phys_addr & ~PAGE_MASK;
-   phys_addr &= PAGE_MASK;
-   size = PAGE_ALIGN(last_addr + 1) - phys_addr;
-
-   /*
-* Ok, go for it..
-*/
-   area = get_vm_area_caller(size, VM_IOREMAP, caller);
-   if (!area)
-   return NULL;
-
-   area->phys_addr = phys_addr;
-   addr = (unsigned long)area->addr;
-   prot = __pgprot(_PAGE_V | _PAGE_M_KRW | _PAGE_D |
-   _PAGE_G | _PAGE_C_DEV);
-   if (ioremap_page_range(addr, addr + size, phys_addr, prot)) {
-   vunmap((void *)addr);
-   return NULL;
-   }
-   return (__force void __iomem *)(offset + (char *)addr);
-
-}
-
-void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
-{
-   return __ioremap_caller(phys_addr, size,
-   __builtin_return_address(0));
-}
-
-EXPORT_SYMBOL(ioremap);
-
-void iounmap(volatile void __iomem * addr)
-{
-   vunmap((void *)(PAGE_MASK & (unsigned long)addr));
-}
-
-EXPORT_SYMBOL(iounmap);
-- 
2.20.1



[PATCH 08/21] x86: clean up ioremap

2019-10-17 Thread Christoph Hellwig
Use ioremap as the main implemented function, and defined
ioremap_nocache to it as a deprecated alias.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/include/asm/io.h | 8 ++--
 arch/x86/mm/ioremap.c | 8 
 arch/x86/mm/pageattr.c| 4 ++--
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 6bed97ff6db2..6b5cc41319a7 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -180,8 +180,6 @@ static inline unsigned int isa_virt_to_bus(volatile void 
*address)
  * The default ioremap() behavior is non-cached; if you need something
  * else, you probably want one of the following.
  */
-extern void __iomem *ioremap_nocache(resource_size_t offset, unsigned long 
size);
-#define ioremap_nocache ioremap_nocache
 extern void __iomem *ioremap_uc(resource_size_t offset, unsigned long size);
 #define ioremap_uc ioremap_uc
 extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
@@ -205,11 +203,9 @@ extern void __iomem *ioremap_encrypted(resource_size_t 
phys_addr, unsigned long
  * If the area you are trying to map is a PCI BAR you should have a
  * look at pci_iomap().
  */
-static inline void __iomem *ioremap(resource_size_t offset, unsigned long size)
-{
-   return ioremap_nocache(offset, size);
-}
+void __iomem *ioremap(resource_size_t offset, unsigned long size);
 #define ioremap ioremap
+#define ioremap_nocache ioremap
 
 extern void iounmap(volatile void __iomem *addr);
 #define iounmap iounmap
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index a39dcdb5ae34..7985233dfb8d 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -280,11 +280,11 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long 
size,
 }
 
 /**
- * ioremap_nocache -   map bus memory into CPU space
+ * ioremap -   map bus memory into CPU space
  * @phys_addr:bus address of the memory
  * @size:  size of the resource to map
  *
- * ioremap_nocache performs a platform specific sequence of operations to
+ * ioremap performs a platform specific sequence of operations to
  * make bus memory CPU accessible via the readb/readw/readl/writeb/
  * writew/writel functions and the other mmio helpers. The returned
  * address is not guaranteed to be usable directly as a virtual
@@ -300,7 +300,7 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long 
size,
  *
  * Must be freed with iounmap.
  */
-void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
+void __iomem *ioremap(resource_size_t phys_addr, unsigned long size)
 {
/*
 * Ideally, this should be:
@@ -315,7 +315,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, 
unsigned long size)
return __ioremap_caller(phys_addr, size, pcm,
__builtin_return_address(0), false);
 }
-EXPORT_SYMBOL(ioremap_nocache);
+EXPORT_SYMBOL(ioremap);
 
 /**
  * ioremap_uc -   map bus memory into CPU space as strongly uncachable
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 0d09cc5aad61..1b99ad05b117 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1784,7 +1784,7 @@ static inline int cpa_clear_pages_array(struct page 
**pages, int numpages,
 int _set_memory_uc(unsigned long addr, int numpages)
 {
/*
-* for now UC MINUS. see comments in ioremap_nocache()
+* for now UC MINUS. see comments in ioremap()
 * If you really need strong UC use ioremap_uc(), but note
 * that you cannot override IO areas with set_memory_*() as
 * these helpers cannot work with IO memory.
@@ -1799,7 +1799,7 @@ int set_memory_uc(unsigned long addr, int numpages)
int ret;
 
/*
-* for now UC MINUS. see comments in ioremap_nocache()
+* for now UC MINUS. see comments in ioremap()
 */
ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
  _PAGE_CACHE_MODE_UC_MINUS, NULL);
-- 
2.20.1



[PATCH 04/21] hexagon: clean up ioremap

2019-10-17 Thread Christoph Hellwig
Use ioremap as the main implemented function, and defined
ioremap_nocache to it as a deprecated alias.

Signed-off-by: Christoph Hellwig 
---
 arch/hexagon/include/asm/io.h   | 11 ++-
 arch/hexagon/kernel/hexagon_ksyms.c |  2 +-
 arch/hexagon/mm/ioremap.c   |  2 +-
 3 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/hexagon/include/asm/io.h b/arch/hexagon/include/asm/io.h
index ba1a444d55b3..89537dc1cf97 100644
--- a/arch/hexagon/include/asm/io.h
+++ b/arch/hexagon/include/asm/io.h
@@ -171,16 +171,9 @@ static inline void writel(u32 data, volatile void __iomem 
*addr)
 #define writew_relaxed __raw_writew
 #define writel_relaxed __raw_writel
 
-/*
- * Need an mtype somewhere in here, for cache type deals?
- * This is probably too long for an inline.
- */
-void __iomem *ioremap_nocache(unsigned long phys_addr, unsigned long size);
+void __iomem *ioremap(unsigned long phys_addr, unsigned long size);
+#define ioremap_nocache ioremap
 
-static inline void __iomem *ioremap(unsigned long phys_addr, unsigned long 
size)
-{
-   return ioremap_nocache(phys_addr, size);
-}
 
 static inline void iounmap(volatile void __iomem *addr)
 {
diff --git a/arch/hexagon/kernel/hexagon_ksyms.c 
b/arch/hexagon/kernel/hexagon_ksyms.c
index cf8974beb500..b3dbb472572e 100644
--- a/arch/hexagon/kernel/hexagon_ksyms.c
+++ b/arch/hexagon/kernel/hexagon_ksyms.c
@@ -20,7 +20,7 @@ EXPORT_SYMBOL(__vmgetie);
 EXPORT_SYMBOL(__vmsetie);
 EXPORT_SYMBOL(__vmyield);
 EXPORT_SYMBOL(empty_zero_page);
-EXPORT_SYMBOL(ioremap_nocache);
+EXPORT_SYMBOL(ioremap);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memset);
 
diff --git a/arch/hexagon/mm/ioremap.c b/arch/hexagon/mm/ioremap.c
index 77d8e1e69e9b..b103d83b5fbb 100644
--- a/arch/hexagon/mm/ioremap.c
+++ b/arch/hexagon/mm/ioremap.c
@@ -9,7 +9,7 @@
 #include 
 #include 
 
-void __iomem *ioremap_nocache(unsigned long phys_addr, unsigned long size)
+void __iomem *ioremap(unsigned long phys_addr, unsigned long size)
 {
unsigned long last_addr, addr;
unsigned long offset = phys_addr & ~PAGE_MASK;
-- 
2.20.1



  1   2   3   4   5   6   7   8   9   10   >