Re: [PATCH] KVM: PPC: Book3S HV: Treat unrecognized TM instructions as illegal

2020-02-16 Thread Segher Boessenkool
On Mon, Feb 17, 2020 at 05:23:07PM +1100, Michael Neuling wrote:
> > > Hence, we should NOP this, not generate an illegal.
> > 
> > It is not a reserved bit.
> > 
> > The IMC entry for it matches op1=01 op2=101110 presumably, which
> > catches all TM instructions and nothing else (bits 0..5 and bits 21..30).
> > That does not look at bit 31, the softpatch handler has to deal with this.
> > 
> > Some TM insns have bit 31 as 1 and some have it as /.  All instructions
> > with a "." in the mnemonic have bit 31 is 1, all other have it reserved.
> > The tables in appendices D, E, F show tend. and tsr. as having it
> > reserved, which contradicts the individual instruction description (and
> > does not make much sense).  (Only tcheck has /, everything else has 1;
> > everything else has a mnemonic with a dot, and does write CR0 always).
> 
> Wow, interesting. 
> 
> P8 seems to be treating 31 as a reserved bit (with the table definition rather
> than the individual instruction description). I'm inclined to match P8 even
> though it's inconsistent with the dot mnemonic as you say.

"The POWER8 core ignores the state of reserved bits in the instructions
(denoted by “///” in the instruction definition) and executes the
instruction normally. Software should set these bits to ‘0’ per the
Power ISA." (p8 UM, 3.1.1.3; same in the p9 UM).


Segher


Re: [PATCH 2/5] mm/vma: Make vma_is_accessible() available for general use

2020-02-16 Thread Geert Uytterhoeven
On Mon, Feb 17, 2020 at 6:04 AM Anshuman Khandual
 wrote:
> Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
> available for general use. While here, this replaces all remaining open
> encodings for VMA access check with vma_is_accessible().
>
> Cc: Guo Ren 
> Cc: Geert Uytterhoeven  Cc: Ralf Baechle 
> Cc: Paul Burton 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Yoshinori Sato 
> Cc: Rich Felker 
> Cc: Dave Hansen 
> Cc: Andy Lutomirski 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Andrew Morton 
> Cc: Steven Rostedt 
> Cc: Mel Gorman 
> Cc: linux-ker...@vger.kernel.org
> Cc: linux-m...@lists.linux-m68k.org
> Cc: linux-m...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux...@vger.kernel.org
> Cc: linux...@kvack.org
> Signed-off-by: Anshuman Khandual 

>  arch/m68k/mm/fault.c| 2 +-

For m68k:
Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [RFC PATCH v2 00/12] Reduce ifdef mess in ptrace

2020-02-16 Thread Christophe Leroy

Hi Mikey,

Le 28/06/2019 à 17:47, Christophe Leroy a écrit :

The purpose of this series is to reduce the amount of #ifdefs
in ptrace.c



Any feedback on this series which aims at fixing the issue you opened at 
https://github.com/linuxppc/issues/issues/128 ?


Thanks
Christophe


This is a first try. Most of it is done, there are still some #ifdefs that
could go away.

Please comment and tell whether it is worth continuing in that direction.

v2:
- Fixed several build failures. Now builts cleanly on kisskb, see 
http://kisskb.ellerman.id.au/kisskb/head/840e53cf913d6096dd60181a085f102c85d6e526/
- Droped last patch which is not related to ptrace and can be applies 
independently.

Christophe Leroy (12):
   powerpc: move ptrace into a subdirectory.
   powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64
   powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET
   powerpc/ptrace: split out VSX related functions.
   powerpc/ptrace: split out ALTIVEC related functions.
   powerpc/ptrace: split out SPE related functions.
   powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.
   powerpc/ptrace: move register viewing functions out of ptrace.c
   powerpc/ptrace: split out ADV_DEBUG_REGS related functions.
   powerpc/ptrace: create ptrace_get_debugreg()
   powerpc/ptrace: create ppc_gethwdinfo()
   powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c

  arch/powerpc/include/asm/ptrace.h   |9 +-
  arch/powerpc/include/uapi/asm/ptrace.h  |   12 +-
  arch/powerpc/kernel/Makefile|7 +-
  arch/powerpc/kernel/hw_breakpoint.c |   16 +
  arch/powerpc/kernel/ptrace.c| 3402 ---
  arch/powerpc/kernel/ptrace/Makefile |   20 +
  arch/powerpc/kernel/ptrace/ptrace-adv.c |  511 
  arch/powerpc/kernel/ptrace/ptrace-altivec.c |  151 ++
  arch/powerpc/kernel/ptrace/ptrace-decl.h|  184 ++
  arch/powerpc/kernel/ptrace/ptrace-noadv.c   |  291 +++
  arch/powerpc/kernel/ptrace/ptrace-novsx.c   |   83 +
  arch/powerpc/kernel/ptrace/ptrace-spe.c |   92 +
  arch/powerpc/kernel/ptrace/ptrace-tm.c  |  879 +++
  arch/powerpc/kernel/ptrace/ptrace-view.c|  953 
  arch/powerpc/kernel/ptrace/ptrace-vsx.c |  177 ++
  arch/powerpc/kernel/ptrace/ptrace.c |  430 
  arch/powerpc/kernel/{ => ptrace}/ptrace32.c |0
  17 files changed, 3798 insertions(+), 3419 deletions(-)
  delete mode 100644 arch/powerpc/kernel/ptrace.c
  create mode 100644 arch/powerpc/kernel/ptrace/Makefile
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-adv.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-altivec.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-decl.h
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-noadv.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-novsx.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-spe.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-tm.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-view.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace-vsx.c
  create mode 100644 arch/powerpc/kernel/ptrace/ptrace.c
  rename arch/powerpc/kernel/{ => ptrace}/ptrace32.c (100%)



Re: [PATCH v7 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support

2020-02-16 Thread Christophe Leroy




Le 17/02/2020 à 00:08, Michael Neuling a écrit :

Daniel.

Can you start this commit message with a simple description of what you are
actually doing? This reads like you've been on a long journey to Mordor and
back, which as a reader of this patch in the long distant future, I don't care
about. I just want to know what you're implementing.

Also I'm struggling to review this as I don't know what software or hardware
mechanisms you are using to perform sanitisation.


KASAN is standard, it's simply using GCC ASAN in kernel mode, ie kernel 
is built with -fsanitize=kernel-address 
(https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html)


You have more details there: 
https://www.kernel.org/doc/html/latest/dev-tools/kasan.html


Christophe


Re: [PATCH] powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK

2020-02-16 Thread Christophe Leroy




Le 16/02/2020 à 23:40, Michael Neuling a écrit :

On Fri, 2020-02-14 at 08:33 +, Christophe Leroy wrote:

With CONFIG_VMAP_STACK, data MMU has to be enabled
to read data on the stack.


Can you describe what goes wrong without this? Some oops message? rtas blows up?
Get corrupt data?


Larry reported a machine check. Or in fact, he reported a Oops in 
kprobe_handler(), that Oops being a bug in kprobe_handle() triggered by 
this machine check.


By converting a VM address to a phys-like address as if is was linear 
mem, you get in the dark. Either there is some physical memory at that 
address and you corrupt it. Or there is none and you get a machine check.




Also can you say what you're actually doing (ie turning on MSR[DR])


Euh ... I'm saying that data MMU has to be enabled, so I'm enabling it.





Signed-off-by: Christophe Leroy 
---
  arch/powerpc/kernel/entry_32.S | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 0713daa651d9..bc056d906b51 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -1354,12 +1354,17 @@ _GLOBAL(enter_rtas)
mtspr   SPRN_SRR0,r8
mtspr   SPRN_SRR1,r9
RFI
-1: tophys(r9,r1)
+1: tophys_novmstack r9, r1
+#ifdef CONFIG_VMAP_STACK
+   li  r0, MSR_KERNEL & ~MSR_IR/* can take DTLB miss */


You're potentially turning on more than MSR DR here. This should be clear in the
commit message.


Am I ?

At the time of the RFI just above, SRR1 contains the value of r9 which 
has been set 2 lines before to MSR_KERNEL & ~(MSR_IR|MSR_DR).


What should be clear in the commit message ?




+   mtmsr   r0
+   isync
+#endif
lwz r8,INT_FRAME_SIZE+4(r9) /* get return address */
lwz r9,8(r9)/* original msr value */
addir1,r1,INT_FRAME_SIZE
li  r0,0
-   tophys(r7, r2)
+   tophys_novmstack r7, r2
stw r0, THREAD + RTAS_SP(r7)
mtspr   SPRN_SRR0,r8
mtspr   SPRN_SRR1,r9


Christophe


Re: [PATCH 1/1] powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type

2020-02-16 Thread Christophe Leroy




Le 15/02/2020 à 06:36, Leonardo Bras a écrit :

Before checking for cpu_type == NULL, this same copy happens, so doing
it here will just write the same value to the t->oprofile_type
again.

Remove the repeated copy, as it is unnecessary.

Signed-off-by: Leonardo Bras 
---
  arch/powerpc/kernel/cputable.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index e745abc5457a..5a87ec96582f 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2197,7 +2197,6 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned 
long offset,
 */
if (old.oprofile_cpu_type != NULL) {
t->oprofile_cpu_type = old.oprofile_cpu_type;
-   t->oprofile_type = old.oprofile_type;
}


The action being reduced to a single line, the { } should be removed.

Christophe


Re: [PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Mike Rapoport
On Sun, Feb 16, 2020 at 08:22:30AM +, Russell King - ARM Linux admin wrote:
> On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> > From: Mike Rapoport 
> > 
> > Hi,
> > 
> > These patches convert several architectures to use page table folding and
> > remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
> > 
> > The changes are mostly about mechanical replacement of pgd accessors with 
> > p4d
> > ones and the addition of higher levels to page table traversals.
> > 
> > All the patches were sent separately to the respective arch lists and
> > maintainers hence the "v2" prefix.
> 
> You fail to explain why this change which adds 488 additional lines of
> code is desirable.

Right, I should have been more explicit about it.

As Christophe mentioned in his reply, removing 'HACK' and 'fixup' is an
improvement.
Another thing is that when all architectures behave the same it opens
opportunities for cleaning up repeating definitions of page table
manipulation primitives.

 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

-- 
Sincerely yours,
Mike.


Re: [PATCH] KVM: PPC: Book3S HV: Treat unrecognized TM instructions as illegal

2020-02-16 Thread Michael Neuling
On Sun, 2020-02-16 at 23:57 -0600, Segher Boessenkool wrote:
> On Mon, Feb 17, 2020 at 12:07:31PM +1100, Michael Neuling wrote:
> > On Thu, 2020-02-13 at 10:15 -0500, Gustavo Romero wrote:
> > > On P9 DD2.2 due to a CPU defect some TM instructions need to be emulated
> > > by
> > > KVM. This is handled at first by the hardware raising a softpatch
> > > interrupt
> > > when certain TM instructions that need KVM assistance are executed in the
> > > guest. Some TM instructions, although not defined in the Power ISA, might
> > > raise a softpatch interrupt. For instance, 'tresume.' instruction as
> > > defined in the ISA must have bit 31 set (1), but an instruction that
> > > matches 'tresume.' OP and XO opcodes but has bit 31 not set (0), like
> > > 0x7cfe9ddc, also raises a softpatch interrupt, for example, if a code
> > > like the following is executed in the guest it will raise a softpatch
> > > interrupt just like a 'tresume.' when the TM facility is enabled:
> > > 
> > > int main() { asm("tabort. 0; .long 0x7cfe9ddc;"); }
> > > and then treats the executed instruction as 'nop' whilst it should
> > > actually
> > > be treated as an illegal instruction since it's not defined by the ISA.
> > 
> > The ISA has this: 
> > 
> >1.3.3 Reserved Fields, Reserved Values, and Reserved SPRs
> > 
> >Reserved fields in instructions are ignored by the pro-
> >cessor.
> > 
> > Hence the hardware will ignore reserved bits. For example executing your
> > little
> > program on P8 just exits normally with 0x7cfe9ddc being executed as a NOP.
> > 
> > Hence, we should NOP this, not generate an illegal.
> 
> It is not a reserved bit.
> 
> The IMC entry for it matches op1=01 op2=101110 presumably, which
> catches all TM instructions and nothing else (bits 0..5 and bits 21..30).
> That does not look at bit 31, the softpatch handler has to deal with this.
> 
> Some TM insns have bit 31 as 1 and some have it as /.  All instructions
> with a "." in the mnemonic have bit 31 is 1, all other have it reserved.
> The tables in appendices D, E, F show tend. and tsr. as having it
> reserved, which contradicts the individual instruction description (and
> does not make much sense).  (Only tcheck has /, everything else has 1;
> everything else has a mnemonic with a dot, and does write CR0 always).

Wow, interesting. 

P8 seems to be treating 31 as a reserved bit (with the table definition rather
than the individual instruction description). I'm inclined to match P8 even
though it's inconsistent with the dot mnemonic as you say.

Mikey


Re: [PATCH] KVM: PPC: Book3S HV: Treat unrecognized TM instructions as illegal

2020-02-16 Thread Segher Boessenkool
On Mon, Feb 17, 2020 at 12:07:31PM +1100, Michael Neuling wrote:
> On Thu, 2020-02-13 at 10:15 -0500, Gustavo Romero wrote:
> > On P9 DD2.2 due to a CPU defect some TM instructions need to be emulated by
> > KVM. This is handled at first by the hardware raising a softpatch interrupt
> > when certain TM instructions that need KVM assistance are executed in the
> > guest. Some TM instructions, although not defined in the Power ISA, might
> > raise a softpatch interrupt. For instance, 'tresume.' instruction as
> > defined in the ISA must have bit 31 set (1), but an instruction that
> > matches 'tresume.' OP and XO opcodes but has bit 31 not set (0), like
> > 0x7cfe9ddc, also raises a softpatch interrupt, for example, if a code
> > like the following is executed in the guest it will raise a softpatch
> > interrupt just like a 'tresume.' when the TM facility is enabled:
> > 
> > int main() { asm("tabort. 0; .long 0x7cfe9ddc;"); }

> > and then treats the executed instruction as 'nop' whilst it should actually
> > be treated as an illegal instruction since it's not defined by the ISA.
> 
> The ISA has this: 
> 
>1.3.3 Reserved Fields, Reserved Values, and Reserved SPRs
> 
>Reserved fields in instructions are ignored by the pro-
>cessor.
> 
> Hence the hardware will ignore reserved bits. For example executing your 
> little
> program on P8 just exits normally with 0x7cfe9ddc being executed as a NOP.
> 
> Hence, we should NOP this, not generate an illegal.

It is not a reserved bit.

The IMC entry for it matches op1=01 op2=101110 presumably, which
catches all TM instructions and nothing else (bits 0..5 and bits 21..30).
That does not look at bit 31, the softpatch handler has to deal with this.

Some TM insns have bit 31 as 1 and some have it as /.  All instructions
with a "." in the mnemonic have bit 31 is 1, all other have it reserved.
The tables in appendices D, E, F show tend. and tsr. as having it
reserved, which contradicts the individual instruction description (and
does not make much sense).  (Only tcheck has /, everything else has 1;
everything else has a mnemonic with a dot, and does write CR0 always).


Segher


[PATCH 3/5] mm/vma: Replace all remaining open encodings with is_vm_hugetlb_page()

2020-02-16 Thread Anshuman Khandual
This replaces all remaining open encodings with is_vm_hugetlb_page().

Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Alexander Viro 
Cc: Will Deacon 
Cc: "Aneesh Kumar K.V" 
Cc: Andrew Morton 
Cc: Nick Piggin 
Cc: Peter Zijlstra 
Cc: Arnd Bergmann 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux...@kvack.org
Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/kvm/e500_mmu_host.c | 2 +-
 fs/binfmt_elf.c  | 2 +-
 include/asm-generic/tlb.h| 2 +-
 kernel/events/core.c | 3 ++-
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 425d13806645..3922575a1c31 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -422,7 +422,7 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
break;
}
} else if (vma && hva >= vma->vm_start &&
-  (vma->vm_flags & VM_HUGETLB)) {
+  (is_vm_hugetlb_page(vma))) {
unsigned long psize = vma_kernel_pagesize(vma);
 
tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >>
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index f4713ea76e82..6bc97ede10ba 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1317,7 +1317,7 @@ static unsigned long vma_dump_size(struct vm_area_struct 
*vma,
}
 
/* Hugetlb memory check */
-   if (vma->vm_flags & VM_HUGETLB) {
+   if (is_vm_hugetlb_page(vma)) {
if ((vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_SHARED))
goto whole;
if (!(vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_PRIVATE))
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index f391f6b500b4..d42c236d4965 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -398,7 +398,7 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct 
vm_area_struct *vma)
 * We rely on tlb_end_vma() to issue a flush, such that when we reset
 * these values the batch is empty.
 */
-   tlb->vma_huge = !!(vma->vm_flags & VM_HUGETLB);
+   tlb->vma_huge = is_vm_hugetlb_page(vma);
tlb->vma_exec = !!(vma->vm_flags & VM_EXEC);
 }
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e453589da97c..eb0ee3c5f322 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -7693,7 +7694,7 @@ static void perf_event_mmap_event(struct perf_mmap_event 
*mmap_event)
flags |= MAP_EXECUTABLE;
if (vma->vm_flags & VM_LOCKED)
flags |= MAP_LOCKED;
-   if (vma->vm_flags & VM_HUGETLB)
+   if (is_vm_hugetlb_page(vma))
flags |= MAP_HUGETLB;
 
if (file) {
-- 
2.20.1



[PATCH 2/5] mm/vma: Make vma_is_accessible() available for general use

2020-02-16 Thread Anshuman Khandual
Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
available for general use. While here, this replaces all remaining open
encodings for VMA access check with vma_is_accessible().

Cc: Guo Ren 
Cc: Geert Uytterhoeven 
Cc: Paul Burton 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Andrew Morton 
Cc: Steven Rostedt 
Cc: Mel Gorman 
Cc: linux-ker...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: linux...@kvack.org
Signed-off-by: Anshuman Khandual 
---
 arch/csky/mm/fault.c| 2 +-
 arch/m68k/mm/fault.c| 2 +-
 arch/mips/mm/fault.c| 2 +-
 arch/powerpc/mm/fault.c | 2 +-
 arch/sh/mm/fault.c  | 2 +-
 arch/x86/mm/fault.c | 2 +-
 include/linux/mm.h  | 5 +
 kernel/sched/fair.c | 2 +-
 mm/gup.c| 2 +-
 mm/memory.c | 5 -
 mm/mempolicy.c  | 3 +--
 11 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/csky/mm/fault.c b/arch/csky/mm/fault.c
index f76618b630f9..4b3511b8298d 100644
--- a/arch/csky/mm/fault.c
+++ b/arch/csky/mm/fault.c
@@ -137,7 +137,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, 
unsigned long write,
if (!(vma->vm_flags & VM_WRITE))
goto bad_area;
} else {
-   if (!(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)))
+   if (!vma_is_accessible(vma))
goto bad_area;
}
 
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index e9b1d7585b43..d5131ec5d923 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -125,7 +125,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
case 1: /* read, present */
goto acc_err;
case 0: /* read, not present */
-   if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)))
+   if (!vma_is_accessible(vma))
goto acc_err;
}
 
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 1e8d00793784..5b9f947bfa32 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -142,7 +142,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, 
unsigned long write,
goto bad_area;
}
} else {
-   if (!(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)))
+   if (!vma_is_accessible(vma))
goto bad_area;
}
}
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 8db0507619e2..71a3658c516b 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -314,7 +314,7 @@ static bool access_error(bool is_write, bool is_exec,
return false;
}
 
-   if (unlikely(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE
+   if (unlikely(!vma_is_accessible(vma)))
return true;
/*
 * We should ideally do the vma pkey access check here. But in the
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 5f51456f4fc7..a8c4253f37d7 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -355,7 +355,7 @@ static inline int access_error(int error_code, struct 
vm_area_struct *vma)
return 1;
 
/* read, not present: */
-   if (unlikely(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE
+   if (unlikely(!vma_is_accessible(vma)))
return 1;
 
return 0;
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fa4ea09593ab..c461eaab0306 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1200,7 +1200,7 @@ access_error(unsigned long error_code, struct 
vm_area_struct *vma)
return 1;
 
/* read, not present: */
-   if (unlikely(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE
+   if (unlikely(!vma_is_accessible(vma)))
return 1;
 
return 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 52269e56c514..b0e53ef13ff1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -541,6 +541,11 @@ static inline bool vma_is_anonymous(struct vm_area_struct 
*vma)
return !vma->vm_ops;
 }
 
+static inline bool vma_is_accessible(struct vm_area_struct *vma)
+{
+   return vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+}
+
 #ifdef CONFIG_SHMEM
 /*
  * The vma_is_shmem is not inline because it is used only by slow
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fe4e0d775375..6ce54d57dd09 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2573,7 +2573,7 @@ static void task_numa_work(struct callback_head *work)
 * Skip inaccessible 

[PATCH 0/5] mm/vma: Use available wrappers when possible

2020-02-16 Thread Anshuman Khandual
Apart from adding a VMA flag readable name for trace purpose, this series
does some open encoding replacements with availabe VMA specific wrappers.
This skips VM_HUGETLB check in vma_migratable() as its already being done
with another patch (https://patchwork.kernel.org/patch/11347831/) which
is yet to be merged.

This series applies on 5.6-rc2. This has been build tested on multiple
platforms, though boot and runtime testing was limited to arm64 and x86.

Cc: linux-ker...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: kvm-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux...@kvack.org

Anshuman Khandual (5):
  mm/vma: Add missing VMA flag readable name for VM_SYNC
  mm/vma: Make vma_is_accessible() available for general use
  mm/vma: Replace all remaining open encodings with is_vm_hugetlb_page()
  mm/vma: Replace all remaining open encodings with vma_set_anonymous()
  mm/vma: Replace all remaining open encodings with vma_is_anonymous()

 arch/csky/mm/fault.c  | 2 +-
 arch/m68k/mm/fault.c  | 2 +-
 arch/mips/mm/fault.c  | 2 +-
 arch/powerpc/kvm/e500_mmu_host.c  | 2 +-
 arch/powerpc/mm/fault.c   | 2 +-
 arch/sh/mm/fault.c| 2 +-
 arch/x86/mm/fault.c   | 2 +-
 drivers/misc/mic/scif/scif_mmap.c | 2 +-
 fs/binfmt_elf.c   | 2 +-
 include/asm-generic/tlb.h | 2 +-
 include/linux/mm.h| 5 +
 include/trace/events/mmflags.h| 1 +
 kernel/events/core.c  | 3 ++-
 kernel/sched/fair.c   | 2 +-
 mm/gup.c  | 5 +++--
 mm/memory.c   | 5 -
 mm/mempolicy.c| 3 +--
 17 files changed, 23 insertions(+), 21 deletions(-)

-- 
2.20.1



[PATCH] powerpc/xmon: Fix whitespace handling in getstring()

2020-02-16 Thread Oliver O'Halloran
The ls (lookup symbol) and zr (reboot) commands use xmon's getstring()
helper to read a string argument from the xmon prompt. This function skips
over leading whitespace, but doesn't check if the first "non-whitespace"
character is a newline which causes some odd behaviour ( indicates
a the enter key was pressed):

0:mon> ls printk
printk: c01680c4

0:mon> ls
printk
Symbol '
printk' not found.
0:mon>

With commit 2d9b332d99b ("powerpc/xmon: Allow passing an argument
to ppc_md.restart()") we have a similar problem with the zr command.
Previously zr took no arguments so "zr would trigger a reboot.
With that patch applied a second newline needs to be sent in order for
the reboot to occur. Fix this by checking if the leading whitespace
ended on a newline:

0:mon> ls
Symbol '' not found.

Fixes: 2d9b332d99b ("powerpc/xmon: Allow passing an argument to 
ppc_md.restart()")
Reported-by: Michael Ellerman 
Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/xmon/xmon.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index e8c84d26..0ec9640 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3435,6 +3435,11 @@ getstring(char *s, int size)
int c;
 
c = skipbl();
+   if (c == '\n') {
+   *s = 0;
+   return;
+   }
+
do {
if( size > 1 ){
*s++ = c;
-- 
2.9.5



[PATCH V14] mm/debug: Add tests validating architecture page table helpers

2020-02-16 Thread Anshuman Khandual
This adds tests which will validate architecture page table helpers and
other accessors in their compliance with expected generic MM semantics.
This will help various architectures in validating changes to existing
page table helpers or addition of new ones.

This test covers basic page table entry transformations including but not
limited to old, young, dirty, clean, write, write protect etc at various
level along with populating intermediate entries with next page table page
and validating them.

Test page table pages are allocated from system memory with required size
and alignments. The mapped pfns at page table levels are derived from a
real pfn representing a valid kernel text symbol. This test gets called
inside kernel_init() right after async_synchronize_full().

This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
architecture, which is willing to subscribe this test will need to select
ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390
and ppc32 platforms where the test is known to build and run successfully.
Going forward, other architectures too can subscribe the test after fixing
any build or runtime problems with their page table helpers. Meanwhile for
better platform coverage, the test can also be enabled with CONFIG_EXPERT
even without ARCH_HAS_DEBUG_VM_PGTABLE.

Folks interested in making sure that a given platform's page table helpers
conform to expected generic MM semantics should enable the above config
which will just trigger this test during boot. Any non conformity here will
be reported as an warning which would need to be fixed. This test will help
catch any changes to the agreed upon semantics expected from generic MM and
enable platforms to accommodate it thereafter.

Cc: Andrew Morton 
Cc: Mike Rapoport 
Cc: Vineet Gupta 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Kirill A. Shutemov 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: x...@kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org

Suggested-by: Catalin Marinas 
Reviewed-by: Ingo Molnar 
Tested-by: Gerald Schaefer  # s390
Tested-by: Christophe Leroy# ppc32
Signed-off-by: Andrew Morton 
Signed-off-by: Christophe Leroy 
Signed-off-by: Anshuman Khandual 
---
This adds a test validation for architecture exported page table helpers.
Patch adds basic transformation tests at various levels of the page table.

This test was originally suggested by Catalin during arm64 THP migration
RFC discussion earlier. Going forward it can include more specific tests
with respect to various generic MM functions like THP, HugeTLB etc and
platform specific tests.

https://lore.kernel.org/linux-mm/20190628102003.ga56...@arrakis.emea.arm.com/

Needs to be applied on linux V5.6-rc2

Changes in V14:

- Disabled DEBUG_VM_PGFLAGS for IA64 and ARM (32 Bit) per Andrew and Christophe
- Updated DEBUG_VM_PGFLAGS documentation wrt EXPERT and disabled platforms
- Updated RANDOM_[OR|NZ]VALUE open encodings with GENMASK() per Catalin
- Updated s390 constraint bits from 12 to 4 (S390_MASK_BITS) per Gerald
- Updated in-code documentation for RANDOM_ORVALUE per Gerald
- Updated pxx_basic_tests() to use invert functions first per Catalin
- Dropped ARCH_HAS_4LEVEL_HACK check from pud_basic_tests()
- Replaced __ARCH_HAS_[4|5]LEVEL_HACK with __PAGETABLE_[PUD|P4D]_FOLDED per 
Catalin
- Trimmed the CC list on the commit message per Catalin

Changes in V13: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=237125)

- Subscribed s390 platform and updated debug-vm-pgtable/arch-support.txt per 
Gerald
- Dropped keyword 'extern' from debug_vm_pgtable() declaration per Christophe
- Moved debug_vm_pgtable() declarations to  per Christophe
- Moved debug_vm_pgtable() call site into kernel_init() per Christophe
- Changed CONFIG_DEBUG_VM_PGTABLE rules per Christophe
- Updated commit to include new supported platforms and changed config selection

Changes in V12: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=233905)

- Replaced __mmdrop() with mmdrop()
- Enable ARCH_HAS_DEBUG_VM_PGTABLE on X86 for non CONFIG_X86_PAE platforms as 
the
  test procedure interfere with pre-allocated PMDs attached to the PGD resulting
  in runtime failures with VM_BUG_ON()

Changes in V11: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=221135)

- Rebased the patch on V5.4

Changes in V10: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=205529)

- Always enable DEBUG_VM_PGTABLE when DEBUG_VM is enabled per Ingo
- Added tags from Ingo

Changes in V9: 

Re: QEMU/KVM snapshot restore bug

2020-02-16 Thread dftxbs3e
On 2/16/20 7:16 PM, Cédric Le Goater wrote:
>
> I think this is fixed by commit f55750e4e4fb ("spapr/xive: Mask the EAS when 
> allocating an IRQ") which is not in QEMU 4.1.1. The same problem should also 
> occur with LE guests. 
>
> Could you possibly regenerate the QEMU rpm with this patch ? 
>
> Thanks,
>
> C.

Hello!

I applied the patch and reinstalled the RPM then tried to restore the
snapshot I created previously and it threw the same error.

Do I need to re-create the snapshot and/or restart the machine? I have
important workloads running so that'll be possible only in a few days if
needed.

Thanks




signature.asc
Description: OpenPGP digital signature


[PATCH 2/2] powerpc/powernv: Add explicit fast-reboot support

2020-02-16 Thread Oliver O'Halloran
Add a way to manually invoke a fast-reboot rather than setting the NVRAM
flag. The idea is to allow userspace to invoke a fast-reboot using the
optional string argument to the reboot() system call, or using the xmon
zr command so we don't need to leave around a persistent changes on
a system to use the feature.

Signed-off-by: Oliver O'Halloran 
---
Companion skiboot patch:
http://lists.ozlabs.org/pipermail/skiboot/2020-February/016420.html
---
 arch/powerpc/include/asm/opal-api.h| 1 +
 arch/powerpc/platforms/powernv/setup.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index c1f25a7..1dffa3c 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1067,6 +1067,7 @@ enum {
OPAL_REBOOT_PLATFORM_ERROR  = 1,
OPAL_REBOOT_FULL_IPL= 2,
OPAL_REBOOT_MPIPL   = 3,
+   OPAL_REBOOT_FAST= 4,
 };
 
 /* Argument to OPAL_PCI_TCE_KILL */
diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index a8fe630..3bc188d 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -237,6 +237,8 @@ static void  __noreturn pnv_restart(char *cmd)
rc = opal_cec_reboot2(OPAL_REBOOT_MPIPL, NULL);
else if (strcmp(cmd, "error") == 0)
rc = opal_cec_reboot2(OPAL_REBOOT_PLATFORM_ERROR, NULL);
+   else if (strcmp(cmd, "fast") == 0)
+   rc = opal_cec_reboot2(OPAL_REBOOT_FAST, NULL);
else
rc = OPAL_UNSUPPORTED;
 
-- 
2.9.5



[PATCH 1/2] powerpc/powernv: Treat an empty reboot string as default

2020-02-16 Thread Oliver O'Halloran
Treat an empty reboot cmd string the same as a NULL string. This squashes a
spurious unsupported reboot message that sometimes gets out when using
xmon.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/platforms/powernv/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index 11fdae8..a8fe630 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -229,7 +229,7 @@ static void  __noreturn pnv_restart(char *cmd)
pnv_prepare_going_down();
 
do {
-   if (!cmd)
+   if (!cmd || !strlen(cmd))
rc = opal_cec_reboot();
else if (strcmp(cmd, "full") == 0)
rc = opal_cec_reboot2(OPAL_REBOOT_FULL_IPL, NULL);
-- 
2.9.5



Re: [PATCH] KVM: PPC: Book3S HV: Treat unrecognized TM instructions as illegal

2020-02-16 Thread Michael Neuling
On Thu, 2020-02-13 at 10:15 -0500, Gustavo Romero wrote:
> On P9 DD2.2 due to a CPU defect some TM instructions need to be emulated by
> KVM. This is handled at first by the hardware raising a softpatch interrupt
> when certain TM instructions that need KVM assistance are executed in the
> guest. Some TM instructions, although not defined in the Power ISA, might
> raise a softpatch interrupt. For instance, 'tresume.' instruction as
> defined in the ISA must have bit 31 set (1), but an instruction that
> matches 'tresume.' OP and XO opcodes but has bit 31 not set (0), like
> 0x7cfe9ddc, also raises a softpatch interrupt, for example, if a code
> like the following is executed in the guest it will raise a softpatch
> interrupt just like a 'tresume.' when the TM facility is enabled:
> 
> int main() { asm("tabort. 0; .long 0x7cfe9ddc;"); }
> 
> Currently in such a case KVM throws a complete trace like the following:
> 
> [345523.705984] WARNING: CPU: 24 PID: 64413 at 
> arch/powerpc/kvm/book3s_hv_tm.c:211 kvmhv_p9_tm_emulation+0x68/0x620 [kvm_hv]
> [345523.705985] Modules linked in: kvm_hv(E) xt_conntrack ipt_REJECT 
> nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat
> iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
> ebtable_filter ebtables ip6table_filter
> ip6_tables iptable_filter bridge stp llc sch_fq_codel ipmi_powernv at24 
> vmx_crypto ipmi_devintf ipmi_msghandler
> ibmpowernv uio_pdrv_genirq kvm opal_prd uio leds_powernv ib_iser rdma_cm 
> iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp
> libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs 
> blake2b_generic zstd_compress raid10 raid456
> async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor 
> raid6_pq raid1 raid0 multipath linear tg3
> crct10dif_vpmsum crc32c_vpmsum ipr [last unloaded: kvm_hv]
> [345523.706030] CPU: 24 PID: 64413 Comm: CPU 0/KVM Tainted: GW   E
>  5.5.0+ #1
> [345523.706031] NIP:  c008072cb9c0 LR: c008072b5e80 CTR: 
> c008085c7850
> [345523.706034] REGS: c00399467680 TRAP: 0700   Tainted: GW   E   
>(5.5.0+)
> [345523.706034] MSR:  90010282b033 
>   CR: 24022428  XER: 
> [345523.706042] CFAR: c008072b5e7c IRQMASK: 0
> GPR00: c008072b5e80 c00399467910 c008072db500 
> c00375ccc720
> GPR04: c00375ccc720 0003fbec a10395dda5a6 
> 
> GPR08: 7cfe9ddc 7cfe9ddc05dc 7cfe9ddc7c0005dc 
> c008072cd530
> GPR12: c008085c7850 c003fffeb800 0001 
> 7dfb737f
> GPR16: c0002001edcca558   
> 0001
> GPR20: c1b21258 c0002001edcca558 0018 
> 
> GPR24: 0100  0001 
> 1500
> GPR28: c0002001edcc4278 c0037dd8 80050280f033 
> c00375ccc720
> [345523.706062] NIP [c008072cb9c0] kvmhv_p9_tm_emulation+0x68/0x620 
> [kvm_hv]
> [345523.706065] LR [c008072b5e80] 
> kvmppc_handle_exit_hv.isra.53+0x3e8/0x798 [kvm_hv]
> [345523.706066] Call Trace:
> [345523.706069] [c00399467910] [c00399467940] 0xc00399467940 
> (unreliable)
> [345523.706071] [c00399467950] [c00399467980] 0xc00399467980
> [345523.706075] [c003994679f0] [c008072bd1c4] 
> kvmhv_run_single_vcpu+0xa1c/0xb80 [kvm_hv]
> [345523.706079] [c00399467ac0] [c008072bd8e0] 
> kvmppc_vcpu_run_hv+0x5b8/0xb00 [kvm_hv]
> [345523.706087] [c00399467b90] [c008085c93cc] 
> kvmppc_vcpu_run+0x34/0x48 [kvm]
> [345523.706095] [c00399467bb0] [c008085c582c] 
> kvm_arch_vcpu_ioctl_run+0x244/0x420 [kvm]
> [345523.706101] [c00399467c40] [c008085b7498] 
> kvm_vcpu_ioctl+0x3d0/0x7b0 [kvm]
> [345523.706105] [c00399467db0] [c04adf9c] ksys_ioctl+0x13c/0x170
> [345523.706107] [c00399467e00] [c04adff8] sys_ioctl+0x28/0x80
> [345523.706111] [c00399467e20] [c000b278] system_call+0x5c/0x68
> [345523.706112] Instruction dump:
> [345523.706114] 419e0390 7f8a4840 409d0048 6d497c00 2f89075d 419e021c 
> 6d497c00 2f8907dd
> [345523.706119] 419e01c0 6d497c00 2f8905dd 419e00a4 <0fe0> 38210040 
> 3860 ebc1fff0
> 
> and then treats the executed instruction as 'nop' whilst it should actually
> be treated as an illegal instruction since it's not defined by the ISA.

The ISA has this: 

   1.3.3 Reserved Fields, Reserved Values, and Reserved SPRs

   Reserved fields in instructions are ignored by the pro-
   cessor.

Hence the hardware will ignore reserved bits. For example executing your little
program on P8 just exits normally with 0x7cfe9ddc being executed as a NOP.

Hence, we should NOP this, not generate an illegal.

Mikey

> This commit changes the handling of the case above by treating the
> unrecognized TM instructions that can raise a softpatch but are not
> 

Re: [PATCH v7 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support

2020-02-16 Thread Michael Neuling
Daniel. 

Can you start this commit message with a simple description of what you are
actually doing? This reads like you've been on a long journey to Mordor and
back, which as a reader of this patch in the long distant future, I don't care
about. I just want to know what you're implementing.

Also I'm struggling to review this as I don't know what software or hardware
mechanisms you are using to perform sanitisation.

Mikey

On Thu, 2020-02-13 at 11:47 +1100, Daniel Axtens wrote:
> KASAN support on Book3S is a bit tricky to get right:
> 
>  - It would be good to support inline instrumentation so as to be able to
>catch stack issues that cannot be caught with outline mode.
> 
>  - Inline instrumentation requires a fixed offset.
> 
>  - Book3S runs code in real mode after booting. Most notably a lot of KVM
>runs in real mode, and it would be good to be able to instrument it.
> 
>  - Because code runs in real mode after boot, the offset has to point to
>valid memory both in and out of real mode.
> 
> [ppc64 mm note: The kernel installs a linear mapping at effective
> address c000... onward. This is a one-to-one mapping with physical
> memory from ... onward. Because of how memory accesses work on
> powerpc 64-bit Book3S, a kernel pointer in the linear map accesses the
> same memory both with translations on (accessing as an 'effective
> address'), and with translations off (accessing as a 'real
> address'). This works in both guests and the hypervisor. For more
> details, see s5.7 of Book III of version 3 of the ISA, in particular
> the Storage Control Overview, s5.7.3, and s5.7.5 - noting that this
> KASAN implementation currently only supports Radix.]
> 
> One approach is just to give up on inline instrumentation. This way all
> checks can be delayed until after everything set is up correctly, and the
> address-to-shadow calculations can be overridden. However, the features and
> speed boost provided by inline instrumentation are worth trying to do
> better.
> 
> If _at compile time_ it is known how much contiguous physical memory a
> system has, the top 1/8th of the first block of physical memory can be set
> aside for the shadow. This is a big hammer and comes with 3 big
> consequences:
> 
>  - there's no nice way to handle physically discontiguous memory, so only
>the first physical memory block can be used.
> 
>  - kernels will simply fail to boot on machines with less memory than
>specified when compiling.
> 
>  - kernels running on machines with more memory than specified when
>compiling will simply ignore the extra memory.
> 
> Implement and document KASAN this way. The current implementation is Radix
> only.
> 
> Despite the limitations, it can still find bugs,
> e.g. http://patchwork.ozlabs.org/patch/1103775/
> 
> At the moment, this physical memory limit must be set _even for outline
> mode_. This may be changed in a later series - a different implementation
> could be added for outline mode that dynamically allocates shadow at a
> fixed offset. For example, see https://patchwork.ozlabs.org/patch/795211/
> 
> Suggested-by: Michael Ellerman 
> Cc: Balbir Singh  # ppc64 out-of-line radix version
> Cc: Christophe Leroy  # ppc32 version
> Signed-off-by: Daniel Axtens 
> 
> ---
> Changes since v6:
>  - rework kasan_late_init support, which also fixes book3e problem that
> snowpatch
>picked up (I think)
>  - fix a checkpatch error that snowpatch picked up
>  - don't needlessly move the include in kasan.h
> 
> Changes since v5:
>  - rebase on powerpc/merge, with Christophe's latest changes integrating
>kasan-vmalloc
>  - documentation tweaks based on latest 32-bit changes
> 
> Changes since v4:
>  - fix some ppc32 build issues
>  - support ptdump
>  - clean up the header file. It turns out we don't need or use
> KASAN_SHADOW_SIZE,
>so just dump it, and make KASAN_SHADOW_END the thing that varies between 32
>and 64 bit. As part of this, make sure KASAN_SHADOW_OFFSET is only
> configured for
>32 bit - it is calculated in the Makefile for ppc64.
>  - various cleanups
> 
> Changes since v3:
>  - Address further feedback from Christophe.
>  - Drop changes to stack walking, it looks like the issue I observed is
>related to that particular stack, not stack-walking generally.
> 
> Changes since v2:
> 
>  - Address feedback from Christophe around cleanups and docs.
>  - Address feedback from Balbir: at this point I don't have a good solution
>for the issues you identify around the limitations of the inline
> implementation
>but I think that it's worth trying to get the stack instrumentation
> support.
>I'm happy to have an alternative and more flexible outline mode - I had
>envisoned this would be called 'lightweight' mode as it imposes fewer
> restrictions.
>I've linked to your implementation. I think it's best to add it in a
> follow-up series.
>  - Made the default PHYS_MEM_SIZE_FOR_KASAN value 

Re: Kernel (little-endian) crashing on POWER8 on heavy PowerKVM load

2020-02-16 Thread Michael Neuling
Paulus,

Something below for you I think


> We have an IBM POWER server (8247-42L) running Linux kernel 5.4.13 on Debian 
> unstable
> hosting a big-endian ppc64 virtual machine running the same kernel in 
> big-endian
> mode.
> 
> When building OpenJDK-11 on the big-endian VM, the testsuite crashes the 
> *host* system
> with the following kernel backtrace. The problem reproduces both with kernel 
> 4.19.98
> as well as 5.4.13.
> 
> Backtrace has been attached at the end of this mail.
> 
> Thanks,
> Adrian
> 
> watson login: [17667518570.438744] BUG: Unable to handle kernel data access 
> at 0xc2bfd038
> [17667518570.438772] Faulting instruction address: 0xc017a778
> [17667518570.438777] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [17667518570.438781] Faulting instruction address: 0xc02659a0
> [17667518570.438785] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [17667518570.438789] Faulting instruction address: 0xc02659a0
> [17667518570.438793] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [17667518570.438797] Faulting instruction address: 0xc02659a0
> [17667518570.438801] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [17667518570.438804] Faulting instruction address: 0xc02659a0
> [17667518570.438808] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08



> [17667518570.439197] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [ 8142.397983]  async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) 
> raid6_pq(E) libcrc32c(E) crc32c_generic(E)
> [17667518570.439207] Faulting instruction address: 0xc02659a0
> [ 8142.397992]  raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) 
> xhci_pci(E) xhci_hcd(E)
> [17667518570.439215] Thread overran stack, or stack corrupted
> [ 8142.398000]  e1000e(E) usbcore(E) ptp(E) pps_core(E) ipr(E) usb_common(E)
> [ 8142.398011] CPU: 48 PID: 2571 Comm: CPU 0/KVM Tainted: GE 
> 5.4.0-0.bpo.3-powerpc64le #1 Debian 5.4.13-1~bpo10+1
> [ 8142.398014] NIP:  c00fe3117a00 LR: c0196b9c CTR: 
> c00fe3117a00
> [17667518570.439234] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [ 8142.398026] REGS: c00fe315f4c0 TRAP: 0400   Tainted: GE
>   (5.4.0-0.bpo.3-powerpc64le Debian 5.4.13-1~bpo10+1)
> [17667518570.439243] Faulting instruction address: 0xc02659a0
> [17667518570.439245] Thread overran stack, or stack corrupted
> [ 8142.398038] MSR:  900010009033   CR: 28448484 
>  XER: 
> [ 8142.398046] CFAR: c0196b98 IRQMASK: 1 
> [ 8142.398046] GPR00: c0196e0c c00fe315f750 c12e0800 
> c00fe31179c0 
> [ 8142.398046] GPR04: 0003   
>  
> [ 8142.398046] GPR08: c00fe315f7f0 c00fe3117a00 8030 
> c008082bcd80 
> [ 8142.398046] GPR12: c00fe3117a00 c00f5a00  
> 0008 
> [ 8142.398046] GPR16: c13a5c18 c00ff1035e00 c00fe315f8e8 
> 0001 
> [ 8142.398046] GPR20:  c00fe315f8e8 c00fe31179c0 
>  
> [ 8142.398046] GPR24: c00fe315f7f0 0001  
> 0003 
> [ 8142.398046] GPR28:  c00fedc6e750 0010 
> c00fe311f8d0 
> [ 8142.398079] NIP [c00fe3117a00] 0xc00fe3117a00
> [ 8142.398087] LR [c0196b9c] __wake_up_common+0xcc/0x290
> [17667518570.439321] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [ 8142.398109] Call Trace:
> [17667518570.439328] Faulting instruction address: 0xc02659a0
> [17667518570.439330] Thread overran stack, or stack corrupted
> [ 8142.398122] [c00fe315f750] [c0196b9c] 
> __wake_up_common+0xcc/0x290 (unreliable)
> [ 8142.398127] [c00fe315f7d0] [c0196e0c] 
> __wake_up_common_lock+0xac/0x110
> [ 8142.398134] [c00fe315f850] [c008082a9760] 
> kvmppc_run_core+0x12f8/0x18c0 [kvm_hv]
> [ 8142.398140] [c00fe315fa10] [c008082acf14] 
> kvmppc_vcpu_run_hv+0x62c/0xb20 [kvm_hv]
> [ 8142.398149] [c00fe315fae0] [c008081098cc] 
> kvmppc_vcpu_run+0x34/0x48 [kvm]
> [ 8142.398158] [c00fe315fb00] [c0080810587c] 
> kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
> [ 8142.398166] [c00fe315fb90] [c008080f7ac8] 
> kvm_vcpu_ioctl+0x340/0x7d0 [kvm]
> [ 8142.398172] [c00fe315fd00] [c0445410] do_vfs_ioctl+0xe0/0xac0
> [ 8142.398176] [c00fe315fdb0] [c0445eb4] ksys_ioctl+0xc4/0x110
> [ 8142.398180] [c00fe315fe00] [c0445f28] sys_ioctl+0x28/0x80
> [ 8142.398184] [c00fe315fe20] [c000b9c8] system_call+0x5c/0x68
> [ 8142.398186] Instruction dump:
> [17667518570.439406] BUG: Unable to handle kernel data access at 
> 0xc007f9070c08
> [ 8142.398196]        
>  

Re: [PATCH] powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK

2020-02-16 Thread Michael Neuling
On Fri, 2020-02-14 at 08:33 +, Christophe Leroy wrote:
> With CONFIG_VMAP_STACK, data MMU has to be enabled
> to read data on the stack.

Can you describe what goes wrong without this? Some oops message? rtas blows up?
Get corrupt data?

Also can you say what you're actually doing (ie turning on MSR[DR])


> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/entry_32.S | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 0713daa651d9..bc056d906b51 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -1354,12 +1354,17 @@ _GLOBAL(enter_rtas)
>   mtspr   SPRN_SRR0,r8
>   mtspr   SPRN_SRR1,r9
>   RFI
> -1:   tophys(r9,r1)
> +1:   tophys_novmstack r9, r1
> +#ifdef CONFIG_VMAP_STACK
> + li  r0, MSR_KERNEL & ~MSR_IR/* can take DTLB miss */

You're potentially turning on more than MSR DR here. This should be clear in the
commit message.

> + mtmsr   r0
> + isync
> +#endif
>   lwz r8,INT_FRAME_SIZE+4(r9) /* get return address */
>   lwz r9,8(r9)/* original msr value */
>   addir1,r1,INT_FRAME_SIZE
>   li  r0,0
> - tophys(r7, r2)
> + tophys_novmstack r7, r2
>   stw r0, THREAD + RTAS_SP(r7)
>   mtspr   SPRN_SRR0,r8
>   mtspr   SPRN_SRR1,r9



Re: [PATCH 1/1] powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type

2020-02-16 Thread Michael Neuling
On Sat, 2020-02-15 at 02:36 -0300, Leonardo Bras wrote:
> Before checking for cpu_type == NULL, this same copy happens, so doing
> it here will just write the same value to the t->oprofile_type
> again.
> 
> Remove the repeated copy, as it is unnecessary.
> 
> Signed-off-by: Leonardo Bras 

LGTM

Reviewed-by: Michael Neuling 

> ---
>  arch/powerpc/kernel/cputable.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
> index e745abc5457a..5a87ec96582f 100644
> --- a/arch/powerpc/kernel/cputable.c
> +++ b/arch/powerpc/kernel/cputable.c
> @@ -2197,7 +2197,6 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned
> long offset,
>*/
>   if (old.oprofile_cpu_type != NULL) {
>   t->oprofile_cpu_type = old.oprofile_cpu_type;
> - t->oprofile_type = old.oprofile_type;
>   }
>   }
>  



Re: QEMU/KVM snapshot restore bug

2020-02-16 Thread Cédric Le Goater
On 2/11/20 4:57 AM, dftxbs3e wrote:
> Hello,
> 
> I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) 
> host using `virsh snapshot-create-as --domain  --name `
>
> Then I restarted my system and tried restoring the snapshot:
> 
> # virsh snapshot-revert --domain  --snapshotname 
> error: internal error: process exited while connecting to monitor: 
> 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: 
> Group 3 attr 0x1309: Device or resource busy
> 2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for 
> instance 0x0 of device 'spapr'
> 2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state
> 
> And dmesg shows each time the restore command is executed:
> 
> [  180.176606] WARNING: CPU: 16 PID: 5528 at 
> arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM 
> xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc 
> rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT 
> nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack 
> ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw 
> ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw 
> iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink 
> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc 
> raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg 
> joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq 
> ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler 
> rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress 
> zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core 
> i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
> [  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm 
> vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core 
> drm_panel_orientation_quirks i2c_core fuse
> [  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 
> 5.4.17-200.fc31.ppc64le #1
> [  180.176665] NIP:  c0080a883c80 LR: c0080a886db8 CTR: 
> c0080a88a9e0
> [  180.176667] REGS: c00767a17890 TRAP: 0700   Not tainted  
> (5.4.17-200.fc31.ppc64le)
> [  180.176668] MSR:  90029033   CR: 48224248 
>  XER: 2004
> [  180.176673] CFAR: c0080a886db4 IRQMASK: 0 
>GPR00: c0080a886db8 c00767a17b20 c0080a8aed00 
> c0002005468a4480 
>GPR04:    
> 0001 
>GPR08: c0002007142b2400 c0002007142b2400  
> c0080a8910f0 
>GPR12: c0080a88a488 c007fffed000  
>  
>GPR16: 000149524180 739bda78 739bda30 
> 025c 
>GPR20:  0003 c0002006f13a 
>  
>GPR24: 1359  c002f8c96c38 
> c002f8c8 
>GPR28:  c0002006f13a c0002006f13a4038 
> c00767a17be4 
> [  180.176688] NIP [c0080a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176693] LR [c0080a886db8] kvmppc_xive_select_target+0x100/0x210 
> [kvm]
> [  180.176694] Call Trace:
> [  180.176696] [c00767a17b20] [c00767a17b70] 0xc00767a17b70 
> (unreliable)
> [  180.176701] [c00767a17b70] [c0080a88b420] 
> kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
> [  180.176705] [c00767a17cc0] [c0080a86392c] 
> kvm_device_ioctl+0xf4/0x180 [kvm]
> [  180.176710] [c00767a17d10] [c05380b0] do_vfs_ioctl+0xaa0/0xd90
> [  180.176712] [c00767a17dd0] [c0538464] sys_ioctl+0xc4/0x110
> [  180.176716] [c00767a17e20] [c000b9d0] system_call+0x5c/0x68
> [  180.176717] Instruction dump:
> [  180.176719] 794ad182 0b0a 2c29 41820080 89490010 2c0a 41820074 
> 78883664 
> [  180.176723] 7d094214 e9480070 7d470074 78e7d182 <0b07> 2c2a 
> 41820054 81480078 
> [  180.176727] ---[ end trace 056a6dd275e20684 ]---
> 
> Let me know if I can provide more information

I think this is fixed by commit f55750e4e4fb ("spapr/xive: Mask the EAS when 
allocating an IRQ") which is not in QEMU 4.1.1. The same problem should also 
occur with LE guests. 

Could you possibly regenerate the QEMU rpm with this patch ? 

Thanks,

C.


Re: Surprising code generated for vdso_read_begin()

2020-02-16 Thread Arnd Bergmann
On Sat, Jan 11, 2020 at 12:33 PM Segher Boessenkool
 wrote:
>
> On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
> > Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> > >It looks like the compiler did loop peeling.  What GCC version is this?
> > >Please try current trunk (to become GCC 10), or at least GCC 9?
> >
> > It is with GCC 5.5
> >
> > https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more
> > recent than 8.1
>
> Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
> this hard and/or painful to do?

To follow up on this older thread, I have now uploaded 6.5, 7.5, 8.3 and 9.2
binaries, as well as a recent 10.0 snapshot.

I hope these work, let me know if there are problems.

   Arnd


Re: [PATCH] powerpc/kprobes: Fix trap address when trap happened in real mode

2020-02-16 Thread Masami Hiramatsu
On Sat, 15 Feb 2020 11:28:49 +0100
Christophe Leroy  wrote:

> Hi,
> 
> Le 14/02/2020 à 14:54, Masami Hiramatsu a écrit :
> > Hi,
> > 
> > On Fri, 14 Feb 2020 12:47:49 + (UTC)
> > Christophe Leroy  wrote:
> > 
> >> When a program check exception happens while MMU translation is
> >> disabled, following Oops happens in kprobe_handler() in the following
> >> test:
> >>
> >>} else if (*addr != BREAKPOINT_INSTRUCTION) {
> > 
> > Thanks for the report and patch. I'm not so sure about powerpc 
> > implementation
> > but at where the MMU translation is disabled, can the handler work 
> > correctly?
> > (And where did you put the probe on?)
> > 
> > Your fix may fix this Oops, but if the handler needs special care, it is an
> > option to blacklist such place (if possible).
> 
> I guess that's another story. Here we are not talking about a place 
> where kprobe has been illegitimately activated, but a place where there 
> is a valid trap, which generated a valid 'program check exception'. And 
> kprobe was off at that time.

Ah, I got it. It is not a kprobe breakpoint, but to check that correctly,
it has to know the address where the breakpoint happens. OK.

> 
> As any 'program check exception' due to a trap (ie a BUG_ON, a WARN_ON, 
> a debugger breakpoint, a perf breakpoint, etc...) calls 
> kprobe_handler(), kprobe_handler() must be prepared to handle the case 
> where the MMU translation is disabled, even if probes are not supposed 
> to be set for functions running with MMU translation disabled.

Can't we check the MMU is disabled there (as same as checking the exception
happened in user space or not)?

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Christophe Leroy




Le 16/02/2020 à 09:22, Russell King - ARM Linux admin a écrit :

On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:

From: Mike Rapoport 

Hi,

These patches convert several architectures to use page table folding and
remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.

The changes are mostly about mechanical replacement of pgd accessors with p4d
ones and the addition of higher levels to page table traversals.

All the patches were sent separately to the respective arch lists and
maintainers hence the "v2" prefix.


You fail to explain why this change which adds 488 additional lines of
code is desirable.



The purpose of the series, ie droping a HACK, is worth it.

However looking at the powerpc patch I have the feeling that this series 
goes behind its purpose.


The number additional lines could be deeply reduced I think if we limit 
the patches to the strict minimum, ie just do things like below instead 
of adding lots of handling of useless levels.


Instead of doing things like:

-   pud = NULL;
+   p4d = NULL;
if (pgd_present(*pgd))
-   pud = pud_offset(pgd, gpa);
+   p4d = p4d_offset(pgd, gpa);
+   else
+   new_p4d = p4d_alloc_one(kvm->mm, gpa);
+
+   pud = NULL;
+   if (p4d_present(*p4d))
+   pud = pud_offset(p4d, gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);

It could be limited to:

if (pgd_present(*pgd))
-   pud = pud_offset(pgd, gpa);
+   pud = pud_offset(p4d_offset(pgd, gpa), gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);


Christophe


Re: [PATCH v2 07/13] powerpc: add support for folded p4d page tables

2020-02-16 Thread Christophe Leroy




Le 16/02/2020 à 09:18, Mike Rapoport a écrit :

From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.


I don't think it is worth adding all this additionnals walks of p4d, 
this patch could be limited to changes like:


-   pud = pud_offset(pgd, gpa);
+   pud = pud_offset(p4d_offset(pgd, gpa), gpa);

The additionnal walks should be added through another patch the day 
powerpc need them.


See below for more comments.



Signed-off-by: Mike Rapoport 
Tested-by: Christophe Leroy  # 8xx and 83xx
---
  arch/powerpc/include/asm/book3s/32/pgtable.h  |  1 -
  arch/powerpc/include/asm/book3s/64/hash.h |  4 +-
  arch/powerpc/include/asm/book3s/64/pgalloc.h  |  4 +-
  arch/powerpc/include/asm/book3s/64/pgtable.h  | 58 ++
  arch/powerpc/include/asm/book3s/64/radix.h|  6 +-
  arch/powerpc/include/asm/nohash/32/pgtable.h  |  1 -
  arch/powerpc/include/asm/nohash/64/pgalloc.h  |  2 +-
  .../include/asm/nohash/64/pgtable-4k.h| 32 +-
  arch/powerpc/include/asm/nohash/64/pgtable.h  |  6 +-
  arch/powerpc/include/asm/pgtable.h|  8 +++
  arch/powerpc/kvm/book3s_64_mmu_radix.c| 59 ---
  arch/powerpc/lib/code-patching.c  |  7 ++-
  arch/powerpc/mm/book3s32/mmu.c|  2 +-
  arch/powerpc/mm/book3s32/tlb.c|  4 +-
  arch/powerpc/mm/book3s64/hash_pgtable.c   |  4 +-
  arch/powerpc/mm/book3s64/radix_pgtable.c  | 19 --
  arch/powerpc/mm/book3s64/subpage_prot.c   |  6 +-
  arch/powerpc/mm/hugetlbpage.c | 28 +
  arch/powerpc/mm/kasan/kasan_init_32.c |  8 +--
  arch/powerpc/mm/mem.c |  4 +-
  arch/powerpc/mm/nohash/40x.c  |  4 +-
  arch/powerpc/mm/nohash/book3e_pgtable.c   | 15 +++--
  arch/powerpc/mm/pgtable.c | 25 +++-
  arch/powerpc/mm/pgtable_32.c  | 28 +
  arch/powerpc/mm/pgtable_64.c  | 10 ++--
  arch/powerpc/mm/ptdump/hashpagetable.c| 20 ++-
  arch/powerpc/mm/ptdump/ptdump.c   | 22 ++-
  arch/powerpc/xmon/xmon.c  | 17 +-
  28 files changed, 284 insertions(+), 120 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 5b39c11e884a..39ec11371be0 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -2,7 +2,6 @@
  #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
  #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
  
-#define __ARCH_USE_5LEVEL_HACK

  #include 
  
  #include 

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 2781ebf6add4..876d1528c2cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
  
  #define	hash__pmd_bad(pmd)		(pmd_val(pmd) & H_PMD_BAD_BITS)

  #define   hash__pud_bad(pud)  (pud_val(pud) & H_PUD_BAD_BITS)
-static inline int hash__pgd_bad(pgd_t pgd)
+static inline int hash__p4d_bad(p4d_t p4d)
  {
-   return (pgd_val(pgd) == 0);
+   return (p4d_val(p4d) == 0);
  }
  #ifdef CONFIG_STRICT_KERNEL_RWX
  extern void hash__mark_rodata_ro(void);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..69c5b051734f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
  }
  
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)

+static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
  {
-   *pgd =  __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
+   *pgd =  __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
  }
  
  static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 201a69e6a355..bafff0ab 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -2,7 +2,7 @@
  #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
  #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
  
-#include 

+#include 
  
  #ifndef __ASSEMBLY__

  #include 
@@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
  /* Bits to mask out from a PUD to get to the PMD page */
  #define PUD_MASKED_BITS   0xc0ffUL
  /* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS0xc0ffUL
+#define P4D_MASKED_BITS

[Bug 206525] BUG: KASAN: stack-out-of-bounds in test_bit+0x30/0x44 (kernel 5.6-rc1)

2020-02-16 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206525

--- Comment #4 from Christophe Leroy (christophe.le...@c-s.fr) ---
Feedback from Nikolay:

I think we can just cap these at min(BITS_PER_TYPE(u32), nlk->ngroups) since
"groups" is coming from sockaddr_nl's "nl_groups" which is a u32, for any
groups beyond u32 one has to use setsockopt().

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Russell King - ARM Linux admin
On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Hi,
> 
> These patches convert several architectures to use page table folding and
> remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
> 
> The changes are mostly about mechanical replacement of pgd accessors with p4d
> ones and the addition of higher levels to page table traversals.
> 
> All the patches were sent separately to the respective arch lists and
> maintainers hence the "v2" prefix.

You fail to explain why this change which adds 488 additional lines of
code is desirable.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


[PATCH v2 13/13] mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

There are no architectures that use include/asm-generic/5level-fixup.h
therefore it can be removed along with __ARCH_HAS_5LEVEL_HACK define and
the code it surrounds

Signed-off-by: Mike Rapoport 
---
 include/asm-generic/5level-fixup.h | 58 --
 include/linux/mm.h |  6 
 mm/kasan/init.c| 11 --
 mm/memory.c|  8 -
 4 files changed, 83 deletions(-)
 delete mode 100644 include/asm-generic/5level-fixup.h

diff --git a/include/asm-generic/5level-fixup.h 
b/include/asm-generic/5level-fixup.h
deleted file mode 100644
index 4c74b1c1d13b..
--- a/include/asm-generic/5level-fixup.h
+++ /dev/null
@@ -1,58 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _5LEVEL_FIXUP_H
-#define _5LEVEL_FIXUP_H
-
-#define __ARCH_HAS_5LEVEL_HACK
-#define __PAGETABLE_P4D_FOLDED 1
-
-#define P4D_SHIFT  PGDIR_SHIFT
-#define P4D_SIZE   PGDIR_SIZE
-#define P4D_MASK   PGDIR_MASK
-#define MAX_PTRS_PER_P4D   1
-#define PTRS_PER_P4D   1
-
-#define p4d_t  pgd_t
-
-#define pud_alloc(mm, p4d, address) \
-   ((unlikely(pgd_none(*(p4d))) && __pud_alloc(mm, p4d, address)) ? \
-   NULL : pud_offset(p4d, address))
-
-#define p4d_alloc(mm, pgd, address)(pgd)
-#define p4d_offset(pgd, start) (pgd)
-
-#ifndef __ASSEMBLY__
-static inline int p4d_none(p4d_t p4d)
-{
-   return 0;
-}
-
-static inline int p4d_bad(p4d_t p4d)
-{
-   return 0;
-}
-
-static inline int p4d_present(p4d_t p4d)
-{
-   return 1;
-}
-#endif
-
-#define p4d_ERROR(p4d) do { } while (0)
-#define p4d_clear(p4d) pgd_clear(p4d)
-#define p4d_val(p4d)   pgd_val(p4d)
-#define p4d_populate(mm, p4d, pud) pgd_populate(mm, p4d, pud)
-#define p4d_populate_safe(mm, p4d, pud)pgd_populate(mm, p4d, pud)
-#define p4d_page(p4d)  pgd_page(p4d)
-#define p4d_page_vaddr(p4d)pgd_page_vaddr(p4d)
-
-#define __p4d(x)   __pgd(x)
-#define set_p4d(p4dp, p4d) set_pgd(p4dp, p4d)
-
-#undef p4d_free_tlb
-#define p4d_free_tlb(tlb, x, addr) do { } while (0)
-#define p4d_free(mm, x)do { } while (0)
-
-#undef  p4d_addr_end
-#define p4d_addr_end(addr, end)(end)
-
-#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 52269e56c514..69fb46e1d91b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1841,11 +1841,6 @@ int __pte_alloc_kernel(pmd_t *pmd);
 
 #if defined(CONFIG_MMU)
 
-/*
- * The following ifdef needed to get the 5level-fixup.h header to work.
- * Remove it when 5level-fixup.h has been removed.
- */
-#ifndef __ARCH_HAS_5LEVEL_HACK
 static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
unsigned long address)
 {
@@ -1859,7 +1854,6 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, 
p4d_t *p4d,
return (unlikely(p4d_none(*p4d)) && __pud_alloc(mm, p4d, address)) ?
NULL : pud_offset(p4d, address);
 }
-#endif /* !__ARCH_HAS_5LEVEL_HACK */
 
 static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long 
address)
 {
diff --git a/mm/kasan/init.c b/mm/kasan/init.c
index ce45c491ebcd..fe6be0be1f76 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -250,20 +250,9 @@ int __ref kasan_populate_early_shadow(const void 
*shadow_start,
 * 3,2 - level page tables where we don't have
 * puds,pmds, so pgd_populate(), pud_populate()
 * is noops.
-*
-* The ifndef is required to avoid build breakage.
-*
-* With 5level-fixup.h, pgd_populate() is not nop and
-* we reference kasan_early_shadow_p4d. It's not defined
-* unless 5-level paging enabled.
-*
-* The ifndef can be dropped once all KASAN-enabled
-* architectures will switch to pgtable-nop4d.h.
 */
-#ifndef __ARCH_HAS_5LEVEL_HACK
pgd_populate(_mm, pgd,
lm_alias(kasan_early_shadow_p4d));
-#endif
p4d = p4d_offset(pgd, addr);
p4d_populate(_mm, p4d,
lm_alias(kasan_early_shadow_pud));
diff --git a/mm/memory.c b/mm/memory.c
index 0bccc622e482..10cc147db1b8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4252,19 +4252,11 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, 
unsigned long address)
smp_wmb(); /* See comment in __pte_alloc */
 
spin_lock(>page_table_lock);
-#ifndef __ARCH_HAS_5LEVEL_HACK
if (!p4d_present(*p4d)) {
mm_inc_nr_puds(mm);
  

[PATCH v2 12/13] asm-generic: remove pgtable-nop4d-hack.h

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

No architecture defines __ARCH_USE_5LEVEL_HACK and therefore
pgtable-nop4d-hack.h will be never actually included.

Remove it.

Signed-off-by: Mike Rapoport 
---
 include/asm-generic/pgtable-nop4d-hack.h | 64 
 include/asm-generic/pgtable-nopud.h  |  4 --
 2 files changed, 68 deletions(-)
 delete mode 100644 include/asm-generic/pgtable-nop4d-hack.h

diff --git a/include/asm-generic/pgtable-nop4d-hack.h 
b/include/asm-generic/pgtable-nop4d-hack.h
deleted file mode 100644
index 829bdb0d6327..
--- a/include/asm-generic/pgtable-nop4d-hack.h
+++ /dev/null
@@ -1,64 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _PGTABLE_NOP4D_HACK_H
-#define _PGTABLE_NOP4D_HACK_H
-
-#ifndef __ASSEMBLY__
-#include 
-
-#define __PAGETABLE_PUD_FOLDED 1
-
-/*
- * Having the pud type consist of a pgd gets the size right, and allows
- * us to conceptually access the pgd entry that this pud is folded into
- * without casting.
- */
-typedef struct { pgd_t pgd; } pud_t;
-
-#define PUD_SHIFT  PGDIR_SHIFT
-#define PTRS_PER_PUD   1
-#define PUD_SIZE   (1UL << PUD_SHIFT)
-#define PUD_MASK   (~(PUD_SIZE-1))
-
-/*
- * The "pgd_xxx()" functions here are trivial for a folded two-level
- * setup: the pud is never bad, and a pud always exists (as it's folded
- * into the pgd entry)
- */
-static inline int pgd_none(pgd_t pgd)  { return 0; }
-static inline int pgd_bad(pgd_t pgd)   { return 0; }
-static inline int pgd_present(pgd_t pgd)   { return 1; }
-static inline void pgd_clear(pgd_t *pgd)   { }
-#define pud_ERROR(pud) (pgd_ERROR((pud).pgd))
-
-#define pgd_populate(mm, pgd, pud) do { } while (0)
-#define pgd_populate_safe(mm, pgd, pud)do { } while (0)
-/*
- * (puds are folded into pgds so this doesn't get actually called,
- * but the define is needed for a generic inline function.)
- */
-#define set_pgd(pgdptr, pgdval)set_pud((pud_t *)(pgdptr), (pud_t) { 
pgdval })
-
-static inline pud_t *pud_offset(pgd_t *pgd, unsigned long address)
-{
-   return (pud_t *)pgd;
-}
-
-#define pud_val(x) (pgd_val((x).pgd))
-#define __pud(x)   ((pud_t) { __pgd(x) })
-
-#define pgd_page(pgd)  (pud_page((pud_t){ pgd }))
-#define pgd_page_vaddr(pgd)(pud_page_vaddr((pud_t){ pgd }))
-
-/*
- * allocating and freeing a pud is trivial: the 1-entry pud is
- * inside the pgd, so has no extra memory associated with it.
- */
-#define pud_alloc_one(mm, address) NULL
-#define pud_free(mm, x)do { } while (0)
-#define __pud_free_tlb(tlb, x, a)  do { } while (0)
-
-#undef  pud_addr_end
-#define pud_addr_end(addr, end)(end)
-
-#endif /* __ASSEMBLY__ */
-#endif /* _PGTABLE_NOP4D_HACK_H */
diff --git a/include/asm-generic/pgtable-nopud.h 
b/include/asm-generic/pgtable-nopud.h
index d3776cb494c0..ad05c1684bfc 100644
--- a/include/asm-generic/pgtable-nopud.h
+++ b/include/asm-generic/pgtable-nopud.h
@@ -4,9 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-#ifdef __ARCH_USE_5LEVEL_HACK
-#include 
-#else
 #include 
 
 #define __PAGETABLE_PUD_FOLDED 1
@@ -65,5 +62,4 @@ static inline pud_t *pud_offset(p4d_t *p4d, unsigned long 
address)
 #define pud_addr_end(addr, end)(end)
 
 #endif /* __ASSEMBLY__ */
-#endif /* !__ARCH_USE_5LEVEL_HACK */
 #endif /* _PGTABLE_NOPUD_H */
-- 
2.24.0



[PATCH v2 11/13] unicore32: remove __ARCH_USE_5LEVEL_HACK

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

The unicore32 architecture has 2 level page tables and
asm-generic/pgtable-nopmd.h and explicit casts from pud_t to pgd_t for page
table folding.

Add p4d walk in the only place that actually unfolds the pud level and
remove __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport 
---
 arch/unicore32/include/asm/pgtable.h | 1 -
 arch/unicore32/kernel/hibernate.c| 4 +++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/unicore32/include/asm/pgtable.h 
b/arch/unicore32/include/asm/pgtable.h
index c8f7ba12f309..82030c32fc05 100644
--- a/arch/unicore32/include/asm/pgtable.h
+++ b/arch/unicore32/include/asm/pgtable.h
@@ -9,7 +9,6 @@
 #ifndef __UNICORE_PGTABLE_H__
 #define __UNICORE_PGTABLE_H__
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 #include 
 
diff --git a/arch/unicore32/kernel/hibernate.c 
b/arch/unicore32/kernel/hibernate.c
index f3812245cc00..ccad051a79b6 100644
--- a/arch/unicore32/kernel/hibernate.c
+++ b/arch/unicore32/kernel/hibernate.c
@@ -33,9 +33,11 @@ struct swsusp_arch_regs swsusp_arch_regs_cpu0;
 static pmd_t *resume_one_md_table_init(pgd_t *pgd)
 {
pud_t *pud;
+   p4d_t *p4d;
pmd_t *pmd_table;
 
-   pud = pud_offset(pgd, 0);
+   p4d = p4d_offset(pgd, 0);
+   pud = pud_offset(p4d, 0);
pmd_table = pmd_offset(pud, 0);
 
return pmd_table;
-- 
2.24.0



[PATCH v2 10/13] sh: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport 
---
 arch/sh/include/asm/pgtable-2level.h |  1 -
 arch/sh/include/asm/pgtable-3level.h |  1 -
 arch/sh/kernel/io_trapped.c  |  7 ++-
 arch/sh/mm/cache-sh4.c   |  4 +++-
 arch/sh/mm/cache-sh5.c   |  7 ++-
 arch/sh/mm/fault.c   | 26 +++---
 arch/sh/mm/hugetlbpage.c | 28 ++--
 arch/sh/mm/init.c|  9 -
 arch/sh/mm/kmap.c|  2 +-
 arch/sh/mm/tlbex_32.c|  6 +-
 arch/sh/mm/tlbex_64.c|  7 ++-
 11 files changed, 76 insertions(+), 22 deletions(-)

diff --git a/arch/sh/include/asm/pgtable-2level.h 
b/arch/sh/include/asm/pgtable-2level.h
index bf1eb51c3ee5..08bff93927ff 100644
--- a/arch/sh/include/asm/pgtable-2level.h
+++ b/arch/sh/include/asm/pgtable-2level.h
@@ -2,7 +2,6 @@
 #ifndef __ASM_SH_PGTABLE_2LEVEL_H
 #define __ASM_SH_PGTABLE_2LEVEL_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 /*
diff --git a/arch/sh/include/asm/pgtable-3level.h 
b/arch/sh/include/asm/pgtable-3level.h
index 779260b721ca..0f80097e5c9c 100644
--- a/arch/sh/include/asm/pgtable-3level.h
+++ b/arch/sh/include/asm/pgtable-3level.h
@@ -2,7 +2,6 @@
 #ifndef __ASM_SH_PGTABLE_3LEVEL_H
 #define __ASM_SH_PGTABLE_3LEVEL_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 /*
diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
index 60c828a2b8a2..037aab2708b7 100644
--- a/arch/sh/kernel/io_trapped.c
+++ b/arch/sh/kernel/io_trapped.c
@@ -136,6 +136,7 @@ EXPORT_SYMBOL_GPL(match_trapped_io_handler);
 static struct trapped_io *lookup_tiop(unsigned long address)
 {
pgd_t *pgd_k;
+   p4d_t *p4d_k;
pud_t *pud_k;
pmd_t *pmd_k;
pte_t *pte_k;
@@ -145,7 +146,11 @@ static struct trapped_io *lookup_tiop(unsigned long 
address)
if (!pgd_present(*pgd_k))
return NULL;
 
-   pud_k = pud_offset(pgd_k, address);
+   p4d_k = p4d_offset(pgd_k, address);
+   if (!p4d_present(*p4d_k))
+   return NULL;
+
+   pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
return NULL;
 
diff --git a/arch/sh/mm/cache-sh4.c b/arch/sh/mm/cache-sh4.c
index eee911422cf9..45943bcb7042 100644
--- a/arch/sh/mm/cache-sh4.c
+++ b/arch/sh/mm/cache-sh4.c
@@ -209,6 +209,7 @@ static void sh4_flush_cache_page(void *args)
unsigned long address, pfn, phys;
int map_coherent = 0;
pgd_t *pgd;
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -224,7 +225,8 @@ static void sh4_flush_cache_page(void *args)
return;
 
pgd = pgd_offset(vma->vm_mm, address);
-   pud = pud_offset(pgd, address);
+   p4d = p4d_offset(pgd, address);
+   pud = pud_offset(p4d, address);
pmd = pmd_offset(pud, address);
pte = pte_offset_kernel(pmd, address);
 
diff --git a/arch/sh/mm/cache-sh5.c b/arch/sh/mm/cache-sh5.c
index 445b5e69b73c..442a77cc2957 100644
--- a/arch/sh/mm/cache-sh5.c
+++ b/arch/sh/mm/cache-sh5.c
@@ -383,6 +383,7 @@ static void sh64_dcache_purge_user_pages(struct mm_struct 
*mm,
unsigned long addr, unsigned long end)
 {
pgd_t *pgd;
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -397,7 +398,11 @@ static void sh64_dcache_purge_user_pages(struct mm_struct 
*mm,
if (pgd_bad(*pgd))
return;
 
-   pud = pud_offset(pgd, addr);
+   p4d = p4d_offset(pgd, addr);
+   if (p4d_none(*p4d) || p4d_bad(*p4d))
+   return;
+
+   pud = pud_offset(p4d, addr);
if (pud_none(*pud) || pud_bad(*pud))
return;
 
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a2b0275413e8..ebd30003fd06 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -53,6 +53,7 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 (u64)pgd_val(*pgd));
 
do {
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -65,7 +66,20 @@ static void show_pte(struct mm_struct *mm, unsigned long 
addr)
break;
}
 
-   pud = pud_offset(pgd, addr);
+   p4d = p4d_offset(pgd, addr);
+   if (PTRS_PER_P4D != 1)
+   pr_cont(", *p4d=%0*Lx", (u32)(sizeof(*p4d) * 2),
+   (u64)p4d_val(*p4d));
+
+   if (p4d_none(*p4d))
+   break;
+
+   if (p4d_bad(*p4d)) {
+   pr_cont("(bad)");
+   break;
+   }
+
+   pud = pud_offset(p4d, addr);
if (PTRS_PER_PUD != 1)
  

[PATCH v2 09/13] sh: drop __pXd_offset() macros that duplicate pXd_index() ones

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

The __pXd_offset() macros are identical to the pXd_index() macros and there
is no point to keep both of them. All architectures define and use
pXd_index() so let's keep only those to make mips consistent with the rest
of the kernel.

Signed-off-by: Mike Rapoport 
---
 arch/sh/include/asm/pgtable_32.h | 5 ++---
 arch/sh/include/asm/pgtable_64.h | 5 ++---
 arch/sh/mm/init.c| 6 +++---
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 29274f0e428e..4acce5f2cbf9 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -407,13 +407,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t 
newprot)
 /* to find an entry in a page-table-directory. */
 #define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
 #define pgd_offset(mm, address)((mm)->pgd + pgd_index(address))
-#define __pgd_offset(address)  pgd_index(address)
 
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(address)  pgd_offset(_mm, address)
 
-#define __pud_offset(address)  (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address)  (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
 
 /* Find an entry in the third-level page table.. */
 #define pte_index(address) ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/sh/include/asm/pgtable_64.h b/arch/sh/include/asm/pgtable_64.h
index 1778bc5971e7..27cc282ec6c0 100644
--- a/arch/sh/include/asm/pgtable_64.h
+++ b/arch/sh/include/asm/pgtable_64.h
@@ -46,14 +46,13 @@ static __inline__ void set_pte(pte_t *pteptr, pte_t pteval)
 
 /* To find an entry in a generic PGD. */
 #define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
-#define __pgd_offset(address) pgd_index(address)
 #define pgd_offset(mm, address) ((mm)->pgd+pgd_index(address))
 
 /* To find an entry in a kernel PGD. */
 #define pgd_offset_k(address) pgd_offset(_mm, address)
 
-#define __pud_offset(address)  (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address)  (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+/* #define pmd_index(address)  (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1)) */
 
 /*
  * PMD level access routines. Same notes as above.
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index d1b1ff2be17a..4bab79baee75 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -172,9 +172,9 @@ void __init page_table_range_init(unsigned long start, 
unsigned long end,
unsigned long vaddr;
 
vaddr = start;
-   i = __pgd_offset(vaddr);
-   j = __pud_offset(vaddr);
-   k = __pmd_offset(vaddr);
+   i = pgd_index(vaddr);
+   j = pud_index(vaddr);
+   k = pmd_index(vaddr);
pgd = pgd_base + i;
 
for ( ; (i < PTRS_PER_PGD) && (vaddr != end); pgd++, i++) {
-- 
2.24.0



[PATCH v2 08/13] sh: fault: Modernize printing of kernel messages

2020-02-16 Thread Mike Rapoport
From: Geert Uytterhoeven 

  - Convert from printk() to pr_*(),
  - Add missing continuations,
  - Use "%llx" to format u64,
  - Join multiple prints in show_fault_oops() into a single print.

Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Mike Rapoport 
---
 arch/sh/mm/fault.c | 39 ++-
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 5f51456f4fc7..a2b0275413e8 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -47,10 +47,10 @@ static void show_pte(struct mm_struct *mm, unsigned long 
addr)
pgd = swapper_pg_dir;
}
 
-   printk(KERN_ALERT "pgd = %p\n", pgd);
+   pr_alert("pgd = %p\n", pgd);
pgd += pgd_index(addr);
-   printk(KERN_ALERT "[%08lx] *pgd=%0*Lx", addr,
-  (u32)(sizeof(*pgd) * 2), (u64)pgd_val(*pgd));
+   pr_alert("[%08lx] *pgd=%0*llx", addr, (u32)(sizeof(*pgd) * 2),
+(u64)pgd_val(*pgd));
 
do {
pud_t *pud;
@@ -61,33 +61,33 @@ static void show_pte(struct mm_struct *mm, unsigned long 
addr)
break;
 
if (pgd_bad(*pgd)) {
-   printk("(bad)");
+   pr_cont("(bad)");
break;
}
 
pud = pud_offset(pgd, addr);
if (PTRS_PER_PUD != 1)
-   printk(", *pud=%0*Lx", (u32)(sizeof(*pud) * 2),
-  (u64)pud_val(*pud));
+   pr_cont(", *pud=%0*llx", (u32)(sizeof(*pud) * 2),
+   (u64)pud_val(*pud));
 
if (pud_none(*pud))
break;
 
if (pud_bad(*pud)) {
-   printk("(bad)");
+   pr_cont("(bad)");
break;
}
 
pmd = pmd_offset(pud, addr);
if (PTRS_PER_PMD != 1)
-   printk(", *pmd=%0*Lx", (u32)(sizeof(*pmd) * 2),
-  (u64)pmd_val(*pmd));
+   pr_cont(", *pmd=%0*llx", (u32)(sizeof(*pmd) * 2),
+   (u64)pmd_val(*pmd));
 
if (pmd_none(*pmd))
break;
 
if (pmd_bad(*pmd)) {
-   printk("(bad)");
+   pr_cont("(bad)");
break;
}
 
@@ -96,11 +96,11 @@ static void show_pte(struct mm_struct *mm, unsigned long 
addr)
break;
 
pte = pte_offset_kernel(pmd, addr);
-   printk(", *pte=%0*Lx", (u32)(sizeof(*pte) * 2),
-  (u64)pte_val(*pte));
+   pr_cont(", *pte=%0*llx", (u32)(sizeof(*pte) * 2),
+   (u64)pte_val(*pte));
} while (0);
 
-   printk("\n");
+   pr_cont("\n");
 }
 
 static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
@@ -188,14 +188,11 @@ show_fault_oops(struct pt_regs *regs, unsigned long 
address)
if (!oops_may_print())
return;
 
-   printk(KERN_ALERT "BUG: unable to handle kernel ");
-   if (address < PAGE_SIZE)
-   printk(KERN_CONT "NULL pointer dereference");
-   else
-   printk(KERN_CONT "paging request");
-
-   printk(KERN_CONT " at %08lx\n", address);
-   printk(KERN_ALERT "PC:");
+   pr_alert("BUG: unable to handle kernel %s at %08lx\n",
+address < PAGE_SIZE ? "NULL pointer dereference"
+: "paging request",
+address);
+   pr_alert("PC:");
printk_address(regs->pc, 1);
 
show_pte(NULL, address);
-- 
2.24.0



[PATCH v2 07/13] powerpc: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.

Signed-off-by: Mike Rapoport 
Tested-by: Christophe Leroy  # 8xx and 83xx
---
 arch/powerpc/include/asm/book3s/32/pgtable.h  |  1 -
 arch/powerpc/include/asm/book3s/64/hash.h |  4 +-
 arch/powerpc/include/asm/book3s/64/pgalloc.h  |  4 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 58 ++
 arch/powerpc/include/asm/book3s/64/radix.h|  6 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h  |  1 -
 arch/powerpc/include/asm/nohash/64/pgalloc.h  |  2 +-
 .../include/asm/nohash/64/pgtable-4k.h| 32 +-
 arch/powerpc/include/asm/nohash/64/pgtable.h  |  6 +-
 arch/powerpc/include/asm/pgtable.h|  8 +++
 arch/powerpc/kvm/book3s_64_mmu_radix.c| 59 ---
 arch/powerpc/lib/code-patching.c  |  7 ++-
 arch/powerpc/mm/book3s32/mmu.c|  2 +-
 arch/powerpc/mm/book3s32/tlb.c|  4 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c   |  4 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c  | 19 --
 arch/powerpc/mm/book3s64/subpage_prot.c   |  6 +-
 arch/powerpc/mm/hugetlbpage.c | 28 +
 arch/powerpc/mm/kasan/kasan_init_32.c |  8 +--
 arch/powerpc/mm/mem.c |  4 +-
 arch/powerpc/mm/nohash/40x.c  |  4 +-
 arch/powerpc/mm/nohash/book3e_pgtable.c   | 15 +++--
 arch/powerpc/mm/pgtable.c | 25 +++-
 arch/powerpc/mm/pgtable_32.c  | 28 +
 arch/powerpc/mm/pgtable_64.c  | 10 ++--
 arch/powerpc/mm/ptdump/hashpagetable.c| 20 ++-
 arch/powerpc/mm/ptdump/ptdump.c   | 22 ++-
 arch/powerpc/xmon/xmon.c  | 17 +-
 28 files changed, 284 insertions(+), 120 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 5b39c11e884a..39ec11371be0 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -2,7 +2,6 @@
 #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 #include 
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 2781ebf6add4..876d1528c2cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
 
 #definehash__pmd_bad(pmd)  (pmd_val(pmd) & H_PMD_BAD_BITS)
 #definehash__pud_bad(pud)  (pud_val(pud) & H_PUD_BAD_BITS)
-static inline int hash__pgd_bad(pgd_t pgd)
+static inline int hash__p4d_bad(p4d_t p4d)
 {
-   return (pgd_val(pgd) == 0);
+   return (p4d_val(p4d) == 0);
 }
 #ifdef CONFIG_STRICT_KERNEL_RWX
 extern void hash__mark_rodata_ro(void);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..69c5b051734f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
 }
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
 {
-   *pgd =  __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
+   *pgd =  __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
 }
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 201a69e6a355..bafff0ab 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 
-#include 
+#include 
 
 #ifndef __ASSEMBLY__
 #include 
@@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
 /* Bits to mask out from a PUD to get to the PMD page */
 #define PUD_MASKED_BITS0xc0ffUL
 /* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS0xc0ffUL
+#define P4D_MASKED_BITS0xc0ffUL
 
 /*
  * Used as an indicator for rcu callback functions
@@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool 
write)
return pte_access_permitted(pud_pte(pud), write);
 }
 
-#define pgd_write(pgd) pte_write(pgd_pte(pgd))
+#define __p4d_raw(x)   ((p4d_t) { __pgd_raw(x) })
+static inline __be64 p4d_raw(p4d_t x)
+{
+   return pgd_raw(x.pgd);
+}
+
+#define p4d_write(p4d) 

[PATCH v2 06/13] openrisc: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport 
---
 arch/openrisc/include/asm/pgtable.h |  1 -
 arch/openrisc/mm/fault.c| 10 --
 arch/openrisc/mm/init.c |  4 +++-
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/openrisc/include/asm/pgtable.h 
b/arch/openrisc/include/asm/pgtable.h
index 248d22d8faa7..c072943fc721 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -21,7 +21,6 @@
 #ifndef __ASM_OPENRISC_PGTABLE_H
 #define __ASM_OPENRISC_PGTABLE_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 #ifndef __ASSEMBLY__
diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c
index 5d4d3a9691d0..44aa04545de3 100644
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -296,6 +296,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, 
unsigned long address,
 
int offset = pgd_index(address);
pgd_t *pgd, *pgd_k;
+   p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;
@@ -322,8 +323,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs, 
unsigned long address,
 * it exists.
 */
 
-   pud = pud_offset(pgd, address);
-   pud_k = pud_offset(pgd_k, address);
+   p4d = p4d_offset(pgd, address);
+   p4d_k = p4d_offset(pgd_k, address);
+   if (!p4d_present(*p4d_k))
+   goto no_context;
+
+   pud = pud_offset(p4d, address);
+   pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
goto no_context;
 
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index 1f87b524db78..2536aeae0975 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -71,6 +71,7 @@ static void __init map_ram(void)
unsigned long v, p, e;
pgprot_t prot;
pgd_t *pge;
+   p4d_t *p4e;
pud_t *pue;
pmd_t *pme;
pte_t *pte;
@@ -90,7 +91,8 @@ static void __init map_ram(void)
 
while (p < e) {
int j;
-   pue = pud_offset(pge, v);
+   p4e = p4d_offset(pge, v);
+   pue = pud_offset(p4e, v);
pme = pmd_offset(pue, v);
 
if ((u32) pue != (u32) pge || (u32) pme != (u32) pge) {
-- 
2.24.0



[PATCH v2 05/13] nios2: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport 
---
 arch/nios2/include/asm/pgtable.h | 3 +--
 arch/nios2/mm/fault.c| 9 +++--
 arch/nios2/mm/ioremap.c  | 6 +-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index 99985d8b7166..54305aa09b74 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -22,7 +22,6 @@
 #include 
 
 #include 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 #define FIRST_USER_ADDRESS 0UL
@@ -100,7 +99,7 @@ extern pte_t invalid_pte_table[PAGE_SIZE/sizeof(pte_t)];
  */
 static inline void set_pmd(pmd_t *pmdptr, pmd_t pmdval)
 {
-   pmdptr->pud.pgd.pgd = pmdval.pud.pgd.pgd;
+   *pmdptr = pmdval;
 }
 
 /* to find an entry in a page-table-directory */
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 6a2e716b959f..d3da995665c3 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -245,6 +245,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, 
unsigned long cause,
 */
int offset = pgd_index(address);
pgd_t *pgd, *pgd_k;
+   p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;
@@ -256,8 +257,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs, 
unsigned long cause,
goto no_context;
set_pgd(pgd, *pgd_k);
 
-   pud = pud_offset(pgd, address);
-   pud_k = pud_offset(pgd_k, address);
+   p4d = p4d_offset(pgd, address);
+   p4d_k = p4d_offset(pgd_k, address);
+   if (!p4d_present(*p4d_k))
+   goto no_context;
+   pud = pud_offset(p4d, address);
+   pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
goto no_context;
pmd = pmd_offset(pud, address);
diff --git a/arch/nios2/mm/ioremap.c b/arch/nios2/mm/ioremap.c
index 819bdfcc2e71..fe821efb9a99 100644
--- a/arch/nios2/mm/ioremap.c
+++ b/arch/nios2/mm/ioremap.c
@@ -86,11 +86,15 @@ static int remap_area_pages(unsigned long address, unsigned 
long phys_addr,
if (address >= end)
BUG();
do {
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
 
error = -ENOMEM;
-   pud = pud_alloc(_mm, dir, address);
+   p4d = p4d_alloc(_mm, dir, address);
+   if (!p4d)
+   break;
+   pud = pud_alloc(_mm, p4d, address);
if (!pud)
break;
pmd = pmd_alloc(_mm, pud, address);
-- 
2.24.0



[PATCH v2 04/13] ia64: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, remove usage of __ARCH_USE_5LEVEL_HACK and replace
5level-fixup.h with pgtable-nop4d.h

Signed-off-by: Mike Rapoport 
---
 arch/ia64/include/asm/pgalloc.h |  4 ++--
 arch/ia64/include/asm/pgtable.h | 17 -
 arch/ia64/mm/fault.c|  7 ++-
 arch/ia64/mm/hugetlbpage.c  | 18 --
 arch/ia64/mm/init.c | 28 
 5 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/arch/ia64/include/asm/pgalloc.h b/arch/ia64/include/asm/pgalloc.h
index f4c491044882..2a3050345099 100644
--- a/arch/ia64/include/asm/pgalloc.h
+++ b/arch/ia64/include/asm/pgalloc.h
@@ -36,9 +36,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 
 #if CONFIG_PGTABLE_LEVELS == 4
 static inline void
-pgd_populate(struct mm_struct *mm, pgd_t * pgd_entry, pud_t * pud)
+p4d_populate(struct mm_struct *mm, p4d_t * p4d_entry, pud_t * pud)
 {
-   pgd_val(*pgd_entry) = __pa(pud);
+   p4d_val(*p4d_entry) = __pa(pud);
 }
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index d602e7c622db..c87f789bc914 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -283,12 +283,12 @@ extern unsigned long VMALLOC_END;
 #define pud_page(pud)  virt_to_page((pud_val(pud) + 
PAGE_OFFSET))
 
 #if CONFIG_PGTABLE_LEVELS == 4
-#define pgd_none(pgd)  (!pgd_val(pgd))
-#define pgd_bad(pgd)   (!ia64_phys_addr_valid(pgd_val(pgd)))
-#define pgd_present(pgd)   (pgd_val(pgd) != 0UL)
-#define pgd_clear(pgdp)(pgd_val(*(pgdp)) = 0UL)
-#define pgd_page_vaddr(pgd)((unsigned long) __va(pgd_val(pgd) & 
_PFN_MASK))
-#define pgd_page(pgd)  virt_to_page((pgd_val(pgd) + 
PAGE_OFFSET))
+#define p4d_none(p4d)  (!p4d_val(p4d))
+#define p4d_bad(p4d)   (!ia64_phys_addr_valid(p4d_val(p4d)))
+#define p4d_present(p4d)   (p4d_val(p4d) != 0UL)
+#define p4d_clear(p4dp)(p4d_val(*(p4dp)) = 0UL)
+#define p4d_page_vaddr(p4d)((unsigned long) __va(p4d_val(p4d) & 
_PFN_MASK))
+#define p4d_page(p4d)  virt_to_page((p4d_val(p4d) + 
PAGE_OFFSET))
 #endif
 
 /*
@@ -388,7 +388,7 @@ pgd_offset (const struct mm_struct *mm, unsigned long 
address)
 #if CONFIG_PGTABLE_LEVELS == 4
 /* Find an entry in the second-level page table.. */
 #define pud_offset(dir,addr) \
-   ((pud_t *) pgd_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & 
(PTRS_PER_PUD - 1)))
+   ((pud_t *) p4d_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & 
(PTRS_PER_PUD - 1)))
 #endif
 
 /* Find an entry in the third-level page table.. */
@@ -582,10 +582,9 @@ extern struct page *zero_page_memmap_ptr;
 
 
 #if CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 #endif
-#include 
+#include 
 #include 
 
 #endif /* _ASM_IA64_PGTABLE_H */
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index c2f299fe9e04..ec994135cb74 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -29,6 +29,7 @@ static int
 mapped_kernel_page_is_present (unsigned long address)
 {
pgd_t *pgd;
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;
@@ -37,7 +38,11 @@ mapped_kernel_page_is_present (unsigned long address)
if (pgd_none(*pgd) || pgd_bad(*pgd))
return 0;
 
-   pud = pud_offset(pgd, address);
+   p4d = p4d_offset(pgd, address);
+   if (p4d_none(*p4d) || p4d_bad(*p4d))
+   return 0;
+
+   pud = pud_offset(p4d, address);
if (pud_none(*pud) || pud_bad(*pud))
return 0;
 
diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c
index d16e419fd712..32352a73df0c 100644
--- a/arch/ia64/mm/hugetlbpage.c
+++ b/arch/ia64/mm/hugetlbpage.c
@@ -30,12 +30,14 @@ huge_pte_alloc(struct mm_struct *mm, unsigned long addr, 
unsigned long sz)
 {
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
 
pgd = pgd_offset(mm, taddr);
-   pud = pud_alloc(mm, pgd, taddr);
+   p4d = p4d_offset(pgd, taddr);
+   pud = pud_alloc(mm, p4d, taddr);
if (pud) {
pmd = pmd_alloc(mm, pud, taddr);
if (pmd)
@@ -49,17 +51,21 @@ huge_pte_offset (struct mm_struct *mm, unsigned long addr, 
unsigned long sz)
 {
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+   p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
 
pgd = pgd_offset(mm, taddr);
if (pgd_present(*pgd)) {
-   pud = pud_offset(pgd, taddr);
-   if (pud_present(*pud)) {

[PATCH v2 03/13] hexagon: remove __ARCH_USE_5LEVEL_HACK

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

The hexagon architecture has 2 level page tables and as such most of the
page table folding is already implemented in asm-generic/pgtable-nopmd.h.

Fixup the only place in arch/hexagon to unfold the p4d level and remove
__ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport 
---
 arch/hexagon/include/asm/fixmap.h  | 4 ++--
 arch/hexagon/include/asm/pgtable.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/hexagon/include/asm/fixmap.h 
b/arch/hexagon/include/asm/fixmap.h
index 933dac167504..97b1b062e750 100644
--- a/arch/hexagon/include/asm/fixmap.h
+++ b/arch/hexagon/include/asm/fixmap.h
@@ -16,7 +16,7 @@
 #include 
 
 #define kmap_get_fixmap_pte(vaddr) \
-   pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), \
-   (vaddr)), (vaddr)), (vaddr))
+   pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr), 
\
+   (vaddr)), (vaddr)), (vaddr)), (vaddr))
 
 #endif
diff --git a/arch/hexagon/include/asm/pgtable.h 
b/arch/hexagon/include/asm/pgtable.h
index 2fec20ad939e..83b544936eed 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -12,7 +12,6 @@
  * Page table definitions for Qualcomm Hexagon processor.
  */
 #include 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 
 /* A handy thing to have if one has the RAM. Declared in head.S */
-- 
2.24.0



[PATCH v2 02/13] h8300: remove usage of __ARCH_USE_5LEVEL_HACK

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

h8300 is a nommu architecture and does not require fixup for upper layers
of the page tables because it is already handled by the generic nommu
implementation.

Remove definition of __ARCH_USE_5LEVEL_HACK in
arch/h8300/include/asm/pgtable.h

Signed-off-by: Mike Rapoport 
---
 arch/h8300/include/asm/pgtable.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/h8300/include/asm/pgtable.h b/arch/h8300/include/asm/pgtable.h
index 4d00152fab58..f00828720dc4 100644
--- a/arch/h8300/include/asm/pgtable.h
+++ b/arch/h8300/include/asm/pgtable.h
@@ -1,7 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _H8300_PGTABLE_H
 #define _H8300_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 #include 
 extern void paging_init(void);
-- 
2.24.0



[PATCH v2 01/13] arm/arm64: add support for folded p4d page tables

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.

Since arm and arm64 share kvm memory management bits, make the conversion
for both variants at once to avoid breaking the builds in the middle.

Signed-off-by: Mike Rapoport 
---
 arch/arm/include/asm/kvm_mmu.h  |   5 +-
 arch/arm/include/asm/pgtable.h  |   1 -
 arch/arm/include/asm/stage2_pgtable.h   |  15 +-
 arch/arm/lib/uaccess_with_memcpy.c  |   9 +-
 arch/arm/mach-sa1100/assabet.c  |   2 +-
 arch/arm/mm/dump.c  |  29 +++-
 arch/arm/mm/fault-armv.c|   7 +-
 arch/arm/mm/fault.c |  28 +++-
 arch/arm/mm/idmap.c |   3 +-
 arch/arm/mm/init.c  |   2 +-
 arch/arm/mm/ioremap.c   |  12 +-
 arch/arm/mm/mm.h|   2 +-
 arch/arm/mm/mmu.c   |  35 +++-
 arch/arm/mm/pgd.c   |  40 -
 arch/arm64/include/asm/kvm_mmu.h|  10 +-
 arch/arm64/include/asm/pgalloc.h|  10 +-
 arch/arm64/include/asm/pgtable-types.h  |   5 +-
 arch/arm64/include/asm/pgtable.h|  37 +++--
 arch/arm64/include/asm/stage2_pgtable.h |  48 --
 arch/arm64/kernel/hibernate.c   |  44 -
 arch/arm64/mm/fault.c   |   9 +-
 arch/arm64/mm/hugetlbpage.c |  15 +-
 arch/arm64/mm/kasan_init.c  |  26 ++-
 arch/arm64/mm/mmu.c |  52 --
 arch/arm64/mm/pageattr.c|   7 +-
 virt/kvm/arm/mmu.c  | 209 
 26 files changed, 522 insertions(+), 140 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 0d84d50bf9ba..8c511bb99e4c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -68,7 +68,8 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_mk_pmd(ptep)   __pmd(__pa(ptep) | PMD_TYPE_TABLE)
 #define kvm_mk_pud(pmdp)   __pud(__pa(pmdp) | PMD_TYPE_TABLE)
-#define kvm_mk_pgd(pudp)   ({ BUILD_BUG(); 0; })
+#define kvm_mk_p4d(pudp)   ({ BUILD_BUG(); __p4d(0); })
+#define kvm_mk_pgd(p4dp)   ({ BUILD_BUG(); 0; })
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
@@ -194,10 +195,12 @@ static inline bool kvm_page_empty(void *ptr)
 #define kvm_pte_table_empty(kvm, ptep) kvm_page_empty(ptep)
 #define kvm_pmd_table_empty(kvm, pmdp) kvm_page_empty(pmdp)
 #define kvm_pud_table_empty(kvm, pudp) false
+#define kvm_p4d_table_empty(kvm, p4dp) false
 
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 #define hyp_pmd_table_empty(pmdp) kvm_page_empty(pmdp)
 #define hyp_pud_table_empty(pudp) false
+#define hyp_p4d_table_empty(p4dp) false
 
 struct kvm;
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index eabcb48a7840..9e3464842dfc 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -17,7 +17,6 @@
 
 #else
 
-#define __ARCH_USE_5LEVEL_HACK
 #include 
 #include 
 #include 
diff --git a/arch/arm/include/asm/stage2_pgtable.h 
b/arch/arm/include/asm/stage2_pgtable.h
index aaceec7855ec..7ed66e216a5e 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -19,8 +19,17 @@
 #define stage2_pgd_none(kvm, pgd)  pgd_none(pgd)
 #define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
 #define stage2_pgd_present(kvm, pgd)   pgd_present(pgd)
-#define stage2_pgd_populate(kvm, pgd, pud) pgd_populate(NULL, pgd, pud)
-#define stage2_pud_offset(kvm, pgd, address)   pud_offset(pgd, address)
+#define stage2_pgd_populate(kvm, pgd, p4d) pgd_populate(NULL, pgd, p4d)
+
+#define stage2_p4d_offset(kvm, pgd, address)   p4d_offset(pgd, address)
+#define stage2_p4d_free(kvm, p4d)  do { } while (0)
+
+#define stage2_p4d_none(kvm, p4d)  p4d_none(p4d)
+#define stage2_p4d_clear(kvm, p4d) p4d_clear(p4d)
+#define stage2_p4d_present(kvm, p4d)   p4d_present(p4d)
+#define stage2_p4d_populate(kvm, p4d, pud) p4d_populate(NULL, p4d, pud)
+
+#define stage2_pud_offset(kvm, p4d, address)   pud_offset(p4d, address)
 #define stage2_pud_free(kvm, pud)  do { } while (0)
 
 #define stage2_pud_none(kvm, pud)  pud_none(pud)
@@ -41,6 +50,7 @@ stage2_pgd_addr_end(struct kvm *kvm, phys_addr_t addr, 
phys_addr_t end)
return (boundary - 1 < end - 1) ? boundary : end;
 }
 
+#define stage2_p4d_addr_end(kvm, addr, end)(end)
 #define stage2_pud_addr_end(kvm, addr, end)(end)
 
 static inline phys_addr_t
@@ -56,6 +66,7 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
phys_addr_t end)
 #define stage2_pte_table_empty(kvm, ptep)  kvm_page_empty(ptep)
 #define stage2_pmd_table_empty(kvm, pmdp)  

[PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Mike Rapoport
From: Mike Rapoport 

Hi,

These patches convert several architectures to use page table folding and
remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.

The changes are mostly about mechanical replacement of pgd accessors with p4d
ones and the addition of higher levels to page table traversals.

All the patches were sent separately to the respective arch lists and
maintainers hence the "v2" prefix.

Geert Uytterhoeven (1):
  sh: fault: Modernize printing of kernel messages

Mike Rapoport (12):
  arm/arm64: add support for folded p4d page tables
  h8300: remove usage of __ARCH_USE_5LEVEL_HACK
  hexagon: remove __ARCH_USE_5LEVEL_HACK
  ia64: add support for folded p4d page tables
  nios2: add support for folded p4d page tables
  openrisc: add support for folded p4d page tables
  powerpc: add support for folded p4d page tables
  sh: drop __pXd_offset() macros that duplicate pXd_index() ones
  sh: add support for folded p4d page tables
  unicore32: remove __ARCH_USE_5LEVEL_HACK
  asm-generic: remove pgtable-nop4d-hack.h
  mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h

 arch/arm/include/asm/kvm_mmu.h|   5 +-
 arch/arm/include/asm/pgtable.h|   1 -
 arch/arm/include/asm/stage2_pgtable.h |  15 +-
 arch/arm/lib/uaccess_with_memcpy.c|   9 +-
 arch/arm/mach-sa1100/assabet.c|   2 +-
 arch/arm/mm/dump.c|  29 ++-
 arch/arm/mm/fault-armv.c  |   7 +-
 arch/arm/mm/fault.c   |  28 ++-
 arch/arm/mm/idmap.c   |   3 +-
 arch/arm/mm/init.c|   2 +-
 arch/arm/mm/ioremap.c |  12 +-
 arch/arm/mm/mm.h  |   2 +-
 arch/arm/mm/mmu.c |  35 ++-
 arch/arm/mm/pgd.c |  40 +++-
 arch/arm64/include/asm/kvm_mmu.h  |  10 +-
 arch/arm64/include/asm/pgalloc.h  |  10 +-
 arch/arm64/include/asm/pgtable-types.h|   5 +-
 arch/arm64/include/asm/pgtable.h  |  37 ++--
 arch/arm64/include/asm/stage2_pgtable.h   |  48 +++-
 arch/arm64/kernel/hibernate.c |  44 +++-
 arch/arm64/mm/fault.c |   9 +-
 arch/arm64/mm/hugetlbpage.c   |  15 +-
 arch/arm64/mm/kasan_init.c|  26 ++-
 arch/arm64/mm/mmu.c   |  52 +++--
 arch/arm64/mm/pageattr.c  |   7 +-
 arch/h8300/include/asm/pgtable.h  |   1 -
 arch/hexagon/include/asm/fixmap.h |   4 +-
 arch/hexagon/include/asm/pgtable.h|   1 -
 arch/ia64/include/asm/pgalloc.h   |   4 +-
 arch/ia64/include/asm/pgtable.h   |  17 +-
 arch/ia64/mm/fault.c  |   7 +-
 arch/ia64/mm/hugetlbpage.c|  18 +-
 arch/ia64/mm/init.c   |  28 ++-
 arch/nios2/include/asm/pgtable.h  |   3 +-
 arch/nios2/mm/fault.c |   9 +-
 arch/nios2/mm/ioremap.c   |   6 +-
 arch/openrisc/include/asm/pgtable.h   |   1 -
 arch/openrisc/mm/fault.c  |  10 +-
 arch/openrisc/mm/init.c   |   4 +-
 arch/powerpc/include/asm/book3s/32/pgtable.h  |   1 -
 arch/powerpc/include/asm/book3s/64/hash.h |   4 +-
 arch/powerpc/include/asm/book3s/64/pgalloc.h  |   4 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  58 +++--
 arch/powerpc/include/asm/book3s/64/radix.h|   6 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h  |   1 -
 arch/powerpc/include/asm/nohash/64/pgalloc.h  |   2 +-
 .../include/asm/nohash/64/pgtable-4k.h|  32 +--
 arch/powerpc/include/asm/nohash/64/pgtable.h  |   6 +-
 arch/powerpc/include/asm/pgtable.h|   8 +
 arch/powerpc/kvm/book3s_64_mmu_radix.c|  59 -
 arch/powerpc/lib/code-patching.c  |   7 +-
 arch/powerpc/mm/book3s32/mmu.c|   2 +-
 arch/powerpc/mm/book3s32/tlb.c|   4 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c   |   4 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c  |  19 +-
 arch/powerpc/mm/book3s64/subpage_prot.c   |   6 +-
 arch/powerpc/mm/hugetlbpage.c |  28 ++-
 arch/powerpc/mm/kasan/kasan_init_32.c |   8 +-
 arch/powerpc/mm/mem.c |   4 +-
 arch/powerpc/mm/nohash/40x.c  |   4 +-
 arch/powerpc/mm/nohash/book3e_pgtable.c   |  15 +-
 arch/powerpc/mm/pgtable.c |  25 ++-
 arch/powerpc/mm/pgtable_32.c  |  28 ++-
 arch/powerpc/mm/pgtable_64.c  |  10 +-
 arch/powerpc/mm/ptdump/hashpagetable.c|  20 +-
 arch/powerpc/mm/ptdump/ptdump.c   |  22 +-
 arch/powerpc/xmon/xmon.c  |  17 +-
 arch/sh/include/asm/pgtable-2level.h  |   1 -
 arch/sh/include/asm/pgtable-3level.h