Re: [PATCH v10 3/3] tty: samsung_tty: 32-bit access for TX/RX hold registers

2020-05-18 Thread Jiri Slaby
On 06. 05. 20, 10:02, Hyunki Koo wrote:
> Support 32-bit access for the TX/RX hold registers UTXH and URXH.
> 
> This is required for some newer SoCs.
> 
> Signed-off-by: Hyunki Koo 
> Reviewed-by: Krzysztof Kozlowski 
> Tested on Odroid HC1 (Exynos5422):
> Tested-by: Krzysztof Kozlowski 
> ---
>  drivers/tty/serial/samsung_tty.c | 62 
> 
>  1 file changed, 57 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/serial/samsung_tty.c 
> b/drivers/tty/serial/samsung_tty.c
> index 326b0164609c..6ef614d8648c 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
...
> @@ -2000,10 +2023,27 @@ static int s3c24xx_serial_probe(struct 
> platform_device *pdev)
>   dev_get_platdata(>dev) :
>   ourport->drv_data->def_cfg;
>  
> - if (np)
> + if (np) {
>   of_property_read_u32(np,
>   "samsung,uart-fifosize", >port.fifosize);
>  
> + if (of_property_read_u32(np, "reg-io-width", ) == 0) {
> + switch (prop) {
> + case 1:
> + ourport->port.iotype = UPIO_MEM;
> + break;
> + case 4:
> + ourport->port.iotype = UPIO_MEM32;
> + break;
> + default:
> + dev_warn(>dev, "unsupported reg-io-width 
> (%d)\n",
> + prop);
> + ret = -EINVAL;
> + break;

This ret value is unused. Did you intend to return here?

thanks,
-- 
js
suse labs


Re: [PATCH] mm: use only pidfd for process_madvise syscall

2020-05-18 Thread Minchan Kim
Hi Andrew,

On Mon, May 18, 2020 at 04:06:56PM -0700, Andrew Morton wrote:
> On Mon, 18 May 2020 14:13:50 -0700 Minchan Kim  wrote:
> 
> > Andrew, I sent this patch without folding into previous syscall introducing
> > patches because it could be arguable. If you want to fold it into each
> > patchset(i.e., introdcuing process_madvise syscall and introducing
> > compat_syscall), let me know it. I will send partial diff to each
> > patchset.
> 
> It doesn't seem necessary - I believe we'll get a clean result if I
> squish all of these:
> 
> mm-support-vector-address-ranges-for-process_madvise-fix.patch
> mm-support-vector-address-ranges-for-process_madvise-fix-fix.patch
> mm-support-vector-address-ranges-for-process_madvise-fix-fix-fix.patch
> mm-support-vector-address-ranges-for-process_madvise-fix-fix-fix-fix.patch
> mm-support-vector-address-ranges-for-process_madvise-fix-fix-fix-fix-fix.patch
> mm-use-only-pidfd-for-process_madvise-syscall.patch
> 
> into mm-support-vector-address-ranges-for-process_madvise.patch and
> make the appropriate changelog adjustments?
> 

If you want to fold them all, please use the description below for
mm-support-vector-address-ranges-for-process_madvise.patch.

Thanks.

== &< ===

Subject: [PATCH] mm: support vector address ranges for process_madvise

This patch changes process_madvise interface:
  a) support vector address ranges in a system call
  b) support the vector address ranges to local process as well as
 external process
  c) remove pid but keep only pidfd in argument - [1][2]
  d) change type of flags with unsgined int

Android app has thousands of vmas due to zygote so it's totally waste of
CPU and power if we should call the syscall one by one for each vma.
(With testing 2000-vma syscall vs 1-vector syscall, it showed 15%
performance improvement.  I think it would be bigger in real practice
because the testing ran very cache friendly environment).

Another potential use case for the vector range is to amortize the cost of
TLB shootdowns for multiple ranges when using MADV_DONTNEED; this could
benefit users like TCP receive zerocopy and malloc implementations.  In
future, we could find more usecases for other advises so let's make it
happens as API since we introduce a new syscall at this moment.  With
that, existing madvise(2) user could replace it with process_madvise(2)
with their own pid if they want to have batch address ranges support
feature.

So finally, the API is as follows,

  ssize_t process_madvise(int pidfd, const struct iovec *iovec,
unsigned long vlen, int advice, unsigned int flags);

DESCRIPTION
  The process_madvise() system call is used to give advice or directions
  to the kernel about the address ranges from external process as well as
  local process. It provides the advice to address ranges of process
  described by iovec and vlen. The goal of such advice is to improve system
  or application performance.

  The pidfd selects the process referred to by the PID file descriptor
  specified in pidfd. (See pidofd_open(2) for further information)

  The pointer iovec points to an array of iovec structures, defined in
   as:

struct iovec {
void *iov_base; /* starting address */
size_t iov_len; /* number of bytes to be advised */
};

  The iovec describes address ranges beginning at address(iov_base)
  and with size length of bytes(iov_len).

  The vlen represents the number of elements in iovec.

  The advice is indicated in the advice argument, which is one of the
  following at this moment if the target process specified by pidfd is
  external.

MADV_COLD
MADV_PAGEOUT
MADV_MERGEABLE
MADV_UNMERGEABLE

  Permission to provide a hint to external process is governed by a
  ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2).

  The process_madvise supports every advice madvise(2) has if target
  process is in same thread group with calling process so user could
  use process_madvise(2) to extend existing madvise(2) to support
  vector address ranges.

RETURN VALUE
  On success, process_madvise() returns the number of bytes advised.
  This return value may be less than the total number of requested
  bytes, if an error occurred. The caller should check return value
  to determine whether a partial advice occurred.

[1] 
https://lore.kernel.org/linux-mm/20200509124817.xmrvsrq3mla6b76k@wittgenstein/
[2] 
https://lore.kernel.org/linux-mm/9d849087-3359-c4ab-fbec-859e8186c...@virtuozzo.com/
Reviewed-by: Suren Baghdasaryan 
Signed-off-by: Minchan Kim 


Re: [PATCH V2] ifcvf: move IRQ request/free to status change handlers

2020-05-18 Thread Jason Wang



On 2020/5/19 上午9:51, Cindy Lu wrote:

Hi ,Jason
It works ok in the latest version of qemu vdpa code , So I think the
patch is ok.
Thanks
Cindy



Thanks for the testing, (btw, we'd better not do top posting when 
discuss in the community).


So,

Acked-by: Jason Wang 




On Wed, May 13, 2020 at 3:18 PM Jason Wang  wrote:


On 2020/5/13 下午12:42, Zhu, Lingshan wrote:


On 5/13/2020 12:12 PM, Jason Wang wrote:

On 2020/5/12 下午4:00, Zhu Lingshan wrote:

This commit move IRQ request and free operations from probe()
to VIRTIO status change handler to comply with VIRTIO spec.

VIRTIO spec 1.1, section 2.1.2 Device Requirements: Device Status Field
The device MUST NOT consume buffers or send any used buffer
notifications to the driver before DRIVER_OK.


This comment needs to be checked as I said previously. It's only
needed if we're sure ifcvf can generate interrupt before DRIVER_OK.



Signed-off-by: Zhu Lingshan 
---
changes from V1:
remove ifcvf_stop_datapath() in status == 0 handler, we don't need
to do this
twice; handle status == 0 after DRIVER_OK -> !DRIVER_OK handler
(Jason Wang)


Patch looks good to me, but with this patch ping cannot work on my
machine. (It works without this patch).

Thanks

This is strange, it works on my machines, let's have a check offline.

Thanks,
BR
Zhu Lingshan


I give it a try with virito-vpda and a tiny userspace. Either works.

So it could be an issue of qemu codes.

Let's wait for Cindy to test if it really works.

Thanks






[PATCH v4 11/45] powerpc/ptdump: Standardise display of BAT flags

2020-05-18 Thread Christophe Leroy
Display BAT flags the same way as page flags: rwx and wimg

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/bats.c | 37 ++-
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/bats.c b/arch/powerpc/mm/ptdump/bats.c
index d6c660f63d71..cebb58c7e289 100644
--- a/arch/powerpc/mm/ptdump/bats.c
+++ b/arch/powerpc/mm/ptdump/bats.c
@@ -15,12 +15,12 @@
 static char *pp_601(int k, int pp)
 {
if (pp == 0)
-   return k ? "NA" : "RWX";
+   return k ? "   " : "rwx";
if (pp == 1)
-   return k ? "ROX" : "RWX";
+   return k ? "r x" : "rwx";
if (pp == 2)
-   return k ? "RWX" : "RWX";
-   return k ? "ROX" : "ROX";
+   return "rwx";
+   return "r x";
 }
 
 static void bat_show_601(struct seq_file *m, int idx, u32 lower, u32 upper)
@@ -48,12 +48,9 @@ static void bat_show_601(struct seq_file *m, int idx, u32 
lower, u32 upper)
 
seq_printf(m, "Kernel %s User %s", pp_601(k & 2, pp), pp_601(k & 1, 
pp));
 
-   if (lower & _PAGE_WRITETHRU)
-   seq_puts(m, "write through ");
-   if (lower & _PAGE_NO_CACHE)
-   seq_puts(m, "no cache ");
-   if (lower & _PAGE_COHERENT)
-   seq_puts(m, "coherent ");
+   seq_puts(m, lower & _PAGE_WRITETHRU ? "w " : "  ");
+   seq_puts(m, lower & _PAGE_NO_CACHE ? "i " : "  ");
+   seq_puts(m, lower & _PAGE_COHERENT ? "m " : "  ");
seq_puts(m, "\n");
 }
 
@@ -101,20 +98,16 @@ static void bat_show_603(struct seq_file *m, int idx, u32 
lower, u32 upper, bool
seq_puts(m, "Kernel/User ");
 
if (lower & BPP_RX)
-   seq_puts(m, is_d ? "RO " : "EXEC ");
+   seq_puts(m, is_d ? "r   " : "  x ");
else if (lower & BPP_RW)
-   seq_puts(m, is_d ? "RW " : "EXEC ");
+   seq_puts(m, is_d ? "rw  " : "  x ");
else
-   seq_puts(m, is_d ? "NA " : "NX   ");
-
-   if (lower & _PAGE_WRITETHRU)
-   seq_puts(m, "write through ");
-   if (lower & _PAGE_NO_CACHE)
-   seq_puts(m, "no cache ");
-   if (lower & _PAGE_COHERENT)
-   seq_puts(m, "coherent ");
-   if (lower & _PAGE_GUARDED)
-   seq_puts(m, "guarded ");
+   seq_puts(m, is_d ? "" : "");
+
+   seq_puts(m, lower & _PAGE_WRITETHRU ? "w " : "  ");
+   seq_puts(m, lower & _PAGE_NO_CACHE ? "i " : "  ");
+   seq_puts(m, lower & _PAGE_COHERENT ? "m " : "  ");
+   seq_puts(m, lower & _PAGE_GUARDED ? "g " : "  ");
seq_puts(m, "\n");
 }
 
-- 
2.25.0



[PATCH v4 10/45] powerpc/ptdump: Display size of BATs

2020-05-18 Thread Christophe Leroy
Display the size of areas mapped with BATs.

For that, the size display for pages is refactorised.

Signed-off-by: Christophe Leroy 
---
v2: Add missing include of linux/seq_file.h (Thanks to kbuild test robot)
---
 arch/powerpc/mm/ptdump/bats.c   |  4 
 arch/powerpc/mm/ptdump/ptdump.c | 23 ++-
 arch/powerpc/mm/ptdump/ptdump.h |  3 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/bats.c b/arch/powerpc/mm/ptdump/bats.c
index d3a5d6b318d1..d6c660f63d71 100644
--- a/arch/powerpc/mm/ptdump/bats.c
+++ b/arch/powerpc/mm/ptdump/bats.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 
+#include "ptdump.h"
+
 static char *pp_601(int k, int pp)
 {
if (pp == 0)
@@ -42,6 +44,7 @@ static void bat_show_601(struct seq_file *m, int idx, u32 
lower, u32 upper)
 #else
seq_printf(m, "0x%08x ", pbn);
 #endif
+   pt_dump_size(m, size);
 
seq_printf(m, "Kernel %s User %s", pp_601(k & 2, pp), pp_601(k & 1, 
pp));
 
@@ -88,6 +91,7 @@ static void bat_show_603(struct seq_file *m, int idx, u32 
lower, u32 upper, bool
 #else
seq_printf(m, "0x%08x ", brpn);
 #endif
+   pt_dump_size(m, size);
 
if (k == 1)
seq_puts(m, "User ");
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index d92bb8ea229c..1f97668853e3 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -112,6 +112,19 @@ static struct addr_marker address_markers[] = {
seq_putc(m, c); \
 })
 
+void pt_dump_size(struct seq_file *m, unsigned long size)
+{
+   static const char units[] = "KMGTPE";
+   const char *unit = units;
+
+   /* Work out what appropriate unit to use */
+   while (!(size & 1023) && unit[1]) {
+   size >>= 10;
+   unit++;
+   }
+   pt_dump_seq_printf(m, "%9lu%c ", size, *unit);
+}
+
 static void dump_flag_info(struct pg_state *st, const struct flag_info
*flag, u64 pte, int num)
 {
@@ -146,8 +159,6 @@ static void dump_flag_info(struct pg_state *st, const 
struct flag_info
 
 static void dump_addr(struct pg_state *st, unsigned long addr)
 {
-   static const char units[] = "KMGTPE";
-   const char *unit = units;
unsigned long delta;
 
 #ifdef CONFIG_PPC64
@@ -164,13 +175,7 @@ static void dump_addr(struct pg_state *st, unsigned long 
addr)
pt_dump_seq_printf(st->seq, " " REG " ", st->start_pa);
delta = (addr - st->start_address) >> 10;
}
-   /* Work out what appropriate unit to use */
-   while (!(delta & 1023) && unit[1]) {
-   delta >>= 10;
-   unit++;
-   }
-   pt_dump_seq_printf(st->seq, "%9lu%c", delta, *unit);
-
+   pt_dump_size(st->seq, delta);
 }
 
 static void note_prot_wx(struct pg_state *st, unsigned long addr)
diff --git a/arch/powerpc/mm/ptdump/ptdump.h b/arch/powerpc/mm/ptdump/ptdump.h
index 5d513636de73..154efae96ae0 100644
--- a/arch/powerpc/mm/ptdump/ptdump.h
+++ b/arch/powerpc/mm/ptdump/ptdump.h
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include 
+#include 
 
 struct flag_info {
u64 mask;
@@ -17,3 +18,5 @@ struct pgtable_level {
 };
 
 extern struct pgtable_level pg_level[5];
+
+void pt_dump_size(struct seq_file *m, unsigned long delta);
-- 
2.25.0



[PATCH v4 00/45] Use hugepages to map kernel mem on 8xx

2020-05-18 Thread Christophe Leroy
The main purpose of this big series is to:
- reorganise huge page handling to avoid using mm_slices.
- use huge pages to map kernel memory on the 8xx.

The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M.
It uses 2 Level page tables, PGD having 1024 entries, each entry
covering 4M address space. Then each page table has 1024 entries.

At the time being, page sizes are managed in PGD entries, implying
the use of mm_slices as it can't mix several pages of the same size
in one page table.

The first purpose of this series is to reorganise things so that
standard page tables can also handle 512k pages. This is done by
adding a new _PAGE_HUGE flag which will be copied into the Level 1
entry in the TLB miss handler. That done, we have 2 types of pages:
- PGD entries to regular page tables handling 4k/16k and 512k pages
- PGD entries to hugepd tables handling 8M pages.

There is no need to mix 8M pages with other sizes, because a 8M page
will use more than what a single PGD covers.

Then comes the second purpose of this series. At the time being, the
8xx has implemented special handling in the TLB miss handlers in order
to transparently map kernel linear address space and the IMMR using
huge pages by building the TLB entries in assembly at the time of the
exception.

As mm_slices is only for user space pages, and also because it would
anyway not be convenient to slice kernel address space, it was not
possible to use huge pages for kernel address space. But after step
one of the series, it is now more flexible to use huge pages.

This series drop all assembly 'just in time' handling of huge pages
and use huge pages in page tables instead.

Once the above is done, then comes icing on the cake:
- Use huge pages for KASAN shadow mapping
- Allow pinned TLBs with strict kernel rwx
- Allow pinned TLBs with debug pagealloc

Then, last but not least, those modifications for the 8xx allows the
following improvement on book3s/32:
- Mapping KASAN shadow with BATs
- Allowing BATs with debug pagealloc

All this allows to considerably simplify TLB miss handlers and associated
initialisation. The overhead of reading page tables is negligible
compared to the reduction of the miss handlers.

While we were at touching pte_update(), some cleanup was done
there too.

Tested widely on 8xx and 832x. Boot tested on QEMU MAC99.

Changes in v4:
- Rebased on top of powerpc/next following the merge of prefix instructions 
series.

Changes in v3:
- Fixed the handling of leaf pages page size which didn't build on PPC64 and 
was invisibily bogus on PPC32 (patch 12)

Changes in v2:
- Selecting HUGETLBFS instead of HUGETLB_PAGE which leads to link failure.
- Rebase on latest powerpc/merge branch
- Reworked the way TLB 28 to 31 are pinned because it was not working.

Christophe Leroy (45):
  powerpc/kasan: Fix error detection on memory allocation
  powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END
  powerpc/kasan: Fix shadow pages allocation failure
  powerpc/kasan: Remove unnecessary page table locking
  powerpc/kasan: Refactor update of early shadow mappings
  powerpc/kasan: Declare kasan_init_region() weak
  powerpc/ptdump: Limit size of flags text to 1/2 chars on PPC32
  powerpc/ptdump: Reorder flags
  powerpc/ptdump: Add _PAGE_COHERENT flag
  powerpc/ptdump: Display size of BATs
  powerpc/ptdump: Standardise display of BAT flags
  powerpc/ptdump: Properly handle non standard page size
  powerpc/ptdump: Handle hugepd at PGD level
  powerpc/32s: Don't warn when mapping RO data ROX.
  powerpc/mm: Allocate static page tables for fixmap
  powerpc/mm: Fix conditions to perform MMU specific management by
blocks on PPC32.
  powerpc/mm: PTE_ATOMIC_UPDATES is only for 40x
  powerpc/mm: Refactor pte_update() on nohash/32
  powerpc/mm: Refactor pte_update() on book3s/32
  powerpc/mm: Standardise __ptep_test_and_clear_young() params between
PPC32 and PPC64
  powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64
  powerpc/mm: Create a dedicated pte_update() for 8xx
  powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx
  powerpc/8xx: Drop CONFIG_8xx_COPYBACK option
  powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.
  powerpc/8xx: Manage 512k huge pages as standard pages.
  powerpc/8xx: Only 8M pages are hugepte pages now
  powerpc/8xx: MM_SLICE is not needed anymore
  powerpc/8xx: Move PPC_PIN_TLB options into 8xx Kconfig
  powerpc/8xx: Add function to set pinned TLBs
  powerpc/8xx: Don't set IMMR map anymore at boot
  powerpc/8xx: Always pin TLBs at startup.
  powerpc/8xx: Drop special handling of Linear and IMMR mappings in I/D
TLB handlers
  powerpc/8xx: Remove now unused TLB miss functions
  powerpc/8xx: Move DTLB perf handling closer.
  powerpc/mm: Don't be too strict with _etext alignment on PPC32
  powerpc/8xx: Refactor kernel address boundary comparison
  powerpc/8xx: Add a function to early map kernel via huge pages
  powerpc/8xx: Map IMMR with a huge page
  powerpc/8xx: Map linear memory with huge 

[PATCH v4 07/45] powerpc/ptdump: Limit size of flags text to 1/2 chars on PPC32

2020-05-18 Thread Christophe Leroy
In order to have all flags fit on a 80 chars wide screen,
reduce the flags to 1 char (2 where ambiguous).

No cache is 'i'
User is 'ur' (Supervisor would be sr)
Shared (for 8xx) becomes 'sh' (it was 'user' when not shared but
that was ambiguous because that's not entirely right)

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/8xx.c| 33 ---
 arch/powerpc/mm/ptdump/shared.c | 35 +
 2 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/8xx.c b/arch/powerpc/mm/ptdump/8xx.c
index 9e2d8e847d6e..ca9ce94672f5 100644
--- a/arch/powerpc/mm/ptdump/8xx.c
+++ b/arch/powerpc/mm/ptdump/8xx.c
@@ -12,9 +12,9 @@
 static const struct flag_info flag_array[] = {
{
.mask   = _PAGE_SH,
-   .val= 0,
-   .set= "user",
-   .clear  = "",
+   .val= _PAGE_SH,
+   .set= "sh",
+   .clear  = "  ",
}, {
.mask   = _PAGE_RO | _PAGE_NA,
.val= 0,
@@ -30,37 +30,38 @@ static const struct flag_info flag_array[] = {
}, {
.mask   = _PAGE_EXEC,
.val= _PAGE_EXEC,
-   .set= " X ",
-   .clear  = "   ",
+   .set= "x",
+   .clear  = " ",
}, {
.mask   = _PAGE_PRESENT,
.val= _PAGE_PRESENT,
-   .set= "present",
-   .clear  = "   ",
+   .set= "p",
+   .clear  = " ",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
-   .set= "guarded",
-   .clear  = "   ",
+   .set= "g",
+   .clear  = " ",
}, {
.mask   = _PAGE_DIRTY,
.val= _PAGE_DIRTY,
-   .set= "dirty",
-   .clear  = " ",
+   .set= "d",
+   .clear  = " ",
}, {
.mask   = _PAGE_ACCESSED,
.val= _PAGE_ACCESSED,
-   .set= "accessed",
-   .clear  = "",
+   .set= "a",
+   .clear  = " ",
}, {
.mask   = _PAGE_NO_CACHE,
.val= _PAGE_NO_CACHE,
-   .set= "no cache",
-   .clear  = "",
+   .set= "i",
+   .clear  = " ",
}, {
.mask   = _PAGE_SPECIAL,
.val= _PAGE_SPECIAL,
-   .set= "special",
+   .set= "s",
+   .clear  = " ",
}
 };
 
diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index f7ed2f187cb0..44a8a64a664f 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -13,8 +13,8 @@ static const struct flag_info flag_array[] = {
{
.mask   = _PAGE_USER,
.val= _PAGE_USER,
-   .set= "user",
-   .clear  = "",
+   .set= "ur",
+   .clear  = "  ",
}, {
.mask   = _PAGE_RW,
.val= _PAGE_RW,
@@ -23,42 +23,43 @@ static const struct flag_info flag_array[] = {
}, {
.mask   = _PAGE_EXEC,
.val= _PAGE_EXEC,
-   .set= " X ",
-   .clear  = "   ",
+   .set= "x",
+   .clear  = " ",
}, {
.mask   = _PAGE_PRESENT,
.val= _PAGE_PRESENT,
-   .set= "present",
-   .clear  = "   ",
+   .set= "p",
+   .clear  = " ",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
-   .set= "guarded",
-   .clear  = "   ",
+   .set= "g",
+   .clear  = " ",
}, {
.mask   = _PAGE_DIRTY,
.val= _PAGE_DIRTY,
-   .set= "dirty",
-   .clear  = " ",
+   .set= "d",
+   .clear  = " ",
}, {
.mask   = _PAGE_ACCESSED,
.val= _PAGE_ACCESSED,
-   .set= "accessed",
-   .clear  = "",
+   .set= "a",
+   .clear  = " ",
}, {
.mask   = _PAGE_WRITETHRU,
.val= _PAGE_WRITETHRU,
-   .set= "write through",
-   .clear  = " ",
+   .set= "w",
+   .clear  = " ",
}, {
.mask   = _PAGE_NO_CACHE,
.val= _PAGE_NO_CACHE,
-   .set= "no cache",
-   .clear  = "",
+   .set= "i",
+   .clear  = " ",
   

[PATCH v4 03/45] powerpc/kasan: Fix shadow pages allocation failure

2020-05-18 Thread Christophe Leroy
Doing kasan pages allocation in MMU_init is too early, kernel doesn't
have access yet to the entire memory space and memblock_alloc() fails
when the kernel is a bit big.

Do it from kasan_init() instead.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/kasan.h  | 2 --
 arch/powerpc/mm/init_32.c | 2 --
 arch/powerpc/mm/kasan/kasan_init_32.c | 4 +++-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index fc900937f653..4769bbf7173a 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -27,12 +27,10 @@
 
 #ifdef CONFIG_KASAN
 void kasan_early_init(void);
-void kasan_mmu_init(void);
 void kasan_init(void);
 void kasan_late_init(void);
 #else
 static inline void kasan_init(void) { }
-static inline void kasan_mmu_init(void) { }
 static inline void kasan_late_init(void) { }
 #endif
 
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 872df48ae41b..a6991ef8727d 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -170,8 +170,6 @@ void __init MMU_init(void)
btext_unmap();
 #endif
 
-   kasan_mmu_init();
-
setup_kup();
 
/* Shortly after that, the entire linear mapping will be available */
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index 8b15fe09b967..b7c287adfd59 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -131,7 +131,7 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
flush_tlb_kernel_range(k_start, k_end);
 }
 
-void __init kasan_mmu_init(void)
+static void __init kasan_mmu_init(void)
 {
int ret;
struct memblock_region *reg;
@@ -159,6 +159,8 @@ void __init kasan_mmu_init(void)
 
 void __init kasan_init(void)
 {
+   kasan_mmu_init();
+
kasan_remap_early_shadow_ro();
 
clear_page(kasan_early_shadow_page);
-- 
2.25.0



[PATCH v4 29/45] powerpc/8xx: Move PPC_PIN_TLB options into 8xx Kconfig

2020-05-18 Thread Christophe Leroy
PPC_PIN_TLB options are dedicated to the 8xx, move them into
the 8xx Kconfig.

While we are at it, add some text to explain what it does.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig   | 20 ---
 arch/powerpc/platforms/8xx/Kconfig | 41 ++
 2 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 30e2111ca15d..1d4ef4f27dec 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1227,26 +1227,6 @@ config TASK_SIZE
hex "Size of user task space" if TASK_SIZE_BOOL
default "0x8000" if PPC_8xx
default "0xc000"
-
-config PIN_TLB
-   bool "Pinned Kernel TLBs (860 ONLY)"
-   depends on ADVANCED_OPTIONS && PPC_8xx && \
-  !DEBUG_PAGEALLOC && !STRICT_KERNEL_RWX
-
-config PIN_TLB_DATA
-   bool "Pinned TLB for DATA"
-   depends on PIN_TLB
-   default y
-
-config PIN_TLB_IMMR
-   bool "Pinned TLB for IMMR"
-   depends on PIN_TLB || PPC_EARLY_DEBUG_CPM
-   default y
-
-config PIN_TLB_TEXT
-   bool "Pinned TLB for TEXT"
-   depends on PIN_TLB
-   default y
 endmenu
 
 if PPC64
diff --git a/arch/powerpc/platforms/8xx/Kconfig 
b/arch/powerpc/platforms/8xx/Kconfig
index b37de62d7e7f..0d036cd868ef 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -162,4 +162,45 @@ config UCODE_PATCH
default y
depends on !NO_UCODE_PATCH
 
+menu "8xx advanced setup"
+   depends on PPC_8xx
+
+config PIN_TLB
+   bool "Pinned Kernel TLBs"
+   depends on ADVANCED_OPTIONS && !DEBUG_PAGEALLOC && !STRICT_KERNEL_RWX
+   help
+ On the 8xx, we have 32 instruction TLBs and 32 data TLBs. In each
+ table 4 TLBs can be pinned.
+
+ It reduces the amount of usable TLBs to 28 (ie by 12%). That's the
+ reason why we make it selectable.
+
+ This option does nothing, it just activate the selection of what
+ to pin.
+
+config PIN_TLB_DATA
+   bool "Pinned TLB for DATA"
+   depends on PIN_TLB
+   default y
+   help
+ This pins the first 32 Mbytes of memory with 8M pages.
+
+config PIN_TLB_IMMR
+   bool "Pinned TLB for IMMR"
+   depends on PIN_TLB || PPC_EARLY_DEBUG_CPM
+   default y
+   help
+ This pins the IMMR area with a 512kbytes page. In case
+ CONFIG_PIN_TLB_DATA is also selected, it will reduce
+ CONFIG_PIN_TLB_DATA to 24 Mbytes.
+
+config PIN_TLB_TEXT
+   bool "Pinned TLB for TEXT"
+   depends on PIN_TLB
+   default y
+   help
+ This pins kernel text with 8M pages.
+
+endmenu
+
 endmenu
-- 
2.25.0



[PATCH v4 23/45] powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx

2020-05-18 Thread Christophe Leroy
Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW
assistance") redefined pte_t as a struct of 4 pte_basic_t, because
in 16K pages mode there are four identical entries in the page table.
But hugepd entries for 8M pages require only one entry of size
pte_basic_t. So there is no point in creating a cache for 4 entries
page tables.

Calculate PTE_T_ORDER using the size of pte_basic_t instead of pte_t.

Define specific huge_pte helpers (set_huge_pte_at(), huge_pte_clear(),
huge_ptep_set_wrprotect()) to write the pte in a single entry instead
of using set_pte_at() which writes 4 identical entries in 16k pages
mode. Also make sure that __ptep_set_access_flags() properly handle
the huge_pte case.

Define set_pte_filter() inline otherwise GCC doesn't inline it anymore
because it is now used twice, and that gives a pretty suboptimal code
because of pte_t being a struct of 4 entries.

Those functions are also used for 512k pages which only require one
entry as well allthough replicating it four times was harmless as 512k
pages entries are spread every 128 bytes in the table.

Signed-off-by: Christophe Leroy 
---
 .../include/asm/nohash/32/hugetlb-8xx.h   | 20 ++
 arch/powerpc/include/asm/nohash/32/pgtable.h  |  3 ++-
 arch/powerpc/mm/hugetlbpage.c |  3 ++-
 arch/powerpc/mm/pgtable.c | 26 ---
 4 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
index a46616937d20..785437323576 100644
--- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
@@ -41,4 +41,24 @@ static inline int check_and_get_huge_psize(int shift)
return shift_to_mmu_psize(shift);
 }
 
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, 
pte_t pte);
+
+#define __HAVE_ARCH_HUGE_PTE_CLEAR
+static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, unsigned long sz)
+{
+   pte_update(mm, addr, ptep, ~0UL, 0, 1);
+}
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+  unsigned long addr, pte_t *ptep)
+{
+   unsigned long clr = ~pte_val(pte_wrprotect(__pte(~0)));
+   unsigned long set = pte_val(pte_wrprotect(__pte(0)));
+
+   pte_update(mm, addr, ptep, clr, set, 1);
+}
+
 #endif /* _ASM_POWERPC_NOHASH_32_HUGETLB_8XX_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 5fb3f6798e22..ff78bf25f832 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -314,8 +314,9 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
pte_t pte_clr = 
pte_mkyoung(pte_mkdirty(pte_mkwrite(pte_mkexec(__pte(~0);
unsigned long set = pte_val(entry) & pte_val(pte_set);
unsigned long clr = ~pte_val(entry) & ~pte_val(pte_clr);
+   int huge = psize > mmu_virtual_psize ? 1 : 0;
 
-   pte_update(vma->vm_mm, address, ptep, clr, set, 0);
+   pte_update(vma->vm_mm, address, ptep, clr, set, huge);
 
flush_tlb_page(vma, address);
 }
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index d06efb946c7d..521929a371af 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -30,7 +30,8 @@ bool hugetlb_disabled = false;
 
 #define hugepd_none(hpd)   (hpd_val(hpd) == 0)
 
-#define PTE_T_ORDER(__builtin_ffs(sizeof(pte_t)) - 
__builtin_ffs(sizeof(void *)))
+#define PTE_T_ORDER(__builtin_ffs(sizeof(pte_basic_t)) - \
+__builtin_ffs(sizeof(void *)))
 
 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long 
sz)
 {
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index e3759b69f81b..214a5f4beb6c 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -100,7 +100,7 @@ static pte_t set_pte_filter_hash(pte_t pte) { return pte; }
  * as we don't have two bits to spare for _PAGE_EXEC and _PAGE_HWEXEC so
  * instead we "filter out" the exec permission for non clean pages.
  */
-static pte_t set_pte_filter(pte_t pte)
+static inline pte_t set_pte_filter(pte_t pte)
 {
struct page *pg;
 
@@ -249,16 +249,34 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 
 #else
/*
-* Not used on non book3s64 platforms. But 8xx
-* can possibly use tsize derived from hstate.
+* Not used on non book3s64 platforms.
+* 8xx compares it with mmu_virtual_psize to
+* know if it is a huge page or not.
 */
-   psize = 0;
+   psize = MMU_PAGE_COUNT;
 #endif

[PATCH v4 21/45] powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64

2020-05-18 Thread Christophe Leroy
PPC64 takes 3 additional parameters compared to PPC32:
- mm
- address
- huge

These 3 parameters will be needed in order to perform different
action depending on the page size on the 8xx.

Make pte_update() prototype identical for PPC32 and PPC64.

This allows dropping an #ifdef in huge_ptep_get_and_clear().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 15 ---
 arch/powerpc/include/asm/hugetlb.h   |  4 
 arch/powerpc/include/asm/nohash/32/pgtable.h | 13 +++--
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 25c59511fcab..8a091d125f2d 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -218,7 +218,7 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, 
pgprot_t prot);
  */
 
 #define pte_clear(mm, addr, ptep) \
-   do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
+   do { pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0); } while (0)
 
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
@@ -254,7 +254,8 @@ extern void flush_hash_entry(struct mm_struct *mm, pte_t 
*ptep,
  * when using atomic updates, only the low part of the PTE is
  * accessed atomically.
  */
-static inline pte_basic_t pte_update(pte_t *p, unsigned long clr, unsigned 
long set)
+static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
+unsigned long clr, unsigned long set, int 
huge)
 {
pte_basic_t old;
unsigned long tmp;
@@ -292,7 +293,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
  unsigned long addr, pte_t *ptep)
 {
unsigned long old;
-   old = pte_update(ptep, _PAGE_ACCESSED, 0);
+   old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
if (old & _PAGE_HASHPTE) {
unsigned long ptephys = __pa(ptep) & PAGE_MASK;
flush_hash_pages(mm->context.id, addr, ptephys, 1);
@@ -306,14 +307,14 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
   pte_t *ptep)
 {
-   return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
+   return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0));
 }
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
 {
-   pte_update(ptep, _PAGE_RW, 0);
+   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
 }
 
 static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
@@ -324,7 +325,7 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
unsigned long set = pte_val(entry) &
(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
 
-   pte_update(ptep, 0, set);
+   pte_update(vma->vm_mm, address, ptep, 0, set, 0);
 
flush_tlb_page(vma, address);
 }
@@ -522,7 +523,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
  | (pte_val(pte) & ~_PAGE_HASHPTE));
else
-   pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
+   pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
 
 #elif defined(CONFIG_PTE_64BIT)
/* Second case is 32-bit with 64-bit PTE.  In this case, we
diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index bd6504c28c2f..e4276af034e9 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -40,11 +40,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned 
long addr,
 static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_PPC64
return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 1));
-#else
-   return __pte(pte_update(ptep, ~0UL, 0));
-#endif
 }
 
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index e963e6880d7c..474dd1db065f 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -166,7 +166,7 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, 
pgprot_t prot);
 #ifndef __ASSEMBLY__
 
 #define pte_clear(mm, addr, ptep) \
-   do { pte_update(ptep, ~0, 0); } while (0)
+   do { pte_update(mm, addr, ptep, ~0, 0, 0); } while (0)
 
 #ifndef pte_mkwrite
 static inline pte_t pte_mkwrite(pte_t pte)
@@ -222,7 +222,8 @@ static inline void 

[PATCH v4 06/45] powerpc/kasan: Declare kasan_init_region() weak

2020-05-18 Thread Christophe Leroy
In order to alloc sub-arches to alloc KASAN regions using optimised
methods (Huge pages on 8xx, BATs on BOOK3S, ...), declare
kasan_init_region() weak.

Also make kasan_init_shadow_page_tables() accessible from outside,
so that it can be called from the specific kasan_init_region()
functions if needed.

And populate remaining KASAN address space only once performed
the region mapping, to allow 8xx to allocate hugepd instead of
standard page tables for mapping via 8M hugepages.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/kasan.h  |  3 +++
 arch/powerpc/mm/kasan/kasan_init_32.c | 21 +++--
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index 4769bbf7173a..107a24c3f7b3 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -34,5 +34,8 @@ static inline void kasan_init(void) { }
 static inline void kasan_late_init(void) { }
 #endif
 
+int kasan_init_shadow_page_tables(unsigned long k_start, unsigned long k_end);
+int kasan_init_region(void *start, size_t size);
+
 #endif /* __ASSEMBLY */
 #endif
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index 10481d904fea..76d418af4ce8 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -28,7 +28,7 @@ static void __init kasan_populate_pte(pte_t *ptep, pgprot_t 
prot)
__set_pte_at(_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 
0);
 }
 
-static int __init kasan_init_shadow_page_tables(unsigned long k_start, 
unsigned long k_end)
+int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned long 
k_end)
 {
pmd_t *pmd;
unsigned long k_cur, k_next;
@@ -52,7 +52,7 @@ static int __init kasan_init_shadow_page_tables(unsigned long 
k_start, unsigned
return 0;
 }
 
-static int __init kasan_init_region(void *start, size_t size)
+int __init __weak kasan_init_region(void *start, size_t size)
 {
unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
@@ -122,14 +122,6 @@ static void __init kasan_mmu_init(void)
int ret;
struct memblock_region *reg;
 
-   if (early_mmu_has_feature(MMU_FTR_HPTE_TABLE) ||
-   IS_ENABLED(CONFIG_KASAN_VMALLOC)) {
-   ret = kasan_init_shadow_page_tables(KASAN_SHADOW_START, 
KASAN_SHADOW_END);
-
-   if (ret)
-   panic("kasan: kasan_init_shadow_page_tables() failed");
-   }
-
for_each_memblock(memory, reg) {
phys_addr_t base = reg->base;
phys_addr_t top = min(base + reg->size, total_lowmem);
@@ -141,6 +133,15 @@ static void __init kasan_mmu_init(void)
if (ret)
panic("kasan: kasan_init_region() failed");
}
+
+   if (early_mmu_has_feature(MMU_FTR_HPTE_TABLE) ||
+   IS_ENABLED(CONFIG_KASAN_VMALLOC)) {
+   ret = kasan_init_shadow_page_tables(KASAN_SHADOW_START, 
KASAN_SHADOW_END);
+
+   if (ret)
+   panic("kasan: kasan_init_shadow_page_tables() failed");
+   }
+
 }
 
 void __init kasan_init(void)
-- 
2.25.0



[PATCH v4 20/45] powerpc/mm: Standardise __ptep_test_and_clear_young() params between PPC32 and PPC64

2020-05-18 Thread Christophe Leroy
On PPC32, __ptep_test_and_clear_young() takes the mm->context.id

In preparation of standardising pte_update() params between PPC32 and
PPC64, __ptep_test_and_clear_young() need mm instead of mm->context.id

Replace context param by mm.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 7 ---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index d2fc324cdf07..25c59511fcab 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -288,18 +288,19 @@ static inline pte_basic_t pte_update(pte_t *p, unsigned 
long clr, unsigned long
  * for our hash-based implementation, we fix that up here.
  */
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-static inline int __ptep_test_and_clear_young(unsigned int context, unsigned 
long addr, pte_t *ptep)
+static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
 {
unsigned long old;
old = pte_update(ptep, _PAGE_ACCESSED, 0);
if (old & _PAGE_HASHPTE) {
unsigned long ptephys = __pa(ptep) & PAGE_MASK;
-   flush_hash_pages(context, addr, ptephys, 1);
+   flush_hash_pages(mm->context.id, addr, ptephys, 1);
}
return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
-   __ptep_test_and_clear_young((__vma)->vm_mm->context.id, __addr, __ptep)
+   __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep)
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index db17f50d6ac3..e963e6880d7c 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -256,14 +256,15 @@ static inline pte_basic_t pte_update(pte_t *p, unsigned 
long clr, unsigned long
 }
 
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-static inline int __ptep_test_and_clear_young(unsigned int context, unsigned 
long addr, pte_t *ptep)
+static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
 {
unsigned long old;
old = pte_update(ptep, _PAGE_ACCESSED, 0);
return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
-   __ptep_test_and_clear_young((__vma)->vm_mm->context.id, __addr, __ptep)
+   __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep)
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
-- 
2.25.0



[PATCH v4 15/45] powerpc/mm: Allocate static page tables for fixmap

2020-05-18 Thread Christophe Leroy
Allocate static page tables for the fixmap area. This allows
setting mappings through page tables before memblock is ready.
That's needed to use early_ioremap() early and to use standard
page mappings with fixmap.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/fixmap.h |  4 
 arch/powerpc/kernel/setup_32.c|  2 +-
 arch/powerpc/mm/pgtable_32.c  | 16 
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index 2ef155a3c821..ccbe2e83c950 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -86,6 +86,10 @@ enum fixed_addresses {
 #define __FIXADDR_SIZE (__end_of_fixed_addresses << PAGE_SHIFT)
 #define FIXADDR_START  (FIXADDR_TOP - __FIXADDR_SIZE)
 
+#define FIXMAP_ALIGNED_SIZE(ALIGN(FIXADDR_TOP, PGDIR_SIZE) - \
+ALIGN_DOWN(FIXADDR_START, PGDIR_SIZE))
+#define FIXMAP_PTE_SIZE(FIXMAP_ALIGNED_SIZE / PGDIR_SIZE * 
PTE_TABLE_SIZE)
+
 #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_NCG
 #define FIXMAP_PAGE_IO PAGE_KERNEL_NCG
 
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 15f0a7c84944..d642e42eabb1 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -80,7 +80,7 @@ notrace void __init machine_init(u64 dt_ptr)
/* Configure static keys first, now that we're relocated. */
setup_feature_keys();
 
-   early_ioremap_setup();
+   early_ioremap_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index f62de06e3d07..9934659cb871 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -29,11 +29,27 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
 extern char etext[], _stext[], _sinittext[], _einittext[];
 
+static u8 early_fixmap_pagetable[FIXMAP_PTE_SIZE] __page_aligned_data;
+
+notrace void __init early_ioremap_init(void)
+{
+   unsigned long addr = ALIGN_DOWN(FIXADDR_START, PGDIR_SIZE);
+   pte_t *ptep = (pte_t *)early_fixmap_pagetable;
+   pmd_t *pmdp = pmd_ptr_k(addr);
+
+   for (; (s32)(FIXADDR_TOP - addr) > 0;
+addr += PGDIR_SIZE, ptep += PTRS_PER_PTE, pmdp++)
+   pmd_populate_kernel(_mm, pmdp, ptep);
+
+   early_ioremap_setup();
+}
+
 static void __init *early_alloc_pgtable(unsigned long size)
 {
void *ptr = memblock_alloc(size, size);
-- 
2.25.0



[PATCH v4 33/45] powerpc/8xx: Drop special handling of Linear and IMMR mappings in I/D TLB handlers

2020-05-18 Thread Christophe Leroy
Up to now, linear and IMMR mappings are managed via huge TLB entries
through specific code directly in TLB miss handlers. This implies
some patching of the TLB miss handlers at startup, and a lot of
dedicated code.

Remove all this specific dedicated code.

For now we are back to normal handling via standard 4k pages. In the
next patches, linear memory mapping and IMMR mapping will be managed
through huge pages.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S |  29 +
 arch/powerpc/mm/nohash/8xx.c   | 106 +
 2 files changed, 3 insertions(+), 132 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index b0cceee6405c..d1546f379757 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -207,31 +207,21 @@ InstructionTLBMiss:
mfspr   r10, SPRN_SRR0  /* Get effective address of fault */
INVALIDATE_ADJACENT_PAGES_CPU15(r10)
mtspr   SPRN_MD_EPN, r10
-   /* Only modules will cause ITLB Misses as we always
-* pin the first 8MB of kernel memory */
 #ifdef ITLB_MISS_KERNEL
mfcrr11
-#if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT)
+#if defined(SIMPLE_KERNEL_ADDRESS)
cmpicr0, r10, 0 /* Address >= 0x8000 */
 #else
rlwinm  r10, r10, 16, 0xfff8
cmpli   cr0, r10, PAGE_OFFSET@h
-#ifndef CONFIG_PIN_TLB_TEXT
-   /* It is assumed that kernel code fits into the first 32M */
-0: cmpli   cr7, r10, (PAGE_OFFSET + 0x200)@h
-   patch_site  0b, patch__itlbmiss_linmem_top
-#endif
 #endif
 #endif
mfspr   r10, SPRN_M_TWB /* Get level 1 table */
 #ifdef ITLB_MISS_KERNEL
-#if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT)
+#if defined(SIMPLE_KERNEL_ADDRESS)
bge+3f
 #else
blt+3f
-#endif
-#ifndef CONFIG_PIN_TLB_TEXT
-   blt cr7, ITLBMissLinear
 #endif
rlwinm  r10, r10, 0, 20, 31
orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
@@ -327,19 +317,9 @@ DataStoreTLBMiss:
mfspr   r10, SPRN_MD_EPN
rlwinm  r10, r10, 16, 0xfff8
cmpli   cr0, r10, PAGE_OFFSET@h
-#ifndef CONFIG_PIN_TLB_IMMR
-   cmpli   cr6, r10, VIRT_IMMR_BASE@h
-#endif
-0: cmpli   cr7, r10, (PAGE_OFFSET + 0x200)@h
-   patch_site  0b, patch__dtlbmiss_linmem_top
 
mfspr   r10, SPRN_M_TWB /* Get level 1 table */
blt+3f
-#ifndef CONFIG_PIN_TLB_IMMR
-0: beq-cr6, DTLBMissIMMR
-   patch_site  0b, patch__dtlbmiss_immr_jmp
-#endif
-   blt cr7, DTLBMissLinear
rlwinm  r10, r10, 0, 20, 31
orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
 3:
@@ -571,14 +551,9 @@ FixupDAR:/* Entry point for dcbx workaround. */
cmpli   cr1, r11, PAGE_OFFSET@h
mfspr   r11, SPRN_M_TWB /* Get level 1 table */
blt+cr1, 3f
-   rlwinm  r11, r10, 16, 0xfff8
-
-0: cmpli   cr7, r11, (PAGE_OFFSET + 0x180)@h
-   patch_site  0b, patch__fixupdar_linmem_top
 
/* create physical page address from effective address */
tophys(r11, r10)
-   blt-cr7, 201f
mfspr   r11, SPRN_M_TWB /* Get level 1 table */
rlwinm  r11, r11, 0, 20, 31
orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 2c480e35b426..b735482e1529 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -55,8 +55,6 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
 }
 
-#define LARGE_PAGE_SIZE_8M (1<<23)
-
 /*
  * MMU_init_hw does the chip-specific initialization of the MMU hardware.
  */
@@ -81,122 +79,20 @@ void __init mmu_mapin_immr(void)
map_kernel_page(v + offset, p + offset, PAGE_KERNEL_NCG);
 }
 
-static void mmu_patch_cmp_limit(s32 *site, unsigned long mapped)
-{
-   modify_instruction_site(site, 0x, (unsigned long)__va(mapped) >> 
16);
-}
-
-static void mmu_patch_addis(s32 *site, long simm)
-{
-   unsigned int instr = *(unsigned int *)patch_site_addr(site);
-
-   instr &= 0x;
-   instr |= ((unsigned long)simm) >> 16;
-   patch_instruction_site(site, ppc_inst(instr));
-}
-
-static void mmu_mapin_ram_chunk(unsigned long offset, unsigned long top, 
pgprot_t prot)
-{
-   unsigned long s = offset;
-   unsigned long v = PAGE_OFFSET + s;
-   phys_addr_t p = memstart_addr + s;
-
-   for (; s < top; s += PAGE_SIZE) {
-   map_kernel_page(v, p, prot);
-   v += PAGE_SIZE;
-   p += PAGE_SIZE;
-   }
-}
-
 unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
 {
-   unsigned long mapped;
-
mmu_mapin_immr();
 
-   if (__map_without_ltlbs) {
-   mapped = 0;
-   if (!IS_ENABLED(CONFIG_PIN_TLB_IMMR))
-   patch_instruction_site(__dtlbmiss_immr_jmp, 

[PATCH v4 16/45] powerpc/mm: Fix conditions to perform MMU specific management by blocks on PPC32.

2020-05-18 Thread Christophe Leroy
Setting init mem to NX shall depend on sinittext being mapped by
block, not on stext being mapped by block.

Setting text and rodata to RO shall depend on stext being mapped by
block, not on sinittext being mapped by block.

Fixes: 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/pgtable_32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 9934659cb871..bd0cb6e3573e 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -185,7 +185,7 @@ void mark_initmem_nx(void)
unsigned long numpages = PFN_UP((unsigned long)_einittext) -
 PFN_DOWN((unsigned long)_sinittext);
 
-   if (v_block_mapped((unsigned long)_stext + 1))
+   if (v_block_mapped((unsigned long)_sinittext))
mmu_mark_initmem_nx();
else
change_page_attr(page, numpages, PAGE_KERNEL);
@@ -197,7 +197,7 @@ void mark_rodata_ro(void)
struct page *page;
unsigned long numpages;
 
-   if (v_block_mapped((unsigned long)_sinittext)) {
+   if (v_block_mapped((unsigned long)_stext + 1)) {
mmu_mark_rodata_ro();
ptdump_check_wx();
return;
-- 
2.25.0



[PATCH v4 24/45] powerpc/8xx: Drop CONFIG_8xx_COPYBACK option

2020-05-18 Thread Christophe Leroy
CONFIG_8xx_COPYBACK was there to help disabling copyback cache mode
for debuging hardware. But nobody will design new boards with 8xx now.

All 8xx platforms select it, so make it the default and remove
the option.

Also remove the Mx_RESETVAL values which are pretty useless and hide
the real value while reading code.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/configs/adder875_defconfig  |  1 -
 arch/powerpc/configs/ep88xc_defconfig|  1 -
 arch/powerpc/configs/mpc866_ads_defconfig|  1 -
 arch/powerpc/configs/mpc885_ads_defconfig|  1 -
 arch/powerpc/configs/tqm8xx_defconfig|  1 -
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h |  2 --
 arch/powerpc/kernel/head_8xx.S   | 15 +--
 arch/powerpc/platforms/8xx/Kconfig   |  9 -
 8 files changed, 1 insertion(+), 30 deletions(-)

diff --git a/arch/powerpc/configs/adder875_defconfig 
b/arch/powerpc/configs/adder875_defconfig
index f55e23cb176c..5326bc739279 100644
--- a/arch/powerpc/configs/adder875_defconfig
+++ b/arch/powerpc/configs/adder875_defconfig
@@ -10,7 +10,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_ADDER875=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_1000=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/ep88xc_defconfig 
b/arch/powerpc/configs/ep88xc_defconfig
index 0e2e5e81a359..f5c3e72da719 100644
--- a/arch/powerpc/configs/ep88xc_defconfig
+++ b/arch/powerpc/configs/ep88xc_defconfig
@@ -12,7 +12,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_EP88XC=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/mpc866_ads_defconfig 
b/arch/powerpc/configs/mpc866_ads_defconfig
index 5320735395e7..5c56d36cdfc5 100644
--- a/arch/powerpc/configs/mpc866_ads_defconfig
+++ b/arch/powerpc/configs/mpc866_ads_defconfig
@@ -12,7 +12,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_MPC86XADS=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_1000=y
 CONFIG_MATH_EMULATION=y
diff --git a/arch/powerpc/configs/mpc885_ads_defconfig 
b/arch/powerpc/configs/mpc885_ads_defconfig
index 82a008c04eae..949ff9ccda5e 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -11,7 +11,6 @@ CONFIG_EXPERT=y
 # CONFIG_VM_EVENT_COUNTERS is not set
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/tqm8xx_defconfig 
b/arch/powerpc/configs/tqm8xx_defconfig
index eda8bfb2d0a3..77857d513022 100644
--- a/arch/powerpc/configs/tqm8xx_defconfig
+++ b/arch/powerpc/configs/tqm8xx_defconfig
@@ -15,7 +15,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_TQM8XX=y
-CONFIG_8xx_COPYBACK=y
 # CONFIG_8xx_CPU15 is not set
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 76af5b0cb16e..26b7cee34dfe 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -19,7 +19,6 @@
 #define MI_RSV4I   0x0800  /* Reserve 4 TLB entries */
 #define MI_PPCS0x0200  /* Use MI_RPN prob/priv state */
 #define MI_IDXMASK 0x1f00  /* TLB index to be loaded */
-#define MI_RESETVAL0x  /* Value of register at reset */
 
 /* These are the Ks and Kp from the PowerPC books.  For proper operation,
  * Ks = 0, Kp = 1.
@@ -95,7 +94,6 @@
 #define MD_TWAM0x0400  /* Use 4K page hardware assist 
*/
 #define MD_PPCS0x0200  /* Use MI_RPN prob/priv state */
 #define MD_IDXMASK 0x1f00  /* TLB index to be loaded */
-#define MD_RESETVAL0x0400  /* Value of register at reset */
 
 #define SPRN_M_CASID   793 /* Address space ID (context) to match */
 #define MC_ASIDMASK0x000f  /* Bits used for ASID value */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 073a651787df..905205c79a25 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -779,10 +779,7 @@ start_here:
 initial_mmu:
li  r8, 0
mtspr   SPRN_MI_CTR, r8 /* remove PINNED ITLB entries */
-   lis r10, MD_RESETVAL@h
-#ifndef CONFIG_8xx_COPYBACK
-   orisr10, r10, MD_WTDEF@h
-#endif
+   lis r10, MD_TWAM@h
mtspr   SPRN_MD_CTR, r10/* remove PINNED DTLB entries */
 
tlbia   /* Invalidate all TLB entries */
@@ -857,17 +854,7 @@ initial_mmu:
mtspr   SPRN_DC_CST, r8
lis r8, IDC_ENABLE@h
mtspr   SPRN_IC_CST, r8
-#ifdef CONFIG_8xx_COPYBACK
-   mtspr   SPRN_DC_CST, r8
-#else
-   /* For a 

[PATCH v4 14/45] powerpc/32s: Don't warn when mapping RO data ROX.

2020-05-18 Thread Christophe Leroy
Mapping RO data as ROX is not an issue since that data
cannot be modified to introduce an exploit.

PPC64 accepts to have RO data mapped ROX, as a trade off
between kernel size and strictness of protection.

On PPC32, kernel size is even more critical as amount of
memory is usually small.

Depending on the number of available IBATs, the last IBATs
might overflow the end of text. Only warn if it crosses
the end of RO data.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/book3s32/mmu.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 39ba53ca5bb5..a9b2cbc74797 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -187,6 +187,7 @@ void mmu_mark_initmem_nx(void)
int i;
unsigned long base = (unsigned long)_stext - PAGE_OFFSET;
unsigned long top = (unsigned long)_etext - PAGE_OFFSET;
+   unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
unsigned long size;
 
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
@@ -201,9 +202,10 @@ void mmu_mark_initmem_nx(void)
size = block_size(base, top);
size = max(size, 128UL << 10);
if ((top - base) > size) {
-   if (strict_kernel_rwx_enabled())
-   pr_warn("Kernel _etext not properly aligned\n");
size <<= 1;
+   if (strict_kernel_rwx_enabled() && base + size > border)
+   pr_warn("Some RW data is getting mapped X. "
+   "Adjust CONFIG_DATA_SHIFT to avoid 
that.\n");
}
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
-- 
2.25.0



[PATCH v4 35/45] powerpc/8xx: Move DTLB perf handling closer.

2020-05-18 Thread Christophe Leroy
Now that space have been freed next to the DTLB miss handler,
it's associated DTLB perf handling can be brought back in
the same place.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index fb5d17187772..9f3f7f3d03a7 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -344,6 +344,17 @@ DataStoreTLBMiss:
rfi
patch_site  0b, patch__dtlbmiss_exit_1
 
+#ifdef CONFIG_PERF_EVENTS
+   patch_site  0f, patch__dtlbmiss_perf
+0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
+   addir10, r10, 1
+   stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
+   mfspr   r10, SPRN_DAR
+   mtspr   SPRN_DAR, r11   /* Tag DAR */
+   mfspr   r11, SPRN_M_TW
+   rfi
+#endif
+
 /* This is an instruction TLB error on the MPC8xx.  This could be due
  * to many reasons, such as executing guarded memory or illegal instruction
  * addresses.  There is nothing to do but handle a big time error fault.
@@ -390,18 +401,6 @@ DARFixed:/* Return from dcbx instruction bug workaround */
/* 0x300 is DataAccess exception, needed by bad_page_fault() */
EXC_XFER_LITE(0x300, handle_page_fault)
 
-/* Called from DataStoreTLBMiss when perf TLB misses events are activated */
-#ifdef CONFIG_PERF_EVENTS
-   patch_site  0f, patch__dtlbmiss_perf
-0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
-   addir10, r10, 1
-   stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0)
-   mfspr   r10, SPRN_DAR
-   mtspr   SPRN_DAR, r11   /* Tag DAR */
-   mfspr   r11, SPRN_M_TW
-   rfi
-#endif
-
 stack_overflow:
vmap_stack_overflow_exception
 
-- 
2.25.0



Re: clean up and streamline probe_kernel_* and friends v2

2020-05-18 Thread Christoph Hellwig
On Thu, May 14, 2020 at 01:04:38AM +0200, Daniel Borkmann wrote:
> Aside from comments on list, the series looks reasonable to me. For BPF
> the bpf_probe_read() helper would be slightly penalized for probing user
> memory given we now test on copy_from_kernel_nofault() first and if that
> fails only then fall back to copy_from_user_nofault(), but it seems
> small enough that it shouldn't matter too much and aside from that we have
> the newer bpf_probe_read_kernel() and bpf_probe_read_user() anyway that
> BPF progs should use instead, so I think it's okay.
>
> For patch 14 and patch 15, do you roughly know the performance gain with
> the new probe_kernel_read_loop() + arch_kernel_read() approach?

I don't think there should be any measurable difference in performance
for typical use cases.  We'll save the stac/clac pair, but that's it.
The real eason is to avoid that stac/clac pair that opens up a window
for explots, and as a significant enabler for killing of set_fs based 
address limit overrides entirely.


[PATCH v4 25/45] powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.

2020-05-18 Thread Christophe Leroy
Prepare ITLB handler to handle _PAGE_HUGE when CONFIG_HUGETLBFS
is enabled. This means that the L1 entry has to be kept in r11
until L2 entry is read, in order to insert _PAGE_HUGE into it.

Also move pgd_offset helpers before pte_update() as they
will be needed there in next patch.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 13 ++---
 arch/powerpc/kernel/head_8xx.S   | 15 +--
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index ff78bf25f832..9a287a95acad 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -206,6 +206,12 @@ static inline void pmd_clear(pmd_t *pmdp)
 }
 
 
+/* to find an entry in a kernel page-table-directory */
+#define pgd_offset_k(address) pgd_offset(_mm, address)
+
+/* to find an entry in a page-table-directory */
+#define pgd_index(address)  ((address) >> PGDIR_SHIFT)
+#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
 
 /*
  * PTE updates. This function is called whenever an existing
@@ -348,13 +354,6 @@ static inline int pte_young(pte_t pte)
pfn_to_page((__pa(pmd_val(pmd)) >> PAGE_SHIFT))
 #endif
 
-/* to find an entry in a kernel page-table-directory */
-#define pgd_offset_k(address) pgd_offset(_mm, address)
-
-/* to find an entry in a page-table-directory */
-#define pgd_index(address)  ((address) >> PGDIR_SHIFT)
-#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
-
 /* Find an entry in the third-level page table.. */
 #define pte_index(address) \
(((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 905205c79a25..adad8baadcf5 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -196,7 +196,7 @@ SystemCall:
 
 InstructionTLBMiss:
mtspr   SPRN_SPRG_SCRATCH0, r10
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
+#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) || 
defined(CONFIG_HUGETLBFS)
mtspr   SPRN_SPRG_SCRATCH1, r11
 #endif
 
@@ -235,16 +235,19 @@ InstructionTLBMiss:
rlwinm  r10, r10, 0, 20, 31
orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
 3:
+   mtcrr11
 #endif
+#ifdef CONFIG_HUGETLBFS
+   lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r10)/* Get level 1 
entry */
+   mtspr   SPRN_MI_TWC, r11/* Set segment attributes */
+   mtspr   SPRN_MD_TWC, r11
+#else
lwz r10, (swapper_pg_dir-PAGE_OFFSET)@l(r10)/* Get level 1 
entry */
mtspr   SPRN_MI_TWC, r10/* Set segment attributes */
-
mtspr   SPRN_MD_TWC, r10
+#endif
mfspr   r10, SPRN_MD_TWC
lwz r10, 0(r10) /* Get the pte */
-#ifdef ITLB_MISS_KERNEL
-   mtcrr11
-#endif
 #ifdef CONFIG_SWAP
rlwinm  r11, r10, 32-5, _PAGE_PRESENT
and r11, r11, r10
@@ -263,7 +266,7 @@ InstructionTLBMiss:
 
/* Restore registers */
 0: mfspr   r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
+#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) || 
defined(CONFIG_HUGETLBFS)
mfspr   r11, SPRN_SPRG_SCRATCH1
 #endif
rfi
-- 
2.25.0



[PATCH v4 17/45] powerpc/mm: PTE_ATOMIC_UPDATES is only for 40x

2020-05-18 Thread Christophe Leroy
Only 40x still uses PTE_ATOMIC_UPDATES.
40x cannot not select CONFIG_PTE64_BIT.

Drop handling of PTE_ATOMIC_UPDATES:
- In nohash/64
- In nohash/32 for CONFIG_PTE_64BIT

Keep PTE_ATOMIC_UPDATES only for nohash/32 for !CONFIG_PTE_64BIT

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 17 
 arch/powerpc/include/asm/nohash/64/pgtable.h | 28 +---
 2 files changed, 1 insertion(+), 44 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 4315d40906a0..7e908a176e9e 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -262,25 +262,8 @@ static inline unsigned long long pte_update(pte_t *p,
unsigned long clr,
unsigned long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
-   unsigned long long old;
-   unsigned long tmp;
-
-   __asm__ __volatile__("\
-1: lwarx   %L0,0,%4\n\
-   lwzx%0,0,%3\n\
-   andc%1,%L0,%5\n\
-   or  %1,%1,%6\n"
-   PPC405_ERR77(0,%3)
-"  stwcx.  %1,0,%4\n\
-   bne-1b"
-   : "=" (old), "=" (tmp), "=m" (*p)
-   : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p)
-   : "cc" );
-#else /* PTE_ATOMIC_UPDATES */
unsigned long long old = pte_val(*p);
*p = __pte((old & ~(unsigned long long)clr) | set);
-#endif /* !PTE_ATOMIC_UPDATES */
 
 #ifdef CONFIG_44x
if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9a33b8bd842d..9c703b140d64 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -211,22 +211,9 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
   unsigned long set,
   int huge)
 {
-#ifdef PTE_ATOMIC_UPDATES
-   unsigned long old, tmp;
-
-   __asm__ __volatile__(
-   "1: ldarx   %0,0,%3 # pte_update\n\
-   andc%1,%0,%4 \n\
-   or  %1,%1,%6\n\
-   stdcx.  %1,0,%3 \n\
-   bne-1b"
-   : "=" (old), "=" (tmp), "=m" (*ptep)
-   : "r" (ptep), "r" (clr), "m" (*ptep), "r" (set)
-   : "cc" );
-#else
unsigned long old = pte_val(*ptep);
*ptep = __pte((old & ~clr) | set);
-#endif
+
/* huge pages use the old page table lock */
if (!huge)
assert_pte_locked(mm, addr);
@@ -310,21 +297,8 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
unsigned long bits = pte_val(entry) &
(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
 
-#ifdef PTE_ATOMIC_UPDATES
-   unsigned long old, tmp;
-
-   __asm__ __volatile__(
-   "1: ldarx   %0,0,%4\n\
-   or  %0,%3,%0\n\
-   stdcx.  %0,0,%4\n\
-   bne-1b"
-   :"=" (old), "=" (tmp), "=m" (*ptep)
-   :"r" (bits), "r" (ptep), "m" (*ptep)
-   :"cc");
-#else
unsigned long old = pte_val(*ptep);
*ptep = __pte(old | bits);
-#endif
 
flush_tlb_page(vma, address);
 }
-- 
2.25.0



[PATCH v4 34/45] powerpc/8xx: Remove now unused TLB miss functions

2020-05-18 Thread Christophe Leroy
The code to setup linear and IMMR mapping via huge TLB entries is
not called anymore. Remove it.

Also remove the handling of removed code exits in the perf driver.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h |  8 +-
 arch/powerpc/kernel/head_8xx.S   | 83 
 arch/powerpc/perf/8xx-pmu.c  | 10 ---
 3 files changed, 1 insertion(+), 100 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 4d3ef3841b00..e82368838416 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -240,13 +240,7 @@ static inline unsigned int mmu_psize_to_shift(unsigned int 
mmu_psize)
 }
 
 /* patch sites */
-extern s32 patch__itlbmiss_linmem_top, patch__itlbmiss_linmem_top8;
-extern s32 patch__dtlbmiss_linmem_top, patch__dtlbmiss_immr_jmp;
-extern s32 patch__fixupdar_linmem_top;
-extern s32 patch__dtlbmiss_romem_top, patch__dtlbmiss_romem_top8;
-
-extern s32 patch__itlbmiss_exit_1, patch__itlbmiss_exit_2;
-extern s32 patch__dtlbmiss_exit_1, patch__dtlbmiss_exit_2, 
patch__dtlbmiss_exit_3;
+extern s32 patch__itlbmiss_exit_1, patch__dtlbmiss_exit_1;
 extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf;
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index d1546f379757..fb5d17187772 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -278,33 +278,6 @@ InstructionTLBMiss:
rfi
 #endif
 
-#ifndef CONFIG_PIN_TLB_TEXT
-ITLBMissLinear:
-   mtcrr11
-#if defined(CONFIG_STRICT_KERNEL_RWX) && CONFIG_ETEXT_SHIFT < 23
-   patch_site  0f, patch__itlbmiss_linmem_top8
-
-   mfspr   r10, SPRN_SRR0
-0: subis   r11, r10, (PAGE_OFFSET - 0x8000)@ha
-   rlwinm  r11, r11, 4, MI_PS8MEG ^ MI_PS512K
-   ori r11, r11, MI_PS512K | MI_SVALID
-   rlwinm  r10, r10, 0, 0x0ff8 /* 8xx supports max 256Mb RAM */
-#else
-   /* Set 8M byte page and mark it valid */
-   li  r11, MI_PS8MEG | MI_SVALID
-   rlwinm  r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */
-#endif
-   mtspr   SPRN_MI_TWC, r11
-   ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \
- _PAGE_PRESENT
-   mtspr   SPRN_MI_RPN, r10/* Update TLB entry */
-
-0: mfspr   r10, SPRN_SPRG_SCRATCH0
-   mfspr   r11, SPRN_SPRG_SCRATCH1
-   rfi
-   patch_site  0b, patch__itlbmiss_exit_2
-#endif
-
. = 0x1200
 DataStoreTLBMiss:
mtspr   SPRN_DAR, r10
@@ -371,62 +344,6 @@ DataStoreTLBMiss:
rfi
patch_site  0b, patch__dtlbmiss_exit_1
 
-DTLBMissIMMR:
-   mtcrr11
-   /* Set 512k byte guarded page and mark it valid */
-   li  r10, MD_PS512K | MD_GUARDED | MD_SVALID
-   mtspr   SPRN_MD_TWC, r10
-   mfspr   r10, SPRN_IMMR  /* Get current IMMR */
-   rlwinm  r10, r10, 0, 0xfff8 /* Get 512 kbytes boundary */
-   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \
- _PAGE_PRESENT | _PAGE_NO_CACHE
-   mtspr   SPRN_MD_RPN, r10/* Update TLB entry */
-
-   li  r11, RPN_PATTERN
-
-0: mfspr   r10, SPRN_DAR
-   mtspr   SPRN_DAR, r11   /* Tag DAR */
-   mfspr   r11, SPRN_M_TW
-   rfi
-   patch_site  0b, patch__dtlbmiss_exit_2
-
-DTLBMissLinear:
-   mtcrr11
-   rlwinm  r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */
-#if defined(CONFIG_STRICT_KERNEL_RWX) && CONFIG_DATA_SHIFT < 23
-   patch_site  0f, patch__dtlbmiss_romem_top8
-
-0: subis   r11, r10, (PAGE_OFFSET - 0x8000)@ha
-   rlwinm  r11, r11, 0, 0xff80
-   neg r10, r11
-   or  r11, r11, r10
-   rlwinm  r11, r11, 4, MI_PS8MEG ^ MI_PS512K
-   ori r11, r11, MI_PS512K | MI_SVALID
-   mfspr   r10, SPRN_MD_EPN
-   rlwinm  r10, r10, 0, 0x0ff8 /* 8xx supports max 256Mb RAM */
-#else
-   /* Set 8M byte page and mark it valid */
-   li  r11, MD_PS8MEG | MD_SVALID
-#endif
-   mtspr   SPRN_MD_TWC, r11
-#ifdef CONFIG_STRICT_KERNEL_RWX
-   patch_site  0f, patch__dtlbmiss_romem_top
-
-0: subis   r11, r10, 0
-   rlwimi  r10, r11, 11, _PAGE_RO
-#endif
-   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \
- _PAGE_PRESENT
-   mtspr   SPRN_MD_RPN, r10/* Update TLB entry */
-
-   li  r11, RPN_PATTERN
-
-0: mfspr   r10, SPRN_DAR
-   mtspr   SPRN_DAR, r11   /* Tag DAR */
-   mfspr   r11, SPRN_M_TW
-   rfi
-   patch_site  0b, patch__dtlbmiss_exit_3
-
 /* This is an instruction TLB error on the MPC8xx.  This could be due
  * to many reasons, such as executing guarded memory or illegal instruction
  * addresses.  There is nothing to do but handle a big time error 

[PATCH v4 36/45] powerpc/mm: Don't be too strict with _etext alignment on PPC32

2020-05-18 Thread Christophe Leroy
Similar to PPC64, accept to map RO data as ROX as a trade off between
between security and memory usage.

Having RO data executable is not a high risk as RO data can't be
modified to forge an exploit.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig  | 26 --
 arch/powerpc/kernel/vmlinux.lds.S |  3 +--
 2 files changed, 1 insertion(+), 28 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1d4ef4f27dec..d147d379b1b9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -778,32 +778,6 @@ config THREAD_SHIFT
  Used to define the stack size. The default is almost always what you
  want. Only change this if you know what you are doing.
 
-config ETEXT_SHIFT_BOOL
-   bool "Set custom etext alignment" if STRICT_KERNEL_RWX && \
-(PPC_BOOK3S_32 || PPC_8xx)
-   depends on ADVANCED_OPTIONS
-   help
- This option allows you to set the kernel end of text alignment. When
- RAM is mapped by blocks, the alignment needs to fit the size and
- number of possible blocks. The default should be OK for most configs.
-
- Say N here unless you know what you are doing.
-
-config ETEXT_SHIFT
-   int "_etext shift" if ETEXT_SHIFT_BOOL
-   range 17 28 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
-   range 19 23 if STRICT_KERNEL_RWX && PPC_8xx
-   default 17 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
-   default 19 if STRICT_KERNEL_RWX && PPC_8xx
-   default PPC_PAGE_SHIFT
-   help
- On Book3S 32 (603+), IBATs are used to map kernel text.
- Smaller is the alignment, greater is the number of necessary IBATs.
-
- On 8xx, large pages (512kb or 8M) are used to map kernel linear
- memory. Aligning to 8M reduces TLB misses as only 8M pages are used
- in that case.
-
 config DATA_SHIFT_BOOL
bool "Set custom data alignment" if STRICT_KERNEL_RWX && \
(PPC_BOOK3S_32 || PPC_8xx)
diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 31a0f201fb6f..54f23205c2b9 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -15,7 +15,6 @@
 #include 
 
 #define STRICT_ALIGN_SIZE  (1 << CONFIG_DATA_SHIFT)
-#define ETEXT_ALIGN_SIZE   (1 << CONFIG_ETEXT_SHIFT)
 
 ENTRY(_stext)
 
@@ -116,7 +115,7 @@ SECTIONS
 
} :text
 
-   . = ALIGN(ETEXT_ALIGN_SIZE);
+   . = ALIGN(PAGE_SIZE);
_etext = .;
PROVIDE32 (etext = .);
 
-- 
2.25.0



[PATCH v4 22/45] powerpc/mm: Create a dedicated pte_update() for 8xx

2020-05-18 Thread Christophe Leroy
pte_update() is a bit special for the 8xx. At the time
being, that's an #ifdef inside the nohash/32 pte_update().

As we are going to make it even more special in the coming
patches, create a dedicated version for pte_update() for 8xx.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 29 +---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 474dd1db065f..5fb3f6798e22 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -221,7 +221,31 @@ static inline void pmd_clear(pmd_t *pmdp)
  * that an executable user mapping was modified, which is needed
  * to properly flush the virtually tagged instruction cache of
  * those implementations.
+ *
+ * On the 8xx, the page tables are a bit special. For 16k pages, we have
+ * 4 identical entries. For other page sizes, we have a single entry in the
+ * table.
  */
+#ifdef CONFIG_PPC_8xx
+static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
+unsigned long clr, unsigned long set, int 
huge)
+{
+   pte_basic_t *entry = >pte;
+   pte_basic_t old = pte_val(*p);
+   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
+   int num, i;
+
+   if (!huge)
+   num = PAGE_SIZE / SZ_4K;
+   else
+   num = 1;
+
+   for (i = 0; i < num; i++, entry++)
+   *entry = new;
+
+   return old;
+}
+#else
 static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
 unsigned long clr, unsigned long set, int 
huge)
 {
@@ -242,11 +266,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
pte_basic_t old = pte_val(*p);
pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
 
-#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
-   p->pte = p->pte1 = p->pte2 = p->pte3 = new;
-#else
*p = __pte(new);
-#endif
 #endif /* !PTE_ATOMIC_UPDATES */
 
 #ifdef CONFIG_44x
@@ -255,6 +275,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 #endif
return old;
 }
+#endif
 
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
-- 
2.25.0



[PATCH v4 41/45] powerpc/8xx: Allow STRICT_KERNEL_RwX with pinned TLB

2020-05-18 Thread Christophe Leroy
Pinned TLB are 8M. Now that there is no strict boundary anymore
between text and RO data, it is possible to use 8M pinned executable
TLB that covers both text and RO data.

When PIN_TLB_DATA or PIN_TLB_TEXT is selected, enforce 8M RW data
alignment and allow STRICT_KERNEL_RWX.

Signed-off-by: Christophe Leroy 
---
v2: Use the new function that sets all pinned TLBs at once.
---
 arch/powerpc/Kconfig   | 8 +---
 arch/powerpc/mm/nohash/8xx.c   | 9 +++--
 arch/powerpc/platforms/8xx/Kconfig | 2 +-
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d147d379b1b9..f5e82629e2cd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -779,9 +779,10 @@ config THREAD_SHIFT
  want. Only change this if you know what you are doing.
 
 config DATA_SHIFT_BOOL
-   bool "Set custom data alignment" if STRICT_KERNEL_RWX && \
-   (PPC_BOOK3S_32 || PPC_8xx)
+   bool "Set custom data alignment"
depends on ADVANCED_OPTIONS
+   depends on STRICT_KERNEL_RWX
+   depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !PIN_TLB_TEXT)
help
  This option allows you to set the kernel data alignment. When
  RAM is mapped by blocks, the alignment needs to fit the size and
@@ -803,7 +804,8 @@ config DATA_SHIFT
 
  On 8xx, large pages (512kb or 8M) are used to map kernel linear
  memory. Aligning to 8M reduces TLB misses as only 8M pages are used
- in that case.
+ in that case. If PIN_TLB is selected, it must be aligned to 8M as
+ 8M pages will be pinned.
 
 config FORCE_MAX_ZONEORDER
int "Maximum zone order"
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index ec3ef75895d8..d8697f535c3e 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -127,8 +127,8 @@ void __init mmu_mapin_immr(void)
PAGE_KERNEL_NCG, MMU_PAGE_512K, true);
 }
 
-static void __init mmu_mapin_ram_chunk(unsigned long offset, unsigned long top,
-  pgprot_t prot, bool new)
+static void mmu_mapin_ram_chunk(unsigned long offset, unsigned long top,
+   pgprot_t prot, bool new)
 {
unsigned long v = PAGE_OFFSET + offset;
unsigned long p = offset;
@@ -181,6 +181,9 @@ void mmu_mark_initmem_nx(void)
 
mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
+
+   if (IS_ENABLED(CONFIG_PIN_TLB_TEXT))
+   mmu_pin_tlb(block_mapped_ram, false);
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
@@ -189,6 +192,8 @@ void mmu_mark_rodata_ro(void)
unsigned long sinittext = __pa(_sinittext);
 
mmu_mapin_ram_chunk(0, sinittext, PAGE_KERNEL_ROX, false);
+   if (IS_ENABLED(CONFIG_PIN_TLB_DATA))
+   mmu_pin_tlb(block_mapped_ram, true);
 }
 #endif
 
diff --git a/arch/powerpc/platforms/8xx/Kconfig 
b/arch/powerpc/platforms/8xx/Kconfig
index 04ea1a8a0bdc..05669f2fadce 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -167,7 +167,7 @@ menu "8xx advanced setup"
 
 config PIN_TLB
bool "Pinned Kernel TLBs"
-   depends on ADVANCED_OPTIONS && !DEBUG_PAGEALLOC && !STRICT_KERNEL_RWX
+   depends on ADVANCED_OPTIONS && !DEBUG_PAGEALLOC
help
  On the 8xx, we have 32 instruction TLBs and 32 data TLBs. In each
  table 4 TLBs can be pinned.
-- 
2.25.0



[PATCH v4 38/45] powerpc/8xx: Add a function to early map kernel via huge pages

2020-05-18 Thread Christophe Leroy
Add a function to early map kernel memory using huge pages.

For 512k pages, just use standard page table and map in using 512k
pages.

For 8M pages, create a hugepd table and populate the two PGD
entries with it.

This function can only be used to create page tables at startup. Once
the regular SLAB allocation functions replace memblock functions,
this function cannot allocate new pages anymore. However it can still
update existing mappings with new protections.

hugepd_none() macro is moved into asm/hugetlb.h to be usable outside
of mm/hugetlbpage.c

early_pte_alloc_kernel() is made visible.

_PAGE_HUGE flag is now displayed by ptdump.

Signed-off-by: Christophe Leroy 
---
v2: Select CONFIG_HUGETLBFS instead of CONFIG_HUGETLB_PAGE which leads to 
linktime failure
---
 .../include/asm/nohash/32/hugetlb-8xx.h   |  5 ++
 arch/powerpc/include/asm/pgtable.h|  2 +
 arch/powerpc/mm/nohash/8xx.c  | 52 +++
 arch/powerpc/mm/pgtable_32.c  |  2 +-
 arch/powerpc/mm/ptdump/8xx.c  |  5 ++
 arch/powerpc/platforms/Kconfig.cputype|  1 +
 6 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
index 1c7d4693a78e..e752a5807a59 100644
--- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
@@ -35,6 +35,11 @@ static inline void hugepd_populate(hugepd_t *hpdp, pte_t 
*new, unsigned int pshi
*hpdp = __hugepd(__pa(new) | _PMD_USER | _PMD_PRESENT | _PMD_PAGE_8M);
 }
 
+static inline void hugepd_populate_kernel(hugepd_t *hpdp, pte_t *new, unsigned 
int pshift)
+{
+   *hpdp = __hugepd(__pa(new) | _PMD_PRESENT | _PMD_PAGE_8M);
+}
+
 static inline int check_and_get_huge_psize(int shift)
 {
return shift_to_mmu_psize(shift);
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index b1f1d5339735..961895be932a 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -107,6 +107,8 @@ unsigned long vmalloc_to_phys(void *vmalloc_addr);
 
 void pgtable_cache_add(unsigned int shift);
 
+pte_t *early_pte_alloc_kernel(pmd_t *pmdp, unsigned long va);
+
 #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_PPC32)
 void mark_initmem_nx(void);
 #else
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index b735482e1529..72fb75f2a5f1 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -9,9 +9,11 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -55,6 +57,56 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
 }
 
+static pte_t __init *early_hugepd_alloc_kernel(hugepd_t *pmdp, unsigned long 
va)
+{
+   if (hpd_val(*pmdp) == 0) {
+   pte_t *ptep = memblock_alloc(sizeof(pte_basic_t), SZ_4K);
+
+   if (!ptep)
+   return NULL;
+
+   hugepd_populate_kernel((hugepd_t *)pmdp, ptep, PAGE_SHIFT_8M);
+   hugepd_populate_kernel((hugepd_t *)pmdp + 1, ptep, 
PAGE_SHIFT_8M);
+   }
+   return hugepte_offset(*(hugepd_t *)pmdp, va, PGDIR_SHIFT);
+}
+
+static int __ref __early_map_kernel_hugepage(unsigned long va, phys_addr_t pa,
+pgprot_t prot, int psize, bool new)
+{
+   pmd_t *pmdp = pmd_ptr_k(va);
+   pte_t *ptep;
+
+   if (WARN_ON(psize != MMU_PAGE_512K && psize != MMU_PAGE_8M))
+   return -EINVAL;
+
+   if (new) {
+   if (WARN_ON(slab_is_available()))
+   return -EINVAL;
+
+   if (psize == MMU_PAGE_512K)
+   ptep = early_pte_alloc_kernel(pmdp, va);
+   else
+   ptep = early_hugepd_alloc_kernel((hugepd_t *)pmdp, va);
+   } else {
+   if (psize == MMU_PAGE_512K)
+   ptep = pte_offset_kernel(pmdp, va);
+   else
+   ptep = hugepte_offset(*(hugepd_t *)pmdp, va, 
PGDIR_SHIFT);
+   }
+
+   if (WARN_ON(!ptep))
+   return -ENOMEM;
+
+   /* The PTE should never be already present */
+   if (new && WARN_ON(pte_present(*ptep) && pgprot_val(prot)))
+   return -EINVAL;
+
+   set_huge_pte_at(_mm, va, ptep, pte_mkhuge(pfn_pte(pa >> 
PAGE_SHIFT, prot)));
+
+   return 0;
+}
+
 /*
  * MMU_init_hw does the chip-specific initialization of the MMU hardware.
  */
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index bd0cb6e3573e..05902bbff8d6 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -61,7 +61,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
return ptr;
 }
 
-static pte_t __init *early_pte_alloc_kernel(pmd_t *pmdp, unsigned long va)
+pte_t __init *early_pte_alloc_kernel(pmd_t *pmdp, unsigned 

[PATCH v4 44/45] powerpc/32s: Allow mapping with BATs with DEBUG_PAGEALLOC

2020-05-18 Thread Christophe Leroy
DEBUG_PAGEALLOC only manages RW data.

Text and RO data can still be mapped with BATs.

In order to map with BATs, also enforce data alignment. Set
by default to 256M which is a good compromise for keeping
enough BATs for also KASAN and IMMR.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig   | 1 +
 arch/powerpc/mm/book3s32/mmu.c | 6 ++
 arch/powerpc/mm/init_32.c  | 5 ++---
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index fcb0a9ae9872..752deddc9ed9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -797,6 +797,7 @@ config DATA_SHIFT
range 17 28 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC) && PPC_BOOK3S_32
range 19 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC) && PPC_8xx
default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
+   default 18 if DEBUG_PAGEALLOC && PPC_BOOK3S_32
default 23 if STRICT_KERNEL_RWX && PPC_8xx
default 23 if DEBUG_PAGEALLOC && PPC_8xx && PIN_TLB_DATA
default 19 if DEBUG_PAGEALLOC && PPC_8xx
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index a9b2cbc74797..a6dcc708eee3 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -170,6 +170,12 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
pr_debug("RAM mapped without BATs\n");
return base;
}
+   if (debug_pagealloc_enabled()) {
+   if (base >= border)
+   return base;
+   if (top >= border)
+   top = border;
+   }
 
if (!strict_kernel_rwx_enabled() || base >= border || top <= border)
return __mmu_mapin_ram(base, top);
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 8977a7c2543d..36c39bd37256 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -99,10 +99,9 @@ static void __init MMU_setup(void)
if (IS_ENABLED(CONFIG_PPC_8xx))
return;
 
-   if (debug_pagealloc_enabled()) {
-   __map_without_bats = 1;
+   if (debug_pagealloc_enabled())
__map_without_ltlbs = 1;
-   }
+
if (strict_kernel_rwx_enabled())
__map_without_ltlbs = 1;
 }
-- 
2.25.0



[PATCH v4 39/45] powerpc/8xx: Map IMMR with a huge page

2020-05-18 Thread Christophe Leroy
Map the IMMR area with a single 512k huge page.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/nohash/8xx.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 72fb75f2a5f1..f8fff1fa72e3 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -118,17 +118,13 @@ static bool immr_is_mapped __initdata;
 
 void __init mmu_mapin_immr(void)
 {
-   unsigned long p = PHYS_IMMR_BASE;
-   unsigned long v = VIRT_IMMR_BASE;
-   int offset;
-
if (immr_is_mapped)
return;
 
immr_is_mapped = true;
 
-   for (offset = 0; offset < IMMR_SIZE; offset += PAGE_SIZE)
-   map_kernel_page(v + offset, p + offset, PAGE_KERNEL_NCG);
+   __early_map_kernel_hugepage(VIRT_IMMR_BASE, PHYS_IMMR_BASE,
+   PAGE_KERNEL_NCG, MMU_PAGE_512K, true);
 }
 
 unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
-- 
2.25.0



[PATCH v4 26/45] powerpc/8xx: Manage 512k huge pages as standard pages.

2020-05-18 Thread Christophe Leroy
At the time being, 512k huge pages are handled through hugepd page
tables. The PMD entry is flagged as a hugepd pointer and it
means that only 512k hugepages can be managed in that 4M block.
However, the hugepd table has the same size as a normal page
table, and 512k entries can therefore be nested with normal pages.

On the 8xx, TLB loading is performed by software and allthough the
page tables are organised to match the L1 and L2 level defined by
the HW, all TLB entries have both L1 and L2 independent entries.
It means that even if two TLB entries are associated with the same
PMD entry, they can be loaded with different values in L1 part.

The L1 entry contains the page size (PS field):
- 00 for 4k and 16 pages
- 01 for 512k pages
- 11 for 8M pages

By adding a flag for hugepages in the PTE (_PAGE_HUGE) and copying it
into the lower bit of PS, we can then manage 512k pages with normal
page tables:
- PMD entry has PS=11 for 8M pages
- PMD entry has PS=00 for other pages.

As a PMD entry covers 4M areas, a PMD will either point to a hugepd
table having a single entry to an 8M page, or the PMD will point to
a standard page table which will have either entries to 4k or 16k or
512k pages. For 512k pages, as the L1 entry will not know it is a
512k page before the PTE is read, there will be 128 entries in the
PTE as if it was 4k pages. But when loading the TLB, it will be
flagged as a 512k page.

Note that we can't use pmd_ptr() in asm/nohash/32/pgtable.h because
it is not defined yet.

In ITLB miss, we keep the possibility to opt it out as when kernel
text is pinned and no user hugepages are used, we can save several
instruction by not using r11.

In DTLB miss, that's just one instruction so it's not worth bothering
with it.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 10 ++---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h |  4 +++-
 arch/powerpc/include/asm/nohash/pgtable.h|  2 +-
 arch/powerpc/kernel/head_8xx.S   | 12 +--
 arch/powerpc/mm/hugetlbpage.c| 22 +---
 arch/powerpc/mm/pgtable.c| 10 -
 6 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 9a287a95acad..717f995d21b8 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -229,8 +229,9 @@ static inline void pmd_clear(pmd_t *pmdp)
  * those implementations.
  *
  * On the 8xx, the page tables are a bit special. For 16k pages, we have
- * 4 identical entries. For other page sizes, we have a single entry in the
- * table.
+ * 4 identical entries. For 512k pages, we have 128 entries as if it was
+ * 4k pages, but they are flagged as 512k pages for the hardware.
+ * For other page sizes, we have a single entry in the table.
  */
 #ifdef CONFIG_PPC_8xx
 static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
@@ -240,13 +241,16 @@ static inline pte_basic_t pte_update(struct mm_struct 
*mm, unsigned long addr, p
pte_basic_t old = pte_val(*p);
pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
int num, i;
+   pmd_t *pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
 
if (!huge)
num = PAGE_SIZE / SZ_4K;
+   else if ((pmd_val(*pmd) & _PMD_PAGE_MASK) != _PMD_PAGE_8M)
+   num = SZ_512K / SZ_4K;
else
num = 1;
 
-   for (i = 0; i < num; i++, entry++)
+   for (i = 0; i < num; i++, entry++, new += SZ_4K)
*entry = new;
 
return old;
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index c9e4b2d90f65..66f403a7da44 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -46,6 +46,8 @@
 #define _PAGE_NA   0x0200  /* Supervisor NA, User no access */
 #define _PAGE_RO   0x0600  /* Supervisor RO, User no access */
 
+#define _PAGE_HUGE 0x0800  /* Copied to L1 PS bit 29 */
+
 /* cache related flags non existing on 8xx */
 #define _PAGE_COHERENT 0
 #define _PAGE_WRITETHRU0
@@ -128,7 +130,7 @@ static inline pte_t pte_mkuser(pte_t pte)
 
 static inline pte_t pte_mkhuge(pte_t pte)
 {
-   return __pte(pte_val(pte) | _PAGE_SPS);
+   return __pte(pte_val(pte) | _PAGE_SPS | _PAGE_HUGE);
 }
 
 #define pte_mkhuge pte_mkhuge
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 7fed9dc0f147..f27c967d9269 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -267,7 +267,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, 
unsigned long pfn,
 static inline int hugepd_ok(hugepd_t hpd)
 {
 #ifdef CONFIG_PPC_8xx
-   return ((hpd_val(hpd) & 0x4) != 0);
+   return 

[PATCH v4 42/45] powerpc/8xx: Allow large TLBs with DEBUG_PAGEALLOC

2020-05-18 Thread Christophe Leroy
DEBUG_PAGEALLOC only manages RW data.

Text and RO data can still be mapped with hugepages and pinned TLB.

In order to map with hugepages, also enforce a 512kB data alignment
minimum. That's a trade-off between size of speed, taking into
account that DEBUG_PAGEALLOC is a debug option. Anyway the alignment
is still tunable.

We also allow tuning of alignment for book3s to limit the complexity
of the test in Kconfig that will anyway disappear in the following
patches once DEBUG_PAGEALLOC is handled together with BATs.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig   | 11 +++
 arch/powerpc/mm/init_32.c  |  5 -
 arch/powerpc/mm/nohash/8xx.c   | 11 ---
 arch/powerpc/platforms/8xx/Kconfig |  2 +-
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f5e82629e2cd..fcb0a9ae9872 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -781,8 +781,9 @@ config THREAD_SHIFT
 config DATA_SHIFT_BOOL
bool "Set custom data alignment"
depends on ADVANCED_OPTIONS
-   depends on STRICT_KERNEL_RWX
-   depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !PIN_TLB_TEXT)
+   depends on STRICT_KERNEL_RWX || DEBUG_PAGEALLOC
+   depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && \
+(!PIN_TLB_TEXT || !STRICT_KERNEL_RWX))
help
  This option allows you to set the kernel data alignment. When
  RAM is mapped by blocks, the alignment needs to fit the size and
@@ -793,10 +794,12 @@ config DATA_SHIFT_BOOL
 config DATA_SHIFT
int "Data shift" if DATA_SHIFT_BOOL
default 24 if STRICT_KERNEL_RWX && PPC64
-   range 17 28 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
-   range 19 23 if STRICT_KERNEL_RWX && PPC_8xx
+   range 17 28 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC) && PPC_BOOK3S_32
+   range 19 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC) && PPC_8xx
default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
default 23 if STRICT_KERNEL_RWX && PPC_8xx
+   default 23 if DEBUG_PAGEALLOC && PPC_8xx && PIN_TLB_DATA
+   default 19 if DEBUG_PAGEALLOC && PPC_8xx
default PPC_PAGE_SHIFT
help
  On Book3S 32 (603+), DBATs are used to map kernel text and rodata RO.
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index a6991ef8727d..8977a7c2543d 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -96,11 +96,14 @@ static void __init MMU_setup(void)
if (strstr(boot_command_line, "noltlbs")) {
__map_without_ltlbs = 1;
}
+   if (IS_ENABLED(CONFIG_PPC_8xx))
+   return;
+
if (debug_pagealloc_enabled()) {
__map_without_bats = 1;
__map_without_ltlbs = 1;
}
-   if (strict_kernel_rwx_enabled() && !IS_ENABLED(CONFIG_PPC_8xx))
+   if (strict_kernel_rwx_enabled())
__map_without_ltlbs = 1;
 }
 
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index d8697f535c3e..286441bbbe49 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -150,7 +150,8 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
 {
unsigned long etext8 = ALIGN(__pa(_etext), SZ_8M);
unsigned long sinittext = __pa(_sinittext);
-   unsigned long boundary = strict_kernel_rwx_enabled() ? sinittext : 
etext8;
+   bool strict_boundary = strict_kernel_rwx_enabled() || 
debug_pagealloc_enabled();
+   unsigned long boundary = strict_boundary ? sinittext : etext8;
unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
 
WARN_ON(top < einittext8);
@@ -161,8 +162,12 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
return 0;
 
mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, true);
-   mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL_TEXT, true);
-   mmu_mapin_ram_chunk(einittext8, top, PAGE_KERNEL, true);
+   if (debug_pagealloc_enabled()) {
+   top = boundary;
+   } else {
+   mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL_TEXT, 
true);
+   mmu_mapin_ram_chunk(einittext8, top, PAGE_KERNEL, true);
+   }
 
if (top > SZ_32M)
memblock_set_current_limit(top);
diff --git a/arch/powerpc/platforms/8xx/Kconfig 
b/arch/powerpc/platforms/8xx/Kconfig
index 05669f2fadce..abb2b45b2789 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -167,7 +167,7 @@ menu "8xx advanced setup"
 
 config PIN_TLB
bool "Pinned Kernel TLBs"
-   depends on ADVANCED_OPTIONS && !DEBUG_PAGEALLOC
+   depends on ADVANCED_OPTIONS
help
  On the 8xx, we have 32 instruction TLBs and 32 data TLBs. In each
  table 4 TLBs can be pinned.
-- 
2.25.0



[PATCH v4 37/45] powerpc/8xx: Refactor kernel address boundary comparison

2020-05-18 Thread Christophe Leroy
Now that linear and IMMR dedicated TLB handling is gone, kernel
boundary address comparison is similar in ITLB miss handler and
in DTLB miss handler.

Create a macro named compare_to_kernel_boundary.

When TASK_SIZE is strictly below 0x8000 and PAGE_OFFSET is
above 0x8000, it is enough to compare to 0x800, and this
can be done with a single instruction.

Using not. instruction, we get to use 'blt' conditional branch as
when doing a regular comparison:

0x <= addr <= 0x7fff ==>
0x >= NOT(addr) >= 0x8000
The above test corresponds to a 'blt'

Otherwise, do a regular comparison using two instructions.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 9f3f7f3d03a7..9a117b9f0998 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -32,10 +32,15 @@
 
 #include "head_32.h"
 
+.macro compare_to_kernel_boundary scratch, addr
 #if CONFIG_TASK_SIZE <= 0x8000 && CONFIG_PAGE_OFFSET >= 0x8000
 /* By simply checking Address >= 0x8000, we know if its a kernel address */
-#define SIMPLE_KERNEL_ADDRESS  1
+   not.\scratch, \addr
+#else
+   rlwinm  \scratch, \addr, 16, 0xfff8
+   cmpli   cr0, \scratch, PAGE_OFFSET@h
 #endif
+.endm
 
 /*
  * We need an ITLB miss handler for kernel addresses if:
@@ -209,20 +214,11 @@ InstructionTLBMiss:
mtspr   SPRN_MD_EPN, r10
 #ifdef ITLB_MISS_KERNEL
mfcrr11
-#if defined(SIMPLE_KERNEL_ADDRESS)
-   cmpicr0, r10, 0 /* Address >= 0x8000 */
-#else
-   rlwinm  r10, r10, 16, 0xfff8
-   cmpli   cr0, r10, PAGE_OFFSET@h
-#endif
+   compare_to_kernel_boundary r10, r10
 #endif
mfspr   r10, SPRN_M_TWB /* Get level 1 table */
 #ifdef ITLB_MISS_KERNEL
-#if defined(SIMPLE_KERNEL_ADDRESS)
-   bge+3f
-#else
blt+3f
-#endif
rlwinm  r10, r10, 0, 20, 31
orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha
 3:
@@ -288,9 +284,7 @@ DataStoreTLBMiss:
 * kernel page tables.
 */
mfspr   r10, SPRN_MD_EPN
-   rlwinm  r10, r10, 16, 0xfff8
-   cmpli   cr0, r10, PAGE_OFFSET@h
-
+   compare_to_kernel_boundary r10, r10
mfspr   r10, SPRN_M_TWB /* Get level 1 table */
blt+3f
rlwinm  r10, r10, 0, 20, 31
-- 
2.25.0



[PATCH v4 31/45] powerpc/8xx: Don't set IMMR map anymore at boot

2020-05-18 Thread Christophe Leroy
Only early debug requires IMMR to be mapped early.

No need to set it up and pin it in assembly. Map it
through page tables at udbg init when necessary.

If CONFIG_PIN_TLB_IMMR is selected, pin it once we
don't need the 32 Mb pinned RAM anymore.

Signed-off-by: Christophe Leroy 
---
v2: Disable TLB reservation to modify entry 31
---
 arch/powerpc/kernel/head_8xx.S | 39 +-
 arch/powerpc/mm/mmu_decl.h |  4 +++
 arch/powerpc/mm/nohash/8xx.c   | 15 +---
 arch/powerpc/platforms/8xx/Kconfig |  2 +-
 arch/powerpc/sysdev/cpm_common.c   |  2 ++
 5 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index c9e3d54e6a6f..d607f4b53e0f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -749,6 +749,23 @@ start_here:
rfi
 /* Load up the kernel context */
 2:
+#ifdef CONFIG_PIN_TLB_IMMR
+   lis r0, MD_TWAM@h
+   orisr0, r0, 0x1f00
+   mtspr   SPRN_MD_CTR, r0
+   LOAD_REG_IMMEDIATE(r0, VIRT_IMMR_BASE | MD_EVALID)
+   tlbie   r0
+   mtspr   SPRN_MD_EPN, r0
+   LOAD_REG_IMMEDIATE(r0, MD_SVALID | MD_PS512K | MD_GUARDED)
+   mtspr   SPRN_MD_TWC, r0
+   mfspr   r0, SPRN_IMMR
+   rlwinm  r0, r0, 0, 0xfff8
+   ori r0, r0, 0xf0 | _PAGE_DIRTY | _PAGE_SPS | _PAGE_SH | \
+   _PAGE_NO_CACHE | _PAGE_PRESENT
+   mtspr   SPRN_MD_RPN, r0
+   lis r0, (MD_TWAM | MD_RSV4I)@h
+   mtspr   SPRN_MD_CTR, r0
+#endif
tlbia   /* Clear all TLB entries */
sync/* wait for tlbia/tlbie to finish */
 
@@ -797,28 +814,6 @@ initial_mmu:
ori r8, r8, MD_APG_INIT@l
mtspr   SPRN_MD_AP, r8
 
-   /* Map a 512k page for the IMMR to get the processor
-* internal registers (among other things).
-*/
-#ifdef CONFIG_PIN_TLB_IMMR
-   orisr10, r10, MD_RSV4I@h
-   ori r10, r10, 0x1c00
-   mtspr   SPRN_MD_CTR, r10
-
-   mfspr   r9, 638 /* Get current IMMR */
-   andis.  r9, r9, 0xfff8  /* Get 512 kbytes boundary */
-
-   lis r8, VIRT_IMMR_BASE@h/* Create vaddr for TLB */
-   ori r8, r8, MD_EVALID   /* Mark it valid */
-   mtspr   SPRN_MD_EPN, r8
-   li  r8, MD_PS512K | MD_GUARDED  /* Set 512k byte page */
-   ori r8, r8, MD_SVALID   /* Make it valid */
-   mtspr   SPRN_MD_TWC, r8
-   mr  r8, r9  /* Create paddr for TLB */
-   ori r8, r8, MI_BOOTINIT|0x2 /* Inhibit cache -- Cort */
-   mtspr   SPRN_MD_RPN, r8
-#endif
-
/* Now map the lower RAM (up to 32 Mbytes) into the ITLB. */
 #ifdef CONFIG_PIN_TLB_TEXT
lis r8, MI_RSV4I@h
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 7097e07a209a..1b6d39e9baed 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -182,6 +182,10 @@ static inline void mmu_mark_initmem_nx(void) { }
 static inline void mmu_mark_rodata_ro(void) { }
 #endif
 
+#ifdef CONFIG_PPC_8xx
+void __init mmu_mapin_immr(void);
+#endif
+
 #ifdef CONFIG_PPC_DEBUG_WX
 void ptdump_check_wx(void);
 #else
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index bda5290af751..a9313aa6f1cd 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -66,7 +66,7 @@ void __init MMU_init_hw(void)
if (IS_ENABLED(CONFIG_PIN_TLB_DATA)) {
unsigned long ctr = mfspr(SPRN_MD_CTR) & 0xfe00;
unsigned long flags = 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY;
-   int i = IS_ENABLED(CONFIG_PIN_TLB_IMMR) ? 29 : 28;
+   int i = 28;
unsigned long addr = 0;
unsigned long mem = total_lowmem;
 
@@ -81,12 +81,19 @@ void __init MMU_init_hw(void)
}
 }
 
-static void __init mmu_mapin_immr(void)
+static bool immr_is_mapped __initdata;
+
+void __init mmu_mapin_immr(void)
 {
unsigned long p = PHYS_IMMR_BASE;
unsigned long v = VIRT_IMMR_BASE;
int offset;
 
+   if (immr_is_mapped)
+   return;
+
+   immr_is_mapped = true;
+
for (offset = 0; offset < IMMR_SIZE; offset += PAGE_SIZE)
map_kernel_page(v + offset, p + offset, PAGE_KERNEL_NCG);
 }
@@ -122,9 +129,10 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
 {
unsigned long mapped;
 
+   mmu_mapin_immr();
+
if (__map_without_ltlbs) {
mapped = 0;
-   mmu_mapin_immr();
if (!IS_ENABLED(CONFIG_PIN_TLB_IMMR))
patch_instruction_site(__dtlbmiss_immr_jmp, 
ppc_inst(PPC_INST_NOP));
if (!IS_ENABLED(CONFIG_PIN_TLB_TEXT))
@@ -143,7 +151,6 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
 */

[PATCH v4 30/45] powerpc/8xx: Add function to set pinned TLBs

2020-05-18 Thread Christophe Leroy
Pinned TLBs cannot be modified when the MMU is enabled.

Create a function to rewrite the pinned TLB entries with MMU off.

To set pinned TLB, we have to turn off MMU, disable pinning,
do a TLB flush (Either with tlbie and tlbia) then reprogam
the TLB entries, enable pinning and turn on MMU.

If using tlbie, it cleared entries in both instruction and data
TLB regardless whether pinning is disabled or not.
If using tlbia, it clears all entries of the TLB which has
disabled pinning.

To make it easy, just clear all entries in both TLBs, and
reprogram them.

The function takes two arguments, the top of the memory to
consider and whether data is RO under _sinittext.
When DEBUG_PAGEALLOC is set, the top is the end of kernel rodata.
Otherwise, that's the top of physical RAM.

Everything below _sinittext is set RX, over _sinittext that's RW.

Signed-off-by: Christophe Leroy 
---
v2: Function rewritten to manage all entries at once.
---
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h |   2 +
 arch/powerpc/kernel/head_8xx.S   | 103 +++
 2 files changed, 105 insertions(+)

diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index a092e6434bda..4d3ef3841b00 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -193,6 +193,8 @@
 
 #include 
 
+void mmu_pin_tlb(unsigned long top, bool readonly);
+
 typedef struct {
unsigned int id;
unsigned int active;
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 423465b10c82..c9e3d54e6a6f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -16,6 +16,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -866,6 +867,108 @@ initial_mmu:
mtspr   SPRN_DER, r8
blr
 
+#ifdef CONFIG_PIN_TLB
+_GLOBAL(mmu_pin_tlb)
+   lis r9, (1f - PAGE_OFFSET)@h
+   ori r9, r9, (1f - PAGE_OFFSET)@l
+   mfmsr   r10
+   mflrr11
+   li  r12, MSR_KERNEL & ~(MSR_IR | MSR_DR | MSR_RI)
+   rlwinm  r0, r10, 0, ~MSR_RI
+   rlwinm  r0, r0, 0, ~MSR_EE
+   mtmsr   r0
+   isync
+   .align  4
+   mtspr   SPRN_SRR0, r9
+   mtspr   SPRN_SRR1, r12
+   rfi
+1:
+   li  r5, 0
+   lis r6, MD_TWAM@h
+   mtspr   SPRN_MI_CTR, r5
+   mtspr   SPRN_MD_CTR, r6
+   tlbia
+
+#ifdef CONFIG_PIN_TLB_TEXT
+   LOAD_REG_IMMEDIATE(r5, 28 << 8)
+   LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
+   LOAD_REG_IMMEDIATE(r7, MI_SVALID | MI_PS8MEG)
+   LOAD_REG_IMMEDIATE(r8, 0xf0 | _PAGE_RO | _PAGE_SPS | _PAGE_SH | 
_PAGE_PRESENT)
+   LOAD_REG_ADDR(r9, _sinittext)
+   li  r0, 4
+   mtctr   r0
+
+2: ori r0, r6, MI_EVALID
+   mtspr   SPRN_MI_CTR, r5
+   mtspr   SPRN_MI_EPN, r0
+   mtspr   SPRN_MI_TWC, r7
+   mtspr   SPRN_MI_RPN, r8
+   addir5, r5, 0x100
+   addis   r6, r6, SZ_8M@h
+   addis   r8, r8, SZ_8M@h
+   cmplw   r6, r9
+   bdnzt   lt, 2b
+   lis r0, MI_RSV4I@h
+   mtspr   SPRN_MI_CTR, r0
+#endif
+   LOAD_REG_IMMEDIATE(r5, 28 << 8 | MD_TWAM)
+#ifdef CONFIG_PIN_TLB_DATA
+   LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET)
+   LOAD_REG_IMMEDIATE(r7, MI_SVALID | MI_PS8MEG)
+#ifdef CONFIG_PIN_TLB_IMMR
+   li  r0, 3
+#else
+   li  r0, 4
+#endif
+   mtctr   r0
+   cmpwi   r4, 0
+   beq 4f
+   LOAD_REG_IMMEDIATE(r8, 0xf0 | _PAGE_RO | _PAGE_SPS | _PAGE_SH | 
_PAGE_PRESENT)
+   LOAD_REG_ADDR(r9, _sinittext)
+
+2: ori r0, r6, MD_EVALID
+   mtspr   SPRN_MD_CTR, r5
+   mtspr   SPRN_MD_EPN, r0
+   mtspr   SPRN_MD_TWC, r7
+   mtspr   SPRN_MD_RPN, r8
+   addir5, r5, 0x100
+   addis   r6, r6, SZ_8M@h
+   addis   r8, r8, SZ_8M@h
+   cmplw   r6, r9
+   bdnzt   lt, 2b
+
+4: LOAD_REG_IMMEDIATE(r8, 0xf0 | _PAGE_SPS | _PAGE_SH | _PAGE_PRESENT)
+2: ori r0, r6, MD_EVALID
+   mtspr   SPRN_MD_CTR, r5
+   mtspr   SPRN_MD_EPN, r0
+   mtspr   SPRN_MD_TWC, r7
+   mtspr   SPRN_MD_RPN, r8
+   addir5, r5, 0x100
+   addis   r6, r6, SZ_8M@h
+   addis   r8, r8, SZ_8M@h
+   cmplw   r6, r3
+   bdnzt   lt, 2b
+#endif
+#ifdef CONFIG_PIN_TLB_IMMR
+   LOAD_REG_IMMEDIATE(r0, VIRT_IMMR_BASE | MD_EVALID)
+   LOAD_REG_IMMEDIATE(r7, MD_SVALID | MD_PS512K | MD_GUARDED)
+   mfspr   r8, SPRN_IMMR
+   rlwinm  r8, r8, 0, 0xfff8
+   ori r8, r8, 0xf0 | _PAGE_DIRTY | _PAGE_SPS | _PAGE_SH | \
+   _PAGE_NO_CACHE | _PAGE_PRESENT
+   mtspr   SPRN_MD_CTR, r5
+   mtspr   SPRN_MD_EPN, r0
+   mtspr   SPRN_MD_TWC, r7
+   mtspr   SPRN_MD_RPN, r8
+#endif
+#if defined(CONFIG_PIN_TLB_IMMR) || defined(CONFIG_PIN_TLB_DATA)
+   lis r0, (MD_RSV4I | MD_TWAM)@h
+   mtspr   SPRN_MI_CTR, r0
+#endif
+   mtspr   SPRN_SRR1, r10
+   mtspr   SPRN_SRR0, r11
+   rfi
+#endif 

[PATCH v4 19/45] powerpc/mm: Refactor pte_update() on book3s/32

2020-05-18 Thread Christophe Leroy
When CONFIG_PTE_64BIT is set, pte_update() operates on
'unsigned long long'
When CONFIG_PTE_64BIT is not set, pte_update() operates on
'unsigned long'

In asm/page.h, we have pte_basic_t which is 'unsigned long long'
when CONFIG_PTE_64BIT is set and 'unsigned long' otherwise.

Refactor pte_update() using pte_basic_t.

While we are at it, drop the comment on 44x which is not applicable
to book3s version of pte_update().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 58 +++-
 1 file changed, 20 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 0d4bccb4b9f2..d2fc324cdf07 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -253,53 +253,35 @@ extern void flush_hash_entry(struct mm_struct *mm, pte_t 
*ptep,
  * and the PTE may be either 32 or 64 bit wide. In the later case,
  * when using atomic updates, only the low part of the PTE is
  * accessed atomically.
- *
- * In addition, on 44x, we also maintain a global flag indicating
- * that an executable user mapping was modified, which is needed
- * to properly flush the virtually tagged instruction cache of
- * those implementations.
  */
-#ifndef CONFIG_PTE_64BIT
-static inline unsigned long pte_update(pte_t *p,
-  unsigned long clr,
-  unsigned long set)
+static inline pte_basic_t pte_update(pte_t *p, unsigned long clr, unsigned 
long set)
 {
-   unsigned long old, tmp;
-
-   __asm__ __volatile__("\
-1: lwarx   %0,0,%3\n\
-   andc%1,%0,%4\n\
-   or  %1,%1,%5\n"
-"  stwcx.  %1,0,%3\n\
-   bne-1b"
-   : "=" (old), "=" (tmp), "=m" (*p)
-   : "r" (p), "r" (clr), "r" (set), "m" (*p)
-   : "cc" );
-
-   return old;
-}
-#else /* CONFIG_PTE_64BIT */
-static inline unsigned long long pte_update(pte_t *p,
-   unsigned long clr,
-   unsigned long set)
-{
-   unsigned long long old;
+   pte_basic_t old;
unsigned long tmp;
 
-   __asm__ __volatile__("\
-1: lwarx   %L0,0,%4\n\
-   lwzx%0,0,%3\n\
-   andc%1,%L0,%5\n\
-   or  %1,%1,%6\n"
-"  stwcx.  %1,0,%4\n\
-   bne-1b"
+   __asm__ __volatile__(
+#ifndef CONFIG_PTE_64BIT
+"1:lwarx   %0, 0, %3\n"
+"  andc%1, %0, %4\n"
+#else
+"1:lwarx   %L0, 0, %3\n"
+"  lwz %0, -4(%3)\n"
+"  andc%1, %L0, %4\n"
+#endif
+"  or  %1, %1, %5\n"
+"  stwcx.  %1, 0, %3\n"
+"  bne-1b"
: "=" (old), "=" (tmp), "=m" (*p)
-   : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p)
+#ifndef CONFIG_PTE_64BIT
+   : "r" (p),
+#else
+   : "b" ((unsigned long)(p) + 4),
+#endif
+ "r" (clr), "r" (set), "m" (*p)
: "cc" );
 
return old;
 }
-#endif /* CONFIG_PTE_64BIT */
 
 /*
  * 2.6 calls this without flushing the TLB entry; this is wrong
-- 
2.25.0



[PATCH v4 18/45] powerpc/mm: Refactor pte_update() on nohash/32

2020-05-18 Thread Christophe Leroy
When CONFIG_PTE_64BIT is set, pte_update() operates on
'unsigned long long'
When CONFIG_PTE_64BIT is not set, pte_update() operates on
'unsigned long'

In asm/page.h, we have pte_basic_t which is 'unsigned long long'
when CONFIG_PTE_64BIT is set and 'unsigned long' otherwise.

Refactor pte_update() using pte_basic_t.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 26 +++-
 1 file changed, 4 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 7e908a176e9e..db17f50d6ac3 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -222,12 +222,9 @@ static inline void pmd_clear(pmd_t *pmdp)
  * to properly flush the virtually tagged instruction cache of
  * those implementations.
  */
-#ifndef CONFIG_PTE_64BIT
-static inline unsigned long pte_update(pte_t *p,
-  unsigned long clr,
-  unsigned long set)
+static inline pte_basic_t pte_update(pte_t *p, unsigned long clr, unsigned 
long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
+#if defined(PTE_ATOMIC_UPDATES) && !defined(CONFIG_PTE_64BIT)
unsigned long old, tmp;
 
__asm__ __volatile__("\
@@ -241,8 +238,8 @@ static inline unsigned long pte_update(pte_t *p,
: "r" (p), "r" (clr), "r" (set), "m" (*p)
: "cc" );
 #else /* PTE_ATOMIC_UPDATES */
-   unsigned long old = pte_val(*p);
-   unsigned long new = (old & ~clr) | set;
+   pte_basic_t old = pte_val(*p);
+   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
 
 #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
p->pte = p->pte1 = p->pte2 = p->pte3 = new;
@@ -257,21 +254,6 @@ static inline unsigned long pte_update(pte_t *p,
 #endif
return old;
 }
-#else /* CONFIG_PTE_64BIT */
-static inline unsigned long long pte_update(pte_t *p,
-   unsigned long clr,
-   unsigned long set)
-{
-   unsigned long long old = pte_val(*p);
-   *p = __pte((old & ~(unsigned long long)clr) | set);
-
-#ifdef CONFIG_44x
-   if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
-   icache_44x_need_flush = 1;
-#endif
-   return old;
-}
-#endif /* CONFIG_PTE_64BIT */
 
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 static inline int __ptep_test_and_clear_young(unsigned int context, unsigned 
long addr, pte_t *ptep)
-- 
2.25.0



[PATCH v4 12/45] powerpc/ptdump: Properly handle non standard page size

2020-05-18 Thread Christophe Leroy
In order to properly display information regardless of the page size,
it is necessary to take into account real page size.

Signed-off-by: Christophe Leroy 
Fixes: cabe8138b23c ("powerpc: dump as a single line areas mapping a single 
physical page.")
Cc: sta...@vger.kernel.org
---
v3: Fixed sizes which were shifted one level (went unoticed on PPC32 as PMD and 
PUD level don't exist)
---
 arch/powerpc/mm/ptdump/ptdump.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 1f97668853e3..98d82dcf6f0b 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -60,6 +60,7 @@ struct pg_state {
unsigned long start_address;
unsigned long start_pa;
unsigned long last_pa;
+   unsigned long page_size;
unsigned int level;
u64 current_flags;
bool check_wx;
@@ -168,9 +169,9 @@ static void dump_addr(struct pg_state *st, unsigned long 
addr)
 #endif
 
pt_dump_seq_printf(st->seq, REG "-" REG " ", st->start_address, addr - 
1);
-   if (st->start_pa == st->last_pa && st->start_address + PAGE_SIZE != 
addr) {
+   if (st->start_pa == st->last_pa && st->start_address + st->page_size != 
addr) {
pt_dump_seq_printf(st->seq, "[" REG "]", st->start_pa);
-   delta = PAGE_SIZE >> 10;
+   delta = st->page_size >> 10;
} else {
pt_dump_seq_printf(st->seq, " " REG " ", st->start_pa);
delta = (addr - st->start_address) >> 10;
@@ -195,7 +196,7 @@ static void note_prot_wx(struct pg_state *st, unsigned long 
addr)
 }
 
 static void note_page(struct pg_state *st, unsigned long addr,
-  unsigned int level, u64 val)
+  unsigned int level, u64 val, unsigned long page_size)
 {
u64 flag = val & pg_level[level].mask;
u64 pa = val & PTE_RPN_MASK;
@@ -207,6 +208,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
st->start_address = addr;
st->start_pa = pa;
st->last_pa = pa;
+   st->page_size = page_size;
pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
/*
 * Dump the section of virtual memory when:
@@ -218,7 +220,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
 */
} else if (flag != st->current_flags || level != st->level ||
   addr >= st->marker[1].start_address ||
-  (pa != st->last_pa + PAGE_SIZE &&
+  (pa != st->last_pa + st->page_size &&
(pa != st->start_pa || st->start_pa != st->last_pa))) {
 
/* Check the PTE flags */
@@ -246,6 +248,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
st->start_address = addr;
st->start_pa = pa;
st->last_pa = pa;
+   st->page_size = page_size;
st->current_flags = flag;
st->level = level;
} else {
@@ -261,7 +264,7 @@ static void walk_pte(struct pg_state *st, pmd_t *pmd, 
unsigned long start)
 
for (i = 0; i < PTRS_PER_PTE; i++, pte++) {
addr = start + i * PAGE_SIZE;
-   note_page(st, addr, 4, pte_val(*pte));
+   note_page(st, addr, 4, pte_val(*pte), PAGE_SIZE);
 
}
 }
@@ -278,7 +281,7 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, 
unsigned long start)
/* pmd exists */
walk_pte(st, pmd, addr);
else
-   note_page(st, addr, 3, pmd_val(*pmd));
+   note_page(st, addr, 3, pmd_val(*pmd), PMD_SIZE);
}
 }
 
@@ -294,7 +297,7 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, 
unsigned long start)
/* pud exists */
walk_pmd(st, pud, addr);
else
-   note_page(st, addr, 2, pud_val(*pud));
+   note_page(st, addr, 2, pud_val(*pud), PUD_SIZE);
}
 }
 
@@ -313,7 +316,7 @@ static void walk_pagetables(struct pg_state *st)
/* pgd exists */
walk_pud(st, pgd, addr);
else
-   note_page(st, addr, 1, pgd_val(*pgd));
+   note_page(st, addr, 1, pgd_val(*pgd), PGDIR_SIZE);
}
 }
 
@@ -368,7 +371,7 @@ static int ptdump_show(struct seq_file *m, void *v)
 
/* Traverse kernel page tables */
walk_pagetables();
-   note_page(, 0, 0, 0);
+   note_page(, 0, 0, 0, 0);
return 0;
 }
 
-- 
2.25.0



[PATCH v4 43/45] powerpc/8xx: Implement dedicated kasan_init_region()

2020-05-18 Thread Christophe Leroy
Implement a kasan_init_region() dedicated to 8xx that
allocates KASAN regions using huge pages.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/kasan/8xx.c| 74 ++
 arch/powerpc/mm/kasan/Makefile |  1 +
 2 files changed, 75 insertions(+)
 create mode 100644 arch/powerpc/mm/kasan/8xx.c

diff --git a/arch/powerpc/mm/kasan/8xx.c b/arch/powerpc/mm/kasan/8xx.c
new file mode 100644
index ..db4ef44af22f
--- /dev/null
+++ b/arch/powerpc/mm/kasan/8xx.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define DISABLE_BRANCH_PROFILING
+
+#include 
+#include 
+#include 
+#include 
+
+static int __init
+kasan_init_shadow_8M(unsigned long k_start, unsigned long k_end, void *block)
+{
+   pmd_t *pmd = pmd_ptr_k(k_start);
+   unsigned long k_cur, k_next;
+
+   for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd += 2, block 
+= SZ_8M) {
+   pte_basic_t *new;
+
+   k_next = pgd_addr_end(k_cur, k_end);
+   k_next = pgd_addr_end(k_next, k_end);
+   if ((void *)pmd_page_vaddr(*pmd) != kasan_early_shadow_pte)
+   continue;
+
+   new = memblock_alloc(sizeof(pte_basic_t), SZ_4K);
+   if (!new)
+   return -ENOMEM;
+
+   *new = pte_val(pte_mkhuge(pfn_pte(PHYS_PFN(__pa(block)), 
PAGE_KERNEL)));
+
+   hugepd_populate_kernel((hugepd_t *)pmd, (pte_t *)new, 
PAGE_SHIFT_8M);
+   hugepd_populate_kernel((hugepd_t *)pmd + 1, (pte_t *)new, 
PAGE_SHIFT_8M);
+   }
+   return 0;
+}
+
+int __init kasan_init_region(void *start, size_t size)
+{
+   unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
+   unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
+   unsigned long k_cur;
+   int ret;
+   void *block;
+
+   block = memblock_alloc(k_end - k_start, SZ_8M);
+   if (!block)
+   return -ENOMEM;
+
+   if (IS_ALIGNED(k_start, SZ_8M)) {
+   kasan_init_shadow_8M(k_start, ALIGN_DOWN(k_end, SZ_8M), block);
+   k_cur = ALIGN_DOWN(k_end, SZ_8M);
+   if (k_cur == k_end)
+   goto finish;
+   } else {
+   k_cur = k_start;
+   }
+
+   ret = kasan_init_shadow_page_tables(k_start, k_end);
+   if (ret)
+   return ret;
+
+   for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+   pmd_t *pmd = pmd_ptr_k(k_cur);
+   void *va = block + k_cur - k_start;
+   pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+
+   if (k_cur < ALIGN_DOWN(k_end, SZ_512K))
+   pte = pte_mkhuge(pte);
+
+   __set_pte_at(_mm, k_cur, pte_offset_kernel(pmd, k_cur), 
pte, 0);
+   }
+finish:
+   flush_tlb_kernel_range(k_start, k_end);
+   return 0;
+}
diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
index 6577897673dd..440038ea79f1 100644
--- a/arch/powerpc/mm/kasan/Makefile
+++ b/arch/powerpc/mm/kasan/Makefile
@@ -3,3 +3,4 @@
 KASAN_SANITIZE := n
 
 obj-$(CONFIG_PPC32)   += kasan_init_32.o
+obj-$(CONFIG_PPC_8xx)  += 8xx.o
-- 
2.25.0



[PATCH v4 27/45] powerpc/8xx: Only 8M pages are hugepte pages now

2020-05-18 Thread Christophe Leroy
512k pages are now standard pages, so only 8M pages
are hugepte.

No more handling of normal page tables through hugepd allocation
and freeing, and hugepte helpers can also be simplified.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h |  7 +++
 arch/powerpc/mm/hugetlbpage.c| 16 +++-
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
index 785437323576..1c7d4693a78e 100644
--- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
@@ -13,13 +13,13 @@ static inline pte_t *hugepd_page(hugepd_t hpd)
 
 static inline unsigned int hugepd_shift(hugepd_t hpd)
 {
-   return ((hpd_val(hpd) & _PMD_PAGE_MASK) >> 1) + 17;
+   return PAGE_SHIFT_8M;
 }
 
 static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr,
unsigned int pdshift)
 {
-   unsigned long idx = (addr & ((1UL << pdshift) - 1)) >> PAGE_SHIFT;
+   unsigned long idx = (addr & (SZ_4M - 1)) >> PAGE_SHIFT;
 
return hugepd_page(hpd) + idx;
 }
@@ -32,8 +32,7 @@ static inline void flush_hugetlb_page(struct vm_area_struct 
*vma,
 
 static inline void hugepd_populate(hugepd_t *hpdp, pte_t *new, unsigned int 
pshift)
 {
-   *hpdp = __hugepd(__pa(new) | _PMD_USER | _PMD_PRESENT |
-(pshift == PAGE_SHIFT_8M ? _PMD_PAGE_8M : 
_PMD_PAGE_512K));
+   *hpdp = __hugepd(__pa(new) | _PMD_USER | _PMD_PRESENT | _PMD_PAGE_8M);
 }
 
 static inline int check_and_get_huge_psize(int shift)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 38bad839e608..cfacd364c7aa 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -54,24 +54,17 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
*hpdp,
if (pshift >= pdshift) {
cachep = PGT_CACHE(PTE_T_ORDER);
num_hugepd = 1 << (pshift - pdshift);
-   new = NULL;
-   } else if (IS_ENABLED(CONFIG_PPC_8xx)) {
-   cachep = NULL;
-   num_hugepd = 1;
-   new = pte_alloc_one(mm);
} else {
cachep = PGT_CACHE(pdshift - pshift);
num_hugepd = 1;
-   new = NULL;
}
 
-   if (!cachep && !new) {
+   if (!cachep) {
WARN_ONCE(1, "No page table cache created for hugetlb tables");
return -ENOMEM;
}
 
-   if (cachep)
-   new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, 
GFP_KERNEL));
+   new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL));
 
BUG_ON(pshift > HUGEPD_SHIFT_MASK);
BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK);
@@ -102,10 +95,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
*hpdp,
if (i < num_hugepd) {
for (i = i - 1 ; i >= 0; i--, hpdp--)
*hpdp = __hugepd(0);
-   if (cachep)
-   kmem_cache_free(cachep, new);
-   else
-   pte_free(mm, new);
+   kmem_cache_free(cachep, new);
} else {
kmemleak_ignore(new);
}
-- 
2.25.0



[PATCH v4 28/45] powerpc/8xx: MM_SLICE is not needed anymore

2020-05-18 Thread Christophe Leroy
As the 8xx now manages 512k pages in standard page tables,
it doesn't need CONFIG_PPC_MM_SLICES anymore.

Don't select it anymore and remove all related code.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 64 
 arch/powerpc/include/asm/nohash/32/slice.h   | 20 --
 arch/powerpc/include/asm/slice.h |  2 -
 arch/powerpc/platforms/Kconfig.cputype   |  1 -
 4 files changed, 87 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/nohash/32/slice.h

diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 26b7cee34dfe..a092e6434bda 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -176,12 +176,6 @@
  */
 #define SPRN_M_TW  799
 
-#ifdef CONFIG_PPC_MM_SLICES
-#include 
-#define SLICE_ARRAY_SIZE   (1 << (32 - SLICE_LOW_SHIFT - 1))
-#define LOW_SLICE_ARRAY_SZ SLICE_ARRAY_SIZE
-#endif
-
 #if defined(CONFIG_PPC_4K_PAGES)
 #define mmu_virtual_psize  MMU_PAGE_4K
 #elif defined(CONFIG_PPC_16K_PAGES)
@@ -199,71 +193,13 @@
 
 #include 
 
-struct slice_mask {
-   u64 low_slices;
-   DECLARE_BITMAP(high_slices, 0);
-};
-
 typedef struct {
unsigned int id;
unsigned int active;
unsigned long vdso_base;
-#ifdef CONFIG_PPC_MM_SLICES
-   u16 user_psize; /* page size index */
-   unsigned char low_slices_psize[SLICE_ARRAY_SIZE];
-   unsigned char high_slices_psize[0];
-   unsigned long slb_addr_limit;
-   struct slice_mask mask_base_psize; /* 4k or 16k */
-   struct slice_mask mask_512k;
-   struct slice_mask mask_8m;
-#endif
void *pte_frag;
 } mm_context_t;
 
-#ifdef CONFIG_PPC_MM_SLICES
-static inline u16 mm_ctx_user_psize(mm_context_t *ctx)
-{
-   return ctx->user_psize;
-}
-
-static inline void mm_ctx_set_user_psize(mm_context_t *ctx, u16 user_psize)
-{
-   ctx->user_psize = user_psize;
-}
-
-static inline unsigned char *mm_ctx_low_slices(mm_context_t *ctx)
-{
-   return ctx->low_slices_psize;
-}
-
-static inline unsigned char *mm_ctx_high_slices(mm_context_t *ctx)
-{
-   return ctx->high_slices_psize;
-}
-
-static inline unsigned long mm_ctx_slb_addr_limit(mm_context_t *ctx)
-{
-   return ctx->slb_addr_limit;
-}
-
-static inline void mm_ctx_set_slb_addr_limit(mm_context_t *ctx, unsigned long 
limit)
-{
-   ctx->slb_addr_limit = limit;
-}
-
-static inline struct slice_mask *slice_mask_for_size(mm_context_t *ctx, int 
psize)
-{
-   if (psize == MMU_PAGE_512K)
-   return >mask_512k;
-   if (psize == MMU_PAGE_8M)
-   return >mask_8m;
-
-   BUG_ON(psize != mmu_virtual_psize);
-
-   return >mask_base_psize;
-}
-#endif /* CONFIG_PPC_MM_SLICE */
-
 #define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8)
 #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
 
diff --git a/arch/powerpc/include/asm/nohash/32/slice.h 
b/arch/powerpc/include/asm/nohash/32/slice.h
deleted file mode 100644
index 39eb0154ae2d..
--- a/arch/powerpc/include/asm/nohash/32/slice.h
+++ /dev/null
@@ -1,20 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_POWERPC_NOHASH_32_SLICE_H
-#define _ASM_POWERPC_NOHASH_32_SLICE_H
-
-#ifdef CONFIG_PPC_MM_SLICES
-
-#define SLICE_LOW_SHIFT26  /* 64 slices */
-#define SLICE_LOW_TOP  (0x1ull)
-#define SLICE_NUM_LOW  (SLICE_LOW_TOP >> SLICE_LOW_SHIFT)
-#define GET_LOW_SLICE_INDEX(addr)  ((addr) >> SLICE_LOW_SHIFT)
-
-#define SLICE_HIGH_SHIFT   0
-#define SLICE_NUM_HIGH 0ul
-#define GET_HIGH_SLICE_INDEX(addr) (addr & 0)
-
-#define SLB_ADDR_LIMIT_DEFAULT DEFAULT_MAP_WINDOW
-
-#endif /* CONFIG_PPC_MM_SLICES */
-
-#endif /* _ASM_POWERPC_NOHASH_32_SLICE_H */
diff --git a/arch/powerpc/include/asm/slice.h b/arch/powerpc/include/asm/slice.h
index c6f466f4c241..0bdd9c62eca0 100644
--- a/arch/powerpc/include/asm/slice.h
+++ b/arch/powerpc/include/asm/slice.h
@@ -4,8 +4,6 @@
 
 #ifdef CONFIG_PPC_BOOK3S_64
 #include 
-#elif defined(CONFIG_PPC_MMU_NOHASH_32)
-#include 
 #endif
 
 #ifndef __ASSEMBLY__
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 27a81c291be8..5774a55a9c58 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -55,7 +55,6 @@ config PPC_8xx
select SYS_SUPPORTS_HUGETLBFS
select PPC_HAVE_KUEP
select PPC_HAVE_KUAP
-   select PPC_MM_SLICES if HUGETLB_PAGE
select HAVE_ARCH_VMAP_STACK
 
 config 40x
-- 
2.25.0



[PATCH v4 32/45] powerpc/8xx: Always pin TLBs at startup.

2020-05-18 Thread Christophe Leroy
At startup, map 32 Mbytes of memory through 4 pages of 8M,
and PIN them inconditionnaly. They need to be pinned because
KASAN is using page tables early and the TLBs might be
dynamically replaced otherwise.

Remove RSV4I flag after installing mappings unless
CONFIG_PIN_TLB_ is selected.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S | 31 +--
 arch/powerpc/mm/nohash/8xx.c   | 19 +--
 2 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index d607f4b53e0f..b0cceee6405c 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -765,6 +765,14 @@ start_here:
mtspr   SPRN_MD_RPN, r0
lis r0, (MD_TWAM | MD_RSV4I)@h
mtspr   SPRN_MD_CTR, r0
+#endif
+#ifndef CONFIG_PIN_TLB_TEXT
+   li  r0, 0
+   mtspr   SPRN_MI_CTR, r0
+#endif
+#if !defined(CONFIG_PIN_TLB_DATA) && !defined(CONFIG_PIN_TLB_IMMR)
+   lis r0, MD_TWAM@h
+   mtspr   SPRN_MD_CTR, r0
 #endif
tlbia   /* Clear all TLB entries */
sync/* wait for tlbia/tlbie to finish */
@@ -802,10 +810,6 @@ initial_mmu:
mtspr   SPRN_MD_CTR, r10/* remove PINNED DTLB entries */
 
tlbia   /* Invalidate all TLB entries */
-#ifdef CONFIG_PIN_TLB_DATA
-   orisr10, r10, MD_RSV4I@h
-   mtspr   SPRN_MD_CTR, r10/* Set data TLB control */
-#endif
 
lis r8, MI_APG_INIT@h   /* Set protection modes */
ori r8, r8, MI_APG_INIT@l
@@ -814,33 +818,32 @@ initial_mmu:
ori r8, r8, MD_APG_INIT@l
mtspr   SPRN_MD_AP, r8
 
-   /* Now map the lower RAM (up to 32 Mbytes) into the ITLB. */
-#ifdef CONFIG_PIN_TLB_TEXT
+   /* Map the lower RAM (up to 32 Mbytes) into the ITLB and DTLB */
lis r8, MI_RSV4I@h
ori r8, r8, 0x1c00
-#endif
+   orisr12, r10, MD_RSV4I@h
+   ori r12, r12, 0x1c00
li  r9, 4   /* up to 4 pages of 8M */
mtctr   r9
lis r9, KERNELBASE@h/* Create vaddr for TLB */
li  r10, MI_PS8MEG | MI_SVALID  /* Set 8M byte page */
li  r11, MI_BOOTINIT/* Create RPN for address 0 */
-   lis r12, _einittext@h
-   ori r12, r12, _einittext@l
 1:
-#ifdef CONFIG_PIN_TLB_TEXT
mtspr   SPRN_MI_CTR, r8 /* Set instruction MMU control */
addir8, r8, 0x100
-#endif
-
ori r0, r9, MI_EVALID   /* Mark it valid */
mtspr   SPRN_MI_EPN, r0
mtspr   SPRN_MI_TWC, r10
mtspr   SPRN_MI_RPN, r11/* Store TLB entry */
+   mtspr   SPRN_MD_CTR, r12
+   addir12, r12, 0x100
+   mtspr   SPRN_MD_EPN, r0
+   mtspr   SPRN_MD_TWC, r10
+   mtspr   SPRN_MD_RPN, r11
addis   r9, r9, 0x80
addis   r11, r11, 0x80
 
-   cmplcr0, r9, r12
-   bdnzf   gt, 1b
+   bdnz1b
 
/* Since the cache is enabled according to the information we
 * just loaded into the TLB, invalidate and enable the caches here.
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index a9313aa6f1cd..2c480e35b426 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -62,23 +62,6 @@ unsigned long p_block_mapped(phys_addr_t pa)
  */
 void __init MMU_init_hw(void)
 {
-   /* PIN up to the 3 first 8Mb after IMMR in DTLB table */
-   if (IS_ENABLED(CONFIG_PIN_TLB_DATA)) {
-   unsigned long ctr = mfspr(SPRN_MD_CTR) & 0xfe00;
-   unsigned long flags = 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY;
-   int i = 28;
-   unsigned long addr = 0;
-   unsigned long mem = total_lowmem;
-
-   for (; i < 32 && mem >= LARGE_PAGE_SIZE_8M; i++) {
-   mtspr(SPRN_MD_CTR, ctr | (i << 8));
-   mtspr(SPRN_MD_EPN, (unsigned long)__va(addr) | 
MD_EVALID);
-   mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID);
-   mtspr(SPRN_MD_RPN, addr | flags | _PAGE_PRESENT);
-   addr += LARGE_PAGE_SIZE_8M;
-   mem -= LARGE_PAGE_SIZE_8M;
-   }
-   }
 }
 
 static bool immr_is_mapped __initdata;
@@ -226,7 +209,7 @@ void __init setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
BUG_ON(first_memblock_base != 0);
 
/* 8xx can only access 32MB at the moment */
-   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0200));
+   memblock_set_current_limit(min_t(u64, first_memblock_size, SZ_32M));
 }
 
 /*
-- 
2.25.0



Re: [PATCH 4.19 02/80] shmem: fix possible deadlocks on shmlock_user_lock

2020-05-18 Thread Greg Kroah-Hartman
On Mon, May 18, 2020 at 06:10:58PM -0700, Hugh Dickins wrote:
> Hi Pavel,
> 
> On Mon, 18 May 2020, Pavel Machek wrote:
> 
> > Hi!
> > 
> > > This may not risk an actual deadlock, since shmem inodes do not take
> > > part in writeback accounting, but there are several easy ways to avoid
> > > it.
> > 
> > ...
> > 
> > > Take info->lock out of the chain and the possibility of deadlock or
> > > lockdep warning goes away.
> > 
> > It is unclear to me if actual possibility of deadlock exists or not,
> > but anyway:
> > 
> > >   int retval = -ENOMEM;
> > >  
> > > - spin_lock_irq(>lock);
> > > + /*
> > > +  * What serializes the accesses to info->flags?
> > > +  * ipc_lock_object() when called from shmctl_do_lock(),
> > > +  * no serialization needed when called from shm_destroy().
> > > +  */
> > >   if (lock && !(info->flags & VM_LOCKED)) {
> > >   if (!user_shm_lock(inode->i_size, user))
> > >   goto out_nomem;
> > 
> > Should we have READ_ONCE() here? If it is okay, are concurency
> > sanitizers smart enough to realize that it is okay? Replacing warning
> > with different one would not be exactly a win...
> 
> If a sanitizer comes to question this change, I don't see how a
> READ_ONCE() anywhere near here (on info->flags?) is likely to be
> enough to satisfy it - it would be asking for a locking scheme that
> it understands (being unable to read the comment) - and might then
> ask for that same locking in the numerous other places that read
> info->flags (and a few that write it).  Add data_race()s all over?
> 
> (Or are you concerned about that inode->i_size, which I suppose ought
> really to be i_size_read(inode) on some 32-bit configurations; though
> that's of very long standing, and has never caused any concern before.)
> 
> I am not at all willing to add annotations speculatively, in case this
> or that tool turns out to want help later.  So far I've not heard of
> any such complaint on 5.7-rc[3456] or linux-next: but maybe this is
> too soon to hear a complaint, and you feel this should not be rushed
> into -stable?
> 
> This was an AUTOSEL selection, to which I have no objection, but it
> isn't something we were desperate to push into -stable: so I've also
> no objection if Greg shares your concern, and prefers to withdraw it.
> (That choice may depend on to what extent he expects to be keeping
> -stable clean against upcoming sanitizers in future.)

Sanitizers run on stable trees all the time as that's the releases that
ends up on products, where people run them.  That's why I like to take
those types of fixes, especially when tools report them.

thanks,

greg k-h


[PATCH v4 40/45] powerpc/8xx: Map linear memory with huge pages

2020-05-18 Thread Christophe Leroy
Map linear memory space with 512k and 8M pages whenever
possible.

Three mappings are performed:
- One for kernel text
- One for RO data
- One for the rest

Separating the mappings is done to be able to update the
protection later when using STRICT_KERNEL_RWX.

The ITLB miss handler now need to also handle huge TLBs
unless kernel text in pinned.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S |  4 +--
 arch/powerpc/mm/nohash/8xx.c   | 50 +-
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 9a117b9f0998..abb71fad7d6a 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -224,7 +224,7 @@ InstructionTLBMiss:
 3:
mtcrr11
 #endif
-#ifdef CONFIG_HUGETLBFS
+#if defined(CONFIG_HUGETLBFS) || !defined(CONFIG_PIN_TLB_TEXT)
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r10)/* Get level 1 
entry */
mtspr   SPRN_MD_TWC, r11
 #else
@@ -234,7 +234,7 @@ InstructionTLBMiss:
 #endif
mfspr   r10, SPRN_MD_TWC
lwz r10, 0(r10) /* Get the pte */
-#ifdef CONFIG_HUGETLBFS
+#if defined(CONFIG_HUGETLBFS) || !defined(CONFIG_PIN_TLB_TEXT)
rlwimi  r11, r10, 32 - 9, _PMD_PAGE_512K
mtspr   SPRN_MI_TWC, r11
 #endif
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index f8fff1fa72e3..ec3ef75895d8 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -127,20 +127,68 @@ void __init mmu_mapin_immr(void)
PAGE_KERNEL_NCG, MMU_PAGE_512K, true);
 }
 
+static void __init mmu_mapin_ram_chunk(unsigned long offset, unsigned long top,
+  pgprot_t prot, bool new)
+{
+   unsigned long v = PAGE_OFFSET + offset;
+   unsigned long p = offset;
+
+   WARN_ON(!IS_ALIGNED(offset, SZ_512K) || !IS_ALIGNED(top, SZ_512K));
+
+   for (; p < ALIGN(p, SZ_8M) && p < top; p += SZ_512K, v += SZ_512K)
+   __early_map_kernel_hugepage(v, p, prot, MMU_PAGE_512K, new);
+   for (; p < ALIGN_DOWN(top, SZ_8M) && p < top; p += SZ_8M, v += SZ_8M)
+   __early_map_kernel_hugepage(v, p, prot, MMU_PAGE_8M, new);
+   for (; p < ALIGN_DOWN(top, SZ_512K) && p < top; p += SZ_512K, v += 
SZ_512K)
+   __early_map_kernel_hugepage(v, p, prot, MMU_PAGE_512K, new);
+
+   if (!new)
+   flush_tlb_kernel_range(PAGE_OFFSET + v, PAGE_OFFSET + top);
+}
+
 unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
 {
+   unsigned long etext8 = ALIGN(__pa(_etext), SZ_8M);
+   unsigned long sinittext = __pa(_sinittext);
+   unsigned long boundary = strict_kernel_rwx_enabled() ? sinittext : 
etext8;
+   unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
+
+   WARN_ON(top < einittext8);
+
mmu_mapin_immr();
 
-   return 0;
+   if (__map_without_ltlbs)
+   return 0;
+
+   mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, true);
+   mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL_TEXT, true);
+   mmu_mapin_ram_chunk(einittext8, top, PAGE_KERNEL, true);
+
+   if (top > SZ_32M)
+   memblock_set_current_limit(top);
+
+   block_mapped_ram = top;
+
+   return top;
 }
 
 void mmu_mark_initmem_nx(void)
 {
+   unsigned long etext8 = ALIGN(__pa(_etext), SZ_8M);
+   unsigned long sinittext = __pa(_sinittext);
+   unsigned long boundary = strict_kernel_rwx_enabled() ? sinittext : 
etext8;
+   unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
+
+   mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
+   mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
 }
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mmu_mark_rodata_ro(void)
 {
+   unsigned long sinittext = __pa(_sinittext);
+
+   mmu_mapin_ram_chunk(0, sinittext, PAGE_KERNEL_ROX, false);
 }
 #endif
 
-- 
2.25.0



[PATCH v4 45/45] powerpc/32s: Implement dedicated kasan_init_region()

2020-05-18 Thread Christophe Leroy
Implement a kasan_init_region() dedicated to book3s/32 that
allocates KASAN regions using BATs.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/kasan.h  |  1 +
 arch/powerpc/mm/kasan/Makefile|  1 +
 arch/powerpc/mm/kasan/book3s_32.c | 57 +++
 arch/powerpc/mm/kasan/kasan_init_32.c |  2 +-
 4 files changed, 60 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/mm/kasan/book3s_32.c

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index 107a24c3f7b3..be85c7005fb1 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -34,6 +34,7 @@ static inline void kasan_init(void) { }
 static inline void kasan_late_init(void) { }
 #endif
 
+void kasan_update_early_region(unsigned long k_start, unsigned long k_end, 
pte_t pte);
 int kasan_init_shadow_page_tables(unsigned long k_start, unsigned long k_end);
 int kasan_init_region(void *start, size_t size);
 
diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
index 440038ea79f1..bb1a5408b86b 100644
--- a/arch/powerpc/mm/kasan/Makefile
+++ b/arch/powerpc/mm/kasan/Makefile
@@ -4,3 +4,4 @@ KASAN_SANITIZE := n
 
 obj-$(CONFIG_PPC32)   += kasan_init_32.o
 obj-$(CONFIG_PPC_8xx)  += 8xx.o
+obj-$(CONFIG_PPC_BOOK3S_32)+= book3s_32.o
diff --git a/arch/powerpc/mm/kasan/book3s_32.c 
b/arch/powerpc/mm/kasan/book3s_32.c
new file mode 100644
index ..4bc491a4a1fd
--- /dev/null
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define DISABLE_BRANCH_PROFILING
+
+#include 
+#include 
+#include 
+#include 
+
+int __init kasan_init_region(void *start, size_t size)
+{
+   unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
+   unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
+   unsigned long k_cur = k_start;
+   int k_size = k_end - k_start;
+   int k_size_base = 1 << (ffs(k_size) - 1);
+   int ret;
+   void *block;
+
+   block = memblock_alloc(k_size, k_size_base);
+
+   if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, 
k_size_base)) {
+   int k_size_more = 1 << (ffs(k_size - k_size_base) - 1);
+
+   setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);
+   if (k_size_more >= SZ_128K)
+   setbat(-1, k_start + k_size_base, __pa(block) + 
k_size_base,
+  k_size_more, PAGE_KERNEL);
+   if (v_block_mapped(k_start))
+   k_cur = k_start + k_size_base;
+   if (v_block_mapped(k_start + k_size_base))
+   k_cur = k_start + k_size_base + k_size_more;
+
+   update_bats();
+   }
+
+   if (!block)
+   block = memblock_alloc(k_size, PAGE_SIZE);
+   if (!block)
+   return -ENOMEM;
+
+   ret = kasan_init_shadow_page_tables(k_start, k_end);
+   if (ret)
+   return ret;
+
+   kasan_update_early_region(k_start, k_cur, __pte(0));
+
+   for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+   pmd_t *pmd = pmd_ptr_k(k_cur);
+   void *va = block + k_cur - k_start;
+   pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+
+   __set_pte_at(_mm, k_cur, pte_offset_kernel(pmd, k_cur), 
pte, 0);
+   }
+   flush_tlb_kernel_range(k_start, k_end);
+   return 0;
+}
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index 76d418af4ce8..c42085801c04 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -79,7 +79,7 @@ int __init __weak kasan_init_region(void *start, size_t size)
return 0;
 }
 
-static void __init
+void __init
 kasan_update_early_region(unsigned long k_start, unsigned long k_end, pte_t 
pte)
 {
unsigned long k_cur;
-- 
2.25.0



[PATCH v4 02/45] powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END

2020-05-18 Thread Christophe Leroy
At the time being, KASAN_SHADOW_END is 0x1, which
is 0 in 32 bits representation.

This leads to a couple of issues:
- kasan_remap_early_shadow_ro() does nothing because the comparison
k_cur < k_end is always false.
- In ptdump, address comparison for markers display fails and the
marker's name is printed at the start of the KASAN area instead of
being printed at the end.

However, there is no need to shadow the KASAN shadow area itself,
so the KASAN shadow area can stop shadowing memory at the start
of itself.

With a PAGE_OFFSET set to 0xc000, KASAN shadow area is then going
from 0xf800 to 0xff00.

Signed-off-by: Christophe Leroy 
Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: sta...@vger.kernel.org
---
 arch/powerpc/include/asm/kasan.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index fbff9ff9032e..fc900937f653 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -23,9 +23,7 @@
 
 #define KASAN_SHADOW_OFFSETASM_CONST(CONFIG_KASAN_SHADOW_OFFSET)
 
-#define KASAN_SHADOW_END   0UL
-
-#define KASAN_SHADOW_SIZE  (KASAN_SHADOW_END - KASAN_SHADOW_START)
+#define KASAN_SHADOW_END   (-(-KASAN_SHADOW_START >> 
KASAN_SHADOW_SCALE_SHIFT))
 
 #ifdef CONFIG_KASAN
 void kasan_early_init(void);
-- 
2.25.0



[PATCH v4 01/45] powerpc/kasan: Fix error detection on memory allocation

2020-05-18 Thread Christophe Leroy
In case (k_start & PAGE_MASK) doesn't equal (kstart), 'va' will never be
NULL allthough 'block' is NULL

Check the return of memblock_alloc() directly instead of
the resulting address in the loop.

Fixes: 509cd3f2b473 ("powerpc/32: Simplify KASAN init")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index cbcad369fcb2..8b15fe09b967 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -76,15 +76,14 @@ static int __init kasan_init_region(void *start, size_t 
size)
return ret;
 
block = memblock_alloc(k_end - k_start, PAGE_SIZE);
+   if (!block)
+   return -ENOMEM;
 
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
void *va = block + k_cur - k_start;
pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
 
-   if (!va)
-   return -ENOMEM;
-
__set_pte_at(_mm, k_cur, pte_offset_kernel(pmd, k_cur), 
pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
-- 
2.25.0



[PATCH v4 04/45] powerpc/kasan: Remove unnecessary page table locking

2020-05-18 Thread Christophe Leroy
Commit 45ff3c559585 ("powerpc/kasan: Fix parallel loading of
modules.") added spinlocks to manage parallele module loading.

Since then commit 47febbeeec44 ("powerpc/32: Force KASAN_VMALLOC for
modules") converted the module loading to KASAN_VMALLOC.

The spinlocking has then become unneeded and can be removed to
simplify kasan_init_shadow_page_tables()

Also remove inclusion of linux/moduleloader.h and linux/vmalloc.h
which are not needed anymore since the removal of modules management.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index b7c287adfd59..91e2ade75192 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -5,9 +5,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -34,31 +32,22 @@ static int __init kasan_init_shadow_page_tables(unsigned 
long k_start, unsigned
 {
pmd_t *pmd;
unsigned long k_cur, k_next;
-   pte_t *new = NULL;
 
pmd = pmd_ptr_k(k_start);
 
for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
+   pte_t *new;
+
k_next = pgd_addr_end(k_cur, k_end);
if ((void *)pmd_page_vaddr(*pmd) != kasan_early_shadow_pte)
continue;
 
-   if (!new)
-   new = memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE);
+   new = memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE);
 
if (!new)
return -ENOMEM;
kasan_populate_pte(new, PAGE_KERNEL);
-
-   smp_wmb(); /* See comment in __pte_alloc */
-
-   spin_lock(_mm.page_table_lock);
-   /* Has another populated it ? */
-   if (likely((void *)pmd_page_vaddr(*pmd) == 
kasan_early_shadow_pte)) {
-   pmd_populate_kernel(_mm, pmd, new);
-   new = NULL;
-   }
-   spin_unlock(_mm.page_table_lock);
+   pmd_populate_kernel(_mm, pmd, new);
}
return 0;
 }
-- 
2.25.0



[PATCH v4 05/45] powerpc/kasan: Refactor update of early shadow mappings

2020-05-18 Thread Christophe Leroy
kasan_remap_early_shadow_ro() and kasan_unmap_early_shadow_vmalloc()
are both updating the early shadow mapping: the first one sets
the mapping read-only while the other clears the mapping.

Refactor and create kasan_update_early_region()

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 39 +--
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index 91e2ade75192..10481d904fea 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -79,45 +79,42 @@ static int __init kasan_init_region(void *start, size_t 
size)
return 0;
 }
 
-static void __init kasan_remap_early_shadow_ro(void)
+static void __init
+kasan_update_early_region(unsigned long k_start, unsigned long k_end, pte_t 
pte)
 {
-   pgprot_t prot = kasan_prot_ro();
-   unsigned long k_start = KASAN_SHADOW_START;
-   unsigned long k_end = KASAN_SHADOW_END;
unsigned long k_cur;
phys_addr_t pa = __pa(kasan_early_shadow_page);
 
-   kasan_populate_pte(kasan_early_shadow_pte, prot);
-
-   for (k_cur = k_start & PAGE_MASK; k_cur != k_end; k_cur += PAGE_SIZE) {
+   for (k_cur = k_start; k_cur != k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
 
if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
continue;
 
-   __set_pte_at(_mm, k_cur, ptep, pfn_pte(PHYS_PFN(pa), 
prot), 0);
+   __set_pte_at(_mm, k_cur, ptep, pte, 0);
}
-   flush_tlb_kernel_range(KASAN_SHADOW_START, KASAN_SHADOW_END);
+
+   flush_tlb_kernel_range(k_start, k_end);
 }
 
-static void __init kasan_unmap_early_shadow_vmalloc(void)
+static void __init kasan_remap_early_shadow_ro(void)
 {
-   unsigned long k_start = (unsigned long)kasan_mem_to_shadow((void 
*)VMALLOC_START);
-   unsigned long k_end = (unsigned long)kasan_mem_to_shadow((void 
*)VMALLOC_END);
-   unsigned long k_cur;
+   pgprot_t prot = kasan_prot_ro();
phys_addr_t pa = __pa(kasan_early_shadow_page);
 
-   for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
-   pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), 
k_cur);
-   pte_t *ptep = pte_offset_kernel(pmd, k_cur);
+   kasan_populate_pte(kasan_early_shadow_pte, prot);
 
-   if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
-   continue;
+   kasan_update_early_region(KASAN_SHADOW_START, KASAN_SHADOW_END,
+ pfn_pte(PHYS_PFN(pa), prot));
+}
 
-   __set_pte_at(_mm, k_cur, ptep, __pte(0), 0);
-   }
-   flush_tlb_kernel_range(k_start, k_end);
+static void __init kasan_unmap_early_shadow_vmalloc(void)
+{
+   unsigned long k_start = (unsigned long)kasan_mem_to_shadow((void 
*)VMALLOC_START);
+   unsigned long k_end = (unsigned long)kasan_mem_to_shadow((void 
*)VMALLOC_END);
+
+   kasan_update_early_region(k_start, k_end, __pte(0));
 }
 
 static void __init kasan_mmu_init(void)
-- 
2.25.0



[PATCH v4 08/45] powerpc/ptdump: Reorder flags

2020-05-18 Thread Christophe Leroy
Reorder flags in a more logical way:
- Page size (huge) first
- User
- RWX
- Present
- WIMG
- Special
- Dirty and Accessed

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/8xx.c| 30 +++---
 arch/powerpc/mm/ptdump/shared.c | 30 +++---
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/8xx.c b/arch/powerpc/mm/ptdump/8xx.c
index ca9ce94672f5..a3169677dced 100644
--- a/arch/powerpc/mm/ptdump/8xx.c
+++ b/arch/powerpc/mm/ptdump/8xx.c
@@ -11,11 +11,6 @@
 
 static const struct flag_info flag_array[] = {
{
-   .mask   = _PAGE_SH,
-   .val= _PAGE_SH,
-   .set= "sh",
-   .clear  = "  ",
-   }, {
.mask   = _PAGE_RO | _PAGE_NA,
.val= 0,
.set= "rw",
@@ -37,11 +32,26 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_PRESENT,
.set= "p",
.clear  = " ",
+   }, {
+   .mask   = _PAGE_NO_CACHE,
+   .val= _PAGE_NO_CACHE,
+   .set= "i",
+   .clear  = " ",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
.set= "g",
.clear  = " ",
+   }, {
+   .mask   = _PAGE_SH,
+   .val= _PAGE_SH,
+   .set= "sh",
+   .clear  = "  ",
+   }, {
+   .mask   = _PAGE_SPECIAL,
+   .val= _PAGE_SPECIAL,
+   .set= "s",
+   .clear  = " ",
}, {
.mask   = _PAGE_DIRTY,
.val= _PAGE_DIRTY,
@@ -52,16 +62,6 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_ACCESSED,
.set= "a",
.clear  = " ",
-   }, {
-   .mask   = _PAGE_NO_CACHE,
-   .val= _PAGE_NO_CACHE,
-   .set= "i",
-   .clear  = " ",
-   }, {
-   .mask   = _PAGE_SPECIAL,
-   .val= _PAGE_SPECIAL,
-   .set= "s",
-   .clear  = " ",
}
 };
 
diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index 44a8a64a664f..dab5d8028a9b 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -30,21 +30,6 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_PRESENT,
.set= "p",
.clear  = " ",
-   }, {
-   .mask   = _PAGE_GUARDED,
-   .val= _PAGE_GUARDED,
-   .set= "g",
-   .clear  = " ",
-   }, {
-   .mask   = _PAGE_DIRTY,
-   .val= _PAGE_DIRTY,
-   .set= "d",
-   .clear  = " ",
-   }, {
-   .mask   = _PAGE_ACCESSED,
-   .val= _PAGE_ACCESSED,
-   .set= "a",
-   .clear  = " ",
}, {
.mask   = _PAGE_WRITETHRU,
.val= _PAGE_WRITETHRU,
@@ -55,11 +40,26 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_NO_CACHE,
.set= "i",
.clear  = " ",
+   }, {
+   .mask   = _PAGE_GUARDED,
+   .val= _PAGE_GUARDED,
+   .set= "g",
+   .clear  = " ",
}, {
.mask   = _PAGE_SPECIAL,
.val= _PAGE_SPECIAL,
.set= "s",
.clear  = " ",
+   }, {
+   .mask   = _PAGE_DIRTY,
+   .val= _PAGE_DIRTY,
+   .set= "d",
+   .clear  = " ",
+   }, {
+   .mask   = _PAGE_ACCESSED,
+   .val= _PAGE_ACCESSED,
+   .set= "a",
+   .clear  = " ",
}
 };
 
-- 
2.25.0



[PATCH v4 13/45] powerpc/ptdump: Handle hugepd at PGD level

2020-05-18 Thread Christophe Leroy
The 8xx is about to map kernel linear space and IMMR using huge
pages.

In order to display those pages properly, ptdump needs to handle
hugepd tables at PGD level.

For the time being do it only at PGD level. Further patches may
add handling of hugepd tables at lower level for other platforms
when needed in the future.

Signed-off-by: Christophe Leroy 
---
v3: notepage() now takes page size instead of page shift
---
 arch/powerpc/mm/ptdump/ptdump.c | 29 ++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 98d82dcf6f0b..5fc880e30175 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -269,6 +270,26 @@ static void walk_pte(struct pg_state *st, pmd_t *pmd, 
unsigned long start)
}
 }
 
+static void walk_hugepd(struct pg_state *st, hugepd_t *phpd, unsigned long 
start,
+   int pdshift, int level)
+{
+#ifdef CONFIG_ARCH_HAS_HUGEPD
+   unsigned int i;
+   int shift = hugepd_shift(*phpd);
+   int ptrs_per_hpd = pdshift - shift > 0 ? 1 << (pdshift - shift) : 1;
+
+   if (start & ((1 << shift) - 1))
+   return;
+
+   for (i = 0; i < ptrs_per_hpd; i++) {
+   unsigned long addr = start + (i << shift);
+   pte_t *pte = hugepte_offset(*phpd, addr, pdshift);
+
+   note_page(st, addr, level + 1, pte_val(*pte), 1 << shift);
+   }
+#endif
+}
+
 static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
 {
pmd_t *pmd = pmd_offset(pud, 0);
@@ -312,11 +333,13 @@ static void walk_pagetables(struct pg_state *st)
 * the hash pagetable.
 */
for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += 
PGDIR_SIZE) {
-   if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
+   if (pgd_none(*pgd) || pgd_is_leaf(*pgd))
+   note_page(st, addr, 1, pgd_val(*pgd), PGDIR_SIZE);
+   else if (is_hugepd(__hugepd(pgd_val(*pgd
+   walk_hugepd(st, (hugepd_t *)pgd, addr, PGDIR_SHIFT, 1);
+   else
/* pgd exists */
walk_pud(st, pgd, addr);
-   else
-   note_page(st, addr, 1, pgd_val(*pgd), PGDIR_SIZE);
}
 }
 
-- 
2.25.0



Re: [PATCH v2] driver core: Fix SYNC_STATE_ONLY device link implementation

2020-05-18 Thread Greg Kroah-Hartman
On Mon, May 18, 2020 at 08:00:25PM -0700, Saravana Kannan wrote:
> When SYNC_STATE_ONLY support was added in commit 05ef983e0d65 ("driver
> core: Add device link support for SYNC_STATE_ONLY flag"),
> device_link_add() incorrectly skipped adding the new SYNC_STATE_ONLY
> device link to the supplier's and consumer's "device link" list.
> 
> This causes multiple issues:
> - The device link is lost forever from driver core if the caller
>   didn't keep track of it (caller typically isn't expected to). This is
>   a memory leak.
> - The device link is also never visible to any other code path after
>   device_link_add() returns.
> 
> If we fix the "device link" list handling, that exposes a bunch of
> issues.
> 
> 1. The device link "status" state management code rightfully doesn't
> handle the case where a DL_FLAG_MANAGED device link exists between a
> supplier and consumer, but the consumer manages to probe successfully
> before the supplier. The addition of DL_FLAG_SYNC_STATE_ONLY links break
> this assumption. This causes device_links_driver_bound() to throw a
> warning when this happens.
> 
> Since DL_FLAG_SYNC_STATE_ONLY device links are mainly used for creating
> proxy device links for child device dependencies and aren't useful once
> the consumer device probes successfully, this patch just deletes
> DL_FLAG_SYNC_STATE_ONLY device links once its consumer device probes.
> This way, we avoid the warning, free up some memory and avoid
> complicating the device links "status" state management code.
> 
> 2. Creating a DL_FLAG_STATELESS device link between two devices that
> already have a DL_FLAG_SYNC_STATE_ONLY device link will result in the
> DL_FLAG_STATELESS flag not getting set correctly. This patch also fixes
> this.
> 
> Lastly, this patch also fixes minor whitespace issues.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 05ef983e0d65 ("driver core: Add device link support for 
> SYNC_STATE_ONLY flag")
> Signed-off-by: Saravana Kannan 
> ---
>  drivers/base/core.c | 61 +
>  1 file changed, 39 insertions(+), 22 deletions(-)

If this is v2, what changed from v1?

That always goes below the --- line, you know this :)

v3 please?

thanks,

greg k-h


[PATCH v4 09/45] powerpc/ptdump: Add _PAGE_COHERENT flag

2020-05-18 Thread Christophe Leroy
For platforms using shared.c (4xx, Book3e, Book3s/32),
also handle the _PAGE_COHERENT flag with corresponds to the
M bit of the WIMG flags.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/shared.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index dab5d8028a9b..634b83aa3487 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -40,6 +40,11 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_NO_CACHE,
.set= "i",
.clear  = " ",
+   }, {
+   .mask   = _PAGE_COHERENT,
+   .val= _PAGE_COHERENT,
+   .set= "m",
+   .clear  = " ",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
-- 
2.25.0



[PATCH 5.6 000/192] 5.6.14-rc2 review

2020-05-18 Thread Greg Kroah-Hartman
This is the start of the stable review cycle for the 5.6.14 release.
There are 192 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu, 21 May 2020 05:45:41 +.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.6.14-rc2.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-5.6.y
and the diffstat can be found below.

thanks,

greg k-h

-
Pseudo-Shortlog of commits:

Greg Kroah-Hartman 
Linux 5.6.14-rc2

Sergei Trofimovich 
Makefile: disallow data races on gcc-10 as well

Daniel Borkmann 
bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier

Yonghong Song 
selftests/bpf: Enforce returning 0 for fentry/fexit programs

Yonghong Song 
bpf: Enforce returning 0 for fentry/fexit progs

Jim Mattson 
KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce

Kefeng Wang 
riscv: perf: RISCV_BASE_PMU should be independent

Jason Gunthorpe 
RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj

Jason Gunthorpe 
RDMA/uverbs: Do not discard the IB_EVENT_DEVICE_FATAL event

Nayna Jain 
powerpc/ima: Fix secure boot rules in ima arch policy

Nicholas Piggin 
powerpc/uaccess: Evaluate macro arguments once, before user access is 
allowed

Xiyu Yang 
bpf: Fix sk_psock refcnt leak when receiving message

Chuck Lever 
SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")

Michael Walle 
dt-bindings: dma: fsl-edma: fix ls1028a-edma compatible

Geert Uytterhoeven 
ARM: dts: r8a7740: Add missing extal2 to CPG node

Yoshihiro Shimoda 
arm64: dts: renesas: r8a77980: Fix IPMMU VIP[01] nodes

Geert Uytterhoeven 
ARM: dts: r8a73a4: Add missing CMT1 interrupts

Adam Ford 
arm64: dts: imx8mn: Change SDMA1 ahb clock for imx8mn

Chen-Yu Tsai 
arm64: dts: rockchip: Rename dwc3 device nodes on rk3399 to make dtc happy

Chen-Yu Tsai 
arm64: dts: rockchip: Replace RK805 PMIC node name with "pmic" on rk3328 
boards

Neil Armstrong 
arm64: dts: meson-g12-common: fix dwc2 clock names

Bjorn Andersson 
arm64: dts: qcom: msm8996: Reduce vdd_apc voltage

Neil Armstrong 
arm64: dts: meson-g12b-khadas-vim3: add missing frddr_a status property

Marc Zyngier 
clk: Unlink clock if failed to prepare or enable

Tero Kristo 
clk: ti: clkctrl: Fix Bad of_node_put within clkctrl_get_name

Kai-Heng Feng 
Revert "ALSA: hda/realtek: Fix pop noise on ALC225"

Wei Yongjun 
usb: gadget: legacy: fix error return code in cdc_bind()

Wei Yongjun 
usb: gadget: legacy: fix error return code in gncm_bind()

Christophe JAILLET 
usb: gadget: audio: Fix a missing error return value in audio_bind()

Christophe JAILLET 
usb: gadget: net2272: Fix a memory leak in an error handling path in 
'net2272_plat_probe()'

Thierry Reding 
usb: gadget: tegra-xudc: Fix idle suspend/resume

Neil Armstrong 
arm64: dts: meson-g12b-ugoos-am6: fix usb vbus-supply

Amir Goldstein 
fanotify: fix merging marks masks with FAN_ONDIR

John Stultz 
dwc3: Remove check for HWO flag in dwc3_gadget_ep_reclaim_trb_sg()

Justin Swartz 
clk: rockchip: fix incorrect configuration of rk3228 aclk_gpu* clocks

Eric W. Biederman 
exec: Move would_dump into flush_old_exec

Josh Poimboeuf 
x86/unwind/orc: Fix error handling in __unwind_start()

Borislav Petkov 
x86: Fix early boot crash on gcc-10, third try

Babu Moger 
KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c

Adam McCoy 
cifs: fix leaked reference on requeued write

Christophe Leroy 
powerpc/32s: Fix build failure with CONFIG_PPC_KUAP_DEBUG

Christophe Leroy 
powerpc/vdso32: Fallback on getres syscall when clock is unknown

Imre Deak 
drm/i915/tgl+: Fix interrupt handling for DP AUX transactions

Tom St Denis 
drm/amd/amdgpu: add raven1 part to the gfxoff quirk list

Simon Ser 
drm/amd/display: add basic atomic check for cursor plane

Michal Vokáč 
ARM: dts: imx6dl-yapp4: Fix Ursa board Ethernet connection

Fabio Estevam 
ARM: dts: imx27-phytec-phycard-s-rdk: Fix the I2C1 pinctrl entries

Kishon Vijay Abraham I 
ARM: dts: dra7: Fix bus_dma_limit for PCIe

Peter Jones 
Make the "Reducing compressed framebufer size" message be DRM_INFO_ONCE()

Sriharsha Allenki 
usb: xhci: Fix NULL pointer dereference when enqueuing trbs from urb sg list

Kyungtae Kim 
USB: gadget: fix illegal array access in binding with UDC

Peter Chen 
usb: cdns3: gadget: prev_req->trb is NULL for ep0

Li Jun 
usb: host: xhci-plat: keep runtime active when removing host

Eugeniu Rosca 
usb: core: hub: limit HUB_QUIRK_DISABLE_AUTOSUSPEND to USB5534B

Jesus Ramos 
ALSA: usb-audio: 

Re: [PATCH 4.4 17/86] phy: micrel: Disable auto negotiation on startup

2020-05-18 Thread Henri Rosten
On Mon, May 18, 2020 at 07:35:48PM +0200, Greg Kroah-Hartman wrote:
> From: Alexandre Belloni 
> 
> [ Upstream commit 99f81afc139c6edd14d77a91ee91685a414a1c66 ]

I notice 99f81afc139c has been reverted in mainline with commit b43bd72835a5.  
The revert commit points out that:

"It was papering over the real problem, which is fixed by commit
f555f34fdc58 ("net: phy: fix auto-negotiation stall due to unavailable
interrupt")"
 
Therefore, consider backporting f555f34fdc58 instead of 99f81afc139c.

Notice if f555f34fdc58 is taken, then I believe 215d08a85b9a should also 
be backported.

Thanks,
-- Henri

> 
> Disable auto negotiation on init to properly detect an already plugged
> cable at boot.
> 
> At boot, when the phy is started, it is in the PHY_UP state.
> However, if a cable is plugged at boot, because auto negociation is already
> enabled at the time we get the first interrupt, the phy is already running.
> But the state machine then switches from PHY_UP to PHY_AN and calls
> phy_start_aneg(). phy_start_aneg() will not do anything because aneg is
> already enabled on the phy. It will then wait for a interrupt before going
> further. This interrupt will never happen unless the cable is unplugged and
> then replugged.
> 
> It was working properly before 321beec5047a (net: phy: Use interrupts when
> available in NOLINK state) because switching to NOLINK meant starting
> polling the phy, even if IRQ were enabled.
> 
 


Re: [PATCH 10/18] maccess: unify the probe kernel arch hooks

2020-05-18 Thread Christoph Hellwig
On Thu, May 14, 2020 at 10:13:18AM +0900, Masami Hiramatsu wrote:
> > +   bool strict)
> >  {
> > long ret;
> > mm_segment_t old_fs = get_fs();
> >  
> > +   if (!probe_kernel_read_allowed(dst, src, size, strict))
> > +   return -EFAULT;
> 
> Could you make this return -ERANGE instead of -EFAULT so that
> the caller can notice that the address might be user space?

That is clearly a behavior change, so I don't want to mix it into
this patch.  But I can add it as another patch at the end.


Re: [PATCH 5.6 061/194] drm/amdgpu: bump version for invalidate L2 before SDMA IBs

2020-05-18 Thread Greg Kroah-Hartman
On Mon, May 18, 2020 at 10:34:34PM +, Olsak, Marek wrote:
> [AMD Official Use Only - Internal Distribution Only]
> 
> Hi Greg,
> 
> I disagree with this. Bumping the driver version will have implications on 
> other new features, because it's like an ABI barrier exposing new 
> functionality.

And yet another reason why driver versions are a total mess and
shouldn't be in in-kernel drivers :(

Ugh.

I'll go drop this, thanks.

greg k-h


Re: [PATCH 5.6 000/194] 5.6.14-rc1 review

2020-05-18 Thread Greg Kroah-Hartman
On Mon, May 18, 2020 at 07:10:45PM -0700, Guenter Roeck wrote:
> On 5/18/20 10:34 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.6.14 release.
> > There are 194 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed, 20 May 2020 17:32:42 +.
> > Anything received after that time might be too late.
> > 
> 
> Quick feedback:
> 
> You also need to pull in commit 92db978f0d68 ("net: ethernet: ti: Remove 
> TI_CPTS_MOD workaround")
> which fixes commit b6d49cab44b5 ("net: Make PTP-specific drivers depend on 
> PTP_1588_CLOCK").
> This is necessary to avoid compile errors.

Sasha has now dropped that original patch, so we should be fine.  I'll
push out a -rc2 soon with that change.

thanks,

greg k-h


[PATCH v2 2/5] soundwire: bus_type: introduce sdw_slave_type and sdw_master_type

2020-05-18 Thread Bard Liao
From: Pierre-Louis Bossart 

this is a preparatory patch before the introduction of the
sdw_master_type. The SoundWire slave support is slightly modified with
the use of a sdw_slave_type, and the uevent handling move to
slave.c (since it's not necessary for the master).

No functionality change other than moving code around.

Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Bard Liao 
---
 drivers/soundwire/bus_type.c   | 19 +--
 drivers/soundwire/slave.c  |  8 +++-
 include/linux/soundwire/sdw_type.h |  9 -
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c
index 17f096dd6806..2c1a19caba51 100644
--- a/drivers/soundwire/bus_type.c
+++ b/drivers/soundwire/bus_type.c
@@ -33,13 +33,21 @@ sdw_get_device_id(struct sdw_slave *slave, struct 
sdw_driver *drv)
 
 static int sdw_bus_match(struct device *dev, struct device_driver *ddrv)
 {
-   struct sdw_slave *slave = dev_to_sdw_dev(dev);
-   struct sdw_driver *drv = drv_to_sdw_driver(ddrv);
+   struct sdw_slave *slave;
+   struct sdw_driver *drv;
+   int ret = 0;
+
+   if (is_sdw_slave(dev)) {
+   slave = dev_to_sdw_dev(dev);
+   drv = drv_to_sdw_driver(ddrv);
 
-   return !!sdw_get_device_id(slave, drv);
+   ret = !!sdw_get_device_id(slave, drv);
+   }
+   return ret;
 }
 
-int sdw_slave_modalias(const struct sdw_slave *slave, char *buf, size_t size)
+static int sdw_slave_modalias(const struct sdw_slave *slave, char *buf,
+ size_t size)
 {
/* modalias is sdw:mp */
 
@@ -47,7 +55,7 @@ int sdw_slave_modalias(const struct sdw_slave *slave, char 
*buf, size_t size)
slave->id.mfg_id, slave->id.part_id);
 }
 
-static int sdw_uevent(struct device *dev, struct kobj_uevent_env *env)
+int sdw_slave_uevent(struct device *dev, struct kobj_uevent_env *env)
 {
struct sdw_slave *slave = dev_to_sdw_dev(dev);
char modalias[32];
@@ -63,7 +71,6 @@ static int sdw_uevent(struct device *dev, struct 
kobj_uevent_env *env)
 struct bus_type sdw_bus_type = {
.name = "soundwire",
.match = sdw_bus_match,
-   .uevent = sdw_uevent,
 };
 EXPORT_SYMBOL_GPL(sdw_bus_type);
 
diff --git a/drivers/soundwire/slave.c b/drivers/soundwire/slave.c
index aace57fae7f8..ed068a004bd9 100644
--- a/drivers/soundwire/slave.c
+++ b/drivers/soundwire/slave.c
@@ -14,6 +14,12 @@ static void sdw_slave_release(struct device *dev)
kfree(slave);
 }
 
+struct device_type sdw_slave_type = {
+   .name = "sdw_slave",
+   .release =  sdw_slave_release,
+   .uevent =   sdw_slave_uevent,
+};
+
 static int sdw_slave_add(struct sdw_bus *bus,
 struct sdw_slave_id *id, struct fwnode_handle *fwnode)
 {
@@ -41,9 +47,9 @@ static int sdw_slave_add(struct sdw_bus *bus,
 id->class_id, id->unique_id);
}
 
-   slave->dev.release = sdw_slave_release;
slave->dev.bus = _bus_type;
slave->dev.of_node = of_node_get(to_of_node(fwnode));
+   slave->dev.type = _slave_type;
slave->bus = bus;
slave->status = SDW_SLAVE_UNATTACHED;
init_completion(>enumeration_complete);
diff --git a/include/linux/soundwire/sdw_type.h 
b/include/linux/soundwire/sdw_type.h
index aaa7f4267c14..52eb66cd11bc 100644
--- a/include/linux/soundwire/sdw_type.h
+++ b/include/linux/soundwire/sdw_type.h
@@ -5,6 +5,13 @@
 #define __SOUNDWIRE_TYPES_H
 
 extern struct bus_type sdw_bus_type;
+extern struct device_type sdw_slave_type;
+extern struct device_type sdw_master_type;
+
+static inline int is_sdw_slave(const struct device *dev)
+{
+   return dev->type == _slave_type;
+}
 
 #define drv_to_sdw_driver(_drv) container_of(_drv, struct sdw_driver, driver)
 
@@ -14,7 +21,7 @@ extern struct bus_type sdw_bus_type;
 int __sdw_register_driver(struct sdw_driver *drv, struct module *owner);
 void sdw_unregister_driver(struct sdw_driver *drv);
 
-int sdw_slave_modalias(const struct sdw_slave *slave, char *buf, size_t size);
+int sdw_slave_uevent(struct device *dev, struct kobj_uevent_env *env);
 
 /**
  * module_sdw_driver() - Helper macro for registering a Soundwire driver
-- 
2.17.1



[PATCH v2 5/5] soundwire: master: add runtime pm support

2020-05-18 Thread Bard Liao
We need to enable runtime_pm on master device with generic helpers,
so that a Slave-initiated wake is propagated to the bus parent.

Signed-off-by: Bard Liao 
---
 drivers/soundwire/master.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/soundwire/master.c b/drivers/soundwire/master.c
index 6be0a027def7..5411791e6aff 100644
--- a/drivers/soundwire/master.c
+++ b/drivers/soundwire/master.c
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "bus.h"
@@ -14,9 +15,15 @@ static void sdw_master_device_release(struct device *dev)
kfree(md);
 }
 
+static const struct dev_pm_ops master_dev_pm = {
+   SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend,
+  pm_generic_runtime_resume, NULL)
+};
+
 struct device_type sdw_master_type = {
.name = "soundwire_master",
.release =  sdw_master_device_release,
+   .pm = _dev_pm,
 };
 
 /**
-- 
2.17.1



[PATCH v2 4/5] soundwire: bus_type: add sdw_master_device support

2020-05-18 Thread Bard Liao
From: Pierre-Louis Bossart 

In the existing SoundWire code, Master Devices are not explicitly
represented - only SoundWire Slave Devices are exposed (the use of
capital letters follows the SoundWire specification conventions).

With the existing code, the bus is handled without using a proper device,
and bus->dev typically points to a platform device. The right thing to
do as discussed in multiple reviews is use a device for each bus.

The sdw_master_device addition is done with minimal internal plumbing
and not exposed externally. The existing API based on
sdw_bus_master_add() and sdw_bus_master_delete() will deal with the
sdw_master_device life cycle, which minimizes changes to existing
drivers.

Note that the Intel code will be modified in follow-up patches (no
impact on any platform since the connection with ASoC is not supported
upstream so far).

Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Bard Liao 
---
 drivers/soundwire/Makefile|  2 +-
 drivers/soundwire/bus.c   | 14 --
 drivers/soundwire/bus.h   |  3 ++
 drivers/soundwire/intel.c |  1 -
 drivers/soundwire/master.c| 81 +++
 drivers/soundwire/qcom.c  |  1 -
 include/linux/soundwire/sdw.h | 17 +++-
 7 files changed, 112 insertions(+), 7 deletions(-)
 create mode 100644 drivers/soundwire/master.c

diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
index e2cdff990e9f..7319918e0aec 100644
--- a/drivers/soundwire/Makefile
+++ b/drivers/soundwire/Makefile
@@ -4,7 +4,7 @@
 #
 
 #Bus Objs
-soundwire-bus-objs := bus_type.o bus.o slave.o mipi_disco.o stream.o
+soundwire-bus-objs := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o
 obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
 
 ifdef CONFIG_DEBUG_FS
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 2d24f183061d..c31a1c2788a9 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -37,14 +37,21 @@ int sdw_bus_master_add(struct sdw_bus *bus, struct device 
*parent,
struct sdw_master_prop *prop = NULL;
int ret;
 
-   if (!bus->dev) {
-   pr_err("SoundWire bus has no device\n");
+   if (!parent) {
+   pr_err("SoundWire parent device is not set\n");
return -ENODEV;
}
 
ret = sdw_get_id(bus);
if (ret) {
-   dev_err(bus->dev, "Failed to get bus id\n");
+   dev_err(parent, "Failed to get bus id\n");
+   return ret;
+   }
+
+   ret = sdw_master_device_add(bus, parent, fwnode);
+   if (ret) {
+   dev_err(parent, "Failed to add master device at link %d\n",
+   bus->link_id);
return ret;
}
 
@@ -161,6 +168,7 @@ static int sdw_delete_slave(struct device *dev, void *data)
 void sdw_bus_master_delete(struct sdw_bus *bus)
 {
device_for_each_child(bus->dev, NULL, sdw_delete_slave);
+   sdw_master_device_del(bus);
 
sdw_bus_debugfs_exit(bus);
ida_free(_ida, bus->id);
diff --git a/drivers/soundwire/bus.h b/drivers/soundwire/bus.h
index 204204a26db8..93ab0234a491 100644
--- a/drivers/soundwire/bus.h
+++ b/drivers/soundwire/bus.h
@@ -19,6 +19,9 @@ static inline int sdw_acpi_find_slaves(struct sdw_bus *bus)
 int sdw_of_find_slaves(struct sdw_bus *bus);
 void sdw_extract_slave_id(struct sdw_bus *bus,
  u64 addr, struct sdw_slave_id *id);
+int sdw_master_device_add(struct sdw_bus *bus, struct device *parent,
+ struct fwnode_handle *fwnode);
+int sdw_master_device_del(struct sdw_bus *bus);
 
 #ifdef CONFIG_DEBUG_FS
 void sdw_bus_debugfs_init(struct sdw_bus *bus);
diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
index 210459390046..3562f2106e30 100644
--- a/drivers/soundwire/intel.c
+++ b/drivers/soundwire/intel.c
@@ -1099,7 +1099,6 @@ static int intel_probe(struct platform_device *pdev)
sdw->cdns.registers = sdw->link_res->registers;
sdw->cdns.instance = sdw->instance;
sdw->cdns.msg_count = 0;
-   sdw->cdns.bus.dev = >dev;
sdw->cdns.bus.link_id = pdev->id;
 
sdw_cdns_probe(>cdns);
diff --git a/drivers/soundwire/master.c b/drivers/soundwire/master.c
new file mode 100644
index ..6be0a027def7
--- /dev/null
+++ b/drivers/soundwire/master.c
@@ -0,0 +1,81 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2019-2020 Intel Corporation.
+
+#include 
+#include 
+#include 
+#include 
+#include "bus.h"
+
+static void sdw_master_device_release(struct device *dev)
+{
+   struct sdw_master_device *md = dev_to_sdw_master_device(dev);
+
+   kfree(md);
+}
+
+struct device_type sdw_master_type = {
+   .name = "soundwire_master",
+   .release =  sdw_master_device_release,
+};
+
+/**
+ * sdw_master_device_add() - create a Linux Master Device representation.
+ * @bus: SDW bus instance
+ * @parent: parent device
+ * @fwnode: firmware node handle
+ */

[PATCH v2 1/5] soundwire: bus: rename sdw_bus_master_add/delete, add arguments

2020-05-18 Thread Bard Liao
From: Pierre-Louis Bossart 

In preparation for future extensions, rename functions to use
sdw_bus_master prefix and add a parent and fwnode argument to
sdw_bus_master_add to help with device registration in follow-up
patches.

No functionality change, just renames and additional arguments.

The Intel code is currently unused, the two additional arguments are
only needed for compilation.

Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Bard Liao 
---
 Documentation/driver-api/soundwire/summary.rst |  7 ---
 drivers/soundwire/bus.c| 15 +--
 drivers/soundwire/intel.c  |  8 
 drivers/soundwire/qcom.c   |  6 +++---
 include/linux/soundwire/sdw.h  |  5 +++--
 5 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/Documentation/driver-api/soundwire/summary.rst 
b/Documentation/driver-api/soundwire/summary.rst
index 8193125a2bfb..01dcb954f6d7 100644
--- a/Documentation/driver-api/soundwire/summary.rst
+++ b/Documentation/driver-api/soundwire/summary.rst
@@ -101,10 +101,11 @@ Following is the Bus API to register the SoundWire Bus:
 
 .. code-block:: c
 
-   int sdw_add_bus_master(struct sdw_bus *bus)
+   int sdw_bus_master_add(struct sdw_bus *bus,
+   struct device *parent,
+   struct fwnode_handle)
{
-   if (!bus->dev)
-   return -ENODEV;
+   sdw_master_device_add(bus, parent, fwnode);
 
mutex_init(>lock);
INIT_LIST_HEAD(>slaves);
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 32de017f08d5..24064dbd74fa 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -10,13 +10,16 @@
 #include "bus.h"
 
 /**
- * sdw_add_bus_master() - add a bus Master instance
+ * sdw_bus_master_add() - add a bus Master instance
  * @bus: bus instance
+ * @parent: parent device
+ * @fwnode: firmware node handle
  *
  * Initializes the bus instance, read properties and create child
  * devices.
  */
-int sdw_add_bus_master(struct sdw_bus *bus)
+int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,
+  struct fwnode_handle *fwnode)
 {
struct sdw_master_prop *prop = NULL;
int ret;
@@ -107,7 +110,7 @@ int sdw_add_bus_master(struct sdw_bus *bus)
 
return 0;
 }
-EXPORT_SYMBOL(sdw_add_bus_master);
+EXPORT_SYMBOL(sdw_bus_master_add);
 
 static int sdw_delete_slave(struct device *dev, void *data)
 {
@@ -131,18 +134,18 @@ static int sdw_delete_slave(struct device *dev, void 
*data)
 }
 
 /**
- * sdw_delete_bus_master() - delete the bus master instance
+ * sdw_bus_master_delete() - delete the bus master instance
  * @bus: bus to be deleted
  *
  * Remove the instance, delete the child devices.
  */
-void sdw_delete_bus_master(struct sdw_bus *bus)
+void sdw_bus_master_delete(struct sdw_bus *bus)
 {
device_for_each_child(bus->dev, NULL, sdw_delete_slave);
 
sdw_bus_debugfs_exit(bus);
 }
-EXPORT_SYMBOL(sdw_delete_bus_master);
+EXPORT_SYMBOL(sdw_bus_master_delete);
 
 /*
  * SDW IO Calls
diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
index 3c83e76c6bf9..210459390046 100644
--- a/drivers/soundwire/intel.c
+++ b/drivers/soundwire/intel.c
@@ -1110,9 +1110,9 @@ static int intel_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, sdw);
 
-   ret = sdw_add_bus_master(>cdns.bus);
+   ret = sdw_bus_master_add(>cdns.bus, >dev, pdev->dev.fwnode);
if (ret) {
-   dev_err(>dev, "sdw_add_bus_master fail: %d\n", ret);
+   dev_err(>dev, "sdw_bus_master_add fail: %d\n", ret);
return ret;
}
 
@@ -1173,7 +1173,7 @@ static int intel_probe(struct platform_device *pdev)
sdw_cdns_enable_interrupt(>cdns, false);
free_irq(sdw->link_res->irq, sdw);
 err_init:
-   sdw_delete_bus_master(>cdns.bus);
+   sdw_bus_master_delete(>cdns.bus);
return ret;
 }
 
@@ -1189,7 +1189,7 @@ static int intel_remove(struct platform_device *pdev)
free_irq(sdw->link_res->irq, sdw);
snd_soc_unregister_component(sdw->cdns.dev);
}
-   sdw_delete_bus_master(>cdns.bus);
+   sdw_bus_master_delete(>cdns.bus);
 
return 0;
 }
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
index e08a17c13f92..401811d6627e 100644
--- a/drivers/soundwire/qcom.c
+++ b/drivers/soundwire/qcom.c
@@ -821,7 +821,7 @@ static int qcom_swrm_probe(struct platform_device *pdev)
goto err_clk;
}
 
-   ret = sdw_add_bus_master(>bus);
+   ret = sdw_bus_master_add(>bus, dev, dev->fwnode);
if (ret) {
dev_err(dev, "Failed to register Soundwire controller (%d)\n",
ret);
@@ -840,7 +840,7 @@ static int qcom_swrm_probe(struct platform_device *pdev)
return 0;
 
 err_master_add:
-   

[PATCH v2 3/5] soundwire: bus: add unique bus id

2020-05-18 Thread Bard Liao
Adding an unique id for each bus.

Suggested-by: Vinod Koul 
Signed-off-by: Bard Liao 
---
 drivers/soundwire/bus.c   | 20 
 include/linux/soundwire/sdw.h |  2 ++
 2 files changed, 22 insertions(+)

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 24064dbd74fa..2d24f183061d 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -9,6 +9,19 @@
 #include 
 #include "bus.h"
 
+static DEFINE_IDA(sdw_ida);
+
+static int sdw_get_id(struct sdw_bus *bus)
+{
+   int rc = ida_alloc(_ida, GFP_KERNEL);
+
+   if (rc < 0)
+   return rc;
+
+   bus->id = rc;
+   return 0;
+}
+
 /**
  * sdw_bus_master_add() - add a bus Master instance
  * @bus: bus instance
@@ -29,6 +42,12 @@ int sdw_bus_master_add(struct sdw_bus *bus, struct device 
*parent,
return -ENODEV;
}
 
+   ret = sdw_get_id(bus);
+   if (ret) {
+   dev_err(bus->dev, "Failed to get bus id\n");
+   return ret;
+   }
+
if (!bus->ops) {
dev_err(bus->dev, "SoundWire Bus ops are not set\n");
return -EINVAL;
@@ -144,6 +163,7 @@ void sdw_bus_master_delete(struct sdw_bus *bus)
device_for_each_child(bus->dev, NULL, sdw_delete_slave);
 
sdw_bus_debugfs_exit(bus);
+   ida_free(_ida, bus->id);
 }
 EXPORT_SYMBOL(sdw_bus_master_delete);
 
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 2003e8c55538..a32cb26f1815 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -789,6 +789,7 @@ struct sdw_master_ops {
  * struct sdw_bus - SoundWire bus
  * @dev: Master linux device
  * @link_id: Link id number, can be 0 to N, unique for each Master
+ * @id: bus system-wide unique id
  * @slaves: list of Slaves on this bus
  * @assigned: Bitmap for Slave device numbers.
  * Bit set implies used number, bit clear implies unused number.
@@ -813,6 +814,7 @@ struct sdw_master_ops {
 struct sdw_bus {
struct device *dev;
unsigned int link_id;
+   int id;
struct list_head slaves;
DECLARE_BITMAP(assigned, SDW_MAX_DEVICES);
struct mutex bus_lock;
-- 
2.17.1



[PATCH v2 0/5] soundwire: bus_type: add sdw_master_device support

2020-05-18 Thread Bard Liao
This series adds sdw master devices support.

changes in v2:
 - Allocate sdw_master_device dynamically
 - Use unique bus id as master id
 - Keep checking parent devices
 - Enable runtime_pm on Master device

Bard Liao (2):
  soundwire: bus: add unique bus id
  soundwire: master: add runtime pm support

Pierre-Louis Bossart (3):
  soundwire: bus: rename sdw_bus_master_add/delete, add arguments
  soundwire: bus_type: introduce sdw_slave_type and sdw_master_type
  soundwire: bus_type: add sdw_master_device support

 .../driver-api/soundwire/summary.rst  |  7 +-
 drivers/soundwire/Makefile|  2 +-
 drivers/soundwire/bus.c   | 47 --
 drivers/soundwire/bus.h   |  3 +
 drivers/soundwire/bus_type.c  | 19 ++--
 drivers/soundwire/intel.c |  9 +-
 drivers/soundwire/master.c| 88 +++
 drivers/soundwire/qcom.c  |  7 +-
 drivers/soundwire/slave.c |  8 +-
 include/linux/soundwire/sdw.h | 24 -
 include/linux/soundwire/sdw_type.h|  9 +-
 11 files changed, 191 insertions(+), 32 deletions(-)
 create mode 100644 drivers/soundwire/master.c

-- 
2.17.1



[PATCH v2] drm/etnaviv: fix perfmon domain interation

2020-05-18 Thread Christian Gmeiner
The GC860 has one GPU device which has a 2d and 3d core. In this case
we want to expose perfmon information for both cores.

The driver has one array which contains all possible perfmon domains
with some meta data - doms_meta. Here we can see that for the GC860
two elements of that array are relevant:

  doms_3d: is at index 0 in the doms_meta array with 8 perfmon domains
  doms_2d: is at index 1 in the doms_meta array with 1 perfmon domain

The userspace driver wants to get a list of all perfmon domains and
their perfmon signals. This is done by iterating over all domains and
their signals. If the userspace driver wants to access the domain with
id 8 the kernel driver fails and returns invalid data from doms_3d with
and invalid offset.

This results in:
  Unable to handle kernel paging request at virtual address 

On such a device it is not possible to use the userspace driver at all.

The fix for this off-by-one error is quite simple.

Reported-by: Paul Cercueil 
Tested-by: Paul Cercueil 
Fixes: ed1dd899baa3 ("drm/etnaviv: rework perfmon query infrastructure")
Cc: sta...@vger.kernel.org
Signed-off-by: Christian Gmeiner 

---
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c 
b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
index e6795bafcbb9..75f9db8f7bec 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
@@ -453,7 +453,7 @@ static const struct etnaviv_pm_domain *pm_domain(const 
struct etnaviv_gpu *gpu,
if (!(gpu->identity.features & meta->feature))
continue;
 
-   if (meta->nr_domains < (index - offset)) {
+   if (index - offset >= meta->nr_domains) {
offset += meta->nr_domains;
continue;
}
-- 
2.26.2



Re: [RESEND PATCH v8 0/3] Add Intel ComboPhy driver

2020-05-18 Thread Dilip Kota



On 5/19/2020 1:17 PM, Kishon Vijay Abraham I wrote:

Dilip,

On 5/19/2020 9:26 AM, Dilip Kota wrote:

On 5/18/2020 9:49 PM, Kishon Vijay Abraham I wrote:

Dilip,

On 5/15/2020 1:43 PM, Dilip Kota wrote:

This patch series adds Intel ComboPhy driver, respective yaml schemas

Changes on v8:
    As per PHY Maintainer's request add description in comments for doing
    register access through register map framework.

Changes on v7:
    As per System control driver maintainer's inputs remove
  fwnode_to_regmap() definition and use device_node_get_regmap()

Can you fix this warning and resend the patch?
drivers/phy/intel/phy-intel-combo.c:229:6: warning: ‘cb_mode’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
    ret = regmap_write(cbphy->hsiocfg, REG_COMBO_MODE(cbphy->bid), cb_mode);
    ^~~
drivers/phy/intel/phy-intel-combo.c:204:24: note: ‘cb_mode’ was declared here
    enum intel_combo_mode cb_mode;
  ^~~

I noticed this warning while preparing the patch.
It sounds like false warning because:
1.) "cb_mode" is initialized in the switch case based on the "mode =
cbphy->phy_mode;"
2.) cbphy->phy_mode is initialized during the probe in
"intel_cbphy_fwnode_parse()" with one of the 3 values.
PHY_PCIE_MODE, PHY_SATA_MODE, PHY_XPCS_MODE.
3.) There is no chance of "cbphy->phy_mode" having different value.
4.) And "cb_mode" will be initialized according to the "mode = cbphy->phy_mode;"
5.) Hence, there is no chance of "cb_mode" getting accessed uninitialized.

Let's try to keep the compiler happy. Please fix this warning.

Sure, will fix it and send the patch series.


Thanks
Kishon


Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

2020-05-18 Thread Sean Christopherson
On Mon, May 18, 2020 at 11:26:29AM -0700, Luck, Tony wrote:
> Maybe it isn't pretty. But I don't see another practical solution.
> 
> The VMM is doing exactly the right thing here. It should not trust
> that the guest will behave and not touch the poison location again.
> If/when the guest does touch the poison, the right action is
> for the VMM to fake a new machine check to the guest.
> 
> Theoretlcally the VMM could decode the instruction that the guest
> was trying to use on the poison page and decide "oh, this is that
> weird case in Linux where it's just trying to CLFLUSH the page. I'll
> just step the return IP past the CLFLUSH and let the guest continue".

That's actually doable in KVM, e.g. a hack could be done in <10 lines of
code.  A proper fix that integrates with KVM's emulator would be
substantially more code and effort though.

> But that doesn't sound at all reasonable to me (especially as the
> next step is to realize that Linux is going to repeat that for every
> cache line in the page, so you also want to VMM to fudge the register
> contents to skip to the end of the loop and avoid another 63 VMEXITs).

Eh, 63 VM-Exits is peanuts in the grand scheme.  Even with the host-side
gup() that's probably less than 50us.

> N.B. Linux wants to switch the page to uncacheable so that in the
> persistant memory case the filesytem code can continue to access
> the other "blocks" in the page, rather than lose all of them. That's
> futile in the case where the VMM took the whole 4K away. Maybe Dan
> needs to think about the guest case too.

This is where I'm unclear as to the guest behavior.  Is it doing *just*
CLFLUSH, or is it doing CLFLUSH followed by other accesses to the poisoned
page?  If it's the former, then it's probably worth at least exploring a
KVM fix.  If it's the latter, then yeah, emulating CLFLUSH for a poisoned
#MC is pointless.  I assume it's the latter since the goal is to recover
data?

Oh, and FWIW, the guest won't actually get UC for that page.


Re: [PATCH 3/3] hwmon: (ina2xx) Add support for ina260

2020-05-18 Thread Michal Simek
On 26. 02. 20 3:16, Guenter Roeck wrote:
> On 2/24/20 3:26 PM, Franz Forstmayr wrote:
>> Add initial support for INA260 power monitor with integrated shunt.
>> Registers are different from other INA2xx devices, that's why a small
>> translation table is used.
>>
>> Signed-off-by: Franz Forstmayr 
> 
> I think the chip is sufficiently different to other chips that a separate
> driver would make much more sense than adding support to the existing
> driver.
> There is no calibration, registers are different, the retry logic is
> not needed. A new driver could use the with_info API and would be much
> simpler while at the same time not messing up the existing driver.

Isn't it also better to switch to IIO framework?
As we discussed in past there are two ina226 drivers. One in hwmon and
second based on IIO framework (more advance one?) and would be good to
deprecate hwmon one.
That's why separate driver is necessary.

Thanks,
Michal



[PATCH] tee: convert convert get_user_pages() --> pin_user_pages()

2020-05-18 Thread John Hubbard
This code was using get_user_pages*(), in a "Case 2" scenario
(DMA/RDMA), using the categorization from [1]. That means that it's
time to convert the get_user_pages*() + put_page() calls to
pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Cc: Jens Wiklander 
Cc: Sumit Semwal 
Cc: tee-...@lists.linaro.org
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Signed-off-by: John Hubbard 
---

Note that I have only compile-tested this patch, although that does
also include cross-compiling for a few other arches.

thanks,
John Hubbard
NVIDIA

 drivers/tee/tee_shm.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
index bd679b72bd05..7dffc42d8d5a 100644
--- a/drivers/tee/tee_shm.c
+++ b/drivers/tee/tee_shm.c
@@ -31,16 +31,13 @@ static void tee_shm_release(struct tee_shm *shm)
 
poolm->ops->free(poolm, shm);
} else if (shm->flags & TEE_SHM_REGISTER) {
-   size_t n;
int rc = teedev->desc->ops->shm_unregister(shm->ctx, shm);
 
if (rc)
dev_err(teedev->dev.parent,
"unregister shm %p failed: %d", shm, rc);
 
-   for (n = 0; n < shm->num_pages; n++)
-   put_page(shm->pages[n]);
-
+   unpin_user_pages(shm->pages, shm->num_pages);
kfree(shm->pages);
}
 
@@ -226,7 +223,7 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, 
unsigned long addr,
goto err;
}
 
-   rc = get_user_pages_fast(start, num_pages, FOLL_WRITE, shm->pages);
+   rc = pin_user_pages_fast(start, num_pages, FOLL_WRITE, shm->pages);
if (rc > 0)
shm->num_pages = rc;
if (rc != num_pages) {
@@ -271,16 +268,13 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, 
unsigned long addr,
return shm;
 err:
if (shm) {
-   size_t n;
-
if (shm->id >= 0) {
mutex_lock(>mutex);
idr_remove(>idr, shm->id);
mutex_unlock(>mutex);
}
if (shm->pages) {
-   for (n = 0; n < shm->num_pages; n++)
-   put_page(shm->pages[n]);
+   unpin_user_pages(shm->pages, shm->num_pages);
kfree(shm->pages);
}
}
-- 
2.26.2



Re: [PATCH] usb: xhci: fix USB_XHCI_PCI depends

2020-05-18 Thread Bjorn Andersson
On Mon 18 May 22:06 PDT 2020, Vinod Koul wrote:

> The xhci-pci-renesas module exports symbols for xhci-pci to load the
> RAM/ROM on renesas xhci controllers. We had dependency which works
> when both the modules are builtin or modules.
> 
> But if xhci-pci is inbuilt and xhci-pci-renesas in module, we get below
> linker error:
> drivers/usb/host/xhci-pci.o: In function `xhci_pci_remove':
> drivers/usb/host/xhci-pci.c:411: undefined reference to 
> `renesas_xhci_pci_exit'
> drivers/usb/host/xhci-pci.o: In function `xhci_pci_probe':
> drivers/usb/host/xhci-pci.c:345: undefined reference to 
> `renesas_xhci_check_request_fw'
> 
> Fix this by adding USB_XHCI_PCI having depends on USB_XHCI_PCI_RENESAS
> || !USB_XHCI_PCI_RENESAS so that both can be either inbuilt or modules.
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Reported-by: Anders Roxell 
> Fixes: a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with 
> memory")
> Tested-by: Anders Roxell 
> Signed-off-by: Vinod Koul 
> ---
>  drivers/usb/host/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
> index b5c542d6a1c5..92783d175b3f 100644
> --- a/drivers/usb/host/Kconfig
> +++ b/drivers/usb/host/Kconfig
> @@ -40,11 +40,11 @@ config USB_XHCI_DBGCAP
>  config USB_XHCI_PCI
>   tristate
>   depends on USB_PCI
> + depends on USB_XHCI_PCI_RENESAS || !USB_XHCI_PCI_RENESAS
>   default y
>  
>  config USB_XHCI_PCI_RENESAS
>   tristate "Support for additional Renesas xHCI controller with firwmare"
> - depends on USB_XHCI_PCI
>   ---help---
> Say 'Y' to enable the support for the Renesas xHCI controller with
> firwmare. Make sure you have the firwmare for the device and
> -- 
> 2.25.4
> 


Re: [RESEND PATCH v8 0/3] Add Intel ComboPhy driver

2020-05-18 Thread Kishon Vijay Abraham I
Dilip,

On 5/19/2020 9:26 AM, Dilip Kota wrote:
> 
> On 5/18/2020 9:49 PM, Kishon Vijay Abraham I wrote:
>> Dilip,
>>
>> On 5/15/2020 1:43 PM, Dilip Kota wrote:
>>> This patch series adds Intel ComboPhy driver, respective yaml schemas
>>>
>>> Changes on v8:
>>>    As per PHY Maintainer's request add description in comments for doing
>>>    register access through register map framework.
>>>
>>> Changes on v7:
>>>    As per System control driver maintainer's inputs remove
>>>  fwnode_to_regmap() definition and use device_node_get_regmap()
>> Can you fix this warning and resend the patch?
>> drivers/phy/intel/phy-intel-combo.c:229:6: warning: ‘cb_mode’ may be used
>> uninitialized in this function [-Wmaybe-uninitialized]
>>    ret = regmap_write(cbphy->hsiocfg, REG_COMBO_MODE(cbphy->bid), cb_mode);
>>    ^~~
>> drivers/phy/intel/phy-intel-combo.c:204:24: note: ‘cb_mode’ was declared here
>>    enum intel_combo_mode cb_mode;
>>  ^~~
> I noticed this warning while preparing the patch.
> It sounds like false warning because:
> 1.) "cb_mode" is initialized in the switch case based on the "mode =
> cbphy->phy_mode;"
> 2.) cbphy->phy_mode is initialized during the probe in
> "intel_cbphy_fwnode_parse()" with one of the 3 values.
> PHY_PCIE_MODE, PHY_SATA_MODE, PHY_XPCS_MODE.
> 3.) There is no chance of "cbphy->phy_mode" having different value.
> 4.) And "cb_mode" will be initialized according to the "mode = 
> cbphy->phy_mode;"
> 5.) Hence, there is no chance of "cb_mode" getting accessed uninitialized.

Let's try to keep the compiler happy. Please fix this warning.

Thanks
Kishon


Re: [PATCH 10/10] mm/migrate.c: call detach_page_private to cleanup code

2020-05-18 Thread Andrew Morton
On Sun, 17 May 2020 23:47:18 +0200 Guoqing Jiang 
 wrote:

> We can cleanup code a little by call detach_page_private here.
> 
> ...
>
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -804,10 +804,7 @@ static int __buffer_migrate_page(struct address_space 
> *mapping,
>   if (rc != MIGRATEPAGE_SUCCESS)
>   goto unlock_buffers;
>  
> - ClearPagePrivate(page);
> - set_page_private(newpage, page_private(page));
> - set_page_private(page, 0);
> - put_page(page);
> + set_page_private(newpage, detach_page_private(page));
>   get_page(newpage);
>  
>   bh = head;

mm/migrate.c: In function '__buffer_migrate_page':
./include/linux/mm_types.h:243:52: warning: assignment makes integer from 
pointer without a cast [-Wint-conversion]
 #define set_page_private(page, v) ((page)->private = (v))
^
mm/migrate.c:800:2: note: in expansion of macro 'set_page_private'
  set_page_private(newpage, detach_page_private(page));
  ^~~~

The fact that set_page_private(detach_page_private()) generates a type
mismatch warning seems deeply wrong, surely.

Please let's get the types sorted out - either unsigned long or void *,
not half-one and half-the other.  Whatever needs the least typecasting
at callsites, I suggest.

And can we please implement set_page_private() and page_private() with
inlined C code?  There is no need for these to be macros.



Re: [PATCH] init/main.c: Print all command line when boot

2020-05-18 Thread Joe Perches
On Mon, 2020-05-18 at 20:44 -0700, Andrew Morton wrote:
> On Tue, 19 May 2020 11:29:46 +0800 王程刚  wrote:
> 
> > Function pr_notice print max length maybe less than the command line length,
> > need more times to print all.
> > For example, arm64 has 2048 bytes command line length, but printk maximum
> > length is only 1024 bytes.
> 
> I can see why that might be a problem!
> 
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -825,6 +825,16 @@ void __init __weak arch_call_rest_init(void)
> > rest_init();
> >  }
> >  
> > +static void __init print_cmdline(void)
> > +{
> > +   const char *prefix = "Kernel command line: ";
> 
> const char prefix[] = "...";
> 
> might generate slightly more efficient code.
> 
> > +   int len = -strlen(prefix);
> 
> hm, tricky.  What the heck does printk() actually return to the caller?
> Seems that we forgot to document this, and there are so many different
> paths which a printk call can take internally that I'm not confident
> that they all got it right!

There is no use of the return value of any pr_ or
dev_ or netdev_ mechanisms (as functions) should return void.
https://lore.kernel.org/lkml/1466739971-30399-1-git-send-email-...@perches.com/

> > +   len += pr_notice("%s%s\n", prefix, boot_command_line);
> > +   while (boot_command_line[len])
> > +   len += pr_notice("%s\n", _command_line[len]);
> > +}

More likely it'd be better to use a strlen(boot_command_line)
and perhaps do something like print multiple lines with args
using strchr(, ' ') at some largish value, say 132 or 256 chars
maximum per line.





inux-next: build failure after merge of the drm-msm tree

2020-05-18 Thread Stephen Rothwell
Hi all,

After merging the drm-msm tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

ERROR: modpost: "__aeabi_ldivmod" [drivers/gpu/drm/msm/msm.ko] undefined!
ERROR: modpost: "__aeabi_uldivmod" [drivers/gpu/drm/msm/msm.ko] undefined!

Caused by commit

  04d9044f6c57 ("drm/msm/dpu: add support for clk and bw scaling for display")

I applied the following patch for today (this is mechanical, there may
be a better way):

From: Stephen Rothwell 
Date: Tue, 19 May 2020 14:12:39 +1000
Subject: [PATCH] drm/msm/dpu: fix up u64/u32 division for 32 bit architectures

Signed-off-by: Stephen Rothwell 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 23 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 15 
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
index 9697abcbec3f..85c2a4190840 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "dpu_kms.h"
 #include "dpu_trace.h"
@@ -53,8 +54,11 @@ static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
}
 
bw_factor = kms->catalog->perf.bw_inefficiency_factor;
-   if (bw_factor)
-   crtc_plane_bw = mult_frac(crtc_plane_bw, bw_factor, 100);
+   if (bw_factor) {
+   u64 quot = crtc_plane_bw;
+   u32 rem = do_div(quot, 100);
+   crtc_plane_bw = (quot * bw_factor) + ((rem * bw_factor) / 100);
+   }
 
return crtc_plane_bw;
 }
@@ -89,8 +93,11 @@ static u64 _dpu_core_perf_calc_clk(struct dpu_kms *kms,
}
 
clk_factor = kms->catalog->perf.clk_inefficiency_factor;
-   if (clk_factor)
-   crtc_clk = mult_frac(crtc_clk, clk_factor, 100);
+   if (clk_factor) {
+   u64 quot = crtc_clk;
+   u32 rem = do_div(quot, 100);
+   crtc_clk = (quot * clk_factor) + ((rem * clk_factor) / 100);
+   }
 
return crtc_clk;
 }
@@ -234,8 +241,12 @@ static int _dpu_core_perf_crtc_update_bus(struct dpu_kms 
*kms,
}
}
 
-   avg_bw = kms->num_paths ?
-   perf.bw_ctl / kms->num_paths : 0;
+   if (kms->num_paths) {
+   avg_bw = perf.bw_ctl;
+   do_div(avg_bw, kms->num_paths);
+   } else {
+   avg_bw = 0;
+   }
 
for (i = 0; i < kms->num_paths; i++)
icc_set_bw(kms->path[i],
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index c2a6e3dacd68..ad95f32eac13 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -174,7 +175,11 @@ static void _dpu_plane_calc_bw(struct drm_plane *plane,
plane_prefill_bw =
src_width * hw_latency_lines * fps * fmt->bpp * scale_factor;
 
-   plane_prefill_bw = mult_frac(plane_prefill_bw, mode->vtotal, (vbp+vpw));
+   {
+   u64 quot = plane_prefill_bw;
+   u32 rem = do_div(plane_prefill_bw, vbp + vpw);
+   plane_prefill_bw = quot * mode->vtotal + rem * mode->vtotal / 
(vbp + vpw);
+   }
 
pstate->plane_fetch_bw = max(plane_bw, plane_prefill_bw);
 }
@@ -204,9 +209,11 @@ static void _dpu_plane_calc_clk(struct drm_plane *plane)
pstate->plane_clk =
dst_width * mode->vtotal * fps;
 
-   if (src_height > dst_height)
-   pstate->plane_clk = mult_frac(pstate->plane_clk,
-   src_height, dst_height);
+   if (src_height > dst_height) {
+   u64 quot = pstate->plane_clk;
+   u32 rem = do_div(quot, dst_height);
+   pstate->plane_clk = quot * src_height + rem * src_height / 
dst_height;
+   }
 }
 
 /**
-- 
2.26.2

-- 
Cheers,
Stephen Rothwell


pgpWYfofksFzW.pgp
Description: OpenPGP digital signature


Re: [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver

2020-05-18 Thread Dave Airlie
On Fri, 15 May 2020 at 00:12, Jeffrey Hugo  wrote:
>
> Introduction:
> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> SoC ASIC for the purpose of efficently running Deep Learning inference
> workloads in a data center environment.
>
> The offical press release can be found at -
> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>
> The offical product website is -
> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>
> At the time of the offical press release, numerious technology news sites
> also covered the product.  Doing a search of your favorite site is likely
> to find their coverage of it.
>
> It is our goal to have the kernel driver for the product fully upstream.
> The purpose of this RFC is to start that process.  We are still doing
> development (see below), and thus not quite looking to gain acceptance quite
> yet, but now that we have a working driver we beleive we are at the stage
> where meaningful conversation with the community can occur.


Hi Jeffery,

Just wondering what the userspace/testing plans for this driver.

This introduces a new user facing API for a device without pointers to
users or tests for that API.

Although this isn't a graphics driver, and Greg will likely merge
anything to the kernel you throw at him, I do wonder how to validate
the uapi from a security perspective. It's always interesting when
someone wraps a DMA engine with user ioctls, and without enough
information to decide if the DMA engine is secure against userspace
misprogramming it.

Also if we don't understand the programming API on board the device,
we can't tell if the "core" on the device are able to reprogram the
device engines either.

Figuring this out is difficult at the best of times, it helps if there
is access to the complete device documentation or user space side
drivers in order to faciliate this.

The other area I mention is testing the uAPI, how do you envisage
regression testing and long term sustainability of the uAPI?

Thanks,
Dave.


RE: [PATCH V2 1/3] dt-bindings: timer: Convert i.MX GPT to json-schema

2020-05-18 Thread Aisheng Dong
> From: Anson Huang 
> Sent: Tuesday, May 19, 2020 11:56 AM
> 
> Convert the i.MX GPT binding to DT schema format using json-schema.
> 
> Signed-off-by: Anson Huang 

Reviewed-by: Dong Aisheng 

Regards
Aisheng


[PATCH] usb: xhci: fix USB_XHCI_PCI depends

2020-05-18 Thread Vinod Koul
The xhci-pci-renesas module exports symbols for xhci-pci to load the
RAM/ROM on renesas xhci controllers. We had dependency which works
when both the modules are builtin or modules.

But if xhci-pci is inbuilt and xhci-pci-renesas in module, we get below
linker error:
drivers/usb/host/xhci-pci.o: In function `xhci_pci_remove':
drivers/usb/host/xhci-pci.c:411: undefined reference to `renesas_xhci_pci_exit'
drivers/usb/host/xhci-pci.o: In function `xhci_pci_probe':
drivers/usb/host/xhci-pci.c:345: undefined reference to 
`renesas_xhci_check_request_fw'

Fix this by adding USB_XHCI_PCI having depends on USB_XHCI_PCI_RENESAS
|| !USB_XHCI_PCI_RENESAS so that both can be either inbuilt or modules.

Reported-by: Anders Roxell 
Fixes: a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with 
memory")
Tested-by: Anders Roxell 
Signed-off-by: Vinod Koul 
---
 drivers/usb/host/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
index b5c542d6a1c5..92783d175b3f 100644
--- a/drivers/usb/host/Kconfig
+++ b/drivers/usb/host/Kconfig
@@ -40,11 +40,11 @@ config USB_XHCI_DBGCAP
 config USB_XHCI_PCI
tristate
depends on USB_PCI
+   depends on USB_XHCI_PCI_RENESAS || !USB_XHCI_PCI_RENESAS
default y
 
 config USB_XHCI_PCI_RENESAS
tristate "Support for additional Renesas xHCI controller with firwmare"
-   depends on USB_XHCI_PCI
---help---
  Say 'Y' to enable the support for the Renesas xHCI controller with
  firwmare. Make sure you have the firwmare for the device and
-- 
2.25.4



Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

2020-05-18 Thread Sean Christopherson
On Mon, May 18, 2020 at 06:55:00PM +0200, Borislav Petkov wrote:
> On Mon, May 18, 2020 at 08:36:25AM -0700, Luck, Tony wrote:
> > The VMM gets the page fault (because the unmapping of the guest
> > physical address is at the VMM EPT level).  The VMM can't map a new
> > page into that guest physical address because it has no way to
> > replace the contents of the old page.  The VMM could pass the #PF
> > to the guest, but that would just confuse the guest (its page tables
> > all say that the page is still valid). In this particular case the
> > page is part of the 1:1 kernel map. So the kernel will OOPS (I think).
> 
> ...
> 
> > PLease explain how a guest (that doesn't even know that it is a guest)
> > is going to figure out that the EPT tables (that it has no way to access)
> > have marked this page invalid in guest physical address space.
> 
> So somewhere BUS_MCEERR_AR was mentioned. So I'm assuming the error
> severity was "action required". What does happen in the kernel, on
> baremetal, with an AR error in kernel space, i.e., kernel memory?
> 
> If we can't fixup the exception, we die.
> 
> So why should the guest behave any differently?
> 
> Now, if you want for the guest to be more "robust" and handle that
> thing, fine. But then you'd need an explicit way to tell the guest
> kernel: "you've just had an MCE and I unmapped the page" so that the
> guest kernel can figure out what do to. Even if it means, to panic.
> 
> I.e., signal in an explicit way that EPT violation Jue is talking about
> in the other mail.

Well, technically the CLFUSH thing is a KVM emulation bug, but it sounds
like that's a moot point since the pmem-enabled guest will make real
accesses to the poisoned page shortly thereafter.  E.g. teaching KVM to
eat the -EHWPOISON on CLFLUSH would only postpone the guest's death.

As for how the second #MC occurs, on the EPT violation, KVM does a gup() to
translate the virtual address to a pfn (KVM maintains a simple GPA->HVA
lookup).  gup() returns -EHWPOISON for the poisoned page, which KVM
redirects into a BUS_MCEERR_AR.  The userspace VMM, e.g. Qemu, sees the
BUS_MCEERR_AR and sends it back into the guest as a virtual #MC.

> You can inject a #PF or better yet the *first* MCE which is being
> injected should say with a bit somehwere "I unmapped the address in
> m->addr". So that the guest kernel can handle that properly and know
> what *exactly* it is getting an MCE for.
> 
> What I don't like is the "am I running as a guest" check. Because
> someone else would come later and say, err, I'm not virtualizing this
> portion of MCA either, lemme add another "am I guest" check.
> 
> Sure, it is a lot easier but when stuff like that starts spreading
> around in the MCE code, then we can just as well disable MCE when
> virtualized altogether. It would be a lot easier for everybody.


[PATCH] scsi: st: convert convert get_user_pages() --> pin_user_pages()

2020-05-18 Thread John Hubbard
This code was using get_user_pages*(), in a "Case 2" scenario
(DMA/RDMA), using the categorization from [1]. That means that it's
time to convert the get_user_pages*() + put_page() calls to
pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.

Note that this effectively changes the code's behavior as well: it now
ultimately calls set_page_dirty_lock(), instead of SetPageDirty().This
is probably more accurate.

As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
dealing with a file backed page where we have reference on the inode it
hangs off." [3]

Also, this deletes one of the two FIXME comments (about refcounting),
because there is nothing wrong with the refcounting at this point.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

[3] https://lore.kernel.org/r/20190723153640.gb...@lst.de

Cc: "Kai Mäkisara" 
Cc: James E.J. Bottomley 
Cc: Martin K. Petersen 
Cc: linux-s...@vger.kernel.org
Signed-off-by: John Hubbard 
---
 drivers/scsi/st.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index c5f9b348b438..0369c7edfd94 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -4922,7 +4922,7 @@ static int sgl_map_user_pages(struct st_buffer *STbp,
unsigned long end = (uaddr + count + PAGE_SIZE - 1) >> PAGE_SHIFT;
unsigned long start = uaddr >> PAGE_SHIFT;
const int nr_pages = end - start;
-   int res, i, j;
+   int res, i;
struct page **pages;
struct rq_map_data *mdata = >map_data;
 
@@ -4944,7 +4944,7 @@ static int sgl_map_user_pages(struct st_buffer *STbp,
 
 /* Try to fault in all of the necessary pages */
 /* rw==READ means read from drive, write into memory area */
-   res = get_user_pages_fast(uaddr, nr_pages, rw == READ ? FOLL_WRITE : 0,
+   res = pin_user_pages_fast(uaddr, nr_pages, rw == READ ? FOLL_WRITE : 0,
  pages);
 
/* Errors and no page mapped should return here */
@@ -4964,8 +4964,7 @@ static int sgl_map_user_pages(struct st_buffer *STbp,
return nr_pages;
  out_unmap:
if (res > 0) {
-   for (j=0; j < res; j++)
-   put_page(pages[j]);
+   unpin_user_pages(pages, res);
res = 0;
}
kfree(pages);
@@ -4977,18 +4976,10 @@ static int sgl_map_user_pages(struct st_buffer *STbp,
 static int sgl_unmap_user_pages(struct st_buffer *STbp,
const unsigned int nr_pages, int dirtied)
 {
-   int i;
+   /* FIXME: cache flush missing for rw==READ */
 
-   for (i=0; i < nr_pages; i++) {
-   struct page *page = STbp->mapped_pages[i];
+   unpin_user_pages_dirty_lock(STbp->mapped_pages, nr_pages, dirtied);
 
-   if (dirtied)
-   SetPageDirty(page);
-   /* FIXME: cache flush missing for rw==READ
-* FIXME: call the correct reference counting function
-*/
-   put_page(page);
-   }
kfree(STbp->mapped_pages);
STbp->mapped_pages = NULL;
 
-- 
2.26.2



Re: [PATCH v13 3/5] usb: xhci: Add support for Renesas controller with memory

2020-05-18 Thread Vinod Koul
On 19-05-20, 00:37, Anders Roxell wrote:
> On Mon, 18 May 2020 at 21:57, Vinod Koul  wrote:
> >
> > Hi Anders,
> 
> Hi Vinod,
> 
> >
> > On 18-05-20, 19:53, Anders Roxell wrote:
> > > On Wed, 6 May 2020 at 08:01, Vinod Koul  wrote:
> > > >
> > > > Some rensas controller like uPD720201 and uPD720202 need firmware to be
> > > > loaded. Add these devices in pci table and invoke renesas firmware 
> > > > loader
> > > > functions to check and load the firmware into device memory when
> > > > required.
> > > >
> > > > Signed-off-by: Vinod Koul 
> > >
> > > Hi, I got a build error when I built an arm64 allmodconfig kernel.
> >
> > Thanks for this. This is happening as we have default y for USB_XHCI_PCI
> > and then we make USB_XHCI_PCI_RENESAS=m. That should be not allowed as
> > we export as symbol so both can be inbuilt or modules but USB_XHCI_PCI=y
> > and USB_XHCI_PCI_RENESAS=m cant. While it is valid that USB_XHCI_PCI=y|m
> > and USB_XHCI_PCI_RENESAS=n
> >
> > So this seems to get fixed by below for me. I have tested with
> >  - both y and m (easy)
> >  - make USB_XHCI_PCI_RENESAS=n, USB_XHCI_PCI=y|m works
> >  - try making USB_XHCI_PCI=y and USB_XHCI_PCI_RENESAS=m, then
> >USB_XHCI_PCI=m by kbuild :)
> >  - try making USB_XHCI_PCI=m and USB_XHCI_PCI_RENESAS=y, kbuild gives
> >error prompt that it will be m due to depends
> >
> > Thanks to all the fixes done by Arnd which pointed me to this. Pls
> > verify
> 
> I was able to build an arm64 allmodconfig kernel with this change.

I will send the formal patch and add your name in reported and
tested. Thanks for the quick verification

> 
> Cheers,
> Anders
> 
> > and I will send the fix with you as reported :)
> >
> >  >8 
> >
> > diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
> > index b5c542d6a1c5..92783d175b3f 100644
> > --- a/drivers/usb/host/Kconfig
> > +++ b/drivers/usb/host/Kconfig
> > @@ -40,11 +40,11 @@ config USB_XHCI_DBGCAP
> >  config USB_XHCI_PCI
> > tristate
> > depends on USB_PCI
> > +   depends on USB_XHCI_PCI_RENESAS || !USB_XHCI_PCI_RENESAS
> > default y
> >
> >  config USB_XHCI_PCI_RENESAS
> > tristate "Support for additional Renesas xHCI controller with 
> > firwmare"
> > -   depends on USB_XHCI_PCI
> > ---help---
> >   Say 'Y' to enable the support for the Renesas xHCI controller with
> >   firwmare. Make sure you have the firwmare for the device and
> >
> > --
> > ~Vinod

-- 
~Vinod


Re: [PATCH v6 1/2] dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC

2020-05-18 Thread Ramuthevar, Vadivel MuruganX

Hi Rob,

On 19/5/2020 2:27 am, Rob Herring wrote:

On Thu, May 14, 2020 at 8:08 PM Ramuthevar, Vadivel MuruganX
 wrote:


Hi Rob,

On 14/5/2020 8:57 pm, Rob Herring wrote:

On Wed, 13 May 2020 18:46:14 +0800, Ramuthevar,Vadivel MuruganX wrote:

From: Ramuthevar Vadivel Murugan 

Add YAML file for dt-bindings to support NAND Flash Controller
on Intel's Lightning Mountain SoC.

Signed-off-by: Ramuthevar Vadivel Murugan 

---
   .../devicetree/bindings/mtd/intel,lgm-nand.yaml| 83 
++
   1 file changed, 83 insertions(+)
   create mode 100644 Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml




My bot found errors running 'make dt_binding_check' on your patch:

/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/mtd/intel,lgm-nand.example.dt.yaml:
 nand-controller@e0f0: 'dmas' is a dependency of 'dma-names'

See https://patchwork.ozlabs.org/patch/1289160

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure dt-schema is up to date:

pip3 install git+https://github.com/devicetree-org/dt-schema.git@master 
--upgrade

Please check and re-submit.

Thank you very much for review comments...
I didn't find build errors, successfully built.


You need to build without DT_SCHEMA_FILES set or be on 5.7-rc (you
should be on a current -rcX at least for any patch submission). This
comes from the core schema.

Yes, reproduced the issue as above mentioned and fixed it. Thanks!

Regards
Vadivel


Rob



Re: [PATCH v4 2/4] sysctl: Move some boundary constants form sysctl.c to sysctl_vals

2020-05-18 Thread Tetsuo Handa
On 2020/05/19 12:31, Xiaoming Ni wrote:
> Some boundary (.extra1 .extra2) constants (E.g: neg_one two) in
> sysctl.c are used in multiple features. Move these variables to
> sysctl_vals to avoid adding duplicate variables when cleaning up
> sysctls table.
> 
> Signed-off-by: Xiaoming Ni 
> Reviewed-by: Kees Cook 

I feel that it is use of

void *extra1;
void *extra2;

in "struct ctl_table" that requires constant values indirection.
Can't we get rid of sysctl_vals using some "union" like below?

struct ctl_table {
const char *procname;   /* Text ID for /proc/sys, or zero */
void *data;
int maxlen;
umode_t mode;
struct ctl_table *child;/* Deprecated */
proc_handler *proc_handler; /* Callback for text formatting */
struct ctl_table_poll *poll;
union {
void *min_max_ptr[2];
int min_max_int[2];
long min_max_long[2];
};
} __randomize_layout;


[PATCH v8 2/2] mtd: rawnand: Add NAND controller support on Intel LGM SoC

2020-05-18 Thread Ramuthevar,Vadivel MuruganX
From: Ramuthevar Vadivel Murugan 

This patch adds the new IP of Nand Flash Controller(NFC) support
on Intel's Lightning Mountain(LGM) SoC.

DMA is used for burst data transfer operation, also DMA HW supports
aligned 32bit memory address and aligned data access by default.
DMA burst of 8 supported. Data register used to support the read/write
operation from/to device.

NAND controller driver implements ->exec_op() to replace legacy hooks,
these specific call-back method to execute NAND operations.

Signed-off-by: Ramuthevar Vadivel Murugan 

---
 drivers/mtd/nand/raw/Kconfig |   8 +
 drivers/mtd/nand/raw/Makefile|   1 +
 drivers/mtd/nand/raw/intel-nand-controller.c | 743 +++
 3 files changed, 752 insertions(+)
 create mode 100644 drivers/mtd/nand/raw/intel-nand-controller.c

diff --git a/drivers/mtd/nand/raw/Kconfig b/drivers/mtd/nand/raw/Kconfig
index a80a46bb5b8b..75ab2afb78cf 100644
--- a/drivers/mtd/nand/raw/Kconfig
+++ b/drivers/mtd/nand/raw/Kconfig
@@ -457,6 +457,14 @@ config MTD_NAND_CADENCE
  Enable the driver for NAND flash on platforms using a Cadence NAND
  controller.
 
+config MTD_NAND_INTEL_LGM
+   tristate "Support for NAND controller on Intel LGM SoC"
+   depends on OF || COMPILE_TEST
+   depends on HAS_IOMEM
+   help
+ Enables support for NAND Flash chips on Intel's LGM SoC.
+ NAND flash controller interfaced through the External Bus Unit.
+
 comment "Misc"
 
 config MTD_SM_COMMON
diff --git a/drivers/mtd/nand/raw/Makefile b/drivers/mtd/nand/raw/Makefile
index 2d136b158fb7..bfc8fe4d2cb0 100644
--- a/drivers/mtd/nand/raw/Makefile
+++ b/drivers/mtd/nand/raw/Makefile
@@ -58,6 +58,7 @@ obj-$(CONFIG_MTD_NAND_TEGRA)  += tegra_nand.o
 obj-$(CONFIG_MTD_NAND_STM32_FMC2)  += stm32_fmc2_nand.o
 obj-$(CONFIG_MTD_NAND_MESON)   += meson_nand.o
 obj-$(CONFIG_MTD_NAND_CADENCE) += cadence-nand-controller.o
+obj-$(CONFIG_MTD_NAND_INTEL_LGM)   += intel-nand-controller.o
 
 nand-objs := nand_base.o nand_legacy.o nand_bbt.o nand_timings.o nand_ids.o
 nand-objs += nand_onfi.o
diff --git a/drivers/mtd/nand/raw/intel-nand-controller.c 
b/drivers/mtd/nand/raw/intel-nand-controller.c
new file mode 100644
index ..0e1079a4fbb5
--- /dev/null
+++ b/drivers/mtd/nand/raw/intel-nand-controller.c
@@ -0,0 +1,743 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright (c) 2020 Intel Corporation. */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define EBU_CLC0x000
+#define EBU_CLC_RST0xu
+
+#define EBU_ADDR_SEL(n)(0x20 + (n) * 4)
+/* 5 bits 26:22 included for comparison in the ADDR_SELx */
+#define EBU_ADDR_MASK(x)   ((x) << 4)
+#define EBU_ADDR_SEL_REGEN 0x1
+
+#define EBU_BUSCON(n)  (0x60 + (n) * 4)
+#define EBU_BUSCON_CMULT_V40x1
+#define EBU_BUSCON_RECOVC(n)   ((n) << 2)
+#define EBU_BUSCON_HOLDC(n)((n) << 4)
+#define EBU_BUSCON_WAITRDC(n)  ((n) << 6)
+#define EBU_BUSCON_WAITWRC(n)  ((n) << 8)
+#define EBU_BUSCON_BCGEN_CS0x0
+#define EBU_BUSCON_SETUP_ENBIT(22)
+#define EBU_BUSCON_ALEC0xC000
+
+#define EBU_CON0x0B0
+#define EBU_CON_NANDM_EN   BIT(0)
+#define EBU_CON_NANDM_DIS  0x0
+#define EBU_CON_CSMUX_E_EN BIT(1)
+#define EBU_CON_ALE_P_LOW  BIT(2)
+#define EBU_CON_CLE_P_LOW  BIT(3)
+#define EBU_CON_CS_P_LOW   BIT(4)
+#define EBU_CON_SE_P_LOW   BIT(5)
+#define EBU_CON_WP_P_LOW   BIT(6)
+#define EBU_CON_PRE_P_LOW  BIT(7)
+#define EBU_CON_IN_CS_S(n) ((n) << 8)
+#define EBU_CON_OUT_CS_S(n)((n) << 10)
+#define EBU_CON_LAT_EN_CS_P((0x3D) << 18)
+
+#define EBU_WAIT   0x0B4
+#define EBU_WAIT_RDBY  BIT(0)
+#define EBU_WAIT_WR_C  BIT(3)
+
+#define HSNAND_CTL10x110
+#define HSNAND_CTL1_ADDR_SHIFT 24
+
+#define HSNAND_CTL20x114
+#define HSNAND_CTL2_ADDR_SHIFT 8
+#define HSNAND_CTL2_CYC_N_V5   (0x2 << 16)
+
+#define HSNAND_INT_MSK_CTL 0x124
+#define HSNAND_INT_MSK_CTL_WR_CBIT(4)
+
+#define HSNAND_INT_STA 0x128
+#define HSNAND_INT_STA_WR_CBIT(4)
+
+#define HSNAND_CTL 0x130
+#define HSNAND_CTL_ENABLE_ECC  BIT(0)
+#define HSNAND_CTL_GO  BIT(2)
+#define HSNAND_CTL_CE_SEL_CS(n)BIT(3 + (n))
+#define HSNAND_CTL_RW_READ 0x0
+#define HSNAND_CTL_RW_WRITEBIT(10)
+#define HSNAND_CTL_ECC_OFF_V8THBIT(11)
+#define HSNAND_CTL_CKFF_EN 0x0
+#define HSNAND_CTL_MSG_EN  BIT(17)
+
+#define HSNAND_PARA0   0x13c
+#define HSNAND_PARA0_PAGE_V81920x3
+#define HSNAND_PARA0_PIB_V256  (0x3 << 4)
+#define HSNAND_PARA0_BYP_EN_NP 0x0
+#define HSNAND_PARA0_BYP_DEC_NP0x0
+#define HSNAND_PARA0_TYPE_ONFI BIT(18)

[PATCH v8 1/2] dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC

2020-05-18 Thread Ramuthevar,Vadivel MuruganX
From: Ramuthevar Vadivel Murugan 

Add YAML file for dt-bindings to support NAND Flash Controller
on Intel's Lightning Mountain SoC.

Signed-off-by: Ramuthevar Vadivel Murugan 

---
 .../devicetree/bindings/mtd/intel,lgm-nand.yaml| 91 ++
 1 file changed, 91 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml

diff --git a/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml 
b/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml
new file mode 100644
index ..cd4e983a449e
--- /dev/null
+++ b/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml
@@ -0,0 +1,91 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mtd/intel,lgm-nand.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Intel LGM SoC NAND Controller Device Tree Bindings
+
+allOf:
+  - $ref: "nand-controller.yaml"
+
+maintainers:
+  - Ramuthevar Vadivel Murugan 
+
+properties:
+  compatible:
+const: intel,lgm-nand-controller
+
+  reg:
+items:
+   - description: ebunand registers
+   - description: hsnand registers
+   - description: nand_cs0 external flash access
+   - description: nand_cs1 external flash access
+   - description: addr_sel0 memory region enable and access
+   - description: addr_sel1 memory region enable and access
+
+  clocks:
+maxItems: 1
+
+  dmas:
+maxItems: 2
+
+  dma-names:
+items:
+  - const: tx
+  - const: rx
+
+patternProperties:
+  "^nand@[a-f0-9]+$":
+type: object
+properties:
+  reg:
+minimum: 0
+maximum: 7
+
+  nand-ecc-mode: true
+
+  nand-ecc-algo:
+const: hw
+
+additionalProperties: false
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - dmas
+  - dma-names
+
+additionalProperties: false
+
+examples:
+  - |
+nand-controller@e0f0 {
+  compatible = "intel,lgm-nand";
+  reg = <0xe0f0 0x100>,
+<0xe100 0x300>,
+<0xe140 0x8000>,
+<0xe1c0 0x1000>,
+<0x1740 0x4>,
+<0x17c0 0x4>;
+  reg-names = "ebunand", "hsnand", "nand_cs0", "nand_cs1",
+"addr_sel0","addr_sel1";
+  clocks = < 125>;
+  dmas = < 8>, < 9>;
+  dma-names = "tx", "rx";
+  #address-cells = <1>;
+  #size-cells = <0>;
+  #clock-cells = <1>;
+
+  nand@0 {
+reg = <0>;
+nand-on-flash-bbt;
+#address-cells = <1>;
+#size-cells = <1>;
+  };
+};
+
+...
-- 
2.11.0



Re: [RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

2020-05-18 Thread Viresh Kumar
On 18-05-20, 19:53, Jassi Brar wrote:
> That is a client/protocol property and has nothing to do with the
> controller dt node.

That's what I am concerned about, i.e. different ways of passing the
doorbell number via DT.

-- 
viresh


[PATCH v8 0/2] mtd: rawnand: Add NAND controller support on Intel LGM SoC

2020-05-18 Thread Ramuthevar,Vadivel MuruganX
This patch adds the new IP of Nand Flash Controller(NFC) support
on Intel's Lightning Mountain(LGM) SoC.

DMA is used for burst data transfer operation, also DMA HW supports
aligned 32bit memory address and aligned data access by default.
DMA burst of 8 supported. Data register used to support the read/write
operation from/to device.

NAND controller also supports in-built HW ECC engine.

NAND controller driver implements ->exec_op() to replace legacy hooks,
these specific call-back method to execute NAND operations.

Thanks Boris, Andy and Arnd for the review comments and suggestions.
---
v8:
  - fix the kbuild bot warnings
  - correct the typo's
v7:
  - indentation issue is fixed
  - add error check for retrieve the resource from dt
  - Rob's review comments addressed
  - dt-schema build issue fixed with upgraded dt-schema
v6:
  - update EBU_ADDR_SELx register base value build it from DT
  - Add tabs in in Kconfig
  - Rob's review comments addressed in YAML file
  - add addr_sel0 and addr_sel1 reg-names in YAML example
v5:
  - replace by 'HSNAND_CLE_OFFS | HSNAND_CS_OFFS' to NAND_WRITE_CMD and 
NAND_WRITE_ADDR
  - remove the unused macros
  - update EBU_ADDR_MASK(x) macro
  - update the EBU_ADDR_SELx register values to be written
  - add the example in YAML file
v4:
  - add ebu_nand_cs structure for multiple-CS support
  - mask/offset encoding for 0x51 value
  - update macro HSNAND_CTL_ENABLE_ECC
  - drop the op argument and un-used macros.
  - updated the datatype and macros
  - add function disable nand module
  - remove ebu_host->dma_rx = NULL;
  - rename MMIO address range variables to ebu and hsnand
  - implement ->setup_data_interface()
  - update label err_cleanup_nand and err_cleanup_dma
  - add return value check in the nand_remove function
  - add/remove tabs and spaces as per coding standard
  - encoded CS ids by reg property
v3:
  - Add depends on MACRO in Kconfig
  - file name update in Makefile
  - file name update to intel-nand-controller
  - modification of MACRO divided like EBU, HSNAND and NAND
  - add NAND_ALE_OFFS, NAND_CLE_OFFS and NAND_CS_OFFS
  - rename lgm_ to ebu_ and _va suffix is removed in the whole file
  - rename structure and varaibles as per review comments.
  - remove lgm_read_byte(), lgm_dev_ready() and cmd_ctrl() un-used function
  - update in exec_op() as per review comments
  - rename function lgm_dma_exit() by lgm_dma_cleanup()
  - hardcoded magic value  for base and offset replaced by MACRO defined
  - mtd_device_unregister() + nand_cleanup() instead of nand_release()
v2:
  - implement the ->exec_op() to replaces the legacy hook-up.
  - update the commit message
  - YAML compatible string update to intel, lgm-nand-controller
  - add MIPS maintainers and xway_nand driver author in CC

v1:
 - initial version

Ramuthevar Vadivel Murugan (2):
  dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC
  mtd: rawnand: Add NAND controller support on Intel LGM SoC

 .../devicetree/bindings/mtd/intel,lgm-nand.yaml|  91 +++
 drivers/mtd/nand/raw/Kconfig   |   8 +
 drivers/mtd/nand/raw/Makefile  |   1 +
 drivers/mtd/nand/raw/intel-nand-controller.c   | 743 +
 4 files changed, 843 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml
 create mode 100644 drivers/mtd/nand/raw/intel-nand-controller.c

-- 
2.11.0



Re: memory offline infinite loop after soft offline

2020-05-18 Thread Qian Cai



> On May 14, 2020, at 11:48 PM, HORIGUCHI NAOYA(堀口 直也) 
>  wrote:
> 
> I'm very sorry to be quiet for long, but I think that I agree with
> this patchset and try to see what happend if merged into mmtom,
> although we need rebaseing to latest mmotm and some basic testing.

Looks like Oscar have been busy those days. Would you have time to take it over 
and rebase it?

[PATCH V2 2/3] dt-bindings: timer: Convert i.MX TPM to json-schema

2020-05-18 Thread Anson Huang
Convert the i.MX TPM binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
Reviewed-by: Dong Aisheng 
---
Changes since V1:
- remove unnecessary maxItems for clocks/clock-names.
---
 .../devicetree/bindings/timer/nxp,tpm-timer.txt| 28 --
 .../devicetree/bindings/timer/nxp,tpm-timer.yaml   | 61 ++
 2 files changed, 61 insertions(+), 28 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/timer/nxp,tpm-timer.txt
 create mode 100644 Documentation/devicetree/bindings/timer/nxp,tpm-timer.yaml

diff --git a/Documentation/devicetree/bindings/timer/nxp,tpm-timer.txt 
b/Documentation/devicetree/bindings/timer/nxp,tpm-timer.txt
deleted file mode 100644
index f82087b..000
--- a/Documentation/devicetree/bindings/timer/nxp,tpm-timer.txt
+++ /dev/null
@@ -1,28 +0,0 @@
-NXP Low Power Timer/Pulse Width Modulation Module (TPM)
-
-The Timer/PWM Module (TPM) supports input capture, output compare,
-and the generation of PWM signals to control electric motor and power
-management applications. The counter, compare and capture registers
-are clocked by an asynchronous clock that can remain enabled in low
-power modes. TPM can support global counter bus where one TPM drives
-the counter bus for the others, provided bit width is the same.
-
-Required properties:
-
-- compatible : should be "fsl,imx7ulp-tpm"
-- reg :Specifies base physical address and size of the 
register sets
-   for the clock event device and clock source device.
-- interrupts : Should be the clock event device interrupt.
-- clocks : The clocks provided by the SoC to drive the timer, must contain
-   an entry for each entry in clock-names.
-- clock-names : Must include the following entries: "ipg" and "per".
-
-Example:
-tpm5: tpm@4026 {
-   compatible = "fsl,imx7ulp-tpm";
-   reg = <0x4026 0x1000>;
-   interrupts = ;
-   clocks = < IMX7ULP_CLK_NIC1_BUS_DIV>,
-< IMX7ULP_CLK_LPTPM5>;
-   clock-names = "ipg", "per";
-};
diff --git a/Documentation/devicetree/bindings/timer/nxp,tpm-timer.yaml 
b/Documentation/devicetree/bindings/timer/nxp,tpm-timer.yaml
new file mode 100644
index 000..edd9585
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/nxp,tpm-timer.yaml
@@ -0,0 +1,61 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/nxp,tpm-timer.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP Low Power Timer/Pulse Width Modulation Module (TPM)
+
+maintainers:
+  - Dong Aisheng 
+
+description: |
+  The Timer/PWM Module (TPM) supports input capture, output compare,
+  and the generation of PWM signals to control electric motor and power
+  management applications. The counter, compare and capture registers
+  are clocked by an asynchronous clock that can remain enabled in low
+  power modes. TPM can support global counter bus where one TPM drives
+  the counter bus for the others, provided bit width is the same.
+
+properties:
+  compatible:
+const: fsl,imx7ulp-tpm
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+items:
+  - description: SoC TPM ipg clock
+  - description: SoC TPM per clock
+
+  clock-names:
+items:
+  - const: ipg
+  - const: per
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+timer@4026 {
+compatible = "fsl,imx7ulp-tpm";
+reg = <0x4026 0x1000>;
+interrupts = ;
+clocks = < IMX7ULP_CLK_NIC1_BUS_DIV>,
+ < IMX7ULP_CLK_LPTPM5>;
+clock-names = "ipg", "per";
+};
-- 
2.7.4



[PATCH V2 1/3] dt-bindings: timer: Convert i.MX GPT to json-schema

2020-05-18 Thread Anson Huang
Convert the i.MX GPT binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
Changes since V1:
- remove unnecessary compatible item descriptions;
- remove unnecessary maxItems for clocks/clock-names;
---
 .../devicetree/bindings/timer/fsl,imxgpt.txt   | 45 
 .../devicetree/bindings/timer/fsl,imxgpt.yaml  | 80 ++
 2 files changed, 80 insertions(+), 45 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/timer/fsl,imxgpt.txt
 create mode 100644 Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml

diff --git a/Documentation/devicetree/bindings/timer/fsl,imxgpt.txt 
b/Documentation/devicetree/bindings/timer/fsl,imxgpt.txt
deleted file mode 100644
index 5d8fd5b..000
--- a/Documentation/devicetree/bindings/timer/fsl,imxgpt.txt
+++ /dev/null
@@ -1,45 +0,0 @@
-Freescale i.MX General Purpose Timer (GPT)
-
-Required properties:
-
-- compatible : should be one of following:
-  for i.MX1:
-  - "fsl,imx1-gpt";
-  for i.MX21:
-  - "fsl,imx21-gpt";
-  for i.MX27:
-  - "fsl,imx27-gpt", "fsl,imx21-gpt";
-  for i.MX31:
-  - "fsl,imx31-gpt";
-  for i.MX25:
-  - "fsl,imx25-gpt", "fsl,imx31-gpt";
-  for i.MX50:
-  - "fsl,imx50-gpt", "fsl,imx31-gpt";
-  for i.MX51:
-  - "fsl,imx51-gpt", "fsl,imx31-gpt";
-  for i.MX53:
-  - "fsl,imx53-gpt", "fsl,imx31-gpt";
-  for i.MX6Q:
-  - "fsl,imx6q-gpt", "fsl,imx31-gpt";
-  for i.MX6DL:
-  - "fsl,imx6dl-gpt";
-  for i.MX6SL:
-  - "fsl,imx6sl-gpt", "fsl,imx6dl-gpt";
-  for i.MX6SX:
-  - "fsl,imx6sx-gpt", "fsl,imx6dl-gpt";
-- reg : specifies base physical address and size of the registers.
-- interrupts : should be the gpt interrupt.
-- clocks : the clocks provided by the SoC to drive the timer, must contain
-   an entry for each entry in clock-names.
-- clock-names : must include "ipg" entry first, then "per" entry.
-
-Example:
-
-gpt1: timer@10003000 {
-   compatible = "fsl,imx27-gpt", "fsl,imx21-gpt";
-   reg = <0x10003000 0x1000>;
-   interrupts = <26>;
-   clocks = < IMX27_CLK_GPT1_IPG_GATE>,
-< IMX27_CLK_PER1_GATE>;
-   clock-names = "ipg", "per";
-};
diff --git a/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml 
b/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
new file mode 100644
index 000..5479290
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
@@ -0,0 +1,80 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/fsl,imxgpt.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX General Purpose Timer (GPT)
+
+maintainers:
+  - Sascha Hauer 
+
+properties:
+  compatible:
+oneOf:
+  - const: "fsl,imx1-gpt"
+  - const: "fsl,imx21-gpt"
+  - items:
+  - const: "fsl,imx27-gpt"
+  - const: "fsl,imx21-gpt"
+  - const: "fsl,imx31-gpt"
+  - items:
+  - const: "fsl,imx25-gpt"
+  - const: "fsl,imx31-gpt"
+  - items:
+  - const: "fsl,imx50-gpt"
+  - const: "fsl,imx31-gpt"
+  - items:
+  - const: "fsl,imx51-gpt"
+  - const: "fsl,imx31-gpt"
+  - items:
+  - const: "fsl,imx53-gpt"
+  - const: "fsl,imx31-gpt"
+  - items:
+  - const: "fsl,imx6q-gpt"
+  - const: "fsl,imx31-gpt"
+  - const: "fsl,imx6dl-gpt"
+  - items:
+  - const: "fsl,imx6sl-gpt"
+  - const: "fsl,imx6dl-gpt"
+  - items:
+  - const: "fsl,imx6sx-gpt"
+  - const: "fsl,imx6dl-gpt"
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+items:
+  - description: SoC GPT ipg clock
+  - description: SoC GPT per clock
+
+  clock-names:
+items:
+  - const: ipg
+  - const: per
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+timer@10003000 {
+compatible = "fsl,imx27-gpt", "fsl,imx21-gpt";
+reg = <0x10003000 0x1000>;
+interrupts = <26>;
+clocks = < IMX27_CLK_GPT1_IPG_GATE>,
+ < IMX27_CLK_PER1_GATE>;
+clock-names = "ipg", "per";
+};
-- 
2.7.4



[PATCH V2 0/3] Covert i.MX GPT/TPM/SYSCTR timer binding to json-schema

2020-05-18 Thread Anson Huang
This patch series converts i.MX GPT, TPM and system counter timer
binding to json-schema, test build passed.

Changes compared to V1 are listed in each patch.

Anson Huang (3):
  dt-bindings: timer: Convert i.MX GPT to json-schema
  dt-bindings: timer: Convert i.MX TPM to json-schema
  dt-bindings: timer: Convert i.MX SYSCTR to json-schema

 .../devicetree/bindings/timer/fsl,imxgpt.txt   | 45 
 .../devicetree/bindings/timer/fsl,imxgpt.yaml  | 80 ++
 .../devicetree/bindings/timer/nxp,sysctr-timer.txt | 25 ---
 .../bindings/timer/nxp,sysctr-timer.yaml   | 54 +++
 .../devicetree/bindings/timer/nxp,tpm-timer.txt| 28 
 .../devicetree/bindings/timer/nxp,tpm-timer.yaml   | 61 +
 6 files changed, 195 insertions(+), 98 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/timer/fsl,imxgpt.txt
 create mode 100644 Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
 delete mode 100644 Documentation/devicetree/bindings/timer/nxp,sysctr-timer.txt
 create mode 100644 
Documentation/devicetree/bindings/timer/nxp,sysctr-timer.yaml
 delete mode 100644 Documentation/devicetree/bindings/timer/nxp,tpm-timer.txt
 create mode 100644 Documentation/devicetree/bindings/timer/nxp,tpm-timer.yaml

-- 
2.7.4



Re: [RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

2020-05-18 Thread Jassi Brar
On Mon, May 18, 2020 at 10:40 PM Viresh Kumar  wrote:
>
> On 18-05-20, 18:29, Bjorn Andersson wrote:
> > On Thu 14 May 22:17 PDT 2020, Viresh Kumar wrote:
> > > This stuff has been doing rounds on the mailing list since several years
> > > now with no agreed conclusion by all the parties. And here is another
> > > attempt to get some feedback from everyone involved to close this once
> > > and for ever. Your comments will very much be appreciated.
> > >
> > > The ARM MHU is defined here in the TRM [1] for your reference, which
> > > states following:
> > >
> > > "The MHU drives the signal using a 32-bit register, with all 32
> > > bits logically ORed together. The MHU provides a set of
> > > registers to enable software to set, clear, and check the status
> > > of each of the bits of this register independently.  The use of
> > > 32 bits for each interrupt line enables software to provide more
> > > information about the source of the interrupt. For example, each
> > > bit of the register can be associated with a type of event that
> > > can contribute to raising the interrupt."
> > >
> >
> > Does this mean that there are 32 different signals and they are all ORed
> > into the same interrupt line to trigger software action when something
> > happens?
> >
> > Or does it mean that this register is used to pass multi-bit information
> > and when any such information is passed an interrupt will be triggered?
> > If so, what does that information mean? How is it tied into other Linux
> > drivers/subsystems?
>
> I have started to believe the hardware is written badly at this point
> :)
>
H/W is actually fine :)   Its just that the driver is written to
_also_ support a platform (my original) that doesn't have shmem and
need to pass data via 32bit registers.
Frankly, I am not against the doorbell mode, I am against implementing
two modes in a driver. If it really helped (note the past tense) the
SCMI, we could implement the driver only in doorbell mode but
unfortunately SCMI would still be _broken_ for non-doorbell
controllers.


[PATCH V2 3/3] dt-bindings: timer: Convert i.MX SYSCTR to json-schema

2020-05-18 Thread Anson Huang
Convert the i.MX SYSCTR binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
Reviewed-by: Dong Aisheng 
---
No changes.
---
 .../devicetree/bindings/timer/nxp,sysctr-timer.txt | 25 --
 .../bindings/timer/nxp,sysctr-timer.yaml   | 54 ++
 2 files changed, 54 insertions(+), 25 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/timer/nxp,sysctr-timer.txt
 create mode 100644 
Documentation/devicetree/bindings/timer/nxp,sysctr-timer.yaml

diff --git a/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.txt 
b/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.txt
deleted file mode 100644
index d576599..000
--- a/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.txt
+++ /dev/null
@@ -1,25 +0,0 @@
-NXP System Counter Module(sys_ctr)
-
-The system counter(sys_ctr) is a programmable system counter which provides
-a shared time base to Cortex A15, A7, A53, A73, etc. it is intended for use in
-applications where the counter is always powered and support multiple,
-unrelated clocks. The compare frame inside can be used for timer purpose.
-
-Required properties:
-
-- compatible :  should be "nxp,sysctr-timer"
-- reg : Specifies the base physical address and size of the comapre
-frame and the counter control, read & compare.
-- interrupts :  should be the first compare frames' interrupt
-- clocks : Specifies the counter clock.
-- clock-names: Specifies the clock's name of this module
-
-Example:
-
-   system_counter: timer@306a {
-   compatible = "nxp,sysctr-timer";
-   reg = <0x306a 0x2>;/* system-counter-rd & compare */
-   clocks = <_8m>;
-   clock-names = "per";
-   interrupts = ;
-   };
diff --git a/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.yaml 
b/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.yaml
new file mode 100644
index 000..830211c
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/nxp,sysctr-timer.yaml
@@ -0,0 +1,54 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/nxp,sysctr-timer.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP System Counter Module(sys_ctr)
+
+maintainers:
+  - Bai Ping 
+
+description: |
+  The system counter(sys_ctr) is a programmable system counter
+  which provides a shared time base to Cortex A15, A7, A53, A73,
+  etc. it is intended for use in applications where the counter
+  is always powered and support multiple, unrelated clocks. The
+  compare frame inside can be used for timer purpose.
+
+properties:
+  compatible:
+const: nxp,sysctr-timer
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  clock-names:
+const: per
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+timer@306a {
+compatible = "nxp,sysctr-timer";
+reg = <0x306a 0x2>;
+clocks = <_8m>;
+clock-names = "per";
+interrupts = ;
+ };
-- 
2.7.4



Re: [PATCH] ceph: don't return -ESTALE if there's still an open file

2020-05-18 Thread Amir Goldstein
On Tue, May 19, 2020 at 1:30 AM Gregory Farnum  wrote:
>
> Maybe we resolved this conversation; I can't quite tell...

I think v2 patch wraps it up...

[...]

> > >
> > > Questions:
> > > 1. Does sync() result in fully purging inodes on MDS?
> >
> > I don't think so, but again, that code is not trivial to follow. I do
> > know that the MDS keeps around a "strays directory" which contains
> > unlinked inodes that are lazily cleaned up. My suspicion is that it's
> > satisfying lookups out of this cache as well.
> >
> > Which may be fine...the MDS is not required to be POSIX compliant after
> > all. Only the fs drivers are.
>
> I don't think this is quite that simple. Yes, the MDS is certainly
> giving back stray inodes in response to a lookup-by-ino request. But
> that's for a specific purpose: we need to be able to give back caps on
> unlinked-but-open files. For NFS specifically, I don't know what the
> rules are on NFS file handles and unlinked files, but the Ceph MDS
> won't know when files are closed everywhere, and it translates from
> NFS fh to Ceph inode using that lookup-by-ino functionality.
>

There is no protocol rule that NFS server MUST return ESTALE
for file handle of a deleted file, but there is a rule that it MAY return
ESTALE for deleted file. For example, on server restart and traditional
block filesystem, there is not much choice.

So returning ESTALE when file is deleted but opened on another ceph
client is definitely allowed by the protocol standard, the question is
whether changing the behavior will break any existing workloads...

> >
> > > 2. Is i_nlink synchronized among nodes on deferred delete?
> > > IWO, can inode come back from the dead on client if another node
> > > has linked it before i_nlink 0 was observed?
> >
> > No, that shouldn't happen. The caps mechanism should ensure that it
> > can't be observed by other clients until after the change.
> >
> > That said, Luis' current patch doesn't ensure we have the correct caps
> > to check the i_nlink. We may need to add that in before we can roll with
> > this.
> >
> > > 3. Can an NFS client be "migrated" from one ceph node to another
> > > with an open but unlinked file?
> > >
> >
> > No. Open files in ceph are generally per-client. You can't pass around a
> > fd (or equivalent).
>
> But the NFS file handles I think do work across clients, right?
>

Maybe they can, but that would be like NFS server restart, so
all bets are off w.r.t open but deleted files.

Thanks,
Amir.


Re: [RESEND PATCH v8 0/3] Add Intel ComboPhy driver

2020-05-18 Thread Dilip Kota



On 5/18/2020 9:49 PM, Kishon Vijay Abraham I wrote:

Dilip,

On 5/15/2020 1:43 PM, Dilip Kota wrote:

This patch series adds Intel ComboPhy driver, respective yaml schemas

Changes on v8:
   As per PHY Maintainer's request add description in comments for doing
   register access through register map framework.

Changes on v7:
   As per System control driver maintainer's inputs remove
 fwnode_to_regmap() definition and use device_node_get_regmap()

Can you fix this warning and resend the patch?
drivers/phy/intel/phy-intel-combo.c:229:6: warning: ‘cb_mode’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
   ret = regmap_write(cbphy->hsiocfg, REG_COMBO_MODE(cbphy->bid), cb_mode);
   ^~~
drivers/phy/intel/phy-intel-combo.c:204:24: note: ‘cb_mode’ was declared here
   enum intel_combo_mode cb_mode;
 ^~~

I noticed this warning while preparing the patch.
It sounds like false warning because:
1.) "cb_mode" is initialized in the switch case based on the "mode = 
cbphy->phy_mode;"
2.) cbphy->phy_mode is initialized during the probe in 
"intel_cbphy_fwnode_parse()" with one of the 3 values.

PHY_PCIE_MODE, PHY_SATA_MODE, PHY_XPCS_MODE.
3.) There is no chance of "cbphy->phy_mode" having different value.
4.) And "cb_mode" will be initialized according to the "mode = 
cbphy->phy_mode;"

5.) Hence, there is no chance of "cb_mode" getting accessed uninitialized.

Regards,
Dilip

Thanks
Kishon
 
Changes on v6:

   Rebase patches on the latest maintainer's branch
   
https://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy.git/?h=phy-for-5.7
Dilip Kota (3):
   dt-bindings: phy: Add PHY_TYPE_XPCS definition
   dt-bindings: phy: Add YAML schemas for Intel ComboPhy
   phy: intel: Add driver support for ComboPhy

  .../devicetree/bindings/phy/intel,combo-phy.yaml   | 101 
  drivers/phy/intel/Kconfig  |  14 +
  drivers/phy/intel/Makefile |   1 +
  drivers/phy/intel/phy-intel-combo.c| 632 +
  include/dt-bindings/phy/phy.h  |   1 +
  5 files changed, 749 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/phy/intel,combo-phy.yaml
  create mode 100644 drivers/phy/intel/phy-intel-combo.c



[PATCH V3] dt-bindings: reset: Convert i.MX reset to json-schema

2020-05-18 Thread Anson Huang
Convert the i.MX reset binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
Reviewed-by: Dong Aisheng 
---
Changes since V2:
- remove unnecessary compatible item descriptions.
---
 .../devicetree/bindings/reset/fsl,imx-src.txt  | 49 -
 .../devicetree/bindings/reset/fsl,imx-src.yaml | 82 ++
 2 files changed, 82 insertions(+), 49 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/reset/fsl,imx-src.txt
 create mode 100644 Documentation/devicetree/bindings/reset/fsl,imx-src.yaml

diff --git a/Documentation/devicetree/bindings/reset/fsl,imx-src.txt 
b/Documentation/devicetree/bindings/reset/fsl,imx-src.txt
deleted file mode 100644
index 6ed79e6..000
--- a/Documentation/devicetree/bindings/reset/fsl,imx-src.txt
+++ /dev/null
@@ -1,49 +0,0 @@
-Freescale i.MX System Reset Controller
-==
-
-Please also refer to reset.txt in this directory for common reset
-controller binding usage.
-
-Required properties:
-- compatible: Should be "fsl,-src"
-- reg: should be register base and length as documented in the
-  datasheet
-- interrupts: Should contain SRC interrupt and CPU WDOG interrupt,
-  in this order.
-- #reset-cells: 1, see below
-
-example:
-
-src: src@20d8000 {
-compatible = "fsl,imx6q-src";
-reg = <0x020d8000 0x4000>;
-interrupts = <0 91 0x04 0 96 0x04>;
-#reset-cells = <1>;
-};
-
-Specifying reset lines connected to IP modules
-==
-
-The system reset controller can be used to reset the GPU, VPU,
-IPU, and OpenVG IP modules on i.MX5 and i.MX6 ICs. Those device
-nodes should specify the reset line on the SRC in their resets
-property, containing a phandle to the SRC device node and a
-RESET_INDEX specifying which module to reset, as described in
-reset.txt
-
-example:
-
-ipu1: ipu@240 {
-resets = < 2>;
-};
-ipu2: ipu@280 {
-resets = < 4>;
-};
-
-The following RESET_INDEX values are valid for i.MX5:
-GPU_RESET 0
-VPU_RESET 1
-IPU1_RESET2
-OPEN_VG_RESET 3
-The following additional RESET_INDEX value is valid for i.MX6:
-IPU2_RESET4
diff --git a/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml 
b/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml
new file mode 100644
index 000..27c5e34
--- /dev/null
+++ b/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml
@@ -0,0 +1,82 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/reset/fsl,imx-src.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX System Reset Controller
+
+maintainers:
+  - Philipp Zabel 
+
+description: |
+  The system reset controller can be used to reset the GPU, VPU,
+  IPU, and OpenVG IP modules on i.MX5 and i.MX6 ICs. Those device
+  nodes should specify the reset line on the SRC in their resets
+  property, containing a phandle to the SRC device node and a
+  RESET_INDEX specifying which module to reset, as described in
+  reset.txt
+
+  The following RESET_INDEX values are valid for i.MX5:
+GPU_RESET 0
+VPU_RESET 1
+IPU1_RESET2
+OPEN_VG_RESET 3
+  The following additional RESET_INDEX value is valid for i.MX6:
+IPU2_RESET4
+
+properties:
+  compatible:
+oneOf:
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx50-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx53-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx6q-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx6sx-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx6sl-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx6ul-src"
+  - const: "fsl,imx51-src"
+  - items:
+  - const: "fsl,imx6sll-src"
+  - const: "fsl,imx51-src"
+
+  reg:
+maxItems: 1
+
+  interrupts:
+items:
+  - description: SRC interrupt
+  - description: CPU WDOG interrupts out of SRC
+minItems: 1
+maxItems: 2
+
+  '#reset-cells':
+const: 1
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - '#reset-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+reset-controller@73fd {
+compatible = "fsl,imx51-src";
+reg = <0x73fd 0x4000>;
+interrupts = <75>;
+#reset-cells = <1>;
+};
-- 
2.7.4



drivers/clk/socfpga/clk-gate.c:100:10: note: in expansion of macro 'GENMASK'

2020-05-18 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   642b151f45dd54809ea00ecd3976a56c1ec9b53d
commit: 295bcca84916cb5079140a89fccb472bb8d1f6e2 linux/bits.h: add compile time 
sanity check of GENMASK inputs
date:   6 weeks ago
config: arm-defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 295bcca84916cb5079140a89fccb472bb8d1f6e2
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from include/linux/bits.h:23,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from include/asm-generic/bug.h:19,
from arch/arm/include/asm/bug.h:60,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from include/linux/slab.h:15,
from drivers/clk/socfpga/clk-gate.c:8:
drivers/clk/socfpga/clk-gate.c: In function 'socfpga_clk_recalc_rate':
include/linux/bits.h:26:28: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~
>> drivers/clk/socfpga/clk-gate.c:100:10: note: in expansion of macro 'GENMASK'
100 |   val &= GENMASK(socfpgaclk->width - 1, 0);
|  ^~~
include/linux/bits.h:26:40: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~
>> drivers/clk/socfpga/clk-gate.c:100:10: note: in expansion of macro 'GENMASK'
100 |   val &= GENMASK(socfpgaclk->width - 1, 0);
|  ^~~
--
In file included from include/linux/bits.h:23,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from include/asm-generic/bug.h:19,
from arch/arm/include/asm/bug.h:60,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from include/linux/slab.h:15,
from drivers/clk/socfpga/clk-periph.c:8:
drivers/clk/socfpga/clk-periph.c: In function 'clk_periclk_recalc_rate':
include/linux/bits.h:26:28: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~
>> drivers/clk/socfpga/clk-periph.c:28:11: note: in expansion of macro 'GENMASK'
28 |val &= GENMASK(socfpgaclk->width - 1, 0);
|   ^~~
include/linux/bits.h:26:40: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~
>> drivers/clk/socfpga/clk-periph.c:28:11: note: in expansion of macro 'GENMASK'
28 |val &= GENMASK(socfpgaclk->width - 1, 0);
|   ^~~
--
In file included from include/linux/bits.h:23,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from include/asm-generic/bug.h:19,
from arch/arm/include/asm/bug.h:60,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from include/linux/slab.h:15,
from drivers/clk/socfpga/clk-periph-a10.c:5:
drivers/clk/socfpga/clk-periph-a10.c: In 

Re: zswap z3fold + memory offline = infinite loop

2020-05-18 Thread Qian Cai
On Wed, May 13, 2020 at 4:28 AM Vitaly Wool  wrote:
>
>
>
> On Wed, May 13, 2020, 2:36 AM Qian Cai  wrote:
>>
>> Put zswap z3fold pages into the memory and then offline those memory would 
>> trigger an infinite loop here in
>>
>> __offline_pages() --> do_migrate_range() because there is no error handling,
>>
>> if (pfn) {
>> /*
>>  * TODO: fatal migration failures should bail
>>  * out
>>  */
>> do_migrate_range(pfn, end_pfn);
>>
>> There, isolate_movable_page() will always return -EBUSY  because,
>>
>> if (!mapping->a_ops->isolate_page(page, mode))
>> goto out_no_isolated;
>>
>> i.e., z3fold_page_isolate() will always return false because,
>>
>> zhdr->mapped_count == 2
>
>
> So who mapped these pages? The whole zswap operation presumes that objects 
> are mapped for a short while to run some I/O and so, most of the time 
> zhdr->mapped_count would be 0.

I have no clue why those pages have been mapped for so long, but it is
trivial to reproduce using the above reproducer. Also, zbud has no
such issue. Alternatively, if you could send me some debug patches to
narrow it down, I'll be happy to run for you.

>
> Removing that check in ->isolate() is not a big deal, but ->migratepage() 
> shall not allow actual migration anyway if there are mapped objects.

Is that worse than an endless loop here?


Re: [PATCH v2] ceph: don't return -ESTALE if there's still an open file

2020-05-18 Thread Amir Goldstein
On Mon, May 18, 2020 at 8:47 PM Luis Henriques  wrote:
>
> Similarly to commit 03f219041fdb ("ceph: check i_nlink while converting
> a file handle to dentry"), this fixes another corner case with
> name_to_handle_at/open_by_handle_at.  The issue has been detected by
> xfstest generic/467, when doing:
>
>  - name_to_handle_at("/cephfs/myfile")
>  - open("/cephfs/myfile")
>  - unlink("/cephfs/myfile")
>  - sync; sync;
>  - drop caches
>  - open_by_handle_at()
>
> The call to open_by_handle_at should not fail because the file hasn't been
> deleted yet (only unlinked) and we do have a valid handle to it.  -ESTALE
> shall be returned only if i_nlink is 0 *and* i_count is 1.
>
> This patch also makes sure we have LINK caps before checking i_nlink.
>
> Signed-off-by: Luis Henriques 
> ---
> Hi!
>
> (and sorry for the delay in my reply!)
>
> So, from the discussion thread and some IRC chat with Jeff, I'm sending
> v2.  What changed?  Everything! :-)
>
> - Use i_count instead of __ceph_is_file_opened to check for open files
> - Add call to ceph_do_getattr to make sure we have LINK caps before
>   accessing i_nlink
>
> Cheers,
> --
> Luis

Acked-by: Amir Goldstein 

Thanks,
Amir.


Re: [PATCH] init/main.c: Print all command line when boot

2020-05-18 Thread Andrew Morton
On Tue, 19 May 2020 11:29:46 +0800 王程刚  wrote:

> Function pr_notice print max length maybe less than the command line length,
> need more times to print all.
> For example, arm64 has 2048 bytes command line length, but printk maximum
> length is only 1024 bytes.

I can see why that might be a problem!

> --- a/init/main.c
> +++ b/init/main.c
> @@ -825,6 +825,16 @@ void __init __weak arch_call_rest_init(void)
>   rest_init();
>  }
>  
> +static void __init print_cmdline(void)
> +{
> + const char *prefix = "Kernel command line: ";

const char prefix[] = "...";

might generate slightly more efficient code.

> + int len = -strlen(prefix);

hm, tricky.  What the heck does printk() actually return to the caller?
Seems that we forgot to document this, and there are so many different
paths which a printk call can take internally that I'm not confident
that they all got it right!

> + len += pr_notice("%s%s\n", prefix, boot_command_line);
> + while (boot_command_line[len])
> + len += pr_notice("%s\n", _command_line[len]);
> +}

Did you really intend to insert a \n into the output every 1024'th
character?

And what effect does this additional \n have upon the code logic? 
Doesn't this cause the printk() return value to be one greater than
expected each time it is called?

>
> ...
>


Re: [PATCH] interconnect: Disallow interconnect core to be built as a module

2020-05-18 Thread Viresh Kumar
On 18-05-20, 20:37, Bjorn Andersson wrote:
> On Mon 18 May 20:31 PDT 2020, Viresh Kumar wrote:
> 
> > On 18-05-20, 11:40, Bjorn Andersson wrote:
> > > It most certainly does.
> > > 
> > > With INTERCONNECT as a bool we can handle its absence with stub
> > > functions - like every other framework does. But as a tristate then
> > > every driver with a call to the interconnect api needs an entry in
> > > Kconfig to ensure the client driver must be a module if the interconnect
> > > framework is.
> > 
> > This patch has been pushed to linux-next a few days back.
> > 
> 
> Thanks Viresh, I had missed that.

Not your fault, we didn't resend it but simply applied the old version
itself :)

-- 
viresh


[PATCH 1/1] riscv: sort select statements alphanumerically

2020-05-18 Thread Zong Li
Like patch b1b3f49 ("ARM: config: sort select statements alphanumerically")
, we sort all our select statements alphanumerically by using the perl
script in patch b1b3f49 as above.

As suggested by Andrew Morton:

  This is a pet peeve of mine.  Any time there's a long list of items
  (header file inclusions, kconfig entries, array initalisers, etc) and
  someone wants to add a new item, they *always* go and stick it at the
  end of the list.

  Guys, don't do this.  Either put the new item into a randomly-chosen
  position or, probably better, alphanumerically sort the list.

Signed-off-by: Zong Li 
---
 arch/riscv/Kconfig | 70 +++---
 1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 74ce5c5249e9..8244b8f7e7c3 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -12,64 +12,64 @@ config 32BIT
 
 config RISCV
def_bool y
-   select OF
-   select OF_EARLY_FLATTREE
-   select OF_IRQ
select ARCH_HAS_BINFMT_FLAT
+   select ARCH_HAS_DEBUG_VIRTUAL if MMU
+   select ARCH_HAS_DEBUG_WX
+   select ARCH_HAS_GCOV_PROFILE_ALL
+   select ARCH_HAS_GIGANTIC_PAGE
+   select ARCH_HAS_MMIOWB
+   select ARCH_HAS_PTE_SPECIAL
+   select ARCH_HAS_SET_DIRECT_MAP
+   select ARCH_HAS_SET_MEMORY
+   select ARCH_HAS_STRICT_KERNEL_RWX if MMU
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
+   select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
select CLONE_BACKWARDS
select COMMON_CLK
+   select EDAC_SUPPORT
+   select GENERIC_ARCH_TOPOLOGY if SMP
+   select GENERIC_ATOMIC64 if !64BIT
select GENERIC_CLOCKEVENTS
+   select GENERIC_IOREMAP
+   select GENERIC_IRQ_MULTI_HANDLER
select GENERIC_IRQ_SHOW
select GENERIC_PCI_IOMAP
+   select GENERIC_PTDUMP if MMU
select GENERIC_SCHED_CLOCK
+   select GENERIC_SMP_IDLE_THREAD
select GENERIC_STRNCPY_FROM_USER if MMU
select GENERIC_STRNLEN_USER if MMU
-   select GENERIC_SMP_IDLE_THREAD
-   select GENERIC_ATOMIC64 if !64BIT
-   select GENERIC_IOREMAP
-   select GENERIC_PTDUMP if MMU
select HAVE_ARCH_AUDITSYSCALL
+   select HAVE_ARCH_KASAN if MMU && 64BIT
+   select HAVE_ARCH_KGDB
+   select HAVE_ARCH_KGDB_QXFER_PKT
+   select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_SECCOMP_FILTER
+   select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
+   select HAVE_COPY_THREAD_TLS
select HAVE_DMA_CONTIGUOUS if MMU
+   select HAVE_EBPF_JIT if MMU
select HAVE_FUTEX_CMPXCHG if FUTEX
+   select HAVE_PCI
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_SYSCALL_TRACEPOINTS
select IRQ_DOMAIN
-   select SPARSE_IRQ
-   select SYSCTL_EXCEPTION_TRACE
-   select HAVE_ARCH_TRACEHOOK
-   select HAVE_PCI
select MODULES_USE_ELF_RELA if MODULES
select MODULE_SECTIONS if MODULES
-   select THREAD_INFO_IN_TASK
+   select OF
+   select OF_EARLY_FLATTREE
+   select OF_IRQ
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
select RISCV_TIMER
-   select GENERIC_IRQ_MULTI_HANDLER
-   select GENERIC_ARCH_TOPOLOGY if SMP
-   select ARCH_HAS_PTE_SPECIAL
-   select ARCH_HAS_MMIOWB
-   select ARCH_HAS_DEBUG_VIRTUAL if MMU
-   select HAVE_EBPF_JIT if MMU
-   select EDAC_SUPPORT
-   select ARCH_HAS_GIGANTIC_PAGE
-   select ARCH_HAS_SET_DIRECT_MAP
-   select ARCH_HAS_SET_MEMORY
-   select ARCH_HAS_STRICT_KERNEL_RWX if MMU
-   select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
select SPARSEMEM_STATIC if 32BIT
-   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
-   select HAVE_ARCH_MMAP_RND_BITS if MMU
-   select ARCH_HAS_GCOV_PROFILE_ALL
-   select HAVE_COPY_THREAD_TLS
-   select HAVE_ARCH_KASAN if MMU && 64BIT
-   select HAVE_ARCH_KGDB
-   select HAVE_ARCH_KGDB_QXFER_PKT
-   select ARCH_HAS_DEBUG_WX
+   select SPARSE_IRQ
+   select SYSCTL_EXCEPTION_TRACE
+   select THREAD_INFO_IN_TASK
 
 config ARCH_MMAP_RND_BITS_MIN
default 18 if 64BIT
@@ -196,11 +196,11 @@ config ARCH_RV64I
bool "RV64I"
select 64BIT
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && GCC_VERSION >= 5
-   select HAVE_FUNCTION_TRACER
-   select HAVE_FUNCTION_GRAPH_TRACER
-   select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE if MMU
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
+   select HAVE_FTRACE_MCOUNT_RECORD
+   select HAVE_FUNCTION_GRAPH_TRACER
+   select HAVE_FUNCTION_TRACER
select SWIOTLB if MMU
 
 endchoice
-- 
2.26.2



  1   2   3   4   5   6   7   8   9   10   >