date:20200115

Re: [RFC PATCH] skiboot machine check handler

2020-01-15 Thread Mahesh J Salgaonkar

On 2019-12-11 20:01:18 Wed, Nicholas Piggin wrote:
> Provide facilities to decode machine checks into human readable
> strings, with only sufficient information required to deal with
> them sanely.
> 
> The old machine check stuff was over engineered. The philosophy
> here is that OPAL should correct anything it possibly can, what
> it can't handle but the OS might be able to do something with
> (e.g., uncorrected memory error or SLB multi-hit), it passes back
> to Linux. Anything else, the OS doesn't care. It doesn't want a
> huge struct of severities and levels and originators etc that it
> can't do anything with -- just provide human readable strings
> for what happened and what was done with it.
> 
> A Linux driver for this will be able to cope with new processors.
> 
> This also uses the same facility to decode machine checks in OPAL
> boot.
> 
> The code is a bit in flux because it's sitting on top of a few
> other RFC patches and not quite complete, just wanted opinions
> about it.

opal_handle_mce() may have to be treated as special opal call. For MCE
that occurs in OPAL context, Linux making opal call will clobber
original opal call stack which hit MCE. Same is true with nested MCE in
OPAL. Should it just continue using same r1 to avoid clobbering or have
a separate stack for mce opal call ?

Thanks,
-Mahesh.

Re: [PATCH] crypto: vmx/xts - reject inputs that are too short

2020-01-15 Thread Herbert Xu

On Wed, Jan 08, 2020 at 04:06:46PM +1100, Daniel Axtens wrote:
> When the kernel XTS implementation was extended to deal with ciphertext
> stealing in commit 8083b1bf8163 ("crypto: xts - add support for ciphertext
> stealing"), a check was added to reject inputs that were too short.
> 
> However, in the vmx enablement - commit 239668419349 ("crypto: vmx/xts -
> use fallback for ciphertext stealing"), that check wasn't added to the
> vmx implementation. This disparity leads to errors like the following:
> 
> alg: skcipher: p8_aes_xts encryption unexpectedly succeeded on test vector 
> "random: len=0 klen=64"; expected_error=-22, cfg="random: inplace may_sleep 
> use_finup src_divs=[66.99%@+10, 33.1%@alignmask+1155]"
> 
> Return -EINVAL if asked to operate with a cryptlen smaller than the AES
> block size. This brings vmx in line with the generic implementation.
> 
> Reported-by: Erhard Furtner 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206049
> Fixes: 239668419349 ("crypto: vmx/xts - use fallback for ciphertext stealing")
> Cc: Ard Biesheuvel 
> Cc: sta...@vger.kernel.org # v5.4+
> Signed-off-by: Michael Ellerman 
> [dja: commit message]
> Signed-off-by: Daniel Axtens 
> ---
>  drivers/crypto/vmx/aes_xts.c | 3 +++
>  1 file changed, 3 insertions(+)

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 7/9] powerpc/configs/skiroot: Enable security features

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 07:10, Oliver O'Halloran  wrote:
>
> On Thu, Jan 16, 2020 at 4:00 PM Daniel Axtens  wrote:
> >
> > Michael Ellerman  writes:
> >
> > > From: Joel Stanley 
> > >
> > > This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and
> > > FORTIFY_SOURCE.
> > >
> > > It also enables SECURITY_LOCKDOWN_LSM with _EARLY and
> > > LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY options enabled.
> > >
> >
> > As I said before, this will disable xmon entirely. If we want to set
> > this, we should compile out xmon. But if we want xmon in read-only mode
> > to be an option, we should pick integrity mode.
> >
> > I don't really mind, because I don't work with skiroot very
> > much. Oliver, Joel, Nayna, you all do stuff around this sort of level -
> > is this a problem for any of you?
>
> Keep it enabled and force INTEGRITY mode. There are some cases where
> xmon is the only method for debugging a crashing skiroot (hello SMC
> BMCs) so I'd rather it remained available. If there's some actual
> security benefit to disabling it entirely then someone should
> articulate that.

Ack.

Re: [PATCH 7/9] powerpc/configs/skiroot: Enable security features

2020-01-15 Thread Oliver O'Halloran

On Thu, Jan 16, 2020 at 4:00 PM Daniel Axtens  wrote:
>
> Michael Ellerman  writes:
>
> > From: Joel Stanley 
> >
> > This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and
> > FORTIFY_SOURCE.
> >
> > It also enables SECURITY_LOCKDOWN_LSM with _EARLY and
> > LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY options enabled.
> >
>
> As I said before, this will disable xmon entirely. If we want to set
> this, we should compile out xmon. But if we want xmon in read-only mode
> to be an option, we should pick integrity mode.
>
> I don't really mind, because I don't work with skiroot very
> much. Oliver, Joel, Nayna, you all do stuff around this sort of level -
> is this a problem for any of you?

Keep it enabled and force INTEGRITY mode. There are some cases where
xmon is the only method for debugging a crashing skiroot (hello SMC
BMCs) so I'd rather it remained available. If there's some actual
security benefit to disabling it entirely then someone should
articulate that.

Oliver

Re: [PATCH v4 3/9] asm-generic/tlb: Avoid potential double flush

2020-01-15 Thread Aneesh Kumar K.V


On 1/16/20 12:15 PM, Aneesh Kumar K.V wrote:

From: Peter Zijlstra 

Aneesh reported that:

tlb_flush_mmu()
  tlb_flush_mmu_tlbonly()
tlb_flush() <-- #1
  tlb_flush_mmu_free()
tlb_table_flush()
  tlb_table_invalidate()
tlb_flush_mmu_tlbonly()
  tlb_flush()   <-- #2

does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not
clear tlb->end in that case.

Observe that any caller to __tlb_adjust_range() also sets at least one
of the tlb->freed_tables || tlb->cleared_p* bits, and those are
unconditionally cleared by __tlb_reset_range().

Change the condition for actually issuing TLBI to having one of those
bits set, as opposed to having tlb->end != 0.




We should possibly get this to stable too along with the first two 
patches. I am not quiet sure if this will qualify for a stable backport. 
Hence avoided adding Cc:sta...@kernel.org




Reported-by: "Aneesh Kumar K.V" 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
  include/asm-generic/tlb.h | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 9e22ac369d1d..b36b3bef5661 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -402,7 +402,12 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct 
vm_area_struct *vma) { }
  
  static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)

  {
-   if (!tlb->end)
+   /*
+* Anything calling __tlb_adjust_range() also sets at least one of
+* these bits.
+*/
+   if (!(tlb->freed_tables || tlb->cleared_ptes || tlb->cleared_pmds ||
+ tlb->cleared_puds || tlb->cleared_p4ds))
return;
  
  	tlb_flush(tlb);

[PATCH v4 9/9] asm-generic/tlb: Provide MMU_GATHER_TABLE_FREE

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

As described in the comment, the correct order for freeing pages is:

 1) unhook page
 2) TLB invalidate page
 3) free page

This order equally applies to page directories.

Currently there are two correct options:

 - use tlb_remove_page(), when all page directores are full pages and
   there are no futher contraints placed by things like software
   walkers (HAVE_FAST_GUP).

 - use MMU_GATHER_RCU_TABLE_FREE and tlb_remove_table() when the
   architecture does not do IPI based TLB invalidate and has
   HAVE_FAST_GUP (or software TLB fill).

This however leaves architectures that don't have page based
directories but don't need RCU in a bind. For those, provide
MMU_GATHER_TABLE_FREE, which provides the independent batching for
directories without the additional RCU freeing.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/Kconfig   |   5 ++
 arch/arm/include/asm/tlb.h |   4 --
 include/asm-generic/tlb.h  |  72 +++---
 mm/mmu_gather.c| 120 +++--
 4 files changed, 130 insertions(+), 71 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index c35668fbf4d4..98de654b79b3 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -393,8 +393,12 @@ config HAVE_ARCH_JUMP_LABEL
 config HAVE_ARCH_JUMP_LABEL_RELATIVE
bool
 
+config MMU_GATHER_TABLE_FREE
+   bool
+
 config MMU_GATHER_RCU_TABLE_FREE
bool
+   select MMU_GATHER_TABLE_FREE
 
 config MMU_GATHER_PAGE_SIZE
bool
@@ -404,6 +408,7 @@ config MMU_GATHER_NO_RANGE
 
 config MMU_GATHER_NO_GATHER
bool
+   depends on MMU_GATHER_TABLE_FREE
 
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
bool
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 46a21cee3442..4d4e7b6aabff 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -37,10 +37,6 @@ static inline void __tlb_remove_table(void *_table)
 
 #include 
 
-#ifndef CONFIG_MMU_GATHER_RCU_TABLE_FREE
-#define tlb_remove_table(tlb, entry) tlb_remove_page(tlb, entry)
-#endif
-
 static inline void
 __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index ca0fe75b5355..f391f6b500b4 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -56,6 +56,15 @@
  *Defaults to flushing at tlb_end_vma() to reset the range; helps when
  *there's large holes between the VMAs.
  *
+ *  - tlb_remove_table()
+ *
+ *tlb_remove_table() is the basic primitive to free page-table directories
+ *(__p*_free_tlb()).  In it's most primitive form it is an alias for
+ *tlb_remove_page() below, for when page directories are pages and have no
+ *additional constraints.
+ *
+ *See also MMU_GATHER_TABLE_FREE and MMU_GATHER_RCU_TABLE_FREE.
+ *
  *  - tlb_remove_page() / __tlb_remove_page()
  *  - tlb_remove_page_size() / __tlb_remove_page_size()
  *
@@ -129,17 +138,24 @@
  *  This might be useful if your architecture has size specific TLB
  *  invalidation instructions.
  *
- *  MMU_GATHER_RCU_TABLE_FREE
+ *  MMU_GATHER_TABLE_FREE
  *
  *  This provides tlb_remove_table(), to be used instead of tlb_remove_page()
- *  for page directores (__p*_free_tlb()). This provides separate freeing of
- *  the page-table pages themselves in a semi-RCU fashion (see comment below).
- *  Useful if your architecture doesn't use IPIs for remote TLB invalidates
- *  and therefore doesn't naturally serialize with software page-table walkers.
+ *  for page directores (__p*_free_tlb()).
+ *
+ *  Useful if your architecture has non-page page directories.
  *
  *  When used, an architecture is expected to provide __tlb_remove_table()
  *  which does the actual freeing of these pages.
  *
+ *  MMU_GATHER_RCU_TABLE_FREE
+ *
+ *  Like MMU_GATHER_TABLE_FREE, and adds semi-RCU semantics to the free (see
+ *  comment below).
+ *
+ *  Useful if your architecture doesn't use IPIs for remote TLB invalidates
+ *  and therefore doesn't naturally serialize with software page-table walkers.
+ *
  *  MMU_GATHER_NO_RANGE
  *
  *  Use this if your architecture lacks an efficient flush_tlb_range().
@@ -155,37 +171,12 @@
  *  various ptep_get_and_clear() functions.
  */
 
-#ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
-/*
- * Semi RCU freeing of the page directories.
- *
- * This is needed by some architectures to implement software pagetable 
walkers.
- *
- * gup_fast() and other software pagetable walkers do a lockless page-table
- * walk and therefore needs some synchronization with the freeing of the page
- * directories. The chosen means to accomplish that is by disabling IRQs over
- * the walk.
- *
- * Architectures that use IPIs to flush TLBs will then automagically DTRT,
- * since we unlink the page, flush TLBs, free the page. Since the disabling of
- * IRQs delays the completion of the TLB flush we can never observe an already
- * freed page.
- *
- *

[PATCH v4 8/9] asm-generic/tlb: Rename HAVE_MMU_GATHER_NO_GATHER

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Towards a more consistent naming scheme.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/Kconfig  |  2 +-
 arch/s390/Kconfig |  2 +-
 include/asm-generic/tlb.h | 14 --
 mm/mmu_gather.c   | 10 +-
 4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index e8548211b6a9..c35668fbf4d4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -402,7 +402,7 @@ config MMU_GATHER_PAGE_SIZE
 config MMU_GATHER_NO_RANGE
bool
 
-config HAVE_MMU_GATHER_NO_GATHER
+config MMU_GATHER_NO_GATHER
bool
 
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index e2cde82a1a3c..de39c2e92435 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -163,7 +163,7 @@ config S390
select HAVE_PERF_USER_STACK_DUMP
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_MEMBLOCK_PHYS_MAP
-   select HAVE_MMU_GATHER_NO_GATHER
+   select MMU_GATHER_NO_GATHER
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NOP_MCOUNT
select HAVE_OPROFILE
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 53befa5acb27..ca0fe75b5355 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -143,6 +143,16 @@
  *  MMU_GATHER_NO_RANGE
  *
  *  Use this if your architecture lacks an efficient flush_tlb_range().
+ *
+ *  MMU_GATHER_NO_GATHER
+ *
+ *  If the option is set the mmu_gather will not track individual pages for
+ *  delayed page free anymore. A platform that enables the option needs to
+ *  provide its own implementation of the __tlb_remove_page_size() function to
+ *  free pages.
+ *
+ *  This is useful if your architecture already flushes TLB entries in the
+ *  various ptep_get_and_clear() functions.
  */
 
 #ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
@@ -202,7 +212,7 @@ extern void tlb_remove_table(struct mmu_gather *tlb, void 
*table);
 #endif /* CONFIG_MMU_GATHER_RCU_TABLE_FREE */
 
 
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
 /*
  * If we can't allocate a page to make a big batch of page pointers
  * to work on, then just handle a few from the on-stack structure.
@@ -277,7 +287,7 @@ struct mmu_gather {
 
unsigned intbatch_count;
 
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
struct mmu_gather_batch *active;
struct mmu_gather_batch local;
struct page *__pages[MMU_GATHER_BUNDLE];
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 297c70307367..a28c74328085 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
 
 static bool tlb_next_batch(struct mmu_gather *tlb)
 {
@@ -89,7 +89,7 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct 
page *page, int page_
return false;
 }
 
-#endif /* HAVE_MMU_GATHER_NO_GATHER */
+#endif /* MMU_GATHER_NO_GATHER */
 
 #ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
 
@@ -180,7 +180,7 @@ static void tlb_flush_mmu_free(struct mmu_gather *tlb)
 #ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
tlb_table_flush(tlb);
 #endif
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
tlb_batch_pages_flush(tlb);
 #endif
 }
@@ -211,7 +211,7 @@ void tlb_gather_mmu(struct mmu_gather *tlb, struct 
mm_struct *mm,
/* Is it from 0 to ~0? */
tlb->fullmm = !(start | (end+1));
 
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
tlb->need_flush_all = 0;
tlb->local.next = NULL;
tlb->local.nr   = 0;
@@ -271,7 +271,7 @@ void tlb_finish_mmu(struct mmu_gather *tlb,
 
tlb_flush_mmu(tlb);
 
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+#ifndef CONFIG_MMU_GATHER_NO_GATHER
tlb_batch_list_free(tlb);
 #endif
dec_tlb_flush_pending(tlb->mm);
-- 
2.24.1

[PATCH v4 7/9] asm-generic/tlb: Rename HAVE_MMU_GATHER_PAGE_SIZE

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Towards a more consistent naming scheme.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/Kconfig  | 2 +-
 arch/powerpc/Kconfig  | 2 +-
 include/asm-generic/tlb.h | 9 ++---
 mm/mmu_gather.c   | 4 ++--
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 501d565690b5..e8548211b6a9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -396,7 +396,7 @@ config HAVE_ARCH_JUMP_LABEL_RELATIVE
 config MMU_GATHER_RCU_TABLE_FREE
bool
 
-config HAVE_MMU_GATHER_PAGE_SIZE
+config MMU_GATHER_PAGE_SIZE
bool
 
 config MMU_GATHER_NO_RANGE
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 955759234776..cefacb9c8f48 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -223,7 +223,7 @@ config PPC
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select MMU_GATHER_RCU_TABLE_FREE
-   select HAVE_MMU_GATHER_PAGE_SIZE
+   select MMU_GATHER_PAGE_SIZE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && 
CPU_LITTLE_ENDIAN
select HAVE_SYSCALL_TRACEPOINTS
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 04a1b8f08eea..53befa5acb27 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -121,11 +121,14 @@
  *
  * Additionally there are a few opt-in features:
  *
- *  HAVE_MMU_GATHER_PAGE_SIZE
+ *  MMU_GATHER_PAGE_SIZE
  *
  *  This ensures we call tlb_flush() every time tlb_change_page_size() actually
  *  changes the size and provides mmu_gather::page_size to tlb_flush().
  *
+ *  This might be useful if your architecture has size specific TLB
+ *  invalidation instructions.
+ *
  *  MMU_GATHER_RCU_TABLE_FREE
  *
  *  This provides tlb_remove_table(), to be used instead of tlb_remove_page()
@@ -279,7 +282,7 @@ struct mmu_gather {
struct mmu_gather_batch local;
struct page *__pages[MMU_GATHER_BUNDLE];
 
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+#ifdef CONFIG_MMU_GATHER_PAGE_SIZE
unsigned int page_size;
 #endif
 #endif
@@ -435,7 +438,7 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, 
struct page *page)
 static inline void tlb_change_page_size(struct mmu_gather *tlb,
 unsigned int page_size)
 {
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+#ifdef CONFIG_MMU_GATHER_PAGE_SIZE
if (tlb->page_size && tlb->page_size != page_size) {
if (!tlb->fullmm && !tlb->need_flush_all)
tlb_flush_mmu(tlb);
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 86bb2176e173..297c70307367 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -69,7 +69,7 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct 
page *page, int page_
 
VM_BUG_ON(!tlb->end);
 
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+#ifdef CONFIG_MMU_GATHER_PAGE_SIZE
VM_WARN_ON(tlb->page_size != page_size);
 #endif
 
@@ -223,7 +223,7 @@ void tlb_gather_mmu(struct mmu_gather *tlb, struct 
mm_struct *mm,
 #ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
tlb->batch = NULL;
 #endif
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+#ifdef CONFIG_MMU_GATHER_PAGE_SIZE
tlb->page_size = 0;
 #endif
 
-- 
2.24.1

[PATCH v4 6/9] asm-generic/tlb: Rename HAVE_RCU_TABLE_FREE

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Towards a more consistent naming scheme.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/Kconfig|  2 +-
 arch/arm/Kconfig|  2 +-
 arch/arm/include/asm/tlb.h  |  2 +-
 arch/arm64/Kconfig  |  2 +-
 arch/powerpc/Kconfig|  2 +-
 arch/s390/Kconfig   |  2 +-
 arch/sparc/Kconfig  |  2 +-
 arch/sparc/include/asm/tlb_64.h |  2 +-
 arch/x86/Kconfig|  2 +-
 arch/x86/include/asm/tlb.h  |  4 ++--
 include/asm-generic/tlb.h   | 10 +-
 mm/gup.c|  2 +-
 mm/mmu_gather.c |  8 
 13 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 5e907a954532..501d565690b5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -393,7 +393,7 @@ config HAVE_ARCH_JUMP_LABEL
 config HAVE_ARCH_JUMP_LABEL_RELATIVE
bool
 
-config HAVE_RCU_TABLE_FREE
+config MMU_GATHER_RCU_TABLE_FREE
bool
 
 config HAVE_MMU_GATHER_PAGE_SIZE
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 96dab76da3b3..36445579243c 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -102,7 +102,7 @@ config ARM
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
-   select HAVE_RCU_TABLE_FREE if SMP && ARM_LPAE
+   select MMU_GATHER_RCU_TABLE_FREE if SMP && ARM_LPAE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 669474add486..46a21cee3442 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -37,7 +37,7 @@ static inline void __tlb_remove_table(void *_table)
 
 #include 
 
-#ifndef CONFIG_HAVE_RCU_TABLE_FREE
+#ifndef CONFIG_MMU_GATHER_RCU_TABLE_FREE
 #define tlb_remove_table(tlb, entry) tlb_remove_page(tlb, entry)
 #endif
 
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e688dfad0b72..a434f7c2438f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -162,7 +162,7 @@ config ARM64
select HAVE_PERF_USER_STACK_DUMP
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_FUNCTION_ARG_ACCESS_API
-   select HAVE_RCU_TABLE_FREE
+   select MMU_GATHER_RCU_TABLE_FREE
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f9970f87612e..955759234776 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -222,7 +222,7 @@ config PPC
select HAVE_HARDLOCKUP_DETECTOR_PERFif PERF_EVENTS && 
HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
-   select HAVE_RCU_TABLE_FREE
+   select MMU_GATHER_RCU_TABLE_FREE
select HAVE_MMU_GATHER_PAGE_SIZE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && 
CPU_LITTLE_ENDIAN
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index bc88841d335d..e2cde82a1a3c 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -169,7 +169,7 @@ config S390
select HAVE_OPROFILE
select HAVE_PCI
select HAVE_PERF_EVENTS
-   select HAVE_RCU_TABLE_FREE
+   select MMU_GATHER_RCU_TABLE_FREE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE
select HAVE_RSEQ
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 18e9fb6fcf1b..c703eb6b7461 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -64,7 +64,7 @@ config SPARC64
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_KRETPROBES
select HAVE_KPROBES
-   select HAVE_RCU_TABLE_FREE if SMP
+   select MMU_GATHER_RCU_TABLE_FREE
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_DYNAMIC_FTRACE
diff --git a/arch/sparc/include/asm/tlb_64.h b/arch/sparc/include/asm/tlb_64.h
index 8cb8f3833239..6820d357581c 100644
--- a/arch/sparc/include/asm/tlb_64.h
+++ b/arch/sparc/include/asm/tlb_64.h
@@ -33,7 +33,7 @@ void flush_tlb_pending(void);
  * and therefore we don't need a TLBI when freeing page-table pages.
  */
 
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+#ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE
 #define tlb_needs_table_invalidate()   (false)
 #endif
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e8949953660..f809bed408dd 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -200,7 +200,7 @@ config X86
select HAVE_PCI
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
-   select HAVE_RCU_TABLE_FREE  if PARAVIRT
+   select MMU_GATHER_RCU_TABLE_FREEif PARAVIRT
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if X86_64 && 
(UNWINDER_FRAME_POINTER ||

[PATCH v4 5/9] asm-generic/tlb: Add missing CONFIG symbol

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Without this the symbol will not actually end up in .config files.

Fixes: a30e32bd79e9 ("asm-generic/tlb: Provide generic tlb_flush() based on 
flush_tlb_mm()")
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 208aad121630..5e907a954532 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -399,6 +399,9 @@ config HAVE_RCU_TABLE_FREE
 config HAVE_MMU_GATHER_PAGE_SIZE
bool
 
+config MMU_GATHER_NO_RANGE
+   bool
+
 config HAVE_MMU_GATHER_NO_GATHER
bool
 
-- 
2.24.1

[PATCH v4 4/9] asm-gemeric/tlb: Remove stray function declarations

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

We removed the actual functions a while ago.

Fixes: 1808d65b55e4 ("asm-generic/tlb: Remove arch_tlb*_mmu()")
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 include/asm-generic/tlb.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index b36b3bef5661..1a4cea5f95df 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -285,11 +285,7 @@ struct mmu_gather {
 #endif
 };
 
-void arch_tlb_gather_mmu(struct mmu_gather *tlb,
-   struct mm_struct *mm, unsigned long start, unsigned long end);
 void tlb_flush_mmu(struct mmu_gather *tlb);
-void arch_tlb_finish_mmu(struct mmu_gather *tlb,
-unsigned long start, unsigned long end, bool force);
 
 static inline void __tlb_adjust_range(struct mmu_gather *tlb,
  unsigned long address,
-- 
2.24.1

[PATCH v4 3/9] asm-generic/tlb: Avoid potential double flush

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Aneesh reported that:

tlb_flush_mmu()
  tlb_flush_mmu_tlbonly()
tlb_flush() <-- #1
  tlb_flush_mmu_free()
tlb_table_flush()
  tlb_table_invalidate()
tlb_flush_mmu_tlbonly()
  tlb_flush()   <-- #2

does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not
clear tlb->end in that case.

Observe that any caller to __tlb_adjust_range() also sets at least one
of the tlb->freed_tables || tlb->cleared_p* bits, and those are
unconditionally cleared by __tlb_reset_range().

Change the condition for actually issuing TLBI to having one of those
bits set, as opposed to having tlb->end != 0.

Reported-by: "Aneesh Kumar K.V" 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 include/asm-generic/tlb.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 9e22ac369d1d..b36b3bef5661 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -402,7 +402,12 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct 
vm_area_struct *vma) { }
 
 static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
 {
-   if (!tlb->end)
+   /*
+* Anything calling __tlb_adjust_range() also sets at least one of
+* these bits.
+*/
+   if (!(tlb->freed_tables || tlb->cleared_ptes || tlb->cleared_pmds ||
+ tlb->cleared_puds || tlb->cleared_p4ds))
return;
 
tlb_flush(tlb);
-- 
2.24.1

[PATCH v4 2/9] mm/mmu_gather: Invalidate TLB correctly on batch allocation failure and flush

2020-01-15 Thread Aneesh Kumar K.V

From: Peter Zijlstra 

Architectures for which we have hardware walkers of Linux page table should
flush TLB on mmu gather batch allocation failures and batch flush. Some
architectures like POWER supports multiple translation modes (hash and radix)
and in the case of POWER only radix translation mode needs the above TLBI.
This is because for hash translation mode kernel wants to avoid this extra
flush since there are no hardware walkers of linux page table. With radix
translation, the hardware also walks linux page table and with that, kernel
needs to make sure to TLB invalidate page walk cache before page table pages are
freed.

More details in
commit: d86564a2f085 ("mm/tlb, x86/mm: Support invalidating TLB caches for 
RCU_TABLE_FREE")

The changes to sparc are to make sure we keep the old behavior since we are now
removing HAVE_RCU_TABLE_NO_INVALIDATE. The default value for
tlb_needs_table_invalidate is to always force an invalidate and sparc can avoid
the table invalidate. Hence we define tlb_needs_table_invalidate to false for
sparc architecture.

Cc: 
Fixes: a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes")
Signed-off-by: Peter Zijlstra (Intel) 
---
 arch/Kconfig|  3 ---
 arch/powerpc/Kconfig|  1 -
 arch/powerpc/include/asm/tlb.h  | 11 +++
 arch/sparc/Kconfig  |  1 -
 arch/sparc/include/asm/tlb_64.h |  9 +
 include/asm-generic/tlb.h   | 22 +++---
 mm/mmu_gather.c | 16 
 7 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 48b5e103bdb0..208aad121630 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -396,9 +396,6 @@ config HAVE_ARCH_JUMP_LABEL_RELATIVE
 config HAVE_RCU_TABLE_FREE
bool
 
-config HAVE_RCU_TABLE_NO_INVALIDATE
-   bool
-
 config HAVE_MMU_GATHER_PAGE_SIZE
bool
 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 04240205f38c..f9970f87612e 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -223,7 +223,6 @@ config PPC
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE
-   select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE
select HAVE_MMU_GATHER_PAGE_SIZE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && 
CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
index b2c0be93929d..7f3a8b902325 100644
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -26,6 +26,17 @@
 
 #define tlb_flush tlb_flush
 extern void tlb_flush(struct mmu_gather *tlb);
+/*
+ * book3s:
+ * Hash does not use the linux page-tables, so we can avoid
+ * the TLB invalidate for page-table freeing, Radix otoh does use the
+ * page-tables and needs the TLBI.
+ *
+ * nohash:
+ * We still do TLB invalidate in the __pte_free_tlb routine before we
+ * add the page table pages to mmu gather table batch.
+ */
+#define tlb_needs_table_invalidate()   radix_enabled()
 
 /* Get the generic bits... */
 #include 
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index eb24cb1afc11..18e9fb6fcf1b 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -65,7 +65,6 @@ config SPARC64
select HAVE_KRETPROBES
select HAVE_KPROBES
select HAVE_RCU_TABLE_FREE if SMP
-   select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_DYNAMIC_FTRACE
diff --git a/arch/sparc/include/asm/tlb_64.h b/arch/sparc/include/asm/tlb_64.h
index a2f3fa61ee36..8cb8f3833239 100644
--- a/arch/sparc/include/asm/tlb_64.h
+++ b/arch/sparc/include/asm/tlb_64.h
@@ -28,6 +28,15 @@ void flush_tlb_pending(void);
 #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 #define tlb_flush(tlb) flush_tlb_pending()
 
+/*
+ * SPARC64's hardware TLB fill does not use the Linux page-tables
+ * and therefore we don't need a TLBI when freeing page-table pages.
+ */
+
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+#define tlb_needs_table_invalidate()   (false)
+#endif
+
 #include 
 
 #endif /* _SPARC64_TLB_H */
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 2b10036fefd0..9e22ac369d1d 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -137,13 +137,6 @@
  *  When used, an architecture is expected to provide __tlb_remove_table()
  *  which does the actual freeing of these pages.
  *
- *  HAVE_RCU_TABLE_NO_INVALIDATE
- *
- *  This makes HAVE_RCU_TABLE_FREE avoid calling tlb_flush_mmu_tlbonly() before
- *  freeing the page-table pages. This can be avoided if you use
- *  HAVE_RCU_TABLE_FREE and your architecture does _NOT_ use the Linux
- *  page-tables natively.
- *
  *  MMU_GATHER_NO_RANGE
  *
  *  Use this if your architecture lacks an efficient flush_tlb_range().
@@ -189,8 +182,23 @@ struct

[PATCH v4 0/9] Fixup page directory freeing

2020-01-15 Thread Aneesh Kumar K.V

This is a repost of patch series from Peter with the arch specific changes 
except ppc64 dropped.
ppc64 changes are added here because we are redoing the patch series on top of 
ppc64 changes. This makes it
easy to backport these changes. Only the first 2 patches need to be backported 
to stable. 

The thing is, on anything SMP, freeing page directories should observe the
exact same order as normal page freeing:

 1) unhook page/directory
 2) TLB invalidate
 3) free page/directory

Without this, any concurrent page-table walk could end up with a Use-after-Free.
This is esp. trivial for anything that has software page-table walkers
(HAVE_FAST_GUP / software TLB fill) or the hardware caches partial page-walks
(ie. caches page directories).

Even on UP this might give issues since mmu_gather is preemptible these days.
An interrupt or preempted task accessing user pages might stumble into the free
page if the hardware caches page directories.

This patch series fixup ppc64 and add generic MMU_GATHER changes to support the 
conversion of other architectures.
I haven't added patches w.r.t other architecture because they are yet to be 
acked.

Changes from V3:
* Added Cc:stable for first two patches
* Explained why we have sparc related changes in patch 2

Aneesh Kumar K.V (1):
  powerpc/mmu_gather: Enable RCU_TABLE_FREE even for !SMP case

Peter Zijlstra (8):
  mm/mmu_gather: Invalidate TLB correctly on batch allocation failure
and flush
  asm-generic/tlb: Avoid potential double flush
  asm-gemeric/tlb: Remove stray function declarations
  asm-generic/tlb: Add missing CONFIG symbol
  asm-generic/tlb: Rename HAVE_RCU_TABLE_FREE
  asm-generic/tlb: Rename HAVE_MMU_GATHER_PAGE_SIZE
  asm-generic/tlb: Rename HAVE_MMU_GATHER_NO_GATHER
  asm-generic/tlb: Provide MMU_GATHER_TABLE_FREE

 arch/Kconfig |  13 +-
 arch/arm/Kconfig |   2 +-
 arch/arm/include/asm/tlb.h   |   4 -
 arch/arm64/Kconfig   |   2 +-
 arch/powerpc/Kconfig |   5 +-
 arch/powerpc/include/asm/book3s/32/pgalloc.h |   8 --
 arch/powerpc/include/asm/book3s/64/pgalloc.h |   2 -
 arch/powerpc/include/asm/nohash/pgalloc.h|   8 --
 arch/powerpc/include/asm/tlb.h   |  11 ++
 arch/powerpc/mm/book3s64/pgtable.c   |   7 -
 arch/s390/Kconfig|   4 +-
 arch/sparc/Kconfig   |   3 +-
 arch/sparc/include/asm/tlb_64.h  |   9 ++
 arch/x86/Kconfig |   2 +-
 arch/x86/include/asm/tlb.h   |   4 +-
 include/asm-generic/tlb.h| 120 ++---
 mm/gup.c |   2 +-
 mm/mmu_gather.c  | 134 +--
 18 files changed, 207 insertions(+), 133 deletions(-)

-- 
2.24.1

[PATCH v4 1/9] powerpc/mmu_gather: Enable RCU_TABLE_FREE even for !SMP case

2020-01-15 Thread Aneesh Kumar K.V

A follow up patch is going to make sure we correctly invalidate page walk cache
before we free page table pages. In order to keep things simple enable
RCU_TABLE_FREE even for !SMP so that we don't have to fixup the !SMP case
differently in the followup patch

!SMP case is right now broken for radix translation w.r.t page walk cache flush.
We can get interrupted in between page table free and that would imply we
have page walk cache entries pointing to tables which got freed already.

Cc: 
Acked-by: Peter Zijlstra (Intel) 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/Kconfig | 2 +-
 arch/powerpc/include/asm/book3s/32/pgalloc.h | 8 
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 2 --
 arch/powerpc/include/asm/nohash/pgalloc.h| 8 
 arch/powerpc/mm/book3s64/pgtable.c   | 7 ---
 5 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1ec34e16ed65..04240205f38c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -222,7 +222,7 @@ config PPC
select HAVE_HARDLOCKUP_DETECTOR_PERFif PERF_EVENTS && 
HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
-   select HAVE_RCU_TABLE_FREE  if SMP
+   select HAVE_RCU_TABLE_FREE
select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE
select HAVE_MMU_GATHER_PAGE_SIZE
select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
b/arch/powerpc/include/asm/book3s/32/pgalloc.h
index 998317702630..dc5c039eb28e 100644
--- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
@@ -49,7 +49,6 @@ static inline void pgtable_free(void *table, unsigned 
index_size)
 
 #define get_hugepd_cache_index(x)  (x)
 
-#ifdef CONFIG_SMP
 static inline void pgtable_free_tlb(struct mmu_gather *tlb,
void *table, int shift)
 {
@@ -66,13 +65,6 @@ static inline void __tlb_remove_table(void *_table)
 
pgtable_free(table, shift);
 }
-#else
-static inline void pgtable_free_tlb(struct mmu_gather *tlb,
-   void *table, int shift)
-{
-   pgtable_free(table, shift);
-}
-#endif
 
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
  unsigned long address)
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index f6968c811026..a41e91bd0580 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -19,9 +19,7 @@ extern struct vmemmap_backing *vmemmap_list;
 extern pmd_t *pmd_fragment_alloc(struct mm_struct *, unsigned long);
 extern void pmd_fragment_free(unsigned long *);
 extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
-#ifdef CONFIG_SMP
 extern void __tlb_remove_table(void *_table);
-#endif
 void pte_frag_destroy(void *pte_frag);
 
 static inline pgd_t *radix__pgd_alloc(struct mm_struct *mm)
diff --git a/arch/powerpc/include/asm/nohash/pgalloc.h 
b/arch/powerpc/include/asm/nohash/pgalloc.h
index 332b13b4ecdb..29c43665a753 100644
--- a/arch/powerpc/include/asm/nohash/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/pgalloc.h
@@ -46,7 +46,6 @@ static inline void pgtable_free(void *table, int shift)
 
 #define get_hugepd_cache_index(x)  (x)
 
-#ifdef CONFIG_SMP
 static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int 
shift)
 {
unsigned long pgf = (unsigned long)table;
@@ -64,13 +63,6 @@ static inline void __tlb_remove_table(void *_table)
pgtable_free(table, shift);
 }
 
-#else
-static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int 
shift)
-{
-   pgtable_free(table, shift);
-}
-#endif
-
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
  unsigned long address)
 {
diff --git a/arch/powerpc/mm/book3s64/pgtable.c 
b/arch/powerpc/mm/book3s64/pgtable.c
index 75483b40fcb1..2bf7e1b4fd82 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -378,7 +378,6 @@ static inline void pgtable_free(void *table, int index)
}
 }
 
-#ifdef CONFIG_SMP
 void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index)
 {
unsigned long pgf = (unsigned long)table;
@@ -395,12 +394,6 @@ void __tlb_remove_table(void *_table)
 
return pgtable_free(table, index);
 }
-#else
-void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index)
-{
-   return pgtable_free(table, index);
-}
-#endif
 
 #ifdef CONFIG_PROC_FS
 atomic_long_t direct_pages_count[MMU_PAGE_COUNT];
-- 
2.24.1

Re: [linux-next/mainline][bisected 3acac06][ppc] Oops when unloading mpt3sas driver

2020-01-15 Thread Michael Ellerman

Abdul Haleem  writes:
> On Thu, 2020-01-09 at 06:22 -0800, Christoph Hellwig wrote:
>> On Thu, Jan 09, 2020 at 02:27:25PM +0530, Abdul Haleem wrote:
>> > + CC Christoph Hellwig
>> 
>> The only thing this commit changed for the dma coherent case (which
>> ppc64 uses) is that we now look up the page to free by the DMA address
>> instead of the virtual address passed in.  Which suggests this call
>> stack passes in a broken dma address.  I suspect we somehow managed
>> to disable the ppc iommu bypass mode after allocating memory, which
>> would cause symptoms like this, and thus the commit is just exposing
>> a pre-existing problem.
>
> Trace with printk added for page->addr, will this help ?
>
> mpt3sas_cm0: removing handle(0x000f), sas_addr(0x500304801f080d3d)
> mpt3sas_cm0: enclosure logical id(0x500304801f080d3f), slot(12)
> mpt3sas_cm0: enclosure level(0x), connector name( )
> mpt3sas_cm0: mpt3sas_transport_port_remove: removed:
> sas_addr(0x500304801f080d3f)
> mpt3sas_cm0: expander_remove: handle(0x0009),
> sas_addr(0x500304801f080d3f)
> mpt3sas_cm0: sending diag reset !!
> mpt3sas_cm0: diag reset: SUCCESS 
> page->vaddr = 0xc03f2d20
> page->vaddr = 0xc03f2ef0
> page->vaddr = 0xc03f3843
> page->vaddr = 0xc03f3d7d
> page->vaddr = 0xc03f7576
> BUG: Unable to handle kernel data access on write at 0xc04a00017c34

We also want the dma address, Abdul did another run resulting in:

  mpt3sas_cm0: mpt3sas_transport_port_remove: removed: 
sas_addr(0x500304801f080d3f)
  mpt3sas_cm0: expander_remove: handle(0x0009), sas_addr(0x500304801f080d3f)
  mpt3sas_cm0: sending diag reset !!
  mpt3sas_cm0: diag reset: SUCCESS
  page->vaddr = 0xc03fc588
  page->dma = 0x83fc588
  page->vaddr = 0xc03fc590
  page->dma = 0x83fc590
  page->vaddr = 0xc03fc598
  page->dma = 0x83fc598
  page->vaddr = 0xc03fc599
  page->dma = 0x83fc599
  page->vaddr = 0xc03fc7c7
  page->dma = 0x5f0
  BUG: Unable to handle kernel data access on write at 0xc04a00017c34
  Faulting instruction address: 0xc0300780
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA PowerNV
  Modules linked in: iptable_mangle xt_MASQUERADE iptable_nat nf_nat 
xt_conntrack nf_conntrack nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
tun bridge stp llc iptable_filter btrfs blake2b_generic xor zstd_decompress 
zstd_compress lzo_compress raid6_pq vmx_crypto gf128mul powernv_rng rng_core 
kvm_hv nfsd kvm binfmt_misc ip_tables x_tables xfs libcrc32c qla2xxx ixgbe 
nvme_fc nvme_fabrics mdio nvme_core i40e mpt3sas(-) raid_class 
scsi_transport_sas autofs4
  CPU: 149 PID: 17518 Comm: rmmod Not tainted 
5.5.0-rc5-next-20200108-autotest-2-g36e1367-dirty #2
  NIP:  c0300780 LR: c01aabe4 CTR: c004a030
  REGS: c078ffab75d0 TRAP: 0380   Not tainted  
(5.5.0-rc5-next-20200108-autotest-2-g36e1367-dirty)
  MSR:  90009033   CR: 24002424  XER: 2000
  CFAR: c01aabe0 IRQMASK: 0
  GPR00: c004a0c8 c078ffab7860 c1321a00 c04a00017c00
  GPR04:  c03fc7c7 003e00017c00 
  GPR08:  c13cd000 c04a00017c34 0230
  GPR12: c004a030 c07ffef35000  
  GPR16:   0100140c0180 10020098
  GPR20: 10020050 10020038 05f0 c0d60870
  GPR24: c0d60890 c0d608a8  c12a9818
  GPR28: 05f0 c03fc7c7 0001 c03fdaa4c8a8
  NIP [c0300780] __free_pages+0x10/0x50
  LR [c01aabe4] dma_direct_free_pages+0x54/0x90
  Call Trace:
  [c078ffab7880] [c004a0c8] dma_iommu_free_coherent+0x98/0xd0
  [c078ffab78d0] [c01a9c10] dma_free_attrs+0x110/0x120
  [c078ffab7920] [c0317750] dma_pool_destroy+0x1d0/0x270
  [c078ffab79d0] [c0080dc51e98] _base_release_memory_pools+0x1d8/0x4b0 
[mpt3sas]
  [c078ffab7a60] [c0080dc5b9f0] mpt3sas_base_detach+0x40/0x150 [mpt3sas]
  [c078ffab7ad0] [c0080dc6c92c] scsih_remove+0x24c/0x3e0 [mpt3sas]
  [c078ffab7b90] [c06199a4] pci_device_remove+0x64/0x110
  [c078ffab7bd0] [c06cf1a4] 
device_release_driver_internal+0x154/0x260
  [c078ffab7c10] [c06cf37c] driver_detach+0x8c/0x140
  [c078ffab7c50] [c06cd488] bus_remove_driver+0x78/0x100
  [c078ffab7c80] [c06d0090] driver_unregister+0x40/0x90
  [c078ffab7cf0] [c06190c8] pci_unregister_driver+0x38/0x110
  [c078ffab7d40] [c0080dc7f188] _mpt3sas_exit+0x50/0x4118 [mpt3sas]
  [c078ffab7da0] [c01dda18] sys_delete_module+0x1a8/0x2a0
  [c078ffab7e20] [c000b9d0] system_call+0x5c/0x68
  Instruction dump:
  88830051 2fa4 41de0008 4bffe7fc 7d234b78

[PATCH v2 3/3] kasan: initialise array in kasan_memcmp test

2020-01-15 Thread Daniel Axtens

memcmp may bail out before accessing all the memory if the buffers
contain differing bytes. kasan_memcmp calls memcmp with a stack array.
Stack variables are not necessarily initialised (in the absence of a
compiler plugin, at least). Sometimes this causes the memcpy to bail
early thus fail to trigger kasan.

Make sure the array initialised to zero in the code.

No other test is dependent on the contents of an array on the stack.

Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Signed-off-by: Daniel Axtens 
---
 lib/test_kasan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index a130d75b9385..519b0f259e97 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -619,7 +619,7 @@ static noinline void __init kasan_memcmp(void)
 {
char *ptr;
size_t size = 24;
-   int arr[9];
+   int arr[9] = {};
 
pr_info("out-of-bounds in memcmp\n");
ptr = kmalloc(size, GFP_KERNEL | __GFP_ZERO);
-- 
2.20.1

[PATCH v2 2/3] string.h: fix incompatibility between FORTIFY_SOURCE and KASAN

2020-01-15 Thread Daniel Axtens

The memcmp KASAN self-test fails on a kernel with both KASAN and
FORTIFY_SOURCE.

When FORTIFY_SOURCE is on, a number of functions are replaced with
fortified versions, which attempt to check the sizes of the operands.
However, these functions often directly invoke __builtin_foo() once they
have performed the fortify check. Using __builtins may bypass KASAN
checks if the compiler decides to inline it's own implementation as
sequence of instructions, rather than emit a function call that goes out
to a KASAN-instrumented implementation.

Why is only memcmp affected?


Of the string and string-like functions that kasan_test tests, only memcmp
is replaced by an inline sequence of instructions in my testing on x86 with
gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2).

I believe this is due to compiler heuristics. For example, if I annotate
kmalloc calls with the alloc_size annotation (and disable some fortify
compile-time checking!), the compiler will replace every memset except the
one in kmalloc_uaf_memset with inline instructions. (I have some WIP
patches to add this annotation.)

Does this affect other functions in string.h?
=

Yes. Anything that uses __builtin_* rather than __real_* could be
affected. This looks like:

 - strncpy
 - strcat
 - strlen
 - strlcpy maybe, under some circumstances?
 - strncat under some circumstances
 - memset
 - memcpy
 - memmove
 - memcmp (as noted)
 - memchr
 - strcpy

Whether a function call is emitted always depends on the compiler. Most
bugs should get caught by FORTIFY_SOURCE, but the missed memcmp test shows
that this is not always the case.

Isn't FORTIFY_SOURCE disabled with KASAN?
-

The string headers on all arches supporting KASAN disable fortify with
kasan, but only when address sanitisation is _also_ disabled. For example
from x86:

 #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
 /*
  * For files that are not instrumented (e.g. mm/slub.c) we
  * should use not instrumented version of mem* functions.
  */
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)

 #ifndef __NO_FORTIFY
 #define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
 #endif

 #endif

This comes from commit 6974f0c4555e ("include/linux/string.h: add the
option of fortified string.h functions"), and doesn't work when KASAN is
enabled and the file is supposed to be sanitised - as with test_kasan.c

I'm pretty sure this is not wrong, but not as expansive it should be:

 * we shouldn't use __builtin_memcpy etc in files where we don't have
   instrumentation - it could devolve into a function call to memcpy,
   which will be instrumented. Rather, we should use __memcpy which
   by convention is not instrumented.

 * we also shouldn't be using __builtin_memcpy when we have a KASAN
   instrumented file, because it could be replaced with inline asm
   that will not be instrumented.

What is correct behaviour?
==

Firstly, there is some overlap between fortification and KASAN: both
provide some level of _runtime_ checking. Only fortify provides
compile-time checking.

KASAN and fortify can pick up different things at runtime:

 - Some fortify functions, notably the string functions, could easily be
   modified to consider sub-object sizes (e.g. members within a struct),
   and I have some WIP patches to do this. KASAN cannot detect these
   because it cannot insert poision between members of a struct.

 - KASAN can detect many over-reads/over-writes when the sizes of both
   operands are unknown, which fortify cannot.

So there are a couple of options:

 1) Flip the test: disable fortify in santised files and enable it in
unsanitised files. This at least stops us missing KASAN checking, but
we lose the fortify checking.

 2) Make the fortify code always call out to real versions. Do this only
for KASAN, for fear of losing the inlining opportunities we get from
__builtin_*.

(We can't use kasan_check_{read,write}: because the fortify functions are
_extern inline_, you can't include _static_ inline functions without a
compiler warning. kasan_check_{read,write} are static inline so we can't
use them even when they would otherwise be suitable.)

Take approach 2 and call out to real versions when KASAN is enabled.

Use __underlying_foo to distinguish from __real_foo: __real_foo always
refers to the kernel's implementation of foo, __underlying_foo could be
either the kernel implementation or the __builtin_foo implementation.

This is sometimes enough to make the memcmp test succeed with
FORTIFY_SOURCE enabled. It is at least enough to get the function call
into the module. One more fix is needed to make it reliable: see the next
patch.

Cc: Daniel Micay 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Fixes:

[PATCH v2 1/3] kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE

2020-01-15 Thread Daniel Axtens

3 KASAN self-tests fail on a kernel with both KASAN and FORTIFY_SOURCE:
memchr, memcmp and strlen.

When FORTIFY_SOURCE is on, a number of functions are replaced with
fortified versions, which attempt to check the sizes of the operands.
However, these functions often directly invoke __builtin_foo() once they
have performed the fortify check. The compiler can detect that the results
of these functions are not used, and knows that they have no other side
effects, and so can eliminate them as dead code.

Why are only memchr, memcmp and strlen affected?


Of string and string-like functions, kasan_test tests:

 * strchr  ->  not affected, no fortified version
 * strrchr ->  likewise
 * strcmp  ->  likewise
 * strncmp ->  likewise

 * strnlen ->  not affected, the fortify source implementation calls the
   underlying strnlen implementation which is instrumented, not
   a builtin

 * strlen  ->  affected, the fortify souce implementation calls a __builtin
   version which the compiler can determine is dead.

 * memchr  ->  likewise
 * memcmp  ->  likewise

 * memset ->   not affected, the compiler knows that memset writes to its
   first argument and therefore is not dead.

Why does this not affect the functions normally?


In string.h, these functions are not marked as __pure, so the compiler
cannot know that they do not have side effects. If relevant functions are
marked as __pure in string.h, we see the following warnings and the
functions are elided:

lib/test_kasan.c: In function ‘kasan_memchr’:
lib/test_kasan.c:606:2: warning: statement with no effect [-Wunused-value]
  memchr(ptr, '1', size + 1);
  ^~
lib/test_kasan.c: In function ‘kasan_memcmp’:
lib/test_kasan.c:622:2: warning: statement with no effect [-Wunused-value]
  memcmp(ptr, arr, size+1);
  ^~~~
lib/test_kasan.c: In function ‘kasan_strings’:
lib/test_kasan.c:645:2: warning: statement with no effect [-Wunused-value]
  strchr(ptr, '1');
  ^~~~
...

This annotation would make sense to add and could be added at any point, so
the behaviour of test_kasan.c should change.

The fix
===

Make all the functions that are pure write their results to a global,
which makes them live. The strlen and memchr tests now pass.

The memcmp test still fails to trigger, which is addressed in the next
patch.

Cc: Daniel Micay 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Fixes: 0c96350a2d2f ("lib/test_kasan.c: add tests for several string/memory API 
functions")
Reviewed-by: Dmitry Vyukov 
Signed-off-by: Daniel Axtens 

---

v2: rename variables to have kasan_ prefixes
---
 lib/test_kasan.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 328d33beae36..a130d75b9385 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -23,6 +23,14 @@
 
 #include 
 
+/*
+ * We assign some test results to these globals to make sure the tests
+ * are not eliminated as dead code.
+ */
+
+int kasan_int_result;
+void *kasan_ptr_result;
+
 /*
  * Note: test functions are marked noinline so that their names appear in
  * reports.
@@ -603,7 +611,7 @@ static noinline void __init kasan_memchr(void)
if (!ptr)
return;
 
-   memchr(ptr, '1', size + 1);
+   kasan_ptr_result = memchr(ptr, '1', size + 1);
kfree(ptr);
 }
 
@@ -618,8 +626,7 @@ static noinline void __init kasan_memcmp(void)
if (!ptr)
return;
 
-   memset(arr, 0, sizeof(arr));
-   memcmp(ptr, arr, size+1);
+   kasan_int_result = memcmp(ptr, arr, size + 1);
kfree(ptr);
 }
 
@@ -642,22 +649,22 @@ static noinline void __init kasan_strings(void)
 * will likely point to zeroed byte.
 */
ptr += 16;
-   strchr(ptr, '1');
+   kasan_ptr_result = strchr(ptr, '1');
 
pr_info("use-after-free in strrchr\n");
-   strrchr(ptr, '1');
+   kasan_ptr_result = strrchr(ptr, '1');
 
pr_info("use-after-free in strcmp\n");
-   strcmp(ptr, "2");
+   kasan_int_result = strcmp(ptr, "2");
 
pr_info("use-after-free in strncmp\n");
-   strncmp(ptr, "2", 1);
+   kasan_int_result = strncmp(ptr, "2", 1);
 
pr_info("use-after-free in strlen\n");
-   strlen(ptr);
+   kasan_int_result = strlen(ptr);
 
pr_info("use-after-free in strnlen\n");
-   strnlen(ptr, 1);
+   kasan_int_result = strnlen(ptr, 1);
 }
 
 static noinline void __init kasan_bitops(void)
@@ -724,11 +731,12 @@ static noinline void __init kasan_bitops(void)
__test_and_change_bit(BITS_PER_LONG + BITS_PER_BYTE, bits);
 
pr_info("out-of-bounds in test_bit\n");
-   (void)test_bit(BITS_PER_LONG + BITS_PER_BYTE, bits);
+   kasan_int_result = test_bit(BITS_PER_LONG + BITS_PER_BYTE, bits);

[PATCH v2 0/3] Fix some incompatibilites between KASAN and FORTIFY_SOURCE

2020-01-15 Thread Daniel Axtens

3 KASAN self-tests fail on a kernel with both KASAN and FORTIFY_SOURCE:
memchr, memcmp and strlen. I have observed this on x86 and powerpc.

When FORTIFY_SOURCE is on, a number of functions are replaced with
fortified versions, which attempt to check the sizes of the
operands. However, these functions often directly invoke __builtin_foo()
once they have performed the fortify check.

This breaks things in 2 ways:

 - the three function calls are technically dead code, and can be
   eliminated. When __builtin_ versions are used, the compiler can detect
   this.

 - Using __builtins may bypass KASAN checks if the compiler decides to
   inline it's own implementation as sequence of instructions, rather than
   emit a function call that goes out to a KASAN-instrumented
   implementation.

The patches address each reason in turn. Finally, test_memcmp used a
stack array without explicit initialisation, which can sometimes break
too, so fix that up.

v2: - some cleanups, don't mess with arch code as I missed some wrinkles.
- add stack array init (patch 3)

Daniel Axtens (3):
  kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE
  string.h: fix incompatibility between FORTIFY_SOURCE and KASAN
  kasan: initialise array in kasan_memcmp test

 include/linux/string.h | 60 +-
 lib/test_kasan.c   | 32 +-
 2 files changed, 68 insertions(+), 24 deletions(-)

-- 
2.20.1

z constraint in powerpc inline assembly ?

2020-01-15 Thread Christophe Leroy


Hi Segher,

I'm trying to see if we could enhance TCP checksum calculations by 
splitting inline assembly blocks to give GCC the opportunity to mix it 
with other stuff, but I'm getting difficulties with the carry.


As far as I can read in the documentation, the z constraint represents 
'‘XER[CA]’ carry bit (part of the XER register)'


I've tried the following, but I get errors. Can you help ?

unsigned long cksum(unsigned long a, unsigned long b, unsigned long c)
{
unsigned long sum;
unsigned long carry;

asm("addc %0, %2, %3" : "=r"(sum), "=z"(carry) : "r"(a), "r"(b));
asm("adde %0, %0, %2" : "+r"(sum), "+z"(carry) : "r"(c));
asm("addze %0, %0" : "+r"(sum) : "z"(carry));

return sum;
}



csum.c: In function 'cksum':
csum.c:6:2: error: inconsistent operand constraints in an 'asm'
  asm("addc %0, %2, %3" : "=r"(sum), "=z"(carry) : "r"(a), "r"(b));
  ^
csum.c:7:2: error: inconsistent operand constraints in an 'asm'
  asm("adde %0, %0, %2" : "+r"(sum), "+z"(carry) : "r"(c));
  ^
csum.c:8:2: error: inconsistent operand constraints in an 'asm'
  asm("addze %0, %0" : "+r"(sum) : "z"(carry));
  ^

Thanks
Christophe

Re: [PATCH 14/18] powerpc/kprobes: Support kprobes on prefixed instructions

2020-01-15 Thread Michael Ellerman

Balamuruhan S  writes:
> On Tue, Nov 26, 2019 at 04:21:37PM +1100, Jordan Niethe wrote:
...
>> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
>> index 7303fe3856cc..aa15b3480385 100644
>> --- a/arch/powerpc/kernel/kprobes.c
>> +++ b/arch/powerpc/kernel/kprobes.c
>> @@ -104,17 +104,30 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, 
>> unsigned int offset)
>>  
>>  int arch_prepare_kprobe(struct kprobe *p)
>>  {
>> +int len;
>>  int ret = 0;
>> +struct kprobe *prev;
>>  kprobe_opcode_t insn = *p->addr;
>> +kprobe_opcode_t prfx = *(p->addr - 1);
>>  
>> +preempt_disable();
>>  if ((unsigned long)p->addr & 0x03) {
>>  printk("Attempt to register kprobe at an unaligned address\n");
>>  ret = -EINVAL;
>>  } else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) {
>>  printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n");
>>  ret = -EINVAL;
>> +} else if (IS_PREFIX(prfx)) {
>> +printk("Cannot register a kprobe on the second word of prefixed 
>> instruction\n");
>
> Let's have line with in 80 columns length.

We don't split printk strings across lines. So this is fine.

cheers

Re: [PATCH] powerpc/xive: discard ESB load value when interrupt is invalid

2020-01-15 Thread Michael Ellerman

Greg Kurz  writes:
> On Tue, 14 Jan 2020 08:44:54 +0100
> Cédric Le Goater  wrote:
>> On 1/14/20 2:14 AM, Michael Ellerman wrote:
>> > Cédric Le Goater  writes:
>> >> On 1/13/20 2:01 PM, Cédric Le Goater wrote:
>> >>> From: Frederic Barrat 
>> >>>
>> >>> A load on an ESB page returning all 1's means that the underlying
>> >>> device has invalidated the access to the PQ state of the interrupt
>> >>> through mmio. It may happen, for example when querying a PHB interrupt
>> >>> while the PHB is in an error state.
>> >>>
>> >>> In that case, we should consider the interrupt to be invalid when
>> >>> checking its state in the irq_get_irqchip_state() handler.
>> >>
>> >>
>> >> and we need also these tags :
>> >>
>> >> Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method 
>> >> for XIVE to fix shutdown race")
>> >> Cc: sta...@vger.kernel.org # v5.3+
>> > 
>> > I added those, although it's v5.4+, as the offending commit was first
>> > included in v5.4-rc1.
>> 
>> Ah yes. I mistook the merge tag of the branch used for the PR (v5.3-rc2)
>> 
>
> You might want to use 'git tag --contains':
>
> [greg@bahia kernel-linus]$ git tag --contains da15c03b047d
> for-linus
> kvm-5.4-2
> next-20191118
> next-20191126
> tags/kvm-5.4-1
> tags/kvm-5.4-2
> v5.4
> v5.4-rc1

Or:

  $ git describe --match "v[0-9]*" --contains da15c03b047d
  v5.4-rc1~99^2~134^2~17

cheers

Re: [PATCH V6 3/4] ASoC: pcm_dmaengine: Extract snd_dmaengine_pcm_refine_runtime_hwparams

2020-01-15 Thread John Stultz

On Wed, Jan 8, 2020 at 8:58 PM John Stultz  wrote:
> On Thu, Sep 26, 2019 at 6:50 PM Shengjiu Wang  wrote:
> >
> > When set the runtime hardware parameters, we may need to query
> > the capability of DMA to complete the parameters.
> >
> > This patch is to Extract this operation from
> > dmaengine_pcm_set_runtime_hwparams function to a separate function
> > snd_dmaengine_pcm_refine_runtime_hwparams, that other components
> > which need this feature can call this function.
> >
> > Signed-off-by: Shengjiu Wang 
> > Reviewed-by: Nicolin Chen 
>
> As a heads up, this patch seems to be causing a regression on the HiKey board.
>
> On boot up I'm seeing:
> [   17.721424] hi6210_i2s f7118000.i2s: ASoC: can't open component
> f7118000.i2s: -6
>
> And HDMI audio isn't working. With this patch reverted, audio works again.
>
>
> > diff --git a/sound/core/pcm_dmaengine.c b/sound/core/pcm_dmaengine.c
> > index 89a05926ac73..5749a8a49784 100644
> > --- a/sound/core/pcm_dmaengine.c
> > +++ b/sound/core/pcm_dmaengine.c
> > @@ -369,4 +369,87 @@ int snd_dmaengine_pcm_close_release_chan(struct 
> > snd_pcm_substream *substream)
> ...
> > +   ret = dma_get_slave_caps(chan, _caps);
> > +   if (ret == 0) {
> > +   if (dma_caps.cmd_pause && dma_caps.cmd_resume)
> > +   hw->info |= SNDRV_PCM_INFO_PAUSE | 
> > SNDRV_PCM_INFO_RESUME;
> > +   if (dma_caps.residue_granularity <= 
> > DMA_RESIDUE_GRANULARITY_SEGMENT)
> > +   hw->info |= SNDRV_PCM_INFO_BATCH;
> > +
> > +   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
> > +   addr_widths = dma_caps.dst_addr_widths;
> > +   else
> > +   addr_widths = dma_caps.src_addr_widths;
> > +   }
>
> It seems a failing ret from dma_get_slave_caps() here is being returned...
>
> > +
> > +   /*
> > +* If SND_DMAENGINE_PCM_DAI_FLAG_PACK is set keep
> > +* hw.formats set to 0, meaning no restrictions are in place.
> > +* In this case it's the responsibility of the DAI driver to
> > +* provide the supported format information.
> > +*/
> > +   if (!(dma_data->flags & SND_DMAENGINE_PCM_DAI_FLAG_PACK))
> > +   /*
> > +* Prepare formats mask for valid/allowed sample types. If 
> > the
> > +* dma does not have support for the given physical word 
> > size,
> > +* it needs to be masked out so user space can not use the
> > +* format which produces corrupted audio.
> > +* In case the dma driver does not implement the slave_caps 
> > the
> > +* default assumption is that it supports 1, 2 and 4 bytes
> > +* widths.
> > +*/
> > +   for (i = SNDRV_PCM_FORMAT_FIRST; i <= 
> > SNDRV_PCM_FORMAT_LAST; i++) {
> > +   int bits = snd_pcm_format_physical_width(i);
> > +
> > +   /*
> > +* Enable only samples with DMA supported physical
> > +* widths
> > +*/
> > +   switch (bits) {
> > +   case 8:
> > +   case 16:
> > +   case 24:
> > +   case 32:
> > +   case 64:
> > +   if (addr_widths & (1 << (bits / 8)))
> > +   hw->formats |= 
> > pcm_format_to_bits(i);
> > +   break;
> > +   default:
> > +   /* Unsupported types */
> > +   break;
> > +   }
> > +   }
> > +
> > +   return ret;
>
> ... down here.
>
> Where as in the old code...
>
> > diff --git a/sound/soc/soc-generic-dmaengine-pcm.c 
> > b/sound/soc/soc-generic-dmaengine-pcm.c
> > index 748f5f641002..b9f147eaf7c4 100644
> > --- a/sound/soc/soc-generic-dmaengine-pcm.c
> > +++ b/sound/soc/soc-generic-dmaengine-pcm.c
>
> > @@ -145,56 +140,12 @@ static int dmaengine_pcm_set_runtime_hwparams(struct 
> > snd_pcm_substream *substrea
> > if (pcm->flags & SND_DMAENGINE_PCM_FLAG_NO_RESIDUE)
> > hw.info |= SNDRV_PCM_INFO_BATCH;
> >
> > -   ret = dma_get_slave_caps(chan, _caps);
> > -   if (ret == 0) {
> > -   if (dma_caps.cmd_pause && dma_caps.cmd_resume)
> > -   hw.info |= SNDRV_PCM_INFO_PAUSE | 
> > SNDRV_PCM_INFO_RESUME;
> > -   if (dma_caps.residue_granularity <= 
> > DMA_RESIDUE_GRANULARITY_SEGMENT)
> > -   hw.info |= SNDRV_PCM_INFO_BATCH;
> > -
> > -   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
> > -   addr_widths = dma_caps.dst_addr_widths;
> > -   else
> > -   addr_widths = dma_caps.src_addr_widths;
> > -   }
>
> ...the ret from

Re: [PATCH v2] Fix display of Maximum Memory

2020-01-15 Thread Michael Ellerman

Michael Bringmann  writes:
> Correct overflow problem in calculation+display of Maximum Memory
> value to syscfg where 32bits is insufficient.
>
> Signed-off-by: Michael Bringmann 
> ---
>  arch/powerpc/platforms/pseries/lparcfg.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
> b/arch/powerpc/platforms/pseries/lparcfg.c
> index e33e8bc..f00411c 100644
> --- a/arch/powerpc/platforms/pseries/lparcfg.c
> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
> @@ -433,12 +433,12 @@ static void parse_em_data(struct seq_file *m)
>  
>  static void maxmem_data(struct seq_file *m)
>  {
> - unsigned long maxmem = 0;
> + u64 maxmem = 0;

This is 64-bit only code, so u64 == unsigned long.

> - maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
> - maxmem += hugetlb_total_pages() * PAGE_SIZE;
> + maxmem += (u64)drmem_info->n_lmbs * drmem_info->lmb_size;

The only problem AFAICS is n_lmbs is int and lmb_size is u32, so this
multiplication will overflow.

> + maxmem += (u64)hugetlb_total_pages() * PAGE_SIZE;

hugetlb_total_pages() already returns unsigned long.

> - seq_printf(m, "MaxMem=%ld\n", maxmem);
> + seq_printf(m, "MaxMem=%llu\n", maxmem);
>  }

This should be sufficient?

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index e33e8bc4b69b..38c306551f76 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -435,10 +435,10 @@ static void maxmem_data(struct seq_file *m)
 {
unsigned long maxmem = 0;
 
-   maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
+   maxmem += (unsigned long)drmem_info->n_lmbs * drmem_info->lmb_size;
maxmem += hugetlb_total_pages() * PAGE_SIZE;
 
-   seq_printf(m, "MaxMem=%ld\n", maxmem);
+   seq_printf(m, "MaxMem=%lu\n", maxmem);
 }
 
 static int pseries_lparcfg_data(struct seq_file *m, void *v)


cheers

Re: [PATCH 1/2] kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE

2020-01-15 Thread Daniel Axtens

>> >> +/*
>> >> + * We assign some test results to these globals to make sure the tests
>> >> + * are not eliminated as dead code.
>> >> + */
>> >> +
>> >> +int int_result;
>> >> +void *ptr_result;
>> >
>> > These are globals, but are not static and don't have kasan_ prefix.
>> > But I guess this does not matter for modules?
>> > Otherwise:
>> >
>> > Reviewed-by: Dmitry Vyukov 
>> >
>>
>> I think if you make them static, GCC will see they aren't used and will
>> eliminate everything still ?
>
> static volatile? :)

Yeah so these are module globals. They'd be accessible from any other
files you linked into the module (currently there are no such
files). They're not visible outside the module because they're not
EXPORTed.

Making them static does lead to them getting eliminated, and 'static
volatile' seems both gross and like something checkpatch would complain
about. I'll leave them as they are but stick a kasan_ prefix on them
just for the additional tidiness.

Regards,
Daniel

Re: [PATCH 7/9] powerpc/configs/skiroot: Enable security features

2020-01-15 Thread Daniel Axtens

Michael Ellerman  writes:

> From: Joel Stanley 
>
> This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and
> FORTIFY_SOURCE.
>
> It also enables SECURITY_LOCKDOWN_LSM with _EARLY and
> LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY options enabled.
>

As I said before, this will disable xmon entirely. If we want to set
this, we should compile out xmon. But if we want xmon in read-only mode
to be an option, we should pick integrity mode.

I don't really mind, because I don't work with skiroot very
much. Oliver, Joel, Nayna, you all do stuff around this sort of level -
is this a problem for any of you?

Regards,
Daniel

> MODULE_SIG is selected by lockdown, so it is still enabled.
>
> Signed-off-by: Joel Stanley 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/configs/skiroot_defconfig | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/configs/skiroot_defconfig 
> b/arch/powerpc/configs/skiroot_defconfig
> index 24a210fe0049..bd661a9a9410 100644
> --- a/arch/powerpc/configs/skiroot_defconfig
> +++ b/arch/powerpc/configs/skiroot_defconfig
> @@ -49,7 +49,6 @@ CONFIG_JUMP_LABEL=y
>  CONFIG_STRICT_KERNEL_RWX=y
>  CONFIG_MODULES=y
>  CONFIG_MODULE_UNLOAD=y
> -CONFIG_MODULE_SIG=y
>  CONFIG_MODULE_SIG_FORCE=y
>  CONFIG_MODULE_SIG_SHA512=y
>  CONFIG_PARTITION_ADVANCED=y
> @@ -272,6 +271,16 @@ CONFIG_NLS_ASCII=y
>  CONFIG_NLS_ISO8859_1=y
>  CONFIG_NLS_UTF8=y
>  CONFIG_ENCRYPTED_KEYS=y
> +CONFIG_SECURITY=y
> +CONFIG_HARDENED_USERCOPY=y
> +# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
> +CONFIG_HARDENED_USERCOPY_PAGESPAN=y
> +CONFIG_FORTIFY_SOURCE=y
> +CONFIG_SECURITY_LOCKDOWN_LSM=y
> +CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
> +CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY=y
> +# CONFIG_INTEGRITY is not set
> +CONFIG_LSM="yama,loadpin,safesetid,integrity"
>  # CONFIG_CRYPTO_HW is not set
>  CONFIG_CRC16=y
>  CONFIG_CRC_ITU_T=y
> -- 
> 2.21.1

Re: [PATCH 2/2] string.h: fix incompatibility between FORTIFY_SOURCE and KASAN

2020-01-15 Thread Daniel Axtens

Dmitry Vyukov  writes:

> On Wed, Jan 15, 2020 at 7:37 AM Daniel Axtens  wrote:
>>
>> The memcmp KASAN self-test fails on a kernel with both KASAN and
>> FORTIFY_SOURCE.
>>
>> When FORTIFY_SOURCE is on, a number of functions are replaced with
>> fortified versions, which attempt to check the sizes of the operands.
>> However, these functions often directly invoke __builtin_foo() once they
>> have performed the fortify check. Using __builtins may bypass KASAN
>> checks if the compiler decides to inline it's own implementation as
>> sequence of instructions, rather than emit a function call that goes out
>> to a KASAN-instrumented implementation.
>>
>> Why is only memcmp affected?
>> 
>>
>> Of the string and string-like functions that kasan_test tests, only memcmp
>> is replaced by an inline sequence of instructions in my testing on x86 with
>> gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2).
>>
>> I believe this is due to compiler heuristics. For example, if I annotate
>> kmalloc calls with the alloc_size annotation (and disable some fortify
>> compile-time checking!), the compiler will replace every memset except the
>> one in kmalloc_uaf_memset with inline instructions. (I have some WIP
>> patches to add this annotation.)
>>
>> Does this affect other functions in string.h?
>> =
>>
>> Yes. Anything that uses __builtin_* rather than __real_* could be
>> affected. This looks like:
>>
>>  - strncpy
>>  - strcat
>>  - strlen
>>  - strlcpy maybe, under some circumstances?
>>  - strncat under some circumstances
>>  - memset
>>  - memcpy
>>  - memmove
>>  - memcmp (as noted)
>>  - memchr
>>  - strcpy
>>
>> Whether a function call is emitted always depends on the compiler. Most
>> bugs should get caught by FORTIFY_SOURCE, but the missed memcmp test shows
>> that this is not always the case.
>>
>> Isn't FORTIFY_SOURCE disabled with KASAN?
>> -
>>
>> The string headers on all arches supporting KASAN disable fortify with
>> kasan, but only when address sanitisation is _also_ disabled. For example
>> from x86:
>>
>>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>>  /*
>>   * For files that are not instrumented (e.g. mm/slub.c) we
>>   * should use not instrumented version of mem* functions.
>>   */
>>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
>>  #define memmove(dst, src, len) __memmove(dst, src, len)
>>  #define memset(s, c, n) __memset(s, c, n)
>>
>>  #ifndef __NO_FORTIFY
>>  #define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
>>  #endif
>>
>>  #endif
>>
>> This comes from commit 6974f0c4555e ("include/linux/string.h: add the
>> option of fortified string.h functions"), and doesn't work when KASAN is
>> enabled and the file is supposed to be sanitised - as with test_kasan.c
>
> Hi Daniel,
>
> Thanks for addressing this. And special kudos for description detail level! :)
>
> Phew, this layering of checking tools is a bit messy...
>
>> I'm pretty sure this is backwards: we shouldn't be using __builtin_memcpy
>> when we have a KASAN instrumented file, but we can use __builtin_* - and in
>> many cases all fortification - in files where we don't have
>> instrumentation.
>
> I think if we use __builtin_* in a non-instrumented file, the compiler
> can emit a call to normal mem* function which will be intercepted by
> kasan and we will get instrumentation in a file which should not be
> instrumented. Moreover this behavior will depend on optimization level
> and compiler internals.
> But as far as I see this does not affect any of the following and the
> code change.
>

mmm OK - you are right, when I consider this and your other point...

>>  #if !defined(__NO_FORTIFY) && defined(__OPTIMIZE__) && 
>> defined(CONFIG_FORTIFY_SOURCE)
>> +
>> +#ifdef CONFIG_KASAN
>> +extern void *__underlying_memchr(const void *p, int c, __kernel_size_t 
>> size) __RENAME(memchr);
>
>
> arch headers do:
>
> #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> #define memcpy(dst, src, len) __memcpy(dst, src, len)
> ...
>
> to disable instrumentation. Does this still work with this change?
> Previously they disabled fortify. What happens now? Will define of
> memcpy to __memcpy also affect __RENAME(memcpy), so that
> __underlying_memcpy will be an alias to __memcpy?

This is a good question. It's a really intricate set of interactions!!

Between these two things, I think I'm going to just drop the removal of
architecture changes, which means that fortify will continue to be
disabled for files that disable KASAN sanitisation. It's just too
complicated to reason through and satisfy myself that we're not going to
get weird bugs, and the payoff is really small.

>> +extern int __underlying_memcmp(const void *p, const void *q, 
>> __kernel_size_t size) __RENAME(memcmp);
>
> All of these macros are leaking from the header file. Tomorrow we will
> discover __underlying_memcpy uses

Re: [PATCH v5 4/5] powerpc/numa: Early request for home node associativity

2020-01-15 Thread Michael Ellerman

Srikar Dronamraju  writes:
> Currently the kernel detects if its running on a shared lpar platform
> and requests home node associativity before the scheduler sched_domains
> are setup. However between the time NUMA setup is initialized and the
> request for home node associativity, workqueue initializes its per node
> cpumask. The per node workqueue possible cpumask may turn invalid
> after home node associativity resulting in weird situations like
> workqueue possible cpumask being a subset of workqueue online cpumask.
>
> This can be fixed by requesting home node associativity earlier just
> before NUMA setup. However at the NUMA setup time, kernel may not be in
> a position to detect if its running on a shared lpar platform. So
> request for home node associativity and if the request fails, fallback
> on the device tree property.
>
> Signed-off-by: Srikar Dronamraju 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: Nathan Lynch 
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Abdul Haleem 
> Cc: Satheesh Rajendran 
> Reported-by: Abdul Haleem 
> Reviewed-by: Nathan Lynch 
> ---
> Changelog (v2->v3):
> - Handled comments from Nathan Lynch
>   * Use first thread of the core for cpu-to-node map.
>   * get hardware-id in numa_setup_cpu
>
> Changelog (v1->v2):
> - Handled comments from Nathan Lynch
>   * Dont depend on pacas to be setup for the hwid
>
>
>  arch/powerpc/mm/numa.c | 45 +-
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 63ec0c3c817f..f837a0e725bc 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -461,13 +461,27 @@ static int of_drconf_to_nid_single(struct drmem_lmb 
> *lmb)
>   return nid;
>  }
>  
> +static int vphn_get_nid(long hwid)
> +{
> + __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> + long rc;
> +
> + rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);

This breaks the build for some defconfigs.

eg. ppc64_book3e_allmodconfig:

  arch/powerpc/mm/numa.c: In function ‘vphn_get_nid’:
  arch/powerpc/mm/numa.c:469:7: error: implicit declaration of function 
‘hcall_vphn’ [-Werror=implicit-function-declaration]
469 |  rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
|   ^~

It needs to be inside #ifdef CONFIG_PPC_SPLPAR.

> + if (rc == H_SUCCESS)
> + return associativity_to_nid(associativity);
> +
> + return NUMA_NO_NODE;
> +}
> +
>  /*
>   * Figure out to which domain a cpu belongs and stick it there.
> + * cpu_to_phys_id is only valid between smp_setup_cpu_maps() and
> + * smp_setup_pacas(). If called outside this window, set get_hwid to true.
>   * Return the id of the domain used.
>   */
> -static int numa_setup_cpu(unsigned long lcpu)
> +static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)

I really dislike this bool.

> @@ -485,6 +499,27 @@ static int numa_setup_cpu(unsigned long lcpu)
>   return nid;
>   }
>  
> + /*
> +  * On a shared lpar, device tree will not have node associativity.
> +  * At this time lppaca, or its __old_status field may not be
> +  * updated. Hence kernel cannot detect if its on a shared lpar. So
> +  * request an explicit associativity irrespective of whether the
> +  * lpar is shared or dedicated. Use the device tree property as a
> +  * fallback.
> +  */
> + if (firmware_has_feature(FW_FEATURE_VPHN)) {
> + long hwid;
> +
> + if (get_hwid)
> + hwid = get_hard_smp_processor_id(lcpu);
> + else
> + hwid = cpu_to_phys_id[lcpu];

This should move inside vphn_get_nid(), and just do:

if (cpu_to_phys_id)
hwid = cpu_to_phys_id[lcpu];
else
hwid = get_hard_smp_processor_id(lcpu);


> + nid = vphn_get_nid(hwid);
> + }
> +
> + if (nid != NUMA_NO_NODE)
> + goto out_present;
> +
>   cpu = of_get_cpu_node(lcpu, NULL);


cheers

Re: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Timur Tabi


On 1/15/20 2:01 PM, Scott Wood wrote:

FWIW I'd rather see the original patch,
that keeps the raw asm hcall stuff as simple wrappers in one place.


I can live with that.

Re: [PATCH 5/9] powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIV

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> It's default n so we don't need to disable it.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [PATCH 4/9] powerpc/configs/skiroot: Drop HID_LOGITECH

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> Commit bdd08fff4915 ("HID: logitech: Add depends on LEDS_CLASS to
> Logitech Kconfig entry") made HID_LOGITECH depend on LEDS_CLASS which
> we do not enable, meaning we are not actually enabling those drivers
> any more.
>
> The Kconfig help text suggests USB HID compliant Logictech devices
> will continue to work without HID_LOGITECH, so just drop it.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [PATCH 3/9] powerpc/configs: Drop NET_VENDOR_HP which moved to staging

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> The HP network driver moved to staging in commit 52340b82cf1a ("hp100:
> Move 100BaseVG AnyLAN driver to staging") meaning we don't need to
> disable it any more in our defconfigs.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [PATCH 1/9] powerpc/configs: Drop CONFIG_QLGE which moved to staging

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> The QLGE driver moved to staging in commit 955315b0dc8c ("qlge: Move
> drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/"), meaning
> our defconfigs that enable it have no effect as we don't enable
> CONFIG_STAGING.
>
> It sounds like the device is obsolete, so drop the driver.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [PATCH 2/9] powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCE

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> The NET_CADENCE symbol was renamed to NET_VENDOR_CADENCE, so we don't
> need to disable the former, see commit 0df5f81c481e ("net: ethernet:
> Add missing VENDOR to Cadence and Packet Engines symbols").
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [RFC PATCH 8/9] powerpc/configs/skiroot: Disable xmon default & enable reboot on panic

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> If the skiroot kernel crashes we don't want it sitting at an xmon
> prompt forever. Instead it's more helpful to reboot and bring the
> boot loader back up, and if the crash was transient we can then boot
> successfully.
>
> Similarly if we panic we should reboot, with a short timeout in case
> someone is watching the console.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley 

> ---
>  arch/powerpc/configs/skiroot_defconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/configs/skiroot_defconfig 
> b/arch/powerpc/configs/skiroot_defconfig
> index bd661a9a9410..12c96c8b0c1d 100644
> --- a/arch/powerpc/configs/skiroot_defconfig
> +++ b/arch/powerpc/configs/skiroot_defconfig
> @@ -29,6 +29,7 @@ CONFIG_ALTIVEC=y
>  CONFIG_VSX=y
>  CONFIG_NR_CPUS=2048
>  CONFIG_CPU_LITTLE_ENDIAN=y
> +CONFIG_PANIC_TIMEOUT=30
>  # CONFIG_PPC_VAS is not set
>  # CONFIG_PPC_PSERIES is not set
>  # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
> @@ -293,6 +294,7 @@ CONFIG_LIBCRC32C=y
>  CONFIG_PRINTK_TIME=y
>  CONFIG_MAGIC_SYSRQ=y
>  CONFIG_DEBUG_STACKOVERFLOW=y
> +CONFIG_PANIC_ON_OOPS=y
>  CONFIG_SOFTLOCKUP_DETECTOR=y
>  CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
>  CONFIG_HARDLOCKUP_DETECTOR=y
> @@ -301,5 +303,4 @@ CONFIG_WQ_WATCHDOG=y
>  # CONFIG_SCHED_DEBUG is not set
>  # CONFIG_FTRACE is not set
>  CONFIG_XMON=y
> -CONFIG_XMON_DEFAULT=y
>  # CONFIG_RUNTIME_TESTING_MENU is not set
> --
> 2.21.1
>

Re: [PATCH 6/9] powerpc/configs/skiroot: Update for symbol movement only

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley

Re: [RFC PATCH 9/9] powerpc/configs/skiroot: Enable some more hardening options

2020-01-15 Thread Joel Stanley

On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>
> Enable more hardening options.
>
> Note BUG_ON_DATA_CORRUPTION selects DEBUG_LIST and is essentially just
> a synonym for it.
>
> DEBUG_SG, DEBUG_NOTIFIERS, DEBUG_LIST, DEBUG_CREDENTIALS and
> SCHED_STACK_END_CHECK should all be low overhead and just add a few
> extra checks.
>
> Unselecting SLAB_MERGE_DEFAULT causes the SLAB to use more memory, but
> the skiroot kernel shouldn't be memory constrained on any of our
> systems, all it does is run a small bootloader.

Why do we unselect it?

> SLAB_FREELIST_RANDOM, and SLUB_DEBUG_ON will add some overhead to the
> SLAB allocator, but nothing that should be meaningful for skiroot.
>
> Signed-off-by: Michael Ellerman 

Acked-by: Joel Stanley 


> ---
>  arch/powerpc/configs/skiroot_defconfig | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/powerpc/configs/skiroot_defconfig 
> b/arch/powerpc/configs/skiroot_defconfig
> index 12c96c8b0c1d..59c2de904fda 100644
> --- a/arch/powerpc/configs/skiroot_defconfig
> +++ b/arch/powerpc/configs/skiroot_defconfig
> @@ -23,6 +23,8 @@ CONFIG_EXPERT=y
>  # CONFIG_AIO is not set
>  CONFIG_PERF_EVENTS=y
>  # CONFIG_COMPAT_BRK is not set
> +# CONFIG_SLAB_MERGE_DEFAULT is not set
> +CONFIG_SLAB_FREELIST_RANDOM=y
>  CONFIG_SLAB_FREELIST_HARDENED=y
>  CONFIG_PPC64=y
>  CONFIG_ALTIVEC=y
> @@ -293,6 +295,8 @@ CONFIG_LIBCRC32C=y
>  # CONFIG_XZ_DEC_SPARC is not set
>  CONFIG_PRINTK_TIME=y
>  CONFIG_MAGIC_SYSRQ=y
> +CONFIG_SLUB_DEBUG_ON=y
> +CONFIG_SCHED_STACK_END_CHECK=y
>  CONFIG_DEBUG_STACKOVERFLOW=y
>  CONFIG_PANIC_ON_OOPS=y
>  CONFIG_SOFTLOCKUP_DETECTOR=y
> @@ -301,6 +305,10 @@ CONFIG_HARDLOCKUP_DETECTOR=y
>  CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
>  CONFIG_WQ_WATCHDOG=y
>  # CONFIG_SCHED_DEBUG is not set
> +CONFIG_DEBUG_SG=y
> +CONFIG_DEBUG_NOTIFIERS=y
> +CONFIG_BUG_ON_DATA_CORRUPTION=y
> +CONFIG_DEBUG_CREDENTIALS=y
>  # CONFIG_FTRACE is not set
>  CONFIG_XMON=y
>  # CONFIG_RUNTIME_TESTING_MENU is not set
> --
> 2.21.1
>

[RFC PATCH 9/9] powerpc/configs/skiroot: Enable some more hardening options

2020-01-15 Thread Michael Ellerman

Enable more hardening options.

Note BUG_ON_DATA_CORRUPTION selects DEBUG_LIST and is essentially just
a synonym for it.

DEBUG_SG, DEBUG_NOTIFIERS, DEBUG_LIST, DEBUG_CREDENTIALS and
SCHED_STACK_END_CHECK should all be low overhead and just add a few
extra checks.

Unselecting SLAB_MERGE_DEFAULT causes the SLAB to use more memory, but
the skiroot kernel shouldn't be memory constrained on any of our
systems, all it does is run a small bootloader.

SLAB_FREELIST_RANDOM, and SLUB_DEBUG_ON will add some overhead to the
SLAB allocator, but nothing that should be meaningful for skiroot.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 12c96c8b0c1d..59c2de904fda 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -23,6 +23,8 @@ CONFIG_EXPERT=y
 # CONFIG_AIO is not set
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
+# CONFIG_SLAB_MERGE_DEFAULT is not set
+CONFIG_SLAB_FREELIST_RANDOM=y
 CONFIG_SLAB_FREELIST_HARDENED=y
 CONFIG_PPC64=y
 CONFIG_ALTIVEC=y
@@ -293,6 +295,8 @@ CONFIG_LIBCRC32C=y
 # CONFIG_XZ_DEC_SPARC is not set
 CONFIG_PRINTK_TIME=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_SLUB_DEBUG_ON=y
+CONFIG_SCHED_STACK_END_CHECK=y
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_PANIC_ON_OOPS=y
 CONFIG_SOFTLOCKUP_DETECTOR=y
@@ -301,6 +305,10 @@ CONFIG_HARDLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
 CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
+CONFIG_DEBUG_SG=y
+CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_BUG_ON_DATA_CORRUPTION=y
+CONFIG_DEBUG_CREDENTIALS=y
 # CONFIG_FTRACE is not set
 CONFIG_XMON=y
 # CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1

[RFC PATCH 8/9] powerpc/configs/skiroot: Disable xmon default & enable reboot on panic

2020-01-15 Thread Michael Ellerman

If the skiroot kernel crashes we don't want it sitting at an xmon
prompt forever. Instead it's more helpful to reboot and bring the
boot loader back up, and if the crash was transient we can then boot
successfully.

Similarly if we panic we should reboot, with a short timeout in case
someone is watching the console.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index bd661a9a9410..12c96c8b0c1d 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -29,6 +29,7 @@ CONFIG_ALTIVEC=y
 CONFIG_VSX=y
 CONFIG_NR_CPUS=2048
 CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_PANIC_TIMEOUT=30
 # CONFIG_PPC_VAS is not set
 # CONFIG_PPC_PSERIES is not set
 # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
@@ -293,6 +294,7 @@ CONFIG_LIBCRC32C=y
 CONFIG_PRINTK_TIME=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_STACKOVERFLOW=y
+CONFIG_PANIC_ON_OOPS=y
 CONFIG_SOFTLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
 CONFIG_HARDLOCKUP_DETECTOR=y
@@ -301,5 +303,4 @@ CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
 # CONFIG_FTRACE is not set
 CONFIG_XMON=y
-CONFIG_XMON_DEFAULT=y
 # CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1

[PATCH 7/9] powerpc/configs/skiroot: Enable security features

2020-01-15 Thread Michael Ellerman

From: Joel Stanley 

This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and
FORTIFY_SOURCE.

It also enables SECURITY_LOCKDOWN_LSM with _EARLY and
LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY options enabled.

MODULE_SIG is selected by lockdown, so it is still enabled.

Signed-off-by: Joel Stanley 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 24a210fe0049..bd661a9a9410 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -49,7 +49,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_STRICT_KERNEL_RWX=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULE_SIG=y
 CONFIG_MODULE_SIG_FORCE=y
 CONFIG_MODULE_SIG_SHA512=y
 CONFIG_PARTITION_ADVANCED=y
@@ -272,6 +271,16 @@ CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_NLS_UTF8=y
 CONFIG_ENCRYPTED_KEYS=y
+CONFIG_SECURITY=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
+CONFIG_HARDENED_USERCOPY_PAGESPAN=y
+CONFIG_FORTIFY_SOURCE=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
+CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY=y
+# CONFIG_INTEGRITY is not set
+CONFIG_LSM="yama,loadpin,safesetid,integrity"
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC16=y
 CONFIG_CRC_ITU_T=y
-- 
2.21.1

[PATCH 6/9] powerpc/configs/skiroot: Update for symbol movement only

2020-01-15 Thread Michael Ellerman

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 0aa060eef06c..24a210fe0049 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -1,8 +1,3 @@
-CONFIG_PPC64=y
-CONFIG_ALTIVEC=y
-CONFIG_VSX=y
-CONFIG_NR_CPUS=2048
-CONFIG_CPU_LITTLE_ENDIAN=y
 CONFIG_KERNEL_XZ=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
@@ -29,16 +24,11 @@ CONFIG_EXPERT=y
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_SLAB_FREELIST_HARDENED=y
-CONFIG_JUMP_LABEL=y
-CONFIG_STRICT_KERNEL_RWX=y
-CONFIG_MODULES=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULE_SIG=y
-CONFIG_MODULE_SIG_FORCE=y
-CONFIG_MODULE_SIG_SHA512=y
-CONFIG_PARTITION_ADVANCED=y
-# CONFIG_MQ_IOSCHED_DEADLINE is not set
-# CONFIG_MQ_IOSCHED_KYBER is not set
+CONFIG_PPC64=y
+CONFIG_ALTIVEC=y
+CONFIG_VSX=y
+CONFIG_NR_CPUS=2048
+CONFIG_CPU_LITTLE_ENDIAN=y
 # CONFIG_PPC_VAS is not set
 # CONFIG_PPC_PSERIES is not set
 # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
@@ -49,14 +39,24 @@ CONFIG_KEXEC=y
 CONFIG_PRESERVE_FA_DUMP=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_NUMA=y
-# CONFIG_COMPACTION is not set
-# CONFIG_MIGRATION is not set
 CONFIG_PPC_64K_PAGES=y
 CONFIG_SCHED_SMT=y
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE="console=tty0 console=hvc0 ipr.fast_reboot=1 quiet"
 # CONFIG_SECCOMP is not set
 # CONFIG_PPC_MEM_KEYS is not set
+CONFIG_JUMP_LABEL=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_SIG=y
+CONFIG_MODULE_SIG_FORCE=y
+CONFIG_MODULE_SIG_SHA512=y
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_MQ_IOSCHED_DEADLINE is not set
+# CONFIG_MQ_IOSCHED_KYBER is not set
+# CONFIG_COMPACTION is not set
+# CONFIG_MIGRATION is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
@@ -153,7 +153,6 @@ CONFIG_IGB=m
 CONFIG_IXGB=m
 CONFIG_IXGBE=m
 CONFIG_I40E=m
-CONFIG_S2IO=m
 # CONFIG_NET_VENDOR_MARVELL is not set
 CONFIG_MLX4_EN=m
 # CONFIG_MLX4_CORE_GEN2 is not set
@@ -164,6 +163,7 @@ CONFIG_MLX5_CORE_EN=y
 # CONFIG_NET_VENDOR_MICROSEMI is not set
 CONFIG_MYRI10GE=m
 # CONFIG_NET_VENDOR_NATSEMI is not set
+CONFIG_S2IO=m
 # CONFIG_NET_VENDOR_NETRONOME is not set
 # CONFIG_NET_VENDOR_NI is not set
 # CONFIG_NET_VENDOR_NVIDIA is not set
@@ -271,6 +271,8 @@ CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_NLS_UTF8=y
+CONFIG_ENCRYPTED_KEYS=y
+# CONFIG_CRYPTO_HW is not set
 CONFIG_CRC16=y
 CONFIG_CRC_ITU_T=y
 CONFIG_LIBCRC32C=y
@@ -289,8 +291,6 @@ CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
 CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
 # CONFIG_FTRACE is not set
-# CONFIG_RUNTIME_TESTING_MENU is not set
 CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
-CONFIG_ENCRYPTED_KEYS=y
-# CONFIG_CRYPTO_HW is not set
+# CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1

[PATCH 5/9] powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIV

2020-01-15 Thread Michael Ellerman

It's default n so we don't need to disable it.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 74cffb854c0f..0aa060eef06c 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -293,5 +293,4 @@ CONFIG_WQ_WATCHDOG=y
 CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
 CONFIG_ENCRYPTED_KEYS=y
-# CONFIG_CRYPTO_ECHAINIV is not set
 # CONFIG_CRYPTO_HW is not set
-- 
2.21.1

[PATCH 3/9] powerpc/configs: Drop NET_VENDOR_HP which moved to staging

2020-01-15 Thread Michael Ellerman

The HP network driver moved to staging in commit 52340b82cf1a ("hp100:
Move 100BaseVG AnyLAN driver to staging") meaning we don't need to
disable it any more in our defconfigs.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/44x/akebono_defconfig | 1 -
 arch/powerpc/configs/skiroot_defconfig | 1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/powerpc/configs/44x/akebono_defconfig 
b/arch/powerpc/configs/44x/akebono_defconfig
index f0c8a07cc274..7705a5c3f4ea 100644
--- a/arch/powerpc/configs/44x/akebono_defconfig
+++ b/arch/powerpc/configs/44x/akebono_defconfig
@@ -59,7 +59,6 @@ CONFIG_BLK_DEV_SD=y
 # CONFIG_NET_VENDOR_DLINK is not set
 # CONFIG_NET_VENDOR_EMULEX is not set
 # CONFIG_NET_VENDOR_EXAR is not set
-# CONFIG_NET_VENDOR_HP is not set
 CONFIG_IBM_EMAC=y
 # CONFIG_NET_VENDOR_MARVELL is not set
 # CONFIG_NET_VENDOR_MELLANOX is not set
diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index eaaffe9ae8b9..3eee39c50941 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -146,7 +146,6 @@ CONFIG_CHELSIO_T1=m
 # CONFIG_NET_VENDOR_DLINK is not set
 CONFIG_BE2NET=m
 # CONFIG_NET_VENDOR_EZCHIP is not set
-# CONFIG_NET_VENDOR_HP is not set
 # CONFIG_NET_VENDOR_HUAWEI is not set
 CONFIG_E1000=m
 CONFIG_E1000E=m
-- 
2.21.1

[PATCH 4/9] powerpc/configs/skiroot: Drop HID_LOGITECH

2020-01-15 Thread Michael Ellerman

Commit bdd08fff4915 ("HID: logitech: Add depends on LEDS_CLASS to
Logitech Kconfig entry") made HID_LOGITECH depend on LEDS_CLASS which
we do not enable, meaning we are not actually enabling those drivers
any more.

The Kconfig help text suggests USB HID compliant Logictech devices
will continue to work without HID_LOGITECH, so just drop it.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 3eee39c50941..74cffb854c0f 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -235,7 +235,6 @@ CONFIG_HID_CYPRESS=y
 CONFIG_HID_EZKEY=y
 CONFIG_HID_ITE=y
 CONFIG_HID_KENSINGTON=y
-CONFIG_HID_LOGITECH=y
 CONFIG_HID_MICROSOFT=y
 CONFIG_HID_MONTEREY=y
 CONFIG_USB_HIDDEV=y
-- 
2.21.1

[PATCH 2/9] powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCE

2020-01-15 Thread Michael Ellerman

The NET_CADENCE symbol was renamed to NET_VENDOR_CADENCE, so we don't
need to disable the former, see commit 0df5f81c481e ("net: ethernet:
Add missing VENDOR to Cadence and Packet Engines symbols").

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 7ff1ff1ddc28..eaaffe9ae8b9 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -138,7 +138,6 @@ CONFIG_TIGON3=m
 CONFIG_BNX2X=m
 # CONFIG_NET_VENDOR_BROCADE is not set
 # CONFIG_NET_VENDOR_CADENCE is not set
-# CONFIG_NET_CADENCE is not set
 # CONFIG_NET_VENDOR_CAVIUM is not set
 CONFIG_CHELSIO_T1=m
 # CONFIG_NET_VENDOR_CISCO is not set
-- 
2.21.1

[PATCH 1/9] powerpc/configs: Drop CONFIG_QLGE which moved to staging

2020-01-15 Thread Michael Ellerman

The QLGE driver moved to staging in commit 955315b0dc8c ("qlge: Move
drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/"), meaning
our defconfigs that enable it have no effect as we don't enable
CONFIG_STAGING.

It sounds like the device is obsolete, so drop the driver.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/powernv_defconfig | 1 -
 arch/powerpc/configs/ppc64_defconfig   | 1 -
 arch/powerpc/configs/ppc6xx_defconfig  | 1 -
 arch/powerpc/configs/pseries_defconfig | 1 -
 arch/powerpc/configs/skiroot_defconfig | 1 -
 5 files changed, 5 deletions(-)

diff --git a/arch/powerpc/configs/powernv_defconfig 
b/arch/powerpc/configs/powernv_defconfig
index 32841456a573..71749377d164 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -181,7 +181,6 @@ CONFIG_MLX5_FPGA=y
 CONFIG_MLX5_CORE_EN=y
 CONFIG_MLX5_CORE_IPOIB=y
 CONFIG_MYRI10GE=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_USB_NET_DRIVERS=m
 # CONFIG_WLAN is not set
diff --git a/arch/powerpc/configs/ppc64_defconfig 
b/arch/powerpc/configs/ppc64_defconfig
index b250e6f5a7ca..7e68cb222c7b 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -189,7 +189,6 @@ CONFIG_MLX4_EN=m
 CONFIG_MYRI10GE=m
 CONFIG_S2IO=m
 CONFIG_PASEMI_MAC=y
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_SUNGEM=y
 CONFIG_GELIC_NET=m
diff --git a/arch/powerpc/configs/ppc6xx_defconfig 
b/arch/powerpc/configs/ppc6xx_defconfig
index 7e28919041cf..3e2f44f38ac5 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -507,7 +507,6 @@ CONFIG_FORCEDETH=m
 CONFIG_HAMACHI=m
 CONFIG_YELLOWFIN=m
 CONFIG_QLA3XXX=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_8139CP=m
 CONFIG_8139TOO=m
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index 26126b4d4de3..6b68109e248f 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -169,7 +169,6 @@ CONFIG_IXGBE=m
 CONFIG_I40E=m
 CONFIG_MLX4_EN=m
 CONFIG_MYRI10GE=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_PPP=m
 CONFIG_PPP_BSDCOMP=m
diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 069f67f12731..7ff1ff1ddc28 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -171,7 +171,6 @@ CONFIG_MYRI10GE=m
 # CONFIG_NET_VENDOR_NVIDIA is not set
 # CONFIG_NET_VENDOR_OKI is not set
 # CONFIG_NET_VENDOR_PACKET_ENGINES is not set
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_QED=m
 CONFIG_QEDE=m
-- 
2.21.1

Re: [PATCH 00/12] i2c: convert subsystem to use i2c_new_client_device()

2020-01-15 Thread Wolfram Sang

On Tue, Jan 07, 2020 at 06:47:34PM +0100, Wolfram Sang wrote:
> This patch series converts the I2C subsystem to use the new API. Drivers
> have been build tested. There is one user left in the SMBus part of the
> core which will need a seperate series because all users of this
> function need to be checked/converted, too.
> 
> Except for documentation patches, the conversion has been done with a
> coccinelle script and further simplification have been applied when
> proofreading the patches.
> 
> A branch is here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git 
> renesas/i2c/new_client_device
> 
> Looking forward to comments...

Thanks for all the quick reviews and tests \o/

Series applied to for-next, thanks!



signature.asc
Description: PGP signature

Re: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Stephen Rothwell

Hi Scott,

On Wed, 15 Jan 2020 14:01:35 -0600 Scott Wood  wrote:
>
> On Thu, 2020-01-16 at 06:42 +1100, Stephen Rothwell wrote:
> > Hi Timur,
> > 
> > On Wed, 15 Jan 2020 07:25:45 -0600 Timur Tabi  wrote:  
> > > On 1/14/20 12:31 AM, Stephen Rothwell wrote:  
> > > > +/**
> > > > + * ev_byte_channel_send - send characters to a byte stream
> > > > + * @handle: byte stream handle
> > > > + * @count: (input) num of chars to send, (output) num chars sent
> > > > + * @bp: pointer to chars to send
> > > > + *
> > > > + * Returns 0 for success, or an error code.
> > > > + */
> > > > +static unsigned int ev_byte_channel_send(unsigned int handle,
> > > > +   unsigned int *count, const char *bp)
> > > 
> > > Well, now you've moved this into the .c file and it is no longer 
> > > available to other callers.  Anything wrong with keeping it in the .h
> > > file?  
> > 
> > There are currently no other callers - are there likely to be in the
> > future?  Even if there are, is it time critical enough that it needs to
> > be inlined everywhere?  
> 
> It's not performance critical and there aren't likely to be other users --
> just a matter of what's cleaner.  FWIW I'd rather see the original patch,
> that keeps the raw asm hcall stuff as simple wrappers in one place.

And I don't mind either way :-)

I just want to get rid of the warnings.

-- 
Cheers,
Stephen Rothwell


pgp_NDSCYduOm.pgp
Description: OpenPGP digital signature

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Thomas Gleixner

Hsin-Yi Wang  writes:

> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.

'some race condition and kernel warning' is pretty useless information.
Please describe exactly which kind of issues are caused by the current
mechanism. Especially the race conditions are the interesting part (the
warnings are just a consequence).

> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would
> offline cpus in

Please read Documentation/process/submitting-patches.rst and search for
'This patch'.

> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop.

This does not make any sense. The issues which you are trying to solve
are going to be still there when CONFIG_HOTPLUG_CPU is disabled.

> If architecture don't enable this config, or some cpus somehow fails
> to offline, it would fallback to ipi function.

This is really a half baken solution which keeps the various pointlessly
different pseudo reboot/kexec offlining implementations around. So with
this we have yet more code which only works depending on kernel
configuration and has the issue of potentially not being able to offline
a CPU. IOW this is going to fail completely in cases where a system is
in a state which prevents regular hotplug.

The existing pseudo-offline functions have timeouts and eventually a
fallback, e.g. the NMI fallback on x86. With this proposed regular
offline solution this will just get stuck w/o a chance to force
recovery.

While I like the idea and surely agree that the ideal solution is to
properly shutdown the CPUs on reboot, we need to take a step back and
look at the minimum requirements for a regular shutdown/reboot and at
the same time have a look at the requirements for emergency shutdown and
kexec/kcrash. Having proper information about the race conditions and
warnings you mentioned would be a good starting point.

> Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.

This is not opt-in. You force that on all architectures which support
CONFIG_HOTPLUG_CPU. The way we do this normally is to provide the
infrastructure first and then have separate patches (one per
architecture) enabling this, which allows the architecture maintainers
to decide individually.

Thanks,

tglx

Re: [PATCH v2] Fix display of Maximum Memory

2020-01-15 Thread Tyrel Datwyler

On 1/15/20 6:53 AM, Michael Bringmann wrote:
> Correct overflow problem in calculation+display of Maximum Memory
> value to syscfg where 32bits is insufficient.
> 

Probably needs the following Fixes tag:

Fixes: 772b039fd9a7 ("powerpc/pseries: Export maximum memory value")

otherwise,

Reviewed-by: Tyrel Datwyler 

> Signed-off-by: Michael Bringmann 
> ---
>  arch/powerpc/platforms/pseries/lparcfg.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
> b/arch/powerpc/platforms/pseries/lparcfg.c
> index e33e8bc..f00411c 100644
> --- a/arch/powerpc/platforms/pseries/lparcfg.c
> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
> @@ -433,12 +433,12 @@ static void parse_em_data(struct seq_file *m)
> 
>  static void maxmem_data(struct seq_file *m)
>  {
> - unsigned long maxmem = 0;
> + u64 maxmem = 0;
> 
> - maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
> - maxmem += hugetlb_total_pages() * PAGE_SIZE;
> + maxmem += (u64)drmem_info->n_lmbs * drmem_info->lmb_size;
> + maxmem += (u64)hugetlb_total_pages() * PAGE_SIZE;
> 
> - seq_printf(m, "MaxMem=%ld\n", maxmem);
> + seq_printf(m, "MaxMem=%llu\n", maxmem);
>  }
> 
>  static int pseries_lparcfg_data(struct seq_file *m, void *v)
>

Re: [PATCH] powerpc: Do not consider weak unresolved symbol relocations as bad

2020-01-15 Thread Stephen Rothwell

Hi Alexandre,

Thanks for sorting this out.  Just a few comments below.

On Wed, 15 Jan 2020 15:46:48 -0500 Alexandre Ghiti  wrote:
>

>  
>  # Have Kbuild supply the path to objdump so we handle cross compilation.
^
"and nm"

> +# Remove from the bad relocations those that match an undefined weak symbol
> +# which will result in an absolute relocation to 0.
> +# Weak unresolved symbols are of that form in nm output:
> +# "  w _binary__btf_vmlinux_bin_end"
> +undef_weak_symbols=$($nm "$vmlinux" | awk -e '$1 ~ /w/ { print $2 }')
> +
> +while IFS= read -r weak_symbol; do
> + bad_relocs="$(echo -n "$bad_relocs" | sed "/$weak_symbol/d")"
> +done <<< "$undef_weak_symbols"

This is not a bash script, and the above is a bashism :-(
Also, my version of awk (mawk) doesn't have a -e option.

How about something like :

undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
if [ "$undef_weak_symbols" ]; then
bad_relocs="$(echo "$bad_relocs" | grep -F -w -v "$undef_weak_symbols")"
fi

Or do this near the top and add the grep to the others.
-- 
Cheers,
Stephen Rothwell


pgpLKZxaAoJ0C.pgp
Description: OpenPGP digital signature

Re: [PATCH] net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info

2020-01-15 Thread David Miller

From: Colin King 
Date: Tue, 14 Jan 2020 14:54:48 +

> From: Colin Ian King 
> 
> Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements
> however up to UCC_MAX_NUM (8) elements are potentially being written
> to it.  Currently we have an array out-of-bounds write error on the
> last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in
> size.
> 
> Addresses-Coverity: ("Out-of-bounds write")
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Colin Ian King 

Applied and queued up for -stable.

Re: [PATCH v12 11/22] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2020-01-15 Thread John Hubbard

On 1/15/20 7:30 AM, Christoph Hellwig wrote:
> On Tue, Jan 07, 2020 at 02:45:47PM -0800, John Hubbard wrote:
>> Introduce pin_user_pages*() variations of get_user_pages*() calls,
>> and also pin_longterm_pages*() variations.
>>
>> For now, these are placeholder calls, until the various call sites
>> are converted to use the correct get_user_pages*() or
>> pin_user_pages*() API.
> 
> What do the pure placeholders buy us?  The API itself looks ok,
> but until it actually is properly implemented it doesn't help at
> all, and we've had all kinds of bad experiences with these sorts
> of stub APIs.
> 

Hi Christoph,

Absolutely agreed, and in fact, after spending some time in this area I 
am getting a much better understanding of just how problematic "this will 
be used soon" APIs really are. However, this is not quite that case.

The following things make this different from a "pure placeholder" API:

1) These APIs are all exercised in the following patches in this series, 
unless I've overlooked one, and

2) The pages are actually tracked in the very next patch that I want to
post. That patch was posted as part of the v11 series [1], but 
Leon Romanovsky reported a problem with it, and so I'm going to add in
the ability to handle larger "pin" refcounts for the huge page cases.

Meanwhile, I wanted to get these long-simmering simpler preparatory
patches submitted, because it's clear that the API is unaffected by the
huge page refcount fix. (That fix will likely use the second struct page of
the compound page, to count up higher.)

[1] https://lore.kernel.org/r/20191216222537.491123-24-jhubb...@nvidia.com  
[PATCH v11 23/25] mm/gup: track FOLL_PIN pages

thanks,
-- 
John Hubbard
NVIDIA

Re: [PATCH v12 04/22] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages

2020-01-15 Thread John Hubbard

On 1/15/20 7:23 AM, Christoph Hellwig wrote:
...
> 
> I'm really not sold on this scheme.  Note that I think it is
> particularly bad, but it also doesn't seem any better than what
> we had before, and it introduced quite a bit more code.
> 

Hi Christoph,

All by itself, yes. But the very next patch (which needs a little 
rework for other reasons, so not included here) needs to reuse some of 
these functions within __unpin_devmap_managed_user_page():

page_is_devmap_managed()
free_devmap_managed_page()

That patch was posted as part of the v11 series [1], and it did this:

+#ifdef CONFIG_DEV_PAGEMAP_OPS
+static bool __unpin_devmap_managed_user_page(struct page *page)
+{
+   int count;
+
+   if (!page_is_devmap_managed(page))
+   return false;
+
+   count = page_ref_sub_return(page, GUP_PIN_COUNTING_BIAS);
+
+   __update_proc_vmstat(page, NR_FOLL_PIN_RETURNED, 1);
+   /*
+* devmap page refcounts are 1-based, rather than 0-based: if
+* refcount is 1, then the page is free and the refcount is
+* stable because nobody holds a reference on the page.
+*/
+   if (count == 1)
+   free_devmap_managed_page(page);
+   else if (!count)
+   __put_page(page);
+
+   return true;
+}
+#else
+static bool __unpin_devmap_managed_user_page(struct page *page)
+{
+   return false;
+}
+#endif /* CONFIG_DEV_PAGEMAP_OPS */
+
+/**
+ * unpin_user_page() - release a dma-pinned page
+ * @page:pointer to page to be released
+ *
+ * Pages that were pinned via pin_user_pages*() must be released via either
+ * unpin_user_page(), or one of the unpin_user_pages*() routines. This is so
+ * that such pages can be separately tracked and uniquely handled. In
+ * particular, interactions with RDMA and filesystems need special handling.
+ */
+void unpin_user_page(struct page *page)
+{
+   page = compound_head(page);
+
+   /*
+* For devmap managed pages we need to catch refcount transition from
+* GUP_PIN_COUNTING_BIAS to 1, when refcount reach one it means the
+* page is free and we need to inform the device driver through
+* callback. See include/linux/memremap.h and HMM for details.
+*/
+   if (__unpin_devmap_managed_user_page(page))
+   return;
+
+   if (page_ref_sub_and_test(page, GUP_PIN_COUNTING_BIAS))
+   __put_page(page);
+
+   __update_proc_vmstat(page, NR_FOLL_PIN_RETURNED, 1);
+}
+EXPORT_SYMBOL(unpin_user_page);


[1] https://lore.kernel.org/r/20191216222537.491123-24-jhubb...@nvidia.com  
[PATCH v11 23/25] mm/gup: track FOLL_PIN pages

thanks,
-- 
John Hubbard
NVIDIA

[PATCH] powerpc: Do not consider weak unresolved symbol relocations as bad

2020-01-15 Thread Alexandre Ghiti

Commit 8580ac9404f6 ("bpf: Process in-kernel BTF") introduced two weak
symbols that may be unresolved at link time which result in an absolute
relocation to 0. relocs_check.sh emits the following warning:

"WARNING: 2 bad relocations
c1a41478 R_PPC64_ADDR64_binary__btf_vmlinux_bin_start
c1a41480 R_PPC64_ADDR64_binary__btf_vmlinux_bin_end"

whereas those relocations are legitimate even for a relocatable kernel
compiled with -pie option.

relocs_check.sh already excluded some weak unresolved symbols explicitly:
remove those hardcoded symbols and add some logic that parses the symbols
using nm, retrieves all the weak unresolved symbols and excludes those from
the list of the potential bad relocations.

Reported-by: Stephen Rothwell 
Signed-off-by: Alexandre Ghiti 
---
 arch/powerpc/Makefile.postlink |  4 ++--
 arch/powerpc/tools/relocs_check.sh | 27 +++
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/Makefile.postlink b/arch/powerpc/Makefile.postlink
index 134f12f89b92..2268396ff4bb 100644
--- a/arch/powerpc/Makefile.postlink
+++ b/arch/powerpc/Makefile.postlink
@@ -17,11 +17,11 @@ quiet_cmd_head_check = CHKHEAD $@
 quiet_cmd_relocs_check = CHKREL  $@
 ifdef CONFIG_PPC_BOOK3S_64
   cmd_relocs_check =   \
-   $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh 
"$(OBJDUMP)" "$@" ; \
+   $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh 
"$(OBJDUMP)" "$(NM)" "$@" ; \
$(BASH) $(srctree)/arch/powerpc/tools/unrel_branch_check.sh 
"$(OBJDUMP)" "$@"
 else
   cmd_relocs_check =   \
-   $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh 
"$(OBJDUMP)" "$@"
+   $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh 
"$(OBJDUMP)" "$(NM)" "$@"
 endif
 
 # `@true` prevents complaint when there is nothing to be done
diff --git a/arch/powerpc/tools/relocs_check.sh 
b/arch/powerpc/tools/relocs_check.sh
index 7b9fe0a567cf..783281b75d9d 100755
--- a/arch/powerpc/tools/relocs_check.sh
+++ b/arch/powerpc/tools/relocs_check.sh
@@ -10,14 +10,15 @@
 # based on relocs_check.pl
 # Copyright © 2009 IBM Corporation
 
-if [ $# -lt 2 ]; then
-   echo "$0 [path to objdump] [path to vmlinux]" 1>&2
+if [ $# -lt 3 ]; then
+   echo "$0 [path to objdump] [path to nm] [path to vmlinux]" 1>&2
exit 1
 fi
 
 # Have Kbuild supply the path to objdump so we handle cross compilation.
 objdump="$1"
-vmlinux="$2"
+nm="$2"
+vmlinux="$3"
 
 bad_relocs=$(
 $objdump -R "$vmlinux" |
@@ -26,8 +27,6 @@ $objdump -R "$vmlinux" |
# These relocations are okay
# On PPC64:
#   R_PPC64_RELATIVE, R_PPC64_NONE
-   #   R_PPC64_ADDR64 mach_
-   #   R_PPC64_ADDR64 __crc_
# On PPC:
#   R_PPC_RELATIVE, R_PPC_ADDR16_HI,
#   R_PPC_ADDR16_HA,R_PPC_ADDR16_LO,
@@ -38,15 +37,27 @@ R_PPC_ADDR16_LO
 R_PPC_ADDR16_HI
 R_PPC_ADDR16_HA
 R_PPC_RELATIVE
-R_PPC_NONE' |
-   grep -E -v '\

Re: linux-next: build warning after merge of the bpf-next tree

2020-01-15 Thread Alexandre Ghiti


On 1/14/20 6:23 AM, Alexei Starovoitov wrote:

On Sun, Jan 12, 2020 at 8:33 PM Zong Li  wrote:

I'm not quite familiar with btf, so I have no idea why there are two
weak symbols be added in 8580ac9404f6 ("bpf: Process in-kernel BTF")

I can explain what these weak symbols are for, but that won't change
the fact that compiler or linker are buggy. The weak symbols should work
in all cases and compiler should pick correct relocation.
In this case it sounds that compiler picked relative relocation and failed
to reach zero from that address.


Sorry for the response delay: I now agree that there is nothing weird 
about those
relocations. All compiler/linker I took a look at (arm64, ppc64 and 
riscv64) correctly
emit an absolute relocation to the address 0 in case of a weak 
unresolved symbol,

so there's no buggy compiler/linker.

And regarding ppc warning, the kernel being compiled as -pie, the 
scripts looks
for absolute relocations which it considers as "bad", except for one 
that is known
to be weak and that is ignored: I have just sent a patch to fix this 
script so that weak

undefined symbol relocations are not considered as bad.

Thanks,

Alex

Re: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Scott Wood

On Thu, 2020-01-16 at 06:42 +1100, Stephen Rothwell wrote:
> Hi Timur,
> 
> On Wed, 15 Jan 2020 07:25:45 -0600 Timur Tabi  wrote:
> > On 1/14/20 12:31 AM, Stephen Rothwell wrote:
> > > +/**
> > > + * ev_byte_channel_send - send characters to a byte stream
> > > + * @handle: byte stream handle
> > > + * @count: (input) num of chars to send, (output) num chars sent
> > > + * @bp: pointer to chars to send
> > > + *
> > > + * Returns 0 for success, or an error code.
> > > + */
> > > +static unsigned int ev_byte_channel_send(unsigned int handle,
> > > + unsigned int *count, const char *bp)  
> > 
> > Well, now you've moved this into the .c file and it is no longer 
> > available to other callers.  Anything wrong with keeping it in the .h
> > file?
> 
> There are currently no other callers - are there likely to be in the
> future?  Even if there are, is it time critical enough that it needs to
> be inlined everywhere?

It's not performance critical and there aren't likely to be other users --
just a matter of what's cleaner.  FWIW I'd rather see the original patch,
that keeps the raw asm hcall stuff as simple wrappers in one place.

-Scott

Re: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Stephen Rothwell

Hi Timur,

On Wed, 15 Jan 2020 07:25:45 -0600 Timur Tabi  wrote:
>
> On 1/14/20 12:31 AM, Stephen Rothwell wrote:
> > +/**
> > + * ev_byte_channel_send - send characters to a byte stream
> > + * @handle: byte stream handle
> > + * @count: (input) num of chars to send, (output) num chars sent
> > + * @bp: pointer to chars to send
> > + *
> > + * Returns 0 for success, or an error code.
> > + */
> > +static unsigned int ev_byte_channel_send(unsigned int handle,
> > +   unsigned int *count, const char *bp)  
> 
> Well, now you've moved this into the .c file and it is no longer 
> available to other callers.  Anything wrong with keeping it in the .h file?

There are currently no other callers - are there likely to be in the
future?  Even if there are, is it time critical enough that it needs to
be inlined everywhere?

-- 
Cheers,
Stephen Rothwell


pgpWNHPvSOiXD.pgp
Description: OpenPGP digital signature

Re: [PATCH] net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info

2020-01-15 Thread Joakim Tjernlund

On Tue, 2020-01-14 at 14:54 +, Colin King wrote:
> 
> From: Colin Ian King 
> 
> Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements
> however up to UCC_MAX_NUM (8) elements are potentially being written
> to it.  Currently we have an array out-of-bounds write error on the
> last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in
> size.
> 
> Addresses-Coverity: ("Out-of-bounds write")
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Colin Ian King 

This should be sent to stable as well
Cc:  # 4.19.x+

> ---
>  drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
> index 94e870f48e21..9edd94679283 100644
> --- a/drivers/net/wan/fsl_ucc_hdlc.c
> +++ b/drivers/net/wan/fsl_ucc_hdlc.c
> @@ -73,7 +73,7 @@ static struct ucc_tdm_info utdm_primary_info = {
> },
>  };
> 
> -static struct ucc_tdm_info utdm_info[MAX_HDLC_NUM];
> +static struct ucc_tdm_info utdm_info[UCC_MAX_NUM];
> 
>  static int uhdlc_init(struct ucc_hdlc_private *priv)
>  {
> --
> 2.24.0
>

Re: [PATCH v12 11/22] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2020-01-15 Thread Christoph Hellwig

On Tue, Jan 07, 2020 at 02:45:47PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
> 
> For now, these are placeholder calls, until the various call sites
> are converted to use the correct get_user_pages*() or
> pin_user_pages*() API.

What do the pure placeholders buy us?  The API itself looks ok,
but until it actually is properly implemented it doesn't help at
all, and we've had all kinds of bad experiences with these sorts
of stub APIs.

Re: [PATCH v12 22/22] mm, tree-wide: rename put_user_page() to unpin_user_page()

2020-01-15 Thread Christoph Hellwig

Looks sensible:

Reviewed-by: Christoph Hellwig

Re: [PATCH v12 08/22] mm/gup: allow FOLL_FORCE for get_user_pages_fast()

2020-01-15 Thread Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig

Re: [PATCH v12 06/22] mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM

2020-01-15 Thread Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig

Re: [PATCH v12 05/22] goldish_pipe: rename local pin_user_pages() routine

2020-01-15 Thread Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig

Re: [PATCH v12 04/22] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages

2020-01-15 Thread Christoph Hellwig

On Tue, Jan 07, 2020 at 02:45:40PM -0800, John Hubbard wrote:
> An upcoming patch changes and complicates the refcounting and
> especially the "put page" aspects of it. In order to keep
> everything clean, refactor the devmap page release routines:
> 
> * Rename put_devmap_managed_page() to page_is_devmap_managed(),
>   and limit the functionality to "read only": return a bool,
>   with no side effects.
> 
> * Add a new routine, put_devmap_managed_page(), to handle
>   decrementing the refcount for ZONE_DEVICE pages.
> 
> * Change callers (just release_pages() and put_page()) to check
>   page_is_devmap_managed() before calling the new
>   put_devmap_managed_page() routine. This is a performance
>   point: put_page() is a hot path, so we need to avoid non-
>   inline function calls where possible.
> 
> * Rename __put_devmap_managed_page() to free_devmap_managed_page(),
>   and limit the functionality to unconditionally freeing a devmap
>   page.
> 
> This is originally based on a separate patch by Ira Weiny, which
> applied to an early version of the put_user_page() experiments.
> Since then, Jérôme Glisse suggested the refactoring described above.

I'm really not sold on this scheme.  Note that I think it is
particularly bad, but it also doesn't seem any better than what
we had before, and it introduced quite a bit more code.

Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2020-01-15 Thread Christoph Hellwig

On Fri, Jan 10, 2020 at 08:10:28AM +0100, Christian Zigotzky wrote:
> Hi All,
> 
> The SCSI cards work again. [1, 2]
> 
> Sorry for bothering you.

No problem, and sorry for not following up earlier.  The Christmas
holiday and catch up phase led to a lot of delay.

Thanks a lot for taking care of these ppc platforms!

Re: [PATCH 2/2] string.h: fix incompatibility between FORTIFY_SOURCE and KASAN

2020-01-15 Thread Dmitry Vyukov

On Wed, Jan 15, 2020 at 7:37 AM Daniel Axtens  wrote:
>
> The memcmp KASAN self-test fails on a kernel with both KASAN and
> FORTIFY_SOURCE.
>
> When FORTIFY_SOURCE is on, a number of functions are replaced with
> fortified versions, which attempt to check the sizes of the operands.
> However, these functions often directly invoke __builtin_foo() once they
> have performed the fortify check. Using __builtins may bypass KASAN
> checks if the compiler decides to inline it's own implementation as
> sequence of instructions, rather than emit a function call that goes out
> to a KASAN-instrumented implementation.
>
> Why is only memcmp affected?
> 
>
> Of the string and string-like functions that kasan_test tests, only memcmp
> is replaced by an inline sequence of instructions in my testing on x86 with
> gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2).
>
> I believe this is due to compiler heuristics. For example, if I annotate
> kmalloc calls with the alloc_size annotation (and disable some fortify
> compile-time checking!), the compiler will replace every memset except the
> one in kmalloc_uaf_memset with inline instructions. (I have some WIP
> patches to add this annotation.)
>
> Does this affect other functions in string.h?
> =
>
> Yes. Anything that uses __builtin_* rather than __real_* could be
> affected. This looks like:
>
>  - strncpy
>  - strcat
>  - strlen
>  - strlcpy maybe, under some circumstances?
>  - strncat under some circumstances
>  - memset
>  - memcpy
>  - memmove
>  - memcmp (as noted)
>  - memchr
>  - strcpy
>
> Whether a function call is emitted always depends on the compiler. Most
> bugs should get caught by FORTIFY_SOURCE, but the missed memcmp test shows
> that this is not always the case.
>
> Isn't FORTIFY_SOURCE disabled with KASAN?
> -
>
> The string headers on all arches supporting KASAN disable fortify with
> kasan, but only when address sanitisation is _also_ disabled. For example
> from x86:
>
>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>  /*
>   * For files that are not instrumented (e.g. mm/slub.c) we
>   * should use not instrumented version of mem* functions.
>   */
>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
>  #define memmove(dst, src, len) __memmove(dst, src, len)
>  #define memset(s, c, n) __memset(s, c, n)
>
>  #ifndef __NO_FORTIFY
>  #define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
>  #endif
>
>  #endif
>
> This comes from commit 6974f0c4555e ("include/linux/string.h: add the
> option of fortified string.h functions"), and doesn't work when KASAN is
> enabled and the file is supposed to be sanitised - as with test_kasan.c

Hi Daniel,

Thanks for addressing this. And special kudos for description detail level! :)

Phew, this layering of checking tools is a bit messy...

> I'm pretty sure this is backwards: we shouldn't be using __builtin_memcpy
> when we have a KASAN instrumented file, but we can use __builtin_* - and in
> many cases all fortification - in files where we don't have
> instrumentation.

I think if we use __builtin_* in a non-instrumented file, the compiler
can emit a call to normal mem* function which will be intercepted by
kasan and we will get instrumentation in a file which should not be
instrumented. Moreover this behavior will depend on optimization level
and compiler internals.
But as far as I see this does not affect any of the following and the
code change.



> What is correct behaviour?
> ==
>
> Firstly, there is some overlap between fortification and KASAN: both
> provide some level of _runtime_ checking. Only fortify provides
> compile-time checking.
>
> KASAN and fortify can pick up different things at runtime:
>
>  - Some fortify functions, notably the string functions, could easily be
>modified to consider sub-object sizes (e.g. members within a struct),
>and I have some WIP patches to do this. KASAN cannot detect these
>because it cannot insert poision between members of a struct.
>
>  - KASAN can detect many over-reads/over-writes when the sizes of both
>operands are unknown, which fortify cannot.
>
> So there are a couple of options:
>
>  1) Flip the test: disable fortify in santised files and enable it in
> unsanitised files. This at least stops us missing KASAN checking, but
> we lose the fortify checking.
>
>  2) Make the fortify code always call out to real versions. Do this only
> for KASAN, for fear of losing the inlining opportunities we get from
> __builtin_*.
>
> (We can't use kasan_check_{read,write}: because the fortify functions are
> _extern inline_, you can't include _static_ inline functions without a
> compiler warning. kasan_check_{read,write} are static inline so we can't
> use them even when they would otherwise be suitable.)
>
> Take approach 2 and call out to real versions when KASAN is

Re: [PATCH v2] Fix display of Maximum Memory

2020-01-15 Thread Christophe Leroy





Le 15/01/2020 à 15:53, Michael Bringmann a écrit :

Correct overflow problem in calculation+display of Maximum Memory
value to syscfg where 32bits is insufficient.

Signed-off-by: Michael Bringmann 


Reviewed-by: Christophe Leroy 


---
  arch/powerpc/platforms/pseries/lparcfg.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index e33e8bc..f00411c 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -433,12 +433,12 @@ static void parse_em_data(struct seq_file *m)
  
  static void maxmem_data(struct seq_file *m)

  {
-   unsigned long maxmem = 0;
+   u64 maxmem = 0;
  
-	maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;

-   maxmem += hugetlb_total_pages() * PAGE_SIZE;
+   maxmem += (u64)drmem_info->n_lmbs * drmem_info->lmb_size;
+   maxmem += (u64)hugetlb_total_pages() * PAGE_SIZE;
  
-	seq_printf(m, "MaxMem=%ld\n", maxmem);

+   seq_printf(m, "MaxMem=%llu\n", maxmem);
  }
  
  static int pseries_lparcfg_data(struct seq_file *m, void *v)

[PATCH v2] Fix display of Maximum Memory

2020-01-15 Thread Michael Bringmann

Correct overflow problem in calculation+display of Maximum Memory
value to syscfg where 32bits is insufficient.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/platforms/pseries/lparcfg.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index e33e8bc..f00411c 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -433,12 +433,12 @@ static void parse_em_data(struct seq_file *m)
 
 static void maxmem_data(struct seq_file *m)
 {
-   unsigned long maxmem = 0;
+   u64 maxmem = 0;
 
-   maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
-   maxmem += hugetlb_total_pages() * PAGE_SIZE;
+   maxmem += (u64)drmem_info->n_lmbs * drmem_info->lmb_size;
+   maxmem += (u64)hugetlb_total_pages() * PAGE_SIZE;
 
-   seq_printf(m, "MaxMem=%ld\n", maxmem);
+   seq_printf(m, "MaxMem=%llu\n", maxmem);
 }
 
 static int pseries_lparcfg_data(struct seq_file *m, void *v)
-- 
1.8.3.1

Re: [PATCH 1/2] kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE

2020-01-15 Thread Christophe Leroy





Le 15/01/2020 à 15:43, Dmitry Vyukov a écrit :

On Wed, Jan 15, 2020 at 7:37 AM Daniel Axtens  wrote:


3 KASAN self-tests fail on a kernel with both KASAN and FORTIFY_SOURCE:
memchr, memcmp and strlen.

When FORTIFY_SOURCE is on, a number of functions are replaced with
fortified versions, which attempt to check the sizes of the operands.
However, these functions often directly invoke __builtin_foo() once they
have performed the fortify check. The compiler can detect that the results
of these functions are not used, and knows that they have no other side
effects, and so can eliminate them as dead code.

Why are only memchr, memcmp and strlen affected?


Of string and string-like functions, kasan_test tests:

  * strchr  ->  not affected, no fortified version
  * strrchr ->  likewise
  * strcmp  ->  likewise
  * strncmp ->  likewise

  * strnlen ->  not affected, the fortify source implementation calls the
underlying strnlen implementation which is instrumented, not
a builtin

  * strlen  ->  affected, the fortify souce implementation calls a __builtin
version which the compiler can determine is dead.

  * memchr  ->  likewise
  * memcmp  ->  likewise

  * memset ->   not affected, the compiler knows that memset writes to its
first argument and therefore is not dead.

Why does this not affect the functions normally?


In string.h, these functions are not marked as __pure, so the compiler
cannot know that they do not have side effects. If relevant functions are
marked as __pure in string.h, we see the following warnings and the
functions are elided:

lib/test_kasan.c: In function ‘kasan_memchr’:
lib/test_kasan.c:606:2: warning: statement with no effect [-Wunused-value]
   memchr(ptr, '1', size + 1);
   ^~
lib/test_kasan.c: In function ‘kasan_memcmp’:
lib/test_kasan.c:622:2: warning: statement with no effect [-Wunused-value]
   memcmp(ptr, arr, size+1);
   ^~~~
lib/test_kasan.c: In function ‘kasan_strings’:
lib/test_kasan.c:645:2: warning: statement with no effect [-Wunused-value]
   strchr(ptr, '1');
   ^~~~
...

This annotation would make sense to add and could be added at any point, so
the behaviour of test_kasan.c should change.

The fix
===

Make all the functions that are pure write their results to a global,
which makes them live. The strlen and memchr tests now pass.

The memcmp test still fails to trigger, which is addressed in the next
patch.

Cc: Daniel Micay 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Dmitry Vyukov 
Fixes: 0c96350a2d2f ("lib/test_kasan.c: add tests for several string/memory API 
functions")
Signed-off-by: Daniel Axtens 
---
  lib/test_kasan.c | 30 +++---
  1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 328d33beae36..58a8cef0d7a2 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -23,6 +23,14 @@

  #include 

+/*
+ * We assign some test results to these globals to make sure the tests
+ * are not eliminated as dead code.
+ */
+
+int int_result;
+void *ptr_result;


These are globals, but are not static and don't have kasan_ prefix.
But I guess this does not matter for modules?
Otherwise:

Reviewed-by: Dmitry Vyukov 



I think if you make them static, GCC will see they aren't used and will 
eliminate everything still ?


Christophe

Re: [PATCH 1/2] kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE

2020-01-15 Thread Dmitry Vyukov

On Wed, Jan 15, 2020 at 7:37 AM Daniel Axtens  wrote:
>
> 3 KASAN self-tests fail on a kernel with both KASAN and FORTIFY_SOURCE:
> memchr, memcmp and strlen.
>
> When FORTIFY_SOURCE is on, a number of functions are replaced with
> fortified versions, which attempt to check the sizes of the operands.
> However, these functions often directly invoke __builtin_foo() once they
> have performed the fortify check. The compiler can detect that the results
> of these functions are not used, and knows that they have no other side
> effects, and so can eliminate them as dead code.
>
> Why are only memchr, memcmp and strlen affected?
> 
>
> Of string and string-like functions, kasan_test tests:
>
>  * strchr  ->  not affected, no fortified version
>  * strrchr ->  likewise
>  * strcmp  ->  likewise
>  * strncmp ->  likewise
>
>  * strnlen ->  not affected, the fortify source implementation calls the
>underlying strnlen implementation which is instrumented, not
>a builtin
>
>  * strlen  ->  affected, the fortify souce implementation calls a __builtin
>version which the compiler can determine is dead.
>
>  * memchr  ->  likewise
>  * memcmp  ->  likewise
>
>  * memset ->   not affected, the compiler knows that memset writes to its
>first argument and therefore is not dead.
>
> Why does this not affect the functions normally?
> 
>
> In string.h, these functions are not marked as __pure, so the compiler
> cannot know that they do not have side effects. If relevant functions are
> marked as __pure in string.h, we see the following warnings and the
> functions are elided:
>
> lib/test_kasan.c: In function ‘kasan_memchr’:
> lib/test_kasan.c:606:2: warning: statement with no effect [-Wunused-value]
>   memchr(ptr, '1', size + 1);
>   ^~
> lib/test_kasan.c: In function ‘kasan_memcmp’:
> lib/test_kasan.c:622:2: warning: statement with no effect [-Wunused-value]
>   memcmp(ptr, arr, size+1);
>   ^~~~
> lib/test_kasan.c: In function ‘kasan_strings’:
> lib/test_kasan.c:645:2: warning: statement with no effect [-Wunused-value]
>   strchr(ptr, '1');
>   ^~~~
> ...
>
> This annotation would make sense to add and could be added at any point, so
> the behaviour of test_kasan.c should change.
>
> The fix
> ===
>
> Make all the functions that are pure write their results to a global,
> which makes them live. The strlen and memchr tests now pass.
>
> The memcmp test still fails to trigger, which is addressed in the next
> patch.
>
> Cc: Daniel Micay 
> Cc: Andrey Ryabinin 
> Cc: Alexander Potapenko 
> Cc: Dmitry Vyukov 
> Fixes: 0c96350a2d2f ("lib/test_kasan.c: add tests for several string/memory 
> API functions")
> Signed-off-by: Daniel Axtens 
> ---
>  lib/test_kasan.c | 30 +++---
>  1 file changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> index 328d33beae36..58a8cef0d7a2 100644
> --- a/lib/test_kasan.c
> +++ b/lib/test_kasan.c
> @@ -23,6 +23,14 @@
>
>  #include 
>
> +/*
> + * We assign some test results to these globals to make sure the tests
> + * are not eliminated as dead code.
> + */
> +
> +int int_result;
> +void *ptr_result;

These are globals, but are not static and don't have kasan_ prefix.
But I guess this does not matter for modules?
Otherwise:

Reviewed-by: Dmitry Vyukov 

> +
>  /*
>   * Note: test functions are marked noinline so that their names appear in
>   * reports.
> @@ -603,7 +611,7 @@ static noinline void __init kasan_memchr(void)
> if (!ptr)
> return;
>
> -   memchr(ptr, '1', size + 1);
> +   ptr_result = memchr(ptr, '1', size + 1);
> kfree(ptr);
>  }
>
> @@ -618,8 +626,7 @@ static noinline void __init kasan_memcmp(void)
> if (!ptr)
> return;
>
> -   memset(arr, 0, sizeof(arr));
> -   memcmp(ptr, arr, size+1);
> +   int_result = memcmp(ptr, arr, size + 1);
> kfree(ptr);
>  }
>
> @@ -642,22 +649,22 @@ static noinline void __init kasan_strings(void)
>  * will likely point to zeroed byte.
>  */
> ptr += 16;
> -   strchr(ptr, '1');
> +   ptr_result = strchr(ptr, '1');
>
> pr_info("use-after-free in strrchr\n");
> -   strrchr(ptr, '1');
> +   ptr_result = strrchr(ptr, '1');
>
> pr_info("use-after-free in strcmp\n");
> -   strcmp(ptr, "2");
> +   int_result = strcmp(ptr, "2");
>
> pr_info("use-after-free in strncmp\n");
> -   strncmp(ptr, "2", 1);
> +   int_result = strncmp(ptr, "2", 1);
>
> pr_info("use-after-free in strlen\n");
> -   strlen(ptr);
> +   int_result = strlen(ptr);
>
> pr_info("use-after-free in strnlen\n");
> -   strnlen(ptr, 1);
> +   int_result = strnlen(ptr, 1);
>  }
>
>  static noinline void __init kasan_bitops(void)

Re: [PATCH] Fix display of Maximum Memory

2020-01-15 Thread Michael Bringmann

On 1/14/20 11:41 PM, Christophe Leroy wrote:
> 
> 
> Le 14/01/2020 à 22:07, Michael Bringmann a écrit :
>> Correct overflow problem in calculation+display of Maximum Memory
>> value to syscfg where 32bits is insufficient.
>>
>> Signed-off-by: Michael Bringmann 
>> ---
>>   arch/powerpc/platforms/pseries/lparcfg.c | 8 
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
>> b/arch/powerpc/platforms/pseries/lparcfg.c
>> index 4ee2594..183aeb7 100644
>> --- a/arch/powerpc/platforms/pseries/lparcfg.c
>> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
>> @@ -435,12 +435,12 @@ static void parse_em_data(struct seq_file *m)
>>
>>   static void maxmem_data(struct seq_file *m)
>>   {
>> -   unsigned long maxmem = 0;
>> +   unsigned long long maxmem = 0;
> 
> What about using u64 instead, for readability ?

Okay.
> 
>>
>> -   maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
>> -   maxmem += hugetlb_total_pages() * PAGE_SIZE;
>> +   maxmem += (unsigned long long)drmem_info->n_lmbs * (unsigned long 
>> long)drmem_info->lmb_size;
> 
> This line is likely too long. You only need to cast one of the two operants 
> to force a 64 bits multiply. And using u64 would shorten the line.
> 
> Can both multiplications overflow ?

Yes.

> 
> Christophe
> 
>> +   maxmem += (unsigned long long)hugetlb_total_pages() * (unsigned long 
>> long)PAGE_SIZE;
>>
>> -   seq_printf(m, "MaxMem=%ld\n", maxmem);
>> +   seq_printf(m, "MaxMem=%llu\n", maxmem);
>>   }
>>
>>   static int pseries_lparcfg_data(struct seq_file *m, void *v)
>>
> 

Thanks.
-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.ibm.com

Re: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Timur Tabi


On 1/14/20 12:31 AM, Stephen Rothwell wrote:

+/**
+ * ev_byte_channel_send - send characters to a byte stream
+ * @handle: byte stream handle
+ * @count: (input) num of chars to send, (output) num chars sent
+ * @bp: pointer to chars to send
+ *
+ * Returns 0 for success, or an error code.
+ */
+static unsigned int ev_byte_channel_send(unsigned int handle,
+   unsigned int *count, const char *bp)


Well, now you've moved this into the .c file and it is no longer 
available to other callers.  Anything wrong with keeping it in the .h file?

RE: [PATCH] evh_bytechan: fix out of bounds accesses

2020-01-15 Thread Laurentiu Tudor



> -Original Message-
> From: Linuxppc-dev  bounces+laurentiu.tudor=nxp@lists.ozlabs.org> On Behalf Of Stephen
> Rothwell
> Sent: Tuesday, January 14, 2020 8:32 AM
> To: Timur Tabi 
> Cc: b08...@gmail.com; Greg Kroah-Hartman ;
> Jiri Slaby ; York Sun ; PowerPC Mailing
> List ; sw...@redhat.com
> Subject: Re: [PATCH] evh_bytechan: fix out of bounds accesses
> 
> Hi Timur,
> 
> On Mon, 13 Jan 2020 19:10:11 -0600 Timur Tabi  wrote:
> >
> > On 1/13/20 2:25 PM, Stephen Rothwell wrote:
> > > The problem is not really the declaration, the problem is that
> > > ev_byte_channel_send always accesses 16 bytes from the buffer and it
> is
> > > not always passed a buffer that long (in one case it is passed a
> > > pointer to a single byte).  So the alternative to the memcpy approach
> I
> > > have take is to complicate ev_byte_channel_send so that only accesses
> > > count bytes from the buffer.
> >
> > Ah, I see now.  This is all coming back to me.
> >
> > I would prefer that ev_byte_channel_send() is updated to access only
> > 'count' bytes.  If that means adding a memcpy to the
> > ev_byte_channel_send() itself, then so be it.  Trying to figure out how
> > to stuff n bytes into 4 32-bit registers is probably not worth the
> effort.
> 
> Like this (I have compile tested this):
> 
> From: Stephen Rothwell 
> Date: Thu, 9 Jan 2020 18:23:48 +1100
> Subject: [PATCH v2] evh_bytechan: fix out of bounds accesses
> 
> ev_byte_channel_send() assumes that its third argument is a 16 byte array.
> Some places where it is called it may not be (or we can't easily tell
> if it is).  Newer compilers have started producing warnings about this,
> so copy the bytes to send into a local array if the passed length is
> to short.
> 
> Since all the calls of ev_byte_channel_send() are in one file, lets move
> it there from the header file and let the compiler decide if it wants
> to inline it.
> 
> The warnings are:
> 
> In file included from include/linux/byteorder/big_endian.h:5,
>  from arch/powerpc/include/uapi/asm/byteorder.h:14,
>  from include/asm-generic/bitops/le.h:6,
>  from arch/powerpc/include/asm/bitops.h:250,
>  from include/linux/bitops.h:29,
>  from include/linux/kernel.h:12,
>  from include/asm-generic/bug.h:19,
>  from arch/powerpc/include/asm/bug.h:109,
>  from include/linux/bug.h:5,
>  from include/linux/mmdebug.h:5,
>  from include/linux/gfp.h:5,
>  from include/linux/slab.h:15,
>  from drivers/tty/ehv_bytechan.c:24:
> drivers/tty/ehv_bytechan.c: In function ‘ehv_bc_udbg_putc’:
> arch/powerpc/include/asm/epapr_hcalls.h:298:20: warning: array subscript 1
> is outside array bounds of ‘const char[1]’ [-Warray-bounds]
>   298 |  r6 = be32_to_cpu(p[1]);
> include/uapi/linux/byteorder/big_endian.h:40:51: note: in definition of
> macro ‘__be32_to_cpu’
>40 | #define __be32_to_cpu(x) ((__force __u32)(__be32)(x))
>   |   ^
> arch/powerpc/include/asm/epapr_hcalls.h:298:7: note: in expansion of macro
> ‘be32_to_cpu’
>   298 |  r6 = be32_to_cpu(p[1]);
>   |   ^~~
> drivers/tty/ehv_bytechan.c:166:13: note: while referencing ‘data’
>   166 | static void ehv_bc_udbg_putc(char c)
>   | ^~~~
> In file included from include/linux/byteorder/big_endian.h:5,
>  from arch/powerpc/include/uapi/asm/byteorder.h:14,
>  from include/asm-generic/bitops/le.h:6,
>  from arch/powerpc/include/asm/bitops.h:250,
>  from include/linux/bitops.h:29,
>  from include/linux/kernel.h:12,
>  from include/asm-generic/bug.h:19,
>  from arch/powerpc/include/asm/bug.h:109,
>  from include/linux/bug.h:5,
>  from include/linux/mmdebug.h:5,
>  from include/linux/gfp.h:5,
>  from include/linux/slab.h:15,
>  from drivers/tty/ehv_bytechan.c:24:
> arch/powerpc/include/asm/epapr_hcalls.h:299:20: warning: array subscript 2
> is outside array bounds of ‘const char[1]’ [-Warray-bounds]
>   299 |  r7 = be32_to_cpu(p[2]);
> include/uapi/linux/byteorder/big_endian.h:40:51: note: in definition of
> macro ‘__be32_to_cpu’
>40 | #define __be32_to_cpu(x) ((__force __u32)(__be32)(x))
>   |   ^
> arch/powerpc/include/asm/epapr_hcalls.h:299:7: note: in expansion of macro
> ‘be32_to_cpu’
>   299 |  r7 = be32_to_cpu(p[2]);
>   |   ^~~
> drivers/tty/ehv_bytechan.c:166:13: note: while referencing ‘data’
>   166 | static void ehv_bc_udbg_putc(char c)
>   | ^~~~
> In file included from include/linux/byteorder/big_endian.h:5,
>  from arch/powerpc/include/uapi/asm/byteorder.h:14,
>

[PATCH vdsotest] Use vdso wrapper for gettimeofday()

2020-01-15 Thread Christophe Leroy

To properly handle errors returned by gettimeofday(), the
DO_VDSO_CALL() macro has to be used, otherwise vdsotest
misinterpret VDSO function return on error.

This has gone unnoticed until now because the powerpc VDSO
gettimeofday() always succeed, but while porting powerpc to
generic C VDSO, the following has been encountered:

gettimeofday(valid, UINTPTR_MAX) (VDSO): unexpected return value 14, expected -1
gettimeofday(valid, UINTPTR_MAX) (VDSO): exited with status 1, expected 0
gettimeofday(valid, page (PROT_NONE)) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(valid, page (PROT_NONE)) (VDSO): exited with status 1, expected 0
gettimeofday(valid, page (PROT_READ)) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(valid, page (PROT_READ)) (VDSO): exited with status 1, expected 0
gettimeofday(UINTPTR_MAX, valid) (VDSO): unexpected return value 14, expected -1
gettimeofday(UINTPTR_MAX, valid) (VDSO): exited with status 1, expected 0
gettimeofday(UINTPTR_MAX, NULL) (VDSO): unexpected return value 14, expected -1
gettimeofday(UINTPTR_MAX, NULL) (VDSO): exited with status 1, expected 0
gettimeofday(UINTPTR_MAX, UINTPTR_MAX) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(UINTPTR_MAX, UINTPTR_MAX) (VDSO): exited with status 1, expected 0
gettimeofday(UINTPTR_MAX, page (PROT_NONE)) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(UINTPTR_MAX, page (PROT_NONE)) (VDSO): exited with status 1, 
expected 0
gettimeofday(UINTPTR_MAX, page (PROT_READ)) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(UINTPTR_MAX, page (PROT_READ)) (VDSO): exited with status 1, 
expected 0
gettimeofday(page (PROT_NONE), valid) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_NONE), valid) (VDSO): exited with status 1, expected 0
gettimeofday(page (PROT_NONE), NULL) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_NONE), NULL) (VDSO): exited with status 1, expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_NONE), UINTPTR_MAX) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_NONE), UINTPTR_MAX) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_NONE), page (PROT_NONE)) (VDSO): unexpected return 
value 14, expected -1
gettimeofday(page (PROT_NONE), page (PROT_NONE)) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_NONE), page (PROT_READ)) (VDSO): unexpected return 
value 14, expected -1
gettimeofday(page (PROT_NONE), page (PROT_READ)) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_READ), valid) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_READ), valid) (VDSO): exited with status 1, expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_READ), NULL) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_READ), NULL) (VDSO): exited with status 1, expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_READ), UINTPTR_MAX) (VDSO): unexpected return value 14, 
expected -1
gettimeofday(page (PROT_READ), UINTPTR_MAX) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_READ), page (PROT_NONE)) (VDSO): unexpected return 
value 14, expected -1
gettimeofday(page (PROT_READ), page (PROT_NONE)) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday(page (PROT_READ), page (PROT_READ)) (VDSO): unexpected return 
value 14, expected -1
gettimeofday(page (PROT_READ), page (PROT_READ)) (VDSO): exited with status 1, 
expected 0
Failure threshold (10) reached; stopping test.
gettimeofday/abi: 18 failures/inconsistencies encountered

Signed-off-by: Christophe Leroy 
---
 src/gettimeofday.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gettimeofday.c b/src/gettimeofday.c
index c50ecea..472f372 100644
--- a/src/gettimeofday.c
+++ b/src/gettimeofday.c
@@ -54,11 +54,16 @@ static void gettimeofday_syscall_nofail(struct timeval *tv, 
struct timezone *tz)
error(EXIT_FAILURE, errno, "SYS_gettimeofday");
 }
 
+static int gettimeofday_vdso_wrapper(struct timeval *tv, struct timezone *tz)
+{
+   return DO_VDSO_CALL(gettimeofday_vdso, int, 2, tv, tz);
+}
+
 static void gettimeofday_vdso_nofail(struct timeval *tv, struct timezone *tz)
 {
int err;
 
-   err = gettimeofday_vdso(tv, tz);
+   err = gettimeofday_vdso_wrapper(tv, tz);
if (err)
error(EXIT_FAILURE, errno, "gettimeofday");
 }
@@ -153,7 +158,7 @@ static void gettimeofday_bench(struct ctx *ctx, struct 
bench_results *res)
struct timeval tv;
 
if (vdso_has_gettimeofday()) {
-   BENCH(ctx, gettimeofday_vdso(, NULL),
+

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Sudeep Holla

On Wed, Jan 15, 2020 at 02:34:10PM +0800, Hsin-Yi Wang wrote:
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>

What's the timing impact on systems with large number of CPUs(say 256 or
more) ? I remember we added some change to reduce the wait times for
offlining CPUs in system suspend path on arm64, still not negligible.

--
Regards,
Sudeep

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Hsin-Yi Wang

On Wed, Jan 15, 2020 at 5:49 PM Rafael J. Wysocki  wrote:
>
> On Wed, Jan 15, 2020 at 7:35 AM Hsin-Yi Wang  wrote:
> >
> > Currently system reboots uses architecture specific codes (smp_send_stop)
> > to offline non reboot CPUs. Most architecture's implementation is looping
> > through all non reboot online CPUs and call ipi function to each of them. 
> > Some
> > architecture like arm64, arm, and x86... would set offline masks to cpu 
> > without
> > really offline them. This causes some race condition and kernel warning 
> > comes
> > out sometimes when system reboots.
> >
> > This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> > cpus in
> > migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> > for
> > checking online cpus would be an empty loop. If architecture don't enable 
> > this
> > config, or some cpus somehow fails to offline, it would fallback to ipi
> > function.
> >
> > Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.
> >
> > Signed-off-by: Hsin-Yi Wang 
> > ---
> > Change from v4:
> > * fix a few nits: naming, comments, remove Kconfig text...
> >
> > Change from v3:
> > * Opt in config for architectures that support CONFIG_HOTPLUG_CPU
> > * Merge function offline_secondary_cpus() and freeze_secondary_cpus()
> >   with an additional flag.
>
> This does not seem to be a very good idea, since
> freeze_secondary_cpus() does much more than you need for reboot.
>
> For reboot, you basically only need to do something like this AFAICS:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(i) {
> if (i != cpu)
> _cpu_down(i, 1, CPUHP_OFFLINE);
> }
> cpu_hotplug_disabled++;
>
> cpu_maps_update_done();
>
> And you may put this into a function defined outside of CONFIG_PM_SLEEP.
>
v2's implementation is similar to this. The conclusion in v2[1] is
that since these 2 functions are similar, we should merge them. I'm
fine both ways but slightly prefer v2's. Maybe wait for others to
comment?

[1] https://lore.kernel.org/lkml/87muarpcwm@vitty.brq.redhat.com/
> >
> > Change from v2:
> > * Add another config instead of configed by CONFIG_HOTPLUG_CPU
>
> So why exactly is this new Kconfig option needed?
>
> Everybody supporting CPU hotplug seems to opt in anyway.
>
Currently we opt-in for all arch that supports HOTPLUG_CPU, but if
some arch decides that this would make reboot slow (or maybe other
reasons), they can choose to opt-out.
I have only tested on arm64 and x86 for now.
> [cut]
>
> >
> > -int freeze_secondary_cpus(int primary)
> > +int freeze_secondary_cpus(int primary, bool reboot)
> >  {
> > int cpu, error = 0;
> >
> > @@ -1237,11 +1237,13 @@ int freeze_secondary_cpus(int primary)
> > if (cpu == primary)
> > continue;
> >
> > -   if (pm_wakeup_pending()) {
> > +#ifdef CONFIG_PM_SLEEP
> > +   if (!reboot && pm_wakeup_pending()) {
> > pr_info("Wakeup pending. Abort CPU freeze\n");
> > error = -EBUSY;
> > break;
> > }
> > +#endif
>
> Please avoid using #ifdefs in function bodies.  This makes the code
> hard to maintain in the long term.
>
> >
> > trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> > error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> > @@ -1250,7 +1252,9 @@ int freeze_secondary_cpus(int primary)
> > cpumask_set_cpu(cpu, frozen_cpus);
> > else {
> > pr_err("Error taking CPU%d down: %d\n", cpu, error);
> > -   break;
> > +   /* When rebooting, offline as many CPUs as 
> > possible. */
> > +   if (!reboot)
> > +   break;
> > }
> > }
> >
> > diff --git a/kernel/reboot.c b/kernel/reboot.c
> > index c4d472b7f1b4..12f643b66e57 100644
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -7,6 +7,7 @@
> >
> >  #define pr_fmt(fmt)"reboot: " fmt
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -220,7 +221,9 @@ void migrate_to_reboot_cpu(void)
> > /* The boot cpu is always logical cpu 0 */
> > int cpu = reboot_cpu;
> >
> > +#if !IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> > cpu_hotplug_disable();
> > +#endif
>
> You can write this as
>
> if (!IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT))
> cpu_hotplug_disable();
>
> That's what IS_ENABLED() is there for.
>
> >
> > /* Make certain the cpu I'm about to reboot on is online */
> > if (!cpu_online(cpu))
> > @@ -231,6 +234,11 @@ void migrate_to_reboot_cpu(void)
> >
> > /* Make certain I only run on the appropriate processor */
> > set_cpus_allowed_ptr(current, cpumask_of(cpu));
> > +
> > +#if IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> > +   /* Offline other cpus if possible */
> > +

Re: [PATCH v5] reboot: support offline CPUs before reboot

2020-01-15 Thread Rafael J. Wysocki

On Wed, Jan 15, 2020 at 7:35 AM Hsin-Yi Wang  wrote:
>
> Currently system reboots uses architecture specific codes (smp_send_stop)
> to offline non reboot CPUs. Most architecture's implementation is looping
> through all non reboot online CPUs and call ipi function to each of them. Some
> architecture like arm64, arm, and x86... would set offline masks to cpu 
> without
> really offline them. This causes some race condition and kernel warning comes
> out sometimes when system reboots.
>
> This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would offline 
> cpus in
> migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop 
> for
> checking online cpus would be an empty loop. If architecture don't enable this
> config, or some cpus somehow fails to offline, it would fallback to ipi
> function.
>
> Opt in this config for architectures that support CONFIG_HOTPLUG_CPU.
>
> Signed-off-by: Hsin-Yi Wang 
> ---
> Change from v4:
> * fix a few nits: naming, comments, remove Kconfig text...
>
> Change from v3:
> * Opt in config for architectures that support CONFIG_HOTPLUG_CPU
> * Merge function offline_secondary_cpus() and freeze_secondary_cpus()
>   with an additional flag.

This does not seem to be a very good idea, since
freeze_secondary_cpus() does much more than you need for reboot.

For reboot, you basically only need to do something like this AFAICS:

cpu_maps_update_begin();

for_each_online_cpu(i) {
if (i != cpu)
_cpu_down(i, 1, CPUHP_OFFLINE);
}
cpu_hotplug_disabled++;

cpu_maps_update_done();

And you may put this into a function defined outside of CONFIG_PM_SLEEP.

>
> Change from v2:
> * Add another config instead of configed by CONFIG_HOTPLUG_CPU

So why exactly is this new Kconfig option needed?

Everybody supporting CPU hotplug seems to opt in anyway.

[cut]

>
> -int freeze_secondary_cpus(int primary)
> +int freeze_secondary_cpus(int primary, bool reboot)
>  {
> int cpu, error = 0;
>
> @@ -1237,11 +1237,13 @@ int freeze_secondary_cpus(int primary)
> if (cpu == primary)
> continue;
>
> -   if (pm_wakeup_pending()) {
> +#ifdef CONFIG_PM_SLEEP
> +   if (!reboot && pm_wakeup_pending()) {
> pr_info("Wakeup pending. Abort CPU freeze\n");
> error = -EBUSY;
> break;
> }
> +#endif

Please avoid using #ifdefs in function bodies.  This makes the code
hard to maintain in the long term.

>
> trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> @@ -1250,7 +1252,9 @@ int freeze_secondary_cpus(int primary)
> cpumask_set_cpu(cpu, frozen_cpus);
> else {
> pr_err("Error taking CPU%d down: %d\n", cpu, error);
> -   break;
> +   /* When rebooting, offline as many CPUs as possible. 
> */
> +   if (!reboot)
> +   break;
> }
> }
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index c4d472b7f1b4..12f643b66e57 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -7,6 +7,7 @@
>
>  #define pr_fmt(fmt)"reboot: " fmt
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -220,7 +221,9 @@ void migrate_to_reboot_cpu(void)
> /* The boot cpu is always logical cpu 0 */
> int cpu = reboot_cpu;
>
> +#if !IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> cpu_hotplug_disable();
> +#endif

You can write this as

if (!IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT))
cpu_hotplug_disable();

That's what IS_ENABLED() is there for.

>
> /* Make certain the cpu I'm about to reboot on is online */
> if (!cpu_online(cpu))
> @@ -231,6 +234,11 @@ void migrate_to_reboot_cpu(void)
>
> /* Make certain I only run on the appropriate processor */
> set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +
> +#if IS_ENABLED(CONFIG_ARCH_OFFLINE_CPUS_ON_REBOOT)
> +   /* Offline other cpus if possible */
> +   freeze_secondary_cpus(cpu, true);
> +#endif

The above comment applies here too.

>  }
>
>  /**
> --

79 matches

Mail list logo