date:20180917

Re: [PATCH] powerpc/makefile: remove check on obsolete GCC versions

2018-09-17 Thread Christophe LEROY





Le 18/09/2018 à 07:48, Joel Stanley a écrit :

Hey Christophe,

On Tue, 18 Sep 2018 at 15:13, Christophe Leroy  wrote:


Since commit cafa0010cd51 ("Raise the minimum required gcc version
to 4.6"), it is not possible to build kernel with GCC lower than 4.6

This patch removes checkbin tests addressing older versions of GCC.


This is the same as Nick's patch:

  https://patchwork.ozlabs.org/patch/969624/



Oops, thanks, I missed that.

And even before Nick's, there is this one 
https://patchwork.ozlabs.org/patch/962319/


So I missed twice :(

Christophe

Re: [RFC PATCH 10/11] powerpc/tm: Set failure summary

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> Since the transaction will be doomed with treckpt., the TEXASR[FS]
> should be set, to reflect that the transaction is a failure. This patch
> ensures it before recheckpointing, and remove changes from other places
> that were calling recheckpoint.

TEXASR[FS] should be set by the reclaim. I don't know why you'd need to set this
explicitly in process.c. The only case is when the user supplies a bad signal
context, but we should check that in the signals code, not process.c

Hence I think this patch is wrong.

Also, according to the architecture, TEXASR[FS] HAS TO BE SET on trecheckpoint
otherwise you'll get a TM Bad Thing. You should say that rather than suggesting
it's because the transaction is doomed. It's illegal to not do it. That's why we
have this check in arch/powerpc/kernel/tm.S.


/* Do final sanity check on TEXASR to make sure FS is set.  Do this
 * here before we load up the userspace r1 so any bugs we hit will get
 * a call chain */
mfspr   r5, SPRN_TEXASR
srdir5, r5, 16
li  r6, (TEXASR_FS)@h
and r6, r6, r5
1:  tdeqi   r6, 0
EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0


Mikey

> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/process.c   | 6 ++
>  arch/powerpc/kernel/signal_32.c | 2 --
>  arch/powerpc/kernel/signal_64.c | 2 --
>  3 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 5cace1b744b1..77725b2e4dc1 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -937,6 +937,12 @@ void tm_recheckpoint(struct thread_struct *thread)
>   local_irq_save(flags);
>   hard_irq_disable();
>  
> + /*
> +  * Make sure the failure summary is set, since the transaction will be
> +  * doomed.
> +  */
> + thread->tm_texasr |= TEXASR_FS;
> +
>   /* The TM SPRs are restored here, so that TEXASR.FS can be set
>* before the trecheckpoint and no explosion occurs.
>*/
> diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
> index 4a1b17409bf3..96956d50538e 100644
> --- a/arch/powerpc/kernel/signal_32.c
> +++ b/arch/powerpc/kernel/signal_32.c
> @@ -851,8 +851,6 @@ static long restore_tm_user_regs(struct pt_regs *regs,
>   /* Pull in the MSR TM bits from the user context */
>   regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr_hi & MSR_TS_MASK);
>  
> - /* Make sure the transaction is marked as failed */
> - current->thread.tm_texasr |= TEXASR_FS;
>   /* Make sure restore_tm_state will be called */
>   set_thread_flag(TIF_RESTORE_TM);
>  
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 32402aa23a5e..c84501711b14 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -569,8 +569,6 @@ static long restore_tm_sigcontexts(struct task_struct
> *tsk,
>   }
>   }
>  #endif
> - /* Make sure the transaction is marked as failed */
> - tsk->thread.tm_texasr |= TEXASR_FS;
>   /* Guarantee that restore_tm_state() will be called */
>   set_thread_flag(TIF_RESTORE_TM);
>

Re: [PATCH] powerpc/makefile: remove check on obsolete GCC versions

2018-09-17 Thread Joel Stanley

Hey Christophe,

On Tue, 18 Sep 2018 at 15:13, Christophe Leroy  wrote:
>
> Since commit cafa0010cd51 ("Raise the minimum required gcc version
> to 4.6"), it is not possible to build kernel with GCC lower than 4.6
>
> This patch removes checkbin tests addressing older versions of GCC.

This is the same as Nick's patch:

 https://patchwork.ozlabs.org/patch/969624/

Cheers,

Joel

[PATCH] powerpc/makefile: remove check on obsolete GCC versions

2018-09-17 Thread Christophe Leroy

Since commit cafa0010cd51 ("Raise the minimum required gcc version
to 4.6"), it is not possible to build kernel with GCC lower than 4.6

This patch removes checkbin tests addressing older versions of GCC.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Makefile | 28 
 1 file changed, 28 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 8397c7bd5880..b33083bad840 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -405,35 +405,7 @@ archprepare: checkbin
 TOUT   := .tmp_gas_check
 
 # Check gcc and binutils versions:
-# - gcc-3.4 and binutils-2.14 are a fatal combination
-# - Require gcc 4.0 or above on 64-bit
-# - gcc-4.2.0 has issues compiling modules on 64-bit
 checkbin:
-   @if test "$(cc-name)" != "clang" \
-   && test "$(cc-version)" = "0304" ; then \
-   if ! /bin/echo mftb 5 | $(AS) -v -mppc -many -o $(TOUT) 
>/dev/null 2>&1 ; then \
-   echo -n '*** ${VERSION}.${PATCHLEVEL} kernels no longer 
build '; \
-   echo 'correctly with gcc-3.4 and your version of 
binutils.'; \
-   echo '*** Please upgrade your binutils or downgrade 
your gcc'; \
-   false; \
-   fi ; \
-   fi
-   @if test "$(cc-name)" != "clang" \
-   && test "$(cc-version)" -lt "0400" \
-   && test "x${CONFIG_PPC64}" = "xy" ; then \
-echo -n "Sorry, GCC v4.0 or above is required to build " ; \
-echo "the 64-bit powerpc kernel." ; \
-false ; \
-fi
-   @if test "$(cc-name)" != "clang" \
-   && test "$(cc-fullversion)" = "040200" \
-   && test "x${CONFIG_MODULES}${CONFIG_PPC64}" = "xyy" ; then \
-   echo -n '*** GCC-4.2.0 cannot compile the 64-bit powerpc ' ; \
-   echo 'kernel with modules enabled.' ; \
-   echo -n '*** Please use a different GCC version or ' ; \
-   echo 'disable kernel modules' ; \
-   false ; \
-   fi
@if test "x${CONFIG_CPU_LITTLE_ENDIAN}" = "xy" \
&& $(LD) --version | head -1 | grep ' 2\.24$$' >/dev/null ; then \
echo -n '*** binutils 2.24 miscompiles weak symbols ' ; \
-- 
2.13.3

Re: [RFC PATCH 09/11] powerpc/tm: Do not restore default DSCR

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> In the previous TM code, trecheckpoint was being executed in the middle of
> an exception, thus, DSCR was being restored to default kernel DSCR value
> after trecheckpoint was done.
> 
> With this current patchset, trecheckpoint is executed just before getting
> to userspace, at ret_from_except_lite, for example. Thus, we do not need to
> set default kernel DSCR value anymore, as we are leaving kernel space.  It
> is OK to keep the checkpointed DSCR value into the live SPR, mainly because
> the transaction is doomed and it will fail soon (after RFID), 

What if we are going back to a suspended transaction?  It will remain live until
userspace does a tresume

> so,
> continuing with the pre-checkpointed DSCR value is what seems correct.

Reading this description suggests this patch isn't really needed. Right?

Mikey

> That said, we must set the DSCR value that will be used in userspace now.
> Current trecheckpoint() function sets it to the pre-checkpointed value
> prior to lines being changed in this patch, so, removing these lines would
> keep the pre-checkpointed values.
> 
> Important to say that we do not need to do the same thing with tm_reclaim,
> since it already set the DSCR to the default value, after TRECLAIM is
> called, in the following lines:
> 
> /* Load CPU's default DSCR */
> ld  r0, PACA_DSCR_DEFAULT(r13)
> mtspr   SPRN_DSCR, r0
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/tm.S | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> index 6bffbc5affe7..5427eda69846 100644
> --- a/arch/powerpc/kernel/tm.S
> +++ b/arch/powerpc/kernel/tm.S
> @@ -493,10 +493,6 @@ restore_gprs:
>   mtlrr0
>   ld  r2, STK_GOT(r1)
>  
> - /* Load CPU's default DSCR */
> - ld  r0, PACA_DSCR_DEFAULT(r13)
> - mtspr   SPRN_DSCR, r0
> -
>   blr
>  
>   /* ** */

Re: [RFC PATCH 08/11] powerpc/tm: Do not reclaim on ptrace

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> Make sure that we are not suspended on ptrace and that the registers were
> already reclaimed.
> 
> Since the data was already reclaimed, there is nothing to be done here
> except to restore the SPRs.
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/ptrace.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
> index 9667666eb18e..cf6ee9154b11 100644
> --- a/arch/powerpc/kernel/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace.c
> @@ -136,12 +136,10 @@ static void flush_tmregs_to_thread(struct task_struct
> *tsk)
>   if ((!cpu_has_feature(CPU_FTR_TM)) || (tsk != current))
>   return;
>  
> - if (MSR_TM_SUSPENDED(mfmsr())) {
> - tm_reclaim_current(TM_CAUSE_SIGNAL);
> - } else {
> - tm_enable();
> - tm_save_sprs(&(tsk->thread));
> - }
> + WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
> +
> + tm_enable();
> + tm_save_sprs(&(tsk->thread));

Do we need to check if TM was enabled in the task before saving the TM SPRs?

What happens if TM was lazily off and hence we had someone else's TM SPRs in the
CPU currently?  Wouldn't this flush the wrong values to the task_struct?

I think we need to check the processes MSR before doing this.

Mikey

>  }
>  #else
>  static inline void flush_tmregs_to_thread(struct task_struct *tsk) { }

Re: [RFC PATCH 07/11] powerpc/tm: Do not recheckpoint at sigreturn

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> Do not recheckpoint at signal code return. Just make sure TIF_RESTORE_TM is
> set, which will restore on the exit to userspace by restore_tm_state.

Cool, but what about the same for reclaim? Why not avoid treclaim since it's
done on entry?

Mikey

> 
> All the FP and VEC lazy restore was already done by tm_reclaim_current(),
> where it checked if FP/VEC was set, and filled out the ckfp and ckvr
> registers area to the expected value.
> 
> The current FP/VEC restoration is not necessary, since the transaction will
> be aborted and the checkpointed values will be restore.
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/signal_32.c | 23 +++
>  arch/powerpc/kernel/signal_64.c | 15 ++-
>  2 files changed, 5 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
> index e6474a45cef5..4a1b17409bf3 100644
> --- a/arch/powerpc/kernel/signal_32.c
> +++ b/arch/powerpc/kernel/signal_32.c
> @@ -850,28 +850,11 @@ static long restore_tm_user_regs(struct pt_regs *regs,
>   return 1;
>   /* Pull in the MSR TM bits from the user context */
>   regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr_hi & MSR_TS_MASK);
> - /* Now, recheckpoint.  This loads up all of the checkpointed (older)
> -  * registers, including FP and V[S]Rs.  After recheckpointing, the
> -  * transactional versions should be loaded.
> -  */
> - tm_enable();
> +
>   /* Make sure the transaction is marked as failed */
>   current->thread.tm_texasr |= TEXASR_FS;
> - /* This loads the checkpointed FP/VEC state, if used */
> - tm_recheckpoint(>thread);
> -
> - /* This loads the speculative FP/VEC state, if used */
> - msr_check_and_set(msr & (MSR_FP | MSR_VEC));
> - if (msr & MSR_FP) {
> - load_fp_state(>thread.fp_state);
> - regs->msr |= (MSR_FP | current->thread.fpexc_mode);
> - }
> -#ifdef CONFIG_ALTIVEC
> - if (msr & MSR_VEC) {
> - load_vr_state(>thread.vr_state);
> - regs->msr |= MSR_VEC;
> - }
> -#endif
> + /* Make sure restore_tm_state will be called */
> + set_thread_flag(TIF_RESTORE_TM);
>  
>   return 0;
>  }
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 83d51bf586c7..32402aa23a5e 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -569,21 +569,10 @@ static long restore_tm_sigcontexts(struct task_struct
> *tsk,
>   }
>   }
>  #endif
> - tm_enable();
>   /* Make sure the transaction is marked as failed */
>   tsk->thread.tm_texasr |= TEXASR_FS;
> - /* This loads the checkpointed FP/VEC state, if used */
> - tm_recheckpoint(>thread);
> -
> - msr_check_and_set(msr & (MSR_FP | MSR_VEC));
> - if (msr & MSR_FP) {
> - load_fp_state(>thread.fp_state);
> - regs->msr |= (MSR_FP | tsk->thread.fpexc_mode);
> - }
> - if (msr & MSR_VEC) {
> - load_vr_state(>thread.vr_state);
> - regs->msr |= MSR_VEC;
> - }
> + /* Guarantee that restore_tm_state() will be called */
> + set_thread_flag(TIF_RESTORE_TM);
>  
>   return err;
>  }

Re: [RFC PATCH 3/3] powerpc/mm/iommu: Allow migration of cma allocated pages during mm_iommu_get

2018-09-17 Thread David Gibson

On Mon, Sep 03, 2018 at 10:07:33PM +0530, Aneesh Kumar K.V wrote:
> Current code doesn't do page migration if the page allocated is a compound 
> page.
> With HugeTLB migration support, we can end up allocating hugetlb pages from
> CMA region. Also THP pages can be allocated from CMA region. This patch 
> updates
> the code to handle compound pages correctly.
> 
> This add a new helper get_user_pages_cma_migrate. It does one get_user_pages
> with right count, instead of doing one get_user_pages per page. That avoids
> reading page table multiple times. The helper could possibly used by other
> subystem if we have more users.
> 
> The patch also convert the hpas member of mm_iommu_table_group_mem_t to a 
> union.
> We use the same storage location to store pointers to struct page. We cannot
> update alll the code path use struct page *, because we access hpas in real 
> mode
> and we can't do that struct page * to pfn conversion in real mode.
> 
> Signed-off-by: Aneesh Kumar K.V 

This approach doesn't seem quite right to me.  It's specific to pages
mapped into the IOMMU.  It's true that will address the obvious case
we have, of vfio-using guests fragmenting the CMA for other guests.

But AFAICT, fragmenting the CMA coud happen with *any* locked memory,
not just things that are IOMMU mapped for VFIO.  So, for example a
guest not using vfio, but using -realtime mlock=on, or an unrelated
program using locked memory (e.g. gpg or something else that locks
memory for security reasons).

AFAICT this approach won't fix the problem for that case.

> ---
>  arch/powerpc/mm/mmu_context_iommu.c | 195 ++--
>  1 file changed, 123 insertions(+), 72 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
> b/arch/powerpc/mm/mmu_context_iommu.c
> index f472965f7638..597b88a0abce 100644
> --- a/arch/powerpc/mm/mmu_context_iommu.c
> +++ b/arch/powerpc/mm/mmu_context_iommu.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static DEFINE_MUTEX(mem_list_mutex);
>  
> @@ -30,8 +31,18 @@ struct mm_iommu_table_group_mem_t {
>   atomic64_t mapped;
>   unsigned int pageshift;
>   u64 ua; /* userspace address */
> - u64 entries;/* number of entries in hpas[] */
> - u64 *hpas;  /* vmalloc'ed */
> + u64 entries;/* number of entries in hpages[] */
> + /*
> +  * in mm_iommu_get we temporarily use this to store
> +  * struct page address.
> +  *
> +  * We need to convert ua to hpa in real mode. Make it
> +  * simpler by storing physicall address.
> +  */
> + union {
> + struct page **hpages;   /* vmalloc'ed */
> + phys_addr_t *hpas;
> + };
>  };
>  
>  static long mm_iommu_adjust_locked_vm(struct mm_struct *mm,
> @@ -75,62 +86,112 @@ bool mm_iommu_preregistered(struct mm_struct *mm)
>  EXPORT_SYMBOL_GPL(mm_iommu_preregistered);
>  
>  /*
> - * Taken from alloc_migrate_target with changes to remove CMA allocations
> + * Taken from alloc_migrate_target/alloc_migrate_huge_page with changes to 
> remove
> + * CMA allocations
> + * Is this the right allocator for hugetlb?
>   */
>  struct page *new_iommu_non_cma_page(struct page *page, unsigned long private)
>  {
> - gfp_t gfp_mask = GFP_USER;
> - struct page *new_page;
> + /* is this the right nid? */
> + int nid = numa_mem_id();
> + gfp_t gfp_mask = GFP_HIGHUSER;
>  
> - if (PageCompound(page))
> - return NULL;
> + if (PageHuge(page)) {
>  
> - if (PageHighMem(page))
> - gfp_mask |= __GFP_HIGHMEM;
> + struct hstate *h = page_hstate(page);
> + /*
> +  * We don't want to dequeue from the pool because pool pages 
> will
> +  * mostly be from the CMA region.
> +  */
> + return alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
>  
> - /*
> -  * We don't want the allocation to force an OOM if possibe
> -  */
> - new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN);
> - return new_page;
> + } else if (PageTransHuge(page)) {
> + struct page *thp;
> + gfp_t thp_gfpmask = GFP_TRANSHUGE & ~__GFP_MOVABLE;
> +
> + thp = __alloc_pages_node(nid, thp_gfpmask, HPAGE_PMD_ORDER);
> + if (!thp)
> + return NULL;
> + prep_transhuge_page(thp);
> + return thp;
> + }
> + return __alloc_pages_node(nid, gfp_mask, 0);
>  }
>  
> -static int mm_iommu_move_page_from_cma(struct page *page)
> +int get_user_pages_cma_migrate(unsigned long start, int nr_pages, int write,
> +struct page **pages)
>  {
> - int ret = 0;
> - LIST_HEAD(cma_migrate_pages);
> -
> - /* Ignore huge pages for now */
> - if (PageCompound(page))
> - return -EBUSY;
> -
> - lru_add_drain();
> - ret = isolate_lru_page(page);
> - if

Re: [RFC PATCH 06/11] powerpc/tm: Refactor the __switch_to_tm code

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> __switch_to_tm is the function that switches between two tasks which might
> have TM enabled. This function is clearly split in two parts, the task that
> is leaving the CPU, known as 'prev' and the task that is being scheduled,
> known as new.
> 
> It starts checking if the previous task had TM enable, if so, it increases
> the load_tm (this is the only place we increment load_tm). It also saves
> the TM SPRs here.
> 
> If the previous task was scheduled out with a transaction active, the
> failure cause needs to be updated, since it might contain the failure cause
> that caused the exception, as TM_CAUSE_MISC. In this case, since there was
> a context switch, overwrite the failure cause.
> 
> If the previous task has overflowed load_tm, disable TM, putting the
> facility save/restore lazy mechanism at lazy mode.
> 
> Regarding the new task, when loading it, it basically restore the SPRs, and
> TIF_RESTORE_TM (already set by tm_reclaim_current if the transaction was
> active) would invoke the recheckpoint process later in restore_tm_state()
> if recheckpoint is somehow required.

This paragraph is a little awkwardly worded.  Can you rewrite?

> On top of that, both tm_reclaim_task() and tm_recheckpoint_new_task()
> functions are not used anymore, removing them.

What about tm_reclaim_current().  This is being used in places like signals
which I would have thought we could avoid with this series

> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/process.c | 163 +++---
>  1 file changed, 74 insertions(+), 89 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index fe063c0142e3..5cace1b744b1 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -921,48 +921,6 @@ void tm_reclaim_current(uint8_t cause)
>   tm_reclaim_thread(>thread, cause);
>  }
>  
> -static inline void tm_reclaim_task(struct task_struct *tsk)
> -{
> - /* We have to work out if we're switching from/to a task that's in the
> -  * middle of a transaction.
> -  *
> -  * In switching we need to maintain a 2nd register state as
> -  * oldtask->thread.ckpt_regs.  We tm_reclaim(oldproc); this saves the
> -  * checkpointed (tbegin) state in ckpt_regs, ckfp_state and
> -  * ckvr_state
> -  *
> -  * We also context switch (save) TFHAR/TEXASR/TFIAR in here.
> -  */
> - struct thread_struct *thr = >thread;
> -
> - if (!thr->regs)
> - return;
> -
> - if (!MSR_TM_ACTIVE(thr->regs->msr))
> - goto out_and_saveregs;
> -
> - WARN_ON(tm_suspend_disabled);
> -
> - TM_DEBUG("--- tm_reclaim on pid %d (NIP=%lx, "
> -  "ccr=%lx, msr=%lx, trap=%lx)\n",
> -  tsk->pid, thr->regs->nip,
> -  thr->regs->ccr, thr->regs->msr,
> -  thr->regs->trap);
> -
> - tm_reclaim_thread(thr, TM_CAUSE_RESCHED);
> -
> - TM_DEBUG("--- tm_reclaim on pid %d complete\n",
> -  tsk->pid);
> -
> -out_and_saveregs:
> - /* Always save the regs here, even if a transaction's not active.
> -  * This context-switches a thread's TM info SPRs.  We do it here to
> -  * be consistent with the restore path (in recheckpoint) which
> -  * cannot happen later in _switch().
> -  */
> - tm_save_sprs(thr);
> -}
> -
>  extern void __tm_recheckpoint(struct thread_struct *thread);
>  
>  void tm_recheckpoint(struct thread_struct *thread)
> @@ -997,59 +955,87 @@ static void tm_fix_failure_cause(struct task_struct
> *task, uint8_t cause)
>   task->thread.tm_texasr |= (unsigned long) cause << 56;
>  }
>  
> -static inline void tm_recheckpoint_new_task(struct task_struct *new)
> +static inline void __switch_to_tm(struct task_struct *prev,

Can we just drop the __ ?

> + struct task_struct *new)
>  {
>   if (!cpu_has_feature(CPU_FTR_TM))
>   return;
>  
> - /* Recheckpoint the registers of the thread we're about to switch to.
> -  *
> -  * If the task was using FP, we non-lazily reload both the original and
> -  * the speculative FP register states.  This is because the kernel
> -  * doesn't see if/when a TM rollback occurs, so if we take an FP
> -  * unavailable later, we are unable to determine which set of FP regs
> -  * need to be restored.
> -  */
> - if (!tm_enabled(new))
> - return;
> -
> - if (!MSR_TM_ACTIVE(new->thread.regs->msr)){
> - tm_restore_sprs(>thread);
> - return;
> - }
> - /* Recheckpoint to restore original checkpointed register state. */
> - TM_DEBUG("*** tm_recheckpoint of pid %d (new->msr 0x%lx)\n",
> -  new->pid, new->thread.regs->msr);
> -
> - tm_recheckpoint(>thread);
> -
> - /*
> -  * The checkpointed state has been restored but the live state has
> -  * not, ensure all the math functionality is

[PATCH v2] powerpc/configs: Update skiroot defconfig

2018-09-17 Thread Joel Stanley

Disable new features from recent releases, and clean out some other
unused options:

  - Enable EXPERT, so we can disable some things
  - Disable non-powerpc BPF decoders
  - Disable TASKSTATS
  - Disable unused syscalls
  - Set more things to be modules
  - Turn off unused network vendors
  - PPC_OF_BOOT_TRAMPOLINE and FB_OF are unused on powernv
  - Drop unused Radeon and Matrox GPU drivers
  - IPV6 support landed in petitboot
  - Bringup related command line powersave=off dropped, switch to quiet

Set CONFIG_I2C_CHARDEV=y as the module is not loaded automatically, and
without this i2cget etc. will fail in the skiroot environment.

This defconfig gets us build coverage of KERNEL_XZ, which was broken in
the 4.19 merge window for powerpc.

Signed-off-by: Joel Stanley 
---
v2: re-sync with version used in op-build
---
 arch/powerpc/configs/skiroot_defconfig | 154 +
 1 file changed, 108 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 6bd5e7261335..cfdd08897a06 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -3,20 +3,17 @@ CONFIG_ALTIVEC=y
 CONFIG_VSX=y
 CONFIG_NR_CPUS=2048
 CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_KERNEL_XZ=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 # CONFIG_CROSS_MEMORY_ATTACH is not set
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
-CONFIG_TASKSTATS=y
-CONFIG_TASK_DELAY_ACCT=y
-CONFIG_TASK_XACCT=y
-CONFIG_TASK_IO_ACCOUNTING=y
+# CONFIG_CPU_ISOLATION is not set
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=20
-CONFIG_RELAY=y
 CONFIG_BLK_DEV_INITRD=y
 # CONFIG_RD_GZIP is not set
 # CONFIG_RD_BZIP2 is not set
@@ -24,8 +21,14 @@ CONFIG_BLK_DEV_INITRD=y
 # CONFIG_RD_LZO is not set
 # CONFIG_RD_LZ4 is not set
 CONFIG_CC_OPTIMIZE_FOR_SIZE=y
+CONFIG_EXPERT=y
+# CONFIG_SGETMASK_SYSCALL is not set
+# CONFIG_SYSFS_SYSCALL is not set
+# CONFIG_SHMEM is not set
+# CONFIG_AIO is not set
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
+CONFIG_SLAB_FREELIST_HARDENED=y
 CONFIG_JUMP_LABEL=y
 CONFIG_STRICT_KERNEL_RWX=y
 CONFIG_MODULES=y
@@ -35,7 +38,9 @@ CONFIG_MODULE_SIG_FORCE=y
 CONFIG_MODULE_SIG_SHA512=y
 CONFIG_PARTITION_ADVANCED=y
 # CONFIG_IOSCHED_DEADLINE is not set
+# CONFIG_PPC_VAS is not set
 # CONFIG_PPC_PSERIES is not set
+# CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
 CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
 CONFIG_CPU_IDLE=y
 CONFIG_HZ_100=y
@@ -48,8 +53,9 @@ CONFIG_NUMA=y
 CONFIG_PPC_64K_PAGES=y
 CONFIG_SCHED_SMT=y
 CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE="console=tty0 console=hvc0 powersave=off"
+CONFIG_CMDLINE="console=tty0 console=hvc0 ipr.fast_reboot=1 quiet"
 # CONFIG_SECCOMP is not set
+# CONFIG_PPC_MEM_KEYS is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
@@ -60,7 +66,6 @@ CONFIG_SYN_COOKIES=y
 # CONFIG_INET_XFRM_MODE_TRANSPORT is not set
 # CONFIG_INET_XFRM_MODE_TUNNEL is not set
 # CONFIG_INET_XFRM_MODE_BEET is not set
-# CONFIG_IPV6 is not set
 CONFIG_DNS_RESOLVER=y
 # CONFIG_WIRELESS is not set
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
@@ -73,8 +78,10 @@ CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=65536
 CONFIG_VIRTIO_BLK=m
 CONFIG_BLK_DEV_NVME=m
-CONFIG_EEPROM_AT24=y
+CONFIG_NVME_MULTIPATH=y
+CONFIG_EEPROM_AT24=m
 # CONFIG_CXL is not set
+# CONFIG_OCXL is not set
 CONFIG_BLK_DEV_SD=m
 CONFIG_BLK_DEV_SR=m
 CONFIG_BLK_DEV_SR_VENDOR=y
@@ -85,7 +92,6 @@ CONFIG_SCSI_FC_ATTRS=y
 CONFIG_SCSI_CXGB3_ISCSI=m
 CONFIG_SCSI_CXGB4_ISCSI=m
 CONFIG_SCSI_BNX2_ISCSI=m
-CONFIG_BE2ISCSI=m
 CONFIG_SCSI_AACRAID=m
 CONFIG_MEGARAID_NEWGEN=y
 CONFIG_MEGARAID_MM=m
@@ -102,7 +108,7 @@ CONFIG_SCSI_VIRTIO=m
 CONFIG_SCSI_DH=y
 CONFIG_SCSI_DH_ALUA=m
 CONFIG_ATA=y
-CONFIG_SATA_AHCI=y
+CONFIG_SATA_AHCI=m
 # CONFIG_ATA_SFF is not set
 CONFIG_MD=y
 CONFIG_BLK_DEV_MD=m
@@ -119,25 +125,72 @@ CONFIG_DM_SNAPSHOT=m
 CONFIG_DM_MIRROR=m
 CONFIG_DM_ZERO=m
 CONFIG_DM_MULTIPATH=m
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
 CONFIG_ACENIC=m
 CONFIG_ACENIC_OMIT_TIGON_I=y
-CONFIG_TIGON3=y
+# CONFIG_NET_VENDOR_AMAZON is not set
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+# CONFIG_NET_VENDOR_ATHEROS is not set
+CONFIG_TIGON3=m
 CONFIG_BNX2X=m
-CONFIG_CHELSIO_T1=y
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_CADENCE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+CONFIG_CHELSIO_T1=m
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_CORTINA is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
 CONFIG_BE2NET=m
-CONFIG_S2IO=m
-CONFIG_E100=m
+# CONFIG_NET_VENDOR_EZCHIP is not set
+# CONFIG_NET_VENDOR_HP is not set
+# CONFIG_NET_VENDOR_HUAWEI is not set
 CONFIG_E1000=m
-CONFIG_E1000E=m
+CONFIG_IGB=m
 CONFIG_IXGB=m
 CONFIG_IXGBE=m
+CONFIG_I40E=m
+CONFIG_S2IO=m
+# CONFIG_NET_VENDOR_MARVELL is not set

Re: [RFC PATCH 05/11] powerpc/tm: Function that updates the failure code

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> Now the transaction reclaims happens very earlier in the trap handler, and
> it is impossible to know precisely, at that early time, what should be set
> as the failure cause for some specific cases, as, if the task will be
> rescheduled, thus, the transaction abort case should be updated from
> TM_CAUSE_MISC to TM_CAUSE_RESCHED, for example.
> 
> This patch creates a function that will update TEXASR special purpose
> register in the task thread and set the failure code which will be
> moved to the live register afterward.
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/process.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 54fddf03b97a..fe063c0142e3 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -85,6 +85,7 @@ extern unsigned long _get_SP(void);
>   * other paths that we should never reach with suspend disabled.
>   */
>  bool tm_suspend_disabled __ro_after_init = false;
> +static void tm_fix_failure_cause(struct task_struct *task, uint8_t cause);
>  
>  static void check_if_tm_restore_required(struct task_struct *tsk)
>  {
> @@ -988,6 +989,14 @@ void tm_recheckpoint(struct thread_struct *thread)
>   local_irq_restore(flags);
>  }
>  
> +/* Change thread->tm.texasr failure code */
> +static void tm_fix_failure_cause(struct task_struct *task, uint8_t cause)

I would just call this tm_change_failure_cause() and drop the comment above.

> +{
> + /* Clear the cause first */
> + task->thread.tm_texasr &= ~TEXASR_FC;
> + task->thread.tm_texasr |= (unsigned long) cause << 56;

56 == TEXASR_FC_LG;


> +}
> +
>  static inline void tm_recheckpoint_new_task(struct task_struct *new)
>  {
>   if (!cpu_has_feature(CPU_FTR_TM))

Re: [RFC PATCH 01/11] powerpc/tm: Reclaim transaction on kernel entry

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> This patch creates a macro that will be invoked on all entrance to the
> kernel, so, in kernel space the transaction will be completely reclaimed
> and not suspended anymore.

There are still some calls to tm_reclaim_current() in process.c. Should these
probably go now, right?

Mikey

> This patchset checks if we are coming from PR, if not, skip. This is useful
> when there is a irq_replay() being called after recheckpoint, when the IRQ
> is re-enable. In this case, we do not want to re-reclaim and
> re-recheckpoint, thus, if not coming from PR, skip it completely.
> 
> This macro does not care about TM SPR also, it will only be saved and
> restore in the context switch code now on.
> 
> This macro will return 0 or 1 in r3 register, to specify if a reclaim was
> executed or not.
> 
> This patchset is based on initial work done by Cyril:
> https://patchwork.ozlabs.org/cover/875341/
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/include/asm/exception-64s.h | 46 
>  arch/powerpc/kernel/entry_64.S   | 10 ++
>  arch/powerpc/kernel/exceptions-64s.S | 12 +--
>  3 files changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h
> b/arch/powerpc/include/asm/exception-64s.h
> index a86fead0..db90b6d7826e 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -36,6 +36,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  
>  /* PACA save area offsets (exgen, exmc, etc) */
>  #define EX_R90
> @@ -686,10 +687,54 @@ BEGIN_FTR_SECTION   \
>   beqlppc64_runlatch_on_trampoline;   \
>  END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>  
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +
> +/*
> + * This macro will reclaim a transaction if called when coming from userspace
> + * (MSR.PR = 1) and if the transaction state is active or suspended.
> + *
> + * Since we don't want to reclaim when coming from kernel, for instance after
> + * a trechkpt. or a IRQ replay, the live MSR is not useful and instead of it
> the
> + * MSR from thread stack is used to check the MSR.PR bit.
> + * This macro has one argument which is the cause that will be used by
> treclaim.
> + * and returns in r3 '1' if the reclaim happens or '0' if reclaim didn't
> + * happen, which is useful to know what registers were clobbered.
> + *
> + * NOTE: If addition registers are clobbered here, make sure the callee
> + * function restores them before proceeding.
> + */
> +#define TM_KERNEL_ENTRY(cause)   
> \
> + ld  r3, _MSR(r1);   \
> + andi.   r0, r3, MSR_PR; /* Coming from userspace? */\
> + beq 1f; /* Skip reclaim if MSR.PR != 1 */   \
> + rldicl. r0, r3, (64-MSR_TM_LG), 63; /* Is TM enabled? */\
> + beq 1f; /* Skip reclaim if TM is off */ \
> + rldicl. r0, r3, (64-MSR_TS_LG), 62; /* Is active */ \
> + beq 1f; /* Skip reclaim if neither */   \
> + /*  \
> +  * If there is a transaction active or suspended, save the  \
> +  * non-volatile GPRs if they are not already saved. \
> +  */ \
> + bl  save_nvgprs;\
> + /*  \
> +  * Soft disable the IRQs, otherwise it might cause a CPU hang.  \
> +  */ \
> + RECONCILE_IRQ_STATE(r10, r11);  \
> + li  r3, cause;  \
> + bl  tm_reclaim_current; \
> + li  r3, 1;  /* Reclaim happened */  \
> + b   2f; \
> +1:   li  r3, 0;  /* Reclaim didn't happen */ \
> +2:
> +#else
> +#define TM_KERNEL_ENTRY(cause)
> +#endif
> +
>  #define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
>   EXCEPTION_PROLOG_COMMON(trap, area);\
>   /* Volatile regs are potentially clobbered here */  \
>   additions;  \
> + TM_KERNEL_ENTRY(TM_CAUSE_MISC); \
>   addir3,r1,STACK_FRAME_OVERHEAD; \
>   bl  hdlr;   \
>   b   ret
> @@ -704,6 +749,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>   EXCEPTION_PROLOG_COMMON_3(trap);\
>   /* Volatile regs are potentially clobbered here */  \
>   additions;

Re: [RFC PATCH 05/11] powerpc/tm: Function that updates the failure code

2018-09-17 Thread Michael Neuling

On Wed, 2018-09-12 at 16:40 -0300, Breno Leitao wrote:
> Now the transaction reclaims happens very earlier in the trap handler, and
> it is impossible to know precisely, at that early time, what should be set
> as the failure cause for some specific cases, as, if the task will be
> rescheduled, thus, the transaction abort case should be updated from
> TM_CAUSE_MISC to TM_CAUSE_RESCHED, for example.

Please add comments to where this is used (in EXCEPTION_COMMON macro I think)
that say this might happen.

> 
> This patch creates a function that will update TEXASR special purpose
> register in the task thread and set the failure code which will be
> moved to the live register afterward.
> 
> Signed-off-by: Breno Leitao 
> ---
>  arch/powerpc/kernel/process.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 54fddf03b97a..fe063c0142e3 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -85,6 +85,7 @@ extern unsigned long _get_SP(void);
>   * other paths that we should never reach with suspend disabled.
>   */
>  bool tm_suspend_disabled __ro_after_init = false;
> +static void tm_fix_failure_cause(struct task_struct *task, uint8_t cause);
>  
>  static void check_if_tm_restore_required(struct task_struct *tsk)
>  {
> @@ -988,6 +989,14 @@ void tm_recheckpoint(struct thread_struct *thread)
>   local_irq_restore(flags);
>  }
>  
> +/* Change thread->tm.texasr failure code */
> +static void tm_fix_failure_cause(struct task_struct *task, uint8_t cause)
> +{
> + /* Clear the cause first */
> + task->thread.tm_texasr &= ~TEXASR_FC;
> + task->thread.tm_texasr |= (unsigned long) cause << 56;
> +}
> +
>  static inline void tm_recheckpoint_new_task(struct task_struct *new)
>  {
>   if (!cpu_has_feature(CPU_FTR_TM))

Re: [PATCH v2 2/5] powerpc/boot: Fix crt0.S syntax for clang

2018-09-17 Thread Joel Stanley

On Tue, 18 Sep 2018 at 06:11, Nick Desaulniers  wrote:
>
> On Fri, Sep 14, 2018 at 2:08 PM Segher Boessenkool
>  wrote:
> >
> > On Fri, Sep 14, 2018 at 10:47:08AM -0700, Nick Desaulniers wrote:
> > > On Thu, Sep 13, 2018 at 9:07 PM Joel Stanley  wrote:
> > > >  10:addis   r12,r12,(-RELACOUNT)@ha
> > > > -   cmpdi   r12,RELACOUNT@l
> > > > +   cmpdi   r12,(RELACOUNT)@l
> > >
> > > Yep, as we can see above, when RELACOUNT is negated, it's wrapped in
> > > parens.
>
> Looks like this was just fixed in Clang-8:
> https://bugs.llvm.org/show_bug.cgi?id=38945
> https://reviews.llvm.org/D52188

Nice!

mpe, given we need the local references to labels fix which is also in
clang-8 I suggest we drop this patch.

Cheers,

Joel

RE: [PATCH] powerpc/mpc85xx: fix issues in clock node

2018-09-17 Thread Andy Tang

Hi Scott,

Could you please take a look at this patch?

Thanks,
Andy

> -Original Message-
> From: andy.t...@nxp.com 
> Sent: 2018年9月11日 10:12
> To: o...@buserror.net
> Cc: robh...@kernel.org; mark.rutl...@arm.com;
> b...@kernel.crashing.org; devicet...@vger.kernel.org;
> linuxppc-dev@lists.ozlabs.org; Andy Tang 
> Subject: [PATCH] powerpc/mpc85xx: fix issues in clock node
> 
> From: Yuantian Tang 
> 
> The compatible string is not correct in the clock node.
> The clocks property refers to the wrong node too.
> This patch is to fix them.
> 
> Signed-off-by: Tang Yuantian 
> ---
>  arch/powerpc/boot/dts/fsl/t1023si-post.dtsi |8 
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi
> b/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi
> index 4908af5..763caf4 100644
> --- a/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi
> @@ -348,7 +348,7 @@
>   mux0: mux0@0 {
>   #clock-cells = <0>;
>   reg = <0x0 4>;
> - compatible = "fsl,core-mux-clock";
> + compatible = "fsl,qoriq-core-mux-2.0";
>   clocks = < 0>, < 1>;
>   clock-names = "pll0_0", "pll0_1";
>   clock-output-names = "cmux0";
> @@ -356,9 +356,9 @@
>   mux1: mux1@20 {
>   #clock-cells = <0>;
>   reg = <0x20 4>;
> - compatible = "fsl,core-mux-clock";
> - clocks = < 0>, < 1>;
> - clock-names = "pll0_0", "pll0_1";
> + compatible = "fsl,qoriq-core-mux-2.0";
> + clocks = < 0>, < 1>;
> + clock-names = "pll1_0", "pll1_1";
>   clock-output-names = "cmux1";
>   };
>   };
> --
> 1.7.1

[PATCH 4.18 084/158] perf tools: Allow overriding MAX_NR_CPUS at compile time

2018-09-17 Thread Greg Kroah-Hartman

4.18-stable review patch.  If anyone has any objections, please let me know.

--

From: Christophe Leroy 

[ Upstream commit 21b8732eb4479b579bda9ee38e62b2c312c2a0e5 ]

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot 
allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss  016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l O .bss 0014 runtime_cycles_stats
  10345f80 l O .bss 0014 runtime_stalled_cycles_front_stats
  10485f80 l O .bss 0014 runtime_stalled_cycles_back_stats
  105c5f80 l O .bss 0014 runtime_branches_stats
  10705f80 l O .bss 0014 runtime_cacherefs_stats
  10845f80 l O .bss 0014 runtime_l1_dcache_stats
  10985f80 l O .bss 0014 runtime_l1_icache_stats
  10ac5f80 l O .bss 0014 runtime_ll_cache_stats
  10c05f80 l O .bss 0014 runtime_itlb_cache_stats
  10d45f80 l O .bss 0014 runtime_dtlb_cache_stats
  10e85f80 l O .bss 0014 runtime_cycles_in_tx_stats
  10fc5f80 l O .bss 0014 runtime_transaction_stats
  11105f80 l O .bss 0014 runtime_elision_stats
  11245f80 l O .bss 0014 runtime_topdown_total_slots
  11385f80 l O .bss 0014 runtime_topdown_slots_retired
  114c5f80 l O .bss 0014 runtime_topdown_slots_issued
  11605f80 l O .bss 0014 runtime_topdown_fetch_bubbles
  11745f80 l O .bss 0014 runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy 
Cc: Alexander Shishkin 
Cc: Peter Zijlstra 
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/20170922112043.8349468...@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 tools/perf/perf.h |2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -25,7 +25,9 @@ static inline unsigned long long rdclock
return ts.tv_sec * 10ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;

[PATCH 4.14 046/126] perf tools: Allow overriding MAX_NR_CPUS at compile time

2018-09-17 Thread Greg Kroah-Hartman

4.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Christophe Leroy 

[ Upstream commit 21b8732eb4479b579bda9ee38e62b2c312c2a0e5 ]

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot 
allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss  016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l O .bss 0014 runtime_cycles_stats
  10345f80 l O .bss 0014 runtime_stalled_cycles_front_stats
  10485f80 l O .bss 0014 runtime_stalled_cycles_back_stats
  105c5f80 l O .bss 0014 runtime_branches_stats
  10705f80 l O .bss 0014 runtime_cacherefs_stats
  10845f80 l O .bss 0014 runtime_l1_dcache_stats
  10985f80 l O .bss 0014 runtime_l1_icache_stats
  10ac5f80 l O .bss 0014 runtime_ll_cache_stats
  10c05f80 l O .bss 0014 runtime_itlb_cache_stats
  10d45f80 l O .bss 0014 runtime_dtlb_cache_stats
  10e85f80 l O .bss 0014 runtime_cycles_in_tx_stats
  10fc5f80 l O .bss 0014 runtime_transaction_stats
  11105f80 l O .bss 0014 runtime_elision_stats
  11245f80 l O .bss 0014 runtime_topdown_total_slots
  11385f80 l O .bss 0014 runtime_topdown_slots_retired
  114c5f80 l O .bss 0014 runtime_topdown_slots_issued
  11605f80 l O .bss 0014 runtime_topdown_fetch_bubbles
  11745f80 l O .bss 0014 runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy 
Cc: Alexander Shishkin 
Cc: Peter Zijlstra 
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/20170922112043.8349468...@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 tools/perf/perf.h |2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -24,7 +24,9 @@ static inline unsigned long long rdclock
return ts.tv_sec * 10ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;

Re: [PATCH RFC 1/4] PCI: hotplug: Add parameter to put devices to reset during rescan

2018-09-17 Thread Bjorn Helgaas

[+cc Russell, Ben, Oliver, linuxppc-dev]

On Mon, Sep 17, 2018 at 11:55:43PM +0300, Sergey Miroshnichenko wrote:
> Hello Sam,
> 
> On 9/17/18 8:28 AM, Sam Bobroff wrote:
> > Hi Sergey,
> > 
> > On Fri, Sep 14, 2018 at 07:14:01PM +0300, Sergey Miroshnichenko wrote:
> >> Introduce a new command line option "pci=pcie_movable_bars" that indicates
> >> support of PCIe hotplug without prior reservation of memory regions by
> >> BIOS/bootloader.
> >>
> >> If a new PCIe device has been hot-plugged between two active ones, which
> >> have no (or not big enough) gap between their BARs, allocating new BARs
> >> requires to move BARs of the following working devices:
> >>
> >> 1)   dev 4
> >>|
> >>v
> >> .. |  dev 3  |  dev 3  |  dev 5  |  dev 7  |
> >> .. |  BAR 0  |  BAR 1  |  BAR 0  |  BAR 0  |
> >>
> >> 2) dev 4
> >>  |
> >>  v
> >> .. |  dev 3  |  dev 3  | -->   --> |  dev 5  |  dev 7  |
> >> .. |  BAR 0  |  BAR 1  | -->   --> |  BAR 0  |  BAR 0  |
> >>
> >> 3)
> >>
> >> .. |  dev 3  |  dev 3  |  dev 4  |  dev 4  |  dev 5  |  dev 7  |
> >> .. |  BAR 0  |  BAR 1  |  BAR 0  |  BAR 1  |  BAR 0  |  BAR 0  |
> >>
> >> Not only BARs, but also bridge windows can be updated during a PCIe rescan,
> >> threatening all memory transactions during this procedure, so the PCI
> >> subsystem will instruct the drivers to pause by calling the reset_prepare()
> >> and reset_done() callbacks.
> >>
> >> If a device may be affected by BAR movement, the BAR changes tracking must
> >> be implemented in its driver.
> >>
> >> Signed-off-by: Sergey Miroshnichenko 
> >> ---
> >>  .../admin-guide/kernel-parameters.txt |  6 +++
> >>  drivers/pci/pci.c |  2 +
> >>  drivers/pci/probe.c   | 43 +++
> >>  include/linux/pci.h   |  1 +
> >>  4 files changed, 52 insertions(+)
> >>
> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> >> b/Documentation/admin-guide/kernel-parameters.txt
> >> index 64a3bf54b974..f8132a709061 100644
> >> --- a/Documentation/admin-guide/kernel-parameters.txt
> >> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >> @@ -3311,6 +3311,12 @@
> >>bridges without forcing it upstream. Note:
> >>this removes isolation between devices and
> >>may put more devices in an IOMMU group.
> >> +  pcie_movable_bars   Arrange a space at runtime for BARs of
> >> +  hotplugged PCIe devices - usable if bootloader
> >> +  doesn't reserve memory regions for them. Freeing
> >> +  a space may require moving BARs of active 
> >> devices
> >> +  to higher addresses, so device drivers will be
> >> +  paused during rescan.
> >>  
> >>pcie_aspm=  [PCIE] Forcibly enable or disable PCIe Active State 
> >> Power
> >>Management.
> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >> index 1835f3a7aa8d..5f07a59b5924 100644
> >> --- a/drivers/pci/pci.c
> >> +++ b/drivers/pci/pci.c
> >> @@ -6105,6 +6105,8 @@ static int __init pci_setup(char *str)
> >>pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
> >>} else if (!strncmp(str, "disable_acs_redir=", 18)) {
> >>disable_acs_redir_param = str + 18;
> >> +  } else if (!strncmp(str, "pcie_movable_bars", 17)) {
> >> +  pci_add_flags(PCI_MOVABLE_BARS);
> >>} else {
> >>printk(KERN_ERR "PCI: Unknown option `%s'\n",
> >>str);
> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >> index 201f9e5ff55c..bdaafc48dc4c 100644
> >> --- a/drivers/pci/probe.c
> >> +++ b/drivers/pci/probe.c
> >> @@ -3138,6 +3138,45 @@ unsigned int pci_rescan_bus_bridge_resize(struct 
> >> pci_dev *bridge)
> >>return max;
> >>  }
> >>  
> >> +/*
> >> + * Put all devices of the bus and its children to reset
> >> + */
> >> +static void pci_bus_reset_prepare(struct pci_bus *bus)
> >> +{
> >> +  struct pci_dev *dev;
> >> +
> >> +  list_for_each_entry(dev, >devices, bus_list) {
> >> +  struct pci_bus *child = dev->subordinate;
> >> +
> >> +  if (child) {
> >> +  pci_bus_reset_prepare(child);
> >> +  } else if (dev->driver &&
> >> + dev->driver->err_handler &&
> >> + dev->driver->err_handler->reset_prepare) {
> >> +  dev->driver->err_handler->reset_prepare(dev);
> >> +  }
> > 
> > What about devices with drivers that don't have reset_prepare()?  It
> > looks like it will just reconfigure them anyway. Is that

[PATCH 4.9 31/70] perf tools: Allow overriding MAX_NR_CPUS at compile time

2018-09-17 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Christophe Leroy 

[ Upstream commit 21b8732eb4479b579bda9ee38e62b2c312c2a0e5 ]

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot 
allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss  016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l O .bss 0014 runtime_cycles_stats
  10345f80 l O .bss 0014 runtime_stalled_cycles_front_stats
  10485f80 l O .bss 0014 runtime_stalled_cycles_back_stats
  105c5f80 l O .bss 0014 runtime_branches_stats
  10705f80 l O .bss 0014 runtime_cacherefs_stats
  10845f80 l O .bss 0014 runtime_l1_dcache_stats
  10985f80 l O .bss 0014 runtime_l1_icache_stats
  10ac5f80 l O .bss 0014 runtime_ll_cache_stats
  10c05f80 l O .bss 0014 runtime_itlb_cache_stats
  10d45f80 l O .bss 0014 runtime_dtlb_cache_stats
  10e85f80 l O .bss 0014 runtime_cycles_in_tx_stats
  10fc5f80 l O .bss 0014 runtime_transaction_stats
  11105f80 l O .bss 0014 runtime_elision_stats
  11245f80 l O .bss 0014 runtime_topdown_total_slots
  11385f80 l O .bss 0014 runtime_topdown_slots_retired
  114c5f80 l O .bss 0014 runtime_topdown_slots_issued
  11605f80 l O .bss 0014 runtime_topdown_fetch_bubbles
  11745f80 l O .bss 0014 runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy 
Cc: Alexander Shishkin 
Cc: Peter Zijlstra 
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/20170922112043.8349468...@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 tools/perf/perf.h |2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -22,7 +22,9 @@ static inline unsigned long long rdclock
return ts.tv_sec * 10ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;

[PATCH 4.4 24/56] perf tools: Allow overriding MAX_NR_CPUS at compile time

2018-09-17 Thread Greg Kroah-Hartman

4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Christophe Leroy 

[ Upstream commit 21b8732eb4479b579bda9ee38e62b2c312c2a0e5 ]

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot 
allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss  016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l O .bss 0014 runtime_cycles_stats
  10345f80 l O .bss 0014 runtime_stalled_cycles_front_stats
  10485f80 l O .bss 0014 runtime_stalled_cycles_back_stats
  105c5f80 l O .bss 0014 runtime_branches_stats
  10705f80 l O .bss 0014 runtime_cacherefs_stats
  10845f80 l O .bss 0014 runtime_l1_dcache_stats
  10985f80 l O .bss 0014 runtime_l1_icache_stats
  10ac5f80 l O .bss 0014 runtime_ll_cache_stats
  10c05f80 l O .bss 0014 runtime_itlb_cache_stats
  10d45f80 l O .bss 0014 runtime_dtlb_cache_stats
  10e85f80 l O .bss 0014 runtime_cycles_in_tx_stats
  10fc5f80 l O .bss 0014 runtime_transaction_stats
  11105f80 l O .bss 0014 runtime_elision_stats
  11245f80 l O .bss 0014 runtime_topdown_total_slots
  11385f80 l O .bss 0014 runtime_topdown_slots_retired
  114c5f80 l O .bss 0014 runtime_topdown_slots_issued
  11605f80 l O .bss 0014 runtime_topdown_fetch_bubbles
  11745f80 l O .bss 0014 runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy 
Cc: Alexander Shishkin 
Cc: Peter Zijlstra 
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/20170922112043.8349468...@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 tools/perf/perf.h |2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -29,7 +29,9 @@ static inline unsigned long long rdclock
return ts.tv_sec * 10ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;

Re: [PATCH -next] fsl/qe: Fix copy-paste error in ucc_get_tdm_sync_shift

2018-09-17 Thread Li Yang

On Sat, Sep 15, 2018 at 6:11 AM YueHaibing  wrote:
>
> if 'mode' is COMM_DIR_TX, 'shift' should use TX_SYNC_SHIFT_BASE
>
> Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
> Signed-off-by: YueHaibing 

Thanks for submitting the patch, but there is already the same fix in
the queue from Zhao Qiang and Dan Carpenter.

> ---
>  drivers/soc/fsl/qe/ucc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c
> index c646d87..681f7d4 100644
> --- a/drivers/soc/fsl/qe/ucc.c
> +++ b/drivers/soc/fsl/qe/ucc.c
> @@ -626,7 +626,7 @@ static u32 ucc_get_tdm_sync_shift(enum comm_dir mode, u32 
> tdm_num)
>  {
> u32 shift;
>
> -   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : 
> RX_SYNC_SHIFT_BASE;
> +   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : 
> TX_SYNC_SHIFT_BASE;
> shift -= tdm_num * 2;
>
> return shift;
> --
> 1.8.3.1
>
>

Re: [PATCH] powerpc/pseries: Disable CPU hotplug across migrations

2018-09-17 Thread Tyrel Datwyler

On 09/17/2018 12:14 PM, Nathan Fontenot wrote:
> When performing partition migrations all present CPUs must be online
> as all present CPUs must make the H_JOIN call as part of the migration
> process. Once all present CPUs make the H_JOIN call, one CPU is returned
> to make the rtas call to perform the migration to the destination system.
> 
> During testing of migration and changing the SMT state we have found
> instances where CPUs are offlined, as part of the SMT state change,
> before they make the H_JOIN call. This results in a hung system where
> every CPU is either in H_JOIN or offline.
> 
> To prevent this this patch disables CPU hotplug during the migration
> process.
> 
> Signed-off-by: Nathan Fontenot 

Reviewed-by: Tyrel Datwyler

[PATCH] powerpc/pseries: Disable CPU hotplug across migrations

2018-09-17 Thread Nathan Fontenot

When performing partition migrations all present CPUs must be online
as all present CPUs must make the H_JOIN call as part of the migration
process. Once all present CPUs make the H_JOIN call, one CPU is returned
to make the rtas call to perform the migration to the destination system.

During testing of migration and changing the SMT state we have found
instances where CPUs are offlined, as part of the SMT state change,
before they make the H_JOIN call. This results in a hung system where
every CPU is either in H_JOIN or offline.

To prevent this this patch disables CPU hotplug during the migration
process.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/kernel/rtas.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 8afd146bc9c7..2c7ed31c736e 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -981,6 +981,7 @@ int rtas_ibm_suspend_me(u64 handle)
goto out;
}
 
+   cpu_hotplug_disable();
stop_topology_update();
 
/* Call function on all CPUs.  One of us will make the
@@ -995,6 +996,7 @@ int rtas_ibm_suspend_me(u64 handle)
printk(KERN_ERR "Error doing global join\n");
 
start_topology_update();
+   cpu_hotplug_enable();
 
/* Take down CPUs not online prior to suspend */
cpuret = rtas_offline_cpus_mask(offline_mask);

Re: [PATCH 3/3] mm: optimise pte dirty/accessed bit setting by demand based pte insertion

2018-09-17 Thread Nicholas Piggin

On Wed, 5 Sep 2018 07:29:51 -0700
Guenter Roeck  wrote:

> Hi,
> 
> On Tue, Aug 28, 2018 at 09:20:34PM +1000, Nicholas Piggin wrote:
> > Similarly to the previous patch, this tries to optimise dirty/accessed
> > bits in ptes to avoid access costs of hardware setting them.
> >   
> 
> This patch results in silent nios2 boot failures, silent meaning that
> the boot stalls.

Okay I just got back to looking at this. The reason for the hang is
I think a bug in the nios2 TLB code, but maybe other archs have similar
issues.

In case of a missing / !present Linux pte, nios2 installs a TLB entry
with no permissions via its fast TLB exception handler (software TLB
fill). Then it relies on that causing a TLB permission exception in a
slower handler that calls handle_mm_fault to set the Linux pte and
flushes the old TLB. Then the fast exception handler will find the new
Linux pte.

With this patch, nios2 has a case where handle_mm_fault does not flush
the old TLB, which results in the TLB permission exception continually
being retried.

What happens now is that fault paths like do_read_fault will install a
Linux pte with the young bit clear and return. That will cause nios2 to
fault again but this time go down the bottom of handle_pte_fault and to
the access flags update with the young bit set. The young bit is seen to
be different, so that causes ptep_set_access_flags to do a TLB flush and
that finally allows the fast TLB handler to fire and pick up the new
Linux pte.

With this patch, the young bit is set in the first handle_mm_fault, so
the second handle_mm_fault no longer sees the ptes are different and
does not flush the TLB. The spurious fault handler also does not flush
them unless FAULT_FLAG_WRITE is set.

What nios2 should do is invalidate the TLB in update_mmu_cache. What it
*really* should do is install the new TLB entry, I have some patches to
make that work in qemu I can submit. But I would like to try getting
these dirty/accessed bit optimisation in 4.20, so I will send a simple
path to just do the TLB invalidate that could go in Andrew's git tree.

Is that agreeable with the nios2 maintainers?

Thanks,
Nick

Re: [PATCH 2/2] powerpc/32: stack protector: change the canary value per task

2018-09-17 Thread Segher Boessenkool

On Mon, Sep 17, 2018 at 12:15:08PM +, Christophe Leroy wrote:
>  I would have liked to use -mstack-protector-guard=tls 
> -mstack-protector-guard-reg=r2
>  -mstack-protector-guard-offset=offsetof(struct task_struct, stack_canary) 
> but I have
>  not found how set the value of offsetof(struct task_struct, stack_canary) in 
> Makefile.

By far the easiest is to have the canary at a fixed offset from r2.


Segher

Re: [PATCH 1/2] powerpc: initial stack protector (-fstack-protector) support

2018-09-17 Thread kbuild test robot

Hi Christophe,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.19-rc4 next-20180913]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-initial-stack-protector-fstack-protector-support/20180917-202227
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-ppc6xx_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/powermac/bootx_init.o: In function `bootx_printf':
>> bootx_init.c:(.init.text+0x2bc): undefined reference to 
>> `__stack_chk_fail_local'
   arch/powerpc/platforms/powermac/bootx_init.o: In function 
`bootx_add_display_props.isra.1':
   bootx_init.c:(.init.text+0x750): undefined reference to 
`__stack_chk_fail_local'
   arch/powerpc/platforms/powermac/bootx_init.o: In function 
`bootx_scan_dt_build_struct':
   bootx_init.c:(.init.text+0xa84): undefined reference to 
`__stack_chk_fail_local'
   arch/powerpc/platforms/powermac/bootx_init.o: In function `bootx_init':
   bootx_init.c:(.init.text+0xf48): undefined reference to 
`__stack_chk_fail_local'
   powerpc-linux-gnu-ld: .tmp_vmlinux1: hidden symbol `__stack_chk_fail_local' 
isn't defined
   powerpc-linux-gnu-ld: final link failed: Bad value

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] powerpc: fix csum_ipv6_magic() on little endian platforms

2018-09-17 Thread Christophe LEROY


Hi Michael,

Le 10/09/2018 à 16:28, Xin Long a écrit :

On Mon, Sep 10, 2018 at 2:09 PM Christophe Leroy
 wrote:


On little endian platforms, csum_ipv6_magic() keeps len and proto in
CPU byte order. This generates a bad results leading to ICMPv6 packets
from other hosts being dropped by powerpc64le platforms.

In order to fix this, len and proto should be converted to network
byte order ie bigendian byte order. However checksumming 0x12345678
and 0x56341278 provide the exact same result so it is enough to
rotate the sum of len and proto by 1 byte.

PPC32 only support bigendian so the fix is needed for PPC64 only

Fixes: e9c4943a107b ("powerpc: Implement csum_ipv6_magic in assembly")
Reported-by: Jianlin Shi 
Reported-by: Xin Long 
Cc:  # 4.18+
Signed-off-by: Christophe Leroy 
---
  arch/powerpc/lib/checksum_64.S | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/lib/checksum_64.S b/arch/powerpc/lib/checksum_64.S
index 886ed94b9c13..2a68c43e13f5 100644
--- a/arch/powerpc/lib/checksum_64.S
+++ b/arch/powerpc/lib/checksum_64.S
@@ -443,6 +443,9 @@ _GLOBAL(csum_ipv6_magic)
 addcr0, r8, r9
 ld  r10, 0(r4)
 ld  r11, 8(r4)
+#ifndef CONFIG_CPU_BIG_ENDIAN
+   rotldi  r5, r5, 8
+#endif
 adder0, r0, r10
 add r5, r5, r7
 adder0, r0, r11
--
2.13.3


Tested-by: Xin Long 



Could you take this fix for 4.19 ?

Unless someone takes it through the netdev tree ?

Thanks
Christophe

[PATCH 2/2] powerpc/32: stack protector: change the canary value per task

2018-09-17 Thread Christophe Leroy

Partially copied from commit df0698be14c66 ("ARM: stack protector:
change the canary value per task")

A new random value for the canary is stored in the task struct whenever
a new task is forked.  This is meant to allow for different canary values
per task.  On powerpc, GCC expects the canary value to be found in a global
variable called __stack_chk_guard.  So this variable has to be updated
with the value stored in the task struct whenever a task switch occurs.

Because the variable GCC expects is global, this cannot work on SMP
unfortunately.  So, on SMP, the same initial canary value is kept
throughout, making this feature a bit less effective although it is still
useful.

Signed-off-by: Christophe Leroy 
---
 I would have liked to use -mstack-protector-guard=tls 
-mstack-protector-guard-reg=r2
 -mstack-protector-guard-offset=offsetof(struct task_struct, stack_canary) but 
I have
 not found how set the value of offsetof(struct task_struct, stack_canary) in 
Makefile.
 Any idea ?

 arch/powerpc/kernel/asm-offsets.c | 3 +++
 arch/powerpc/kernel/entry_32.S| 5 +
 2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 89cf15566c4e..cb02d23764ca 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -89,6 +89,9 @@ int main(void)
DEFINE(THREAD_INFO_GAP, _ALIGN_UP(sizeof(struct thread_info), 16));
OFFSET(KSP_LIMIT, thread_struct, ksp_limit);
 #endif /* CONFIG_PPC64 */
+#ifdef CONFIG_CC_STACKPROTECTOR
+   DEFINE(TSK_STACK_CANARY, offsetof(struct task_struct, stack_canary));
+#endif
 
 #ifdef CONFIG_LIVEPATCH
OFFSET(TI_livepatch_sp, thread_info, livepatch_sp);
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index e58c3f467db5..0cdb4170a21d 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -721,6 +721,11 @@ BEGIN_FTR_SECTION
mtspr   SPRN_SPEFSCR,r0 /* restore SPEFSCR reg */
 END_FTR_SECTION_IFSET(CPU_FTR_SPE)
 #endif /* CONFIG_SPE */
+#if defined(CONFIG_CC_STACKPROTECTOR) && !defined(CONFIG_SMP)
+   lwz r0, TSK_STACK_CANARY(r2)
+   lis r4, __stack_chk_guard@ha
+   stw r0, __stack_chk_guard@l(r4)
+#endif
 
lwz r0,_CCR(r1)
mtcrf   0xFF,r0
-- 
2.13.3

[PATCH 1/2] powerpc: initial stack protector (-fstack-protector) support

2018-09-17 Thread Christophe Leroy

Partialy copied from commit c743f38013aef ("ARM: initial stack protector
(-fstack-protector) support")

This is the very basic stuff without the changing canary upon
task switch yet.  Just the Kconfig option and a constant canary
value initialized at boot time.

This patch was tentatively added in the past (commit 6533b7c16ee5
("powerpc: Initial stack protector (-fstack-protector) support"))
but had to be reverted (commit f2574030b0e3 ("powerpc: Revert the
initial stack protector support") because GCC implementing it
differently whether it had been built with libc support or not.

Now, GCC offers the possibility to manually set the
stack-protector mode (global or tls) regardless of libc support.

This time, the patch selects HAVE_STACKPROTECTOR only if
-mstack-protector-guard=global is supported by GCC.

 $ echo CORRUPT_STACK > /sys/kernel/debug/provoke-crash/DIRECT
[  134.943666] Kernel panic - not syncing: stack-protector: Kernel stack is 
corrupted in: lkdtm_CORRUPT_STACK+0x64/0x64
[  134.943666]
[  134.955414] CPU: 0 PID: 283 Comm: sh Not tainted 
4.18.0-s3k-dev-12143-ga3272be41209 #835
[  134.963380] Call Trace:
[  134.965860] [c6615d60] [c001f76c] panic+0x118/0x260 (unreliable)
[  134.971775] [c6615dc0] [c001f654] panic+0x0/0x260
[  134.976435] [c6615dd0] [c032c368] lkdtm_CORRUPT_STACK_STRONG+0x0/0x64
[  134.982769] [c6615e00] [] 0x

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/Makefile |  4 +++
 arch/powerpc/include/asm/stackprotector.h | 41 +++
 arch/powerpc/kernel/Makefile  |  4 +++
 arch/powerpc/kernel/process.c |  6 +
 5 files changed, 56 insertions(+)
 create mode 100644 arch/powerpc/include/asm/stackprotector.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index db0b6eebbfa5..3f5776ed99d3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -181,6 +181,7 @@ config PPC
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_CBPF_JITif !PPC64
+   select HAVE_STACKPROTECTOR  if 
$(cc-option,-mstack-protector-guard=global)
select HAVE_CONTEXT_TRACKINGif PPC64
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 8397c7bd5880..0dbfdb6a145d 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -112,6 +112,10 @@ LDFLAGS+= -m elf$(BITS)$(LDEMULATION)
 KBUILD_ARFLAGS += --target=elf$(BITS)-$(GNUTARGET)
 endif
 
+ifdef CONFIG_STACKPROTECTOR
+KBUILD_CFLAGS  += -mstack-protector-guard=global
+endif
+
 LDFLAGS_vmlinux-y := -Bstatic
 LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie
 LDFLAGS_vmlinux:= $(LDFLAGS_vmlinux-y)
diff --git a/arch/powerpc/include/asm/stackprotector.h 
b/arch/powerpc/include/asm/stackprotector.h
new file mode 100644
index ..2556e227cdb2
--- /dev/null
+++ b/arch/powerpc/include/asm/stackprotector.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * GCC stack protector support.
+ *
+ * Stack protector works by putting predefined pattern at the start of
+ * the stack frame and verifying that it hasn't been overwritten when
+ * returning from the function.  The pattern is called stack canary
+ * and gcc expects it to be defined by a global variable called
+ * "__stack_chk_guard" on PPC.  This unfortunately means that on SMP
+ * we cannot have a different canary value per task.
+ */
+
+#ifndef _ASM_STACKPROTECTOR_H
+#define _ASM_STACKPROTECTOR_H
+
+#include 
+#include 
+#include 
+
+extern unsigned long __stack_chk_guard;
+
+/*
+ * Initialize the stackprotector canary value.
+ *
+ * NOTE: this must only be called from functions that never return,
+ * and it must always be inlined.
+ */
+static __always_inline void boot_init_stack_canary(void)
+{
+   unsigned long canary;
+
+   /* Try to get a semi random initial value. */
+   get_random_bytes(, sizeof(canary));
+   canary ^= mftb();
+   canary ^= LINUX_VERSION_CODE;
+
+   current->stack_canary = canary;
+   __stack_chk_guard = current->stack_canary;
+}
+
+#endif /* _ASM_STACKPROTECTOR_H */
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 3b66f2c19c84..0556a7243d2a 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,10 @@ CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 
+# -fstack-protector triggers protection checks in this code,
+# but it is being used too early to link to meaningful stack_chk logic.
+CFLAGS_prom_init.o += $(call cc-option, -fno-stack-protector)
+
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
 CFLAGS_REMOVE_cputable.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
diff --git a/arch/powerpc/kernel/process.c

Re: [PATCH 4.4.y] crypto: vmx - Fix sleep-in-atomic bugs

2018-09-17 Thread Greg Kroah-Hartman

On Mon, Sep 10, 2018 at 09:42:04AM +0200, Ondrej Mosnacek wrote:
> commit 0522236d4f9c5ab2e79889cb020d1acbe5da416e upstream.
> 
> Conflicts:
>   drivers/crypto/vmx/
> aes_cbc.c - adapted enable/disable calls to v4.4 state
> aes_xts.c - did not exist yet in v4.4

Now applied, thanks.

greg k-h

Re: [PATCH 4.4.y] crypto: vmx - Fix sleep-in-atomic bugs

2018-09-17 Thread Greg Kroah-Hartman

On Mon, Sep 10, 2018 at 09:42:04AM +0200, Ondrej Mosnacek wrote:
> commit 0522236d4f9c5ab2e79889cb020d1acbe5da416e upstream.
> 
> Conflicts:
>   drivers/crypto/vmx/
> aes_cbc.c - adapted enable/disable calls to v4.4 state
> aes_xts.c - did not exist yet in v4.4

We don't need these lines here...  I'll go hand edit it out...

Re: [PATCH v2 05/17] compat_ioctl: move more drivers to generic_compat_ioctl_ptrarg

2018-09-17 Thread Jonathan Cameron

On Wed, 12 Sep 2018 17:08:52 +0200
Arnd Bergmann  wrote:

> The .ioctl and .compat_ioctl file operations have the same prototype so
> they can both point to the same function, which works great almost all
> the time when all the commands are compatible.
> 
> One exception is the s390 architecture, where a compat pointer is only
> 31 bit wide, and converting it into a 64-bit pointer requires calling
> compat_ptr(). Most drivers here will ever run in s390, but since we now
> have a generic helper for it, it's easy enough to use it consistently.
> 
> I double-checked all these drivers to ensure that all ioctl arguments
> are used as pointers or are ignored, but are not interpreted as integer
> values.
> 
> Signed-off-by: Arnd Bergmann 
> ---

For IIO part.

Acked-by: Jonathan Cameron 

Thanks,
> diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
> index a062cfddc5af..22844b94b0e9 100644
> --- a/drivers/iio/industrialio-core.c
> +++ b/drivers/iio/industrialio-core.c
> @@ -1630,7 +1630,7 @@ static const struct file_operations iio_buffer_fileops 
> = {
>   .owner = THIS_MODULE,
>   .llseek = noop_llseek,
>   .unlocked_ioctl = iio_ioctl,
> - .compat_ioctl = iio_ioctl,
> + .compat_ioctl = generic_compat_ioctl_ptrarg,
>  };
>

Re: How to handle PTE tables with non contiguous entries ?

2018-09-17 Thread Christophe LEROY





Le 17/09/2018 à 11:03, Aneesh Kumar K.V a écrit :

Christophe Leroy  writes:


Hi,

I'm having a hard time figuring out the best way to handle the following
situation:

On the powerpc8xx, handling 16k size pages requires to have page tables
with 4 identical entries.


I assume that hugetlb page size? If so isn't that similar to FSL hugetlb
page table layout?


No, it is not for 16k hugepage size with a standard page size of 4k.

Here I'm trying to handle the case of CONFIG_PPC_16K_PAGES.
As of today, it is implemented by using the standard Linux page layout, 
ie one PTE entry for each 16k page. This forbids the use the 8xx HW 
assistance.






Initially I was thinking about handling this by simply modifying
pte_index() which changing pte_t type in order to have one entry every
16 bytes, then replicate the PTE value at *ptep, *ptep+1,*ptep+2 and
*ptep+3 both in set_pte_at() and pte_update().

However, this doesn't work because many many places in the mm core part
of the kernel use loops on ptep with single ptep++ increment.

Therefore did it with the following hack:

   /* PTE level */
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
+#else
   typedef struct { pte_basic_t pte; } pte_t;
+#endif

@@ -181,7 +192,13 @@ static inline unsigned long pte_update(pte_t *p,
  : "cc" );
   #else /* PTE_ATOMIC_UPDATES */
  unsigned long old = pte_val(*p);
-   *p = __pte((old & ~clr) | set);
+   unsigned long new = (old & ~clr) | set;
+
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+   p->pte = p->pte1 = p->pte2 = p->pte3 = new;
+#else
+   *p = __pte(new);
+#endif
   #endif /* !PTE_ATOMIC_UPDATES */

   #ifdef CONFIG_44x


@@ -161,7 +161,11 @@ static inline void __set_pte_at(struct mm_struct
*mm, unsigned long addr,
  /* Anything else just stores the PTE normally. That covers all
64-bit
   * cases, and 32-bit non-hash with 32-bit PTEs.
   */
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+   ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte);
+#else
  *ptep = pte;
+#endif



But I'm not too happy with it as it means pte_t is not a single type
anymore so passing it from one function to the other is quite heavy.


Would someone have an idea of an elegent way to handle that ?

Thanks
Christophe


Why would pte_update bother about updating all the 4 entries?. Can you
help me understand the issue?


Because the 8xx HW assistance expects 4 identical entries for each 16k 
page, so everytime a PTE is updated the 4 entries have to be updated.


Christophe

Re: How to handle PTE tables with non contiguous entries ?

2018-09-17 Thread Aneesh Kumar K.V

Christophe Leroy  writes:

> Hi,
>
> I'm having a hard time figuring out the best way to handle the following 
> situation:
>
> On the powerpc8xx, handling 16k size pages requires to have page tables 
> with 4 identical entries.

I assume that hugetlb page size? If so isn't that similar to FSL hugetlb
page table layout?

>
> Initially I was thinking about handling this by simply modifying 
> pte_index() which changing pte_t type in order to have one entry every 
> 16 bytes, then replicate the PTE value at *ptep, *ptep+1,*ptep+2 and 
> *ptep+3 both in set_pte_at() and pte_update().
>
> However, this doesn't work because many many places in the mm core part 
> of the kernel use loops on ptep with single ptep++ increment.
>
> Therefore did it with the following hack:
>
>   /* PTE level */
> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
> +typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
> +#else
>   typedef struct { pte_basic_t pte; } pte_t;
> +#endif
>
> @@ -181,7 +192,13 @@ static inline unsigned long pte_update(pte_t *p,
>  : "cc" );
>   #else /* PTE_ATOMIC_UPDATES */
>  unsigned long old = pte_val(*p);
> -   *p = __pte((old & ~clr) | set);
> +   unsigned long new = (old & ~clr) | set;
> +
> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
> +   p->pte = p->pte1 = p->pte2 = p->pte3 = new;
> +#else
> +   *p = __pte(new);
> +#endif
>   #endif /* !PTE_ATOMIC_UPDATES */
>
>   #ifdef CONFIG_44x
>
>
> @@ -161,7 +161,11 @@ static inline void __set_pte_at(struct mm_struct 
> *mm, unsigned long addr,
>  /* Anything else just stores the PTE normally. That covers all 
> 64-bit
>   * cases, and 32-bit non-hash with 32-bit PTEs.
>   */
> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
> +   ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte);
> +#else
>  *ptep = pte;
> +#endif
>
>
>
> But I'm not too happy with it as it means pte_t is not a single type 
> anymore so passing it from one function to the other is quite heavy.
>
>
> Would someone have an idea of an elegent way to handle that ?
>
> Thanks
> Christophe

Why would pte_update bother about updating all the 4 entries?. Can you
help me understand the issue?

-aneesh

Re: [PATCH 05/12] powerpc/64s/hash: Use POWER6 SLBIA IH=1 variant in switch_slb

2018-09-17 Thread Nicholas Piggin

On Mon, 17 Sep 2018 11:38:35 +0530
"Aneesh Kumar K.V"  wrote:

> Nicholas Piggin  writes:
> 
> > The SLBIA IH=1 hint will remove all non-zero SLBEs, but only
> > invalidate ERAT entries associated with a class value of 1, for
> > processors that support the hint (e.g., POWER6 and newer), which
> > Linux assigns to user addresses.
> >
> > This prevents kernel ERAT entries from being invalidated when
> > context switchig (if the thread faulted in more than 8 user SLBEs).  
> 
> 
> how about renaming stuff to indicate kernel ERAT entries are kept?
> something like slb_flush_and_rebolt_user()? 

User mappings aren't bolted though. I consider rebolt to mean update
the bolted kernel mappings when something has changed (like vmalloc
segment update). That doesn't need to be done here, so I think this
is okay. I can add a comment though.

Thanks,
Nick

> 
> >
> > Signed-off-by: Nicholas Piggin 
> > ---
> >  arch/powerpc/mm/slb.c | 38 +++---
> >  1 file changed, 23 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> > index a5e58f11d676..03fa1c663ccf 100644
> > --- a/arch/powerpc/mm/slb.c
> > +++ b/arch/powerpc/mm/slb.c
> > @@ -128,13 +128,21 @@ void slb_flush_all_realmode(void)
> > asm volatile("slbmte %0,%0; slbia" : : "r" (0));
> >  }
> >  
> > -static void __slb_flush_and_rebolt(void)
> > +void slb_flush_and_rebolt(void)
> >  {
> > /* If you change this make sure you change SLB_NUM_BOLTED
> >  * and PR KVM appropriately too. */
> > unsigned long linear_llp, lflags;
> > unsigned long ksp_esid_data, ksp_vsid_data;
> >  
> > +   WARN_ON(!irqs_disabled());
> > +
> > +   /*
> > +* We can't take a PMU exception in the following code, so hard
> > +* disable interrupts.
> > +*/
> > +   hard_irq_disable();
> > +
> > linear_llp = mmu_psize_defs[mmu_linear_psize].sllp;
> > lflags = SLB_VSID_KERNEL | linear_llp;
> >  
> > @@ -160,20 +168,7 @@ static void __slb_flush_and_rebolt(void)
> >  :: "r"(ksp_vsid_data),
> > "r"(ksp_esid_data)
> >  : "memory");
> > -}
> >  
> > -void slb_flush_and_rebolt(void)
> > -{
> > -
> > -   WARN_ON(!irqs_disabled());
> > -
> > -   /*
> > -* We can't take a PMU exception in the following code, so hard
> > -* disable interrupts.
> > -*/
> > -   hard_irq_disable();
> > -
> > -   __slb_flush_and_rebolt();
> > get_paca()->slb_cache_ptr = 0;
> >  }
> >  
> > @@ -248,7 +243,20 @@ void switch_slb(struct task_struct *tsk, struct 
> > mm_struct *mm)
> >  
> > asm volatile("isync" : : : "memory");
> > } else {
> > -   __slb_flush_and_rebolt();
> > +   struct slb_shadow *p = get_slb_shadow();
> > +   unsigned long ksp_esid_data =
> > +   be64_to_cpu(p->save_area[KSTACK_INDEX].esid);
> > +   unsigned long ksp_vsid_data =
> > +   be64_to_cpu(p->save_area[KSTACK_INDEX].vsid);
> > +
> > +   asm volatile("isync\n"
> > +PPC_SLBIA(1) "\n"
> > +"slbmte%0,%1\n"
> > +"isync"
> > +:: "r"(ksp_vsid_data),
> > +   "r"(ksp_esid_data));
> > +
> > +   asm volatile("isync" : : : "memory");
> > }
> >  
> > get_paca()->slb_cache_ptr = 0;
> > -- 
> > 2.18.0  
>

Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-17 Thread Nicholas Piggin

On Mon, 17 Sep 2018 11:30:16 +0530
"Aneesh Kumar K.V"  wrote:

> Nicholas Piggin  writes:
> 
> > The POWER5 < DD2.1 issue is that slbie needs to be issued more than
> > once. It came in with this change:
> >
> > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au
> >   [PATCH] POWER5 erratum workaround
> >
> >   Early POWER5 revisions ( >   instructions to be repeated under some circumstances.  The patch below
> >   adds a workaround (patch made by Anton Blanchard).  
> 
> Thanks for extracting this. Can we add this to the code?

The comment? Sure.

> Also I am not
> sure what is repeated here? Is it that we just need one slb extra(hence
> only applicable to offset == 1) or is it that we need to make sure there
> is always one slb extra? The code does the former.

Yeah it has always done the former, so my assumption is that you just
need more than one slbie. I don't think we need to bother revisiting
that assumption unless someone can pull up something definitive.

What I did change is that slbia no longer has the additional slbie, but
I think there are strong reasons not to need that.

>  Do you a have link for
> that email patch?

I tried looking through the archives around that date but could not
find it. That came from a bitkeeper log.

Thanks,
Nick

Re: [PATCH v2 11/24] powerpc/mm: don't use _PAGE_EXEC for calling hash_preload()

2018-09-17 Thread Aneesh Kumar K.V

Christophe Leroy  writes:

> The 'access' parameter of hash_preload() is either 0 or _PAGE_EXEC.
> Among the two versions of hash_preload(), only the PPC64 one is
> doing something with this 'access' parameter.
>
> In order to remove the use of _PAGE_EXEC outside platform code,
> 'access' parameter is replaced by 'is_exec' which will be either
> true of false, and the PPC64 version of hash_preload() creates
> the access flag based on 'is_exec'.
>

Reviewed-by: Aneesh Kumar K.V 

> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/hash_utils_64.c | 3 ++-
>  arch/powerpc/mm/mem.c   | 9 +
>  arch/powerpc/mm/mmu_decl.h  | 2 +-
>  arch/powerpc/mm/pgtable_32.c| 2 +-
>  arch/powerpc/mm/ppc_mmu_32.c| 2 +-
>  5 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index f23a89d8e4ce..b8ce0e8cc608 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1482,7 +1482,7 @@ static bool should_hash_preload(struct mm_struct *mm, 
> unsigned long ea)
>  #endif
>  
>  void hash_preload(struct mm_struct *mm, unsigned long ea,
> -   unsigned long access, unsigned long trap)
> +   bool is_exec, unsigned long trap)
>  {
>   int hugepage_shift;
>   unsigned long vsid;
> @@ -1490,6 +1490,7 @@ void hash_preload(struct mm_struct *mm, unsigned long 
> ea,
>   pte_t *ptep;
>   unsigned long flags;
>   int rc, ssize, update_flags = 0;
> + unsigned long access = _PAGE_PRESENT | _PAGE_READ | (is_exec ? 
> _PAGE_EXEC : 0);
>  
>   BUG_ON(REGION_ID(ea) != USER_REGION_ID);
>  
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 31bd9b53c358..0ba0cdb3f759 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -507,7 +507,8 @@ void update_mmu_cache(struct vm_area_struct *vma, 
> unsigned long address,
>* We don't need to worry about _PAGE_PRESENT here because we are
>* called with either mm->page_table_lock held or ptl lock held
>*/
> - unsigned long access, trap;
> + unsigned long trap;
> + bool is_exec;
>  
>   if (radix_enabled()) {
>   prefetch((void *)address);
> @@ -529,16 +530,16 @@ void update_mmu_cache(struct vm_area_struct *vma, 
> unsigned long address,
>   trap = current->thread.regs ? TRAP(current->thread.regs) : 0UL;
>   switch (trap) {
>   case 0x300:
> - access = 0UL;
> + is_exec = false;
>   break;
>   case 0x400:
> - access = _PAGE_EXEC;
> + is_exec = true;
>   break;
>   default:
>   return;
>   }
>  
> - hash_preload(vma->vm_mm, address, access, trap);
> + hash_preload(vma->vm_mm, address, is_exec, trap);
>  #endif /* CONFIG_PPC_STD_MMU */
>  #if (defined(CONFIG_PPC_BOOK3E_64) || defined(CONFIG_PPC_FSL_BOOK3E)) \
>   && defined(CONFIG_HUGETLB_PAGE)
> diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
> index e5d779eed181..dd7f9b951d25 100644
> --- a/arch/powerpc/mm/mmu_decl.h
> +++ b/arch/powerpc/mm/mmu_decl.h
> @@ -82,7 +82,7 @@ static inline void _tlbivax_bcast(unsigned long address, 
> unsigned int pid,
>  #else /* CONFIG_PPC_MMU_NOHASH */
>  
>  extern void hash_preload(struct mm_struct *mm, unsigned long ea,
> -  unsigned long access, unsigned long trap);
> +  bool is_exec, unsigned long trap);
>  
>  
>  extern void _tlbie(unsigned long address);
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 0bbc7b7d8a05..01f348938328 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -261,7 +261,7 @@ static void __init __mapin_ram_chunk(unsigned long 
> offset, unsigned long top)
>   map_kernel_page(v, p, ktext ? PAGE_KERNEL_TEXT : PAGE_KERNEL);
>  #ifdef CONFIG_PPC_STD_MMU_32
>   if (ktext)
> - hash_preload(_mm, v, 0, 0x300);
> + hash_preload(_mm, v, false, 0x300);
>  #endif
>   v += PAGE_SIZE;
>   p += PAGE_SIZE;
> diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
> index bea6c544e38f..38a793bfca37 100644
> --- a/arch/powerpc/mm/ppc_mmu_32.c
> +++ b/arch/powerpc/mm/ppc_mmu_32.c
> @@ -163,7 +163,7 @@ void __init setbat(int index, unsigned long virt, 
> phys_addr_t phys,
>   * Preload a translation in the hash table
>   */
>  void hash_preload(struct mm_struct *mm, unsigned long ea,
> -   unsigned long access, unsigned long trap)
> +   bool is_exec, unsigned long trap)
>  {
>   pmd_t *pmd;
>  
> -- 
> 2.13.3

[PATCH] powerpc: Disable -Wbuiltin-requires-header when setjmp is used

2018-09-17 Thread Joel Stanley

The powerpc kernel uses setjmp which causes a warning when building with
clang:

CC  arch/powerpc/xmon/xmon.o
  In file included from arch/powerpc/xmon/xmon.c:51:
  ./arch/powerpc/include/asm/setjmp.h:15:13: error: declaration of
  built-in function 'setjmp' requires inclusion of the header 
[-Werror,-Wbuiltin-requires-header]
  extern long setjmp(long *);
  ^
  ./arch/powerpc/include/asm/setjmp.h:16:13: error: declaration of
  built-in function 'longjmp' requires inclusion of the header 
[-Werror,-Wbuiltin-requires-header]
  extern void longjmp(long *, long);
  ^

This *is* the header and we're not using the built-in setjump but
rather the one in arch/powerpc/kernel/misc.S. As the compiler warning
does not make sense, it for the files where setjmp is used.

Signed-off-by: Joel Stanley 
---
We could instead disable this for all of the kernel as I don't think the
warning is going to ever provide useful information for the kernel.

 arch/powerpc/kernel/Makefile | 3 +++
 arch/powerpc/xmon/Makefile   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1e64cfe22a83..9845a94f5f68 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -7,6 +7,9 @@ CFLAGS_ptrace.o += -DUTS_MACHINE='"$(UTS_MACHINE)"'
 
 subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
 
+# Disable clang warning for using setjmp without setjmp.h header
+CFLAGS_crash.o += $(call cc-disable-warning, builtin-requires-header)
+
 ifdef CONFIG_PPC64
 CFLAGS_prom_init.o += $(NO_MINIMAL_TOC)
 endif
diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index 93cc1f1b8b61..a38db48f9f6d 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -14,6 +14,9 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
 
 obj-y  += xmon.o nonstdio.o spr_access.o
 
+# Disable clang warning for using setjmp without setjmp.h header
+subdir-ccflags-y := $(call cc-disable-warning, builtin-requires-header)
+
 ifdef CONFIG_XMON_DISASSEMBLY
 obj-y  += ppc-dis.o ppc-opc.o
 obj-$(CONFIG_SPU_BASE) += spu-dis.o spu-opc.o
-- 
2.17.1

Re: [PATCH 01/12] powerpc/64s/hash: Fix stab_rr off by one initialization

2018-09-17 Thread Nicholas Piggin

On Mon, 17 Sep 2018 16:21:51 +0930
Joel Stanley  wrote:

> On Sat, 15 Sep 2018 at 01:03, Nicholas Piggin  wrote:
> >
> > This causes SLB alloation to start 1 beyond the start of the SLB.

allocation

> > There is no real problem because after it wraps it stats behaving  
> 
> starts?
> 
> > properly, it's just surprisig to see when looking at SLB traces.  
> 
> surprising

My keyboard is dying :(

Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock

2018-09-17 Thread David Hildenbrand

Am 03.09.18 um 02:36 schrieb Rashmica:
> Hi David,
> 
> 
> On 21/08/18 20:44, David Hildenbrand wrote:
> 
>> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
>> fix concurrent memory hot-add deadlock"), which tried to fix a possible
>> lock inversion reported and discussed in [1] due to the two locks
>>  a) device_lock()
>>  b) mem_hotplug_lock
>>
>> While add_memory() first takes b), followed by a) during
>> bus_probe_device(), onlining of memory from user space first took b),
>> followed by a), exposing a possible deadlock.
> 
> Do you mean "onlining of memory from user space first took a),
> followed by b)"? 

Very right, thanks.

> 
>> In [1], and it was decided to not make use of device_hotplug_lock, but
>> rather to enforce a locking order.
>>
>> The problems I spotted related to this:
>>
>> 1. Memory block device attributes: While .state first calls
>>mem_hotplug_begin() and the calls device_online() - which takes
>>device_lock() - .online does no longer call mem_hotplug_begin(), so
>>effectively calls online_pages() without mem_hotplug_lock.
>>
>> 2. device_online() should be called under device_hotplug_lock, however
>>onlining memory during add_memory() does not take care of that.
>>
>> In addition, I think there is also something wrong about the locking in
>>
>> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>>without locks. This was introduced after 30467e0b3be. And skimming over
>>the code, I assume it could need some more care in regards to locking
>>(e.g. device_online() called without device_hotplug_lock - but I'll
>>not touch that for now).
> 
> Can you mention that you fixed this in later patches?

Sure!

> 
> 
> The series looks good to me. Feel free to add my reviewed-by:
> 
> Reviewed-by: Rashmica Gupta 
> 

Thanks, r-b only for this patch or all of the series?

-- 

Thanks,

David / dhildenb

Re: [PATCH kernel RFC 0/3] powerpc/pseries/iommu: GPU coherent memory pass through

2018-09-17 Thread Alexey Kardashevskiy

Ping?

The problem is still there...


On 24/08/2018 13:04, Alexey Kardashevskiy wrote:
> 
> 
> On 09/08/2018 14:41, Alexey Kardashevskiy wrote:
>>
>>
>> On 25/07/2018 19:50, Alexey Kardashevskiy wrote:
>>> I am trying to pass through a 3D controller:
>>> [0302]: NVIDIA Corporation GV100GL [Tesla V100 SXM2] [10de:1db1] (rev a1)
>>>
>>> which has a quite unique feature as coherent memory directly accessible
>>> from a POWER9 CPU via an NVLink2 transport.
>>>
>>> So in addition to passing a PCI device + accompanying NPU devices,
>>> we will also be passing the host physical address range as it is done
>>> on the bare metal system.
>>>
>>> The memory on the host is presented as:
>>>
>>> ===
>>> [aik@yc02goos ~]$ lsprop /proc/device-tree/memory@420
>>> ibm,chip-id  00fe (254)
>>> device_type  "memory"
>>> compatible   "ibm,coherent-device-memory"
>>> reg  0420  0020 
>>> linux,usable-memory
>>>  0420   
>>> phandle  0726 (1830)
>>> name "memory"
>>> ibm,associativity
>>>  0004 00fe 00fe 00fe 00fe
>>> ===
>>>
>>> and the host does not touch it as the second 64bit value of
>>> "linux,usable-memory" - the size - is null. Later on the NVIDIA driver
>>> trains the NVLink2 and probes this memory and this is how it becomes
>>> onlined.
>>>
>>> In the virtual environment I am planning on doing the same thing,
>>> however there is a difference in 64bit DMA handling. The powernv
>>> platform uses a PHB3 bypass mode and that just works but
>>> the pseries platform uses DDW RTAS API to achieve the same
>>> result and the problem with this is that we need a huge DMA
>>> window to start from zero (because this GPU supports less than
>>> 50bits for DMA address space) and cover not just present memory
>>> but also this new coherent memory.
>>>
>>>
>>> This is based on sha1
>>> d72e90f3 Linus Torvalds "Linux 4.18-rc6".
>>>
>>> Please comment. Thanks.
>>
>>
>> Ping?
> 
> 
> Ping?
> 
>>
>>
>>>
>>>
>>>
>>> Alexey Kardashevskiy (3):
>>>   powerpc/pseries/iommu: Allow dynamic window to start from zero
>>>   powerpc/pseries/iommu: Force default DMA window removal
>>>   powerpc/pseries/iommu: Use memory@ nodes in max RAM address
>>> calculation
>>>
>>>  arch/powerpc/platforms/pseries/iommu.c | 77 
>>> ++
>>>  1 file changed, 70 insertions(+), 7 deletions(-)
>>>
>>
> 

-- 
Alexey

Re: [PATCH 01/12] powerpc/64s/hash: Fix stab_rr off by one initialization

2018-09-17 Thread Joel Stanley

On Sat, 15 Sep 2018 at 01:03, Nicholas Piggin  wrote:
>
> This causes SLB alloation to start 1 beyond the start of the SLB.
> There is no real problem because after it wraps it stats behaving

starts?

> properly, it's just surprisig to see when looking at SLB traces.

surprising

>
> Signed-off-by: Nicholas Piggin 

> ---
>  arch/powerpc/mm/slb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index 9f574e59d178..2f162c6e52d4 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -355,7 +355,7 @@ void slb_initialize(void)
>  #endif
> }
>
> -   get_paca()->stab_rr = SLB_NUM_BOLTED;
> +   get_paca()->stab_rr = SLB_NUM_BOLTED - 1;
>
> lflags = SLB_VSID_KERNEL | linear_llp;
> vflags = SLB_VSID_KERNEL | vmalloc_llp;
> --
> 2.18.0
>

[PATCH v3 3/3] dt-bindings: watchdog: add mpc8xxx-wdt support

2018-09-17 Thread Christophe Leroy

Add description of DT bindings for mpc8xxx-wdt driver which
handles the CPU watchdog timer on the mpc83xx, mpc86xx and mpc8xx.

Signed-off-by: Christophe Leroy 
---
 .../devicetree/bindings/watchdog/mpc8xxx-wdt.txt   | 25 ++
 1 file changed, 25 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/watchdog/mpc8xxx-wdt.txt

diff --git a/Documentation/devicetree/bindings/watchdog/mpc8xxx-wdt.txt 
b/Documentation/devicetree/bindings/watchdog/mpc8xxx-wdt.txt
new file mode 100644
index ..e176face472a
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/mpc8xxx-wdt.txt
@@ -0,0 +1,25 @@
+* Freescale mpc8xxx watchdog driver (For 83xx, 86xx and 8xx)
+
+Required properties:
+- compatible: Shall contain one of the following:
+   "mpc83xx_wdt" for an mpc83xx
+   "fsl,mpc8610-wdt" for an mpc86xx
+   "fsl,mpc823-wdt" for an mpc8xx
+- reg: base physical address and length of the area hosting the
+   watchdog registers.
+   On the 83xx, "Watchdog Timer Registers" area:   <0x200 0x100>
+   On the 86xx, "Watchdog Timer Registers" area:   <0xe4000 0x100>
+   On the 8xx, "General System Interface Unit" area: <0x0 0x10>
+
+Optional properties:
+- reg: additional physical address and length (4) of location of the
+   Reset Status Register (called RSTRSCR on the mpc86xx)
+   On the 83xx, it is located at offset 0x910
+   On the 86xx, it is located at offset 0xe0094
+   On the 8xx, it is located at offset 0x288
+
+Example:
+   WDT: watchdog@0 {
+   compatible = "fsl,mpc823-wdt";
+   reg = <0x0 0x10 0x288 0x4>;
+   };
-- 
2.13.3

[PATCH v3 2/3] watchdog: mpc8xxx: provide boot status

2018-09-17 Thread Christophe Leroy

mpc8xxx watchdog driver supports the following platforms:
- mpc8xx
- mpc83xx
- mpc86xx

Those three platforms have a 32 bits register which provides the
reason of the last boot, including whether it was caused by the
watchdog.

mpc8xx: Register RSR, bit SWRS (bit 3)
mpc83xx: Register RSR, bit SWRS (bit 28)
mpc86xx: Register RSTRSCR, bit WDT_RR (bit 11)

This patch maps the register as defined in the device tree and updates
wdt.bootstatus based on the value of the watchdog related bit. Then
the information can be retrieved via the WDIOC_GETBOOTSTATUS ioctl.

Hereunder is an example of devicetree for mpc8xx,
the Reset Status Register being at offset 0x288:

WDT: watchdog@0 {
compatible = "fsl,mpc823-wdt";
reg = <0x0 0x10 0x288 0x4>;
};

On the mpc83xx, RSR is at offset 0x910
On the mpc86xx, RSTRSCR is at offset 0xe0094

Suggested-by: Radu Rendec 
Tested-by: Christophe Leroy  # On mpc885
Signed-off-by: Christophe Leroy 
---
 drivers/watchdog/mpc8xxx_wdt.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/watchdog/mpc8xxx_wdt.c b/drivers/watchdog/mpc8xxx_wdt.c
index 1dcf5f10cdd9..069072e6747d 100644
--- a/drivers/watchdog/mpc8xxx_wdt.c
+++ b/drivers/watchdog/mpc8xxx_wdt.c
@@ -47,6 +47,7 @@ struct mpc8xxx_wdt {
 struct mpc8xxx_wdt_type {
int prescaler;
bool hw_enabled;
+   u32 rsr_mask;
 };
 
 struct mpc8xxx_wdt_ddata {
@@ -159,6 +160,24 @@ static int mpc8xxx_wdt_probe(struct platform_device *ofdev)
return -ENODEV;
}
 
+   res = platform_get_resource(ofdev, IORESOURCE_MEM, 1);
+   if (res) {
+   bool status;
+   u32 __iomem *rsr = ioremap(res->start, resource_size(res));
+
+   if (!rsr)
+   return -ENOMEM;
+
+   status = in_be32(rsr) & wdt_type->rsr_mask;
+   ddata->wdd.bootstatus = status ? WDIOF_CARDRESET : 0;
+/* clear reset status bits related to watchdog timer */
+   out_be32(rsr, wdt_type->rsr_mask);
+   iounmap(rsr);
+
+   dev_info(dev, "Last boot was %scaused by watchdog\n",
+status ? "" : "not ");
+   }
+
spin_lock_init(>lock);
 
ddata->wdd.info = _wdt_info,
@@ -216,6 +235,7 @@ static const struct of_device_id mpc8xxx_wdt_match[] = {
.compatible = "mpc83xx_wdt",
.data = &(struct mpc8xxx_wdt_type) {
.prescaler = 0x1,
+   .rsr_mask = BIT(3), /* RSR Bit SWRS */
},
},
{
@@ -223,6 +243,7 @@ static const struct of_device_id mpc8xxx_wdt_match[] = {
.data = &(struct mpc8xxx_wdt_type) {
.prescaler = 0x1,
.hw_enabled = true,
+   .rsr_mask = BIT(20), /* RSTRSCR Bit WDT_RR */
},
},
{
@@ -230,6 +251,7 @@ static const struct of_device_id mpc8xxx_wdt_match[] = {
.data = &(struct mpc8xxx_wdt_type) {
.prescaler = 0x800,
.hw_enabled = true,
+   .rsr_mask = BIT(28), /* RSR Bit SWRS */
},
},
{},
-- 
2.13.3

[PATCH v3 1/3] watchdog: mpc8xxx: use dev_xxxx() instead of pr_xxxx()

2018-09-17 Thread Christophe Leroy

mpc8xxx watchdog driver is a platform device drivers, it is
therefore possible to use dev_xxx() messaging rather than pr_xxx()

Reviewed-by: Guenter Roeck 
Signed-off-by: Christophe Leroy 
---
 drivers/watchdog/mpc8xxx_wdt.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/watchdog/mpc8xxx_wdt.c b/drivers/watchdog/mpc8xxx_wdt.c
index aca2d6323f8a..1dcf5f10cdd9 100644
--- a/drivers/watchdog/mpc8xxx_wdt.c
+++ b/drivers/watchdog/mpc8xxx_wdt.c
@@ -17,8 +17,6 @@
  * option) any later version.
  */
 
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
 #include 
 #include 
 #include 
@@ -137,26 +135,27 @@ static int mpc8xxx_wdt_probe(struct platform_device 
*ofdev)
struct mpc8xxx_wdt_ddata *ddata;
u32 freq = fsl_get_sys_freq();
bool enabled;
+   struct device *dev = >dev;
 
-   wdt_type = of_device_get_match_data(>dev);
+   wdt_type = of_device_get_match_data(dev);
if (!wdt_type)
return -EINVAL;
 
if (!freq || freq == -1)
return -EINVAL;
 
-   ddata = devm_kzalloc(>dev, sizeof(*ddata), GFP_KERNEL);
+   ddata = devm_kzalloc(dev, sizeof(*ddata), GFP_KERNEL);
if (!ddata)
return -ENOMEM;
 
res = platform_get_resource(ofdev, IORESOURCE_MEM, 0);
-   ddata->base = devm_ioremap_resource(>dev, res);
+   ddata->base = devm_ioremap_resource(dev, res);
if (IS_ERR(ddata->base))
return PTR_ERR(ddata->base);
 
enabled = in_be32(>base->swcrr) & SWCRR_SWEN;
if (!enabled && wdt_type->hw_enabled) {
-   pr_info("could not be enabled in software\n");
+   dev_info(dev, "could not be enabled in software\n");
return -ENODEV;
}
 
@@ -166,7 +165,7 @@ static int mpc8xxx_wdt_probe(struct platform_device *ofdev)
ddata->wdd.ops = _wdt_ops,
 
ddata->wdd.timeout = WATCHDOG_TIMEOUT;
-   watchdog_init_timeout(>wdd, timeout, >dev);
+   watchdog_init_timeout(>wdd, timeout, dev);
 
watchdog_set_nowayout(>wdd, nowayout);
 
@@ -189,12 +188,13 @@ static int mpc8xxx_wdt_probe(struct platform_device 
*ofdev)
 
ret = watchdog_register_device(>wdd);
if (ret) {
-   pr_err("cannot register watchdog device (err=%d)\n", ret);
+   dev_err(dev, "cannot register watchdog device (err=%d)\n", ret);
return ret;
}
 
-   pr_info("WDT driver for MPC8xxx initialized. mode:%s timeout=%d sec\n",
-   reset ? "reset" : "interrupt", ddata->wdd.timeout);
+   dev_info(dev,
+"WDT driver for MPC8xxx initialized. mode:%s timeout=%d sec\n",
+reset ? "reset" : "interrupt", ddata->wdd.timeout);
 
platform_set_drvdata(ofdev, ddata);
return 0;
@@ -204,8 +204,8 @@ static int mpc8xxx_wdt_remove(struct platform_device *ofdev)
 {
struct mpc8xxx_wdt_ddata *ddata = platform_get_drvdata(ofdev);
 
-   pr_crit("Watchdog removed, expect the %s soon!\n",
-   reset ? "reset" : "machine check exception");
+   dev_crit(>dev, "Watchdog removed, expect the %s soon!\n",
+reset ? "reset" : "machine check exception");
watchdog_unregister_device(>wdd);
 
return 0;
-- 
2.13.3

Re: [PATCH 05/12] powerpc/64s/hash: Use POWER6 SLBIA IH=1 variant in switch_slb

2018-09-17 Thread Aneesh Kumar K.V

Nicholas Piggin  writes:

> The SLBIA IH=1 hint will remove all non-zero SLBEs, but only
> invalidate ERAT entries associated with a class value of 1, for
> processors that support the hint (e.g., POWER6 and newer), which
> Linux assigns to user addresses.
>
> This prevents kernel ERAT entries from being invalidated when
> context switchig (if the thread faulted in more than 8 user SLBEs).


how about renaming stuff to indicate kernel ERAT entries are kept?
something like slb_flush_and_rebolt_user()? 

>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/mm/slb.c | 38 +++---
>  1 file changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index a5e58f11d676..03fa1c663ccf 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -128,13 +128,21 @@ void slb_flush_all_realmode(void)
>   asm volatile("slbmte %0,%0; slbia" : : "r" (0));
>  }
>  
> -static void __slb_flush_and_rebolt(void)
> +void slb_flush_and_rebolt(void)
>  {
>   /* If you change this make sure you change SLB_NUM_BOLTED
>* and PR KVM appropriately too. */
>   unsigned long linear_llp, lflags;
>   unsigned long ksp_esid_data, ksp_vsid_data;
>  
> + WARN_ON(!irqs_disabled());
> +
> + /*
> +  * We can't take a PMU exception in the following code, so hard
> +  * disable interrupts.
> +  */
> + hard_irq_disable();
> +
>   linear_llp = mmu_psize_defs[mmu_linear_psize].sllp;
>   lflags = SLB_VSID_KERNEL | linear_llp;
>  
> @@ -160,20 +168,7 @@ static void __slb_flush_and_rebolt(void)
>:: "r"(ksp_vsid_data),
>   "r"(ksp_esid_data)
>: "memory");
> -}
>  
> -void slb_flush_and_rebolt(void)
> -{
> -
> - WARN_ON(!irqs_disabled());
> -
> - /*
> -  * We can't take a PMU exception in the following code, so hard
> -  * disable interrupts.
> -  */
> - hard_irq_disable();
> -
> - __slb_flush_and_rebolt();
>   get_paca()->slb_cache_ptr = 0;
>  }
>  
> @@ -248,7 +243,20 @@ void switch_slb(struct task_struct *tsk, struct 
> mm_struct *mm)
>  
>   asm volatile("isync" : : : "memory");
>   } else {
> - __slb_flush_and_rebolt();
> + struct slb_shadow *p = get_slb_shadow();
> + unsigned long ksp_esid_data =
> + be64_to_cpu(p->save_area[KSTACK_INDEX].esid);
> + unsigned long ksp_vsid_data =
> + be64_to_cpu(p->save_area[KSTACK_INDEX].vsid);
> +
> + asm volatile("isync\n"
> +  PPC_SLBIA(1) "\n"
> +  "slbmte%0,%1\n"
> +  "isync"
> +  :: "r"(ksp_vsid_data),
> + "r"(ksp_esid_data));
> +
> + asm volatile("isync" : : : "memory");
>   }
>  
>   get_paca()->slb_cache_ptr = 0;
> -- 
> 2.18.0

Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-17 Thread Aneesh Kumar K.V

Nicholas Piggin  writes:

> The POWER5 < DD2.1 issue is that slbie needs to be issued more than
> once. It came in with this change:
>
> ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au
>   [PATCH] POWER5 erratum workaround
>
>   Early POWER5 revisions (   instructions to be repeated under some circumstances.  The patch below
>   adds a workaround (patch made by Anton Blanchard).

Thanks for extracting this. Can we add this to the code? Also I am not
sure what is repeated here? Is it that we just need one slb extra(hence
only applicable to offset == 1) or is it that we need to make sure there
is always one slb extra? The code does the former.  Do you a have link for
that email patch?


>
> The extra slbie in switch_slb is done even for the case where slbia is
> called (slb_flush_and_rebolt). I don't believe that is required
> because there are other slb_flush_and_rebolt callers which do not
> issue the workaround slbie, which would be broken if it was required.
>
> It also seems to be fine inside the isync with the first slbie, as it
> is in the kernel stack switch code.
>
> So move this workaround to where it is required. This is not much of
> an optimisation because this is the fast path, but it makes the code
> more understandable and neater.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/mm/slb.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index 1c7128c63a4b..d952ece3abf7 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, 
> unsigned long addr2)
>  void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
>  {
>   unsigned long offset;
> - unsigned long slbie_data = 0;
>   unsigned long pc = KSTK_EIP(tsk);
>   unsigned long stack = KSTK_ESP(tsk);
>   unsigned long exec_base;
> @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
> *mm)
>   offset = get_paca()->slb_cache_ptr;
>   if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
>   offset <= SLB_CACHE_ENTRIES) {
> + unsigned long slbie_data;
>   int i;
> +
>   asm volatile("isync" : : : "memory");
>   for (i = 0; i < offset; i++) {
>   slbie_data = (unsigned long)get_paca()->slb_cache[i]
> @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct 
> mm_struct *mm)
>   slbie_data |= SLBIE_C; /* C set for user addresses */
>   asm volatile("slbie %0" : : "r" (slbie_data));
>   }
> - asm volatile("isync" : : : "memory");
> - } else {
> - __slb_flush_and_rebolt();
> - }
>  
> - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) {
>   /* Workaround POWER5 < DD2.1 issue */
> - if (offset == 1 || offset > SLB_CACHE_ENTRIES)
> + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1)
>   asm volatile("slbie %0" : : "r" (slbie_data));
> +
> + asm volatile("isync" : : : "memory");
> + } else {
> + __slb_flush_and_rebolt();
>   }
>  
>   get_paca()->slb_cache_ptr = 0;
> -- 
> 2.18.0

47 matches

Mail list logo