Re: [PATCH 2/2] workqueue:Fix affinity of an unbound worker of a node with 1 online CPU

2016-06-07 Thread Abdul Haleem

Hi Gautham,

Thanks a lot for the fix.

With your patches applied, 4.7.0-rc2 builds fine on ppc64le bare metal.
Boot was successful with No call traces.

Thanks for all your support !

Regard's
Abdul

On Tuesday 07 June 2016 08:44 PM, Gautham R. Shenoy wrote:


With commit e9d867a67fd03ccc ("sched: Allow per-cpu kernel threads to
run on online && !active"), __set_cpus_allowed_ptr() expects that only
strict per-cpu kernel threads can have affinity to an online CPU which
is not yet active.

This assumption is currently broken in the CPU_ONLINE notification
handler for the workqueues where restore_unbound_workers_cpumask()
calls set_cpus_allowed_ptr() when the first cpu in the unbound
worker's pool->attr->cpumask comes online. Since
set_cpus_allowed_ptr() is called with pool->attr->cpumask in which
only one CPU is online which is not yet active, we get the following
WARN_ON during an CPU online operation.

[ cut here ]
WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166
__set_cpus_allowed_ptr+0x228/0x2e0
Modules linked in:
CPU: 40 PID: 248 Comm: cpuhp/40 Not tainted 4.6.0-autotest+ #4
<..snip..>
Call Trace:
[c00f273ff920] [c010493c] __set_cpus_allowed_ptr+0x2cc/0x2e0 
(unreliable)
[c00f273ffac0] [c00ed4b0] workqueue_cpu_up_callback+0x2c0/0x470
[c00f273ffb70] [c00f5c58] notifier_call_chain+0x98/0x100
[c00f273ffbc0] [c00c5ed0] __cpu_notify+0x70/0xe0
[c00f273ffc00] [c00c6028] notify_online+0x38/0x50
[c00f273ffc30] [c00c5214] cpuhp_invoke_callback+0x84/0x250
[c00f273ffc90] [c00c562c] cpuhp_up_callbacks+0x5c/0x120
[c00f273ffce0] [c00c64d4] cpuhp_thread_fun+0x184/0x1c0
[c00f273ffd20] [c00fa050] smpboot_thread_fn+0x290/0x2a0
[c00f273ffd80] [c00f45b0] kthread+0x110/0x130
[c00f273ffe30] [c0009570] ret_from_kernel_thread+0x5c/0x6c
---[ end trace 00f1456578b2a3b2 ]---

This patch sets the affinity of the worker to
a) the only online CPU in the cpumask of the worker pool when it comes
online.
b) the cpumask of the worker pool when the second CPU in the pool's
cpumask comes online.

Reported-by: Abdul Haleem 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tejun Heo 
Cc: Michael Ellerman 
Signed-off-by: Gautham R. Shenoy 
---


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH, RFC] cxl: Add support for CAPP DMA mode

2016-06-07 Thread Ian Munsie
From: Ian Munsie 

This adds support for using CAPP DMA mode, which is required for XSL
based cards such as the Mellanox CX4 to function.

This is currently an RFC as it depends on the corresponding support to
be merged into skiboot first, which was submitted here:
http://patchwork.ozlabs.org/patch/625582/

In the event that the skiboot on the system does not have the above
support, it will indicate as such in the kernel log and abort the init
process.

Signed-off-by: Ian Munsie 
---
 arch/powerpc/include/asm/opal-api.h   | 1 +
 arch/powerpc/platforms/powernv/pci-ioda.c | 4 +++-
 drivers/misc/cxl/cxl.h| 1 +
 drivers/misc/cxl/pci.c| 4 +++-
 4 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 9bb8ddf..d29c584 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -825,6 +825,7 @@ enum {
OPAL_PHB_CAPI_MODE_CAPI = 1,
OPAL_PHB_CAPI_MODE_SNOOP_OFF= 2,
OPAL_PHB_CAPI_MODE_SNOOP_ON = 3,
+   OPAL_PHB_CAPI_MODE_DMA  = 4,
 };
 
 /* OPAL I2C request */
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3a5ea82..5a42e98 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2793,7 +2793,9 @@ int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t 
mode)
pe_info(pe, "Switching PHB to CXL\n");
 
rc = opal_pci_set_phb_cxl_mode(phb->opal_id, mode, pe->pe_number);
-   if (rc)
+   if (rc == OPAL_UNSUPPORTED)
+   dev_err(&dev->dev, "Required cxl mode not supported by firmware 
- update skiboot\n");
+   else if (rc)
dev_err(&dev->dev, "opal_pci_set_phb_cxl_mode failed: %i\n", 
rc);
 
return rc;
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 92e7f19..aa69a84 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -542,6 +542,7 @@ struct cxl_service_layer_ops {
void (*debugfs_stop_trace)(struct cxl *adapter);
void (*write_timebase_ctrl)(struct cxl *adapter);
u64 (*timebase_read)(struct cxl *adapter);
+   int capi_mode;
 };
 
 struct cxl_native {
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 556718d..648817a 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1249,7 +1249,7 @@ static int cxl_configure_adapter(struct cxl *adapter, 
struct pci_dev *dev)
if ((rc = adapter->native->sl_ops->adapter_regs_init(adapter, dev)))
goto err;
 
-   if ((rc = pnv_phb_to_cxl_mode(dev, OPAL_PHB_CAPI_MODE_CAPI)))
+   if ((rc = pnv_phb_to_cxl_mode(dev, adapter->native->sl_ops->capi_mode)))
goto err;
 
/* If recovery happened, the last step is to turn on snooping.
@@ -1293,6 +1293,7 @@ static const struct cxl_service_layer_ops psl_ops = {
.debugfs_stop_trace = cxl_stop_trace,
.write_timebase_ctrl = write_timebase_ctrl_psl,
.timebase_read = timebase_read_psl,
+   .capi_mode = OPAL_PHB_CAPI_MODE_CAPI,
 };
 
 static const struct cxl_service_layer_ops xsl_ops = {
@@ -1300,6 +1301,7 @@ static const struct cxl_service_layer_ops xsl_ops = {
.debugfs_add_adapter_sl_regs = cxl_debugfs_add_adapter_xsl_regs,
.write_timebase_ctrl = write_timebase_ctrl_xsl,
.timebase_read = timebase_read_xsl,
+   .capi_mode = OPAL_PHB_CAPI_MODE_DMA,
 };
 
 static void set_sl_ops(struct cxl *adapter, struct pci_dev *dev)
-- 
2.8.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/mm/radix: Update LPCR HR bit as per ISA

2016-06-07 Thread Michael Ellerman
On Thu, 2016-02-06 at 09:40:57 UTC, "Aneesh Kumar K.V" wrote:
> We need to se HR bit LPCR for radix partitions.
 
Please update the change log with something similar to what Ben sent.

> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/reg.h  | 1 +
>  arch/powerpc/mm/pgtable-radix.c | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index c1e82e968506..652147ef5ae3 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -348,6 +348,7 @@
>  #define   LPCR_RMI 0x0002  /* real mode is cache inhibit */
>  #define   LPCR_HDICE   0x0001  /* Hyp Decr enable (HV,PR,EE) */
>  #define   LPCR_UPRT0x0040  /* Use Process Table (ISA 3) */
> +#defineLPCR_HR  0x0010

What is this bit? Where is it documented?

Also white space is wrong.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/5] selftests/powerpc: Check for VSX preservation across userspace preemption

2016-06-07 Thread Cyril Bur
Ensure the kernel correctly switches VSX registers correctly. VSX
registers are all volatile, and despite the kernel preserving VSX
across syscalls, it doesn't have to. Test that during interrupts and
timeslices ending the VSX regs remain the same.

Signed-off-by: Cyril Bur 
---
 tools/testing/selftests/powerpc/math/Makefile  |   4 +-
 tools/testing/selftests/powerpc/math/vsx_asm.S |  57 +
 tools/testing/selftests/powerpc/math/vsx_preempt.c | 140 +
 tools/testing/selftests/powerpc/vsx_asm.h  |  71 +++
 4 files changed, 271 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/math/vsx_asm.S
 create mode 100644 tools/testing/selftests/powerpc/math/vsx_preempt.c
 create mode 100644 tools/testing/selftests/powerpc/vsx_asm.h

diff --git a/tools/testing/selftests/powerpc/math/Makefile 
b/tools/testing/selftests/powerpc/math/Makefile
index 5b88875..aa6598b 100644
--- a/tools/testing/selftests/powerpc/math/Makefile
+++ b/tools/testing/selftests/powerpc/math/Makefile
@@ -1,4 +1,4 @@
-TEST_PROGS := fpu_syscall fpu_preempt fpu_signal vmx_syscall vmx_preempt 
vmx_signal
+TEST_PROGS := fpu_syscall fpu_preempt fpu_signal vmx_syscall vmx_preempt 
vmx_signal vsx_preempt
 
 all: $(TEST_PROGS)
 
@@ -13,6 +13,8 @@ vmx_syscall: vmx_asm.S
 vmx_preempt: vmx_asm.S
 vmx_signal: vmx_asm.S
 
+vsx_preempt: vsx_asm.S
+
 include ../../lib.mk
 
 clean:
diff --git a/tools/testing/selftests/powerpc/math/vsx_asm.S 
b/tools/testing/selftests/powerpc/math/vsx_asm.S
new file mode 100644
index 000..4ceaf37
--- /dev/null
+++ b/tools/testing/selftests/powerpc/math/vsx_asm.S
@@ -0,0 +1,57 @@
+/*
+ * Copyright 2015, Cyril Bur, IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include "../basic_asm.h"
+#include "../vsx_asm.h"
+
+FUNC_START(check_vsx)
+   PUSH_BASIC_STACK(32)
+   std r3,STACK_FRAME_PARAM(0)(sp)
+   addi r3, r3, 16 * 12 #Second half of array
+   bl store_vsx
+   ld r3,STACK_FRAME_PARAM(0)(sp)
+   bl vsx_memcmp
+   POP_BASIC_STACK(32)
+   blr
+FUNC_END(check_vsx)
+
+# int preempt_vmx(vector int *varray, int *threads_starting, int *running)
+# On starting will (atomically) decrement threads_starting as a signal that
+# the VMX have been loaded with varray. Will proceed to check the validity of
+# the VMX registers while running is not zero.
+FUNC_START(preempt_vsx)
+   PUSH_BASIC_STACK(512)
+   std r3,STACK_FRAME_PARAM(0)(sp) # vector int *varray
+   std r4,STACK_FRAME_PARAM(1)(sp) # int *threads_starting
+   std r5,STACK_FRAME_PARAM(2)(sp) # int *running
+
+   bl load_vsx
+   nop
+
+   sync
+   # Atomic DEC
+   ld r3,STACK_FRAME_PARAM(1)(sp)
+1: lwarx r4,0,r3
+   addi r4,r4,-1
+   stwcx. r4,0,r3
+   bne- 1b
+
+2: ld r3,STACK_FRAME_PARAM(0)(sp)
+   bl check_vsx
+   nop
+   cmpdi r3,0
+   bne 3f
+   ld r4,STACK_FRAME_PARAM(2)(sp)
+   ld r5,0(r4)
+   cmpwi r5,0
+   bne 2b
+
+3: POP_BASIC_STACK(512)
+   blr
+FUNC_END(preempt_vsx)
diff --git a/tools/testing/selftests/powerpc/math/vsx_preempt.c 
b/tools/testing/selftests/powerpc/math/vsx_preempt.c
new file mode 100644
index 000..706dbaa
--- /dev/null
+++ b/tools/testing/selftests/powerpc/math/vsx_preempt.c
@@ -0,0 +1,140 @@
+/*
+ * Copyright 2015, Cyril Bur, IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * This test attempts to see if the VSX registers change across preemption.
+ * There is no way to be sure preemption happened so this test just
+ * uses many threads and a long wait. As such, a successful test
+ * doesn't mean much but a failure is bad.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "utils.h"
+
+/* Time to wait for workers to get preempted (seconds) */
+#define PREEMPT_TIME 20
+/*
+ * Factor by which to multiply number of online CPUs for total number of
+ * worker threads
+ */
+#define THREAD_FACTOR 8
+
+__thread vector int varray[24] = {{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10,11,12},
+   {13,14,15,16},{17,18,19,20},{21,22,23,24},
+   {25,26,27,28},{29,30,31,32},{33,34,35,36},
+   {37,38,39,40},{41,42,43,44},{45,46,47,48}};
+
+int threads_starting;
+int running;
+
+extern long preempt_vsx(vector int *varray, int *threads_starting, int 
*running);
+
+long vsx_memcmp(vector int *a) {
+   vector int zero = {0,0,0,0};
+   int i;
+
+   FAIL_IF(a != varray);
+
+   for(i = 0; i < 12; i++) {
+   if (memcmp(&a[i 

[PATCH 3/5] powerpc: tm: Always use fp_state and vr_state to store live registers

2016-06-07 Thread Cyril Bur
There is currently an inconsistency as to how the entire CPU register
state is saved and restored when a thread uses transactional memory
(TM).

Using transactional memory results in the CPU having duplicated
(almost all) of its register state. This duplication results in a set
of registers which can be considered 'live', those being currently
modified by the instructions being executed and another set that is
frozen at a point in time.

On context switch, both sets of state have to be saved and (later)
restored. These two states are often called a variety of different
things. Common terms for the state which only exists after has entered
a transaction (performed a TBEGIN instruction) in hardware is the
'transactional' or 'speculative'.

Between a TBEGIN and a TEND or TABORT (or an event that causes the
hardware to abort), regardless of the use of TSUSPEND the
transactional state can be referred to as the live state.

The second state is often to referred to as the 'checkpointed' state
and is a duplication of the live state when the TBEGIN instruction is
executed. This state is kept in the hardware and will be rolled back
to on transaction failure.

Currently all the registers stored in pt_regs are ALWAYS the live
registers, that is, when a thread has transactional registers their
values are stored in pt_regs and the checkpointed state is in
ckpt_regs. A strange opposite is true for fp_state. When a thread is
non transactional fp_state holds the live registers. When a thread has
initiated a transaction fp_state holds the checkpointed state and
transact_fp becomes the structure which holds the live state (at this
point it is a transactional state). The same is true for vr_state

This method creates confusion as to where the live state is, in some
circumstances it requires extra work to determine where to put the
live state and prevents the use of common functions designed (probably
before TM) to save the live state.

With this patch pt_regs, fp_state and vr_state all represent the same
thing and the other structures [pending rename] are for checkpointed
state.

Signed-off-by: Cyril Bur 
---
 arch/powerpc/kernel/process.c   | 44 +--
 arch/powerpc/kernel/signal_32.c | 50 ++
 arch/powerpc/kernel/signal_64.c | 53 +++
 arch/powerpc/kernel/tm.S| 95 ++---
 arch/powerpc/kernel/traps.c | 12 --
 5 files changed, 116 insertions(+), 138 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index ea8a28f..696e0236 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -763,24 +763,12 @@ static void tm_reclaim_thread(struct thread_struct *thr,
 {
unsigned long msr_diff = 0;
 
-   /*
-* If FP/VSX registers have been already saved to the
-* thread_struct, move them to the transact_fp array.
-* We clear the TIF_RESTORE_TM bit since after the reclaim
-* the thread will no longer be transactional.
-*/
if (test_ti_thread_flag(ti, TIF_RESTORE_TM)) {
-   msr_diff = thr->ckpt_regs.msr & ~thr->regs->msr;
-   if (msr_diff & MSR_FP)
-   memcpy(&thr->transact_fp, &thr->fp_state,
-  sizeof(struct thread_fp_state));
-   if (msr_diff & MSR_VEC)
-   memcpy(&thr->transact_vr, &thr->vr_state,
-  sizeof(struct thread_vr_state));
+   msr_diff = (thr->ckpt_regs.msr & ~thr->regs->msr)
+   & (MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1);
+
clear_ti_thread_flag(ti, TIF_RESTORE_TM);
-   msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1;
}
-
/*
 * Use the current MSR TM suspended bit to track if we have
 * checkpointed state outstanding.
@@ -799,6 +787,8 @@ static void tm_reclaim_thread(struct thread_struct *thr,
if (!MSR_TM_SUSPENDED(mfmsr()))
return;
 
+   save_all(container_of(thr, struct task_struct, thread));
+
tm_reclaim(thr, thr->regs->msr, cause);
 
/* Having done the reclaim, we now have the checkpointed
@@ -901,7 +891,7 @@ static inline void tm_recheckpoint_new_task(struct 
task_struct *new)
 * If the task was using FP, we non-lazily reload both the original and
 * the speculative FP register states.  This is because the kernel
 * doesn't see if/when a TM rollback occurs, so if we take an FP
-* unavoidable later, we are unable to determine which set of FP regs
+* unavailable later, we are unable to determine which set of FP regs
 * need to be restored.
 */
if (!new->thread.regs)
@@ -917,24 +907,10 @@ static inline void tm_recheckpoint_new_task(struct 
task_struct *new)
 "(new->msr 0x%lx, new->origmsr 0x%lx)\n",
 new->pid, new->thread.regs->msr

[PATCH 5/5] powerpc: Remove do_load_up_transact_{fpu,altivec}

2016-06-07 Thread Cyril Bur
Previous rework of TM code leaves these functions unused

Signed-off-by: Cyril Bur 
---
 arch/powerpc/include/asm/tm.h |  5 -
 arch/powerpc/kernel/fpu.S | 26 --
 arch/powerpc/kernel/vector.S  | 25 -
 3 files changed, 56 deletions(-)

diff --git a/arch/powerpc/include/asm/tm.h b/arch/powerpc/include/asm/tm.h
index c22d704..82e06ca 100644
--- a/arch/powerpc/include/asm/tm.h
+++ b/arch/powerpc/include/asm/tm.h
@@ -9,11 +9,6 @@
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-extern void do_load_up_transact_fpu(struct thread_struct *thread);
-extern void do_load_up_transact_altivec(struct thread_struct *thread);
-#endif
-
 extern void tm_enable(void);
 extern void tm_reclaim(struct thread_struct *thread,
   unsigned long orig_msr, uint8_t cause);
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 181c187..08d14b0 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -50,32 +50,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX);  
\
 #define REST_32FPVSRS(n,c,base) __REST_32FPVSRS(n,__REG_##c,__REG_##base)
 #define SAVE_32FPVSRS(n,c,base) __SAVE_32FPVSRS(n,__REG_##c,__REG_##base)
 
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-/* void do_load_up_transact_fpu(struct thread_struct *thread)
- *
- * This is similar to load_up_fpu but for the transactional version of the FP
- * register set.  It doesn't mess with the task MSR or valid flags.
- * Furthermore, we don't do lazy FP with TM currently.
- */
-_GLOBAL(do_load_up_transact_fpu)
-   mfmsr   r6
-   ori r5,r6,MSR_FP
-#ifdef CONFIG_VSX
-BEGIN_FTR_SECTION
-   orisr5,r5,MSR_VSX@h
-END_FTR_SECTION_IFSET(CPU_FTR_VSX)
-#endif
-   SYNC
-   MTMSRD(r5)
-
-   addir7,r3,THREAD_CKFPSTATE
-   lfd fr0,FPSTATE_FPSCR(r7)
-   MTFSF_L(fr0)
-   REST_32FPVSRS(0, R4, R7)
-
-   blr
-#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
-
 /*
  * Load state from memory into FP registers including FPSCR.
  * Assumes the caller has enabled FP in the MSR.
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index b5d5025..84b19ab 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -7,31 +7,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-/* void do_load_up_transact_altivec(struct thread_struct *thread)
- *
- * This is similar to load_up_altivec but for the transactional version of the
- * vector regs.  It doesn't mess with the task MSR or valid flags.
- * Furthermore, VEC laziness is not supported with TM currently.
- */
-_GLOBAL(do_load_up_transact_altivec)
-   mfmsr   r6
-   orisr5,r6,MSR_VEC@h
-   MTMSRD(r5)
-   isync
-
-   li  r4,1
-   stw r4,THREAD_USED_VR(r3)
-
-   li  r10,THREAD_CKVRSTATE+VRSTATE_VSCR
-   lvx v0,r10,r3
-   mtvscr  v0
-   addir10,r3,THREAD_CKVRSTATE
-   REST_32VRS(0,r4,r10)
-
-   blr
-#endif
-
 /*
  * Load state from memory into VMX registers including VSCR.
  * Assumes the caller has enabled VMX in the MSR.
-- 
2.8.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/5] Consistent TM structures

2016-06-07 Thread Cyril Bur
Hi,

The reason for this series is outlined in 3/5. I'll reexplain here
quickly.

If userspace doesn't use TM at all then pt_regs, fp_state and vr_state
hold (almost) all the register state of the CPU.

If userspace uses TM then pt_regs is ALWAYS the live state. This may
be a transactional speculative state or if the thread is between
transactions it is just the regular live state. The checkpointed state
(if needed) always exists in ckpt_regs.
This is not true of fp_state and vr_state which MAY hold a live state
when the thread has not entered a transaction but will then contain
checkpointed values once a thread enters a transaction.
transact_fp and transact_vr are used only when a thread is in a
transaction (active or suspended) to keep the live (but speculative)
state.

Here I aim to remove this disconnect and have everything behave like
pt_regs.

For ease of review I've left patches 3, 4 and 5 separate. It probably
makes sense for them to be squashed into one, the naming inconsistency
between 3 and 4 can't be a good idea.

A few apologies for this series:
 - I had to write tests to have an idea what I've done is correct,
they're still a bit rough around the edges.
 - In the process I made more the asm helpers shared as the powerpc/math
selftests had quite a few things I found useful.
 - This pretty much means the 2/5 monster should be a few patches. I'll
split them up.

I didn't want this series held up from initial review while I cleaned
up tests.

Thanks,

Cyril

Cyril Bur (5):
  selftests/powerpc: Check for VSX preservation across userspace
preemption
  selftests/powerpc: Add test to check TM ucontext creation
  powerpc: tm: Always use fp_state and vr_state to store live registers
  powerpc: tm: Rename transct_(*) to ck(\1)_state
  powerpc: Remove do_load_up_transact_{fpu,altivec}

 arch/powerpc/include/asm/processor.h   |  20 +--
 arch/powerpc/include/asm/tm.h  |   5 -
 arch/powerpc/kernel/asm-offsets.c  |  12 +-
 arch/powerpc/kernel/fpu.S  |  26 
 arch/powerpc/kernel/process.c  |  48 ++-
 arch/powerpc/kernel/signal.h   |   8 +-
 arch/powerpc/kernel/signal_32.c|  84 ++---
 arch/powerpc/kernel/signal_64.c|  59 -
 arch/powerpc/kernel/tm.S   |  95 +++---
 arch/powerpc/kernel/traps.c|  12 +-
 arch/powerpc/kernel/vector.S   |  25 
 tools/testing/selftests/powerpc/basic_asm.h|   4 +
 tools/testing/selftests/powerpc/fpu_asm.h  |  72 +++
 tools/testing/selftests/powerpc/gpr_asm.h  |  96 ++
 tools/testing/selftests/powerpc/math/Makefile  |   4 +-
 tools/testing/selftests/powerpc/math/fpu_asm.S |  73 +--
 tools/testing/selftests/powerpc/math/vmx_asm.S |  85 +
 tools/testing/selftests/powerpc/math/vsx_asm.S |  57 +
 tools/testing/selftests/powerpc/math/vsx_preempt.c | 140 +
 tools/testing/selftests/powerpc/tm/Makefile|   9 +-
 .../powerpc/tm/tm-signal-context-chk-fpu.c |  94 ++
 .../powerpc/tm/tm-signal-context-chk-gpr.c |  96 ++
 .../powerpc/tm/tm-signal-context-chk-vmx.c | 112 +
 .../powerpc/tm/tm-signal-context-chk-vsx.c | 127 +++
 .../selftests/powerpc/tm/tm-signal-context-chk.c   | 102 +++
 tools/testing/selftests/powerpc/tm/tm-signal.S | 105 
 tools/testing/selftests/powerpc/vmx_asm.h  |  98 +++
 tools/testing/selftests/powerpc/vsx_asm.h  |  71 +++
 28 files changed, 1343 insertions(+), 396 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/fpu_asm.h
 create mode 100644 tools/testing/selftests/powerpc/gpr_asm.h
 create mode 100644 tools/testing/selftests/powerpc/math/vsx_asm.S
 create mode 100644 tools/testing/selftests/powerpc/math/vsx_preempt.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-fpu.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-gpr.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vmx.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vsx.c
 create mode 100644 tools/testing/selftests/powerpc/tm/tm-signal-context-chk.c
 create mode 100644 tools/testing/selftests/powerpc/tm/tm-signal.S
 create mode 100644 tools/testing/selftests/powerpc/vmx_asm.h
 create mode 100644 tools/testing/selftests/powerpc/vsx_asm.h

-- 
2.8.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/5] powerpc: tm: Rename transct_(*) to ck(\1)_state

2016-06-07 Thread Cyril Bur
Make the structures being used for checkpointed state named
consistently with the pt_regs/ckpt_regs.

Signed-off-by: Cyril Bur 
---
 arch/powerpc/include/asm/processor.h | 20 +++-
 arch/powerpc/kernel/asm-offsets.c| 12 
 arch/powerpc/kernel/fpu.S|  2 +-
 arch/powerpc/kernel/process.c|  4 +--
 arch/powerpc/kernel/signal.h |  8 ++---
 arch/powerpc/kernel/signal_32.c  | 60 ++--
 arch/powerpc/kernel/signal_64.c  | 32 +--
 arch/powerpc/kernel/tm.S | 12 
 arch/powerpc/kernel/vector.S |  4 +--
 9 files changed, 71 insertions(+), 83 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 009fab1..6fd0f00 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -147,7 +147,7 @@ typedef struct {
 } mm_segment_t;
 
 #define TS_FPR(i) fp_state.fpr[i][TS_FPROFFSET]
-#define TS_TRANS_FPR(i) transact_fp.fpr[i][TS_FPROFFSET]
+#define TS_CKFPR(i) ckfp_state.fpr[i][TS_FPROFFSET]
 
 /* FP and VSX 0-31 register set */
 struct thread_fp_state {
@@ -266,21 +266,9 @@ struct thread_struct {
unsigned long   tm_ppr;
unsigned long   tm_dscr;
 
-   /*
-* Transactional FP and VSX 0-31 register set.
-* NOTE: the sense of these is the opposite of the integer ckpt_regs!
-*
-* When a transaction is active/signalled/scheduled etc., *regs is the
-* most recent set of/speculated GPRs with ckpt_regs being the older
-* checkpointed regs to which we roll back if transaction aborts.
-*
-* However, fpr[] is the checkpointed 'base state' of FP regs, and
-* transact_fpr[] is the new set of transactional values.
-* VRs work the same way.
-*/
-   struct thread_fp_state transact_fp;
-   struct thread_vr_state transact_vr;
-   unsigned long   transact_vrsave;
+   struct thread_fp_state ckfp_state; /* Checkpointed FP state */
+   struct thread_vr_state ckvr_state; /* Checkpointed VR state */
+   unsigned long   ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
void*   kvm_shadow_vcpu; /* KVM internal data */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9ea0955..e67741f 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -152,12 +152,12 @@ int main(void)
DEFINE(THREAD_TM_PPR, offsetof(struct thread_struct, tm_ppr));
DEFINE(THREAD_TM_DSCR, offsetof(struct thread_struct, tm_dscr));
DEFINE(PT_CKPT_REGS, offsetof(struct thread_struct, ckpt_regs));
-   DEFINE(THREAD_TRANSACT_VRSTATE, offsetof(struct thread_struct,
-transact_vr));
-   DEFINE(THREAD_TRANSACT_VRSAVE, offsetof(struct thread_struct,
-   transact_vrsave));
-   DEFINE(THREAD_TRANSACT_FPSTATE, offsetof(struct thread_struct,
-transact_fp));
+   DEFINE(THREAD_CKVRSTATE, offsetof(struct thread_struct,
+ckvr_state));
+   DEFINE(THREAD_CKVRSAVE, offsetof(struct thread_struct,
+   ckvrsave));
+   DEFINE(THREAD_CKFPSTATE, offsetof(struct thread_struct,
+ckfp_state));
/* Local pt_regs on stack for Transactional Memory funcs. */
DEFINE(TM_FRAME_SIZE, STACK_FRAME_OVERHEAD +
   sizeof(struct pt_regs) + 16);
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 15da2b5..181c187 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -68,7 +68,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
SYNC
MTMSRD(r5)
 
-   addir7,r3,THREAD_TRANSACT_FPSTATE
+   addir7,r3,THREAD_CKFPSTATE
lfd fr0,FPSTATE_FPSCR(r7)
MTFSF_L(fr0)
REST_32FPVSRS(0, R4, R7)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 696e0236..15462c9 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -813,8 +813,8 @@ static inline void tm_reclaim_task(struct task_struct *tsk)
 *
 * In switching we need to maintain a 2nd register state as
 * oldtask->thread.ckpt_regs.  We tm_reclaim(oldproc); this saves the
-* checkpointed (tbegin) state in ckpt_regs and saves the transactional
-* (current) FPRs into oldtask->thread.transact_fpr[].
+* checkpointed (tbegin) state in ckpt_regs, ckfp_state and
+* ckvr_state
 *
 * We also context switch (save) TFHAR/TEXASR/TFIAR in here.
 */
diff --git a/arch/powerpc/kernel/signal.h b/arch/powerpc/kernel/signal.h
index be305c8..b1cebf66 100644
--- a/

[PATCH 2/5] selftests/powerpc: Add test to check TM ucontext creation

2016-06-07 Thread Cyril Bur
Signed-off-by: Cyril Bur 
---
 tools/testing/selftests/powerpc/basic_asm.h|   4 +
 tools/testing/selftests/powerpc/fpu_asm.h  |  72 
 tools/testing/selftests/powerpc/gpr_asm.h  |  96 
 tools/testing/selftests/powerpc/math/fpu_asm.S |  73 +---
 tools/testing/selftests/powerpc/math/vmx_asm.S |  85 +-
 tools/testing/selftests/powerpc/tm/Makefile|   9 +-
 .../powerpc/tm/tm-signal-context-chk-fpu.c |  94 +++
 .../powerpc/tm/tm-signal-context-chk-gpr.c |  96 
 .../powerpc/tm/tm-signal-context-chk-vmx.c | 112 ++
 .../powerpc/tm/tm-signal-context-chk-vsx.c | 127 +
 .../selftests/powerpc/tm/tm-signal-context-chk.c   | 102 +
 tools/testing/selftests/powerpc/tm/tm-signal.S | 105 +
 tools/testing/selftests/powerpc/vmx_asm.h  |  98 
 13 files changed, 920 insertions(+), 153 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/fpu_asm.h
 create mode 100644 tools/testing/selftests/powerpc/gpr_asm.h
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-fpu.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-gpr.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vmx.c
 create mode 100644 
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vsx.c
 create mode 100644 tools/testing/selftests/powerpc/tm/tm-signal-context-chk.c
 create mode 100644 tools/testing/selftests/powerpc/tm/tm-signal.S
 create mode 100644 tools/testing/selftests/powerpc/vmx_asm.h

diff --git a/tools/testing/selftests/powerpc/basic_asm.h 
b/tools/testing/selftests/powerpc/basic_asm.h
index 3349a07..5131059 100644
--- a/tools/testing/selftests/powerpc/basic_asm.h
+++ b/tools/testing/selftests/powerpc/basic_asm.h
@@ -4,6 +4,10 @@
 #include 
 #include 
 
+#define TBEGIN .long 0x7C00051D
+#define TSUSPEND .long 0x7C0005DD
+#define TRESUME .long 0x7C2005DD
+
 #define LOAD_REG_IMMEDIATE(reg,expr) \
lis reg,(expr)@highest; \
ori reg,reg,(expr)@higher;  \
diff --git a/tools/testing/selftests/powerpc/fpu_asm.h 
b/tools/testing/selftests/powerpc/fpu_asm.h
new file mode 100644
index 000..a73a7a9
--- /dev/null
+++ b/tools/testing/selftests/powerpc/fpu_asm.h
@@ -0,0 +1,72 @@
+#ifndef _SELFTESTS_POWERPC_FPU_ASM_H
+#define _SELFTESTS_POWERPC_FPU_ASM_H
+#include "basic_asm.h"
+
+#define PUSH_FPU(stack_size) \
+   stfdf31,(stack_size + STACK_FRAME_MIN_SIZE)(%r1); \
+   stfdf30,(stack_size + STACK_FRAME_MIN_SIZE - 8)(%r1); \
+   stfdf29,(stack_size + STACK_FRAME_MIN_SIZE - 16)(%r1); \
+   stfdf28,(stack_size + STACK_FRAME_MIN_SIZE - 24)(%r1); \
+   stfdf27,(stack_size + STACK_FRAME_MIN_SIZE - 32)(%r1); \
+   stfdf26,(stack_size + STACK_FRAME_MIN_SIZE - 40)(%r1); \
+   stfdf25,(stack_size + STACK_FRAME_MIN_SIZE - 48)(%r1); \
+   stfdf24,(stack_size + STACK_FRAME_MIN_SIZE - 56)(%r1); \
+   stfdf23,(stack_size + STACK_FRAME_MIN_SIZE - 64)(%r1); \
+   stfdf22,(stack_size + STACK_FRAME_MIN_SIZE - 72)(%r1); \
+   stfdf21,(stack_size + STACK_FRAME_MIN_SIZE - 80)(%r1); \
+   stfdf20,(stack_size + STACK_FRAME_MIN_SIZE - 88)(%r1); \
+   stfdf19,(stack_size + STACK_FRAME_MIN_SIZE - 96)(%r1); \
+   stfdf18,(stack_size + STACK_FRAME_MIN_SIZE - 104)(%r1); \
+   stfdf17,(stack_size + STACK_FRAME_MIN_SIZE - 112)(%r1); \
+   stfdf16,(stack_size + STACK_FRAME_MIN_SIZE - 120)(%r1); \
+   stfdf15,(stack_size + STACK_FRAME_MIN_SIZE - 128)(%r1); \
+   stfdf14,(stack_size + STACK_FRAME_MIN_SIZE - 136)(%r1);
+
+#define POP_FPU(stack_size) \
+   lfd f31,(stack_size + STACK_FRAME_MIN_SIZE)(%r1); \
+   lfd f30,(stack_size + STACK_FRAME_MIN_SIZE - 8)(%r1); \
+   lfd f29,(stack_size + STACK_FRAME_MIN_SIZE - 16)(%r1); \
+   lfd f28,(stack_size + STACK_FRAME_MIN_SIZE - 24)(%r1); \
+   lfd f27,(stack_size + STACK_FRAME_MIN_SIZE - 32)(%r1); \
+   lfd f26,(stack_size + STACK_FRAME_MIN_SIZE - 40)(%r1); \
+   lfd f25,(stack_size + STACK_FRAME_MIN_SIZE - 48)(%r1); \
+   lfd f24,(stack_size + STACK_FRAME_MIN_SIZE - 56)(%r1); \
+   lfd f23,(stack_size + STACK_FRAME_MIN_SIZE - 64)(%r1); \
+   lfd f22,(stack_size + STACK_FRAME_MIN_SIZE - 72)(%r1); \
+   lfd f21,(stack_size + STACK_FRAME_MIN_SIZE - 80)(%r1); \
+   lfd f20,(stack_size + STACK_FRAME_MIN_SIZE - 88)(%r1); \
+   lfd f19,(stack_size + STACK_FRAME_MIN_SIZE - 96)(%r1); \
+   lfd f18,(stack_size + STACK_FRAME_MIN_SIZE - 104)(%r1); \
+   lfd f17,(stack_size + STACK_FRAME_MIN_SIZE - 112)(%r1); \
+   lfd f16,(stack_size + STACK_FRAME_MIN_SIZE - 120)(%r1); \
+   lfd f15,(stack_size + STACK_FRAME_MIN_SIZE - 128)(%r1); 

Re: powerpc/mm/radix: Make the pid unsigned long

2016-06-07 Thread Michael Ellerman
On Thu, 2016-02-06 at 09:44:48 UTC, "Aneesh Kumar K.V" wrote:
> Semantic Issue: comparison of constant 18446744073709551615 with
> expression of type 'unsigned int' is always false.
> 
> Signed-off-by: Aneesh Kumar K.V 
> Reviewed-by: Balbir Singh 
> ---
>  arch/powerpc/mm/tlb-radix.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

I'm going to take this as a fix, I rewrote the change log to:

powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT

In some of the radix TLB flush routines, we use a local to store the
mm->context.id, AKA the PID.

Currently we use an int, but the PID is unsigned long, so large values
of PID will be truncated. In particular MMU_NO_CONTEXT is -1, which
means all our comparisons against that value can never be true.

This means we'll issue TLB flushes when we shouldn't on radix enabled
machines.

Fix it by using an unsigned long for the local. Discovered by Coverity.

Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: Aneesh Kumar K.V 
Reviewed-by: Balbir Singh 
[mpe: Write change log]
Signed-off-by: Michael Ellerman 

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 8/9] cpufreq: Keep policy->freq_table sorted in ascending order

2016-06-07 Thread Viresh Kumar
On 08-06-16, 02:38, Rafael J. Wysocki wrote:
> On Tuesday, June 07, 2016 09:58:07 AM Viresh Kumar wrote:
> > On 06-06-16, 23:56, Rafael J. Wysocki wrote:
> > > Since you are adding new code, you can write it so it doesn't do
> > > unnecessary checks from the start.
> > 
> > Hmm, I will do all that in this series only now.
> > 
> > > While at it, the "if ((freq < policy->min) || (freq > policy->max))"
> > > checks in cpufreq_find_index_l() and cpufreq_find_index_h() don't look
> > > good to me, because they very well may cause those function to return
> > > -EINVAL even when there's a valid table and that may cause
> > > acpi_cpufreq_fast_switch() to do bad things.
> > 
> > Hmm. So, the checks are for sure required here, otherwise we may end up
> > returning a frequency which we aren't allowed to. Also note that 'freq' here
> > isn't the target-freq, but the entry in the freq-table.
> > 
> > This routine should be returning a valid freq within the ranges specified by
> > policy->min/max.
> 
> Which in principle may not be possible if the range doesn't include any
> frequency in the table, eg. min == max and between the table entries.

By within ranges I meant, policy->min <= freq <= policy->max, and that's how all
our checks are. So even if the table will have a single valid frequency, we will
return that only.

> However, the CPU has to run at *some* frequency, even if there's none in the
> min/max range.

I completely agree. But the error will be fired only if there is no frequency
within ranges we can switch to. And that's a bug somewhere else then.

> And if we are sure that there is at least one valid frequency between min
> and max, please note that target_freq has already been clamped between them,

Yeah, its already clamped by the freq-change helpers in cpufreq core, but others
may not be doing it properly.

> > Also note that these routines shall *never* return -EINVAL, otherwise it is
> > mostly a bug we are hitting.
> 
> So make them explicitly return a valid frequency every time.

I thought about return Index 0 on such errors, will that be fine ? Anyway the
new patches have added a WARN() for such cases.

> > We have enough checks in place to make sure that there is at least one valid
> > entry in the freq-table which is >= policy->min and <= policy->max.
> 
> That assuming that the driver will always do the right thing in its ->verify
> callback.

Yeah.

-- 
viresh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v10 09/18] powerpc/powernv: Extend PCI bridge resources

2016-06-07 Thread Alexey Kardashevskiy
On 20/05/16 16:41, Gavin Shan wrote:
> The PCI slots are associated with root port or downstream ports
> of the PCIe switch connected to root port. When adapter is hot
> added to the PCI slot, it usually requests more IO or memory
> resource from the directly connected parent bridge (port) and
> update the bridge's windows accordingly. The resource windows
> of upstream bridges can't be updated automatically. It possibly
> leads to unbalanced resource across the bridges: The window of
> downstream bridge is overruning that of upstream bridge. The
> IO or MMIO path won't work.
> 
> This resolves the above issue by extending bridge windows of
> root port and upstream port of the PCIe switch connected to
> the root port to PHB's windows.
> 
> Signed-off-by: Gavin Shan 


This breaks Garrison machine (g86l):

EEH: Frozen PE#f9 on PHB#5 detected
EEH: PE location: Backplane PLX, PHB location: N/A
EEH: This PCI device has failed 1 times in the last hour
EEH: Notify device drivers to shutdown
EEH: Collect temporary log


PHB#5 has a boot device so we end up in initramdisk.


│0005:03:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0
xHCI Host Controller (rev 02)
│0005:04:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe
2.0 x2 4-port SATA 6 Gb/s Controller (rev 11)
│0005:05:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge
(rev 03)
│0005:06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED
Graphics Family (rev 30)



> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 46 
> +++
>  1 file changed, 46 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 3186a29..e97a5fa 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3221,6 +3221,49 @@ static resource_size_t pnv_pci_window_alignment(struct 
> pci_bus *bus,
>   return phb->ioda.io_segsize;
>  }
>  
> +/*
> + * We are updating root port or the upstream port of the
> + * bridge behind the root port with PHB's windows in order
> + * to accommodate the changes on required resources during
> + * PCI (slot) hotplug, which is connected to either root
> + * port or the downstream ports of PCIe switch behind the
> + * root port.
> + */
> +static void pnv_pci_fixup_bridge_resources(struct pci_bus *bus,
> +unsigned long type)
> +{
> + struct pci_controller *hose = pci_bus_to_host(bus);
> + struct pnv_phb *phb = hose->private_data;
> + struct pci_dev *bridge = bus->self;
> + struct resource *r, *w;
> + int i;
> +
> + /* Check if we need apply fixup to the bridge's windows */
> + if (!pci_is_root_bus(bridge->bus) &&
> + !pci_is_root_bus(bridge->bus->self->bus))
> + return;
> +
> + /* Fixup the resources */
> + for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
> + r = &bridge->resource[PCI_BRIDGE_RESOURCES + i];
> + if (!r->flags || !r->parent)
> + continue;
> +
> + w = NULL;
> + if (r->flags & type & IORESOURCE_IO)
> + w = &hose->io_resource;
> + else if (pnv_pci_is_mem_pref_64(r->flags) &&
> +  (type & IORESOURCE_PREFETCH) &&
> +  phb->ioda.m64_segsize)
> + w = &hose->mem_resources[1];
> + else if (r->flags & type & IORESOURCE_MEM)
> + w = &hose->mem_resources[0];
> +
> + r->start = w->start;
> + r->end = w->end;
> + }
> +}
> +
>  static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>  {
>   struct pci_controller *hose = pci_bus_to_host(bus);
> @@ -3229,6 +3272,9 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, 
> unsigned long type)
>   struct pnv_ioda_pe *pe;
>   bool all = (pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE);
>  
> + /* Extend bridge's windows if necessary */
> + pnv_pci_fixup_bridge_resources(bus, type);
> +
>   /* The PE for root bus should be realized before any one else */
>   if (!phb->ioda.root_pe_populated) {
>   pe = pnv_ioda_setup_bus_PE(phb->hose->bus, false);
> 


-- 
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-07 Thread Michael Ellerman
On Tue, 2016-06-07 at 22:17 +0200, Christian Zigotzky wrote:
> 
> 764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
> commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
> Author: Aneesh Kumar K.V 
> Date:   Fri Apr 29 23:26:09 2016 +1000
> 
>  powerpc/mm/radix: Add checks in slice code to catch radix usage
> 
>  Radix doesn't need slice support. Catch incorrect usage of slice code
>  when radix is enabled.
> 
>  Signed-off-by: Aneesh Kumar K.V 
>  Signed-off-by: Michael Ellerman 
> 

Hmm, I find that hard to believe. But maybe I'm missing something.

Can you checkout Linus' master and then revert that commit?

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] Add PowerPC AT_HWCAP2 definitions

2016-06-07 Thread David Gibson
On Tue, Jun 07, 2016 at 10:28:42PM +1000, Anton Blanchard wrote:
> From: Anton Blanchard 
> 
> We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
> add the PowerPC AT_HWCAP2 definitions.
> 
> Signed-off-by: Anton Blanchard 

Applied to ppc-for-2.7.

Paolo or Peter: since this is a change to PPC specific bits, it seems
reasonable to go through my tree although it's technically a generic
header.  If someone wants to drop an explicit Ack, that wouldn't hurt
of course.

> ---
> 
> diff --git a/include/elf.h b/include/elf.h
> index 28d448b..8533b2a 100644
> --- a/include/elf.h
> +++ b/include/elf.h
> @@ -477,6 +477,19 @@ typedef struct {
>  #define PPC_FEATURE_TRUE_LE 0x0002
>  #define PPC_FEATURE_PPC_LE  0x0001
>  
> +/* Bits present in AT_HWCAP2 for PowerPC.  */
> +
> +#define PPC_FEATURE2_ARCH_2_07  0x8000
> +#define PPC_FEATURE2_HAS_HTM0x4000
> +#define PPC_FEATURE2_HAS_DSCR   0x2000
> +#define PPC_FEATURE2_HAS_EBB0x1000
> +#define PPC_FEATURE2_HAS_ISEL   0x0800
> +#define PPC_FEATURE2_HAS_TAR0x0400
> +#define PPC_FEATURE2_HAS_VEC_CRYPTO 0x0200
> +#define PPC_FEATURE2_HTM_NOSC   0x0100
> +#define PPC_FEATURE2_ARCH_3_00  0x0080
> +#define PPC_FEATURE2_HAS_IEEE1280x0040
> +
>  /* Bits present in AT_HWCAP for Sparc.  */
>  
>  #define HWCAP_SPARC_FLUSH   0x0001
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] spapr: Better handling of ibm,pa-features TM bit

2016-06-07 Thread David Gibson
On Tue, Jun 07, 2016 at 10:32:10PM +1000, Anton Blanchard wrote:
> From: Anton Blanchard 
> 
> There are a few issues with our handling of the ibm,pa-features
> TM bit:
> 
> - We don't support transactional memory in PR KVM, so don't tell
>   the OS that we do.
> 
> - In full emulation we have a minimal implementation of TM that always
>   fails, so for performance reasons lets not tell the OS that we
>   support it either.
> 
> - In HV KVM mode, we should mirror the host TM enabled state by
>   looking at the AT_HWCAP2 bit.
> 
> Signed-off-by: Anton Blanchard 

So, we certainly need a change like this.  I'm not entirely happy with
the current implementation though.

> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0636642..c403fbb 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -620,7 +620,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void 
> *fdt, int offset,
>  0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
>  0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
>  0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
> -0x80, 0x00, 0x80, 0x00, 0x80, 0x00 };
> +0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
>  uint8_t *pa_features;
>  size_t pa_size;
>  
> @@ -697,6 +697,19 @@ static void spapr_populate_cpu_dt(CPUState *cs, void 
> *fdt, int offset,
>  } else /* env->mmu_model == POWERPC_MMU_2_07 */ {
>  pa_features = pa_features_207;
>  pa_size = sizeof(pa_features_207);
> +
> +#ifdef CONFIG_KVM
> +/* Only enable TM in HV KVM mode */
> +if (kvm_enabled() &&
> +!kvm_vm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_PVINFO)) {
> +unsigned long hwcap2 = qemu_getauxval(AT_HWCAP2);
> +
> +/* Guest should inherit host TM enabled bit */
> +if (hwcap2 & PPC_FEATURE2_HAS_HTM) {
> +pa_features[24] |= 0x80;
> +}
> +}
> +#endif

So first, I think this stanza wants to move into target-ppc/kvm.c -
maybe a kvm_filter_pa_features() call or something.

Second, although using PVINFO to determine if we have HV KVM is a
standard trick, we don't want to use it as our first option.  We
really want to introduce an actual KVM CAP flag for TM support, then
fall back to checking PVINFO if we can't use that.

I wonder if we actually want to just blanket disable TM in one patch -
since it doesn't work at all with PR KVM, and "works" only in the most
rules-lawyering and useless way on TCG.  Then re-enable it on HV KVM
in a second patch.

>  }
>  if (env->ci_large_pages) {
>  pa_features[3] |= 0x20;
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-07 Thread Michael Ellerman
On Wed, 2016-06-08 at 00:14 +0200, Christian Zigotzky wrote:
> Hi All,
> 
> I replaced the file "slice.c" with the old one from kernel 4.6. It 
> compiled but unfortunately it doesn't boot.

I would expect nothing else.

You can't just replace whole files from different versions, the unit of work is
a commit, not a file.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call

2016-06-07 Thread Balbir Singh


On 31/05/16 20:32, Michael Ellerman wrote:
> On Tue, 2016-05-31 at 12:19 +0200, Thomas Huth wrote:
>> On 31.05.2016 12:04, Michael Ellerman wrote:
>>> On Tue, 2016-05-31 at 07:51 +0200, Thomas Huth wrote:
 If we do not provide the PVR for POWER8NVL, a guest on this
 system currently ends up in PowerISA 2.06 compatibility mode on
 KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet.
 So some new instructions from POWER8 (like "mtvsrd") get disabled
 for the guest, resulting in crashes when using code compiled
 explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC).

 Signed-off-by: Thomas Huth 
>>>
>>> So this should say:
>>>
>>>   Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor")
>>>
>>> And therefore:
>>>
>>>   Cc: sta...@vger.kernel.org # v4.0+
>>>
>>> Am I right?
>>
>> Right. (At least for virtualized systems ... for bare-metal systems,
>> that original patch was enough). So shall I resubmit my patch with these
>> two lines, or could you add them when you pick this patch up?
> 
> Thanks, I'll add them here.

Don't we need to update IBM_ARCH_VEC_NRCORES_OFFSET as well?

Balbir Singh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 8/9] cpufreq: Keep policy->freq_table sorted in ascending order

2016-06-07 Thread Rafael J. Wysocki
On Tuesday, June 07, 2016 09:58:07 AM Viresh Kumar wrote:
> On 06-06-16, 23:56, Rafael J. Wysocki wrote:
> > Since you are adding new code, you can write it so it doesn't do
> > unnecessary checks from the start.
> 
> Hmm, I will do all that in this series only now.
> 
> > While at it, the "if ((freq < policy->min) || (freq > policy->max))"
> > checks in cpufreq_find_index_l() and cpufreq_find_index_h() don't look
> > good to me, because they very well may cause those function to return
> > -EINVAL even when there's a valid table and that may cause
> > acpi_cpufreq_fast_switch() to do bad things.
> 
> Hmm. So, the checks are for sure required here, otherwise we may end up
> returning a frequency which we aren't allowed to. Also note that 'freq' here
> isn't the target-freq, but the entry in the freq-table.
> 
> This routine should be returning a valid freq within the ranges specified by
> policy->min/max.

Which in principle may not be possible if the range doesn't include any
frequency in the table, eg. min == max and between the table entries.

However, the CPU has to run at *some* frequency, even if there's none in the
min/max range.

And if we are sure that there is at least one valid frequency between min
and max, please note that target_freq has already been clamped between them,
so clamping again is rather unuseful.  And of course it is racy in general,
which makes it even more unuseful.

> Also note that these routines shall *never* return -EINVAL, otherwise it is
> mostly a bug we are hitting.

So make them explicitly return a valid frequency every time.

> We have enough checks in place to make sure that there is at least one valid
> entry in the freq-table which is >= policy->min and <= policy->max.

That assuming that the driver will always do the right thing in its ->verify
callback.

> I will take care of rest of the comments though. Thanks.

Thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 2/5] fsl/qe: setup clock source for TDM mode

2016-06-07 Thread David Miller
From: Zhao Qiang 
Date: Mon, 6 Jun 2016 14:29:59 +0800

> Add tdm clock configuration in both qe clock system and ucc
> fast controller.
> 
> Signed-off-by: Zhao Qiang 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 5/5] drivers/net: support hdlc function for QE-UCC

2016-06-07 Thread David Miller
From: Zhao Qiang 
Date: Mon, 6 Jun 2016 14:30:02 +0800

> The driver add hdlc support for Freescale QUICC Engine.
> It support NMSI and TSA mode.
> 
> Signed-off-by: Zhao Qiang 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 1/5] fsl/qe: add rx_sync and tx_sync for TDM mode

2016-06-07 Thread David Miller
From: Zhao Qiang 
Date: Mon, 6 Jun 2016 14:29:58 +0800

> Rx_sync and tx_sync are used by QE-TDM mode,
> add them to struct ucc_fast_info.
> 
> Signed-off-by: Zhao Qiang 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 3/5] fsl/qe: Make regs resouce_size_t

2016-06-07 Thread David Miller
From: Zhao Qiang 
Date: Mon, 6 Jun 2016 14:30:00 +0800

> Signed-off-by: Zhao Qiang 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 4/5] fsl/qe: Add QE TDM lib

2016-06-07 Thread David Miller
From: Zhao Qiang 
Date: Mon, 6 Jun 2016 14:30:01 +0800

> QE has module to support TDM, some other protocols
> supported by QE are based on TDM.
> add a qe-tdm lib, this lib provides functions to the protocols
> using TDM to configurate QE-TDM.
> 
> Signed-off-by: Zhao Qiang 

Applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-07 Thread Alexei Starovoitov
On Tue, Jun 07, 2016 at 07:02:23PM +0530, Naveen N. Rao wrote:
> PPC64 eBPF JIT compiler.
> 
> Enable with:
> echo 1 > /proc/sys/net/core/bpf_jit_enable
> or
> echo 2 > /proc/sys/net/core/bpf_jit_enable
> 
> ... to see the generated JIT code. This can further be processed with
> tools/net/bpf_jit_disasm.
> 
> With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
> test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]
> 
> ... on both ppc64 BE and LE.

Nice. That's even better than on x64 which cannot jit one test:
test_bpf: #262 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 168 PASS
which was designed specifically to hit x64 jit pass limit.
ppc jit has predicatble number of passes and doesn't have this problem
as expected. Great.

> The details of the approach are documented through various comments in
> the code.
> 
> Cc: Matt Evans 
> Cc: Denis Kirjanov 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Cc: "David S. Miller" 
> Cc: Ananth N Mavinakayanahalli 
> Signed-off-by: Naveen N. Rao 
> ---
>  arch/powerpc/Kconfig  |   3 +-
>  arch/powerpc/include/asm/asm-compat.h |   2 +
>  arch/powerpc/include/asm/ppc-opcode.h |  20 +-
>  arch/powerpc/net/Makefile |   4 +
>  arch/powerpc/net/bpf_jit.h|  53 +-
>  arch/powerpc/net/bpf_jit64.h  | 102 
>  arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
>  arch/powerpc/net/bpf_jit_comp64.c | 956 
> ++
>  8 files changed, 1317 insertions(+), 3 deletions(-)
>  create mode 100644 arch/powerpc/net/bpf_jit64.h
>  create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
>  create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

don't see any issues with the code.
Thank you for working on this.

Acked-by: Alexei Starovoitov 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Fix IBM_ARCH_VEC_NRCORES_OFFSET value

2016-06-07 Thread Benjamin Herrenschmidt
Commit 7cc851039d643a2ee7df4d18177150f2c3a484f5
"powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call"
introduced a regression by adding fields to the beginning of the
ibm_architecture_vec structure without updating IBM_ARCH_VEC_NRCORES_OFFSET.

This causes the kernel to print a warning at boot and to fail to adjust
the number of cores based on the number of threads before doing the CAS
call to firmware.

This is quite a fragile piece of code sadly, we should try to find a way
to avoid that hard coded offset at some point, but for now this fixes it.

Signed-off-by: Benjamin Herrenschmidt 
---

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index ccd2037..6ee4b72 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -719,7 +719,7 @@ unsigned char ibm_architecture_vec[] = {
     * must match by the macro below. Update the definition if
     * the structure layout changes.
     */
-#define IBM_ARCH_VEC_NRCORES_OFFSET125
+#define IBM_ARCH_VEC_NRCORES_OFFSET133
    W(NR_CPUS), /* number of cores supported */
    0,
    0,
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-07 Thread Christian Zigotzky

Hi All,

I replaced the file "slice.c" with the old one from kernel 4.6. It 
compiled but unfortunately it doesn't boot.


Cheers,

Christian

On 07 June 2016 at 10:17 PM, Christian Zigotzky wrote:

Hi Michael,

On 06 June 2016 at 02:51 AM, Michael Ellerman wrote:

On Sat, 2016-06-04 at 17:07 +0200, Christian Zigotzky wrote:


Aneesh,

Shall I bisect the kernel from the powerpc git?

No just use linus' tree.


Shall I start with the following commit?

https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=8ffb4103f5e28d7e7890ed4774d8e009f253f56e 


Yeah that would be a good one to start with.

Then mark rc1 as bad and bisect should do the rest.

cheers


"range.size, pgprot_val(pgprot_noncached(__pgprot(0;" isn't the 
problem. :-) It works.


764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:09 2016 +1000

powerpc/mm/radix: Add checks in slice code to catch radix usage

Radix doesn't need slice support. Catch incorrect usage of slice code
when radix is enabled.

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Michael Ellerman 

Cheers,

Christian


Git Log:

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
linux-git


git bisect start

git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e

git bisect bad 1a695a905c18548062509178b98bc91e67510864 (Linux 4.7-rc1)

Output:

Bisecting: 6333 revisions left to test after this (roughly 13 steps)
[4741526b83c5d3a3d661d1896f9e7414c5730bcb] mm, page_alloc: restore the 
original nodemask if the fast path allocation failed




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-05145-g4741526-dirty #1 SMP Mon Jun 6 
14:35:01 CEST 2016 ppc64 GNU/Linux)


Output:

Bisecting: 3014 revisions left to test after this (roughly 12 steps)
[2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac] Merge tag 'staging-4.7-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 1693 revisions left to test after this (roughly 11 steps)
[54cf809b9512be95f53ed4a5e3b631d1ac42f0fa] locking,qspinlock: Fix 
spin_is_locked() and spin_unlock_wait()




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 721 revisions left to test after this (roughly 10 steps)
[f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752] Merge tag 'sound-4.7-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-05931-gf4c80d5-dirty #1 SMP Mon Jun 6 
19:13:42 CEST 2016 ppc64 GNU/Linux)


Output:

Bisecting: 324 revisions left to test after this (roughly 9 steps)
[c04a5880299eab3da8c10547db96ea9cdffd44a6] Merge tag 'powerpc-4.7-1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 186 revisions left to test after this (roughly 8 steps)
[e9ad9b9bd3a3b95c89a29b2a197476e662db4233] Merge tag 'docs-for-linus' 
of git://git.lwn.net/linux




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-06141-ge9ad9b9-dirty #1 SMP Tue Jun 7 
08:32:33 CEST 2016 ppc64 GNU/Linux)


Bisecting: 93 revisions left to test after this (roughly 7 steps)
[0e5b5ba17ac33a05d9f4a48b5eb8b5e30f2274d7] cxl: Remove duplicate #defines



I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 46 revisions left to test after this (roughly 6 steps)
[764041e0f43cc7846f6d8eb246d65b53cc06c764] powerpc/mm/radix: Add 
checks in slice code t

[PATCH 8/8] dmaengine: Remove site specific OOM error messages on kzalloc

2016-06-07 Thread Peter Griffin
If kzalloc() fails it will issue it's own error message including
a dump_stack(). So remove the site specific error messages.

Signed-off-by: Peter Griffin 
---
 drivers/dma/amba-pl08x.c| 10 +-
 drivers/dma/bestcomm/bestcomm.c |  2 --
 drivers/dma/edma.c  | 16 
 drivers/dma/fsldma.c|  2 --
 drivers/dma/k3dma.c | 10 --
 drivers/dma/mmp_tdma.c  |  5 ++---
 drivers/dma/moxart-dma.c|  4 +---
 drivers/dma/nbpfaxi.c   |  5 ++---
 drivers/dma/pl330.c |  5 +
 drivers/dma/ppc4xx/adma.c   |  2 --
 drivers/dma/s3c24xx-dma.c   |  5 +
 drivers/dma/sh/shdmac.c |  9 ++---
 drivers/dma/sh/sudmac.c |  9 ++---
 drivers/dma/sirf-dma.c  |  5 ++---
 drivers/dma/ste_dma40.c |  4 +---
 drivers/dma/tegra20-apb-dma.c   | 11 +++
 drivers/dma/timb_dma.c  |  8 ++--
 17 files changed, 28 insertions(+), 84 deletions(-)

diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c
index 81db1c4..939a7c3 100644
--- a/drivers/dma/amba-pl08x.c
+++ b/drivers/dma/amba-pl08x.c
@@ -1443,8 +1443,6 @@ static struct dma_async_tx_descriptor 
*pl08x_prep_dma_memcpy(
dsg = kzalloc(sizeof(struct pl08x_sg), GFP_NOWAIT);
if (!dsg) {
pl08x_free_txd(pl08x, txd);
-   dev_err(&pl08x->adev->dev, "%s no memory for pl080 sg\n",
-   __func__);
return NULL;
}
list_add_tail(&dsg->node, &txd->dsg_list);
@@ -1901,11 +1899,8 @@ static int pl08x_dma_init_virtual_channels(struct 
pl08x_driver_data *pl08x,
 */
for (i = 0; i < channels; i++) {
chan = kzalloc(sizeof(*chan), GFP_KERNEL);
-   if (!chan) {
-   dev_err(&pl08x->adev->dev,
-   "%s no memory for channel\n", __func__);
+   if (!chan)
return -ENOMEM;
-   }
 
chan->host = pl08x;
chan->state = PL08X_CHAN_IDLE;
@@ -2360,9 +2355,6 @@ static int pl08x_probe(struct amba_device *adev, const 
struct amba_id *id)
pl08x->phy_chans = kzalloc((vd->channels * sizeof(*pl08x->phy_chans)),
GFP_KERNEL);
if (!pl08x->phy_chans) {
-   dev_err(&adev->dev, "%s failed to allocate "
-   "physical channel holders\n",
-   __func__);
ret = -ENOMEM;
goto out_no_phychans;
}
diff --git a/drivers/dma/bestcomm/bestcomm.c b/drivers/dma/bestcomm/bestcomm.c
index 180fedb..7ce8437 100644
--- a/drivers/dma/bestcomm/bestcomm.c
+++ b/drivers/dma/bestcomm/bestcomm.c
@@ -397,8 +397,6 @@ static int mpc52xx_bcom_probe(struct platform_device *op)
/* Get a clean struct */
bcom_eng = kzalloc(sizeof(struct bcom_engine), GFP_KERNEL);
if (!bcom_eng) {
-   printk(KERN_ERR DRIVER_NAME ": "
-   "Can't allocate state structure\n");
rv = -ENOMEM;
goto error_sramclean;
}
diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 8181ed1..3c84cd8 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -1069,10 +1069,8 @@ static struct dma_async_tx_descriptor 
*edma_prep_slave_sg(
 
edesc = kzalloc(sizeof(*edesc) + sg_len * sizeof(edesc->pset[0]),
GFP_ATOMIC);
-   if (!edesc) {
-   dev_err(dev, "%s: Failed to allocate a descriptor\n", __func__);
+   if (!edesc)
return NULL;
-   }
 
edesc->pset_nr = sg_len;
edesc->residue = 0;
@@ -1173,10 +1171,8 @@ static struct dma_async_tx_descriptor 
*edma_prep_dma_memcpy(
 
edesc = kzalloc(sizeof(*edesc) + nslots * sizeof(edesc->pset[0]),
GFP_ATOMIC);
-   if (!edesc) {
-   dev_dbg(dev, "Failed to allocate a descriptor\n");
+   if (!edesc)
return NULL;
-   }
 
edesc->pset_nr = nslots;
edesc->residue = edesc->residue_stat = len;
@@ -1298,10 +1294,8 @@ static struct dma_async_tx_descriptor 
*edma_prep_dma_cyclic(
 
edesc = kzalloc(sizeof(*edesc) + nslots * sizeof(edesc->pset[0]),
GFP_ATOMIC);
-   if (!edesc) {
-   dev_err(dev, "%s: Failed to allocate a descriptor\n", __func__);
+   if (!edesc)
return NULL;
-   }
 
edesc->cyclic = 1;
edesc->pset_nr = nslots;
@@ -2207,10 +2201,8 @@ static int edma_probe(struct platform_device *pdev)
return ret;
 
ecc = devm_kzalloc(dev, sizeof(*ecc), GFP_KERNEL);
-   if (!ecc) {
-   dev_err(dev, "Can't allocate controller\n");
+   if (!ecc)
return -ENOMEM;
-   }
 
ecc->dev = dev;
ecc->id = pdev->id;
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index a8828ed..911b717 

[PATCH 7/8] dmaengine: tegra20-apb-dma: Only calculate residue if txstate exists.

2016-06-07 Thread Peter Griffin
There is no point calculating the residue if there is
no txstate to store the value.

Signed-off-by: Peter Griffin 
---
 drivers/dma/tegra20-apb-dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index 01e316f..7f4af8c 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -814,7 +814,7 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan 
*dc,
unsigned int residual;
 
ret = dma_cookie_status(dc, cookie, txstate);
-   if (ret == DMA_COMPLETE)
+   if (ret == DMA_COMPLETE || !txstate)
return ret;
 
spin_lock_irqsave(&tdc->lock, flags);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/8] dmaengine: sun6i-dma: Only calculate residue if state exists.

2016-06-07 Thread Peter Griffin
There is no point in calculating the residue if state does not
exist to store the value.

Signed-off-by: Peter Griffin 
---
 drivers/dma/sun6i-dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index 5065ca4..3835fcd 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -865,7 +865,7 @@ static enum dma_status sun6i_dma_tx_status(struct dma_chan 
*chan,
size_t bytes = 0;
 
ret = dma_cookie_status(chan, cookie, state);
-   if (ret == DMA_COMPLETE)
+   if (ret == DMA_COMPLETE || !state)
return ret;
 
spin_lock_irqsave(&vchan->vc.lock, flags);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/8] dmaengine: ste_dma40: Only calculate residue if txstate exists.

2016-06-07 Thread Peter Griffin
There is no point calculating the residue if there is
no txstate to store the value.

Signed-off-by: Peter Griffin 
---
 drivers/dma/ste_dma40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c
index 6fb8307..378cc47 100644
--- a/drivers/dma/ste_dma40.c
+++ b/drivers/dma/ste_dma40.c
@@ -2588,7 +2588,7 @@ static enum dma_status d40_tx_status(struct dma_chan 
*chan,
}
 
ret = dma_cookie_status(chan, cookie, txstate);
-   if (ret != DMA_COMPLETE)
+   if (ret != DMA_COMPLETE && txstate)
dma_set_residue(txstate, stedma40_residue(chan));
 
if (d40_is_paused(d40c))
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/8] dmaengine: s3c24xx: Simplify code in s3c24xx_dma_tx_status()

2016-06-07 Thread Peter Griffin
Doing so saves a few lines of code in the driver.

Signed-off-by: Peter Griffin 
---
 drivers/dma/s3c24xx-dma.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/dma/s3c24xx-dma.c b/drivers/dma/s3c24xx-dma.c
index 17ccdfd..f7d2c7a 100644
--- a/drivers/dma/s3c24xx-dma.c
+++ b/drivers/dma/s3c24xx-dma.c
@@ -768,16 +768,12 @@ static enum dma_status s3c24xx_dma_tx_status(struct 
dma_chan *chan,
 
spin_lock_irqsave(&s3cchan->vc.lock, flags);
ret = dma_cookie_status(chan, cookie, txstate);
-   if (ret == DMA_COMPLETE) {
-   spin_unlock_irqrestore(&s3cchan->vc.lock, flags);
-   return ret;
-   }
 
/*
 * There's no point calculating the residue if there's
 * no txstate to store the value.
 */
-   if (!txstate) {
+   if (ret == DMA_COMPLETE || !txstate) {
spin_unlock_irqrestore(&s3cchan->vc.lock, flags);
return ret;
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/8] dmaengine: coh901318: Only calculate residue if txstate exists.

2016-06-07 Thread Peter Griffin
There is no point in calculating the residue if there is no
txstate to store the value.

Signed-off-by: Peter Griffin 
---
 drivers/dma/coh901318.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/coh901318.c b/drivers/dma/coh901318.c
index c340ca9..c100616 100644
--- a/drivers/dma/coh901318.c
+++ b/drivers/dma/coh901318.c
@@ -2422,7 +2422,7 @@ coh901318_tx_status(struct dma_chan *chan, dma_cookie_t 
cookie,
enum dma_status ret;
 
ret = dma_cookie_status(chan, cookie, txstate);
-   if (ret == DMA_COMPLETE)
+   if (ret == DMA_COMPLETE || !txstate)
return ret;
 
dma_set_residue(txstate, coh901318_get_bytes_left(chan));
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/8] dmaengine: fsl-edma: print error code in error messages.

2016-06-07 Thread Peter Griffin
It is useful to print the error code as part of the error
message.

Signed-off-by: Peter Griffin 
---
 drivers/dma/fsl-edma.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
index 7208fc9..cc06eea 100644
--- a/drivers/dma/fsl-edma.c
+++ b/drivers/dma/fsl-edma.c
@@ -963,14 +963,16 @@ static int fsl_edma_probe(struct platform_device *pdev)
 
ret = dma_async_device_register(&fsl_edma->dma_dev);
if (ret) {
-   dev_err(&pdev->dev, "Can't register Freescale eDMA engine.\n");
+   dev_err(&pdev->dev,
+   "Can't register Freescale eDMA engine. (%d)\n", ret);
fsl_disable_clocks(fsl_edma);
return ret;
}
 
ret = of_dma_controller_register(np, fsl_edma_xlate, fsl_edma);
if (ret) {
-   dev_err(&pdev->dev, "Can't register Freescale eDMA of_dma.\n");
+   dev_err(&pdev->dev,
+   "Can't register Freescale eDMA of_dma. (%d)\n", ret);
dma_async_device_unregister(&fsl_edma->dma_dev);
fsl_disable_clocks(fsl_edma);
return ret;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/8] dmaengine: fsl-edma: Fix clock handling error paths

2016-06-07 Thread Peter Griffin
Currently fsl-edma doesn't clk_disable_unprepare()
its clocks on error conditions. This patch adds a
fsl_disable_clocks helper for this, and also only
disables clocks which were enabled if encountering
an error whilst enabling clocks.

Signed-off-by: Peter Griffin 
---
 drivers/dma/fsl-edma.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
index be2e62b..7208fc9 100644
--- a/drivers/dma/fsl-edma.c
+++ b/drivers/dma/fsl-edma.c
@@ -852,6 +852,14 @@ fsl_edma_irq_init(struct platform_device *pdev, struct 
fsl_edma_engine *fsl_edma
return 0;
 }
 
+static void fsl_disable_clocks(struct fsl_edma_engine *fsl_edma)
+{
+   int i;
+
+   for (i = 0; i < DMAMUX_NR; i++)
+   clk_disable_unprepare(fsl_edma->muxclk[i]);
+}
+
 static int fsl_edma_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
@@ -897,6 +905,10 @@ static int fsl_edma_probe(struct platform_device *pdev)
 
ret = clk_prepare_enable(fsl_edma->muxclk[i]);
if (ret) {
+   /* disable only clks which were enabled on error */
+   for (; i >= 0; i--)
+   clk_disable_unprepare(fsl_edma->muxclk[i]);
+
dev_err(&pdev->dev, "DMAMUX clk block failed.\n");
return ret;
}
@@ -952,6 +964,7 @@ static int fsl_edma_probe(struct platform_device *pdev)
ret = dma_async_device_register(&fsl_edma->dma_dev);
if (ret) {
dev_err(&pdev->dev, "Can't register Freescale eDMA engine.\n");
+   fsl_disable_clocks(fsl_edma);
return ret;
}
 
@@ -959,6 +972,7 @@ static int fsl_edma_probe(struct platform_device *pdev)
if (ret) {
dev_err(&pdev->dev, "Can't register Freescale eDMA of_dma.\n");
dma_async_device_unregister(&fsl_edma->dma_dev);
+   fsl_disable_clocks(fsl_edma);
return ret;
}
 
@@ -972,13 +986,10 @@ static int fsl_edma_remove(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
struct fsl_edma_engine *fsl_edma = platform_get_drvdata(pdev);
-   int i;
 
of_dma_controller_free(np);
dma_async_device_unregister(&fsl_edma->dma_dev);
-
-   for (i = 0; i < DMAMUX_NR; i++)
-   clk_disable_unprepare(fsl_edma->muxclk[i]);
+   fsl_disable_clocks(fsl_edma);
 
return 0;
 }
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/8] Various dmaengine cleanups

2016-06-07 Thread Peter Griffin
Hi Vinod,

This series is a bunch of cleanup updates to various
dmaengine drivers, based on some of the review feeback to my fdma series.

regards,

Peter.

Peter Griffin (8):
  dmaengine: fsl-edma: Fix clock handling error paths
  dmaengine: fsl-edma: print error code in error messages.
  dmaengine: coh901318: Only calculate residue if txstate exists.
  dmaengine: s3c24xx: Simplify code in s3c24xx_dma_tx_status()
  dmaengine: ste_dma40: Only calculate residue if txstate exists.
  dmaengine: sun6i-dma: Only calculate residue if state exists.
  dmaengine: tegra20-apb-dma: Only calculate residue if txstate exists.
  dmaengine: Remove site specific OOM error messages on kzalloc

 drivers/dma/amba-pl08x.c| 10 +-
 drivers/dma/bestcomm/bestcomm.c |  2 --
 drivers/dma/coh901318.c |  2 +-
 drivers/dma/edma.c  | 16 
 drivers/dma/fsl-edma.c  | 25 +++--
 drivers/dma/fsldma.c|  2 --
 drivers/dma/k3dma.c | 10 --
 drivers/dma/mmp_tdma.c  |  5 ++---
 drivers/dma/moxart-dma.c|  4 +---
 drivers/dma/nbpfaxi.c   |  5 ++---
 drivers/dma/pl330.c |  5 +
 drivers/dma/ppc4xx/adma.c   |  2 --
 drivers/dma/s3c24xx-dma.c   | 11 ++-
 drivers/dma/sh/shdmac.c |  9 ++---
 drivers/dma/sh/sudmac.c |  9 ++---
 drivers/dma/sirf-dma.c  |  5 ++---
 drivers/dma/ste_dma40.c |  6 ++
 drivers/dma/sun6i-dma.c |  2 +-
 drivers/dma/tegra20-apb-dma.c   | 13 -
 drivers/dma/timb_dma.c  |  8 ++--
 20 files changed, 52 insertions(+), 99 deletions(-)

-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Kernel 4.7: PAGE_GUARDED and _PAGE_NO_CACHE

2016-06-07 Thread Christian Zigotzky

Hi Michael,

On 06 June 2016 at 02:51 AM, Michael Ellerman wrote:

On Sat, 2016-06-04 at 17:07 +0200, Christian Zigotzky wrote:


Aneesh,

Shall I bisect the kernel from the powerpc git?

No just use linus' tree.


Shall I start with the following commit?

https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=8ffb4103f5e28d7e7890ed4774d8e009f253f56e

Yeah that would be a good one to start with.

Then mark rc1 as bad and bisect should do the rest.

cheers


"range.size, pgprot_val(pgprot_noncached(__pgprot(0;" isn't the 
problem. :-) It works.


764041e0f43cc7846f6d8eb246d65b53cc06c764 is the first bad commit
commit 764041e0f43cc7846f6d8eb246d65b53cc06c764
Author: Aneesh Kumar K.V 
Date:   Fri Apr 29 23:26:09 2016 +1000

powerpc/mm/radix: Add checks in slice code to catch radix usage

Radix doesn't need slice support. Catch incorrect usage of slice code
when radix is enabled.

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Michael Ellerman 

Cheers,

Christian


Git Log:

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git


git bisect start

git bisect good 8ffb4103f5e28d7e7890ed4774d8e009f253f56e

git bisect bad 1a695a905c18548062509178b98bc91e67510864 (Linux 4.7-rc1)

Output:

Bisecting: 6333 revisions left to test after this (roughly 13 steps)
[4741526b83c5d3a3d661d1896f9e7414c5730bcb] mm, page_alloc: restore the 
original nodemask if the fast path allocation failed




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-05145-g4741526-dirty #1 SMP Mon Jun 6 
14:35:01 CEST 2016 ppc64 GNU/Linux)


Output:

Bisecting: 3014 revisions left to test after this (roughly 12 steps)
[2f37dd131c5d3a2eac21cd5baf80658b1b02a8ac] Merge tag 'staging-4.7-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 1693 revisions left to test after this (roughly 11 steps)
[54cf809b9512be95f53ed4a5e3b631d1ac42f0fa] locking,qspinlock: Fix 
spin_is_locked() and spin_unlock_wait()




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 721 revisions left to test after this (roughly 10 steps)
[f4c80d5a16eb4b08a0d9ade154af1ebdc63f5752] Merge tag 'sound-4.7-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-05931-gf4c80d5-dirty #1 SMP Mon Jun 6 
19:13:42 CEST 2016 ppc64 GNU/Linux)


Output:

Bisecting: 324 revisions left to test after this (roughly 9 steps)
[c04a5880299eab3da8c10547db96ea9cdffd44a6] Merge tag 'powerpc-4.7-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux




I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 186 revisions left to test after this (roughly 8 steps)
[e9ad9b9bd3a3b95c89a29b2a197476e662db4233] Merge tag 'docs-for-linus' of 
git://git.lwn.net/linux




git bisect good (Linux AmigaoneX1000 
4.6.0_A-EON_AmigaONE_X1000_Nemo-06141-ge9ad9b9-dirty #1 SMP Tue Jun 7 
08:32:33 CEST 2016 ppc64 GNU/Linux)


Bisecting: 93 revisions left to test after this (roughly 7 steps)
[0e5b5ba17ac33a05d9f4a48b5eb8b5e30f2274d7] cxl: Remove duplicate #defines



I had to replace

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, _PAGE_NO_CACHE|_PAGE_GUARDED);

with

__ioremap_at(range.cpu_addr, (void *)ISA_IO_BASE,
   range.size, 
pgprot_val(pgprot_noncached(__pgprot(0;



in the file "pci-common.c". After that it compiled but it doesn't boot 
so "git bisect bad"


Output:

Bisecting: 46 revisions left to test after this (roughly 6 steps)
[764041e0f43cc7846f6d8eb246d65b53cc06c764] powerpc/mm/radix: Add checks 
in slice code to catch radix usage




It doesn't boot. Booting Linux via __start()... didn't appear in the CFE.

git bisect bad

Output:

Bisecting: 22 revisions left to test after this (roughly 5 steps)
[2e87351

[PATCH v12.update 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer

2016-06-07 Thread Yinghai Lu
This one is preparing patch for next one:
  PCI: Let pci_mmap_page_range() take resource addr

We need to pass extra resource pointer to avoid searching that again
for powerpc and microblaze prot set operation.

update for fixing bisectibility problem found by build test robot.

Signed-off-by: Yinghai Lu 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-cris-ker...@axis.com
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@linux-mips.org
Cc: linux-am33-l...@redhat.com
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux-xte...@linux-xtensa.org
---
 arch/arm/include/asm/pci.h  | 2 --
 arch/arm/kernel/bios32.c| 3 ++-
 arch/cris/arch-v32/drivers/pci/bios.c   | 3 ++-
 arch/cris/include/asm/pci.h | 3 ---
 arch/ia64/include/asm/pci.h | 2 --
 arch/ia64/pci/pci.c | 3 ++-
 arch/microblaze/include/asm/pci.h   | 3 ---
 arch/microblaze/pci/pci-common.c| 3 ++-
 arch/mips/include/asm/pci.h | 3 ---
 arch/mips/pci/pci.c | 3 ++-
 arch/mn10300/include/asm/pci.h  | 3 ---
 arch/mn10300/unit-asb2305/pci-asb2305.c | 3 ++-
 arch/parisc/include/asm/pci.h   | 3 ---
 arch/parisc/kernel/pci.c| 3 ++-
 arch/powerpc/include/asm/pci.h  | 3 ---
 arch/powerpc/kernel/pci-common.c| 3 ++-
 arch/sh/drivers/pci/pci.c   | 3 ++-
 arch/sh/include/asm/pci.h   | 2 --
 arch/sparc/include/asm/pci_64.h | 4 
 arch/sparc/kernel/pci.c | 3 ++-
 arch/unicore32/include/asm/pci.h| 2 --
 arch/unicore32/kernel/pci.c | 3 ++-
 arch/x86/include/asm/pci.h  | 4 
 arch/x86/pci/i386.c | 3 ++-
 arch/xtensa/include/asm/pci.h   | 4 
 arch/xtensa/kernel/pci.c| 3 ++-
 drivers/pci/pci-sysfs.c | 2 +-
 drivers/pci/proc.c  | 2 +-
 include/linux/pci.h | 6 ++
 29 files changed, 34 insertions(+), 53 deletions(-)

diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
index 057d381..51118a0 100644
--- a/arch/arm/include/asm/pci.h
+++ b/arch/arm/include/asm/pci.h
@@ -29,8 +29,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 #define PCI_DMA_BUS_IS_PHYS (1)
 
 #define HAVE_PCI_MMAP
-extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
-   enum pci_mmap_state mmap_state, int 
write_combine);
 
 static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
 {
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 05e61a2..d3245d1 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -602,7 +602,8 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
return pci_enable_resources(dev, mask);
 }
 
-int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
+int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
+   struct vm_area_struct *vma,
enum pci_mmap_state mmap_state, int write_combine)
 {
if (mmap_state == pci_mmap_io)
diff --git a/arch/cris/arch-v32/drivers/pci/bios.c 
b/arch/cris/arch-v32/drivers/pci/bios.c
index 64a5fb9..082efb9 100644
--- a/arch/cris/arch-v32/drivers/pci/bios.c
+++ b/arch/cris/arch-v32/drivers/pci/bios.c
@@ -14,7 +14,8 @@ void pcibios_set_master(struct pci_dev *dev)
pci_write_config_byte(dev, PCI_LATENCY_TIMER, lat);
 }
 
-int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
+int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
+   struct vm_area_struct *vma,
enum pci_mmap_state mmap_state, int write_combine)
 {
unsigned long prot;
diff --git a/arch/cris/include/asm/pci.h b/arch/cris/include/asm/pci.h
index b1b289d..65198cb 100644
--- a/arch/cris/include/asm/pci.h
+++ b/arch/cris/include/asm/pci.h
@@ -42,9 +42,6 @@ struct pci_dev;
 #define PCI_DMA_BUS_IS_PHYS(1)
 
 #define HAVE_PCI_MMAP
-extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
-  enum pci_mmap_state mmap_state, int 
write_combine);
-
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/ia64/include/asm/pci.h b/arch/ia64/include/asm/pci.h
index c0835b0..6a2f5d8 100644
--- a/arch/ia64/include/asm/pci.h
+++ b/arch/ia64/include/asm/pci.h
@@ -51,8 +51,6 @@ extern unsigned long ia64_max_iommu_merge_mask;
 #define PCI_DMA_BUS_IS_PHYS(ia64_max_iommu_merge_mask == ~0UL)
 
 #define HAVE_PCI_MMAP
-extern int pci_mmap_page_range (struct pci_dev *dev, struct vm_area_struct 
*vma,
-   enum pci_mmap_state mmap_state, int 
write_combine);
 #define HAVE_PCI_LEGACY
 extern int pci_mmap_legacy_page_range(struct pci_bus *bus,
  struct vm_area_struct *vma

[PATCH v12.update 02/15] PCI: Let pci_mmap_page_range() take resource address

2016-06-07 Thread Yinghai Lu
In 8c05cd08a7 ("PCI: fix offset check for sysfs mmapped files"), try
to check exposed value with resource start/end in proc mmap path.

|start = vma->vm_pgoff;
|size = ((pci_resource_len(pdev, resno) - 1) >> PAGE_SHIFT) + 1;
|pci_start = (mmap_api == PCI_MMAP_PROCFS) ?
|pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
|if (start >= pci_start && start < pci_start + size &&
|start + nr <= pci_start + size)

That breaks sparc that exposed value is BAR value, and need to be offseted
to resource address.

Original pci_mmap_page_range() is taking PCI BAR value aka usr_address.

Bjorn found out that it would be much simple to pass resource address
directly and avoid extra those __pci_mmap_make_offset.

In this patch:
1. in proc path: proc_bus_pci_mmap, try convert back to resource
   before calling pci_mmap_page_range
2. in sysfs path: pci_mmap_resource will just offset with resource start.
3. all pci_mmap_page_range will have vma->vm_pgoff with in resource
   range instead of BAR value.
4. remove __pci_mmap_make_offset, as the checking is done
   in pci_mmap_fits().

-v2: add pci_user_to_resource and remove __pci_mmap_make_offset
-v3: pass resource pointer with pci_mmap_page_range()

update for fixing bisectibility problem found by build test robot.

Signed-off-by: Yinghai Lu 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparcli...@vger.kernel.org
Cc: linux-xte...@linux-xtensa.org
---
 arch/microblaze/pci/pci-common.c |  80 +++---
 arch/powerpc/kernel/pci-common.c |  80 +++---
 arch/sparc/kernel/pci.c  | 117 ---
 arch/xtensa/kernel/pci.c |  75 -
 drivers/pci/pci-sysfs.c  |  33 ---
 drivers/pci/pci.h|   2 +-
 drivers/pci/proc.c   |  63 ++---
 7 files changed, 107 insertions(+), 343 deletions(-)

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index fd2b013..9471383 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -156,69 +156,6 @@ void pcibios_set_master(struct pci_dev *dev)
  */
 
 /*
- * Adjust vm_pgoff of VMA such that it is the physical page offset
- * corresponding to the 32-bit pci bus offset for DEV requested by the user.
- *
- * Basically, the user finds the base address for his device which he wishes
- * to mmap.  They read the 32-bit value from the config space base register,
- * add whatever PAGE_SIZE multiple offset they wish, and feed this into the
- * offset parameter of mmap on /proc/bus/pci/XXX for that device.
- *
- * Returns negative error code on failure, zero on success.
- */
-static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
-  resource_size_t *offset,
-  enum pci_mmap_state mmap_state)
-{
-   struct pci_controller *hose = pci_bus_to_host(dev->bus);
-   unsigned long io_offset = 0;
-   int i, res_bit;
-
-   if (!hose)
-   return NULL;/* should never happen */
-
-   /* If memory, add on the PCI bridge address offset */
-   if (mmap_state == pci_mmap_mem) {
-#if 0 /* See comment in pci_resource_to_user() for why this is disabled */
-   *offset += hose->pci_mem_offset;
-#endif
-   res_bit = IORESOURCE_MEM;
-   } else {
-   io_offset = (unsigned long)hose->io_base_virt - _IO_BASE;
-   *offset += io_offset;
-   res_bit = IORESOURCE_IO;
-   }
-
-   /*
-* Check that the offset requested corresponds to one of the
-* resources of the device.
-*/
-   for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
-   struct resource *rp = &dev->resource[i];
-   int flags = rp->flags;
-
-   /* treat ROM as memory (should be already) */
-   if (i == PCI_ROM_RESOURCE)
-   flags |= IORESOURCE_MEM;
-
-   /* Active and same type? */
-   if ((flags & res_bit) == 0)
-   continue;
-
-   /* In the range of this resource? */
-   if (*offset < (rp->start & PAGE_MASK) || *offset > rp->end)
-   continue;
-
-   /* found it! construct the final physical address */
-   if (mmap_state == pci_mmap_io)
-   *offset += hose->io_base_phys - io_offset;
-   return rp;
-   }
-
-   return NULL;
-}
-
-/*
  * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
  * device mapping.
  */
@@ -310,15 +247,18 @@ int pci_mmap_page_range(struct pci_dev *dev, struct 
resource *res,
 {
resource_size_t offset =
((resource_size_t)vma->vm_pgoff) << PAGE_SHIFT;
-   struct resource *rp;
int ret;
 
-   rp = __pci_mmap_make_off

[PATCH 3/6] ppc: bpf/jit: Introduce rotate immediate instructions

2016-06-07 Thread Naveen N. Rao
Since we will be using the rotate immediate instructions for extended
BPF JIT, let's introduce macros for the same. And since the shift
immediate operations use the rotate immediate instructions, let's redo
those macros to use the newly introduced instructions.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/ppc-opcode.h |  2 ++
 arch/powerpc/net/bpf_jit.h| 20 +++-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1d035c1..fd8d640 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -272,6 +272,8 @@
 #define __PPC_SH(s)__PPC_WS(s)
 #define __PPC_MB(s)(((s) & 0x1f) << 6)
 #define __PPC_ME(s)(((s) & 0x1f) << 1)
+#define __PPC_MB64(s)  (__PPC_MB(s) | ((s) & 0x20))
+#define __PPC_ME64(s)  __PPC_MB64(s)
 #define __PPC_BI(s)(((s) & 0x1f) << 16)
 #define __PPC_CT(t)(((t) & 0x0f) << 21)
 
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 4c1e055..95d0e38 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -210,18 +210,20 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_RLWINM(d, a, i, mb, me)EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_MB(mb) | __PPC_ME(me))
+#define PPC_RLDICR(d, a, i, me)EMIT(PPC_INST_RLDICR | 
___PPC_RA(d) | \
+   ___PPC_RS(a) | __PPC_SH(i) |  \
+   __PPC_ME64(me) | (((i) & 0x20) >> 4))
+
 /* slwi = rlwinm Rx, Ry, n, 0, 31-n */
-#define PPC_SLWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(0) | __PPC_ME(31-(i)))
+#define PPC_SLWI(d, a, i)  PPC_RLWINM(d, a, i, 0, 31-(i))
 /* srwi = rlwinm Rx, Ry, 32-n, n, 31 */
-#define PPC_SRWI(d, a, i)  EMIT(PPC_INST_RLWINM | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(32-(i)) |\
-__PPC_MB(i) | __PPC_ME(31))
+#define PPC_SRWI(d, a, i)  PPC_RLWINM(d, a, 32-(i), i, 31)
 /* sldi = rldicr Rx, Ry, n, 63-n */
-#define PPC_SLDI(d, a, i)  EMIT(PPC_INST_RLDICR | ___PPC_RA(d) | \
-___PPC_RS(a) | __PPC_SH(i) | \
-__PPC_MB(63-(i)) | (((i) & 0x20) >> 4))
+#define PPC_SLDI(d, a, i)  PPC_RLDICR(d, a, i, 63-(i))
+
 #define PPC_NEG(d, a)  EMIT(PPC_INST_NEG | ___PPC_RT(d) | ___PPC_RA(a))
 
 /* Long jump; (unconditional 'branch') */
-- 
2.8.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] workqueue:Fix affinity of an unbound worker of a node with 1 online CPU

2016-06-07 Thread Gautham R. Shenoy
With commit e9d867a67fd03ccc ("sched: Allow per-cpu kernel threads to
run on online && !active"), __set_cpus_allowed_ptr() expects that only
strict per-cpu kernel threads can have affinity to an online CPU which
is not yet active.

This assumption is currently broken in the CPU_ONLINE notification
handler for the workqueues where restore_unbound_workers_cpumask()
calls set_cpus_allowed_ptr() when the first cpu in the unbound
worker's pool->attr->cpumask comes online. Since
set_cpus_allowed_ptr() is called with pool->attr->cpumask in which
only one CPU is online which is not yet active, we get the following
WARN_ON during an CPU online operation.

[ cut here ]
WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166
__set_cpus_allowed_ptr+0x228/0x2e0
Modules linked in:
CPU: 40 PID: 248 Comm: cpuhp/40 Not tainted 4.6.0-autotest+ #4
<..snip..>
Call Trace:
[c00f273ff920] [c010493c] __set_cpus_allowed_ptr+0x2cc/0x2e0 
(unreliable)
[c00f273ffac0] [c00ed4b0] workqueue_cpu_up_callback+0x2c0/0x470
[c00f273ffb70] [c00f5c58] notifier_call_chain+0x98/0x100
[c00f273ffbc0] [c00c5ed0] __cpu_notify+0x70/0xe0
[c00f273ffc00] [c00c6028] notify_online+0x38/0x50
[c00f273ffc30] [c00c5214] cpuhp_invoke_callback+0x84/0x250
[c00f273ffc90] [c00c562c] cpuhp_up_callbacks+0x5c/0x120
[c00f273ffce0] [c00c64d4] cpuhp_thread_fun+0x184/0x1c0
[c00f273ffd20] [c00fa050] smpboot_thread_fn+0x290/0x2a0
[c00f273ffd80] [c00f45b0] kthread+0x110/0x130
[c00f273ffe30] [c0009570] ret_from_kernel_thread+0x5c/0x6c
---[ end trace 00f1456578b2a3b2 ]---

This patch sets the affinity of the worker to
a) the only online CPU in the cpumask of the worker pool when it comes
   online.
b) the cpumask of the worker pool when the second CPU in the pool's
   cpumask comes online.

Reported-by: Abdul Haleem 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tejun Heo 
Cc: Michael Ellerman 
Signed-off-by: Gautham R. Shenoy 
---
 kernel/workqueue.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index e412794..1199f73 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4586,7 +4586,7 @@ static void rebind_workers(struct worker_pool *pool)
  *
  * An unbound pool may end up with a cpumask which doesn't have any online
  * CPUs.  When a worker of such pool get scheduled, the scheduler resets
- * its cpus_allowed.  If @cpu is in @pool's cpumask which didn't have any
+ * its cpus_allowed.  If @cpu is in @pool's cpumask which had at most one
  * online CPU before, cpus_allowed of all its workers should be restored.
  */
 static void restore_unbound_workers_cpumask(struct worker_pool *pool, int cpu)
@@ -4600,15 +4600,26 @@ static void restore_unbound_workers_cpumask(struct 
worker_pool *pool, int cpu)
if (!cpumask_test_cpu(cpu, pool->attrs->cpumask))
return;
 
-   /* is @cpu the only online CPU? */
cpumask_and(&cpumask, pool->attrs->cpumask, cpu_online_mask);
-   if (cpumask_weight(&cpumask) != 1)
+
+   /*
+* The affinity needs to be set
+* a) to @cpu when that is the only online CPU in
+*pool->attrs->cpumask.
+* b) to pool->attrs->cpumask when exactly two CPUs in
+*pool->attrs->cpumask are online. This affinity will be
+*retained when subsequent CPUs come online.
+*/
+   if (cpumask_weight(&cpumask) > 2)
return;
 
+   if (cpumask_weight(&cpumask) == 2)
+   cpumask_copy(&cpumask, pool->attrs->cpumask);
+
/* as we're called from CPU_ONLINE, the following shouldn't fail */
for_each_pool_worker(worker, pool)
WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
- pool->attrs->cpumask) < 0);
+ &cpumask) < 0);
 }
 
 /*
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-07 Thread Gautham R. Shenoy
Currently in the CPU_ONLINE workqueue handler, the
restore_unbound_workers_cpumask() will never call
set_cpus_allowed_ptr() for a newly created unbound worker thread.

This is because the function which creates a new unbound worker thread
when the first CPU in the node comes online [wq_update_unbound_numa()]
is invoked after the call to restore_unbound_workers_cpumask(). Thus
the affinity is never set for this worker thread when the first CPU in
the node comes online.

Furthermore, due to an optimization in
restore_unbound_workers_cpumask(), set_cpus_allowed_ptr() is not
called when subsequent CPUs in the node come online since it assumes
that the affinity would have been set when the first CPU has come
online.

This patch fixes this issue by invoking wq_update_unbound_numa()
before the calling restore_unbound_workers_cpumask().

Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tejun Heo 
Cc: Michael Ellerman 
Signed-off-by: Gautham R. Shenoy 
---
 kernel/workqueue.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index e1c0e99..e412794 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4638,6 +4638,10 @@ static int workqueue_cpu_up_callback(struct 
notifier_block *nfb,
case CPU_ONLINE:
mutex_lock(&wq_pool_mutex);
 
+   /* update NUMA affinity of unbound workqueues */
+   list_for_each_entry(wq, &workqueues, list)
+   wq_update_unbound_numa(wq, cpu, true);
+
for_each_pool(pool, pi) {
mutex_lock(&pool->attach_mutex);
 
@@ -4649,10 +4653,6 @@ static int workqueue_cpu_up_callback(struct 
notifier_block *nfb,
mutex_unlock(&pool->attach_mutex);
}
 
-   /* update NUMA affinity of unbound workqueues */
-   list_for_each_entry(wq, &workqueues, list)
-   wq_update_unbound_numa(wq, cpu, true);
-
mutex_unlock(&wq_pool_mutex);
break;
}
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] Fix CPU Online handling for unbounded worker threads

2016-06-07 Thread Gautham R. Shenoy
Hi,

This patchset fixes a couple of issues in the CPU_ONLINE notification
handling for the workqueues with respect to unbounded worker threads.

Patch 1 ensures that the affinity of a unbound worker thread
associated with a node whose very first CPU has come online is set
correctly. In the existing code path we will never call
set_cpus_allowed_ptr() for unbound worker threads that have been
created on a CPU Online operation after boot.

Patch 2 fixes the following WARN_ON() reported by Abdul when
set_cpus_allowed_ptr() for an unbound worker thread is invoked when
only one of the CPUs in its cpumask is online but not yet active.

 [ cut here ]
 WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166 
__set_cpus_allowed_ptr+0x21c/0x290
 Modules linked in:
CPU: 40 PID: 248 Comm: cpuhp/40 Not tainted 4.6.0-autotest #1
task: c00f27284200 ti: c00f273fc000 task.ti: c00f273fc000
NIP: c010488c LR: c0104874 CTR: 
REGS: c00f273ff7d0 TRAP: 0700   Not tainted  (4.6.0-autotest)
MSR: 900100029033   CR: 28002804  XER: 
2000
CFAR: c05b0888 SOFTE: 0
GPR00: c010478c c00f273ffa50 c13ce400 
GPR04: c140ed98 0800 c007f64d9408 
GPR08:  0028 c140ee90 0020
GPR12: 2200 cfb96800 c00f44a8 c007fa158480
GPR16: c007fc621a70 c00f2721f800  0001
GPR20: c1571ef0  c134879f c12bc510
GPR24: 0100  c140ea98 c007f64d9408
GPR28: c007fbc21c00 ffea  c00f2728
NIP [c010488c] __set_cpus_allowed_ptr+0x21c/0x290
LR [c0104874] __set_cpus_allowed_ptr+0x204/0x290
Call Trace:
[c00f273ffa50] [c010478c] __set_cpus_allowed_ptr+0x11c/0x290 
(unreliable)
[c00f273ffac0] [c00ed4b0] workqueue_cpu_up_callback+0x2c0/0x470
[c00f273ffb70] [c00f5c58] notifier_call_chain+0x98/0x100
[c00f273ffbc0] [c00c5ed0] __cpu_notify+0x70/0xe0
[c00f273ffc00] [c00c6028] notify_online+0x38/0x50
[c00f273ffc30] [c00c5214] cpuhp_invoke_callback+0x84/0x250
[c00f273ffc90] [c00c562c] cpuhp_up_callbacks+0x5c/0x120
[c00f273ffce0] [c00c64d4] cpuhp_thread_fun+0x184/0x1c0
[c00f273ffd20] [c00fa050] smpboot_thread_fn+0x290/0x2a0
[c00f273ffd80] [c00f45b0] kthread+0x110/0x130
[c00f273ffe30] [c0009570] ret_from_kernel_thread+0x5c/0x6c
Instruction dump:
419eff3c 3d420004 38a00800 388a0998 7f63db78 484abfa1 6000 2fa3
409eff1c 813f0378 2f890001 419eff10 <0fe0> 4b08 6000 6000
 ---[ end trace cbc1c5cfbc9591d0 ]---

The patches are based on 4.7-rc2. I have tested the patches on a
multi-node x86_64 and a ppc64

Gautham R. Shenoy (2):
  workqueue: Move wq_update_unbound_numa() to the beginning of
CPU_ONLINE
  workqueue:Fix affinity of an unbound worker of a node with 1 online
CPU

 kernel/workqueue.c | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-07 Thread Naveen N. Rao
PPC64 eBPF JIT compiler.

Enable with:
echo 1 > /proc/sys/net/core/bpf_jit_enable
or
echo 2 > /proc/sys/net/core/bpf_jit_enable

... to see the generated JIT code. This can further be processed with
tools/net/bpf_jit_disasm.

With CONFIG_TEST_BPF=m and 'modprobe test_bpf':
test_bpf: Summary: 305 PASSED, 0 FAILED, [297/297 JIT'ed]

... on both ppc64 BE and LE.

The details of the approach are documented through various comments in
the code.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  20 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h|  53 +-
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp64.c | 956 ++
 8 files changed, 1317 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 01f7464..ee82f9a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,7 +128,8 @@ config PPC
select IRQ_FORCED_THREADING
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT
+   select HAVE_CBPF_JIT if !PPC64
+   select HAVE_EBPF_JIT if PPC64
select HAVE_ARCH_JUMP_LABEL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/powerpc/include/asm/asm-compat.h 
b/arch/powerpc/include/asm/asm-compat.h
index dc85dcb..cee3aa0 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -36,11 +36,13 @@
 #define PPC_MIN_STKFRM 112
 
 #ifdef __BIG_ENDIAN__
+#define LHZX_BEstringify_in_c(lhzx)
 #define LWZX_BEstringify_in_c(lwzx)
 #define LDX_BE stringify_in_c(ldx)
 #define STWX_BEstringify_in_c(stwx)
 #define STDX_BEstringify_in_c(stdx)
 #else
+#define LHZX_BEstringify_in_c(lhbrx)
 #define LWZX_BEstringify_in_c(lwbrx)
 #define LDX_BE stringify_in_c(ldbrx)
 #define STWX_BEstringify_in_c(stwbrx)
diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index fd8d640..6a77d130 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -142,9 +142,11 @@
 #define PPC_INST_ISEL  0x7c1e
 #define PPC_INST_ISEL_MASK 0xfc3e
 #define PPC_INST_LDARX 0x7ca8
+#define PPC_INST_STDCX 0x7c0001ad
 #define PPC_INST_LSWI  0x7c0004aa
 #define PPC_INST_LSWX  0x7c00042a
 #define PPC_INST_LWARX 0x7c28
+#define PPC_INST_STWCX 0x7c00012d
 #define PPC_INST_LWSYNC0x7c2004ac
 #define PPC_INST_SYNC  0x7c0004ac
 #define PPC_INST_SYNC_MASK 0xfc0007fe
@@ -211,8 +213,11 @@
 #define PPC_INST_LBZ   0x8800
 #define PPC_INST_LD0xe800
 #define PPC_INST_LHZ   0xa000
-#define PPC_INST_LHBRX 0x7c00062c
 #define PPC_INST_LWZ   0x8000
+#define PPC_INST_LHBRX 0x7c00062c
+#define PPC_INST_LDBRX 0x7c000428
+#define PPC_INST_STB   0x9800
+#define PPC_INST_STH   0xb000
 #define PPC_INST_STD   0xf800
 #define PPC_INST_STDU  0xf801
 #define PPC_INST_STW   0x9000
@@ -221,22 +226,34 @@
 #define PPC_INST_MTLR  0x7c0803a6
 #define PPC_INST_CMPWI 0x2c00
 #define PPC_INST_CMPDI 0x2c20
+#define PPC_INST_CMPW  0x7c00
+#define PPC_INST_CMPD  0x7c20
 #define PPC_INST_CMPLW 0x7c40
+#define PPC_INST_CMPLD 0x7c200040
 #define PPC_INST_CMPLWI0x2800
+#define PPC_INST_CMPLDI0x2820
 #define PPC_INST_ADDI  0x3800
 #define PPC_INST_ADDIS 0x3c00
 #define PPC_INST_ADD   0x7c000214
 #define PPC_INST_SUB   0x7c50
 #define PPC_INST_BLR   0x4e800020
 #define PPC_INST_BLRL  0x4e800021
+#define PPC_INST_MULLD 0x7c0001d2
 #define PPC_INST_MULLW 0x7c0001d6
 #define PPC_INST_MULHWU0x7c16
 #define PPC_INST_MULLI 0x1c00
 #define PPC_INST_DIVWU 0x7c000396
+#define PPC_

[PATCH 1/6] ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation

2016-06-07 Thread Naveen N. Rao
The existing LI32() macro can sometimes result in a sign-extended 32-bit
load that does not clear the top 32-bits properly. As an example,
loading 0x7fff results in the register containing
0x7fff. While this does not impact classic BPF JIT
implementation (since that only uses the lower word for all operations),
we would like to share this macro between classic BPF JIT and extended
BPF JIT, wherein the entire 64-bit value in the register matters. Fix
this by first doing a shifted LI followed by ORI.

An additional optimization is with loading values between -32768 to -1,
where we now only need a single LI.

The new implementation now generates the same or less number of
instructions.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit.h | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 889fd19..a9882db 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -232,10 +232,17 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 (((cond) & 0x3ff) << 16) |   \
 (((dest) - (ctx->idx * 4)) & \
  0xfffc))
-#define PPC_LI32(d, i) do { PPC_LI(d, IMM_L(i)); \
-   if ((u32)(uintptr_t)(i) >= 32768) {   \
-   PPC_ADDIS(d, d, IMM_HA(i));   \
+/* Sign-extended 32-bit immediate load */
+#define PPC_LI32(d, i) do {  \
+   if ((int)(uintptr_t)(i) >= -32768 &&  \
+   (int)(uintptr_t)(i) < 32768)  \
+   PPC_LI(d, i); \
+   else {\
+   PPC_LIS(d, IMM_H(i)); \
+   if (IMM_L(i)) \
+   PPC_ORI(d, d, IMM_L(i));  \
} } while(0)
+
 #define PPC_LI64(d, i) do {  \
if (!((uintptr_t)(i) & 0xULL))\
PPC_LI32(d, i);   \
-- 
2.8.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/6] ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header

2016-06-07 Thread Naveen N. Rao
Break out classic BPF JIT specifics into a separate header in
preparation for eBPF JIT implementation. Note that ppc32 will still need
the classic BPF JIT.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit.h  | 121 +-
 arch/powerpc/net/bpf_jit32.h| 139 
 arch/powerpc/net/bpf_jit_asm.S  |   2 +-
 arch/powerpc/net/bpf_jit_comp.c |   2 +-
 4 files changed, 143 insertions(+), 121 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 9041d3f..313cfaf 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -1,4 +1,5 @@
-/* bpf_jit.h: BPF JIT compiler for PPC64
+/*
+ * bpf_jit.h: BPF JIT compiler for PPC
  *
  * Copyright 2011 Matt Evans , IBM Corporation
  *
@@ -10,66 +11,8 @@
 #ifndef _BPF_JIT_H
 #define _BPF_JIT_H
 
-#ifdef CONFIG_PPC64
-#define BPF_PPC_STACK_R3_OFF   48
-#define BPF_PPC_STACK_LOCALS   32
-#define BPF_PPC_STACK_BASIC(48+64)
-#define BPF_PPC_STACK_SAVE (18*8)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (48+64)
-#else
-#define BPF_PPC_STACK_R3_OFF   24
-#define BPF_PPC_STACK_LOCALS   16
-#define BPF_PPC_STACK_BASIC(24+32)
-#define BPF_PPC_STACK_SAVE (18*4)
-#define BPF_PPC_STACKFRAME (BPF_PPC_STACK_BASIC+BPF_PPC_STACK_LOCALS+ \
-BPF_PPC_STACK_SAVE)
-#define BPF_PPC_SLOWPATH_FRAME (24+32)
-#endif
-
-#define REG_SZ (BITS_PER_LONG/8)
-
-/*
- * Generated code register usage:
- *
- * As normal PPC C ABI (e.g. r1=sp, r2=TOC), with:
- *
- * skb r3  (Entry parameter)
- * A register  r4
- * X register  r5
- * addr param  r6
- * r7-r10  scratch
- * skb->data   r14
- * skb headlen r15 (skb->len - skb->data_len)
- * m[0]r16
- * m[...]  ...
- * m[15]   r31
- */
-#define r_skb  3
-#define r_ret  3
-#define r_A4
-#define r_X5
-#define r_addr 6
-#define r_scratch1 7
-#define r_scratch2 8
-#define r_D14
-#define r_HL   15
-#define r_M16
-
 #ifndef __ASSEMBLY__
 
-/*
- * Assembly helpers from arch/powerpc/net/bpf_jit.S:
- */
-#define DECLARE_LOAD_FUNC(func)\
-   extern u8 func[], func##_negative_offset[], func##_positive_offset[]
-
-DECLARE_LOAD_FUNC(sk_load_word);
-DECLARE_LOAD_FUNC(sk_load_half);
-DECLARE_LOAD_FUNC(sk_load_byte);
-DECLARE_LOAD_FUNC(sk_load_byte_msh);
-
 #ifdef CONFIG_PPC64
 #define FUNCTION_DESCR_SIZE24
 #else
@@ -131,46 +74,6 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0)
 #endif
 
-/* Convenience helpers for the above with 'far' offsets: */
-#define PPC_LBZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LBZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LBZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LD_OFFS(r, base, i) do { if ((i) < 32768) PPC_LD(r, base, i); \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LD(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LWZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LWZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LWZ(r, r, IMM_L(i)); } } while(0)
-
-#define PPC_LHZ_OFFS(r, base, i) do { if ((i) < 32768) PPC_LHZ(r, base, i);   \
-   else {  PPC_ADDIS(r, base, IMM_HA(i));\
-   PPC_LHZ(r, r, IMM_L(i)); } } while(0)
-
-#ifdef CONFIG_PPC64
-#define PPC_LL_OFFS(r, base, i) do { PPC_LD_OFFS(r, base, i); } while(0)
-#else
-#define PPC_LL_OFFS(r, base, i) do { PPC_LWZ_OFFS(r, base, i); } while(0)
-#endif
-
-#ifdef CONFIG_SMP
-#ifdef CONFIG_PPC64
-#define PPC_BPF_LOAD_CPU(r)\
-   do { BUILD_BUG_ON(FIELD_SIZEOF(struct paca_struct, paca_index) != 2);   
\
-   PPC_LHZ_OFFS(r, 13, offsetof(struct paca_struct, paca_index));  
\
-   } while (0)
-#else
-#define PPC_BPF_LOAD_CPU(r) \
-   do { BUILD_BUG_ON(FIELD_SIZEOF(struct thread_info, cpu) != 4);  
\
-   PPC_LHZ_OFFS(r, (1 & ~(THREAD_SIZE - 1)),   
\
-   offsetof(struct thread_info, cpu)); 
\
-   } while(0)
-#endif
-#else
-#define PPC_BPF_LOAD_CPU(r) do { PPC_LI(r, 0); } while(0)
-#endif
-
 #define PPC_CMPWI(a, i)EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMP

[PATCH 4/6] ppc: bpf/jit: A few cleanups

2016-06-07 Thread Naveen N. Rao
1. Per the ISA, ADDIS actually uses RT, rather than RS. Though
the result is the same, make the usage clear.
2. The multiply instruction used is a 32-bit multiply. Rename PPC_MUL()
to PPC_MULW() to make the same clear.
3. PPC_STW[U] take the entire 16-bit immediate value and do not require
word-alignment, per the ISA. Change the macros to use IMM_L().
4. A few white-space cleanups to satisfy checkpatch.pl.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit.h  | 13 +++--
 arch/powerpc/net/bpf_jit_comp.c |  8 
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 95d0e38..9041d3f 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -83,7 +83,7 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
  */
 #define IMM_H(i)   ((uintptr_t)(i)>>16)
 #define IMM_HA(i)  (((uintptr_t)(i)>>16) +   \
-(((uintptr_t)(i) & 0x8000) >> 15))
+   (((uintptr_t)(i) & 0x8000) >> 15))
 #define IMM_L(i)   ((uintptr_t)(i) & 0x)
 
 #define PLANT_INSTR(d, idx, instr)   \
@@ -99,16 +99,16 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_MR(d, a)   PPC_OR(d, a, a)
 #define PPC_LI(r, i)   PPC_ADDI(r, 0, i)
 #define PPC_ADDIS(d, a, i) EMIT(PPC_INST_ADDIS | \
-___PPC_RS(d) | ___PPC_RA(a) | IMM_L(i))
+___PPC_RT(d) | ___PPC_RA(a) | IMM_L(i))
 #define PPC_LIS(r, i)  PPC_ADDIS(r, 0, i)
 #define PPC_STD(r, base, i)EMIT(PPC_INST_STD | ___PPC_RS(r) |\
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STDU(r, base, i)   EMIT(PPC_INST_STDU | ___PPC_RS(r) |   \
 ___PPC_RA(base) | ((i) & 0xfffc))
 #define PPC_STW(r, base, i)EMIT(PPC_INST_STW | ___PPC_RS(r) |\
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 #define PPC_STWU(r, base, i)   EMIT(PPC_INST_STWU | ___PPC_RS(r) |   \
-___PPC_RA(base) | ((i) & 0xfffc))
+___PPC_RA(base) | IMM_L(i))
 
 #define PPC_LBZ(r, base, i)EMIT(PPC_INST_LBZ | ___PPC_RT(r) |\
 ___PPC_RA(base) | IMM_L(i))
@@ -174,13 +174,14 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
 #define PPC_CMPWI(a, i)EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPDI(a, i)EMIT(PPC_INST_CMPDI | ___PPC_RA(a) | 
IMM_L(i))
 #define PPC_CMPLWI(a, i)   EMIT(PPC_INST_CMPLWI | ___PPC_RA(a) | IMM_L(i))
-#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) | 
___PPC_RB(b))
+#define PPC_CMPLW(a, b)EMIT(PPC_INST_CMPLW | ___PPC_RA(a) |
  \
+   ___PPC_RB(b))
 
 #define PPC_SUB(d, a, b)   EMIT(PPC_INST_SUB | ___PPC_RT(d) |\
 ___PPC_RB(a) | ___PPC_RA(b))
 #define PPC_ADD(d, a, b)   EMIT(PPC_INST_ADD | ___PPC_RT(d) |\
 ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_MUL(d, a, b)   EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
+#define PPC_MULW(d, a, b)  EMIT(PPC_INST_MULLW | ___PPC_RT(d) |  \
 ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_MULHWU(d, a, b)EMIT(PPC_INST_MULHWU | ___PPC_RT(d) | \
 ___PPC_RA(a) | ___PPC_RB(b))
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 2d66a84..6012aac 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -161,14 +161,14 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
case BPF_ALU | BPF_MUL | BPF_X: /* A *= X; */
ctx->seen |= SEEN_XREG;
-   PPC_MUL(r_A, r_A, r_X);
+   PPC_MULW(r_A, r_A, r_X);
break;
case BPF_ALU | BPF_MUL | BPF_K: /* A *= K */
if (K < 32768)
PPC_MULI(r_A, r_A, K);
else {
PPC_LI32(r_scratch1, K);
-   PPC_MUL(r_A, r_A, r_scratch1);
+   PPC_MULW(r_A, r_A, r_scratch1);
}
break;
case BPF_ALU | BPF_MOD | BPF_X: /* A %= X; */
@@ -184,7 +184,7 

[PATCH 2/6] ppc: bpf/jit: Optimize 64-bit Immediate loads

2016-06-07 Thread Naveen N. Rao
Similar to the LI32() optimization, if the value can be represented
in 32-bits, use LI32(). Also handle loading a few specific forms of
immediate values in an optimum manner.

Cc: Matt Evans 
Cc: Denis Kirjanov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: "David S. Miller" 
Cc: Ananth N Mavinakayanahalli 
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/net/bpf_jit.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index a9882db..4c1e055 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -244,20 +244,25 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
} } while(0)
 
 #define PPC_LI64(d, i) do {  \
-   if (!((uintptr_t)(i) & 0xULL))\
+   if ((long)(i) >= -2147483648 &&   \
+   (long)(i) < 2147483648)   \
PPC_LI32(d, i);   \
else {\
-   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
-   if ((uintptr_t)(i) & 0xULL)   \
-   PPC_ORI(d, d, \
-   ((uintptr_t)(i) >> 32) & 0x); \
+   if (!((uintptr_t)(i) & 0x8000ULL))\
+   PPC_LI(d, ((uintptr_t)(i) >> 32) & 0x);   \
+   else {\
+   PPC_LIS(d, ((uintptr_t)(i) >> 48));   \
+   if ((uintptr_t)(i) & 0xULL)   \
+   PPC_ORI(d, d, \
+ ((uintptr_t)(i) >> 32) & 0x);   \
+   } \
PPC_SLDI(d, d, 32);   \
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORIS(d, d,\
 ((uintptr_t)(i) >> 16) & 0x);\
if ((uintptr_t)(i) & 0xULL)   \
PPC_ORI(d, d, (uintptr_t)(i) & 0x);   \
-   } } while (0);
+   } } while (0)
 
 #ifdef CONFIG_PPC64
 #define PPC_FUNC_ADDR(d,i) do { PPC_LI64(d, i); } while(0)
-- 
2.8.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/6] eBPF JIT for PPC64

2016-06-07 Thread Naveen N. Rao
Implement extended BPF JIT for ppc64. We retain the classic BPF JIT for
ppc32 and move ppc64 BE/LE to use the new JIT. Classic BPF filters will
be converted to extended BPF (see convert_filter()) and JIT'ed with the
new compiler.

Most of the existing macros are retained and fixed/enhanced where
appropriate. Patches 1-4 are geared towards this.

Patch 5 breaks out the classic BPF JIT specifics into a separate
bpf_jit32.h header file, while retaining all the generic instruction
macros in bpf_jit.h.

Patch 6 implements eBPF JIT for ppc64.

Since the RFC patchset [1], powerpc JIT has now gained support for skb
access helpers and now passes all tests in test_bpf.ko. Review comments
on the RFC patches have been addressed (use of an ABI macro [2] and use
of bpf_jit_binary_alloc()), along with a few other generic fixes and
updates.

Prominent TODOs:
 - implement BPF tail calls
 - support for BPF constant blinding

Please note that patch [2] is a pre-requisite for this patchset, and is
not yet upstream.


- Naveen

[1] http://thread.gmane.org/gmane.linux.kernel/2188694
[2] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/96514


Naveen N. Rao (6):
  ppc: bpf/jit: Fix/enhance 32-bit Load Immediate implementation
  ppc: bpf/jit: Optimize 64-bit Immediate loads
  ppc: bpf/jit: Introduce rotate immediate instructions
  ppc: bpf/jit: A few cleanups
  ppc: bpf/jit: Isolate classic BPF JIT specifics into a separate header
  ppc: ebpf/jit: Implement JIT compiler for extended BPF

 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/asm-compat.h |   2 +
 arch/powerpc/include/asm/ppc-opcode.h |  22 +-
 arch/powerpc/net/Makefile |   4 +
 arch/powerpc/net/bpf_jit.h| 235 -
 arch/powerpc/net/bpf_jit32.h  | 139 +
 arch/powerpc/net/bpf_jit64.h  | 102 
 arch/powerpc/net/bpf_jit_asm.S|   2 +-
 arch/powerpc/net/bpf_jit_asm64.S  | 180 +++
 arch/powerpc/net/bpf_jit_comp.c   |  10 +-
 arch/powerpc/net/bpf_jit_comp64.c | 956 ++
 11 files changed, 1504 insertions(+), 151 deletions(-)
 create mode 100644 arch/powerpc/net/bpf_jit32.h
 create mode 100644 arch/powerpc/net/bpf_jit64.h
 create mode 100644 arch/powerpc/net/bpf_jit_asm64.S
 create mode 100644 arch/powerpc/net/bpf_jit_comp64.c

-- 
2.8.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] KVM: PPC: hypervisor large decrementer support

2016-06-07 Thread Michael Ellerman
On Fri, 2016-06-03 at 07:46 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2016-06-01 at 16:23 +1000, Michael Neuling wrote:
> > FWIW you can use:
> > andis. reg,reg,(LPCR_LD)@ha
> 
> @h in that case. Probably the same result but technically @ha is for
> arithmetic operations.

In this case it doesn't matter, because LPCR_LD is a single bit constant.

But if bit 15 of your constant happens to be 1, using @ha will mean 1 is added
to the high 16 bits of the value.

eg. if you have:

#define CONSTANT0x00208000

lis r4,CONSTANT@ha

You get:

1118:   21 00 80 3c lis r4,33
   ^
   = 0x21


Documented (sort of) here:

  http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html#RELOC-TYPE
  
  #ha(value) denotes the high adjusted value: bits 16 through 31 of the 
indicated value, compensating for #lo() being treated as a signed number:
  
  #ha(x) = (((x >> 16) + ((x & 0x8000) ? 1 : 0)) & 0x)

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] spapr: Better handling of ibm,pa-features TM bit

2016-06-07 Thread Anton Blanchard
From: Anton Blanchard 

There are a few issues with our handling of the ibm,pa-features
TM bit:

- We don't support transactional memory in PR KVM, so don't tell
  the OS that we do.

- In full emulation we have a minimal implementation of TM that always
  fails, so for performance reasons lets not tell the OS that we
  support it either.

- In HV KVM mode, we should mirror the host TM enabled state by
  looking at the AT_HWCAP2 bit.

Signed-off-by: Anton Blanchard 
---

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0636642..c403fbb 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -620,7 +620,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
 0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
-0x80, 0x00, 0x80, 0x00, 0x80, 0x00 };
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
 uint8_t *pa_features;
 size_t pa_size;
 
@@ -697,6 +697,19 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
 } else /* env->mmu_model == POWERPC_MMU_2_07 */ {
 pa_features = pa_features_207;
 pa_size = sizeof(pa_features_207);
+
+#ifdef CONFIG_KVM
+/* Only enable TM in HV KVM mode */
+if (kvm_enabled() &&
+!kvm_vm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_PVINFO)) {
+unsigned long hwcap2 = qemu_getauxval(AT_HWCAP2);
+
+/* Guest should inherit host TM enabled bit */
+if (hwcap2 & PPC_FEATURE2_HAS_HTM) {
+pa_features[24] |= 0x80;
+}
+}
+#endif
 }
 if (env->ci_large_pages) {
 pa_features[3] |= 0x20;

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] Add PowerPC AT_HWCAP2 definitions

2016-06-07 Thread Anton Blanchard
From: Anton Blanchard 

We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
add the PowerPC AT_HWCAP2 definitions.

Signed-off-by: Anton Blanchard 
---

diff --git a/include/elf.h b/include/elf.h
index 28d448b..8533b2a 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -477,6 +477,19 @@ typedef struct {
 #define PPC_FEATURE_TRUE_LE 0x0002
 #define PPC_FEATURE_PPC_LE  0x0001
 
+/* Bits present in AT_HWCAP2 for PowerPC.  */
+
+#define PPC_FEATURE2_ARCH_2_07  0x8000
+#define PPC_FEATURE2_HAS_HTM0x4000
+#define PPC_FEATURE2_HAS_DSCR   0x2000
+#define PPC_FEATURE2_HAS_EBB0x1000
+#define PPC_FEATURE2_HAS_ISEL   0x0800
+#define PPC_FEATURE2_HAS_TAR0x0400
+#define PPC_FEATURE2_HAS_VEC_CRYPTO 0x0200
+#define PPC_FEATURE2_HTM_NOSC   0x0100
+#define PPC_FEATURE2_ARCH_3_00  0x0080
+#define PPC_FEATURE2_HAS_IEEE1280x0040
+
 /* Bits present in AT_HWCAP for Sparc.  */
 
 #define HWCAP_SPARC_FLUSH   0x0001

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC v3 20/45] xen: dma-mapping: Use unsigned long for dma_attrs

2016-06-07 Thread David Vrabel
On 02/06/16 16:39, Krzysztof Kozlowski wrote:
> Split out subsystem specific changes for easier reviews. This will be
> squashed with main commit.

Acked-by: David Vrabel 

David
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/mm/hash: Fix the reference bit update when handling hash fault

2016-06-07 Thread Michael Ellerman
On Thu, 2016-06-02 at 08:12 -0700, Hugh Dickins wrote:
> On Tue, 31 May 2016, Hugh Dickins wrote:
> > 
> > But all my evidence so far is that it is now right: I'll continue
> > testing v4.6+fix on a couple of loads until this evening: all is
> > well so far.  And then switch to testing v4.5+fix on those loads
> > for another day and a half.
> 
> I'm glad to confirm: your patch to htab_convert_pte_flags() fixes all
> the elusive sigsegv issues I was seeing under load on v4.5 and v4.6.
> Thank you!

Thanks for all the testing Hugh, and sorry for the bug.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/nvram: Fix an incorrect partition merge

2016-06-07 Thread Michael Ellerman
On Mon, 2016-06-06 at 13:31 +0800, xinhui wrote:
> On 2016年06月03日 19:47, Michael Ellerman wrote:
> > On Thu, 2015-10-12 at 07:30:02 UTC, xinhui wrote:
> > > From: Pan Xinhui 
> > > 
> > > When we merge two contiguous partitions whose signatures are marked
> > > NVRAM_SIG_FREE, We need update prev's length and checksum, then write it
> > > to nvram, not cur's. So lets fix this mistake now.
> > > 
> > > Also use memset instead of strncpy to set the partition's name. It's
> > > more readable if we want to fill up with duplicate chars .
> > 
> > Does this ever happen in practice? ie. should we backport the fix to stable
> > kernels?
> 
> I did not see that nvram warning in practice. BUT I suggest to backport it to 
> stable kernel. :)
> 
> Let me recall the story. :)
> In past days, I was using pstore to keep some kernel logs. and sometimes I 
> found my own logs and the panic logs did not show.
> pstore use a fixed-address reserved memory In x86 while nvram instead in ppc.
> 
> Then I spent some days to review the nvram codes.
> And worked out three patches to fix all issues that I found in nvram. BUT 
> looks like I only sent out two of them. :)
> I lost the third patch maybe...

OK.

> > Has it always been broken?
> 
> no. after nvram partition corruption hit, all nvram partitions will be erased 
> and re-alloc after the second machine reboot.
> I don't know who does it but i guess it is the firmware. :)

Actually I meant has the code always contained the bug, or was it added 
recently.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 2/2] cpufreq: Reuse new freq-table helpers

2016-06-07 Thread Viresh Kumar
This patch migrates few users of cpufreq tables to the new helpers that
work on sorted freq-tables.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/acpi-cpufreq.c | 14 --
 drivers/cpufreq/amd_freq_sensitivity.c |  4 ++--
 drivers/cpufreq/cpufreq_ondemand.c |  6 ++
 drivers/cpufreq/powernv-cpufreq.c  |  3 +--
 drivers/cpufreq/s5pv210-cpufreq.c  |  3 +--
 5 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 32a15052f363..11c9a078e0fd 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -468,20 +468,14 @@ unsigned int acpi_cpufreq_fast_switch(struct 
cpufreq_policy *policy,
struct acpi_cpufreq_data *data = policy->driver_data;
struct acpi_processor_performance *perf;
struct cpufreq_frequency_table *entry;
-   unsigned int next_perf_state, next_freq, freq;
+   unsigned int next_perf_state, next_freq, index;
 
/*
 * Find the closest frequency above target_freq.
-*
-* The table is sorted in the reverse order with respect to the
-* frequency and all of the entries are valid (see the initialization).
 */
-   entry = policy->freq_table;
-   do {
-   entry++;
-   freq = entry->frequency;
-   } while (freq >= target_freq && freq != CPUFREQ_TABLE_END);
-   entry--;
+   index = cpufreq_table_find_index_dl(policy, target_freq);
+
+   entry = &policy->freq_table[index];
next_freq = entry->frequency;
next_perf_state = entry->driver_data;
 
diff --git a/drivers/cpufreq/amd_freq_sensitivity.c 
b/drivers/cpufreq/amd_freq_sensitivity.c
index 6d5dc04c3a37..042023bbbf62 100644
--- a/drivers/cpufreq/amd_freq_sensitivity.c
+++ b/drivers/cpufreq/amd_freq_sensitivity.c
@@ -91,8 +91,8 @@ static unsigned int amd_powersave_bias_target(struct 
cpufreq_policy *policy,
else {
unsigned int index;
 
-   index = cpufreq_frequency_table_target(policy,
-   policy->cur - 1, CPUFREQ_RELATION_H);
+   index = cpufreq_table_find_index_h(policy,
+  policy->cur - 1);
freq_next = policy->freq_table[index].frequency;
}
 
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 0c93cd9dee99..3a1f49f5f4c6 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -85,11 +85,9 @@ static unsigned int generic_powersave_bias_target(struct 
cpufreq_policy *policy,
freq_avg = freq_req - freq_reduc;
 
/* Find freq bounds for freq_avg in freq_table */
-   index = cpufreq_frequency_table_target(policy, freq_avg,
-  CPUFREQ_RELATION_H);
+   index = cpufreq_table_find_index_h(policy, freq_avg);
freq_lo = freq_table[index].frequency;
-   index = cpufreq_frequency_table_target(policy, freq_avg,
-  CPUFREQ_RELATION_L);
+   index = cpufreq_table_find_index_l(policy, freq_avg);
freq_hi = freq_table[index].frequency;
 
/* Find out how long we have to be in hi and lo freqs */
diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index b29c5c20c3a1..2a2920c4fdf9 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -760,8 +760,7 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
struct cpufreq_policy policy;
 
cpufreq_get_policy(&policy, cpu);
-   index = cpufreq_frequency_table_target(&policy, policy.cur,
-  CPUFREQ_RELATION_C);
+   index = cpufreq_table_find_index_c(&policy, policy.cur);
powernv_cpufreq_target_index(&policy, index);
cpumask_andnot(&mask, &mask, policy.cpus);
}
diff --git a/drivers/cpufreq/s5pv210-cpufreq.c 
b/drivers/cpufreq/s5pv210-cpufreq.c
index 4f4e9df9b7fc..9e07588ea9f5 100644
--- a/drivers/cpufreq/s5pv210-cpufreq.c
+++ b/drivers/cpufreq/s5pv210-cpufreq.c
@@ -246,8 +246,7 @@ static int s5pv210_target(struct cpufreq_policy *policy, 
unsigned int index)
new_freq = s5pv210_freq_table[index].frequency;
 
/* Finding current running level index */
-   priv_index = cpufreq_frequency_table_target(policy, old_freq,
-   CPUFREQ_RELATION_H);
+   priv_index = cpufreq_table_find_index_h(policy, old_freq);
 
arm_volt = dvs_conf[index].arm_volt;
int_volt = dvs_conf[index].int_volt;
-- 
2.7.1.410.g6faf27b

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer

2016-06-07 Thread Jesper Nilsson
On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.
> 
> Signed-off-by: Yinghai Lu 
> Cc: linux-arm-ker...@lists.infradead.org

For the CRIS part:

Acked-by: Jesper Nilsson 

> diff --git a/arch/cris/arch-v32/drivers/pci/bios.c 
> b/arch/cris/arch-v32/drivers/pci/bios.c
> index 64a5fb9..082efb9 100644
> --- a/arch/cris/arch-v32/drivers/pci/bios.c
> +++ b/arch/cris/arch-v32/drivers/pci/bios.c
> @@ -14,7 +14,8 @@ void pcibios_set_master(struct pci_dev *dev)
>   pci_write_config_byte(dev, PCI_LATENCY_TIMER, lat);
>  }
>  
> -int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> + struct vm_area_struct *vma,
>   enum pci_mmap_state mmap_state, int write_combine)
>  {
>   unsigned long prot;
> diff --git a/arch/cris/include/asm/pci.h b/arch/cris/include/asm/pci.h
> index b1b289d..65198cb 100644
> --- a/arch/cris/include/asm/pci.h
> +++ b/arch/cris/include/asm/pci.h
> @@ -42,9 +42,6 @@ struct pci_dev;
>  #define PCI_DMA_BUS_IS_PHYS  (1)
>  
>  #define HAVE_PCI_MMAP
> -extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct 
> *vma,
> -enum pci_mmap_state mmap_state, int 
> write_combine);
> -
>  
>  #endif /* __KERNEL__ */

/^JN - Jesper Nilsson
-- 
   Jesper Nilsson -- jesper.nils...@axis.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev