[PATCH] rcu: Fix up pending cbs check in rcu_prepare_for_idle

2017-08-06 Thread Neeraj Upadhyay
Pending cbs check in rcu_prepare_for_idle is inversed
in the sense that, it should accelerate if there are
pending cbs; but, the check does the opposite. So,
fix it.

Fixes: 15fecf89e46a ("srcu: Abstract multi-tail callback list handling")
Signed-off-by: Neeraj Upadhyay 
---
 kernel/rcu/tree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 908b309..b8f51df 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1493,7 +1493,7 @@ static void rcu_prepare_for_idle(void)
rdtp->last_accelerate = jiffies;
for_each_rcu_flavor(rsp) {
rdp = this_cpu_ptr(rsp->rda);
-   if (rcu_segcblist_pend_cbs(>cblist))
+   if (!rcu_segcblist_pend_cbs(>cblist))
continue;
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation



[PATCH] rcu: Fix up pending cbs check in rcu_prepare_for_idle

2017-08-06 Thread Neeraj Upadhyay
Pending cbs check in rcu_prepare_for_idle is inversed
in the sense that, it should accelerate if there are
pending cbs; but, the check does the opposite. So,
fix it.

Fixes: 15fecf89e46a ("srcu: Abstract multi-tail callback list handling")
Signed-off-by: Neeraj Upadhyay 
---
 kernel/rcu/tree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 908b309..b8f51df 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1493,7 +1493,7 @@ static void rcu_prepare_for_idle(void)
rdtp->last_accelerate = jiffies;
for_each_rcu_flavor(rsp) {
rdp = this_cpu_ptr(rsp->rda);
-   if (rcu_segcblist_pend_cbs(>cblist))
+   if (!rcu_segcblist_pend_cbs(>cblist))
continue;
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 1/5] mtip32xx: Delete an error message for a failed memory allocation in five functions

2017-08-06 Thread kbuild test robot
Hi Markus,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc4 next-20170804]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/SF-Markus-Elfring/mtip32xx-Adjustments-for-some-function-implementations/20170807-033055
config: x86_64-randconfig-b0-08071209 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/uuid.h:21,
from include/linux/uuid.h:19,
from include/linux/mod_devicetable.h:12,
from include/linux/pci.h:20,
from drivers/block/mtip32xx/mtip32xx.c:21:
   include/linux/string.h: In function 'strncpy':
   include/linux/string.h:209: warning: '__f' is static but declared in 
inline function 'strncpy' which is not static
   include/linux/string.h:211: warning: '__f' is static but declared in 
inline function 'strncpy' which is not static
   include/linux/string.h: In function 'strcat':
   include/linux/string.h:219: warning: '__f' is static but declared in 
inline function 'strcat' which is not static
   include/linux/string.h:221: warning: '__f' is static but declared in 
inline function 'strcat' which is not static
   include/linux/string.h: In function 'strlen':
   include/linux/string.h:230: warning: '__f' is static but declared in 
inline function 'strlen' which is not static
   include/linux/string.h:233: warning: '__f' is static but declared in 
inline function 'strlen' which is not static
   include/linux/string.h: In function 'strnlen':
   include/linux/string.h:243: warning: '__f' is static but declared in 
inline function 'strnlen' which is not static
   include/linux/string.h: In function 'strlcpy':
   include/linux/string.h:255: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:258: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:260: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:262: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h: In function 'strncat':
   include/linux/string.h:276: warning: '__f' is static but declared in 
inline function 'strncat' which is not static
   include/linux/string.h:280: warning: '__f' is static but declared in 
inline function 'strncat' which is not static
   include/linux/string.h: In function 'memset':
   include/linux/string.h:290: warning: '__f' is static but declared in 
inline function 'memset' which is not static
   include/linux/string.h:292: warning: '__f' is static but declared in 
inline function 'memset' which is not static
   include/linux/string.h: In function 'memcpy':
   include/linux/string.h:301: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:302: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:304: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:307: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h: In function 'memmove':
   include/linux/string.h:316: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:317: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:319: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:322: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h: In function 'memscan':
   include/linux/string.h:331: warning: '__f' is static but declared in 
inline function 'memscan' which is not static
   include/linux/string.h:333: warning: '__f' is static but declared in 
inline function 'memscan' which is not static
   include/linux/string.h: In function 'memcmp':
   include/linux/string.h:342: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:343: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:345: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:348: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h: In function 

Re: [PATCH 1/5] mtip32xx: Delete an error message for a failed memory allocation in five functions

2017-08-06 Thread kbuild test robot
Hi Markus,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc4 next-20170804]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/SF-Markus-Elfring/mtip32xx-Adjustments-for-some-function-implementations/20170807-033055
config: x86_64-randconfig-b0-08071209 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/uuid.h:21,
from include/linux/uuid.h:19,
from include/linux/mod_devicetable.h:12,
from include/linux/pci.h:20,
from drivers/block/mtip32xx/mtip32xx.c:21:
   include/linux/string.h: In function 'strncpy':
   include/linux/string.h:209: warning: '__f' is static but declared in 
inline function 'strncpy' which is not static
   include/linux/string.h:211: warning: '__f' is static but declared in 
inline function 'strncpy' which is not static
   include/linux/string.h: In function 'strcat':
   include/linux/string.h:219: warning: '__f' is static but declared in 
inline function 'strcat' which is not static
   include/linux/string.h:221: warning: '__f' is static but declared in 
inline function 'strcat' which is not static
   include/linux/string.h: In function 'strlen':
   include/linux/string.h:230: warning: '__f' is static but declared in 
inline function 'strlen' which is not static
   include/linux/string.h:233: warning: '__f' is static but declared in 
inline function 'strlen' which is not static
   include/linux/string.h: In function 'strnlen':
   include/linux/string.h:243: warning: '__f' is static but declared in 
inline function 'strnlen' which is not static
   include/linux/string.h: In function 'strlcpy':
   include/linux/string.h:255: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:258: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:260: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h:262: warning: '__f' is static but declared in 
inline function 'strlcpy' which is not static
   include/linux/string.h: In function 'strncat':
   include/linux/string.h:276: warning: '__f' is static but declared in 
inline function 'strncat' which is not static
   include/linux/string.h:280: warning: '__f' is static but declared in 
inline function 'strncat' which is not static
   include/linux/string.h: In function 'memset':
   include/linux/string.h:290: warning: '__f' is static but declared in 
inline function 'memset' which is not static
   include/linux/string.h:292: warning: '__f' is static but declared in 
inline function 'memset' which is not static
   include/linux/string.h: In function 'memcpy':
   include/linux/string.h:301: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:302: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:304: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h:307: warning: '__f' is static but declared in 
inline function 'memcpy' which is not static
   include/linux/string.h: In function 'memmove':
   include/linux/string.h:316: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:317: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:319: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h:322: warning: '__f' is static but declared in 
inline function 'memmove' which is not static
   include/linux/string.h: In function 'memscan':
   include/linux/string.h:331: warning: '__f' is static but declared in 
inline function 'memscan' which is not static
   include/linux/string.h:333: warning: '__f' is static but declared in 
inline function 'memscan' which is not static
   include/linux/string.h: In function 'memcmp':
   include/linux/string.h:342: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:343: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:345: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h:348: warning: '__f' is static but declared in 
inline function 'memcmp' which is not static
   include/linux/string.h: In function 

[v4 PATCH 1/2] powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.

Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time.

When stop4 is used in the context of cpuidle, we want these additional
SPRs to be restored to their older value, to ensure that the context
on the CPU coming back from idle is same as it was before going idle.

In this patch, we define a SPR save area in PACA (since we have used
up the volatile register space in the stack) and on POWER9, we restore
SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
SPRN_MMCR2 to the values they had before entering stop.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/cpuidle.h | 11 +++
 arch/powerpc/include/asm/paca.h|  7 
 arch/powerpc/kernel/asm-offsets.c  |  8 +
 arch/powerpc/kernel/idle_book3s.S  | 65 --
 4 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index 52586f9..8a174cb 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -67,6 +67,17 @@
 #define ERR_DEEP_STATE_ESL_MISMATCH-2
 
 #ifndef __ASSEMBLY__
+/* Additional SPRs that need to be saved/restored during stop */
+struct stop_sprs {
+   u64 pid;
+   u64 ldbar;
+   u64 fscr;
+   u64 hfscr;
+   u64 mmcr1;
+   u64 mmcr2;
+   u64 mmcra;
+};
+
 extern u32 pnv_fastsleep_workaround_at_entry[];
 extern u32 pnv_fastsleep_workaround_at_exit[];
 
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc88a31..04b60af 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -31,6 +31,7 @@
 #endif
 #include 
 #include 
+#include 
 
 register struct paca_struct *local_paca asm("r13");
 
@@ -183,6 +184,12 @@ struct paca_struct {
struct paca_struct **thread_sibling_pacas;
/* The PSSCR value that the kernel requested before going to stop */
u64 requested_psscr;
+
+   /*
+* Save area for additional SPRs that need to be
+* saved/restored during cpuidle stop.
+*/
+   struct stop_sprs stop_sprs;
 #endif
 
 #ifdef CONFIG_PPC_STD_MMU_64
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 6e95c2c..8cfb20e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -746,6 +746,14 @@ int main(void)
OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask);
OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas);
OFFSET(PACA_REQ_PSSCR, paca_struct, requested_psscr);
+#define STOP_SPR(x, f) OFFSET(x, paca_struct, stop_sprs.f)
+   STOP_SPR(STOP_PID, pid);
+   STOP_SPR(STOP_LDBAR, ldbar);
+   STOP_SPR(STOP_FSCR, fscr);
+   STOP_SPR(STOP_HFSCR, hfscr);
+   STOP_SPR(STOP_MMCR1, mmcr1);
+   STOP_SPR(STOP_MMCR2, mmcr2);
+   STOP_SPR(STOP_MMCRA, mmcra);
 #endif
 
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 516ebef..4621568 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -85,7 +85,61 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
std r3,_WORT(r1)
mfspr   r3,SPRN_WORC
std r3,_WORC(r1)
+/*
+ * On POWER9, there are idle states such as stop4, invoked via cpuidle,
+ * that lose hypervisor resources. In such cases, we need to save
+ * additional SPRs before entering those idle states so that they can
+ * be restored to their older values on wakeup from the idle state.
+ *
+ * On POWER8, the only such deep idle state is winkle which is used
+ * only in the context of CPU-Hotplug, where these additional SPRs are
+ * reinitiazed to a sane value. Hence there is no need to save/restore
+ * these SPRs.
+ */
+BEGIN_FTR_SECTION
+   blr
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
+
+power9_save_additional_sprs:
+   mfspr   r3, SPRN_PID
+   mfspr   r4, SPRN_LDBAR
+   std r3, STOP_PID(r13)
+   std r4, STOP_LDBAR(r13)
 
+   mfspr   r3, SPRN_FSCR
+   mfspr   r4, SPRN_HFSCR
+   std r3, STOP_FSCR(r13)
+   std r4, STOP_HFSCR(r13)
+
+   mfspr   r3, SPRN_MMCRA
+   mfspr   r4, SPRN_MMCR1
+   std r3, STOP_MMCRA(r13)
+   std r4, STOP_MMCR1(r13)
+
+   mfspr   r3, SPRN_MMCR2
+   std r3, STOP_MMCR2(r13)
+   blr
+
+power9_restore_additional_sprs:
+   ld  r3,_LPCR(r1)
+   ld  r4, 

[v4 PATCH 1/2] powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.

Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time.

When stop4 is used in the context of cpuidle, we want these additional
SPRs to be restored to their older value, to ensure that the context
on the CPU coming back from idle is same as it was before going idle.

In this patch, we define a SPR save area in PACA (since we have used
up the volatile register space in the stack) and on POWER9, we restore
SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
SPRN_MMCR2 to the values they had before entering stop.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/cpuidle.h | 11 +++
 arch/powerpc/include/asm/paca.h|  7 
 arch/powerpc/kernel/asm-offsets.c  |  8 +
 arch/powerpc/kernel/idle_book3s.S  | 65 --
 4 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index 52586f9..8a174cb 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -67,6 +67,17 @@
 #define ERR_DEEP_STATE_ESL_MISMATCH-2
 
 #ifndef __ASSEMBLY__
+/* Additional SPRs that need to be saved/restored during stop */
+struct stop_sprs {
+   u64 pid;
+   u64 ldbar;
+   u64 fscr;
+   u64 hfscr;
+   u64 mmcr1;
+   u64 mmcr2;
+   u64 mmcra;
+};
+
 extern u32 pnv_fastsleep_workaround_at_entry[];
 extern u32 pnv_fastsleep_workaround_at_exit[];
 
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc88a31..04b60af 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -31,6 +31,7 @@
 #endif
 #include 
 #include 
+#include 
 
 register struct paca_struct *local_paca asm("r13");
 
@@ -183,6 +184,12 @@ struct paca_struct {
struct paca_struct **thread_sibling_pacas;
/* The PSSCR value that the kernel requested before going to stop */
u64 requested_psscr;
+
+   /*
+* Save area for additional SPRs that need to be
+* saved/restored during cpuidle stop.
+*/
+   struct stop_sprs stop_sprs;
 #endif
 
 #ifdef CONFIG_PPC_STD_MMU_64
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 6e95c2c..8cfb20e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -746,6 +746,14 @@ int main(void)
OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask);
OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas);
OFFSET(PACA_REQ_PSSCR, paca_struct, requested_psscr);
+#define STOP_SPR(x, f) OFFSET(x, paca_struct, stop_sprs.f)
+   STOP_SPR(STOP_PID, pid);
+   STOP_SPR(STOP_LDBAR, ldbar);
+   STOP_SPR(STOP_FSCR, fscr);
+   STOP_SPR(STOP_HFSCR, hfscr);
+   STOP_SPR(STOP_MMCR1, mmcr1);
+   STOP_SPR(STOP_MMCR2, mmcr2);
+   STOP_SPR(STOP_MMCRA, mmcra);
 #endif
 
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 516ebef..4621568 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -85,7 +85,61 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
std r3,_WORT(r1)
mfspr   r3,SPRN_WORC
std r3,_WORC(r1)
+/*
+ * On POWER9, there are idle states such as stop4, invoked via cpuidle,
+ * that lose hypervisor resources. In such cases, we need to save
+ * additional SPRs before entering those idle states so that they can
+ * be restored to their older values on wakeup from the idle state.
+ *
+ * On POWER8, the only such deep idle state is winkle which is used
+ * only in the context of CPU-Hotplug, where these additional SPRs are
+ * reinitiazed to a sane value. Hence there is no need to save/restore
+ * these SPRs.
+ */
+BEGIN_FTR_SECTION
+   blr
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
+
+power9_save_additional_sprs:
+   mfspr   r3, SPRN_PID
+   mfspr   r4, SPRN_LDBAR
+   std r3, STOP_PID(r13)
+   std r4, STOP_LDBAR(r13)
 
+   mfspr   r3, SPRN_FSCR
+   mfspr   r4, SPRN_HFSCR
+   std r3, STOP_FSCR(r13)
+   std r4, STOP_HFSCR(r13)
+
+   mfspr   r3, SPRN_MMCRA
+   mfspr   r4, SPRN_MMCR1
+   std r3, STOP_MMCRA(r13)
+   std r4, STOP_MMCR1(r13)
+
+   mfspr   r3, SPRN_MMCR2
+   std r3, STOP_MMCR2(r13)
+   blr
+
+power9_restore_additional_sprs:
+   ld  r3,_LPCR(r1)
+   ld  r4, STOP_PID(r13)
+   mtspr   SPRN_LPCR,r3
+  

[v4 PATCH 0/2] powerpc/powernv: Enable stop4 via cpuidle

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the fourth iteration of the patchset to enable exploitation of
stop4 idle state on POWER9 via cpuidle.

The earlier version can be found here :
[v3]: https://lkml.org/lkml/2017/7/21/209
[v2]: https://lkml.org/lkml/2017/7/19/152
[v1]: https://lkml.org/lkml/2017/7/18/691

The changes across the versions are as follows:
v3-->v4:
- Modified the subject line to be consistent with the convention. No changes to 
code.

v2-->v3:
- Use a structure instead of an array for the stop sprs save area.
- Name the offsets into the paca->stop_sprs as STOP_XXX instead of PACA_XXX.
- Add comments in the assembly code explaining why saving/restoring
  is not needed on POWER8.
- Program the LPCR during platform idle entry/exit on both POWER8 and POWER9
  as suggested by Nicholas Piggin.

v1 --> v2:
- Move the LPCR manipulations for CPU-Hotplug into
arch/powerpc/platforms/powernv/idle.c as per Nicholas Piggin's
suggestion.

== Description ===
The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.

Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time. When stop4 is used in the context
of cpuidle, we want these additional SPRs to be restored to their
older value, to ensure that the context on the CPU coming back from
idle is same as it was before going idle.

Additionally, the CPU which is in stop4 while idling can be woken up
by the decrementer interrupts. So we need to ensure that the LPCR is
programmed with PECE1 bit cleared via the stop-api only for the
CPU-Hotplug case and not for cpuidle.

The two patches in the series address this problem.

Gautham R. Shenoy (2):
  powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle
  powernv/powerpc: Clear PECE1 in LPCR via stop-api only on Hotplug


Gautham R. Shenoy (2):
  powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle
  powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug

 arch/powerpc/include/asm/cpuidle.h| 11 ++
 arch/powerpc/include/asm/paca.h   |  7 
 arch/powerpc/kernel/asm-offsets.c |  8 +
 arch/powerpc/kernel/idle_book3s.S | 65 +--
 arch/powerpc/platforms/powernv/idle.c | 34 +-
 arch/powerpc/platforms/powernv/smp.c  |  8 -
 6 files changed, 122 insertions(+), 11 deletions(-)

-- 
1.9.4



[v4 PATCH 2/2] powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently we use the stop-api provided by the firmware to program the
SLW engine to restore the values of hypervisor resources that get lost
on deeper idle states (such as winkle). Since the deep states were
only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
woken up by decrementer.

On POWER9, some of the deep platform idle states such as stop4 can be
used in cpuidle as well. In this case, we want the CPU in stop4 to be
woken up by the decrementer when some timer on the CPU expires.

In this patch, we program the stop-api for LPCR with PECE1
bit cleared only when we are offlining the CPU and set it
back once the CPU is online.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/platforms/powernv/idle.c | 34 +-
 arch/powerpc/platforms/powernv/smp.c  |  8 
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 2abee07..a1296e7 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -68,7 +68,7 @@ static int pnv_save_sprs_for_deep_states(void)
 * all cpus at boot. Get these reg values of current cpu and use the
 * same across all cpus.
 */
-   uint64_t lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+   uint64_t lpcr_val = mfspr(SPRN_LPCR);
uint64_t hid0_val = mfspr(SPRN_HID0);
uint64_t hid1_val = mfspr(SPRN_HID1);
uint64_t hid4_val = mfspr(SPRN_HID4);
@@ -355,6 +355,14 @@ void power9_idle(void)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
+static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val)
+{
+   u64 pir = get_hard_smp_processor_id(cpu);
+
+   mtspr(SPRN_LPCR, lpcr_val);
+   opal_slw_set_reg(pir, SPRN_LPCR, lpcr_val);
+}
+
 /*
  * pnv_cpu_offline: A function that puts the CPU into the deepest
  * available platform idle state on a CPU-Offline.
@@ -364,6 +372,20 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 {
unsigned long srr1;
u32 idle_states = pnv_get_supported_cpuidle_states();
+   u64 lpcr_val;
+
+   /*
+* We don't want to take decrementer interrupts while we are
+* offline, so clear LPCR:PECE1. We keep PECE2 (and
+* LPCR_PECE_HVEE on P9) enabled as to let IPIs in.
+*
+* If the CPU gets woken up by a special wakeup, ensure that
+* the SLW engine sets LPCR with decrementer bit cleared, else
+* the CPU will come back to the kernel due to a spurious
+* wakeup.
+*/
+   lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+   pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
 
__ppc64_runlatch_off();
 
@@ -394,6 +416,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 
__ppc64_runlatch_on();
 
+   /*
+* Re-enable decrementer interrupts in LPCR.
+*
+* Further, we want stop states to be woken up by decrementer
+* for non-hotplug cases. So program the LPCR via stop api as
+* well.
+*/
+   lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1;
+   pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
+
return srr1;
 }
 #endif
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 40dae96..536b07b 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -164,12 +164,6 @@ static void pnv_smp_cpu_kill_self(void)
if (cpu_has_feature(CPU_FTR_ARCH_207S))
wmask = SRR1_WAKEMASK_P8;
 
-   /* We don't want to take decrementer interrupts while we are offline,
-* so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9)
-* enabled as to let IPIs in.
-*/
-   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
-
while (!generic_check_cpu_restart(cpu)) {
/*
 * Clear IPI flag, since we don't handle IPIs while
@@ -219,8 +213,6 @@ static void pnv_smp_cpu_kill_self(void)
 
}
 
-   /* Re-enable decrementer interrupts */
-   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1);
DBG("CPU%d coming online...\n", cpu);
 }
 
-- 
1.9.4



[v4 PATCH 0/2] powerpc/powernv: Enable stop4 via cpuidle

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the fourth iteration of the patchset to enable exploitation of
stop4 idle state on POWER9 via cpuidle.

The earlier version can be found here :
[v3]: https://lkml.org/lkml/2017/7/21/209
[v2]: https://lkml.org/lkml/2017/7/19/152
[v1]: https://lkml.org/lkml/2017/7/18/691

The changes across the versions are as follows:
v3-->v4:
- Modified the subject line to be consistent with the convention. No changes to 
code.

v2-->v3:
- Use a structure instead of an array for the stop sprs save area.
- Name the offsets into the paca->stop_sprs as STOP_XXX instead of PACA_XXX.
- Add comments in the assembly code explaining why saving/restoring
  is not needed on POWER8.
- Program the LPCR during platform idle entry/exit on both POWER8 and POWER9
  as suggested by Nicholas Piggin.

v1 --> v2:
- Move the LPCR manipulations for CPU-Hotplug into
arch/powerpc/platforms/powernv/idle.c as per Nicholas Piggin's
suggestion.

== Description ===
The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.

Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time. When stop4 is used in the context
of cpuidle, we want these additional SPRs to be restored to their
older value, to ensure that the context on the CPU coming back from
idle is same as it was before going idle.

Additionally, the CPU which is in stop4 while idling can be woken up
by the decrementer interrupts. So we need to ensure that the LPCR is
programmed with PECE1 bit cleared via the stop-api only for the
CPU-Hotplug case and not for cpuidle.

The two patches in the series address this problem.

Gautham R. Shenoy (2):
  powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle
  powernv/powerpc: Clear PECE1 in LPCR via stop-api only on Hotplug


Gautham R. Shenoy (2):
  powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle
  powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug

 arch/powerpc/include/asm/cpuidle.h| 11 ++
 arch/powerpc/include/asm/paca.h   |  7 
 arch/powerpc/kernel/asm-offsets.c |  8 +
 arch/powerpc/kernel/idle_book3s.S | 65 +--
 arch/powerpc/platforms/powernv/idle.c | 34 +-
 arch/powerpc/platforms/powernv/smp.c  |  8 -
 6 files changed, 122 insertions(+), 11 deletions(-)

-- 
1.9.4



[v4 PATCH 2/2] powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug

2017-08-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently we use the stop-api provided by the firmware to program the
SLW engine to restore the values of hypervisor resources that get lost
on deeper idle states (such as winkle). Since the deep states were
only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
woken up by decrementer.

On POWER9, some of the deep platform idle states such as stop4 can be
used in cpuidle as well. In this case, we want the CPU in stop4 to be
woken up by the decrementer when some timer on the CPU expires.

In this patch, we program the stop-api for LPCR with PECE1
bit cleared only when we are offlining the CPU and set it
back once the CPU is online.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/platforms/powernv/idle.c | 34 +-
 arch/powerpc/platforms/powernv/smp.c  |  8 
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 2abee07..a1296e7 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -68,7 +68,7 @@ static int pnv_save_sprs_for_deep_states(void)
 * all cpus at boot. Get these reg values of current cpu and use the
 * same across all cpus.
 */
-   uint64_t lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+   uint64_t lpcr_val = mfspr(SPRN_LPCR);
uint64_t hid0_val = mfspr(SPRN_HID0);
uint64_t hid1_val = mfspr(SPRN_HID1);
uint64_t hid4_val = mfspr(SPRN_HID4);
@@ -355,6 +355,14 @@ void power9_idle(void)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
+static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val)
+{
+   u64 pir = get_hard_smp_processor_id(cpu);
+
+   mtspr(SPRN_LPCR, lpcr_val);
+   opal_slw_set_reg(pir, SPRN_LPCR, lpcr_val);
+}
+
 /*
  * pnv_cpu_offline: A function that puts the CPU into the deepest
  * available platform idle state on a CPU-Offline.
@@ -364,6 +372,20 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 {
unsigned long srr1;
u32 idle_states = pnv_get_supported_cpuidle_states();
+   u64 lpcr_val;
+
+   /*
+* We don't want to take decrementer interrupts while we are
+* offline, so clear LPCR:PECE1. We keep PECE2 (and
+* LPCR_PECE_HVEE on P9) enabled as to let IPIs in.
+*
+* If the CPU gets woken up by a special wakeup, ensure that
+* the SLW engine sets LPCR with decrementer bit cleared, else
+* the CPU will come back to the kernel due to a spurious
+* wakeup.
+*/
+   lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+   pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
 
__ppc64_runlatch_off();
 
@@ -394,6 +416,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 
__ppc64_runlatch_on();
 
+   /*
+* Re-enable decrementer interrupts in LPCR.
+*
+* Further, we want stop states to be woken up by decrementer
+* for non-hotplug cases. So program the LPCR via stop api as
+* well.
+*/
+   lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1;
+   pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
+
return srr1;
 }
 #endif
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 40dae96..536b07b 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -164,12 +164,6 @@ static void pnv_smp_cpu_kill_self(void)
if (cpu_has_feature(CPU_FTR_ARCH_207S))
wmask = SRR1_WAKEMASK_P8;
 
-   /* We don't want to take decrementer interrupts while we are offline,
-* so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9)
-* enabled as to let IPIs in.
-*/
-   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
-
while (!generic_check_cpu_restart(cpu)) {
/*
 * Clear IPI flag, since we don't handle IPIs while
@@ -219,8 +213,6 @@ static void pnv_smp_cpu_kill_self(void)
 
}
 
-   /* Re-enable decrementer interrupts */
-   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1);
DBG("CPU%d coming online...\n", cpu);
 }
 
-- 
1.9.4



[PATCH -mm -v4 5/5] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-08-06 Thread Huang, Ying
From: Huang Ying 

VMA based swap readahead will readahead the virtual pages that is
continuous in the virtual address space.  While the original swap
readahead will readahead the swap slots that is continuous in the swap
device.  Although VMA based swap readahead is more correct for the
swap slots to be readahead, it will trigger more small random
readings, which may cause the performance of HDD (hard disk) to
degrade heavily, and may finally exceed the benefit.

To avoid the issue, in this patch, if the HDD is used as swap, the VMA
based swap readahead will be disabled, and the original swap readahead
will be used instead.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/swap.h | 11 ++-
 mm/swapfile.c|  8 +++-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 61d63379e956..9c4ae6f14eea 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -400,16 +400,17 @@ extern struct page *do_swap_page_readahead(swp_entry_t 
fentry, gfp_t gfp_mask,
   struct vm_fault *vmf,
   struct vma_swap_readahead *swap_ra);
 
-static inline bool swap_use_vma_readahead(void)
-{
-   return READ_ONCE(swap_vma_readahead);
-}
-
 /* linux/mm/swapfile.c */
 extern atomic_long_t nr_swap_pages;
 extern long total_swap_pages;
+extern atomic_t nr_rotate_swap;
 extern bool has_usable_swap(void);
 
+static inline bool swap_use_vma_readahead(void)
+{
+   return READ_ONCE(swap_vma_readahead) && !atomic_read(_rotate_swap);
+}
+
 /* Swap 50% full? Release swapcache more aggressively.. */
 static inline bool vm_swap_full(void)
 {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 42eff9e4e972..4f8b3e08a547 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -96,6 +96,8 @@ static DECLARE_WAIT_QUEUE_HEAD(proc_poll_wait);
 /* Activity counter to indicate that a swapon or swapoff has occurred */
 static atomic_t proc_poll_event = ATOMIC_INIT(0);
 
+atomic_t nr_rotate_swap = ATOMIC_INIT(0);
+
 static inline unsigned char swap_count(unsigned char ent)
 {
return ent & ~SWAP_HAS_CACHE;   /* may include SWAP_HAS_CONT flag */
@@ -2569,6 +2571,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
if (p->flags & SWP_CONTINUED)
free_swap_count_continuations(p);
 
+   if (!p->bdev || !blk_queue_nonrot(bdev_get_queue(p->bdev)))
+   atomic_dec(_rotate_swap);
+
mutex_lock(_mutex);
spin_lock(_lock);
spin_lock(>lock);
@@ -3145,7 +3150,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, 
int, swap_flags)
cluster = per_cpu_ptr(p->percpu_cluster, cpu);
cluster_set_null(>index);
}
-   }
+   } else
+   atomic_inc(_rotate_swap);
 
error = swap_cgroup_swapon(p->type, maxpages);
if (error)
-- 
2.11.0



[PATCH -mm -v4 5/5] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-08-06 Thread Huang, Ying
From: Huang Ying 

VMA based swap readahead will readahead the virtual pages that is
continuous in the virtual address space.  While the original swap
readahead will readahead the swap slots that is continuous in the swap
device.  Although VMA based swap readahead is more correct for the
swap slots to be readahead, it will trigger more small random
readings, which may cause the performance of HDD (hard disk) to
degrade heavily, and may finally exceed the benefit.

To avoid the issue, in this patch, if the HDD is used as swap, the VMA
based swap readahead will be disabled, and the original swap readahead
will be used instead.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/swap.h | 11 ++-
 mm/swapfile.c|  8 +++-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 61d63379e956..9c4ae6f14eea 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -400,16 +400,17 @@ extern struct page *do_swap_page_readahead(swp_entry_t 
fentry, gfp_t gfp_mask,
   struct vm_fault *vmf,
   struct vma_swap_readahead *swap_ra);
 
-static inline bool swap_use_vma_readahead(void)
-{
-   return READ_ONCE(swap_vma_readahead);
-}
-
 /* linux/mm/swapfile.c */
 extern atomic_long_t nr_swap_pages;
 extern long total_swap_pages;
+extern atomic_t nr_rotate_swap;
 extern bool has_usable_swap(void);
 
+static inline bool swap_use_vma_readahead(void)
+{
+   return READ_ONCE(swap_vma_readahead) && !atomic_read(_rotate_swap);
+}
+
 /* Swap 50% full? Release swapcache more aggressively.. */
 static inline bool vm_swap_full(void)
 {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 42eff9e4e972..4f8b3e08a547 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -96,6 +96,8 @@ static DECLARE_WAIT_QUEUE_HEAD(proc_poll_wait);
 /* Activity counter to indicate that a swapon or swapoff has occurred */
 static atomic_t proc_poll_event = ATOMIC_INIT(0);
 
+atomic_t nr_rotate_swap = ATOMIC_INIT(0);
+
 static inline unsigned char swap_count(unsigned char ent)
 {
return ent & ~SWAP_HAS_CACHE;   /* may include SWAP_HAS_CONT flag */
@@ -2569,6 +2571,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
if (p->flags & SWP_CONTINUED)
free_swap_count_continuations(p);
 
+   if (!p->bdev || !blk_queue_nonrot(bdev_get_queue(p->bdev)))
+   atomic_dec(_rotate_swap);
+
mutex_lock(_mutex);
spin_lock(_lock);
spin_lock(>lock);
@@ -3145,7 +3150,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, 
int, swap_flags)
cluster = per_cpu_ptr(p->percpu_cluster, cpu);
cluster_set_null(>index);
}
-   }
+   } else
+   atomic_inc(_rotate_swap);
 
error = swap_cgroup_swapon(p->type, maxpages);
if (error)
-- 
2.11.0



[PATCH -mm -v4 4/5] mm, swap: Add sysfs interface for VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The sysfs interface to control the VMA based swap readahead is added
as follow,

/sys/kernel/mm/swap/vma_ra_enabled

Enable the VMA based swap readahead algorithm, or use the original
global swap readahead algorithm.

/sys/kernel/mm/swap/vma_ra_max_order

Set the max order of the readahead window size for the VMA based swap
readahead algorithm.

The corresponding ABI documentation is added too.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 Documentation/ABI/testing/sysfs-kernel-mm-swap | 26 +
 mm/swap_state.c| 80 ++
 2 files changed, 106 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-swap

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-swap 
b/Documentation/ABI/testing/sysfs-kernel-mm-swap
new file mode 100644
index ..587db52084c7
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-swap
@@ -0,0 +1,26 @@
+What:  /sys/kernel/mm/swap/
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   Interface for swapping
+
+What:  /sys/kernel/mm/swap/vma_ra_enabled
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   Enable/disable VMA based swap readahead.
+
+   If set to true, the VMA based swap readahead algorithm
+   will be used for swappable anonymous pages mapped in a
+   VMA, and the global swap readahead algorithm will be
+   still used for tmpfs etc. other users.  If set to
+   false, the global swap readahead algorithm will be
+   used for all swappable pages.
+
+What:  /sys/kernel/mm/swap/vma_ra_max_order
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   The max readahead size in order for VMA based swap readahead
+
+   VMA based swap readahead algorithm will readahead at
+   most 1 << max_order pages for each readahead.  The
+   real readahead size for each readahead will be scaled
+   according to the estimation algorithm.
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3885fef7bdf5..71ce2d1ccbf7 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -751,3 +751,83 @@ struct page *do_swap_page_readahead(swp_entry_t fentry, 
gfp_t gfp_mask,
return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address,
 swap_ra->win == 1);
 }
+
+#ifdef CONFIG_SYSFS
+static ssize_t vma_ra_enabled_show(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%s\n", swap_vma_readahead ? "true" : "false");
+}
+static ssize_t vma_ra_enabled_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+   if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1))
+   swap_vma_readahead = true;
+   else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1))
+   swap_vma_readahead = false;
+   else
+   return -EINVAL;
+
+   return count;
+}
+static struct kobj_attribute vma_ra_enabled_attr =
+   __ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show,
+  vma_ra_enabled_store);
+
+static ssize_t vma_ra_max_order_show(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%d\n", swap_ra_max_order);
+}
+static ssize_t vma_ra_max_order_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+   int err, v;
+
+   err = kstrtoint(buf, 10, );
+   if (err || v > SWAP_RA_ORDER_CEILING || v <= 0)
+   return -EINVAL;
+
+   swap_ra_max_order = v;
+
+   return count;
+}
+static struct kobj_attribute vma_ra_max_order_attr =
+   __ATTR(vma_ra_max_order, 0644, vma_ra_max_order_show,
+  vma_ra_max_order_store);
+
+static struct attribute *swap_attrs[] = {
+   _ra_enabled_attr.attr,
+   _ra_max_order_attr.attr,
+   NULL,
+};
+
+static struct attribute_group swap_attr_group = {
+   .attrs = swap_attrs,
+};
+
+static int __init swap_init_sysfs(void)
+{
+   int err;
+   struct kobject *swap_kobj;
+
+   swap_kobj = kobject_create_and_add("swap", mm_kobj);
+   if (!swap_kobj) {
+   

[PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The statistics for total readahead pages and total readahead hits are
recorded and exported via the following sysfs interface.

/sys/kernel/mm/swap/ra_hits
/sys/kernel/mm/swap/ra_total

With them, the efficiency of the swap readahead could be measured, so
that the swap readahead algorithm and parameters could be tuned
accordingly.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/vm_event_item.h | 2 ++
 mm/swap_state.c   | 9 +++--
 mm/vmstat.c   | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index e02820fc2861..27e3339cfd65 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -106,6 +106,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
VMACACHE_FIND_HITS,
VMACACHE_FULL_FLUSHES,
 #endif
+   SWAP_RA,
+   SWAP_RA_HIT,
NR_VM_EVENT_ITEMS
 };
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index b68c93014f50..d1bdb31cab13 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -305,8 +305,10 @@ struct page * lookup_swap_cache(swp_entry_t entry)
 
if (page && likely(!PageTransCompound(page))) {
INC_CACHE_INFO(find_success);
-   if (TestClearPageReadahead(page))
+   if (TestClearPageReadahead(page)) {
atomic_inc(_readahead_hits);
+   count_vm_event(SWAP_RA_HIT);
+   }
}
 
INC_CACHE_INFO(find_total);
@@ -516,8 +518,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
gfp_mask, vma, addr, false);
if (!page)
continue;
-   if (offset != entry_offset && likely(!PageTransCompound(page)))
+   if (offset != entry_offset &&
+   likely(!PageTransCompound(page))) {
SetPageReadahead(page);
+   count_vm_event(SWAP_RA);
+   }
put_page(page);
}
blk_finish_plug();
diff --git a/mm/vmstat.c b/mm/vmstat.c
index ba9b202e8500..4c2121a8b877 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1095,6 +1095,9 @@ const char * const vmstat_text[] = {
"vmacache_find_hits",
"vmacache_full_flushes",
 #endif
+
+   "swap_ra",
+   "swap_ra_hit",
 #endif /* CONFIG_VM_EVENTS_COUNTERS */
 };
 #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA */
-- 
2.11.0



[PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The swap readahead is an important mechanism to reduce the swap in
latency.  Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.

In the original swap readahead implementation, the consecutive blocks
in swap device are readahead based on the global space locality
estimation.  But the consecutive blocks in swap device just reflect
the order of page reclaiming, don't necessarily reflect the access
pattern in virtual memory.  And the different tasks in the system may
have different access patterns, which makes the global space locality
estimation incorrect.

In this patch, when page fault occurs, the virtual pages near the
fault address will be readahead instead of the swap slots near the
fault swap slot in swap device.  This avoid to readahead the unrelated
swap slots.  At the same time, the swap readahead is changed to work
on per-VMA from globally.  So that the different access patterns of
the different VMAs could be distinguished, and the different readahead
policy could be applied accordingly.  The original core readahead
detection and scaling algorithm is reused, because it is an effect
algorithm to detect the space locality.

The test and result is as follow,

Common test condition
=

Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM)
Swap device: NVMe disk

Micro-benchmark with combined access pattern


vm-scalability, sequential swap test case, 4 processes to eat 50G
virtual memory space, repeat the sequential memory writing until 300
seconds.  The first round writing will trigger swap out, the following
rounds will trigger sequential swap in and out.

At the same time, run vm-scalability random swap test case in
background, 8 processes to eat 30G virtual memory space, repeat the
random memory write until 300 seconds.  This will trigger random
swap-in in the background.

This is a combined workload with sequential and random memory
accessing at the same time.  The result (for sequential workload) is
as follow,

BaseOptimized
-
throughput  345413 KB/s 414029 KB/s (+19.9%)
latency.average 97.14 us61.06 us (-37.1%)
latency.50th2 us1 us
latency.60th2 us1 us
latency.70th98 us   2 us
latency.80th160 us  2 us
latency.90th260 us  217 us
latency.95th346 us  369 us
latency.99th1.34 ms 1.09 ms
ra_hit% 52.69%  99.98%

The original swap readahead algorithm is confused by the background
random access workload, so readahead hit rate is lower.  The VMA-base
readahead algorithm works much better.

Linpack
===

The test memory size is bigger than RAM to trigger swapping.

BaseOptimized
-
elapsed_time393.49 s329.88 s (-16.2%)
ra_hit% 86.21%  98.82%

The score of base and optimized kernel hasn't visible changes.  But
the elapsed time reduced and readahead hit rate improved, so the
optimized kernel runs better for startup and tear down stages.  And
the absolute value of readahead hit rate is high, shows that the space
locality is still valid in some practical workloads.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/mm_types.h |   1 +
 include/linux/swap.h |  57 -
 mm/memory.c  |  23 +++--
 mm/shmem.c   |   2 +-
 mm/swap_state.c  | 215 +++
 5 files changed, 273 insertions(+), 25 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 7f384bb62d8e..5c02027050a2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -335,6 +335,7 @@ struct vm_area_struct {
struct file * vm_file;  /* File we map to (can be NULL). */
void * vm_private_data; /* was vm_pte (shared mem) */
 
+   atomic_long_t swap_readahead_info;
 #ifndef CONFIG_MMU
struct vm_region *vm_region;/* NOMMU mapping region */
 #endif
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 76f1632eea5a..61d63379e956 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -251,6 +251,25 @@ struct swap_info_struct {
struct swap_cluster_list discard_clusters; /* discard clusters list */
 };
 
+#ifdef CONFIG_64BIT
+#define SWAP_RA_ORDER_CEILING  

[PATCH -mm -v4 4/5] mm, swap: Add sysfs interface for VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The sysfs interface to control the VMA based swap readahead is added
as follow,

/sys/kernel/mm/swap/vma_ra_enabled

Enable the VMA based swap readahead algorithm, or use the original
global swap readahead algorithm.

/sys/kernel/mm/swap/vma_ra_max_order

Set the max order of the readahead window size for the VMA based swap
readahead algorithm.

The corresponding ABI documentation is added too.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 Documentation/ABI/testing/sysfs-kernel-mm-swap | 26 +
 mm/swap_state.c| 80 ++
 2 files changed, 106 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-swap

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-swap 
b/Documentation/ABI/testing/sysfs-kernel-mm-swap
new file mode 100644
index ..587db52084c7
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-swap
@@ -0,0 +1,26 @@
+What:  /sys/kernel/mm/swap/
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   Interface for swapping
+
+What:  /sys/kernel/mm/swap/vma_ra_enabled
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   Enable/disable VMA based swap readahead.
+
+   If set to true, the VMA based swap readahead algorithm
+   will be used for swappable anonymous pages mapped in a
+   VMA, and the global swap readahead algorithm will be
+   still used for tmpfs etc. other users.  If set to
+   false, the global swap readahead algorithm will be
+   used for all swappable pages.
+
+What:  /sys/kernel/mm/swap/vma_ra_max_order
+Date:  August 2017
+Contact:   Linux memory management mailing list 
+Description:   The max readahead size in order for VMA based swap readahead
+
+   VMA based swap readahead algorithm will readahead at
+   most 1 << max_order pages for each readahead.  The
+   real readahead size for each readahead will be scaled
+   according to the estimation algorithm.
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3885fef7bdf5..71ce2d1ccbf7 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -751,3 +751,83 @@ struct page *do_swap_page_readahead(swp_entry_t fentry, 
gfp_t gfp_mask,
return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address,
 swap_ra->win == 1);
 }
+
+#ifdef CONFIG_SYSFS
+static ssize_t vma_ra_enabled_show(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%s\n", swap_vma_readahead ? "true" : "false");
+}
+static ssize_t vma_ra_enabled_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+   if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1))
+   swap_vma_readahead = true;
+   else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1))
+   swap_vma_readahead = false;
+   else
+   return -EINVAL;
+
+   return count;
+}
+static struct kobj_attribute vma_ra_enabled_attr =
+   __ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show,
+  vma_ra_enabled_store);
+
+static ssize_t vma_ra_max_order_show(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%d\n", swap_ra_max_order);
+}
+static ssize_t vma_ra_max_order_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+   int err, v;
+
+   err = kstrtoint(buf, 10, );
+   if (err || v > SWAP_RA_ORDER_CEILING || v <= 0)
+   return -EINVAL;
+
+   swap_ra_max_order = v;
+
+   return count;
+}
+static struct kobj_attribute vma_ra_max_order_attr =
+   __ATTR(vma_ra_max_order, 0644, vma_ra_max_order_show,
+  vma_ra_max_order_store);
+
+static struct attribute *swap_attrs[] = {
+   _ra_enabled_attr.attr,
+   _ra_max_order_attr.attr,
+   NULL,
+};
+
+static struct attribute_group swap_attr_group = {
+   .attrs = swap_attrs,
+};
+
+static int __init swap_init_sysfs(void)
+{
+   int err;
+   struct kobject *swap_kobj;
+
+   swap_kobj = kobject_create_and_add("swap", mm_kobj);
+   if (!swap_kobj) {
+   pr_err("failed to create swap kobject\n");
+   return -ENOMEM;
+   }
+   err = sysfs_create_group(swap_kobj, _attr_group);
+   if (err) {
+   pr_err("failed to register swap group\n");
+   goto delete_obj;
+   }
+   return 

[PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The statistics for total readahead pages and total readahead hits are
recorded and exported via the following sysfs interface.

/sys/kernel/mm/swap/ra_hits
/sys/kernel/mm/swap/ra_total

With them, the efficiency of the swap readahead could be measured, so
that the swap readahead algorithm and parameters could be tuned
accordingly.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/vm_event_item.h | 2 ++
 mm/swap_state.c   | 9 +++--
 mm/vmstat.c   | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index e02820fc2861..27e3339cfd65 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -106,6 +106,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
VMACACHE_FIND_HITS,
VMACACHE_FULL_FLUSHES,
 #endif
+   SWAP_RA,
+   SWAP_RA_HIT,
NR_VM_EVENT_ITEMS
 };
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index b68c93014f50..d1bdb31cab13 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -305,8 +305,10 @@ struct page * lookup_swap_cache(swp_entry_t entry)
 
if (page && likely(!PageTransCompound(page))) {
INC_CACHE_INFO(find_success);
-   if (TestClearPageReadahead(page))
+   if (TestClearPageReadahead(page)) {
atomic_inc(_readahead_hits);
+   count_vm_event(SWAP_RA_HIT);
+   }
}
 
INC_CACHE_INFO(find_total);
@@ -516,8 +518,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
gfp_mask, vma, addr, false);
if (!page)
continue;
-   if (offset != entry_offset && likely(!PageTransCompound(page)))
+   if (offset != entry_offset &&
+   likely(!PageTransCompound(page))) {
SetPageReadahead(page);
+   count_vm_event(SWAP_RA);
+   }
put_page(page);
}
blk_finish_plug();
diff --git a/mm/vmstat.c b/mm/vmstat.c
index ba9b202e8500..4c2121a8b877 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1095,6 +1095,9 @@ const char * const vmstat_text[] = {
"vmacache_find_hits",
"vmacache_full_flushes",
 #endif
+
+   "swap_ra",
+   "swap_ra_hit",
 #endif /* CONFIG_VM_EVENTS_COUNTERS */
 };
 #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA */
-- 
2.11.0



[PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying 

The swap readahead is an important mechanism to reduce the swap in
latency.  Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.

In the original swap readahead implementation, the consecutive blocks
in swap device are readahead based on the global space locality
estimation.  But the consecutive blocks in swap device just reflect
the order of page reclaiming, don't necessarily reflect the access
pattern in virtual memory.  And the different tasks in the system may
have different access patterns, which makes the global space locality
estimation incorrect.

In this patch, when page fault occurs, the virtual pages near the
fault address will be readahead instead of the swap slots near the
fault swap slot in swap device.  This avoid to readahead the unrelated
swap slots.  At the same time, the swap readahead is changed to work
on per-VMA from globally.  So that the different access patterns of
the different VMAs could be distinguished, and the different readahead
policy could be applied accordingly.  The original core readahead
detection and scaling algorithm is reused, because it is an effect
algorithm to detect the space locality.

The test and result is as follow,

Common test condition
=

Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM)
Swap device: NVMe disk

Micro-benchmark with combined access pattern


vm-scalability, sequential swap test case, 4 processes to eat 50G
virtual memory space, repeat the sequential memory writing until 300
seconds.  The first round writing will trigger swap out, the following
rounds will trigger sequential swap in and out.

At the same time, run vm-scalability random swap test case in
background, 8 processes to eat 30G virtual memory space, repeat the
random memory write until 300 seconds.  This will trigger random
swap-in in the background.

This is a combined workload with sequential and random memory
accessing at the same time.  The result (for sequential workload) is
as follow,

BaseOptimized
-
throughput  345413 KB/s 414029 KB/s (+19.9%)
latency.average 97.14 us61.06 us (-37.1%)
latency.50th2 us1 us
latency.60th2 us1 us
latency.70th98 us   2 us
latency.80th160 us  2 us
latency.90th260 us  217 us
latency.95th346 us  369 us
latency.99th1.34 ms 1.09 ms
ra_hit% 52.69%  99.98%

The original swap readahead algorithm is confused by the background
random access workload, so readahead hit rate is lower.  The VMA-base
readahead algorithm works much better.

Linpack
===

The test memory size is bigger than RAM to trigger swapping.

BaseOptimized
-
elapsed_time393.49 s329.88 s (-16.2%)
ra_hit% 86.21%  98.82%

The score of base and optimized kernel hasn't visible changes.  But
the elapsed time reduced and readahead hit rate improved, so the
optimized kernel runs better for startup and tear down stages.  And
the absolute value of readahead hit rate is high, shows that the space
locality is still valid in some practical workloads.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 include/linux/mm_types.h |   1 +
 include/linux/swap.h |  57 -
 mm/memory.c  |  23 +++--
 mm/shmem.c   |   2 +-
 mm/swap_state.c  | 215 +++
 5 files changed, 273 insertions(+), 25 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 7f384bb62d8e..5c02027050a2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -335,6 +335,7 @@ struct vm_area_struct {
struct file * vm_file;  /* File we map to (can be NULL). */
void * vm_private_data; /* was vm_pte (shared mem) */
 
+   atomic_long_t swap_readahead_info;
 #ifndef CONFIG_MMU
struct vm_region *vm_region;/* NOMMU mapping region */
 #endif
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 76f1632eea5a..61d63379e956 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -251,6 +251,25 @@ struct swap_info_struct {
struct swap_cluster_list discard_clusters; /* discard clusters list */
 };
 
+#ifdef CONFIG_64BIT
+#define SWAP_RA_ORDER_CEILING  5
+#else
+/* Avoid stack overflow, because we need to save part of page table */
+#define SWAP_RA_ORDER_CEILING  3
+#define SWAP_RA_PTE_CACHE_SIZE (1 << SWAP_RA_ORDER_CEILING)
+#endif
+
+struct 

[PATCH -mm -v4 2/5] mm, swap: Fix swap readahead marking

2017-08-06 Thread Huang, Ying
From: Huang Ying 

In the original implementation, it is possible that the existing pages
in the swap cache (not newly readahead) could be marked as the
readahead pages.  This will cause the statistics of swap readahead be
wrong and influence the swap readahead algorithm too.

This is fixed via marking a page as the readahead page only if it is
newly allocated and read from the disk.

When testing with linpack, after the fixing the swap readahead hit
rate increased from ~66% to ~86%.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 mm/swap_state.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index d1bdb31cab13..a901afe9da61 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -498,7 +498,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
unsigned long start_offset, end_offset;
unsigned long mask;
struct blk_plug plug;
-   bool do_poll = true;
+   bool do_poll = true, page_allocated;
 
mask = swapin_nr_pages(offset) - 1;
if (!mask)
@@ -514,14 +514,18 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
blk_start_plug();
for (offset = start_offset; offset <= end_offset ; offset++) {
/* Ok, do the async read-ahead now */
-   page = read_swap_cache_async(swp_entry(swp_type(entry), offset),
-   gfp_mask, vma, addr, false);
+   page = __read_swap_cache_async(
+   swp_entry(swp_type(entry), offset),
+   gfp_mask, vma, addr, _allocated);
if (!page)
continue;
-   if (offset != entry_offset &&
-   likely(!PageTransCompound(page))) {
-   SetPageReadahead(page);
-   count_vm_event(SWAP_RA);
+   if (page_allocated) {
+   swap_readpage(page, false);
+   if (offset != entry_offset &&
+   likely(!PageTransCompound(page))) {
+   SetPageReadahead(page);
+   count_vm_event(SWAP_RA);
+   }
}
put_page(page);
}
-- 
2.11.0



[PATCH -mm -v4 2/5] mm, swap: Fix swap readahead marking

2017-08-06 Thread Huang, Ying
From: Huang Ying 

In the original implementation, it is possible that the existing pages
in the swap cache (not newly readahead) could be marked as the
readahead pages.  This will cause the statistics of swap readahead be
wrong and influence the swap readahead algorithm too.

This is fixed via marking a page as the readahead page only if it is
newly allocated and read from the disk.

When testing with linpack, after the fixing the swap readahead hit
rate increased from ~66% to ~86%.

Signed-off-by: "Huang, Ying" 
Cc: Johannes Weiner 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Hugh Dickins 
Cc: Fengguang Wu 
Cc: Tim Chen 
Cc: Dave Hansen 
---
 mm/swap_state.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index d1bdb31cab13..a901afe9da61 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -498,7 +498,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
unsigned long start_offset, end_offset;
unsigned long mask;
struct blk_plug plug;
-   bool do_poll = true;
+   bool do_poll = true, page_allocated;
 
mask = swapin_nr_pages(offset) - 1;
if (!mask)
@@ -514,14 +514,18 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
blk_start_plug();
for (offset = start_offset; offset <= end_offset ; offset++) {
/* Ok, do the async read-ahead now */
-   page = read_swap_cache_async(swp_entry(swp_type(entry), offset),
-   gfp_mask, vma, addr, false);
+   page = __read_swap_cache_async(
+   swp_entry(swp_type(entry), offset),
+   gfp_mask, vma, addr, _allocated);
if (!page)
continue;
-   if (offset != entry_offset &&
-   likely(!PageTransCompound(page))) {
-   SetPageReadahead(page);
-   count_vm_event(SWAP_RA);
+   if (page_allocated) {
+   swap_readpage(page, false);
+   if (offset != entry_offset &&
+   likely(!PageTransCompound(page))) {
+   SetPageReadahead(page);
+   count_vm_event(SWAP_RA);
+   }
}
put_page(page);
}
-- 
2.11.0



[PATCH -mm -v4 0/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
The swap readahead is an important mechanism to reduce the swap in
latency.  Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.

In the original swap readahead implementation, the consecutive blocks
in swap device are readahead based on the global space locality
estimation.  But the consecutive blocks in swap device just reflect
the order of page reclaiming, don't necessarily reflect the access
pattern in virtual memory space.  And the different tasks in the
system may have different access patterns, which makes the global
space locality estimation incorrect.

In this patchset, when page fault occurs, the virtual pages near the
fault address will be readahead instead of the swap slots near the
fault swap slot in swap device.  This avoid to readahead the unrelated
swap slots.  At the same time, the swap readahead is changed to work
on per-VMA from globally.  So that the different access patterns of
the different VMAs could be distinguished, and the different readahead
policy could be applied accordingly.  The original core readahead
detection and scaling algorithm is reused, because it is an effect
algorithm to detect the space locality.

In addition to the swap readahead changes, some new sysfs interface is
added to show the efficiency of the readahead algorithm and some other
swap statistics.

This new implementation will incur more small random read, on SSD, the
improved correctness of estimation and readahead target should beat
the potential increased overhead, this is also illustrated in the test
results below.  But on HDD, the overhead may beat the benefit, so the
original implementation will be used by default.

The test and result is as follow,

Common test condition
=

Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM)
Swap device: NVMe disk

Micro-benchmark with combined access pattern


vm-scalability, sequential swap test case, 4 processes to eat 50G
virtual memory space, repeat the sequential memory writing until 300
seconds.  The first round writing will trigger swap out, the following
rounds will trigger sequential swap in and out.

At the same time, run vm-scalability random swap test case in
background, 8 processes to eat 30G virtual memory space, repeat the
random memory write until 300 seconds.  This will trigger random
swap-in in the background.

This is a combined workload with sequential and random memory
accessing at the same time.  The result (for sequential workload) is
as follow,

BaseOptimized
-
throughput  345413 KB/s 414029 KB/s (+19.9%)
latency.average 97.14 us61.06 us (-37.1%)
latency.50th2 us1 us
latency.60th2 us1 us
latency.70th98 us   2 us
latency.80th160 us  2 us
latency.90th260 us  217 us
latency.95th346 us  369 us
latency.99th1.34 ms 1.09 ms
ra_hit% 52.69%  99.98%

The original swap readahead algorithm is confused by the background
random access workload, so readahead hit rate is lower.  The VMA-base
readahead algorithm works much better.

Linpack
===

The test memory size is bigger than RAM to trigger swapping.

BaseOptimized
-
elapsed_time393.49 s329.88 s (-16.2%)
ra_hit% 86.21%  98.82%

The score of base and optimized kernel hasn't visible changes.  But
the elapsed time reduced and readahead hit rate improved, so the
optimized kernel runs better for startup and tear down stages.  And
the absolute value of readahead hit rate is high, shows that the space
locality is still valid in some practical workloads.

Changelogs:

v4:

- Rebased on latest -mm tree.

- Remove swap cache statistics interface, because we found that the
  interface for readahead statistics should be sufficient.

- Use /proc/vmstat for swap readahead statistics, because that is the
  interface used by other similar statistics.

- Add ABI document for newly added sysfs interface.

v3:

- Rebased on latest -mm tree

- Use percpu_counter for swap readahead statistics per Dave Hansen's comment.

Best Regards,
Huang, Ying


[PATCH -mm -v4 0/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
The swap readahead is an important mechanism to reduce the swap in
latency.  Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.

In the original swap readahead implementation, the consecutive blocks
in swap device are readahead based on the global space locality
estimation.  But the consecutive blocks in swap device just reflect
the order of page reclaiming, don't necessarily reflect the access
pattern in virtual memory space.  And the different tasks in the
system may have different access patterns, which makes the global
space locality estimation incorrect.

In this patchset, when page fault occurs, the virtual pages near the
fault address will be readahead instead of the swap slots near the
fault swap slot in swap device.  This avoid to readahead the unrelated
swap slots.  At the same time, the swap readahead is changed to work
on per-VMA from globally.  So that the different access patterns of
the different VMAs could be distinguished, and the different readahead
policy could be applied accordingly.  The original core readahead
detection and scaling algorithm is reused, because it is an effect
algorithm to detect the space locality.

In addition to the swap readahead changes, some new sysfs interface is
added to show the efficiency of the readahead algorithm and some other
swap statistics.

This new implementation will incur more small random read, on SSD, the
improved correctness of estimation and readahead target should beat
the potential increased overhead, this is also illustrated in the test
results below.  But on HDD, the overhead may beat the benefit, so the
original implementation will be used by default.

The test and result is as follow,

Common test condition
=

Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM)
Swap device: NVMe disk

Micro-benchmark with combined access pattern


vm-scalability, sequential swap test case, 4 processes to eat 50G
virtual memory space, repeat the sequential memory writing until 300
seconds.  The first round writing will trigger swap out, the following
rounds will trigger sequential swap in and out.

At the same time, run vm-scalability random swap test case in
background, 8 processes to eat 30G virtual memory space, repeat the
random memory write until 300 seconds.  This will trigger random
swap-in in the background.

This is a combined workload with sequential and random memory
accessing at the same time.  The result (for sequential workload) is
as follow,

BaseOptimized
-
throughput  345413 KB/s 414029 KB/s (+19.9%)
latency.average 97.14 us61.06 us (-37.1%)
latency.50th2 us1 us
latency.60th2 us1 us
latency.70th98 us   2 us
latency.80th160 us  2 us
latency.90th260 us  217 us
latency.95th346 us  369 us
latency.99th1.34 ms 1.09 ms
ra_hit% 52.69%  99.98%

The original swap readahead algorithm is confused by the background
random access workload, so readahead hit rate is lower.  The VMA-base
readahead algorithm works much better.

Linpack
===

The test memory size is bigger than RAM to trigger swapping.

BaseOptimized
-
elapsed_time393.49 s329.88 s (-16.2%)
ra_hit% 86.21%  98.82%

The score of base and optimized kernel hasn't visible changes.  But
the elapsed time reduced and readahead hit rate improved, so the
optimized kernel runs better for startup and tear down stages.  And
the absolute value of readahead hit rate is high, shows that the space
locality is still valid in some practical workloads.

Changelogs:

v4:

- Rebased on latest -mm tree.

- Remove swap cache statistics interface, because we found that the
  interface for readahead statistics should be sufficient.

- Use /proc/vmstat for swap readahead statistics, because that is the
  interface used by other similar statistics.

- Add ABI document for newly added sysfs interface.

v3:

- Rebased on latest -mm tree

- Use percpu_counter for swap readahead statistics per Dave Hansen's comment.

Best Regards,
Huang, Ying


[PATCH] ASoC: mediatek: Fix an error checking code

2017-08-06 Thread Christophe JAILLET
Check the value returned by 'devm_clk_get()' instead of the clock
identifier which can never be an ERR code.

Fixes: d6f3710a56e1 ("ASoC: mediatek: add structure define and clock control 
for 2701")
Signed-off-by: Christophe JAILLET 
---
 sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c 
b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
index b815ecc6bbf6..affa7fb25dd9 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
@@ -75,7 +75,7 @@ int mt2701_init_clock(struct mtk_base_afe *afe)
 
for (i = 0; i < MT2701_CLOCK_NUM; i++) {
afe_priv->clocks[i] = devm_clk_get(afe->dev, aud_clks[i]);
-   if (IS_ERR(aud_clks[i])) {
+   if (IS_ERR(afe_priv->clocks[i])) {
dev_warn(afe->dev, "%s devm_clk_get %s fail\n",
 __func__, aud_clks[i]);
return PTR_ERR(aud_clks[i]);
-- 
2.11.0



[PATCH] ASoC: mediatek: Fix an error checking code

2017-08-06 Thread Christophe JAILLET
Check the value returned by 'devm_clk_get()' instead of the clock
identifier which can never be an ERR code.

Fixes: d6f3710a56e1 ("ASoC: mediatek: add structure define and clock control 
for 2701")
Signed-off-by: Christophe JAILLET 
---
 sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c 
b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
index b815ecc6bbf6..affa7fb25dd9 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
@@ -75,7 +75,7 @@ int mt2701_init_clock(struct mtk_base_afe *afe)
 
for (i = 0; i < MT2701_CLOCK_NUM; i++) {
afe_priv->clocks[i] = devm_clk_get(afe->dev, aud_clks[i]);
-   if (IS_ERR(aud_clks[i])) {
+   if (IS_ERR(afe_priv->clocks[i])) {
dev_warn(afe->dev, "%s devm_clk_get %s fail\n",
 __func__, aud_clks[i]);
return PTR_ERR(aud_clks[i]);
-- 
2.11.0



Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled

2017-08-06 Thread Matthew Garrett
On Sun, Aug 6, 2017 at 9:47 PM, Rusty Russell  wrote:
> Matthew Garrett  writes:
>> And then you need an entire trusted userland, at which point you can
>> assert that the modules are trustworthy without having to validate
>> them so you don't need CONFIG_MODULE_SIG anyway.
>
> Yep.  But your patch already gives userland that power, to silently load
> unsigned modules.

Only if sig_enforce isn't set.

>>> With your patch, you don't get tainting in the environment where you can
>>> verify.
>>
>> You don't anyway, do you? Loading has already failed before this point
>> if sig_enforce is set.
>
> No.  You used to get a warning and a taint when you had a kernel
> configured to expect signatures and it didn't get one.  You want to
> remove that warning, to silently accept unsigned modules.

I'm very confused here. If sig_enforce is set, the kernel will refuse
to load an unsigned module - it won't be tainted, modprobe will just
return an error. If sig_enforce is not set, any attacker in a position
to provide unsigned modules is also in a position to just subvert
modprobe, so you aren't in an environment where you can verify
anything. The taint is informational, not any form of security. You're
only able to securely verify module signatures in userland in very
constrainted setups.

>>> You'd be better adding a sysctl or equiv. to turn off force loading, and
>>> use that in your can-verify system.
>>
>> I'm not sure what you mean by "force loading" here - if sig_enforce is
>> set, you can't force load an unsigned module. If sig_enforce isn't
>> set, you'll taint regardless of whether or not you force.
>>
>> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS?
>
> No, I mean stripping the signatures.  (I thought modprobe could do this
> these days, but apparently not!)
>
> So, you're actually building the same kernel, but building two sets of
> modules: one without signatures, one with?
>
> And when deploying the one with signatures, you're setting sig_enforce.
> On the other, you don't want signatures because um, reasons?  And you
> want to suppress the message?

No. A distribution may ship a kernel with signed modules. In some
configurations, the signatures are irrelevant - there's no mechanism
to verify that the correct kernel was loaded in the first place, so
for all you know the signature validation code has already been
removed at runtime. In that scenario you're fine with users loading
unsigned kernel modules and there's no benefit in tainting the kernel.
But the same kernel may be booted under circumstances where it *is*
possible to validate the kernel, and in those circumstances you want
to enforce module signatures and so sig_enforce is set.

Right now you have two choices:

1) unsigned modules taint the kernel if sig_enforce is false, unsigned
modules can't be loaded if sig_enforce is true (ie, CONFIG_MODULE_SIG
is set)
2) unsigned modules do not taint the kernel, unsigned modules can
always be loaded (ie, CONFIG_MODULE_SIG is unset)

What I want is:

3) unsigned modules do not taint the kernel if sig_enforce is false,
unsigned modules can't be loaded if sig_enforce is true

This is currently impossible to express, and as a result some
distributions ship with CONFIG_MODULE_SIG disabled in order to avoid
dealing with user questions about why loading locally built modules
now taints the kernel. Being able to build a single kernel that
satisfies more use cases seems like a win.

But maybe there's a cleaner way. How about adding a paramter like
sig_enforce (say taint_on_unsigned) and then adding a config parameter
equivalent to CONFIG_MODULE_SIG_FORCE? That way the default policy can
be set at build time, but can also be overridden by end users who
still want to be able to taint on unsigned module load.


Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled

2017-08-06 Thread Matthew Garrett
On Sun, Aug 6, 2017 at 9:47 PM, Rusty Russell  wrote:
> Matthew Garrett  writes:
>> And then you need an entire trusted userland, at which point you can
>> assert that the modules are trustworthy without having to validate
>> them so you don't need CONFIG_MODULE_SIG anyway.
>
> Yep.  But your patch already gives userland that power, to silently load
> unsigned modules.

Only if sig_enforce isn't set.

>>> With your patch, you don't get tainting in the environment where you can
>>> verify.
>>
>> You don't anyway, do you? Loading has already failed before this point
>> if sig_enforce is set.
>
> No.  You used to get a warning and a taint when you had a kernel
> configured to expect signatures and it didn't get one.  You want to
> remove that warning, to silently accept unsigned modules.

I'm very confused here. If sig_enforce is set, the kernel will refuse
to load an unsigned module - it won't be tainted, modprobe will just
return an error. If sig_enforce is not set, any attacker in a position
to provide unsigned modules is also in a position to just subvert
modprobe, so you aren't in an environment where you can verify
anything. The taint is informational, not any form of security. You're
only able to securely verify module signatures in userland in very
constrainted setups.

>>> You'd be better adding a sysctl or equiv. to turn off force loading, and
>>> use that in your can-verify system.
>>
>> I'm not sure what you mean by "force loading" here - if sig_enforce is
>> set, you can't force load an unsigned module. If sig_enforce isn't
>> set, you'll taint regardless of whether or not you force.
>>
>> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS?
>
> No, I mean stripping the signatures.  (I thought modprobe could do this
> these days, but apparently not!)
>
> So, you're actually building the same kernel, but building two sets of
> modules: one without signatures, one with?
>
> And when deploying the one with signatures, you're setting sig_enforce.
> On the other, you don't want signatures because um, reasons?  And you
> want to suppress the message?

No. A distribution may ship a kernel with signed modules. In some
configurations, the signatures are irrelevant - there's no mechanism
to verify that the correct kernel was loaded in the first place, so
for all you know the signature validation code has already been
removed at runtime. In that scenario you're fine with users loading
unsigned kernel modules and there's no benefit in tainting the kernel.
But the same kernel may be booted under circumstances where it *is*
possible to validate the kernel, and in those circumstances you want
to enforce module signatures and so sig_enforce is set.

Right now you have two choices:

1) unsigned modules taint the kernel if sig_enforce is false, unsigned
modules can't be loaded if sig_enforce is true (ie, CONFIG_MODULE_SIG
is set)
2) unsigned modules do not taint the kernel, unsigned modules can
always be loaded (ie, CONFIG_MODULE_SIG is unset)

What I want is:

3) unsigned modules do not taint the kernel if sig_enforce is false,
unsigned modules can't be loaded if sig_enforce is true

This is currently impossible to express, and as a result some
distributions ship with CONFIG_MODULE_SIG disabled in order to avoid
dealing with user questions about why loading locally built modules
now taints the kernel. Being able to build a single kernel that
satisfies more use cases seems like a win.

But maybe there's a cleaner way. How about adding a paramter like
sig_enforce (say taint_on_unsigned) and then adding a config parameter
equivalent to CONFIG_MODULE_SIG_FORCE? That way the default policy can
be set at build time, but can also be overridden by end users who
still want to be able to taint on unsigned module load.


Re: [PATCH] devfreq: replace sscanf with kstrtol

2017-08-06 Thread Chanwoo Choi
Hi,

On 2017년 08월 07일 13:47, gsant...@codeaurora.org wrote:
> On 2017-08-04 20:42, Chanwoo Choi wrote:
>> Hi,
>>
>> On Fri, Aug 4, 2017 at 12:57 PM,   wrote:
>>> Hi,
>>>
>>> Adding error checks to devfreq userspace governor, the current
>>> implementation results in setting wrong
>>> frequency when sscanf returns error.
>>>
>>>
>>> From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001
>>> From: Santosh Mardi 
>>> Date: Tue, 25 Jul 2017 18:47:11 +0530
>>> Subject: [PATCH] devfreq: replace sscanf with kstrtol
>>>
>>> store_freq function of devfreq userspace governor
>>> executes further, even if error is returned from sscanf,
>>> this will result in setting up wrong frequency value.
>>>
>>> The usage for the sscanf is only for single variable so
>>> replace sscanf with kstrtol along with error check to
>>> bail out if any error is returned.
>>>
>>> Signed-off-by: Santosh Mardi 
>>> ---
>>>  drivers/devfreq/governor_userspace.c | 5 -
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/devfreq/governor_userspace.c
>>> b/drivers/devfreq/governor_userspace.c
>>> index 77028c2..a84796d 100644
>>> --- a/drivers/devfreq/governor_userspace.c
>>> +++ b/drivers/devfreq/governor_userspace.c
>>> @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct
>>> device_attribute *attr,
>>> mutex_lock(>lock);
>>> data = devfreq->data;
>>>
>>> -   sscanf(buf, "%lu", );
>>> +   err = kstrtol(buf, 0, );
>>> +   if (err < 0)
>>> +   goto out;
>>
>> I think that just you can check the return value as following:
>> The other point of devfreq already uses the following style
>> to check the return value of sscanf. I think kstrtol is not necessary.
>>
>>  err = sscanf(buf, "%lu", );
>>  if (err != 1)
>>   goto out;
>>
> 
> [Santosh] - I Agree we need to have this error check as mentioned by you if 
> we are scanning an arrary from the sscanf,
> but in the above code we are only scanning one variable and there is a rule 
> in the checkpatch scripts, not to use sscanf if it is a single variable, So I 
> need to replace sscanf to strtol

IMHO, even if checkpatch shows the warning about sscanf,
I'd like you to use 'sscanf' in order to maintain
the consistency and readability when handling the sscanf.

For example, drivers/devfreq/devfreq.c and drivers/cpufreq/cpufreq.c
have the same warnings on many points.

> 
> I have added all the mails I got as output from scripts/get_maintainer.pl 
> scripts in this mail.

Maybe, you missed including me (reviewer) to cc list.

MyungJoo Ham  (maintainer:DEVICE FREQUENCY (DEVFREQ))
Kyungmin Park  (maintainer:DEVICE FREQUENCY 
(DEVFREQ))
Chanwoo Choi  (reviewer:DEVICE FREQUENCY (DEVFREQ))
linux...@vger.kernel.org (open list:DEVICE FREQUENCY (DEVFREQ))
linux-kernel@vger.kernel.org (open list)

> 
> 
>> And please use the scripts/get_maintainer.pl
>> in order to prevent the missing of the reviewer.
>>
>>> data->user_frequency = wanted;
>>> data->valid = true;
>>> err = update_devfreq(devfreq);
>>> if (err == 0)
>>> err = count;
>>> +out:
>>> mutex_unlock(>lock);
>>> return err;
>>>  }
>>> -- 
>>>
>>> Regards,
>>> Santosh M G.
>>> Qualcomm Innovation Center
> 
> 
> 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH] devfreq: replace sscanf with kstrtol

2017-08-06 Thread Chanwoo Choi
Hi,

On 2017년 08월 07일 13:47, gsant...@codeaurora.org wrote:
> On 2017-08-04 20:42, Chanwoo Choi wrote:
>> Hi,
>>
>> On Fri, Aug 4, 2017 at 12:57 PM,   wrote:
>>> Hi,
>>>
>>> Adding error checks to devfreq userspace governor, the current
>>> implementation results in setting wrong
>>> frequency when sscanf returns error.
>>>
>>>
>>> From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001
>>> From: Santosh Mardi 
>>> Date: Tue, 25 Jul 2017 18:47:11 +0530
>>> Subject: [PATCH] devfreq: replace sscanf with kstrtol
>>>
>>> store_freq function of devfreq userspace governor
>>> executes further, even if error is returned from sscanf,
>>> this will result in setting up wrong frequency value.
>>>
>>> The usage for the sscanf is only for single variable so
>>> replace sscanf with kstrtol along with error check to
>>> bail out if any error is returned.
>>>
>>> Signed-off-by: Santosh Mardi 
>>> ---
>>>  drivers/devfreq/governor_userspace.c | 5 -
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/devfreq/governor_userspace.c
>>> b/drivers/devfreq/governor_userspace.c
>>> index 77028c2..a84796d 100644
>>> --- a/drivers/devfreq/governor_userspace.c
>>> +++ b/drivers/devfreq/governor_userspace.c
>>> @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct
>>> device_attribute *attr,
>>> mutex_lock(>lock);
>>> data = devfreq->data;
>>>
>>> -   sscanf(buf, "%lu", );
>>> +   err = kstrtol(buf, 0, );
>>> +   if (err < 0)
>>> +   goto out;
>>
>> I think that just you can check the return value as following:
>> The other point of devfreq already uses the following style
>> to check the return value of sscanf. I think kstrtol is not necessary.
>>
>>  err = sscanf(buf, "%lu", );
>>  if (err != 1)
>>   goto out;
>>
> 
> [Santosh] - I Agree we need to have this error check as mentioned by you if 
> we are scanning an arrary from the sscanf,
> but in the above code we are only scanning one variable and there is a rule 
> in the checkpatch scripts, not to use sscanf if it is a single variable, So I 
> need to replace sscanf to strtol

IMHO, even if checkpatch shows the warning about sscanf,
I'd like you to use 'sscanf' in order to maintain
the consistency and readability when handling the sscanf.

For example, drivers/devfreq/devfreq.c and drivers/cpufreq/cpufreq.c
have the same warnings on many points.

> 
> I have added all the mails I got as output from scripts/get_maintainer.pl 
> scripts in this mail.

Maybe, you missed including me (reviewer) to cc list.

MyungJoo Ham  (maintainer:DEVICE FREQUENCY (DEVFREQ))
Kyungmin Park  (maintainer:DEVICE FREQUENCY 
(DEVFREQ))
Chanwoo Choi  (reviewer:DEVICE FREQUENCY (DEVFREQ))
linux...@vger.kernel.org (open list:DEVICE FREQUENCY (DEVFREQ))
linux-kernel@vger.kernel.org (open list)

> 
> 
>> And please use the scripts/get_maintainer.pl
>> in order to prevent the missing of the reviewer.
>>
>>> data->user_frequency = wanted;
>>> data->valid = true;
>>> err = update_devfreq(devfreq);
>>> if (err == 0)
>>> err = count;
>>> +out:
>>> mutex_unlock(>lock);
>>> return err;
>>>  }
>>> -- 
>>>
>>> Regards,
>>> Santosh M G.
>>> Qualcomm Innovation Center
> 
> 
> 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle

2017-08-06 Thread Gautham R Shenoy
Hi Michael,


On Tue, Aug 01, 2017 at 08:56:18PM +1000, Michael Ellerman wrote:
> "Gautham R. Shenoy"  writes:
> >
> > Subject: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for 
> > stop4 cpuidle
> 
> I know it's not a big deal, but can we agree on the subject format?
> 
>   powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle

Sure. I will repost with the updated subject.

> 
> cheers
> 
--
Thanks and Regards
gautham.



Re: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle

2017-08-06 Thread Gautham R Shenoy
Hi Michael,


On Tue, Aug 01, 2017 at 08:56:18PM +1000, Michael Ellerman wrote:
> "Gautham R. Shenoy"  writes:
> >
> > Subject: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for 
> > stop4 cpuidle
> 
> I know it's not a big deal, but can we agree on the subject format?
> 
>   powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle

Sure. I will repost with the updated subject.

> 
> cheers
> 
--
Thanks and Regards
gautham.



Re: [PATCH 2/2] Save current timestamp part of dmesg while writing oops message to pstore

2017-08-06 Thread Ankit Kumar

Hi Kees,



On Tuesday 23 May 2017 02:19 PM, Ankit Kumar wrote:

Hi Kees,



On Tuesday 23 May 2017 05:21 AM, Kees Cook wrote:
On Mon, May 22, 2017 at 3:20 AM, Ankit Kumar 
 wrote:

Currently on panic or Oops, kernel saves the last few bytes from dmesg
buffer to nvram. Usually kdump does capture kernel memory and provide
dmesg logs as well. But in some cases where kdump fails to capture
vmcore, the dmesg buffer stored in nvram/pstore turns out to be very
helpful in analyzing root cause.

Present code creates pstore dump file(/sys/fs/pstore/dmesg-***) 
based on

timestamp(retrieved from header). Current pstore code creates dump file
(/sys/fs/pstore/dmesg-***) with that timestamp. Dump file can be 
analyzed
based on file creation time and we can make out whether dump file 
has latest

data or not.

But when we transfer pstore dump file(/sys/fs/pstore/dmesg-***) to 
other
machine or collect file using some 
utilities(sosreport/supportconfig) then file
timestamp gets changed and hence by looking at device file 
(dmesg-***) we won't

be able to identify whether dump has latest data or not.

Above issue can be fixed if we also have timestamp(dump creation 
time) as

initial few bytes while capturing dmesg buffer to pstore dump file
(/sys/fs/pstore/dmesg-***).


This patch enhances pstore write code to also write timestamp as 
part of data.


Here is sample log of dump file:(/sys/fs/pstore/dmesg-***)
Oops#1 Part1 [timestamp:1494939359.590463]

While I understand your rationale about possibly losing file timestamp
information in userspace, I think this is a solvable problem on the
collection side. If an additional header is needed, perhaps copy the
dmesg files like this:

for i in dmesg-*; do
 (stat --format=%y /sys/fs/pstore/$i; \
  cat /sys/fs/pstore/$i) > $collect_dir/$i
done


Yes. We can handle this in userspace. But we wanted to see if we can 
add this as part of pstore

log itself.



One of the primary concerns for pstore is the stored dump size,


I understand. How about adding timestamp to file name itself? 
Something like below



How about appending time as part of file name itself. ?
Did you get time to look at above approach.
Code can be something like below piece.

~Ankit


index 792a4e5..0837365 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -349,9 +349,10 @@ int pstore_mkfile(struct dentry *root, struct 
pstore_record *record)


switch (record->type) {
case PSTORE_TYPE_DMESG:
-   scnprintf(name, sizeof(name), "dmesg-%s-%lld%s",
+   scnprintf(name, sizeof(name), "dmesg-%s-%lld%s-%lu.%lu",
  record->psi->name, record->id,
- record->compressed ? ".enc.z" : "");
+ record->compressed ? ".enc.z" : "",
+ record->time.tv_sec, record->time.tv_nsec / 
1000);

break;
case PSTORE_TYPE_CONSOLE:


~Ankit




Re: [PATCH 2/2] Save current timestamp part of dmesg while writing oops message to pstore

2017-08-06 Thread Ankit Kumar

Hi Kees,



On Tuesday 23 May 2017 02:19 PM, Ankit Kumar wrote:

Hi Kees,



On Tuesday 23 May 2017 05:21 AM, Kees Cook wrote:
On Mon, May 22, 2017 at 3:20 AM, Ankit Kumar 
 wrote:

Currently on panic or Oops, kernel saves the last few bytes from dmesg
buffer to nvram. Usually kdump does capture kernel memory and provide
dmesg logs as well. But in some cases where kdump fails to capture
vmcore, the dmesg buffer stored in nvram/pstore turns out to be very
helpful in analyzing root cause.

Present code creates pstore dump file(/sys/fs/pstore/dmesg-***) 
based on

timestamp(retrieved from header). Current pstore code creates dump file
(/sys/fs/pstore/dmesg-***) with that timestamp. Dump file can be 
analyzed
based on file creation time and we can make out whether dump file 
has latest

data or not.

But when we transfer pstore dump file(/sys/fs/pstore/dmesg-***) to 
other
machine or collect file using some 
utilities(sosreport/supportconfig) then file
timestamp gets changed and hence by looking at device file 
(dmesg-***) we won't

be able to identify whether dump has latest data or not.

Above issue can be fixed if we also have timestamp(dump creation 
time) as

initial few bytes while capturing dmesg buffer to pstore dump file
(/sys/fs/pstore/dmesg-***).


This patch enhances pstore write code to also write timestamp as 
part of data.


Here is sample log of dump file:(/sys/fs/pstore/dmesg-***)
Oops#1 Part1 [timestamp:1494939359.590463]

While I understand your rationale about possibly losing file timestamp
information in userspace, I think this is a solvable problem on the
collection side. If an additional header is needed, perhaps copy the
dmesg files like this:

for i in dmesg-*; do
 (stat --format=%y /sys/fs/pstore/$i; \
  cat /sys/fs/pstore/$i) > $collect_dir/$i
done


Yes. We can handle this in userspace. But we wanted to see if we can 
add this as part of pstore

log itself.



One of the primary concerns for pstore is the stored dump size,


I understand. How about adding timestamp to file name itself? 
Something like below



How about appending time as part of file name itself. ?
Did you get time to look at above approach.
Code can be something like below piece.

~Ankit


index 792a4e5..0837365 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -349,9 +349,10 @@ int pstore_mkfile(struct dentry *root, struct 
pstore_record *record)


switch (record->type) {
case PSTORE_TYPE_DMESG:
-   scnprintf(name, sizeof(name), "dmesg-%s-%lld%s",
+   scnprintf(name, sizeof(name), "dmesg-%s-%lld%s-%lu.%lu",
  record->psi->name, record->id,
- record->compressed ? ".enc.z" : "");
+ record->compressed ? ".enc.z" : "",
+ record->time.tv_sec, record->time.tv_nsec / 
1000);

break;
case PSTORE_TYPE_CONSOLE:


~Ankit




[PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores

2017-08-06 Thread Hoeun Ryu
 Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
version in panic path) introduced crash_smp_send_stop() which is a weak
function and can be overriden by architecture codes to fix the side effect
caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
notifiers" option).

 ARM64 architecture uses the weak version function and the problem is that
the weak function simply calls smp_send_stop() which makes other CPUs
offline and takes away the chance to save crash information for nonpanic
CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
option is enabled.

 Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
because all nonpanic CPUs are already offline by smp_send_stop() in this
case and smp_send_crash_stop() only works against online CPUs.

 The result is that /proc/vmcore is not available with the error messages;
"Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized".

 crash_smp_send_stop() is implemented to fix this problem by replacing the
exising smp_send_crash_stop() and adding a check for multiple calling to
the function. The function (strong symbol version) saves crash information
for nonpanic CPUs and machine_crash_shutdown() tries to save crash
information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
option is disabled.

* crash_kexec_post_notifiers : false

  panic()
__crash_kexec()
  machine_crash_shutdown()
crash_smp_send_stop()<= save crash dump for nonpanic cores

* crash_kexec_post_notifiers : true

  panic()
crash_smp_send_stop()<= save crash dump for nonpanic cores
__crash_kexec()
  machine_crash_shutdown()
crash_smp_send_stop()<= just return.

Signed-off-by: Hoeun Ryu 
---
 v2:
   - replace the existing smp_send_crash_stop() with crash_smp_send_stop()
 and adding called-twice logic to it.
   - modify the commit message

 arch/arm64/include/asm/smp.h  |  2 +-
 arch/arm64/kernel/machine_kexec.c |  2 +-
 arch/arm64/kernel/smp.c   | 12 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 55f08c5..f82b447 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void)
  */
 bool cpus_are_stuck_in_kernel(void);
 
-extern void smp_send_crash_stop(void);
+extern void crash_smp_send_stop(void);
 extern bool smp_crash_stop_failed(void);
 
 #endif /* ifndef __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 481f54a..11121f6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
local_irq_disable();
 
/* shutdown non-crashing cpus */
-   smp_send_crash_stop();
+   crash_smp_send_stop();
 
/* for crashing cpu */
crash_save_cpu(regs, smp_processor_id());
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index dc66e6e..73d8f5e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -977,11 +977,21 @@ void smp_send_stop(void)
 }
 
 #ifdef CONFIG_KEXEC_CORE
-void smp_send_crash_stop(void)
+void crash_smp_send_stop(void)
 {
+   static int cpus_stopped;
cpumask_t mask;
unsigned long timeout;
 
+   /*
+* This function can be called twice in panic path, but obviously
+* we execute this only once.
+*/
+   if (cpus_stopped)
+   return;
+
+   cpus_stopped = 1;
+
if (num_online_cpus() == 1)
return;
 
-- 
2.7.4



[PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores

2017-08-06 Thread Hoeun Ryu
 Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
version in panic path) introduced crash_smp_send_stop() which is a weak
function and can be overriden by architecture codes to fix the side effect
caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
notifiers" option).

 ARM64 architecture uses the weak version function and the problem is that
the weak function simply calls smp_send_stop() which makes other CPUs
offline and takes away the chance to save crash information for nonpanic
CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
option is enabled.

 Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
because all nonpanic CPUs are already offline by smp_send_stop() in this
case and smp_send_crash_stop() only works against online CPUs.

 The result is that /proc/vmcore is not available with the error messages;
"Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized".

 crash_smp_send_stop() is implemented to fix this problem by replacing the
exising smp_send_crash_stop() and adding a check for multiple calling to
the function. The function (strong symbol version) saves crash information
for nonpanic CPUs and machine_crash_shutdown() tries to save crash
information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
option is disabled.

* crash_kexec_post_notifiers : false

  panic()
__crash_kexec()
  machine_crash_shutdown()
crash_smp_send_stop()<= save crash dump for nonpanic cores

* crash_kexec_post_notifiers : true

  panic()
crash_smp_send_stop()<= save crash dump for nonpanic cores
__crash_kexec()
  machine_crash_shutdown()
crash_smp_send_stop()<= just return.

Signed-off-by: Hoeun Ryu 
---
 v2:
   - replace the existing smp_send_crash_stop() with crash_smp_send_stop()
 and adding called-twice logic to it.
   - modify the commit message

 arch/arm64/include/asm/smp.h  |  2 +-
 arch/arm64/kernel/machine_kexec.c |  2 +-
 arch/arm64/kernel/smp.c   | 12 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 55f08c5..f82b447 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void)
  */
 bool cpus_are_stuck_in_kernel(void);
 
-extern void smp_send_crash_stop(void);
+extern void crash_smp_send_stop(void);
 extern bool smp_crash_stop_failed(void);
 
 #endif /* ifndef __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 481f54a..11121f6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
local_irq_disable();
 
/* shutdown non-crashing cpus */
-   smp_send_crash_stop();
+   crash_smp_send_stop();
 
/* for crashing cpu */
crash_save_cpu(regs, smp_processor_id());
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index dc66e6e..73d8f5e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -977,11 +977,21 @@ void smp_send_stop(void)
 }
 
 #ifdef CONFIG_KEXEC_CORE
-void smp_send_crash_stop(void)
+void crash_smp_send_stop(void)
 {
+   static int cpus_stopped;
cpumask_t mask;
unsigned long timeout;
 
+   /*
+* This function can be called twice in panic path, but obviously
+* we execute this only once.
+*/
+   if (cpus_stopped)
+   return;
+
+   cpus_stopped = 1;
+
if (num_online_cpus() == 1)
return;
 
-- 
2.7.4



[PATCH] perf report: calculate the average cycles of iterations

2017-08-06 Thread Jin Yao
The branch history code has a loop detection function. With
this, we can get the number of iterations by calculating the
removed loops.

While it would be nice for knowing the average cycles of
iterations. This patch adds up the cycles in branch entries
of removed loops and save the result to the next branch entry
(e.g. branch entry A).

Finally it will display the iteration number and average
cycles at the "from" of branch entry A.

For example:
perf record -g -j any,save_type ./div
perf report --branch-history --no-children --stdio

--22.63%--main div.c:42 (RET CROSS_2M)
  compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2)
  |
   --10.73%--compute_flag div.c:27 (RET CROSS_2M)
 rand rand.c:28 (cycles:1)
 rand rand.c:28 (RET CROSS_2M)
 __random random.c:298 (cycles:1)
 __random random.c:297 (COND_BWD CROSS_2M)
 __random random.c:295 (cycles:1)
 __random random.c:295 (COND_BWD CROSS_2M)
 __random random.c:295 (cycles:1)
 __random random.c:295 (RET CROSS_2M)

Signed-off-by: Jin Yao 
---
 tools/perf/ui/browsers/hists.c |  8 +---
 tools/perf/ui/stdio/hist.c | 10 ++---
 tools/perf/util/callchain.c| 49 +++
 tools/perf/util/callchain.h|  9 ++---
 tools/perf/util/machine.c  | 88 +-
 5 files changed, 85 insertions(+), 79 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index f4bc246..13dfb0a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -931,12 +931,8 @@ static int hist_browser__show_callchain_list(struct 
hist_browser *browser,
   browser->show_dso);
 
if (symbol_conf.show_branchflag_count) {
-   if (need_percent)
-   callchain_list_counts__printf_value(node, chain, NULL,
-   buf, sizeof(buf));
-   else
-   callchain_list_counts__printf_value(NULL, chain, NULL,
-   buf, sizeof(buf));
+   callchain_list_counts__printf_value(chain, NULL,
+   buf, sizeof(buf));
 
if (asprintf(_str2, "%s%s", str, buf) < 0)
str = "Not enough memory!";
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 5c95b83..8bdb7a5 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -124,12 +124,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct 
callchain_node *node,
str = callchain_list__sym_name(chain, bf, sizeof(bf), false);
 
if (symbol_conf.show_branchflag_count) {
-   if (!period)
-   callchain_list_counts__printf_value(node, chain, NULL,
-   buf, sizeof(buf));
-   else
-   callchain_list_counts__printf_value(NULL, chain, NULL,
-   buf, sizeof(buf));
+   callchain_list_counts__printf_value(chain, NULL,
+   buf, sizeof(buf));
 
if (asprintf(_str, "%s%s", str, buf) < 0)
str = "Not enough memory!";
@@ -313,7 +309,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct 
rb_root *root,
 
if (symbol_conf.show_branchflag_count)
ret += callchain_list_counts__printf_value(
-   NULL, chain, fp, NULL, 0);
+   chain, fp, NULL, 0);
ret += fprintf(fp, "\n");
 
if (++entries_printed == callchain_param.print_limit)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index f320b07..510b513 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -588,7 +588,7 @@ fill_node(struct callchain_node *node, struct 
callchain_cursor *cursor)
call->cycles_count =
cursor_node->branch_flags.cycles;
call->iter_count = cursor_node->nr_loop_iter;
-   call->samples_count = cursor_node->samples;
+   call->iter_cycles = cursor_node->iter_cycles;
}
}
 
@@ -722,7 +722,7 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
cnode->cycles_count +=
node->branch_flags.cycles;
cnode->iter_count += 

[PATCH] perf report: calculate the average cycles of iterations

2017-08-06 Thread Jin Yao
The branch history code has a loop detection function. With
this, we can get the number of iterations by calculating the
removed loops.

While it would be nice for knowing the average cycles of
iterations. This patch adds up the cycles in branch entries
of removed loops and save the result to the next branch entry
(e.g. branch entry A).

Finally it will display the iteration number and average
cycles at the "from" of branch entry A.

For example:
perf record -g -j any,save_type ./div
perf report --branch-history --no-children --stdio

--22.63%--main div.c:42 (RET CROSS_2M)
  compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2)
  |
   --10.73%--compute_flag div.c:27 (RET CROSS_2M)
 rand rand.c:28 (cycles:1)
 rand rand.c:28 (RET CROSS_2M)
 __random random.c:298 (cycles:1)
 __random random.c:297 (COND_BWD CROSS_2M)
 __random random.c:295 (cycles:1)
 __random random.c:295 (COND_BWD CROSS_2M)
 __random random.c:295 (cycles:1)
 __random random.c:295 (RET CROSS_2M)

Signed-off-by: Jin Yao 
---
 tools/perf/ui/browsers/hists.c |  8 +---
 tools/perf/ui/stdio/hist.c | 10 ++---
 tools/perf/util/callchain.c| 49 +++
 tools/perf/util/callchain.h|  9 ++---
 tools/perf/util/machine.c  | 88 +-
 5 files changed, 85 insertions(+), 79 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index f4bc246..13dfb0a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -931,12 +931,8 @@ static int hist_browser__show_callchain_list(struct 
hist_browser *browser,
   browser->show_dso);
 
if (symbol_conf.show_branchflag_count) {
-   if (need_percent)
-   callchain_list_counts__printf_value(node, chain, NULL,
-   buf, sizeof(buf));
-   else
-   callchain_list_counts__printf_value(NULL, chain, NULL,
-   buf, sizeof(buf));
+   callchain_list_counts__printf_value(chain, NULL,
+   buf, sizeof(buf));
 
if (asprintf(_str2, "%s%s", str, buf) < 0)
str = "Not enough memory!";
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 5c95b83..8bdb7a5 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -124,12 +124,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct 
callchain_node *node,
str = callchain_list__sym_name(chain, bf, sizeof(bf), false);
 
if (symbol_conf.show_branchflag_count) {
-   if (!period)
-   callchain_list_counts__printf_value(node, chain, NULL,
-   buf, sizeof(buf));
-   else
-   callchain_list_counts__printf_value(NULL, chain, NULL,
-   buf, sizeof(buf));
+   callchain_list_counts__printf_value(chain, NULL,
+   buf, sizeof(buf));
 
if (asprintf(_str, "%s%s", str, buf) < 0)
str = "Not enough memory!";
@@ -313,7 +309,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct 
rb_root *root,
 
if (symbol_conf.show_branchflag_count)
ret += callchain_list_counts__printf_value(
-   NULL, chain, fp, NULL, 0);
+   chain, fp, NULL, 0);
ret += fprintf(fp, "\n");
 
if (++entries_printed == callchain_param.print_limit)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index f320b07..510b513 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -588,7 +588,7 @@ fill_node(struct callchain_node *node, struct 
callchain_cursor *cursor)
call->cycles_count =
cursor_node->branch_flags.cycles;
call->iter_count = cursor_node->nr_loop_iter;
-   call->samples_count = cursor_node->samples;
+   call->iter_cycles = cursor_node->iter_cycles;
}
}
 
@@ -722,7 +722,7 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
cnode->cycles_count +=
node->branch_flags.cycles;
cnode->iter_count += node->nr_loop_iter;
-

Re: [PATCH] osq_lock: fix osq_lock queue corruption

2017-08-06 Thread Prateek Sood
On 07/31/2017 10:54 PM, Prateek Sood wrote:
> Fix ordering of link creation between node->prev and prev->next in
> osq_lock(). A case in which the status of optimistic spin queue is
> CPU6->CPU2 in which CPU6 has acquired the lock.
> 
> tail
>   v
>   ,-. <- ,-.
>   |6||2|
>   `-' -> `-'
> 
> At this point if CPU0 comes in to acquire osq_lock, it will update the
> tail count.
> 
>   CPU2CPU0
>   --
> 
>  tail
>v
> ,-. <- ,-.,-.
> |6||2||0|
> `-' -> `-'`-'
> 
> After tail count update if CPU2 starts to unqueue itself from
> optimistic spin queue, it will find updated tail count with CPU0 and
> update CPU2 node->next to NULL in osq_wait_next().
> 
>   unqueue-A
> 
>  tail
>v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
> 
>   unqueue-B
> 
>   ->tail != curr && !node->next
> 
> If reordering of following stores happen then
> prev->next where prev being CPU2 would be updated to point to CPU0 node:
> 
>  tail
>v
> ,-. <- ,-.,-.
> |6||2||0|
> `-' -> `-' -> `-'
> 
>   osq_wait_next()
> node->next <- 0
> xchg(node->next, NULL)
> 
>  tail
>v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
> 
>   unqueue-C
> 
> At this point if next instruction
> WRITE_ONCE(next->prev, prev);
> in CPU2 path is committed before the update of CPU0 node->prev = prev then
> CPU0 node->prev will point to CPU6 node.
> 
>  tail
> V--. v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
>  `--^
> 
> At this point if CPU0 path's node->prev = prev is committed resulting
> in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
> currently,
> 
>  tail
>v
> ,-. <- ,-. <- ,-.
> |6||2||0|
> `-'`-'`-'
>`--^
> 
> so if CPU0 gets into unqueue path of osq_lock it will keep spinning
> in infinite loop as condition prev->next == node will never be true.
> 
> Signed-off-by: Prateek Sood 
> ---
>  kernel/locking/osq_lock.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index a316794..9f4afa3 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>  
>   prev = decode_cpu(old);
>   node->prev = prev;
> +
> + /*
> +  * osq_lock()   unqueue
> +  *
> +  * node->prev = prevosq_wait_next()
> +  * WMB  MB
> +  * prev->next = nodenext->prev = prev //unqueue-C
> +  *
> +  * Here 'node->prev' and 'next->prev' are the same variable and we need
> +  * to ensure these stores happen in-order to avoid corrupting the list.
> +  */
> + smp_wmb();
> +
>   WRITE_ONCE(prev->next, node);
>  
>   /*
> 

Hi Peter,

I have updated the change log and comments in code.

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


Re: [PATCH] osq_lock: fix osq_lock queue corruption

2017-08-06 Thread Prateek Sood
On 07/31/2017 10:54 PM, Prateek Sood wrote:
> Fix ordering of link creation between node->prev and prev->next in
> osq_lock(). A case in which the status of optimistic spin queue is
> CPU6->CPU2 in which CPU6 has acquired the lock.
> 
> tail
>   v
>   ,-. <- ,-.
>   |6||2|
>   `-' -> `-'
> 
> At this point if CPU0 comes in to acquire osq_lock, it will update the
> tail count.
> 
>   CPU2CPU0
>   --
> 
>  tail
>v
> ,-. <- ,-.,-.
> |6||2||0|
> `-' -> `-'`-'
> 
> After tail count update if CPU2 starts to unqueue itself from
> optimistic spin queue, it will find updated tail count with CPU0 and
> update CPU2 node->next to NULL in osq_wait_next().
> 
>   unqueue-A
> 
>  tail
>v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
> 
>   unqueue-B
> 
>   ->tail != curr && !node->next
> 
> If reordering of following stores happen then
> prev->next where prev being CPU2 would be updated to point to CPU0 node:
> 
>  tail
>v
> ,-. <- ,-.,-.
> |6||2||0|
> `-' -> `-' -> `-'
> 
>   osq_wait_next()
> node->next <- 0
> xchg(node->next, NULL)
> 
>  tail
>v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
> 
>   unqueue-C
> 
> At this point if next instruction
> WRITE_ONCE(next->prev, prev);
> in CPU2 path is committed before the update of CPU0 node->prev = prev then
> CPU0 node->prev will point to CPU6 node.
> 
>  tail
> V--. v
>   ,-. <- ,-.,-.
>   |6||2||0|
>   `-'`-'`-'
>  `--^
> 
> At this point if CPU0 path's node->prev = prev is committed resulting
> in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
> currently,
> 
>  tail
>v
> ,-. <- ,-. <- ,-.
> |6||2||0|
> `-'`-'`-'
>`--^
> 
> so if CPU0 gets into unqueue path of osq_lock it will keep spinning
> in infinite loop as condition prev->next == node will never be true.
> 
> Signed-off-by: Prateek Sood 
> ---
>  kernel/locking/osq_lock.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index a316794..9f4afa3 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>  
>   prev = decode_cpu(old);
>   node->prev = prev;
> +
> + /*
> +  * osq_lock()   unqueue
> +  *
> +  * node->prev = prevosq_wait_next()
> +  * WMB  MB
> +  * prev->next = nodenext->prev = prev //unqueue-C
> +  *
> +  * Here 'node->prev' and 'next->prev' are the same variable and we need
> +  * to ensure these stores happen in-order to avoid corrupting the list.
> +  */
> + smp_wmb();
> +
>   WRITE_ONCE(prev->next, node);
>  
>   /*
> 

Hi Peter,

I have updated the change log and comments in code.

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


[PATCH] i2c: imx: Remove a useless test in 'i2c_imx_init_recovery_info()'

2017-08-06 Thread Christophe JAILLET
'devm_pinctrl_get()' never returns NULL, so this test can be simplified.

Signed-off-by: Christophe JAILLET 
---
 drivers/i2c/busses/i2c-imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 54a47b40546f..7e84662fe1c0 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -997,7 +997,7 @@ static int i2c_imx_init_recovery_info(struct imx_i2c_struct 
*i2c_imx,
struct i2c_bus_recovery_info *rinfo = _imx->rinfo;
 
i2c_imx->pinctrl = devm_pinctrl_get(>dev);
-   if (!i2c_imx->pinctrl || IS_ERR(i2c_imx->pinctrl)) {
+   if (IS_ERR(i2c_imx->pinctrl)) {
dev_info(>dev, "can't get pinctrl, bus recovery not 
supported\n");
return PTR_ERR(i2c_imx->pinctrl);
}
-- 
2.11.0



[PATCH] i2c: imx: Remove a useless test in 'i2c_imx_init_recovery_info()'

2017-08-06 Thread Christophe JAILLET
'devm_pinctrl_get()' never returns NULL, so this test can be simplified.

Signed-off-by: Christophe JAILLET 
---
 drivers/i2c/busses/i2c-imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 54a47b40546f..7e84662fe1c0 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -997,7 +997,7 @@ static int i2c_imx_init_recovery_info(struct imx_i2c_struct 
*i2c_imx,
struct i2c_bus_recovery_info *rinfo = _imx->rinfo;
 
i2c_imx->pinctrl = devm_pinctrl_get(>dev);
-   if (!i2c_imx->pinctrl || IS_ERR(i2c_imx->pinctrl)) {
+   if (IS_ERR(i2c_imx->pinctrl)) {
dev_info(>dev, "can't get pinctrl, bus recovery not 
supported\n");
return PTR_ERR(i2c_imx->pinctrl);
}
-- 
2.11.0



[PATCH v2] rtlwifi: constify rate_control_ops structure

2017-08-06 Thread Bhumika Goyal
rate_control_ops structure is only passed as an argument to the
function ieee80211_rate_control_{register/unregister}. This argument
is of type const, so declare the structure as const.

Signed-off-by: Bhumika Goyal 
---
Changes in v2:
* Change subject line.

 drivers/net/wireless/realtek/rtlwifi/rc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rc.c 
b/drivers/net/wireless/realtek/rtlwifi/rc.c
index 951d257..02811ed 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rc.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rc.c
@@ -283,7 +283,7 @@ static void rtl_rate_free_sta(void *rtlpriv,
kfree(rate_priv);
 }
 
-static struct rate_control_ops rtl_rate_ops = {
+static const struct rate_control_ops rtl_rate_ops = {
.name = "rtl_rc",
.alloc = rtl_rate_alloc,
.free = rtl_rate_free,
-- 
1.9.1



[PATCH v2] rtlwifi: constify rate_control_ops structure

2017-08-06 Thread Bhumika Goyal
rate_control_ops structure is only passed as an argument to the
function ieee80211_rate_control_{register/unregister}. This argument
is of type const, so declare the structure as const.

Signed-off-by: Bhumika Goyal 
---
Changes in v2:
* Change subject line.

 drivers/net/wireless/realtek/rtlwifi/rc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rc.c 
b/drivers/net/wireless/realtek/rtlwifi/rc.c
index 951d257..02811ed 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rc.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rc.c
@@ -283,7 +283,7 @@ static void rtl_rate_free_sta(void *rtlpriv,
kfree(rate_priv);
 }
 
-static struct rate_control_ops rtl_rate_ops = {
+static const struct rate_control_ops rtl_rate_ops = {
.name = "rtl_rc",
.alloc = rtl_rate_alloc,
.free = rtl_rate_free,
-- 
1.9.1



Re: linux-next: manual merge of the net-next tree with the net tree

2017-08-06 Thread Stephen Rothwell
Hi Neal,

On Sun, 6 Aug 2017 22:21:43 -0400 Neal Cardwell  wrote:
>
> > I fixed it up (see below) and can carry the fix as necessary. This
> > is now fixed as far as linux-next is concerned, but any non trivial
> > conflicts should be mentioned to your upstream maintainer when your tree
> > is submitted for merging.  You may also want to consider cooperating
> > with the maintainer of the conflicting tree to minimise any particularly
> > complex conflicts.  
> 
> Sorry about that. Will try to follow that procedure in the future.

The above is a generic statement I add to all these emails.  It is
aimed more at the maintainers if the trees involved, no the developers
of patches.  I don't think you need to do anything different in these
cases with the "net" and "net-next" tree.  Dave Miller will fix up any
conflicts when he next merges the net tree into the net-next tree.

-- 
Cheers,
Stephen Rothwell


Re: linux-next: manual merge of the net-next tree with the net tree

2017-08-06 Thread Stephen Rothwell
Hi Neal,

On Sun, 6 Aug 2017 22:21:43 -0400 Neal Cardwell  wrote:
>
> > I fixed it up (see below) and can carry the fix as necessary. This
> > is now fixed as far as linux-next is concerned, but any non trivial
> > conflicts should be mentioned to your upstream maintainer when your tree
> > is submitted for merging.  You may also want to consider cooperating
> > with the maintainer of the conflicting tree to minimise any particularly
> > complex conflicts.  
> 
> Sorry about that. Will try to follow that procedure in the future.

The above is a generic statement I add to all these emails.  It is
aimed more at the maintainers if the trees involved, no the developers
of patches.  I don't think you need to do anything different in these
cases with the "net" and "net-next" tree.  Dave Miller will fix up any
conflicts when he next merges the net tree into the net-next tree.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] devfreq: replace sscanf with kstrtol

2017-08-06 Thread gsantosh

On 2017-08-04 20:42, Chanwoo Choi wrote:

Hi,

On Fri, Aug 4, 2017 at 12:57 PM,   wrote:

Hi,

Adding error checks to devfreq userspace governor, the current
implementation results in setting wrong
frequency when sscanf returns error.


From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001
From: Santosh Mardi 
Date: Tue, 25 Jul 2017 18:47:11 +0530
Subject: [PATCH] devfreq: replace sscanf with kstrtol

store_freq function of devfreq userspace governor
executes further, even if error is returned from sscanf,
this will result in setting up wrong frequency value.

The usage for the sscanf is only for single variable so
replace sscanf with kstrtol along with error check to
bail out if any error is returned.

Signed-off-by: Santosh Mardi 
---
 drivers/devfreq/governor_userspace.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/devfreq/governor_userspace.c
b/drivers/devfreq/governor_userspace.c
index 77028c2..a84796d 100644
--- a/drivers/devfreq/governor_userspace.c
+++ b/drivers/devfreq/governor_userspace.c
@@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, 
struct

device_attribute *attr,
mutex_lock(>lock);
data = devfreq->data;

-   sscanf(buf, "%lu", );
+   err = kstrtol(buf, 0, );
+   if (err < 0)
+   goto out;


I think that just you can check the return value as following:
The other point of devfreq already uses the following style
to check the return value of sscanf. I think kstrtol is not necessary.

 err = sscanf(buf, "%lu", );
 if (err != 1)
  goto out;



[Santosh] - I Agree we need to have this error check as mentioned by you 
if we are scanning an arrary from the sscanf,
but in the above code we are only scanning one variable and there is a 
rule in the checkpatch scripts, not to use sscanf if it is a single 
variable, So I need to replace sscanf to strtol


I have added all the mails I got as output from 
scripts/get_maintainer.pl scripts in this mail.




And please use the scripts/get_maintainer.pl
in order to prevent the missing of the reviewer.


data->user_frequency = wanted;
data->valid = true;
err = update_devfreq(devfreq);
if (err == 0)
err = count;
+out:
mutex_unlock(>lock);
return err;
 }
--

Regards,
Santosh M G.
Qualcomm Innovation Center


Re: [PATCH] devfreq: replace sscanf with kstrtol

2017-08-06 Thread gsantosh

On 2017-08-04 20:42, Chanwoo Choi wrote:

Hi,

On Fri, Aug 4, 2017 at 12:57 PM,   wrote:

Hi,

Adding error checks to devfreq userspace governor, the current
implementation results in setting wrong
frequency when sscanf returns error.


From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001
From: Santosh Mardi 
Date: Tue, 25 Jul 2017 18:47:11 +0530
Subject: [PATCH] devfreq: replace sscanf with kstrtol

store_freq function of devfreq userspace governor
executes further, even if error is returned from sscanf,
this will result in setting up wrong frequency value.

The usage for the sscanf is only for single variable so
replace sscanf with kstrtol along with error check to
bail out if any error is returned.

Signed-off-by: Santosh Mardi 
---
 drivers/devfreq/governor_userspace.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/devfreq/governor_userspace.c
b/drivers/devfreq/governor_userspace.c
index 77028c2..a84796d 100644
--- a/drivers/devfreq/governor_userspace.c
+++ b/drivers/devfreq/governor_userspace.c
@@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, 
struct

device_attribute *attr,
mutex_lock(>lock);
data = devfreq->data;

-   sscanf(buf, "%lu", );
+   err = kstrtol(buf, 0, );
+   if (err < 0)
+   goto out;


I think that just you can check the return value as following:
The other point of devfreq already uses the following style
to check the return value of sscanf. I think kstrtol is not necessary.

 err = sscanf(buf, "%lu", );
 if (err != 1)
  goto out;



[Santosh] - I Agree we need to have this error check as mentioned by you 
if we are scanning an arrary from the sscanf,
but in the above code we are only scanning one variable and there is a 
rule in the checkpatch scripts, not to use sscanf if it is a single 
variable, So I need to replace sscanf to strtol


I have added all the mails I got as output from 
scripts/get_maintainer.pl scripts in this mail.




And please use the scripts/get_maintainer.pl
in order to prevent the missing of the reviewer.


data->user_frequency = wanted;
data->valid = true;
err = update_devfreq(devfreq);
if (err == 0)
err = count;
+out:
mutex_unlock(>lock);
return err;
 }
--

Regards,
Santosh M G.
Qualcomm Innovation Center


Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled

2017-08-06 Thread Rusty Russell
Matthew Garrett  writes:
> On Sun, Aug 6, 2017 at 7:49 PM, Rusty Russell  wrote:
>> Matthew Garrett  writes:
>>> Binary modules will still be tainted by the license checker. The issue
>>> is that if you want to enforce module signatures under *some*
>>> circumstances, you need to build with CONFIG_MODULE_SIG
>>
>> Not at all!  You can validate them in userspace.
>
> And then you need an entire trusted userland, at which point you can
> assert that the modules are trustworthy without having to validate
> them so you don't need CONFIG_MODULE_SIG anyway.

Yep.  But your patch already gives userland that power, to silently load
unsigned modules.

>>> but that
>>> changes the behaviour of the kernel even when you're not enforcing
>>> module signatures. The same kernel may be used in environments where
>>> you can verify the kernel and environments where you can't, and in the
>>> latter you may not care that modules are unsigned. In that scenario,
>>> tainting doesn't buy you anything.
>>
>> With your patch, you don't get tainting in the environment where you can
>> verify.
>
> You don't anyway, do you? Loading has already failed before this point
> if sig_enforce is set.

No.  You used to get a warning and a taint when you had a kernel
configured to expect signatures and it didn't get one.  You want to
remove that warning, to silently accept unsigned modules.

>> You'd be better adding a sysctl or equiv. to turn off force loading, and
>> use that in your can-verify system.
>
> I'm not sure what you mean by "force loading" here - if sig_enforce is
> set, you can't force load an unsigned module. If sig_enforce isn't
> set, you'll taint regardless of whether or not you force.
>
> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS?

No, I mean stripping the signatures.  (I thought modprobe could do this
these days, but apparently not!)

So, you're actually building the same kernel, but building two sets of
modules: one without signatures, one with?

And when deploying the one with signatures, you're setting sig_enforce.
On the other, you don't want signatures because um, reasons?  And you
want to suppress the message?

This seems so convoluted already, I can see how you considered an
upstream patch your most productive path forward.

But it's possible that this scenario makes sense to Jeyu and I'm just
incapable of seeing its beauty?

Cheers,
Rusty.


Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled

2017-08-06 Thread Rusty Russell
Matthew Garrett  writes:
> On Sun, Aug 6, 2017 at 7:49 PM, Rusty Russell  wrote:
>> Matthew Garrett  writes:
>>> Binary modules will still be tainted by the license checker. The issue
>>> is that if you want to enforce module signatures under *some*
>>> circumstances, you need to build with CONFIG_MODULE_SIG
>>
>> Not at all!  You can validate them in userspace.
>
> And then you need an entire trusted userland, at which point you can
> assert that the modules are trustworthy without having to validate
> them so you don't need CONFIG_MODULE_SIG anyway.

Yep.  But your patch already gives userland that power, to silently load
unsigned modules.

>>> but that
>>> changes the behaviour of the kernel even when you're not enforcing
>>> module signatures. The same kernel may be used in environments where
>>> you can verify the kernel and environments where you can't, and in the
>>> latter you may not care that modules are unsigned. In that scenario,
>>> tainting doesn't buy you anything.
>>
>> With your patch, you don't get tainting in the environment where you can
>> verify.
>
> You don't anyway, do you? Loading has already failed before this point
> if sig_enforce is set.

No.  You used to get a warning and a taint when you had a kernel
configured to expect signatures and it didn't get one.  You want to
remove that warning, to silently accept unsigned modules.

>> You'd be better adding a sysctl or equiv. to turn off force loading, and
>> use that in your can-verify system.
>
> I'm not sure what you mean by "force loading" here - if sig_enforce is
> set, you can't force load an unsigned module. If sig_enforce isn't
> set, you'll taint regardless of whether or not you force.
>
> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS?

No, I mean stripping the signatures.  (I thought modprobe could do this
these days, but apparently not!)

So, you're actually building the same kernel, but building two sets of
modules: one without signatures, one with?

And when deploying the one with signatures, you're setting sig_enforce.
On the other, you don't want signatures because um, reasons?  And you
want to suppress the message?

This seems so convoluted already, I can see how you considered an
upstream patch your most productive path forward.

But it's possible that this scenario makes sense to Jeyu and I'm just
incapable of seeing its beauty?

Cheers,
Rusty.


Re: Suspend-resume failure on Intel Eagle Lake Core2Duo

2017-08-06 Thread Masahiro Yamada
Hi Marc,


2017-08-03 22:30 GMT+09:00 Marc Zyngier :
> On 03/08/17 13:52, Masahiro Yamada wrote:
>> Hi Marc,
>>
>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier :
>>> Hi Masahiro,
>>>
>>> On 03/08/17 08:32, Masahiro Yamada wrote:
 Hi.

 2017-08-01 0:55 GMT+09:00 Thomas Gleixner :
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>> Can you please remove the patch. And try the following:
>>>
>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>
>>> # echo mem > /sys/power/state
>>>
>>> and log the output of the serial console. That way we might get a clue
>>> where it gets stuck.
>>
>> I'm afraid it hangs right away. No response from SSH, no output to 
>> serial.
>
> What means hangs right away? Is there no output at all on the serial
> console? Or does it just stop at some point?
>
> Thanks,
>
> tglx
>

 Sorry for jumping in.
 Finally, I found this thread.


 My environment is completely different (ARM64 board),
 I am also suffering from a hibernation problem
 since this commit.


 I get no response on the serial console
 after "Restarting tasks ... done." log message.


 By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
 irq function calls", I can get hibernation working again.


 SW info:
 defconfig:  arch/arm64/configs/defconfig
 DT   :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
 PSCI :  ARM Trusted Firmware


 SoC info:
 CPU  :  Cortex-A72 * 2 + Cortex-A53 * 2
 irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>
>>> Let me take an educated guess: It feels like your firmware doesn't
>>> save/restore the GIC context across suspend/resume. Is that something
>>> you could check, assuming you have access to the firmware source code?
>>
>> Thanks for your comments.
>>
>>
>> I do not know much about the manner of preserving GICv3 context.
>>
>> I can see this patch  (rejected?) :
>> https://patchwork.kernel.org/patch/9343061/
>>
>>
>> Is it something that should be completely cared by firmware
>> instead of kernel?
>
> That was definitely the intention, but it looks like something that ATF
> has only started supporting very recently:
>
> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>
>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>> is open source software, and I pushed my platform code to the upstream.
>>
>> So, yes, I (and everybody) can have access to the firmware source code.
>>
>>
>> I am not sure how ATF saves the context during hibernation, though.
>
> See the above link. Is there any chance of you trying this into your
> firmware?
>
> Thanks,

Thanks for the pointer.


Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.

I think that will basically work for suspend-to-ram
because all contexts including both non-secure and secure worlds will
be retained in the main memory.

However, I still do not understand how the context is preserved during
the hibernation (suspend-to-disk).


If my understanding is correct, hibernation on Linux works like follows:

[1] Freeze all tasks
[2] CPU_OFF for non-boot CPUs
[3] Create a hibernation image
[4] CPU_ON for non-boot CPUs
[5] Write the hibernation image to the disk (=swap area)
[6] SYSTEM_OFF


IIUC, [5] only writes the context Linux takes care of (only non-secure).

If so, where and how does the firmware write the GIC-v3 context
to the disk?


-- 
Best Regards
Masahiro Yamada


Re: Suspend-resume failure on Intel Eagle Lake Core2Duo

2017-08-06 Thread Masahiro Yamada
Hi Marc,


2017-08-03 22:30 GMT+09:00 Marc Zyngier :
> On 03/08/17 13:52, Masahiro Yamada wrote:
>> Hi Marc,
>>
>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier :
>>> Hi Masahiro,
>>>
>>> On 03/08/17 08:32, Masahiro Yamada wrote:
 Hi.

 2017-08-01 0:55 GMT+09:00 Thomas Gleixner :
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>> Can you please remove the patch. And try the following:
>>>
>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>
>>> # echo mem > /sys/power/state
>>>
>>> and log the output of the serial console. That way we might get a clue
>>> where it gets stuck.
>>
>> I'm afraid it hangs right away. No response from SSH, no output to 
>> serial.
>
> What means hangs right away? Is there no output at all on the serial
> console? Or does it just stop at some point?
>
> Thanks,
>
> tglx
>

 Sorry for jumping in.
 Finally, I found this thread.


 My environment is completely different (ARM64 board),
 I am also suffering from a hibernation problem
 since this commit.


 I get no response on the serial console
 after "Restarting tasks ... done." log message.


 By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
 irq function calls", I can get hibernation working again.


 SW info:
 defconfig:  arch/arm64/configs/defconfig
 DT   :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
 PSCI :  ARM Trusted Firmware


 SoC info:
 CPU  :  Cortex-A72 * 2 + Cortex-A53 * 2
 irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>
>>> Let me take an educated guess: It feels like your firmware doesn't
>>> save/restore the GIC context across suspend/resume. Is that something
>>> you could check, assuming you have access to the firmware source code?
>>
>> Thanks for your comments.
>>
>>
>> I do not know much about the manner of preserving GICv3 context.
>>
>> I can see this patch  (rejected?) :
>> https://patchwork.kernel.org/patch/9343061/
>>
>>
>> Is it something that should be completely cared by firmware
>> instead of kernel?
>
> That was definitely the intention, but it looks like something that ATF
> has only started supporting very recently:
>
> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>
>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>> is open source software, and I pushed my platform code to the upstream.
>>
>> So, yes, I (and everybody) can have access to the firmware source code.
>>
>>
>> I am not sure how ATF saves the context during hibernation, though.
>
> See the above link. Is there any chance of you trying this into your
> firmware?
>
> Thanks,

Thanks for the pointer.


Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.

I think that will basically work for suspend-to-ram
because all contexts including both non-secure and secure worlds will
be retained in the main memory.

However, I still do not understand how the context is preserved during
the hibernation (suspend-to-disk).


If my understanding is correct, hibernation on Linux works like follows:

[1] Freeze all tasks
[2] CPU_OFF for non-boot CPUs
[3] Create a hibernation image
[4] CPU_ON for non-boot CPUs
[5] Write the hibernation image to the disk (=swap area)
[6] SYSTEM_OFF


IIUC, [5] only writes the context Linux takes care of (only non-secure).

If so, where and how does the firmware write the GIC-v3 context
to the disk?


-- 
Best Regards
Masahiro Yamada


Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics

2017-08-06 Thread David Miller
From: Florian Fainelli 
Date: Thu,  3 Aug 2017 21:33:27 -0700

> During testing with a background iperf pushing 1Gbit/sec worth of
> traffic and having both ifconfig and ethtool collect statistics, we
> could see quite frequent deadlocks. Convert the often accessed DSA slave
> network devices statistics to per-cpu 64-bit statistics to remove these
> deadlocks and provide fast efficient statistics updates.
> 
> Signed-off-by: Florian Fainelli 

Applied with appropriate Fixes: tag added.

Thanks.


Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics

2017-08-06 Thread David Miller
From: Florian Fainelli 
Date: Thu,  3 Aug 2017 21:33:27 -0700

> During testing with a background iperf pushing 1Gbit/sec worth of
> traffic and having both ifconfig and ethtool collect statistics, we
> could see quite frequent deadlocks. Convert the often accessed DSA slave
> network devices statistics to per-cpu 64-bit statistics to remove these
> deadlocks and provide fast efficient statistics updates.
> 
> Signed-off-by: Florian Fainelli 

Applied with appropriate Fixes: tag added.

Thanks.


Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-08-06 Thread Ding Tianhong


On 2017/8/7 11:47, David Miller wrote:
> From: Ding Tianhong 
> Date: Sat, 5 Aug 2017 15:15:09 +0800
> 
>> Some devices have problems with Transaction Layer Packets with the Relaxed
>> Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
>> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
>> devices with Relaxed Ordering issues, and a use of this new flag by the
>> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
>> Ports.
>>
>> It's been years since I've submitted kernel.org patches, I appolgise for the
>> almost certain submission errors.
> 
> Which tree should merge this?  The PCI tree or my networking tree?
> 

Hi David:

I think networking tree merge it is a better choice, as it mainly used to tell 
the NIC
drivers how to use the Relaxed Ordering Attribute, and later we need send patch 
to enable
RO for ixgbe driver base on this patch. But I am not sure whether Bjorn has 
some of his own
view. :)

Hi Bjorn:

Could you help review this patch or give some feedback ?

Thanks
Ding
> .
> 



Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-08-06 Thread Ding Tianhong


On 2017/8/7 11:47, David Miller wrote:
> From: Ding Tianhong 
> Date: Sat, 5 Aug 2017 15:15:09 +0800
> 
>> Some devices have problems with Transaction Layer Packets with the Relaxed
>> Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
>> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
>> devices with Relaxed Ordering issues, and a use of this new flag by the
>> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
>> Ports.
>>
>> It's been years since I've submitted kernel.org patches, I appolgise for the
>> almost certain submission errors.
> 
> Which tree should merge this?  The PCI tree or my networking tree?
> 

Hi David:

I think networking tree merge it is a better choice, as it mainly used to tell 
the NIC
drivers how to use the Relaxed Ordering Attribute, and later we need send patch 
to enable
RO for ixgbe driver base on this patch. But I am not sure whether Bjorn has 
some of his own
view. :)

Hi Bjorn:

Could you help review this patch or give some feedback ?

Thanks
Ding
> .
> 



Re: [PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 09:39 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp.
This patch updates the year, because this driver was moved
into synopsys folder in 2017.


Thanks. Queued to drm-misc-next.

Archit



Signed-off-by: Kuninori Morimoto 
---
v1 -> v2

  - update year 2016 -> 2017

  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..3b7e5c5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
  /*
   * dw-hdmi-i2s-audio.c
   *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2017 Renesas Solutions Corp.
+ * Kuninori Morimoto 
   *
   * This program is free software; you can redistribute it and/or modify
   * it under the terms of the GNU General Public License version 2 as



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 09:39 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp.
This patch updates the year, because this driver was moved
into synopsys folder in 2017.


Thanks. Queued to drm-misc-next.

Archit



Signed-off-by: Kuninori Morimoto 
---
v1 -> v2

  - update year 2016 -> 2017

  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..3b7e5c5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
  /*
   * dw-hdmi-i2s-audio.c
   *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2017 Renesas Solutions Corp.
+ * Kuninori Morimoto 
   *
   * This program is free software; you can redistribute it and/or modify
   * it under the terms of the GNU General Public License version 2 as



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp.
This patch updates the year, because this driver was moved
into synopsys folder in 2017.

Signed-off-by: Kuninori Morimoto 
---
v1 -> v2

 - update year 2016 -> 2017

 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..3b7e5c5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
 /*
  * dw-hdmi-i2s-audio.c
  *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2017 Renesas Solutions Corp.
+ * Kuninori Morimoto 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
-- 
1.9.1



[PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp.
This patch updates the year, because this driver was moved
into synopsys folder in 2017.

Signed-off-by: Kuninori Morimoto 
---
v1 -> v2

 - update year 2016 -> 2017

 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..3b7e5c5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
 /*
  * dw-hdmi-i2s-audio.c
  *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2017 Renesas Solutions Corp.
+ * Kuninori Morimoto 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
-- 
1.9.1



Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

Hi Archit

> >> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:
> >>>
> >>> From: Kuninori Morimoto 
> >>>
> >>> This driver's Copyright is under Renesas Solutions Corp
> >>
> >> Can we update the year to 2017 while we're at it?
> >
> > The original patch was created and applied on 2016
> >
> > 2761ba6c0925ca9c5b917a95f68135d9dce443fb
> > ("drm: bridge: add DesignWare HDMI I2S audio support")
> >
> > And moved into new synopsys folder on 2017, I think.
> 
> We're allowed to update the copyright year as we continue to
> make changes to a file. So, I think updating to 2017 should be
> okay.

OK, will do in v2


Best regards
---
Kuninori Morimoto


Re: [PATCH 2/5] edac: synopsys: Add EDAC ECC support for ZynqMP DDRC

2017-08-06 Thread Borislav Petkov
On Fri, Aug 04, 2017 at 02:00:24PM +0200, Michal Simek wrote:
> From: Naga Sureshkumar Relli 
> 
> This patch adds EDAC ECC support for ZynqMP DDRC IP
> 
> Signed-off-by: Naga Sureshkumar Relli 
> Signed-off-by: Michal Simek 
> ---
> 
>  drivers/edac/Kconfig |   2 +-
>  drivers/edac/synopsys_edac.c | 306 
> ++-
>  2 files changed, 302 insertions(+), 6 deletions(-)

...

> @@ -440,9 +706,12 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci,
>   mci->dev_name = SYNPS_EDAC_MOD_STRING;
>   mci->mod_name = SYNPS_EDAC_MOD_VER;
>   mci->mod_ver = "1";
> -
> - edac_op_state = EDAC_OPSTATE_POLL;
> - mci->edac_check = synps_edac_check;
> + if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> + edac_op_state = EDAC_OPSTATE_INT;
> + } else {
> + edac_op_state = EDAC_OPSTATE_POLL;
> + mci->edac_check = synps_edac_check;
> + }
>   mci->ctl_page_to_phys = NULL;
>  
>   status = synps_edac_init_csrows(mci);

This hunk doesn't apply cleanly:

$ test-apply.sh -q 
/tmp/02-edac-synopsys-add_edac_ecc_support_for_zynqmp_ddrc.patch 
checking file drivers/edac/Kconfig
checking file drivers/edac/synopsys_edac.c
Hunk #11 FAILED at 706.
Hunk #12 succeeded at 723 (offset -1 lines).
Hunk #13 succeeded at 754 (offset -1 lines).
Hunk #14 succeeded at 803 (offset -1 lines).
1 out of 14 hunks FAILED

Please redo your patches against this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=for-next

Thx.

> @@ -458,8 +727,18 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci,
>   .quirks = 0,
>  };
>  
> +static const struct synps_platform_data zynqmp_enh_edac_def = {
> + .synps_edac_geterror_info   = synps_enh_edac_geterror_info,
> + .synps_edac_get_mtype   = synps_enh_edac_get_mtype,
> + .synps_edac_get_dtype   = synps_enh_edac_get_dtype,
> + .synps_edac_get_eccstate= synps_enh_edac_get_eccstate,
> + .quirks = DDR_ECC_INTR_SUPPORT,
> +};
> +
>  static const struct of_device_id synps_edac_match[] = {
>   { .compatible = "xlnx,zynq-ddrc-a05", .data = (void *)_edac_def },
> + { .compatible = "xlnx,zynqmp-ddrc-2.40a",
> + .data = (void *)_enh_edac_def},

WARNING: DT compatible string "xlnx,zynqmp-ddrc-2.40a" appears un-documented -- 
check ./Documentation/devicetree/bindings/
#414: FILE: drivers/edac/synopsys_edac.c:740:
+   { .compatible = "xlnx,zynqmp-ddrc-2.40a",

Please integrate checkpatch.pl into your patch creation workflow.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

Hi Archit

> >> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:
> >>>
> >>> From: Kuninori Morimoto 
> >>>
> >>> This driver's Copyright is under Renesas Solutions Corp
> >>
> >> Can we update the year to 2017 while we're at it?
> >
> > The original patch was created and applied on 2016
> >
> > 2761ba6c0925ca9c5b917a95f68135d9dce443fb
> > ("drm: bridge: add DesignWare HDMI I2S audio support")
> >
> > And moved into new synopsys folder on 2017, I think.
> 
> We're allowed to update the copyright year as we continue to
> make changes to a file. So, I think updating to 2017 should be
> okay.

OK, will do in v2


Best regards
---
Kuninori Morimoto


Re: [PATCH 2/5] edac: synopsys: Add EDAC ECC support for ZynqMP DDRC

2017-08-06 Thread Borislav Petkov
On Fri, Aug 04, 2017 at 02:00:24PM +0200, Michal Simek wrote:
> From: Naga Sureshkumar Relli 
> 
> This patch adds EDAC ECC support for ZynqMP DDRC IP
> 
> Signed-off-by: Naga Sureshkumar Relli 
> Signed-off-by: Michal Simek 
> ---
> 
>  drivers/edac/Kconfig |   2 +-
>  drivers/edac/synopsys_edac.c | 306 
> ++-
>  2 files changed, 302 insertions(+), 6 deletions(-)

...

> @@ -440,9 +706,12 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci,
>   mci->dev_name = SYNPS_EDAC_MOD_STRING;
>   mci->mod_name = SYNPS_EDAC_MOD_VER;
>   mci->mod_ver = "1";
> -
> - edac_op_state = EDAC_OPSTATE_POLL;
> - mci->edac_check = synps_edac_check;
> + if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> + edac_op_state = EDAC_OPSTATE_INT;
> + } else {
> + edac_op_state = EDAC_OPSTATE_POLL;
> + mci->edac_check = synps_edac_check;
> + }
>   mci->ctl_page_to_phys = NULL;
>  
>   status = synps_edac_init_csrows(mci);

This hunk doesn't apply cleanly:

$ test-apply.sh -q 
/tmp/02-edac-synopsys-add_edac_ecc_support_for_zynqmp_ddrc.patch 
checking file drivers/edac/Kconfig
checking file drivers/edac/synopsys_edac.c
Hunk #11 FAILED at 706.
Hunk #12 succeeded at 723 (offset -1 lines).
Hunk #13 succeeded at 754 (offset -1 lines).
Hunk #14 succeeded at 803 (offset -1 lines).
1 out of 14 hunks FAILED

Please redo your patches against this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=for-next

Thx.

> @@ -458,8 +727,18 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci,
>   .quirks = 0,
>  };
>  
> +static const struct synps_platform_data zynqmp_enh_edac_def = {
> + .synps_edac_geterror_info   = synps_enh_edac_geterror_info,
> + .synps_edac_get_mtype   = synps_enh_edac_get_mtype,
> + .synps_edac_get_dtype   = synps_enh_edac_get_dtype,
> + .synps_edac_get_eccstate= synps_enh_edac_get_eccstate,
> + .quirks = DDR_ECC_INTR_SUPPORT,
> +};
> +
>  static const struct of_device_id synps_edac_match[] = {
>   { .compatible = "xlnx,zynq-ddrc-a05", .data = (void *)_edac_def },
> + { .compatible = "xlnx,zynqmp-ddrc-2.40a",
> + .data = (void *)_enh_edac_def},

WARNING: DT compatible string "xlnx,zynqmp-ddrc-2.40a" appears un-documented -- 
check ./Documentation/devicetree/bindings/
#414: FILE: drivers/edac/synopsys_edac.c:740:
+   { .compatible = "xlnx,zynqmp-ddrc-2.40a",

Please integrate checkpatch.pl into your patch creation workflow.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH v4 net-next 02/13] nfp: change bpf verifier hooks to match new verifier data structures

2017-08-06 Thread David Miller
From: Edward Cree 
Date: Thu, 3 Aug 2017 17:11:34 +0100

> Signed-off-by: Edward Cree 

Sorry, this doesn't work.

The entire source tree must compile properly after each patch in the
patch series.

So if you change a datastructure, you have to update all of the users
in that patch to keep everything compiling and working.


Re: [PATCH v4 net-next 02/13] nfp: change bpf verifier hooks to match new verifier data structures

2017-08-06 Thread David Miller
From: Edward Cree 
Date: Thu, 3 Aug 2017 17:11:34 +0100

> Signed-off-by: Edward Cree 

Sorry, this doesn't work.

The entire source tree must compile properly after each patch in the
patch series.

So if you change a datastructure, you have to update all of the users
in that patch to keep everything compiling and working.


[regression] tty console panic for 4.13-rcx

2017-08-06 Thread Shawn Lin

Hi,

I saw the log at the bottom and bisect the issue to the commits of

065ea0a7afd64d6c ("tty: improve tty_insert_flip_char() slow path")
979990c628481461 ("tty: improve tty_insert_flip_char() fast path")

I nearly could 100% reproduce this. Any thought?


 [  154.823106] Unable to handle kernel NULL pointer dereference at 
virtual address 000d

[  154.823885] user pgtable: 4k pages, 48-bit VAs, pgd = 800066e79000
[  154.824464] [000d] *pgd=6768a003, 
*pud=6a7da003, *pmd=669c3003, *pte=

[  154.825460] Internal error: Oops: 9607 [#1] PREEMPT SMP
[  154.825957] Modules linked in:
[  154.826258] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.13.0-rc3-next-20170802-2-gd66440a-dirty #112

[  154.827091] Hardware name: Firefly-RK3399 Board (DT)
[  154.827539] task: 28f42b00 task.stack: 28f3
[  154.828083] PC is at llist_del_first+0x8/0x74
[  154.828481] LR is at __tty_buffer_request_room+0x114/0x148
[  154.828972] pc : [] lr : [] 
pstate: 61c5

[  154.829625] sp : 80007ef10d00
[  154.829925] x29: 80007ef10d00 x28: 28f42b00
[  154.830409] x27: 28cc7458 x26: 28cc7430
[  154.830892] x25: 0026 x24: 
[  154.831373] x23:  x22: 0001
[  154.831854] x21: 80006b37b600 x20: 0100
[  154.832337] x19: 80006a8a5840 x18: 
[  154.832819] x17:  x16: 2821b398
[  154.833300] x15:  x14: 3d097d00
[  154.833781] x13: 00017700 x12: 009a
[  154.834263] x11: 7fff x10: 0002
[  154.834744] x9 : 0003 x8 : 
[  154.835225] x7 : 003d0900 x6 : 
[  154.835706] x5 : 0100 x4 : 0200
[  154.836187] x3 : 0001 x2 : 000d
[  154.836668] x1 : 0200 x0 : 80006a8a58b0
[  154.837153] Process swapper/0 (pid: 0, stack limit = 0x28f3)
[  154.837750] Stack: (0x80007ef10d00 to 0x28f34000)
[  154.838261] Call trace:
[  154.838493] Exception stack(0x80007ef10b30 to 0x80007ef10c60)
[  154.839068] 0b20:   80006a8a5840 
0001
[  154.839766] 0b40: 80007ef10d00 283b813c 800069219180 
80007ef1c968
[  154.840464] 0b60: 80007ef10b70 280f31d0 80007ef10bf0 
280e7cf0
[  154.841161] 0b80: 80007ef1c900 80007ef1c900 0004 
01c0
[  154.841859] 0ba0: 80007ef1c900 28f39ba8  
0001
[  154.842557] 0bc0: 80007ef10be0 280e7e5c 80006a8a58b0 
0200
[  154.843252] 0be0: 000d 0001 0200 
0100
[  154.843949] 0c00:  003d0900  
0003
[  154.844645] 0c20: 0002 7fff 009a 
00017700
[  154.845342] 0c40: 3d097d00  2821b398 


[  154.846040] [] llist_del_first+0x8/0x74
[  154.846528] [] __tty_insert_flip_char+0x2c/0x78
[  154.847076] [] uart_insert_char+0x54/0x13c
[  154.847589] [] serial8250_rx_chars+0x98/0x1e8
[  154.848124] [] serial8250_handle_irq.part.23+0x70/0xec
[  154.848725] [] serial8250_handle_irq+0x14/0x24
[  154.849264] [] dw8250_handle_irq+0x40/0xfc
[  154.849774] [] serial8250_interrupt+0x6c/0xec
[  154.850309] [] __handle_irq_event_percpu+0xa0/0x128
[  154.850887] [] handle_irq_event_percpu+0x1c/0x54
[  154.851442] [] handle_irq_event+0x44/0x74
[  154.851947] [] handle_fasteoi_irq+0x9c/0x154
[  154.852470] [] generic_handle_irq+0x24/0x38
[  154.852986] [] __handle_domain_irq+0x60/0xac
[  154.853510] [] gic_handle_irq+0xd4/0x17c
[  154.854001] Exception stack(0x28f33da0 to 0x28f33ed0)
[  154.854576] 3da0:   0001 

[  154.855273] 3dc0:  600075ff6000 0001 

[  154.855970] 3de0: 28f43560 28f33e50 0a00 

[  154.85] 3e00: 014f ffd42070 f7934d7b 

[  154.857365] 3e20: 2821b398   
28f17000
[  154.858064] 3e40: 28f39000 28f39000 28f25820 
28f39e98
[  154.858761] 3e60:   28f42b00 

[  154.859458] 3e80: 02e00018 28f33ed0 280852ec 
28f33ed0
[  154.860156] 3ea0: 280852f0 6145 7df19878 
7ffa7010

[  154.860851] 3ec0:  2813744c
[  154.861293] [] el1_irq+0xb4/0x128
[  154.861737] [] arch_cpu_idle+0x10/0x18
[  154.86] [] default_idle_call+0x18/0x2c
[  154.862735] [] do_idle+0x170/0x1fc
[  154.863187] [] cpu_startup_entry+0x1c/0x24
[  

[regression] tty console panic for 4.13-rcx

2017-08-06 Thread Shawn Lin

Hi,

I saw the log at the bottom and bisect the issue to the commits of

065ea0a7afd64d6c ("tty: improve tty_insert_flip_char() slow path")
979990c628481461 ("tty: improve tty_insert_flip_char() fast path")

I nearly could 100% reproduce this. Any thought?


 [  154.823106] Unable to handle kernel NULL pointer dereference at 
virtual address 000d

[  154.823885] user pgtable: 4k pages, 48-bit VAs, pgd = 800066e79000
[  154.824464] [000d] *pgd=6768a003, 
*pud=6a7da003, *pmd=669c3003, *pte=

[  154.825460] Internal error: Oops: 9607 [#1] PREEMPT SMP
[  154.825957] Modules linked in:
[  154.826258] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.13.0-rc3-next-20170802-2-gd66440a-dirty #112

[  154.827091] Hardware name: Firefly-RK3399 Board (DT)
[  154.827539] task: 28f42b00 task.stack: 28f3
[  154.828083] PC is at llist_del_first+0x8/0x74
[  154.828481] LR is at __tty_buffer_request_room+0x114/0x148
[  154.828972] pc : [] lr : [] 
pstate: 61c5

[  154.829625] sp : 80007ef10d00
[  154.829925] x29: 80007ef10d00 x28: 28f42b00
[  154.830409] x27: 28cc7458 x26: 28cc7430
[  154.830892] x25: 0026 x24: 
[  154.831373] x23:  x22: 0001
[  154.831854] x21: 80006b37b600 x20: 0100
[  154.832337] x19: 80006a8a5840 x18: 
[  154.832819] x17:  x16: 2821b398
[  154.833300] x15:  x14: 3d097d00
[  154.833781] x13: 00017700 x12: 009a
[  154.834263] x11: 7fff x10: 0002
[  154.834744] x9 : 0003 x8 : 
[  154.835225] x7 : 003d0900 x6 : 
[  154.835706] x5 : 0100 x4 : 0200
[  154.836187] x3 : 0001 x2 : 000d
[  154.836668] x1 : 0200 x0 : 80006a8a58b0
[  154.837153] Process swapper/0 (pid: 0, stack limit = 0x28f3)
[  154.837750] Stack: (0x80007ef10d00 to 0x28f34000)
[  154.838261] Call trace:
[  154.838493] Exception stack(0x80007ef10b30 to 0x80007ef10c60)
[  154.839068] 0b20:   80006a8a5840 
0001
[  154.839766] 0b40: 80007ef10d00 283b813c 800069219180 
80007ef1c968
[  154.840464] 0b60: 80007ef10b70 280f31d0 80007ef10bf0 
280e7cf0
[  154.841161] 0b80: 80007ef1c900 80007ef1c900 0004 
01c0
[  154.841859] 0ba0: 80007ef1c900 28f39ba8  
0001
[  154.842557] 0bc0: 80007ef10be0 280e7e5c 80006a8a58b0 
0200
[  154.843252] 0be0: 000d 0001 0200 
0100
[  154.843949] 0c00:  003d0900  
0003
[  154.844645] 0c20: 0002 7fff 009a 
00017700
[  154.845342] 0c40: 3d097d00  2821b398 


[  154.846040] [] llist_del_first+0x8/0x74
[  154.846528] [] __tty_insert_flip_char+0x2c/0x78
[  154.847076] [] uart_insert_char+0x54/0x13c
[  154.847589] [] serial8250_rx_chars+0x98/0x1e8
[  154.848124] [] serial8250_handle_irq.part.23+0x70/0xec
[  154.848725] [] serial8250_handle_irq+0x14/0x24
[  154.849264] [] dw8250_handle_irq+0x40/0xfc
[  154.849774] [] serial8250_interrupt+0x6c/0xec
[  154.850309] [] __handle_irq_event_percpu+0xa0/0x128
[  154.850887] [] handle_irq_event_percpu+0x1c/0x54
[  154.851442] [] handle_irq_event+0x44/0x74
[  154.851947] [] handle_fasteoi_irq+0x9c/0x154
[  154.852470] [] generic_handle_irq+0x24/0x38
[  154.852986] [] __handle_domain_irq+0x60/0xac
[  154.853510] [] gic_handle_irq+0xd4/0x17c
[  154.854001] Exception stack(0x28f33da0 to 0x28f33ed0)
[  154.854576] 3da0:   0001 

[  154.855273] 3dc0:  600075ff6000 0001 

[  154.855970] 3de0: 28f43560 28f33e50 0a00 

[  154.85] 3e00: 014f ffd42070 f7934d7b 

[  154.857365] 3e20: 2821b398   
28f17000
[  154.858064] 3e40: 28f39000 28f39000 28f25820 
28f39e98
[  154.858761] 3e60:   28f42b00 

[  154.859458] 3e80: 02e00018 28f33ed0 280852ec 
28f33ed0
[  154.860156] 3ea0: 280852f0 6145 7df19878 
7ffa7010

[  154.860851] 3ec0:  2813744c
[  154.861293] [] el1_irq+0xb4/0x128
[  154.861737] [] arch_cpu_idle+0x10/0x18
[  154.86] [] default_idle_call+0x18/0x2c
[  154.862735] [] do_idle+0x170/0x1fc
[  154.863187] [] cpu_startup_entry+0x1c/0x24
[  

Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 09:25 AM, Kuninori Morimoto wrote:


Hi Archit

Thank you for your feedback


On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp


Can we update the year to 2017 while we're at it?


The original patch was created and applied on 2016

2761ba6c0925ca9c5b917a95f68135d9dce443fb
("drm: bridge: add DesignWare HDMI I2S audio support")

And moved into new synopsys folder on 2017, I think.


We're allowed to update the copyright year as we continue to
make changes to a file. So, I think updating to 2017 should be
okay.

Archit




Best regards
---
Kuninori Morimoto



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 09:25 AM, Kuninori Morimoto wrote:


Hi Archit

Thank you for your feedback


On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp


Can we update the year to 2017 while we're at it?


The original patch was created and applied on 2016

2761ba6c0925ca9c5b917a95f68135d9dce443fb
("drm: bridge: add DesignWare HDMI I2S audio support")

And moved into new synopsys folder on 2017, I think.


We're allowed to update the copyright year as we continue to
make changes to a file. So, I think updating to 2017 should be
okay.

Archit




Best regards
---
Kuninori Morimoto



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

Hi Archit

Thank you for your feedback

> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:
> >
> > From: Kuninori Morimoto 
> >
> > This driver's Copyright is under Renesas Solutions Corp
> 
> Can we update the year to 2017 while we're at it?

The original patch was created and applied on 2016

2761ba6c0925ca9c5b917a95f68135d9dce443fb
("drm: bridge: add DesignWare HDMI I2S audio support")

And moved into new synopsys folder on 2017, I think.


Best regards
---
Kuninori Morimoto


Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Kuninori Morimoto

Hi Archit

Thank you for your feedback

> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:
> >
> > From: Kuninori Morimoto 
> >
> > This driver's Copyright is under Renesas Solutions Corp
> 
> Can we update the year to 2017 while we're at it?

The original patch was created and applied on 2016

2761ba6c0925ca9c5b917a95f68135d9dce443fb
("drm: bridge: add DesignWare HDMI I2S audio support")

And moved into new synopsys folder on 2017, I think.


Best regards
---
Kuninori Morimoto


Re: [PATCH] intel-vbtn: match power button on press rather than release

2017-08-06 Thread Darren Hart
On Mon, Aug 07, 2017 at 08:59:30AM +0800, AceLan Kao wrote:
> Looks like I'm one hour late to ack the patch.
> Thanks any way for the quick response.

Thanks for chiming in all the same - and normally I'd have provided for
more time. In this case, I will be away for a few days, and it was
important to get this in sooner rather than later in the RC cycle.

-- 
Darren Hart
VMware Open Source Technology Center


Re: [PATCH] intel-vbtn: match power button on press rather than release

2017-08-06 Thread Darren Hart
On Mon, Aug 07, 2017 at 08:59:30AM +0800, AceLan Kao wrote:
> Looks like I'm one hour late to ack the patch.
> Thanks any way for the quick response.

Thanks for chiming in all the same - and normally I'd have provided for
more time. In this case, I will be away for a few days, and it was
important to get this in sooner rather than later in the RC cycle.

-- 
Darren Hart
VMware Open Source Technology Center


[PATCH v6 0/2] Make find_later_rq() choose a closer cpu in topology

2017-08-06 Thread Byungchul Park
When cpudl_find() returns any among free_cpus, the cpu might not be
closer than others, considering sched domain. For example:

   this_cpu: 15
   free_cpus: 0, 1,..., 14 (== later_mask)
   best_cpu: 0

   topology:

   0 --+
   +--+
   1 --+  |
  +-- ... --+
   2 --+  | |
   +--+ |
   3 --+|

   ... ...

   12 --+   |
+--+|
   13 --+  ||
   +-- ... -+
   14 --+  |
+--+
   15 --+

In this case, it would be best to select 14 since it's a free cpu and
closest to 15(this_cpu). However, currently the code select 0(best_cpu)
even though that's just any among free_cpus. Fix it.

Change from v5
   -. exclude two patches already picked up by peterz
  (sched/deadline: Make find_later_rq() choose a closer cpu in topology)
  (sched/deadline: Change return value of cpudl_find())
   -. apply what peterz fixed for 'prefer sibling', into deadline and rt

Change from v4
   -. remove a patch that might cause huge lock contention
  (by spin lock() in a hot path of scheduler)

Change from v3
   -. rename closest_cpu to best_cpu so that it align with rt
   -. protect referring cpudl.elements with cpudl.lock
   -. change return value of cpudl_find() to bool

Change from v2
   -. add support for SD_PREFER_SIBLING

Change from v1
   -. clean up the patch

Byungchul Park (2):
  sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()
  sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()

 kernel/sched/deadline.c | 46 +++---
 kernel/sched/rt.c   | 47 ---
 2 files changed, 87 insertions(+), 6 deletions(-)

-- 
1.9.1



[PATCH v6 0/2] Make find_later_rq() choose a closer cpu in topology

2017-08-06 Thread Byungchul Park
When cpudl_find() returns any among free_cpus, the cpu might not be
closer than others, considering sched domain. For example:

   this_cpu: 15
   free_cpus: 0, 1,..., 14 (== later_mask)
   best_cpu: 0

   topology:

   0 --+
   +--+
   1 --+  |
  +-- ... --+
   2 --+  | |
   +--+ |
   3 --+|

   ... ...

   12 --+   |
+--+|
   13 --+  ||
   +-- ... -+
   14 --+  |
+--+
   15 --+

In this case, it would be best to select 14 since it's a free cpu and
closest to 15(this_cpu). However, currently the code select 0(best_cpu)
even though that's just any among free_cpus. Fix it.

Change from v5
   -. exclude two patches already picked up by peterz
  (sched/deadline: Make find_later_rq() choose a closer cpu in topology)
  (sched/deadline: Change return value of cpudl_find())
   -. apply what peterz fixed for 'prefer sibling', into deadline and rt

Change from v4
   -. remove a patch that might cause huge lock contention
  (by spin lock() in a hot path of scheduler)

Change from v3
   -. rename closest_cpu to best_cpu so that it align with rt
   -. protect referring cpudl.elements with cpudl.lock
   -. change return value of cpudl_find() to bool

Change from v2
   -. add support for SD_PREFER_SIBLING

Change from v1
   -. clean up the patch

Byungchul Park (2):
  sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()
  sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()

 kernel/sched/deadline.c | 46 +++---
 kernel/sched/rt.c   | 47 ---
 2 files changed, 87 insertions(+), 6 deletions(-)

-- 
1.9.1



[PATCH v6 2/2] sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()

2017-08-06 Thread Byungchul Park
It would be better to avoid pushing tasks to other cpu within
a SD_PREFER_SIBLING domain, instead, get more chances to check other
siblings.

Signed-off-by: Byungchul Park 
---
 kernel/sched/rt.c | 47 ---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 979b734..50639e5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1618,12 +1618,35 @@ static struct task_struct 
*pick_highest_pushable_task(struct rq *rq, int cpu)
 
 static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask);
 
+/*
+ * Find the first cpu in: mask & sd & ~prefer
+ */
+static int find_cpu(const struct cpumask *mask,
+   const struct sched_domain *sd,
+   const struct sched_domain *prefer)
+{
+   const struct cpumask *sds = sched_domain_span(sd);
+   const struct cpumask *ps  = prefer ? sched_domain_span(prefer) : NULL;
+   int cpu = -1;
+
+   while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) {
+   if (!cpumask_test_cpu(cpu, sds))
+   continue;
+   if (ps && cpumask_test_cpu(cpu, ps))
+   continue;
+   break;
+   }
+
+   return cpu;
+}
+
 static int find_lowest_rq(struct task_struct *task)
 {
-   struct sched_domain *sd;
+   struct sched_domain *sd, *prefer = NULL;
struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
int this_cpu = smp_processor_id();
int cpu  = task_cpu(task);
+   int fallback_cpu = -1;
 
/* Make sure the mask is initialized first */
if (unlikely(!lowest_mask))
@@ -1668,9 +1691,20 @@ static int find_lowest_rq(struct task_struct *task)
return this_cpu;
}
 
-   best_cpu = cpumask_first_and(lowest_mask,
-sched_domain_span(sd));
+   best_cpu = find_cpu(lowest_mask, sd, prefer);
+
if (best_cpu < nr_cpu_ids) {
+   /*
+* If current domain is SD_PREFER_SIBLING
+* flaged, we have to get more chances to
+* check other siblings.
+*/
+   if (sd->flags & SD_PREFER_SIBLING) {
+   prefer = sd;
+   if (fallback_cpu == -1)
+   fallback_cpu = best_cpu;
+   continue;
+   }
rcu_read_unlock();
return best_cpu;
}
@@ -1679,6 +1713,13 @@ static int find_lowest_rq(struct task_struct *task)
rcu_read_unlock();
 
/*
+* If fallback_cpu is valid, all our quesses failed *except* for
+* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu.
+*/
+   if (fallback_cpu != -1)
+   return fallback_cpu;
+
+   /*
 * And finally, if there were no matches within the domains
 * just give the caller *something* to work with from the compatible
 * locations.
-- 
1.9.1



[PATCH v6 1/2] sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()

2017-08-06 Thread Byungchul Park
It would be better to avoid pushing tasks to other cpu within
a SD_PREFER_SIBLING domain, instead, get more chances to check other
siblings.

Signed-off-by: Byungchul Park 
---
 kernel/sched/deadline.c | 46 +++---
 1 file changed, 43 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 0223694..2fd1591 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1319,12 +1319,35 @@ static struct task_struct 
*pick_earliest_pushable_dl_task(struct rq *rq, int cpu
 
 static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask_dl);
 
+/*
+ * Find the first cpu in: mask & sd & ~prefer
+ */
+static int find_cpu(const struct cpumask *mask,
+   const struct sched_domain *sd,
+   const struct sched_domain *prefer)
+{
+   const struct cpumask *sds = sched_domain_span(sd);
+   const struct cpumask *ps  = prefer ? sched_domain_span(prefer) : NULL;
+   int cpu = -1;
+
+   while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) {
+   if (!cpumask_test_cpu(cpu, sds))
+   continue;
+   if (ps && cpumask_test_cpu(cpu, ps))
+   continue;
+   break;
+   }
+
+   return cpu;
+}
+
 static int find_later_rq(struct task_struct *task)
 {
-   struct sched_domain *sd;
+   struct sched_domain *sd, *prefer = NULL;
struct cpumask *later_mask = 
this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
int this_cpu = smp_processor_id();
int cpu = task_cpu(task);
+   int fallback_cpu = -1;
 
/* Make sure the mask is initialized first */
if (unlikely(!later_mask))
@@ -1376,8 +1399,7 @@ static int find_later_rq(struct task_struct *task)
return this_cpu;
}
 
-   best_cpu = cpumask_first_and(later_mask,
-   sched_domain_span(sd));
+   best_cpu = find_cpu(later_mask, sd, prefer);
/*
 * Last chance: if a cpu being in both later_mask
 * and current sd span is valid, that becomes our
@@ -1385,6 +1407,17 @@ static int find_later_rq(struct task_struct *task)
 * already under consideration through later_mask.
 */
if (best_cpu < nr_cpu_ids) {
+   /*
+* If current domain is SD_PREFER_SIBLING
+* flaged, we have to get more chances to
+* check other siblings.
+*/
+   if (sd->flags & SD_PREFER_SIBLING) {
+   prefer = sd;
+   if (fallback_cpu == -1)
+   fallback_cpu = best_cpu;
+   continue;
+   }
rcu_read_unlock();
return best_cpu;
}
@@ -1393,6 +1426,13 @@ static int find_later_rq(struct task_struct *task)
rcu_read_unlock();
 
/*
+* If fallback_cpu is valid, all our guesses failed *except* for
+* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu.
+*/
+   if (fallback_cpu != -1)
+   return fallback_cpu;
+
+   /*
 * At this point, all our guesses failed, we just return
 * 'something', and let the caller sort the things out.
 */
-- 
1.9.1



[PATCH v6 1/2] sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()

2017-08-06 Thread Byungchul Park
It would be better to avoid pushing tasks to other cpu within
a SD_PREFER_SIBLING domain, instead, get more chances to check other
siblings.

Signed-off-by: Byungchul Park 
---
 kernel/sched/deadline.c | 46 +++---
 1 file changed, 43 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 0223694..2fd1591 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1319,12 +1319,35 @@ static struct task_struct 
*pick_earliest_pushable_dl_task(struct rq *rq, int cpu
 
 static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask_dl);
 
+/*
+ * Find the first cpu in: mask & sd & ~prefer
+ */
+static int find_cpu(const struct cpumask *mask,
+   const struct sched_domain *sd,
+   const struct sched_domain *prefer)
+{
+   const struct cpumask *sds = sched_domain_span(sd);
+   const struct cpumask *ps  = prefer ? sched_domain_span(prefer) : NULL;
+   int cpu = -1;
+
+   while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) {
+   if (!cpumask_test_cpu(cpu, sds))
+   continue;
+   if (ps && cpumask_test_cpu(cpu, ps))
+   continue;
+   break;
+   }
+
+   return cpu;
+}
+
 static int find_later_rq(struct task_struct *task)
 {
-   struct sched_domain *sd;
+   struct sched_domain *sd, *prefer = NULL;
struct cpumask *later_mask = 
this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
int this_cpu = smp_processor_id();
int cpu = task_cpu(task);
+   int fallback_cpu = -1;
 
/* Make sure the mask is initialized first */
if (unlikely(!later_mask))
@@ -1376,8 +1399,7 @@ static int find_later_rq(struct task_struct *task)
return this_cpu;
}
 
-   best_cpu = cpumask_first_and(later_mask,
-   sched_domain_span(sd));
+   best_cpu = find_cpu(later_mask, sd, prefer);
/*
 * Last chance: if a cpu being in both later_mask
 * and current sd span is valid, that becomes our
@@ -1385,6 +1407,17 @@ static int find_later_rq(struct task_struct *task)
 * already under consideration through later_mask.
 */
if (best_cpu < nr_cpu_ids) {
+   /*
+* If current domain is SD_PREFER_SIBLING
+* flaged, we have to get more chances to
+* check other siblings.
+*/
+   if (sd->flags & SD_PREFER_SIBLING) {
+   prefer = sd;
+   if (fallback_cpu == -1)
+   fallback_cpu = best_cpu;
+   continue;
+   }
rcu_read_unlock();
return best_cpu;
}
@@ -1393,6 +1426,13 @@ static int find_later_rq(struct task_struct *task)
rcu_read_unlock();
 
/*
+* If fallback_cpu is valid, all our guesses failed *except* for
+* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu.
+*/
+   if (fallback_cpu != -1)
+   return fallback_cpu;
+
+   /*
 * At this point, all our guesses failed, we just return
 * 'something', and let the caller sort the things out.
 */
-- 
1.9.1



[PATCH v6 2/2] sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()

2017-08-06 Thread Byungchul Park
It would be better to avoid pushing tasks to other cpu within
a SD_PREFER_SIBLING domain, instead, get more chances to check other
siblings.

Signed-off-by: Byungchul Park 
---
 kernel/sched/rt.c | 47 ---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 979b734..50639e5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1618,12 +1618,35 @@ static struct task_struct 
*pick_highest_pushable_task(struct rq *rq, int cpu)
 
 static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask);
 
+/*
+ * Find the first cpu in: mask & sd & ~prefer
+ */
+static int find_cpu(const struct cpumask *mask,
+   const struct sched_domain *sd,
+   const struct sched_domain *prefer)
+{
+   const struct cpumask *sds = sched_domain_span(sd);
+   const struct cpumask *ps  = prefer ? sched_domain_span(prefer) : NULL;
+   int cpu = -1;
+
+   while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) {
+   if (!cpumask_test_cpu(cpu, sds))
+   continue;
+   if (ps && cpumask_test_cpu(cpu, ps))
+   continue;
+   break;
+   }
+
+   return cpu;
+}
+
 static int find_lowest_rq(struct task_struct *task)
 {
-   struct sched_domain *sd;
+   struct sched_domain *sd, *prefer = NULL;
struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
int this_cpu = smp_processor_id();
int cpu  = task_cpu(task);
+   int fallback_cpu = -1;
 
/* Make sure the mask is initialized first */
if (unlikely(!lowest_mask))
@@ -1668,9 +1691,20 @@ static int find_lowest_rq(struct task_struct *task)
return this_cpu;
}
 
-   best_cpu = cpumask_first_and(lowest_mask,
-sched_domain_span(sd));
+   best_cpu = find_cpu(lowest_mask, sd, prefer);
+
if (best_cpu < nr_cpu_ids) {
+   /*
+* If current domain is SD_PREFER_SIBLING
+* flaged, we have to get more chances to
+* check other siblings.
+*/
+   if (sd->flags & SD_PREFER_SIBLING) {
+   prefer = sd;
+   if (fallback_cpu == -1)
+   fallback_cpu = best_cpu;
+   continue;
+   }
rcu_read_unlock();
return best_cpu;
}
@@ -1679,6 +1713,13 @@ static int find_lowest_rq(struct task_struct *task)
rcu_read_unlock();
 
/*
+* If fallback_cpu is valid, all our quesses failed *except* for
+* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu.
+*/
+   if (fallback_cpu != -1)
+   return fallback_cpu;
+
+   /*
 * And finally, if there were no matches within the domains
 * just give the caller *something* to work with from the compatible
 * locations.
-- 
1.9.1



Re: [RFC Part1 PATCH v3 12/17] x86/mm: DMA support for SEV memory encryption

2017-08-06 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 02:07:52PM -0500, Brijesh Singh wrote:
> From: Tom Lendacky 
> 
> DMA access to memory mapped as encrypted while SEV is active can not be
> encrypted during device write or decrypted during device read.

Yeah, definitely rewrite that sentence.

> In order
> for DMA to properly work when SEV is active, the SWIOTLB bounce buffers
> must be used.
> 
> Signed-off-by: Tom Lendacky 
> Signed-off-by: Brijesh Singh 
> ---
>  arch/x86/mm/mem_encrypt.c | 86 
> +++
>  lib/swiotlb.c |  5 +--
>  2 files changed, 89 insertions(+), 2 deletions

...

> @@ -202,6 +280,14 @@ void __init mem_encrypt_init(void)
>   /* Call into SWIOTLB to update the SWIOTLB DMA buffers */
>   swiotlb_update_mem_attributes();
>  
> + /*
> +  * With SEV, DMA operations cannot use encryption. New DMA ops
> +  * are required in order to mark the DMA areas as decrypted or
> +  * to use bounce buffers.
> +  */
> + if (sev_active())
> + dma_ops = _dma_ops;

Well, we do differentiate between SME and SEV and the check is
sev_active but the ops are called sme_dma_ops. Call them sev_dma_ops
instead for less confusion.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [RFC Part1 PATCH v3 12/17] x86/mm: DMA support for SEV memory encryption

2017-08-06 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 02:07:52PM -0500, Brijesh Singh wrote:
> From: Tom Lendacky 
> 
> DMA access to memory mapped as encrypted while SEV is active can not be
> encrypted during device write or decrypted during device read.

Yeah, definitely rewrite that sentence.

> In order
> for DMA to properly work when SEV is active, the SWIOTLB bounce buffers
> must be used.
> 
> Signed-off-by: Tom Lendacky 
> Signed-off-by: Brijesh Singh 
> ---
>  arch/x86/mm/mem_encrypt.c | 86 
> +++
>  lib/swiotlb.c |  5 +--
>  2 files changed, 89 insertions(+), 2 deletions

...

> @@ -202,6 +280,14 @@ void __init mem_encrypt_init(void)
>   /* Call into SWIOTLB to update the SWIOTLB DMA buffers */
>   swiotlb_update_mem_attributes();
>  
> + /*
> +  * With SEV, DMA operations cannot use encryption. New DMA ops
> +  * are required in order to mark the DMA areas as decrypted or
> +  * to use bounce buffers.
> +  */
> + if (sev_active())
> + dma_ops = _dma_ops;

Well, we do differentiate between SME and SEV and the check is
sev_active but the ops are called sme_dma_ops. Call them sev_dma_ops
instead for less confusion.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-08-06 Thread David Miller
From: Ding Tianhong 
Date: Sat, 5 Aug 2017 15:15:09 +0800

> Some devices have problems with Transaction Layer Packets with the Relaxed
> Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
> devices with Relaxed Ordering issues, and a use of this new flag by the
> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
> Ports.
> 
> It's been years since I've submitted kernel.org patches, I appolgise for the
> almost certain submission errors.

Which tree should merge this?  The PCI tree or my networking tree?


Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-08-06 Thread David Miller
From: Ding Tianhong 
Date: Sat, 5 Aug 2017 15:15:09 +0800

> Some devices have problems with Transaction Layer Packets with the Relaxed
> Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
> devices with Relaxed Ordering issues, and a use of this new flag by the
> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
> Ports.
> 
> It's been years since I've submitted kernel.org patches, I appolgise for the
> almost certain submission errors.

Which tree should merge this?  The PCI tree or my networking tree?


Re: [PATCH v2 4/5] PCI: mediatek: Add new generation controller support

2017-08-06 Thread Honghui Zhang
On Sat, 2017-08-05 at 14:16 +0800, Ryder Lee wrote:
> On Sat, 2017-08-05 at 12:52 +0800, Ryder Lee wrote:
> > Hi Honghui, Bjorn,
> > 
> > On Fri, 2017-08-04 at 08:18 -0500, Bjorn Helgaas wrote:
> > > On Fri, Aug 04, 2017 at 04:39:36PM +0800, Honghui Zhang wrote:
> > > > On Thu, 2017-08-03 at 17:42 -0500, Bjorn Helgaas wrote:
> > > > > > +
> > > > > > +static struct mtk_pcie_port *mtk_pcie_find_port(struct mtk_pcie 
> > > > > > *pcie,
> > > > > > +   struct pci_bus *bus, 
> > > > > > int devfn)
> > > > > > +{
> > > > > > +   struct pci_dev *dev;
> > > > > > +   struct pci_bus *pbus;
> > > > > > +   struct mtk_pcie_port *port, *tmp;
> > > > > > +
> > > > > > +   list_for_each_entry_safe(port, tmp, >ports, list) {
> > > > > > +   if (bus->number == 0 && port->index == PCI_SLOT(devfn)) 
> > > > > > {
> > > > > > +   return port;
> > > > > > +   } else if (bus->number != 0) {
> > > > > > +   pbus = bus;
> > > > > > +   do {
> > > > > > +   dev = pbus->self;
> > > > > > +   if (port->index == PCI_SLOT(dev->devfn))
> > > > > > +   return port;
> > > > > > +   pbus = dev->bus;
> > > > > > +   } while (dev->bus->number != 0);
> > > > > > +   }
> > > > > > +   }
> > > > > > +
> > > > > > +   return NULL;
> > > > > 
> > > > > You should be able to use sysdata to avoid searching the list.
> > > > > See drivers/pci/host/pci-aardvark.c, for example.
> > > > > 
> > > > 
> > > > I could put the mtk_pcie * in sysdata, but still need to searching the
> > > > list to get the mtk_pcie_port *, how about:
> > > > 
> > > > list_for_each_entry_safe(port, tmp, >ports, list) {
> > > > if (port->index == PCI_SLOT(devfn))
> > > > return port;
> > > > }
> > > 
> > > No.  Other drivers don't need to search the list.  Please take a look
> > > at them and see how they solve this problem.  I don't think your
> > > hardware is fundamentally different in a way that means you need to
> > > search when the others don't.
> > > 
> > 
> > I'm not directly involved in this generation, but I guess the main reason 
> > why Honghui need to do that is just because this hardware access 
> > configuration space via per-port registers, not just for the guard.  
> > Currently, We had a host bridge with two ports (two subnodes in binding 
> > text), thus he tried to tells them apart so that he can get the correct 
> > registers.
> > 
> > Some platforms don't need to do that since they just have a single port (no 
> > more subnodes), the others might have specific/shared registers to access 
> > configuration space. (e.g. Tegra, MTK legacy IP block).
> > Or, he can split them into two independent nodes, but it will break common 
> > probing flow by doing so. (I'd prefer to use subnodes.)
> > 
> > Ryder
> > 
> 
> Sorry for the typesetting in previous mail and noise again,
> 
> I've took a look at pci-rcar-gen2.c, this is a similar case I can found
> for Honghui's case. It gathers two ports reg regions into one, and uses
> the "slot id" to calculate the cfg base of each port.
> 
> Perhaps this is a example for those who need to use subnodes and use
> port registers for cfg operation. Not sure whether it's worthwhile doing
> that since we need to changes ports/host structures.
> 
> Ryder.
> 
As Ryder's description, Mediatek's new generation HW blocks has two
separate ports, they have separate control register base address. We
must touch the per-port control register to access the EP's
configuration space. One port's control register is the only way to
access the EP's configuration space(the EP which is connect under this
very port).
Given an EP device, we need to determine which ports it's been
connected, and get the base address for that port. It's a bit like
pci-tegra/pci-mvebu.

Seems list is not forbidden, pci-tegra search the list to identify the
ports[1], mvebu use point array to search the ports[2], they have the
same functionality through different approach. I may propose another
patch to make the code like mvebu[2] if you insist, but I'm prefer the
current list way.

[1]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-tegra.c#L456
[2]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-mvebu.c#L780

thanks.
> 




Re: [PATCH v2 4/5] PCI: mediatek: Add new generation controller support

2017-08-06 Thread Honghui Zhang
On Sat, 2017-08-05 at 14:16 +0800, Ryder Lee wrote:
> On Sat, 2017-08-05 at 12:52 +0800, Ryder Lee wrote:
> > Hi Honghui, Bjorn,
> > 
> > On Fri, 2017-08-04 at 08:18 -0500, Bjorn Helgaas wrote:
> > > On Fri, Aug 04, 2017 at 04:39:36PM +0800, Honghui Zhang wrote:
> > > > On Thu, 2017-08-03 at 17:42 -0500, Bjorn Helgaas wrote:
> > > > > > +
> > > > > > +static struct mtk_pcie_port *mtk_pcie_find_port(struct mtk_pcie 
> > > > > > *pcie,
> > > > > > +   struct pci_bus *bus, 
> > > > > > int devfn)
> > > > > > +{
> > > > > > +   struct pci_dev *dev;
> > > > > > +   struct pci_bus *pbus;
> > > > > > +   struct mtk_pcie_port *port, *tmp;
> > > > > > +
> > > > > > +   list_for_each_entry_safe(port, tmp, >ports, list) {
> > > > > > +   if (bus->number == 0 && port->index == PCI_SLOT(devfn)) 
> > > > > > {
> > > > > > +   return port;
> > > > > > +   } else if (bus->number != 0) {
> > > > > > +   pbus = bus;
> > > > > > +   do {
> > > > > > +   dev = pbus->self;
> > > > > > +   if (port->index == PCI_SLOT(dev->devfn))
> > > > > > +   return port;
> > > > > > +   pbus = dev->bus;
> > > > > > +   } while (dev->bus->number != 0);
> > > > > > +   }
> > > > > > +   }
> > > > > > +
> > > > > > +   return NULL;
> > > > > 
> > > > > You should be able to use sysdata to avoid searching the list.
> > > > > See drivers/pci/host/pci-aardvark.c, for example.
> > > > > 
> > > > 
> > > > I could put the mtk_pcie * in sysdata, but still need to searching the
> > > > list to get the mtk_pcie_port *, how about:
> > > > 
> > > > list_for_each_entry_safe(port, tmp, >ports, list) {
> > > > if (port->index == PCI_SLOT(devfn))
> > > > return port;
> > > > }
> > > 
> > > No.  Other drivers don't need to search the list.  Please take a look
> > > at them and see how they solve this problem.  I don't think your
> > > hardware is fundamentally different in a way that means you need to
> > > search when the others don't.
> > > 
> > 
> > I'm not directly involved in this generation, but I guess the main reason 
> > why Honghui need to do that is just because this hardware access 
> > configuration space via per-port registers, not just for the guard.  
> > Currently, We had a host bridge with two ports (two subnodes in binding 
> > text), thus he tried to tells them apart so that he can get the correct 
> > registers.
> > 
> > Some platforms don't need to do that since they just have a single port (no 
> > more subnodes), the others might have specific/shared registers to access 
> > configuration space. (e.g. Tegra, MTK legacy IP block).
> > Or, he can split them into two independent nodes, but it will break common 
> > probing flow by doing so. (I'd prefer to use subnodes.)
> > 
> > Ryder
> > 
> 
> Sorry for the typesetting in previous mail and noise again,
> 
> I've took a look at pci-rcar-gen2.c, this is a similar case I can found
> for Honghui's case. It gathers two ports reg regions into one, and uses
> the "slot id" to calculate the cfg base of each port.
> 
> Perhaps this is a example for those who need to use subnodes and use
> port registers for cfg operation. Not sure whether it's worthwhile doing
> that since we need to changes ports/host structures.
> 
> Ryder.
> 
As Ryder's description, Mediatek's new generation HW blocks has two
separate ports, they have separate control register base address. We
must touch the per-port control register to access the EP's
configuration space. One port's control register is the only way to
access the EP's configuration space(the EP which is connect under this
very port).
Given an EP device, we need to determine which ports it's been
connected, and get the base address for that port. It's a bit like
pci-tegra/pci-mvebu.

Seems list is not forbidden, pci-tegra search the list to identify the
ports[1], mvebu use point array to search the ports[2], they have the
same functionality through different approach. I may propose another
patch to make the code like mvebu[2] if you insist, but I'm prefer the
current list way.

[1]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-tegra.c#L456
[2]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-mvebu.c#L780

thanks.
> 




Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp


Can we update the year to 2017 while we're at it?

Archit



Signed-off-by: Kuninori Morimoto 
---
  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..d487b6b 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
  /*
   * dw-hdmi-i2s-audio.c
   *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2016 Renesas Solutions Corp.
+ * Kuninori Morimoto 
   *
   * This program is free software; you can redistribute it and/or modify
   * it under the terms of the GNU General Public License version 2 as



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright

2017-08-06 Thread Archit Taneja



On 08/07/2017 07:41 AM, Kuninori Morimoto wrote:


From: Kuninori Morimoto 

This driver's Copyright is under Renesas Solutions Corp


Can we update the year to 2017 while we're at it?

Archit



Signed-off-by: Kuninori Morimoto 
---
  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..d487b6b 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
  /*
   * dw-hdmi-i2s-audio.c
   *
- * Copyright (c) 2016 Kuninori Morimoto 
+ * Copyright (c) 2016 Renesas Solutions Corp.
+ * Kuninori Morimoto 
   *
   * This program is free software; you can redistribute it and/or modify
   * it under the terms of the GNU General Public License version 2 as



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH] cpufreq: Simplify cpufreq_can_do_remote_dvfs()

2017-08-06 Thread Viresh Kumar
On 04-08-17, 14:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop
> it and simply return the value of the expression under it.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
> 
> On top of the current linux-next.
> 
> ---
>  include/linux/cpufreq.h |7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> Index: linux-pm/include/linux/cpufreq.h
> ===
> --- linux-pm.orig/include/linux/cpufreq.h
> +++ linux-pm/include/linux/cpufreq.h
> @@ -578,11 +578,8 @@ static inline bool cpufreq_can_do_remote
>* - dvfs_possible_from_any_cpu flag is set
>* - the local and remote CPUs share cpufreq policy
>*/
> - if (policy->dvfs_possible_from_any_cpu ||
> - cpumask_test_cpu(smp_processor_id(), policy->cpus))
> - return true;
> -
> - return false;
> + return policy->dvfs_possible_from_any_cpu ||
> + cpumask_test_cpu(smp_processor_id(), policy->cpus);
>  }
>  
>  /*

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH] cpufreq: Simplify cpufreq_can_do_remote_dvfs()

2017-08-06 Thread Viresh Kumar
On 04-08-17, 14:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop
> it and simply return the value of the expression under it.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
> 
> On top of the current linux-next.
> 
> ---
>  include/linux/cpufreq.h |7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> Index: linux-pm/include/linux/cpufreq.h
> ===
> --- linux-pm.orig/include/linux/cpufreq.h
> +++ linux-pm/include/linux/cpufreq.h
> @@ -578,11 +578,8 @@ static inline bool cpufreq_can_do_remote
>* - dvfs_possible_from_any_cpu flag is set
>* - the local and remote CPUs share cpufreq policy
>*/
> - if (policy->dvfs_possible_from_any_cpu ||
> - cpumask_test_cpu(smp_processor_id(), policy->cpus))
> - return true;
> -
> - return false;
> + return policy->dvfs_possible_from_any_cpu ||
> + cpumask_test_cpu(smp_processor_id(), policy->cpus);
>  }
>  
>  /*

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH V3] get_maintainer: Prepare for separate MAINTAINERS files

2017-08-06 Thread Joe Perches
On Sun, 2017-08-06 at 19:16 -0700, Frank Rowand wrote:
> On 08/04/17 21:45, Joe Perches wrote:
> > Allow for MAINTAINERS to become a directory and if it is,
> > read all the files in the directory for maintained sections.
> > 
> > Optionally look for all files named MAINTAINERS in directories
> > excluding the .git directory by using --find-maintainer-files.
> > 
> > This optional feature adds ~.3 seconds of CPU on an Intel
> > i5-6200 with an SSD.
> > 
> > Miscellanea:
> > 
> > o Create a read_maintainer_file subroutine from the existing code
> > o Test only the existence of MAINTAINERS, not whether it's a file
> > 
> > Signed-off-by: Joe Perches 
> > ---
> 
> < snip > 
> 
> Hi Joe,
> 
> In the three versions of this patch, I have not seen any description
> of what is wrong with the current single MAINTAINERS file, or why the
> proposed change is an improvement. Could you please add that
> information?

It's really up to Linus.

He's the one who wants to separate the MAINTAINERS
file as he's the one that has to deal with the
merges.

This is only to enable the script to still function
if the file is split up.


Re: [PATCH V3] get_maintainer: Prepare for separate MAINTAINERS files

2017-08-06 Thread Joe Perches
On Sun, 2017-08-06 at 19:16 -0700, Frank Rowand wrote:
> On 08/04/17 21:45, Joe Perches wrote:
> > Allow for MAINTAINERS to become a directory and if it is,
> > read all the files in the directory for maintained sections.
> > 
> > Optionally look for all files named MAINTAINERS in directories
> > excluding the .git directory by using --find-maintainer-files.
> > 
> > This optional feature adds ~.3 seconds of CPU on an Intel
> > i5-6200 with an SSD.
> > 
> > Miscellanea:
> > 
> > o Create a read_maintainer_file subroutine from the existing code
> > o Test only the existence of MAINTAINERS, not whether it's a file
> > 
> > Signed-off-by: Joe Perches 
> > ---
> 
> < snip > 
> 
> Hi Joe,
> 
> In the three versions of this patch, I have not seen any description
> of what is wrong with the current single MAINTAINERS file, or why the
> proposed change is an improvement. Could you please add that
> information?

It's really up to Linus.

He's the one who wants to separate the MAINTAINERS
file as he's the one that has to deal with the
merges.

This is only to enable the script to still function
if the file is split up.


[PATCH v10 2/4] irqchip/qeic: merge qeic init code from platforms to a common function

2017-08-06 Thread Zhao Qiang
The codes of qe_ic init from a variety of platforms are redundant,
merge them to a common function and put it to irqchip/irq-qeic.c

For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0,
qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of
"qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);".

qe_ic_cascade_muxed_mpic was used for boards has the same interrupt
number for low interrupt and high interrupt, qe_ic_init has checked
if "low interrupt == high interrupt"

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/misc.c| 15 ---
 arch/powerpc/platforms/85xx/corenet_generic.c |  9 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 --
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 
 arch/powerpc/platforms/85xx/twr_p102x.c   | 14 --
 drivers/irqchip/irq-qeic.c| 13 +
 6 files changed, 13 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index d75c981..c09a135 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void)
 }
 
 #ifdef CONFIG_QUICC_ENGINE
-void __init mpc83xx_qe_init_IRQ(void)
-{
-   struct device_node *np;
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-   qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic);
-   of_node_put(np);
-}
-
 void __init mpc83xx_ipic_and_qe_init_IRQ(void)
 {
mpc83xx_ipic_init_IRQ();
-   mpc83xx_qe_init_IRQ();
 }
 #endif /* CONFIG_QUICC_ENGINE */
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index ac191a7..1b385ac 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void)
unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU |
MPIC_NO_RESET;
 
-   struct device_node *np;
-
if (ppc_md.get_irq == mpic_get_coreint_irq)
flags |= MPIC_ENABLE_COREINT;
 
@@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void)
BUG_ON(mpic == NULL);
 
mpic_init(mpic);
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-   }
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index d7e440e..06f34a9 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -283,20 +283,6 @@ static void __init mpc85xx_mds_qeic_init(void)
of_node_put(np);
return;
}
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-
-   if (machine_is(p1021_mds))
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   else
-   qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);
-   of_node_put(np);
 }
 #else
 static void __init mpc85xx_mds_qe_init(void) { }
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 1006950..000d385 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -48,10 +48,6 @@ void __init mpc85xx_rdb_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct device_node *np;
-#endif
-
if (of_machine_is_compatible("fsl,MPC85XXRDB-CAMP")) {
mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET |
MPIC_BIG_ENDIAN |
@@ -66,18 +62,6 @@ void __init mpc85xx_rdb_pic_init(void)
 
BUG_ON(mpic == NULL);
mpic_init(mpic);
-
-#ifdef CONFIG_QUICC_ENGINE
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-
-   } else
-   pr_err("%s: Could not find qe-ic node\n", __func__);
-#endif
-
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 360f625..6be9b33 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -35,26 +35,12 @@ static void __init twr_p1025_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct 

[PATCH v10 2/4] irqchip/qeic: merge qeic init code from platforms to a common function

2017-08-06 Thread Zhao Qiang
The codes of qe_ic init from a variety of platforms are redundant,
merge them to a common function and put it to irqchip/irq-qeic.c

For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0,
qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of
"qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);".

qe_ic_cascade_muxed_mpic was used for boards has the same interrupt
number for low interrupt and high interrupt, qe_ic_init has checked
if "low interrupt == high interrupt"

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/misc.c| 15 ---
 arch/powerpc/platforms/85xx/corenet_generic.c |  9 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 --
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 
 arch/powerpc/platforms/85xx/twr_p102x.c   | 14 --
 drivers/irqchip/irq-qeic.c| 13 +
 6 files changed, 13 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index d75c981..c09a135 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void)
 }
 
 #ifdef CONFIG_QUICC_ENGINE
-void __init mpc83xx_qe_init_IRQ(void)
-{
-   struct device_node *np;
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-   qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic);
-   of_node_put(np);
-}
-
 void __init mpc83xx_ipic_and_qe_init_IRQ(void)
 {
mpc83xx_ipic_init_IRQ();
-   mpc83xx_qe_init_IRQ();
 }
 #endif /* CONFIG_QUICC_ENGINE */
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index ac191a7..1b385ac 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void)
unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU |
MPIC_NO_RESET;
 
-   struct device_node *np;
-
if (ppc_md.get_irq == mpic_get_coreint_irq)
flags |= MPIC_ENABLE_COREINT;
 
@@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void)
BUG_ON(mpic == NULL);
 
mpic_init(mpic);
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-   }
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index d7e440e..06f34a9 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -283,20 +283,6 @@ static void __init mpc85xx_mds_qeic_init(void)
of_node_put(np);
return;
}
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-
-   if (machine_is(p1021_mds))
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   else
-   qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);
-   of_node_put(np);
 }
 #else
 static void __init mpc85xx_mds_qe_init(void) { }
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 1006950..000d385 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -48,10 +48,6 @@ void __init mpc85xx_rdb_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct device_node *np;
-#endif
-
if (of_machine_is_compatible("fsl,MPC85XXRDB-CAMP")) {
mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET |
MPIC_BIG_ENDIAN |
@@ -66,18 +62,6 @@ void __init mpc85xx_rdb_pic_init(void)
 
BUG_ON(mpic == NULL);
mpic_init(mpic);
-
-#ifdef CONFIG_QUICC_ENGINE
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-
-   } else
-   pr_err("%s: Could not find qe-ic node\n", __func__);
-#endif
-
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 360f625..6be9b33 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -35,26 +35,12 @@ static void __init twr_p1025_pic_init(void)
 {
struct mpic *mpic;
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct device_node *np;

[PATCH v10 4/4] irqchip/qeic: remove PPCisms for QEIC

2017-08-06 Thread Zhao Qiang
QEIC was supported on PowerPC, and dependent on PPC,
Now it is supported on other platforms, so remove PPCisms.

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/km83xx.c  |   1 -
 arch/powerpc/platforms/83xx/misc.c|   1 -
 arch/powerpc/platforms/83xx/mpc832x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c |   1 -
 arch/powerpc/platforms/85xx/twr_p102x.c   |   1 -
 drivers/irqchip/irq-qeic.c| 188 +++---
 include/soc/fsl/qe/qe_ic.h| 132 --
 12 files changed, 80 insertions(+), 250 deletions(-)
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

diff --git a/arch/powerpc/platforms/83xx/km83xx.c 
b/arch/powerpc/platforms/83xx/km83xx.c
index d8642a4..b1cef0a 100644
--- a/arch/powerpc/platforms/83xx/km83xx.c
+++ b/arch/powerpc/platforms/83xx/km83xx.c
@@ -38,7 +38,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index c09a135..07a0e61 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c 
b/arch/powerpc/platforms/83xx/mpc832x_mds.c
index bb7b25a..a1cadf4 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c
@@ -37,7 +37,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index d7c9b18..6c66527 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -26,7 +26,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c 
b/arch/powerpc/platforms/83xx/mpc836x_mds.c
index 4fc3051..9234d63 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c
@@ -45,7 +45,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_rdk.c 
b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
index 93f024f..82fa344 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_rdk.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index 1b385ac..9ca27b1 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index 06f34a9..8102e5f 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -49,7 +49,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include "smp.h"
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 000d385..f806b6b 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 6be9b33..4f620f2 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -23,7 +23,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index a2d8084..26bfcbd 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -18,8 +18,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -27,9 +30,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
-#include 
 
 #define NR_QE_IC_INTS  64
 
@@ -87,6 +89,43 @@
 #define SIGNAL_HIGH2
 #define SIGNAL_LOW 0
 
+#define NUM_OF_QE_IC_GROUPS6
+
+/* Flags when we init the QE IC */
+#define QE_IC_SPREADMODE_GRP_W 0x0001
+#define QE_IC_SPREADMODE_GRP_X 0x0002
+#define QE_IC_SPREADMODE_GRP_Y 0x0004
+#define QE_IC_SPREADMODE_GRP_Z 0x0008
+#define QE_IC_SPREADMODE_GRP_RISCA 0x0010
+#define 

[PATCH v10 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2017-08-06 Thread Zhao Qiang
move the driver from drivers/soc/fsl/qe to drivers/irqchip,
merge qe_ic.h and qe_ic.c into irq-qeic.c.

Signed-off-by: Zhao Qiang 
---
 MAINTAINERS|   6 ++
 drivers/irqchip/Makefile   |   1 +
 drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} |  95 ++-
 drivers/soc/fsl/qe/Makefile|   2 +-
 drivers/soc/fsl/qe/qe_ic.h | 103 -
 5 files changed, 100 insertions(+), 107 deletions(-)
 rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%)
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 567343b..1288329 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5462,6 +5462,12 @@ F:   drivers/soc/fsl/qe/
 F: include/soc/fsl/*qe*.h
 F: include/soc/fsl/*ucc*.h
 
+FREESCALE QEIC DRIVERS
+M: Qiang Zhao 
+L: linux-kernel@vger.kernel.org
+S: Maintained
+F: drivers/irqchip/irq-qeic.c
+
 FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
 M: Li Yang 
 L: net...@vger.kernel.org
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index e88d856..b8eae87 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -78,3 +78,4 @@ obj-$(CONFIG_EZNPS_GIC)   += irq-eznps.o
 obj-$(CONFIG_ARCH_ASPEED)  += irq-aspeed-vic.o irq-aspeed-i2c-ic.o
 obj-$(CONFIG_STM32_EXTI)   += irq-stm32-exti.o
 obj-$(CONFIG_QCOM_IRQ_COMBINER)+= qcom-irq-combiner.o
+obj-$(CONFIG_QUICC_ENGINE) += irq-qeic.o
diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c
similarity index 85%
rename from drivers/soc/fsl/qe/qe_ic.c
rename to drivers/irqchip/irq-qeic.c
index ec2ca86..9b4660c 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -1,7 +1,7 @@
 /*
- * arch/powerpc/sysdev/qe_lib/qe_ic.c
+ * drivers/irqchip/irq-qeic.c
  *
- * Copyright (C) 2006 Freescale Semiconductor, Inc.  All rights reserved.
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.  All rights reserved.
  *
  * Author: Li Yang 
  * Based on code from Shlomi Gridish 
@@ -30,7 +30,96 @@
 #include 
 #include 
 
-#include "qe_ic.h"
+#define NR_QE_IC_INTS  64
+
+/* QE IC registers offset */
+#define QEIC_CICR  0x00
+#define QEIC_CIVEC 0x04
+#define QEIC_CRIPNR0x08
+#define QEIC_CIPNR 0x0c
+#define QEIC_CIPXCC0x10
+#define QEIC_CIPYCC0x14
+#define QEIC_CIPWCC0x18
+#define QEIC_CIPZCC0x1c
+#define QEIC_CIMR  0x20
+#define QEIC_CRIMR 0x24
+#define QEIC_CICNR 0x28
+#define QEIC_CIPRTA0x30
+#define QEIC_CIPRTB0x34
+#define QEIC_CRICR 0x3c
+#define QEIC_CHIVEC0x60
+
+/* Interrupt priority registers */
+#define CIPCC_SHIFT_PRI0   29
+#define CIPCC_SHIFT_PRI1   26
+#define CIPCC_SHIFT_PRI2   23
+#define CIPCC_SHIFT_PRI3   20
+#define CIPCC_SHIFT_PRI4   13
+#define CIPCC_SHIFT_PRI5   10
+#define CIPCC_SHIFT_PRI6   7
+#define CIPCC_SHIFT_PRI7   4
+
+/* CICR priority modes */
+#define CICR_GWCC  0x0004
+#define CICR_GXCC  0x0002
+#define CICR_GYCC  0x0001
+#define CICR_GZCC  0x0008
+#define CICR_GRTA  0x0020
+#define CICR_GRTB  0x0040
+#define CICR_HPIT_SHIFT8
+#define CICR_HPIT_MASK 0x0300
+#define CICR_HP_SHIFT  24
+#define CICR_HP_MASK   0x3f00
+
+/* CICNR */
+#define CICNR_WCC1T_SHIFT  20
+#define CICNR_ZCC1T_SHIFT  28
+#define CICNR_YCC1T_SHIFT  12
+#define CICNR_XCC1T_SHIFT  4
+
+/* CRICR */
+#define CRICR_RTA1T_SHIFT  20
+#define CRICR_RTB1T_SHIFT  28
+
+/* Signal indicator */
+#define SIGNAL_MASK3
+#define SIGNAL_HIGH2
+#define SIGNAL_LOW 0
+
+struct qe_ic {
+   /* Control registers offset */
+   u32 __iomem *regs;
+
+   /* The remapper for this QEIC */
+   struct irq_domain *irqhost;
+
+   /* The "linux" controller struct */
+   struct irq_chip hc_irq;
+
+   /* VIRQ numbers of QE high/low irqs */
+   unsigned int virq_high;
+   unsigned int virq_low;
+};
+
+/*
+ * QE interrupt controller internal structure
+ */
+struct qe_ic_info {
+   /* location of this source at the QIMR register. */
+   u32 mask;
+
+   /* Mask register offset */
+   u32 mask_reg;
+
+   /*
+* for grouped interrupts sources - the interrupt
+* code as appears at the group priority register
+*/
+   u8  pri_code;
+
+   /* Group priority register offset */
+   u32 pri_reg;
+};
 
 static DEFINE_RAW_SPINLOCK(qe_ic_lock);
 
diff --git a/drivers/soc/fsl/qe/Makefile 

[PATCH v10 4/4] irqchip/qeic: remove PPCisms for QEIC

2017-08-06 Thread Zhao Qiang
QEIC was supported on PowerPC, and dependent on PPC,
Now it is supported on other platforms, so remove PPCisms.

Signed-off-by: Zhao Qiang 
---
 arch/powerpc/platforms/83xx/km83xx.c  |   1 -
 arch/powerpc/platforms/83xx/misc.c|   1 -
 arch/powerpc/platforms/83xx/mpc832x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |   1 -
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c |   1 -
 arch/powerpc/platforms/85xx/twr_p102x.c   |   1 -
 drivers/irqchip/irq-qeic.c| 188 +++---
 include/soc/fsl/qe/qe_ic.h| 132 --
 12 files changed, 80 insertions(+), 250 deletions(-)
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

diff --git a/arch/powerpc/platforms/83xx/km83xx.c 
b/arch/powerpc/platforms/83xx/km83xx.c
index d8642a4..b1cef0a 100644
--- a/arch/powerpc/platforms/83xx/km83xx.c
+++ b/arch/powerpc/platforms/83xx/km83xx.c
@@ -38,7 +38,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index c09a135..07a0e61 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c 
b/arch/powerpc/platforms/83xx/mpc832x_mds.c
index bb7b25a..a1cadf4 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c
@@ -37,7 +37,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index d7c9b18..6c66527 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -26,7 +26,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c 
b/arch/powerpc/platforms/83xx/mpc836x_mds.c
index 4fc3051..9234d63 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c
@@ -45,7 +45,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mpc83xx.h"
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_rdk.c 
b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
index 93f024f..82fa344 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_rdk.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_rdk.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index 1b385ac..9ca27b1 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index 06f34a9..8102e5f 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -49,7 +49,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include "smp.h"
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 000d385..f806b6b 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 6be9b33..4f620f2 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -23,7 +23,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index a2d8084..26bfcbd 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -18,8 +18,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -27,9 +30,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
-#include 
 
 #define NR_QE_IC_INTS  64
 
@@ -87,6 +89,43 @@
 #define SIGNAL_HIGH2
 #define SIGNAL_LOW 0
 
+#define NUM_OF_QE_IC_GROUPS6
+
+/* Flags when we init the QE IC */
+#define QE_IC_SPREADMODE_GRP_W 0x0001
+#define QE_IC_SPREADMODE_GRP_X 0x0002
+#define QE_IC_SPREADMODE_GRP_Y 0x0004
+#define QE_IC_SPREADMODE_GRP_Z 0x0008
+#define QE_IC_SPREADMODE_GRP_RISCA 0x0010
+#define QE_IC_SPREADMODE_GRP_RISCB 

[PATCH v10 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2017-08-06 Thread Zhao Qiang
move the driver from drivers/soc/fsl/qe to drivers/irqchip,
merge qe_ic.h and qe_ic.c into irq-qeic.c.

Signed-off-by: Zhao Qiang 
---
 MAINTAINERS|   6 ++
 drivers/irqchip/Makefile   |   1 +
 drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} |  95 ++-
 drivers/soc/fsl/qe/Makefile|   2 +-
 drivers/soc/fsl/qe/qe_ic.h | 103 -
 5 files changed, 100 insertions(+), 107 deletions(-)
 rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%)
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 567343b..1288329 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5462,6 +5462,12 @@ F:   drivers/soc/fsl/qe/
 F: include/soc/fsl/*qe*.h
 F: include/soc/fsl/*ucc*.h
 
+FREESCALE QEIC DRIVERS
+M: Qiang Zhao 
+L: linux-kernel@vger.kernel.org
+S: Maintained
+F: drivers/irqchip/irq-qeic.c
+
 FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
 M: Li Yang 
 L: net...@vger.kernel.org
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index e88d856..b8eae87 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -78,3 +78,4 @@ obj-$(CONFIG_EZNPS_GIC)   += irq-eznps.o
 obj-$(CONFIG_ARCH_ASPEED)  += irq-aspeed-vic.o irq-aspeed-i2c-ic.o
 obj-$(CONFIG_STM32_EXTI)   += irq-stm32-exti.o
 obj-$(CONFIG_QCOM_IRQ_COMBINER)+= qcom-irq-combiner.o
+obj-$(CONFIG_QUICC_ENGINE) += irq-qeic.o
diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c
similarity index 85%
rename from drivers/soc/fsl/qe/qe_ic.c
rename to drivers/irqchip/irq-qeic.c
index ec2ca86..9b4660c 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -1,7 +1,7 @@
 /*
- * arch/powerpc/sysdev/qe_lib/qe_ic.c
+ * drivers/irqchip/irq-qeic.c
  *
- * Copyright (C) 2006 Freescale Semiconductor, Inc.  All rights reserved.
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.  All rights reserved.
  *
  * Author: Li Yang 
  * Based on code from Shlomi Gridish 
@@ -30,7 +30,96 @@
 #include 
 #include 
 
-#include "qe_ic.h"
+#define NR_QE_IC_INTS  64
+
+/* QE IC registers offset */
+#define QEIC_CICR  0x00
+#define QEIC_CIVEC 0x04
+#define QEIC_CRIPNR0x08
+#define QEIC_CIPNR 0x0c
+#define QEIC_CIPXCC0x10
+#define QEIC_CIPYCC0x14
+#define QEIC_CIPWCC0x18
+#define QEIC_CIPZCC0x1c
+#define QEIC_CIMR  0x20
+#define QEIC_CRIMR 0x24
+#define QEIC_CICNR 0x28
+#define QEIC_CIPRTA0x30
+#define QEIC_CIPRTB0x34
+#define QEIC_CRICR 0x3c
+#define QEIC_CHIVEC0x60
+
+/* Interrupt priority registers */
+#define CIPCC_SHIFT_PRI0   29
+#define CIPCC_SHIFT_PRI1   26
+#define CIPCC_SHIFT_PRI2   23
+#define CIPCC_SHIFT_PRI3   20
+#define CIPCC_SHIFT_PRI4   13
+#define CIPCC_SHIFT_PRI5   10
+#define CIPCC_SHIFT_PRI6   7
+#define CIPCC_SHIFT_PRI7   4
+
+/* CICR priority modes */
+#define CICR_GWCC  0x0004
+#define CICR_GXCC  0x0002
+#define CICR_GYCC  0x0001
+#define CICR_GZCC  0x0008
+#define CICR_GRTA  0x0020
+#define CICR_GRTB  0x0040
+#define CICR_HPIT_SHIFT8
+#define CICR_HPIT_MASK 0x0300
+#define CICR_HP_SHIFT  24
+#define CICR_HP_MASK   0x3f00
+
+/* CICNR */
+#define CICNR_WCC1T_SHIFT  20
+#define CICNR_ZCC1T_SHIFT  28
+#define CICNR_YCC1T_SHIFT  12
+#define CICNR_XCC1T_SHIFT  4
+
+/* CRICR */
+#define CRICR_RTA1T_SHIFT  20
+#define CRICR_RTB1T_SHIFT  28
+
+/* Signal indicator */
+#define SIGNAL_MASK3
+#define SIGNAL_HIGH2
+#define SIGNAL_LOW 0
+
+struct qe_ic {
+   /* Control registers offset */
+   u32 __iomem *regs;
+
+   /* The remapper for this QEIC */
+   struct irq_domain *irqhost;
+
+   /* The "linux" controller struct */
+   struct irq_chip hc_irq;
+
+   /* VIRQ numbers of QE high/low irqs */
+   unsigned int virq_high;
+   unsigned int virq_low;
+};
+
+/*
+ * QE interrupt controller internal structure
+ */
+struct qe_ic_info {
+   /* location of this source at the QIMR register. */
+   u32 mask;
+
+   /* Mask register offset */
+   u32 mask_reg;
+
+   /*
+* for grouped interrupts sources - the interrupt
+* code as appears at the group priority register
+*/
+   u8  pri_code;
+
+   /* Group priority register offset */
+   u32 pri_reg;
+};
 
 static DEFINE_RAW_SPINLOCK(qe_ic_lock);
 
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 2031d38..51e4726 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ 

[PATCH v10 3/4] irqchip/qeic: merge qeic_of_init into qe_ic_init

2017-08-06 Thread Zhao Qiang
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init,
pass the device_node to qe_ic_init.
So merge qeic_of_init into qe_ic_init to get the qeic node in
qe_ic_init.

Signed-off-by: Zhao Qiang 
---
 drivers/irqchip/irq-qeic.c | 90 --
 include/soc/fsl/qe/qe_ic.h |  7 
 2 files changed, 39 insertions(+), 58 deletions(-)

diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index 8287c22..a2d8084 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -407,27 +407,33 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic)
return irq_linear_revmap(qe_ic->irqhost, irq);
 }
 
-void __init qe_ic_init(struct device_node *node, unsigned int flags,
-  void (*low_handler)(struct irq_desc *desc),
-  void (*high_handler)(struct irq_desc *desc))
+static int __init qe_ic_init(struct device_node *node, unsigned int flags)
 {
struct qe_ic *qe_ic;
struct resource res;
-   u32 temp = 0, ret, high_active = 0;
+   u32 temp = 0, high_active = 0;
+   int ret = 0;
+
+   if (!node)
+   return -ENODEV;
 
ret = of_address_to_resource(node, 0, );
-   if (ret)
-   return;
+   if (ret) {
+   ret = -ENODEV;
+   goto err_put_node;
+   }
 
qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
-   if (qe_ic == NULL)
-   return;
+   if (qe_ic == NULL) {
+   ret = -ENOMEM;
+   goto err_put_node;
+   }
 
qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
   _ic_host_ops, qe_ic);
if (qe_ic->irqhost == NULL) {
-   kfree(qe_ic);
-   return;
+   ret = -ENOMEM;
+   goto err_free_qe_ic;
}
 
qe_ic->regs = ioremap(res.start, resource_size());
@@ -438,9 +444,9 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic->virq_low = irq_of_parse_and_map(node, 1);
 
if (qe_ic->virq_low == NO_IRQ) {
-   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
-   kfree(qe_ic);
-   return;
+   pr_err("Failed to map QE_IC low IRQ\n");
+   ret = -ENOMEM;
+   goto err_domain_remove;
}
 
/* default priority scheme is grouped. If spread mode is*/
@@ -467,13 +473,24 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic_write(qe_ic->regs, QEIC_CICR, temp);
 
irq_set_handler_data(qe_ic->virq_low, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_low, low_handler);
+   irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic);
 
if (qe_ic->virq_high != NO_IRQ &&
qe_ic->virq_high != qe_ic->virq_low) {
irq_set_handler_data(qe_ic->virq_high, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_high, high_handler);
+   irq_set_chained_handler(qe_ic->virq_high,
+   qe_ic_cascade_high_mpic);
}
+   of_node_put(node);
+   return 0;
+
+err_domain_remove:
+   irq_domain_remove(qe_ic->irqhost);
+err_free_qe_ic:
+   kfree(qe_ic);
+err_put_node:
+   of_node_put(node);
+   return ret;
 }
 
 void qe_ic_set_highest_priority(unsigned int virq, int high)
@@ -570,45 +587,16 @@ int qe_ic_set_high_priority(unsigned int virq, unsigned 
int priority, int high)
return 0;
 }
 
-static struct bus_type qe_ic_subsys = {
-   .name = "qe_ic",
-   .dev_name = "qe_ic",
-};
-
-static struct device device_qe_ic = {
-   .id = 0,
-   .bus = _ic_subsys,
-};
-
-static int __init init_qe_ic_sysfs(void)
+static int __init init_qe_ic(struct device_node *node,
+struct device_node *parent)
 {
-   int rc;
-
-   printk(KERN_DEBUG "Registering qe_ic with sysfs...\n");
+   int ret;
 
-   rc = subsys_system_register(_ic_subsys, NULL);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys class\n");
-   return -ENODEV;
-   }
-   rc = device_register(_qe_ic);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys device\n");
-   return -ENODEV;
-   }
-   return 0;
-}
+   ret = qe_ic_init(node, 0);
+   if (ret)
+   return ret;
 
-static int __init qeic_of_init(struct device_node *node,
-  struct device_node *parent)
-{
-   if (!node)
-   return -ENODEV;
-   qe_ic_init(node, 0, qe_ic_cascade_low_mpic,
-  qe_ic_cascade_high_mpic);
-   of_node_put(node);
return 0;
 }
 
-IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init);
-subsys_initcall(init_qe_ic_sysfs);
+IRQCHIP_DECLARE(qeic, "fsl,qe-ic", init_qe_ic);
diff --git 

[PATCH v10 3/4] irqchip/qeic: merge qeic_of_init into qe_ic_init

2017-08-06 Thread Zhao Qiang
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init,
pass the device_node to qe_ic_init.
So merge qeic_of_init into qe_ic_init to get the qeic node in
qe_ic_init.

Signed-off-by: Zhao Qiang 
---
 drivers/irqchip/irq-qeic.c | 90 --
 include/soc/fsl/qe/qe_ic.h |  7 
 2 files changed, 39 insertions(+), 58 deletions(-)

diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index 8287c22..a2d8084 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -407,27 +407,33 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic)
return irq_linear_revmap(qe_ic->irqhost, irq);
 }
 
-void __init qe_ic_init(struct device_node *node, unsigned int flags,
-  void (*low_handler)(struct irq_desc *desc),
-  void (*high_handler)(struct irq_desc *desc))
+static int __init qe_ic_init(struct device_node *node, unsigned int flags)
 {
struct qe_ic *qe_ic;
struct resource res;
-   u32 temp = 0, ret, high_active = 0;
+   u32 temp = 0, high_active = 0;
+   int ret = 0;
+
+   if (!node)
+   return -ENODEV;
 
ret = of_address_to_resource(node, 0, );
-   if (ret)
-   return;
+   if (ret) {
+   ret = -ENODEV;
+   goto err_put_node;
+   }
 
qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
-   if (qe_ic == NULL)
-   return;
+   if (qe_ic == NULL) {
+   ret = -ENOMEM;
+   goto err_put_node;
+   }
 
qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
   _ic_host_ops, qe_ic);
if (qe_ic->irqhost == NULL) {
-   kfree(qe_ic);
-   return;
+   ret = -ENOMEM;
+   goto err_free_qe_ic;
}
 
qe_ic->regs = ioremap(res.start, resource_size());
@@ -438,9 +444,9 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic->virq_low = irq_of_parse_and_map(node, 1);
 
if (qe_ic->virq_low == NO_IRQ) {
-   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
-   kfree(qe_ic);
-   return;
+   pr_err("Failed to map QE_IC low IRQ\n");
+   ret = -ENOMEM;
+   goto err_domain_remove;
}
 
/* default priority scheme is grouped. If spread mode is*/
@@ -467,13 +473,24 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic_write(qe_ic->regs, QEIC_CICR, temp);
 
irq_set_handler_data(qe_ic->virq_low, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_low, low_handler);
+   irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic);
 
if (qe_ic->virq_high != NO_IRQ &&
qe_ic->virq_high != qe_ic->virq_low) {
irq_set_handler_data(qe_ic->virq_high, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_high, high_handler);
+   irq_set_chained_handler(qe_ic->virq_high,
+   qe_ic_cascade_high_mpic);
}
+   of_node_put(node);
+   return 0;
+
+err_domain_remove:
+   irq_domain_remove(qe_ic->irqhost);
+err_free_qe_ic:
+   kfree(qe_ic);
+err_put_node:
+   of_node_put(node);
+   return ret;
 }
 
 void qe_ic_set_highest_priority(unsigned int virq, int high)
@@ -570,45 +587,16 @@ int qe_ic_set_high_priority(unsigned int virq, unsigned 
int priority, int high)
return 0;
 }
 
-static struct bus_type qe_ic_subsys = {
-   .name = "qe_ic",
-   .dev_name = "qe_ic",
-};
-
-static struct device device_qe_ic = {
-   .id = 0,
-   .bus = _ic_subsys,
-};
-
-static int __init init_qe_ic_sysfs(void)
+static int __init init_qe_ic(struct device_node *node,
+struct device_node *parent)
 {
-   int rc;
-
-   printk(KERN_DEBUG "Registering qe_ic with sysfs...\n");
+   int ret;
 
-   rc = subsys_system_register(_ic_subsys, NULL);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys class\n");
-   return -ENODEV;
-   }
-   rc = device_register(_qe_ic);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys device\n");
-   return -ENODEV;
-   }
-   return 0;
-}
+   ret = qe_ic_init(node, 0);
+   if (ret)
+   return ret;
 
-static int __init qeic_of_init(struct device_node *node,
-  struct device_node *parent)
-{
-   if (!node)
-   return -ENODEV;
-   qe_ic_init(node, 0, qe_ic_cascade_low_mpic,
-  qe_ic_cascade_high_mpic);
-   of_node_put(node);
return 0;
 }
 
-IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init);
-subsys_initcall(init_qe_ic_sysfs);
+IRQCHIP_DECLARE(qeic, "fsl,qe-ic", init_qe_ic);
diff --git a/include/soc/fsl/qe/qe_ic.h 

[PATCH v10 0/4] this patchset is to remove PPCisms for QEIC

2017-08-06 Thread Zhao Qiang
QEIC is supported more than just powerpc boards, so remove PPCisms.

changelog:
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall in qeic driver
- remove include/soc/fsl/qe/qe_ic.h
Changes for v9:
- rebase 
- fix the compile issue when apply the second patch, in fact, there was 
no compile issue 
  when apply all the patches of this patchset
Changes for v10:
- simplify codes, remove duplicated codes 

Zhao Qiang (4):
  irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
Changes for v2:
- modify the subject and commit msg
Changes for v3:
- merge .h file to .c, rename it with irq-qeic.c
Changes for v4:
- modify comments
Changes for v5:
- disable rename detection
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: merge qeic init code from platforms to a common function
Changes for v2:
- modify subject and commit msg
- add check for qeic by type
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall

  irqchip/qeic: merge qeic_of_init into qe_ic_init
Changes for v2:
- modify subject and commit msg
- return 0 and add put node when return in qe_ic_init
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: remove PPCisms for QEIC
Changes for v6:
- new added
Changes for v7:
- fix warning
Changes for v8:
- remove include/soc/fsl/qe/qe_ic.h

Zhao Qiang (4):
  irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
  irqchip/qeic: merge qeic init code from platforms to a common function
  irqchip/qeic: merge qeic_of_init into qe_ic_init
  irqchip/qeic: remove PPCisms for QEIC

 MAINTAINERS|   6 +
 arch/powerpc/platforms/83xx/km83xx.c   |   1 -
 arch/powerpc/platforms/83xx/misc.c |  16 -
 arch/powerpc/platforms/83xx/mpc832x_mds.c  |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c  |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c  |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c  |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c  |  10 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c  |  15 -
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c  |  17 -
 arch/powerpc/platforms/85xx/twr_p102x.c|  15 -
 drivers/irqchip/Makefile   |   1 +
 drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 358 -
 drivers/soc/fsl/qe/Makefile|   2 +-
 drivers/soc/fsl/qe/qe_ic.h | 103 --
 include/soc/fsl/qe/qe_ic.h | 139 
 16 files changed, 218 insertions(+), 469 deletions(-)
 rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (58%)
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

-- 
2.1.0.27.g96db324



[PATCH v10 0/4] this patchset is to remove PPCisms for QEIC

2017-08-06 Thread Zhao Qiang
QEIC is supported more than just powerpc boards, so remove PPCisms.

changelog:
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall in qeic driver
- remove include/soc/fsl/qe/qe_ic.h
Changes for v9:
- rebase 
- fix the compile issue when apply the second patch, in fact, there was 
no compile issue 
  when apply all the patches of this patchset
Changes for v10:
- simplify codes, remove duplicated codes 

Zhao Qiang (4):
  irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
Changes for v2:
- modify the subject and commit msg
Changes for v3:
- merge .h file to .c, rename it with irq-qeic.c
Changes for v4:
- modify comments
Changes for v5:
- disable rename detection
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: merge qeic init code from platforms to a common function
Changes for v2:
- modify subject and commit msg
- add check for qeic by type
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na
Changes for v8:
- use IRQCHIP_DECLARE() instead of subsys_initcall

  irqchip/qeic: merge qeic_of_init into qe_ic_init
Changes for v2:
- modify subject and commit msg
- return 0 and add put node when return in qe_ic_init
Changes for v3:
- na
Changes for v4:
- na
Changes for v5:
- na
Changes for v6:
- rebase
Changes for v7:
- na

  irqchip/qeic: remove PPCisms for QEIC
Changes for v6:
- new added
Changes for v7:
- fix warning
Changes for v8:
- remove include/soc/fsl/qe/qe_ic.h

Zhao Qiang (4):
  irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
  irqchip/qeic: merge qeic init code from platforms to a common function
  irqchip/qeic: merge qeic_of_init into qe_ic_init
  irqchip/qeic: remove PPCisms for QEIC

 MAINTAINERS|   6 +
 arch/powerpc/platforms/83xx/km83xx.c   |   1 -
 arch/powerpc/platforms/83xx/misc.c |  16 -
 arch/powerpc/platforms/83xx/mpc832x_mds.c  |   1 -
 arch/powerpc/platforms/83xx/mpc832x_rdb.c  |   1 -
 arch/powerpc/platforms/83xx/mpc836x_mds.c  |   1 -
 arch/powerpc/platforms/83xx/mpc836x_rdk.c  |   1 -
 arch/powerpc/platforms/85xx/corenet_generic.c  |  10 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c  |  15 -
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c  |  17 -
 arch/powerpc/platforms/85xx/twr_p102x.c|  15 -
 drivers/irqchip/Makefile   |   1 +
 drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 358 -
 drivers/soc/fsl/qe/Makefile|   2 +-
 drivers/soc/fsl/qe/qe_ic.h | 103 --
 include/soc/fsl/qe/qe_ic.h | 139 
 16 files changed, 218 insertions(+), 469 deletions(-)
 rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (58%)
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h
 delete mode 100644 include/soc/fsl/qe/qe_ic.h

-- 
2.1.0.27.g96db324



  1   2   3   4   5   6   7   8   >