[PATCH] rcu: Fix up pending cbs check in rcu_prepare_for_idle
Pending cbs check in rcu_prepare_for_idle is inversed in the sense that, it should accelerate if there are pending cbs; but, the check does the opposite. So, fix it. Fixes: 15fecf89e46a ("srcu: Abstract multi-tail callback list handling") Signed-off-by: Neeraj Upadhyay--- kernel/rcu/tree_plugin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 908b309..b8f51df 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1493,7 +1493,7 @@ static void rcu_prepare_for_idle(void) rdtp->last_accelerate = jiffies; for_each_rcu_flavor(rsp) { rdp = this_cpu_ptr(rsp->rda); - if (rcu_segcblist_pend_cbs(>cblist)) + if (!rcu_segcblist_pend_cbs(>cblist)) continue; rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
[PATCH] rcu: Fix up pending cbs check in rcu_prepare_for_idle
Pending cbs check in rcu_prepare_for_idle is inversed in the sense that, it should accelerate if there are pending cbs; but, the check does the opposite. So, fix it. Fixes: 15fecf89e46a ("srcu: Abstract multi-tail callback list handling") Signed-off-by: Neeraj Upadhyay --- kernel/rcu/tree_plugin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 908b309..b8f51df 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1493,7 +1493,7 @@ static void rcu_prepare_for_idle(void) rdtp->last_accelerate = jiffies; for_each_rcu_flavor(rsp) { rdp = this_cpu_ptr(rsp->rda); - if (rcu_segcblist_pend_cbs(>cblist)) + if (!rcu_segcblist_pend_cbs(>cblist)) continue; rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH 1/5] mtip32xx: Delete an error message for a failed memory allocation in five functions
Hi Markus, [auto build test WARNING on linus/master] [also build test WARNING on v4.13-rc4 next-20170804] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/SF-Markus-Elfring/mtip32xx-Adjustments-for-some-function-implementations/20170807-033055 config: x86_64-randconfig-b0-08071209 (attached as .config) compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from include/uapi/linux/uuid.h:21, from include/linux/uuid.h:19, from include/linux/mod_devicetable.h:12, from include/linux/pci.h:20, from drivers/block/mtip32xx/mtip32xx.c:21: include/linux/string.h: In function 'strncpy': include/linux/string.h:209: warning: '__f' is static but declared in inline function 'strncpy' which is not static include/linux/string.h:211: warning: '__f' is static but declared in inline function 'strncpy' which is not static include/linux/string.h: In function 'strcat': include/linux/string.h:219: warning: '__f' is static but declared in inline function 'strcat' which is not static include/linux/string.h:221: warning: '__f' is static but declared in inline function 'strcat' which is not static include/linux/string.h: In function 'strlen': include/linux/string.h:230: warning: '__f' is static but declared in inline function 'strlen' which is not static include/linux/string.h:233: warning: '__f' is static but declared in inline function 'strlen' which is not static include/linux/string.h: In function 'strnlen': include/linux/string.h:243: warning: '__f' is static but declared in inline function 'strnlen' which is not static include/linux/string.h: In function 'strlcpy': include/linux/string.h:255: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:258: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:260: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:262: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h: In function 'strncat': include/linux/string.h:276: warning: '__f' is static but declared in inline function 'strncat' which is not static include/linux/string.h:280: warning: '__f' is static but declared in inline function 'strncat' which is not static include/linux/string.h: In function 'memset': include/linux/string.h:290: warning: '__f' is static but declared in inline function 'memset' which is not static include/linux/string.h:292: warning: '__f' is static but declared in inline function 'memset' which is not static include/linux/string.h: In function 'memcpy': include/linux/string.h:301: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:302: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:304: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:307: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h: In function 'memmove': include/linux/string.h:316: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:317: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:319: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:322: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h: In function 'memscan': include/linux/string.h:331: warning: '__f' is static but declared in inline function 'memscan' which is not static include/linux/string.h:333: warning: '__f' is static but declared in inline function 'memscan' which is not static include/linux/string.h: In function 'memcmp': include/linux/string.h:342: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:343: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:345: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:348: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h: In function
Re: [PATCH 1/5] mtip32xx: Delete an error message for a failed memory allocation in five functions
Hi Markus, [auto build test WARNING on linus/master] [also build test WARNING on v4.13-rc4 next-20170804] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/SF-Markus-Elfring/mtip32xx-Adjustments-for-some-function-implementations/20170807-033055 config: x86_64-randconfig-b0-08071209 (attached as .config) compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from include/uapi/linux/uuid.h:21, from include/linux/uuid.h:19, from include/linux/mod_devicetable.h:12, from include/linux/pci.h:20, from drivers/block/mtip32xx/mtip32xx.c:21: include/linux/string.h: In function 'strncpy': include/linux/string.h:209: warning: '__f' is static but declared in inline function 'strncpy' which is not static include/linux/string.h:211: warning: '__f' is static but declared in inline function 'strncpy' which is not static include/linux/string.h: In function 'strcat': include/linux/string.h:219: warning: '__f' is static but declared in inline function 'strcat' which is not static include/linux/string.h:221: warning: '__f' is static but declared in inline function 'strcat' which is not static include/linux/string.h: In function 'strlen': include/linux/string.h:230: warning: '__f' is static but declared in inline function 'strlen' which is not static include/linux/string.h:233: warning: '__f' is static but declared in inline function 'strlen' which is not static include/linux/string.h: In function 'strnlen': include/linux/string.h:243: warning: '__f' is static but declared in inline function 'strnlen' which is not static include/linux/string.h: In function 'strlcpy': include/linux/string.h:255: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:258: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:260: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h:262: warning: '__f' is static but declared in inline function 'strlcpy' which is not static include/linux/string.h: In function 'strncat': include/linux/string.h:276: warning: '__f' is static but declared in inline function 'strncat' which is not static include/linux/string.h:280: warning: '__f' is static but declared in inline function 'strncat' which is not static include/linux/string.h: In function 'memset': include/linux/string.h:290: warning: '__f' is static but declared in inline function 'memset' which is not static include/linux/string.h:292: warning: '__f' is static but declared in inline function 'memset' which is not static include/linux/string.h: In function 'memcpy': include/linux/string.h:301: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:302: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:304: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h:307: warning: '__f' is static but declared in inline function 'memcpy' which is not static include/linux/string.h: In function 'memmove': include/linux/string.h:316: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:317: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:319: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h:322: warning: '__f' is static but declared in inline function 'memmove' which is not static include/linux/string.h: In function 'memscan': include/linux/string.h:331: warning: '__f' is static but declared in inline function 'memscan' which is not static include/linux/string.h:333: warning: '__f' is static but declared in inline function 'memscan' which is not static include/linux/string.h: In function 'memcmp': include/linux/string.h:342: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:343: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:345: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h:348: warning: '__f' is static but declared in inline function 'memcmp' which is not static include/linux/string.h: In function
[v4 PATCH 1/2] powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle
From: "Gautham R. Shenoy"The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. In this patch, we define a SPR save area in PACA (since we have used up the volatile register space in the stack) and on POWER9, we restore SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1, SPRN_MMCR2 to the values they had before entering stop. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/cpuidle.h | 11 +++ arch/powerpc/include/asm/paca.h| 7 arch/powerpc/kernel/asm-offsets.c | 8 + arch/powerpc/kernel/idle_book3s.S | 65 -- 4 files changed, 89 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 52586f9..8a174cb 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -67,6 +67,17 @@ #define ERR_DEEP_STATE_ESL_MISMATCH-2 #ifndef __ASSEMBLY__ +/* Additional SPRs that need to be saved/restored during stop */ +struct stop_sprs { + u64 pid; + u64 ldbar; + u64 fscr; + u64 hfscr; + u64 mmcr1; + u64 mmcr2; + u64 mmcra; +}; + extern u32 pnv_fastsleep_workaround_at_entry[]; extern u32 pnv_fastsleep_workaround_at_exit[]; diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index dc88a31..04b60af 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -31,6 +31,7 @@ #endif #include #include +#include register struct paca_struct *local_paca asm("r13"); @@ -183,6 +184,12 @@ struct paca_struct { struct paca_struct **thread_sibling_pacas; /* The PSSCR value that the kernel requested before going to stop */ u64 requested_psscr; + + /* +* Save area for additional SPRs that need to be +* saved/restored during cpuidle stop. +*/ + struct stop_sprs stop_sprs; #endif #ifdef CONFIG_PPC_STD_MMU_64 diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 6e95c2c..8cfb20e 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -746,6 +746,14 @@ int main(void) OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask); OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas); OFFSET(PACA_REQ_PSSCR, paca_struct, requested_psscr); +#define STOP_SPR(x, f) OFFSET(x, paca_struct, stop_sprs.f) + STOP_SPR(STOP_PID, pid); + STOP_SPR(STOP_LDBAR, ldbar); + STOP_SPR(STOP_FSCR, fscr); + STOP_SPR(STOP_HFSCR, hfscr); + STOP_SPR(STOP_MMCR1, mmcr1); + STOP_SPR(STOP_MMCR2, mmcr2); + STOP_SPR(STOP_MMCRA, mmcra); #endif DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER); diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index 516ebef..4621568 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -85,7 +85,61 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) std r3,_WORT(r1) mfspr r3,SPRN_WORC std r3,_WORC(r1) +/* + * On POWER9, there are idle states such as stop4, invoked via cpuidle, + * that lose hypervisor resources. In such cases, we need to save + * additional SPRs before entering those idle states so that they can + * be restored to their older values on wakeup from the idle state. + * + * On POWER8, the only such deep idle state is winkle which is used + * only in the context of CPU-Hotplug, where these additional SPRs are + * reinitiazed to a sane value. Hence there is no need to save/restore + * these SPRs. + */ +BEGIN_FTR_SECTION + blr +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) + +power9_save_additional_sprs: + mfspr r3, SPRN_PID + mfspr r4, SPRN_LDBAR + std r3, STOP_PID(r13) + std r4, STOP_LDBAR(r13) + mfspr r3, SPRN_FSCR + mfspr r4, SPRN_HFSCR + std r3, STOP_FSCR(r13) + std r4, STOP_HFSCR(r13) + + mfspr r3, SPRN_MMCRA + mfspr r4, SPRN_MMCR1 + std r3, STOP_MMCRA(r13) + std r4, STOP_MMCR1(r13) + + mfspr r3, SPRN_MMCR2 + std r3, STOP_MMCR2(r13) + blr + +power9_restore_additional_sprs: + ld r3,_LPCR(r1) + ld r4,
[v4 PATCH 1/2] powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle
From: "Gautham R. Shenoy" The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. In this patch, we define a SPR save area in PACA (since we have used up the volatile register space in the stack) and on POWER9, we restore SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1, SPRN_MMCR2 to the values they had before entering stop. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/cpuidle.h | 11 +++ arch/powerpc/include/asm/paca.h| 7 arch/powerpc/kernel/asm-offsets.c | 8 + arch/powerpc/kernel/idle_book3s.S | 65 -- 4 files changed, 89 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 52586f9..8a174cb 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -67,6 +67,17 @@ #define ERR_DEEP_STATE_ESL_MISMATCH-2 #ifndef __ASSEMBLY__ +/* Additional SPRs that need to be saved/restored during stop */ +struct stop_sprs { + u64 pid; + u64 ldbar; + u64 fscr; + u64 hfscr; + u64 mmcr1; + u64 mmcr2; + u64 mmcra; +}; + extern u32 pnv_fastsleep_workaround_at_entry[]; extern u32 pnv_fastsleep_workaround_at_exit[]; diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index dc88a31..04b60af 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -31,6 +31,7 @@ #endif #include #include +#include register struct paca_struct *local_paca asm("r13"); @@ -183,6 +184,12 @@ struct paca_struct { struct paca_struct **thread_sibling_pacas; /* The PSSCR value that the kernel requested before going to stop */ u64 requested_psscr; + + /* +* Save area for additional SPRs that need to be +* saved/restored during cpuidle stop. +*/ + struct stop_sprs stop_sprs; #endif #ifdef CONFIG_PPC_STD_MMU_64 diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 6e95c2c..8cfb20e 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -746,6 +746,14 @@ int main(void) OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask); OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas); OFFSET(PACA_REQ_PSSCR, paca_struct, requested_psscr); +#define STOP_SPR(x, f) OFFSET(x, paca_struct, stop_sprs.f) + STOP_SPR(STOP_PID, pid); + STOP_SPR(STOP_LDBAR, ldbar); + STOP_SPR(STOP_FSCR, fscr); + STOP_SPR(STOP_HFSCR, hfscr); + STOP_SPR(STOP_MMCR1, mmcr1); + STOP_SPR(STOP_MMCR2, mmcr2); + STOP_SPR(STOP_MMCRA, mmcra); #endif DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER); diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index 516ebef..4621568 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -85,7 +85,61 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) std r3,_WORT(r1) mfspr r3,SPRN_WORC std r3,_WORC(r1) +/* + * On POWER9, there are idle states such as stop4, invoked via cpuidle, + * that lose hypervisor resources. In such cases, we need to save + * additional SPRs before entering those idle states so that they can + * be restored to their older values on wakeup from the idle state. + * + * On POWER8, the only such deep idle state is winkle which is used + * only in the context of CPU-Hotplug, where these additional SPRs are + * reinitiazed to a sane value. Hence there is no need to save/restore + * these SPRs. + */ +BEGIN_FTR_SECTION + blr +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) + +power9_save_additional_sprs: + mfspr r3, SPRN_PID + mfspr r4, SPRN_LDBAR + std r3, STOP_PID(r13) + std r4, STOP_LDBAR(r13) + mfspr r3, SPRN_FSCR + mfspr r4, SPRN_HFSCR + std r3, STOP_FSCR(r13) + std r4, STOP_HFSCR(r13) + + mfspr r3, SPRN_MMCRA + mfspr r4, SPRN_MMCR1 + std r3, STOP_MMCRA(r13) + std r4, STOP_MMCR1(r13) + + mfspr r3, SPRN_MMCR2 + std r3, STOP_MMCR2(r13) + blr + +power9_restore_additional_sprs: + ld r3,_LPCR(r1) + ld r4, STOP_PID(r13) + mtspr SPRN_LPCR,r3 +
[v4 PATCH 0/2] powerpc/powernv: Enable stop4 via cpuidle
From: "Gautham R. Shenoy"Hi, This is the fourth iteration of the patchset to enable exploitation of stop4 idle state on POWER9 via cpuidle. The earlier version can be found here : [v3]: https://lkml.org/lkml/2017/7/21/209 [v2]: https://lkml.org/lkml/2017/7/19/152 [v1]: https://lkml.org/lkml/2017/7/18/691 The changes across the versions are as follows: v3-->v4: - Modified the subject line to be consistent with the convention. No changes to code. v2-->v3: - Use a structure instead of an array for the stop sprs save area. - Name the offsets into the paca->stop_sprs as STOP_XXX instead of PACA_XXX. - Add comments in the assembly code explaining why saving/restoring is not needed on POWER8. - Program the LPCR during platform idle entry/exit on both POWER8 and POWER9 as suggested by Nicholas Piggin. v1 --> v2: - Move the LPCR manipulations for CPU-Hotplug into arch/powerpc/platforms/powernv/idle.c as per Nicholas Piggin's suggestion. == Description === The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. Additionally, the CPU which is in stop4 while idling can be woken up by the decrementer interrupts. So we need to ensure that the LPCR is programmed with PECE1 bit cleared via the stop-api only for the CPU-Hotplug case and not for cpuidle. The two patches in the series address this problem. Gautham R. Shenoy (2): powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle powernv/powerpc: Clear PECE1 in LPCR via stop-api only on Hotplug Gautham R. Shenoy (2): powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug arch/powerpc/include/asm/cpuidle.h| 11 ++ arch/powerpc/include/asm/paca.h | 7 arch/powerpc/kernel/asm-offsets.c | 8 + arch/powerpc/kernel/idle_book3s.S | 65 +-- arch/powerpc/platforms/powernv/idle.c | 34 +- arch/powerpc/platforms/powernv/smp.c | 8 - 6 files changed, 122 insertions(+), 11 deletions(-) -- 1.9.4
[v4 PATCH 2/2] powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug
From: "Gautham R. Shenoy"Currently we use the stop-api provided by the firmware to program the SLW engine to restore the values of hypervisor resources that get lost on deeper idle states (such as winkle). Since the deep states were only used for CPU-Hotplug on POWER8 systems, we would program the LPCR to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously woken up by decrementer. On POWER9, some of the deep platform idle states such as stop4 can be used in cpuidle as well. In this case, we want the CPU in stop4 to be woken up by the decrementer when some timer on the CPU expires. In this patch, we program the stop-api for LPCR with PECE1 bit cleared only when we are offlining the CPU and set it back once the CPU is online. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/platforms/powernv/idle.c | 34 +- arch/powerpc/platforms/powernv/smp.c | 8 2 files changed, 33 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index 2abee07..a1296e7 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -68,7 +68,7 @@ static int pnv_save_sprs_for_deep_states(void) * all cpus at boot. Get these reg values of current cpu and use the * same across all cpus. */ - uint64_t lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; + uint64_t lpcr_val = mfspr(SPRN_LPCR); uint64_t hid0_val = mfspr(SPRN_HID0); uint64_t hid1_val = mfspr(SPRN_HID1); uint64_t hid4_val = mfspr(SPRN_HID4); @@ -355,6 +355,14 @@ void power9_idle(void) } #ifdef CONFIG_HOTPLUG_CPU +static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val) +{ + u64 pir = get_hard_smp_processor_id(cpu); + + mtspr(SPRN_LPCR, lpcr_val); + opal_slw_set_reg(pir, SPRN_LPCR, lpcr_val); +} + /* * pnv_cpu_offline: A function that puts the CPU into the deepest * available platform idle state on a CPU-Offline. @@ -364,6 +372,20 @@ unsigned long pnv_cpu_offline(unsigned int cpu) { unsigned long srr1; u32 idle_states = pnv_get_supported_cpuidle_states(); + u64 lpcr_val; + + /* +* We don't want to take decrementer interrupts while we are +* offline, so clear LPCR:PECE1. We keep PECE2 (and +* LPCR_PECE_HVEE on P9) enabled as to let IPIs in. +* +* If the CPU gets woken up by a special wakeup, ensure that +* the SLW engine sets LPCR with decrementer bit cleared, else +* the CPU will come back to the kernel due to a spurious +* wakeup. +*/ + lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); __ppc64_runlatch_off(); @@ -394,6 +416,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu) __ppc64_runlatch_on(); + /* +* Re-enable decrementer interrupts in LPCR. +* +* Further, we want stop states to be woken up by decrementer +* for non-hotplug cases. So program the LPCR via stop api as +* well. +*/ + lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); + return srr1; } #endif diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 40dae96..536b07b 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -164,12 +164,6 @@ static void pnv_smp_cpu_kill_self(void) if (cpu_has_feature(CPU_FTR_ARCH_207S)) wmask = SRR1_WAKEMASK_P8; - /* We don't want to take decrementer interrupts while we are offline, -* so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9) -* enabled as to let IPIs in. -*/ - mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1); - while (!generic_check_cpu_restart(cpu)) { /* * Clear IPI flag, since we don't handle IPIs while @@ -219,8 +213,6 @@ static void pnv_smp_cpu_kill_self(void) } - /* Re-enable decrementer interrupts */ - mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1); DBG("CPU%d coming online...\n", cpu); } -- 1.9.4
[v4 PATCH 0/2] powerpc/powernv: Enable stop4 via cpuidle
From: "Gautham R. Shenoy" Hi, This is the fourth iteration of the patchset to enable exploitation of stop4 idle state on POWER9 via cpuidle. The earlier version can be found here : [v3]: https://lkml.org/lkml/2017/7/21/209 [v2]: https://lkml.org/lkml/2017/7/19/152 [v1]: https://lkml.org/lkml/2017/7/18/691 The changes across the versions are as follows: v3-->v4: - Modified the subject line to be consistent with the convention. No changes to code. v2-->v3: - Use a structure instead of an array for the stop sprs save area. - Name the offsets into the paca->stop_sprs as STOP_XXX instead of PACA_XXX. - Add comments in the assembly code explaining why saving/restoring is not needed on POWER8. - Program the LPCR during platform idle entry/exit on both POWER8 and POWER9 as suggested by Nicholas Piggin. v1 --> v2: - Move the LPCR manipulations for CPU-Hotplug into arch/powerpc/platforms/powernv/idle.c as per Nicholas Piggin's suggestion. == Description === The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. Additionally, the CPU which is in stop4 while idling can be woken up by the decrementer interrupts. So we need to ensure that the LPCR is programmed with PECE1 bit cleared via the stop-api only for the CPU-Hotplug case and not for cpuidle. The two patches in the series address this problem. Gautham R. Shenoy (2): powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle powernv/powerpc: Clear PECE1 in LPCR via stop-api only on Hotplug Gautham R. Shenoy (2): powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug arch/powerpc/include/asm/cpuidle.h| 11 ++ arch/powerpc/include/asm/paca.h | 7 arch/powerpc/kernel/asm-offsets.c | 8 + arch/powerpc/kernel/idle_book3s.S | 65 +-- arch/powerpc/platforms/powernv/idle.c | 34 +- arch/powerpc/platforms/powernv/smp.c | 8 - 6 files changed, 122 insertions(+), 11 deletions(-) -- 1.9.4
[v4 PATCH 2/2] powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug
From: "Gautham R. Shenoy" Currently we use the stop-api provided by the firmware to program the SLW engine to restore the values of hypervisor resources that get lost on deeper idle states (such as winkle). Since the deep states were only used for CPU-Hotplug on POWER8 systems, we would program the LPCR to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously woken up by decrementer. On POWER9, some of the deep platform idle states such as stop4 can be used in cpuidle as well. In this case, we want the CPU in stop4 to be woken up by the decrementer when some timer on the CPU expires. In this patch, we program the stop-api for LPCR with PECE1 bit cleared only when we are offlining the CPU and set it back once the CPU is online. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/platforms/powernv/idle.c | 34 +- arch/powerpc/platforms/powernv/smp.c | 8 2 files changed, 33 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index 2abee07..a1296e7 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -68,7 +68,7 @@ static int pnv_save_sprs_for_deep_states(void) * all cpus at boot. Get these reg values of current cpu and use the * same across all cpus. */ - uint64_t lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; + uint64_t lpcr_val = mfspr(SPRN_LPCR); uint64_t hid0_val = mfspr(SPRN_HID0); uint64_t hid1_val = mfspr(SPRN_HID1); uint64_t hid4_val = mfspr(SPRN_HID4); @@ -355,6 +355,14 @@ void power9_idle(void) } #ifdef CONFIG_HOTPLUG_CPU +static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val) +{ + u64 pir = get_hard_smp_processor_id(cpu); + + mtspr(SPRN_LPCR, lpcr_val); + opal_slw_set_reg(pir, SPRN_LPCR, lpcr_val); +} + /* * pnv_cpu_offline: A function that puts the CPU into the deepest * available platform idle state on a CPU-Offline. @@ -364,6 +372,20 @@ unsigned long pnv_cpu_offline(unsigned int cpu) { unsigned long srr1; u32 idle_states = pnv_get_supported_cpuidle_states(); + u64 lpcr_val; + + /* +* We don't want to take decrementer interrupts while we are +* offline, so clear LPCR:PECE1. We keep PECE2 (and +* LPCR_PECE_HVEE on P9) enabled as to let IPIs in. +* +* If the CPU gets woken up by a special wakeup, ensure that +* the SLW engine sets LPCR with decrementer bit cleared, else +* the CPU will come back to the kernel due to a spurious +* wakeup. +*/ + lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); __ppc64_runlatch_off(); @@ -394,6 +416,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu) __ppc64_runlatch_on(); + /* +* Re-enable decrementer interrupts in LPCR. +* +* Further, we want stop states to be woken up by decrementer +* for non-hotplug cases. So program the LPCR via stop api as +* well. +*/ + lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); + return srr1; } #endif diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 40dae96..536b07b 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -164,12 +164,6 @@ static void pnv_smp_cpu_kill_self(void) if (cpu_has_feature(CPU_FTR_ARCH_207S)) wmask = SRR1_WAKEMASK_P8; - /* We don't want to take decrementer interrupts while we are offline, -* so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9) -* enabled as to let IPIs in. -*/ - mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1); - while (!generic_check_cpu_restart(cpu)) { /* * Clear IPI flag, since we don't handle IPIs while @@ -219,8 +213,6 @@ static void pnv_smp_cpu_kill_self(void) } - /* Re-enable decrementer interrupts */ - mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1); DBG("CPU%d coming online...\n", cpu); } -- 1.9.4
[PATCH -mm -v4 5/5] mm, swap: Don't use VMA based swap readahead if HDD is used as swap
From: Huang YingVMA based swap readahead will readahead the virtual pages that is continuous in the virtual address space. While the original swap readahead will readahead the swap slots that is continuous in the swap device. Although VMA based swap readahead is more correct for the swap slots to be readahead, it will trigger more small random readings, which may cause the performance of HDD (hard disk) to degrade heavily, and may finally exceed the benefit. To avoid the issue, in this patch, if the HDD is used as swap, the VMA based swap readahead will be disabled, and the original swap readahead will be used instead. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/swap.h | 11 ++- mm/swapfile.c| 8 +++- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 61d63379e956..9c4ae6f14eea 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -400,16 +400,17 @@ extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask, struct vm_fault *vmf, struct vma_swap_readahead *swap_ra); -static inline bool swap_use_vma_readahead(void) -{ - return READ_ONCE(swap_vma_readahead); -} - /* linux/mm/swapfile.c */ extern atomic_long_t nr_swap_pages; extern long total_swap_pages; +extern atomic_t nr_rotate_swap; extern bool has_usable_swap(void); +static inline bool swap_use_vma_readahead(void) +{ + return READ_ONCE(swap_vma_readahead) && !atomic_read(_rotate_swap); +} + /* Swap 50% full? Release swapcache more aggressively.. */ static inline bool vm_swap_full(void) { diff --git a/mm/swapfile.c b/mm/swapfile.c index 42eff9e4e972..4f8b3e08a547 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -96,6 +96,8 @@ static DECLARE_WAIT_QUEUE_HEAD(proc_poll_wait); /* Activity counter to indicate that a swapon or swapoff has occurred */ static atomic_t proc_poll_event = ATOMIC_INIT(0); +atomic_t nr_rotate_swap = ATOMIC_INIT(0); + static inline unsigned char swap_count(unsigned char ent) { return ent & ~SWAP_HAS_CACHE; /* may include SWAP_HAS_CONT flag */ @@ -2569,6 +2571,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) if (p->flags & SWP_CONTINUED) free_swap_count_continuations(p); + if (!p->bdev || !blk_queue_nonrot(bdev_get_queue(p->bdev))) + atomic_dec(_rotate_swap); + mutex_lock(_mutex); spin_lock(_lock); spin_lock(>lock); @@ -3145,7 +3150,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) cluster = per_cpu_ptr(p->percpu_cluster, cpu); cluster_set_null(>index); } - } + } else + atomic_inc(_rotate_swap); error = swap_cgroup_swapon(p->type, maxpages); if (error) -- 2.11.0
[PATCH -mm -v4 5/5] mm, swap: Don't use VMA based swap readahead if HDD is used as swap
From: Huang Ying VMA based swap readahead will readahead the virtual pages that is continuous in the virtual address space. While the original swap readahead will readahead the swap slots that is continuous in the swap device. Although VMA based swap readahead is more correct for the swap slots to be readahead, it will trigger more small random readings, which may cause the performance of HDD (hard disk) to degrade heavily, and may finally exceed the benefit. To avoid the issue, in this patch, if the HDD is used as swap, the VMA based swap readahead will be disabled, and the original swap readahead will be used instead. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/swap.h | 11 ++- mm/swapfile.c| 8 +++- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 61d63379e956..9c4ae6f14eea 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -400,16 +400,17 @@ extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask, struct vm_fault *vmf, struct vma_swap_readahead *swap_ra); -static inline bool swap_use_vma_readahead(void) -{ - return READ_ONCE(swap_vma_readahead); -} - /* linux/mm/swapfile.c */ extern atomic_long_t nr_swap_pages; extern long total_swap_pages; +extern atomic_t nr_rotate_swap; extern bool has_usable_swap(void); +static inline bool swap_use_vma_readahead(void) +{ + return READ_ONCE(swap_vma_readahead) && !atomic_read(_rotate_swap); +} + /* Swap 50% full? Release swapcache more aggressively.. */ static inline bool vm_swap_full(void) { diff --git a/mm/swapfile.c b/mm/swapfile.c index 42eff9e4e972..4f8b3e08a547 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -96,6 +96,8 @@ static DECLARE_WAIT_QUEUE_HEAD(proc_poll_wait); /* Activity counter to indicate that a swapon or swapoff has occurred */ static atomic_t proc_poll_event = ATOMIC_INIT(0); +atomic_t nr_rotate_swap = ATOMIC_INIT(0); + static inline unsigned char swap_count(unsigned char ent) { return ent & ~SWAP_HAS_CACHE; /* may include SWAP_HAS_CONT flag */ @@ -2569,6 +2571,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) if (p->flags & SWP_CONTINUED) free_swap_count_continuations(p); + if (!p->bdev || !blk_queue_nonrot(bdev_get_queue(p->bdev))) + atomic_dec(_rotate_swap); + mutex_lock(_mutex); spin_lock(_lock); spin_lock(>lock); @@ -3145,7 +3150,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) cluster = per_cpu_ptr(p->percpu_cluster, cpu); cluster_set_null(>index); } - } + } else + atomic_inc(_rotate_swap); error = swap_cgroup_swapon(p->type, maxpages); if (error) -- 2.11.0
[PATCH -mm -v4 4/5] mm, swap: Add sysfs interface for VMA based swap readahead
From: Huang YingThe sysfs interface to control the VMA based swap readahead is added as follow, /sys/kernel/mm/swap/vma_ra_enabled Enable the VMA based swap readahead algorithm, or use the original global swap readahead algorithm. /sys/kernel/mm/swap/vma_ra_max_order Set the max order of the readahead window size for the VMA based swap readahead algorithm. The corresponding ABI documentation is added too. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- Documentation/ABI/testing/sysfs-kernel-mm-swap | 26 + mm/swap_state.c| 80 ++ 2 files changed, 106 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-swap diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-swap b/Documentation/ABI/testing/sysfs-kernel-mm-swap new file mode 100644 index ..587db52084c7 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-mm-swap @@ -0,0 +1,26 @@ +What: /sys/kernel/mm/swap/ +Date: August 2017 +Contact: Linux memory management mailing list +Description: Interface for swapping + +What: /sys/kernel/mm/swap/vma_ra_enabled +Date: August 2017 +Contact: Linux memory management mailing list +Description: Enable/disable VMA based swap readahead. + + If set to true, the VMA based swap readahead algorithm + will be used for swappable anonymous pages mapped in a + VMA, and the global swap readahead algorithm will be + still used for tmpfs etc. other users. If set to + false, the global swap readahead algorithm will be + used for all swappable pages. + +What: /sys/kernel/mm/swap/vma_ra_max_order +Date: August 2017 +Contact: Linux memory management mailing list +Description: The max readahead size in order for VMA based swap readahead + + VMA based swap readahead algorithm will readahead at + most 1 << max_order pages for each readahead. The + real readahead size for each readahead will be scaled + according to the estimation algorithm. diff --git a/mm/swap_state.c b/mm/swap_state.c index 3885fef7bdf5..71ce2d1ccbf7 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -751,3 +751,83 @@ struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask, return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address, swap_ra->win == 1); } + +#ifdef CONFIG_SYSFS +static ssize_t vma_ra_enabled_show(struct kobject *kobj, +struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%s\n", swap_vma_readahead ? "true" : "false"); +} +static ssize_t vma_ra_enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1)) + swap_vma_readahead = true; + else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1)) + swap_vma_readahead = false; + else + return -EINVAL; + + return count; +} +static struct kobj_attribute vma_ra_enabled_attr = + __ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show, + vma_ra_enabled_store); + +static ssize_t vma_ra_max_order_show(struct kobject *kobj, +struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", swap_ra_max_order); +} +static ssize_t vma_ra_max_order_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err, v; + + err = kstrtoint(buf, 10, ); + if (err || v > SWAP_RA_ORDER_CEILING || v <= 0) + return -EINVAL; + + swap_ra_max_order = v; + + return count; +} +static struct kobj_attribute vma_ra_max_order_attr = + __ATTR(vma_ra_max_order, 0644, vma_ra_max_order_show, + vma_ra_max_order_store); + +static struct attribute *swap_attrs[] = { + _ra_enabled_attr.attr, + _ra_max_order_attr.attr, + NULL, +}; + +static struct attribute_group swap_attr_group = { + .attrs = swap_attrs, +}; + +static int __init swap_init_sysfs(void) +{ + int err; + struct kobject *swap_kobj; + + swap_kobj = kobject_create_and_add("swap", mm_kobj); + if (!swap_kobj) { +
[PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics
From: Huang YingThe statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead algorithm and parameters could be tuned accordingly. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/vm_event_item.h | 2 ++ mm/swap_state.c | 9 +++-- mm/vmstat.c | 3 +++ 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index e02820fc2861..27e3339cfd65 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -106,6 +106,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMACACHE_FIND_HITS, VMACACHE_FULL_FLUSHES, #endif + SWAP_RA, + SWAP_RA_HIT, NR_VM_EVENT_ITEMS }; diff --git a/mm/swap_state.c b/mm/swap_state.c index b68c93014f50..d1bdb31cab13 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -305,8 +305,10 @@ struct page * lookup_swap_cache(swp_entry_t entry) if (page && likely(!PageTransCompound(page))) { INC_CACHE_INFO(find_success); - if (TestClearPageReadahead(page)) + if (TestClearPageReadahead(page)) { atomic_inc(_readahead_hits); + count_vm_event(SWAP_RA_HIT); + } } INC_CACHE_INFO(find_total); @@ -516,8 +518,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, gfp_mask, vma, addr, false); if (!page) continue; - if (offset != entry_offset && likely(!PageTransCompound(page))) + if (offset != entry_offset && + likely(!PageTransCompound(page))) { SetPageReadahead(page); + count_vm_event(SWAP_RA); + } put_page(page); } blk_finish_plug(); diff --git a/mm/vmstat.c b/mm/vmstat.c index ba9b202e8500..4c2121a8b877 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1095,6 +1095,9 @@ const char * const vmstat_text[] = { "vmacache_find_hits", "vmacache_full_flushes", #endif + + "swap_ra", + "swap_ra_hit", #endif /* CONFIG_VM_EVENTS_COUNTERS */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA */ -- 2.11.0
[PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead
From: Huang YingThe swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device are readahead based on the global space locality estimation. But the consecutive blocks in swap device just reflect the order of page reclaiming, don't necessarily reflect the access pattern in virtual memory. And the different tasks in the system may have different access patterns, which makes the global space locality estimation incorrect. In this patch, when page fault occurs, the virtual pages near the fault address will be readahead instead of the swap slots near the fault swap slot in swap device. This avoid to readahead the unrelated swap slots. At the same time, the swap readahead is changed to work on per-VMA from globally. So that the different access patterns of the different VMAs could be distinguished, and the different readahead policy could be applied accordingly. The original core readahead detection and scaling algorithm is reused, because it is an effect algorithm to detect the space locality. The test and result is as follow, Common test condition = Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM) Swap device: NVMe disk Micro-benchmark with combined access pattern vm-scalability, sequential swap test case, 4 processes to eat 50G virtual memory space, repeat the sequential memory writing until 300 seconds. The first round writing will trigger swap out, the following rounds will trigger sequential swap in and out. At the same time, run vm-scalability random swap test case in background, 8 processes to eat 30G virtual memory space, repeat the random memory write until 300 seconds. This will trigger random swap-in in the background. This is a combined workload with sequential and random memory accessing at the same time. The result (for sequential workload) is as follow, BaseOptimized - throughput 345413 KB/s 414029 KB/s (+19.9%) latency.average 97.14 us61.06 us (-37.1%) latency.50th2 us1 us latency.60th2 us1 us latency.70th98 us 2 us latency.80th160 us 2 us latency.90th260 us 217 us latency.95th346 us 369 us latency.99th1.34 ms 1.09 ms ra_hit% 52.69% 99.98% The original swap readahead algorithm is confused by the background random access workload, so readahead hit rate is lower. The VMA-base readahead algorithm works much better. Linpack === The test memory size is bigger than RAM to trigger swapping. BaseOptimized - elapsed_time393.49 s329.88 s (-16.2%) ra_hit% 86.21% 98.82% The score of base and optimized kernel hasn't visible changes. But the elapsed time reduced and readahead hit rate improved, so the optimized kernel runs better for startup and tear down stages. And the absolute value of readahead hit rate is high, shows that the space locality is still valid in some practical workloads. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/mm_types.h | 1 + include/linux/swap.h | 57 - mm/memory.c | 23 +++-- mm/shmem.c | 2 +- mm/swap_state.c | 215 +++ 5 files changed, 273 insertions(+), 25 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7f384bb62d8e..5c02027050a2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -335,6 +335,7 @@ struct vm_area_struct { struct file * vm_file; /* File we map to (can be NULL). */ void * vm_private_data; /* was vm_pte (shared mem) */ + atomic_long_t swap_readahead_info; #ifndef CONFIG_MMU struct vm_region *vm_region;/* NOMMU mapping region */ #endif diff --git a/include/linux/swap.h b/include/linux/swap.h index 76f1632eea5a..61d63379e956 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -251,6 +251,25 @@ struct swap_info_struct { struct swap_cluster_list discard_clusters; /* discard clusters list */ }; +#ifdef CONFIG_64BIT +#define SWAP_RA_ORDER_CEILING
[PATCH -mm -v4 4/5] mm, swap: Add sysfs interface for VMA based swap readahead
From: Huang Ying The sysfs interface to control the VMA based swap readahead is added as follow, /sys/kernel/mm/swap/vma_ra_enabled Enable the VMA based swap readahead algorithm, or use the original global swap readahead algorithm. /sys/kernel/mm/swap/vma_ra_max_order Set the max order of the readahead window size for the VMA based swap readahead algorithm. The corresponding ABI documentation is added too. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- Documentation/ABI/testing/sysfs-kernel-mm-swap | 26 + mm/swap_state.c| 80 ++ 2 files changed, 106 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-swap diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-swap b/Documentation/ABI/testing/sysfs-kernel-mm-swap new file mode 100644 index ..587db52084c7 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-mm-swap @@ -0,0 +1,26 @@ +What: /sys/kernel/mm/swap/ +Date: August 2017 +Contact: Linux memory management mailing list +Description: Interface for swapping + +What: /sys/kernel/mm/swap/vma_ra_enabled +Date: August 2017 +Contact: Linux memory management mailing list +Description: Enable/disable VMA based swap readahead. + + If set to true, the VMA based swap readahead algorithm + will be used for swappable anonymous pages mapped in a + VMA, and the global swap readahead algorithm will be + still used for tmpfs etc. other users. If set to + false, the global swap readahead algorithm will be + used for all swappable pages. + +What: /sys/kernel/mm/swap/vma_ra_max_order +Date: August 2017 +Contact: Linux memory management mailing list +Description: The max readahead size in order for VMA based swap readahead + + VMA based swap readahead algorithm will readahead at + most 1 << max_order pages for each readahead. The + real readahead size for each readahead will be scaled + according to the estimation algorithm. diff --git a/mm/swap_state.c b/mm/swap_state.c index 3885fef7bdf5..71ce2d1ccbf7 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -751,3 +751,83 @@ struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask, return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address, swap_ra->win == 1); } + +#ifdef CONFIG_SYSFS +static ssize_t vma_ra_enabled_show(struct kobject *kobj, +struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%s\n", swap_vma_readahead ? "true" : "false"); +} +static ssize_t vma_ra_enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1)) + swap_vma_readahead = true; + else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1)) + swap_vma_readahead = false; + else + return -EINVAL; + + return count; +} +static struct kobj_attribute vma_ra_enabled_attr = + __ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show, + vma_ra_enabled_store); + +static ssize_t vma_ra_max_order_show(struct kobject *kobj, +struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", swap_ra_max_order); +} +static ssize_t vma_ra_max_order_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err, v; + + err = kstrtoint(buf, 10, ); + if (err || v > SWAP_RA_ORDER_CEILING || v <= 0) + return -EINVAL; + + swap_ra_max_order = v; + + return count; +} +static struct kobj_attribute vma_ra_max_order_attr = + __ATTR(vma_ra_max_order, 0644, vma_ra_max_order_show, + vma_ra_max_order_store); + +static struct attribute *swap_attrs[] = { + _ra_enabled_attr.attr, + _ra_max_order_attr.attr, + NULL, +}; + +static struct attribute_group swap_attr_group = { + .attrs = swap_attrs, +}; + +static int __init swap_init_sysfs(void) +{ + int err; + struct kobject *swap_kobj; + + swap_kobj = kobject_create_and_add("swap", mm_kobj); + if (!swap_kobj) { + pr_err("failed to create swap kobject\n"); + return -ENOMEM; + } + err = sysfs_create_group(swap_kobj, _attr_group); + if (err) { + pr_err("failed to register swap group\n"); + goto delete_obj; + } + return
[PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics
From: Huang Ying The statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead algorithm and parameters could be tuned accordingly. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/vm_event_item.h | 2 ++ mm/swap_state.c | 9 +++-- mm/vmstat.c | 3 +++ 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index e02820fc2861..27e3339cfd65 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -106,6 +106,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMACACHE_FIND_HITS, VMACACHE_FULL_FLUSHES, #endif + SWAP_RA, + SWAP_RA_HIT, NR_VM_EVENT_ITEMS }; diff --git a/mm/swap_state.c b/mm/swap_state.c index b68c93014f50..d1bdb31cab13 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -305,8 +305,10 @@ struct page * lookup_swap_cache(swp_entry_t entry) if (page && likely(!PageTransCompound(page))) { INC_CACHE_INFO(find_success); - if (TestClearPageReadahead(page)) + if (TestClearPageReadahead(page)) { atomic_inc(_readahead_hits); + count_vm_event(SWAP_RA_HIT); + } } INC_CACHE_INFO(find_total); @@ -516,8 +518,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, gfp_mask, vma, addr, false); if (!page) continue; - if (offset != entry_offset && likely(!PageTransCompound(page))) + if (offset != entry_offset && + likely(!PageTransCompound(page))) { SetPageReadahead(page); + count_vm_event(SWAP_RA); + } put_page(page); } blk_finish_plug(); diff --git a/mm/vmstat.c b/mm/vmstat.c index ba9b202e8500..4c2121a8b877 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1095,6 +1095,9 @@ const char * const vmstat_text[] = { "vmacache_find_hits", "vmacache_full_flushes", #endif + + "swap_ra", + "swap_ra_hit", #endif /* CONFIG_VM_EVENTS_COUNTERS */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA */ -- 2.11.0
[PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead
From: Huang Ying The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device are readahead based on the global space locality estimation. But the consecutive blocks in swap device just reflect the order of page reclaiming, don't necessarily reflect the access pattern in virtual memory. And the different tasks in the system may have different access patterns, which makes the global space locality estimation incorrect. In this patch, when page fault occurs, the virtual pages near the fault address will be readahead instead of the swap slots near the fault swap slot in swap device. This avoid to readahead the unrelated swap slots. At the same time, the swap readahead is changed to work on per-VMA from globally. So that the different access patterns of the different VMAs could be distinguished, and the different readahead policy could be applied accordingly. The original core readahead detection and scaling algorithm is reused, because it is an effect algorithm to detect the space locality. The test and result is as follow, Common test condition = Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM) Swap device: NVMe disk Micro-benchmark with combined access pattern vm-scalability, sequential swap test case, 4 processes to eat 50G virtual memory space, repeat the sequential memory writing until 300 seconds. The first round writing will trigger swap out, the following rounds will trigger sequential swap in and out. At the same time, run vm-scalability random swap test case in background, 8 processes to eat 30G virtual memory space, repeat the random memory write until 300 seconds. This will trigger random swap-in in the background. This is a combined workload with sequential and random memory accessing at the same time. The result (for sequential workload) is as follow, BaseOptimized - throughput 345413 KB/s 414029 KB/s (+19.9%) latency.average 97.14 us61.06 us (-37.1%) latency.50th2 us1 us latency.60th2 us1 us latency.70th98 us 2 us latency.80th160 us 2 us latency.90th260 us 217 us latency.95th346 us 369 us latency.99th1.34 ms 1.09 ms ra_hit% 52.69% 99.98% The original swap readahead algorithm is confused by the background random access workload, so readahead hit rate is lower. The VMA-base readahead algorithm works much better. Linpack === The test memory size is bigger than RAM to trigger swapping. BaseOptimized - elapsed_time393.49 s329.88 s (-16.2%) ra_hit% 86.21% 98.82% The score of base and optimized kernel hasn't visible changes. But the elapsed time reduced and readahead hit rate improved, so the optimized kernel runs better for startup and tear down stages. And the absolute value of readahead hit rate is high, shows that the space locality is still valid in some practical workloads. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- include/linux/mm_types.h | 1 + include/linux/swap.h | 57 - mm/memory.c | 23 +++-- mm/shmem.c | 2 +- mm/swap_state.c | 215 +++ 5 files changed, 273 insertions(+), 25 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7f384bb62d8e..5c02027050a2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -335,6 +335,7 @@ struct vm_area_struct { struct file * vm_file; /* File we map to (can be NULL). */ void * vm_private_data; /* was vm_pte (shared mem) */ + atomic_long_t swap_readahead_info; #ifndef CONFIG_MMU struct vm_region *vm_region;/* NOMMU mapping region */ #endif diff --git a/include/linux/swap.h b/include/linux/swap.h index 76f1632eea5a..61d63379e956 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -251,6 +251,25 @@ struct swap_info_struct { struct swap_cluster_list discard_clusters; /* discard clusters list */ }; +#ifdef CONFIG_64BIT +#define SWAP_RA_ORDER_CEILING 5 +#else +/* Avoid stack overflow, because we need to save part of page table */ +#define SWAP_RA_ORDER_CEILING 3 +#define SWAP_RA_PTE_CACHE_SIZE (1 << SWAP_RA_ORDER_CEILING) +#endif + +struct
[PATCH -mm -v4 2/5] mm, swap: Fix swap readahead marking
From: Huang YingIn the original implementation, it is possible that the existing pages in the swap cache (not newly readahead) could be marked as the readahead pages. This will cause the statistics of swap readahead be wrong and influence the swap readahead algorithm too. This is fixed via marking a page as the readahead page only if it is newly allocated and read from the disk. When testing with linpack, after the fixing the swap readahead hit rate increased from ~66% to ~86%. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- mm/swap_state.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/swap_state.c b/mm/swap_state.c index d1bdb31cab13..a901afe9da61 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -498,7 +498,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, unsigned long start_offset, end_offset; unsigned long mask; struct blk_plug plug; - bool do_poll = true; + bool do_poll = true, page_allocated; mask = swapin_nr_pages(offset) - 1; if (!mask) @@ -514,14 +514,18 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, blk_start_plug(); for (offset = start_offset; offset <= end_offset ; offset++) { /* Ok, do the async read-ahead now */ - page = read_swap_cache_async(swp_entry(swp_type(entry), offset), - gfp_mask, vma, addr, false); + page = __read_swap_cache_async( + swp_entry(swp_type(entry), offset), + gfp_mask, vma, addr, _allocated); if (!page) continue; - if (offset != entry_offset && - likely(!PageTransCompound(page))) { - SetPageReadahead(page); - count_vm_event(SWAP_RA); + if (page_allocated) { + swap_readpage(page, false); + if (offset != entry_offset && + likely(!PageTransCompound(page))) { + SetPageReadahead(page); + count_vm_event(SWAP_RA); + } } put_page(page); } -- 2.11.0
[PATCH -mm -v4 2/5] mm, swap: Fix swap readahead marking
From: Huang Ying In the original implementation, it is possible that the existing pages in the swap cache (not newly readahead) could be marked as the readahead pages. This will cause the statistics of swap readahead be wrong and influence the swap readahead algorithm too. This is fixed via marking a page as the readahead page only if it is newly allocated and read from the disk. When testing with linpack, after the fixing the swap readahead hit rate increased from ~66% to ~86%. Signed-off-by: "Huang, Ying" Cc: Johannes Weiner Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Hugh Dickins Cc: Fengguang Wu Cc: Tim Chen Cc: Dave Hansen --- mm/swap_state.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/swap_state.c b/mm/swap_state.c index d1bdb31cab13..a901afe9da61 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -498,7 +498,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, unsigned long start_offset, end_offset; unsigned long mask; struct blk_plug plug; - bool do_poll = true; + bool do_poll = true, page_allocated; mask = swapin_nr_pages(offset) - 1; if (!mask) @@ -514,14 +514,18 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, blk_start_plug(); for (offset = start_offset; offset <= end_offset ; offset++) { /* Ok, do the async read-ahead now */ - page = read_swap_cache_async(swp_entry(swp_type(entry), offset), - gfp_mask, vma, addr, false); + page = __read_swap_cache_async( + swp_entry(swp_type(entry), offset), + gfp_mask, vma, addr, _allocated); if (!page) continue; - if (offset != entry_offset && - likely(!PageTransCompound(page))) { - SetPageReadahead(page); - count_vm_event(SWAP_RA); + if (page_allocated) { + swap_readpage(page, false); + if (offset != entry_offset && + likely(!PageTransCompound(page))) { + SetPageReadahead(page); + count_vm_event(SWAP_RA); + } } put_page(page); } -- 2.11.0
[PATCH -mm -v4 0/5] mm, swap: VMA based swap readahead
The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device are readahead based on the global space locality estimation. But the consecutive blocks in swap device just reflect the order of page reclaiming, don't necessarily reflect the access pattern in virtual memory space. And the different tasks in the system may have different access patterns, which makes the global space locality estimation incorrect. In this patchset, when page fault occurs, the virtual pages near the fault address will be readahead instead of the swap slots near the fault swap slot in swap device. This avoid to readahead the unrelated swap slots. At the same time, the swap readahead is changed to work on per-VMA from globally. So that the different access patterns of the different VMAs could be distinguished, and the different readahead policy could be applied accordingly. The original core readahead detection and scaling algorithm is reused, because it is an effect algorithm to detect the space locality. In addition to the swap readahead changes, some new sysfs interface is added to show the efficiency of the readahead algorithm and some other swap statistics. This new implementation will incur more small random read, on SSD, the improved correctness of estimation and readahead target should beat the potential increased overhead, this is also illustrated in the test results below. But on HDD, the overhead may beat the benefit, so the original implementation will be used by default. The test and result is as follow, Common test condition = Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM) Swap device: NVMe disk Micro-benchmark with combined access pattern vm-scalability, sequential swap test case, 4 processes to eat 50G virtual memory space, repeat the sequential memory writing until 300 seconds. The first round writing will trigger swap out, the following rounds will trigger sequential swap in and out. At the same time, run vm-scalability random swap test case in background, 8 processes to eat 30G virtual memory space, repeat the random memory write until 300 seconds. This will trigger random swap-in in the background. This is a combined workload with sequential and random memory accessing at the same time. The result (for sequential workload) is as follow, BaseOptimized - throughput 345413 KB/s 414029 KB/s (+19.9%) latency.average 97.14 us61.06 us (-37.1%) latency.50th2 us1 us latency.60th2 us1 us latency.70th98 us 2 us latency.80th160 us 2 us latency.90th260 us 217 us latency.95th346 us 369 us latency.99th1.34 ms 1.09 ms ra_hit% 52.69% 99.98% The original swap readahead algorithm is confused by the background random access workload, so readahead hit rate is lower. The VMA-base readahead algorithm works much better. Linpack === The test memory size is bigger than RAM to trigger swapping. BaseOptimized - elapsed_time393.49 s329.88 s (-16.2%) ra_hit% 86.21% 98.82% The score of base and optimized kernel hasn't visible changes. But the elapsed time reduced and readahead hit rate improved, so the optimized kernel runs better for startup and tear down stages. And the absolute value of readahead hit rate is high, shows that the space locality is still valid in some practical workloads. Changelogs: v4: - Rebased on latest -mm tree. - Remove swap cache statistics interface, because we found that the interface for readahead statistics should be sufficient. - Use /proc/vmstat for swap readahead statistics, because that is the interface used by other similar statistics. - Add ABI document for newly added sysfs interface. v3: - Rebased on latest -mm tree - Use percpu_counter for swap readahead statistics per Dave Hansen's comment. Best Regards, Huang, Ying
[PATCH -mm -v4 0/5] mm, swap: VMA based swap readahead
The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device are readahead based on the global space locality estimation. But the consecutive blocks in swap device just reflect the order of page reclaiming, don't necessarily reflect the access pattern in virtual memory space. And the different tasks in the system may have different access patterns, which makes the global space locality estimation incorrect. In this patchset, when page fault occurs, the virtual pages near the fault address will be readahead instead of the swap slots near the fault swap slot in swap device. This avoid to readahead the unrelated swap slots. At the same time, the swap readahead is changed to work on per-VMA from globally. So that the different access patterns of the different VMAs could be distinguished, and the different readahead policy could be applied accordingly. The original core readahead detection and scaling algorithm is reused, because it is an effect algorithm to detect the space locality. In addition to the swap readahead changes, some new sysfs interface is added to show the efficiency of the readahead algorithm and some other swap statistics. This new implementation will incur more small random read, on SSD, the improved correctness of estimation and readahead target should beat the potential increased overhead, this is also illustrated in the test results below. But on HDD, the overhead may beat the benefit, so the original implementation will be used by default. The test and result is as follow, Common test condition = Test Machine: Xeon E5 v3 (2 sockets, 72 threads, 32G RAM) Swap device: NVMe disk Micro-benchmark with combined access pattern vm-scalability, sequential swap test case, 4 processes to eat 50G virtual memory space, repeat the sequential memory writing until 300 seconds. The first round writing will trigger swap out, the following rounds will trigger sequential swap in and out. At the same time, run vm-scalability random swap test case in background, 8 processes to eat 30G virtual memory space, repeat the random memory write until 300 seconds. This will trigger random swap-in in the background. This is a combined workload with sequential and random memory accessing at the same time. The result (for sequential workload) is as follow, BaseOptimized - throughput 345413 KB/s 414029 KB/s (+19.9%) latency.average 97.14 us61.06 us (-37.1%) latency.50th2 us1 us latency.60th2 us1 us latency.70th98 us 2 us latency.80th160 us 2 us latency.90th260 us 217 us latency.95th346 us 369 us latency.99th1.34 ms 1.09 ms ra_hit% 52.69% 99.98% The original swap readahead algorithm is confused by the background random access workload, so readahead hit rate is lower. The VMA-base readahead algorithm works much better. Linpack === The test memory size is bigger than RAM to trigger swapping. BaseOptimized - elapsed_time393.49 s329.88 s (-16.2%) ra_hit% 86.21% 98.82% The score of base and optimized kernel hasn't visible changes. But the elapsed time reduced and readahead hit rate improved, so the optimized kernel runs better for startup and tear down stages. And the absolute value of readahead hit rate is high, shows that the space locality is still valid in some practical workloads. Changelogs: v4: - Rebased on latest -mm tree. - Remove swap cache statistics interface, because we found that the interface for readahead statistics should be sufficient. - Use /proc/vmstat for swap readahead statistics, because that is the interface used by other similar statistics. - Add ABI document for newly added sysfs interface. v3: - Rebased on latest -mm tree - Use percpu_counter for swap readahead statistics per Dave Hansen's comment. Best Regards, Huang, Ying
[PATCH] ASoC: mediatek: Fix an error checking code
Check the value returned by 'devm_clk_get()' instead of the clock identifier which can never be an ERR code. Fixes: d6f3710a56e1 ("ASoC: mediatek: add structure define and clock control for 2701") Signed-off-by: Christophe JAILLET--- sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c index b815ecc6bbf6..affa7fb25dd9 100644 --- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c +++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c @@ -75,7 +75,7 @@ int mt2701_init_clock(struct mtk_base_afe *afe) for (i = 0; i < MT2701_CLOCK_NUM; i++) { afe_priv->clocks[i] = devm_clk_get(afe->dev, aud_clks[i]); - if (IS_ERR(aud_clks[i])) { + if (IS_ERR(afe_priv->clocks[i])) { dev_warn(afe->dev, "%s devm_clk_get %s fail\n", __func__, aud_clks[i]); return PTR_ERR(aud_clks[i]); -- 2.11.0
[PATCH] ASoC: mediatek: Fix an error checking code
Check the value returned by 'devm_clk_get()' instead of the clock identifier which can never be an ERR code. Fixes: d6f3710a56e1 ("ASoC: mediatek: add structure define and clock control for 2701") Signed-off-by: Christophe JAILLET --- sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c index b815ecc6bbf6..affa7fb25dd9 100644 --- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c +++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c @@ -75,7 +75,7 @@ int mt2701_init_clock(struct mtk_base_afe *afe) for (i = 0; i < MT2701_CLOCK_NUM; i++) { afe_priv->clocks[i] = devm_clk_get(afe->dev, aud_clks[i]); - if (IS_ERR(aud_clks[i])) { + if (IS_ERR(afe_priv->clocks[i])) { dev_warn(afe->dev, "%s devm_clk_get %s fail\n", __func__, aud_clks[i]); return PTR_ERR(aud_clks[i]); -- 2.11.0
Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled
On Sun, Aug 6, 2017 at 9:47 PM, Rusty Russellwrote: > Matthew Garrett writes: >> And then you need an entire trusted userland, at which point you can >> assert that the modules are trustworthy without having to validate >> them so you don't need CONFIG_MODULE_SIG anyway. > > Yep. But your patch already gives userland that power, to silently load > unsigned modules. Only if sig_enforce isn't set. >>> With your patch, you don't get tainting in the environment where you can >>> verify. >> >> You don't anyway, do you? Loading has already failed before this point >> if sig_enforce is set. > > No. You used to get a warning and a taint when you had a kernel > configured to expect signatures and it didn't get one. You want to > remove that warning, to silently accept unsigned modules. I'm very confused here. If sig_enforce is set, the kernel will refuse to load an unsigned module - it won't be tainted, modprobe will just return an error. If sig_enforce is not set, any attacker in a position to provide unsigned modules is also in a position to just subvert modprobe, so you aren't in an environment where you can verify anything. The taint is informational, not any form of security. You're only able to securely verify module signatures in userland in very constrainted setups. >>> You'd be better adding a sysctl or equiv. to turn off force loading, and >>> use that in your can-verify system. >> >> I'm not sure what you mean by "force loading" here - if sig_enforce is >> set, you can't force load an unsigned module. If sig_enforce isn't >> set, you'll taint regardless of whether or not you force. >> >> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS? > > No, I mean stripping the signatures. (I thought modprobe could do this > these days, but apparently not!) > > So, you're actually building the same kernel, but building two sets of > modules: one without signatures, one with? > > And when deploying the one with signatures, you're setting sig_enforce. > On the other, you don't want signatures because um, reasons? And you > want to suppress the message? No. A distribution may ship a kernel with signed modules. In some configurations, the signatures are irrelevant - there's no mechanism to verify that the correct kernel was loaded in the first place, so for all you know the signature validation code has already been removed at runtime. In that scenario you're fine with users loading unsigned kernel modules and there's no benefit in tainting the kernel. But the same kernel may be booted under circumstances where it *is* possible to validate the kernel, and in those circumstances you want to enforce module signatures and so sig_enforce is set. Right now you have two choices: 1) unsigned modules taint the kernel if sig_enforce is false, unsigned modules can't be loaded if sig_enforce is true (ie, CONFIG_MODULE_SIG is set) 2) unsigned modules do not taint the kernel, unsigned modules can always be loaded (ie, CONFIG_MODULE_SIG is unset) What I want is: 3) unsigned modules do not taint the kernel if sig_enforce is false, unsigned modules can't be loaded if sig_enforce is true This is currently impossible to express, and as a result some distributions ship with CONFIG_MODULE_SIG disabled in order to avoid dealing with user questions about why loading locally built modules now taints the kernel. Being able to build a single kernel that satisfies more use cases seems like a win. But maybe there's a cleaner way. How about adding a paramter like sig_enforce (say taint_on_unsigned) and then adding a config parameter equivalent to CONFIG_MODULE_SIG_FORCE? That way the default policy can be set at build time, but can also be overridden by end users who still want to be able to taint on unsigned module load.
Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled
On Sun, Aug 6, 2017 at 9:47 PM, Rusty Russell wrote: > Matthew Garrett writes: >> And then you need an entire trusted userland, at which point you can >> assert that the modules are trustworthy without having to validate >> them so you don't need CONFIG_MODULE_SIG anyway. > > Yep. But your patch already gives userland that power, to silently load > unsigned modules. Only if sig_enforce isn't set. >>> With your patch, you don't get tainting in the environment where you can >>> verify. >> >> You don't anyway, do you? Loading has already failed before this point >> if sig_enforce is set. > > No. You used to get a warning and a taint when you had a kernel > configured to expect signatures and it didn't get one. You want to > remove that warning, to silently accept unsigned modules. I'm very confused here. If sig_enforce is set, the kernel will refuse to load an unsigned module - it won't be tainted, modprobe will just return an error. If sig_enforce is not set, any attacker in a position to provide unsigned modules is also in a position to just subvert modprobe, so you aren't in an environment where you can verify anything. The taint is informational, not any form of security. You're only able to securely verify module signatures in userland in very constrainted setups. >>> You'd be better adding a sysctl or equiv. to turn off force loading, and >>> use that in your can-verify system. >> >> I'm not sure what you mean by "force loading" here - if sig_enforce is >> set, you can't force load an unsigned module. If sig_enforce isn't >> set, you'll taint regardless of whether or not you force. >> >> Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS? > > No, I mean stripping the signatures. (I thought modprobe could do this > these days, but apparently not!) > > So, you're actually building the same kernel, but building two sets of > modules: one without signatures, one with? > > And when deploying the one with signatures, you're setting sig_enforce. > On the other, you don't want signatures because um, reasons? And you > want to suppress the message? No. A distribution may ship a kernel with signed modules. In some configurations, the signatures are irrelevant - there's no mechanism to verify that the correct kernel was loaded in the first place, so for all you know the signature validation code has already been removed at runtime. In that scenario you're fine with users loading unsigned kernel modules and there's no benefit in tainting the kernel. But the same kernel may be booted under circumstances where it *is* possible to validate the kernel, and in those circumstances you want to enforce module signatures and so sig_enforce is set. Right now you have two choices: 1) unsigned modules taint the kernel if sig_enforce is false, unsigned modules can't be loaded if sig_enforce is true (ie, CONFIG_MODULE_SIG is set) 2) unsigned modules do not taint the kernel, unsigned modules can always be loaded (ie, CONFIG_MODULE_SIG is unset) What I want is: 3) unsigned modules do not taint the kernel if sig_enforce is false, unsigned modules can't be loaded if sig_enforce is true This is currently impossible to express, and as a result some distributions ship with CONFIG_MODULE_SIG disabled in order to avoid dealing with user questions about why loading locally built modules now taints the kernel. Being able to build a single kernel that satisfies more use cases seems like a win. But maybe there's a cleaner way. How about adding a paramter like sig_enforce (say taint_on_unsigned) and then adding a config parameter equivalent to CONFIG_MODULE_SIG_FORCE? That way the default policy can be set at build time, but can also be overridden by end users who still want to be able to taint on unsigned module load.
Re: [PATCH] devfreq: replace sscanf with kstrtol
Hi, On 2017년 08월 07일 13:47, gsant...@codeaurora.org wrote: > On 2017-08-04 20:42, Chanwoo Choi wrote: >> Hi, >> >> On Fri, Aug 4, 2017 at 12:57 PM,wrote: >>> Hi, >>> >>> Adding error checks to devfreq userspace governor, the current >>> implementation results in setting wrong >>> frequency when sscanf returns error. >>> >>> >>> From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001 >>> From: Santosh Mardi >>> Date: Tue, 25 Jul 2017 18:47:11 +0530 >>> Subject: [PATCH] devfreq: replace sscanf with kstrtol >>> >>> store_freq function of devfreq userspace governor >>> executes further, even if error is returned from sscanf, >>> this will result in setting up wrong frequency value. >>> >>> The usage for the sscanf is only for single variable so >>> replace sscanf with kstrtol along with error check to >>> bail out if any error is returned. >>> >>> Signed-off-by: Santosh Mardi >>> --- >>> drivers/devfreq/governor_userspace.c | 5 - >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/devfreq/governor_userspace.c >>> b/drivers/devfreq/governor_userspace.c >>> index 77028c2..a84796d 100644 >>> --- a/drivers/devfreq/governor_userspace.c >>> +++ b/drivers/devfreq/governor_userspace.c >>> @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct >>> device_attribute *attr, >>> mutex_lock(>lock); >>> data = devfreq->data; >>> >>> - sscanf(buf, "%lu", ); >>> + err = kstrtol(buf, 0, ); >>> + if (err < 0) >>> + goto out; >> >> I think that just you can check the return value as following: >> The other point of devfreq already uses the following style >> to check the return value of sscanf. I think kstrtol is not necessary. >> >> err = sscanf(buf, "%lu", ); >> if (err != 1) >> goto out; >> > > [Santosh] - I Agree we need to have this error check as mentioned by you if > we are scanning an arrary from the sscanf, > but in the above code we are only scanning one variable and there is a rule > in the checkpatch scripts, not to use sscanf if it is a single variable, So I > need to replace sscanf to strtol IMHO, even if checkpatch shows the warning about sscanf, I'd like you to use 'sscanf' in order to maintain the consistency and readability when handling the sscanf. For example, drivers/devfreq/devfreq.c and drivers/cpufreq/cpufreq.c have the same warnings on many points. > > I have added all the mails I got as output from scripts/get_maintainer.pl > scripts in this mail. Maybe, you missed including me (reviewer) to cc list. MyungJoo Ham (maintainer:DEVICE FREQUENCY (DEVFREQ)) Kyungmin Park (maintainer:DEVICE FREQUENCY (DEVFREQ)) Chanwoo Choi (reviewer:DEVICE FREQUENCY (DEVFREQ)) linux...@vger.kernel.org (open list:DEVICE FREQUENCY (DEVFREQ)) linux-kernel@vger.kernel.org (open list) > > >> And please use the scripts/get_maintainer.pl >> in order to prevent the missing of the reviewer. >> >>> data->user_frequency = wanted; >>> data->valid = true; >>> err = update_devfreq(devfreq); >>> if (err == 0) >>> err = count; >>> +out: >>> mutex_unlock(>lock); >>> return err; >>> } >>> -- >>> >>> Regards, >>> Santosh M G. >>> Qualcomm Innovation Center > > > -- Best Regards, Chanwoo Choi Samsung Electronics
Re: [PATCH] devfreq: replace sscanf with kstrtol
Hi, On 2017년 08월 07일 13:47, gsant...@codeaurora.org wrote: > On 2017-08-04 20:42, Chanwoo Choi wrote: >> Hi, >> >> On Fri, Aug 4, 2017 at 12:57 PM, wrote: >>> Hi, >>> >>> Adding error checks to devfreq userspace governor, the current >>> implementation results in setting wrong >>> frequency when sscanf returns error. >>> >>> >>> From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001 >>> From: Santosh Mardi >>> Date: Tue, 25 Jul 2017 18:47:11 +0530 >>> Subject: [PATCH] devfreq: replace sscanf with kstrtol >>> >>> store_freq function of devfreq userspace governor >>> executes further, even if error is returned from sscanf, >>> this will result in setting up wrong frequency value. >>> >>> The usage for the sscanf is only for single variable so >>> replace sscanf with kstrtol along with error check to >>> bail out if any error is returned. >>> >>> Signed-off-by: Santosh Mardi >>> --- >>> drivers/devfreq/governor_userspace.c | 5 - >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/devfreq/governor_userspace.c >>> b/drivers/devfreq/governor_userspace.c >>> index 77028c2..a84796d 100644 >>> --- a/drivers/devfreq/governor_userspace.c >>> +++ b/drivers/devfreq/governor_userspace.c >>> @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct >>> device_attribute *attr, >>> mutex_lock(>lock); >>> data = devfreq->data; >>> >>> - sscanf(buf, "%lu", ); >>> + err = kstrtol(buf, 0, ); >>> + if (err < 0) >>> + goto out; >> >> I think that just you can check the return value as following: >> The other point of devfreq already uses the following style >> to check the return value of sscanf. I think kstrtol is not necessary. >> >> err = sscanf(buf, "%lu", ); >> if (err != 1) >> goto out; >> > > [Santosh] - I Agree we need to have this error check as mentioned by you if > we are scanning an arrary from the sscanf, > but in the above code we are only scanning one variable and there is a rule > in the checkpatch scripts, not to use sscanf if it is a single variable, So I > need to replace sscanf to strtol IMHO, even if checkpatch shows the warning about sscanf, I'd like you to use 'sscanf' in order to maintain the consistency and readability when handling the sscanf. For example, drivers/devfreq/devfreq.c and drivers/cpufreq/cpufreq.c have the same warnings on many points. > > I have added all the mails I got as output from scripts/get_maintainer.pl > scripts in this mail. Maybe, you missed including me (reviewer) to cc list. MyungJoo Ham (maintainer:DEVICE FREQUENCY (DEVFREQ)) Kyungmin Park (maintainer:DEVICE FREQUENCY (DEVFREQ)) Chanwoo Choi (reviewer:DEVICE FREQUENCY (DEVFREQ)) linux...@vger.kernel.org (open list:DEVICE FREQUENCY (DEVFREQ)) linux-kernel@vger.kernel.org (open list) > > >> And please use the scripts/get_maintainer.pl >> in order to prevent the missing of the reviewer. >> >>> data->user_frequency = wanted; >>> data->valid = true; >>> err = update_devfreq(devfreq); >>> if (err == 0) >>> err = count; >>> +out: >>> mutex_unlock(>lock); >>> return err; >>> } >>> -- >>> >>> Regards, >>> Santosh M G. >>> Qualcomm Innovation Center > > > -- Best Regards, Chanwoo Choi Samsung Electronics
Re: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle
Hi Michael, On Tue, Aug 01, 2017 at 08:56:18PM +1000, Michael Ellerman wrote: > "Gautham R. Shenoy"writes: > > > > Subject: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for > > stop4 cpuidle > > I know it's not a big deal, but can we agree on the subject format? > > powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle Sure. I will repost with the updated subject. > > cheers > -- Thanks and Regards gautham.
Re: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle
Hi Michael, On Tue, Aug 01, 2017 at 08:56:18PM +1000, Michael Ellerman wrote: > "Gautham R. Shenoy" writes: > > > > Subject: [v3 PATCH 1/2] powernv/powerpc:Save/Restore additional SPRs for > > stop4 cpuidle > > I know it's not a big deal, but can we agree on the subject format? > > powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle Sure. I will repost with the updated subject. > > cheers > -- Thanks and Regards gautham.
Re: [PATCH 2/2] Save current timestamp part of dmesg while writing oops message to pstore
Hi Kees, On Tuesday 23 May 2017 02:19 PM, Ankit Kumar wrote: Hi Kees, On Tuesday 23 May 2017 05:21 AM, Kees Cook wrote: On Mon, May 22, 2017 at 3:20 AM, Ankit Kumarwrote: Currently on panic or Oops, kernel saves the last few bytes from dmesg buffer to nvram. Usually kdump does capture kernel memory and provide dmesg logs as well. But in some cases where kdump fails to capture vmcore, the dmesg buffer stored in nvram/pstore turns out to be very helpful in analyzing root cause. Present code creates pstore dump file(/sys/fs/pstore/dmesg-***) based on timestamp(retrieved from header). Current pstore code creates dump file (/sys/fs/pstore/dmesg-***) with that timestamp. Dump file can be analyzed based on file creation time and we can make out whether dump file has latest data or not. But when we transfer pstore dump file(/sys/fs/pstore/dmesg-***) to other machine or collect file using some utilities(sosreport/supportconfig) then file timestamp gets changed and hence by looking at device file (dmesg-***) we won't be able to identify whether dump has latest data or not. Above issue can be fixed if we also have timestamp(dump creation time) as initial few bytes while capturing dmesg buffer to pstore dump file (/sys/fs/pstore/dmesg-***). This patch enhances pstore write code to also write timestamp as part of data. Here is sample log of dump file:(/sys/fs/pstore/dmesg-***) Oops#1 Part1 [timestamp:1494939359.590463] While I understand your rationale about possibly losing file timestamp information in userspace, I think this is a solvable problem on the collection side. If an additional header is needed, perhaps copy the dmesg files like this: for i in dmesg-*; do (stat --format=%y /sys/fs/pstore/$i; \ cat /sys/fs/pstore/$i) > $collect_dir/$i done Yes. We can handle this in userspace. But we wanted to see if we can add this as part of pstore log itself. One of the primary concerns for pstore is the stored dump size, I understand. How about adding timestamp to file name itself? Something like below How about appending time as part of file name itself. ? Did you get time to look at above approach. Code can be something like below piece. ~Ankit index 792a4e5..0837365 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -349,9 +349,10 @@ int pstore_mkfile(struct dentry *root, struct pstore_record *record) switch (record->type) { case PSTORE_TYPE_DMESG: - scnprintf(name, sizeof(name), "dmesg-%s-%lld%s", + scnprintf(name, sizeof(name), "dmesg-%s-%lld%s-%lu.%lu", record->psi->name, record->id, - record->compressed ? ".enc.z" : ""); + record->compressed ? ".enc.z" : "", + record->time.tv_sec, record->time.tv_nsec / 1000); break; case PSTORE_TYPE_CONSOLE: ~Ankit
Re: [PATCH 2/2] Save current timestamp part of dmesg while writing oops message to pstore
Hi Kees, On Tuesday 23 May 2017 02:19 PM, Ankit Kumar wrote: Hi Kees, On Tuesday 23 May 2017 05:21 AM, Kees Cook wrote: On Mon, May 22, 2017 at 3:20 AM, Ankit Kumar wrote: Currently on panic or Oops, kernel saves the last few bytes from dmesg buffer to nvram. Usually kdump does capture kernel memory and provide dmesg logs as well. But in some cases where kdump fails to capture vmcore, the dmesg buffer stored in nvram/pstore turns out to be very helpful in analyzing root cause. Present code creates pstore dump file(/sys/fs/pstore/dmesg-***) based on timestamp(retrieved from header). Current pstore code creates dump file (/sys/fs/pstore/dmesg-***) with that timestamp. Dump file can be analyzed based on file creation time and we can make out whether dump file has latest data or not. But when we transfer pstore dump file(/sys/fs/pstore/dmesg-***) to other machine or collect file using some utilities(sosreport/supportconfig) then file timestamp gets changed and hence by looking at device file (dmesg-***) we won't be able to identify whether dump has latest data or not. Above issue can be fixed if we also have timestamp(dump creation time) as initial few bytes while capturing dmesg buffer to pstore dump file (/sys/fs/pstore/dmesg-***). This patch enhances pstore write code to also write timestamp as part of data. Here is sample log of dump file:(/sys/fs/pstore/dmesg-***) Oops#1 Part1 [timestamp:1494939359.590463] While I understand your rationale about possibly losing file timestamp information in userspace, I think this is a solvable problem on the collection side. If an additional header is needed, perhaps copy the dmesg files like this: for i in dmesg-*; do (stat --format=%y /sys/fs/pstore/$i; \ cat /sys/fs/pstore/$i) > $collect_dir/$i done Yes. We can handle this in userspace. But we wanted to see if we can add this as part of pstore log itself. One of the primary concerns for pstore is the stored dump size, I understand. How about adding timestamp to file name itself? Something like below How about appending time as part of file name itself. ? Did you get time to look at above approach. Code can be something like below piece. ~Ankit index 792a4e5..0837365 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -349,9 +349,10 @@ int pstore_mkfile(struct dentry *root, struct pstore_record *record) switch (record->type) { case PSTORE_TYPE_DMESG: - scnprintf(name, sizeof(name), "dmesg-%s-%lld%s", + scnprintf(name, sizeof(name), "dmesg-%s-%lld%s-%lu.%lu", record->psi->name, record->id, - record->compressed ? ".enc.z" : ""); + record->compressed ? ".enc.z" : "", + record->time.tv_sec, record->time.tv_nsec / 1000); break; case PSTORE_TYPE_CONSOLE: ~Ankit
[PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly version in panic path) introduced crash_smp_send_stop() which is a weak function and can be overriden by architecture codes to fix the side effect caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ notifiers" option). ARM64 architecture uses the weak version function and the problem is that the weak function simply calls smp_send_stop() which makes other CPUs offline and takes away the chance to save crash information for nonpanic CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel option is enabled. Calling smp_send_crash_stop() in machine_crash_shutdown() is useless because all nonpanic CPUs are already offline by smp_send_stop() in this case and smp_send_crash_stop() only works against online CPUs. The result is that /proc/vmcore is not available with the error messages; "Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized". crash_smp_send_stop() is implemented to fix this problem by replacing the exising smp_send_crash_stop() and adding a check for multiple calling to the function. The function (strong symbol version) saves crash information for nonpanic CPUs and machine_crash_shutdown() tries to save crash information for nonpanic CPUs only when crash_kexec_post_notifiers kernel option is disabled. * crash_kexec_post_notifiers : false panic() __crash_kexec() machine_crash_shutdown() crash_smp_send_stop()<= save crash dump for nonpanic cores * crash_kexec_post_notifiers : true panic() crash_smp_send_stop()<= save crash dump for nonpanic cores __crash_kexec() machine_crash_shutdown() crash_smp_send_stop()<= just return. Signed-off-by: Hoeun Ryu--- v2: - replace the existing smp_send_crash_stop() with crash_smp_send_stop() and adding called-twice logic to it. - modify the commit message arch/arm64/include/asm/smp.h | 2 +- arch/arm64/kernel/machine_kexec.c | 2 +- arch/arm64/kernel/smp.c | 12 +++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 55f08c5..f82b447 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void) */ bool cpus_are_stuck_in_kernel(void); -extern void smp_send_crash_stop(void); +extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); #endif /* ifndef __ASSEMBLY__ */ diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 481f54a..11121f6 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs) local_irq_disable(); /* shutdown non-crashing cpus */ - smp_send_crash_stop(); + crash_smp_send_stop(); /* for crashing cpu */ crash_save_cpu(regs, smp_processor_id()); diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index dc66e6e..73d8f5e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -977,11 +977,21 @@ void smp_send_stop(void) } #ifdef CONFIG_KEXEC_CORE -void smp_send_crash_stop(void) +void crash_smp_send_stop(void) { + static int cpus_stopped; cpumask_t mask; unsigned long timeout; + /* +* This function can be called twice in panic path, but obviously +* we execute this only once. +*/ + if (cpus_stopped) + return; + + cpus_stopped = 1; + if (num_online_cpus() == 1) return; -- 2.7.4
[PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly version in panic path) introduced crash_smp_send_stop() which is a weak function and can be overriden by architecture codes to fix the side effect caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ notifiers" option). ARM64 architecture uses the weak version function and the problem is that the weak function simply calls smp_send_stop() which makes other CPUs offline and takes away the chance to save crash information for nonpanic CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel option is enabled. Calling smp_send_crash_stop() in machine_crash_shutdown() is useless because all nonpanic CPUs are already offline by smp_send_stop() in this case and smp_send_crash_stop() only works against online CPUs. The result is that /proc/vmcore is not available with the error messages; "Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized". crash_smp_send_stop() is implemented to fix this problem by replacing the exising smp_send_crash_stop() and adding a check for multiple calling to the function. The function (strong symbol version) saves crash information for nonpanic CPUs and machine_crash_shutdown() tries to save crash information for nonpanic CPUs only when crash_kexec_post_notifiers kernel option is disabled. * crash_kexec_post_notifiers : false panic() __crash_kexec() machine_crash_shutdown() crash_smp_send_stop()<= save crash dump for nonpanic cores * crash_kexec_post_notifiers : true panic() crash_smp_send_stop()<= save crash dump for nonpanic cores __crash_kexec() machine_crash_shutdown() crash_smp_send_stop()<= just return. Signed-off-by: Hoeun Ryu --- v2: - replace the existing smp_send_crash_stop() with crash_smp_send_stop() and adding called-twice logic to it. - modify the commit message arch/arm64/include/asm/smp.h | 2 +- arch/arm64/kernel/machine_kexec.c | 2 +- arch/arm64/kernel/smp.c | 12 +++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 55f08c5..f82b447 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -148,7 +148,7 @@ static inline void cpu_panic_kernel(void) */ bool cpus_are_stuck_in_kernel(void); -extern void smp_send_crash_stop(void); +extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); #endif /* ifndef __ASSEMBLY__ */ diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 481f54a..11121f6 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -252,7 +252,7 @@ void machine_crash_shutdown(struct pt_regs *regs) local_irq_disable(); /* shutdown non-crashing cpus */ - smp_send_crash_stop(); + crash_smp_send_stop(); /* for crashing cpu */ crash_save_cpu(regs, smp_processor_id()); diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index dc66e6e..73d8f5e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -977,11 +977,21 @@ void smp_send_stop(void) } #ifdef CONFIG_KEXEC_CORE -void smp_send_crash_stop(void) +void crash_smp_send_stop(void) { + static int cpus_stopped; cpumask_t mask; unsigned long timeout; + /* +* This function can be called twice in panic path, but obviously +* we execute this only once. +*/ + if (cpus_stopped) + return; + + cpus_stopped = 1; + if (num_online_cpus() == 1) return; -- 2.7.4
[PATCH] perf report: calculate the average cycles of iterations
The branch history code has a loop detection function. With this, we can get the number of iterations by calculating the removed loops. While it would be nice for knowing the average cycles of iterations. This patch adds up the cycles in branch entries of removed loops and save the result to the next branch entry (e.g. branch entry A). Finally it will display the iteration number and average cycles at the "from" of branch entry A. For example: perf record -g -j any,save_type ./div perf report --branch-history --no-children --stdio --22.63%--main div.c:42 (RET CROSS_2M) compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2) | --10.73%--compute_flag div.c:27 (RET CROSS_2M) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M) Signed-off-by: Jin Yao--- tools/perf/ui/browsers/hists.c | 8 +--- tools/perf/ui/stdio/hist.c | 10 ++--- tools/perf/util/callchain.c| 49 +++ tools/perf/util/callchain.h| 9 ++--- tools/perf/util/machine.c | 88 +- 5 files changed, 85 insertions(+), 79 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index f4bc246..13dfb0a 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -931,12 +931,8 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser, browser->show_dso); if (symbol_conf.show_branchflag_count) { - if (need_percent) - callchain_list_counts__printf_value(node, chain, NULL, - buf, sizeof(buf)); - else - callchain_list_counts__printf_value(NULL, chain, NULL, - buf, sizeof(buf)); + callchain_list_counts__printf_value(chain, NULL, + buf, sizeof(buf)); if (asprintf(_str2, "%s%s", str, buf) < 0) str = "Not enough memory!"; diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c index 5c95b83..8bdb7a5 100644 --- a/tools/perf/ui/stdio/hist.c +++ b/tools/perf/ui/stdio/hist.c @@ -124,12 +124,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node, str = callchain_list__sym_name(chain, bf, sizeof(bf), false); if (symbol_conf.show_branchflag_count) { - if (!period) - callchain_list_counts__printf_value(node, chain, NULL, - buf, sizeof(buf)); - else - callchain_list_counts__printf_value(NULL, chain, NULL, - buf, sizeof(buf)); + callchain_list_counts__printf_value(chain, NULL, + buf, sizeof(buf)); if (asprintf(_str, "%s%s", str, buf) < 0) str = "Not enough memory!"; @@ -313,7 +309,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root, if (symbol_conf.show_branchflag_count) ret += callchain_list_counts__printf_value( - NULL, chain, fp, NULL, 0); + chain, fp, NULL, 0); ret += fprintf(fp, "\n"); if (++entries_printed == callchain_param.print_limit) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index f320b07..510b513 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -588,7 +588,7 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor) call->cycles_count = cursor_node->branch_flags.cycles; call->iter_count = cursor_node->nr_loop_iter; - call->samples_count = cursor_node->samples; + call->iter_cycles = cursor_node->iter_cycles; } } @@ -722,7 +722,7 @@ static enum match_result match_chain(struct callchain_cursor_node *node, cnode->cycles_count += node->branch_flags.cycles; cnode->iter_count +=
[PATCH] perf report: calculate the average cycles of iterations
The branch history code has a loop detection function. With this, we can get the number of iterations by calculating the removed loops. While it would be nice for knowing the average cycles of iterations. This patch adds up the cycles in branch entries of removed loops and save the result to the next branch entry (e.g. branch entry A). Finally it will display the iteration number and average cycles at the "from" of branch entry A. For example: perf record -g -j any,save_type ./div perf report --branch-history --no-children --stdio --22.63%--main div.c:42 (RET CROSS_2M) compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2) | --10.73%--compute_flag div.c:27 (RET CROSS_2M) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M) Signed-off-by: Jin Yao --- tools/perf/ui/browsers/hists.c | 8 +--- tools/perf/ui/stdio/hist.c | 10 ++--- tools/perf/util/callchain.c| 49 +++ tools/perf/util/callchain.h| 9 ++--- tools/perf/util/machine.c | 88 +- 5 files changed, 85 insertions(+), 79 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index f4bc246..13dfb0a 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -931,12 +931,8 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser, browser->show_dso); if (symbol_conf.show_branchflag_count) { - if (need_percent) - callchain_list_counts__printf_value(node, chain, NULL, - buf, sizeof(buf)); - else - callchain_list_counts__printf_value(NULL, chain, NULL, - buf, sizeof(buf)); + callchain_list_counts__printf_value(chain, NULL, + buf, sizeof(buf)); if (asprintf(_str2, "%s%s", str, buf) < 0) str = "Not enough memory!"; diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c index 5c95b83..8bdb7a5 100644 --- a/tools/perf/ui/stdio/hist.c +++ b/tools/perf/ui/stdio/hist.c @@ -124,12 +124,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node, str = callchain_list__sym_name(chain, bf, sizeof(bf), false); if (symbol_conf.show_branchflag_count) { - if (!period) - callchain_list_counts__printf_value(node, chain, NULL, - buf, sizeof(buf)); - else - callchain_list_counts__printf_value(NULL, chain, NULL, - buf, sizeof(buf)); + callchain_list_counts__printf_value(chain, NULL, + buf, sizeof(buf)); if (asprintf(_str, "%s%s", str, buf) < 0) str = "Not enough memory!"; @@ -313,7 +309,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root, if (symbol_conf.show_branchflag_count) ret += callchain_list_counts__printf_value( - NULL, chain, fp, NULL, 0); + chain, fp, NULL, 0); ret += fprintf(fp, "\n"); if (++entries_printed == callchain_param.print_limit) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index f320b07..510b513 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -588,7 +588,7 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor) call->cycles_count = cursor_node->branch_flags.cycles; call->iter_count = cursor_node->nr_loop_iter; - call->samples_count = cursor_node->samples; + call->iter_cycles = cursor_node->iter_cycles; } } @@ -722,7 +722,7 @@ static enum match_result match_chain(struct callchain_cursor_node *node, cnode->cycles_count += node->branch_flags.cycles; cnode->iter_count += node->nr_loop_iter; -
Re: [PATCH] osq_lock: fix osq_lock queue corruption
On 07/31/2017 10:54 PM, Prateek Sood wrote: > Fix ordering of link creation between node->prev and prev->next in > osq_lock(). A case in which the status of optimistic spin queue is > CPU6->CPU2 in which CPU6 has acquired the lock. > > tail > v > ,-. <- ,-. > |6||2| > `-' -> `-' > > At this point if CPU0 comes in to acquire osq_lock, it will update the > tail count. > > CPU2CPU0 > -- > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-' -> `-'`-' > > After tail count update if CPU2 starts to unqueue itself from > optimistic spin queue, it will find updated tail count with CPU0 and > update CPU2 node->next to NULL in osq_wait_next(). > > unqueue-A > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > > unqueue-B > > ->tail != curr && !node->next > > If reordering of following stores happen then > prev->next where prev being CPU2 would be updated to point to CPU0 node: > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-' -> `-' -> `-' > > osq_wait_next() > node->next <- 0 > xchg(node->next, NULL) > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > > unqueue-C > > At this point if next instruction > WRITE_ONCE(next->prev, prev); > in CPU2 path is committed before the update of CPU0 node->prev = prev then > CPU0 node->prev will point to CPU6 node. > > tail > V--. v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > `--^ > > At this point if CPU0 path's node->prev = prev is committed resulting > in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL > currently, > > tail >v > ,-. <- ,-. <- ,-. > |6||2||0| > `-'`-'`-' >`--^ > > so if CPU0 gets into unqueue path of osq_lock it will keep spinning > in infinite loop as condition prev->next == node will never be true. > > Signed-off-by: Prateek Sood> --- > kernel/locking/osq_lock.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > index a316794..9f4afa3 100644 > --- a/kernel/locking/osq_lock.c > +++ b/kernel/locking/osq_lock.c > @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > prev = decode_cpu(old); > node->prev = prev; > + > + /* > + * osq_lock() unqueue > + * > + * node->prev = prevosq_wait_next() > + * WMB MB > + * prev->next = nodenext->prev = prev //unqueue-C > + * > + * Here 'node->prev' and 'next->prev' are the same variable and we need > + * to ensure these stores happen in-order to avoid corrupting the list. > + */ > + smp_wmb(); > + > WRITE_ONCE(prev->next, node); > > /* > Hi Peter, I have updated the change log and comments in code. -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH] osq_lock: fix osq_lock queue corruption
On 07/31/2017 10:54 PM, Prateek Sood wrote: > Fix ordering of link creation between node->prev and prev->next in > osq_lock(). A case in which the status of optimistic spin queue is > CPU6->CPU2 in which CPU6 has acquired the lock. > > tail > v > ,-. <- ,-. > |6||2| > `-' -> `-' > > At this point if CPU0 comes in to acquire osq_lock, it will update the > tail count. > > CPU2CPU0 > -- > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-' -> `-'`-' > > After tail count update if CPU2 starts to unqueue itself from > optimistic spin queue, it will find updated tail count with CPU0 and > update CPU2 node->next to NULL in osq_wait_next(). > > unqueue-A > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > > unqueue-B > > ->tail != curr && !node->next > > If reordering of following stores happen then > prev->next where prev being CPU2 would be updated to point to CPU0 node: > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-' -> `-' -> `-' > > osq_wait_next() > node->next <- 0 > xchg(node->next, NULL) > > tail >v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > > unqueue-C > > At this point if next instruction > WRITE_ONCE(next->prev, prev); > in CPU2 path is committed before the update of CPU0 node->prev = prev then > CPU0 node->prev will point to CPU6 node. > > tail > V--. v > ,-. <- ,-.,-. > |6||2||0| > `-'`-'`-' > `--^ > > At this point if CPU0 path's node->prev = prev is committed resulting > in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL > currently, > > tail >v > ,-. <- ,-. <- ,-. > |6||2||0| > `-'`-'`-' >`--^ > > so if CPU0 gets into unqueue path of osq_lock it will keep spinning > in infinite loop as condition prev->next == node will never be true. > > Signed-off-by: Prateek Sood > --- > kernel/locking/osq_lock.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > index a316794..9f4afa3 100644 > --- a/kernel/locking/osq_lock.c > +++ b/kernel/locking/osq_lock.c > @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > prev = decode_cpu(old); > node->prev = prev; > + > + /* > + * osq_lock() unqueue > + * > + * node->prev = prevosq_wait_next() > + * WMB MB > + * prev->next = nodenext->prev = prev //unqueue-C > + * > + * Here 'node->prev' and 'next->prev' are the same variable and we need > + * to ensure these stores happen in-order to avoid corrupting the list. > + */ > + smp_wmb(); > + > WRITE_ONCE(prev->next, node); > > /* > Hi Peter, I have updated the change log and comments in code. -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] i2c: imx: Remove a useless test in 'i2c_imx_init_recovery_info()'
'devm_pinctrl_get()' never returns NULL, so this test can be simplified. Signed-off-by: Christophe JAILLET--- drivers/i2c/busses/i2c-imx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 54a47b40546f..7e84662fe1c0 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -997,7 +997,7 @@ static int i2c_imx_init_recovery_info(struct imx_i2c_struct *i2c_imx, struct i2c_bus_recovery_info *rinfo = _imx->rinfo; i2c_imx->pinctrl = devm_pinctrl_get(>dev); - if (!i2c_imx->pinctrl || IS_ERR(i2c_imx->pinctrl)) { + if (IS_ERR(i2c_imx->pinctrl)) { dev_info(>dev, "can't get pinctrl, bus recovery not supported\n"); return PTR_ERR(i2c_imx->pinctrl); } -- 2.11.0
[PATCH] i2c: imx: Remove a useless test in 'i2c_imx_init_recovery_info()'
'devm_pinctrl_get()' never returns NULL, so this test can be simplified. Signed-off-by: Christophe JAILLET --- drivers/i2c/busses/i2c-imx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 54a47b40546f..7e84662fe1c0 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -997,7 +997,7 @@ static int i2c_imx_init_recovery_info(struct imx_i2c_struct *i2c_imx, struct i2c_bus_recovery_info *rinfo = _imx->rinfo; i2c_imx->pinctrl = devm_pinctrl_get(>dev); - if (!i2c_imx->pinctrl || IS_ERR(i2c_imx->pinctrl)) { + if (IS_ERR(i2c_imx->pinctrl)) { dev_info(>dev, "can't get pinctrl, bus recovery not supported\n"); return PTR_ERR(i2c_imx->pinctrl); } -- 2.11.0
[PATCH v2] rtlwifi: constify rate_control_ops structure
rate_control_ops structure is only passed as an argument to the function ieee80211_rate_control_{register/unregister}. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal--- Changes in v2: * Change subject line. drivers/net/wireless/realtek/rtlwifi/rc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rc.c b/drivers/net/wireless/realtek/rtlwifi/rc.c index 951d257..02811ed 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rc.c +++ b/drivers/net/wireless/realtek/rtlwifi/rc.c @@ -283,7 +283,7 @@ static void rtl_rate_free_sta(void *rtlpriv, kfree(rate_priv); } -static struct rate_control_ops rtl_rate_ops = { +static const struct rate_control_ops rtl_rate_ops = { .name = "rtl_rc", .alloc = rtl_rate_alloc, .free = rtl_rate_free, -- 1.9.1
[PATCH v2] rtlwifi: constify rate_control_ops structure
rate_control_ops structure is only passed as an argument to the function ieee80211_rate_control_{register/unregister}. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal --- Changes in v2: * Change subject line. drivers/net/wireless/realtek/rtlwifi/rc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rc.c b/drivers/net/wireless/realtek/rtlwifi/rc.c index 951d257..02811ed 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rc.c +++ b/drivers/net/wireless/realtek/rtlwifi/rc.c @@ -283,7 +283,7 @@ static void rtl_rate_free_sta(void *rtlpriv, kfree(rate_priv); } -static struct rate_control_ops rtl_rate_ops = { +static const struct rate_control_ops rtl_rate_ops = { .name = "rtl_rc", .alloc = rtl_rate_alloc, .free = rtl_rate_free, -- 1.9.1
Re: linux-next: manual merge of the net-next tree with the net tree
Hi Neal, On Sun, 6 Aug 2017 22:21:43 -0400 Neal Cardwellwrote: > > > I fixed it up (see below) and can carry the fix as necessary. This > > is now fixed as far as linux-next is concerned, but any non trivial > > conflicts should be mentioned to your upstream maintainer when your tree > > is submitted for merging. You may also want to consider cooperating > > with the maintainer of the conflicting tree to minimise any particularly > > complex conflicts. > > Sorry about that. Will try to follow that procedure in the future. The above is a generic statement I add to all these emails. It is aimed more at the maintainers if the trees involved, no the developers of patches. I don't think you need to do anything different in these cases with the "net" and "net-next" tree. Dave Miller will fix up any conflicts when he next merges the net tree into the net-next tree. -- Cheers, Stephen Rothwell
Re: linux-next: manual merge of the net-next tree with the net tree
Hi Neal, On Sun, 6 Aug 2017 22:21:43 -0400 Neal Cardwell wrote: > > > I fixed it up (see below) and can carry the fix as necessary. This > > is now fixed as far as linux-next is concerned, but any non trivial > > conflicts should be mentioned to your upstream maintainer when your tree > > is submitted for merging. You may also want to consider cooperating > > with the maintainer of the conflicting tree to minimise any particularly > > complex conflicts. > > Sorry about that. Will try to follow that procedure in the future. The above is a generic statement I add to all these emails. It is aimed more at the maintainers if the trees involved, no the developers of patches. I don't think you need to do anything different in these cases with the "net" and "net-next" tree. Dave Miller will fix up any conflicts when he next merges the net tree into the net-next tree. -- Cheers, Stephen Rothwell
Re: [PATCH] devfreq: replace sscanf with kstrtol
On 2017-08-04 20:42, Chanwoo Choi wrote: Hi, On Fri, Aug 4, 2017 at 12:57 PM,wrote: Hi, Adding error checks to devfreq userspace governor, the current implementation results in setting wrong frequency when sscanf returns error. From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001 From: Santosh Mardi Date: Tue, 25 Jul 2017 18:47:11 +0530 Subject: [PATCH] devfreq: replace sscanf with kstrtol store_freq function of devfreq userspace governor executes further, even if error is returned from sscanf, this will result in setting up wrong frequency value. The usage for the sscanf is only for single variable so replace sscanf with kstrtol along with error check to bail out if any error is returned. Signed-off-by: Santosh Mardi --- drivers/devfreq/governor_userspace.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c index 77028c2..a84796d 100644 --- a/drivers/devfreq/governor_userspace.c +++ b/drivers/devfreq/governor_userspace.c @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct device_attribute *attr, mutex_lock(>lock); data = devfreq->data; - sscanf(buf, "%lu", ); + err = kstrtol(buf, 0, ); + if (err < 0) + goto out; I think that just you can check the return value as following: The other point of devfreq already uses the following style to check the return value of sscanf. I think kstrtol is not necessary. err = sscanf(buf, "%lu", ); if (err != 1) goto out; [Santosh] - I Agree we need to have this error check as mentioned by you if we are scanning an arrary from the sscanf, but in the above code we are only scanning one variable and there is a rule in the checkpatch scripts, not to use sscanf if it is a single variable, So I need to replace sscanf to strtol I have added all the mails I got as output from scripts/get_maintainer.pl scripts in this mail. And please use the scripts/get_maintainer.pl in order to prevent the missing of the reviewer. data->user_frequency = wanted; data->valid = true; err = update_devfreq(devfreq); if (err == 0) err = count; +out: mutex_unlock(>lock); return err; } -- Regards, Santosh M G. Qualcomm Innovation Center
Re: [PATCH] devfreq: replace sscanf with kstrtol
On 2017-08-04 20:42, Chanwoo Choi wrote: Hi, On Fri, Aug 4, 2017 at 12:57 PM, wrote: Hi, Adding error checks to devfreq userspace governor, the current implementation results in setting wrong frequency when sscanf returns error. From 12e0a347addd70529b2c378299b27b65f0766f99 Mon Sep 17 00:00:00 2001 From: Santosh Mardi Date: Tue, 25 Jul 2017 18:47:11 +0530 Subject: [PATCH] devfreq: replace sscanf with kstrtol store_freq function of devfreq userspace governor executes further, even if error is returned from sscanf, this will result in setting up wrong frequency value. The usage for the sscanf is only for single variable so replace sscanf with kstrtol along with error check to bail out if any error is returned. Signed-off-by: Santosh Mardi --- drivers/devfreq/governor_userspace.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c index 77028c2..a84796d 100644 --- a/drivers/devfreq/governor_userspace.c +++ b/drivers/devfreq/governor_userspace.c @@ -53,12 +53,15 @@ static ssize_t store_freq(struct device *dev, struct device_attribute *attr, mutex_lock(>lock); data = devfreq->data; - sscanf(buf, "%lu", ); + err = kstrtol(buf, 0, ); + if (err < 0) + goto out; I think that just you can check the return value as following: The other point of devfreq already uses the following style to check the return value of sscanf. I think kstrtol is not necessary. err = sscanf(buf, "%lu", ); if (err != 1) goto out; [Santosh] - I Agree we need to have this error check as mentioned by you if we are scanning an arrary from the sscanf, but in the above code we are only scanning one variable and there is a rule in the checkpatch scripts, not to use sscanf if it is a single variable, So I need to replace sscanf to strtol I have added all the mails I got as output from scripts/get_maintainer.pl scripts in this mail. And please use the scripts/get_maintainer.pl in order to prevent the missing of the reviewer. data->user_frequency = wanted; data->valid = true; err = update_devfreq(devfreq); if (err == 0) err = count; +out: mutex_unlock(>lock); return err; } -- Regards, Santosh M G. Qualcomm Innovation Center
Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled
Matthew Garrettwrites: > On Sun, Aug 6, 2017 at 7:49 PM, Rusty Russell wrote: >> Matthew Garrett writes: >>> Binary modules will still be tainted by the license checker. The issue >>> is that if you want to enforce module signatures under *some* >>> circumstances, you need to build with CONFIG_MODULE_SIG >> >> Not at all! You can validate them in userspace. > > And then you need an entire trusted userland, at which point you can > assert that the modules are trustworthy without having to validate > them so you don't need CONFIG_MODULE_SIG anyway. Yep. But your patch already gives userland that power, to silently load unsigned modules. >>> but that >>> changes the behaviour of the kernel even when you're not enforcing >>> module signatures. The same kernel may be used in environments where >>> you can verify the kernel and environments where you can't, and in the >>> latter you may not care that modules are unsigned. In that scenario, >>> tainting doesn't buy you anything. >> >> With your patch, you don't get tainting in the environment where you can >> verify. > > You don't anyway, do you? Loading has already failed before this point > if sig_enforce is set. No. You used to get a warning and a taint when you had a kernel configured to expect signatures and it didn't get one. You want to remove that warning, to silently accept unsigned modules. >> You'd be better adding a sysctl or equiv. to turn off force loading, and >> use that in your can-verify system. > > I'm not sure what you mean by "force loading" here - if sig_enforce is > set, you can't force load an unsigned module. If sig_enforce isn't > set, you'll taint regardless of whether or not you force. > > Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS? No, I mean stripping the signatures. (I thought modprobe could do this these days, but apparently not!) So, you're actually building the same kernel, but building two sets of modules: one without signatures, one with? And when deploying the one with signatures, you're setting sig_enforce. On the other, you don't want signatures because um, reasons? And you want to suppress the message? This seems so convoluted already, I can see how you considered an upstream patch your most productive path forward. But it's possible that this scenario makes sense to Jeyu and I'm just incapable of seeing its beauty? Cheers, Rusty.
Re: [PATCH] Allow automatic kernel taint on unsigned module load to be disabled
Matthew Garrett writes: > On Sun, Aug 6, 2017 at 7:49 PM, Rusty Russell wrote: >> Matthew Garrett writes: >>> Binary modules will still be tainted by the license checker. The issue >>> is that if you want to enforce module signatures under *some* >>> circumstances, you need to build with CONFIG_MODULE_SIG >> >> Not at all! You can validate them in userspace. > > And then you need an entire trusted userland, at which point you can > assert that the modules are trustworthy without having to validate > them so you don't need CONFIG_MODULE_SIG anyway. Yep. But your patch already gives userland that power, to silently load unsigned modules. >>> but that >>> changes the behaviour of the kernel even when you're not enforcing >>> module signatures. The same kernel may be used in environments where >>> you can verify the kernel and environments where you can't, and in the >>> latter you may not care that modules are unsigned. In that scenario, >>> tainting doesn't buy you anything. >> >> With your patch, you don't get tainting in the environment where you can >> verify. > > You don't anyway, do you? Loading has already failed before this point > if sig_enforce is set. No. You used to get a warning and a taint when you had a kernel configured to expect signatures and it didn't get one. You want to remove that warning, to silently accept unsigned modules. >> You'd be better adding a sysctl or equiv. to turn off force loading, and >> use that in your can-verify system. > > I'm not sure what you mean by "force loading" here - if sig_enforce is > set, you can't force load an unsigned module. If sig_enforce isn't > set, you'll taint regardless of whether or not you force. > > Wait. Hang on - are you confusing CONFIG_MODULE_SIG with CONFIG_MODVERSIONS? No, I mean stripping the signatures. (I thought modprobe could do this these days, but apparently not!) So, you're actually building the same kernel, but building two sets of modules: one without signatures, one with? And when deploying the one with signatures, you're setting sig_enforce. On the other, you don't want signatures because um, reasons? And you want to suppress the message? This seems so convoluted already, I can see how you considered an upstream patch your most productive path forward. But it's possible that this scenario makes sense to Jeyu and I'm just incapable of seeing its beauty? Cheers, Rusty.
Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
Hi Marc, 2017-08-03 22:30 GMT+09:00 Marc Zyngier: > On 03/08/17 13:52, Masahiro Yamada wrote: >> Hi Marc, >> >> 2017-08-03 17:41 GMT+09:00 Marc Zyngier : >>> Hi Masahiro, >>> >>> On 03/08/17 08:32, Masahiro Yamada wrote: Hi. 2017-08-01 0:55 GMT+09:00 Thomas Gleixner : > On Mon, 31 Jul 2017, Tomi Sarvela wrote: >> On 31/07/17 18:06, Thomas Gleixner wrote: >>> Can you please remove the patch. And try the following: >>> >>> # echo N > /sys/module/printk/parameters/console_suspend >>> >>> # echo mem > /sys/power/state >>> >>> and log the output of the serial console. That way we might get a clue >>> where it gets stuck. >> >> I'm afraid it hangs right away. No response from SSH, no output to >> serial. > > What means hangs right away? Is there no output at all on the serial > console? Or does it just stop at some point? > > Thanks, > > tglx > Sorry for jumping in. Finally, I found this thread. My environment is completely different (ARM64 board), I am also suffering from a hibernation problem since this commit. I get no response on the serial console after "Restarting tasks ... done." log message. By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level irq function calls", I can get hibernation working again. SW info: defconfig: arch/arm64/configs/defconfig DT : arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts PSCI : ARM Trusted Firmware SoC info: CPU : Cortex-A72 * 2 + Cortex-A53 * 2 irqchip : GICv3 (drivers/irq/irq-gic-v3.c) >>> >>> Let me take an educated guess: It feels like your firmware doesn't >>> save/restore the GIC context across suspend/resume. Is that something >>> you could check, assuming you have access to the firmware source code? >> >> Thanks for your comments. >> >> >> I do not know much about the manner of preserving GICv3 context. >> >> I can see this patch (rejected?) : >> https://patchwork.kernel.org/patch/9343061/ >> >> >> Is it something that should be completely cared by firmware >> instead of kernel? > > That was definitely the intention, but it looks like something that ATF > has only started supporting very recently: > > https://github.com/ARM-software/arm-trusted-firmware/pull/1047 > >> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware) >> is open source software, and I pushed my platform code to the upstream. >> >> So, yes, I (and everybody) can have access to the firmware source code. >> >> >> I am not sure how ATF saves the context during hibernation, though. > > See the above link. Is there any chance of you trying this into your > firmware? > > Thanks, Thanks for the pointer. Yes. I will try that once GIC-v3 context save/restore is supported in ATF. I think that will basically work for suspend-to-ram because all contexts including both non-secure and secure worlds will be retained in the main memory. However, I still do not understand how the context is preserved during the hibernation (suspend-to-disk). If my understanding is correct, hibernation on Linux works like follows: [1] Freeze all tasks [2] CPU_OFF for non-boot CPUs [3] Create a hibernation image [4] CPU_ON for non-boot CPUs [5] Write the hibernation image to the disk (=swap area) [6] SYSTEM_OFF IIUC, [5] only writes the context Linux takes care of (only non-secure). If so, where and how does the firmware write the GIC-v3 context to the disk? -- Best Regards Masahiro Yamada
Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
Hi Marc, 2017-08-03 22:30 GMT+09:00 Marc Zyngier : > On 03/08/17 13:52, Masahiro Yamada wrote: >> Hi Marc, >> >> 2017-08-03 17:41 GMT+09:00 Marc Zyngier : >>> Hi Masahiro, >>> >>> On 03/08/17 08:32, Masahiro Yamada wrote: Hi. 2017-08-01 0:55 GMT+09:00 Thomas Gleixner : > On Mon, 31 Jul 2017, Tomi Sarvela wrote: >> On 31/07/17 18:06, Thomas Gleixner wrote: >>> Can you please remove the patch. And try the following: >>> >>> # echo N > /sys/module/printk/parameters/console_suspend >>> >>> # echo mem > /sys/power/state >>> >>> and log the output of the serial console. That way we might get a clue >>> where it gets stuck. >> >> I'm afraid it hangs right away. No response from SSH, no output to >> serial. > > What means hangs right away? Is there no output at all on the serial > console? Or does it just stop at some point? > > Thanks, > > tglx > Sorry for jumping in. Finally, I found this thread. My environment is completely different (ARM64 board), I am also suffering from a hibernation problem since this commit. I get no response on the serial console after "Restarting tasks ... done." log message. By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level irq function calls", I can get hibernation working again. SW info: defconfig: arch/arm64/configs/defconfig DT : arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts PSCI : ARM Trusted Firmware SoC info: CPU : Cortex-A72 * 2 + Cortex-A53 * 2 irqchip : GICv3 (drivers/irq/irq-gic-v3.c) >>> >>> Let me take an educated guess: It feels like your firmware doesn't >>> save/restore the GIC context across suspend/resume. Is that something >>> you could check, assuming you have access to the firmware source code? >> >> Thanks for your comments. >> >> >> I do not know much about the manner of preserving GICv3 context. >> >> I can see this patch (rejected?) : >> https://patchwork.kernel.org/patch/9343061/ >> >> >> Is it something that should be completely cared by firmware >> instead of kernel? > > That was definitely the intention, but it looks like something that ATF > has only started supporting very recently: > > https://github.com/ARM-software/arm-trusted-firmware/pull/1047 > >> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware) >> is open source software, and I pushed my platform code to the upstream. >> >> So, yes, I (and everybody) can have access to the firmware source code. >> >> >> I am not sure how ATF saves the context during hibernation, though. > > See the above link. Is there any chance of you trying this into your > firmware? > > Thanks, Thanks for the pointer. Yes. I will try that once GIC-v3 context save/restore is supported in ATF. I think that will basically work for suspend-to-ram because all contexts including both non-secure and secure worlds will be retained in the main memory. However, I still do not understand how the context is preserved during the hibernation (suspend-to-disk). If my understanding is correct, hibernation on Linux works like follows: [1] Freeze all tasks [2] CPU_OFF for non-boot CPUs [3] Create a hibernation image [4] CPU_ON for non-boot CPUs [5] Write the hibernation image to the disk (=swap area) [6] SYSTEM_OFF IIUC, [5] only writes the context Linux takes care of (only non-secure). If so, where and how does the firmware write the GIC-v3 context to the disk? -- Best Regards Masahiro Yamada
Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics
From: Florian FainelliDate: Thu, 3 Aug 2017 21:33:27 -0700 > During testing with a background iperf pushing 1Gbit/sec worth of > traffic and having both ifconfig and ethtool collect statistics, we > could see quite frequent deadlocks. Convert the often accessed DSA slave > network devices statistics to per-cpu 64-bit statistics to remove these > deadlocks and provide fast efficient statistics updates. > > Signed-off-by: Florian Fainelli Applied with appropriate Fixes: tag added. Thanks.
Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics
From: Florian Fainelli Date: Thu, 3 Aug 2017 21:33:27 -0700 > During testing with a background iperf pushing 1Gbit/sec worth of > traffic and having both ifconfig and ethtool collect statistics, we > could see quite frequent deadlocks. Convert the often accessed DSA slave > network devices statistics to per-cpu 64-bit statistics to remove these > deadlocks and provide fast efficient statistics updates. > > Signed-off-by: Florian Fainelli Applied with appropriate Fixes: tag added. Thanks.
Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On 2017/8/7 11:47, David Miller wrote: > From: Ding Tianhong> Date: Sat, 5 Aug 2017 15:15:09 +0800 > >> Some devices have problems with Transaction Layer Packets with the Relaxed >> Ordering Attribute set. This patch set adds a new PCIe Device Flag, >> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known >> devices with Relaxed Ordering issues, and a use of this new flag by the >> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex >> Ports. >> >> It's been years since I've submitted kernel.org patches, I appolgise for the >> almost certain submission errors. > > Which tree should merge this? The PCI tree or my networking tree? > Hi David: I think networking tree merge it is a better choice, as it mainly used to tell the NIC drivers how to use the Relaxed Ordering Attribute, and later we need send patch to enable RO for ixgbe driver base on this patch. But I am not sure whether Bjorn has some of his own view. :) Hi Bjorn: Could you help review this patch or give some feedback ? Thanks Ding > . >
Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
On 2017/8/7 11:47, David Miller wrote: > From: Ding Tianhong > Date: Sat, 5 Aug 2017 15:15:09 +0800 > >> Some devices have problems with Transaction Layer Packets with the Relaxed >> Ordering Attribute set. This patch set adds a new PCIe Device Flag, >> PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known >> devices with Relaxed Ordering issues, and a use of this new flag by the >> cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex >> Ports. >> >> It's been years since I've submitted kernel.org patches, I appolgise for the >> almost certain submission errors. > > Which tree should merge this? The PCI tree or my networking tree? > Hi David: I think networking tree merge it is a better choice, as it mainly used to tell the NIC drivers how to use the Relaxed Ordering Attribute, and later we need send patch to enable RO for ixgbe driver base on this patch. But I am not sure whether Bjorn has some of his own view. :) Hi Bjorn: Could you help review this patch or give some feedback ? Thanks Ding > . >
Re: [PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 09:39 AM, Kuninori Morimoto wrote: From: Kuninori MorimotoThis driver's Copyright is under Renesas Solutions Corp. This patch updates the year, because this driver was moved into synopsys folder in 2017. Thanks. Queued to drm-misc-next. Archit Signed-off-by: Kuninori Morimoto --- v1 -> v2 - update year 2016 -> 2017 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..3b7e5c5 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2017 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 09:39 AM, Kuninori Morimoto wrote: From: Kuninori Morimoto This driver's Copyright is under Renesas Solutions Corp. This patch updates the year, because this driver was moved into synopsys folder in 2017. Thanks. Queued to drm-misc-next. Archit Signed-off-by: Kuninori Morimoto --- v1 -> v2 - update year 2016 -> 2017 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..3b7e5c5 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2017 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright
From: Kuninori MorimotoThis driver's Copyright is under Renesas Solutions Corp. This patch updates the year, because this driver was moved into synopsys folder in 2017. Signed-off-by: Kuninori Morimoto --- v1 -> v2 - update year 2016 -> 2017 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..3b7e5c5 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2017 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- 1.9.1
[PATCH v2] drm: dw-hdmi-i2s: add missing company name on Copyright
From: Kuninori Morimoto This driver's Copyright is under Renesas Solutions Corp. This patch updates the year, because this driver was moved into synopsys folder in 2017. Signed-off-by: Kuninori Morimoto --- v1 -> v2 - update year 2016 -> 2017 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..3b7e5c5 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2017 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- 1.9.1
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
Hi Archit > >> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: > >>> > >>> From: Kuninori Morimoto> >>> > >>> This driver's Copyright is under Renesas Solutions Corp > >> > >> Can we update the year to 2017 while we're at it? > > > > The original patch was created and applied on 2016 > > > > 2761ba6c0925ca9c5b917a95f68135d9dce443fb > > ("drm: bridge: add DesignWare HDMI I2S audio support") > > > > And moved into new synopsys folder on 2017, I think. > > We're allowed to update the copyright year as we continue to > make changes to a file. So, I think updating to 2017 should be > okay. OK, will do in v2 Best regards --- Kuninori Morimoto
Re: [PATCH 2/5] edac: synopsys: Add EDAC ECC support for ZynqMP DDRC
On Fri, Aug 04, 2017 at 02:00:24PM +0200, Michal Simek wrote: > From: Naga Sureshkumar Relli> > This patch adds EDAC ECC support for ZynqMP DDRC IP > > Signed-off-by: Naga Sureshkumar Relli > Signed-off-by: Michal Simek > --- > > drivers/edac/Kconfig | 2 +- > drivers/edac/synopsys_edac.c | 306 > ++- > 2 files changed, 302 insertions(+), 6 deletions(-) ... > @@ -440,9 +706,12 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci, > mci->dev_name = SYNPS_EDAC_MOD_STRING; > mci->mod_name = SYNPS_EDAC_MOD_VER; > mci->mod_ver = "1"; > - > - edac_op_state = EDAC_OPSTATE_POLL; > - mci->edac_check = synps_edac_check; > + if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) { > + edac_op_state = EDAC_OPSTATE_INT; > + } else { > + edac_op_state = EDAC_OPSTATE_POLL; > + mci->edac_check = synps_edac_check; > + } > mci->ctl_page_to_phys = NULL; > > status = synps_edac_init_csrows(mci); This hunk doesn't apply cleanly: $ test-apply.sh -q /tmp/02-edac-synopsys-add_edac_ecc_support_for_zynqmp_ddrc.patch checking file drivers/edac/Kconfig checking file drivers/edac/synopsys_edac.c Hunk #11 FAILED at 706. Hunk #12 succeeded at 723 (offset -1 lines). Hunk #13 succeeded at 754 (offset -1 lines). Hunk #14 succeeded at 803 (offset -1 lines). 1 out of 14 hunks FAILED Please redo your patches against this branch: https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=for-next Thx. > @@ -458,8 +727,18 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci, > .quirks = 0, > }; > > +static const struct synps_platform_data zynqmp_enh_edac_def = { > + .synps_edac_geterror_info = synps_enh_edac_geterror_info, > + .synps_edac_get_mtype = synps_enh_edac_get_mtype, > + .synps_edac_get_dtype = synps_enh_edac_get_dtype, > + .synps_edac_get_eccstate= synps_enh_edac_get_eccstate, > + .quirks = DDR_ECC_INTR_SUPPORT, > +}; > + > static const struct of_device_id synps_edac_match[] = { > { .compatible = "xlnx,zynq-ddrc-a05", .data = (void *)_edac_def }, > + { .compatible = "xlnx,zynqmp-ddrc-2.40a", > + .data = (void *)_enh_edac_def}, WARNING: DT compatible string "xlnx,zynqmp-ddrc-2.40a" appears un-documented -- check ./Documentation/devicetree/bindings/ #414: FILE: drivers/edac/synopsys_edac.c:740: + { .compatible = "xlnx,zynqmp-ddrc-2.40a", Please integrate checkpatch.pl into your patch creation workflow. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
Hi Archit > >> On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: > >>> > >>> From: Kuninori Morimoto > >>> > >>> This driver's Copyright is under Renesas Solutions Corp > >> > >> Can we update the year to 2017 while we're at it? > > > > The original patch was created and applied on 2016 > > > > 2761ba6c0925ca9c5b917a95f68135d9dce443fb > > ("drm: bridge: add DesignWare HDMI I2S audio support") > > > > And moved into new synopsys folder on 2017, I think. > > We're allowed to update the copyright year as we continue to > make changes to a file. So, I think updating to 2017 should be > okay. OK, will do in v2 Best regards --- Kuninori Morimoto
Re: [PATCH 2/5] edac: synopsys: Add EDAC ECC support for ZynqMP DDRC
On Fri, Aug 04, 2017 at 02:00:24PM +0200, Michal Simek wrote: > From: Naga Sureshkumar Relli > > This patch adds EDAC ECC support for ZynqMP DDRC IP > > Signed-off-by: Naga Sureshkumar Relli > Signed-off-by: Michal Simek > --- > > drivers/edac/Kconfig | 2 +- > drivers/edac/synopsys_edac.c | 306 > ++- > 2 files changed, 302 insertions(+), 6 deletions(-) ... > @@ -440,9 +706,12 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci, > mci->dev_name = SYNPS_EDAC_MOD_STRING; > mci->mod_name = SYNPS_EDAC_MOD_VER; > mci->mod_ver = "1"; > - > - edac_op_state = EDAC_OPSTATE_POLL; > - mci->edac_check = synps_edac_check; > + if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) { > + edac_op_state = EDAC_OPSTATE_INT; > + } else { > + edac_op_state = EDAC_OPSTATE_POLL; > + mci->edac_check = synps_edac_check; > + } > mci->ctl_page_to_phys = NULL; > > status = synps_edac_init_csrows(mci); This hunk doesn't apply cleanly: $ test-apply.sh -q /tmp/02-edac-synopsys-add_edac_ecc_support_for_zynqmp_ddrc.patch checking file drivers/edac/Kconfig checking file drivers/edac/synopsys_edac.c Hunk #11 FAILED at 706. Hunk #12 succeeded at 723 (offset -1 lines). Hunk #13 succeeded at 754 (offset -1 lines). Hunk #14 succeeded at 803 (offset -1 lines). 1 out of 14 hunks FAILED Please redo your patches against this branch: https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=for-next Thx. > @@ -458,8 +727,18 @@ static int synps_edac_mc_init(struct mem_ctl_info *mci, > .quirks = 0, > }; > > +static const struct synps_platform_data zynqmp_enh_edac_def = { > + .synps_edac_geterror_info = synps_enh_edac_geterror_info, > + .synps_edac_get_mtype = synps_enh_edac_get_mtype, > + .synps_edac_get_dtype = synps_enh_edac_get_dtype, > + .synps_edac_get_eccstate= synps_enh_edac_get_eccstate, > + .quirks = DDR_ECC_INTR_SUPPORT, > +}; > + > static const struct of_device_id synps_edac_match[] = { > { .compatible = "xlnx,zynq-ddrc-a05", .data = (void *)_edac_def }, > + { .compatible = "xlnx,zynqmp-ddrc-2.40a", > + .data = (void *)_enh_edac_def}, WARNING: DT compatible string "xlnx,zynqmp-ddrc-2.40a" appears un-documented -- check ./Documentation/devicetree/bindings/ #414: FILE: drivers/edac/synopsys_edac.c:740: + { .compatible = "xlnx,zynqmp-ddrc-2.40a", Please integrate checkpatch.pl into your patch creation workflow. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
Re: [PATCH v4 net-next 02/13] nfp: change bpf verifier hooks to match new verifier data structures
From: Edward CreeDate: Thu, 3 Aug 2017 17:11:34 +0100 > Signed-off-by: Edward Cree Sorry, this doesn't work. The entire source tree must compile properly after each patch in the patch series. So if you change a datastructure, you have to update all of the users in that patch to keep everything compiling and working.
Re: [PATCH v4 net-next 02/13] nfp: change bpf verifier hooks to match new verifier data structures
From: Edward Cree Date: Thu, 3 Aug 2017 17:11:34 +0100 > Signed-off-by: Edward Cree Sorry, this doesn't work. The entire source tree must compile properly after each patch in the patch series. So if you change a datastructure, you have to update all of the users in that patch to keep everything compiling and working.
[regression] tty console panic for 4.13-rcx
Hi, I saw the log at the bottom and bisect the issue to the commits of 065ea0a7afd64d6c ("tty: improve tty_insert_flip_char() slow path") 979990c628481461 ("tty: improve tty_insert_flip_char() fast path") I nearly could 100% reproduce this. Any thought? [ 154.823106] Unable to handle kernel NULL pointer dereference at virtual address 000d [ 154.823885] user pgtable: 4k pages, 48-bit VAs, pgd = 800066e79000 [ 154.824464] [000d] *pgd=6768a003, *pud=6a7da003, *pmd=669c3003, *pte= [ 154.825460] Internal error: Oops: 9607 [#1] PREEMPT SMP [ 154.825957] Modules linked in: [ 154.826258] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170802-2-gd66440a-dirty #112 [ 154.827091] Hardware name: Firefly-RK3399 Board (DT) [ 154.827539] task: 28f42b00 task.stack: 28f3 [ 154.828083] PC is at llist_del_first+0x8/0x74 [ 154.828481] LR is at __tty_buffer_request_room+0x114/0x148 [ 154.828972] pc : [] lr : [] pstate: 61c5 [ 154.829625] sp : 80007ef10d00 [ 154.829925] x29: 80007ef10d00 x28: 28f42b00 [ 154.830409] x27: 28cc7458 x26: 28cc7430 [ 154.830892] x25: 0026 x24: [ 154.831373] x23: x22: 0001 [ 154.831854] x21: 80006b37b600 x20: 0100 [ 154.832337] x19: 80006a8a5840 x18: [ 154.832819] x17: x16: 2821b398 [ 154.833300] x15: x14: 3d097d00 [ 154.833781] x13: 00017700 x12: 009a [ 154.834263] x11: 7fff x10: 0002 [ 154.834744] x9 : 0003 x8 : [ 154.835225] x7 : 003d0900 x6 : [ 154.835706] x5 : 0100 x4 : 0200 [ 154.836187] x3 : 0001 x2 : 000d [ 154.836668] x1 : 0200 x0 : 80006a8a58b0 [ 154.837153] Process swapper/0 (pid: 0, stack limit = 0x28f3) [ 154.837750] Stack: (0x80007ef10d00 to 0x28f34000) [ 154.838261] Call trace: [ 154.838493] Exception stack(0x80007ef10b30 to 0x80007ef10c60) [ 154.839068] 0b20: 80006a8a5840 0001 [ 154.839766] 0b40: 80007ef10d00 283b813c 800069219180 80007ef1c968 [ 154.840464] 0b60: 80007ef10b70 280f31d0 80007ef10bf0 280e7cf0 [ 154.841161] 0b80: 80007ef1c900 80007ef1c900 0004 01c0 [ 154.841859] 0ba0: 80007ef1c900 28f39ba8 0001 [ 154.842557] 0bc0: 80007ef10be0 280e7e5c 80006a8a58b0 0200 [ 154.843252] 0be0: 000d 0001 0200 0100 [ 154.843949] 0c00: 003d0900 0003 [ 154.844645] 0c20: 0002 7fff 009a 00017700 [ 154.845342] 0c40: 3d097d00 2821b398 [ 154.846040] [] llist_del_first+0x8/0x74 [ 154.846528] [] __tty_insert_flip_char+0x2c/0x78 [ 154.847076] [] uart_insert_char+0x54/0x13c [ 154.847589] [] serial8250_rx_chars+0x98/0x1e8 [ 154.848124] [] serial8250_handle_irq.part.23+0x70/0xec [ 154.848725] [] serial8250_handle_irq+0x14/0x24 [ 154.849264] [] dw8250_handle_irq+0x40/0xfc [ 154.849774] [] serial8250_interrupt+0x6c/0xec [ 154.850309] [] __handle_irq_event_percpu+0xa0/0x128 [ 154.850887] [] handle_irq_event_percpu+0x1c/0x54 [ 154.851442] [] handle_irq_event+0x44/0x74 [ 154.851947] [] handle_fasteoi_irq+0x9c/0x154 [ 154.852470] [] generic_handle_irq+0x24/0x38 [ 154.852986] [] __handle_domain_irq+0x60/0xac [ 154.853510] [] gic_handle_irq+0xd4/0x17c [ 154.854001] Exception stack(0x28f33da0 to 0x28f33ed0) [ 154.854576] 3da0: 0001 [ 154.855273] 3dc0: 600075ff6000 0001 [ 154.855970] 3de0: 28f43560 28f33e50 0a00 [ 154.85] 3e00: 014f ffd42070 f7934d7b [ 154.857365] 3e20: 2821b398 28f17000 [ 154.858064] 3e40: 28f39000 28f39000 28f25820 28f39e98 [ 154.858761] 3e60: 28f42b00 [ 154.859458] 3e80: 02e00018 28f33ed0 280852ec 28f33ed0 [ 154.860156] 3ea0: 280852f0 6145 7df19878 7ffa7010 [ 154.860851] 3ec0: 2813744c [ 154.861293] [] el1_irq+0xb4/0x128 [ 154.861737] [] arch_cpu_idle+0x10/0x18 [ 154.86] [] default_idle_call+0x18/0x2c [ 154.862735] [] do_idle+0x170/0x1fc [ 154.863187] [] cpu_startup_entry+0x1c/0x24 [
[regression] tty console panic for 4.13-rcx
Hi, I saw the log at the bottom and bisect the issue to the commits of 065ea0a7afd64d6c ("tty: improve tty_insert_flip_char() slow path") 979990c628481461 ("tty: improve tty_insert_flip_char() fast path") I nearly could 100% reproduce this. Any thought? [ 154.823106] Unable to handle kernel NULL pointer dereference at virtual address 000d [ 154.823885] user pgtable: 4k pages, 48-bit VAs, pgd = 800066e79000 [ 154.824464] [000d] *pgd=6768a003, *pud=6a7da003, *pmd=669c3003, *pte= [ 154.825460] Internal error: Oops: 9607 [#1] PREEMPT SMP [ 154.825957] Modules linked in: [ 154.826258] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc3-next-20170802-2-gd66440a-dirty #112 [ 154.827091] Hardware name: Firefly-RK3399 Board (DT) [ 154.827539] task: 28f42b00 task.stack: 28f3 [ 154.828083] PC is at llist_del_first+0x8/0x74 [ 154.828481] LR is at __tty_buffer_request_room+0x114/0x148 [ 154.828972] pc : [] lr : [] pstate: 61c5 [ 154.829625] sp : 80007ef10d00 [ 154.829925] x29: 80007ef10d00 x28: 28f42b00 [ 154.830409] x27: 28cc7458 x26: 28cc7430 [ 154.830892] x25: 0026 x24: [ 154.831373] x23: x22: 0001 [ 154.831854] x21: 80006b37b600 x20: 0100 [ 154.832337] x19: 80006a8a5840 x18: [ 154.832819] x17: x16: 2821b398 [ 154.833300] x15: x14: 3d097d00 [ 154.833781] x13: 00017700 x12: 009a [ 154.834263] x11: 7fff x10: 0002 [ 154.834744] x9 : 0003 x8 : [ 154.835225] x7 : 003d0900 x6 : [ 154.835706] x5 : 0100 x4 : 0200 [ 154.836187] x3 : 0001 x2 : 000d [ 154.836668] x1 : 0200 x0 : 80006a8a58b0 [ 154.837153] Process swapper/0 (pid: 0, stack limit = 0x28f3) [ 154.837750] Stack: (0x80007ef10d00 to 0x28f34000) [ 154.838261] Call trace: [ 154.838493] Exception stack(0x80007ef10b30 to 0x80007ef10c60) [ 154.839068] 0b20: 80006a8a5840 0001 [ 154.839766] 0b40: 80007ef10d00 283b813c 800069219180 80007ef1c968 [ 154.840464] 0b60: 80007ef10b70 280f31d0 80007ef10bf0 280e7cf0 [ 154.841161] 0b80: 80007ef1c900 80007ef1c900 0004 01c0 [ 154.841859] 0ba0: 80007ef1c900 28f39ba8 0001 [ 154.842557] 0bc0: 80007ef10be0 280e7e5c 80006a8a58b0 0200 [ 154.843252] 0be0: 000d 0001 0200 0100 [ 154.843949] 0c00: 003d0900 0003 [ 154.844645] 0c20: 0002 7fff 009a 00017700 [ 154.845342] 0c40: 3d097d00 2821b398 [ 154.846040] [] llist_del_first+0x8/0x74 [ 154.846528] [] __tty_insert_flip_char+0x2c/0x78 [ 154.847076] [] uart_insert_char+0x54/0x13c [ 154.847589] [] serial8250_rx_chars+0x98/0x1e8 [ 154.848124] [] serial8250_handle_irq.part.23+0x70/0xec [ 154.848725] [] serial8250_handle_irq+0x14/0x24 [ 154.849264] [] dw8250_handle_irq+0x40/0xfc [ 154.849774] [] serial8250_interrupt+0x6c/0xec [ 154.850309] [] __handle_irq_event_percpu+0xa0/0x128 [ 154.850887] [] handle_irq_event_percpu+0x1c/0x54 [ 154.851442] [] handle_irq_event+0x44/0x74 [ 154.851947] [] handle_fasteoi_irq+0x9c/0x154 [ 154.852470] [] generic_handle_irq+0x24/0x38 [ 154.852986] [] __handle_domain_irq+0x60/0xac [ 154.853510] [] gic_handle_irq+0xd4/0x17c [ 154.854001] Exception stack(0x28f33da0 to 0x28f33ed0) [ 154.854576] 3da0: 0001 [ 154.855273] 3dc0: 600075ff6000 0001 [ 154.855970] 3de0: 28f43560 28f33e50 0a00 [ 154.85] 3e00: 014f ffd42070 f7934d7b [ 154.857365] 3e20: 2821b398 28f17000 [ 154.858064] 3e40: 28f39000 28f39000 28f25820 28f39e98 [ 154.858761] 3e60: 28f42b00 [ 154.859458] 3e80: 02e00018 28f33ed0 280852ec 28f33ed0 [ 154.860156] 3ea0: 280852f0 6145 7df19878 7ffa7010 [ 154.860851] 3ec0: 2813744c [ 154.861293] [] el1_irq+0xb4/0x128 [ 154.861737] [] arch_cpu_idle+0x10/0x18 [ 154.86] [] default_idle_call+0x18/0x2c [ 154.862735] [] do_idle+0x170/0x1fc [ 154.863187] [] cpu_startup_entry+0x1c/0x24 [
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 09:25 AM, Kuninori Morimoto wrote: Hi Archit Thank you for your feedback On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: From: Kuninori MorimotoThis driver's Copyright is under Renesas Solutions Corp Can we update the year to 2017 while we're at it? The original patch was created and applied on 2016 2761ba6c0925ca9c5b917a95f68135d9dce443fb ("drm: bridge: add DesignWare HDMI I2S audio support") And moved into new synopsys folder on 2017, I think. We're allowed to update the copyright year as we continue to make changes to a file. So, I think updating to 2017 should be okay. Archit Best regards --- Kuninori Morimoto -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 09:25 AM, Kuninori Morimoto wrote: Hi Archit Thank you for your feedback On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: From: Kuninori Morimoto This driver's Copyright is under Renesas Solutions Corp Can we update the year to 2017 while we're at it? The original patch was created and applied on 2016 2761ba6c0925ca9c5b917a95f68135d9dce443fb ("drm: bridge: add DesignWare HDMI I2S audio support") And moved into new synopsys folder on 2017, I think. We're allowed to update the copyright year as we continue to make changes to a file. So, I think updating to 2017 should be okay. Archit Best regards --- Kuninori Morimoto -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
Hi Archit Thank you for your feedback > On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: > > > > From: Kuninori Morimoto> > > > This driver's Copyright is under Renesas Solutions Corp > > Can we update the year to 2017 while we're at it? The original patch was created and applied on 2016 2761ba6c0925ca9c5b917a95f68135d9dce443fb ("drm: bridge: add DesignWare HDMI I2S audio support") And moved into new synopsys folder on 2017, I think. Best regards --- Kuninori Morimoto
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
Hi Archit Thank you for your feedback > On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: > > > > From: Kuninori Morimoto > > > > This driver's Copyright is under Renesas Solutions Corp > > Can we update the year to 2017 while we're at it? The original patch was created and applied on 2016 2761ba6c0925ca9c5b917a95f68135d9dce443fb ("drm: bridge: add DesignWare HDMI I2S audio support") And moved into new synopsys folder on 2017, I think. Best regards --- Kuninori Morimoto
Re: [PATCH] intel-vbtn: match power button on press rather than release
On Mon, Aug 07, 2017 at 08:59:30AM +0800, AceLan Kao wrote: > Looks like I'm one hour late to ack the patch. > Thanks any way for the quick response. Thanks for chiming in all the same - and normally I'd have provided for more time. In this case, I will be away for a few days, and it was important to get this in sooner rather than later in the RC cycle. -- Darren Hart VMware Open Source Technology Center
Re: [PATCH] intel-vbtn: match power button on press rather than release
On Mon, Aug 07, 2017 at 08:59:30AM +0800, AceLan Kao wrote: > Looks like I'm one hour late to ack the patch. > Thanks any way for the quick response. Thanks for chiming in all the same - and normally I'd have provided for more time. In this case, I will be away for a few days, and it was important to get this in sooner rather than later in the RC cycle. -- Darren Hart VMware Open Source Technology Center
[PATCH v6 0/2] Make find_later_rq() choose a closer cpu in topology
When cpudl_find() returns any among free_cpus, the cpu might not be closer than others, considering sched domain. For example: this_cpu: 15 free_cpus: 0, 1,..., 14 (== later_mask) best_cpu: 0 topology: 0 --+ +--+ 1 --+ | +-- ... --+ 2 --+ | | +--+ | 3 --+| ... ... 12 --+ | +--+| 13 --+ || +-- ... -+ 14 --+ | +--+ 15 --+ In this case, it would be best to select 14 since it's a free cpu and closest to 15(this_cpu). However, currently the code select 0(best_cpu) even though that's just any among free_cpus. Fix it. Change from v5 -. exclude two patches already picked up by peterz (sched/deadline: Make find_later_rq() choose a closer cpu in topology) (sched/deadline: Change return value of cpudl_find()) -. apply what peterz fixed for 'prefer sibling', into deadline and rt Change from v4 -. remove a patch that might cause huge lock contention (by spin lock() in a hot path of scheduler) Change from v3 -. rename closest_cpu to best_cpu so that it align with rt -. protect referring cpudl.elements with cpudl.lock -. change return value of cpudl_find() to bool Change from v2 -. add support for SD_PREFER_SIBLING Change from v1 -. clean up the patch Byungchul Park (2): sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq() sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq() kernel/sched/deadline.c | 46 +++--- kernel/sched/rt.c | 47 --- 2 files changed, 87 insertions(+), 6 deletions(-) -- 1.9.1
[PATCH v6 0/2] Make find_later_rq() choose a closer cpu in topology
When cpudl_find() returns any among free_cpus, the cpu might not be closer than others, considering sched domain. For example: this_cpu: 15 free_cpus: 0, 1,..., 14 (== later_mask) best_cpu: 0 topology: 0 --+ +--+ 1 --+ | +-- ... --+ 2 --+ | | +--+ | 3 --+| ... ... 12 --+ | +--+| 13 --+ || +-- ... -+ 14 --+ | +--+ 15 --+ In this case, it would be best to select 14 since it's a free cpu and closest to 15(this_cpu). However, currently the code select 0(best_cpu) even though that's just any among free_cpus. Fix it. Change from v5 -. exclude two patches already picked up by peterz (sched/deadline: Make find_later_rq() choose a closer cpu in topology) (sched/deadline: Change return value of cpudl_find()) -. apply what peterz fixed for 'prefer sibling', into deadline and rt Change from v4 -. remove a patch that might cause huge lock contention (by spin lock() in a hot path of scheduler) Change from v3 -. rename closest_cpu to best_cpu so that it align with rt -. protect referring cpudl.elements with cpudl.lock -. change return value of cpudl_find() to bool Change from v2 -. add support for SD_PREFER_SIBLING Change from v1 -. clean up the patch Byungchul Park (2): sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq() sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq() kernel/sched/deadline.c | 46 +++--- kernel/sched/rt.c | 47 --- 2 files changed, 87 insertions(+), 6 deletions(-) -- 1.9.1
[PATCH v6 2/2] sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()
It would be better to avoid pushing tasks to other cpu within a SD_PREFER_SIBLING domain, instead, get more chances to check other siblings. Signed-off-by: Byungchul Park--- kernel/sched/rt.c | 47 --- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 979b734..50639e5 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1618,12 +1618,35 @@ static struct task_struct *pick_highest_pushable_task(struct rq *rq, int cpu) static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask); +/* + * Find the first cpu in: mask & sd & ~prefer + */ +static int find_cpu(const struct cpumask *mask, + const struct sched_domain *sd, + const struct sched_domain *prefer) +{ + const struct cpumask *sds = sched_domain_span(sd); + const struct cpumask *ps = prefer ? sched_domain_span(prefer) : NULL; + int cpu = -1; + + while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) { + if (!cpumask_test_cpu(cpu, sds)) + continue; + if (ps && cpumask_test_cpu(cpu, ps)) + continue; + break; + } + + return cpu; +} + static int find_lowest_rq(struct task_struct *task) { - struct sched_domain *sd; + struct sched_domain *sd, *prefer = NULL; struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask); int this_cpu = smp_processor_id(); int cpu = task_cpu(task); + int fallback_cpu = -1; /* Make sure the mask is initialized first */ if (unlikely(!lowest_mask)) @@ -1668,9 +1691,20 @@ static int find_lowest_rq(struct task_struct *task) return this_cpu; } - best_cpu = cpumask_first_and(lowest_mask, -sched_domain_span(sd)); + best_cpu = find_cpu(lowest_mask, sd, prefer); + if (best_cpu < nr_cpu_ids) { + /* +* If current domain is SD_PREFER_SIBLING +* flaged, we have to get more chances to +* check other siblings. +*/ + if (sd->flags & SD_PREFER_SIBLING) { + prefer = sd; + if (fallback_cpu == -1) + fallback_cpu = best_cpu; + continue; + } rcu_read_unlock(); return best_cpu; } @@ -1679,6 +1713,13 @@ static int find_lowest_rq(struct task_struct *task) rcu_read_unlock(); /* +* If fallback_cpu is valid, all our quesses failed *except* for +* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu. +*/ + if (fallback_cpu != -1) + return fallback_cpu; + + /* * And finally, if there were no matches within the domains * just give the caller *something* to work with from the compatible * locations. -- 1.9.1
[PATCH v6 1/2] sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()
It would be better to avoid pushing tasks to other cpu within a SD_PREFER_SIBLING domain, instead, get more chances to check other siblings. Signed-off-by: Byungchul Park--- kernel/sched/deadline.c | 46 +++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 0223694..2fd1591 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1319,12 +1319,35 @@ static struct task_struct *pick_earliest_pushable_dl_task(struct rq *rq, int cpu static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask_dl); +/* + * Find the first cpu in: mask & sd & ~prefer + */ +static int find_cpu(const struct cpumask *mask, + const struct sched_domain *sd, + const struct sched_domain *prefer) +{ + const struct cpumask *sds = sched_domain_span(sd); + const struct cpumask *ps = prefer ? sched_domain_span(prefer) : NULL; + int cpu = -1; + + while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) { + if (!cpumask_test_cpu(cpu, sds)) + continue; + if (ps && cpumask_test_cpu(cpu, ps)) + continue; + break; + } + + return cpu; +} + static int find_later_rq(struct task_struct *task) { - struct sched_domain *sd; + struct sched_domain *sd, *prefer = NULL; struct cpumask *later_mask = this_cpu_cpumask_var_ptr(local_cpu_mask_dl); int this_cpu = smp_processor_id(); int cpu = task_cpu(task); + int fallback_cpu = -1; /* Make sure the mask is initialized first */ if (unlikely(!later_mask)) @@ -1376,8 +1399,7 @@ static int find_later_rq(struct task_struct *task) return this_cpu; } - best_cpu = cpumask_first_and(later_mask, - sched_domain_span(sd)); + best_cpu = find_cpu(later_mask, sd, prefer); /* * Last chance: if a cpu being in both later_mask * and current sd span is valid, that becomes our @@ -1385,6 +1407,17 @@ static int find_later_rq(struct task_struct *task) * already under consideration through later_mask. */ if (best_cpu < nr_cpu_ids) { + /* +* If current domain is SD_PREFER_SIBLING +* flaged, we have to get more chances to +* check other siblings. +*/ + if (sd->flags & SD_PREFER_SIBLING) { + prefer = sd; + if (fallback_cpu == -1) + fallback_cpu = best_cpu; + continue; + } rcu_read_unlock(); return best_cpu; } @@ -1393,6 +1426,13 @@ static int find_later_rq(struct task_struct *task) rcu_read_unlock(); /* +* If fallback_cpu is valid, all our guesses failed *except* for +* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu. +*/ + if (fallback_cpu != -1) + return fallback_cpu; + + /* * At this point, all our guesses failed, we just return * 'something', and let the caller sort the things out. */ -- 1.9.1
[PATCH v6 1/2] sched/deadline: Add support for SD_PREFER_SIBLING on find_later_rq()
It would be better to avoid pushing tasks to other cpu within a SD_PREFER_SIBLING domain, instead, get more chances to check other siblings. Signed-off-by: Byungchul Park --- kernel/sched/deadline.c | 46 +++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 0223694..2fd1591 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1319,12 +1319,35 @@ static struct task_struct *pick_earliest_pushable_dl_task(struct rq *rq, int cpu static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask_dl); +/* + * Find the first cpu in: mask & sd & ~prefer + */ +static int find_cpu(const struct cpumask *mask, + const struct sched_domain *sd, + const struct sched_domain *prefer) +{ + const struct cpumask *sds = sched_domain_span(sd); + const struct cpumask *ps = prefer ? sched_domain_span(prefer) : NULL; + int cpu = -1; + + while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) { + if (!cpumask_test_cpu(cpu, sds)) + continue; + if (ps && cpumask_test_cpu(cpu, ps)) + continue; + break; + } + + return cpu; +} + static int find_later_rq(struct task_struct *task) { - struct sched_domain *sd; + struct sched_domain *sd, *prefer = NULL; struct cpumask *later_mask = this_cpu_cpumask_var_ptr(local_cpu_mask_dl); int this_cpu = smp_processor_id(); int cpu = task_cpu(task); + int fallback_cpu = -1; /* Make sure the mask is initialized first */ if (unlikely(!later_mask)) @@ -1376,8 +1399,7 @@ static int find_later_rq(struct task_struct *task) return this_cpu; } - best_cpu = cpumask_first_and(later_mask, - sched_domain_span(sd)); + best_cpu = find_cpu(later_mask, sd, prefer); /* * Last chance: if a cpu being in both later_mask * and current sd span is valid, that becomes our @@ -1385,6 +1407,17 @@ static int find_later_rq(struct task_struct *task) * already under consideration through later_mask. */ if (best_cpu < nr_cpu_ids) { + /* +* If current domain is SD_PREFER_SIBLING +* flaged, we have to get more chances to +* check other siblings. +*/ + if (sd->flags & SD_PREFER_SIBLING) { + prefer = sd; + if (fallback_cpu == -1) + fallback_cpu = best_cpu; + continue; + } rcu_read_unlock(); return best_cpu; } @@ -1393,6 +1426,13 @@ static int find_later_rq(struct task_struct *task) rcu_read_unlock(); /* +* If fallback_cpu is valid, all our guesses failed *except* for +* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu. +*/ + if (fallback_cpu != -1) + return fallback_cpu; + + /* * At this point, all our guesses failed, we just return * 'something', and let the caller sort the things out. */ -- 1.9.1
[PATCH v6 2/2] sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()
It would be better to avoid pushing tasks to other cpu within a SD_PREFER_SIBLING domain, instead, get more chances to check other siblings. Signed-off-by: Byungchul Park --- kernel/sched/rt.c | 47 --- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 979b734..50639e5 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1618,12 +1618,35 @@ static struct task_struct *pick_highest_pushable_task(struct rq *rq, int cpu) static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask); +/* + * Find the first cpu in: mask & sd & ~prefer + */ +static int find_cpu(const struct cpumask *mask, + const struct sched_domain *sd, + const struct sched_domain *prefer) +{ + const struct cpumask *sds = sched_domain_span(sd); + const struct cpumask *ps = prefer ? sched_domain_span(prefer) : NULL; + int cpu = -1; + + while ((cpu = cpumask_next(cpu, mask)) < nr_cpu_ids) { + if (!cpumask_test_cpu(cpu, sds)) + continue; + if (ps && cpumask_test_cpu(cpu, ps)) + continue; + break; + } + + return cpu; +} + static int find_lowest_rq(struct task_struct *task) { - struct sched_domain *sd; + struct sched_domain *sd, *prefer = NULL; struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask); int this_cpu = smp_processor_id(); int cpu = task_cpu(task); + int fallback_cpu = -1; /* Make sure the mask is initialized first */ if (unlikely(!lowest_mask)) @@ -1668,9 +1691,20 @@ static int find_lowest_rq(struct task_struct *task) return this_cpu; } - best_cpu = cpumask_first_and(lowest_mask, -sched_domain_span(sd)); + best_cpu = find_cpu(lowest_mask, sd, prefer); + if (best_cpu < nr_cpu_ids) { + /* +* If current domain is SD_PREFER_SIBLING +* flaged, we have to get more chances to +* check other siblings. +*/ + if (sd->flags & SD_PREFER_SIBLING) { + prefer = sd; + if (fallback_cpu == -1) + fallback_cpu = best_cpu; + continue; + } rcu_read_unlock(); return best_cpu; } @@ -1679,6 +1713,13 @@ static int find_lowest_rq(struct task_struct *task) rcu_read_unlock(); /* +* If fallback_cpu is valid, all our quesses failed *except* for +* SD_PREFER_SIBLING domain. Now, we can return the fallback cpu. +*/ + if (fallback_cpu != -1) + return fallback_cpu; + + /* * And finally, if there were no matches within the domains * just give the caller *something* to work with from the compatible * locations. -- 1.9.1
Re: [RFC Part1 PATCH v3 12/17] x86/mm: DMA support for SEV memory encryption
On Mon, Jul 24, 2017 at 02:07:52PM -0500, Brijesh Singh wrote: > From: Tom Lendacky> > DMA access to memory mapped as encrypted while SEV is active can not be > encrypted during device write or decrypted during device read. Yeah, definitely rewrite that sentence. > In order > for DMA to properly work when SEV is active, the SWIOTLB bounce buffers > must be used. > > Signed-off-by: Tom Lendacky > Signed-off-by: Brijesh Singh > --- > arch/x86/mm/mem_encrypt.c | 86 > +++ > lib/swiotlb.c | 5 +-- > 2 files changed, 89 insertions(+), 2 deletions ... > @@ -202,6 +280,14 @@ void __init mem_encrypt_init(void) > /* Call into SWIOTLB to update the SWIOTLB DMA buffers */ > swiotlb_update_mem_attributes(); > > + /* > + * With SEV, DMA operations cannot use encryption. New DMA ops > + * are required in order to mark the DMA areas as decrypted or > + * to use bounce buffers. > + */ > + if (sev_active()) > + dma_ops = _dma_ops; Well, we do differentiate between SME and SEV and the check is sev_active but the ops are called sme_dma_ops. Call them sev_dma_ops instead for less confusion. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [RFC Part1 PATCH v3 12/17] x86/mm: DMA support for SEV memory encryption
On Mon, Jul 24, 2017 at 02:07:52PM -0500, Brijesh Singh wrote: > From: Tom Lendacky > > DMA access to memory mapped as encrypted while SEV is active can not be > encrypted during device write or decrypted during device read. Yeah, definitely rewrite that sentence. > In order > for DMA to properly work when SEV is active, the SWIOTLB bounce buffers > must be used. > > Signed-off-by: Tom Lendacky > Signed-off-by: Brijesh Singh > --- > arch/x86/mm/mem_encrypt.c | 86 > +++ > lib/swiotlb.c | 5 +-- > 2 files changed, 89 insertions(+), 2 deletions ... > @@ -202,6 +280,14 @@ void __init mem_encrypt_init(void) > /* Call into SWIOTLB to update the SWIOTLB DMA buffers */ > swiotlb_update_mem_attributes(); > > + /* > + * With SEV, DMA operations cannot use encryption. New DMA ops > + * are required in order to mark the DMA areas as decrypted or > + * to use bounce buffers. > + */ > + if (sev_active()) > + dma_ops = _dma_ops; Well, we do differentiate between SME and SEV and the check is sev_active but the ops are called sme_dma_ops. Call them sev_dma_ops instead for less confusion. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Ding TianhongDate: Sat, 5 Aug 2017 15:15:09 +0800 > Some devices have problems with Transaction Layer Packets with the Relaxed > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > devices with Relaxed Ordering issues, and a use of this new flag by the > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > Ports. > > It's been years since I've submitted kernel.org patches, I appolgise for the > almost certain submission errors. Which tree should merge this? The PCI tree or my networking tree?
Re: [PATCH v9 0/4] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Ding Tianhong Date: Sat, 5 Aug 2017 15:15:09 +0800 > Some devices have problems with Transaction Layer Packets with the Relaxed > Ordering Attribute set. This patch set adds a new PCIe Device Flag, > PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known > devices with Relaxed Ordering issues, and a use of this new flag by the > cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex > Ports. > > It's been years since I've submitted kernel.org patches, I appolgise for the > almost certain submission errors. Which tree should merge this? The PCI tree or my networking tree?
Re: [PATCH v2 4/5] PCI: mediatek: Add new generation controller support
On Sat, 2017-08-05 at 14:16 +0800, Ryder Lee wrote: > On Sat, 2017-08-05 at 12:52 +0800, Ryder Lee wrote: > > Hi Honghui, Bjorn, > > > > On Fri, 2017-08-04 at 08:18 -0500, Bjorn Helgaas wrote: > > > On Fri, Aug 04, 2017 at 04:39:36PM +0800, Honghui Zhang wrote: > > > > On Thu, 2017-08-03 at 17:42 -0500, Bjorn Helgaas wrote: > > > > > > + > > > > > > +static struct mtk_pcie_port *mtk_pcie_find_port(struct mtk_pcie > > > > > > *pcie, > > > > > > + struct pci_bus *bus, > > > > > > int devfn) > > > > > > +{ > > > > > > + struct pci_dev *dev; > > > > > > + struct pci_bus *pbus; > > > > > > + struct mtk_pcie_port *port, *tmp; > > > > > > + > > > > > > + list_for_each_entry_safe(port, tmp, >ports, list) { > > > > > > + if (bus->number == 0 && port->index == PCI_SLOT(devfn)) > > > > > > { > > > > > > + return port; > > > > > > + } else if (bus->number != 0) { > > > > > > + pbus = bus; > > > > > > + do { > > > > > > + dev = pbus->self; > > > > > > + if (port->index == PCI_SLOT(dev->devfn)) > > > > > > + return port; > > > > > > + pbus = dev->bus; > > > > > > + } while (dev->bus->number != 0); > > > > > > + } > > > > > > + } > > > > > > + > > > > > > + return NULL; > > > > > > > > > > You should be able to use sysdata to avoid searching the list. > > > > > See drivers/pci/host/pci-aardvark.c, for example. > > > > > > > > > > > > > I could put the mtk_pcie * in sysdata, but still need to searching the > > > > list to get the mtk_pcie_port *, how about: > > > > > > > > list_for_each_entry_safe(port, tmp, >ports, list) { > > > > if (port->index == PCI_SLOT(devfn)) > > > > return port; > > > > } > > > > > > No. Other drivers don't need to search the list. Please take a look > > > at them and see how they solve this problem. I don't think your > > > hardware is fundamentally different in a way that means you need to > > > search when the others don't. > > > > > > > I'm not directly involved in this generation, but I guess the main reason > > why Honghui need to do that is just because this hardware access > > configuration space via per-port registers, not just for the guard. > > Currently, We had a host bridge with two ports (two subnodes in binding > > text), thus he tried to tells them apart so that he can get the correct > > registers. > > > > Some platforms don't need to do that since they just have a single port (no > > more subnodes), the others might have specific/shared registers to access > > configuration space. (e.g. Tegra, MTK legacy IP block). > > Or, he can split them into two independent nodes, but it will break common > > probing flow by doing so. (I'd prefer to use subnodes.) > > > > Ryder > > > > Sorry for the typesetting in previous mail and noise again, > > I've took a look at pci-rcar-gen2.c, this is a similar case I can found > for Honghui's case. It gathers two ports reg regions into one, and uses > the "slot id" to calculate the cfg base of each port. > > Perhaps this is a example for those who need to use subnodes and use > port registers for cfg operation. Not sure whether it's worthwhile doing > that since we need to changes ports/host structures. > > Ryder. > As Ryder's description, Mediatek's new generation HW blocks has two separate ports, they have separate control register base address. We must touch the per-port control register to access the EP's configuration space. One port's control register is the only way to access the EP's configuration space(the EP which is connect under this very port). Given an EP device, we need to determine which ports it's been connected, and get the base address for that port. It's a bit like pci-tegra/pci-mvebu. Seems list is not forbidden, pci-tegra search the list to identify the ports[1], mvebu use point array to search the ports[2], they have the same functionality through different approach. I may propose another patch to make the code like mvebu[2] if you insist, but I'm prefer the current list way. [1]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-tegra.c#L456 [2]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-mvebu.c#L780 thanks. >
Re: [PATCH v2 4/5] PCI: mediatek: Add new generation controller support
On Sat, 2017-08-05 at 14:16 +0800, Ryder Lee wrote: > On Sat, 2017-08-05 at 12:52 +0800, Ryder Lee wrote: > > Hi Honghui, Bjorn, > > > > On Fri, 2017-08-04 at 08:18 -0500, Bjorn Helgaas wrote: > > > On Fri, Aug 04, 2017 at 04:39:36PM +0800, Honghui Zhang wrote: > > > > On Thu, 2017-08-03 at 17:42 -0500, Bjorn Helgaas wrote: > > > > > > + > > > > > > +static struct mtk_pcie_port *mtk_pcie_find_port(struct mtk_pcie > > > > > > *pcie, > > > > > > + struct pci_bus *bus, > > > > > > int devfn) > > > > > > +{ > > > > > > + struct pci_dev *dev; > > > > > > + struct pci_bus *pbus; > > > > > > + struct mtk_pcie_port *port, *tmp; > > > > > > + > > > > > > + list_for_each_entry_safe(port, tmp, >ports, list) { > > > > > > + if (bus->number == 0 && port->index == PCI_SLOT(devfn)) > > > > > > { > > > > > > + return port; > > > > > > + } else if (bus->number != 0) { > > > > > > + pbus = bus; > > > > > > + do { > > > > > > + dev = pbus->self; > > > > > > + if (port->index == PCI_SLOT(dev->devfn)) > > > > > > + return port; > > > > > > + pbus = dev->bus; > > > > > > + } while (dev->bus->number != 0); > > > > > > + } > > > > > > + } > > > > > > + > > > > > > + return NULL; > > > > > > > > > > You should be able to use sysdata to avoid searching the list. > > > > > See drivers/pci/host/pci-aardvark.c, for example. > > > > > > > > > > > > > I could put the mtk_pcie * in sysdata, but still need to searching the > > > > list to get the mtk_pcie_port *, how about: > > > > > > > > list_for_each_entry_safe(port, tmp, >ports, list) { > > > > if (port->index == PCI_SLOT(devfn)) > > > > return port; > > > > } > > > > > > No. Other drivers don't need to search the list. Please take a look > > > at them and see how they solve this problem. I don't think your > > > hardware is fundamentally different in a way that means you need to > > > search when the others don't. > > > > > > > I'm not directly involved in this generation, but I guess the main reason > > why Honghui need to do that is just because this hardware access > > configuration space via per-port registers, not just for the guard. > > Currently, We had a host bridge with two ports (two subnodes in binding > > text), thus he tried to tells them apart so that he can get the correct > > registers. > > > > Some platforms don't need to do that since they just have a single port (no > > more subnodes), the others might have specific/shared registers to access > > configuration space. (e.g. Tegra, MTK legacy IP block). > > Or, he can split them into two independent nodes, but it will break common > > probing flow by doing so. (I'd prefer to use subnodes.) > > > > Ryder > > > > Sorry for the typesetting in previous mail and noise again, > > I've took a look at pci-rcar-gen2.c, this is a similar case I can found > for Honghui's case. It gathers two ports reg regions into one, and uses > the "slot id" to calculate the cfg base of each port. > > Perhaps this is a example for those who need to use subnodes and use > port registers for cfg operation. Not sure whether it's worthwhile doing > that since we need to changes ports/host structures. > > Ryder. > As Ryder's description, Mediatek's new generation HW blocks has two separate ports, they have separate control register base address. We must touch the per-port control register to access the EP's configuration space. One port's control register is the only way to access the EP's configuration space(the EP which is connect under this very port). Given an EP device, we need to determine which ports it's been connected, and get the base address for that port. It's a bit like pci-tegra/pci-mvebu. Seems list is not forbidden, pci-tegra search the list to identify the ports[1], mvebu use point array to search the ports[2], they have the same functionality through different approach. I may propose another patch to make the code like mvebu[2] if you insist, but I'm prefer the current list way. [1]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-tegra.c#L456 [2]http://elixir.free-electrons.com/linux/v4.13-rc4/source/drivers/pci/host/pci-mvebu.c#L780 thanks. >
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: From: Kuninori MorimotoThis driver's Copyright is under Renesas Solutions Corp Can we update the year to 2017 while we're at it? Archit Signed-off-by: Kuninori Morimoto --- drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..d487b6b 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2016 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH][resend] drm: dw-hdmi-i2s: add missing company name on Copyright
On 08/07/2017 07:41 AM, Kuninori Morimoto wrote: From: Kuninori Morimoto This driver's Copyright is under Renesas Solutions Corp Can we update the year to 2017 while we're at it? Archit Signed-off-by: Kuninori Morimoto --- drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index b2cf59f..d487b6b 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,7 +1,8 @@ /* * dw-hdmi-i2s-audio.c * - * Copyright (c) 2016 Kuninori Morimoto + * Copyright (c) 2016 Renesas Solutions Corp. + * Kuninori Morimoto * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH] cpufreq: Simplify cpufreq_can_do_remote_dvfs()
On 04-08-17, 14:57, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki> > The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop > it and simply return the value of the expression under it. > > Signed-off-by: Rafael J. Wysocki > --- > > On top of the current linux-next. > > --- > include/linux/cpufreq.h |7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > Index: linux-pm/include/linux/cpufreq.h > === > --- linux-pm.orig/include/linux/cpufreq.h > +++ linux-pm/include/linux/cpufreq.h > @@ -578,11 +578,8 @@ static inline bool cpufreq_can_do_remote >* - dvfs_possible_from_any_cpu flag is set >* - the local and remote CPUs share cpufreq policy >*/ > - if (policy->dvfs_possible_from_any_cpu || > - cpumask_test_cpu(smp_processor_id(), policy->cpus)) > - return true; > - > - return false; > + return policy->dvfs_possible_from_any_cpu || > + cpumask_test_cpu(smp_processor_id(), policy->cpus); > } > > /* Acked-by: Viresh Kumar -- viresh
Re: [PATCH] cpufreq: Simplify cpufreq_can_do_remote_dvfs()
On 04-08-17, 14:57, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki > > The if () in cpufreq_can_do_remote_dvfs() is superfluous, so drop > it and simply return the value of the expression under it. > > Signed-off-by: Rafael J. Wysocki > --- > > On top of the current linux-next. > > --- > include/linux/cpufreq.h |7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > Index: linux-pm/include/linux/cpufreq.h > === > --- linux-pm.orig/include/linux/cpufreq.h > +++ linux-pm/include/linux/cpufreq.h > @@ -578,11 +578,8 @@ static inline bool cpufreq_can_do_remote >* - dvfs_possible_from_any_cpu flag is set >* - the local and remote CPUs share cpufreq policy >*/ > - if (policy->dvfs_possible_from_any_cpu || > - cpumask_test_cpu(smp_processor_id(), policy->cpus)) > - return true; > - > - return false; > + return policy->dvfs_possible_from_any_cpu || > + cpumask_test_cpu(smp_processor_id(), policy->cpus); > } > > /* Acked-by: Viresh Kumar -- viresh
Re: [PATCH V3] get_maintainer: Prepare for separate MAINTAINERS files
On Sun, 2017-08-06 at 19:16 -0700, Frank Rowand wrote: > On 08/04/17 21:45, Joe Perches wrote: > > Allow for MAINTAINERS to become a directory and if it is, > > read all the files in the directory for maintained sections. > > > > Optionally look for all files named MAINTAINERS in directories > > excluding the .git directory by using --find-maintainer-files. > > > > This optional feature adds ~.3 seconds of CPU on an Intel > > i5-6200 with an SSD. > > > > Miscellanea: > > > > o Create a read_maintainer_file subroutine from the existing code > > o Test only the existence of MAINTAINERS, not whether it's a file > > > > Signed-off-by: Joe Perches> > --- > > < snip > > > Hi Joe, > > In the three versions of this patch, I have not seen any description > of what is wrong with the current single MAINTAINERS file, or why the > proposed change is an improvement. Could you please add that > information? It's really up to Linus. He's the one who wants to separate the MAINTAINERS file as he's the one that has to deal with the merges. This is only to enable the script to still function if the file is split up.
Re: [PATCH V3] get_maintainer: Prepare for separate MAINTAINERS files
On Sun, 2017-08-06 at 19:16 -0700, Frank Rowand wrote: > On 08/04/17 21:45, Joe Perches wrote: > > Allow for MAINTAINERS to become a directory and if it is, > > read all the files in the directory for maintained sections. > > > > Optionally look for all files named MAINTAINERS in directories > > excluding the .git directory by using --find-maintainer-files. > > > > This optional feature adds ~.3 seconds of CPU on an Intel > > i5-6200 with an SSD. > > > > Miscellanea: > > > > o Create a read_maintainer_file subroutine from the existing code > > o Test only the existence of MAINTAINERS, not whether it's a file > > > > Signed-off-by: Joe Perches > > --- > > < snip > > > Hi Joe, > > In the three versions of this patch, I have not seen any description > of what is wrong with the current single MAINTAINERS file, or why the > proposed change is an improvement. Could you please add that > information? It's really up to Linus. He's the one who wants to separate the MAINTAINERS file as he's the one that has to deal with the merges. This is only to enable the script to still function if the file is split up.
[PATCH v10 2/4] irqchip/qeic: merge qeic init code from platforms to a common function
The codes of qe_ic init from a variety of platforms are redundant, merge them to a common function and put it to irqchip/irq-qeic.c For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0, qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of "qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);". qe_ic_cascade_muxed_mpic was used for boards has the same interrupt number for low interrupt and high interrupt, qe_ic_init has checked if "low interrupt == high interrupt" Signed-off-by: Zhao Qiang--- arch/powerpc/platforms/83xx/misc.c| 15 --- arch/powerpc/platforms/85xx/corenet_generic.c | 9 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 -- arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 arch/powerpc/platforms/85xx/twr_p102x.c | 14 -- drivers/irqchip/irq-qeic.c| 13 + 6 files changed, 13 insertions(+), 68 deletions(-) diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index d75c981..c09a135 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void) } #ifdef CONFIG_QUICC_ENGINE -void __init mpc83xx_qe_init_IRQ(void) -{ - struct device_node *np; - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic); - of_node_put(np); -} - void __init mpc83xx_ipic_and_qe_init_IRQ(void) { mpc83xx_ipic_init_IRQ(); - mpc83xx_qe_init_IRQ(); } #endif /* CONFIG_QUICC_ENGINE */ diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index ac191a7..1b385ac 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void) unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU | MPIC_NO_RESET; - struct device_node *np; - if (ppc_md.get_irq == mpic_get_coreint_irq) flags |= MPIC_ENABLE_COREINT; @@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - } } /* diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index d7e440e..06f34a9 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -283,20 +283,6 @@ static void __init mpc85xx_mds_qeic_init(void) of_node_put(np); return; } - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - - if (machine_is(p1021_mds)) - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - else - qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL); - of_node_put(np); } #else static void __init mpc85xx_mds_qe_init(void) { } diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c index 1006950..000d385 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c @@ -48,10 +48,6 @@ void __init mpc85xx_rdb_pic_init(void) { struct mpic *mpic; -#ifdef CONFIG_QUICC_ENGINE - struct device_node *np; -#endif - if (of_machine_is_compatible("fsl,MPC85XXRDB-CAMP")) { mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET | MPIC_BIG_ENDIAN | @@ -66,18 +62,6 @@ void __init mpc85xx_rdb_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - -#ifdef CONFIG_QUICC_ENGINE - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - - } else - pr_err("%s: Could not find qe-ic node\n", __func__); -#endif - } /* diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c index 360f625..6be9b33 100644 --- a/arch/powerpc/platforms/85xx/twr_p102x.c +++ b/arch/powerpc/platforms/85xx/twr_p102x.c @@ -35,26 +35,12 @@ static void __init twr_p1025_pic_init(void) { struct mpic *mpic; -#ifdef CONFIG_QUICC_ENGINE - struct
[PATCH v10 2/4] irqchip/qeic: merge qeic init code from platforms to a common function
The codes of qe_ic init from a variety of platforms are redundant, merge them to a common function and put it to irqchip/irq-qeic.c For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0, qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of "qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);". qe_ic_cascade_muxed_mpic was used for boards has the same interrupt number for low interrupt and high interrupt, qe_ic_init has checked if "low interrupt == high interrupt" Signed-off-by: Zhao Qiang --- arch/powerpc/platforms/83xx/misc.c| 15 --- arch/powerpc/platforms/85xx/corenet_generic.c | 9 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 -- arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 arch/powerpc/platforms/85xx/twr_p102x.c | 14 -- drivers/irqchip/irq-qeic.c| 13 + 6 files changed, 13 insertions(+), 68 deletions(-) diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index d75c981..c09a135 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void) } #ifdef CONFIG_QUICC_ENGINE -void __init mpc83xx_qe_init_IRQ(void) -{ - struct device_node *np; - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic); - of_node_put(np); -} - void __init mpc83xx_ipic_and_qe_init_IRQ(void) { mpc83xx_ipic_init_IRQ(); - mpc83xx_qe_init_IRQ(); } #endif /* CONFIG_QUICC_ENGINE */ diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index ac191a7..1b385ac 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void) unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU | MPIC_NO_RESET; - struct device_node *np; - if (ppc_md.get_irq == mpic_get_coreint_irq) flags |= MPIC_ENABLE_COREINT; @@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - } } /* diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index d7e440e..06f34a9 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -283,20 +283,6 @@ static void __init mpc85xx_mds_qeic_init(void) of_node_put(np); return; } - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - - if (machine_is(p1021_mds)) - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - else - qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL); - of_node_put(np); } #else static void __init mpc85xx_mds_qe_init(void) { } diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c index 1006950..000d385 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c @@ -48,10 +48,6 @@ void __init mpc85xx_rdb_pic_init(void) { struct mpic *mpic; -#ifdef CONFIG_QUICC_ENGINE - struct device_node *np; -#endif - if (of_machine_is_compatible("fsl,MPC85XXRDB-CAMP")) { mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET | MPIC_BIG_ENDIAN | @@ -66,18 +62,6 @@ void __init mpc85xx_rdb_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - -#ifdef CONFIG_QUICC_ENGINE - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - - } else - pr_err("%s: Could not find qe-ic node\n", __func__); -#endif - } /* diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c index 360f625..6be9b33 100644 --- a/arch/powerpc/platforms/85xx/twr_p102x.c +++ b/arch/powerpc/platforms/85xx/twr_p102x.c @@ -35,26 +35,12 @@ static void __init twr_p1025_pic_init(void) { struct mpic *mpic; -#ifdef CONFIG_QUICC_ENGINE - struct device_node *np;
[PATCH v10 4/4] irqchip/qeic: remove PPCisms for QEIC
QEIC was supported on PowerPC, and dependent on PPC, Now it is supported on other platforms, so remove PPCisms. Signed-off-by: Zhao Qiang--- arch/powerpc/platforms/83xx/km83xx.c | 1 - arch/powerpc/platforms/83xx/misc.c| 1 - arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 - arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 - arch/powerpc/platforms/85xx/corenet_generic.c | 1 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 1 - arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 1 - arch/powerpc/platforms/85xx/twr_p102x.c | 1 - drivers/irqchip/irq-qeic.c| 188 +++--- include/soc/fsl/qe/qe_ic.h| 132 -- 12 files changed, 80 insertions(+), 250 deletions(-) delete mode 100644 include/soc/fsl/qe/qe_ic.h diff --git a/arch/powerpc/platforms/83xx/km83xx.c b/arch/powerpc/platforms/83xx/km83xx.c index d8642a4..b1cef0a 100644 --- a/arch/powerpc/platforms/83xx/km83xx.c +++ b/arch/powerpc/platforms/83xx/km83xx.c @@ -38,7 +38,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index c09a135..07a0e61 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -17,7 +17,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c b/arch/powerpc/platforms/83xx/mpc832x_mds.c index bb7b25a..a1cadf4 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_mds.c +++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c @@ -37,7 +37,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c b/arch/powerpc/platforms/83xx/mpc832x_rdb.c index d7c9b18..6c66527 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c +++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c @@ -26,7 +26,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c b/arch/powerpc/platforms/83xx/mpc836x_mds.c index 4fc3051..9234d63 100644 --- a/arch/powerpc/platforms/83xx/mpc836x_mds.c +++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c @@ -45,7 +45,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/mpc836x_rdk.c b/arch/powerpc/platforms/83xx/mpc836x_rdk.c index 93f024f..82fa344 100644 --- a/arch/powerpc/platforms/83xx/mpc836x_rdk.c +++ b/arch/powerpc/platforms/83xx/mpc836x_rdk.c @@ -21,7 +21,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index 1b385ac..9ca27b1 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -27,7 +27,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index 06f34a9..8102e5f 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -49,7 +49,6 @@ #include #include #include -#include #include #include #include "smp.h" diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c index 000d385..f806b6b 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c @@ -27,7 +27,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c index 6be9b33..4f620f2 100644 --- a/arch/powerpc/platforms/85xx/twr_p102x.c +++ b/arch/powerpc/platforms/85xx/twr_p102x.c @@ -23,7 +23,6 @@ #include #include #include -#include #include #include diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c index a2d8084..26bfcbd 100644 --- a/drivers/irqchip/irq-qeic.c +++ b/drivers/irqchip/irq-qeic.c @@ -18,8 +18,11 @@ #include #include #include +#include #include #include +#include +#include #include #include #include @@ -27,9 +30,8 @@ #include #include #include -#include +#include #include -#include #define NR_QE_IC_INTS 64 @@ -87,6 +89,43 @@ #define SIGNAL_HIGH2 #define SIGNAL_LOW 0 +#define NUM_OF_QE_IC_GROUPS6 + +/* Flags when we init the QE IC */ +#define QE_IC_SPREADMODE_GRP_W 0x0001 +#define QE_IC_SPREADMODE_GRP_X 0x0002 +#define QE_IC_SPREADMODE_GRP_Y 0x0004 +#define QE_IC_SPREADMODE_GRP_Z 0x0008 +#define QE_IC_SPREADMODE_GRP_RISCA 0x0010 +#define
[PATCH v10 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
move the driver from drivers/soc/fsl/qe to drivers/irqchip, merge qe_ic.h and qe_ic.c into irq-qeic.c. Signed-off-by: Zhao Qiang--- MAINTAINERS| 6 ++ drivers/irqchip/Makefile | 1 + drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 95 ++- drivers/soc/fsl/qe/Makefile| 2 +- drivers/soc/fsl/qe/qe_ic.h | 103 - 5 files changed, 100 insertions(+), 107 deletions(-) rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%) delete mode 100644 drivers/soc/fsl/qe/qe_ic.h diff --git a/MAINTAINERS b/MAINTAINERS index 567343b..1288329 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5462,6 +5462,12 @@ F: drivers/soc/fsl/qe/ F: include/soc/fsl/*qe*.h F: include/soc/fsl/*ucc*.h +FREESCALE QEIC DRIVERS +M: Qiang Zhao +L: linux-kernel@vger.kernel.org +S: Maintained +F: drivers/irqchip/irq-qeic.c + FREESCALE QUICC ENGINE UCC ETHERNET DRIVER M: Li Yang L: net...@vger.kernel.org diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index e88d856..b8eae87 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -78,3 +78,4 @@ obj-$(CONFIG_EZNPS_GIC) += irq-eznps.o obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-vic.o irq-aspeed-i2c-ic.o obj-$(CONFIG_STM32_EXTI) += irq-stm32-exti.o obj-$(CONFIG_QCOM_IRQ_COMBINER)+= qcom-irq-combiner.o +obj-$(CONFIG_QUICC_ENGINE) += irq-qeic.o diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c similarity index 85% rename from drivers/soc/fsl/qe/qe_ic.c rename to drivers/irqchip/irq-qeic.c index ec2ca86..9b4660c 100644 --- a/drivers/soc/fsl/qe/qe_ic.c +++ b/drivers/irqchip/irq-qeic.c @@ -1,7 +1,7 @@ /* - * arch/powerpc/sysdev/qe_lib/qe_ic.c + * drivers/irqchip/irq-qeic.c * - * Copyright (C) 2006 Freescale Semiconductor, Inc. All rights reserved. + * Copyright (C) 2016 Freescale Semiconductor, Inc. All rights reserved. * * Author: Li Yang * Based on code from Shlomi Gridish @@ -30,7 +30,96 @@ #include #include -#include "qe_ic.h" +#define NR_QE_IC_INTS 64 + +/* QE IC registers offset */ +#define QEIC_CICR 0x00 +#define QEIC_CIVEC 0x04 +#define QEIC_CRIPNR0x08 +#define QEIC_CIPNR 0x0c +#define QEIC_CIPXCC0x10 +#define QEIC_CIPYCC0x14 +#define QEIC_CIPWCC0x18 +#define QEIC_CIPZCC0x1c +#define QEIC_CIMR 0x20 +#define QEIC_CRIMR 0x24 +#define QEIC_CICNR 0x28 +#define QEIC_CIPRTA0x30 +#define QEIC_CIPRTB0x34 +#define QEIC_CRICR 0x3c +#define QEIC_CHIVEC0x60 + +/* Interrupt priority registers */ +#define CIPCC_SHIFT_PRI0 29 +#define CIPCC_SHIFT_PRI1 26 +#define CIPCC_SHIFT_PRI2 23 +#define CIPCC_SHIFT_PRI3 20 +#define CIPCC_SHIFT_PRI4 13 +#define CIPCC_SHIFT_PRI5 10 +#define CIPCC_SHIFT_PRI6 7 +#define CIPCC_SHIFT_PRI7 4 + +/* CICR priority modes */ +#define CICR_GWCC 0x0004 +#define CICR_GXCC 0x0002 +#define CICR_GYCC 0x0001 +#define CICR_GZCC 0x0008 +#define CICR_GRTA 0x0020 +#define CICR_GRTB 0x0040 +#define CICR_HPIT_SHIFT8 +#define CICR_HPIT_MASK 0x0300 +#define CICR_HP_SHIFT 24 +#define CICR_HP_MASK 0x3f00 + +/* CICNR */ +#define CICNR_WCC1T_SHIFT 20 +#define CICNR_ZCC1T_SHIFT 28 +#define CICNR_YCC1T_SHIFT 12 +#define CICNR_XCC1T_SHIFT 4 + +/* CRICR */ +#define CRICR_RTA1T_SHIFT 20 +#define CRICR_RTB1T_SHIFT 28 + +/* Signal indicator */ +#define SIGNAL_MASK3 +#define SIGNAL_HIGH2 +#define SIGNAL_LOW 0 + +struct qe_ic { + /* Control registers offset */ + u32 __iomem *regs; + + /* The remapper for this QEIC */ + struct irq_domain *irqhost; + + /* The "linux" controller struct */ + struct irq_chip hc_irq; + + /* VIRQ numbers of QE high/low irqs */ + unsigned int virq_high; + unsigned int virq_low; +}; + +/* + * QE interrupt controller internal structure + */ +struct qe_ic_info { + /* location of this source at the QIMR register. */ + u32 mask; + + /* Mask register offset */ + u32 mask_reg; + + /* +* for grouped interrupts sources - the interrupt +* code as appears at the group priority register +*/ + u8 pri_code; + + /* Group priority register offset */ + u32 pri_reg; +}; static DEFINE_RAW_SPINLOCK(qe_ic_lock); diff --git a/drivers/soc/fsl/qe/Makefile
[PATCH v10 4/4] irqchip/qeic: remove PPCisms for QEIC
QEIC was supported on PowerPC, and dependent on PPC, Now it is supported on other platforms, so remove PPCisms. Signed-off-by: Zhao Qiang --- arch/powerpc/platforms/83xx/km83xx.c | 1 - arch/powerpc/platforms/83xx/misc.c| 1 - arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 - arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 - arch/powerpc/platforms/85xx/corenet_generic.c | 1 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 1 - arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 1 - arch/powerpc/platforms/85xx/twr_p102x.c | 1 - drivers/irqchip/irq-qeic.c| 188 +++--- include/soc/fsl/qe/qe_ic.h| 132 -- 12 files changed, 80 insertions(+), 250 deletions(-) delete mode 100644 include/soc/fsl/qe/qe_ic.h diff --git a/arch/powerpc/platforms/83xx/km83xx.c b/arch/powerpc/platforms/83xx/km83xx.c index d8642a4..b1cef0a 100644 --- a/arch/powerpc/platforms/83xx/km83xx.c +++ b/arch/powerpc/platforms/83xx/km83xx.c @@ -38,7 +38,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index c09a135..07a0e61 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -17,7 +17,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c b/arch/powerpc/platforms/83xx/mpc832x_mds.c index bb7b25a..a1cadf4 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_mds.c +++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c @@ -37,7 +37,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c b/arch/powerpc/platforms/83xx/mpc832x_rdb.c index d7c9b18..6c66527 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c +++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c @@ -26,7 +26,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c b/arch/powerpc/platforms/83xx/mpc836x_mds.c index 4fc3051..9234d63 100644 --- a/arch/powerpc/platforms/83xx/mpc836x_mds.c +++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c @@ -45,7 +45,6 @@ #include #include #include -#include #include "mpc83xx.h" diff --git a/arch/powerpc/platforms/83xx/mpc836x_rdk.c b/arch/powerpc/platforms/83xx/mpc836x_rdk.c index 93f024f..82fa344 100644 --- a/arch/powerpc/platforms/83xx/mpc836x_rdk.c +++ b/arch/powerpc/platforms/83xx/mpc836x_rdk.c @@ -21,7 +21,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index 1b385ac..9ca27b1 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -27,7 +27,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index 06f34a9..8102e5f 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -49,7 +49,6 @@ #include #include #include -#include #include #include #include "smp.h" diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c index 000d385..f806b6b 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c @@ -27,7 +27,6 @@ #include #include #include -#include #include #include diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c index 6be9b33..4f620f2 100644 --- a/arch/powerpc/platforms/85xx/twr_p102x.c +++ b/arch/powerpc/platforms/85xx/twr_p102x.c @@ -23,7 +23,6 @@ #include #include #include -#include #include #include diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c index a2d8084..26bfcbd 100644 --- a/drivers/irqchip/irq-qeic.c +++ b/drivers/irqchip/irq-qeic.c @@ -18,8 +18,11 @@ #include #include #include +#include #include #include +#include +#include #include #include #include @@ -27,9 +30,8 @@ #include #include #include -#include +#include #include -#include #define NR_QE_IC_INTS 64 @@ -87,6 +89,43 @@ #define SIGNAL_HIGH2 #define SIGNAL_LOW 0 +#define NUM_OF_QE_IC_GROUPS6 + +/* Flags when we init the QE IC */ +#define QE_IC_SPREADMODE_GRP_W 0x0001 +#define QE_IC_SPREADMODE_GRP_X 0x0002 +#define QE_IC_SPREADMODE_GRP_Y 0x0004 +#define QE_IC_SPREADMODE_GRP_Z 0x0008 +#define QE_IC_SPREADMODE_GRP_RISCA 0x0010 +#define QE_IC_SPREADMODE_GRP_RISCB
[PATCH v10 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
move the driver from drivers/soc/fsl/qe to drivers/irqchip, merge qe_ic.h and qe_ic.c into irq-qeic.c. Signed-off-by: Zhao Qiang --- MAINTAINERS| 6 ++ drivers/irqchip/Makefile | 1 + drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 95 ++- drivers/soc/fsl/qe/Makefile| 2 +- drivers/soc/fsl/qe/qe_ic.h | 103 - 5 files changed, 100 insertions(+), 107 deletions(-) rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%) delete mode 100644 drivers/soc/fsl/qe/qe_ic.h diff --git a/MAINTAINERS b/MAINTAINERS index 567343b..1288329 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5462,6 +5462,12 @@ F: drivers/soc/fsl/qe/ F: include/soc/fsl/*qe*.h F: include/soc/fsl/*ucc*.h +FREESCALE QEIC DRIVERS +M: Qiang Zhao +L: linux-kernel@vger.kernel.org +S: Maintained +F: drivers/irqchip/irq-qeic.c + FREESCALE QUICC ENGINE UCC ETHERNET DRIVER M: Li Yang L: net...@vger.kernel.org diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index e88d856..b8eae87 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -78,3 +78,4 @@ obj-$(CONFIG_EZNPS_GIC) += irq-eznps.o obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-vic.o irq-aspeed-i2c-ic.o obj-$(CONFIG_STM32_EXTI) += irq-stm32-exti.o obj-$(CONFIG_QCOM_IRQ_COMBINER)+= qcom-irq-combiner.o +obj-$(CONFIG_QUICC_ENGINE) += irq-qeic.o diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c similarity index 85% rename from drivers/soc/fsl/qe/qe_ic.c rename to drivers/irqchip/irq-qeic.c index ec2ca86..9b4660c 100644 --- a/drivers/soc/fsl/qe/qe_ic.c +++ b/drivers/irqchip/irq-qeic.c @@ -1,7 +1,7 @@ /* - * arch/powerpc/sysdev/qe_lib/qe_ic.c + * drivers/irqchip/irq-qeic.c * - * Copyright (C) 2006 Freescale Semiconductor, Inc. All rights reserved. + * Copyright (C) 2016 Freescale Semiconductor, Inc. All rights reserved. * * Author: Li Yang * Based on code from Shlomi Gridish @@ -30,7 +30,96 @@ #include #include -#include "qe_ic.h" +#define NR_QE_IC_INTS 64 + +/* QE IC registers offset */ +#define QEIC_CICR 0x00 +#define QEIC_CIVEC 0x04 +#define QEIC_CRIPNR0x08 +#define QEIC_CIPNR 0x0c +#define QEIC_CIPXCC0x10 +#define QEIC_CIPYCC0x14 +#define QEIC_CIPWCC0x18 +#define QEIC_CIPZCC0x1c +#define QEIC_CIMR 0x20 +#define QEIC_CRIMR 0x24 +#define QEIC_CICNR 0x28 +#define QEIC_CIPRTA0x30 +#define QEIC_CIPRTB0x34 +#define QEIC_CRICR 0x3c +#define QEIC_CHIVEC0x60 + +/* Interrupt priority registers */ +#define CIPCC_SHIFT_PRI0 29 +#define CIPCC_SHIFT_PRI1 26 +#define CIPCC_SHIFT_PRI2 23 +#define CIPCC_SHIFT_PRI3 20 +#define CIPCC_SHIFT_PRI4 13 +#define CIPCC_SHIFT_PRI5 10 +#define CIPCC_SHIFT_PRI6 7 +#define CIPCC_SHIFT_PRI7 4 + +/* CICR priority modes */ +#define CICR_GWCC 0x0004 +#define CICR_GXCC 0x0002 +#define CICR_GYCC 0x0001 +#define CICR_GZCC 0x0008 +#define CICR_GRTA 0x0020 +#define CICR_GRTB 0x0040 +#define CICR_HPIT_SHIFT8 +#define CICR_HPIT_MASK 0x0300 +#define CICR_HP_SHIFT 24 +#define CICR_HP_MASK 0x3f00 + +/* CICNR */ +#define CICNR_WCC1T_SHIFT 20 +#define CICNR_ZCC1T_SHIFT 28 +#define CICNR_YCC1T_SHIFT 12 +#define CICNR_XCC1T_SHIFT 4 + +/* CRICR */ +#define CRICR_RTA1T_SHIFT 20 +#define CRICR_RTB1T_SHIFT 28 + +/* Signal indicator */ +#define SIGNAL_MASK3 +#define SIGNAL_HIGH2 +#define SIGNAL_LOW 0 + +struct qe_ic { + /* Control registers offset */ + u32 __iomem *regs; + + /* The remapper for this QEIC */ + struct irq_domain *irqhost; + + /* The "linux" controller struct */ + struct irq_chip hc_irq; + + /* VIRQ numbers of QE high/low irqs */ + unsigned int virq_high; + unsigned int virq_low; +}; + +/* + * QE interrupt controller internal structure + */ +struct qe_ic_info { + /* location of this source at the QIMR register. */ + u32 mask; + + /* Mask register offset */ + u32 mask_reg; + + /* +* for grouped interrupts sources - the interrupt +* code as appears at the group priority register +*/ + u8 pri_code; + + /* Group priority register offset */ + u32 pri_reg; +}; static DEFINE_RAW_SPINLOCK(qe_ic_lock); diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile index 2031d38..51e4726 100644 --- a/drivers/soc/fsl/qe/Makefile +++
[PATCH v10 3/4] irqchip/qeic: merge qeic_of_init into qe_ic_init
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init, pass the device_node to qe_ic_init. So merge qeic_of_init into qe_ic_init to get the qeic node in qe_ic_init. Signed-off-by: Zhao Qiang--- drivers/irqchip/irq-qeic.c | 90 -- include/soc/fsl/qe/qe_ic.h | 7 2 files changed, 39 insertions(+), 58 deletions(-) diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c index 8287c22..a2d8084 100644 --- a/drivers/irqchip/irq-qeic.c +++ b/drivers/irqchip/irq-qeic.c @@ -407,27 +407,33 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic) return irq_linear_revmap(qe_ic->irqhost, irq); } -void __init qe_ic_init(struct device_node *node, unsigned int flags, - void (*low_handler)(struct irq_desc *desc), - void (*high_handler)(struct irq_desc *desc)) +static int __init qe_ic_init(struct device_node *node, unsigned int flags) { struct qe_ic *qe_ic; struct resource res; - u32 temp = 0, ret, high_active = 0; + u32 temp = 0, high_active = 0; + int ret = 0; + + if (!node) + return -ENODEV; ret = of_address_to_resource(node, 0, ); - if (ret) - return; + if (ret) { + ret = -ENODEV; + goto err_put_node; + } qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL); - if (qe_ic == NULL) - return; + if (qe_ic == NULL) { + ret = -ENOMEM; + goto err_put_node; + } qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS, _ic_host_ops, qe_ic); if (qe_ic->irqhost == NULL) { - kfree(qe_ic); - return; + ret = -ENOMEM; + goto err_free_qe_ic; } qe_ic->regs = ioremap(res.start, resource_size()); @@ -438,9 +444,9 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic->virq_low = irq_of_parse_and_map(node, 1); if (qe_ic->virq_low == NO_IRQ) { - printk(KERN_ERR "Failed to map QE_IC low IRQ\n"); - kfree(qe_ic); - return; + pr_err("Failed to map QE_IC low IRQ\n"); + ret = -ENOMEM; + goto err_domain_remove; } /* default priority scheme is grouped. If spread mode is*/ @@ -467,13 +473,24 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic_write(qe_ic->regs, QEIC_CICR, temp); irq_set_handler_data(qe_ic->virq_low, qe_ic); - irq_set_chained_handler(qe_ic->virq_low, low_handler); + irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic); if (qe_ic->virq_high != NO_IRQ && qe_ic->virq_high != qe_ic->virq_low) { irq_set_handler_data(qe_ic->virq_high, qe_ic); - irq_set_chained_handler(qe_ic->virq_high, high_handler); + irq_set_chained_handler(qe_ic->virq_high, + qe_ic_cascade_high_mpic); } + of_node_put(node); + return 0; + +err_domain_remove: + irq_domain_remove(qe_ic->irqhost); +err_free_qe_ic: + kfree(qe_ic); +err_put_node: + of_node_put(node); + return ret; } void qe_ic_set_highest_priority(unsigned int virq, int high) @@ -570,45 +587,16 @@ int qe_ic_set_high_priority(unsigned int virq, unsigned int priority, int high) return 0; } -static struct bus_type qe_ic_subsys = { - .name = "qe_ic", - .dev_name = "qe_ic", -}; - -static struct device device_qe_ic = { - .id = 0, - .bus = _ic_subsys, -}; - -static int __init init_qe_ic_sysfs(void) +static int __init init_qe_ic(struct device_node *node, +struct device_node *parent) { - int rc; - - printk(KERN_DEBUG "Registering qe_ic with sysfs...\n"); + int ret; - rc = subsys_system_register(_ic_subsys, NULL); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys class\n"); - return -ENODEV; - } - rc = device_register(_qe_ic); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys device\n"); - return -ENODEV; - } - return 0; -} + ret = qe_ic_init(node, 0); + if (ret) + return ret; -static int __init qeic_of_init(struct device_node *node, - struct device_node *parent) -{ - if (!node) - return -ENODEV; - qe_ic_init(node, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(node); return 0; } -IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init); -subsys_initcall(init_qe_ic_sysfs); +IRQCHIP_DECLARE(qeic, "fsl,qe-ic", init_qe_ic); diff --git
[PATCH v10 3/4] irqchip/qeic: merge qeic_of_init into qe_ic_init
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init, pass the device_node to qe_ic_init. So merge qeic_of_init into qe_ic_init to get the qeic node in qe_ic_init. Signed-off-by: Zhao Qiang --- drivers/irqchip/irq-qeic.c | 90 -- include/soc/fsl/qe/qe_ic.h | 7 2 files changed, 39 insertions(+), 58 deletions(-) diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c index 8287c22..a2d8084 100644 --- a/drivers/irqchip/irq-qeic.c +++ b/drivers/irqchip/irq-qeic.c @@ -407,27 +407,33 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic) return irq_linear_revmap(qe_ic->irqhost, irq); } -void __init qe_ic_init(struct device_node *node, unsigned int flags, - void (*low_handler)(struct irq_desc *desc), - void (*high_handler)(struct irq_desc *desc)) +static int __init qe_ic_init(struct device_node *node, unsigned int flags) { struct qe_ic *qe_ic; struct resource res; - u32 temp = 0, ret, high_active = 0; + u32 temp = 0, high_active = 0; + int ret = 0; + + if (!node) + return -ENODEV; ret = of_address_to_resource(node, 0, ); - if (ret) - return; + if (ret) { + ret = -ENODEV; + goto err_put_node; + } qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL); - if (qe_ic == NULL) - return; + if (qe_ic == NULL) { + ret = -ENOMEM; + goto err_put_node; + } qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS, _ic_host_ops, qe_ic); if (qe_ic->irqhost == NULL) { - kfree(qe_ic); - return; + ret = -ENOMEM; + goto err_free_qe_ic; } qe_ic->regs = ioremap(res.start, resource_size()); @@ -438,9 +444,9 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic->virq_low = irq_of_parse_and_map(node, 1); if (qe_ic->virq_low == NO_IRQ) { - printk(KERN_ERR "Failed to map QE_IC low IRQ\n"); - kfree(qe_ic); - return; + pr_err("Failed to map QE_IC low IRQ\n"); + ret = -ENOMEM; + goto err_domain_remove; } /* default priority scheme is grouped. If spread mode is*/ @@ -467,13 +473,24 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic_write(qe_ic->regs, QEIC_CICR, temp); irq_set_handler_data(qe_ic->virq_low, qe_ic); - irq_set_chained_handler(qe_ic->virq_low, low_handler); + irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic); if (qe_ic->virq_high != NO_IRQ && qe_ic->virq_high != qe_ic->virq_low) { irq_set_handler_data(qe_ic->virq_high, qe_ic); - irq_set_chained_handler(qe_ic->virq_high, high_handler); + irq_set_chained_handler(qe_ic->virq_high, + qe_ic_cascade_high_mpic); } + of_node_put(node); + return 0; + +err_domain_remove: + irq_domain_remove(qe_ic->irqhost); +err_free_qe_ic: + kfree(qe_ic); +err_put_node: + of_node_put(node); + return ret; } void qe_ic_set_highest_priority(unsigned int virq, int high) @@ -570,45 +587,16 @@ int qe_ic_set_high_priority(unsigned int virq, unsigned int priority, int high) return 0; } -static struct bus_type qe_ic_subsys = { - .name = "qe_ic", - .dev_name = "qe_ic", -}; - -static struct device device_qe_ic = { - .id = 0, - .bus = _ic_subsys, -}; - -static int __init init_qe_ic_sysfs(void) +static int __init init_qe_ic(struct device_node *node, +struct device_node *parent) { - int rc; - - printk(KERN_DEBUG "Registering qe_ic with sysfs...\n"); + int ret; - rc = subsys_system_register(_ic_subsys, NULL); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys class\n"); - return -ENODEV; - } - rc = device_register(_qe_ic); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys device\n"); - return -ENODEV; - } - return 0; -} + ret = qe_ic_init(node, 0); + if (ret) + return ret; -static int __init qeic_of_init(struct device_node *node, - struct device_node *parent) -{ - if (!node) - return -ENODEV; - qe_ic_init(node, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(node); return 0; } -IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init); -subsys_initcall(init_qe_ic_sysfs); +IRQCHIP_DECLARE(qeic, "fsl,qe-ic", init_qe_ic); diff --git a/include/soc/fsl/qe/qe_ic.h
[PATCH v10 0/4] this patchset is to remove PPCisms for QEIC
QEIC is supported more than just powerpc boards, so remove PPCisms. changelog: Changes for v8: - use IRQCHIP_DECLARE() instead of subsys_initcall in qeic driver - remove include/soc/fsl/qe/qe_ic.h Changes for v9: - rebase - fix the compile issue when apply the second patch, in fact, there was no compile issue when apply all the patches of this patchset Changes for v10: - simplify codes, remove duplicated codes Zhao Qiang (4): irqchip/qeic: move qeic driver from drivers/soc/fsl/qe Changes for v2: - modify the subject and commit msg Changes for v3: - merge .h file to .c, rename it with irq-qeic.c Changes for v4: - modify comments Changes for v5: - disable rename detection Changes for v6: - rebase Changes for v7: - na irqchip/qeic: merge qeic init code from platforms to a common function Changes for v2: - modify subject and commit msg - add check for qeic by type Changes for v3: - na Changes for v4: - na Changes for v5: - na Changes for v6: - rebase Changes for v7: - na Changes for v8: - use IRQCHIP_DECLARE() instead of subsys_initcall irqchip/qeic: merge qeic_of_init into qe_ic_init Changes for v2: - modify subject and commit msg - return 0 and add put node when return in qe_ic_init Changes for v3: - na Changes for v4: - na Changes for v5: - na Changes for v6: - rebase Changes for v7: - na irqchip/qeic: remove PPCisms for QEIC Changes for v6: - new added Changes for v7: - fix warning Changes for v8: - remove include/soc/fsl/qe/qe_ic.h Zhao Qiang (4): irqchip/qeic: move qeic driver from drivers/soc/fsl/qe irqchip/qeic: merge qeic init code from platforms to a common function irqchip/qeic: merge qeic_of_init into qe_ic_init irqchip/qeic: remove PPCisms for QEIC MAINTAINERS| 6 + arch/powerpc/platforms/83xx/km83xx.c | 1 - arch/powerpc/platforms/83xx/misc.c | 16 - arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 - arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 - arch/powerpc/platforms/85xx/corenet_generic.c | 10 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 15 - arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 17 - arch/powerpc/platforms/85xx/twr_p102x.c| 15 - drivers/irqchip/Makefile | 1 + drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 358 - drivers/soc/fsl/qe/Makefile| 2 +- drivers/soc/fsl/qe/qe_ic.h | 103 -- include/soc/fsl/qe/qe_ic.h | 139 16 files changed, 218 insertions(+), 469 deletions(-) rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (58%) delete mode 100644 drivers/soc/fsl/qe/qe_ic.h delete mode 100644 include/soc/fsl/qe/qe_ic.h -- 2.1.0.27.g96db324
[PATCH v10 0/4] this patchset is to remove PPCisms for QEIC
QEIC is supported more than just powerpc boards, so remove PPCisms. changelog: Changes for v8: - use IRQCHIP_DECLARE() instead of subsys_initcall in qeic driver - remove include/soc/fsl/qe/qe_ic.h Changes for v9: - rebase - fix the compile issue when apply the second patch, in fact, there was no compile issue when apply all the patches of this patchset Changes for v10: - simplify codes, remove duplicated codes Zhao Qiang (4): irqchip/qeic: move qeic driver from drivers/soc/fsl/qe Changes for v2: - modify the subject and commit msg Changes for v3: - merge .h file to .c, rename it with irq-qeic.c Changes for v4: - modify comments Changes for v5: - disable rename detection Changes for v6: - rebase Changes for v7: - na irqchip/qeic: merge qeic init code from platforms to a common function Changes for v2: - modify subject and commit msg - add check for qeic by type Changes for v3: - na Changes for v4: - na Changes for v5: - na Changes for v6: - rebase Changes for v7: - na Changes for v8: - use IRQCHIP_DECLARE() instead of subsys_initcall irqchip/qeic: merge qeic_of_init into qe_ic_init Changes for v2: - modify subject and commit msg - return 0 and add put node when return in qe_ic_init Changes for v3: - na Changes for v4: - na Changes for v5: - na Changes for v6: - rebase Changes for v7: - na irqchip/qeic: remove PPCisms for QEIC Changes for v6: - new added Changes for v7: - fix warning Changes for v8: - remove include/soc/fsl/qe/qe_ic.h Zhao Qiang (4): irqchip/qeic: move qeic driver from drivers/soc/fsl/qe irqchip/qeic: merge qeic init code from platforms to a common function irqchip/qeic: merge qeic_of_init into qe_ic_init irqchip/qeic: remove PPCisms for QEIC MAINTAINERS| 6 + arch/powerpc/platforms/83xx/km83xx.c | 1 - arch/powerpc/platforms/83xx/misc.c | 16 - arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 - arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 - arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 - arch/powerpc/platforms/85xx/corenet_generic.c | 10 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 15 - arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 17 - arch/powerpc/platforms/85xx/twr_p102x.c| 15 - drivers/irqchip/Makefile | 1 + drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 358 - drivers/soc/fsl/qe/Makefile| 2 +- drivers/soc/fsl/qe/qe_ic.h | 103 -- include/soc/fsl/qe/qe_ic.h | 139 16 files changed, 218 insertions(+), 469 deletions(-) rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (58%) delete mode 100644 drivers/soc/fsl/qe/qe_ic.h delete mode 100644 include/soc/fsl/qe/qe_ic.h -- 2.1.0.27.g96db324