Re: [PATCH v1] fs2dt: Refine kdump device_tree sort
Hi Simon, How about this patch? Thanks Wei On 06/12/2014 01:16 PM, wei.y...@windriver.com wrote: From: Yang Wei wei.y...@windriver.com The commit b02d735bf was to rearrange the device-tree entries, and assumed that these entries are sorted in the ascending order. but acctually when I was validating kexec and kdump, the order of serial node still is changed. We should not only compare the length of directory name, but also compare the directory name, it would ensure that the order of device node is really in ascending order. Signed-off-by: Yang Wei wei.y...@windriver.com --- kexec/fs2dt.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) It is validated on Freescale t4240qds. diff --git a/kexec/fs2dt.c b/kexec/fs2dt.c index 1e5f074..0bffaf5 100644 --- a/kexec/fs2dt.c +++ b/kexec/fs2dt.c @@ -479,6 +479,9 @@ static int comparefunc(const struct dirent **dentry1, { char *str1 = (*(struct dirent **)dentry1)-d_name; char *str2 = (*(struct dirent **)dentry2)-d_name; + char* ptr1 = strchr(str1, '@'); + char* ptr2 = strchr(str2, '@'); + int len1, len2; /* * strcmp scans from left to right and fails to idetify for some @@ -486,9 +489,13 @@ static int comparefunc(const struct dirent **dentry1, * Therefore, we get the wrong sorted order like memory@1000 and * memory@f00. */ - if (strchr(str1, '@') strchr(str2, '@') - (strlen(str1) strlen(str2))) - return 1; + if (ptr1 ptr2) { + len1 = ptr1 - str1; + len2 = ptr2 - str2; + if (!strncmp(str1, str2, len1 len2 ? len1: len2) + (strlen(str1) strlen(str2))) + return 1; + } return strcmp(str1, str2); } ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/5] powerpc: Add ppc_global_function_entry()
ABIv2 has the concept of a global and local entry point to a function. In most cases we are interested in the local entry point, and so that is what ppc_function_entry() returns. However we have a case in the ftrace code where we want the global entry point, and there may be other places we need it too. Rather than special casing each, add an accessor. For ABIv1 and 32-bit there is only a single entry point, so we return that. That means it's safe for the caller to use this without also checking the ABI version. Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/include/asm/code-patching.h | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h index 37991e1..840a550 100644 --- a/arch/powerpc/include/asm/code-patching.h +++ b/arch/powerpc/include/asm/code-patching.h @@ -88,4 +88,15 @@ static inline unsigned long ppc_function_entry(void *func) #endif } +static inline unsigned long ppc_global_function_entry(void *func) +{ +#if defined(CONFIG_PPC64) defined(_CALL_ELF) _CALL_ELF == 2 + /* PPC64 ABIv2 the global entry point is at the address */ + return (unsigned long)func; +#else + /* All other cases there is no change vs ppc_function_entry() */ + return ppc_function_entry(func); +#endif +} + #endif /* _ASM_POWERPC_CODE_PATCHING_H */ -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/5] powerpc/ftrace: Fix inverted check of create_branch()
In commit 24a1bdc35, Fix ABIv2 issues with __ftrace_make_call, Anton changed the logic that creates and patches the branch, and added a thinko in the check of create_branch(). create_branch() returns the instruction that was generated, so if we get zero then it succeeded. The result is we can't ftrace modules: Branch out of range WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [d4ba001c] fuse_req_init_context+0x1c/0x90 [fuse] We should probably fix patch_instruction() to do that check and make the API saner, but that's a separate patch. For now just invert the test. Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/kernel/ftrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c index f5d1a34..8fc0c17 100644 --- a/arch/powerpc/kernel/ftrace.c +++ b/arch/powerpc/kernel/ftrace.c @@ -320,7 +320,7 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) } /* Ensure branch is within 24 bits */ - if (create_branch(ip, rec-arch.mod-arch.tramp, BRANCH_SET_LINK)) { + if (!create_branch(ip, rec-arch.mod-arch.tramp, BRANCH_SET_LINK)) { printk(KERN_ERR Branch out of range); return -EINVAL; } -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/5] powerpc/ftrace: Fix typo in mask of opcode
In commit 24a1bdc35, Fix ABIv2 issues with __ftrace_make_call, Anton changed the logic that checks for the expected code sequence when patching a module. We missed the typo in the mask, 0x0 should be 0x, which has the effect of making the test always true. That makes it impossible to ftrace against modules, eg: Unexpected call sequence: 4808 e8410018 WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [d7cf001c] rng_dev_open+0x1c/0x70 [rng_core] Reported-by: David Binderman dcb...@hotmail.com Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/kernel/ftrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c index f202d07..f5d1a34 100644 --- a/arch/powerpc/kernel/ftrace.c +++ b/arch/powerpc/kernel/ftrace.c @@ -307,7 +307,7 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) * The load offset is different depending on the ABI. For simplicity * just mask it out when doing the compare. */ - if ((op[0] != 0x4808) || ((op[1] 0x0) != 0xe841)) { + if ((op[0] != 0x4808) || ((op[1] 0x) != 0xe841)) { printk(KERN_ERR Unexpected call sequence: %x %x\n, op[0], op[1]); return -EINVAL; -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 4/5] powerpc/ftrace: Fix nop of modules on 64bit LE (ABIv2)
There is a bug in the handling of the function entry when we are nopping out a branch from a module in ftrace. We compare the result of module_trampoline_target() with the value of ppc_function_entry(), and expect them to be true. But they never will be. module_trampoline_target() will always return the global entry point of the function, whereas ppc_function_entry() will always return the local. Fix it by using the newly added ppc_global_function_entry(). Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/kernel/ftrace.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c index 8fc0c17..96efc66 100644 --- a/arch/powerpc/kernel/ftrace.c +++ b/arch/powerpc/kernel/ftrace.c @@ -105,7 +105,7 @@ __ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { unsigned int op; - unsigned long ptr; + unsigned long entry, ptr; unsigned long ip = rec-ip; void *tramp; @@ -136,10 +136,11 @@ __ftrace_make_nop(struct module *mod, pr_devel(trampoline target %lx, ptr); + entry = ppc_global_function_entry((void *)addr); /* This should match what was called */ - if (ptr != ppc_function_entry((void *)addr)) { + if (ptr != entry) { printk(KERN_ERR addr %lx does not match expected %lx\n, - ptr, ppc_function_entry((void *)addr)); + ptr, entry); return -EINVAL; } -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/5] powerpc/ftrace: Use pr_fmt() to namespace error messages
The printks() in our ftrace code have no prefix, so they appear on the console with very little context, eg: Branch out of range Use pr_fmt() pr_err() to add a prefix. While we're at it, collapse a few split lines that don't need to be, and add a missing newline to one message. Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/kernel/ftrace.c | 43 --- 1 file changed, 20 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c index 96efc66..d178834 100644 --- a/arch/powerpc/kernel/ftrace.c +++ b/arch/powerpc/kernel/ftrace.c @@ -10,6 +10,8 @@ * */ +#define pr_fmt(fmt) ftrace-powerpc: fmt + #include linux/spinlock.h #include linux/hardirq.h #include linux/uaccess.h @@ -115,7 +117,7 @@ __ftrace_make_nop(struct module *mod, /* Make sure that that this is still a 24bit jump */ if (!is_bl_op(op)) { - printk(KERN_ERR Not expected bl: opcode is %x\n, op); + pr_err(Not expected bl: opcode is %x\n, op); return -EINVAL; } @@ -125,12 +127,12 @@ __ftrace_make_nop(struct module *mod, pr_devel(ip:%lx jumps to %p, ip, tramp); if (!is_module_trampoline(tramp)) { - printk(KERN_ERR Not a trampoline\n); + pr_err(Not a trampoline\n); return -EINVAL; } if (module_trampoline_target(mod, tramp, ptr)) { - printk(KERN_ERR Failed to get trampoline target\n); + pr_err(Failed to get trampoline target\n); return -EFAULT; } @@ -139,8 +141,7 @@ __ftrace_make_nop(struct module *mod, entry = ppc_global_function_entry((void *)addr); /* This should match what was called */ if (ptr != entry) { - printk(KERN_ERR addr %lx does not match expected %lx\n, - ptr, entry); + pr_err(addr %lx does not match expected %lx\n, ptr, entry); return -EINVAL; } @@ -180,7 +181,7 @@ __ftrace_make_nop(struct module *mod, /* Make sure that that this is still a 24bit jump */ if (!is_bl_op(op)) { - printk(KERN_ERR Not expected bl: opcode is %x\n, op); + pr_err(Not expected bl: opcode is %x\n, op); return -EINVAL; } @@ -199,7 +200,7 @@ __ftrace_make_nop(struct module *mod, /* Find where the trampoline jumps to */ if (probe_kernel_read(jmp, (void *)tramp, sizeof(jmp))) { - printk(KERN_ERR Failed to read %lx\n, tramp); + pr_err(Failed to read %lx\n, tramp); return -EFAULT; } @@ -210,7 +211,7 @@ __ftrace_make_nop(struct module *mod, ((jmp[1] 0x) != 0x398c) || (jmp[2] != 0x7d8903a6) || (jmp[3] != 0x4e800420)) { - printk(KERN_ERR Not a trampoline\n); + pr_err(Not a trampoline\n); return -EINVAL; } @@ -222,8 +223,7 @@ __ftrace_make_nop(struct module *mod, pr_devel( %lx , tramp); if (tramp != addr) { - printk(KERN_ERR - Trampoline location %08lx does not match addr\n, + pr_err(Trampoline location %08lx does not match addr\n, tramp); return -EINVAL; } @@ -264,15 +264,13 @@ int ftrace_make_nop(struct module *mod, */ if (!rec-arch.mod) { if (!mod) { - printk(KERN_ERR No module loaded addr=%lx\n, - addr); + pr_err(No module loaded addr=%lx\n, addr); return -EFAULT; } rec-arch.mod = mod; } else if (mod) { if (mod != rec-arch.mod) { - printk(KERN_ERR - Record mod %p not equal to passed in mod %p\n, + pr_err(Record mod %p not equal to passed in mod %p\n, rec-arch.mod, mod); return -EINVAL; } @@ -309,25 +307,24 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) * just mask it out when doing the compare. */ if ((op[0] != 0x4808) || ((op[1] 0x) != 0xe841)) { - printk(KERN_ERR Unexpected call sequence: %x %x\n, - op[0], op[1]); + pr_err(Unexpected call sequence: %x %x\n, op[0], op[1]); return -EINVAL; } /* If we never set up a trampoline to ftrace_caller, then bail */ if (!rec-arch.mod-arch.tramp) { - printk(KERN_ERR No ftrace trampoline\n); + pr_err(No ftrace trampoline\n); return -EINVAL; } /* Ensure branch is within 24 bits */ if (!create_branch(ip,
Re: [PATCH 4/4] powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.
On Wed, Jun 11, 2014 at 02:18:21PM +0530, Mahesh J Salgaonkar wrote: From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Currently we forward MCEs to guest which have been recovered by guest. And for unhandled errors we do not deliver the MCE to guest. It looks like with no support of FWNMI in qemu, guest just panics whenever we deliver the recovered MCEs to guest. Also, the existig code used to return to host for unhandled errors which was casuing guest to hang with soft lockups inside guest and makes it difficult to recover guest instance. This patch now forwards all fatal MCEs to guest causing guest to crash/panic. And, for recovered errors we just go back to normal functioning of guest instead of returning to host. ... having corrupted possibly live values that the guest had in SRR0/1. Ideally the guest should have cleared MSR[RI] before putting values in SRR0/1, so perhaps you could check that and return to the guest without giving it a machine check if MSR[RI] is set. But if MSR[RI] is clear, the guest is unfixably corrupted because the machine check overwrote SRR0/1, and the only thing we can do, in the absence of FWNMI support, is give the guest a machine check interrupt and let it crash. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v1] kexec:fs2dt: Refine kdump device_tree sort
From: Yang Wei wei.y...@windriver.com The commit b02d735bf was to rearrange the device-tree entries, and assumed that these entries are sorted in the ascending order. but acctually when I was validating kexec and kdump, the order of serial node still is changed. We should not only compare the length of directory name, but also compare the directory name, it would ensure that the order of device node is really in ascending order. Signed-off-by: Yang Wei wei.y...@windriver.com --- Hi Simon, Please help me take a look at this patch. I validated it on freescale t4240qds. Thanks Wei kexec/fs2dt.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/kexec/fs2dt.c b/kexec/fs2dt.c index 1e5f074..0bffaf5 100644 --- a/kexec/fs2dt.c +++ b/kexec/fs2dt.c @@ -479,6 +479,9 @@ static int comparefunc(const struct dirent **dentry1, { char *str1 = (*(struct dirent **)dentry1)-d_name; char *str2 = (*(struct dirent **)dentry2)-d_name; + char* ptr1 = strchr(str1, '@'); + char* ptr2 = strchr(str2, '@'); + int len1, len2; /* * strcmp scans from left to right and fails to idetify for some @@ -486,9 +489,13 @@ static int comparefunc(const struct dirent **dentry1, * Therefore, we get the wrong sorted order like memory@1000 and * memory@f00. */ - if (strchr(str1, '@') strchr(str2, '@') - (strlen(str1) strlen(str2))) - return 1; + if (ptr1 ptr2) { + len1 = ptr1 - str1; + len2 = ptr2 - str2; + if (!strncmp(str1, str2, len1 len2 ? len1: len2) + (strlen(str1) strlen(str2))) + return 1; + } return strcmp(str1, str2); } -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v1] fs2dt: Refine kdump device_tree sort
Simon, I missed kexec string in subject, so please ignore this version. I would resend it with adding kexec in subject. Thanks Wei On 06/17/2014 02:01 PM, Yang,Wei wrote: Hi Simon, How about this patch? Thanks Wei On 06/12/2014 01:16 PM, wei.y...@windriver.com wrote: From: Yang Wei wei.y...@windriver.com The commit b02d735bf was to rearrange the device-tree entries, and assumed that these entries are sorted in the ascending order. but acctually when I was validating kexec and kdump, the order of serial node still is changed. We should not only compare the length of directory name, but also compare the directory name, it would ensure that the order of device node is really in ascending order. Signed-off-by: Yang Wei wei.y...@windriver.com --- kexec/fs2dt.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) It is validated on Freescale t4240qds. diff --git a/kexec/fs2dt.c b/kexec/fs2dt.c index 1e5f074..0bffaf5 100644 --- a/kexec/fs2dt.c +++ b/kexec/fs2dt.c @@ -479,6 +479,9 @@ static int comparefunc(const struct dirent **dentry1, { char *str1 = (*(struct dirent **)dentry1)-d_name; char *str2 = (*(struct dirent **)dentry2)-d_name; +char* ptr1 = strchr(str1, '@'); +char* ptr2 = strchr(str2, '@'); +int len1, len2; /* * strcmp scans from left to right and fails to idetify for some @@ -486,9 +489,13 @@ static int comparefunc(const struct dirent **dentry1, * Therefore, we get the wrong sorted order like memory@1000 and * memory@f00. */ -if (strchr(str1, '@') strchr(str2, '@') -(strlen(str1) strlen(str2))) -return 1; +if (ptr1 ptr2) { +len1 = ptr1 - str1; +len2 = ptr2 - str2; +if (!strncmp(str1, str2, len1 len2 ? len1: len2) +(strlen(str1) strlen(str2))) +return 1; +} return strcmp(str1, str2); } ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
On Monday 16 June 2014 09:17:11 Peter Hurley wrote: tty_wait_until_sent_from_close() drops the tty lock while waiting for the tty driver to finish sending previously accepted data (ie., data remaining in its write buffer and transmit fifo). However, dropping the tty lock is a hold-over from when the tty lock was system-wide; ie., one lock for all ttys. Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, 'tty: localise the lock', dropping the tty lock has not been necessary. CC: Karsten Keil i...@linux-pingi.de CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Peter Hurley pe...@hurleysoftware.com I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. Arnd ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 4/4] powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.
On 2014-06-17 16:23:58 Tue, Paul Mackerras wrote: On Wed, Jun 11, 2014 at 02:18:21PM +0530, Mahesh J Salgaonkar wrote: From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Currently we forward MCEs to guest which have been recovered by guest. And for unhandled errors we do not deliver the MCE to guest. It looks like with no support of FWNMI in qemu, guest just panics whenever we deliver the recovered MCEs to guest. Also, the existig code used to return to host for unhandled errors which was casuing guest to hang with soft lockups inside guest and makes it difficult to recover guest instance. This patch now forwards all fatal MCEs to guest causing guest to crash/panic. And, for recovered errors we just go back to normal functioning of guest instead of returning to host. ... having corrupted possibly live values that the guest had in SRR0/1. Ideally the guest should have cleared MSR[RI] before putting values in SRR0/1, so perhaps you could check that and return to the guest without giving it a machine check if MSR[RI] is set. But if MSR[RI] is clear, the guest is unfixably corrupted because the machine check overwrote SRR0/1, and the only thing we can do, in the absence of FWNMI support, is give the guest a machine check interrupt and let it crash. Yes agree. I have patch (below) ready for the same, will test/verify and send it out soon. Thanks, -Mahesh. - Deliver machine check with MSR(RI=0) to guest as MCE From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 868347e..c9c56ee 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2257,7 +2257,6 @@ machine_check_realmode: mr r3, r9 /* get vcpu pointer */ bl kvmppc_realmode_machine_check nop - cmpdi r3, 0 /* Did we handle MCE ? */ ld r9, HSTATE_KVM_VCPU(r13) li r12, BOOK3S_INTERRUPT_MACHINE_CHECK /* @@ -2270,13 +2269,18 @@ machine_check_realmode: * The old code used to return to host for unhandled errors which * was causing guest to hang with soft lockups inside guest and * makes it difficult to recover guest instance. +* +* if we receive machine check with MSR(RI=0) then deliver it to +* guest as machine check causing guest to crash. */ - ld r10, VCPU_PC(r9) ld r11, VCPU_MSR(r9) + andi. r10, r11, MSR_RI/* check for unrecoverable exception */ + beq 1f /* Deliver a machine check to guest */ + ld r10, VCPU_PC(r9) + cmpdi r3, 0 /* Did we handle MCE ? */ bne 2f /* Continue guest execution. */ /* If not, deliver a machine check. SRR0/1 are already set */ - li r10, BOOK3S_INTERRUPT_MACHINE_CHECK - ld r11, VCPU_MSR(r9) +1: li r10, BOOK3S_INTERRUPT_MACHINE_CHECK bl kvmppc_msr_interrupt 2: b fast_interrupt_c_return ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 14.06.14 23:08, Madhavan Srinivasan wrote: This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch mandates use of abs instruction (primary opcode 31 and extended opcode 360) as sw breakpoint instruction. Based on PowerISA v2.01, ABS instruction has been dropped from the architecture and treated an illegal instruction. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s.c| 3 ++- arch/powerpc/kvm/book3s_hv.c | 23 +++ 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index c254c27..b40fe5d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { - return -EINVAL; + vcpu-guest_debug = dbg-control; + return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a12edb..688421d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -67,6 +67,14 @@ /* Used as a null value for timebase values */ #define TB_NIL(~(u64)0) +/* + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint. + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360. + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture + * and treated an illegal instruction. + */ +#define SW_BRK_DBG_INT 0x7c0002d0 The instruction we use to trap needs to get exposed to user space via a ONE_REG property. Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, break; /* * This occurs if the guest executes an illegal instruction. -* We just generate a program interrupt to the guest, since -* we don't emulate any guest instructions at this stage. +* To support software breakpoint, we check for the sw breakpoint +* instruction to return back to host, if not we just generate a +* program interrupt to the guest. */ case BOOK3S_INTERRUPT_H_EMUL_ASSIST: - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - r = RESUME_GUEST; + if (vcpu-arch.last_inst == SW_BRK_DBG_INT) { Don't access last_inst directly. Instead use the provided helpers. + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.address = vcpu-arch.pc; + r = RESUME_HOST; + } else { + kvmppc_core_queue_program(vcpu, 0x8); magic numbers + r = RESUME_GUEST; + } break; /* * This occurs if the guest (kernel or userspace), does something that Please enable PR KVM as well while you're at it. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i ITERATIONS; i++) for (j = 0; j ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman mihai.cara...@freescale.com Cc: Scott Wood scottw...@freescale.com --- v2: - improve patch name and description - add performance results arch/powerpc/kvm/e500mc.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c index 17e4562..d3b814b0 100644 --- a/arch/powerpc/kvm/e500mc.c +++ b/arch/powerpc/kvm/e500mc.c @@ -111,10 +111,12 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr) } static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(int, last_lpid_on_cpu); static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + bool update_last = false, inval_tlb = false; kvmppc_booke_vcpu_load(vcpu, cpu); @@ -140,12 +142,24 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu) mtspr(SPRN_GDEAR, vcpu-arch.shared-dar); mtspr(SPRN_GESR, vcpu-arch.shared-esr); - if (vcpu-arch.oldpir != mfspr(SPRN_PIR) || - __get_cpu_var(last_vcpu_on_cpu) != vcpu) { - kvmppc_e500_tlbil_all(vcpu_e500); + if (vcpu-arch.oldpir != mfspr(SPRN_PIR)) { + /* stale tlb entries */ + inval_tlb = update_last = true; + } else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) { + update_last = true; + /* polluted tlb entries */ + inval_tlb = __get_cpu_var(last_lpid_on_cpu) == + vcpu-kvm-arch.lpid; + } + + if (update_last) { __get_cpu_var(last_vcpu_on_cpu) = vcpu; + __get_cpu_var(last_lpid_on_cpu) = vcpu-kvm-arch.lpid; } + if (inval_tlb) + kvmppc_e500_tlbil_all(vcpu_e500); + kvmppc_load_guest_fp(vcpu); } -- 1.7.11.7 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule
On 13.06.14 21:42, Scott Wood wrote: On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote: On 13.06.14 16:43, mihai.cara...@freescale.com wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, June 12, 2014 8:05 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org; Wood Scott-B07421 Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule On 06/12/2014 04:00 PM, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too tight. The tlb entries of one vcpu are polluted when a different vcpu from the same partition runs in-between. Relax the current tlb invalidation condition taking into account the lpid. Can you quantify the performance improvement from this? We've had bugs in this area before, so let's make sure it's worth it before making this more complicated. Signed-off-by: Mihai Caraman mihai.caraman at freescale.com Your mailer is broken? :) This really should be an @. I think this should work. Scott, please ack. Alex, you were right. I screwed up the patch description by inverting relax and tight terms :) It should have been more like this: KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the lpid. Can't we give every vcpu its own lpid? Or don't we trap on global invalidates? That would significantly increase the odds of exhausting LPIDs, especially on large chips like t4240 with similarly large VMs. If we were to do that, the LPIDs would need to be dynamically assigned (like PIDs), and should probably be a separate numberspace per physical core. True, I didn't realize we only have so few of them. It would however save us from most flushing as long as we have spare LPIDs available :). Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). I'm trying to see if I can get the architect to set one in stone in a future proof way. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). Ah, good point. I guess we need different one for PR and HV then to ensure compatibility with older ISAs on PR. Alex I'm trying to see if I can get the architect to set one in stone in a future proof way. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 17.06.14 11:32, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote: On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). Ah, good point. I guess we need different one for PR and HV then to ensure compatibility with older ISAs on PR. Well, we also need to be careful with what happens if a PR guest puts that instruction in, do that stop its HV guest/host ? What if it's done in userspace ? Do that stop the kernel ? :-) The way SW breakpointing is handled is that when we see one, it gets deflected into user space. User space then has an array of breakpoints it configured itself. If the breakpoint is part of that list, it consumes it. If not, it injects a debug interrupt (program in this case) into the guest. That way we can overlay that one instruction with as many layers as we like :). We only get a performance hit on execution of that instruction. Maddy, I haven't checked, does your patch ensure that we only ever stop if the instruction is at a recorded bkpt address ? It still means that a userspace process can practically DOS its kernel by issuing a lot of these causing a crapload of exits. Only user space knows about its breakpoint addresses, so we have to deflect. However since time still ticks on, we only increase jitter of the guest. The process would still get scheduled away after the same amount of real time, no? Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Boot failure in Power7 pSeries
Hi all, I use newest linux-next( top commit: 5f295cdf5c5dbbb0c40f10f2ddae02ff46bbf773) to boot up my Power7 machine, PowerVM mode(HypMode 01), use defualt config file in /boot/, it show error log below: OF stdout device is: /vdevice/vty@3000 Preparing to boot Linux version 3.16.0-rc1-next-20140617+ (root@shui) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #5 SMP Tue Jun 17 05:16:21 EDT 2014 Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 1024) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/vmlinux-3.16.0-rc1-next-20140617+ root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/swap rd.md=0 rd.dm=0 vconsole.keymap=us rd.luks=0 vconsole.font=latarcyrheb-sun16 rd.lvt memory layout at init: memory_limit : (16 MB aligned) alloc_bottom : 0591 alloc_top: 1000 alloc_top_hi : 1000 rmo_top : 1000 ram_top : 1000 instantiating rtas at 0x0ee8... done Querying for OPAL presence... DEFAULT CATCH!, exception-handler=fff00700 at %SRR0: 041a1c14 %SRR1: 00081002 Open Firmware exception handler entered from non-OF code Client's Fix Pt Regs: 00 042c017c 042c2ce8 04ae8d58 042c2f38 04 0369aafc 042c2f38 01adc100 042c2f38 08 04328d58 28002024 1002 0c a001 01a9fd20 041a7df8 10 041a2130 041a1e70 f821ff913d220005 01a9fd20 14 7962 0ee8 0118 0ee8 18 041a2610 0369 042c3070 041a1ce8 1c 041a1ce0 041b89f0 0003 0001 Special Regs: %IV: 0700 %CR: 48002024%XER: %DSISR: 4000 %SRR0: 041a1c14 %SRR1: 00081002 %LR: 0369aafc%CTR: %DAR: f821ff913d220035 Virtual PID = 0 ok 0 Thanks Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On Tuesday 17 June 2014 03:02 PM, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote: On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). Ah, good point. I guess we need different one for PR and HV then to ensure compatibility with older ISAs on PR. Well, we also need to be careful with what happens if a PR guest puts that instruction in, do that stop its HV guest/host ? Damn, my mail client is messed up. did not see the mail till now. I havent tried this incase of PR guest kernel. I will need to try this before commenting. What if it's done in userspace ? Do that stop the kernel ? :-) Basically flow is that, when we see this instruction, we return to host, and host checks for address in the SW array and if not it returns to kernel. Maddy, I haven't checked, does your patch ensure that we only ever stop if the instruction is at a recorded bkpt address ? It still means that a userspace process can practically DOS its kernel by issuing a lot of these causing a crapload of exits. This is valid, userspace can create a mess, need to handle this, meaning incase if we dont find a valid SW breakpoint for this address in the HOST, we need to route it to guest and kill it at app. Regards Maddy Cheers, Ben. Alex I'm trying to see if I can get the architect to set one in stone in a future proof way. Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
On 06/17/2014 04:00 AM, Arnd Bergmann wrote: On Monday 16 June 2014 09:17:11 Peter Hurley wrote: tty_wait_until_sent_from_close() drops the tty lock while waiting for the tty driver to finish sending previously accepted data (ie., data remaining in its write buffer and transmit fifo). However, dropping the tty lock is a hold-over from when the tty lock was system-wide; ie., one lock for all ttys. Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, 'tty: localise the lock', dropping the tty lock has not been necessary. CC: Karsten Keil i...@linux-pingi.de CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Peter Hurley pe...@hurleysoftware.com I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. The tty lock is not used in the i/o path; it's purpose is to mutually exclude state changes in open(), close() and hangup(). The commit that added this [1] comments that _other_ ttys may wait for this tty to complete, and comments in the code note that this function should be removed when the system-wide tty mutex was removed (which happened with the commit noted in the changelog). Regards, Peter Hurley [1] commit a57a7bf3fc7eff00f07eb9c805774d911a3f2472 Author: Jiri Slaby jsl...@suse.cz Date: Thu Aug 25 15:12:06 2011 +0200 TTY: define tty_wait_until_sent_from_close We need this helper to fix system stalls. The issue is that the rest of the system TTYs wait for us to finish waiting. This wasn't an issue with BKL. BKL used to unlock implicitly. This is based on the Arnd suggestion. Signed-off-by: Jiri Slaby jsl...@suse.cz Acked-by: Arnd Bergmann a...@arndb.de Signed-off-by: Greg Kroah-Hartman gre...@suse.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
From: Peter Hurley ... I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. The tty lock is not used in the i/o path; it's purpose is to mutually exclude state changes in open(), close() and hangup(). The commit that added this [1] comments that _other_ ttys may wait for this tty to complete, and comments in the code note that this function should be removed when the system-wide tty mutex was removed (which happened with the commit noted in the changelog). What happens if another process tries to do a non-blocking open while you are sleeping in close waiting for output to drain? Hopefully this returns before that data has drained. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote: On 14.06.14 23:08, Madhavan Srinivasan wrote: This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch mandates use of abs instruction (primary opcode 31 and extended opcode 360) as sw breakpoint instruction. Based on PowerISA v2.01, ABS instruction has been dropped from the architecture and treated an illegal instruction. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s.c| 3 ++- arch/powerpc/kvm/book3s_hv.c | 23 +++ 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index c254c27..b40fe5d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { -return -EINVAL; +vcpu-guest_debug = dbg-control; +return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a12edb..688421d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -67,6 +67,14 @@ /* Used as a null value for timebase values */ #define TB_NIL(~(u64)0) +/* + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint. + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360. + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture + * and treated an illegal instruction. + */ +#define SW_BRK_DBG_INT 0x7c0002d0 The instruction we use to trap needs to get exposed to user space via a ONE_REG property. Yes. I got to know about that from Bharat (patchset ppc debug: Add debug stub support). I will change it. Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, break; /* * This occurs if the guest executes an illegal instruction. - * We just generate a program interrupt to the guest, since - * we don't emulate any guest instructions at this stage. + * To support software breakpoint, we check for the sw breakpoint + * instruction to return back to host, if not we just generate a + * program interrupt to the guest. */ case BOOK3S_INTERRUPT_H_EMUL_ASSIST: -kvmppc_core_queue_program(vcpu, SRR1_PROGILL); -r = RESUME_GUEST; +if (vcpu-arch.last_inst == SW_BRK_DBG_INT) { Don't access last_inst directly. Instead use the provided helpers. Ok. Will look and replace it. +run-exit_reason = KVM_EXIT_DEBUG; +run-debug.arch.address = vcpu-arch.pc; +r = RESUME_HOST; +} else { +kvmppc_core_queue_program(vcpu, 0x8); magic numbers ^ I did not understand this? +r = RESUME_GUEST; +} break; /* * This occurs if the guest (kernel or userspace), does something that Please enable PR KVM as well while you're at it. My bad, I did not try the PR KVM. I will try it out. Alex Thanks for review Regards Maddy ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 17.06.14 13:07, Madhavan Srinivasan wrote: On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote: On 14.06.14 23:08, Madhavan Srinivasan wrote: This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch mandates use of abs instruction (primary opcode 31 and extended opcode 360) as sw breakpoint instruction. Based on PowerISA v2.01, ABS instruction has been dropped from the architecture and treated an illegal instruction. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s.c| 3 ++- arch/powerpc/kvm/book3s_hv.c | 23 +++ 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index c254c27..b40fe5d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { -return -EINVAL; +vcpu-guest_debug = dbg-control; +return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a12edb..688421d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -67,6 +67,14 @@ /* Used as a null value for timebase values */ #define TB_NIL(~(u64)0) +/* + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint. + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360. + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture + * and treated an illegal instruction. + */ +#define SW_BRK_DBG_INT 0x7c0002d0 The instruction we use to trap needs to get exposed to user space via a ONE_REG property. Yes. I got to know about that from Bharat (patchset ppc debug: Add debug stub support). I will change it. Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, break; /* * This occurs if the guest executes an illegal instruction. - * We just generate a program interrupt to the guest, since - * we don't emulate any guest instructions at this stage. + * To support software breakpoint, we check for the sw breakpoint + * instruction to return back to host, if not we just generate a + * program interrupt to the guest. */ case BOOK3S_INTERRUPT_H_EMUL_ASSIST: -kvmppc_core_queue_program(vcpu, SRR1_PROGILL); -r = RESUME_GUEST; +if (vcpu-arch.last_inst == SW_BRK_DBG_INT) { Don't access last_inst directly. Instead use the provided helpers. Ok. Will look and replace it. +run-exit_reason = KVM_EXIT_DEBUG; +run-debug.arch.address = vcpu-arch.pc; +r = RESUME_HOST; +} else { +kvmppc_core_queue_program(vcpu, 0x8); magic numbers ^ I did not understand this? You're replacing the readable SRR1_PROGILL with the unreadable 0x8. That's bad. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote: On 17.06.14 11:32, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote: On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). Ah, good point. I guess we need different one for PR and HV then to ensure compatibility with older ISAs on PR. Well, we also need to be careful with what happens if a PR guest puts that instruction in, do that stop its HV guest/host ? What if it's done in userspace ? Do that stop the kernel ? :-) The way SW breakpointing is handled is that when we see one, it gets deflected into user space. User space then has an array of breakpoints it configured itself. If the breakpoint is part of that list, it consumes it. If not, it injects a debug interrupt (program in this case) into the guest. That way we can overlay that one instruction with as many layers as we like :). We only get a performance hit on execution of that instruction. Maddy, I haven't checked, does your patch ensure that we only ever stop if the instruction is at a recorded bkpt address ? It still means that a userspace process can practically DOS its kernel by issuing a lot of these causing a crapload of exits. Only user space knows about its breakpoint addresses, so we have to deflect. However since time still ticks on, we only increase jitter of the guest. The process would still get scheduled away after the same ^^^ Where is this taken care. I am still trying to understand. Kindly can you explain or point to the code. Will help. amount of real time, no? Alex Thanks for review. Regards Maddy ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote: On 17.06.14 13:07, Madhavan Srinivasan wrote: On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote: On 14.06.14 23:08, Madhavan Srinivasan wrote: This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch mandates use of abs instruction (primary opcode 31 and extended opcode 360) as sw breakpoint instruction. Based on PowerISA v2.01, ABS instruction has been dropped from the architecture and treated an illegal instruction. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s.c| 3 ++- arch/powerpc/kvm/book3s_hv.c | 23 +++ 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index c254c27..b40fe5d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { -return -EINVAL; +vcpu-guest_debug = dbg-control; +return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a12edb..688421d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -67,6 +67,14 @@ /* Used as a null value for timebase values */ #define TB_NIL(~(u64)0) +/* + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint. + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360. + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture + * and treated an illegal instruction. + */ +#define SW_BRK_DBG_INT 0x7c0002d0 The instruction we use to trap needs to get exposed to user space via a ONE_REG property. Yes. I got to know about that from Bharat (patchset ppc debug: Add debug stub support). I will change it. Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, break; /* * This occurs if the guest executes an illegal instruction. - * We just generate a program interrupt to the guest, since - * we don't emulate any guest instructions at this stage. + * To support software breakpoint, we check for the sw breakpoint + * instruction to return back to host, if not we just generate a + * program interrupt to the guest. */ case BOOK3S_INTERRUPT_H_EMUL_ASSIST: -kvmppc_core_queue_program(vcpu, SRR1_PROGILL); -r = RESUME_GUEST; +if (vcpu-arch.last_inst == SW_BRK_DBG_INT) { Don't access last_inst directly. Instead use the provided helpers. Ok. Will look and replace it. +run-exit_reason = KVM_EXIT_DEBUG; +run-debug.arch.address = vcpu-arch.pc; +r = RESUME_HOST; +} else { +kvmppc_core_queue_program(vcpu, 0x8); magic numbers ^ I did not understand this? You're replacing the readable SRR1_PROGILL with the unreadable 0x8. That's bad. Oops. My bad. Will undo that. I guess I messed up when was re basing it. Alex Thanks for review Regards Maddy ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 17.06.14 13:20, Madhavan Srinivasan wrote: On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote: On 17.06.14 11:32, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote: On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see this shared with book3s_32 PR. twi will be directed to the guest on HV no ? We want a real illegal because those go to the host (for potential emulation by the HV). Ah, good point. I guess we need different one for PR and HV then to ensure compatibility with older ISAs on PR. Well, we also need to be careful with what happens if a PR guest puts that instruction in, do that stop its HV guest/host ? What if it's done in userspace ? Do that stop the kernel ? :-) The way SW breakpointing is handled is that when we see one, it gets deflected into user space. User space then has an array of breakpoints it configured itself. If the breakpoint is part of that list, it consumes it. If not, it injects a debug interrupt (program in this case) into the guest. That way we can overlay that one instruction with as many layers as we like :). We only get a performance hit on execution of that instruction. Maddy, I haven't checked, does your patch ensure that we only ever stop if the instruction is at a recorded bkpt address ? It still means that a userspace process can practically DOS its kernel by issuing a lot of these causing a crapload of exits. Only user space knows about its breakpoint addresses, so we have to deflect. However since time still ticks on, we only increase jitter of the guest. The process would still get scheduled away after the same ^^^ Where is this taken care. I am still trying to understand. Kindly can you explain or point to the code. Will help. We tell the guest via VPA about its steal time which includes QEMU time. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
On Tuesday 17 June 2014 11:03:50 David Laight wrote: From: Peter Hurley ... I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. The tty lock is not used in the i/o path; it's purpose is to mutually exclude state changes in open(), close() and hangup(). The commit that added this [1] comments that _other_ ttys may wait for this tty to complete, and comments in the code note that this function should be removed when the system-wide tty mutex was removed (which happened with the commit noted in the changelog). What happens if another process tries to do a non-blocking open while you are sleeping in close waiting for output to drain? Hopefully this returns before that data has drained. Before the patch, I believe tty_reopen() would return -EIO because the TTY_CLOSING flag is set. After the patch, tty_open() blocks on tty_lock() before calling tty_reopen(). AFAICT, this is independent of O_NONBLOCK. Arnd ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
On 06/17/2014 07:03 AM, David Laight wrote: From: Peter Hurley ... I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. The tty lock is not used in the i/o path; it's purpose is to mutually exclude state changes in open(), close() and hangup(). The commit that added this [1] comments that _other_ ttys may wait for this tty to complete, and comments in the code note that this function should be removed when the system-wide tty mutex was removed (which happened with the commit noted in the changelog). What happens if another process tries to do a non-blocking open while you are sleeping in close waiting for output to drain? Hopefully this returns before that data has drained. Good point. tty_open() should be trylocking both mutexes anyway in O_NONBLOCK. Regards, Peter Hurley ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Tuesday, June 17, 2014 12:09 PM To: Wood Scott-B07421 Cc: Caraman Mihai Claudiu-B02008; kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule On 13.06.14 21:42, Scott Wood wrote: On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote: On 13.06.14 16:43, mihai.cara...@freescale.com wrote: -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, June 12, 2014 8:05 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org; Wood Scott-B07421 Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule On 06/12/2014 04:00 PM, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too tight. The tlb entries of one vcpu are polluted when a different vcpu from the same partition runs in-between. Relax the current tlb invalidation condition taking into account the lpid. Can you quantify the performance improvement from this? We've had bugs in this area before, so let's make sure it's worth it before making this more complicated. Signed-off-by: Mihai Caraman mihai.caraman at freescale.com Your mailer is broken? :) This really should be an @. I think this should work. Scott, please ack. Alex, you were right. I screwed up the patch description by inverting relax and tight terms :) It should have been more like this: KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu are polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the lpid. Can't we give every vcpu its own lpid? Or don't we trap on global invalidates? That would significantly increase the odds of exhausting LPIDs, especially on large chips like t4240 with similarly large VMs. If we were to do that, the LPIDs would need to be dynamically assigned (like PIDs), and should probably be a separate numberspace per physical core. True, I didn't realize we only have so few of them. It would however save us from most flushing as long as we have spare LPIDs available :). Yes, we had this proposal on the table for e6500 multithreaded core. This core lacks tlb write conditional instruction, so an OS needs to use locks to protect itself against concurrent tlb writes executed from sibling threads. When we expose hw treads as single-threaded vcpus (useful when the user opt not to pin vcpus), the guest can't no longer protect itself optimally (it can protect tlb writes across all threads but this is not acceptable). So instead, we found a solution at hypervisor level by assigning different logical partition ids to guest's vcpus running simultaneous on sibling hw threads. Currently in FSL SDK we allocate two lpids to each guest. I am also a proponent for using all LPID space (63 values) per (multi-threaded) physical core, which will lead to fewer invalidates on vcpu schedule and will accommodate the solution described above. -Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH tty-next 14/22] tty: Remove tty_wait_until_sent_from_close()
Before the patch, I believe tty_reopen() would return -EIO because the TTY_CLOSING flag is set. After the patch, tty_open() blocks on tty_lock() before calling tty_reopen(). AFAICT, this is independent of O_NONBLOCK. That would be a bug then. Returning -EIO is fine (if unfriendly). The O_NONBLOCK can't block in this case though because the port could take a long time to give up trying to dribble its bits (up to 30 seconds or so) Alan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/1] powerpc/traps/e500: fix misleading error output
In machine_check_e500 exception handler is a wrong indication in case of MCSR_BUS_WBERR - so print Write instead of Read. Signed-off-by: Wladislav Wiebe wladislav...@gmail.com --- arch/powerpc/kernel/traps.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 239f1cd..cb9cfe4 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -609,7 +609,7 @@ int machine_check_e500(struct pt_regs *regs) if (reason MCSR_BUS_RBERR) printk(Bus - Read Data Bus Error\n); if (reason MCSR_BUS_WBERR) - printk(Bus - Read Data Bus Error\n); + printk(Bus - Write Data Bus Error\n); if (reason MCSR_BUS_IPERR) printk(Bus - Instruction Parity Error\n); if (reason MCSR_BUS_RPERR) -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/kvm: support to handle sw breakpoint
On 17.06.14 13:13, Madhavan Srinivasan wrote: On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote: On 17.06.14 13:07, Madhavan Srinivasan wrote: On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote: On 14.06.14 23:08, Madhavan Srinivasan wrote: This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch mandates use of abs instruction (primary opcode 31 and extended opcode 360) as sw breakpoint instruction. Based on PowerISA v2.01, ABS instruction has been dropped from the architecture and treated an illegal instruction. Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com --- arch/powerpc/kvm/book3s.c| 3 ++- arch/powerpc/kvm/book3s_hv.c | 23 +++ 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index c254c27..b40fe5d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { -return -EINVAL; +vcpu-guest_debug = dbg-control; +return 0; } void kvmppc_decrementer_func(unsigned long data) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a12edb..688421d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -67,6 +67,14 @@ /* Used as a null value for timebase values */ #define TB_NIL(~(u64)0) +/* + * SW_BRK_DBG_INT is debug Instruction for supporting Software Breakpoint. + * Instruction mnemonic is ABS, primary opcode is 31 and extended opcode is 360. + * Based on PowerISA v2.01, ABS instruction has been dropped from the architecture + * and treated an illegal instruction. + */ +#define SW_BRK_DBG_INT 0x7c0002d0 The instruction we use to trap needs to get exposed to user space via a ONE_REG property. Yes. I got to know about that from Bharat (patchset ppc debug: Add debug stub support). I will change it. Also please make sure to pick an instruction that preferably looks identical regardless of guest endianness. Segher suggested 0x0000. Does that trap properly for you? Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 04/24] powerpc: check/return actual error on sysfs functions
From: Jie Liu jeff@oracle.com Cc: Benjamin Herrenschmidt b...@kernel.crashing.org Cc: Paul Mackerras pau...@samba.org Signed-off-by: Jie Liu jeff@oracle.com --- arch/powerpc/platforms/powernv/opal-dump.c | 2 +- arch/powerpc/platforms/powernv/opal-elog.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv/opal-dump.c b/arch/powerpc/platforms/powernv/opal-dump.c index 788a197..e718baf 100644 --- a/arch/powerpc/platforms/powernv/opal-dump.c +++ b/arch/powerpc/platforms/powernv/opal-dump.c @@ -424,7 +424,7 @@ void __init opal_platform_dump_init(void) int rc; dump_kset = kset_create_and_add(dump, NULL, opal_kobj); - if (!dump_kset) { + if (IS_ERR(dump_kset)) { pr_warn(%s: Failed to create dump kset\n, __func__); return; } diff --git a/arch/powerpc/platforms/powernv/opal-elog.c b/arch/powerpc/platforms/powernv/opal-elog.c index 10268c4..09c1f6f 100644 --- a/arch/powerpc/platforms/powernv/opal-elog.c +++ b/arch/powerpc/platforms/powernv/opal-elog.c @@ -296,9 +296,9 @@ int __init opal_elog_init(void) int rc = 0; elog_kset = kset_create_and_add(elog, NULL, opal_kobj); - if (!elog_kset) { + if (IS_ERR(elog_kset)) { pr_warn(%s: failed to create elog kset\n, __func__); - return -1; + return PTR_ERR(elog_kset); } rc = opal_notifier_register(elog_nb); -- 1.8.3.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule
On Thu, 2014-06-12 at 19:04 +0200, Alexander Graf wrote: On 06/12/2014 04:00 PM, Mihai Caraman wrote: @@ -140,12 +142,24 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu) mtspr(SPRN_GDEAR, vcpu-arch.shared-dar); mtspr(SPRN_GESR, vcpu-arch.shared-esr); - if (vcpu-arch.oldpir != mfspr(SPRN_PIR) || - __get_cpu_var(last_vcpu_on_cpu) != vcpu) { - kvmppc_e500_tlbil_all(vcpu_e500); + if (vcpu-arch.oldpir != mfspr(SPRN_PIR)) { + /* tlb entries deprecated */ + inval_tlb = update_last = true; + } else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) { + update_last = true; + /* tlb entries polluted */ + inval_tlb = __get_cpu_var(last_lpid_on_cpu) == + vcpu-kvm-arch.lpid; + } What about the following sequence on one CPU: LPID 1, vcpu A LPID 2, vcpu C LPID 1, vcpu B LPID 2, vcpu C doesn't invalidate LPID 1, vcpu A doesn't invalidate In the last line, vcpu A last ran on this cpu (oldpir matches), but LPID 2 last ran on this cpu (last_lpid_on_cpu does not match) -- but an invalidation has never happened since vcpu B from LPID 1 ran on this cpu. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. See https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 0/4] Update hotplug for pseries systems
In order to support device hotplug (cpu, memory, and pci) in both the PowerVM and the PowerKVM environments the handling of hotplug events will need to be updated. This patch set adresses this by creating a common entry point for handling hotplug events in the kernel that can be used in both PowerVM and PowerKVM environments. To accomplish this several changes need to be made. For PowerVM systems hotplug (DLPAR) events are requested by users from the HMC which then communicates the request to the partitions via the RSCT framework. The RSCT framework then invokes the drmgr command to handle the hotplug event. The drmgr command performs some of the work in user space and makes calls into the kernel to handle the remaining work. For PowerKVM systems hotplug events are communicated to the guest via the ras epow interrupt by qemu. We could have the rtas event sent up to user space through rtas_errd that would then invoke drmgr. This is not the most ideal solution and it would be nicer to have hotplug handled completely in the kernel. To do this, hotplug events will now be communicated to the kernel in the form of rtas hotplug events. For PowerKVM systems this is done by qemu using the ras epow interrupt. For PowerVM systems the drmgr command will be updated to create a rtas hotplug event and send it to the kernel via a new /proc/powerpc/dlpar interface. Both of these entry points for hotplug rtas events then call a common routine for handling rtas hotplug events. Additionally we will need to be able to handle all of the work to do resource hotplug in the kernel. This patch set addresses this for cpu and memory hotplug, pci hotplug will be done later. Once all of the work is done to move hotplug handling into the kernel we should also be able to get rid of the /proc/powerpc/ofdt interface. Of course these updates do depend on updating the drmgr command. If you care to look the updates for this is here; https://github.com/nfont/powerpc-utils/tree/mem_rtas_hp Patch 1/4: o Create a common rtas hotplug event handling routine o Create the /proc/powerpc/dlpar interface for PowerVM systems o Implement memory hotplug handling in the kernel. Patch 2/4: o Move the cpu hotplug code from pseries/dlpar.c to pseries/hotplug-cpu.c Patch 3/4: o Update cpu hotplug handling to allow for invocation from rtas event notifications. Patch 4/4: o Update the ras epow interrupt handler to recognize hotplug rtas events This code (except for patch 4/4 which I cannot test right now) has been tested on PowerVM systems. There are some error paths that still need to be tested and of course testing on PowerKVM when cpu and memory hotplug is enabled for Power. Thoughts? -Nathan --- include/asm/rtas.h | 27 ++ kernel/rtas.c |7 platforms/pseries/dlpar.c | 253 ++ platforms/pseries/hotplug-cpu.c| 358 + platforms/pseries/hotplug-memory.c | 351 ++-- platforms/pseries/pseries.h|6 platforms/pseries/ras.c| 12 + platforms/pseries/reconfig.c |6 8 files changed, 780 insertions(+), 240 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 1/4] Create interface for rtas hotplug events and move mem hotplug to the kernel
In order to support hotplug of memory, cpu and pci devices in the PowerVM and the PowerKVM environments we will need to provide a single entry point. To do this requires updating the way in which we handle hotplug requests in the PowerVM environment. The idea is to have all of the hotplug in the kernel so that a hotplug rtas event can used to initiate the hotplug add/remove of a device. The current method for handling a hotplug request in a PowerVM partition is to have the HMC notify the partition of the request through the RSCT framework which then invokes the drmgr command to hotplug add/remove the requested devices. The drmgr command does part of this in user-space and part in the kernel via sysfs and /proc interfaces. This patch creates the entry point for initiating a hotplug request for pseries with a rtas hotplug event. For PowerVM systems the drmgr command will now create and write a hotplug rtas event to /proc/powerpc/dlpar which will then pass the hotplug rtas event to the entry point. For PowerKVM systems QEMU will generate an epow interrupt to the guest, which then calls rtas-check-execption to get the hotplug rtas event and pass it to the entry point. NOTE that the updates to handle hotplug events from epow interrupts is not in this intial patch. This patch also adds funtionality so that we can do memory hotplug in the kernel. Using the updates to drmgr found below you can initiate memory hotplug events using the new interface. https://github.com/nfont/powerpc-utils/tree/mem_rtas_hp --- arch/powerpc/include/asm/rtas.h | 26 ++ arch/powerpc/kernel/rtas.c | 7 + arch/powerpc/platforms/pseries/dlpar.c | 65 - arch/powerpc/platforms/pseries/hotplug-memory.c | 351 arch/powerpc/platforms/pseries/pseries.h| 4 + arch/powerpc/platforms/pseries/reconfig.c | 6 + 6 files changed, 403 insertions(+), 56 deletions(-) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index b390f55..26491ae 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -254,6 +254,31 @@ inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log) return be32_to_cpu(ext_log-company_id); } +/* RTAS pseries hotplug elog section */ +struct pseries_hp_elog { + uint8_t resource; + uint8_t action:8; +uint8_tid_type:8; +uint8_treserved; +union { + __be32 drc_index; + __be32 drc_count; + chardrc_name[1]; +}_drc_u; +}; + +#define HP_ELOG_RESOURCE_CPU 1 +#define HP_ELOG_RESOURCE_MEM 2 +#define HP_ELOG_RESOURCE_SLOT 3 +#define HP_ELOG_RESOURCE_PHB 4 + +#define HP_ELOG_ACTION_ADD 1 +#define HP_ELOG_ACTION_REMOVE 2 + +#define HP_ELOG_ID_DRC_NAME1 +#define HP_ELOG_ID_DRC_INDEX 2 +#define HP_ELOG_ID_DRC_COUNT 3 + /* pSeries event log format */ /* Two bytes ASCII section IDs */ @@ -273,6 +298,7 @@ inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log) #define PSERIES_ELOG_SECT_ID_MANUFACT_INFO (('M' 8) | 'I') #define PSERIES_ELOG_SECT_ID_CALL_HOME (('C' 8) | 'H') #define PSERIES_ELOG_SECT_ID_USER_DEF (('U' 8) | 'D') +#define PSERIES_ELOG_SECT_ID_HP(('H' 8) | 'P') /* Vendor specific Platform Event Log Format, Version 6, section header */ struct pseries_errorlog { diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 8cd5ed0..b738b1b 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -997,6 +997,13 @@ struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log, uint8_t log_format = rtas_ext_event_log_format(ext_log); uint32_t company_id = rtas_ext_event_company_id(ext_log); + printk(KERN_EMERG Validation: %x : %lx\n%x : %x\n%x : %x\n, + log-extended_log_length, sizeof(struct rtas_ext_event_log_v6), + rtas_ext_event_log_format(ext_log), + RTAS_V6EXT_LOG_FORMAT_EVENT_LOG, + rtas_ext_event_company_id(ext_log), + RTAS_V6EXT_COMPANY_ID_IBM); + /* Check that we understand the format */ if (ext_log_length sizeof(struct rtas_ext_event_log_v6) || log_format != RTAS_V6EXT_LOG_FORMAT_EVENT_LOG || diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 022b38e..dfca23b 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -16,9 +16,13 @@ #include linux/cpu.h #include linux/slab.h #include linux/of.h +#include linux/proc_fs.h +#include linux/memory.h +#include linux/memblock.h +#include linux/mutex.h #include offline_states.h +#include pseries.h -#include asm/prom.h #include asm/machdep.h #include asm/uaccess.h #include asm/rtas.h @@
[RFC PATCH 2/4] Migrate cpu hotplug code to pseries/hotplug-cpu.c
This patch moves the cpu hotplug handling code from pseries/dlpar.c to pseries/hotplug-cpu.c. Additionally it factors out the work to add/remove a single cpu into its own routine. --- arch/powerpc/platforms/pseries/dlpar.c | 182 - arch/powerpc/platforms/pseries/hotplug-cpu.c | 194 +++ 2 files changed, 194 insertions(+), 182 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index dfca23b..16c85b9 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -359,182 +359,6 @@ int dlpar_release_drc(u32 drc_index) return 0; } -#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE - -static int dlpar_online_cpu(struct device_node *dn) -{ - int rc = 0; - unsigned int cpu; - int len, nthreads, i; - const u32 *intserv; - - intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len); - if (!intserv) - return -EINVAL; - - nthreads = len / sizeof(u32); - - cpu_maps_update_begin(); - for (i = 0; i nthreads; i++) { - for_each_present_cpu(cpu) { - if (get_hard_smp_processor_id(cpu) != intserv[i]) - continue; - BUG_ON(get_cpu_current_state(cpu) - != CPU_STATE_OFFLINE); - cpu_maps_update_done(); - rc = cpu_up(cpu); - if (rc) - goto out; - cpu_maps_update_begin(); - - break; - } - if (cpu == num_possible_cpus()) - printk(KERN_WARNING Could not find cpu to online - with physical id 0x%x\n, intserv[i]); - } - cpu_maps_update_done(); - -out: - return rc; - -} - -static ssize_t dlpar_cpu_probe(const char *buf, size_t count) -{ - struct device_node *dn, *parent; - unsigned long drc_index; - int rc; - - rc = strict_strtoul(buf, 0, drc_index); - if (rc) - return -EINVAL; - - parent = of_find_node_by_path(/cpus); - if (!parent) - return -ENODEV; - - dn = dlpar_configure_connector(drc_index, parent); - if (!dn) - return -EINVAL; - - of_node_put(parent); - - rc = dlpar_acquire_drc(drc_index); - if (rc) { - dlpar_free_cc_nodes(dn); - return -EINVAL; - } - - rc = dlpar_attach_node(dn); - if (rc) { - dlpar_release_drc(drc_index); - dlpar_free_cc_nodes(dn); - return rc; - } - - rc = dlpar_online_cpu(dn); - if (rc) - return rc; - - return count; -} - -static int dlpar_offline_cpu(struct device_node *dn) -{ - int rc = 0; - unsigned int cpu; - int len, nthreads, i; - const u32 *intserv; - - intserv = of_get_property(dn, ibm,ppc-interrupt-server#s, len); - if (!intserv) - return -EINVAL; - - nthreads = len / sizeof(u32); - - cpu_maps_update_begin(); - for (i = 0; i nthreads; i++) { - for_each_present_cpu(cpu) { - if (get_hard_smp_processor_id(cpu) != intserv[i]) - continue; - - if (get_cpu_current_state(cpu) == CPU_STATE_OFFLINE) - break; - - if (get_cpu_current_state(cpu) == CPU_STATE_ONLINE) { - set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); - cpu_maps_update_done(); - rc = cpu_down(cpu); - if (rc) - goto out; - cpu_maps_update_begin(); - break; - - } - - /* -* The cpu is in CPU_STATE_INACTIVE. -* Upgrade it's state to CPU_STATE_OFFLINE. -*/ - set_preferred_offline_state(cpu, CPU_STATE_OFFLINE); - BUG_ON(plpar_hcall_norets(H_PROD, intserv[i]) - != H_SUCCESS); - __cpu_die(cpu); - break; - } - if (cpu == num_possible_cpus()) - printk(KERN_WARNING Could not find cpu to offline - with physical id 0x%x\n, intserv[i]); - } - cpu_maps_update_done(); - -out: - return rc; - -} - -static ssize_t dlpar_cpu_release(const char *buf, size_t count) -{ - struct device_node *dn; - const u32 *drc_index; - int rc; - - dn =
[RFC PATCH 3/4] Handle cpu hotplug from rtas hotplug events
This patch updates the cpu hotplug handling code so that we can perform cpu hotplug using the new rtas hotplug event interface while still maintaining the ability to use the probe/release sysfs interface for adding and removing cpus. At a later point we could deprecate the use of the probe/release sysfs files and remove those code bits. --- arch/powerpc/platforms/pseries/dlpar.c | 4 + arch/powerpc/platforms/pseries/hotplug-cpu.c | 164 +++ arch/powerpc/platforms/pseries/pseries.h | 1 + 3 files changed, 169 insertions(+) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 16c85b9..53f4fe6 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -275,6 +275,7 @@ int dlpar_attach_node(struct device_node *dn) if (!dn-parent) return -ENOMEM; + of_node_init(dn); rc = of_attach_node(dn); if (rc) { printk(KERN_ERR Failed to add device node %s\n, @@ -374,6 +375,9 @@ static int handle_dlpar_errorlog(struct rtas_error_log *error_log) case HP_ELOG_RESOURCE_MEM: rc = dlpar_memory(hp_elog); break; + case HP_ELOG_RESOURCE_CPU: + rc = dlpar_cpus(hp_elog); + break; } return rc; diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 6b42fd5..8be88d6 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -24,6 +24,7 @@ #include linux/sched.h /* for idle_task_exit */ #include linux/cpu.h #include linux/of.h +#include linux/slab.h #include asm/prom.h #include asm/rtas.h #include asm/firmware.h @@ -387,6 +388,169 @@ static int dlpar_remove_one_cpu(struct device_node *dn, u32 drc_index) return 0; } +struct cpu_drc_info { + u32 drc_index; + int present; +}; + +static struct cpu_drc_info *get_cpu_drc_info(int *drc_count) +{ + struct device_node *dn, *child = NULL; + struct cpu_drc_info *drcs; + const u32 *indexes; + int i, count; + + dn = of_find_node_by_path(/cpus); + if (!dn) + return NULL; + + indexes = of_get_property(dn, ibm,drc-indexes, NULL); + if (!indexes) { + of_node_put(dn); + return NULL; + } + + count = *indexes++; + drcs = kzalloc(count * sizeof(*drcs), GFP_KERNEL); + if (!drcs) { + of_node_put(dn); + return NULL; + } + + for (i = 0; i count; i++) + drcs[i].drc_index = indexes[i]; + + for_each_child_of_node(dn, child) { + const u32 *drc_index; + + drc_index = of_get_property(child, ibm,my-drc-index, NULL); + if (!drc_index) + continue; + + for (i = 0; i count; i++) { + if (drcs[i].drc_index == *drc_index) + drcs[i].present = 1; + break; + } + } + + of_node_put(dn); + *drc_count = count; + return drcs; +} + +static struct device_node *cpu_drc_index_to_device(u32 drc_index) +{ + struct device_node *parent, *child; + const u32 *my_drc_index; + + parent = of_find_node_by_path(/cpus); + if (!parent) + return NULL; + + for_each_child_of_node(parent, child) { + my_drc_index = of_get_property(child, ibm,my-drc-index, NULL); + if (!my_drc_index) + continue; + + if (*my_drc_index == drc_index) + break; + } + + of_node_put(parent); + return child; +} + +static int dlpar_remove_cpus(struct pseries_hp_elog *hp_elog, + struct cpu_drc_info *cpu_drcs, int num_drcs) +{ + struct device_node *dn; + int cpus_to_remove, cpus_removed = 0; + int rc, i; + + if (hp_elog-id_type == HP_ELOG_ID_DRC_COUNT) + cpus_to_remove = hp_elog-_drc_u.drc_count; + else + cpus_to_remove = 1; + + for (i = 0; i num_drcs; i++) { + if (cpus_to_remove == cpus_removed) + break; + + if (!cpu_drcs[i].present) + continue; + + if (hp_elog-id_type == HP_ELOG_ID_DRC_INDEX +hp_elog-_drc_u.drc_index != cpu_drcs[i].drc_index) + continue; + + dn = cpu_drc_index_to_device(cpu_drcs[i].drc_index); + if (!dn) + continue; + + rc = dlpar_remove_one_cpu(dn, cpu_drcs[i].drc_index); + of_node_put(dn); + + if (!rc) + cpus_removed++; + } + + return (cpus_to_remove ==
[RFC PATCH 4/4] Hook into ras epow interrupt handler for hotplug
This patch hooks into the ras EPOW interrupt handler so that we can communicate hotplug rtas events to a PowerKVM guest from qemu. The ras epow interrupt wil lnow check for hotplgu rtas events and invoke the common handling routine accordingly. --- arch/powerpc/include/asm/rtas.h | 1 + arch/powerpc/platforms/pseries/dlpar.c | 2 +- arch/powerpc/platforms/pseries/pseries.h | 1 + arch/powerpc/platforms/pseries/ras.c | 12 +++- 4 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 26491ae..7313e6f 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -124,6 +124,7 @@ struct rtas_suspend_me_data { #define RTAS_TYPE_INFO 0xE2 #define RTAS_TYPE_DEALLOC 0xE3 #define RTAS_TYPE_DUMP 0xE4 +#define RTAS_TYPE_HOTPLUG 0xE5 /* I don't add PowerMGM events right now, this is a different topic */ #define RTAS_TYPE_PMGM_POWER_SW_ON 0x60 #define RTAS_TYPE_PMGM_POWER_SW_OFF0x61 diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 53f4fe6..6d3de6e 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -360,7 +360,7 @@ int dlpar_release_drc(u32 drc_index) return 0; } -static int handle_dlpar_errorlog(struct rtas_error_log *error_log) +int handle_dlpar_errorlog(struct rtas_error_log *error_log) { struct pseries_errorlog *pseries_log; struct pseries_hp_elog *hp_elog; diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h index 1706215..9b9bd82 100644 --- a/arch/powerpc/platforms/pseries/pseries.h +++ b/arch/powerpc/platforms/pseries/pseries.h @@ -64,6 +64,7 @@ extern int dlpar_acquire_drc(u32); extern int dlpar_release_drc(u32); extern int dlpar_memory(struct pseries_hp_elog *); extern int dlpar_cpus(struct pseries_hp_elog *); +extern int handle_dlpar_errorlog(struct rtas_error_log *); /* PCI root bridge prepare function override for pseries */ struct pci_host_bridge; diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 9c5778e..3cc6a49 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -183,6 +183,7 @@ void rtas_parse_epow_errlog(struct rtas_error_log *log) /* Handle environmental and power warning (EPOW) interrupts. */ static irqreturn_t ras_epow_interrupt(int irq, void *dev_id) { + struct rtas_error_log *elog; int status; int state; int critical; @@ -205,7 +206,16 @@ static irqreturn_t ras_epow_interrupt(int irq, void *dev_id) log_error(ras_log_buf, ERR_TYPE_RTAS_LOG, 0); - rtas_parse_epow_errlog((struct rtas_error_log *)ras_log_buf); + elog = (struct rtas_error_log *)ras_log_buf; + + switch(rtas_error_type(elog)) { + case RTAS_TYPE_EPOW: + rtas_parse_epow_errlog(elog); + break; + case RTAS_TYPE_HOTPLUG: + handle_dlpar_errorlog(elog); + break; + } spin_unlock(ras_log_buf_lock); return IRQ_HANDLED; -- 2.0.0.rc3.2.g998f840 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: module: fix TOC symbol CRC
The commit 71ec7c55ed91 introduced the magic symbol .TOC. for ELFv2 ABI. This symbol is built manually and has no CRC value computed. A zero value is put in the CRC section to avoid modpost complaining about a missing CRC. Unfortunately, this breaks the kernel module loading when the kernel is relocated (kdump case for instance) because of the relocation applied to the kcrctab values. This patch compute a CRC value for the TOC symbol which will match the one compute by the kernel when it is relocated - aka '0 - relocate_start' done in maybe_relocated called by check_version (module.c). Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com Cc: Anton Blanchard an...@samba.org --- arch/powerpc/kernel/module_64.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index 077d2ce..a80c933 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -315,8 +315,15 @@ static void dedotify_versions(struct modversion_info *vers, struct modversion_info *end; for (end = (void *)vers + size; vers end; vers++) - if (vers-name[0] == '.') + if (vers-name[0] == '.') { memmove(vers-name, vers-name+1, strlen(vers-name)); + /* The TOC symbol has no CRC computed. To avoid CRC +* check failing, we must force it to the expected +* value (see CRC check in module.c). +*/ + if (!strcmp(vers-name, TOC.)) + vers-crc = -(unsigned long)reloc_start; + } } /* Undefined symbols which refer to .funcname, hack to funcname (or .TOC.) */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
On 12.06.14 10:16, Anton Blanchard wrote: To establish addressability quickly, ABIv2 requires the target address of the function being called to be in r12. Signed-off-by: Anton Blanchard an...@samba.org Thanks, applied to kvm-ppc-queue. Alex --- Index: b/arch/powerpc/kvm/book3s_hv_rmhandlers.S === --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -1920,8 +1920,8 @@ hcall_try_real_mode: lwaxr3,r3,r4 cmpwi r3,0 beq guest_exit_cont - add r3,r3,r4 - mtctr r3 + add r12,r3,r4 + mtctr r12 mr r3,r9 /* get vcpu pointer */ ld r4,VCPU_GPR(R4)(r9) bctrl ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 4/4] KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
On 12.06.14 10:16, Anton Blanchard wrote: Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are assembly functions that are exported to modules and also require a valid r2. As such we need to use _GLOBAL_TOC so we provide a global entry point that establishes the TOC (r2). Signed-off-by: Anton Blanchard an...@samba.org Thanks, applied to kvm-ppc-queue. I've not applied patches 1 and 2 for now, as they break BE module support. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 6:36 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. See https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html Thanks. The original code needs just a simple adjustment to benefit from this optimization, please review v3. - Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On Tue, 2014-06-17 at 14:04 -0500, Caraman Mihai Claudiu-B02008 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 6:36 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition taking into account the logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. See https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html Thanks. The original code needs just a simple adjustment to benefit from this optimization, please review v3. Where is v3? Or is it forthcoming? -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition keeping last_vcpu_on_cpu per logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i ITERATIONS; i++) for (j = 0; j ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman mihai.cara...@freescale.com Cc: Scott Wood scottw...@freescale.com --- v3: - use existing logic while keeping last_vcpu_per_cpu per lpid v2: - improve patch name and description - add performance results arch/powerpc/kvm/e500mc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c index 17e4562..95e33e3 100644 --- a/arch/powerpc/kvm/e500mc.c +++ b/arch/powerpc/kvm/e500mc.c @@ -110,7 +110,7 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr) { } -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu) { @@ -141,9 +141,9 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu) mtspr(SPRN_GESR, vcpu-arch.shared-esr); if (vcpu-arch.oldpir != mfspr(SPRN_PIR) || - __get_cpu_var(last_vcpu_on_cpu) != vcpu) { + __get_cpu_var(last_vcpu_on_cpu)[vcpu-kvm-arch.lpid] != vcpu) { kvmppc_e500_tlbil_all(vcpu_e500); - __get_cpu_var(last_vcpu_on_cpu) = vcpu; + __get_cpu_var(last_vcpu_on_cpu)[vcpu-kvm-arch.lpid] = vcpu; } kvmppc_load_guest_fp(vcpu); -- 1.7.11.7 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On Tue, 2014-06-17 at 22:09 +0300, Mihai Caraman wrote: On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition keeping last_vcpu_on_cpu per logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i ITERATIONS; i++) for (j = 0; j ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman mihai.cara...@freescale.com Cc: Scott Wood scottw...@freescale.com --- v3: - use existing logic while keeping last_vcpu_per_cpu per lpid v2: - improve patch name and description - add performance results arch/powerpc/kvm/e500mc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c index 17e4562..95e33e3 100644 --- a/arch/powerpc/kvm/e500mc.c +++ b/arch/powerpc/kvm/e500mc.c @@ -110,7 +110,7 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr) { } -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? No space after * Name should be adjusted to match, something like last_vcpu_of_lpid (with the _on_cpu being implied by the fact that it's PER_CPU). -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 04/24] powerpc: check/return actual error on sysfs functions
On Tue, Jun 17, 2014 at 10:31:09PM +0800, Jeff Liu wrote: From: Jie Liu jeff@oracle.com Cc: Benjamin Herrenschmidt b...@kernel.crashing.org Cc: Paul Mackerras pau...@samba.org Signed-off-by: Jie Liu jeff@oracle.com --- arch/powerpc/platforms/powernv/opal-dump.c | 2 +- arch/powerpc/platforms/powernv/opal-elog.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) Ben and Paul, please do not take this patch, it is incorrect. greg k-h ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
-static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Name should be adjusted to match, something like last_vcpu_of_lpid (with the _on_cpu being implied by the fact that it's PER_CPU). I was thinking to the long name but it was not appealing, I will change it to last_vcpu_of_lpid. -Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote: -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Checkpatch is wrong, which isn't surprising given that this is unusual syntax. We don't normally put a space after * when used to represent a pointer. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 10:48 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote: -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Checkpatch is wrong, which isn't surprising given that this is unusual syntax. We don't normally put a space after * when used to represent a pointer. This is not something new. See [PATCH 04/10] percpu: cleanup percpu array definitions: https://lkml.org/lkml/2009/6/24/26 -Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 10:48 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote: -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Checkpatch is wrong, which isn't surprising given that this is unusual syntax. We don't normally put a space after * when used to represent a pointer. This is not something new. See [PATCH 04/10] percpu: cleanup percpu array definitions: https://lkml.org/lkml/2009/6/24/26 I didn't say it was new, just unusual, and checkpatch doesn't recognize it. Checkpatch shouldn't be blindly and silently obeyed when it says something strange. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 11:05 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 10:48 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote: -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Checkpatch is wrong, which isn't surprising given that this is unusual syntax. We don't normally put a space after * when used to represent a pointer. This is not something new. See [PATCH 04/10] percpu: cleanup percpu array definitions: https://lkml.org/lkml/2009/6/24/26 I didn't say it was new, just unusual, and checkpatch doesn't recognize it. Checkpatch shouldn't be blindly and silently obeyed when it says something strange. I agree with you about the syntax and I know other cases where checkpatch is a moron. For similar corner cases checkpatch maintainers did not wanted (or found it difficult) to make an exception. I would also like to see Alex opinion on this. -Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
On 17.06.14 22:36, mihai.cara...@freescale.com wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 11:05 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, June 17, 2014 10:48 PM To: Caraman Mihai Claudiu-B02008 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote: -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu); +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu); Hmm, I didn't know you could express types like that. Is this special syntax that only works for typeof? Yes, AFAIK. No space after * Checkpatch complains about the missing space ;) Checkpatch is wrong, which isn't surprising given that this is unusual syntax. We don't normally put a space after * when used to represent a pointer. This is not something new. See [PATCH 04/10] percpu: cleanup percpu array definitions: https://lkml.org/lkml/2009/6/24/26 I didn't say it was new, just unusual, and checkpatch doesn't recognize it. Checkpatch shouldn't be blindly and silently obeyed when it says something strange. I agree with you about the syntax and I know other cases where checkpatch is a moron. For similar corner cases checkpatch maintainers did not wanted (or found it difficult) to make an exception. I would also like to see Alex opinion on this. I usually like to apply common sense :). Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Build regressions/improvements in v3.16-rc1
Hi Sam, On Tue, Jun 17, 2014 at 11:29 PM, Sam Ravnborg s...@ravnborg.org wrote: + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add' [-Werror=implicit-function-declaration]: = 176:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add_negative' [-Werror=implicit-function-declaration]: = 211:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add_return' [-Werror=implicit-function-declaration]: = 218:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec' [-Werror=implicit-function-declaration]: = 169:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec_and_test' [-Werror=implicit-function-declaration]: = 197:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec_return' [-Werror=implicit-function-declaration]: = 239:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc' [-Werror=implicit-function-declaration]: = 162:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc_and_test' [-Werror=implicit-function-declaration]: = 204:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc_return' [-Werror=implicit-function-declaration]: = 232:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_set' [-Werror=implicit-function-declaration]: = 155:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub' [-Werror=implicit-function-declaration]: = 183:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub_and_test' [-Werror=implicit-function-declaration]: = 190:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub_return' [-Werror=implicit-function-declaration]: = 225:2 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function '__atomic_add_unless' [-Werror=implicit-function-declaration]: = 53:2 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function 'atomic_cmpxchg' [-Werror=implicit-function-declaration]: = 89:3 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]: = 136:2 sparc-allmodconfig Not reproduceable here with linus latest. (-rc1 + two patches). Can you help me with a pointer to the original build log? http://kisskb.ellerman.id.au/kisskb/buildresult/11340509/ They've been there since quite a while (in -next since Apr 22), but lately I didn't have much time to dive into -next build failures. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Build regressions/improvements in v3.16-rc1
+ /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add' [-Werror=implicit-function-declaration]: = 176:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add_negative' [-Werror=implicit-function-declaration]: = 211:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_add_return' [-Werror=implicit-function-declaration]: = 218:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec' [-Werror=implicit-function-declaration]: = 169:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec_and_test' [-Werror=implicit-function-declaration]: = 197:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_dec_return' [-Werror=implicit-function-declaration]: = 239:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc' [-Werror=implicit-function-declaration]: = 162:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc_and_test' [-Werror=implicit-function-declaration]: = 204:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_inc_return' [-Werror=implicit-function-declaration]: = 232:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_set' [-Werror=implicit-function-declaration]: = 155:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub' [-Werror=implicit-function-declaration]: = 183:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub_and_test' [-Werror=implicit-function-declaration]: = 190:2 + /scratch/kisskb/src/include/asm-generic/atomic-long.h: error: implicit declaration of function 'atomic_sub_return' [-Werror=implicit-function-declaration]: = 225:2 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function '__atomic_add_unless' [-Werror=implicit-function-declaration]: = 53:2 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function 'atomic_cmpxchg' [-Werror=implicit-function-declaration]: = 89:3 + /scratch/kisskb/src/include/linux/atomic.h: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]: = 136:2 sparc-allmodconfig Not reproduceable here with linus latest. (-rc1 + two patches). Can you help me with a pointer to the original build log? Sam ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFT PATCH -next v3] [BUGFIX] kprobes: Fix Failed to find blacklist error on ia64 and ppc64
On Thu, Jun 5, 2014 at 11:38 PM, Masami Hiramatsu masami.hiramatsu...@hitachi.com wrote: Ping? I guess this should go to 3.16 branch, shouldn't it? (2014/05/30 12:18), Masami Hiramatsu wrote: On ia64 and ppc64, the function pointer does not point the entry address of the function, but the address of function discriptor (which contains the entry address and misc data.) Since the kprobes passes the function pointer stored by NOKPROBE_SYMBOL() to kallsyms_lookup_size_offset() for initalizing its blacklist, it fails and reports many errors as below. Failed to find blacklist 000101316830 Yes please ... just found this problem on ia64 in mainline and was happy to see this fix for it. Tested-by: Tony Luck tony.l...@intel.com -Tony ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 23/38] mmc: sdhci: convert sdhci_set_uhs_signaling() into a library function
On Mon, Jun 16, 2014 at 02:17:30PM +0200, Ulf Hansson wrote: On 16 June 2014 12:46, Russell King - ARM Linux li...@arm.linux.org.uk wrote: On Wed, Apr 23, 2014 at 08:08:07PM +0100, Russell King wrote: @@ -1507,25 +1529,7 @@ static void sdhci_do_set_ios(struct sdhci_host *host, struct mmc_ios *ios) host-ops-set_clock(host, host-clock); } - if (host-ops-set_uhs_signaling) - host-ops-set_uhs_signaling(host, ios-timing); - else { - ctrl_2 = sdhci_readw(host, SDHCI_HOST_CONTROL2); - /* Select Bus Speed Mode for host */ - ctrl_2 = ~SDHCI_CTRL_UHS_MASK; - if ((ios-timing == MMC_TIMING_MMC_HS200) || - (ios-timing == MMC_TIMING_UHS_SDR104)) - ctrl_2 |= SDHCI_CTRL_UHS_SDR104; - else if (ios-timing == MMC_TIMING_UHS_SDR12) - ctrl_2 |= SDHCI_CTRL_UHS_SDR12; - else if (ios-timing == MMC_TIMING_UHS_SDR25) - ctrl_2 |= SDHCI_CTRL_UHS_SDR25; - else if (ios-timing == MMC_TIMING_UHS_SDR50) - ctrl_2 |= SDHCI_CTRL_UHS_SDR50; - else if (ios-timing == MMC_TIMING_UHS_DDR50) - ctrl_2 |= SDHCI_CTRL_UHS_DDR50; - sdhci_writew(host, ctrl_2, SDHCI_HOST_CONTROL2); - } + host-ops-set_uhs_signaling(host, ios-timing); if (!(host-quirks2 SDHCI_QUIRK2_PRESET_VALUE_BROKEN) ((ios-timing == MMC_TIMING_UHS_SDR12) || Whoever decided to poorly pick these patches up against my will has slightly messed this patch up - whereas my original patch left the code correctly formatted, when whoever applied this patch did so, they left an additional blank line in the above. Hi Russell, We kindly pinged you several times asking for your state and for the PR, but I suppose you were just too busy. Your PR were kind of blocking patches for sdhci, if you remember. I wasn't too busy. I had walked away from all kernel maintanence in disgust at the way many in the ARM community ignores questions, and ignores patches which need testing - I'm talking there about the L2C patch series which was extremely poorly tested, and still, to this day, has questions outstanding. Yes, the code now produces warnings. It produces warnings /because/ people were not willing to help. Those warnings serve as a reminder that there's still problems which need solving there, and they're not going to go away until those problems are solved. While I don't like pushing unfinished code into mainline, in this case, others deemed the patch set too important _not_ to go into mainline even with these problems. Now, it's been /soo/ long since I worked on that patch set that my knowledge has now diminished... so it's now going to be _much_ harder to resolve those issues than it would have been three months ago. And I'm also holding a grudge, and I bear grudges for a long time, so expect me to be difficult towards Linux stuff for a while yet. The mmc people were also very helping in sending patches to fixup related regressions, immediately after we merged your patchset. Thus together I think we managed to pull it off. The formatting problem I refer to above is line 1532/1533 in sdhci.c - there's an additional blank line which somehow got left behind, caused presumably by insufficient attention paid to cleaning up a conflict between my original patches and the state of the tree they were applied to. -- FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly improving, and getting towards what was expected from it. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Boot failure in Power7 pSeries
Anyone has a idea on this issue? Thanks Mike On 06/17/2014 05:45 PM, Mike Qiu wrote: Hi all, I use newest linux-next( top commit: 5f295cdf5c5dbbb0c40f10f2ddae02ff46bbf773) to boot up my Power7 machine, PowerVM mode(HypMode 01), use defualt config file in /boot/, it show error log below: OF stdout device is: /vdevice/vty@3000 Preparing to boot Linux version 3.16.0-rc1-next-20140617+ (root@shui) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #5 SMP Tue Jun 17 05:16:21 EDT 2014 Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 1024) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/vmlinux-3.16.0-rc1-next-20140617+ root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/swap rd.md=0 rd.dm=0 vconsole.keymap=us rd.luks=0 vconsole.font=latarcyrheb-sun16 rd.lvt memory layout at init: memory_limit : (16 MB aligned) alloc_bottom : 0591 alloc_top: 1000 alloc_top_hi : 1000 rmo_top : 1000 ram_top : 1000 instantiating rtas at 0x0ee8... done Querying for OPAL presence... DEFAULT CATCH!, exception-handler=fff00700 at %SRR0: 041a1c14 %SRR1: 00081002 Open Firmware exception handler entered from non-OF code Client's Fix Pt Regs: 00 042c017c 042c2ce8 04ae8d58 042c2f38 04 0369aafc 042c2f38 01adc100 042c2f38 08 04328d58 28002024 1002 0c a001 01a9fd20 041a7df8 10 041a2130 041a1e70 f821ff913d220005 01a9fd20 14 7962 0ee8 0118 0ee8 18 041a2610 0369 042c3070 041a1ce8 1c 041a1ce0 041b89f0 0003 0001 Special Regs: %IV: 0700 %CR: 48002024%XER: %DSISR: 4000 %SRR0: 041a1c14 %SRR1: 00081002 %LR: 0369aafc%CTR: %DAR: f821ff913d220035 Virtual PID = 0 ok 0 Thanks Mike ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Build regressions/improvements in v3.16-rc1
On 06/17/2014 08:23 AM, Geert Uytterhoeven wrote: On Tue, Jun 17, 2014 at 5:16 PM, Geert Uytterhoeven ge...@linux-m68k.org wrote: [...] + /scratch/kisskb/src/sound/soc/fsl/fsl_dma.c: error: invalid use of undefined type 'struct ccsr_ssi': = 926:34, 927:34 powerpc/mpc85xx_defconfig Being fixed: http://patchwork.ozlabs.org/patch/358500/ Guenter ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev