Re: [PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE
Hi Naveen, On Wed, 15 Feb 2017 00:28:34 +0530 "Naveen N. Rao"wrote: > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c > index e51a045f3d3b..a8f414a0b141 100644 > --- a/arch/powerpc/kernel/optprobes.c > +++ b/arch/powerpc/kernel/optprobes.c > @@ -70,6 +70,9 @@ static unsigned long can_optimize(struct kprobe *p) > struct instruction_op op; > unsigned long nip = 0; > > + if (unlikely(kprobe_ftrace(p))) > + return 0; > + Hmm, this should not be checked in arch-depend code, since it may duplicate code in each arch. Please try below. Thanks, commit 897bb304974c1b0c19a4274fea220b175b07f353 Author: Masami Hiramatsu Date: Wed Feb 15 15:48:14 2017 +0900 kprobes: Skip preparing optprobe if the probe is ftrace-based Skip preparing optprobe if the probe is ftrace-based, since anyway, it must not be optimized (or already optimized by ftrace). Signed-off-by: Masami Hiramatsu diff --git a/kernel/kprobes.c b/kernel/kprobes.c index ebb4dad..546a8b1 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -746,13 +746,20 @@ static void kill_optimized_kprobe(struct kprobe *p) arch_remove_optimized_kprobe(op); } +static inline +void __prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p) +{ + if (!kprobe_ftrace(p)) + arch_prepare_optimized_kprobe(op, p); +} + /* Try to prepare optimized instructions */ static void prepare_optimized_kprobe(struct kprobe *p) { struct optimized_kprobe *op; op = container_of(p, struct optimized_kprobe, kp); - arch_prepare_optimized_kprobe(op, p); + __prepare_optimized_kprobe(op, p); } /* Allocate new optimized_kprobe and try to prepare optimized instructions */ @@ -766,7 +773,7 @@ static struct kprobe *alloc_aggr_kprobe(struct kprobe *p) INIT_LIST_HEAD(>list); op->kp.addr = p->addr; - arch_prepare_optimized_kprobe(op, p); + __prepare_optimized_kprobe(op, p); return >kp; } -- Masami Hiramatsu
[PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices
Currently we reassign the alignment by extending resources' size in pci_reassigndev_resource_alignment(). This could potentially break some drivers when the driver uses the size to locate register whose length is related to the size. Some examples as below: - misc\Hpilo.c: off = pci_resource_len(pdev, bar) - 0x2000; - net\ethernet\chelsio\cxgb4\cxgb4_uld.h: (pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size)) - infiniband\hw\nes\Nes_hw.c: num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT; This risk could be easily prevented before because we only had one way (kernel parameter resource_alignment) to touch those codes. And even some users may be happy to see the extended size. But now we define a default alignment for all devices which would also touch those codes. It would be hard to prevent the risk in this case. So this patch tries to use START_ALIGNMENT to identify the resource's alignment without extending the size when the alignment reassigning is caused by the default alignment. Signed-off-by: Yongji Xie--- drivers/pci/pci.c | 34 -- 1 files changed, 24 insertions(+), 10 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2622e9b..0017e5e 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4958,7 +4958,8 @@ void pci_ignore_hotplug(struct pci_dev *dev) * RETURNS: Resource alignment if it is specified. * Zero if it is not specified. */ -static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev) +static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev, + bool *resize) { int seg, bus, slot, func, align_order, count; unsigned short vendor, device, subsystem_vendor, subsystem_device; @@ -5002,6 +5003,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev) (!device || (device == dev->device)) && (!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) && (!subsystem_device || (subsystem_device == dev->subsystem_device))) { + *resize = true; if (align_order == -1) align = PAGE_SIZE; else @@ -5027,6 +5029,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev) bus == dev->bus->number && slot == PCI_SLOT(dev->devfn) && func == PCI_FUNC(dev->devfn)) { + *resize = true; if (align_order == -1) align = PAGE_SIZE; else @@ -5059,6 +5062,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev) struct resource *r; resource_size_t align, size; u16 command; + bool resize = false; /* * VF BARs are read-only zero according to SR-IOV spec r1.1, sec @@ -5070,7 +5074,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev) return; /* check if specified PCI is target device to reassign */ - align = pci_specified_resource_alignment(dev); + align = pci_specified_resource_alignment(dev, ); if (!align) return; @@ -5098,15 +5102,25 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev) } size = resource_size(r); - if (size < align) { - size = align; - dev_info(>dev, - "Rounding up size of resource #%d to %#llx.\n", - i, (unsigned long long)size); + if (resize) { + if (size < align) { + size = align; + dev_info(>dev, + "Rounding up size of resource #%d to %#llx.\n", + i, (unsigned long long)size); + } + r->flags |= IORESOURCE_UNSET; + r->end = size - 1; + r->start = 0; + } else { + if (size < align) { + r->flags &= ~IORESOURCE_SIZEALIGN; + r->flags |= IORESOURCE_STARTALIGN | + IORESOURCE_UNSET; + r->start = align; + r->end = r->start + size - 1; + } } - r->flags |= IORESOURCE_UNSET; - r->end = size - 1; - r->start = 0; } /*
[PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
When vfio passthroughs a PCI device of which MMIO BARs are smaller than PAGE_SIZE, guest will not handle the mmio accesses to the BARs which leads to mmio emulations in host. This is because vfio will not allow to passthrough one BAR's mmio page which may be shared with other BARs. Otherwise, there will be a backdoor that guest can use to access BARs of other guest. This patch adds a macro to set default alignment for all PCI devices. Then we could solve this issue on some platforms which would easily hit this issue because of their 64K page such as PowerNV platform by defining this macro as PAGE_SIZE. Signed-off-by: Yongji Xie--- arch/powerpc/include/asm/pci.h |4 drivers/pci/pci.c |3 +++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h index e9bd6cf..5e31bc2 100644 --- a/arch/powerpc/include/asm/pci.h +++ b/arch/powerpc/include/asm/pci.h @@ -28,6 +28,10 @@ #define PCIBIOS_MIN_IO 0x1000 #define PCIBIOS_MIN_MEM0x1000 +#ifdef CONFIG_PPC_POWERNV +#define PCIBIOS_DEFAULT_ALIGNMENT PAGE_SIZE +#endif + struct pci_dev; /* Values for the `which' argument to sys_pciconfig_iobase syscall. */ diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a881c0d..2622e9b 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4965,6 +4965,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev) resource_size_t align = 0; char *p; +#ifdef PCIBIOS_DEFAULT_ALIGNMENT + align = PCIBIOS_DEFAULT_ALIGNMENT; +#endif spin_lock(_alignment_lock); p = resource_alignment_param; if (!*p) -- 1.7.1
[PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment
In case that one device's alignment is greater than its size, we may get an incorrect size and alignment for its bus's memory window in pbus_size_mem(). This patch fixes this case. Signed-off-by: Yongji Xie--- drivers/pci/setup-bus.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index f30ca75..b3cabc2 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -1075,10 +1075,10 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, r->flags = 0; continue; } - size += r_size; + size += max(r_size, align); /* Exclude ranges with size > align from calculation of the alignment. */ - if (r_size == align) + if (r_size <= align) aligns[order] += align; if (order > max_order) max_order = order; -- 1.7.1
[PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE
This series introduces a way for PCI resource allocator to force MMIO BARs not to share PAGE_SIZE. This would make sense to VFIO driver. Because current VFIO implementation disallows to mmap sub-page(size < PAGE_SIZE) MMIO BARs which may share the same page with other BARs for security reasons. Thus, we have to handle mmio access to these BARs in QEMU emulation rather than in guest which will cause some performance loss. In our solution, we try to make use of the existing code path of resource_alignment kernel parameter and add a macro to set default alignment for it. Thus we can define this macro by default on some archs which may easily hit the performance issue because of their 64K page. In this series, patch 1 fixes a bug related to bridge window size/alignment caculating; patch 2,3 add support for setting default alignment of all MMIO BAR. Changelog v9: - Add a patch to fix for caculating bridge window's size and alignment - Remove an unrelated patch - Rework the patch that fix bug that reassign resources's alignment by changing its size Changelog v8: - Rebased against v4.10-rc4 - Rework the patch 2 - Change the commit log of patch 1 Changelog v7: - Rebased against v4.9-rc2 - Drop two merged patches - Rework the patch which fix a bug that resources's size is changed when using resource_alignment - Add a patch that fix a bug for IOV BARs when using resource_alignment Changelog v6: - Remove the option "noresize@" of resource_alignment Changelog v5: - Rebased against v4.8-rc6 - Drop the patch that forbidding disable memory decoding in pci_reassigndev_resource_alignment() Changelog v4: - Rebased against v4.8-rc1 - Drop one irrelevant patch - Drop the patch that adding wildcard to resource_alignment to enforce the alignment of all MMIO BARs to be at least PAGE_SIZE - Change the format of option "noresize" of resource_alignment - Code style improvements Changelog v3: - Ignore enforced alignment to fixed BARs - Fix issue that disabling memory decoding when reassigning the alignment - Only enable default alignment on PowerNV platform Changelog v2: - Ignore enforced alignment to VF BARs on pci_reassigndev_resource_alignment() Yongji Xie (3): PCI: A fix for caculating bridge window's size and alignment PCI: Add a macro to set default alignment for all PCI devices PCI: Don't extend device's size when using default alignment for all devices arch/powerpc/include/asm/pci.h |4 drivers/pci/pci.c | 37 +++-- drivers/pci/setup-bus.c|4 ++-- 3 files changed, 33 insertions(+), 12 deletions(-)
Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler
On Wednesday 15 February 2017 10:38 AM, Stewart Smith wrote: Mukesh Ojhawrites: Converts all the return explicit number to a more proper IRQ_HANDLED, which looks proper incase of interrupt handler returning case. Signed-off-by: Mukesh Ojha Reviewed-by: Vasant Hegde --- arch/powerpc/platforms/powernv/opal-dump.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) Should also have: Fixes: 8034f715f ("powernv/opal-dump: Convert to irq domain") ? Yeah .. i should have mentioned which patch missed these changes. -Mukesh
Re: [v4,1/2] arch/powerpc: Implement Optprobes
Thank You Michael. :) On Tuesday 14 February 2017 06:10 PM, Michael Ellerman wrote: On Wed, 2017-02-08 at 09:50:51 UTC, Anju T wrote: Current infrastructure of kprobe uses the unconditional trap instruction to probe a running kernel. Optprobe allows kprobe to replace the trap with a branch instruction to a detour buffer. Detour buffer contains instructions to create an in memory pt_regs. Detour buffer also has a call to optimized_callback() which in turn call the pre_handler(). After the execution of the pre-handler, a call is made for instruction emulation. The NIP is determined in advanced through dummy instruction emulation and a branch instruction is created to the NIP at the end of the trampoline. To address the limitation of branch instruction in POWER architecture, detour buffer slot is allocated from a reserved area. For the time being, 64KB is reserved in memory for this purpose. Instructions which can be emulated using analyse_instr() are the candidates for optimization. Before optimization ensure that the address range between the detour buffer allocated and the instruction being probed is within +/- 32MB. Signed-off-by: Anju T SudhakarSigned-off-by: Naveen N. Rao Acked-by: Masami Hiramatsu Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/51c9c0843993528bffc920c54c2121 cheers
Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler
Mukesh Ojhawrites: > Converts all the return explicit number to a more proper IRQ_HANDLED, > which looks proper incase of interrupt handler returning case. > > Signed-off-by: Mukesh Ojha > Reviewed-by: Vasant Hegde > --- > arch/powerpc/platforms/powernv/opal-dump.c | 9 +++-- > 1 file changed, 3 insertions(+), 6 deletions(-) Should also have: Fixes: 8034f715f ("powernv/opal-dump: Convert to irq domain") ? -- Stewart Smith OPAL Architect, IBM.
Re: [PATCH 1/2] powerpc/powernv/opal-dump : Handles opal_dump_info properly
Mukesh Ojhawrites: > Moves the return value check of 'opal_dump_info' to a proper place which > was previously unnecessarily filling all the dump info even on failure. > > Signed-off-by: Mukesh Ojha > --- > arch/powerpc/platforms/powernv/opal-dump.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) Acked-by: Stewart Smith -- Stewart Smith OPAL Architect, IBM.
Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails
Michael Ellermanwrites: > Vipin K Parashar writes: > >> OPAL returns OPAL_WRONG_STATE for XSCOM operations >> >> done to read any core FIR which is sleeping, offline. > > OK. > > Do we know why Linux is causing that to happen? > > It's also returned from many of the XIVE routines if we're in the wrong > xive mode, all of which would indicate a fairly bad Linux bug. > > Also the skiboot patch which added WRONG_STATE for XSCOM ops did so > explicitly so we could differentiate from other errors: > > commit 9c2d82394fd2303847cac4a665dee62556ca528a > Author: Russell Currey > AuthorDate: Mon Mar 21 12:00:00 2016 +1100 > > xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep > > xscom_read and xscom_write return OPAL_SUCCESS if they worked, and > OPAL_HARDWARE if they didn't. This doesn't provide information about why > the operation failed, such as if the CPU happens to be asleep. > > This is specifically useful in error scanning, so if every CPU is being > scanned for errors, sleeping CPUs likely aren't the cause of failures. > > So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is > sleeping. > > Signed-off-by: Russell Currey > Reviewed-by: Alistair Popple > Signed-off-by: Stewart Smith > > > > So I'm still not convinced that quietly swallowing this error and > mapping it to -EIO along with several of the other error codes is the > right thing to do. FWIW I agree - pretty limited cases where it should just be converted into -EIO and passed on - probably *just* the debugfs interface to be honest. -- Stewart Smith OPAL Architect, IBM.
Re: [PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE
On Wed, Feb 15, 2017 at 12:28:34AM +0530, Naveen N. Rao wrote: > Allow kprobes to be placed on ftrace _mcount() call sites. This > optimization avoids the use of a trap, by riding on ftrace > infrastructure. > > This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on > MPROFILE_KERNEL, which is only currently enabled on powerpc64le with > newer toolchains. > > Based on the x86 code by Masami. > > Signed-off-by: Naveen N. Rao> --- > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/kprobes.h | 10 > arch/powerpc/kernel/Makefile | 3 ++ > arch/powerpc/kernel/kprobes-ftrace.c | 100 > +++ > arch/powerpc/kernel/kprobes.c| 4 +- > arch/powerpc/kernel/optprobes.c | 3 ++ > 6 files changed, 120 insertions(+), 1 deletion(-) > create mode 100644 arch/powerpc/kernel/kprobes-ftrace.c You'll also need to update Documentation/features/debug/kprobes-on-ftrace/arch-support.txt > +/* Ftrace callback handler for kprobes */ > +void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip, > +struct ftrace_ops *ops, struct pt_regs *regs) > +{ > + struct kprobe *p; > + struct kprobe_ctlblk *kcb; > + unsigned long flags; > + > + /* Disable irq for emulating a breakpoint and avoiding preempt */ > + local_irq_save(flags); > + hard_irq_disable(); > + > + p = get_kprobe((kprobe_opcode_t *)nip); > + if (unlikely(!p) || kprobe_disabled(p)) > + goto end; > + > + kcb = get_kprobe_ctlblk(); > + if (kprobe_running()) { > + kprobes_inc_nmissed_count(p); > + } else { > + unsigned long orig_nip = regs->nip; > + /* Kprobe handler expects regs->nip = nip + 1 as breakpoint hit > */ Can you clarify this? On powerpc, the regs->nip at the time of breakpoint hit points to the probed instruction, not the one after. Ananth
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
在 2017/2/15 上午1:35, Guilherme G. Piccoli 写道: > On 14/02/2017 01:58, Pan Xinhui wrote: >> >> >> 在 2017/2/14 10:35, Nicholas Piggin 写道: >>> On Mon, 13 Feb 2017 19:00:42 -0200 >>> >>> xmon state changing after the first sysrq+x violates principle of least >>> astonishment, so I think that should be fixed. >>> >> hi, Nick >> yes, as long as xmon is disabled during boot, it should still be disabled >> after existing xmon. >> My patch does not fix that as it need people add one more char 'z' following >> 'x'. >> I will provide a new patch to fix that. >> >>> Then the question is, is it worth making it runtime configurable with xmon >>> command or debugfs tunables? >>> >> They are options for people to turn xmon features on or off. Maybe people >> needn't this. >> However I am not a fan of debugfs this time as I am used to using xmon >> cmds. :) >> >> Hi, Guilherme >> So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon >> in the original state(indicated by var xmon_off). >> 2) Then add options to turn some features on/off. And debugfs maybe not fit >> for this. But I am also wondering at same time, are people needing this? > > Hi Nick and Xinhui, thanks very much for the feedback. > I agree, we should keep xmon in the state it was firstly set, on boot > time - dropping to the debugger using sysrq shouldn't change it. > Yes, and feel free to include my fix patch "powerpc/xmon: Fix an unexpected xmon onoff state change" :) > Now, the use case of the debugfs approach is to allow user to > enable/disable xmon without need to drop into the debugger itself, or > reboot the machine. Good, got it. We look forward to your new patch. :) thanks xinhui > Imagine a scenario in which we have a production machine, and: > > > i) For some reason, somebody kept xmon enabled on grub.cfg and now, we > want to let kdump work in case of crash - how to disable xmon in runtime? > > ii) The opposite: xmon wasn't enable on boot time in production machine, > but we have a super-rare issue and want to drop to xmon next time it > happens, so we need to enable it. But we don't want to drop into the > debugger to force it gets enabled, so how do we enable it? > > Regarding the place of the xmon state file, I believe debugfs is the > right place - where else could we add it? procfs? configfs? > > Thanks, > > > Guilherme >> >> thanks >> xinhui >> >>> Thanks, >>> Nick >>> >>
[PATCH] KVM: Prevent double-free on HPT resize commit path
resize_hpt_release(), called once the HPT resize of a KVM guest is completed (successfully or unsuccessfully) free()s the state structure for the resize. It is currently not safe to call with a NULL pointer. However, one of the error paths in kvm_vm_ioctl_resize_hpt_commit() can invoke it with a NULL pointer. This will occur if userspace improperly invokes KVM_PPC_RESIZE_HPT_COMMIT without previously calling KVM_PPC_RESIZE_HPT_PREPARE, or if it calls COMMIT twice without an intervening PREPARE. To fix this potential crash bug - and maybe others like it, make it safe (and a no-op) to call resize_hpt_release() with a NULL resize pointer. Found by Dan Carpenter with a static checker. Reported-by: Dan CarpenterSigned-off-by: David Gibson --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 013552f..72ccac2 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -1407,6 +1407,9 @@ static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize) { BUG_ON(kvm->arch.resize_hpt != resize); + if (!resize) + return; + if (resize->hpt.virt) kvmppc_free_hpt(>hpt); -- 2.9.3
Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler
Hi Mukesh, > Converts all the return explicit number to a more proper IRQ_HANDLED, > which looks proper incase of interrupt handler returning case. This looks good to me, but can you describe the effects of those changes to the interrupt handler's return code? ie, what happened in the erroneous case where we returned 0 (== IRQ_NONE) - does this fix a user-visible issue? Cheers, Jeremy
Re: [PATCH 1/2] powerpc/powernv/opal-dump : Handles opal_dump_info properly
Hi Mukesh, > Moves the return value check of 'opal_dump_info' to a proper place which > was previously unnecessarily filling all the dump info even on failure. Acked-by: Jeremy KerrThanks! Jeremy
[PATCH] powerpc/xmon: Fix an unexpected xmon onoff state change
Once xmon is triggered by sysrq-x, it is enabled always afterwards even if it is disabled during boot. This will cause a system reset interrut fail to dump. So keep xmon in its original state after exit. Signed-off-by: Pan Xinhui--- arch/powerpc/xmon/xmon.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index 9c0e17c..721212f 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -76,6 +76,7 @@ static int xmon_gate; #endif /* CONFIG_SMP */ static unsigned long in_xmon __read_mostly = 0; +static int xmon_off = 0; static unsigned long adrs; static int size = 1; @@ -3250,6 +3251,8 @@ static void sysrq_handle_xmon(int key) /* ensure xmon is enabled */ xmon_init(1); debugger(get_irq_regs()); + if (xmon_off) + xmon_init(0); } static struct sysrq_key_op sysrq_xmon_op = { @@ -3266,7 +3269,7 @@ static int __init setup_xmon_sysrq(void) __initcall(setup_xmon_sysrq); #endif /* CONFIG_MAGIC_SYSRQ */ -static int __initdata xmon_early, xmon_off; +static int __initdata xmon_early; static int __init early_parse_xmon(char *p) { -- 2.4.11
linux-next: manual merge of the kvm tree with the powerpc tree
Hi all, Today's linux-next merge of the kvm tree got a conflict in: arch/powerpc/kvm/book3s_hv_rm_xics.c between commit: ab9bad0ead9a ("powerpc/powernv: Remove separate entry for OPAL real mode calls") from the powerpc tree and commit: 21acd0e4df04 ("KVM: PPC: Book 3S: XICS: Don't lock twice when checking for resend") from the kvm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/powerpc/kvm/book3s_hv_rm_xics.c index 29f43ed6d5eb,0b2e388f4cdf.. --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@@ -35,8 -35,8 +35,8 @@@ int kvm_irq_bypass = 1 EXPORT_SYMBOL(kvm_irq_bypass); static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp, - u32 new_irq); + u32 new_irq, bool check_resend); -static int xics_opal_rm_set_server(unsigned int hw_irq, int server_cpu); +static int xics_opal_set_server(unsigned int hw_irq, int server_cpu); /* -- ICS routines -- */ static void ics_rm_check_resend(struct kvmppc_xics *xics,
Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
"Aneesh Kumar K.V"writes: > Michael Ellerman writes: > >> "Aneesh Kumar K.V" writes: >>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>> index 0735d5a8049f..8720a406bbbe 100644 >>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>> @@ -16,6 +16,9 @@ >>> #include >>> #include >>> >>> +#ifndef __ASSEMBLY__ >>> +#include >>> +#endif >> >> I assume that's for the VM_BUG_ON() you add below. But if so wouldn't >> the #include be better placed in book3s/64/pgtable.h also? > > mmu-hash.h has got a hack that is explained below > > #ifndef __ASSEMBLY__ > #include > #endif > /* > * This is necessary to get the definition of PGTABLE_RANGE which we > * need for various slices related matters. Note that this isn't the > * complete pgtable.h but only a portion of it. > */ > #include > > This is the only place where we do that book3s/64/pgtable.h include this > way. Everybody should include asm/pgable.h which picks the righ version > based on different config option. I don't understand how that is related. If you're adding a VM_BUG_ON() in book3s/64/pgtable.h, why isn't the include of mmdebug.h in that file also? cheers
Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
On Tue, 14 Feb 2017 21:59:23 +1100 Michael Ellermanwrote: > "Aneesh Kumar K.V" writes: > > > On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote: > >> "Aneesh Kumar K.V" writes: > >> > >>> Autonuma preserves the write permission across numa fault to avoid taking > >>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: > >>> preserve PTE > >>> write permissions across a NUMA hinting fault"). Architecture can > >>> implement > >>> protnone in different ways and some may choose to implement that by > >>> clearing Read/ > >>> Write/Exec bit of pte. Setting the write bit on such pte can result in > >>> wrong > >>> behaviour. Fix this up by allowing arch to override how to save the write > >>> bit > >>> on a protnone pte. > >> This is pretty obviously a nop on arches that don't implement the new > >> hooks, but it'd still be good to get an ack from someone in mm land > >> before I merge it. > > > > > > To get it apply cleanly you may need > > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch > > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch > > Ah OK, I missed those. > > In that case these two should probably go via Andrew's tree. Done. But mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch is on hold because Aneesh saw a testing issue, so these two are also on hold.
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
"Guilherme G. Piccoli"writes: > On 14/02/2017 01:58, Pan Xinhui wrote: >> 在 2017/2/14 10:35, Nicholas Piggin 写道: >>> On Mon, 13 Feb 2017 19:00:42 -0200 >>> >>> xmon state changing after the first sysrq+x violates principle of least >>> astonishment, so I think that should be fixed. >>> >> hi, Nick >> yes, as long as xmon is disabled during boot, it should still be disabled >> after existing xmon. >> My patch does not fix that as it need people add one more char 'z' following >> 'x'. >> I will provide a new patch to fix that. >> >>> Then the question is, is it worth making it runtime configurable with xmon >>> command or debugfs tunables? >>> >> They are options for people to turn xmon features on or off. Maybe people >> needn't this. >> However I am not a fan of debugfs this time as I am used to using xmon >> cmds. :) >> >> Hi, Guilherme >> So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon >> in the original state(indicated by var xmon_off). >> 2) Then add options to turn some features on/off. And debugfs maybe not fit >> for this. But I am also wondering at same time, are people needing this? > > Hi Nick and Xinhui, thanks very much for the feedback. > I agree, we should keep xmon in the state it was firstly set, on boot > time - dropping to the debugger using sysrq shouldn't change it. > > Now, the use case of the debugfs approach is to allow user to > enable/disable xmon without need to drop into the debugger itself, or > reboot the machine. > Imagine a scenario in which we have a production machine, and: > > > i) For some reason, somebody kept xmon enabled on grub.cfg and now, we > want to let kdump work in case of crash - how to disable xmon in runtime? > > ii) The opposite: xmon wasn't enable on boot time in production machine, > but we have a super-rare issue and want to drop to xmon next time it > happens, so we need to enable it. But we don't want to drop into the > debugger to force it gets enabled, so how do we enable it? Yep. I think both of those are good reasons to add a debugfs file. > Regarding the place of the xmon state file, I believe debugfs is the > right place - where else could we add it? procfs? configfs? debugfs, under powerpc_debugfs_root. cheers
Re: [bug report] powerpc/powernv: Support EEH reset for VF PE
On Tue, 2017-02-14 at 16:39 +0300, Dan Carpenter wrote: > Hello Wei Yang, > > The patch 9312bc5bab59: "powerpc/powernv: Support EEH reset for VF > PE" from Mar 4, 2016, leads to the following static checker warning: > > arch/powerpc/platforms/powernv/eeh-powernv.c:1033 pnv_eeh_reset_vf_pe() > info: return a literal instead of 'ret' Hey Dan, Out of curiosity, what static analysis tool are you using? > > arch/powerpc/platforms/powernv/eeh-powernv.c > 1019 static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option) > 1020 { > 1021 struct eeh_dev *edev; > 1022 struct pci_dn *pdn; > 1023 int ret; > 1024 > 1025 /* The VF PE should have only one child device */ > 1026 edev = list_first_entry_or_null(>edevs, struct eeh_dev, > list); > 1027 pdn = eeh_dev_to_pdn(edev); > 1028 if (!pdn) > 1029 return -ENXIO; > 1030 > 1031 ret = pnv_eeh_do_flr(pdn, option); > 1032 if (!ret) > 1033 return ret; > > Presumably if pnv_eeh_do_flr() succeeds then we don't need to continue? > pnv_eeh_do_af_flr() is a fall back option? Doing a: > > return 0; > > Makes this more deliberate looking. Sometimes people get their if > statements reversed is the other option. I can send a patch for this. > > 1034 > 1035 return pnv_eeh_do_af_flr(pdn, option); > 1036 } > > regards, > dan carpenter
Re: [PATCH 1/5] selftests: Fix selftests build to just build, not run tests
On 15 February 2017 03:14:24 GMT+11:00, Shuah Khanwrote: >On 02/13/2017 07:09 PM, Michael Ellerman wrote: >> Michael Ellerman writes: >> >>> In commit 88baa78d1f31 ("selftests: remove duplicated all and clean >>> target"), the "all" target was removed from individual Makefiles and >>> added to lib.mk. >>> >>> However the "all" target was added to lib.mk *after* the existing >>> "runtests" target. This means "runtests" becomes the first (default) >>> target for most of our Makefiles. >> ... >>> >>> Fix it by moving the "all" target to the start of lib.mk, making it >the >>> default target. >>> >>> Fixes: 88baa78d1f31 ("selftests: remove duplicated all and clean >target") >>> Signed-off-by: Michael Ellerman >> >> Hi Shuah, >> >> Can you please merge this series into linux-next? >> >> The selftests are badly broken otherwise. >> >> cheers >> > >Hi Michael, > >Thanks. All 5 patches are now in linux-kselftest next with Tested-by >tag from Bamvor for 1,2,3. Thanks! -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
[PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE
Allow kprobes to be placed on ftrace _mcount() call sites. This optimization avoids the use of a trap, by riding on ftrace infrastructure. This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on MPROFILE_KERNEL, which is only currently enabled on powerpc64le with newer toolchains. Based on the x86 code by Masami. Signed-off-by: Naveen N. Rao--- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/kprobes.h | 10 arch/powerpc/kernel/Makefile | 3 ++ arch/powerpc/kernel/kprobes-ftrace.c | 100 +++ arch/powerpc/kernel/kprobes.c| 4 +- arch/powerpc/kernel/optprobes.c | 3 ++ 6 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/kernel/kprobes-ftrace.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 260dd6a371e0..78419919556d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -101,6 +101,7 @@ config PPC select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) select HAVE_KPROBES select HAVE_OPTPROBES if PPC64 + select HAVE_KPROBES_ON_FTRACE select HAVE_ARCH_KGDB select HAVE_KRETPROBES select HAVE_ARCH_TRACEHOOK diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h index e7ada061aa12..3305a12286fa 100644 --- a/arch/powerpc/include/asm/kprobes.h +++ b/arch/powerpc/include/asm/kprobes.h @@ -153,6 +153,16 @@ extern int kprobe_exceptions_notify(struct notifier_block *self, extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr); extern int kprobe_handler(struct pt_regs *regs); extern int kprobe_post_handler(struct pt_regs *regs); +#ifdef CONFIG_KPROBES_ON_FTRACE +extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs, + struct kprobe_ctlblk *kcb); +#else +static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs, + struct kprobe_ctlblk *kcb) +{ + return 0; +} +#endif #else static inline int kprobe_handler(struct pt_regs *regs) { return 0; } static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; } diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index a048b37b9b27..88b21427ccc7 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -101,6 +101,7 @@ obj-$(CONFIG_BOOTX_TEXT)+= btext.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_OPTPROBES)+= optprobes.o optprobes_head.o +obj-$(CONFIG_KPROBES_ON_FTRACE)+= kprobes-ftrace.o obj-$(CONFIG_UPROBES) += uprobes.o obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o obj-$(CONFIG_STACKTRACE) += stacktrace.o @@ -154,6 +155,8 @@ GCOV_PROFILE_machine_kexec_32.o := n UBSAN_SANITIZE_machine_kexec_32.o := n GCOV_PROFILE_kprobes.o := n UBSAN_SANITIZE_kprobes.o := n +GCOV_PROFILE_kprobes-ftrace.o := n +UBSAN_SANITIZE_kprobes-ftrace.o := n UBSAN_SANITIZE_vdso.o := n extra-$(CONFIG_PPC_FPU)+= fpu.o diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c new file mode 100644 index ..0377b3013723 --- /dev/null +++ b/arch/powerpc/kernel/kprobes-ftrace.c @@ -0,0 +1,100 @@ +/* + * Dynamic Ftrace based Kprobes Optimization + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) Hitachi Ltd., 2012 + * Copyright 2016 Naveen N. Rao + * IBM Corporation + */ +#include +#include +#include +#include +#include + +static nokprobe_inline +int __skip_singlestep(struct kprobe *p, struct pt_regs *regs, + struct kprobe_ctlblk *kcb, unsigned long orig_nip) +{ + /* +* Emulate singlestep (and also recover regs->nip) +* as if there is a nop +*/ + regs->nip = (unsigned long)p->addr + MCOUNT_INSN_SIZE; + if (unlikely(p->post_handler)) { + kcb->kprobe_status = KPROBE_HIT_SSDONE; + p->post_handler(p, regs, 0); + } + __this_cpu_write(current_kprobe, NULL); + if (orig_nip) + regs->nip = orig_nip; + return 1; +} + +int
[PATCH 3/3] powerpc: kprobes: prefer ftrace when probing function entry
KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it eliminates the need for a trap, as well as the need to emulate or single-step instructions. Though OPTPROBES provides us with similar performance, we have limited optprobes trampoline slots. As such, when asked to probe at a function entry, default to using the ftrace infrastructure. With: # cd /sys/kernel/debug/tracing # echo 'p _do_fork' > kprobe_events before patch: # cat ../kprobes/list c00daf08 k _do_fork+0x8[DISABLED] c0044fc0 k kretprobe_trampoline+0x0[OPTIMIZED] and after patch: # cat ../kprobes/list c00d074c k _do_fork+0xc[DISABLED][FTRACE] c00412b0 k kretprobe_trampoline+0x0[OPTIMIZED] Signed-off-by: Naveen N. Rao--- arch/powerpc/include/asm/kprobes.h | 20 1 file changed, 20 insertions(+) diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h index 3305a12286fa..09e74edee625 100644 --- a/arch/powerpc/include/asm/kprobes.h +++ b/arch/powerpc/include/asm/kprobes.h @@ -60,12 +60,32 @@ extern kprobe_opcode_t optprobe_template_end[]; #ifdef PPC64_ELF_ABI_v2 /* PPC64 ABIv2 needs local entry point */ +#ifdef CONFIG_KPROBES_ON_FTRACE +/* + * Per livepatch.h, ftrace location is always within the first 16 bytes + * of a function on powerpc with -mprofile-kernel. + */ +#define kprobe_lookup_name(name, addr, offset) \ +{ \ + addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \ + if (addr && !(offset)) {\ + unsigned long faddr;\ + faddr = ftrace_location_range((unsigned long)addr, \ + (unsigned long)addr + 16);\ + if (faddr) \ + addr = (kprobe_opcode_t *)faddr;\ + else\ + addr = (kprobe_opcode_t *)ppc_function_entry(addr); \ + } \ +} +#else #define kprobe_lookup_name(name, addr, offset) \ { \ addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \ if (addr && !(offset)) \ addr = (kprobe_opcode_t *)ppc_function_entry(addr); \ } +#endif #elif defined(PPC64_ELF_ABI_v1) /* * 64bit powerpc ABIv1 uses function descriptors: -- 2.11.0
[PATCH 2/3] powerpc: introduce a new helper to obtain function entry points
kprobe_lookup_name() is specific to the kprobe subsystem and may not always return the function entry point (in a subsequent patch). For looking up function entry points, introduce a separate helper and use the same in optprobes.c Signed-off-by: Naveen N. Rao--- arch/powerpc/include/asm/code-patching.h | 37 arch/powerpc/kernel/optprobes.c | 4 ++-- 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h index 8ab937771068..3e994f404434 100644 --- a/arch/powerpc/include/asm/code-patching.h +++ b/arch/powerpc/include/asm/code-patching.h @@ -12,6 +12,8 @@ #include #include +#include +#include /* Flags for create_branch: * "b" == create_branch(addr, target, 0); @@ -99,6 +101,41 @@ static inline unsigned long ppc_global_function_entry(void *func) #endif } +/* + * Wrapper around kallsyms_lookup() to return function entry address: + * - For ABIv1, we lookup the dot variant. + * - For ABIv2, we return the local entry point. + */ +static inline unsigned long ppc_kallsyms_lookup_name(const char *name) +{ + unsigned long addr; +#ifdef PPC64_ELF_ABI_v1 + /* check for dot variant */ + char dot_name[1 + KSYM_NAME_LEN]; + bool dot_appended = false; + if (name[0] != '.') { + dot_name[0] = '.'; + dot_name[1] = '\0'; + strncat(dot_name, name, KSYM_NAME_LEN - 2); + dot_appended = true; + } else { + dot_name[0] = '\0'; + strncat(dot_name, name, KSYM_NAME_LEN - 1); + } + addr = kallsyms_lookup_name(dot_name); + if (!addr && dot_appended) + /* Let's try the original non-dot symbol lookup */ + addr = kallsyms_lookup_name(name); +#elif defined(PPC64_ELF_ABI_v2) + addr = kallsyms_lookup_name(name); + if (addr) + addr = ppc_function_entry((void *)addr); +#else + addr = kallsyms_lookup_name(name); +#endif + return addr; +} + #ifdef CONFIG_PPC64 /* * Some instruction encodings commonly used in dynamic ftracing diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c index a8f414a0b141..a2f6f28e5205 100644 --- a/arch/powerpc/kernel/optprobes.c +++ b/arch/powerpc/kernel/optprobes.c @@ -246,8 +246,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p) /* * 2. branch to optimized_callback() and emulate_step() */ - kprobe_lookup_name("optimized_callback", op_callback_addr, 0); - kprobe_lookup_name("emulate_step", emulate_step_addr, 0); + op_callback_addr = (kprobe_opcode_t *)ppc_kallsyms_lookup_name("optimized_callback"); + emulate_step_addr = (kprobe_opcode_t *)ppc_kallsyms_lookup_name("emulate_step"); if (!op_callback_addr || !emulate_step_addr) { WARN(1, "kprobe_lookup_name() failed\n"); goto error; -- 2.11.0
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
On 14/02/2017 09:37, Michael Ellerman wrote: > "Guilherme G. Piccoli"writes: > >> Currently the xmon debugger is set only via kernel boot command-line. >> It's disabled by default, and can be enabled with "xmon=on" on the >> command-line. Also, xmon may be accessed via sysrq mechanism, but once >> we enter xmon via sysrq, it's kept enabled until system is rebooted, >> even if we exit the debugger. A kernel crash will then lead to xmon >> instance, instead of triggering a kdump procedure (if configured), for >> example. >> >> This patch introduces a debugfs entry for xmon, allowing user to query >> its current state and change it if desired. Basically, the "xmon" file >> to read from/write to is under the debugfs mount point, on powerpc >> directory. Reading this file will provide the current state of the >> debugger, one of the following: "on", "off", "early" or "nobt". Writing >> one of these states to the file will take immediate effect on the debugger. > > I like this in general. > > But I think we can simplify it a bit. > > I don't think we need the nobt state anymore. As far as I can see it was > added as a way to reinstate the old behaviour when the auto backtrace > mode was added, but I've never heard of anyone using it. > > If anyone hits a crash where they really need that feature they can > always just hack the code to disable the backtrace. > > So I think step 1 is a patch to drop the xmon_no_auto_backtrace stuff. > > Also I'm not sure EARLY needs to be a separate state. It's just a > one-off invocation at boot, all we really need is just a single bit of > state communicated from early_parse_xmon() to xmon_setup(), ie. a static > bool would do. > > If we agree with that, then there's only two states left, on and off, in > which case it can probably just be an int - and we can use a simple > attribute file rather than custom parsing. > Thanks for the good suggestions Michael, I'll change the code and submit a new version. >> * I had this patch partially done for some time, and after a discussion >> at the kernel slack channel latest week, I decided to rebase and fix >> some remaining bugs. I'd change 'x' option to always disable the debugger, > > Not quite. > > 'x' should exit and leave xmon in whatever state it was previously in. Agreed! Thanks, Guilherme > > cheers >
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
On 14/02/2017 01:58, Pan Xinhui wrote: > > > 在 2017/2/14 10:35, Nicholas Piggin 写道: >> On Mon, 13 Feb 2017 19:00:42 -0200 >> >> xmon state changing after the first sysrq+x violates principle of least >> astonishment, so I think that should be fixed. >> > hi, Nick > yes, as long as xmon is disabled during boot, it should still be disabled > after existing xmon. > My patch does not fix that as it need people add one more char 'z' following > 'x'. > I will provide a new patch to fix that. > >> Then the question is, is it worth making it runtime configurable with xmon >> command or debugfs tunables? >> > They are options for people to turn xmon features on or off. Maybe people > needn't this. > However I am not a fan of debugfs this time as I am used to using xmon cmds. > :) > > Hi, Guilherme > So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon in > the original state(indicated by var xmon_off). > 2) Then add options to turn some features on/off. And debugfs maybe not fit > for this. But I am also wondering at same time, are people needing this? Hi Nick and Xinhui, thanks very much for the feedback. I agree, we should keep xmon in the state it was firstly set, on boot time - dropping to the debugger using sysrq shouldn't change it. Now, the use case of the debugfs approach is to allow user to enable/disable xmon without need to drop into the debugger itself, or reboot the machine. Imagine a scenario in which we have a production machine, and: i) For some reason, somebody kept xmon enabled on grub.cfg and now, we want to let kdump work in case of crash - how to disable xmon in runtime? ii) The opposite: xmon wasn't enable on boot time in production machine, but we have a super-rare issue and want to drop to xmon next time it happens, so we need to enable it. But we don't want to drop into the debugger to force it gets enabled, so how do we enable it? Regarding the place of the xmon state file, I believe debugfs is the right place - where else could we add it? procfs? configfs? Thanks, Guilherme > > thanks > xinhui > >> Thanks, >> Nick >> >
[PATCH 3/3] powerpc/mm: move mmap_sem unlocking in do_page_fault()
Since the fault retry is now handled earlier, we can release the mmap_sem lock earlier too and remove later unlocking previously done in mm_fault_error(). Signed-off-by: Laurent Dufour--- arch/powerpc/mm/fault.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 2a6bc7e6e69a..21e06cce8984 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -151,13 +151,6 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) * continue the pagefault. */ if (fatal_signal_pending(current)) { - /* -* If we have retry set, the mmap semaphore will have -* alrady been released in __lock_page_or_retry(). Else -* we release it now. -*/ - if (!(fault & VM_FAULT_RETRY)) - up_read(>mm->mmap_sem); /* Coming from kernel, we need to deal with uaccess fixups */ if (user_mode(regs)) return MM_FAULT_RETURN; @@ -170,8 +163,6 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) /* Out of memory */ if (fault & VM_FAULT_OOM) { - up_read(>mm->mmap_sem); - /* * We ran out of memory, or some other thing happened to us that * made us unable to handle the page fault gracefully. @@ -182,10 +173,8 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) return MM_FAULT_RETURN; } - if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) { - up_read(>mm->mmap_sem); + if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) return do_sigbus(regs, addr, fault); - } /* We don't understand the fault code, this is fatal */ BUG(); @@ -452,11 +441,12 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, goto retry; } /* We will enter mm_fault_error() below */ - } + } else + up_read(>mm->mmap_sem); if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) { if (fault & VM_FAULT_SIGSEGV) - goto bad_area; + goto bad_area_nosemaphore; rc = mm_fault_error(regs, address, fault); if (rc >= MM_FAULT_RETURN) goto bail; @@ -488,7 +478,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, regs, address); } - up_read(>mmap_sem); goto bail; bad_area: -- 2.7.4
[PATCH 2/3] powerpc/mm: handle VM_FAULT_RETRY earlier
In do_page_fault() if handle_mm_fault() returns VM_FAULT_RETRY, retry the page fault handling before anything else. This would simplify the handling of the mmap_sem lock in this part of the code. Signed-off-by: Laurent Dufour--- arch/powerpc/mm/fault.c | 67 - 1 file changed, 38 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index ee09604bbe12..2a6bc7e6e69a 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -434,6 +434,26 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, * the fault. */ fault = handle_mm_fault(vma, address, flags); + + /* +* Handle the retry right now, the mmap_sem has been released in that +* case. +*/ + if (unlikely(fault & VM_FAULT_RETRY)) { + /* We retry only once */ + if (flags & FAULT_FLAG_ALLOW_RETRY) { + /* +* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk +* of starvation. +*/ + flags &= ~FAULT_FLAG_ALLOW_RETRY; + flags |= FAULT_FLAG_TRIED; + if (!fatal_signal_pending(current)) + goto retry; + } + /* We will enter mm_fault_error() below */ + } + if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) { if (fault & VM_FAULT_SIGSEGV) goto bad_area; @@ -445,38 +465,27 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, } /* -* Major/minor page fault accounting is only done on the -* initial attempt. If we go through a retry, it is extremely -* likely that the page will be found in page cache at that point. +* Major/minor page fault accounting. */ - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_MAJOR) { - current->maj_flt++; - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, - regs, address); + if (fault & VM_FAULT_MAJOR) { + current->maj_flt++; + perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, + regs, address); #ifdef CONFIG_PPC_SMLPAR - if (firmware_has_feature(FW_FEATURE_CMO)) { - u32 page_ins; - - preempt_disable(); - page_ins = be32_to_cpu(get_lppaca()->page_ins); - page_ins += 1 << PAGE_FACTOR; - get_lppaca()->page_ins = cpu_to_be32(page_ins); - preempt_enable(); - } -#endif /* CONFIG_PPC_SMLPAR */ - } else { - current->min_flt++; - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, - regs, address); - } - if (fault & VM_FAULT_RETRY) { - /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk -* of starvation. */ - flags &= ~FAULT_FLAG_ALLOW_RETRY; - flags |= FAULT_FLAG_TRIED; - goto retry; + if (firmware_has_feature(FW_FEATURE_CMO)) { + u32 page_ins; + + preempt_disable(); + page_ins = be32_to_cpu(get_lppaca()->page_ins); + page_ins += 1 << PAGE_FACTOR; + get_lppaca()->page_ins = cpu_to_be32(page_ins); + preempt_enable(); } +#endif /* CONFIG_PPC_SMLPAR */ + } else { + current->min_flt++; + perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, + regs, address); } up_read(>mmap_sem); -- 2.7.4
[PATCH 1/3] powerpc/mm: move mmap_sem unlock up from do_sigbus
Move mmap_sem releasing in the do_sigbus()'s unique caller : mm_fault_error() No functional changes. Signed-off-by: Laurent Dufour--- arch/powerpc/mm/fault.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 62a50d6d1053..ee09604bbe12 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -119,8 +119,6 @@ static int do_sigbus(struct pt_regs *regs, unsigned long address, siginfo_t info; unsigned int lsb = 0; - up_read(>mm->mmap_sem); - if (!user_mode(regs)) return MM_FAULT_ERR(SIGBUS); @@ -184,8 +182,10 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) return MM_FAULT_RETURN; } - if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) + if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) { + up_read(>mm->mmap_sem); return do_sigbus(regs, addr, fault); + } /* We don't understand the fault code, this is fatal */ BUG(); -- 2.7.4
[PATCH 0/3] powerpc/mm: page fault handler cleaning
This series attempts to clean the page fault handler in the way it has been done previously for the x86 architecture [1]. The goal is to manage the mmap_sem earlier and only in do_page_fault(). This done by handling the retry case earlier, before handling the error case. This way the semaphore can be released earlier and the error path processed without holding it. The first patch is just moving a unlock to the caller of the service, which as no functional impact. The second patch is handling the retry case earlier in do_page_fault(). This is where most the change are done, but I was conservative here, not changing the use of mm_fault_error() in the case of the second retry. It may be smarter to handle that case separately but this may create duplicate code. The last patch is moving up semaphore releasing from mm_fault_error() to do_page_fault(). [1] see commits from Linus Torvalds 26178ec11ef3 ("x86: mm: consolidate VM_FAULT_RETRY handling") 7fb08eca4527 ("x86: mm: move mmap_sem unlock from mm_fault_error() to caller") Laurent Dufour (3): powerpc/mm: move mmap_sem unlock up from do_sigbus powerpc/mm: handle VM_FAULT_RETRY earlier powerpc/mm: move mmap_sem unlocking in do_page_fault() arch/powerpc/mm/fault.c | 82 - 1 file changed, 40 insertions(+), 42 deletions(-) -- 2.7.4
Re: [PATCH 1/5] selftests: Fix selftests build to just build, not run tests
On 02/13/2017 07:09 PM, Michael Ellerman wrote: > Michael Ellermanwrites: > >> In commit 88baa78d1f31 ("selftests: remove duplicated all and clean >> target"), the "all" target was removed from individual Makefiles and >> added to lib.mk. >> >> However the "all" target was added to lib.mk *after* the existing >> "runtests" target. This means "runtests" becomes the first (default) >> target for most of our Makefiles. > ... >> >> Fix it by moving the "all" target to the start of lib.mk, making it the >> default target. >> >> Fixes: 88baa78d1f31 ("selftests: remove duplicated all and clean target") >> Signed-off-by: Michael Ellerman > > Hi Shuah, > > Can you please merge this series into linux-next? > > The selftests are badly broken otherwise. > > cheers > Hi Michael, Thanks. All 5 patches are now in linux-kselftest next with Tested-by tag from Bamvor for 1,2,3. thank you both, -- Shuah
[bug report] powernv/opal: Convert opal message events to opal irq domain
Hello Alistair Popple, The patch a295af24d0d2: "powernv/opal: Convert opal message events to opal irq domain" from May 15, 2015, leads to the following static checker warning: arch/powerpc/platforms/powernv/opal.c:297 opal_message_init() info: return a literal instead of 'irq' arch/powerpc/platforms/powernv/opal.c 286 static int __init opal_message_init(void) 287 { 288 int ret, i, irq; 289 290 for (i = 0; i < OPAL_MSG_TYPE_MAX; i++) 291 ATOMIC_INIT_NOTIFIER_HEAD(_msg_notifier_head[i]); 292 293 irq = opal_event_request(ilog2(OPAL_EVENT_MSG_PENDING)); 294 if (!irq) { 295 pr_err("%s: Can't register OPAL event irq (%d)\n", 296 __func__, irq); 297 return irq; This code doesn't really make sense. I'm not really certain what it should be... 298 } 299 300 ret = request_irq(irq, opal_message_notify, 301 IRQ_TYPE_LEVEL_HIGH, "opal-msg", NULL); 302 if (ret) { 303 pr_err("%s: Can't request OPAL event irq (%d)\n", 304 __func__, ret); 305 return ret; 306 } 307 308 return 0; 309 } regards, dan carpenter
[PATCH v4 2/2] powerpc: emulate_step tests for load/store instructions
Add new selftest that test emulate_step for Normal, Floating Point, Vector and Vector Scalar - load/store instructions. Test should run at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Sample log: [0.762063] emulate_step smoke test: start. [0.762219] emulate_step smoke test: ld : PASS [0.762434] emulate_step smoke test: lwz: PASS [0.762653] emulate_step smoke test: lwzx : PASS [0.762867] emulate_step smoke test: std: PASS [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS [0.763302] emulate_step smoke test: lfsx : PASS [0.763514] emulate_step smoke test: stfsx : PASS [0.763727] emulate_step smoke test: lfdx : PASS [0.763942] emulate_step smoke test: stfdx : PASS [0.764134] emulate_step smoke test: lvx: PASS [0.764349] emulate_step smoke test: stvx : PASS [0.764575] emulate_step smoke test: lxvd2x : PASS [0.764788] emulate_step smoke test: stxvd2x: PASS [0.764997] emulate_step smoke test: complete. Signed-off-by: Ravi Bangoria--- arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/lib/Makefile | 2 + arch/powerpc/lib/test_emulate_step.c | 443 ++ 3 files changed, 452 insertions(+) create mode 100644 arch/powerpc/lib/test_emulate_step.c diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index d99bd44..e7d6d86 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -284,6 +284,13 @@ #define PPC_INST_BRANCH_COND 0x4080 #define PPC_INST_LBZCIX0x7c0006aa #define PPC_INST_STBCIX0x7c0007aa +#define PPC_INST_LWZX 0x7c2e +#define PPC_INST_LFSX 0x7c00042e +#define PPC_INST_STFSX 0x7c00052e +#define PPC_INST_LFDX 0x7c0004ae +#define PPC_INST_STFDX 0x7c0005ae +#define PPC_INST_LVX 0x7cce +#define PPC_INST_STVX 0x7c0001ce /* macros to insert fields into opcodes */ #define ___PPC_RA(a) (((a) & 0x1f) << 16) diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 0e649d7..c0d6e79 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -32,4 +32,6 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o obj-$(CONFIG_ALTIVEC) += xor_vmx.o CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) +obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o + obj-$(CONFIG_PPC64) += $(obj64-y) diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c new file mode 100644 index 000..4e3bce9 --- /dev/null +++ b/arch/powerpc/lib/test_emulate_step.c @@ -0,0 +1,443 @@ +/* + * test_emulate_step.c - simple sanity test for emulate_step load/store + * instructions + * + * Copyright IBM Corp. 2016 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + */ + +#define pr_fmt(fmt) "emulate_step smoke test: " fmt + +#include +#include +#include + +#define IMM_L(i) ((uintptr_t)(i) & 0x) + +/* + * Defined with TEST_ prefix so it does not conflict with other + * definitions. + */ +#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) | \ + ___PPC_RA(base) | IMM_L(i)) +#define TEST_LWZ(r, base, i) (PPC_INST_LWZ | ___PPC_RT(r) | \ + ___PPC_RA(base) | IMM_L(i)) +#define TEST_LWZX(t, a, b) (PPC_INST_LWZX | ___PPC_RT(t) | \ + ___PPC_RA(a) | ___PPC_RB(b)) +#define TEST_STD(r, base, i) (PPC_INST_STD | ___PPC_RS(r) | \ + ___PPC_RA(base) | ((i) & 0xfffc)) +#define TEST_LDARX(t, a, b, eh)(PPC_INST_LDARX | ___PPC_RT(t) | \ + ___PPC_RA(a) | ___PPC_RB(b) | \ + __PPC_EH(eh)) +#define TEST_STDCX(s, a, b)(PPC_INST_STDCX | ___PPC_RS(s) |\ + ___PPC_RA(a) | ___PPC_RB(b)) +#define TEST_LFSX(t, a, b) (PPC_INST_LFSX | ___PPC_RT(t) | \ + ___PPC_RA(a) | ___PPC_RB(b)) +#define TEST_STFSX(s, a, b)
[PATCH v4 1/2] powerpc: Emulation support for load/store instructions on LE
emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Reported-by: Anton BlanchardSigned-off-by: Ravi Bangoria --- arch/powerpc/lib/sstep.c | 20 1 file changed, 20 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index 846dba2..9c542ec 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case LARX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ if (!address_ok(regs, op.ea, size)) @@ -1823,8 +1821,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case STCX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ if (!address_ok(regs, op.ea, size)) @@ -1849,8 +1845,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case LOAD: - if (regs->msr & MSR_LE) - return 0; err = read_mem(>gpr[op.reg], op.ea, size, regs); if (!err) { if (op.type & SIGNEXT) @@ -1862,8 +1856,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case LOAD_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_load(op.reg, do_lfs, op.ea, size, regs); else @@ -1872,15 +1864,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case LOAD_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case LOAD_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs); goto ldst_done; #endif @@ -1903,8 +1891,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case STORE: - if (regs->msr & MSR_LE) - return 0; if ((op.type & UPDATE) && size == sizeof(long) && op.reg == 1 && op.update_reg == 1 && !(regs->msr & MSR_PR) && @@ -1917,8 +1903,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case STORE_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_store(op.reg, do_stfs, op.ea, size, regs); else @@ -1927,15 +1911,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case STORE_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case STORE_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs); goto ldst_done; #endif -- 1.8.3.1
[PATCH v4 0/2] powerpc: Emulation support for load/store instructions on LE
emulate_step is the basic infrastructure which is used by number of other kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc. In case of kprobe, enabling emulation of load/store instructions will speedup the execution of probed instruction. In case of kernel-space breakpoint, causative instruction is first get emulated before executing user registered handler. If emulation fails, hw-breakpoint is disabled with error. As emulate_step does not support load/store instructions on LE, kernel-space hw-breakpoint infrastructure is broken on LE. emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Changes in v4: - Used late_initcall() for selftest - Makefile changes v3 link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1332686.html Ravi Bangoria (2): powerpc: Emulation support for load/store instructions on LE powerpc: emulate_step tests for load/store instructions arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/lib/Makefile | 2 + arch/powerpc/lib/sstep.c | 20 -- arch/powerpc/lib/test_emulate_step.c | 443 ++ 4 files changed, 452 insertions(+), 20 deletions(-) create mode 100644 arch/powerpc/lib/test_emulate_step.c -- 1.8.3.1
[bug report] powerpc/powernv: Support EEH reset for VF PE
Hello Wei Yang, The patch 9312bc5bab59: "powerpc/powernv: Support EEH reset for VF PE" from Mar 4, 2016, leads to the following static checker warning: arch/powerpc/platforms/powernv/eeh-powernv.c:1033 pnv_eeh_reset_vf_pe() info: return a literal instead of 'ret' arch/powerpc/platforms/powernv/eeh-powernv.c 1019 static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option) 1020 { 1021 struct eeh_dev *edev; 1022 struct pci_dn *pdn; 1023 int ret; 1024 1025 /* The VF PE should have only one child device */ 1026 edev = list_first_entry_or_null(>edevs, struct eeh_dev, list); 1027 pdn = eeh_dev_to_pdn(edev); 1028 if (!pdn) 1029 return -ENXIO; 1030 1031 ret = pnv_eeh_do_flr(pdn, option); 1032 if (!ret) 1033 return ret; Presumably if pnv_eeh_do_flr() succeeds then we don't need to continue? pnv_eeh_do_af_flr() is a fall back option? Doing a: return 0; Makes this more deliberate looking. Sometimes people get their if statements reversed is the other option. 1034 1035 return pnv_eeh_do_af_flr(pdn, option); 1036 } regards, dan carpenter
Re: linux-next: manual merge of the kvm tree with the powerpc tree
On 14/02/2017 09:45, Michael Ellerman wrote: >> If possible, please pull only up to "powerpc/64: Allow for relocation-on >> interrupts from guest to host" and cherry-pick the top two patches >> ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts" and >> "powerpc/powernv: Remove separate entry for OPAL real mode calls") into >> your next branch, but leave the rest for my tree only. > > I don't see how that helps anything. > > In fact it guarantees a mess because those two commits would now go to > Linus via my tree (cherry picked) and via Paul's as part of his second > merge of the topic branch. > > So unless you can give me a good reason I'll merge the tip of the topic > branch into my next, as planned. Yes, Paul's second merge did guarantee a mess, so go ahead. However, the reason was that this is simply not how topic branches should work: topic branches should be the base for other work, they shouldn't contain _all_ the work. So the right workflow would have been: - Paul submits topic branch A to you - you merge A - Paul merges topic branch A into his "next" branch - Paul applies KVM-specific patches B1 on top of his "next" branch. - Paul sends pull request to me (with A + kvmppc work). As far as I understand, there was no reason for you to get B1. The last two patches (let's call them B2) also didn't need to go through the kvm-ppc branch at all. You could have applied them directly on top of A. Linus then would get A and B2 from you, and A and B1 from me: base -→ A -→ B1 ↓↓ ppc -→ ▪▪ ←- kvm ↓| B2 | ↓↓ torvalds/linux.git If necessary, things could have been arranged so that Linus got A and B2 from you, and all three of A/B1/B2 from me: - Paul submits topic branch B2 to you, based on topic branch A - you merge B2 - Paul merges B2 and I get it from him The result would have been: base -→ A -→ B1 ↓ ↘ ↓ ppc -→ ▪ B2 → ▪ ↓ ↙ ↓ ▪▪ ←- kvm ↓↓ torvalds/linux.git Paolo
Re: powerpc: Add a prototype for mcount() so it can be versioned
On Fri, 2017-02-10 at 02:40:02 UTC, Michael Ellerman wrote: > Currently we get a warning that _mcount() can't be versioned: > > WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, > symbol will not be versioned. > > Add a prototype to asm-prototypes.h to fix it. > > The prototype is not really correct, mcount() is not a normal function, > it has a special ABI. But for the purpose of versioning it doesn't > matter. > > Signed-off-by: Michael EllermanApplied to powerpc next. https://git.kernel.org/powerpc/c/99ad503287daf78e19e64e0e51f1d6 cheers
Re: powerpc: Fix confusing help text for DISABLE_MPROFILE_KERNEL
On Fri, 2017-02-10 at 01:16:59 UTC, Anton Blanchard wrote: > From: Anton Blanchard> > The final paragraph of the help text is reversed - we want to > enable this option by default, and disable it if the toolchain > has a working -mprofile-kernel. > > Signed-off-by: Anton Blanchard Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/496e9cb5b2aa2ba303d2bbd08518f9 cheers
Re: [v4,1/2] arch/powerpc: Implement Optprobes
On Wed, 2017-02-08 at 09:50:51 UTC, Anju T wrote: > Current infrastructure of kprobe uses the unconditional trap instruction > to probe a running kernel. Optprobe allows kprobe to replace the trap with > a branch instruction to a detour buffer. Detour buffer contains instructions > to create an in memory pt_regs. Detour buffer also has a call to > optimized_callback() which in turn call the pre_handler(). > After the execution of the pre-handler, a call is made for instruction > emulation. The NIP is determined in advanced through dummy instruction > emulation and a branch instruction is created to the NIP at the end of > the trampoline. > > To address the limitation of branch instruction in POWER architecture, > detour buffer slot is allocated from a reserved area. For the time being, > 64KB is reserved in memory for this purpose. > > Instructions which can be emulated using analyse_instr() are the candidates > for optimization. Before optimization ensure that the address range > between the detour buffer allocated and the instruction being probed > is within +/- 32MB. > > Signed-off-by: Anju T Sudhakar> Signed-off-by: Naveen N. Rao > Acked-by: Masami Hiramatsu Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/51c9c0843993528bffc920c54c2121 cheers
Re: [1/3] powerpc: asm/ppc-opcode.h: introduce __PPC_SH64()
On Wed, 2017-02-08 at 08:57:29 UTC, Anju T wrote: > From: "Naveen N. Rao"> > Introduce __PPC_SH64() as a 64-bit variant to encode shift field in some > of the shift and rotate instructions operating on double-words. Convert > some of the BPF instruction macros to use the same. > > Signed-off-by: Naveen N. Rao Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/c233f5979b3dbb39a5b2473b5fcaf5 cheers
Re: [3/3] powerpc: kprobes: remove kprobe_exceptions_notify()
On Tue, 2017-02-07 at 19:54:16 UTC, "Naveen N. Rao" wrote: > ... as the weak variant will do. > > Signed-off-by: Naveen N. RaoApplied to powerpc next, thanks. https://git.kernel.org/powerpc/c/0ddde5004d26c483c9e67005b2be5b cheers
Re: [1/3] kprobes: introduce weak variant of kprobe_exceptions_notify
On Tue, 2017-02-07 at 19:54:14 UTC, "Naveen N. Rao" wrote: > kprobe_exceptions_notify() is not used on some of the architectures such > as arm[64] and powerpc anymore. Introduce a weak variant for such > architectures. > > Signed-off-by: Naveen N. Rao> Acked-by: Masami Hiramatsu Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/fc62d0207ae0ebc9d19df68394c0dc cheers
Re: powerpc/powernv: Fix opal_exit tracepoint opcode
On Tue, 2017-02-07 at 10:01:01 UTC, Michael Ellerman wrote: > Currently the opal_exit tracepoint usually shows the opcode as 0: > > -0 [047] d.h. 635.654292: opal_entry: opcode=63 > -0 [047] d.h. 635.654296: opal_exit: opcode=0 retval=0 > kopald-1209 [019] d... 636.420943: opal_entry: opcode=10 > kopald-1209 [019] d... 636.420959: opal_exit: opcode=0 retval=0 > > This is because we incorrectly load the opcode into r0 before calling > __trace_opal_exit(), whereas it expects the opcode in r3 (first function > parameter). In fact we are leaving the retval in r3, so opcode and > retval will always show the same value. > > Instead load the opcode into r3, resulting in: > > -0 [040] d.h. 636.618625: opal_entry: opcode=63 > -0 [040] d.h. 636.618627: opal_exit: opcode=63 retval=0 > > Fixes: c49f63530bb6 ("powernv: Add OPAL tracepoints") > Signed-off-by: Michael EllermanApplied to powerpc next. https://git.kernel.org/powerpc/c/a7e0fb6c2029a780444d09560f739e cheers
Re: powerpc: Fix inconsistent of_node_to_nid EXPORT_SYMBOL handling
On Wed, 2017-02-01 at 22:52:42 UTC, Shailendra Singh wrote: > The generic implementation of of_node_to_nid is EXPORT_SYMBOL. > > The powerpc implementation added by following commit is EXPORT_SYMBOL_GPL. > commit 953039c8df7b ("[PATCH] powerpc: Allow devices to register with numa > topology") > > This creates an inconsistency for of_node_to_nid callers across > architectures. > > Update the powerpc implementation to be exported consistently with the > generic implementation. > > Signed-off-by: Shailendra SinghApplied to powerpc next, thanks. https://git.kernel.org/powerpc/c/be9ba9ff93cc3e44dc46da9ed25655 cheers
Re: [-next] powerpc/pseries: Fix typo in parameter description
On Thu, 2017-01-12 at 15:09:21 UTC, Wei Yongjun wrote: > From: Wei Yongjun> > Fix typo in parameter description. > > Signed-off-by: Wei Yongjun Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/b0b5a76579ea62a9eeb720e71fdaa9 cheers
Re: [PATCHv2, 1/4] pseries: Add hypercall wrappers for hash page table resizing
On Fri, 2016-12-09 at 00:07:35 UTC, David Gibson wrote: > This adds the hypercall numbers and wrapper functions for the hash page > table resizing hypercalls. > > It also adds a new firmware feature flag to track the presence of the > HPT resizing calls. > > Signed-off-by: David Gibson> Reviewed-by: Paul Mackerras Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/64b40ffbc83029f035571cad9727e3 cheers
Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
Michael Ellermanwrites: > "Aneesh Kumar K.V" writes: >> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >> index 0735d5a8049f..8720a406bbbe 100644 >> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >> @@ -16,6 +16,9 @@ >> #include >> #include >> >> +#ifndef __ASSEMBLY__ >> +#include >> +#endif > > I assume that's for the VM_BUG_ON() you add below. But if so wouldn't > the #include be better placed in book3s/64/pgtable.h also? mmu-hash.h has got a hack that is explained below #ifndef __ASSEMBLY__ #include #endif /* * This is necessary to get the definition of PGTABLE_RANGE which we * need for various slices related matters. Note that this isn't the * complete pgtable.h but only a portion of it. */ #include This is the only place where we do that book3s/64/pgtable.h include this way. Everybody should include asm/pgable.h which picks the righ version based on different config option. # > >> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h >> b/arch/powerpc/include/asm/book3s/64/pgtable.h >> index fef738229a68..c684ef6cbd10 100644 >> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h >> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h >> @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte) >> return pte; >> } >> >> +#define pte_mk_savedwrite pte_mk_savedwrite >> +static inline pte_t pte_mk_savedwrite(pte_t pte) >> +{ >> +/* >> + * Used by Autonuma subsystem to preserve the write bit >> + * while marking the pte PROT_NONE. Only allow this >> + * on PROT_NONE pte >> + */ >> +VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | >> _PAGE_PRIVILEGED)) != >> + cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED)); >> +return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED); >> +} >> + > > > cheers -aneesh
Re: [PATCH] powerpc/xmon: add turn off xmon option
Pan Xinhuiwrites: > Once xmon is triggered, there is no interface to turn it off again. > However there exists disable/enable xmon code flows. And more important, > System reset interrupt on powerVM will fire an oops to make a dump. At > that time, xmon should not be triggered. > > So add 'z' option after current 'x|X' exit commands. Turn xmon off if 'z' > is following. Thanks for trying to clear this up. But I like the solution we discussed in the other thread better. ie. entering/leaving xmon via sysrq will always leave the state unchanged, and we'll add a debugfs to change the state. cheers
[PATCH] powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
Currently the build breaks if CMA=n and SPAPR_TCE_IOMMU=y: arch/powerpc/mm/mmu_context_iommu.c: In function ‘mm_iommu_get’: arch/powerpc/mm/mmu_context_iommu.c:193:42: error: ‘MIGRATE_CMA’ undeclared (first use in this function) if (get_pageblock_migratetype(page) == MIGRATE_CMA) { ^~~ Fix it by using the existing is_migrate_cma_page(), which evaulates to false when CMA=n. Fixes: 2e5bbb5461f1 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA") Signed-off-by: Michael Ellerman--- arch/powerpc/mm/mmu_context_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c index 104bad029ce9..7de7124ac91b 100644 --- a/arch/powerpc/mm/mmu_context_iommu.c +++ b/arch/powerpc/mm/mmu_context_iommu.c @@ -184,7 +184,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries, * of the CMA zone if possible. NOTE: faulting in + migration * can be expensive. Batching can be considered later */ - if (get_pageblock_migratetype(page) == MIGRATE_CMA) { + if (is_migrate_cma_page(page)) { if (mm_iommu_move_page_from_cma(page)) goto populate; if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT), -- 2.7.4
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
Pan Xinhuiwrites: > 在 2017/2/14 10:35, Nicholas Piggin 写道: >> On Mon, 13 Feb 2017 19:00:42 -0200 >> "Guilherme G. Piccoli" wrote: >>> * I had this patch partially done for some time, and after a discussion >>> at the kernel slack channel latest week, I decided to rebase and fix >>> some remaining bugs. I'd change 'x' option to always disable the debugger, >>> since with this patch we can always re-enable xmon, but today I noticed >>> Pan's patch on the mailing list, so perhaps his approach of adding a flag >>> to 'x' option is preferable. I can change this in a V2, if requested. >>> Thanks in advance! >> >> xmon state changing after the first sysrq+x violates principle of least >> astonishment, so I think that should be fixed. > > Hi, Guilherme > So in the end, my thought is that: > 1) cmd x|X will exit xmon and keep xmon in the original state Agreed. > 2) Then add options to turn some features on/off. And debugfs maybe > not fit for this. But I am also wondering at same time, are people > needing this? I think it's useful. Especially seeing as Guilherme has already written the code. And yeah I think debugfs is the right place to do it. It means you can query and/or set the xmon state without crashing the box - which is what entering xmon essentially does, even if you later recover. cheers
Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions
On Tuesday 14 February 2017 04:16 PM, Michael Ellerman wrote: > Ravi Bangoriawrites: > >> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile >> index 0e649d7..ddc879d 100644 >> --- a/arch/powerpc/lib/Makefile >> +++ b/arch/powerpc/lib/Makefile >> @@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o >> CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) >> >> obj-$(CONFIG_PPC64) += $(obj64-y) >> + >> +ifeq ($(CONFIG_PPC64), y) >> +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o >> +endif > FYI, the right way to do that is: > > obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o > obj-$(CONFIG_PPC64) += $(obj64-y) > > And in this Makefile you don't need to add the second line because it's > already there. Thanks for this. Will change accordingly and send next version. Ravi > > cheers >
Re: [PATCH] powerpc/xmon: add debugfs entry for xmon
"Guilherme G. Piccoli"writes: > Currently the xmon debugger is set only via kernel boot command-line. > It's disabled by default, and can be enabled with "xmon=on" on the > command-line. Also, xmon may be accessed via sysrq mechanism, but once > we enter xmon via sysrq, it's kept enabled until system is rebooted, > even if we exit the debugger. A kernel crash will then lead to xmon > instance, instead of triggering a kdump procedure (if configured), for > example. > > This patch introduces a debugfs entry for xmon, allowing user to query > its current state and change it if desired. Basically, the "xmon" file > to read from/write to is under the debugfs mount point, on powerpc > directory. Reading this file will provide the current state of the > debugger, one of the following: "on", "off", "early" or "nobt". Writing > one of these states to the file will take immediate effect on the debugger. I like this in general. But I think we can simplify it a bit. I don't think we need the nobt state anymore. As far as I can see it was added as a way to reinstate the old behaviour when the auto backtrace mode was added, but I've never heard of anyone using it. If anyone hits a crash where they really need that feature they can always just hack the code to disable the backtrace. So I think step 1 is a patch to drop the xmon_no_auto_backtrace stuff. Also I'm not sure EARLY needs to be a separate state. It's just a one-off invocation at boot, all we really need is just a single bit of state communicated from early_parse_xmon() to xmon_setup(), ie. a static bool would do. If we agree with that, then there's only two states left, on and off, in which case it can probably just be an int - and we can use a simple attribute file rather than custom parsing. > * I had this patch partially done for some time, and after a discussion > at the kernel slack channel latest week, I decided to rebase and fix > some remaining bugs. I'd change 'x' option to always disable the debugger, Not quite. 'x' should exit and leave xmon in whatever state it was previously in. cheers
Re: [PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE
Thanks Michael, On Tuesday 14 February 2017 03:50 PM, Michael Ellerman wrote: > Ravi Bangoriawrites: > >> emulate_step() uses a number of underlying kernel functions that were >> initially not enabled for LE. This has been rectified since. > When exactly? ie. which commit. I found couple of commits: 6506b4718b ("powerpc: Fix Unaligned Loads and Stores") dbc2fbd7c2 ("powerpc: Fix Unaligned LE Floating Point Loads and Stores") There may be more. Patch2 is to test emulate_step() for basic load/store instructions and it seems to be working fine on LE. > > Should we backport this? ie. is it actually a bug people are hitting in > the real world much? Yes, we should backport this. kernel-space hw-breakpoint feature is broken on LE without this. This is on ppc64le: $ sudo cat /proc/kallsyms | grep pid_max c116998c D pid_max $ sudo ./perf record -a --event=mem:0xc116998c sleep 10 Before patch: It does not record any data and throws below warning. $ dmesg [ 817.895573] Unable to handle hardware breakpoint. Breakpoint at 0xc116998c will be disabled. [ 817.895581] [ cut here ] [ 817.895588] WARNING: CPU: 24 PID: 2032 at arch/powerpc/kernel/hw_breakpoint.c:277 hw_breakpoint_handler+0x124/0x230 ... After patch: It records data properly. $ sudo ./perf report --stdio ... # Samples: 36 of event 'mem:0xc116998c' # Event count (approx.): 36 # # Overhead CommandShared Object Symbol # . . # 63.89% kdumpctl [kernel.vmlinux] [k] alloc_pid 27.78% opal_errd [kernel.vmlinux] [k] alloc_pid 5.56% kworker/u97:4 [kernel.vmlinux] [k] alloc_pid 2.78% systemd[kernel.vmlinux] [k] alloc_pid > > cheers >
Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
"Aneesh Kumar K.V"writes: > diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h > b/arch/powerpc/include/asm/book3s/64/mmu-hash.h > index 0735d5a8049f..8720a406bbbe 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h > @@ -16,6 +16,9 @@ > #include > #include > > +#ifndef __ASSEMBLY__ > +#include > +#endif I assume that's for the VM_BUG_ON() you add below. But if so wouldn't the #include be better placed in book3s/64/pgtable.h also? > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h > b/arch/powerpc/include/asm/book3s/64/pgtable.h > index fef738229a68..c684ef6cbd10 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte) > return pte; > } > > +#define pte_mk_savedwrite pte_mk_savedwrite > +static inline pte_t pte_mk_savedwrite(pte_t pte) > +{ > + /* > + * Used by Autonuma subsystem to preserve the write bit > + * while marking the pte PROT_NONE. Only allow this > + * on PROT_NONE pte > + */ > + VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | > _PAGE_PRIVILEGED)) != > + cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED)); > + return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED); > +} > + cheers
Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
Michael Neulingwrites: > On Thu, 2017-02-09 at 08:30 +0530, Aneesh Kumar K.V wrote: >> With this our protnone becomes a present pte with READ/WRITE/EXEC bit >> cleared. >> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help >> us identify a protnone pte that as saved write bit. For such pte, we will >> clear >> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user >> and kernel. >> >> Signed-off-by: Aneesh Kumar K.V > > > FWIW I've tested this, so: > > Acked-By: Michael Neuling In future if you've tested something then "Tested-by:" is the right tag to use. cheers
Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
"Aneesh Kumar K.V"writes: > On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote: >> "Aneesh Kumar K.V" writes: >> >>> Autonuma preserves the write permission across numa fault to avoid taking >>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve >>> PTE >>> write permissions across a NUMA hinting fault"). Architecture can implement >>> protnone in different ways and some may choose to implement that by >>> clearing Read/ >>> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong >>> behaviour. Fix this up by allowing arch to override how to save the write >>> bit >>> on a protnone pte. >> This is pretty obviously a nop on arches that don't implement the new >> hooks, but it'd still be good to get an ack from someone in mm land >> before I merge it. > > > To get it apply cleanly you may need > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch Ah OK, I missed those. In that case these two should probably go via Andrew's tree. cheers
Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions
Ravi Bangoriawrites: > diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile > index 0e649d7..ddc879d 100644 > --- a/arch/powerpc/lib/Makefile > +++ b/arch/powerpc/lib/Makefile > @@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o > CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) > > obj-$(CONFIG_PPC64) += $(obj64-y) > + > +ifeq ($(CONFIG_PPC64), y) > +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o > +endif FYI, the right way to do that is: obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o obj-$(CONFIG_PPC64) += $(obj64-y) And in this Makefile you don't need to add the second line because it's already there. cheers
Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions
Ravi Bangoriawrites: > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c > index fce05a3..5c5ae66 100644 > --- a/arch/powerpc/kernel/kprobes.c > +++ b/arch/powerpc/kernel/kprobes.c > @@ -528,6 +528,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, > struct pt_regs *regs) > > int __init arch_init_kprobes(void) > { > + test_emulate_step(); > + I don't see any good reason why this is called from here. So I'm inclined to just make test_emulate_step() a regular init call. cheers
Re: [PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE
Ravi Bangoriawrites: > emulate_step() uses a number of underlying kernel functions that were > initially not enabled for LE. This has been rectified since. When exactly? ie. which commit. Should we backport this? ie. is it actually a bug people are hitting in the real world much? cheers
Re: [PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE
On Tuesday 14 February 2017 02:17 PM, Naveen N. Rao wrote: > On 2017/02/14 01:32PM, Ravi Bangoria wrote: >> emulate_step() uses a number of underlying kernel functions that were >> initially not enabled for LE. This has been rectified since. So, fix >> emulate_step() for LE for the corresponding instructions. >> >> Reported-by: Anton Blanchard>> Signed-off-by: Ravi Bangoria > Can you redo this on top of powerpc/next? This doesn't apply cleanly due > to a recent change... Ok. sorry for the noise. I sent a v3 series. Please review it. Ravi > - Naveen >
[PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions
Add new selftest that test emulate_step for Normal, Floating Point, Vector and Vector Scalar - load/store instructions. Test should run at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Sample log: [0.762063] emulate_step smoke test: start. [0.762219] emulate_step smoke test: ld : PASS [0.762434] emulate_step smoke test: lwz: PASS [0.762653] emulate_step smoke test: lwzx : PASS [0.762867] emulate_step smoke test: std: PASS [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS [0.763302] emulate_step smoke test: lfsx : PASS [0.763514] emulate_step smoke test: stfsx : PASS [0.763727] emulate_step smoke test: lfdx : PASS [0.763942] emulate_step smoke test: stfdx : PASS [0.764134] emulate_step smoke test: lvx: PASS [0.764349] emulate_step smoke test: stvx : PASS [0.764575] emulate_step smoke test: lxvd2x : PASS [0.764788] emulate_step smoke test: stxvd2x: PASS [0.764997] emulate_step smoke test: complete. Signed-off-by: Ravi Bangoria--- arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/include/asm/sstep.h | 8 + arch/powerpc/kernel/kprobes.c | 2 + arch/powerpc/lib/Makefile | 4 + arch/powerpc/lib/test_emulate_step.c | 439 ++ 5 files changed, 460 insertions(+) create mode 100644 arch/powerpc/lib/test_emulate_step.c diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index d99bd44..e7d6d86 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -284,6 +284,13 @@ #define PPC_INST_BRANCH_COND 0x4080 #define PPC_INST_LBZCIX0x7c0006aa #define PPC_INST_STBCIX0x7c0007aa +#define PPC_INST_LWZX 0x7c2e +#define PPC_INST_LFSX 0x7c00042e +#define PPC_INST_STFSX 0x7c00052e +#define PPC_INST_LFDX 0x7c0004ae +#define PPC_INST_STFDX 0x7c0005ae +#define PPC_INST_LVX 0x7cce +#define PPC_INST_STVX 0x7c0001ce /* macros to insert fields into opcodes */ #define ___PPC_RA(a) (((a) & 0x1f) << 16) diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h index d3a42cc..d6d3630 100644 --- a/arch/powerpc/include/asm/sstep.h +++ b/arch/powerpc/include/asm/sstep.h @@ -87,3 +87,11 @@ struct instruction_op { extern int analyse_instr(struct instruction_op *op, struct pt_regs *regs, unsigned int instr); + +#if defined(CONFIG_KPROBES_SANITY_TEST) && defined(CONFIG_PPC64) +void test_emulate_step(void); +#else +static inline void test_emulate_step(void) +{ +} +#endif diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index fce05a3..5c5ae66 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -528,6 +528,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) int __init arch_init_kprobes(void) { + test_emulate_step(); + return register_kprobe(_p); } diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 0e649d7..ddc879d 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) obj-$(CONFIG_PPC64) += $(obj64-y) + +ifeq ($(CONFIG_PPC64), y) +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o +endif diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c new file mode 100644 index 000..887d1db --- /dev/null +++ b/arch/powerpc/lib/test_emulate_step.c @@ -0,0 +1,439 @@ +/* + * test_emulate_step.c - simple sanity test for emulate_step load/store + * instructions + * + * Copyright IBM Corp. 2016 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + */ + +#define pr_fmt(fmt) "emulate_step smoke test: " fmt + +#include +#include +#include + +#define IMM_L(i) ((uintptr_t)(i) & 0x) + +/* + * Defined with TEST_ prefix so it does not conflict with other + * definitions. + */ +#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) | \ + ___PPC_RA(base) |
[PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE
emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Reported-by: Anton BlanchardSigned-off-by: Ravi Bangoria --- arch/powerpc/lib/sstep.c | 20 1 file changed, 20 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index 846dba2..9c542ec 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case LARX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ if (!address_ok(regs, op.ea, size)) @@ -1823,8 +1821,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case STCX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ if (!address_ok(regs, op.ea, size)) @@ -1849,8 +1845,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case LOAD: - if (regs->msr & MSR_LE) - return 0; err = read_mem(>gpr[op.reg], op.ea, size, regs); if (!err) { if (op.type & SIGNEXT) @@ -1862,8 +1856,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case LOAD_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_load(op.reg, do_lfs, op.ea, size, regs); else @@ -1872,15 +1864,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case LOAD_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case LOAD_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs); goto ldst_done; #endif @@ -1903,8 +1891,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case STORE: - if (regs->msr & MSR_LE) - return 0; if ((op.type & UPDATE) && size == sizeof(long) && op.reg == 1 && op.update_reg == 1 && !(regs->msr & MSR_PR) && @@ -1917,8 +1903,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case STORE_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_store(op.reg, do_stfs, op.ea, size, regs); else @@ -1927,15 +1911,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case STORE_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case STORE_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs); goto ldst_done; #endif -- 1.8.3.1
[PATCH v3 0/2] powerpc: Emulation support for load/store instructions on LE
emulate_step is the basic infrastructure which is used by number of other kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc. In case of kprobe, enabling emulation of load/store instructions will speedup the execution of probed instruction. In case of kernel-space breakpoint, causative instruction is first get emulated before executing user registered handler. If emulation fails, hw-breakpoint is disabled with error. As emulate_step does not support load/store instructions on LE, kernel-space hw-breakpoint infrastructure is broken on LE. emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Changes in v3: - Rebased to powerpc/next. No functionality changes. v2 link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1332638.html Ravi Bangoria (2): powerpc: Emulation support for load/store instructions on LE powerpc: emulate_step tests for load/store instructions arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/include/asm/sstep.h | 8 + arch/powerpc/kernel/kprobes.c | 2 + arch/powerpc/lib/Makefile | 4 + arch/powerpc/lib/sstep.c | 20 -- arch/powerpc/lib/test_emulate_step.c | 439 ++ 6 files changed, 460 insertions(+), 20 deletions(-) create mode 100644 arch/powerpc/lib/test_emulate_step.c -- 1.8.3.1
Re: [PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE
On 2017/02/14 01:32PM, Ravi Bangoria wrote: > emulate_step() uses a number of underlying kernel functions that were > initially not enabled for LE. This has been rectified since. So, fix > emulate_step() for LE for the corresponding instructions. > > Reported-by: Anton Blanchard> Signed-off-by: Ravi Bangoria Can you redo this on top of powerpc/next? This doesn't apply cleanly due to a recent change... - Naveen
Re: [PATCH] powerpc/xmon: Fix data-breakpoint
Hi Michael, Can you please pull this patch. Thanks, Ravi On Tuesday 22 November 2016 02:55 PM, Ravi Bangoria wrote: > Xmon data-breakpoint feature is broken. > > Whenever there is a watchpoint match occurs, hw_breakpoint_handler will > be called by do_break via notifier chains mechanism. If watchpoint is > registered by xmon, hw_breakpoint_handler won't find any associated > perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break > also returns without notifying to xmon. > > Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not > find any perf_event associated with matched watchpoint. > > Signed-off-by: Ravi Bangoria> --- > arch/powerpc/kernel/hw_breakpoint.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/hw_breakpoint.c > b/arch/powerpc/kernel/hw_breakpoint.c > index 03d089b..469d86d 100644 > --- a/arch/powerpc/kernel/hw_breakpoint.c > +++ b/arch/powerpc/kernel/hw_breakpoint.c > @@ -228,8 +228,10 @@ int hw_breakpoint_handler(struct die_args *args) > rcu_read_lock(); > > bp = __this_cpu_read(bp_per_reg); > - if (!bp) > + if (!bp) { > + rc = NOTIFY_DONE; > goto out; > + } > info = counter_arch_bp(bp); > > /*
Re: linux-next: manual merge of the kvm tree with the powerpc tree
Paolo Bonziniwrites: > On 10/02/2017 04:59, Stephen Rothwell wrote: >> Hi all, >> >> Today's linux-next merge of the kvm tree got a conflict in: >> >> arch/powerpc/include/asm/head-64.h >> >> between commit: >> >> 852e5da99d15 ("powerpc/64s: Tidy up after exception handler rework") >> >> from the powerpc tree and commit: >> >> 7ede531773ea ("KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out >> from alt section") >> >> from the kvm tree. > > Michael, please pull the topic branch as soon as possible, so that the > conflicts don't hit Linus. They won't hit Linus until I send my pull request. > That said, the topic branch is a mess. It starts with generic arch > patches (until "powerpc/64: Allow for relocation-on interrupts from > guest to host") then it's only KVM, then on the top there's two more > generic patches that were added _after_ Paul merged it. It's not a mess, it's a collection of patches which touch either arch/powerpc or arch/powerpc/kvm, or are otherwise related. Yeah I could have merged just the start of Paul's series, but that seemed pointless, it doesn't prevent or add any conflicts, and it means I'm unable to test his series as a whole. Paul has also now merged the remaining two commits, and sent you a pull request including them. > If possible, please pull only up to "powerpc/64: Allow for relocation-on > interrupts from guest to host" and cherry-pick the top two patches > ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts" and > "powerpc/powernv: Remove separate entry for OPAL real mode calls") into > your next branch, but leave the rest for my tree only. I don't see how that helps anything. In fact it guarantees a mess because those two commits would now go to Linus via my tree (cherry picked) and via Paul's as part of his second merge of the topic branch. So unless you can give me a good reason I'll merge the tip of the topic branch into my next, as planned. cheers
Re: [PATCH 3/3] powerpc: kprobes: emulate instructions on kprobe handler re-entry
On Tue, Feb 14, 2017 at 02:08:03PM +0530, Naveen N. Rao wrote: > On kprobe handler re-entry, try to emulate the instruction rather than > single stepping always. > > As a related change, remove the duplicate saving of msr as that is > already done in set_current_kprobe() > > Signed-off-by: Naveen N. RaoAcked-by: Ananth N Mavinakayanahalli
Re: [PATCH 2/3] powerpc: kprobes: factor out code to emulate instruction into a helper
On Tue, Feb 14, 2017 at 02:08:02PM +0530, Naveen N. Rao wrote: > This helper will be used in a subsequent patch to emulate instructions > on re-entering the kprobe handler. No functional change. > > Signed-off-by: Naveen N. RaoAcked-by: Ananth N Mavinakayanahalli
Re: [PATCH 1/3] powerpc: kprobes: fix handling of function offsets on ABIv2
On Tue, Feb 14, 2017 at 02:08:01PM +0530, Naveen N. Rao wrote: > commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling > with kallsyms on ppc64le") changed how we use the offset field in struct > kprobe on ABIv2. perf now offsets from the GEP (Global entry point) if an > offset is specified and otherwise chooses the LEP (Local entry point). > > Fix the same in kernel for kprobe API users. We do this by extending > kprobe_lookup_name() to accept an additional parameter to indicate the > offset specified with the kprobe registration. If offset is 0, we return > the local function entry and return the global entry point otherwise. > > With: > # cd /sys/kernel/debug/tracing/ > # echo "p _do_fork" >> kprobe_events > # echo "p _do_fork+0x10" >> kprobe_events > > before this patch: > # cat ../kprobes/list > c00d0748 k _do_fork+0x8[DISABLED] > c00d0758 k _do_fork+0x18[DISABLED] > c00412b0 k kretprobe_trampoline+0x0[OPTIMIZED] > > and after: > # cat ../kprobes/list > c00d04c8 k _do_fork+0x8[DISABLED] > c00d04d0 k _do_fork+0x10[DISABLED] > c00412b0 k kretprobe_trampoline+0x0[OPTIMIZED] > > Signed-off-by: Naveen N. RaoAcked-by: Ananth N Mavinakayanahalli
[PATCH 3/3] powerpc: kprobes: emulate instructions on kprobe handler re-entry
On kprobe handler re-entry, try to emulate the instruction rather than single stepping always. As a related change, remove the duplicate saving of msr as that is already done in set_current_kprobe() Signed-off-by: Naveen N. Rao--- arch/powerpc/kernel/kprobes.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index 9cdf2de31e9e..c213637b9d25 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -206,10 +206,17 @@ int __kprobes kprobe_handler(struct pt_regs *regs) */ save_previous_kprobe(kcb); set_current_kprobe(p, regs, kcb); - kcb->kprobe_saved_msr = regs->msr; kprobes_inc_nmissed_count(p); prepare_singlestep(p, regs); kcb->kprobe_status = KPROBE_REENTER; + if (p->ainsn.boostable >= 0) { + ret = try_to_emulate(p, regs); + + if (ret > 0) { + restore_previous_kprobe(kcb); + return 1; + } + } return 1; } else { if (*addr != BREAKPOINT_INSTRUCTION) { -- 2.11.0
[PATCH 2/3] powerpc: kprobes: factor out code to emulate instruction into a helper
This helper will be used in a subsequent patch to emulate instructions on re-entering the kprobe handler. No functional change. Signed-off-by: Naveen N. Rao--- arch/powerpc/kernel/kprobes.c | 52 ++- 1 file changed, 31 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index fce05a38851c..9cdf2de31e9e 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -140,6 +140,35 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, regs->link = (unsigned long)kretprobe_trampoline; } +int __kprobes try_to_emulate(struct kprobe *p, struct pt_regs *regs) +{ + int ret; + unsigned int insn = *p->ainsn.insn; + + /* regs->nip is also adjusted if emulate_step returns 1 */ + ret = emulate_step(regs, insn); + if (ret > 0) { + /* +* Once this instruction has been boosted +* successfully, set the boostable flag +*/ + if (unlikely(p->ainsn.boostable == 0)) + p->ainsn.boostable = 1; + } else if (ret < 0) { + /* +* We don't allow kprobes on mtmsr(d)/rfi(d), etc. +* So, we should never get here... but, its still +* good to catch them, just in case... +*/ + printk("Can't step on instruction %x\n", insn); + BUG(); + } else if (ret == 0) + /* This instruction can't be boosted */ + p->ainsn.boostable = -1; + + return ret; +} + int __kprobes kprobe_handler(struct pt_regs *regs) { struct kprobe *p; @@ -235,18 +264,9 @@ int __kprobes kprobe_handler(struct pt_regs *regs) ss_probe: if (p->ainsn.boostable >= 0) { - unsigned int insn = *p->ainsn.insn; + ret = try_to_emulate(p, regs); - /* regs->nip is also adjusted if emulate_step returns 1 */ - ret = emulate_step(regs, insn); if (ret > 0) { - /* -* Once this instruction has been boosted -* successfully, set the boostable flag -*/ - if (unlikely(p->ainsn.boostable == 0)) - p->ainsn.boostable = 1; - if (p->post_handler) p->post_handler(p, regs, 0); @@ -254,17 +274,7 @@ int __kprobes kprobe_handler(struct pt_regs *regs) reset_current_kprobe(); preempt_enable_no_resched(); return 1; - } else if (ret < 0) { - /* -* We don't allow kprobes on mtmsr(d)/rfi(d), etc. -* So, we should never get here... but, its still -* good to catch them, just in case... -*/ - printk("Can't step on instruction %x\n", insn); - BUG(); - } else if (ret == 0) - /* This instruction can't be boosted */ - p->ainsn.boostable = -1; + } } prepare_singlestep(p, regs); kcb->kprobe_status = KPROBE_HIT_SS; -- 2.11.0
[PATCH 1/3] powerpc: kprobes: fix handling of function offsets on ABIv2
commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling with kallsyms on ppc64le") changed how we use the offset field in struct kprobe on ABIv2. perf now offsets from the GEP (Global entry point) if an offset is specified and otherwise chooses the LEP (Local entry point). Fix the same in kernel for kprobe API users. We do this by extending kprobe_lookup_name() to accept an additional parameter to indicate the offset specified with the kprobe registration. If offset is 0, we return the local function entry and return the global entry point otherwise. With: # cd /sys/kernel/debug/tracing/ # echo "p _do_fork" >> kprobe_events # echo "p _do_fork+0x10" >> kprobe_events before this patch: # cat ../kprobes/list c00d0748 k _do_fork+0x8[DISABLED] c00d0758 k _do_fork+0x18[DISABLED] c00412b0 k kretprobe_trampoline+0x0[OPTIMIZED] and after: # cat ../kprobes/list c00d04c8 k _do_fork+0x8[DISABLED] c00d04d0 k _do_fork+0x10[DISABLED] c00412b0 k kretprobe_trampoline+0x0[OPTIMIZED] Signed-off-by: Naveen N. Rao--- arch/powerpc/include/asm/kprobes.h | 6 +++--- arch/powerpc/kernel/optprobes.c| 4 ++-- kernel/kprobes.c | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h index d821835ade86..e7ada061aa12 100644 --- a/arch/powerpc/include/asm/kprobes.h +++ b/arch/powerpc/include/asm/kprobes.h @@ -60,10 +60,10 @@ extern kprobe_opcode_t optprobe_template_end[]; #ifdef PPC64_ELF_ABI_v2 /* PPC64 ABIv2 needs local entry point */ -#define kprobe_lookup_name(name, addr) \ +#define kprobe_lookup_name(name, addr, offset) \ { \ addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \ - if (addr) \ + if (addr && !(offset)) \ addr = (kprobe_opcode_t *)ppc_function_entry(addr); \ } #elif defined(PPC64_ELF_ABI_v1) @@ -75,7 +75,7 @@ extern kprobe_opcode_t optprobe_template_end[]; * This ensures we always get to the actual symbol and not the descriptor. * Also handle format. */ -#define kprobe_lookup_name(name, addr) \ +#define kprobe_lookup_name(name, addr, offset) \ { \ char dot_name[MODULE_NAME_LEN + 1 + KSYM_NAME_LEN]; \ const char *modsym; \ diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c index 2282bf4e63cd..e51a045f3d3b 100644 --- a/arch/powerpc/kernel/optprobes.c +++ b/arch/powerpc/kernel/optprobes.c @@ -243,8 +243,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p) /* * 2. branch to optimized_callback() and emulate_step() */ - kprobe_lookup_name("optimized_callback", op_callback_addr); - kprobe_lookup_name("emulate_step", emulate_step_addr); + kprobe_lookup_name("optimized_callback", op_callback_addr, 0); + kprobe_lookup_name("emulate_step", emulate_step_addr, 0); if (!op_callback_addr || !emulate_step_addr) { WARN(1, "kprobe_lookup_name() failed\n"); goto error; diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 83ad7e440417..9bc433575d98 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -63,7 +63,7 @@ * so this must be overridable. */ #ifndef kprobe_lookup_name -#define kprobe_lookup_name(name, addr) \ +#define kprobe_lookup_name(name, addr, offset) \ addr = ((kprobe_opcode_t *)(kallsyms_lookup_name(name))) #endif @@ -1365,7 +1365,7 @@ static kprobe_opcode_t *kprobe_addr(struct kprobe *p) goto invalid; if (p->symbol_name) { - kprobe_lookup_name(p->symbol_name, addr); + kprobe_lookup_name(p->symbol_name, addr, p->offset); if (!addr) return ERR_PTR(-ENOENT); } @@ -2161,7 +2161,7 @@ static int __init init_kprobes(void) /* lookup the function address from its name */ for (i = 0; kretprobe_blacklist[i].name != NULL; i++) { kprobe_lookup_name(kretprobe_blacklist[i].name, - kretprobe_blacklist[i].addr); + kretprobe_blacklist[i].addr, 0); if (!kretprobe_blacklist[i].addr) printk("kretprobe: lookup failed: %s\n",
[PATCH v2 2/2] powerpc: emulate_step tests for load/store instructions
Add new selftest that test emulate_step for Normal, Floating Point, Vector and Vector Scalar - load/store instructions. Test should run at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Sample log: [0.762063] emulate_step smoke test: start. [0.762219] emulate_step smoke test: ld : PASS [0.762434] emulate_step smoke test: lwz: PASS [0.762653] emulate_step smoke test: lwzx : PASS [0.762867] emulate_step smoke test: std: PASS [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS [0.763302] emulate_step smoke test: lfsx : PASS [0.763514] emulate_step smoke test: stfsx : PASS [0.763727] emulate_step smoke test: lfdx : PASS [0.763942] emulate_step smoke test: stfdx : PASS [0.764134] emulate_step smoke test: lvx: PASS [0.764349] emulate_step smoke test: stvx : PASS [0.764575] emulate_step smoke test: lxvd2x : PASS [0.764788] emulate_step smoke test: stxvd2x: PASS [0.764997] emulate_step smoke test: complete. Signed-off-by: Ravi Bangoria--- arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/include/asm/sstep.h | 8 + arch/powerpc/kernel/kprobes.c | 2 + arch/powerpc/lib/Makefile | 4 + arch/powerpc/lib/test_emulate_step.c | 439 ++ 5 files changed, 460 insertions(+) create mode 100644 arch/powerpc/lib/test_emulate_step.c diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index c4ced1d..691 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -284,6 +284,13 @@ #define PPC_INST_BRANCH_COND 0x4080 #define PPC_INST_LBZCIX0x7c0006aa #define PPC_INST_STBCIX0x7c0007aa +#define PPC_INST_LWZX 0x7c2e +#define PPC_INST_LFSX 0x7c00042e +#define PPC_INST_STFSX 0x7c00052e +#define PPC_INST_LFDX 0x7c0004ae +#define PPC_INST_STFDX 0x7c0005ae +#define PPC_INST_LVX 0x7cce +#define PPC_INST_STVX 0x7c0001ce /* macros to insert fields into opcodes */ #define ___PPC_RA(a) (((a) & 0x1f) << 16) diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h index d3a42cc..d6d3630 100644 --- a/arch/powerpc/include/asm/sstep.h +++ b/arch/powerpc/include/asm/sstep.h @@ -87,3 +87,11 @@ struct instruction_op { extern int analyse_instr(struct instruction_op *op, struct pt_regs *regs, unsigned int instr); + +#if defined(CONFIG_KPROBES_SANITY_TEST) && defined(CONFIG_PPC64) +void test_emulate_step(void); +#else +static inline void test_emulate_step(void) +{ +} +#endif diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index 735ff3d..c867347 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -529,6 +529,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) int __init arch_init_kprobes(void) { + test_emulate_step(); + return register_kprobe(_p); } diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 309361e8..7d046ca 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -35,3 +35,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) obj-$(CONFIG_PPC64) += $(obj64-y) + +ifeq ($(CONFIG_PPC64), y) +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o +endif diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c new file mode 100644 index 000..887d1db --- /dev/null +++ b/arch/powerpc/lib/test_emulate_step.c @@ -0,0 +1,439 @@ +/* + * test_emulate_step.c - simple sanity test for emulate_step load/store + * instructions + * + * Copyright IBM Corp. 2016 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it would be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + */ + +#define pr_fmt(fmt) "emulate_step smoke test: " fmt + +#include +#include +#include + +#define IMM_L(i) ((uintptr_t)(i) & 0x) + +/* + * Defined with TEST_ prefix so it does not conflict with other + * definitions. + */ +#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) | \ + ___PPC_RA(base) |
[PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE
emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Reported-by: Anton BlanchardSigned-off-by: Ravi Bangoria --- arch/powerpc/lib/sstep.c | 20 1 file changed, 20 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index 06c7e9b..e14a2fb 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case LARX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ err = -EFAULT; @@ -1824,8 +1822,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case STCX: - if (regs->msr & MSR_LE) - return 0; if (op.ea & (size - 1)) break; /* can't handle misaligned */ err = -EFAULT; @@ -1851,8 +1847,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto ldst_done; case LOAD: - if (regs->msr & MSR_LE) - return 0; err = read_mem(>gpr[op.reg], op.ea, size, regs); if (!err) { if (op.type & SIGNEXT) @@ -1864,8 +1858,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case LOAD_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_load(op.reg, do_lfs, op.ea, size, regs); else @@ -1874,15 +1866,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case LOAD_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case LOAD_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs); goto ldst_done; #endif @@ -1905,8 +1893,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) goto instr_done; case STORE: - if (regs->msr & MSR_LE) - return 0; if ((op.type & UPDATE) && size == sizeof(long) && op.reg == 1 && op.update_reg == 1 && !(regs->msr & MSR_PR) && @@ -1919,8 +1905,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #ifdef CONFIG_PPC_FPU case STORE_FP: - if (regs->msr & MSR_LE) - return 0; if (size == 4) err = do_fp_store(op.reg, do_stfs, op.ea, size, regs); else @@ -1929,15 +1913,11 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_ALTIVEC case STORE_VMX: - if (regs->msr & MSR_LE) - return 0; err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs); goto ldst_done; #endif #ifdef CONFIG_VSX case STORE_VSX: - if (regs->msr & MSR_LE) - return 0; err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs); goto ldst_done; #endif -- 1.8.3.1
[PATCH v2 0/2] powerpc: Emulation support for load/store instructions on LE
emulate_step is the basic infrastructure which is used by number of other kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc. In case of kprobe, enabling emulation of load/store instructions will speedup the execution of probed instruction. In case of kernel-space breakpoint, causative instruction is first get emulated before executing user registered handler. If emulation fails, hw-breakpoint is disabled with error. As emulate_step does not support load/store instructions on LE, kernel-space hw-breakpoint infrastructure is broken on LE. emulate_step() uses a number of underlying kernel functions that were initially not enabled for LE. This has been rectified since. So, fix emulate_step() for LE for the corresponding instructions. Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Changes in v2: - Folded 2nd and 3rd patch of v1 into one patch, as suggested by Naveen v1 link: https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg110671.html Ravi Bangoria (2): powerpc: Emulation support for load/store instructions on LE powerpc: emulate_step tests for load/store instructions arch/powerpc/include/asm/ppc-opcode.h | 7 + arch/powerpc/include/asm/sstep.h | 8 + arch/powerpc/kernel/kprobes.c | 2 + arch/powerpc/lib/Makefile | 4 + arch/powerpc/lib/sstep.c | 20 -- arch/powerpc/lib/test_emulate_step.c | 439 ++ 6 files changed, 460 insertions(+), 20 deletions(-) create mode 100644 arch/powerpc/lib/test_emulate_step.c -- 1.8.3.1