date:20170214

Re: [PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE

2017-02-14 Thread Masami Hiramatsu

Hi Naveen,

On Wed, 15 Feb 2017 00:28:34 +0530
"Naveen N. Rao"  wrote:

> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index e51a045f3d3b..a8f414a0b141 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -70,6 +70,9 @@ static unsigned long can_optimize(struct kprobe *p)
>   struct instruction_op op;
>   unsigned long nip = 0;
>  
> + if (unlikely(kprobe_ftrace(p)))
> + return 0;
> +

Hmm, this should not be checked in arch-depend code, since it may duplicate
code in each arch. Please try below.

Thanks,

commit 897bb304974c1b0c19a4274fea220b175b07f353
Author: Masami Hiramatsu 
Date:   Wed Feb 15 15:48:14 2017 +0900

kprobes: Skip preparing optprobe if the probe is ftrace-based

Skip preparing optprobe if the probe is ftrace-based, since
anyway, it must not be optimized (or already optimized by
ftrace).

Signed-off-by: Masami Hiramatsu 

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index ebb4dad..546a8b1 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -746,13 +746,20 @@ static void kill_optimized_kprobe(struct kprobe *p)
arch_remove_optimized_kprobe(op);
 }
 
+static inline
+void __prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
+{
+   if (!kprobe_ftrace(p))
+   arch_prepare_optimized_kprobe(op, p);
+}
+
 /* Try to prepare optimized instructions */
 static void prepare_optimized_kprobe(struct kprobe *p)
 {
struct optimized_kprobe *op;
 
op = container_of(p, struct optimized_kprobe, kp);
-   arch_prepare_optimized_kprobe(op, p);
+   __prepare_optimized_kprobe(op, p);
 }
 
 /* Allocate new optimized_kprobe and try to prepare optimized instructions */
@@ -766,7 +773,7 @@ static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
 
INIT_LIST_HEAD(>list);
op->kp.addr = p->addr;
-   arch_prepare_optimized_kprobe(op, p);
+   __prepare_optimized_kprobe(op, p);
 
return >kp;
 }



-- 
Masami Hiramatsu

[PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices

2017-02-14 Thread Yongji Xie

Currently we reassign the alignment by extending resources' size in
pci_reassigndev_resource_alignment(). This could potentially break
some drivers when the driver uses the size to locate register
whose length is related to the size. Some examples as below:

- misc\Hpilo.c:
off = pci_resource_len(pdev, bar) - 0x2000;

- net\ethernet\chelsio\cxgb4\cxgb4_uld.h:
(pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size))

- infiniband\hw\nes\Nes_hw.c:
num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT;

This risk could be easily prevented before because we only had one way
(kernel parameter resource_alignment) to touch those codes. And even
some users may be happy to see the extended size.

But now we define a default alignment for all devices which would also
touch those codes. It would be hard to prevent the risk in this case. So
this patch tries to use START_ALIGNMENT to identify the resource's
alignment without extending the size when the alignment reassigning is
caused by the default alignment.

Signed-off-by: Yongji Xie 
---
 drivers/pci/pci.c |   34 --
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2622e9b..0017e5e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4958,7 +4958,8 @@ void pci_ignore_hotplug(struct pci_dev *dev)
  * RETURNS: Resource alignment if it is specified.
  *  Zero if it is not specified.
  */
-static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
+static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
+   bool *resize)
 {
int seg, bus, slot, func, align_order, count;
unsigned short vendor, device, subsystem_vendor, subsystem_device;
@@ -5002,6 +5003,7 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev)
(!device || (device == dev->device)) &&
(!subsystem_vendor || (subsystem_vendor == 
dev->subsystem_vendor)) &&
(!subsystem_device || (subsystem_device == 
dev->subsystem_device))) {
+   *resize = true;
if (align_order == -1)
align = PAGE_SIZE;
else
@@ -5027,6 +5029,7 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev)
bus == dev->bus->number &&
slot == PCI_SLOT(dev->devfn) &&
func == PCI_FUNC(dev->devfn)) {
+   *resize = true;
if (align_order == -1)
align = PAGE_SIZE;
else
@@ -5059,6 +5062,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev 
*dev)
struct resource *r;
resource_size_t align, size;
u16 command;
+   bool resize = false;
 
/*
 * VF BARs are read-only zero according to SR-IOV spec r1.1, sec
@@ -5070,7 +5074,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev 
*dev)
return;
 
/* check if specified PCI is target device to reassign */
-   align = pci_specified_resource_alignment(dev);
+   align = pci_specified_resource_alignment(dev, );
if (!align)
return;
 
@@ -5098,15 +5102,25 @@ void pci_reassigndev_resource_alignment(struct pci_dev 
*dev)
}
 
size = resource_size(r);
-   if (size < align) {
-   size = align;
-   dev_info(>dev,
-   "Rounding up size of resource #%d to %#llx.\n",
-   i, (unsigned long long)size);
+   if (resize) {
+   if (size < align) {
+   size = align;
+   dev_info(>dev,
+   "Rounding up size of resource #%d to 
%#llx.\n",
+   i, (unsigned long long)size);
+   }
+   r->flags |= IORESOURCE_UNSET;
+   r->end = size - 1;
+   r->start = 0;
+   } else {
+   if (size < align) {
+   r->flags &= ~IORESOURCE_SIZEALIGN;
+   r->flags |= IORESOURCE_STARTALIGN |
+   IORESOURCE_UNSET;
+   r->start = align;
+   r->end = r->start + size - 1;
+   }
}
-   r->flags |= IORESOURCE_UNSET;
-   r->end = size - 1;
-   r->start = 0;
}
/*

[PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices

2017-02-14 Thread Yongji Xie

When vfio passthroughs a PCI device of which MMIO BARs are
smaller than PAGE_SIZE, guest will not handle the mmio
accesses to the BARs which leads to mmio emulations in host.

This is because vfio will not allow to passthrough one BAR's
mmio page which may be shared with other BARs. Otherwise,
there will be a backdoor that guest can use to access BARs
of other guest.

This patch adds a macro to set default alignment for all
PCI devices. Then we could solve this issue on some platforms
which would easily hit this issue because of their 64K page
such as PowerNV platform by defining this macro as PAGE_SIZE.

Signed-off-by: Yongji Xie 
---
 arch/powerpc/include/asm/pci.h |4 
 drivers/pci/pci.c  |3 +++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index e9bd6cf..5e31bc2 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -28,6 +28,10 @@
 #define PCIBIOS_MIN_IO 0x1000
 #define PCIBIOS_MIN_MEM0x1000
 
+#ifdef CONFIG_PPC_POWERNV
+#define PCIBIOS_DEFAULT_ALIGNMENT  PAGE_SIZE
+#endif
+
 struct pci_dev;
 
 /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a881c0d..2622e9b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4965,6 +4965,9 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev)
resource_size_t align = 0;
char *p;
 
+#ifdef PCIBIOS_DEFAULT_ALIGNMENT
+   align = PCIBIOS_DEFAULT_ALIGNMENT;
+#endif
spin_lock(_alignment_lock);
p = resource_alignment_param;
if (!*p)
-- 
1.7.1

[PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment

2017-02-14 Thread Yongji Xie

In case that one device's alignment is greater than its size,
we may get an incorrect size and alignment for its bus's memory
window in pbus_size_mem(). This patch fixes this case.

Signed-off-by: Yongji Xie 
---
 drivers/pci/setup-bus.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f30ca75..b3cabc2 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1075,10 +1075,10 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned 
long mask,
r->flags = 0;
continue;
}
-   size += r_size;
+   size += max(r_size, align);
/* Exclude ranges with size > align from
   calculation of the alignment. */
-   if (r_size == align)
+   if (r_size <= align)
aligns[order] += align;
if (order > max_order)
max_order = order;
-- 
1.7.1

[PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE

2017-02-14 Thread Yongji Xie

This series introduces a way for PCI resource allocator to force
MMIO BARs not to share PAGE_SIZE. This would make sense to VFIO
driver. Because current VFIO implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs which may share the same page
with other BARs for security reasons. Thus, we have to handle mmio
access to these BARs in QEMU emulation rather than in guest which
will cause some performance loss.

In our solution, we try to make use of the existing code path of
resource_alignment kernel parameter and add a macro to set default
alignment for it. Thus we can define this macro by default on some
archs which may easily hit the performance issue because of their
64K page.

In this series, patch 1 fixes a bug related to bridge window
size/alignment caculating; patch 2,3 add support for setting default
alignment of all MMIO BAR.

Changelog v9:
- Add a patch to fix for caculating bridge window's size and alignment
- Remove an unrelated patch
- Rework the patch that fix bug that reassign resources's alignment by
  changing its size

Changelog v8:
- Rebased against v4.10-rc4
- Rework the patch 2
- Change the commit log of patch 1

Changelog v7:
- Rebased against v4.9-rc2
- Drop two merged patches
- Rework the patch which fix a bug that resources's size is changed when
  using resource_alignment
- Add a patch that fix a bug for IOV BARs when using resource_alignment

Changelog v6: 
- Remove the option "noresize@" of resource_alignment

Changelog v5: 
- Rebased against v4.8-rc6
- Drop the patch that forbidding disable memory decoding in
  pci_reassigndev_resource_alignment()

Changelog v4: 
- Rebased against v4.8-rc1
- Drop one irrelevant patch
- Drop the patch that adding wildcard to resource_alignment to enforce
  the alignment of all MMIO BARs to be at least PAGE_SIZE
- Change the format of option "noresize" of resource_alignment
- Code style improvements

Changelog v3: 
- Ignore enforced alignment to fixed BARs
- Fix issue that disabling memory decoding when reassigning the alignment
- Only enable default alignment on PowerNV platform

Changelog v2:
- Ignore enforced alignment to VF BARs on pci_reassigndev_resource_alignment()

Yongji Xie (3):
  PCI: A fix for caculating bridge window's size and alignment
  PCI: Add a macro to set default alignment for all PCI devices
  PCI: Don't extend device's size when using default alignment for all devices

 arch/powerpc/include/asm/pci.h |4 
 drivers/pci/pci.c  |   37 +++--
 drivers/pci/setup-bus.c|4 ++--
 3 files changed, 33 insertions(+), 12 deletions(-)

Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler

2017-02-14 Thread Mukesh Ojha




On Wednesday 15 February 2017 10:38 AM, Stewart Smith wrote:

Mukesh Ojha  writes:


Converts all the return explicit number to a more proper IRQ_HANDLED,
which looks proper incase of interrupt handler returning case.

Signed-off-by: Mukesh Ojha 
Reviewed-by: Vasant Hegde 
---
  arch/powerpc/platforms/powernv/opal-dump.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

Should also have:
Fixes: 8034f715f ("powernv/opal-dump: Convert to irq domain")

?



Yeah .. i should have mentioned which patch missed these changes.

-Mukesh

Re: [v4,1/2] arch/powerpc: Implement Optprobes

2017-02-14 Thread Anju T Sudhakar


Thank You Michael.  :)


On Tuesday 14 February 2017 06:10 PM, Michael Ellerman wrote:

On Wed, 2017-02-08 at 09:50:51 UTC, Anju T wrote:

Current infrastructure of kprobe uses the unconditional trap instruction
to probe a running kernel. Optprobe allows kprobe to replace the trap with
a branch instruction to a detour buffer. Detour buffer contains instructions
to create an in memory pt_regs. Detour buffer also has a call to
optimized_callback() which in turn call the pre_handler().
After the execution of the pre-handler, a call is made for instruction
emulation. The NIP is determined in advanced through dummy instruction
emulation and a branch instruction is created to the NIP at the end of
the trampoline.

To address the limitation of branch instruction in POWER architecture,
detour buffer slot is allocated from a reserved area. For the time being,
64KB is reserved in memory for this purpose.

Instructions which can be emulated using analyse_instr() are the candidates
for optimization. Before optimization ensure that the address range
between the detour buffer allocated and the instruction being probed
is within +/- 32MB.

Signed-off-by: Anju T Sudhakar 
Signed-off-by: Naveen N. Rao 
Acked-by: Masami Hiramatsu 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/51c9c0843993528bffc920c54c2121

cheers

Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler

2017-02-14 Thread Stewart Smith

Mukesh Ojha  writes:

> Converts all the return explicit number to a more proper IRQ_HANDLED,
> which looks proper incase of interrupt handler returning case.
>
> Signed-off-by: Mukesh Ojha 
> Reviewed-by: Vasant Hegde 
> ---
>  arch/powerpc/platforms/powernv/opal-dump.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)

Should also have:
Fixes: 8034f715f ("powernv/opal-dump: Convert to irq domain")

?

-- 
Stewart Smith
OPAL Architect, IBM.

Re: [PATCH 1/2] powerpc/powernv/opal-dump : Handles opal_dump_info properly

2017-02-14 Thread Stewart Smith

Mukesh Ojha  writes:
> Moves the return value check of 'opal_dump_info' to a proper place which
> was previously unnecessarily filling all the dump info even on failure.
>
> Signed-off-by: Mukesh Ojha 
> ---
>  arch/powerpc/platforms/powernv/opal-dump.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)


Acked-by: Stewart Smith 

-- 
Stewart Smith
OPAL Architect, IBM.

Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

2017-02-14 Thread Stewart Smith

Michael Ellerman  writes:
> Vipin K Parashar  writes:
>
>> OPAL returns OPAL_WRONG_STATE for XSCOM operations
>>
>> done to read any core FIR which is sleeping, offline.
>
> OK.
>
> Do we know why Linux is causing that to happen?
>
> It's also returned from many of the XIVE routines if we're in the wrong
> xive mode, all of which would indicate a fairly bad Linux bug.
>
> Also the skiboot patch which added WRONG_STATE for XSCOM ops did so
> explicitly so we could differentiate from other errors:
>
> commit 9c2d82394fd2303847cac4a665dee62556ca528a
> Author: Russell Currey 
> AuthorDate: Mon Mar 21 12:00:00 2016 +1100
>
> xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep
> 
> xscom_read and xscom_write return OPAL_SUCCESS if they worked, and
> OPAL_HARDWARE if they didn't.  This doesn't provide information about why
> the operation failed, such as if the CPU happens to be asleep.
> 
> This is specifically useful in error scanning, so if every CPU is being
> scanned for errors, sleeping CPUs likely aren't the cause of failures.
> 
> So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is
> sleeping.
> 
> Signed-off-by: Russell Currey 
> Reviewed-by: Alistair Popple 
> Signed-off-by: Stewart Smith 
>
>
>
> So I'm still not convinced that quietly swallowing this error and
> mapping it to -EIO along with several of the other error codes is the
> right thing to do.

FWIW I agree - pretty limited cases where it should just be converted
into -EIO and passed on - probably *just* the debugfs interface to be honest.

-- 
Stewart Smith
OPAL Architect, IBM.

Re: [PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE

2017-02-14 Thread Ananth N Mavinakayanahalli

On Wed, Feb 15, 2017 at 12:28:34AM +0530, Naveen N. Rao wrote:
> Allow kprobes to be placed on ftrace _mcount() call sites. This
> optimization avoids the use of a trap, by riding on ftrace
> infrastructure.
> 
> This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on
> MPROFILE_KERNEL, which is only currently enabled on powerpc64le with
> newer toolchains.
> 
> Based on the x86 code by Masami.
> 
> Signed-off-by: Naveen N. Rao 
> ---
>  arch/powerpc/Kconfig |   1 +
>  arch/powerpc/include/asm/kprobes.h   |  10 
>  arch/powerpc/kernel/Makefile |   3 ++
>  arch/powerpc/kernel/kprobes-ftrace.c | 100 
> +++
>  arch/powerpc/kernel/kprobes.c|   4 +-
>  arch/powerpc/kernel/optprobes.c  |   3 ++
>  6 files changed, 120 insertions(+), 1 deletion(-)
>  create mode 100644 arch/powerpc/kernel/kprobes-ftrace.c

You'll also need to update
Documentation/features/debug/kprobes-on-ftrace/arch-support.txt

> +/* Ftrace callback handler for kprobes */
> +void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> +struct ftrace_ops *ops, struct pt_regs *regs)
> +{
> + struct kprobe *p;
> + struct kprobe_ctlblk *kcb;
> + unsigned long flags;
> +
> + /* Disable irq for emulating a breakpoint and avoiding preempt */
> + local_irq_save(flags);
> + hard_irq_disable();
> +
> + p = get_kprobe((kprobe_opcode_t *)nip);
> + if (unlikely(!p) || kprobe_disabled(p))
> + goto end;
> +
> + kcb = get_kprobe_ctlblk();
> + if (kprobe_running()) {
> + kprobes_inc_nmissed_count(p);
> + } else {
> + unsigned long orig_nip = regs->nip;
> + /* Kprobe handler expects regs->nip = nip + 1 as breakpoint hit 
> */

Can you clarify this? On powerpc, the regs->nip at the time of
breakpoint hit points to the probed instruction, not the one after.

Ananth

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread panxinhui



在 2017/2/15 上午1:35, Guilherme G. Piccoli 写道:
> On 14/02/2017 01:58, Pan Xinhui wrote:
>>
>>
>> 在 2017/2/14 10:35, Nicholas Piggin 写道:
>>> On Mon, 13 Feb 2017 19:00:42 -0200
>>>
>>> xmon state changing after the first sysrq+x violates principle of least
>>> astonishment, so I think that should be fixed.
>>>
>> hi, Nick
>> yes, as long as xmon is disabled during boot, it should still be disabled 
>> after existing xmon.
>> My patch does not fix that as it need people add one more char 'z' following 
>> 'x'.
>> I will provide a new patch to fix that.
>>
>>> Then the question is, is it worth making it runtime configurable with xmon
>>> command or debugfs tunables?
>>>
>> They are options for people to turn xmon features on or off. Maybe people 
>> needn't this.
>> However I am not a fan  of debugfs this time as I am used to using xmon 
>> cmds. :)
>>
>> Hi, Guilherme
>> So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon 
>> in the original state(indicated by var xmon_off).
>> 2) Then add options to turn some features on/off. And debugfs maybe not fit 
>> for this. But I am also wondering at same time, are people needing this?
> 
> Hi Nick and Xinhui, thanks very much for the feedback.
> I agree, we should keep xmon in the state it was firstly set, on boot
> time - dropping to the debugger using sysrq shouldn't change it.
> 
Yes, and feel free to include my fix patch "powerpc/xmon: Fix an unexpected 
xmon onoff state change" :)

> Now, the use case of the debugfs approach is to allow user to
> enable/disable xmon without need to drop into the debugger itself, or
> reboot the machine.
Good, got it.
We look forward to your new patch. :)

thanks
xinhui
> Imagine a scenario in which we have a production machine, and:
> 
> 
> i) For some reason, somebody kept xmon enabled on grub.cfg and now, we
> want to let kdump work in case of crash - how to disable xmon in runtime?
> 
> ii) The opposite: xmon wasn't enable on boot time in production machine,
> but we have a super-rare issue and want to drop to xmon next time it
> happens, so we need to enable it. But we don't want to drop into the
> debugger to force it gets enabled, so how do we enable it?
> 
> Regarding the place of the xmon state file, I believe debugfs is the
> right place - where else could we add it? procfs? configfs?
> 
> Thanks,
> 
> 
> Guilherme
>>
>> thanks
>> xinhui
>>
>>> Thanks,
>>> Nick
>>>
>>

[PATCH] KVM: Prevent double-free on HPT resize commit path

2017-02-14 Thread David Gibson

resize_hpt_release(), called once the HPT resize of a KVM guest is
completed (successfully or unsuccessfully) free()s the state structure for
the resize.  It is currently not safe to call with a NULL pointer.

However, one of the error paths in kvm_vm_ioctl_resize_hpt_commit() can
invoke it with a NULL pointer.  This will occur if userspace improperly
invokes KVM_PPC_RESIZE_HPT_COMMIT without previously calling
KVM_PPC_RESIZE_HPT_PREPARE, or if it calls COMMIT twice without an
intervening PREPARE.

To fix this potential crash bug - and maybe others like it, make it safe
(and a no-op) to call resize_hpt_release() with a NULL resize pointer.

Found by Dan Carpenter with a static checker.

Reported-by: Dan Carpenter 
Signed-off-by: David Gibson 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 013552f..72ccac2 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1407,6 +1407,9 @@ static void resize_hpt_release(struct kvm *kvm, struct 
kvm_resize_hpt *resize)
 {
BUG_ON(kvm->arch.resize_hpt != resize);
 
+   if (!resize)
+   return;
+
if (resize->hpt.virt)
kvmppc_free_hpt(>hpt);
 
-- 
2.9.3

Re: [PATCH 2/2] powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler

2017-02-14 Thread Jeremy Kerr


Hi Mukesh,

> Converts all the return explicit number to a more proper IRQ_HANDLED,
> which looks proper incase of interrupt handler returning case.

This looks good to me, but can you describe the effects of those changes
to the interrupt handler's return code? ie, what happened in the
erroneous case where we returned 0 (== IRQ_NONE) - does this fix a
user-visible issue?

Cheers,


Jeremy

Re: [PATCH 1/2] powerpc/powernv/opal-dump : Handles opal_dump_info properly

2017-02-14 Thread Jeremy Kerr

Hi Mukesh,

> Moves the return value check of 'opal_dump_info' to a proper place which
> was previously unnecessarily filling all the dump info even on failure.

Acked-by: Jeremy Kerr 

Thanks!


Jeremy

[PATCH] powerpc/xmon: Fix an unexpected xmon onoff state change

2017-02-14 Thread Pan Xinhui

Once xmon is triggered by sysrq-x, it is enabled always afterwards even
if it is disabled during boot. This will cause a system reset interrut
fail to dump. So keep xmon in its original state after exit.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/xmon/xmon.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9c0e17c..721212f 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -76,6 +76,7 @@ static int xmon_gate;
 #endif /* CONFIG_SMP */
 
 static unsigned long in_xmon __read_mostly = 0;
+static int xmon_off = 0;
 
 static unsigned long adrs;
 static int size = 1;
@@ -3250,6 +3251,8 @@ static void sysrq_handle_xmon(int key)
/* ensure xmon is enabled */
xmon_init(1);
debugger(get_irq_regs());
+   if (xmon_off)
+   xmon_init(0);
 }
 
 static struct sysrq_key_op sysrq_xmon_op = {
@@ -3266,7 +3269,7 @@ static int __init setup_xmon_sysrq(void)
 __initcall(setup_xmon_sysrq);
 #endif /* CONFIG_MAGIC_SYSRQ */
 
-static int __initdata xmon_early, xmon_off;
+static int __initdata xmon_early;
 
 static int __init early_parse_xmon(char *p)
 {
-- 
2.4.11

linux-next: manual merge of the kvm tree with the powerpc tree

2017-02-14 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the kvm tree got a conflict in:

  arch/powerpc/kvm/book3s_hv_rm_xics.c

between commit:

  ab9bad0ead9a ("powerpc/powernv: Remove separate entry for OPAL real mode 
calls")

from the powerpc tree and commit:

  21acd0e4df04 ("KVM: PPC: Book 3S: XICS: Don't lock twice when checking for 
resend")

from the kvm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/powerpc/kvm/book3s_hv_rm_xics.c
index 29f43ed6d5eb,0b2e388f4cdf..
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@@ -35,8 -35,8 +35,8 @@@ int kvm_irq_bypass = 1
  EXPORT_SYMBOL(kvm_irq_bypass);
  
  static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
-   u32 new_irq);
+   u32 new_irq, bool check_resend);
 -static int xics_opal_rm_set_server(unsigned int hw_irq, int server_cpu);
 +static int xics_opal_set_server(unsigned int hw_irq, int server_cpu);
  
  /* -- ICS routines -- */
  static void ics_rm_check_resend(struct kvmppc_xics *xics,

Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write

2017-02-14 Thread Michael Ellerman

"Aneesh Kumar K.V"  writes:

> Michael Ellerman  writes:
>
>> "Aneesh Kumar K.V"  writes:
>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
>>> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> index 0735d5a8049f..8720a406bbbe 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> @@ -16,6 +16,9 @@
>>>  #include 
>>>  #include 
>>>  
>>> +#ifndef __ASSEMBLY__
>>> +#include 
>>> +#endif
>>
>> I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
>> the #include be better placed in book3s/64/pgtable.h also?
>
> mmu-hash.h has got a hack that is explained below
>
> #ifndef __ASSEMBLY__
> #include 
> #endif
> /*
>  * This is necessary to get the definition of PGTABLE_RANGE which we
>  * need for various slices related matters. Note that this isn't the
>  * complete pgtable.h but only a portion of it.
>  */
> #include 
>
> This is the only place where we do that book3s/64/pgtable.h include this
> way. Everybody should include asm/pgable.h which picks the righ version
> based on different config option.

I don't understand how that is related.

If you're adding a VM_BUG_ON() in book3s/64/pgtable.h, why isn't the
include of mmdebug.h in that file also?

cheers

Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.

2017-02-14 Thread Andrew Morton

On Tue, 14 Feb 2017 21:59:23 +1100 Michael Ellerman  
wrote:

> "Aneesh Kumar K.V"  writes:
> 
> > On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote:
> >> "Aneesh Kumar K.V"  writes:
> >>
> >>> Autonuma preserves the write permission across numa fault to avoid taking
> >>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: 
> >>> preserve PTE
> >>> write permissions across a NUMA hinting fault"). Architecture can 
> >>> implement
> >>> protnone in different ways and some may choose to implement that by 
> >>> clearing Read/
> >>> Write/Exec bit of pte. Setting the write bit on such pte can result in 
> >>> wrong
> >>> behaviour. Fix this up by allowing arch to override how to save the write 
> >>> bit
> >>> on a protnone pte.
> >> This is pretty obviously a nop on arches that don't implement the new
> >> hooks, but it'd still be good to get an ack from someone in mm land
> >> before I merge it.
> >
> >
> > To get it apply cleanly you may need
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch
> 
> Ah OK, I missed those.
> 
> In that case these two should probably go via Andrew's tree.

Done.  But
mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch is on
hold because Aneesh saw a testing issue, so these two are also on hold.

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread Michael Ellerman

"Guilherme G. Piccoli"  writes:
> On 14/02/2017 01:58, Pan Xinhui wrote:
>> 在 2017/2/14 10:35, Nicholas Piggin 写道:
>>> On Mon, 13 Feb 2017 19:00:42 -0200
>>>
>>> xmon state changing after the first sysrq+x violates principle of least
>>> astonishment, so I think that should be fixed.
>>>
>> hi, Nick
>> yes, as long as xmon is disabled during boot, it should still be disabled 
>> after existing xmon.
>> My patch does not fix that as it need people add one more char 'z' following 
>> 'x'.
>> I will provide a new patch to fix that.
>> 
>>> Then the question is, is it worth making it runtime configurable with xmon
>>> command or debugfs tunables?
>>>
>> They are options for people to turn xmon features on or off. Maybe people 
>> needn't this.
>> However I am not a fan  of debugfs this time as I am used to using xmon 
>> cmds. :)
>> 
>> Hi, Guilherme
>> So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon 
>> in the original state(indicated by var xmon_off).
>> 2) Then add options to turn some features on/off. And debugfs maybe not fit 
>> for this. But I am also wondering at same time, are people needing this?
>
> Hi Nick and Xinhui, thanks very much for the feedback.
> I agree, we should keep xmon in the state it was firstly set, on boot
> time - dropping to the debugger using sysrq shouldn't change it.
>
> Now, the use case of the debugfs approach is to allow user to
> enable/disable xmon without need to drop into the debugger itself, or
> reboot the machine.
> Imagine a scenario in which we have a production machine, and:
>
>
> i) For some reason, somebody kept xmon enabled on grub.cfg and now, we
> want to let kdump work in case of crash - how to disable xmon in runtime?
>
> ii) The opposite: xmon wasn't enable on boot time in production machine,
> but we have a super-rare issue and want to drop to xmon next time it
> happens, so we need to enable it. But we don't want to drop into the
> debugger to force it gets enabled, so how do we enable it?

Yep. I think both of those are good reasons to add a debugfs file.

> Regarding the place of the xmon state file, I believe debugfs is the
> right place - where else could we add it? procfs? configfs?

debugfs, under powerpc_debugfs_root.

cheers

Re: [bug report] powerpc/powernv: Support EEH reset for VF PE

2017-02-14 Thread Russell Currey

On Tue, 2017-02-14 at 16:39 +0300, Dan Carpenter wrote:
> Hello Wei Yang,
> 
> The patch 9312bc5bab59: "powerpc/powernv: Support EEH reset for VF
> PE" from Mar 4, 2016, leads to the following static checker warning:
> 
>   arch/powerpc/platforms/powernv/eeh-powernv.c:1033 pnv_eeh_reset_vf_pe()
>   info: return a literal instead of 'ret'

Hey Dan,

Out of curiosity, what static analysis tool are you using?

> 
> arch/powerpc/platforms/powernv/eeh-powernv.c
>   1019  static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option)
>   1020  {
>   1021  struct eeh_dev *edev;
>   1022  struct pci_dn *pdn;
>   1023  int ret;
>   1024  
>   1025  /* The VF PE should have only one child device */
>   1026  edev = list_first_entry_or_null(>edevs, struct eeh_dev,
> list);
>   1027  pdn = eeh_dev_to_pdn(edev);
>   1028  if (!pdn)
>   1029  return -ENXIO;
>   1030  
>   1031  ret = pnv_eeh_do_flr(pdn, option);
>   1032  if (!ret)
>   1033  return ret;
> 
> Presumably if pnv_eeh_do_flr() succeeds then we don't need to continue?
> pnv_eeh_do_af_flr() is a fall back option?  Doing a:
> 
>   return 0;
> 
> Makes this more deliberate looking.  Sometimes people get their if
> statements reversed is the other option.

I can send a patch for this.

> 
>   1034  
>   1035  return pnv_eeh_do_af_flr(pdn, option);
>   1036  }
> 
> regards,
> dan carpenter

Re: [PATCH 1/5] selftests: Fix selftests build to just build, not run tests

2017-02-14 Thread Michael Ellerman



On 15 February 2017 03:14:24 GMT+11:00, Shuah Khan  
wrote:
>On 02/13/2017 07:09 PM, Michael Ellerman wrote:
>> Michael Ellerman  writes:
>> 
>>> In commit 88baa78d1f31 ("selftests: remove duplicated all and clean
>>> target"), the "all" target was removed from individual Makefiles and
>>> added to lib.mk.
>>>
>>> However the "all" target was added to lib.mk *after* the existing
>>> "runtests" target. This means "runtests" becomes the first (default)
>>> target for most of our Makefiles.
>> ...
>>>
>>> Fix it by moving the "all" target to the start of lib.mk, making it
>the
>>> default target.
>>>
>>> Fixes: 88baa78d1f31 ("selftests: remove duplicated all and clean
>target")
>>> Signed-off-by: Michael Ellerman 
>> 
>> Hi Shuah,
>> 
>> Can you please merge this series into linux-next?
>> 
>> The selftests are badly broken otherwise.
>> 
>> cheers
>> 
>
>Hi Michael,
>
>Thanks. All 5 patches are now in linux-kselftest next with Tested-by
>tag from Bamvor for 1,2,3.

Thanks!
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

[PATCH 1/3] powerpc: kprobes: add support for KPROBES_ON_FTRACE

2017-02-14 Thread Naveen N. Rao

Allow kprobes to be placed on ftrace _mcount() call sites. This
optimization avoids the use of a trap, by riding on ftrace
infrastructure.

This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on
MPROFILE_KERNEL, which is only currently enabled on powerpc64le with
newer toolchains.

Based on the x86 code by Masami.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/Kconfig |   1 +
 arch/powerpc/include/asm/kprobes.h   |  10 
 arch/powerpc/kernel/Makefile |   3 ++
 arch/powerpc/kernel/kprobes-ftrace.c | 100 +++
 arch/powerpc/kernel/kprobes.c|   4 +-
 arch/powerpc/kernel/optprobes.c  |   3 ++
 6 files changed, 120 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/kernel/kprobes-ftrace.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 260dd6a371e0..78419919556d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -101,6 +101,7 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
select HAVE_KPROBES
select HAVE_OPTPROBES if PPC64
+   select HAVE_KPROBES_ON_FTRACE
select HAVE_ARCH_KGDB
select HAVE_KRETPROBES
select HAVE_ARCH_TRACEHOOK
diff --git a/arch/powerpc/include/asm/kprobes.h 
b/arch/powerpc/include/asm/kprobes.h
index e7ada061aa12..3305a12286fa 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -153,6 +153,16 @@ extern int kprobe_exceptions_notify(struct notifier_block 
*self,
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);
 extern int kprobe_post_handler(struct pt_regs *regs);
+#ifdef CONFIG_KPROBES_ON_FTRACE
+extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+  struct kprobe_ctlblk *kcb);
+#else
+static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
+{
+   return 0;
+}
+#endif
 #else
 static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
 static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; }
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index a048b37b9b27..88b21427ccc7 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_BOOTX_TEXT)+= btext.o
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_KPROBES)  += kprobes.o
 obj-$(CONFIG_OPTPROBES)+= optprobes.o optprobes_head.o
+obj-$(CONFIG_KPROBES_ON_FTRACE)+= kprobes-ftrace.o
 obj-$(CONFIG_UPROBES)  += uprobes.o
 obj-$(CONFIG_PPC_UDBG_16550)   += legacy_serial.o udbg_16550.o
 obj-$(CONFIG_STACKTRACE)   += stacktrace.o
@@ -154,6 +155,8 @@ GCOV_PROFILE_machine_kexec_32.o := n
 UBSAN_SANITIZE_machine_kexec_32.o := n
 GCOV_PROFILE_kprobes.o := n
 UBSAN_SANITIZE_kprobes.o := n
+GCOV_PROFILE_kprobes-ftrace.o := n
+UBSAN_SANITIZE_kprobes-ftrace.o := n
 UBSAN_SANITIZE_vdso.o := n
 
 extra-$(CONFIG_PPC_FPU)+= fpu.o
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
new file mode 100644
index ..0377b3013723
--- /dev/null
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -0,0 +1,100 @@
+/*
+ * Dynamic Ftrace based Kprobes Optimization
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) Hitachi Ltd., 2012
+ * Copyright 2016 Naveen N. Rao 
+ *   IBM Corporation
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static nokprobe_inline
+int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb, unsigned long orig_nip)
+{
+   /*
+* Emulate singlestep (and also recover regs->nip)
+* as if there is a nop
+*/
+   regs->nip = (unsigned long)p->addr + MCOUNT_INSN_SIZE;
+   if (unlikely(p->post_handler)) {
+   kcb->kprobe_status = KPROBE_HIT_SSDONE;
+   p->post_handler(p, regs, 0);
+   }
+   __this_cpu_write(current_kprobe, NULL);
+   if (orig_nip)
+   regs->nip = orig_nip;
+   return 1;
+}
+
+int

[PATCH 3/3] powerpc: kprobes: prefer ftrace when probing function entry

2017-02-14 Thread Naveen N. Rao

KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it
eliminates the need for a trap, as well as the need to emulate or
single-step instructions.

Though OPTPROBES provides us with similar performance, we have limited
optprobes trampoline slots. As such, when asked to probe at a function
entry, default to using the ftrace infrastructure.

With:
# cd /sys/kernel/debug/tracing
# echo 'p _do_fork' > kprobe_events

before patch:
# cat ../kprobes/list
c00daf08  k  _do_fork+0x8[DISABLED]
c0044fc0  k  kretprobe_trampoline+0x0[OPTIMIZED]

and after patch:
# cat ../kprobes/list
c00d074c  k  _do_fork+0xc[DISABLED][FTRACE]
c00412b0  k  kretprobe_trampoline+0x0[OPTIMIZED]

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/kprobes.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/kprobes.h 
b/arch/powerpc/include/asm/kprobes.h
index 3305a12286fa..09e74edee625 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -60,12 +60,32 @@ extern kprobe_opcode_t optprobe_template_end[];
 
 #ifdef PPC64_ELF_ABI_v2
 /* PPC64 ABIv2 needs local entry point */
+#ifdef CONFIG_KPROBES_ON_FTRACE
+/*
+ * Per livepatch.h, ftrace location is always within the first 16 bytes
+ * of a function on powerpc with -mprofile-kernel.
+ */
+#define kprobe_lookup_name(name, addr, offset) \
+{  \
+   addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);   \
+   if (addr && !(offset)) {\
+   unsigned long faddr;\
+   faddr = ftrace_location_range((unsigned long)addr,  \
+ (unsigned long)addr + 16);\
+   if (faddr)  \
+   addr = (kprobe_opcode_t *)faddr;\
+   else\
+   addr = (kprobe_opcode_t *)ppc_function_entry(addr); 
\
+   }   \
+}
+#else
 #define kprobe_lookup_name(name, addr, offset) \
 {  \
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);   \
if (addr && !(offset))  \
addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
 }
+#endif
 #elif defined(PPC64_ELF_ABI_v1)
 /*
  * 64bit powerpc ABIv1 uses function descriptors:
-- 
2.11.0

[PATCH 2/3] powerpc: introduce a new helper to obtain function entry points

2017-02-14 Thread Naveen N. Rao

kprobe_lookup_name() is specific to the kprobe subsystem and may not
always return the function entry point (in a subsequent patch). For
looking up function entry points, introduce a separate helper and use
the same in optprobes.c

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/code-patching.h | 37 
 arch/powerpc/kernel/optprobes.c  |  4 ++--
 2 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index 8ab937771068..3e994f404434 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -12,6 +12,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 /* Flags for create_branch:
  * "b"   == create_branch(addr, target, 0);
@@ -99,6 +101,41 @@ static inline unsigned long ppc_global_function_entry(void 
*func)
 #endif
 }
 
+/*
+ * Wrapper around kallsyms_lookup() to return function entry address:
+ * - For ABIv1, we lookup the dot variant.
+ * - For ABIv2, we return the local entry point.
+ */
+static inline unsigned long ppc_kallsyms_lookup_name(const char *name)
+{
+   unsigned long addr;
+#ifdef PPC64_ELF_ABI_v1
+   /* check for dot variant */
+   char dot_name[1 + KSYM_NAME_LEN];
+   bool dot_appended = false;
+   if (name[0] != '.') {
+   dot_name[0] = '.';
+   dot_name[1] = '\0';
+   strncat(dot_name, name, KSYM_NAME_LEN - 2);
+   dot_appended = true;
+   } else {
+   dot_name[0] = '\0';
+   strncat(dot_name, name, KSYM_NAME_LEN - 1);
+   }
+   addr = kallsyms_lookup_name(dot_name);
+   if (!addr && dot_appended)
+   /* Let's try the original non-dot symbol lookup */
+   addr = kallsyms_lookup_name(name);
+#elif defined(PPC64_ELF_ABI_v2)
+   addr = kallsyms_lookup_name(name);
+   if (addr)
+   addr = ppc_function_entry((void *)addr);
+#else
+   addr = kallsyms_lookup_name(name);
+#endif
+   return addr;
+}
+
 #ifdef CONFIG_PPC64
 /*
  * Some instruction encodings commonly used in dynamic ftracing
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index a8f414a0b141..a2f6f28e5205 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -246,8 +246,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe 
*op, struct kprobe *p)
/*
 * 2. branch to optimized_callback() and emulate_step()
 */
-   kprobe_lookup_name("optimized_callback", op_callback_addr, 0);
-   kprobe_lookup_name("emulate_step", emulate_step_addr, 0);
+   op_callback_addr = (kprobe_opcode_t 
*)ppc_kallsyms_lookup_name("optimized_callback");
+   emulate_step_addr = (kprobe_opcode_t 
*)ppc_kallsyms_lookup_name("emulate_step");
if (!op_callback_addr || !emulate_step_addr) {
WARN(1, "kprobe_lookup_name() failed\n");
goto error;
-- 
2.11.0

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread Guilherme G. Piccoli

On 14/02/2017 09:37, Michael Ellerman wrote:
> "Guilherme G. Piccoli"  writes:
> 
>> Currently the xmon debugger is set only via kernel boot command-line.
>> It's disabled by default, and can be enabled with "xmon=on" on the
>> command-line. Also, xmon may be accessed via sysrq mechanism, but once
>> we enter xmon via sysrq,  it's  kept enabled until system is rebooted,
>> even if we exit the debugger. A kernel crash will then lead to xmon
>> instance, instead of triggering a kdump procedure (if configured), for
>> example.
>>
>> This patch introduces a debugfs entry for xmon, allowing user to query
>> its current state and change it if desired. Basically, the "xmon" file
>> to read from/write to is under the debugfs mount point, on powerpc
>> directory. Reading this file will provide the current state of the
>> debugger, one of the following: "on", "off", "early" or "nobt". Writing
>> one of these states to the file will take immediate effect on the debugger.
> 
> I like this in general.
> 
> But I think we can simplify it a bit.
> 
> I don't think we need the nobt state anymore. As far as I can see it was
> added as a way to reinstate the old behaviour when the auto backtrace
> mode was added, but I've never heard of anyone using it.
> 
> If anyone hits a crash where they really need that feature they can
> always just hack the code to disable the backtrace.
> 
> So I think step 1 is a patch to drop the xmon_no_auto_backtrace stuff.
> 
> Also I'm not sure EARLY needs to be a separate state. It's just a
> one-off invocation at boot, all we really need is just a single bit of
> state communicated from early_parse_xmon() to xmon_setup(), ie. a static
> bool would do.
> 
> If we agree with that, then there's only two states left, on and off, in
> which case it can probably just be an int - and we can use a simple
> attribute file rather than custom parsing.
> 

Thanks for the good suggestions Michael, I'll change the code and submit
a new version.


>> * I had this patch partially done for some time, and after a discussion
>> at the kernel slack channel latest week, I decided to rebase and fix
>> some remaining bugs. I'd change 'x' option to always disable the debugger,
> 
> Not quite.
> 
> 'x' should exit and leave xmon in whatever state it was previously in.

Agreed!

Thanks,


Guilherme

> 
> cheers
>

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread Guilherme G. Piccoli

On 14/02/2017 01:58, Pan Xinhui wrote:
> 
> 
> 在 2017/2/14 10:35, Nicholas Piggin 写道:
>> On Mon, 13 Feb 2017 19:00:42 -0200
>>
>> xmon state changing after the first sysrq+x violates principle of least
>> astonishment, so I think that should be fixed.
>>
> hi, Nick
> yes, as long as xmon is disabled during boot, it should still be disabled 
> after existing xmon.
> My patch does not fix that as it need people add one more char 'z' following 
> 'x'.
> I will provide a new patch to fix that.
> 
>> Then the question is, is it worth making it runtime configurable with xmon
>> command or debugfs tunables?
>>
> They are options for people to turn xmon features on or off. Maybe people 
> needn't this.
> However I am not a fan  of debugfs this time as I am used to using xmon cmds. 
> :)
> 
> Hi, Guilherme
> So in the end, my thought is that: 1) cmd x|X will exit xmon and keep xmon in 
> the original state(indicated by var xmon_off).
> 2) Then add options to turn some features on/off. And debugfs maybe not fit 
> for this. But I am also wondering at same time, are people needing this?

Hi Nick and Xinhui, thanks very much for the feedback.
I agree, we should keep xmon in the state it was firstly set, on boot
time - dropping to the debugger using sysrq shouldn't change it.

Now, the use case of the debugfs approach is to allow user to
enable/disable xmon without need to drop into the debugger itself, or
reboot the machine.
Imagine a scenario in which we have a production machine, and:

i) For some reason, somebody kept xmon enabled on grub.cfg and now, we
want to let kdump work in case of crash - how to disable xmon in runtime?

ii) The opposite: xmon wasn't enable on boot time in production machine,
but we have a super-rare issue and want to drop to xmon next time it
happens, so we need to enable it. But we don't want to drop into the
debugger to force it gets enabled, so how do we enable it?

Regarding the place of the xmon state file, I believe debugfs is the
right place - where else could we add it? procfs? configfs?

Thanks,

Guilherme
> 
> thanks
> xinhui
> 
>> Thanks,
>> Nick
>>
>

[PATCH 3/3] powerpc/mm: move mmap_sem unlocking in do_page_fault()

2017-02-14 Thread Laurent Dufour

Since the fault retry is now handled earlier, we can release the
mmap_sem lock earlier too and remove later unlocking previously done in
mm_fault_error().

Signed-off-by: Laurent Dufour 
---
 arch/powerpc/mm/fault.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 2a6bc7e6e69a..21e06cce8984 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -151,13 +151,6 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr, int fault)
 * continue the pagefault.
 */
if (fatal_signal_pending(current)) {
-   /*
-* If we have retry set, the mmap semaphore will have
-* alrady been released in __lock_page_or_retry(). Else
-* we release it now.
-*/
-   if (!(fault & VM_FAULT_RETRY))
-   up_read(>mm->mmap_sem);
/* Coming from kernel, we need to deal with uaccess fixups */
if (user_mode(regs))
return MM_FAULT_RETURN;
@@ -170,8 +163,6 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr, int fault)
 
/* Out of memory */
if (fault & VM_FAULT_OOM) {
-   up_read(>mm->mmap_sem);
-
/*
 * We ran out of memory, or some other thing happened to us that
 * made us unable to handle the page fault gracefully.
@@ -182,10 +173,8 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr, int fault)
return MM_FAULT_RETURN;
}
 
-   if (fault & 
(VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
-   up_read(>mm->mmap_sem);
+   if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE))
return do_sigbus(regs, addr, fault);
-   }
 
/* We don't understand the fault code, this is fatal */
BUG();
@@ -452,11 +441,12 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
goto retry;
}
/* We will enter mm_fault_error() below */
-   }
+   } else
+   up_read(>mm->mmap_sem);
 
if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) {
if (fault & VM_FAULT_SIGSEGV)
-   goto bad_area;
+   goto bad_area_nosemaphore;
rc = mm_fault_error(regs, address, fault);
if (rc >= MM_FAULT_RETURN)
goto bail;
@@ -488,7 +478,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
  regs, address);
}
 
-   up_read(>mmap_sem);
goto bail;
 
 bad_area:
-- 
2.7.4

[PATCH 2/3] powerpc/mm: handle VM_FAULT_RETRY earlier

2017-02-14 Thread Laurent Dufour

In do_page_fault() if handle_mm_fault() returns VM_FAULT_RETRY, retry
the page fault handling before anything else.

This would simplify the handling of the mmap_sem lock in this part of
the code.

Signed-off-by: Laurent Dufour 
---
 arch/powerpc/mm/fault.c | 67 -
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index ee09604bbe12..2a6bc7e6e69a 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -434,6 +434,26 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
 * the fault.
 */
fault = handle_mm_fault(vma, address, flags);
+
+   /*
+* Handle the retry right now, the mmap_sem has been released in that
+* case.
+*/
+   if (unlikely(fault & VM_FAULT_RETRY)) {
+   /* We retry only once */
+   if (flags & FAULT_FLAG_ALLOW_RETRY) {
+   /*
+* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
+* of starvation.
+*/
+   flags &= ~FAULT_FLAG_ALLOW_RETRY;
+   flags |= FAULT_FLAG_TRIED;
+   if (!fatal_signal_pending(current))
+   goto retry;
+   }
+   /* We will enter mm_fault_error() below */
+   }
+
if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) {
if (fault & VM_FAULT_SIGSEGV)
goto bad_area;
@@ -445,38 +465,27 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
}
 
/*
-* Major/minor page fault accounting is only done on the
-* initial attempt. If we go through a retry, it is extremely
-* likely that the page will be found in page cache at that point.
+* Major/minor page fault accounting.
 */
-   if (flags & FAULT_FLAG_ALLOW_RETRY) {
-   if (fault & VM_FAULT_MAJOR) {
-   current->maj_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, address);
+   if (fault & VM_FAULT_MAJOR) {
+   current->maj_flt++;
+   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+ regs, address);
 #ifdef CONFIG_PPC_SMLPAR
-   if (firmware_has_feature(FW_FEATURE_CMO)) {
-   u32 page_ins;
-
-   preempt_disable();
-   page_ins = be32_to_cpu(get_lppaca()->page_ins);
-   page_ins += 1 << PAGE_FACTOR;
-   get_lppaca()->page_ins = cpu_to_be32(page_ins);
-   preempt_enable();
-   }
-#endif /* CONFIG_PPC_SMLPAR */
-   } else {
-   current->min_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, address);
-   }
-   if (fault & VM_FAULT_RETRY) {
-   /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
-* of starvation. */
-   flags &= ~FAULT_FLAG_ALLOW_RETRY;
-   flags |= FAULT_FLAG_TRIED;
-   goto retry;
+   if (firmware_has_feature(FW_FEATURE_CMO)) {
+   u32 page_ins;
+
+   preempt_disable();
+   page_ins = be32_to_cpu(get_lppaca()->page_ins);
+   page_ins += 1 << PAGE_FACTOR;
+   get_lppaca()->page_ins = cpu_to_be32(page_ins);
+   preempt_enable();
}
+#endif /* CONFIG_PPC_SMLPAR */
+   } else {
+   current->min_flt++;
+   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+ regs, address);
}
 
up_read(>mmap_sem);
-- 
2.7.4

[PATCH 1/3] powerpc/mm: move mmap_sem unlock up from do_sigbus

2017-02-14 Thread Laurent Dufour

Move mmap_sem releasing in the do_sigbus()'s unique caller : mm_fault_error()

No functional changes.

Signed-off-by: Laurent Dufour 
---
 arch/powerpc/mm/fault.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 62a50d6d1053..ee09604bbe12 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -119,8 +119,6 @@ static int do_sigbus(struct pt_regs *regs, unsigned long 
address,
siginfo_t info;
unsigned int lsb = 0;
 
-   up_read(>mm->mmap_sem);
-
if (!user_mode(regs))
return MM_FAULT_ERR(SIGBUS);
 
@@ -184,8 +182,10 @@ static int mm_fault_error(struct pt_regs *regs, unsigned 
long addr, int fault)
return MM_FAULT_RETURN;
}
 
-   if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE))
+   if (fault & 
(VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
+   up_read(>mm->mmap_sem);
return do_sigbus(regs, addr, fault);
+   }
 
/* We don't understand the fault code, this is fatal */
BUG();
-- 
2.7.4

[PATCH 0/3] powerpc/mm: page fault handler cleaning

2017-02-14 Thread Laurent Dufour

This series attempts to clean the page fault handler in the way it has
been done previously for the x86 architecture [1].

The goal is to manage the mmap_sem earlier and only in
do_page_fault(). This done by handling the retry case earlier, before
handling the error case.  This way the semaphore can be released
earlier and the error path processed without holding it.

The first patch is just moving a unlock to the caller of the service,
which as no functional impact.

The second patch is handling the retry case earlier in
do_page_fault(). This is where most the change are done, but I was
conservative here, not changing the use of mm_fault_error() in the
case of the second retry. It may be smarter to handle that case
separately but this may create duplicate code.

The last patch is moving up semaphore releasing from mm_fault_error()
to do_page_fault().

[1] see commits from Linus Torvalds
 26178ec11ef3 ("x86: mm: consolidate VM_FAULT_RETRY handling")
 7fb08eca4527 ("x86: mm: move mmap_sem unlock from mm_fault_error() to
caller")

Laurent Dufour (3):
  powerpc/mm: move mmap_sem unlock up from do_sigbus
  powerpc/mm: handle VM_FAULT_RETRY earlier
  powerpc/mm: move mmap_sem unlocking in do_page_fault()

 arch/powerpc/mm/fault.c | 82 -
 1 file changed, 40 insertions(+), 42 deletions(-)

-- 
2.7.4

Re: [PATCH 1/5] selftests: Fix selftests build to just build, not run tests

2017-02-14 Thread Shuah Khan

On 02/13/2017 07:09 PM, Michael Ellerman wrote:
> Michael Ellerman  writes:
> 
>> In commit 88baa78d1f31 ("selftests: remove duplicated all and clean
>> target"), the "all" target was removed from individual Makefiles and
>> added to lib.mk.
>>
>> However the "all" target was added to lib.mk *after* the existing
>> "runtests" target. This means "runtests" becomes the first (default)
>> target for most of our Makefiles.
> ...
>>
>> Fix it by moving the "all" target to the start of lib.mk, making it the
>> default target.
>>
>> Fixes: 88baa78d1f31 ("selftests: remove duplicated all and clean target")
>> Signed-off-by: Michael Ellerman 
> 
> Hi Shuah,
> 
> Can you please merge this series into linux-next?
> 
> The selftests are badly broken otherwise.
> 
> cheers
> 

Hi Michael,

Thanks. All 5 patches are now in linux-kselftest next with Tested-by
tag from Bamvor for 1,2,3.

thank you both,
-- Shuah

[bug report] powernv/opal: Convert opal message events to opal irq domain

2017-02-14 Thread Dan Carpenter

Hello Alistair Popple,

The patch a295af24d0d2: "powernv/opal: Convert opal message events to
opal irq domain" from May 15, 2015, leads to the following static
checker warning:

arch/powerpc/platforms/powernv/opal.c:297 opal_message_init()
info: return a literal instead of 'irq'

arch/powerpc/platforms/powernv/opal.c
   286  static int __init opal_message_init(void)
   287  {
   288  int ret, i, irq;
   289  
   290  for (i = 0; i < OPAL_MSG_TYPE_MAX; i++)
   291  ATOMIC_INIT_NOTIFIER_HEAD(_msg_notifier_head[i]);
   292  
   293  irq = opal_event_request(ilog2(OPAL_EVENT_MSG_PENDING));
   294  if (!irq) {
   295  pr_err("%s: Can't register OPAL event irq (%d)\n",
   296 __func__, irq);
   297  return irq;

This code doesn't really make sense.  I'm not really certain what it
should be...

   298  }
   299  
   300  ret = request_irq(irq, opal_message_notify,
   301  IRQ_TYPE_LEVEL_HIGH, "opal-msg", NULL);
   302  if (ret) {
   303  pr_err("%s: Can't request OPAL event irq (%d)\n",
   304 __func__, ret);
   305  return ret;
   306  }
   307  
   308  return 0;
   309  }

regards,
dan carpenter

[PATCH v4 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Ravi Bangoria

Add new selftest that test emulate_step for Normal, Floating Point,
Vector and Vector Scalar - load/store instructions. Test should run
at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set.

Sample log:

  [0.762063] emulate_step smoke test: start.
  [0.762219] emulate_step smoke test: ld : PASS
  [0.762434] emulate_step smoke test: lwz: PASS
  [0.762653] emulate_step smoke test: lwzx   : PASS
  [0.762867] emulate_step smoke test: std: PASS
  [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS
  [0.763302] emulate_step smoke test: lfsx   : PASS
  [0.763514] emulate_step smoke test: stfsx  : PASS
  [0.763727] emulate_step smoke test: lfdx   : PASS
  [0.763942] emulate_step smoke test: stfdx  : PASS
  [0.764134] emulate_step smoke test: lvx: PASS
  [0.764349] emulate_step smoke test: stvx   : PASS
  [0.764575] emulate_step smoke test: lxvd2x : PASS
  [0.764788] emulate_step smoke test: stxvd2x: PASS
  [0.764997] emulate_step smoke test: complete.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/lib/Makefile |   2 +
 arch/powerpc/lib/test_emulate_step.c  | 443 ++
 3 files changed, 452 insertions(+)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index d99bd44..e7d6d86 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -284,6 +284,13 @@
 #define PPC_INST_BRANCH_COND   0x4080
 #define PPC_INST_LBZCIX0x7c0006aa
 #define PPC_INST_STBCIX0x7c0007aa
+#define PPC_INST_LWZX  0x7c2e
+#define PPC_INST_LFSX  0x7c00042e
+#define PPC_INST_STFSX 0x7c00052e
+#define PPC_INST_LFDX  0x7c0004ae
+#define PPC_INST_STFDX 0x7c0005ae
+#define PPC_INST_LVX   0x7cce
+#define PPC_INST_STVX  0x7c0001ce
 
 /* macros to insert fields into opcodes */
 #define ___PPC_RA(a)   (((a) & 0x1f) << 16)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 0e649d7..c0d6e79 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -32,4 +32,6 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
 obj-$(CONFIG_ALTIVEC)  += xor_vmx.o
 CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
 
+obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
+
 obj-$(CONFIG_PPC64) += $(obj64-y)
diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
new file mode 100644
index 000..4e3bce9
--- /dev/null
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -0,0 +1,443 @@
+/*
+ * test_emulate_step.c - simple sanity test for emulate_step load/store
+ *  instructions
+ *
+ * Copyright IBM Corp. 2016
+ *
+ * This program is free software;  you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) "emulate_step smoke test: " fmt
+
+#include 
+#include 
+#include 
+
+#define IMM_L(i)   ((uintptr_t)(i) & 0x)
+
+/*
+ * Defined with TEST_ prefix so it does not conflict with other
+ * definitions.
+ */
+#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) |   \
+   ___PPC_RA(base) | IMM_L(i))
+#define TEST_LWZ(r, base, i)   (PPC_INST_LWZ | ___PPC_RT(r) |  \
+   ___PPC_RA(base) | IMM_L(i))
+#define TEST_LWZX(t, a, b) (PPC_INST_LWZX | ___PPC_RT(t) | \
+   ___PPC_RA(a) | ___PPC_RB(b))
+#define TEST_STD(r, base, i)   (PPC_INST_STD | ___PPC_RS(r) |  \
+   ___PPC_RA(base) | ((i) & 0xfffc))
+#define TEST_LDARX(t, a, b, eh)(PPC_INST_LDARX | ___PPC_RT(t) |
\
+   ___PPC_RA(a) | ___PPC_RB(b) |   \
+   __PPC_EH(eh))
+#define TEST_STDCX(s, a, b)(PPC_INST_STDCX | ___PPC_RS(s) |\
+   ___PPC_RA(a) | ___PPC_RB(b))
+#define TEST_LFSX(t, a, b) (PPC_INST_LFSX | ___PPC_RT(t) | \
+   ___PPC_RA(a) | ___PPC_RB(b))
+#define TEST_STFSX(s, a, b)

[PATCH v4 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Reported-by: Anton Blanchard 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 846dba2..9c542ec 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case LARX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
if (!address_ok(regs, op.ea, size))
@@ -1823,8 +1821,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case STCX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
if (!address_ok(regs, op.ea, size))
@@ -1849,8 +1845,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case LOAD:
-   if (regs->msr & MSR_LE)
-   return 0;
err = read_mem(>gpr[op.reg], op.ea, size, regs);
if (!err) {
if (op.type & SIGNEXT)
@@ -1862,8 +1856,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case LOAD_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_load(op.reg, do_lfs, op.ea, size, regs);
else
@@ -1872,15 +1864,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case LOAD_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case LOAD_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs);
goto ldst_done;
 #endif
@@ -1903,8 +1891,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case STORE:
-   if (regs->msr & MSR_LE)
-   return 0;
if ((op.type & UPDATE) && size == sizeof(long) &&
op.reg == 1 && op.update_reg == 1 &&
!(regs->msr & MSR_PR) &&
@@ -1917,8 +1903,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case STORE_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_store(op.reg, do_stfs, op.ea, size, regs);
else
@@ -1927,15 +1911,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case STORE_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case STORE_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs);
goto ldst_done;
 #endif
-- 
1.8.3.1

[PATCH v4 0/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step is the basic infrastructure which is used by number of other
kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc.
In case of kprobe, enabling emulation of load/store instructions will
speedup the execution of probed instruction. In case of kernel-space
breakpoint, causative instruction is first get emulated before executing
user registered handler. If emulation fails, hw-breakpoint is disabled
with error. As emulate_step does not support load/store instructions on
LE, kernel-space hw-breakpoint infrastructure is broken on LE.

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST
and CONFIG_PPC64 is set.

Changes in v4:
  - Used late_initcall() for selftest
  - Makefile changes

v3 link: 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1332686.html

Ravi Bangoria (2):
  powerpc: Emulation support for load/store instructions on LE
  powerpc: emulate_step tests for load/store instructions

 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/lib/Makefile |   2 +
 arch/powerpc/lib/sstep.c  |  20 --
 arch/powerpc/lib/test_emulate_step.c  | 443 ++
 4 files changed, 452 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

-- 
1.8.3.1

[bug report] powerpc/powernv: Support EEH reset for VF PE

2017-02-14 Thread Dan Carpenter

Hello Wei Yang,

The patch 9312bc5bab59: "powerpc/powernv: Support EEH reset for VF
PE" from Mar 4, 2016, leads to the following static checker warning:

arch/powerpc/platforms/powernv/eeh-powernv.c:1033 pnv_eeh_reset_vf_pe()
info: return a literal instead of 'ret'

arch/powerpc/platforms/powernv/eeh-powernv.c
  1019  static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option)
  1020  {
  1021  struct eeh_dev *edev;
  1022  struct pci_dn *pdn;
  1023  int ret;
  1024  
  1025  /* The VF PE should have only one child device */
  1026  edev = list_first_entry_or_null(>edevs, struct eeh_dev, 
list);
  1027  pdn = eeh_dev_to_pdn(edev);
  1028  if (!pdn)
  1029  return -ENXIO;
  1030  
  1031  ret = pnv_eeh_do_flr(pdn, option);
  1032  if (!ret)
  1033  return ret;

Presumably if pnv_eeh_do_flr() succeeds then we don't need to continue?
pnv_eeh_do_af_flr() is a fall back option?  Doing a:

return 0;

Makes this more deliberate looking.  Sometimes people get their if
statements reversed is the other option.

  1034  
  1035  return pnv_eeh_do_af_flr(pdn, option);
  1036  }

regards,
dan carpenter

Re: linux-next: manual merge of the kvm tree with the powerpc tree

2017-02-14 Thread Paolo Bonzini

On 14/02/2017 09:45, Michael Ellerman wrote:
>> If possible, please pull only up to "powerpc/64: Allow for relocation-on
>> interrupts from guest to host" and cherry-pick the top two patches
>> ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts" and
>> "powerpc/powernv: Remove separate entry for OPAL real mode calls") into
>> your next branch, but leave the rest for my tree only.
>
> I don't see how that helps anything.
> 
> In fact it guarantees a mess because those two commits would now go to
> Linus via my tree (cherry picked) and via Paul's as part of his second
> merge of the topic branch.
>
> So unless you can give me a good reason I'll merge the tip of the topic
> branch into my next, as planned.

Yes, Paul's second merge did guarantee a mess, so go ahead.

However, the reason was that this is simply not how topic branches
should work: topic branches should be the base for other work, they
shouldn't contain _all_ the work.  So the right workflow would have been:

- Paul submits topic branch A to you

- you merge A

- Paul merges topic branch A into his "next" branch

- Paul applies KVM-specific patches B1 on top of his "next" branch.

- Paul sends pull request to me (with A + kvmppc work).

As far as I understand, there was no reason for you to get B1.

The last two patches (let's call them B2) also didn't need to go through
the kvm-ppc branch at all.  You could have applied them directly on top
of A.  Linus then would get A and B2 from you, and A and B1 from me:

  base -→ A -→ B1
  ↓↓
   ppc -→ ▪▪ ←- kvm
  ↓|
  B2   |
  ↓↓
  torvalds/linux.git

If necessary, things could have been arranged so that Linus got A and B2
from you, and all three of A/B1/B2 from me:

- Paul submits topic branch B2 to you, based on topic branch A

- you merge B2

- Paul merges B2 and I get it from him

The result would have been:

  base -→ A -→ B1
  ↓ ↘  ↓
   ppc -→ ▪   B2 → ▪
  ↓ ↙  ↓
  ▪▪ ←- kvm
  ↓↓
  torvalds/linux.git

Paolo

Re: powerpc: Add a prototype for mcount() so it can be versioned

2017-02-14 Thread Michael Ellerman

On Fri, 2017-02-10 at 02:40:02 UTC, Michael Ellerman wrote:
> Currently we get a warning that _mcount() can't be versioned:
> 
>   WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, 
> symbol will not be versioned.
> 
> Add a prototype to asm-prototypes.h to fix it.
> 
> The prototype is not really correct, mcount() is not a normal function,
> it has a special ABI. But for the purpose of versioning it doesn't
> matter.
> 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/99ad503287daf78e19e64e0e51f1d6

cheers

Re: powerpc: Fix confusing help text for DISABLE_MPROFILE_KERNEL

2017-02-14 Thread Michael Ellerman

On Fri, 2017-02-10 at 01:16:59 UTC, Anton Blanchard wrote:
> From: Anton Blanchard 
> 
> The final paragraph of the help text is reversed - we want to
> enable this option by default, and disable it if the toolchain
> has a working -mprofile-kernel.
> 
> Signed-off-by: Anton Blanchard 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/496e9cb5b2aa2ba303d2bbd08518f9

cheers

Re: [v4,1/2] arch/powerpc: Implement Optprobes

2017-02-14 Thread Michael Ellerman

On Wed, 2017-02-08 at 09:50:51 UTC, Anju T wrote:
> Current infrastructure of kprobe uses the unconditional trap instruction
> to probe a running kernel. Optprobe allows kprobe to replace the trap with
> a branch instruction to a detour buffer. Detour buffer contains instructions
> to create an in memory pt_regs. Detour buffer also has a call to
> optimized_callback() which in turn call the pre_handler().
> After the execution of the pre-handler, a call is made for instruction
> emulation. The NIP is determined in advanced through dummy instruction
> emulation and a branch instruction is created to the NIP at the end of
> the trampoline.
> 
> To address the limitation of branch instruction in POWER architecture,
> detour buffer slot is allocated from a reserved area. For the time being,
> 64KB is reserved in memory for this purpose.
> 
> Instructions which can be emulated using analyse_instr() are the candidates
> for optimization. Before optimization ensure that the address range
> between the detour buffer allocated and the instruction being probed
> is within +/- 32MB.
> 
> Signed-off-by: Anju T Sudhakar 
> Signed-off-by: Naveen N. Rao 
> Acked-by: Masami Hiramatsu 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/51c9c0843993528bffc920c54c2121

cheers

Re: [1/3] powerpc: asm/ppc-opcode.h: introduce __PPC_SH64()

2017-02-14 Thread Michael Ellerman

On Wed, 2017-02-08 at 08:57:29 UTC, Anju T wrote:
> From: "Naveen N. Rao" 
> 
> Introduce __PPC_SH64() as a 64-bit variant to encode shift field in some
> of the shift and rotate instructions operating on double-words. Convert
> some of the BPF instruction macros to use the same.
> 
> Signed-off-by: Naveen N. Rao 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c233f5979b3dbb39a5b2473b5fcaf5

cheers

Re: [3/3] powerpc: kprobes: remove kprobe_exceptions_notify()

2017-02-14 Thread Michael Ellerman

On Tue, 2017-02-07 at 19:54:16 UTC, "Naveen N. Rao" wrote:
> ... as the weak variant will do.
> 
> Signed-off-by: Naveen N. Rao 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/0ddde5004d26c483c9e67005b2be5b

cheers

Re: [1/3] kprobes: introduce weak variant of kprobe_exceptions_notify

2017-02-14 Thread Michael Ellerman

On Tue, 2017-02-07 at 19:54:14 UTC, "Naveen N. Rao" wrote:
> kprobe_exceptions_notify() is not used on some of the architectures such
> as arm[64] and powerpc anymore. Introduce a weak variant for such
> architectures.
> 
> Signed-off-by: Naveen N. Rao 
> Acked-by: Masami Hiramatsu 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/fc62d0207ae0ebc9d19df68394c0dc

cheers

Re: powerpc/powernv: Fix opal_exit tracepoint opcode

2017-02-14 Thread Michael Ellerman

On Tue, 2017-02-07 at 10:01:01 UTC, Michael Ellerman wrote:
> Currently the opal_exit tracepoint usually shows the opcode as 0:
> 
>   -0 [047] d.h.   635.654292: opal_entry: opcode=63
>   -0 [047] d.h.   635.654296: opal_exit: opcode=0 retval=0
>   kopald-1209  [019] d...   636.420943: opal_entry: opcode=10
>   kopald-1209  [019] d...   636.420959: opal_exit: opcode=0 retval=0
> 
> This is because we incorrectly load the opcode into r0 before calling
> __trace_opal_exit(), whereas it expects the opcode in r3 (first function
> parameter). In fact we are leaving the retval in r3, so opcode and
> retval will always show the same value.
> 
> Instead load the opcode into r3, resulting in:
> 
>   -0 [040] d.h.   636.618625: opal_entry: opcode=63
>   -0 [040] d.h.   636.618627: opal_exit: opcode=63 retval=0
> 
> Fixes: c49f63530bb6 ("powernv: Add OPAL tracepoints")
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/a7e0fb6c2029a780444d09560f739e

cheers

Re: powerpc: Fix inconsistent of_node_to_nid EXPORT_SYMBOL handling

2017-02-14 Thread Michael Ellerman

On Wed, 2017-02-01 at 22:52:42 UTC, Shailendra Singh wrote:
> The generic implementation of of_node_to_nid is EXPORT_SYMBOL.
> 
> The powerpc implementation added by following commit is EXPORT_SYMBOL_GPL.
> commit 953039c8df7b ("[PATCH] powerpc: Allow devices to register with numa
> topology")
> 
> This creates an inconsistency for of_node_to_nid callers across
> architectures.
> 
> Update the powerpc implementation to be exported consistently with the
> generic implementation.
> 
> Signed-off-by: Shailendra Singh 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/be9ba9ff93cc3e44dc46da9ed25655

cheers

Re: [-next] powerpc/pseries: Fix typo in parameter description

2017-02-14 Thread Michael Ellerman

On Thu, 2017-01-12 at 15:09:21 UTC, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix typo in parameter description.
> 
> Signed-off-by: Wei Yongjun 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/b0b5a76579ea62a9eeb720e71fdaa9

cheers

Re: [PATCHv2, 1/4] pseries: Add hypercall wrappers for hash page table resizing

2017-02-14 Thread Michael Ellerman

On Fri, 2016-12-09 at 00:07:35 UTC, David Gibson wrote:
> This adds the hypercall numbers and wrapper functions for the hash page
> table resizing hypercalls.
> 
> It also adds a new firmware feature flag to track the presence of the
> HPT resizing calls.
> 
> Signed-off-by: David Gibson 
> Reviewed-by: Paul Mackerras 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/64b40ffbc83029f035571cad9727e3

cheers

Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write

2017-02-14 Thread Aneesh Kumar K.V

Michael Ellerman  writes:

> "Aneesh Kumar K.V"  writes:
>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
>> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> index 0735d5a8049f..8720a406bbbe 100644
>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> @@ -16,6 +16,9 @@
>>  #include 
>>  #include 
>>  
>> +#ifndef __ASSEMBLY__
>> +#include 
>> +#endif
>
> I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
> the #include be better placed in book3s/64/pgtable.h also?

mmu-hash.h has got a hack that is explained below

#ifndef __ASSEMBLY__
#include 
#endif
/*
 * This is necessary to get the definition of PGTABLE_RANGE which we
 * need for various slices related matters. Note that this isn't the
 * complete pgtable.h but only a portion of it.
 */
#include 

This is the only place where we do that book3s/64/pgtable.h include this
way. Everybody should include asm/pgable.h which picks the righ version
based on different config option.

#
>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
>> b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index fef738229a68..c684ef6cbd10 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>  return pte;
>>  }
>>  
>> +#define pte_mk_savedwrite pte_mk_savedwrite
>> +static inline pte_t pte_mk_savedwrite(pte_t pte)
>> +{
>> +/*
>> + * Used by Autonuma subsystem to preserve the write bit
>> + * while marking the pte PROT_NONE. Only allow this
>> + * on PROT_NONE pte
>> + */
>> +VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | 
>> _PAGE_PRIVILEGED)) !=
>> +  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
>> +return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
>> +}
>> +
>
>
> cheers

-aneesh

Re: [PATCH] powerpc/xmon: add turn off xmon option

2017-02-14 Thread Michael Ellerman

Pan Xinhui  writes:

> Once xmon is triggered, there is no interface to turn it off again.
> However there exists disable/enable xmon code flows. And more important,
> System reset interrupt on powerVM will fire an oops to make a dump. At
> that time, xmon should not be triggered.
>
> So add 'z' option after current 'x|X' exit commands. Turn xmon off if 'z'
> is following.

Thanks for trying to clear this up.

But I like the solution we discussed in the other thread better. ie.
entering/leaving xmon via sysrq will always leave the state unchanged,
and we'll add a debugfs to change the state.

cheers

[PATCH] powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y

2017-02-14 Thread Michael Ellerman

Currently the build breaks if CMA=n and SPAPR_TCE_IOMMU=y:

  arch/powerpc/mm/mmu_context_iommu.c: In function ‘mm_iommu_get’:
  arch/powerpc/mm/mmu_context_iommu.c:193:42: error: ‘MIGRATE_CMA’ undeclared 
(first use in this function)
  if (get_pageblock_migratetype(page) == MIGRATE_CMA) {
  ^~~

Fix it by using the existing is_migrate_cma_page(), which evaulates to
false when CMA=n.

Fixes: 2e5bbb5461f1 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA")
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/mmu_context_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
b/arch/powerpc/mm/mmu_context_iommu.c
index 104bad029ce9..7de7124ac91b 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -184,7 +184,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, 
unsigned long entries,
 * of the CMA zone if possible. NOTE: faulting in + migration
 * can be expensive. Batching can be considered later
 */
-   if (get_pageblock_migratetype(page) == MIGRATE_CMA) {
+   if (is_migrate_cma_page(page)) {
if (mm_iommu_move_page_from_cma(page))
goto populate;
if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT),
-- 
2.7.4

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread Michael Ellerman

Pan Xinhui  writes:
> 在 2017/2/14 10:35, Nicholas Piggin 写道:
>> On Mon, 13 Feb 2017 19:00:42 -0200
>> "Guilherme G. Piccoli"  wrote:
>>> * I had this patch partially done for some time, and after a discussion
>>> at the kernel slack channel latest week, I decided to rebase and fix
>>> some remaining bugs. I'd change 'x' option to always disable the debugger,
>>> since with this patch we can always re-enable xmon, but today I noticed
>>> Pan's patch on the mailing list, so perhaps his approach of adding a flag
>>> to 'x' option is preferable. I can change this in a V2, if requested.
>>> Thanks in advance!
>>
>> xmon state changing after the first sysrq+x violates principle of least
>> astonishment, so I think that should be fixed.
>
> Hi, Guilherme
> So in the end, my thought is that:
> 1) cmd x|X will exit xmon and keep xmon in the original state

Agreed.

> 2) Then add options to turn some features on/off. And debugfs maybe
> not fit for this. But I am also wondering at same time, are people
> needing this?

I think it's useful. Especially seeing as Guilherme has already written
the code.

And yeah I think debugfs is the right place to do it. It means you can
query and/or set the xmon state without crashing the box - which is what
entering xmon essentially does, even if you later recover.

cheers

Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Ravi Bangoria



On Tuesday 14 February 2017 04:16 PM, Michael Ellerman wrote:
> Ravi Bangoria  writes:
>
>> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
>> index 0e649d7..ddc879d 100644
>> --- a/arch/powerpc/lib/Makefile
>> +++ b/arch/powerpc/lib/Makefile
>> @@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC)  += xor_vmx.o
>>  CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
>>  
>>  obj-$(CONFIG_PPC64) += $(obj64-y)
>> +
>> +ifeq ($(CONFIG_PPC64), y)
>> +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
>> +endif
> FYI, the right way to do that is:
>
>   obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
>   obj-$(CONFIG_PPC64) += $(obj64-y)
>
> And in this Makefile you don't need to add the second line because it's
> already there.

Thanks for this. Will change accordingly and send next version.

Ravi

>
> cheers
>

Re: [PATCH] powerpc/xmon: add debugfs entry for xmon

2017-02-14 Thread Michael Ellerman

"Guilherme G. Piccoli"  writes:

> Currently the xmon debugger is set only via kernel boot command-line.
> It's disabled by default, and can be enabled with "xmon=on" on the
> command-line. Also, xmon may be accessed via sysrq mechanism, but once
> we enter xmon via sysrq,  it's  kept enabled until system is rebooted,
> even if we exit the debugger. A kernel crash will then lead to xmon
> instance, instead of triggering a kdump procedure (if configured), for
> example.
>
> This patch introduces a debugfs entry for xmon, allowing user to query
> its current state and change it if desired. Basically, the "xmon" file
> to read from/write to is under the debugfs mount point, on powerpc
> directory. Reading this file will provide the current state of the
> debugger, one of the following: "on", "off", "early" or "nobt". Writing
> one of these states to the file will take immediate effect on the debugger.

I like this in general.

But I think we can simplify it a bit.

I don't think we need the nobt state anymore. As far as I can see it was
added as a way to reinstate the old behaviour when the auto backtrace
mode was added, but I've never heard of anyone using it.

If anyone hits a crash where they really need that feature they can
always just hack the code to disable the backtrace.

So I think step 1 is a patch to drop the xmon_no_auto_backtrace stuff.

Also I'm not sure EARLY needs to be a separate state. It's just a
one-off invocation at boot, all we really need is just a single bit of
state communicated from early_parse_xmon() to xmon_setup(), ie. a static
bool would do.

If we agree with that, then there's only two states left, on and off, in
which case it can probably just be an int - and we can use a simple
attribute file rather than custom parsing.

> * I had this patch partially done for some time, and after a discussion
> at the kernel slack channel latest week, I decided to rebase and fix
> some remaining bugs. I'd change 'x' option to always disable the debugger,

Not quite.

'x' should exit and leave xmon in whatever state it was previously in.

cheers

Re: [PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

Thanks Michael,

On Tuesday 14 February 2017 03:50 PM, Michael Ellerman wrote:
> Ravi Bangoria  writes:
>
>> emulate_step() uses a number of underlying kernel functions that were
>> initially not enabled for LE. This has been rectified since.
> When exactly? ie. which commit.

I found couple of commits:

6506b4718b ("powerpc: Fix Unaligned Loads and Stores")
dbc2fbd7c2 ("powerpc: Fix Unaligned LE Floating Point Loads and Stores")

There may be more.

Patch2 is to test emulate_step() for basic load/store instructions and it
seems to be working fine on LE.

>
> Should we backport this? ie. is it actually a bug people are hitting in
> the real world much?

Yes, we should backport this. kernel-space hw-breakpoint feature is broken
on LE without this. This is on ppc64le:

  $ sudo cat /proc/kallsyms  | grep pid_max
c116998c D pid_max

  $ sudo ./perf record -a --event=mem:0xc116998c sleep 10


Before patch:
  It does not record any data and throws below warning.

  $ dmesg
[  817.895573] Unable to handle hardware breakpoint. Breakpoint at 
0xc116998c will be disabled.
[  817.895581] [ cut here ]
[  817.895588] WARNING: CPU: 24 PID: 2032 at 
arch/powerpc/kernel/hw_breakpoint.c:277 hw_breakpoint_handler+0x124/0x230
...

After patch:
  It records data properly.

  $ sudo ./perf report --stdio
...
# Samples: 36  of event 'mem:0xc116998c'
# Event count (approx.): 36
#
# Overhead  CommandShared Object Symbol 
#   .    .
#
63.89%  kdumpctl   [kernel.vmlinux]  [k] alloc_pid
27.78%  opal_errd  [kernel.vmlinux]  [k] alloc_pid
 5.56%  kworker/u97:4  [kernel.vmlinux]  [k] alloc_pid
 2.78%  systemd[kernel.vmlinux]  [k] alloc_pid



>
> cheers
>

Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write

2017-02-14 Thread Michael Ellerman

"Aneesh Kumar K.V"  writes:
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 0735d5a8049f..8720a406bbbe 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -16,6 +16,9 @@
>  #include 
>  #include 
>  
> +#ifndef __ASSEMBLY__
> +#include 
> +#endif

I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
the #include be better placed in book3s/64/pgtable.h also?

> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index fef738229a68..c684ef6cbd10 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
>   return pte;
>  }
>  
> +#define pte_mk_savedwrite pte_mk_savedwrite
> +static inline pte_t pte_mk_savedwrite(pte_t pte)
> +{
> + /*
> +  * Used by Autonuma subsystem to preserve the write bit
> +  * while marking the pte PROT_NONE. Only allow this
> +  * on PROT_NONE pte
> +  */
> + VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | 
> _PAGE_PRIVILEGED)) !=
> +   cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
> + return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
> +}
> +


cheers

Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write

2017-02-14 Thread Michael Ellerman

Michael Neuling  writes:

> On Thu, 2017-02-09 at 08:30 +0530, Aneesh Kumar K.V wrote:
>> With this our protnone becomes a present pte with READ/WRITE/EXEC bit 
>> cleared.
>> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
>> us identify a protnone pte that as saved write bit. For such pte, we will
>> clear
>> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
>> and kernel.
>> 
>> Signed-off-by: Aneesh Kumar K.V 
>
>
> FWIW I've tested this, so:
>
> Acked-By: Michael Neuling 

In future if you've tested something then "Tested-by:" is the right tag
to use.

cheers

Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.

2017-02-14 Thread Michael Ellerman

"Aneesh Kumar K.V"  writes:

> On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote:
>> "Aneesh Kumar K.V"  writes:
>>
>>> Autonuma preserves the write permission across numa fault to avoid taking
>>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve 
>>> PTE
>>> write permissions across a NUMA hinting fault"). Architecture can implement
>>> protnone in different ways and some may choose to implement that by 
>>> clearing Read/
>>> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
>>> behaviour. Fix this up by allowing arch to override how to save the write 
>>> bit
>>> on a protnone pte.
>> This is pretty obviously a nop on arches that don't implement the new
>> hooks, but it'd still be good to get an ack from someone in mm land
>> before I merge it.
>
>
> To get it apply cleanly you may need
> http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch
> http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch

Ah OK, I missed those.

In that case these two should probably go via Andrew's tree.

cheers

Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Michael Ellerman

Ravi Bangoria  writes:

> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
> index 0e649d7..ddc879d 100644
> --- a/arch/powerpc/lib/Makefile
> +++ b/arch/powerpc/lib/Makefile
> @@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC)   += xor_vmx.o
>  CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
>  
>  obj-$(CONFIG_PPC64) += $(obj64-y)
> +
> +ifeq ($(CONFIG_PPC64), y)
> +obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
> +endif

FYI, the right way to do that is:

  obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
  obj-$(CONFIG_PPC64) += $(obj64-y)

And in this Makefile you don't need to add the second line because it's
already there.

cheers

Re: [PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Michael Ellerman

Ravi Bangoria  writes:

> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index fce05a3..5c5ae66 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -528,6 +528,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
> struct pt_regs *regs)
>  
>  int __init arch_init_kprobes(void)
>  {
> + test_emulate_step();
> +

I don't see any good reason why this is called from here.

So I'm inclined to just make test_emulate_step() a regular init call.

cheers

Re: [PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Michael Ellerman

Ravi Bangoria  writes:

> emulate_step() uses a number of underlying kernel functions that were
> initially not enabled for LE. This has been rectified since.

When exactly? ie. which commit.

Should we backport this? ie. is it actually a bug people are hitting in
the real world much?

cheers

Re: [PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria



On Tuesday 14 February 2017 02:17 PM, Naveen N. Rao wrote:
> On 2017/02/14 01:32PM, Ravi Bangoria wrote:
>> emulate_step() uses a number of underlying kernel functions that were
>> initially not enabled for LE. This has been rectified since. So, fix
>> emulate_step() for LE for the corresponding instructions.
>>
>> Reported-by: Anton Blanchard 
>> Signed-off-by: Ravi Bangoria 
> Can you redo this on top of powerpc/next? This doesn't apply cleanly due 
> to a recent change...

Ok. sorry for the noise. I sent a v3 series. Please review it.

Ravi

> - Naveen
>

[PATCH v3 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Ravi Bangoria

Add new selftest that test emulate_step for Normal, Floating Point,
Vector and Vector Scalar - load/store instructions. Test should run
at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set.

Sample log:

  [0.762063] emulate_step smoke test: start.
  [0.762219] emulate_step smoke test: ld : PASS
  [0.762434] emulate_step smoke test: lwz: PASS
  [0.762653] emulate_step smoke test: lwzx   : PASS
  [0.762867] emulate_step smoke test: std: PASS
  [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS
  [0.763302] emulate_step smoke test: lfsx   : PASS
  [0.763514] emulate_step smoke test: stfsx  : PASS
  [0.763727] emulate_step smoke test: lfdx   : PASS
  [0.763942] emulate_step smoke test: stfdx  : PASS
  [0.764134] emulate_step smoke test: lvx: PASS
  [0.764349] emulate_step smoke test: stvx   : PASS
  [0.764575] emulate_step smoke test: lxvd2x : PASS
  [0.764788] emulate_step smoke test: stxvd2x: PASS
  [0.764997] emulate_step smoke test: complete.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/include/asm/sstep.h  |   8 +
 arch/powerpc/kernel/kprobes.c |   2 +
 arch/powerpc/lib/Makefile |   4 +
 arch/powerpc/lib/test_emulate_step.c  | 439 ++
 5 files changed, 460 insertions(+)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index d99bd44..e7d6d86 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -284,6 +284,13 @@
 #define PPC_INST_BRANCH_COND   0x4080
 #define PPC_INST_LBZCIX0x7c0006aa
 #define PPC_INST_STBCIX0x7c0007aa
+#define PPC_INST_LWZX  0x7c2e
+#define PPC_INST_LFSX  0x7c00042e
+#define PPC_INST_STFSX 0x7c00052e
+#define PPC_INST_LFDX  0x7c0004ae
+#define PPC_INST_STFDX 0x7c0005ae
+#define PPC_INST_LVX   0x7cce
+#define PPC_INST_STVX  0x7c0001ce
 
 /* macros to insert fields into opcodes */
 #define ___PPC_RA(a)   (((a) & 0x1f) << 16)
diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
index d3a42cc..d6d3630 100644
--- a/arch/powerpc/include/asm/sstep.h
+++ b/arch/powerpc/include/asm/sstep.h
@@ -87,3 +87,11 @@ struct instruction_op {
 
 extern int analyse_instr(struct instruction_op *op, struct pt_regs *regs,
 unsigned int instr);
+
+#if defined(CONFIG_KPROBES_SANITY_TEST) && defined(CONFIG_PPC64)
+void test_emulate_step(void);
+#else
+static inline void test_emulate_step(void)
+{
+}
+#endif
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index fce05a3..5c5ae66 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -528,6 +528,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
 
 int __init arch_init_kprobes(void)
 {
+   test_emulate_step();
+
return register_kprobe(_p);
 }
 
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 0e649d7..ddc879d 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -33,3 +33,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o
 CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
 
 obj-$(CONFIG_PPC64) += $(obj64-y)
+
+ifeq ($(CONFIG_PPC64), y)
+obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
+endif
diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
new file mode 100644
index 000..887d1db
--- /dev/null
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -0,0 +1,439 @@
+/*
+ * test_emulate_step.c - simple sanity test for emulate_step load/store
+ *  instructions
+ *
+ * Copyright IBM Corp. 2016
+ *
+ * This program is free software;  you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) "emulate_step smoke test: " fmt
+
+#include 
+#include 
+#include 
+
+#define IMM_L(i)   ((uintptr_t)(i) & 0x)
+
+/*
+ * Defined with TEST_ prefix so it does not conflict with other
+ * definitions.
+ */
+#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) |   \
+   ___PPC_RA(base) |

[PATCH v3 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Reported-by: Anton Blanchard 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 846dba2..9c542ec 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case LARX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
if (!address_ok(regs, op.ea, size))
@@ -1823,8 +1821,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case STCX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
if (!address_ok(regs, op.ea, size))
@@ -1849,8 +1845,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case LOAD:
-   if (regs->msr & MSR_LE)
-   return 0;
err = read_mem(>gpr[op.reg], op.ea, size, regs);
if (!err) {
if (op.type & SIGNEXT)
@@ -1862,8 +1856,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case LOAD_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_load(op.reg, do_lfs, op.ea, size, regs);
else
@@ -1872,15 +1864,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case LOAD_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case LOAD_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs);
goto ldst_done;
 #endif
@@ -1903,8 +1891,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case STORE:
-   if (regs->msr & MSR_LE)
-   return 0;
if ((op.type & UPDATE) && size == sizeof(long) &&
op.reg == 1 && op.update_reg == 1 &&
!(regs->msr & MSR_PR) &&
@@ -1917,8 +1903,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case STORE_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_store(op.reg, do_stfs, op.ea, size, regs);
else
@@ -1927,15 +1911,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case STORE_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case STORE_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs);
goto ldst_done;
 #endif
-- 
1.8.3.1

[PATCH v3 0/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step is the basic infrastructure which is used by number of other
kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc.
In case of kprobe, enabling emulation of load/store instructions will
speedup the execution of probed instruction. In case of kernel-space
breakpoint, causative instruction is first get emulated before executing
user registered handler. If emulation fails, hw-breakpoint is disabled
with error. As emulate_step does not support load/store instructions on
LE, kernel-space hw-breakpoint infrastructure is broken on LE.

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST
and CONFIG_PPC64 is set.

Changes in v3:
  - Rebased to powerpc/next. No functionality changes.

v2 link: 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1332638.html

Ravi Bangoria (2):
  powerpc: Emulation support for load/store instructions on LE
  powerpc: emulate_step tests for load/store instructions

 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/include/asm/sstep.h  |   8 +
 arch/powerpc/kernel/kprobes.c |   2 +
 arch/powerpc/lib/Makefile |   4 +
 arch/powerpc/lib/sstep.c  |  20 --
 arch/powerpc/lib/test_emulate_step.c  | 439 ++
 6 files changed, 460 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

-- 
1.8.3.1

Re: [PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Naveen N. Rao

On 2017/02/14 01:32PM, Ravi Bangoria wrote:
> emulate_step() uses a number of underlying kernel functions that were
> initially not enabled for LE. This has been rectified since. So, fix
> emulate_step() for LE for the corresponding instructions.
> 
> Reported-by: Anton Blanchard 
> Signed-off-by: Ravi Bangoria 

Can you redo this on top of powerpc/next? This doesn't apply cleanly due 
to a recent change...

- Naveen

Re: [PATCH] powerpc/xmon: Fix data-breakpoint

2017-02-14 Thread Ravi Bangoria

Hi Michael,

Can you please pull this patch.

Thanks,
Ravi

On Tuesday 22 November 2016 02:55 PM, Ravi Bangoria wrote:
> Xmon data-breakpoint feature is broken.
>
> Whenever there is a watchpoint match occurs, hw_breakpoint_handler will
> be called by do_break via notifier chains mechanism. If watchpoint is
> registered by xmon, hw_breakpoint_handler won't find any associated
> perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break
> also returns without notifying to xmon.
>
> Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not
> find any perf_event associated with matched watchpoint.
>
> Signed-off-by: Ravi Bangoria 
> ---
>  arch/powerpc/kernel/hw_breakpoint.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
> b/arch/powerpc/kernel/hw_breakpoint.c
> index 03d089b..469d86d 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -228,8 +228,10 @@ int hw_breakpoint_handler(struct die_args *args)
>   rcu_read_lock();
>
>   bp = __this_cpu_read(bp_per_reg);
> - if (!bp)
> + if (!bp) {
> + rc = NOTIFY_DONE;
>   goto out;
> + }
>   info = counter_arch_bp(bp);
>
>   /*

Re: linux-next: manual merge of the kvm tree with the powerpc tree

2017-02-14 Thread Michael Ellerman

Paolo Bonzini  writes:

> On 10/02/2017 04:59, Stephen Rothwell wrote:
>> Hi all,
>> 
>> Today's linux-next merge of the kvm tree got a conflict in:
>> 
>>   arch/powerpc/include/asm/head-64.h
>> 
>> between commit:
>> 
>>   852e5da99d15 ("powerpc/64s: Tidy up after exception handler rework")
>> 
>> from the powerpc tree and commit:
>> 
>>   7ede531773ea ("KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out 
>> from alt section")
>> 
>> from the kvm tree.
>
> Michael, please pull the topic branch as soon as possible, so that the
> conflicts don't hit Linus.

They won't hit Linus until I send my pull request.

> That said, the topic branch is a mess.  It starts with generic arch
> patches (until "powerpc/64: Allow for relocation-on interrupts from
> guest to host") then it's only KVM, then on the top there's two more
> generic patches that were added _after_ Paul merged it.

It's not a mess, it's a collection of patches which touch either arch/powerpc
or arch/powerpc/kvm, or are otherwise related.

Yeah I could have merged just the start of Paul's series, but that
seemed pointless, it doesn't prevent or add any conflicts, and it means
I'm unable to test his series as a whole.

Paul has also now merged the remaining two commits, and sent you a pull
request including them.

> If possible, please pull only up to "powerpc/64: Allow for relocation-on
> interrupts from guest to host" and cherry-pick the top two patches
> ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts" and
> "powerpc/powernv: Remove separate entry for OPAL real mode calls") into
> your next branch, but leave the rest for my tree only.

I don't see how that helps anything.

In fact it guarantees a mess because those two commits would now go to
Linus via my tree (cherry picked) and via Paul's as part of his second
merge of the topic branch.

So unless you can give me a good reason I'll merge the tip of the topic
branch into my next, as planned.

cheers

Re: [PATCH 3/3] powerpc: kprobes: emulate instructions on kprobe handler re-entry

2017-02-14 Thread Ananth N Mavinakayanahalli

On Tue, Feb 14, 2017 at 02:08:03PM +0530, Naveen N. Rao wrote:
> On kprobe handler re-entry, try to emulate the instruction rather than
> single stepping always.
> 
> As a related change, remove the duplicate saving of msr as that is
> already done in set_current_kprobe()
> 
> Signed-off-by: Naveen N. Rao 

Acked-by: Ananth N Mavinakayanahalli

Re: [PATCH 2/3] powerpc: kprobes: factor out code to emulate instruction into a helper

2017-02-14 Thread Ananth N Mavinakayanahalli

On Tue, Feb 14, 2017 at 02:08:02PM +0530, Naveen N. Rao wrote:
> This helper will be used in a subsequent patch to emulate instructions
> on re-entering the kprobe handler. No functional change.
> 
> Signed-off-by: Naveen N. Rao 

Acked-by: Ananth N Mavinakayanahalli

Re: [PATCH 1/3] powerpc: kprobes: fix handling of function offsets on ABIv2

2017-02-14 Thread Ananth N Mavinakayanahalli

On Tue, Feb 14, 2017 at 02:08:01PM +0530, Naveen N. Rao wrote:
> commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling
> with kallsyms on ppc64le") changed how we use the offset field in struct
> kprobe on ABIv2. perf now offsets from the GEP (Global entry point) if an
> offset is specified and otherwise chooses the LEP (Local entry point).
> 
> Fix the same in kernel for kprobe API users. We do this by extending
> kprobe_lookup_name() to accept an additional parameter to indicate the
> offset specified with the kprobe registration. If offset is 0, we return
> the local function entry and return the global entry point otherwise.
> 
> With:
>   # cd /sys/kernel/debug/tracing/
>   # echo "p _do_fork" >> kprobe_events
>   # echo "p _do_fork+0x10" >> kprobe_events
> 
> before this patch:
>   # cat ../kprobes/list
>   c00d0748  k  _do_fork+0x8[DISABLED]
>   c00d0758  k  _do_fork+0x18[DISABLED]
>   c00412b0  k  kretprobe_trampoline+0x0[OPTIMIZED]
> 
> and after:
>   # cat ../kprobes/list
>   c00d04c8  k  _do_fork+0x8[DISABLED]
>   c00d04d0  k  _do_fork+0x10[DISABLED]
>   c00412b0  k  kretprobe_trampoline+0x0[OPTIMIZED]
> 
> Signed-off-by: Naveen N. Rao 

Acked-by: Ananth N Mavinakayanahalli

[PATCH 3/3] powerpc: kprobes: emulate instructions on kprobe handler re-entry

2017-02-14 Thread Naveen N. Rao

On kprobe handler re-entry, try to emulate the instruction rather than
single stepping always.

As a related change, remove the duplicate saving of msr as that is
already done in set_current_kprobe()

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/kernel/kprobes.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 9cdf2de31e9e..c213637b9d25 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -206,10 +206,17 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
 */
save_previous_kprobe(kcb);
set_current_kprobe(p, regs, kcb);
-   kcb->kprobe_saved_msr = regs->msr;
kprobes_inc_nmissed_count(p);
prepare_singlestep(p, regs);
kcb->kprobe_status = KPROBE_REENTER;
+   if (p->ainsn.boostable >= 0) {
+   ret = try_to_emulate(p, regs);
+
+   if (ret > 0) {
+   restore_previous_kprobe(kcb);
+   return 1;
+   }
+   }
return 1;
} else {
if (*addr != BREAKPOINT_INSTRUCTION) {
-- 
2.11.0

[PATCH 2/3] powerpc: kprobes: factor out code to emulate instruction into a helper

2017-02-14 Thread Naveen N. Rao

This helper will be used in a subsequent patch to emulate instructions
on re-entering the kprobe handler. No functional change.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/kernel/kprobes.c | 52 ++-
 1 file changed, 31 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index fce05a38851c..9cdf2de31e9e 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -140,6 +140,35 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
regs->link = (unsigned long)kretprobe_trampoline;
 }
 
+int __kprobes try_to_emulate(struct kprobe *p, struct pt_regs *regs)
+{
+   int ret;
+   unsigned int insn = *p->ainsn.insn;
+
+   /* regs->nip is also adjusted if emulate_step returns 1 */
+   ret = emulate_step(regs, insn);
+   if (ret > 0) {
+   /*
+* Once this instruction has been boosted
+* successfully, set the boostable flag
+*/
+   if (unlikely(p->ainsn.boostable == 0))
+   p->ainsn.boostable = 1;
+   } else if (ret < 0) {
+   /*
+* We don't allow kprobes on mtmsr(d)/rfi(d), etc.
+* So, we should never get here... but, its still
+* good to catch them, just in case...
+*/
+   printk("Can't step on instruction %x\n", insn);
+   BUG();
+   } else if (ret == 0)
+   /* This instruction can't be boosted */
+   p->ainsn.boostable = -1;
+
+   return ret;
+}
+
 int __kprobes kprobe_handler(struct pt_regs *regs)
 {
struct kprobe *p;
@@ -235,18 +264,9 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
 
 ss_probe:
if (p->ainsn.boostable >= 0) {
-   unsigned int insn = *p->ainsn.insn;
+   ret = try_to_emulate(p, regs);
 
-   /* regs->nip is also adjusted if emulate_step returns 1 */
-   ret = emulate_step(regs, insn);
if (ret > 0) {
-   /*
-* Once this instruction has been boosted
-* successfully, set the boostable flag
-*/
-   if (unlikely(p->ainsn.boostable == 0))
-   p->ainsn.boostable = 1;
-
if (p->post_handler)
p->post_handler(p, regs, 0);
 
@@ -254,17 +274,7 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
reset_current_kprobe();
preempt_enable_no_resched();
return 1;
-   } else if (ret < 0) {
-   /*
-* We don't allow kprobes on mtmsr(d)/rfi(d), etc.
-* So, we should never get here... but, its still
-* good to catch them, just in case...
-*/
-   printk("Can't step on instruction %x\n", insn);
-   BUG();
-   } else if (ret == 0)
-   /* This instruction can't be boosted */
-   p->ainsn.boostable = -1;
+   }
}
prepare_singlestep(p, regs);
kcb->kprobe_status = KPROBE_HIT_SS;
-- 
2.11.0

[PATCH 1/3] powerpc: kprobes: fix handling of function offsets on ABIv2

2017-02-14 Thread Naveen N. Rao

commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling
with kallsyms on ppc64le") changed how we use the offset field in struct
kprobe on ABIv2. perf now offsets from the GEP (Global entry point) if an
offset is specified and otherwise chooses the LEP (Local entry point).

Fix the same in kernel for kprobe API users. We do this by extending
kprobe_lookup_name() to accept an additional parameter to indicate the
offset specified with the kprobe registration. If offset is 0, we return
the local function entry and return the global entry point otherwise.

With:
# cd /sys/kernel/debug/tracing/
# echo "p _do_fork" >> kprobe_events
# echo "p _do_fork+0x10" >> kprobe_events

before this patch:
# cat ../kprobes/list
c00d0748  k  _do_fork+0x8[DISABLED]
c00d0758  k  _do_fork+0x18[DISABLED]
c00412b0  k  kretprobe_trampoline+0x0[OPTIMIZED]

and after:
# cat ../kprobes/list
c00d04c8  k  _do_fork+0x8[DISABLED]
c00d04d0  k  _do_fork+0x10[DISABLED]
c00412b0  k  kretprobe_trampoline+0x0[OPTIMIZED]

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/kprobes.h | 6 +++---
 arch/powerpc/kernel/optprobes.c| 4 ++--
 kernel/kprobes.c   | 6 +++---
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h 
b/arch/powerpc/include/asm/kprobes.h
index d821835ade86..e7ada061aa12 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -60,10 +60,10 @@ extern kprobe_opcode_t optprobe_template_end[];
 
 #ifdef PPC64_ELF_ABI_v2
 /* PPC64 ABIv2 needs local entry point */
-#define kprobe_lookup_name(name, addr) \
+#define kprobe_lookup_name(name, addr, offset) \
 {  \
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);   \
-   if (addr)   \
+   if (addr && !(offset))  \
addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
 }
 #elif defined(PPC64_ELF_ABI_v1)
@@ -75,7 +75,7 @@ extern kprobe_opcode_t optprobe_template_end[];
  * This ensures we always get to the actual symbol and not the descriptor.
  * Also handle  format.
  */
-#define kprobe_lookup_name(name, addr) \
+#define kprobe_lookup_name(name, addr, offset) \
 {  \
char dot_name[MODULE_NAME_LEN + 1 + KSYM_NAME_LEN]; \
const char *modsym; 
\
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index 2282bf4e63cd..e51a045f3d3b 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -243,8 +243,8 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe 
*op, struct kprobe *p)
/*
 * 2. branch to optimized_callback() and emulate_step()
 */
-   kprobe_lookup_name("optimized_callback", op_callback_addr);
-   kprobe_lookup_name("emulate_step", emulate_step_addr);
+   kprobe_lookup_name("optimized_callback", op_callback_addr, 0);
+   kprobe_lookup_name("emulate_step", emulate_step_addr, 0);
if (!op_callback_addr || !emulate_step_addr) {
WARN(1, "kprobe_lookup_name() failed\n");
goto error;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 83ad7e440417..9bc433575d98 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -63,7 +63,7 @@
  * so this must be overridable.
  */
 #ifndef kprobe_lookup_name
-#define kprobe_lookup_name(name, addr) \
+#define kprobe_lookup_name(name, addr, offset) \
addr = ((kprobe_opcode_t *)(kallsyms_lookup_name(name)))
 #endif
 
@@ -1365,7 +1365,7 @@ static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
goto invalid;
 
if (p->symbol_name) {
-   kprobe_lookup_name(p->symbol_name, addr);
+   kprobe_lookup_name(p->symbol_name, addr, p->offset);
if (!addr)
return ERR_PTR(-ENOENT);
}
@@ -2161,7 +2161,7 @@ static int __init init_kprobes(void)
/* lookup the function address from its name */
for (i = 0; kretprobe_blacklist[i].name != NULL; i++) {
kprobe_lookup_name(kretprobe_blacklist[i].name,
-  kretprobe_blacklist[i].addr);
+  kretprobe_blacklist[i].addr, 0);
if (!kretprobe_blacklist[i].addr)
printk("kretprobe: lookup failed: %s\n",

[PATCH v2 2/2] powerpc: emulate_step tests for load/store instructions

2017-02-14 Thread Ravi Bangoria

Add new selftest that test emulate_step for Normal, Floating Point,
Vector and Vector Scalar - load/store instructions. Test should run
at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set.

Sample log:

  [0.762063] emulate_step smoke test: start.
  [0.762219] emulate_step smoke test: ld : PASS
  [0.762434] emulate_step smoke test: lwz: PASS
  [0.762653] emulate_step smoke test: lwzx   : PASS
  [0.762867] emulate_step smoke test: std: PASS
  [0.763082] emulate_step smoke test: ldarx / stdcx. : PASS
  [0.763302] emulate_step smoke test: lfsx   : PASS
  [0.763514] emulate_step smoke test: stfsx  : PASS
  [0.763727] emulate_step smoke test: lfdx   : PASS
  [0.763942] emulate_step smoke test: stfdx  : PASS
  [0.764134] emulate_step smoke test: lvx: PASS
  [0.764349] emulate_step smoke test: stvx   : PASS
  [0.764575] emulate_step smoke test: lxvd2x : PASS
  [0.764788] emulate_step smoke test: stxvd2x: PASS
  [0.764997] emulate_step smoke test: complete.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/include/asm/sstep.h  |   8 +
 arch/powerpc/kernel/kprobes.c |   2 +
 arch/powerpc/lib/Makefile |   4 +
 arch/powerpc/lib/test_emulate_step.c  | 439 ++
 5 files changed, 460 insertions(+)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index c4ced1d..691 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -284,6 +284,13 @@
 #define PPC_INST_BRANCH_COND   0x4080
 #define PPC_INST_LBZCIX0x7c0006aa
 #define PPC_INST_STBCIX0x7c0007aa
+#define PPC_INST_LWZX  0x7c2e
+#define PPC_INST_LFSX  0x7c00042e
+#define PPC_INST_STFSX 0x7c00052e
+#define PPC_INST_LFDX  0x7c0004ae
+#define PPC_INST_STFDX 0x7c0005ae
+#define PPC_INST_LVX   0x7cce
+#define PPC_INST_STVX  0x7c0001ce
 
 /* macros to insert fields into opcodes */
 #define ___PPC_RA(a)   (((a) & 0x1f) << 16)
diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
index d3a42cc..d6d3630 100644
--- a/arch/powerpc/include/asm/sstep.h
+++ b/arch/powerpc/include/asm/sstep.h
@@ -87,3 +87,11 @@ struct instruction_op {
 
 extern int analyse_instr(struct instruction_op *op, struct pt_regs *regs,
 unsigned int instr);
+
+#if defined(CONFIG_KPROBES_SANITY_TEST) && defined(CONFIG_PPC64)
+void test_emulate_step(void);
+#else
+static inline void test_emulate_step(void)
+{
+}
+#endif
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 735ff3d..c867347 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -529,6 +529,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
 
 int __init arch_init_kprobes(void)
 {
+   test_emulate_step();
+
return register_kprobe(_p);
 }
 
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 309361e8..7d046ca 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -35,3 +35,7 @@ obj-$(CONFIG_ALTIVEC) += xor_vmx.o
 CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
 
 obj-$(CONFIG_PPC64) += $(obj64-y)
+
+ifeq ($(CONFIG_PPC64), y)
+obj-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o
+endif
diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
new file mode 100644
index 000..887d1db
--- /dev/null
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -0,0 +1,439 @@
+/*
+ * test_emulate_step.c - simple sanity test for emulate_step load/store
+ *  instructions
+ *
+ * Copyright IBM Corp. 2016
+ *
+ * This program is free software;  you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) "emulate_step smoke test: " fmt
+
+#include 
+#include 
+#include 
+
+#define IMM_L(i)   ((uintptr_t)(i) & 0x)
+
+/*
+ * Defined with TEST_ prefix so it does not conflict with other
+ * definitions.
+ */
+#define TEST_LD(r, base, i)(PPC_INST_LD | ___PPC_RT(r) |   \
+   ___PPC_RA(base) |

[PATCH v2 1/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Reported-by: Anton Blanchard 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 06c7e9b..e14a2fb 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1799,8 +1799,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case LARX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
err = -EFAULT;
@@ -1824,8 +1822,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case STCX:
-   if (regs->msr & MSR_LE)
-   return 0;
if (op.ea & (size - 1))
break;  /* can't handle misaligned */
err = -EFAULT;
@@ -1851,8 +1847,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto ldst_done;
 
case LOAD:
-   if (regs->msr & MSR_LE)
-   return 0;
err = read_mem(>gpr[op.reg], op.ea, size, regs);
if (!err) {
if (op.type & SIGNEXT)
@@ -1864,8 +1858,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case LOAD_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_load(op.reg, do_lfs, op.ea, size, regs);
else
@@ -1874,15 +1866,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case LOAD_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_load(op.reg, do_lvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case LOAD_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_load(op.reg, do_lxvd2x, op.ea, regs);
goto ldst_done;
 #endif
@@ -1905,8 +1893,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
goto instr_done;
 
case STORE:
-   if (regs->msr & MSR_LE)
-   return 0;
if ((op.type & UPDATE) && size == sizeof(long) &&
op.reg == 1 && op.update_reg == 1 &&
!(regs->msr & MSR_PR) &&
@@ -1919,8 +1905,6 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned 
int instr)
 
 #ifdef CONFIG_PPC_FPU
case STORE_FP:
-   if (regs->msr & MSR_LE)
-   return 0;
if (size == 4)
err = do_fp_store(op.reg, do_stfs, op.ea, size, regs);
else
@@ -1929,15 +1913,11 @@ int __kprobes emulate_step(struct pt_regs *regs, 
unsigned int instr)
 #endif
 #ifdef CONFIG_ALTIVEC
case STORE_VMX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vec_store(op.reg, do_stvx, op.ea & ~0xfUL, regs);
goto ldst_done;
 #endif
 #ifdef CONFIG_VSX
case STORE_VSX:
-   if (regs->msr & MSR_LE)
-   return 0;
err = do_vsx_store(op.reg, do_stxvd2x, op.ea, regs);
goto ldst_done;
 #endif
-- 
1.8.3.1

[PATCH v2 0/2] powerpc: Emulation support for load/store instructions on LE

2017-02-14 Thread Ravi Bangoria

emulate_step is the basic infrastructure which is used by number of other
kernel infrastructures like kprobe, hw-breakpoint(data breakpoint) etc.
In case of kprobe, enabling emulation of load/store instructions will
speedup the execution of probed instruction. In case of kernel-space
breakpoint, causative instruction is first get emulated before executing
user registered handler. If emulation fails, hw-breakpoint is disabled
with error. As emulate_step does not support load/store instructions on
LE, kernel-space hw-breakpoint infrastructure is broken on LE.

emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.

Also add selftest which will run at boot if CONFIG_KPROBES_SANITY_TEST
and CONFIG_PPC64 is set.

Changes in v2:
  - Folded 2nd and 3rd patch of v1 into one patch, as suggested by Naveen

v1 link: 
https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg110671.html

Ravi Bangoria (2):
  powerpc: Emulation support for load/store instructions on LE
  powerpc: emulate_step tests for load/store instructions

 arch/powerpc/include/asm/ppc-opcode.h |   7 +
 arch/powerpc/include/asm/sstep.h  |   8 +
 arch/powerpc/kernel/kprobes.c |   2 +
 arch/powerpc/lib/Makefile |   4 +
 arch/powerpc/lib/sstep.c  |  20 --
 arch/powerpc/lib/test_emulate_step.c  | 439 ++
 6 files changed, 460 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/lib/test_emulate_step.c

-- 
1.8.3.1

77 matches

Mail list logo