On 2017/04/13 01:37PM, Masami Hiramatsu wrote:
> On Wed, 12 Apr 2017 16:28:28 +0530
> "Naveen N. Rao" wrote:
>
> > On kprobe handler re-entry, try to emulate the instruction rather than
> > single stepping always.
> >
>
> > As a related change, remove the
On 2017/04/13 01:34PM, Masami Hiramatsu wrote:
> On Wed, 12 Apr 2017 16:28:27 +0530
> "Naveen N. Rao" wrote:
>
> > This helper will be used in a subsequent patch to emulate instructions
> > on re-entering the kprobe handler. No functional change.
>
> In this
On 2017/04/13 01:32PM, Masami Hiramatsu wrote:
> On Wed, 12 Apr 2017 16:28:26 +0530
> "Naveen N. Rao" wrote:
>
> > kprobe_lookup_name() is specific to the kprobe subsystem and may not
> > always return the function entry point (in a subsequent patch for
> >
On 2017/04/13 12:02PM, Masami Hiramatsu wrote:
> Hi Naveen,
Hi Masami,
>
> BTW, I saw you sent 3 different series, are there any
> conflict each other? or can we pick those independently?
Yes, all these three patch series are based off powerpc/next and they do
depend on each other, as they
Thomas Gleixner writes:
> Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
> run on any online CPU at this point, but the setup_cpu() callback requires
> to be invoked on the boot CPU. This is achieved by temporarily setting the
> affinity of the
Oliver O'Halloran writes:
> On Wed, Apr 12, 2017 at 4:52 PM, Michael Ellerman wrote:
>> Rashmica Gupta writes:
>>
>>> On 31/03/17 12:37, Oliver O'Halloran wrote:
On Book3s we have two PTE flags used to mark cache-inhibited
Oliver O'Halloran writes:
> From: "Aneesh Kumar K.V"
>
> Add a _PAGE_DEVMAP bit for PTE and DAX PMD entires. PowerPC doesn't
> currently support PUD faults so we haven't extended it to the PUD
> level.
>
> Cc: Aneesh Kumar K.V
Do the checks that __flush_tlb_pending() does and check if
a local flush will do when batch->active is false inside of
hpte_need_flush(). I've checked the changes with tlbie tracing,
I see local flushes as applicable now and I've also run
some basic ltp testcases on top of these changes on a
On Wed, 12 Apr 2017 16:28:28 +0530
"Naveen N. Rao" wrote:
> On kprobe handler re-entry, try to emulate the instruction rather than
> single stepping always.
>
> As a related change, remove the duplicate saving of msr as that is
> already done in
On Wed, 12 Apr 2017 16:28:27 +0530
"Naveen N. Rao" wrote:
> This helper will be used in a subsequent patch to emulate instructions
> on re-entering the kprobe handler. No functional change.
In this case, please merge this patch into the next patch which
actually
On Wed, 12 Apr 2017 16:28:26 +0530
"Naveen N. Rao" wrote:
> kprobe_lookup_name() is specific to the kprobe subsystem and may not
> always return the function entry point (in a subsequent patch for
> KPROBES_ON_FTRACE).
If so, please move this patch into that
On Wed, 12 Apr 2017 16:28:25 +0530
"Naveen N. Rao" wrote:
> commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling
> with kallsyms on ppc64le") changed how we use the offset field in struct
> kprobe on ABIv2. perf now offsets from the GEP (Global
On Wed, Apr 12, 2017 at 11:53 AM, Balbir Singh wrote:
> On Wed, 2017-04-12 at 03:42 +1000, Oliver O'Halloran wrote:
>> From: Rashmica Gupta
>>
>> Adds support for removing bolted (i.e kernel linear mapping) mappings on
>> powernv. This is needed to
On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> > #endif
> > mtctr r12
> > bctrl
> > +/*
> > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > + * sane value that is set at early boot time,
> > + * thereby clearing LPCR_UPRT.
> > + * LPCR_UPRT is required if
"Gautham R. Shenoy" writes:
> From: "Gautham R. Shenoy"
>
> On wakeup from a deep-stop used for CPU-Hotplug, we invoke
> cur_cpu_spec->cpu_restore() which would set sane default values to
> various SPRs including LPCR.
>
> On POWER9, the
On 4/13/2017 10:00 AM, Jin, Yao wrote:
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking
On Wed, 12 Apr 2017 16:28:24 +0530
"Naveen N. Rao" wrote:
> The macro is now pretty long and ugly on powerpc. In the light of
> further changes needed here, convert it to a __weak variant to be
> over-ridden with a nicer looking function.
Looks good to me.
Hi Naveen,
BTW, I saw you sent 3 different series, are there any
conflict each other? or can we pick those independently?
Thanks,
On Wed, 12 Apr 2017 16:28:23 +0530
"Naveen N. Rao" wrote:
> v1:
>
On Wed, Apr 12, 2017 at 4:52 PM, Michael Ellerman wrote:
> Rashmica Gupta writes:
>
>> On 31/03/17 12:37, Oliver O'Halloran wrote:
>>> On Book3s we have two PTE flags used to mark cache-inhibited mappings:
>>> _PAGE_TOLERANT and _PAGE_NON_IDEMPOTENT.
On Wednesday 12 April 2017 03:41 PM, Michael Ellerman wrote:
Recently in commit f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB"),
we increased H_PGD_INDEX_SIZE to 15 when we're building with 64K pages. This
makes it larger than RADIX_PGD_INDEX_SIZE (13), which means the logic to
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K page or 2MB page.
On Thu, 13 Apr 2017 07:34:51 +1000
Benjamin Herrenschmidt wrote:
> On Thu, 2017-04-13 at 00:12 +1000, Nicholas Piggin wrote:
> > Yeah sure that sounds good. How's this then?
>
> I suppose so :-) When I was testing all that I had a "b ." at 0x500 and
> 0x4500 and I
Hi Balbir,
> FYI: The version you applied does not have checks for is_write
Yeah, we decided to do that in a follow up patch. I'm ok if someone
gets to it before me :)
Anton
On Thu, 2017-04-06 at 23:06 +1000, Michael Ellerman wrote:
> On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote:
> > From: Anton Blanchard
> >
> Applied to powerpc next, thanks.
>
> https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f
>
FYI: The version you
The raid6 Q syndrome check has been optimised using the vpermxor
instruction. This instruction was made available with POWER8, ISA version
2.07. It allows for both vperm and vxor instructions to be done in a single
instruction. This has been tested for correctness on a ppc64le vm with a
basic
Previously the raid6 test Makefile did not correctly build the files for
testing on PowerPC. This patch fixes the bug, so that all appropriate files
for PowerPC are built.
Signed-off-by: Matt Brown
---
Changlog
v2 - v4
- fixup whitespace
- change
On Thu, 2017-04-13 at 00:12 +1000, Nicholas Piggin wrote:
> Yeah sure that sounds good. How's this then?
I suppose so :-) When I was testing all that I had a "b ." at 0x500 and
0x4500 and I didn't hit them :)
On Friday 07 April 2017 07:16 PM, Michael Ellerman wrote:
Hari Bathini writes:
On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote:
My preference would be that the fadump kernel "just works". If it's
using too much memory then the fadump kernel should do
Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
run on any online CPU at this point, but the setup_cpu() callback requires
to be invoked on the boot CPU. This is achieved by temporarily setting the
affinity of the calling user space thread to the requested CPU and reset
On 04/11/2017 07:10 PM, Michael Ellerman wrote:
> Tyrel Datwyler writes:
>> On 04/11/2017 02:00 AM, Michael Ellerman wrote:
>>> Tyrel Datwyler writes:
I started looking at it when Bharata submitted a patch trying to fix the
From: "Matthew R. Ochs"
As an enhancement to distribute requests to multiple hardware queues, add
the infrastructure to hash a SCSI command into a particular hardware queue.
Support the following scenarios when deriving which queue to use: single
queue, tagging when
From: "Matthew R. Ochs"
As staging for supporting multiple hardware queues, add an attribute to
show and set the current number of hardware queues for the host. Support
specifying a hard limit or a CPU affinitized value. This will allow the
number of hardware queues to
Introduce multiple hardware queues to improve legacy I/O path performance.
Each hardware queue is comprised of a master context and associated I/O
resources. The hardware queues are initially implemented as a static array
embedded in the AFU. This will be transitioned to a dynamic allocation in a
From: "Matthew R. Ochs"
The method used to decode asynchronous interrupts involves unnecessary
loops to match up bits that are set with corresponding entries in the
asynchronous interrupt information table. This algorithm is wasteful
and does not scale well as new
From: "Matthew R. Ochs"
As a general cleanup, address all reasonable checkpatch warnings and
errors. These include enforcement of comment styles and including named
identifiers in function prototypes.
Signed-off-by: Matthew R. Ochs
From: "Matthew R. Ochs"
Devices supported by the cxlflash driver are fully coherent and do not
require a bus address mapping. Avoid unnecessary path length by using
the virtual address and length already present in the scatter-gather
entry.
Signed-off-by: Matthew R.
From: "Matthew R. Ochs"
Validation statements to enforce assumptions about specific defines
are not being evaluated by the compiler due to the fact that they
reside in a routine that is not used. To activate them, call the
routine as part of module initialization. As
From: "Matthew R. Ochs"
An EEH during probe can lead to a crash as the recovery thread races
with the probe thread. To avoid this issue, introduce new states to
fence out EEH recovery until probe has completed. Also ensure the reset
wait queue is flushed during device
From: "Matthew R. Ochs"
Update the driver to allow for future cards with 4 ports.
Signed-off-by: Matthew R. Ochs
Signed-off-by: Uma Krishnan
---
drivers/scsi/cxlflash/main.c| 78
From: "Matthew R. Ochs"
Update the SISlite header to support 4 ports as outlined in the
SISlite specification. Address fallout from structure renames and
refreshed organization throughout the driver. Determine the number
of ports supported by a card from the global
From: "Matthew R. Ochs"
As staging to support FC-related updates to the SISlite specification,
introduce helper routines to obtain references to FC resources that exist
within the global map. This will allow changes to the underlying global
map structure without
From: "Matthew R. Ochs"
At present, the cxlflash driver only supports hardware with two FC
ports. The code was initially designed with this assumption and is
dependent on having two FC ports - adding more ports will break logic
within the driver.
To mitigate this
From: "Matthew R. Ochs"
Transition from a static number of FC ports to a value that is derived
during probe. For now, a static value is used but this will later be
based on the type of card being configured.
Signed-off-by: Matthew R. Ochs
From: "Matthew R. Ochs"
As staging for future function, pass the config pointer instead of the
AFU pointer for port-related sysfs helper routines.
Signed-off-by: Matthew R. Ochs
Signed-off-by: Uma Krishnan
---
From: "Matthew R. Ochs"
Currently, RRQ processing takes place on hardware interrupt context. This
can be a heavy burden in some environments due to the overhead encountered
while completing RRQ entries. In an effort to improve system performance,
use the IRQ polling
From: "Matthew R. Ochs"
As further staging to support processing the HRRQ by other means, access
to the HRRQ needs to be serialized by a disabled lock. This will allow
safe access in other non-hardware interrupt contexts. In an effort to
minimize the period where
From: "Matthew R. Ochs"
In order to support processing the HRRQ by other means (e.g. polling),
the processing portion of the current RRQ interrupt handler needs to be
broken out into a separate routine. This will allow RRQ processing from
places other than the RRQ
This patch series contains miscellaneous patches and adds 4 port device
support. This series also includes patches to improve performance of the
driver in the legacy IO path.
This series is intended for 4.12 and is bisectable
Matthew R. Ochs (16):
cxlflash: Separate RRQ processing from the RRQ
prom_init.c: Enable support for new DRC device tree properties
"ibm,dynamic-memory-v2" in initial handshake between the Linux kernel
and the front end processor.
Signed-off-by: Michael Bringmann
---
arch/powerpc/kernel/prom_init.c |2 +-
1 file changed, 1
hotplug_init: Simplify the code needed for runtime memory hotplug and
maintenance with a conversion routine that transforms the compressed
property "ibm,dynamic-memory-v2" to the form of "ibm,dynamic-memory"
within the "ibm,dynamic-reconfiguration-memory" property. Thus only
a single set of
powerpc/memory: Add parallel routines to parse the new property
"ibm,dynamic-memory-v2" property when it is present, and then to
finish initialization of the relevant memory structures with the
operating system. This code is shared between the boot-time
initialization functions and the runtime
powerpc/memory: Add parallel routines to parse the new property
"ibm,dynamic-memory-v2" property when it is present, and then to
register the relevant memory blocks with the operating system.
This property format is intended to provide a more compact
representation of memory when communicating
architecture.vec5 features: The boot-time memory management needs to
know the form of the "ibm,dynamic-memory-v2" property early during
scanning of the flattened device tree. This patch moves execution of
the function pseries_probe_fw_features() early enough to be before
the scanning of the
"ibm,dynamic-memory-v2": This property replaces the "ibm,dynamic-memory"
node representation within the "ibm,dynamic-reconfiguration-memory"
property provided by the BMC. This element format is intended to provide
a more compact representation of memory, especially, for systems with
massive
This is the skiboot patch included here if anyone wants to test or review.
I've put more of the feature documentation into this patch, so I haven't
duplicated it on the Linux side -- firmware will be canonical definition.
---
core/Makefile.inc | 2 +-
core/cpufeatures.c | 888
The /cpus/features dt binding describes architected CPU features along
with some compatibility, privilege, and enablement properties that allow
flexibility with discovering and enabling capabilities.
Presence of this feature implies a base level of functionality, then
additional feature nodes
Introduce primitives for FDT parsing. These will be used for powerpc
cpufeatures node scanning, which has quite complex structure but should
be processed early.
Acked-by: Rob Herring
Signed-off-by: Nicholas Piggin
---
drivers/of/fdt.c | 38
POWER9/ISAv3 has no VRMASD field in LPCR. Don't set reserved bits.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/cpu_setup_power.S | 21 -
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/cpu_setup_power.S
I expect this will require still some more changes, but I think it's
getting close to polished. Intention is to make it default off and
unsupported at the initial merge, to give a new more weeks to test
then freeze the format.
I included the firmware cpufeatures patch here as well.
Thanks,
Nick
The definition of smp_mb__after_unlock_lock() is currently smp_mb()
for CONFIG_PPC and a no-op otherwise. It would be better to instead
provide an architecture-selectable Kconfig option, and select the
strength of smp_mb__after_unlock_lock() based on that option. This
commit therefore creates
Hi Rashmica,
Le 17/11/2016 à 13:03, Michael Ellerman a écrit :
On Fri, 2016-27-05 at 05:48:59 UTC, Rashmica Gupta wrote:
Useful to be able to dump the kernels page tables to check permissions
and memory types - derived from arm64's implementation.
Add a debugfs file to check the page tables.
Hi Anton,
Le 04/04/2017 à 00:00, Anton Blanchard a écrit :
Hi Christophe,
- if (user_mode(regs))
+ if (!is_exec && user_mode(regs))
Shouldn't it also check 'is_write' ?
If it is a store, is_write should be set, shouldn't it ?
Thanks, Ben had the same suggestion. I'll add that
On Wed, Apr 12, 2017 at 11:42:44PM +0800, Jin, Yao wrote:
>
>
> On 4/12/2017 10:26 PM, Jiri Olsa wrote:
> > On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote:
> >
> > SNIP
> >
> > > > # Overhead Command Source Shared Object Source Symbol
> > > > Target
On 4/12/2017 10:26 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote:
SNIP
# Overhead Command Source Shared Object Source Symbol
Target SymbolBasic Block Cycles
# ...
Hello Michael and Scott,
I see that the status of the below patch has been changed to 'Not
Applicable' in the linuxppc-dev Patchwork.
About this serie, David S. Miller said:
Sujet : Re: [PATCH 0/2] get rid of immrbar_virt_to_phys()
Date : Wed, 08 Feb 2017 13:17:32 -0500 (EST)
De : David
Excerpts from PrasannaKumar Muralidharan's message of April 5, 2017 11:21:
On 30 March 2017 at 12:46, Naveen N. Rao
wrote:
Also, with a simple module to memset64() a 1GB vmalloc'ed buffer, here
are the results:
generic:0.245315533 seconds time elapsed
From: Christophe Lombard
The new Coherent Accelerator Interface Architecture, level 2, for the
IBM POWER9 brings new content and features:
- POWER9 Service Layer
- Registers
- Radix mode
- Process element entry
- Dedicated-Shared Process Programming Model
-
On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote:
SNIP
> > # Overhead Command Source Shared Object Source Symbol
> > Target SymbolBasic Block Cycles
> > # ...
> >
On Wed, 12 Apr 2017 23:45:42 +1000
Benjamin Herrenschmidt wrote:
> On Wed, 2017-04-12 at 23:11 +1000, Nicholas Piggin wrote:
> > After setting LPES0 in the host on POWER9, the host external interrupt
> > handler no longer works correctly, because it's set to HV mode
On Wed, 2017-04-12 at 01:42 -0700, IanJiang wrote:
>
> In my test, DMA buffers are allocated with (bus 2, device 1, function
> 0) in module Plx8000_NT, but DMA is issued by (bus 1 device 0 function
> 1) in module Plx8000_DMA. And error of (bus 1 device 0 function 1) is
> reported by EEH.
On Wed, 2017-04-12 at 23:11 +1000, Nicholas Piggin wrote:
> After setting LPES0 in the host on POWER9, the host external interrupt
> handler no longer works correctly, because it's set to HV mode (HSRR)
> for POWER7/8 with LPES0 clear. We don't expect to get any EE in the host
> with XIVE, but it
On Mon, 2017-04-10 at 22:48:26 UTC, Michael Ellerman wrote:
> powerpc_debugfs_root is the dentry representing the root of the
> "powerpc" directory tree in debugfs.
>
> Currently it sits in asm/debug.h, a long with some other things that
> have "debug" in the name, but are otherwise unrelated.
>
On Wed, 2017-03-22 at 15:04:14 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy"
>
> Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
> transitions the CPU to the deepest available platform idle state to a
> new function named
After setting LPES0 in the host on POWER9, the host external interrupt
handler no longer works correctly, because it's set to HV mode (HSRR)
for POWER7/8 with LPES0 clear. We don't expect to get any EE in the host
with XIVE, but it seems preferable to catch unexpected interrupts in case
there are
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K page or 2MB page.
Le 12/04/2017 à 09:52, Andrew Donnellan a écrit :
On 08/04/17 00:11, Christophe Lombard wrote:
+static u32 get_phb_index(struct device_node *np)
{
u32 phb_index;
if (of_property_read_u32(np, "ibm,phb-index", _index))
-return 0;
+return -ENODEV;
Function is
christophe lombard writes:
> Le 12/04/2017 à 04:11, Michael Ellerman a écrit :
> Hi,
>
> Here is a new patch which updates the documentation based
> on the complet PATCH V4 7/7.
> Let me know if it suits you.
Fine by me, I'll wait for Fred's ack before I merge it
From: "Gautham R. Shenoy"
The idle-exit code assumes that if Timebase is not lost, then neither
are the per-core hypervisor resources lost. This was true on POWER8
where fast-sleep lost only TB but not per-core resources, and winkle
lost both.
This assumption is not
From: "Gautham R. Shenoy"
On wakeup from a deep-stop used for CPU-Hotplug, we invoke
cur_cpu_spec->cpu_restore() which would set sane default values to
various SPRs including LPCR.
On POWER9, the cpu_restore_power9() call would would restore LPCR to a
sane value that is
From: "Gautham R. Shenoy"
This patch ensures that POWER8 and POWER9 processors use the correct
value of IDLE_THREAD_BITS as POWER8 has 8 threads per core and hence
the IDLE_THREAD_BITS should be 0xFF while POWER9 has only 4 threads
per core and hence the IDLE_THREAD_BITS
From: "Gautham R. Shenoy"
Hi,
This patchset contains three fixes required to get a deep stop state
that can lose the Hypervisor state to work correctly.
The first patch in the series uses the correct value for the
IDLE_THREAD_BITS on POWER8 which has 8 threads per core
On Wed, 2017-04-05 at 23:01:33 UTC, Benjamin Herrenschmidt wrote:
> Signed-off-by: Benjamin Herrenschmidt
Applied to topic/xive, thanks.
https://git.kernel.org/powerpc/c/eeea1a434ddedbb5aaeac1a8661445
cheers
On Wed, 2017-04-05 at 07:54:47 UTC, Benjamin Herrenschmidt wrote:
> Add 32 and 8 bit variants
>
> Signed-off-by: Benjamin Herrenschmidt
Series applied to topic/xive, thanks.
https://git.kernel.org/powerpc/c/22bd64a621cc80beeb009abec3d3df
cheers
Along similar lines as commit 9326638cbee2 ("kprobes, x86: Use
NOKPROBE_SYMBOL() instead of __kprobes annotation"), convert __kprobes
annotation to either NOKPROBE_SYMBOL() or nokprobe_inline. The latter
forces inlining, in which case the caller needs to be added to
NOKPROBE_SYMBOL().
Also:
-
v3:
https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg114669.html
For v4, this has been rebased on top of powerpc/next as well as the
KPROBES_ON_FTRACE series. No other changes.
- Naveen
Naveen N. Rao (2):
powerpc: split ftrace bits into a separate file
powerpc: ftrace_64: split
entry_*.S now includes a lot more than just kernel entry/exit code. As a
first step at cleaning this up, let's split out the ftrace bits into
separate files. Also move all related tracing code into a new trace/
subdirectory.
No functional changes.
Suggested-by: Michael Ellerman
Split ftrace_64.S further retaining the core ftrace 64-bit aspects
in ftrace_64.S and moving ftrace_caller() and ftrace_graph_caller() into
separate files based on -mprofile-kernel. The livepatch routines are all
now contained within the mprofile file.
Signed-off-by: Naveen N. Rao
Allow kprobes to be placed on ftrace _mcount() call sites. This
optimization avoids the use of a trap, by riding on ftrace
infrastructure.
This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on
MPROFILE_KERNEL, which is only currently enabled on powerpc64le with
newer toolchains.
Based
From: Masami Hiramatsu
Skip preparing optprobe if the probe is ftrace-based, since anyway, it
must not be optimized (or already optimized by ftrace).
Tested-by: Naveen N. Rao
Signed-off-by: Masami Hiramatsu
---
Though
Pass the real LR to the ftrace handler. This is needed for
KPROBES_ON_FTRACE for the pre handlers.
Also, with KPROBES_ON_FTRACE, the link register may be updated by the
pre handlers or by a registed kretprobe. Honor updated LR by restoring
it from pt_regs, rather than from the stack save area.
KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it
eliminates the need for a trap, as well as the need to emulate or
single-step instructions.
Though OPTPROBES provides us with similar performance, we have limited
optprobes trampoline slots. As such, when asked to probe at a
v2:
https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg114659.html
For v3, this has only been rebased on top of powerpc/next and carries a
minor change to patch 4/5. No other changes.
Also, though patch 3/5 is generic, it needs to be carried in this
series as we crash on powerpc
Move the stack setup and teardown code to the ftrace_graph_caller().
This way, we don't incur the cost of setting it up unless function graph
is enabled for this function.
Also, remove the extraneous LR restore code after the function graph
stub. LR has previously been restored and neither
kprobe_lookup_name() is specific to the kprobe subsystem and may not
always return the function entry point (in a subsequent patch for
KPROBES_ON_FTRACE). For looking up function entry points, introduce a
separate helper and use the same in optprobes.c
Signed-off-by: Naveen N. Rao
This helper will be used in a subsequent patch to emulate instructions
on re-entering the kprobe handler. No functional change.
Acked-by: Ananth N Mavinakayanahalli
Signed-off-by: Naveen N. Rao
---
arch/powerpc/kernel/kprobes.c | 52
On kprobe handler re-entry, try to emulate the instruction rather than
single stepping always.
As a related change, remove the duplicate saving of msr as that is
already done in set_current_kprobe()
Acked-by: Ananth N Mavinakayanahalli
Signed-off-by: Naveen N. Rao
The macro is now pretty long and ugly on powerpc. In the light of
further changes needed here, convert it to a __weak variant to be
over-ridden with a nicer looking function.
Suggested-by: Masami Hiramatsu
Signed-off-by: Naveen N. Rao
---
commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling
with kallsyms on ppc64le") changed how we use the offset field in struct
kprobe on ABIv2. perf now offsets from the GEP (Global entry point) if an
offset is specified and otherwise chooses the LEP (Local entry point).
Fix the
v1:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1334843.html
For v2, this series has been re-ordered and rebased on top of
powerpc/next so as to make it easier to resolve conflicts with -tip. No
other changes.
- Naveen
Naveen N. Rao (5):
kprobes: convert kprobe_lookup_name()
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
>
> 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
>for branch cross 4K or 2M area. It's an approximate computing
>for checking if the branch cross 4K page or 2MB page.
>
> For example:
>
> perf record -g
1 - 100 of 123 matches
Mail list logo