From: Andi Kleen
This is a very complex function, which is called in multiple places.
It is unlikely that inlining or not inlining it makes any difference
for its run time.
This saves around 13k text in my kernel
textdata bss dec hex filename
9083992 5367600 6544
> > Or define some quirk table just for this purpose?
>
> Nope. It's about identifying the bus.
PCI just has no good way to identify busses.
>
> The bus which contains the uncore devices:
>
> The Uncore devices reside on CPUBUSNO(1), which is the PCI bus assigned
> for the processor socket.
> And the way how this function is used is a horrible hack. It's called from
> a random driver at some random point in time.
>
> The proper solution is to identify the bus at the point where the bus is
> discovered and switch it to mmconfig if possible.
But how would you know that it is safe?
AFA
> and ARCH_SPARSEMEM_DEFAULT is enabeld on 64b. So I guess whatever was
> the reason to add this code back in 2006 is not true anymore. So I am
> really wondering. Do we absolutely need to assign pages which are not
> onlined yet to the ZONE_NORMAL unconditionally? Why cannot we put them
> out of a
> > Available from
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git
> > perf/builtin-json-29
>
> hi,
> I can't see the branch..
It's there
https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/builtin-json-29
-Andi
>
> [jolsa@krava perf]$ git r
From: Andi Kleen
For performance testing it is useful to be able to disable AVX
and AVX512. User programs check in XGETBV if AVX is supported
by the OS. If we don't initialize the XSAVE state for AVX it will
appear as if the OS is not supporting AVX.
Implement disable options for AVX and A
From: Andi Kleen
Move the XSAVE initialization code to be after parsing early parameters.
I don't see any reason why the FPU code needs to be initialized that
early, nothing else in the initialization phase uses XSAVE.
This is useful to be able to handle command line parameters in the
On Sat, Mar 11, 2017 at 11:46:37AM +0100, Thomas Gleixner wrote:
> > +enum xsave_features {
> > + XSAVE_X87,
> > + XSAVE_SSE,
> > + XSAVE_AVX,
> > + XSAVE_MPX_BOUNDS,
> > + XSAVE_MPX_CSR,
> > + XSAVE_AVX512_OPMASK,
> > + XSAVE_AVX512_HI256,
> > + XSAVE_AVX512_ZMM_HI256,
> > + XSAV
From: Andi Kleen
Since
cd9c57c x86/MCE: Dump MCE to dmesg if no consumers
in 4.9 all MCEs are printed even when mcelog is running. This fixes
the code again to not print when mcelog is running, because it
already takes care of the logging and predicting.
This fixes spamming all xterms for
From: Andi Kleen
Move the XSAVE initialization code to be after parsing early parameters.
I don't see any reason why the FPU code needs to be initialized that
early, nothing else in the initialization phase uses XSAVE.
This is useful to be able to handle command line parameters in the
From: Andi Kleen
For performance testing it is useful to be able to disable AVX
and AVX512. User programs check in XGETBV if AVX is supported
by the OS. If we don't initialize the XSAVE state for AVX it will
appear as if the OS is not supporting AVX. For kernel users we
can also clea
Andi Kleen writes:
> Hi,
>
> I had a large systems with lots of cores stop responding to new ssh
> requests. It turned out it crashed in the tty layer. The system
> has a serial console and had some active sshs and screen
Correction. This may have been a linux-next kernel, not 4.
Hi,
I had a large systems with lots of cores stop responding to new ssh
requests. It turned out it crashed in the tty layer. The system
has a serial console and had some active sshs and screen
[24922.097093] BUG: unable to handle kernel paging request at 2260
[24922.64] IP: n_tty_
From: Andi Kleen
Add support for parsing the MetricExpr header in the JSON event lists and
storing them in the alias structure.
Used in the next patch.
v2: Change DividedBy to MetricExpr
v3: Really catch all uses of DividedBy
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c
From: Andi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr"
entries in the event lists. Many events are more useful as ratios
than in raw form, typically some count in relation to total ticks.
Transfer the MetricExpr information from the alias to the
From: Andi Kleen
Factor out the PMU name matching in the event parser into a separate function,
to use the same code for other grammar rules later.
Signed-off-by: Andi Kleen
---
tools/perf/util/parse-events.c | 46 ++
tools/perf/util/parse-events.h | 5
From: Andi Kleen
Move the printing of perf expressions and internal events to a
new clearer --details flag, instead of lumping it together
with other debug options in --debug. This makes it clearer
to use.
Before
perf list --debug
...
unc_m_power_critical_throttle_cycles
[Cycles all
From: Andi Kleen
To be used in next patch to support automatic summing of alias
events.
v2: Move check for bad results to next patch
v3: Remove trivial addition.
Signed-off-by: Andi Kleen
---
tools/perf/builtin-stat.c | 103 +++---
1 file changed, 80
From: Andi Kleen
Add support for a new JSON event attribute to name MetricExpr for better output
in perf stat.
If the event has no MetricName it uses the normal event name instead to describe
the metric.
Before
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}'
From: Andi Kleen
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
This is just intended as an simple parser for internal usage
in the event lists, not the beginning of a "perf scripting language"
v2: Use expr__ pref
From: Andi Kleen
Special case uncore_ prefix in PMU match, to allow for shorter event
uncore specifications.
Before
perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After
perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
Signed-off-by: Andi Kleen
From: Andi Kleen
When the user specifies a pmu directly, expand it automatically
with a prefix match for all available PMUs, similar as we do for
the normal aliases now.
This allows to specify attributes for duplicated boxes quickly.
For example uncore_cbox_{0,6}/.../ can be now specified as
From: Andi Kleen
- Add MetricName to describe Metric
- Remove redundant "derived from" in descriptions
- Rename UNC_M_CAS_COUNT to LLC_MISSES.READ
Signed-off-by: Andi Kleen
---
.../arch/x86/broadwellde/uncore-cache.json | 28 ++--
.../arch/x86/broadwellde/uncore-m
From: Andi Kleen
Output the metric expr in perf list when --debug is specified, so that the user
can check the formula.
Before:
% perf list
...
unc_m_power_channel_ppd
[Cycles where DRAM ranks are in power down (CKE) mode. Derived from
unc_m_power_channel_ppd. Unit
This patch kit further improves support for Intel uncore events in
the Linux perf user tool. The basic support has been already
merged earlier, but this makes it nicer to use.
- Collapse counts from duplicated boxes to make the output
easier to read.
- Support specifying events for multiple duplic
From: Andi Kleen
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is
From: Andi Kleen
When any result that is being merged is bad, mark them all
bad to give consistent output in interval mode.
No before/after, because the issue was only found in theoretical
review and it is hard to reproduce
Signed-off-by: Andi Kleen
---
tools/perf/builtin-stat.c | 10
> One example of the problems with extra layers what this patch fixes:
> mmap_pgoff() should never be using SHM_HUGE_* logic. This was
> introduced by:
>
>091d0d55b28 (shm: fix null pointer deref when userspace specifies invalid
> hugepage size)
>
> It is obviously harmless but lets just rip
On Fri, Mar 03, 2017 at 11:33:03AM +0100, Jiri Olsa wrote:
> On Tue, Feb 28, 2017 at 10:49:15PM -0800, Andi Kleen wrote:
>
> SNIP
>
> > +static void collect_data(struct perf_evsel *counter,
> > + void (*cb)(struct perf_ev
From: Andi Kleen
This fills in the pci_bus_force_mmconfig interface that was
added earlier for x86 to allow drivers to optimize config
space accesses. The implementation is straight forward
and uses the existing mmconfig access functions, just forcing
mmconfig access.
Signed-off-by: Andi Kleen
From: Andi Kleen
The Intel uncore driver can do a lot of PCI config accesses to read
performance counters. I had a situation on a 4S system where it
was spending 40+% of CPU time grabbing the pci_cfg_lock due to that.
For 64bit x86 with MMCONFIG there isn't really any reason to take
a lock
From: Andi Kleen
On Intel systems some uncore counters are located in PCI config space.
On 4S systems with many uncore events being sampled at a high frequency
we can see significant overhead from the type 1 accesses: both
from the IO port accesses and also from lock contention on the locks
From: Andi Kleen
x86 traditionally used mmconfig only for extended config space accesses
with offsets larger than 256. For lower offsets it uses the classic
Type 1 IO port access. This is quite slow and also requires taking
a global spin lock to protect the Type 1 IO port mailbox.
IIRC (I added
From: Andi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr"
entries in the event lists. Many events are more useful as ratios
than in raw form, typically some count in relation to total ticks.
Transfer the MetricExpr information from the alias to the
From: Andi Kleen
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is
From: Andi Kleen
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
This is just intended as an simple parser for internal usage
in the event lists, not the beginning of a "perf scripting language"
v2: Use expr__ pref
From: Andi Kleen
Add support for parsing the MetricExpr header in the JSON event lists and
storing them in the alias structure.
Used in the next patch.
v2: Change DividedBy to MetricExpr
v3: Really catch all uses of DividedBy
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c
From: Andi Kleen
Add support for a new JSON event attribute to name MetricExpr for better output
in perf stat.
If the event has no MetricName it uses the normal event name instead to describe
the metric.
So far no names are added.
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c
This patch kit further improves support for Intel uncore events in
the Linux perf user tool. The basic support has been already
merged earlier, but this makes it nicer to use.
- Collapse counts from duplicated boxes to make the output
easier to read.
- Support specifying events for multiple duplic
From: Andi Kleen
To be used in next patch to support automatic summing of alias
events.
v2: Move check for bad results to next patch
Signed-off-by: Andi Kleen
---
tools/perf/builtin-stat.c | 104 --
1 file changed, 81 insertions(+), 23 deletions
From: Andi Kleen
Special case uncore_ prefix in PMU match, to allow for shorter event
uncore specifications.
Before
perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After
perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
Signed-off-by: Andi Kleen
From: Andi Kleen
When the user specifies a pmu directly, expand it automatically
with a prefix match for all available PMUs, similar as we do for
the normal aliases now.
This allows to specify attributes for duplicated boxes quickly.
For example uncore_cbox_{0,6}/.../ can be now specified as
From: Andi Kleen
Factor out the PMU name matching in the event parser into a separate function,
to use the same code for other grammar rules later.
Signed-off-by: Andi Kleen
---
tools/perf/util/parse-events.c | 46 ++
tools/perf/util/parse-events.h | 5
From: Andi Kleen
Output the metric expr in perf list when -v is specified, so that the user
can check the formula.
Before:
% perf list -v
...
unc_m_power_channel_ppd
[Cycles where DRAM ranks are in power down (CKE) mode. Derived from
unc_m_power_channel_ppd. Unit
On Fri, Feb 24, 2017 at 09:37:20AM +0100, Jiri Olsa wrote:
> On Thu, Feb 23, 2017 at 04:10:12PM -0800, Andi Kleen wrote:
> > From: Andi Kleen
> >
> > To be used in next patch to support automatic summing of alias
> > events.
> >
> > Signed-off-by: Andi Kl
From: Andi Kleen
Implement printing instruction sequences as hex dump for branch stacks.
This relies on the x86 instruction decoder used by the PT decoder to find
the lengths of instructions to dump them individually.
This is good enough for pattern matching.
This allows to study hot paths for
From: Andi Kleen
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is
From: Andi Kleen
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
v2: Use expr__ prefix instead of expr_
Support multiple free variables for parser
Signed-off-by: Andi Kleen
---
tools/perf/tests/Build | 1 +
tools
From: Andi Kleen
Add support for parsing the MetricExpr header in the JSON event lists and
storing them in the alias structure.
Used in the next patch.
v2: Change DividedBy to MetricExpr
v3: Really catch all uses of DividedBy
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c
From: Andi Kleen
Factor out the PMU name matching in the event parser into a separate function,
to use the same code for other grammar rules later.
Signed-off-by: Andi Kleen
---
tools/perf/util/parse-events.c | 46 ++
tools/perf/util/parse-events.h | 5
From: Andi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr"
entries in the event lists. Many events are more useful as ratios
than in raw form, typically some count in relation to total ticks.
Transfer the MetricExpr information from the alias to the
From: Andi Kleen
To be used in next patch to support automatic summing of alias
events.
Signed-off-by: Andi Kleen
---
tools/perf/builtin-stat.c | 114 --
1 file changed, 91 insertions(+), 23 deletions(-)
diff --git a/tools/perf/builtin-stat.c b
From: Andi Kleen
Output the metric expr in perf list when -v is specified, so that the user
can check the formula.
Before:
% perf list -v
...
unc_m_power_channel_ppd
[Cycles where DRAM ranks are in power down (CKE) mode. Derived from
unc_m_power_channel_ppd. Unit
From: Andi Kleen
Add support for a new JSON event attribute to name MetricExpr for better output
in perf stat.
If the event has no MetricName it uses the normal event name instead to describe
the metric.
So far no names are added.
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c
From: Andi Kleen
When the user specifies a pmu directly, expand it automatically
with a prefix match for all available PMUs, similar as we do for
the normal aliases now.
This allows to specify attributes for duplicated boxes quickly.
For example uncore_cbox_{0,6}/.../ can be now specified as
From: Andi Kleen
Special case uncore_ prefix in PMU match, to allow for shorter event
uncore specifications.
Before
perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After
perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
Signed-off-by: Andi Kleen
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index 0888a879120f..d6c6aa80675f 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -357,7 +357,7 @@ static void amd_e400_idle(void)
> > if (!amd_e400_c1e_detected) {
> > u3
> No. It related to the counter width. The number of bits we can use should be
> 1 bit less than the total width. Otherwise, there will be problem.
> For big cores such as haswell, broadwell, skylake, the counter width is 48
> bit.
> So we can only use 47 bits.
> For Silvermont and KNL, the counte
Commit-ID: f23610245c1aa0e912476e642bd5107d04122230
Gitweb: http://git.kernel.org/tip/f23610245c1aa0e912476e642bd5107d04122230
Author: Andi Kleen
AuthorDate: Fri, 27 Jan 2017 18:03:40 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 08:55:04 -0300
perf list: Add
Commit-ID: fedb2b518239cbc00abcf0d200e0be8436251c11
Gitweb: http://git.kernel.org/tip/fedb2b518239cbc00abcf0d200e0be8436251c11
Author: Andi Kleen
AuthorDate: Fri, 27 Jan 2017 18:03:37 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 08:55:03 -0300
perf jevents
Commit-ID: dd32cb5d8fd42316bf8c2b9f7e5c51a38625f755
Gitweb: http://git.kernel.org/tip/dd32cb5d8fd42316bf8c2b9f7e5c51a38625f755
Author: Andi Kleen
AuthorDate: Sat, 17 Sep 2016 18:10:03 -0700
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 16:37:35 -0300
perf vendor
Commit-ID: 15b22ed369aa23ef4d083ffb9621650c353d3ddd
Gitweb: http://git.kernel.org/tip/15b22ed369aa23ef4d083ffb9621650c353d3ddd
Author: Andi Kleen
AuthorDate: Fri, 27 Jan 2017 18:03:38 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 08:55:03 -0300
perf pmu
Commit-ID: 231bb2aa32498cbebef1306889a02114e9dfc934
Gitweb: http://git.kernel.org/tip/231bb2aa32498cbebef1306889a02114e9dfc934
Author: Andi Kleen
AuthorDate: Fri, 27 Jan 2017 18:03:39 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 08:55:04 -0300
perf pmu
Commit-ID: d581141970ef3965c1624960fa928d765afd8a3e
Gitweb: http://git.kernel.org/tip/d581141970ef3965c1624960fa928d765afd8a3e
Author: Andi Kleen
AuthorDate: Fri, 27 Jan 2017 18:03:36 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 8 Feb 2017 08:55:02 -0300
perf jevents
> > - no_size = !!size;
>
> Erk! Isn't the logic is the wrong way around here. Sorry!
> i.e. should be:
Yes it works with that change too.
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index c5a6e0b12452..78bd632f144d 100644
> --- a/tools/perf/util/auxtr
> But what about my question? Do you think the changes are ok? I actually
> made all be fallthrough, i.e. considered that the existing code was ok.
Yes the changes are fine.
-Andi
On Thu, Feb 09, 2017 at 07:37:55PM +0100, Jiri Olsa wrote:
> > The last time I proposed separate files Ingo vetoed it.
> > He wanted everything built in.
>
> sure, he veto it for event files.. expressions could be built
> in same way as we have events now
That's exactly what I implemented. The ex
On Thu, Feb 09, 2017 at 01:50:39PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi,
>
> I've updated the container with Fedora Rawhide I use to build
> tools/perf/ and samples/bcc/ and it now comes with gcc 7, where I get
> things like:
FWIW, but it just shows that you should never ship softwar
On Thu, Feb 09, 2017 at 12:39:37PM +0100, Jiri Olsa wrote:
> and this makes me think, that this is not the right approach
>
> adding extra copy of an event when you want to add new expression?
I don't want to add new expressions.
I don't even need arbitrary expressions, just DividedBy
to get per
On Wed, Feb 08, 2017 at 12:31:34PM +0100, Jiri Olsa wrote:
> On Fri, Jan 27, 2017 at 06:03:45PM -0800, Andi Kleen wrote:
> > From: Andi Kleen
> >
> > Add generic infrastructure to perf stat to output ratios for "MetricExpr"
> > entries in the event lists.
On Mon, Feb 06, 2017 at 06:05:29PM +0200, Alexander Shishkin wrote:
> Andi Kleen writes:
>
> > Alexander Shishkin writes:
> >
> >> Now that Intel PT supports more types of trace content than just branch
> >> tracing, it may be useful to allow the user to di
Alexander Shishkin writes:
> Now that Intel PT supports more types of trace content than just branch
> tracing, it may be useful to allow the user to disable branch tracing
> when it is not needed.
>
> The special case is BDW, where not setting BranchEn is not supported.
>
> This is slightly tric
From: Andi Kleen
The address filter code disallows filtering for per-cpu events,
because it would require dynamically changing user address filters
in context switches.
For the special case of filtering on kernel code only,
we can allow it, as the kernel code always stays at the same
addresses
From: Andi Kleen
Currently a filter like
--filter "filter _text / _end"
doesn't work because _end doesn't have a size. The
filter resolution always wants to use the end of the function
as end.
Allow this case by assuming the filter just spawns to the
start of the end sym
> > I'm not sure this is a real requirement. It's just an optimization,
> > right? If you can assign policies to threads, you can implicitly set it
> > per CPU through affinity (or the other way around).
>
> That's difficult when distinct users/systems do monitoring and system
> management. What
"Luck, Tony" writes:
> 9)Measure per logical CPU (pick active RMID in same precedence for
> task/cpu as CAT picks CLOSID)
> 10) Put multiple CPUs into a group
I'm not sure this is a real requirement. It's just an optimization,
right? If you can assign policies to threads, you can implicitl
From: Andi Kleen
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
Signed-off-by: Andi Kleen
---
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 ++
tools/perf/tests/expr.c | 48
From: Andi Kleen
For debugging and testing it is useful to see the converted
alias string. Add support to perf stat/record and perf list to print
the alias conversion. The text string is saved in the alias structure.
For perf stat/record it is folded into the normal -v. For perf list
-v was
From: Andi Kleen
When the user specifies a pmu directly, expand it automatically
with a prefix match, similar as we do for the normal aliases now.
This allows to specify attributes for duplicated boxes quickly.
For example uncore_cbox_{0,6}/.../ can be now specified as cbox/.../
and it gets
From: Andi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr"
entries in the event lists. Many events are more useful as ratios
than in raw form, typically some count in relation to total ticks.
Transfer the MetricExpr information from the alias to the
From: Andi Kleen
Add support for parsing the MetricExpr header in the JSON event lists and
storing them in the alias structure.
Used in the next patch.
v2: Change DividedBy to MetricExpr
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c| 18 ++
tools/perf/pmu
From: Andi Kleen
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is
From: Andi Kleen
Handle the Unit field, which is needed to find the right PMU for
an event. We call it "pmu" and convert it to the perf pmu name
with an uncore prefix.
Handle the ExtSel field, which just extends the event mask with
an additional bit.
Handle the Filter field
From: Andi Kleen
Add support for registering json aliases per PMU. Any alias
with an unit matching the prefix is registered to the PMU.
Uncore has multiple instances of most units, so all
these aliases get registered for each individual PMU
(this is important later to run the event on every
From: Andi Kleen
The next patch needs to modify event code. Previously eventcode was just
passed through as a string. Now parse it as a number.
v2: Don't special case 0
Acked-by: Jiri Olsa
Signed-off-by: Andi Kleen
---
tools/perf/pmu-events/jevents.c | 10 +-
1 file chang
From: Andi Kleen
The code for handling pmu aliases without specifying
the PMU hardcoded only supported the cpu PMU.
This patch extends it to work for all PMUs. We always
duplicate the event for all PMUs that have an matching alias.
This allows to automatically expand an alias for all instances
This adds uncore support on top of the recently merged JSON event list
infrastructure for core events. Uncore is everything outside the core,
including memory controllers, PCI, interconnect etc.
Uncore is more complicated to handle than core events because it uses
many duplicated PMUs, which leads
> Do you know if there is any tool comparing the output of objdump -d to what is
> produced by a similar xed based tool?
I'm not aware of such a tool, but could be written using the "xed" tool
in the xed distribution. However I would trust xed over objdump,
it is used widely in Intel tools with li
>
> [jolsa@krava perf]$ git branch -r | grep xed-
> ak/perf/xed-3
> ak/perf/xed-4
Pushed.
-Andi
A native disassembler in perf is very useful, in particular with perf script to
trace
instruction streams, but also for other analysis. Previously I attempted
to do this using the udis86 library, but that was rejected because:
- udis86 was not maintained anymore and lacking recent instructions
-
From: Andi Kleen
Implement printing full disassembled sequences for branch stacks in perf
script. This allows to directly print hot paths for individual samples,
together with branch misprediction and cycle count / IPC information if
available (on Skylake systems). This only works when no
From: Andi Kleen
When dumping PT traces with perf script it is very useful to see the
assembler for each sample, so that it is easily possible to follow
the control flow.
As using objdump is difficult and inefficient from perf script this
patch uses the Intel xed library to implement assembler
From: Andi Kleen
Add autoprobing for the xed disassembler library.
Can be downloaded from https://github.com/intelxed/xed
v2: Hide. Require XED=1 to enable. Add XED_DIR
v3: Remove -lxed from probe all. Don't touch FEATURE_DISPLAY.
v4: Move to FEATURE_FLAGS_BASIC
Signed-off-by: Andi
From: Andi Kleen
Add a one liner warning for perf features that need to be enabled
explicitly by the user, so that they know they are missing something.
Currently enabled for XED and BABELTRACE.
Signed-off-by: Andi Kleen
---
tools/perf/Makefile.config | 10 ++
1 file changed, 10
From: Andi Kleen
Add a generic disassembler function for x86 using the XED library,
and a fallback function for architectures that don't implement one.
Other architectures can implement their own disassembler functions.
The previous version of this patch used udis86, but was
rejected be
> > index 8bffe99d8e3f..4bfc98953aba 100644
> > --- a/tools/perf/util/pmu.c
> > +++ b/tools/perf/util/pmu.c
> > @@ -587,14 +587,13 @@ static struct perf_pmu *pmu_lookup(const char *name)
> > if (pmu_format(name, &format))
> > return NULL;
> >
> > - if (pmu_aliases(name, &aliases
> will it always show 'not supported', as I haven't found this in the
> changelog I guess you did not know about this behaviour?
Not guaranteed. Will fix that.
>
> could you also please document it somewhere
Ok.
-Andi
> > % perf stat -a -e unc_c_llc_lookup.any sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > 2,685,120 Bytes unc_c_llc_lookup.any
> >
> >1.002648032 seconds time elapsed
>
>
> if one of them is not supported, we get wrong output:
I would argue the outpu
On Wed, Jan 18, 2017 at 01:16:06PM +0100, Jiri Olsa wrote:
> On Tue, Jan 03, 2017 at 07:08:28AM -0800, Andi Kleen wrote:
>
> SNIP
>
> > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> > index 4bfc98953aba..e0c43698fd62 100644
> > --- a/tools/perf/util/p
Commit-ID: d02fc6bcd53721cf8588633409157c232f2418e0
Gitweb: http://git.kernel.org/tip/d02fc6bcd53721cf8588633409157c232f2418e0
Author: Andi Kleen
AuthorDate: Tue, 3 Jan 2017 07:08:23 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Mon, 16 Jan 2017 14:59:15 -0300
perf pmu: Factor
1701 - 1800 of 10164 matches
Mail list logo