Naoya Horiguchi writes:
> When we try to soft-offline a thp tail page, put_page() is called on the
> tail page unthinkingly and VM_BUG_ON is triggered in put_compound_page().
> This patch splits thp before going into the main body of soft-offlining.
Looks good.
>
> The interface of soft-offlini
> > @@ -198,8 +200,9 @@ enum perf_event_read_format {
> > PERF_FORMAT_TOTAL_TIME_RUNNING = 1U << 1,
> > PERF_FORMAT_ID = 1U << 2,
> > PERF_FORMAT_GROUP = 1U << 3,
> > + PERF_FORMAT_WEIGHT = 1U << 4,
>
> what
> > +static void str_append(char **s, int *len, const char *a)
> > +{
> > + int olen = *s ? strlen(*s) : 0;
> > + int nlen = olen + strlen(a) + 1;
> > + if (*len < nlen) {
> > + *len = *len * 2;
> > + if (*len < nlen)
> > + *len = nlen;
> > + *s
> Should we test for >3?
>
> * precise_ip:
> *
> * 0 - SAMPLE_IP can have arbitrary skid
> * 1 - SAMPLE_IP must have constant skid
> * 2 - SAMPLE_IP requested to have 0 skid
> * 3 - SAMPLE_IP must have 0 skid
>
> Maybe it's not implemented in hw yet, but in
Kent Overstreet writes:
> This implements a refcount with similar semantics to
> atomic_get()/atomic_dec_and_test(), that starts out as just an atomic_t
> but dynamically switches to per cpu refcounting when the rate of
> gets/puts becomes too high.
This will only work if you put on the same CPU
On Thu, Nov 29, 2012 at 10:57:20AM -0800, Kent Overstreet wrote:
> On Thu, Nov 29, 2012 at 10:45:04AM -0800, Andi Kleen wrote:
> > Kent Overstreet writes:
> >
> > > This implements a refcount with similar semantics to
> > > atomic_get()/atomic_dec_and_test(), th
Dave Chinner writes:
>
> Comments, thoughts and flames all welcome.
Doing the reclaim per CPU sounds like a big change in the VM balance.
Doesn't this invalidate some zone reclaim mode settings?
How did you validate all this?
-Andi
--
a...@linux.intel.com -- Speaking for myself only
--
To uns
Prasad Koya writes:
> Hi
>
> Before going into crashkernel, nmi_shootdown_cpus() calls
> register_die_notifier(), which calls vmalloc_sync_all(). I'm seeing
> lockup in sync_global_pgds() (init_64.c). From 3.2 and up,
> register_die_notifier() is replaced with register_nmi_handler() (patch
> 9c48
> The trick is that we don't watch for the refcount hitting 0 until we're
> shutting down - so this only works if you keep track of your initial
> refcount. As long as we're not shutting down, we know the refcount can't
> hit 0 because we haven't released the initial refcount.
This seems dangerous
Peter Zijlstra writes:
> +
> + down_write(&mm->mmap_sem);
> + for (vma = mm->mmap; vma; vma = vma->vm_next) {
> + if (!vma_migratable(vma))
> + continue;
> + change_protection(vma, vma->vm_start, vma->vm_e
Jim Kukunas writes:
> +
> + /* ymm0 = x0f[16] */
> + asm volatile("vpbroadcastb %0, %%ymm7" : : "m" (x0f));
> +
> + while (bytes) {
> +#ifdef CONFIG_X86_64
> + asm volatile("vmovdqa %0, %%ymm1" : : "m" (q[0]));
> + asm volatile("vmovdqa %0, %%ymm9" : : "m" (q[32
Joseph Parmelee writes:
> Greetings:
>
> The gas test suite in recent binutils snapshots from
> ftp://sourceware.org/pub/binutils/snapshots/ consistently freezes my i386
> custom-built kernels. This may be a kernel configuration problem but if so
> it has manifested only recently. I have been b
Kent Overstreet writes:
> On Thu, Nov 29, 2012 at 02:34:52PM -0500, Benjamin LaHaise wrote:
>> On Thu, Nov 29, 2012 at 11:29:25AM -0800, Kent Overstreet wrote:
>> > There's some kind of symmetry going on here, and if I'd been awake more
>> > in college I could probably say exactly why it works, b
> > The regular atomic_t is limited in ways that you are not.
> > See my original mail.
>
> I don't follow, can you explain?
For most cases the reference count is tied to some object, which are
naturally limited by memory size or other physical resources.
But in the assymetric CPU case with your
> The code is compiled so that the xmm/ymm registers are not available to
> the compiler. Do you have any known examples of asm volatiles being
> reordered *with respect to each other*? My understandings of gcc is
> that volatile operations are ordered with respect to each other (not
> necessaril
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the
extreme "perf for dummies" edition as requested.
I removed some more patches, these will come later.
I moved parts of an later patch (counter constraints for qualifiers)
into an ear
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Reviewed-by: Stephane Eranian
Signed-off-by
From: Andi Kleen
Recent Intel CPUs have a new alternative MSR range for perfctrs that allows
writing the full counter width. Enable this range if the hardware reports it
using a new capability bit. This lowers overhead of perf stat slightly because
it has to do less interrupts to accumulate the
From: Andi Kleen
Add support for the v2 PEBS format. It has a superset of the v1 PEBS
fields, but has a longer record so we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with pr
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events and two
new counter bits.
There are some new counter flags that need to be prevented
from being set on fixed counters, and allowed to be set
for generic counters.
Also we add support for the
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier non P4 cores.
Signed-off-by: Andi
On Tue, Feb 05, 2013 at 04:15:26PM +0100, Stephane Eranian wrote:
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -2228,5 +2228,11 @@ __init int intel_pmu_init(void)
> > }
> > }
> >
> > + /* Support full width co
v2: Print the feature at boot
Signed-off-by: Andi Kleen
diff --git a/arch/x86/include/uapi/asm/msr-index.h
b/arch/x86/include/uapi/asm/msr-index.h
index 433a59f..af41a77 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -163,6 +163,9 @
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the
bare bones
Only for very extremly basic usage.
Most interesting new features are not in this patchkit
(full version is
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier non P4 cores.
Signed-off-by: Andi
From: Andi Kleen
Add support for the v2 PEBS format. It has a superset of the v1 PEBS
fields, but has a longer record so we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with pr
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Reviewed-by: Stephane Eranian
Signed-off-by
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events and two
new counter bits.
There are some new counter flags that need to be prevented
from being set on fixed counters, and allowed to be set
for generic counters.
Also we add support for the
From: Andi Kleen
Recent Intel CPUs like Haswell and IvyBridge have a new alternative MSR
range for perfctrs that allows writing the full counter width. Enable this
range if the hardware reports it using a new capability bit.
This lowers the overhead of perf stat slightly because it has to do
From: Andi Kleen
Newer gcc enables the var-tracking pass with -g to keep track which
registers contain which variables. This is one of the slower passes in gcc.
With reduced debug info (aimed at objdump -S, but not using a full debugger)
we don't need this fine grained tracking. But i
On Mon, Feb 11, 2013 at 04:53:26PM -0800, Kent Overstreet wrote:
> I finally started hacking on the dio code, and it's far from done but
> it's turning out better than I expected so I thought I'd show off what
> I've got so far.
The critical metric for some of the highend workloads
is the number
On Tue, Feb 12, 2013 at 09:43:46AM +0100, Ingo Molnar wrote:
> Was this stress-tested on all affected main CPU types, or only
> on Haswell?
I tested it on Haswell and Ivy Bridge. I can also try
Westmere and a Saltwell(Atom), but for the majority of other family 6
systems I'll need to rely on the
On Tue, Feb 12, 2013 at 03:09:26PM +0100, Stephane Eranian wrote:
> This patch series contains improvement to the aggregation support
> in perf stat.
>
> First, the aggregation code is refactored and a aggr_mode enum
> is defined. There is also an important bug fix for the existing
> per-socket ag
takes in printf are common.
Better to duplicate the sprintf.
The rest looks good to me.
Reviewed-by: Andi Kleen
-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
>
> > The idea itself is useful.
> >
> Yes, it is.
BTW it would be even more useful if it could print some of the
statistics turbostat does (in particular frequency and C0 residency)
Often you only care about cycles not idle, and the frequency
tells you how fast the cycles happen.
I think Cx cou
From: Andi Kleen
Add support for the Haswell extended (fmt2) PEBS format.
It has a superset of the nhm (fmt1) PEBS fields, but has a longer record so
we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier family 6 cores.
Tested on Haswell, IvyB
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the
bare bones
Only for very extremly basic usage.
Most interesting new features are not in this patchkit
(full version is
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events and two
new counter bits.
There are some new counter flags that need to be prevented
from being set on fixed counters, and allowed to be set
for generic counters.
Also we add support for the
From: Andi Kleen
Recent Intel CPUs like Haswell and IvyBridge have a new alternative MSR
range for perfctrs that allows writing the full counter width. Enable this
range if the hardware reports it using a new capability bit.
This lowers the overhead of perf stat slightly because it has to do
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Reviewed-by: Stephane Eranian
Signed-off-by
> > - if (lbr_format == LBR_FORMAT_EIP_FLAGS) {
> > + if (lbr_format == LBR_FORMAT_EIP_FLAGS ||
> > + lbr_format == LBR_FORMAT_EIP_FLAGS2) {
> > mis = !!(from & LBR_FROM_FLAG_MISPRED);
> > pred = !mis;
> >
> I don't buy really this workaround. You are assuming you're always
> measuring INTC_CHECKPOINTED
> event by itself.
There's no such assumption.
> So what if you get into the handler because of an PMI
> due to an overflow
> of another counter which is active at the same time as counter2?
> You'
On Tue, Jan 29, 2013 at 01:30:19AM +0100, Stephane Eranian wrote:
> >> The counter is reinstated to its state before the critical section but
> >> the PMI cannot be
> >> cancelled and there is no state left behind to tell what to do with it.
> >
> > The PMI is effectively spurious, but we use it to
> That's a very low sampling rate, yet I think it would be rejected by your
> code.
You mean allowed?
> But if I come in with frequency 0x7fff+1, then that's a very high
> frequency, thus
> small period, I would pass the test. So I think you need to reinforce the test
> for freq=1.
I'm awa
On Thu, Jan 31, 2013 at 06:19:01PM +0100, Stephane Eranian wrote:
> Andi,
>
> Are you going to post a new version based on my feedback or do you stay
> with what you posted on 1/25?
I'm posting a new version today, already added all changes.
-Andi
--
To unsubscribe from this list: send the line
From: Andi Kleen
Add support for the v2 PEBS format. It has a superset of the v1 PEBS
fields, but has a longer record so we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with pr
From: Andi Kleen
Implement the TSX transaction and checkpointed transaction qualifiers for
Haswell. This allows e.g. to profile the number of cycles in transactions.
The checkpointed qualifier requires forcing the event to
counter 2, implement this with a custom constraint for Haswell.
Also
From: Andi Kleen
Recent Intel CPUs have a new alternative MSR range for perfctrs that allows
writing the full counter width. Enable this range if the hardware reports it
using a new capability bit. This lowers overhead of perf stat slightly because
it has to do less interrupts to accumulate the
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the "basic support"
as requested. I consider all of this basic support for Haswell usage.
although it's a bit more than what you need if you never use -e cpu//
or -b options. I decided to
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Signed-off-by: Andi Kleen
---
arch/x86/kernel
From: Andi Kleen
Haswell has two additional LBR from flags for TSX: intx and abort, implemented
as a new v4 version of the LBR format.
Handle those in and adjust the sign extension code to still correctly extend.
The flags are exported similarly in the LBR record to the existing misprediction
From: Andi Kleen
When the LBR format is unknown disable LBR recording. This prevents
crashes when the LBR address is misdecoded and mis-sign extended.
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c |3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff
From: Andi Kleen
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.
v2: ABORT -> ABORTTX
v3: Add more _
Signed-off-by: Andi Kleen
---
tools/perf/Documentation/perf-record.txt |3 +++
tools/perf/builtin-record.c |3 +++
2 files changed
From: Andi Kleen
Add LBR filtering for branch in transaction, branch not in transaction
or transaction abort. This is exposed as new sample types.
v2: Rename ABORT to ABORTTX
v3: Use table instead of if
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 58
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier non P4 cores.
Signed-off-by: Andi
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events. Further
differences are handled in followon patches.
There are some new counter flags that need to be prevented
from being set on fixed counters.
Contains fixes from Stephane Eranian
v2: Folded
From: Andi Kleen
With checkpointed counters there can be a situation where the counter
is overflowing, aborts the transaction, is set back to a non overflowing
checkpoint, causes interupt. The interrupt doesn't see the overflow
because it has been checkpointed. This is then a spuriou
From: Andi Kleen
Extend the perf branch sorting code to support sorting by intx
or abort qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support notx here, because it's simply not showing
the intx flag.
v2: Readd fla
>
> As requested before, please keep those in a completely separate
> series so that minimal support can be merged upstream. This is
Hi Ingo, The goal is not to merge "minimal support" but full support.
All of these features have users. But I subsetted it to make reviewing easier.
I think the
> And there's a patchset [1] from Jiri to support some kind of formula -
> yeah, now I've written the correct spelling. :) - that might fit to this
> purpose if you provide suitable formula file IMHO. So I guess we don't
> need to have another command and can reuse perf stat, no?
Yes with a prope
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier non P4 cores.
Signed-off-by: Andi
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the
extreme "perf for dummies" edition as requested.
I removed some more patches, these will come soon later.
I moved parts of an later patch (counter constraints for qualifiers)
into a
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events and two
new counter bits.
There are some new counter flags that need to be prevented
from being set on fixed counters, and allowed to be set
for generic counters.
Also we add support for the
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Reviewed-by: Stephane Eranian
Signed-off-by
From: Andi Kleen
Recent Intel CPUs have a new alternative MSR range for perfctrs that allows
writing the full counter width. Enable this range if the hardware reports it
using a new capability bit. This lowers overhead of perf stat slightly because
it has to do less interrupts to accumulate the
From: Andi Kleen
Add support for the v2 PEBS format. It has a superset of the v1 PEBS
fields, but has a longer record so we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with pr
From: Andi Kleen
Add basic Haswell PMU support.
Similar to SandyBridge, but has a few new events. Further
differences are handled in followon patches.
There are some new counter flags that need to be prevented
from being set on fixed counters.
Contains fixes from Stephane Eranian
v2: Folded
From: Andi Kleen
Haswell has two additional LBR from flags for TSX: intx and abort, implemented
as a new v4 version of the LBR format.
Handle those in and adjust the sign extension code to still correctly extend.
The flags are exported similarly in the LBR record to the existing misprediction
From: Andi Kleen
Extend the perf branch sorting code to support sorting by intx
or abort qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support notx here, because it's simply not showing
the intx flag.
v2: Readd fla
This is based on v7 of the full Haswell PMU support, but
ported to the latest perf/core and stripped down to the "basic support"
as requested.
I decided to include LBRs in the basic support. These are 4 patches
self contained at the end, so could be also handled as a separate
unit if that is pre
From: Andi Kleen
Recent Intel CPUs have a new alternative MSR range for perfctrs that allows
writing the full counter width. Enable this range if the hardware reports it
using a new capability bit. This lowers overhead of perf stat slightly because
it has to do less interrupts to accumulate the
From: Andi Kleen
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.
v2: ABORT -> ABORTTX
v3: Add more _
Signed-off-by: Andi Kleen
---
tools/perf/Documentation/perf-record.txt |3 +++
tools/perf/builtin-record.c |3 +++
2 files changed
From: Andi Kleen
This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier non P4 cores.
Signed-off-by: Andi
From: Andi Kleen
When the LBR format is unknown disable LBR recording. This prevents
crashes when the LBR address is misdecoded and mis-sign extended.
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c |3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff
From: Andi Kleen
With checkpointed counters there can be a situation where the counter
is overflowing, aborts the transaction, is set back to a non overflowing
checkpoint, causes interupt. The interrupt doesn't see the overflow
because it has been checkpointed. This is then a spuriou
From: Andi Kleen
Add support for the v2 PEBS format. It has a superset of the v1 PEBS
fields, but has a longer record so we need to adjust the code paths.
The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with pr
From: Andi Kleen
Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.
v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event.h |2 ++
arch/x86
From: Andi Kleen
Implement the TSX transaction and checkpointed transaction qualifiers for
Haswell. This allows e.g. to profile the number of cycles in transactions.
The checkpointed qualifier requires forcing the event to
counter 2, implement this with a custom constraint for Haswell.
Also
From: Andi Kleen
Add LBR filtering for branch in transaction, branch not in transaction
or transaction abort. This is exposed as new sample types.
v2: Rename ABORT to ABORTTX
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 31 +--
include
From: Andi Kleen
For some events it's useful to weight sample with a hardware
provided number. This expresses how expensive the action the
sample represent was. This allows the profiler to scale
the samples to be more informative to the programmer.
There is already the period which is
From: Andi Kleen
Add infrastructure to generate event aliases in /sys/devices/cpu/events/
And use this to set up user friendly aliases for the common TSX events.
TSX tuning relies heavily on the PMU, so it's important to be user friendly.
This replaces the generic transaction events
From: Andi Kleen
Add a precise qualifier, like cpu/event=0x3c,precise=1/
This is needed so that the kernel can request enabling PEBS
for TSX events. The parser bails out on any sysfs parse errors,
so this is needed in any case to handle any event on the TSX
perf kernel.
v2: Allow 3 as value
From: Andi Kleen
Add a instructions-p event alias that uses the PDIR randomized instruction
retirement event. This is useful to avoid some systematic sampling shadow
problems. Normally PEBS sampling has a systematic shadow. With PDIR
enabled the hardware adds some randomization that
From: Andi Kleen
Add a way for the CPU initialization code to register additional events,
and merge them into the events attribute directory. Used in the next
patch.
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event.c | 29 +
arch/x86/kernel/cpu
From: Andi Kleen
When an event fails to parse and it's not in a new style format,
try to parse it again as a cpu event.
This allows to use sysfs exported events directly without //, so I can use
perf record -e tx-aborts ...
instead of
perf record -e cpu/tx-aborts/
v2: Handle multiple e
From: Andi Kleen
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary
Make events_sysfs_show unstatic again to fix compilation]
Signed-off-by: Stephane Eranian
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event.c | 28 +---
arch/x86/kernel/cpu/perf_event.h | 26 ++
2 files changed, 39 insertions(+), 15 dele
From: Andi Kleen
Add histogram support for the transaction flags. Each flags instance becomes
a separate histogram. Support sorting and displaying the flags in report
and top.
The patch is fairly large, but it's really mostly just plumbing to pass the
flags around.
v2: Increase column
From: Andi Kleen
List the kernel supplied pmu event aliases in perf list
It's better when the users can actually see them.
v2: Fix pattern matching
v3: perf_pmu__alias -> perf_pmu_alias
Signed-off-by: Andi Kleen
---
tools/perf/Documentation/perf-list.txt |4 +-
tools/perf/builti
From: Andi Kleen
Extend the perf branch sorting code to support sorting by intx
or abort qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support notx here, because it's simply not showing
the intx flag.
v2: Readd fla
From: Andi Kleen
In the PEBS handler report the transaction flags using the new
generic transaction flags facility. Most of them come from
the "tsx_tuning" field in PEBSv2, but the abort code is derived
from the RAX register reported in the PEBS record.
Signed-off-by: Andi Kleen
---
From: Andi Kleen
When a weighted sample is requested, first try to report the TSX abort cost
on Haswell. If that is not available report the memory latency. This
allows profiling both by abort cost and by memory latencies.
Memory latencies requires enabling a different PEBS mode (LL).
When both
From: Andi Kleen
Add the glue in the user tools to record transaction flags with
--transaction (-T was already taken) and dump them.
Followon patches will use them.
v2: Fix manpage
v3: Move transaction to the end
Signed-off-by: Andi Kleen
---
tools/perf/Documentation/perf-record.txt |4
From: Andi Kleen
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to
From: Andi Kleen
Haswell supplies the address for every PEBS memory event, so always fill it in
when the user requested it. It will be 0 when not useful (no memory access)
Signed-off-by: Andi Kleen
---
arch/x86/kernel/cpu/perf_event_intel_ds.c |4
1 files changed, 4 insertions(+), 0
From: Andi Kleen
Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the intx and intx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length
This is based on v7 of the earlier combined Haswell PMU patchkit.
The basic functionality has moved into a separate patchkit.
These patches implement more advanced functionality. Most
of the functionality is related to TSX.
This applies on top of the basic hsw/pmu4-basics patchkit posted
separatel
From: Andi Kleen
This is not arch perfmon, but older CPUs will just ignore it. This makes
it possible to do at least some TSX measurements from a KVM guest
Cc: g...@redhat.com
v2: Various fixes to address review feedback
v3: Ignore the bits when no CPUID. No #GP. Force raw events with TSX bits
On Sat, Jan 26, 2013 at 12:54:02PM +0100, Ingo Molnar wrote:
>
> * Andi Kleen wrote:
>
> > From: Andi Kleen
> >
> > Implement the TSX transaction and checkpointed transaction
> > qualifiers for Haswell. This allows e.g. to profile the number
> > of cyc
1 - 100 of 10253 matches
Mail list logo