[tip: perf/core] perf script: Fix --reltime with --time

2019-10-21 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/core branch of tip: Commit-ID: b3509b6ed7a79ec49f6b64e4f3b780f259a2a468 Gitweb: https://git.kernel.org/tip/b3509b6ed7a79ec49f6b64e4f3b780f259a2a468 Author:Andi Kleen AuthorDate:Fri, 11 Oct 2019 11:21:39 -07:00 Committer

[tip: perf/core] perf evlist: Fix fix for freed id arrays

2019-10-21 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/core branch of tip: Commit-ID: 5a40e1994815ab09c59614c6a13d94eef55d1a7f Gitweb: https://git.kernel.org/tip/5a40e1994815ab09c59614c6a13d94eef55d1a7f Author:Andi Kleen AuthorDate:Fri, 11 Oct 2019 11:21:40 -07:00 Committer

[tip: perf/urgent] perf evlist: Fix fix for freed id arrays

2019-10-21 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: 98a8b2e60c69927b1f405c3b001a1de3f4e53901 Gitweb: https://git.kernel.org/tip/98a8b2e60c69927b1f405c3b001a1de3f4e53901 Author:Andi Kleen AuthorDate:Fri, 11 Oct 2019 11:21:40 -07:00 Committer

Optimize perf stat for large number of events/cpus v2

2019-10-20 Thread Andi Kleen
[The earlier v1 version had a lot of conflicts against some recent libperf changes in tip/perf/core. Resolve that and also fix some minor issues.] This patch kit optimizes perf stat for a large number of events on systems with many CPUs and PMUs. Some profiling shows that the most overhead is

[PATCH v2 7/9] perf stat: Use affinity for opening events

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure the event opening in perf stat to cycle through the events by CPU after setting affinity to that CPU. This eliminates IPI overhead in the perf API. We have to loop through the CPU in the outter builtin-stat code instead of leaving that to low level functions

[PATCH v2 3/9] perf pmu: Use file system cache to optimize sysfs access

2019-10-20 Thread Andi Kleen
From: Andi Kleen pmu.c does a lot of redundant /sys accesses while parsing aliases and probing for PMUs. On large systems with a lot of PMUs this can get expensive (>2s): % time seconds usecs/call callserrors sysc

[PATCH v2 4/9] perf affinity: Add infrastructure to save/restore affinity

2019-10-20 Thread Andi Kleen
From: Andi Kleen The kernel perf subsystem has to IPI to the target CPU for many operations. On systems with many CPUs and when managing many events the overhead can be dominated by lots of IPIs. An alternative is to set up CPU affinity in the perf tool, then set up all the events for that CPU

[PATCH v2 9/9] perf stat: Use affinity for enabling/disabling events

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure event enabling/disabling to use affinity, which minimizes the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall -- --- --- - - 54.65

[PATCH v2 1/9] perf evsel: Always preserve errno while cleaning up perf_event_open failures

2019-10-20 Thread Andi Kleen
From: Andi Kleen In some cases when perf_event_open fails, it may do some closes to clean up. In special cases these closes can fail too, which overwrites the errno of the perf_event_open, which is then incorrectly reported. Save/restore errno around closes. Signed-off-by: Andi Kleen

[PATCH v2 2/9] perf evsel: Avoid close(-1)

2019-10-20 Thread Andi Kleen
From: Andi Kleen In some weak fallback cases close can be called a lot with -1. Check for this case and avoid calling close then. This is mainly to shut up valgrind which complains about this case. Signed-off-by: Andi Kleen --- tools/perf/lib/evsel.c | 3 ++- tools/perf/util/evsel.c | 3

[PATCH v2 8/9] perf stat: Use affinity for reading

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure event reading to use affinity to minimize the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall -- --- --- - - 3.160.106079

[PATCH v2 5/9] perf evsel: Add iterator to iterate over events ordered by CPU

2019-10-20 Thread Andi Kleen
From: Andi Kleen Add some common code that is needed to iterate over all events in CPU order. Used in followon patches Signed-off-by: Andi Kleen --- tools/perf/util/evlist.c | 33 + tools/perf/util/evlist.h | 4 tools/perf/util/evsel.h | 1 + 3 files

[PATCH v2 6/9] perf stat: Use affinity for closing file descriptors

2019-10-20 Thread Andi Kleen
From: Andi Kleen Closing a perf fd can also trigger an IPI to the target CPU. Use the same affinity technique as we use for reading/enabling events to closing to optimize the CPU transitions. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall

[PATCH v1 5/9] perf evsel: Add iterator to iterate over events ordered by CPU

2019-10-20 Thread Andi Kleen
From: Andi Kleen Add some common code that is needed to iterate over all events in CPU order. Used in followon patches Signed-off-by: Andi Kleen --- tools/perf/util/evlist.c | 33 + tools/perf/util/evlist.h | 4 tools/perf/util/evsel.h | 1 + 3 files

Optimize perf stat for large number of events/cpus v1

2019-10-20 Thread Andi Kleen
This patch kit optimizes perf stat for a large number of events on systems with many CPUs and PMUs. Some profiling shows that the most overhead is doing IPIs to all the target CPUs. We can optimize this by using sched_setaffinity to set the affinity to a target CPU once and then doing the perf

[PATCH v1 3/9] perf pmu: Use file system cache to optimize sysfs access

2019-10-20 Thread Andi Kleen
From: Andi Kleen pmu.c does a lot of redundant /sys accesses while parsing aliases and probing for PMUs. On large systems with a lot of PMUs this can get expensive (>2s): % time seconds usecs/call callserrors sysc

[PATCH v1 8/9] perf stat: Use affinity for reading

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure event reading to use affinity to minimize the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall -- --- --- - - 3.160.106079

[PATCH v1 9/9] perf stat: Use affinity for enabling/disabling events

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure event enabling/disabling to use affinity, which minimizes the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall -- --- --- - - 54.65

[PATCH v1 1/9] perf evsel: Always preserve errno while cleaning up perf_event_open failures

2019-10-20 Thread Andi Kleen
From: Andi Kleen In some cases when perf_event_open fails, it may do some closes to clean up. In special cases these closes can fail too, which overwrites the errno of the perf_event_open, which is then incorrectly reported. Save/restore errno around closes. Signed-off-by: Andi Kleen

[PATCH v1 7/9] perf stat: Use affinity for opening events

2019-10-20 Thread Andi Kleen
From: Andi Kleen Restructure the event opening in perf stat to cycle through the events by CPU after setting affinity to that CPU. This eliminates IPI overhead in the perf API. We have to loop through the CPU in the outter builtin-stat code instead of leaving that to low level functions

[PATCH v1 4/9] perf affinity: Add infrastructure to save/restore affinity

2019-10-20 Thread Andi Kleen
From: Andi Kleen The kernel perf subsystem has to IPI to the target CPU for many operations. On systems with many CPUs and when managing many events the overhead can be dominated by lots of IPIs. An alternative is to set up CPU affinity in the perf tool, then set up all the events for that CPU

[PATCH v1 2/9] perf evsel: Avoid close(-1)

2019-10-20 Thread Andi Kleen
From: Andi Kleen In some weak fallback cases close can be called a lot with -1. Check for this case and avoid calling close then. This is mainly to shut up valgrind which complains about this case. Signed-off-by: Andi Kleen --- tools/perf/lib/evsel.c | 3 ++- tools/perf/util/evsel.c | 3

[PATCH v1 6/9] perf stat: Use affinity for closing file descriptors

2019-10-20 Thread Andi Kleen
From: Andi Kleen Closing a perf fd can also trigger an IPI to the target CPU. Use the same affinity technique as we use for reading/enabling events to closing to optimize the CPU transitions. Before on a large test case with 94 CPUs: % time seconds usecs/call callserrors syscall

Re: [PATCH 1/3] auxdisplay: Make charlcd.[ch] more general

2019-10-18 Thread Andi Kleen
On Fri, Oct 18, 2019 at 08:33:26AM -0700, Joe Perches wrote: > On Fri, 2019-10-18 at 17:08 +0200, Miguel Ojeda wrote: > > On Thu, Oct 17, 2019 at 10:07 AM Lars Poeschel wrote: > [] > > > Oh by the way: Do you know what I can do to make checkpatch happy with > > > its describing of the config

Re: [PATCH v2] perf list: Separate the deprecated events

2019-10-17 Thread Andi Kleen
> v2: > --- > In v1, the deprecated events are hidden by default but they can be > displayed when option "--deprecated" is enabled. In v2, we don't use > the new option "--deprecated". Instead, we just display the deprecated > events under the title "--- Following are deprecated events ---".

Re: [PATCH] perf list: Hide deprecated events by default

2019-10-16 Thread Andi Kleen
On Tue, Oct 15, 2019 at 11:14:01AM +0200, Jiri Olsa wrote: > On Tue, Oct 15, 2019 at 10:53:57AM +0800, Jin Yao wrote: > > There are some deprecated events listed by perf list. But we can't remove > > them from perf list with ease because some old scripts may use them. > > > > Deprecated events

[tip: perf/core] perf script: Allow --time with --reltime

2019-10-14 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/core branch of tip: Commit-ID: 3714437d3fcc7956cabcb0077f2a506b61160a56 Gitweb: https://git.kernel.org/tip/3714437d3fcc7956cabcb0077f2a506b61160a56 Author:Andi Kleen AuthorDate:Wed, 02 Oct 2019 09:46:42 -07:00 Committer

Re: [PATCH 3/3] perf tools: Make 'struct map_shared' truly shared

2019-10-14 Thread Andi Kleen
> > We may need a COW operation for this (hopefully rare) case. > > so the jitted mmaps are inserted into the data file > and processed during report where they can overload > existing maps - thats detected before addition in: > > thread__insert_map > map_groups__fixup_overlappings >

Re: [PATCH] perf data: Fix babeltrace detection

2019-10-14 Thread Andi Kleen
> I'm not being able to reproduce here, without your patch I get things > working: Okay, looks like I had some stale libbabel* stuff in /usr/local/* Probably that caused it. I still think the patch would be a good idea to handle such cases, but it's not needed for the common case. -Andi

Re: [PATCH 3/3] perf tools: Make 'struct map_shared' truly shared

2019-10-13 Thread Andi Kleen
On Sun, Oct 13, 2019 at 05:14:27PM +0200, Jiri Olsa wrote: > Andi reported that maps cloning is eating lot of memory and > it's probably unnecessary, because they keep the same data. > > Changing 'struct map_shared' to be a pointer inside 'struct map', > so it can be shared on fork. Changing the

Re: [PATCH 1/3] perf tools: Allow to build with -ltcmalloc

2019-10-13 Thread Andi Kleen
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index a099a8a89447..8f1ba986d3bf 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -114,6 +114,8 @@ include ../scripts/utilities.mak > # Define NO_LIBZSTD if you do not want support of Zstandard based

[PATCH 1/2] perf script: Fix --reltime with --time

2019-10-11 Thread Andi Kleen
From: Andi Kleen My earlier patch to just enable --reltime with --time was a little too optimistic. The --time parsing would accept absolute time, which is very confusing to the user. Support relative time in --time parsing too. This only works with recent perf record that records the first

[PATCH 2/2] perf evlist: Fix fix for freed id arrays

2019-10-11 Thread Andi Kleen
From: Andi Kleen In the earlier fix for the memory overrun of id arrays I managed to typo the wrong event in the fix. Of course we need to close the current event in the loop, not the original failing event. The same test case as in the original patch still passes. Fixes: 7834fa948beb ("

Re: [PATCH] perf data: Fix babeltrace detection

2019-10-11 Thread Andi Kleen
On Fri, Oct 11, 2019 at 04:05:48PM +0200, Jiri Olsa wrote: > On Tue, Oct 08, 2019 at 07:21:44AM -0700, Andi Kleen wrote: > > On Tue, Oct 08, 2019 at 01:52:40PM +0200, Jiri Olsa wrote: > > > On Mon, Oct 07, 2019 at 10:41:20AM -0700, Andi Kleen wrote: > &

Re: [PATCH] perf data: Fix babeltrace detection

2019-10-08 Thread Andi Kleen
On Tue, Oct 08, 2019 at 01:52:40PM +0200, Jiri Olsa wrote: > On Mon, Oct 07, 2019 at 10:41:20AM -0700, Andi Kleen wrote: > > From: Andi Kleen > > > > The symbol the feature file checks for is now actually in -lbabeltrace, > > not -lbabeltrace-ctf, at least as of libbab

[PATCH] perf data: Fix babeltrace detection

2019-10-07 Thread Andi Kleen
From: Andi Kleen The symbol the feature file checks for is now actually in -lbabeltrace, not -lbabeltrace-ctf, at least as of libbabeltrace-1.5.6-2.fc30.x86_64 Always add both libraries to fix the feature detection. Signed-off-by: Andi Kleen --- tools/perf/Makefile.config | 4 ++-- 1 file

[tip: perf/urgent] perf script brstackinsn: Fix recovery from LBR/binary mismatch

2019-10-07 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: e98df280bc2a499fd41d7f9e2d6733884de69902 Gitweb: https://git.kernel.org/tip/e98df280bc2a499fd41d7f9e2d6733884de69902 Author:Andi Kleen AuthorDate:Fri, 27 Sep 2019 16:35:44 -07:00 Committer

[tip: perf/urgent] perf jevents: Fix period for Intel fixed counters

2019-10-07 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: 6bdfd9f118bd59cf0f85d3bf4b72b586adea17c1 Gitweb: https://git.kernel.org/tip/6bdfd9f118bd59cf0f85d3bf4b72b586adea17c1 Author:Andi Kleen AuthorDate:Fri, 27 Sep 2019 16:35:45 -07:00 Committer

[PATCH] perf script: Allow --time with --reltime

2019-10-02 Thread Andi Kleen
From: Andi Kleen The original --reltime patch forbid --time with --reltime. But it turns out --time doesn't really care about --reltime, because the relative time is only used at final output, while the time filtering always works earlier on absolute time. So just remove the check and allow

Re: [RFC][PATCH] sysctl: Remove the sysctl system call

2019-10-02 Thread Andi Kleen
There were some really old glibc versions that used the system call, but I believe they have a reasonable fall back, and may not be used much anymore. Acked-by: Andi Kleen -Andi

Re: [PATCH 3/4] perf inject --jit: Remove //anon mmap events

2019-09-30 Thread Andi Kleen
On Mon, Sep 30, 2019 at 08:49:00PM +, Steve MacLean wrote: > SNIP > > > I can't apply this one: > > > patching file builtin-inject.c > > Hunk #1 FAILED at 263. > > 1 out of 1 hunk FAILED -- saving rejects to file builtin-inject.c.rej > > I assume this is because I based my patches on the

Re: [PATCH v1 0/2] perf stat: Support --all-kernel and --all-user

2019-09-30 Thread Andi Kleen
> > I think it's useful. Makes it easy to do kernel/user break downs. > > perf record should support the same. > > Don't we have this already with: > > [root@quaco ~]# perf stat -e cycles:u,instructions:u,cycles:k,instructions:k > -a -- sleep 1 This only works for simple cases. Try it for

Re: [PATCH v1 0/2] perf stat: Support --all-kernel and --all-user

2019-09-30 Thread Andi Kleen
On Sun, Sep 29, 2019 at 05:29:13PM +0200, Jiri Olsa wrote: > On Wed, Sep 25, 2019 at 10:02:16AM +0800, Jin Yao wrote: > > This patch series supports the new options "--all-kernel" and "--all-user" > > in perf-stat. > > > > For example, > > > > root@kbl:~# perf stat -e cycles,instructions

Re: [PATCH V4 07/14] perf/x86/intel: Support hardware TopDown metrics

2019-09-30 Thread Andi Kleen
> Andi, what do you think? Will it be a problem for RDPMC users? Should be fine. RDPMC users only need slots anyways, as they managed RDPMC on their own. We can document it in the documentation. And they will see an error if they only enable a metrics event. -Andi

[PATCH 3/3] perf annotate: Improve handling of corrupted ~/.debug

2019-09-27 Thread Andi Kleen
From: Andi Kleen Sometimes ~/.debug can get corrupted and contain files that still have symbol tables, but which objdump cannot handle. Add a fallback to read the "original" file in such a case. This might be wrong too if it's different, but in many cases when profiling on the

[PATCH 2/3] perf jevents: Fix period for Intel fixed counters

2019-09-27 Thread Andi Kleen
From: Andi Kleen The Intel fixed counters use a special table to override the JSON information. During this override the period information from the JSON file got dropped, which results in inst_retired.any and similar running with frequency mode instead of a period. Just specify the expected

[PATCH 1/3] perf script brstackinsn: Fix recovery from LBR/binary mismatch

2019-09-27 Thread Andi Kleen
From: Andi Kleen When the LBR data and the instructions in a binary do not match the loop printing instructions could get confused and print a long stream of bogus instructions. The problem was that if the instruction decoder cannot decode an instruction it ilen wasn't initialized, so the loop

Re: [PATCH 1/3] perf, evlist: Fix access of freed id arrays

2019-09-24 Thread Andi Kleen
> id/sample_id arrays are not created when evsel is open but > we free it at close > > for now this fix seems correct to me.. we are moving id/sample_id > arrays under libperf, I'll make a note to check on close and reopen > of evsel and add some tests for that > > Acked-by: Jiri Olsa It looks

Re: [PATCH 3/3] perf, stat: Fix free memory access / memory leaks in metrics

2019-09-24 Thread Andi Kleen
> > expr__ctx_init(); > > + /* Must be first id entry */ > > + expr__add_id(, name, avg); > > hum, shouldn't u instead use strdup(name) instead of name? The cleanup loop later skips freeing the first entry. -Andi

[PATCH 3/3] perf, stat: Fix free memory access / memory leaks in metrics

2019-09-23 Thread Andi Kleen
From: Andi Kleen Make sure to not free the name passed in by the caller, but free all the allocated ids when parsing expressions. The loop at the end knows that the first entry shouldn't be freed, so make sure the caller name is the first entry. Fixes % perf stat -M IpB,IpCall,IpTB,IPC

[PATCH 2/3] perf, expr: Remove assert usage

2019-09-23 Thread Andi Kleen
From: Andi Kleen My "compile perf statically" setup doesn't like this assert for unknown reasons. Replace it with a standard BUG_ON Signed-off-by: Andi Kleen --- tools/perf/util/expr.y | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/expr.y b/

[PATCH 1/3] perf, evlist: Fix access of freed id arrays

2019-09-23 Thread Andi Kleen
From: Andi Kleen I'm not fully sure if this is the correct fix, but without this I get crashes on more complex perf stat metric usages. The problem is that part of the state gets freed when a weak group fails, but then is later still used. Just don't free the ids, we're going to reuse them

Re: [PATCH 3/5] perf vendor events: minor fixes to the README

2019-09-19 Thread Andi Kleen
For all the patches except the last Reviewed-by: Andi Kleen

Re: [PATCH 5/5] perf list: specify metrics are to be used with -M

2019-09-19 Thread Andi Kleen
> This misleads the uninitiated user to try: > > $ perf stat -e C2_Pkg_Residency Actually I guess we could just fix -e to support metrics too. Would probably not be that difficult. -Andi

Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

2019-08-31 Thread Andi Kleen
On Sat, Aug 31, 2019 at 02:13:05AM -0700, Stephane Eranian wrote: > Andi, > > On Fri, Aug 30, 2019 at 5:31 PM Andi Kleen wrote: > > > > > the same manner. It would greatly simplify the kernel implementation. > > > > I tried that originally. It was actually

Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

2019-08-30 Thread Andi Kleen
> the same manner. It would greatly simplify the kernel implementation. I tried that originally. It was actually more complicated. You can't really do deltas on raw metrics, and a lot of the perf infrastructure is built around deltas. To do the regular reset and not lose precision over time

Re: [RFC v1 3/9] KVM: x86: Implement MSR_IA32_PEBS_ENABLE read/write emulation

2019-08-29 Thread Andi Kleen
> + case MSR_IA32_PEBS_ENABLE: > + if (pmu->pebs_enable == data) > + return 0; > + if (!(data & pmu->pebs_enable_mask) && > + (data & MSR_IA32_PEBS_OUTPUT_MASK) == > +

Re: [RFC v1 1/9] KVM: x86: Add base address parameter for get_fixed_pmc function

2019-08-29 Thread Andi Kleen
> /* returns fixed PMC with the specified MSR */ > -static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr) > +static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr, > + int base) Better define a

Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

2019-08-28 Thread Andi Kleen
On Wed, Aug 28, 2019 at 06:28:57PM +0200, Peter Zijlstra wrote: > On Wed, Aug 28, 2019 at 09:17:54AM -0700, Andi Kleen wrote: > > > This really doesn't make sense to me; if you set FIXED_CTR3 := 0, you'll > > > never trigger the overflow there; this then seems

Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

2019-08-28 Thread Andi Kleen
On Wed, Aug 28, 2019 at 05:02:38PM +0200, Peter Zijlstra wrote: > > > > To avoid reading the METRICS register multiple times, the metrics and > > slots value can only be updated by the first slots/metrics event in a > > group. All active slots and metrics events will be updated one time. > >

Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

2019-08-28 Thread Andi Kleen
> This really doesn't make sense to me; if you set FIXED_CTR3 := 0, you'll > never trigger the overflow there; this then seems to suggest the actual The 48bit counter might overflow in a few hours. -Andi

Re: [tip: perf/core] perf script: Fix memory leaks in list_scripts()

2019-08-27 Thread Andi Kleen
> > Signed-off-by: Gustavo A. R. Silva > > This should be tagged for stable: > > Cc: sta...@vger.kernel.org It's a theoretical problem (which are explicitely ruled out by stable rules) because if you ever see user space malloc() returning NULL the system is likely already randomly killing your

[tip: perf/core] perf report: Use timestamp__scnprintf_nsec() for time sort key

2019-08-27 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/core branch of tip: Commit-ID: 092804ae092fc6097348f5c09b62cde040717aa1 Gitweb: https://git.kernel.org/tip/092804ae092fc6097348f5c09b62cde040717aa1 Author:Andi Kleen AuthorDate:Fri, 23 Aug 2019 14:03:37 -07:00 Committer

[tip: perf/core] perf report: Fix --ns time sort key output

2019-08-27 Thread tip-bot2 for Andi Kleen
The following commit has been merged into the perf/core branch of tip: Commit-ID: 3dab6ac080dcd7f71cb9ceb84ad7dafecd6f7c07 Gitweb: https://git.kernel.org/tip/3dab6ac080dcd7f71cb9ceb84ad7dafecd6f7c07 Author:Andi Kleen AuthorDate:Fri, 23 Aug 2019 14:03:38 -07:00 Committer

Re: BoF on LPC 2019 : Linux Perf advancements for compute intensive and server systems

2019-08-26 Thread Andi Kleen
> > > > > All those are already merged, after long reviewing phases and lots of > > testing, right? > > Right. These changes now constitute parts of the Linux kernel source tree. Might be better to focus on future areas that haven't been merged yet. -Andi

[PATCH 1/2] perf report: Use timestamp__scnprintf_nsec for time sort key

2019-08-23 Thread Andi Kleen
From: Andi Kleen Use timestamp__scnprintf_nsec to print nanoseconds for the time sort key, instead of open coding. Signed-off-by: Andi Kleen --- tools/perf/util/sort.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/sort.c b/tools/perf/util

[PATCH 2/2] perf report: Fix --ns time sort key output

2019-08-23 Thread Andi Kleen
From: Andi Kleen If the user specified --ns, the column to print the sort time stamp wasn't wide enough to actually print the full nanoseconds. Widen the time key column width when --ns is specified. Before: % perf record -a sleep 1 % perf report --sort time,overhead,symbol --stdio --ns

Re: [PATCH V2] perf/x86: Consider pinned events for group validation

2019-08-22 Thread Andi Kleen
On Thu, Aug 22, 2019 at 08:29:46PM +0200, Thomas Gleixner wrote: > On Thu, 22 Aug 2019, Andi Kleen wrote: > > > > + /* > > > + * Disable interrupts to prevent the events in this CPU's cpuc > > > + * going away and getting freed. > > > + */ > > &g

Re: [PATCH V2] perf/x86: Consider pinned events for group validation

2019-08-22 Thread Andi Kleen
> + /* > + * Disable interrupts to prevent the events in this CPU's cpuc > + * going away and getting freed. > + */ > + local_irq_save(flags); I believe it's also needed to disable preemption. Probably should add a comment, or better an explicit preempt_disable() too.

Re: [PATCH] perf pmu-events: Fix the missing "cpu_clk_unhalted.core"

2019-07-29 Thread Andi Kleen
> diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c > index 1a91a197cafb..d413761621b0 100644 > --- a/tools/perf/pmu-events/jevents.c > +++ b/tools/perf/pmu-events/jevents.c > @@ -453,6 +453,7 @@ static struct fixed { > { "inst_retired.any_p", "event=0xc0" }, >

[tip:perf/urgent] perf script: Fix off by one in brstackinsn IPC computation

2019-07-23 Thread tip-bot for Andi Kleen
Commit-ID: dde4e732a5b02fa5599c2c0e6c48a0c11789afc4 Gitweb: https://git.kernel.org/tip/dde4e732a5b02fa5599c2c0e6c48a0c11789afc4 Author: Andi Kleen AuthorDate: Thu, 11 Jul 2019 11:19:21 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 23 Jul 2019 08:59:37 -0300 perf script

[tip:perf/urgent] perf script: Improve man page description of metrics

2019-07-23 Thread tip-bot for Andi Kleen
Commit-ID: 7db7218a7ea577f04c2df92453d47ab5ebfc8863 Gitweb: https://git.kernel.org/tip/7db7218a7ea577f04c2df92453d47ab5ebfc8863 Author: Andi Kleen AuthorDate: Thu, 11 Jul 2019 11:19:22 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 23 Jul 2019 08:58:11 -0300 perf script

[tip:perf/urgent] perf script: Fix --max-blocks man page description

2019-07-23 Thread tip-bot for Andi Kleen
Commit-ID: 5f8eec3225ff7b86763b060164e9ce47b1a71406 Gitweb: https://git.kernel.org/tip/5f8eec3225ff7b86763b060164e9ce47b1a71406 Author: Andi Kleen AuthorDate: Thu, 11 Jul 2019 11:19:20 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 23 Jul 2019 08:57:54 -0300 perf script

Re: [Patch] perf stat: always separate stalled cycles per insn

2019-07-17 Thread Andi Kleen
On Tue, Jul 16, 2019 at 05:43:24PM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Jul 16, 2019 at 12:24:41PM -0700, Cong Wang escreveu: > > Hi, Arnaldo > > > > On Tue, May 28, 2019 at 12:11 PM Arnaldo Carvalho de Melo > > wrote: > > > > > > Em Tue, May 28, 2019 at 11:21:38AM -0700, Cong Wang

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andi Kleen
On Mon, Jul 15, 2019 at 04:10:36PM -0700, Andy Lutomirski wrote: > On Mon, Jul 15, 2019 at 3:53 PM Andi Kleen wrote: > > > > > I haven't tested on a real kernel with i915. Does i915 really hit > > > this code path? Does it happen more than once or twice at boot?

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andi Kleen
> I haven't tested on a real kernel with i915. Does i915 really hit > this code path? Does it happen more than once or twice at boot? Yes some workloads allocate/free a lot of write combined memory for graphics objects. -Andi

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andi Kleen
> Right, we don't know where the PAT invocation comes from and whether they > are safe to omit flushing the cache. The module load code would be one > obvious candidate. Module load just changes the writable/executable status, right? That shouldn't need to flush in any case because it doesn't

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andi Kleen
> > That does not answer the question whether it's worthwhile to do that. It's likely worthwhile for (Intel integrated) graphics. There was also a recent issue with 3dxp/dax, which uses ioremap in some cases. -Andi

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andi Kleen
Uros Bizjak writes: > Recent patch [1] disabled a self-snoop feature on a list of processor > models with a known errata, so we are confident that the feature > should work on remaining models also for other purposes than to speed > up MTRR programming. MTRR is very different than TLBs. >From

Re: [PATCH 0/2] Remove 32-bit Xen PV guest support

2019-07-15 Thread Andi Kleen
Juergen Gross writes: > The long term plan has been to replace Xen PV guests by PVH. The first > victim of that plan are now 32-bit PV guests, as those are used only > rather seldom these days. Xen on x86 requires 64-bit support and with > Grub2 now supporting PVH officially since version 2.04

Re: [PATCH] scatterlist: Allocate a contiguous array instead of chaining

2019-07-12 Thread Andi Kleen
Sultan Alsawaf writes: > > Abusing repeated kmallocs to produce a large allocation puts strain on > the slab allocator, when kvmalloc can be used instead. The single > kvmalloc allocation for all sg lists reduces the burden on the slab and > page allocators, since for large sg list allocations,

Re: [RFC v2 02/26] mm/asi: Abort isolation on interrupt, exception and context switch

2019-07-11 Thread Andi Kleen
Alexandre Chartre writes: > jmp paranoid_exit > @@ -1182,6 +1196,16 @@ ENTRY(paranoid_entry) > xorl%ebx, %ebx > > 1: > +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION > + /* > + * If address space isolation is active then abort it and return > + * the original kernel

[PATCH 2/3] perf script: Fix off by one in brstackinsn IPC computation

2019-07-11 Thread Andi Kleen
From: Andi Kleen When we hit the end of a program block, need to count the last instruction too for the IPC computation. This caused large errors for small blocks. % perf script -b ls / > /dev/null Before: % perf script -F +brstackinsn --xed ... 7f94c9ac7

[PATCH 3/3] perf script: Improve man page description of metrics

2019-07-11 Thread Andi Kleen
From: Andi Kleen Clarify that a metric is based on events, not referring to itself. Also some improvements with the sentences. Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-script.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf

[PATCH 1/3] perf script: Fix --max-blocks man page description

2019-07-11 Thread Andi Kleen
From: Andi Kleen The --max-blocks description was using the old name brstackasm. Use brstackinsn instead. Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-script.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-script.txt b

Re: [PATCH] perf/x86/intel: Fix spurious NMI on fixed counter

2019-07-10 Thread Andi Kleen
> > > oops, I overlooed this, looks good > > > > > > Acked-by: Jiri Olsa > > Have it now, thanks! Can Kan's patch please be merged asap and also put into stable for 5.2? The regression causes crashes on Icelake when fixed counters are used in PEBS groups, and presumably also on Goldmont Plus.

Re: [PATCH v7 10/12] KVM/x86/lbr: lazy save the guest lbr stack

2019-07-08 Thread Andi Kleen
> I don't understand a word of that. > > Who cares if the LBR MSRs are touched; the guest expects up-to-date > values when it does reads them. This is for only when the LBRs are disabled in the guest. It doesn't make sense to constantly save/restore disabled LBRs, which would be a large

[tip:perf/core] perf tools metric: Don't include duration_time in group

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 488c3bf7ece89e47887607863207021283e37828 Gitweb: https://git.kernel.org/tip/488c3bf7ece89e47887607863207021283e37828 Author: Andi Kleen AuthorDate: Fri, 28 Jun 2019 15:07:37 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 2 Jul 2019 16:08:16 -0300 perf tools

[tip:perf/core] perf list: Avoid extra : for --raw metrics

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 9c344d15f5783260f57c711f3fce72dd744bebe2 Gitweb: https://git.kernel.org/tip/9c344d15f5783260f57c711f3fce72dd744bebe2 Author: Andi Kleen AuthorDate: Fri, 28 Jun 2019 15:07:36 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 2 Jul 2019 16:08:16 -0300 perf list

[tip:perf/core] perf vendor events intel: Metric fixes for SKX/CLX

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 4df79ba3eb1b82e2939fb984b36a0e71bbed611b Gitweb: https://git.kernel.org/tip/4df79ba3eb1b82e2939fb984b36a0e71bbed611b Author: Andi Kleen AuthorDate: Fri, 28 Jun 2019 15:07:35 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 2 Jul 2019 16:08:16 -0300 perf vendor

[tip:perf/core] perf tools: Fix typos / broken sentences

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 734ac47e23aee12e1c16a4dd52d7c1cb893eaf6c Gitweb: https://git.kernel.org/tip/734ac47e23aee12e1c16a4dd52d7c1cb893eaf6c Author: Andi Kleen AuthorDate: Fri, 28 Jun 2019 15:09:00 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 2 Jul 2019 16:08:16 -0300 perf tools: Fix

[tip:perf/core] perf stat: Fix metrics with --no-merge

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: e3a9427323a53ceee540276a74af7706f350d052 Gitweb: https://git.kernel.org/tip/e3a9427323a53ceee540276a74af7706f350d052 Author: Andi Kleen AuthorDate: Mon, 24 Jun 2019 12:37:11 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 1 Jul 2019 22:50:41 -0300 perf stat: Fix

[tip:perf/core] perf stat: Fix group lookup for metric group

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 2f87f33f4226523df9c9cc28f9874ea02fcc3d3f Gitweb: https://git.kernel.org/tip/2f87f33f4226523df9c9cc28f9874ea02fcc3d3f Author: Andi Kleen AuthorDate: Mon, 24 Jun 2019 12:37:10 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 1 Jul 2019 22:50:41 -0300 perf stat: Fix

[tip:perf/core] perf stat: Don't merge events in the same PMU

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 6c5f4e5cb35b4694dc035d91092d23f596ecd06a Gitweb: https://git.kernel.org/tip/6c5f4e5cb35b4694dc035d91092d23f596ecd06a Author: Andi Kleen AuthorDate: Mon, 24 Jun 2019 12:37:09 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 1 Jul 2019 22:50:41 -0300 perf stat

[tip:perf/core] perf stat: Make metric event lookup more robust

2019-07-03 Thread tip-bot for Andi Kleen
Commit-ID: 145c407c808352acd625be793396fd4f33c794f8 Gitweb: https://git.kernel.org/tip/145c407c808352acd625be793396fd4f33c794f8 Author: Andi Kleen AuthorDate: Mon, 24 Jun 2019 12:37:08 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 1 Jul 2019 22:50:41 -0300 perf stat: Make

[PATCH] x86/fpu: Fix nofxsr regression

2019-07-02 Thread Andi Kleen
From: Andi Kleen Vegard Nossum reports: The commit for this patch in mainline (ccb18db2ab9d ("x86/fpu: Make XSAVE check ...")) causes the kernel to hang on boot when passing the "nofxsr" option: $ kvm -cpu host -kernel arch/x86/boot/bzImage -append "console=ttyS0

Re: [PATCH v8 4/5] x86/xsave: Make XSAVE check the base CPUID features before enabling

2019-07-01 Thread Andi Kleen
> So if it is unlikely to have XSAVE but no FXSR I would suggest to add > "fpu__xstate_clear_all_cpu_caps()" to nofxsr and behave like "nofxsr > noxsave". Thanks for the analysis Sebastian. Makes sense. This would likely work, but I think I would rather just remove the option. -Andi

Re: [PATCH v8 4/5] x86/xsave: Make XSAVE check the base CPUID features before enabling

2019-07-01 Thread Andi Kleen
> > The commit for this patch in mainline > (ccb18db2ab9d923df07e7495123fe5fb02329713) causes the kernel to hang on > boot when passing the "nofxsr" option: Thanks. Hmm, I'm not sure nofxsr ever worked on 64bit. Certainly SSE cannot be saved/restored in any other way during the context switch.

[PATCH v1] perf/x86: Consider pinned events for group validation

2019-06-28 Thread Andi Kleen
From: Andi Kleen perf stat -M metrics relies on weak groups to reject unschedulable groups and run them as non-groups. This uses the group validation code in the kernel. Unfortunately that code doesn't take pinned events, such as the NMI watchdog, into account. So some groups can pass

[PATCH] perf tools: Fix typos / broken sentences

2019-06-28 Thread Andi Kleen
From: Andi Kleen - Fix a typo in the man page - Fix a tip that doesn't make any sense. Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-report.txt | 2 +- tools/perf/Documentation/tips.txt| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf

<    1   2   3   4   5   6   7   8   9   10   >