[PATCH v4 14/15] perf tools script: Add array bound checking to list_scripts

2019-03-05 Thread Andi Kleen
From: Andi Kleen Don't overflow array when the scripts directory is too large, or the script file name is too long. Signed-off-by: Andi Kleen --- tools/perf/builtin-script.c | 8 ++-- tools/perf/builtin.h | 3 ++- tools/perf/ui/browsers/scripts.c | 3 ++- 3 files changed

Re: [PATCH v5 07/10] perf record: implement -z,--compression_level=n option and compression

2019-03-04 Thread Andi Kleen
On Fri, Mar 01, 2019 at 06:58:32PM +0300, Alexey Budankov wrote: Could do this as a follow up patch, but at some point the new records need to be documented in Documentation/perf.data-file-format.txt -Andi

Re: [PATCH v3 02/11] perf tools script: Support insn output for normal samples

2019-03-04 Thread Andi Kleen
> > + uname(); > > + if (!strcmp(uts.machine, session->header.env.arch) || > > + (!strcmp(uts.machine, "x86_64") && > > +!strcmp(session->header.env.arch, "i386"))) > > why is this check and native_arch bool necessary? > i386 data will be overed by arch/x86 This is so

Re: [PATCH v3 10/11] perf tools report: Implement browsing of individual samples

2019-03-04 Thread Andi Kleen
> > +--samples=N:: > > + Save N individual samples for each histogram entry to show context in > > perf > > + report tui browser. > > maybe we could set some default value (50?) 50 wouldn't fit on the screen. I turned it off by default intentionally because it will increase the memory

[PATCH v3 07/11] perf tools report: Support running scripts for current time range

2019-02-28 Thread Andi Kleen
From: Andi Kleen When using the time sort key, add new context menus to run scripts for only the currently selected time range. Compute the correct range for the selection add pass it as the --time option to perf script. Signed-off-by: Andi Kleen --- v2: Use symbol_conf.time_quantum v3: Work

Support sample context in perf report

2019-02-28 Thread Andi Kleen
[Changes: v3: Fix compile problem on Fedora. Rebase on latest tip. Now hopefully no missing patches.] We currently have two ways to look at sample data in perf: either use perf report to aggregate everything, or use perf script to look at all individual samples. Both ways are useful. Of course

[PATCH v3 09/11] perf tools: Add utility function to print ns time stamps

2019-02-28 Thread Andi Kleen
From: Andi Kleen Add a utility function to print nanosecond timestamps. Signed-off-by: Andi Kleen --- tools/perf/util/time-utils.c | 8 tools/perf/util/time-utils.h | 1 + 2 files changed, 9 insertions(+) diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c index

[PATCH v3 03/11] perf tools report: Support nano seconds

2019-02-28 Thread Andi Kleen
From: Andi Kleen Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: Andi Kleen --- v2: Move flag into symbol_conf and change all users --- tools/perf/Documentation/perf-report.txt | 3

[PATCH v3 08/11] perf tools report: Support builtin perf script in scripts menu

2019-02-28 Thread Andi Kleen
From: Andi Kleen The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script

[PATCH v3 04/11] perf tools report: Parse time quantum

2019-02-28 Thread Andi Kleen
From: Andi Kleen Many workloads change over time. perf report currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key

[PATCH v3 11/11] perf tools: Add some new tips describing the new options

2019-02-28 Thread Andi Kleen
From: Andi Kleen And one old option. Signed-off-by: Andi Kleen --- tools/perf/Documentation/tips.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt index 849599f39c5e..4ec8107ed512 100644 --- a/tools/perf

[PATCH v3 06/11] perf tools report: Use less for scripts output

2019-02-28 Thread Andi Kleen
From: Andi Kleen The UI viewer for scripts output has a lot of limitations: limited size, no search or save function, slow, and various other issues. Just use 'less' to display directly on the terminal instead. This won't work in gtk mode, but gtk doesn't support these context menus anyways

[PATCH v3 01/11] perf tools: Add utility function to fetch executable

2019-02-28 Thread Andi Kleen
From: Andi Kleen Add a utility function to fetch executable code. Convert one user over to it. There are more places doing that, but they do significantly different actions, so they are not easy to fit into a single library function. Signed-off-by: Andi Kleen --- tools/perf/util/Build

[PATCH v3 10/11] perf tools report: Implement browsing of individual samples

2019-02-28 Thread Andi Kleen
From: Andi Kleen Now report can show whole time periods with perf script, but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number

[PATCH v3 02/11] perf tools script: Support insn output for normal samples

2019-02-28 Thread Andi Kleen
From: Andi Kleen perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. % perf record -a sleep 1 % perf script -F ip,sym,insn

[PATCH v3 05/11] perf tools report: Support time sort key

2019-02-28 Thread Andi Kleen
From: Andi Kleen Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol

[tip:perf/core] perf tools: Add perf_exe() helper to find perf binary

2019-02-28 Thread tip-bot for Andi Kleen
Commit-ID: 94816add0005595ea33fc8456519be582330401e Gitweb: https://git.kernel.org/tip/94816add0005595ea33fc8456519be582330401e Author: Andi Kleen AuthorDate: Sun, 24 Feb 2019 07:37:19 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 25 Feb 2019 10:58:28 -0300 perf tools

[tip:perf/core] perf script: Handle missing fields with -F +..

2019-02-28 Thread tip-bot for Andi Kleen
Commit-ID: 4b6ac811bce46c83811b83cdf87b41251596b9fc Gitweb: https://git.kernel.org/tip/4b6ac811bce46c83811b83cdf87b41251596b9fc Author: Andi Kleen AuthorDate: Sun, 24 Feb 2019 07:37:12 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 25 Feb 2019 10:58:07 -0300 perf script

Re: Support sample context in perf report

2019-02-27 Thread Andi Kleen
On Wed, Feb 27, 2019 at 05:16:59PM +0100, Jiri Olsa wrote: > On Wed, Feb 27, 2019 at 08:01:35AM -0800, Andi Kleen wrote: > > > > Also available in > > > > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git > > > > perf/streams-2 > > >

Re: Support sample context in perf report

2019-02-27 Thread Andi Kleen
> > Also available in > > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git > > perf/streams-2 > > your post is missing this patch, it's only in the branch: > perf tools: Add utility function to fetch executable Because Arnaldo already merged it. But the branch is still based on

Re: Support sample context in perf report

2019-02-26 Thread Andi Kleen
Jiri Olsa writes: > > im still getting compile error the new branch: > > CC ui/browsers/hists.o > ui/browsers/hists.c: In function ‘perf_evsel__hists_browse’: > ui/browsers/hists.c:2567:8: error: ‘%s’ directive output may be truncated > writing up to 63 bytes into a region of size

[PATCH v2 02/11] perf tools report: Support nano seconds

2019-02-25 Thread Andi Kleen
From: Andi Kleen Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: Andi Kleen --- v2: Move flag into symbol_conf and change all users --- tools/perf/Documentation/perf-report.txt | 3

[PATCH v2 01/11] perf tools script: Support insn output for normal samples

2019-02-25 Thread Andi Kleen
From: Andi Kleen perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. % perf record -a sleep 1 % perf script -F ip,sym,insn

[PATCH v2 07/11] perf tools: Add perf_exe() helper to find perf binary

2019-02-25 Thread Andi Kleen
From: Andi Kleen Also convert one existing user. Signed-off-by: Andi Kleen --- tools/perf/util/header.c | 12 +++- tools/perf/util/util.c | 10 ++ tools/perf/util/util.h | 2 ++ 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/header.c b

[PATCH v2 04/11] perf tools report: Support time sort key

2019-02-25 Thread Andi Kleen
From: Andi Kleen Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol

[PATCH v2 03/11] perf tools report: Parse time quantum

2019-02-25 Thread Andi Kleen
From: Andi Kleen Many workloads change over time. perf report currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key

[PATCH v2 06/11] perf tools report: Support running scripts for current time range

2019-02-25 Thread Andi Kleen
From: Andi Kleen When using the time sort key, add new context menus to run scripts for only the currently selected time range. Compute the correct range for the selection add pass it as the --time option to perf script. Signed-off-by: Andi Kleen --- v2: Use symbol_conf.time_quantum

Support sample context in perf report

2019-02-25 Thread Andi Kleen
[Changes: Removed already merged patches. Address review feedback, see individual patches. Now compiles with gcc 8. Some minor bug fixes and improvements.] We currently have two ways to look at sample data in perf: either use perf report to aggregate everything, or use perf script to look at all

[PATCH v2 09/11] perf tools: Add utility function to print ns time stamps

2019-02-25 Thread Andi Kleen
From: Andi Kleen Add a utility function to print nanosecond timestamps. Signed-off-by: Andi Kleen --- tools/perf/util/time-utils.c | 8 tools/perf/util/time-utils.h | 1 + 2 files changed, 9 insertions(+) diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c index

[PATCH v2 10/11] perf tools report: Implement browsing of individual samples

2019-02-25 Thread Andi Kleen
From: Andi Kleen Now report can show whole time periods with perf script, but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number

[PATCH v2 11/11] perf tools: Add some new tips describing the new options

2019-02-25 Thread Andi Kleen
From: Andi Kleen And one old option. Signed-off-by: Andi Kleen --- tools/perf/Documentation/tips.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt index 849599f39c5e..4ec8107ed512 100644 --- a/tools/perf

[PATCH v2 05/11] perf tools report: Use less for scripts output

2019-02-25 Thread Andi Kleen
From: Andi Kleen The UI viewer for scripts output has a lot of limitations: limited size, no search or save function, slow, and various other issues. Just use 'less' to display directly on the terminal instead. This won't work in gtk mode, but gtk doesn't support these context menus anyways

[PATCH v2 08/11] perf tools report: Support builtin perf script in scripts menu

2019-02-25 Thread Andi Kleen
From: Andi Kleen The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script

Re: [PATCH 03/11] perf tools report: Support nano seconds

2019-02-25 Thread Andi Kleen
On Mon, Feb 25, 2019 at 11:40:45AM -0500, Sebastien Boisvert wrote: > > > On 2019-02-24 10:37 a.m., Andi Kleen wrote: > > From: Andi Kleen > > > > Upcoming changes add timestamp output in perf report. Add a --ns > > argument similar to perf script to suppor

Re: [PATCH 11/11] perf tools report: Implement browsing of individual samples

2019-02-25 Thread Andi Kleen
On Mon, Feb 25, 2019 at 01:56:15PM +0100, Jiri Olsa wrote: > On Sun, Feb 24, 2019 at 07:37:22AM -0800, Andi Kleen wrote: > > SNIP > > > +static void hists__res_sample(struct hist_entry *he, struct perf_sample > > *sample) > > +{ > > + struct res_sample *r;

Re: [PATCH 11/11] perf tools report: Implement browsing of individual samples

2019-02-25 Thread Andi Kleen
> for some reason can't see those items in menu These one needs --samples 10 or similar. It's off by default currently. -Andi

Re: [PATCH 07/11] perf tools report: Support running scripts for current time range

2019-02-25 Thread Andi Kleen
> can't see the time ranges in the menu.. what's the path > to get them listed? You need to use --sort time,overhead,sym Without time sorting there are no available ranges. -Andi

[PATCH 04/11] perf tools report: Parse time quantum

2019-02-24 Thread Andi Kleen
From: Andi Kleen Many workloads change over time. perf report currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key

[PATCH 01/11] perf tools script: Handle missing fields with -F +..

2019-02-24 Thread Andi Kleen
From: Andi Kleen When using -F + syntax to add a field the existing defaults are currently all marked user_set. This can cause errors when some field is missing in the perf.data This patch tracks the actually user set fields separately, so that we don't error out in this case. Before: % perf

Support sample context in perf report

2019-02-24 Thread Andi Kleen
We currently have two ways to look at sample data in perf: either use perf report to aggregate everything, or use perf script to look at all individual samples. Both ways are useful. Of course aggregation is useful to quickly find the most expensive part of the code. But sometimes a single

[PATCH 08/11] perf tools: Add perf_exe() helper to find perf binary

2019-02-24 Thread Andi Kleen
From: Andi Kleen Also convert one existing user. Signed-off-by: Andi Kleen --- tools/perf/util/header.c | 12 +++- tools/perf/util/util.c | 10 ++ tools/perf/util/util.h | 2 ++ 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/header.c b

[PATCH 06/11] perf tools report: Use less for scripts output

2019-02-24 Thread Andi Kleen
From: Andi Kleen The UI viewer for scripts output has a lot of limitations: limited size, no search or save function, slow, and various other issues. Just use 'less' to display directly on the terminal instead. This won't work in gtk mode, but gtk doesn't support these context menus anyways

[PATCH 05/11] perf tools report: Support time sort key

2019-02-24 Thread Andi Kleen
From: Andi Kleen Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol

[PATCH 02/11] perf tools script: Support insn output for normal samples

2019-02-24 Thread Andi Kleen
From: Andi Kleen perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. % perf record -a sleep 1 % perf script -F ip,sym,insn

[PATCH 07/11] perf tools report: Support running scripts for current time range

2019-02-24 Thread Andi Kleen
From: Andi Kleen When using the time sort key, add new context menus to run scripts for only the currently selected time range. Compute the correct range for the selection add pass it as the --time option to perf script. Signed-off-by: Andi Kleen --- tools/perf/ui/browsers/hists.c | 82

[PATCH 10/11] perf tools: Add utility function to print ns time stamps

2019-02-24 Thread Andi Kleen
From: Andi Kleen Add a utility function to print nanosecond timestamps. Signed-off-by: Andi Kleen --- tools/perf/util/time-utils.c | 8 tools/perf/util/time-utils.h | 1 + 2 files changed, 9 insertions(+) diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c index

[PATCH 11/11] perf tools report: Implement browsing of individual samples

2019-02-24 Thread Andi Kleen
From: Andi Kleen Now report can show whole time periods with perf script, but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number

[PATCH 09/11] perf tools report: Support builtin perf script in scripts menu

2019-02-24 Thread Andi Kleen
From: Andi Kleen The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script

[PATCH 03/11] perf tools report: Support nano seconds

2019-02-24 Thread Andi Kleen
From: Andi Kleen Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-report.txt | 3 +++ tools/perf/builtin-report.c | 1

Re: [PATCH 04/17] perf data: Fail check_backup in case of error

2019-02-21 Thread Andi Kleen
On Thu, Feb 21, 2019 at 10:41:32AM +0100, Jiri Olsa wrote: > And display the error message from removing > the old data file: > > $ perf record ls > Can't remove old data: Permission denied (perf.data.old) > Perf session creation failed. > > Not sure how to make fail the rename (after we

Re: [PATCH] x86/fpu: Parse comma separated list passed in clearcpuid

2019-02-21 Thread Andi Kleen
On Thu, Feb 21, 2019 at 02:37:45PM +0100, Peter Zijlstra wrote: > On Thu, Feb 21, 2019 at 08:12:25AM -0500, Prarit Bhargava wrote: > > Users cannot disable multiple CPU features with the kernel parameter > > clearcpuid=. For example, "clearcpuid=154 clearcpuid=227" only disables > > CPUID bit

Re: [PATCH v5 12/12] KVM/VMX/vPMU: support to report GLOBAL_STATUS_LBRS_FROZEN

2019-02-15 Thread Andi Kleen
On Fri, Feb 15, 2019 at 08:56:02AM +, Wang, Wei W wrote: > On Friday, February 15, 2019 12:32 AM, Andi Kleen wrote: > > > > > +static void intel_pmu_get_global_status(struct kvm_pmu *pmu, > > > + struct msr_data *msr_info) > >

Re: [PATCH v5 12/12] KVM/VMX/vPMU: support to report GLOBAL_STATUS_LBRS_FROZEN

2019-02-14 Thread Andi Kleen
> +static void intel_pmu_get_global_status(struct kvm_pmu *pmu, > + struct msr_data *msr_info) > +{ > + u64 guest_debugctl, freeze_lbr_bits = DEBUGCTLMSR_FREEZE_LBRS_ON_PMI | > + DEBUGCTLMSR_LBR; > + > + if

Re: [PATCH v5 07/12] perf/x86: no counter allocation support

2019-02-14 Thread Andi Kleen
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 9de8780..ec97a70 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -372,7 +372,8 @@ struct perf_event_attr { > context_switch : 1, /*

Re: [PATCH v5 03/12] KVM/x86: KVM_CAP_X86_GUEST_LBR

2019-02-14 Thread Andi Kleen
> + case KVM_CAP_X86_GUEST_LBR: > + r = -EINVAL; > + if (cap->args[0] && > + x86_perf_get_lbr_stack(>arch.lbr_stack)) { > + pr_err("Failed to enable the guest lbr feature\n"); Remove the pr_err. We don't want unprivileged users

Re: [PATCH 4.9 137/137] perf: Add support for supplementary event registers

2019-02-11 Thread Andi Kleen
On Mon, Feb 11, 2019 at 03:20:18PM +0100, Greg Kroah-Hartman wrote: > 4.9-stable review patch. If anyone has any objections, please let me know. > > -- > > From: Andi Kleen > > commit a7e3ed1e470116c9d12c2f778431a481a6be8ab6 upstream. The patch

[tip:perf/core] perf/x86/kvm: Avoid unnecessary work in guest filtering

2019-02-11 Thread tip-bot for Andi Kleen
Commit-ID: 9b545c04abd4f7246a3bde040efde587abebb23c Gitweb: https://git.kernel.org/tip/9b545c04abd4f7246a3bde040efde587abebb23c Author: Andi Kleen AuthorDate: Mon, 4 Feb 2019 14:23:30 -0800 Committer: Ingo Molnar CommitDate: Mon, 11 Feb 2019 08:00:39 +0100 perf/x86/kvm: Avoid

Re: [PATCH V6 2/5] perf/x86/kvm: Avoid unnecessary work in guest filtering

2019-02-04 Thread Andi Kleen
> As my understanding, the microcode version for each stepping is independent > and irrelevant. The 0x004e should be just coincidence. > If so, I don't think X86_STEPPING_ANY is very useful. > > Andi, if I'm wrong please correct me. Yes that's right. You cannot match microcode without

Re: [PATCH 3.16 045/305] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak

2019-02-03 Thread Andi Kleen
On Sun, Feb 03, 2019 at 08:05:53PM +0100, Jiri Kosina wrote: > On Sun, 3 Feb 2019, Ben Hutchings wrote: > > > 3.16.63-rc1 review patch. If anyone has any objections, please let me know. > > > > -- > > > > From: Jiri Kosina > > > > commit

Re: [PATCH v5 00/13] x86: Enable FSGSBASE instructions

2019-02-01 Thread Andi Kleen
Patches all look good to me. Reviewed-by: Andi Kleen -Andi

Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-01-31 Thread Andi Kleen
> Yeah, a loop stuck looks really scary inside an NMI handler. > Should I just go ahead to send a patch to remove this warning? > Or probably turn it into a pr_info()? Not at this point. Would need to fix the PMU reset first to be more selective. -Andi

Re: [PATCH V3 01/13] perf/core, x86: Add PERF_SAMPLE_DATA_PAGE_SIZE

2019-01-31 Thread Andi Kleen
> > Aside from all the missin {}, I'm fairly sure this is broken since this > happens from NMI context. This can interrupt switch_mm() and things like > use_temporary_mm(). So the concern is that the sample is from before the switch, and then looks it up in the wrong page tables if the PMI

Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-01-31 Thread Andi Kleen
On Thu, Jan 31, 2019 at 01:28:34PM +0530, Ravi Bangoria wrote: > Hi Andi, > > On 1/25/19 9:30 PM, Andi Kleen wrote: > >> [Fri Jan 25 10:28:53 2019] perf: interrupt took too long (2501 > 2500), > >> lowering kernel.perf_event_max_sample_rate to 79750 > &

Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-01-30 Thread Andi Kleen
Jiri Olsa writes: > > the patch adds check_eriod pmu callback.. I need to check if there's > better way to do this, but so far it fixes the crash for me > > if you guys could check this patch, that'd be great There's already a limit_period callback, perhaps that could be extended. But ok, can do

Re: [PATCH v2 0/4] perf: enable compression of record mode trace to save storage space

2019-01-29 Thread Andi Kleen
On Tue, Jan 29, 2019 at 11:45:43AM +0100, Arnaldo Carvalho de Melo wrote: > Em Mon, Jan 28, 2019 at 09:40:28AM +0300, Alexey Budankov escreveu: > > The patch set implements runtime trace compression for record mode and > > trace file decompression for report mode. Zstandard API [1] is used for >

Re: [PATCH] perf vendor events intel: Fix Load_Miss_Real_Latency on CLX

2019-01-29 Thread Andi Kleen
On Tue, Jan 29, 2019 at 12:05:36PM -0500, William Cohen wrote: > Fix incorrect event names for the Load_Miss_Real_Latency metric for > Cascadelake server in the same manner as commit 91b2b97025 for SKL/SKX. Reviewed-by: Andi Kleen > > Signed-off-by: William Cohen > --- &g

Re: [RFC] Don't print sample_type bits in non-group events not set in the group's was Re: [PATCH] perf, script: Fix crash with printing mixed trace point and other events

2019-01-28 Thread Andi Kleen
> > also now it won't make sample for slave events > > with zero value/period read > > > > note the patch needs to be split into more patches, > > sending it all together for discussion over the solution > > any feedback on this one? Looks good to me. Reviewed-by: Andi Kleen -Andi

Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-01-25 Thread Andi Kleen
> [Fri Jan 25 10:28:53 2019] perf: interrupt took too long (2501 > 2500), > lowering kernel.perf_event_max_sample_rate to 79750 > [Fri Jan 25 10:29:08 2019] perf: interrupt took too long (3136 > 3126), > lowering kernel.perf_event_max_sample_rate to 63750 > [Fri Jan 25 10:29:11 2019] perf:

Re: [LSF/MM TOPIC] Page flags, can we free up space ?

2019-01-22 Thread Andi Kleen
Jerome Glisse writes: > > Right now this is more a temptative ie i do not know if i will succeed, > in any case i can report on failure or success and discuss my finding to > get people opinions on the matter. I would just stop putting node/zone number into the flags. These could be all handled

Re: [PATCH 10/12] perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

2019-01-22 Thread Andi Kleen
> + PERF_OUTPUT_CODE_PAGE_SIZE = 1UL << 32, That won't work on 32bit. You need 1ULL Also might want to audit that noone puts these flags into an int. -Andi

[tip:perf/urgent] perf script: Fix crash with printing mixed trace point and other events

2019-01-22 Thread tip-bot for Andi Kleen
Commit-ID: 96167167b6e17b25c0e05ecc31119b73baeab094 Gitweb: https://git.kernel.org/tip/96167167b6e17b25c0e05ecc31119b73baeab094 Author: Andi Kleen AuthorDate: Thu, 17 Jan 2019 11:48:34 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 18 Jan 2019 09:53:07 -0300 perf script

Re: [RFC] x86/speculation: add L1 Terminal Fault / Foreshadow demo

2019-01-21 Thread Andi Kleen
> + /* Check the start address: needs to be page-aligned.. */ > +-if (start & ~PAGE_MASK) > ++if (start & ~PAGE_MASK) { > ++ > ++/* > ++ * XXX Hack > ++ * > ++ * We re-use this error case to show case a cache load gadget: > ++

Re: [RFC] Don't print sample_type bits in non-group events not set in the group's was Re: [PATCH] perf, script: Fix crash with printing mixed trace point and other events

2019-01-18 Thread Andi Kleen
> +static bool perf_evsel__should_skip(struct perf_evsel *evsel) > +{ > + struct perf_event_attr *attr = >attr; > + struct perf_evsel *leader = evsel->leader; > + > + return (leader != evsel) && !attr->freq && !attr->sample_freq; > +} > + > static int process_sample_event(struct

Re: [PATCH] perf/core: fix perf_proc_update_handler() bug

2019-01-17 Thread Andi Kleen
gt; > > But the value is still modified causing all sorts of inconsistencies: > > > > $ cat /proc/sys/kernel/perf_event_max_sample_rate > > 10001 > > > > This patch fixes the problem by moving the parsing of the value after > > the test. > > > > Signed-off-by: Stephane Eranian > > Ping. Reviewed-by: Andi Kleen -Andi

[PATCH] perf, script: Fix crash with printing mixed trace point and other events

2019-01-17 Thread Andi Kleen
From: Andi Kleen perf script crashes currently when printing mixed trace points and other events because the trace format does not handle events without trace meta data. Add a simple check to avoid that. % cat > test.c main() { printf("Hello world\n"); } ^D % gcc -g -o test

Re: [PATCH v4 04/13] x86/fsgsbase/64: Add intrinsics/macros for FSGSBASE instructions

2019-01-16 Thread Andi Kleen
> +#ifdef CONFIG_X86_64 > + > +#include > + > +.macro RDGSBASE opd The caller can now use the assembler instructions directly, so the macros are not needed anymore. -Andi

Re: [RFC v2 0/6] x86: dynamic indirect branch promotion

2019-01-08 Thread Andi Kleen
> BTW: I am not sure that static-keys are much better. Their change also > affects the control flow, and they do affect the control flow. Static keys have the same problem, but they only change infrequently so usually it's not too big a problem if you dump the kernel close to the tracing

Re: [RFC v2 0/6] x86: dynamic indirect branch promotion

2019-01-08 Thread Andi Kleen
On Tue, Jan 08, 2019 at 11:10:58AM +0100, Peter Zijlstra wrote: > On Tue, Jan 08, 2019 at 12:01:11PM +0200, Adrian Hunter wrote: > > The problem is that the jitted code gets freed from memory, which is why I > > suggested the ability to pin it for a while. > > Then what do you tell the guy that

Re: [PATCH v4 05/10] KVM/x86: expose MSR_IA32_PERF_CAPABILITIES to the guest

2019-01-07 Thread Andi Kleen
On Mon, Jan 07, 2019 at 10:48:38AM -0800, Jim Mattson wrote: > On Mon, Jan 7, 2019 at 10:20 AM Andi Kleen wrote: > > > > > The issue is compatibility. Prior to your change, reading this MSR > > > from a VM would raise #GP. After your change, it won't. That means

Re: [PATCH v4 05/10] KVM/x86: expose MSR_IA32_PERF_CAPABILITIES to the guest

2019-01-07 Thread Andi Kleen
> The issue is compatibility. Prior to your change, reading this MSR > from a VM would raise #GP. After your change, it won't. That means > that if you have a VM migrating between hosts with kernel versions > before and after this change, the results will be inconsistent. In the No it will not

Re: [RFC v2 1/6] x86: introduce kernel restartable sequence

2019-01-03 Thread Andi Kleen
> Thanks for the explanations. I don’t think it would work (e.g., IRQs). I can > avoid generalizing and just detect the "magic sequence” of the code, but let > me give it some more thought. If you ask me I would just use compiler profile feedback or autofdo (if your compiler has a working

Re: [RFC v2 1/6] x86: introduce kernel restartable sequence

2019-01-03 Thread Andi Kleen
> Ok… I’ll try to think about another solution. Just note that this is just > used as a hint to avoid unnecessary lookups. (IOW, nothing will break if the > prefix is used.) Are you sure actually? The empty prefix could mean 8bit register accesses. > > You're doing the equivalent of patching a

Re: [RFC v2 1/6] x86: introduce kernel restartable sequence

2019-01-03 Thread Andi Kleen
Nadav Amit writes: I see another poor man's attempt to reinvent TSX. > It is sometimes beneficial to have a restartable sequence - very few > instructions which if they are preempted jump to a predefined point. > > To provide such functionality on x86-64, we use an empty REX-prefix > (opcode

Re: [RFC v2 0/6] x86: dynamic indirect branch promotion

2019-01-03 Thread Andi Kleen
Nadav Amit writes: > > - Do we use periodic learning or not? Josh suggested to reconfigure the > branches whenever a new target is found. However, I do not know at > this time how to do learning efficiently, without making learning much > more expensive. FWIW frequent patching will likely

Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

2019-01-03 Thread Andi Kleen
> Yes, but then what happens? > > Fast forward to, say, 2021. You're decommissioning all Broadwell > servers in your data center. You have to migrate the running VMs off > of those Broadwell systems onto newer hardware. But, with the current > implementation, the migration cannot happen. So, what

[tip:perf/urgent] perf script: Fix LBR skid dump problems in brstackinsn

2019-01-03 Thread tip-bot for Andi Kleen
Commit-ID: 61f611593f2c90547cb09c0bf6977414454a27e6 Gitweb: https://git.kernel.org/tip/61f611593f2c90547cb09c0bf6977414454a27e6 Author: Andi Kleen AuthorDate: Mon, 19 Nov 2018 21:06:17 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 28 Dec 2018 16:33:02 -0300 perf script

Re: [PATCH v4 10/10] KVM/x86/lbr: lazy save the guest lbr stack

2018-12-28 Thread Andi Kleen
On Fri, Dec 28, 2018 at 11:47:06AM +0800, Wei Wang wrote: > On 12/28/2018 04:51 AM, Andi Kleen wrote: > > Thanks. This looks a lot better than the earlier versions. > > > > Some more comments. > > > > On Wed, Dec 26, 2018 at 05:25:38PM +0800, Wei Wang wrote:

Re: [PATCH v4 10/10] KVM/x86/lbr: lazy save the guest lbr stack II

2018-12-27 Thread Andi Kleen
Actually forgot one case. In Arch Perfmon v4 the LBR freezing is also controlled through a GLOBAL_CTRL bit. I didn't see any code handling that bit? -Andi

Re: [PATCH v4 10/10] KVM/x86/lbr: lazy save the guest lbr stack

2018-12-27 Thread Andi Kleen
Thanks. This looks a lot better than the earlier versions. Some more comments. On Wed, Dec 26, 2018 at 05:25:38PM +0800, Wei Wang wrote: > When the vCPU is scheduled in: > - if the lbr feature was used in the last vCPU time slice, set the lbr > stack to be interceptible, so that the host can

Re: [PATCH v2] x86, kbuild: revert macrofying inline assembly code

2018-12-21 Thread Andi Kleen
Masahiro Yamada writes: > Revert the following 9 commits: FWIW the -Wa additional also broke LTO builds because it doesn't really support -Wa for individual files. So I'm glad they got reverted. -Andi

Re: [PATCH 0/7] ARM: hacks for link-time optimization

2018-12-21 Thread Andi Kleen
> In particular turning an address-dependency into a control-dependency, > which is something allowed by the C language, since it doesn't recognise > these concepts as such. > > The 'optimization' is allowed currently, but LTO will make it much more > likely since it will have a much wider view

Re: [RFC PATCH] x86/speculation: Don't inherit TIF_SSBD on execve()

2018-12-19 Thread Andi Kleen
> You can always force disable SSB. In that case, all the child processes > will have SSBD on. Okay that sounds reasonable, given the below. Thanks. -Andi > > > > > Do you have a real use case where this behavior is a problem? > > > > -Andi > > Yes, we have an enterprise application partner

Re: [PATCH] checkpatch.pl: Improve WARNING on Kconfig help

2018-12-19 Thread Andi Kleen
>"expecting a 'help' section of > $min_conf_desc_length or more lines\n" . $herecurr); > or maybe >"please write a paragraph that describes > the config symbol fully ($min_conf_desc_length or more lines)\n" . $herecurr); >

Re: [RFC PATCH] x86/speculation: Don't inherit TIF_SSBD on execve()

2018-12-19 Thread Andi Kleen
On Wed, Dec 19, 2018 at 02:09:50PM -0500, Waiman Long wrote: > With the default SPEC_STORE_BYPASS_SECCOMP/SPEC_STORE_BYPASS_PRCTL mode, > the TIF_SSBD bit will be inherited when a new task is fork'ed or cloned. > > As only certain class of applications (like Java) requires disabling > speculative

Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

2018-12-18 Thread Andi Kleen
On Tue, Dec 18, 2018 at 05:16:20PM -0500, Steven Rostedt wrote: > On Tue, 18 Dec 2018 14:13:38 -0800 > Andi Kleen wrote: > > > > Again, that's not the ftrace case. It doesn't care about more than one > > > out of line instance. Thus, for this particular use, "u

Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

2018-12-18 Thread Andi Kleen
On Tue, Dec 18, 2018 at 04:57:13PM -0500, Steven Rostedt wrote: > Hmm, how does that work? When does LTO do its linker magic? Because the > fentry/mcounts are added when the object is created. Are they removed > if the compiler sees that it can be inlined? Or does LTO just compile > everything in

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Andi Kleen
On Tue, Dec 18, 2018 at 01:44:41PM -0800, Dave Hansen wrote: > On 12/18/18 1:38 PM, Andi Kleen wrote: > >> I misunderstood, you mean 32bit kernel, not 32bit machine. Theoretically > >> 32bit > >> kernel can use AVX512, but not sure if anyone use it like this. >

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Andi Kleen
> I misunderstood, you mean 32bit kernel, not 32bit machine. Theoretically 32bit > kernel can use AVX512, but not sure if anyone use it like this. > get_jiffies_64() > includes jiffies_lock ops so not good in context switch. So I want to use raw > jiffies_64 here. jiffies is a good candidate but

Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

2018-12-18 Thread Andi Kleen
On Tue, Dec 18, 2018 at 10:19:32AM +0100, Peter Zijlstra wrote: > On Mon, Dec 17, 2018 at 03:59:50PM -0800, Andi Kleen wrote: > > BTW I have a user base for LTO and so far noone has reported any issues > > like this. > > Because ordering issues are immediately obvious and eas

Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

2018-12-18 Thread Andi Kleen
> I whittled it down to a small test case. It turns out the problem is > caused by the "__optimize__("no-tracer")" atribute, which is used by our > __noclone macro: > > > # if __has_attribute(__optimize__) > # define __noclone __attribute__((__noclone__, >

<    1   2   3   4   5   6   7   8   9   10   >