Re: [perf] perf_fuzzer causes unchecked MSR access error

2021-03-03 Thread Vince Weaver
On Wed, 3 Mar 2021, Liang, Kan wrote:

> We never use bit 58. It should be a new issue.
> Is it repeatable?

yes, it's repeatable.  

(which I'm glad to see because it looks suspiciously like a memory bit 
flip)

Though since it's a WARN_ONCE I have to reboot each time I want to test.

If I get a chance I'll try to come up with a reduced test case but 
probably won't have time for that today.

Vince




[perf] perf_fuzzer causes unchecked MSR access error

2021-03-03 Thread Vince Weaver
Hello

on my Haswell machine the perf_fuzzer managed to trigger this message:

[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 
0x0400) at rIP: 0x8106e4f4 (native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685]  intel_pmu_pebs_enable_all+0x31/0x40
[117248.097737]  intel_pmu_enable_all+0xa/0x10
[117248.102210]  __perf_event_task_sched_in+0x2df/0x2f0
[117248.107511]  finish_task_switch.isra.0+0x15f/0x280
[117248.112765]  schedule_tail+0xc/0x40
[117248.116562]  ret_from_fork+0x8/0x30

that shouldn't be possible, should it?  MSR 0x3f1 is MSR_IA32_PEBS_ENABLE

this is on recent-git with the patch causing the pebs-related crash 
reverted.

Vince


Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

2021-03-02 Thread Vince Weaver
On Mon, 1 Mar 2021, Liang, Kan wrote:

> https://lore.kernel.org/lkml/tip-01330d7288e0050c5aaabc558059ff91589e6...@git.kernel.org/
> The patch is an SW workaround for some old CPUs (HSW and earlier), which may
> set 0 to the PEBS status. It adds a check in the intel_pmu_drain_pebs_nhm().
> It tries to minimize the impact of the defect by avoiding dropping the PEBS
> records which have PEBS status 0.
> But, it doesn't correct the PEBS status, which may bring problems,
> especially for the large PEBS.
> It's possible that all the PEBS records in a large PEBS have the PEBS status
> 0. If so, the first get_next_pebs_record_by_bit() in the
> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large PEBS,
> the 'count' parameter must > 1. The second get_next_pebs_record_by_bit() will
> crash.
> 
> Could you please revert the patch and check whether it fixes your issue?

I've reverted that patch and my test-case no longer triggers the issue.

I'll restart a longer fuzzing run to see if any other issues turn up.

Thanks,

Vince


Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

2021-02-11 Thread Vince Weaver
On Thu, 11 Feb 2021, Liang, Kan wrote:

> > On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
> I'd like to reproduce it on my machine.
> Is this issue only found in a Haswell client machine?
> 
> To reproduce the issue, can I use ./perf_fuzzer under perf_event_tests/fuzzer?
> Do I need to apply any parameters with ./perf_fuzzer?
> 
> Usually how long does it take to reproduce the issue?

On my machine if I run the commands
echo 1 > /proc/sys/kernel/nmi_watchdog
echo 0 > /proc/sys/kernel/perf_event_paranoid
echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
./perf_fuzzer -s 3 -r 1611784483

it is repeatable within a minute, but because of the nature of the fuzzer 
it probably won't work for you because the random events will diverge 
based on the different configs of the system.

I can try to generate a simple reproducer, I've just been extremely busy 
here at work and haven't had the chance.

If you want to try to reproduce it the hard way, run the
./fast_repro99.sh
script in the perf_fuzzer directory.  It will start fuzzing.  My machine 
turned up the issue within a day or so.

Vince



Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

2021-01-28 Thread Vince Weaver
On Thu, 28 Jan 2021, Vince Weaver wrote:

> the perf_fuzzer has turned up a repeatable crash on my haswell system.
> 
> addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
> I'll investigate more when I have the chance.

so I poked around some more.

This seems to be caused in

   __intel_pmu_pebs_event()
get_next_pebs_record_by_bit()   ds.c line 1639
get_pebs_status(at) ds.c line 1317
return ((struct pebs_record_nhm *)n)->status;

where "n" has the value of 0xc0 rather than a proper pointer.

this does seem to be repetable, but fairly deep in a fuzzing run so I 
don't have a quick reproducer.

Vince


> [96289.009646] BUG: kernel NULL pointer dereference, address: 0150
> [96289.017094] #PF: supervisor read access in kernel mode
> [96289.022588] #PF: error_code(0x) - not-present page
> [96289.028069] PGD 0 P4D 0 
> [96289.030796] Oops:  [#1] SMP PTI
> [96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
> 5.11.0-rc5+ #151
> [96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> 01/26/2014
> [96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
> [96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 
> 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 
> 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
> [96289.076876] RSP: :822039e0 EFLAGS: 00010097
> [96289.082468] RAX: 0002 RBX: 0155 RCX: 
> 0008
> [96289.090095] RDX: 88811ac118a0 RSI: 82203980 RDI: 
> 82203980
> [96289.097746] RBP: 00c0 R08:  R09: 
> 
> [96289.105376] R10:  R11:  R12: 
> 0001
> [96289.113008] R13: 82203bc0 R14: 88801c3cf800 R15: 
> 829814a0
> [96289.120671] FS:  () GS:88811ac0() 
> knlGS:
> [96289.129346] CS:  0010 DS:  ES:  CR0: 80050033
> [96289.135526] CR2: 0150 CR3: 0220c003 CR4: 
> 001706f0
> [96289.143159] DR0:  DR1:  DR2: 
> 
> [96289.150803] DR3:  DR6: 0ff0 DR7: 
> 0600
> [96289.158414] Call Trace:
> [96289.161041]  ? update_blocked_averages+0x532/0x620
> [96289.166152]  ? update_group_capacity+0x25/0x1d0
> [96289.171025]  ? cpumask_next_and+0x19/0x20
> [96289.175339]  ? update_sd_lb_stats.constprop.0+0x702/0x820
> [96289.181105]  intel_pmu_drain_pebs_buffer+0x33/0x50
> [96289.186259]  ? x86_pmu_commit_txn+0xbc/0xf0
> [96289.190749]  ? _raw_spin_lock_irqsave+0x1d/0x30
> [96289.195603]  ? timerqueue_add+0x64/0xb0
> [96289.199720]  ? update_load_avg+0x6c/0x5e0
> [96289.204001]  ? enqueue_task_fair+0x98/0x5a0
> [96289.208464]  ? timerqueue_del+0x1e/0x40
> [96289.212556]  ? uncore_msr_read_counter+0x10/0x20
> [96289.217513]  intel_pmu_pebs_disable+0x12a/0x130
> [96289.222324]  x86_pmu_stop+0x48/0xa0
> [96289.226076]  x86_pmu_del+0x40/0x160
> [96289.229813]  event_sched_out.isra.0+0x81/0x1e0
> [96289.234602]  group_sched_out.part.0+0x4f/0xc0
> [96289.239257]  __perf_event_disable+0xef/0x1d0
> [96289.243831]  event_function+0x8c/0xd0
> [96289.247785]  remote_function+0x3e/0x50
> [96289.251797]  flush_smp_call_function_queue+0x11b/0x1a0
> [96289.257268]  flush_smp_call_function_from_idle+0x38/0x60
> [96289.262944]  do_idle+0x15f/0x240
> [96289.266421]  cpu_startup_entry+0x19/0x20
> [96289.270639]  start_kernel+0x7df/0x804
> [96289.274558]  ? apply_microcode_early.cold+0xc/0x27
> [96289.279678]  secondary_startup_64_no_verify+0xb0/0xbb
> [96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink 
> intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic 
> snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp 
> snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel 
> kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer 
> crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm 
> iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd 
> mei_me mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables 
> autofs4 sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci 
> drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore 
> i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi 
> fan thermal video button
> [96289.362498] CR2: 0150
> [96289.366070] ---[ end trace 80c577f99562015f ]---
> [96289.371007] RIP: 0010:

[perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

2021-01-28 Thread Vince Weaver
Hello

the perf_fuzzer has turned up a repeatable crash on my haswell system.

addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
I'll investigate more when I have the chance.

Vince

[96289.009646] BUG: kernel NULL pointer dereference, address: 0150
[96289.017094] #PF: supervisor read access in kernel mode
[96289.022588] #PF: error_code(0x) - not-present page
[96289.028069] PGD 0 P4D 0 
[96289.030796] Oops:  [#1] SMP PTI
[96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
5.11.0-rc5+ #151
[96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
[96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 
01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 
90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
[96289.076876] RSP: :822039e0 EFLAGS: 00010097
[96289.082468] RAX: 0002 RBX: 0155 RCX: 0008
[96289.090095] RDX: 88811ac118a0 RSI: 82203980 RDI: 82203980
[96289.097746] RBP: 00c0 R08:  R09: 
[96289.105376] R10:  R11:  R12: 0001
[96289.113008] R13: 82203bc0 R14: 88801c3cf800 R15: 829814a0
[96289.120671] FS:  () GS:88811ac0() 
knlGS:
[96289.129346] CS:  0010 DS:  ES:  CR0: 80050033
[96289.135526] CR2: 0150 CR3: 0220c003 CR4: 001706f0
[96289.143159] DR0:  DR1:  DR2: 
[96289.150803] DR3:  DR6: 0ff0 DR7: 0600
[96289.158414] Call Trace:
[96289.161041]  ? update_blocked_averages+0x532/0x620
[96289.166152]  ? update_group_capacity+0x25/0x1d0
[96289.171025]  ? cpumask_next_and+0x19/0x20
[96289.175339]  ? update_sd_lb_stats.constprop.0+0x702/0x820
[96289.181105]  intel_pmu_drain_pebs_buffer+0x33/0x50
[96289.186259]  ? x86_pmu_commit_txn+0xbc/0xf0
[96289.190749]  ? _raw_spin_lock_irqsave+0x1d/0x30
[96289.195603]  ? timerqueue_add+0x64/0xb0
[96289.199720]  ? update_load_avg+0x6c/0x5e0
[96289.204001]  ? enqueue_task_fair+0x98/0x5a0
[96289.208464]  ? timerqueue_del+0x1e/0x40
[96289.212556]  ? uncore_msr_read_counter+0x10/0x20
[96289.217513]  intel_pmu_pebs_disable+0x12a/0x130
[96289.222324]  x86_pmu_stop+0x48/0xa0
[96289.226076]  x86_pmu_del+0x40/0x160
[96289.229813]  event_sched_out.isra.0+0x81/0x1e0
[96289.234602]  group_sched_out.part.0+0x4f/0xc0
[96289.239257]  __perf_event_disable+0xef/0x1d0
[96289.243831]  event_function+0x8c/0xd0
[96289.247785]  remote_function+0x3e/0x50
[96289.251797]  flush_smp_call_function_queue+0x11b/0x1a0
[96289.257268]  flush_smp_call_function_from_idle+0x38/0x60
[96289.262944]  do_idle+0x15f/0x240
[96289.266421]  cpu_startup_entry+0x19/0x20
[96289.270639]  start_kernel+0x7df/0x804
[96289.274558]  ? apply_microcode_early.cold+0xc/0x27
[96289.279678]  secondary_startup_64_no_verify+0xb0/0xbb
[96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink intel_rapl_msr 
intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic 
snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp 
snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel 
kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer 
crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm 
iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd mei_me 
mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables autofs4 
sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci 
drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore 
i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi fan 
thermal video button
[96289.362498] CR2: 0150
[96289.366070] ---[ end trace 80c577f99562015f ]---
[96289.371007] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
[96289.376868] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 
01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 
90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
[96289.396981] RSP: :822039e0 EFLAGS: 00010097
[96289.402573] RAX: 0002 RBX: 0155 RCX: 0008
[96289.410226] RDX: 88811ac118a0 RSI: 82203980 RDI: 82203980
[96289.417841] RBP: 00c0 R08:  R09: 
[96289.425461] R10:  R11:  R12: 0001
[96289.433122] R13: 82203bc0 R14: 88801c3cf800 R15: 829814a0
[96289.440774] FS:  () GS:88811ac0() 
knlGS:
[96289.449374] CS:  0010 DS:  ES:  CR0: 80050033
[96289.455507] CR2: 0150 CR3: 

Re: [patch] perf tool buffer overflow in perf_header__read_build_ids

2019-08-23 Thread Vince Weaver
On Fri, 26 Jul 2019, Arnaldo Carvalho de Melo wrote:

> Em Tue, Jul 23, 2019 at 04:42:30PM -0400, Vince Weaver escreveu:
> > my perf_tool_fuzzer has found another issue, this one a buffer overflow
> > in perf_header__read_build_ids.  The build id filename is read in with a 
> > filename length read from the perf.data file, but this can be longer than
> > PATH_MAX which will smash the stack.
> > 
> > This might not be the right fix, not sure if filename should be NUL
> > terminated or not.
> > 
> > Signed-off-by: Vince Weaver 
> > 
> > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > index c24db7f4909c..9a893a26e678 100644
> > --- a/tools/perf/util/header.c
> > +++ b/tools/perf/util/header.c
> > @@ -2001,6 +2001,9 @@ static int perf_header__read_build_ids(struct 
> > perf_header *header,
> > perf_event_header__bswap();
> >  
> > len = bev.header.size - sizeof(bev);
> > +
> > +   if (len>PATH_MAX) len=PATH_MAX;
> > +
> 
> Humm, I wonder if we shouldn't just declare the whole file invalid like
> you did with the previous patch?
> 
> - Arnaldo
> 
> > if (readn(input, filename, len) != len)
> > goto out;
> > /*
 
did we ever decide how to fix this issue?  Or were you waiting on a 
followup patch from me?

This is actually an exploitable security bug if you can convince someone 
to run "perf" on an untrusted perf.data file.

Vince


[tip:perf/core] perf.data documentation: Clarify HEADER_SAMPLE_TOPOLOGY format

2019-08-15 Thread tip-bot for Vince Weaver
Commit-ID:  3143906c2770778d89b730e0342b745d1b4a8303
Gitweb: https://git.kernel.org/tip/3143906c2770778d89b730e0342b745d1b4a8303
Author: Vince Weaver 
AuthorDate: Thu, 1 Aug 2019 14:30:43 -0400
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 14 Aug 2019 10:59:59 -0300

perf.data documentation: Clarify HEADER_SAMPLE_TOPOLOGY format

The perf.data file format documentation for HEADER_SAMPLE_TOPOLOGY
specifies the layout in a confusing manner that doesn't match the rest
of the document.  This patch attempts to describe things consistent with
the rest of the file.

Signed-off-by: Vince Weaver 
Acked-by: Jiri Olsa 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Chong Jiang 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Simon Que 
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908011425240.14303@macbook-air
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt | 25 +-
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index d030c87ed9f5..b0152e1095c5 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -298,16 +298,21 @@ Physical memory map and its node assignments.
 
 The format of data in MEM_TOPOLOGY is as follows:
 
-   0 - version  | for future changes
-   8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
-  16 - count| number of nodes
-
-For each node we store map of physical indexes:
-
-  32 - node id  | node index
-  40 - size | size of bitmap
-  48 - bitmap   | bitmap of memory indexes that belongs to node
-| /sys/devices/system/node/node/memory
+   u64 version;// Currently 1
+   u64 block_size_bytes;   // /sys/devices/system/memory/block_size_bytes
+   u64 count;  // number of nodes
+
+struct memory_node {
+u64 node_id;// node index
+u64 size;   // size of bitmap
+struct bitmap {
+   /* size of bitmap again */
+u64 bitmapsize;
+   /* bitmap of memory indexes that belongs to node */
+   /* /sys/devices/system/node/node/memory */
+u64 entries[(bitmapsize/64)+1];
+}
+}[count];
 
 The MEM_TOPOLOGY can be displayed with following command:
 


[patch] perf.data documentation clarify HEADER_SAMPLE_TOPOLOGY format

2019-08-01 Thread Vince Weaver


The perf.data file format documentation for HEADER_SAMPLE_TOPOLOGY 
specifies the layout in a confusing manner that doesn't match the rest of 
the document.  This patch attempts to describe things consistent with the 
rest of the file.

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 5f54feb19977..6a7dceaae709 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -298,16 +298,21 @@ Physical memory map and its node assignments.
 
 The format of data in MEM_TOPOLOGY is as follows:
 
-   0 - version  | for future changes
-   8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
-  16 - count| number of nodes
-
-For each node we store map of physical indexes:
-
-  32 - node id  | node index
-  40 - size | size of bitmap
-  48 - bitmap   | bitmap of memory indexes that belongs to node
-| /sys/devices/system/node/node/memory
+   u64 version;// Currently 1
+   u64 block_size_bytes;   // /sys/devices/system/memory/block_size_bytes
+   u64 count;  // number of nodes
+
+struct memory_node {
+u64 node_id;// node index
+u64 size;   // size of bitmap
+struct bitmap {
+   /* size of bitmap again */
+u64 bitmapsize; 
+   /* bitmap of memory indexes that belongs to node */
+   /* /sys/devices/system/node/node/memory */
+u64 entries[(bitmapsize/64)+1];
+}
+}[count];
 
 The MEM_TOPOLOGY can be displayed with following command:
 


[tip:perf/urgent] perf tools: Fix perf.data documentation units for memory size

2019-07-29 Thread tip-bot for Vince Weaver
Commit-ID:  2e9a06dda10aea81a17c623f08534dac6735434a
Gitweb: https://git.kernel.org/tip/2e9a06dda10aea81a17c623f08534dac6735434a
Author: Vince Weaver 
AuthorDate: Thu, 25 Jul 2019 11:57:43 -0400
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 29 Jul 2019 09:03:43 -0300

perf tools: Fix perf.data documentation units for memory size

The perf.data-file-format documentation incorrectly says the
HEADER_TOTAL_MEM results are in bytes.  The results are in kilobytes
(perf reads the value from /proc/meminfo)

Signed-off-by: Vince Weaver 
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1907251155500.22624@macbook-air
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 5f54feb19977..d030c87ed9f5 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -126,7 +126,7 @@ vendor,family,model,stepping. For example: 
GenuineIntel,6,69,1
 
HEADER_TOTAL_MEM = 10,
 
-An uint64_t with the total memory in bytes.
+An uint64_t with the total memory in kilobytes.
 
HEADER_CMDLINE = 11,
 


[tip:perf/urgent] perf header: Fix divide by zero error if f_header.attr_size==0

2019-07-29 Thread tip-bot for Vince Weaver
Commit-ID:  7622236ceb167aa3857395f9bdaf871442aa467e
Gitweb: https://git.kernel.org/tip/7622236ceb167aa3857395f9bdaf871442aa467e
Author: Vince Weaver 
AuthorDate: Tue, 23 Jul 2019 11:06:01 -0400
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 29 Jul 2019 09:03:43 -0300

perf header: Fix divide by zero error if f_header.attr_size==0

So I have been having lots of trouble with hand-crafted perf.data files
causing segfaults and the like, so I have started fuzzing the perf tool.

First issue found:

If f_header.attr_size is 0 in the perf.data file, then perf will crash
with a divide-by-zero error.

Committer note:

Added a pr_err() to tell the user why the command failed.

Signed-off-by: Vince Weaver 
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1907231100440.14532@macbook-air
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/header.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 20111f8da5cb..47877f0f6667 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3559,6 +3559,13 @@ int perf_session__read_header(struct perf_session 
*session)
   data->file.path);
}
 
+   if (f_header.attr_size == 0) {
+   pr_err("ERROR: The %s file's attr size field is 0 which is 
unexpected.\n"
+  "Was the 'perf record' command properly terminated?\n",
+  data->file.path);
+   return -EINVAL;
+   }
+
nr_attrs = f_header.attrs.size / f_header.attr_size;
lseek(fd, f_header.attrs.offset, SEEK_SET);
 


Re: perf: perf report stuck in an infinite loop

2019-07-29 Thread Vince Weaver
On Fri, 26 Jul 2019, Arnaldo Carvalho de Melo wrote:

> Em Fri, Jul 26, 2019 at 04:46:51PM -0400, Vince Weaver escreveu:
> > 
> > Currently the perf_data_fuzzer causes perf report to get stuck in an 
> > infinite loop.
> > 
> > >From what I can tell, the issue happens in reader__process_events()
> > when an event is mapped using mmap(), but when it goes to process the
> > event finds out the internal event header has the size (invalidly) set to 
> > something much larger than the mmap buffer size.  This means 
> > fetch_mmaped_event() fails, which gotos remap: which tries again with
> > the exact same mmap size, and this will loop forever.
> > 
> > I haven't been able to puzzle out how to fix this, but maybe you have a 
> > better feel for what's going on here.
> 
> Perhaps the patch below?

yes, with the patch you provided I can no longer trigger the infinite 
loop.

Tested-by: Vince Weaver 


perf: perf report stuck in an infinite loop

2019-07-26 Thread Vince Weaver


Currently the perf_data_fuzzer causes perf report to get stuck in an 
infinite loop.

>From what I can tell, the issue happens in reader__process_events()
when an event is mapped using mmap(), but when it goes to process the
event finds out the internal event header has the size (invalidly) set to 
something much larger than the mmap buffer size.  This means 
fetch_mmaped_event() fails, which gotos remap: which tries again with
the exact same mmap size, and this will loop forever.

I haven't been able to puzzle out how to fix this, but maybe you have a 
better feel for what's going on here.

Vince


Re: [patch] perf report segfault with 0-sized strings

2019-07-25 Thread Vince Weaver


probably all perf_header_strings are affected by this.  The fuzzer just 
tripped up cmdline now, which needs this fix.

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..631aa1911f3a 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1427,6 +1430,8 @@ static void print_cmdline(struct feat_fd *ff, FILE *fp)
 
fprintf(fp, "# cmdline : ");
 
+   if (ff->ph->env.cmdline_argv==NULL) return;
+
for (i = 0; i < nr; i++) {
char *argv_i = strdup(ff->ph->env.cmdline_argv[i]);
if (!argv_i) {


[patch] perf report segfault with 0-sized strings

2019-07-25 Thread Vince Weaver
Hello,

the perf_data_fuzzer found an issue when strings have size 0.
malloc() in do_read_string() is happy to allocate a string of 
size 0 but when code (in this case the pmu parser) tries to work with 
those it will segfault.

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..641129efa987 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -251,6 +252,9 @@ static char *do_read_string(struct feat_fd *ff)
if (do_read_u32(ff, ))
return NULL;
 
+   if (len==0)
+   return NULL;
+
buf = malloc(len);
if (!buf)
return NULL;
@@ -1781,6 +1785,10 @@ static void print_pmu_mappings(struct feat_fd *ff, FILE 
*fp)
str = ff->ph->env.pmu_mappings;
 
while (pmu_num) {
+
+   if (str==NULL)
+   goto error;
+
type = strtoul(str, , 0);
if (*tmp != ':')
goto error;


[patch] perf.data documentation has wrong units for memory size

2019-07-25 Thread Vince Weaver


The perf.data-file-format documentation incorrectly says the 
HEADER_TOTAL_MEM results are in bytes.  The results are in kilobytes
(perf reads the value from /proc/meminfo)

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 5f54feb19977..d030c87ed9f5 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -126,7 +126,7 @@ vendor,family,model,stepping. For example: 
GenuineIntel,6,69,1
 
HEADER_TOTAL_MEM = 10,
 
-An uint64_t with the total memory in bytes.
+An uint64_t with the total memory in kilobytes.
 
HEADER_CMDLINE = 11,
 


[patch] perf tool buffer overflow in perf_header__read_build_ids

2019-07-23 Thread Vince Weaver
Hello

my perf_tool_fuzzer has found another issue, this one a buffer overflow
in perf_header__read_build_ids.  The build id filename is read in with a 
filename length read from the perf.data file, but this can be longer than
PATH_MAX which will smash the stack.

This might not be the right fix, not sure if filename should be NUL
terminated or not.

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..9a893a26e678 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2001,6 +2001,9 @@ static int perf_header__read_build_ids(struct perf_header 
*header,
perf_event_header__bswap();
 
len = bev.header.size - sizeof(bev);
+
+   if (len>PATH_MAX) len=PATH_MAX;
+
if (readn(input, filename, len) != len)
goto out;
/*


[patch] perf tool divide by zero error if f_header.attr_size==0

2019-07-23 Thread Vince Weaver
Hello

so I have been having lots of trouble with hand-crafted perf.data files 
causing segfaults and the like, so I have started fuzzing the perf tool.

First issue found:

If f_header.attr_size is 0 in the perf.data file, then perf will crash
with a divide-by-zero error.

Signed-off-by: Vince Weaver 

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..26df60ee9460 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3559,6 +3559,10 @@ int perf_session__read_header(struct perf_session 
*session)
   data->file.path);
}
 
+   if (f_header.attr_size == 0) {
+   return -EINVAL;
+   }
+
nr_attrs = f_header.attrs.size / f_header.attr_size;
lseek(fd, f_header.attrs.offset, SEEK_SET);
 


Re: WARNING in perf_reg_value

2019-06-19 Thread Vince Weaver
On Wed, 19 Jun 2019, syzbot wrote:

> syzbot found the following crash on:
> 
> HEAD commit:0011572c Merge branch 'for-5.2-fixes' of git://git.kernel...
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12c38d66a0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
> dashboard link: https://syzkaller.appspot.com/bug?extid=10189b9b0f8c4664badd
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1434b876a0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10e6c876a0

the perf_fuzzer found this issue about a month ago, and patches were 
posted that fixed the issue (I've been unable to reproduce when running 
with a patched kernel).

Any reason they haven't been applied?

Vince


Re: [PATCH V2 1/3] perf/x86: Disable non generic regs for software/probe events

2019-05-28 Thread Vince Weaver
On Tue, 28 May 2019, Peter Zijlstra wrote:

> On Tue, May 28, 2019 at 09:33:40AM -0400, Liang, Kan wrote:
> > Uncore PMU doesn't support sampling. It will return -EINVAL.
> > There is no regs support for counting. The request will be ignored.
> > 
> > I think current check for uncore is good enough.
> 
> breakpoints then.. There's also no guarantee you covered all software
> events, and the core rewrite will allow other per-task/sampling PMUs
> too.

possibly related, even with the patches applied, the skylake machine 
eventually did still crash while fuzzing:

[133621.333101] BUG: unable to handle page fault for address: 000100c8
[133621.333102] #PF: supervisor read access in kernel mode
[133621.333103] #PF: error_code(0x) - not-present page
[133621.333104] PGD 0 P4D 0 
[133621.333106] Oops:  [#1] SMP PTI
[133621.333108] CPU: 4 PID: 22203 Comm: perf_fuzzer Tainted: GW 
5.2.0-rc1+ #39
[133621.333109] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   
06/06/2016
[133621.333109] RIP: 0010:perf_reg_value+0x1e/0x50
[133621.333111] Code: 00 48 b8 00 00 00 00 ff ff ff ff c3 0f 1f 44 00 00 8d 46 
e0 83 f8 1f 77 1d 48 8b 97 a8 00 00 00 31 c0 48 85 d2 74 0e 48 63 f6 <48> 8b 84 
f2 00 ff ff ff c3 31 c0 c3 83 fe 17 77 16 48 63 f6 8b 04
[133621.333112] RSP: :fe0d5a80 EFLAGS: 00010006
[133621.333113] RAX:  RBX: 0039 RCX: 
0039
[133621.333114] RDX: 0001 RSI: 0039 RDI: 
fe0d5c88
[133621.333115] RBP: fe0d5b38 R08:  R09: 

[133621.333116] R10: bff0 R11: 0012 R12: 
fe0d5c88
[133621.333117] R13: 99883253ed10 R14: 0050 R15: 

[133621.333118] FS:  7fb9741d3540() GS:998835b0() 
knlGS:
[133621.333119] CS:  0010 DS:  ES:  CR0: 80050033
[133621.333119] CR2: 000100c8 CR3: 0002c004 CR4: 
003607e0
[133621.333120] DR0:  DR1:  DR2: 

[133621.333121] DR3:  DR6: fffe0ff0 DR7: 
0600
[133621.333122] Call Trace:
[133621.333122]  
[133621.333123]  perf_output_sample_regs+0x43/0xa0
[133621.333124]  perf_output_sample+0x3aa/0x7a0
[133621.333125]  perf_event_output_forward+0x53/0x80
[133621.333125]  __perf_event_overflow+0x52/0xf0
[133621.333126]  handle_pmi_common+0x1b3/0x240
[133621.333127]  ? visit_groups_merge+0xeb/0x180
[133621.333127]  ? native_write_msr+0xb/0x20
[133621.333128]  ? native_write_msr+0x1a/0x20
[133621.333129]  ? native_write_msr+0xc/0x20
[133621.333129]  ? intel_pmu_lbr_read+0x29f/0x3d0
[133621.333130]  ? intel_pmu_lbr_filter+0x7f/0x1f0
[133621.333131]  intel_pmu_handle_irq+0xbf/0x160
[133621.333132]  perf_event_nmi_handler+0x2d/0x50
[133621.333132]  nmi_handle+0x63/0x110
[133621.333133]  default_do_nmi+0x4e/0x100
[133621.333134]  do_nmi+0x14d/0x1b0
[133621.333134]  end_repeat_nmi+0x16/0x50
[133621.333135] RIP: 0010:visit_groups_merge+0xeb/0x180
[133621.333137] Code: c0 75 73 48 8d 7b 30 e8 c3 de 55 00 48 85 c0 0f 84 9a 00 
00 00 48 89 c2 48 83 ea 30 74 10 8b b3 74 02 00 00 39 b0 44 02 00 00 <49> 0f 45 
d7 48 89 55 00 48 8b 04 24 48 8b 5c 24 08 48 85 c0 48 89
[133621.333137] RSP: :b28f4c897e10 EFLAGS: 0046
[133621.333139] RAX: 998833315830 RBX: 99882a17a000 RCX: 
0001
[133621.333140] RDX: 998833315800 RSI: 0004 RDI: 
99882a17a030
[133621.333140] RBP: b28f4c897e18 R08:  R09: 
998835b26a80
[133621.333141] R10: 99882a17a000 R11: 0001 R12: 
b6bb93b0
[133621.333142] R13: b28f4c897e68 R14: b28f4c897e10 R15: 

[133621.333143]  ? __perf_event_disable+0x160/0x160
[133621.333144]  ? visit_groups_merge+0xeb/0x180
[133621.333144]  ? visit_groups_merge+0xeb/0x180
[133621.333145]  
[133621.333145]  ctx_sched_in+0xb7/0x180
[133621.333146]  __perf_event_task_sched_in+0x16e/0x1c0
[133621.333147]  ? __switch_to_asm+0x40/0x70
[133621.333147]  ? __switch_to_asm+0x34/0x70
[133621.333148]  ? __switch_to_asm+0x40/0x70
[133621.333149]  ? __switch_to_asm+0x34/0x70
[133621.333149]  finish_task_switch+0xcd/0x270
[133621.333150]  schedule_tail+0xb/0x50
[133621.333151]  ret_from_fork+0x8/0x40
[133621.333151] Modules linked in: intel_rapl x86_pkg_temp_thermal 
intel_powerclamp coretemp kvm irqbypass snd_hda_codec_hdmi 
snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul crc32_pclmul 
ledtrig_audio ghash_clmulni_intel snd_hda_intel snd_hda_codec aesni_intel 
snd_hda_core aes_x86_64 crypto_simd snd_hwdep cryptd snd_pcm mei_me glue_helper 
snd_timer snd sg mei iTCO_wdt iTCO_vendor_support soundcore wmi_bmof tpm_tis 
evdev tpm_tis_core acpi_pad tpm rng_core pcspkr pcc_cpufreq fuse parport_pc 
sunrpc ppdev lp parport ip_tables x_tables autofs4 ext4 crc32c_generic crc16 
mbcache jbd2 sr_mod sd_mod cdrom i915 i2c_algo_bit ahci libahci crc32c_intel 

Re: [PATCH 1/2] perf/x86: Disable non generic regs for software/probe events

2019-05-24 Thread Vince Weaver


I've run the fuzzer overnight with both patches applied and have not seen 
any issues.

Vince


Re: perf: fuzzer causes crash in new XMM code

2019-05-23 Thread Vince Weaver
On Wed, 22 May 2019, Liang, Kan wrote:

> XMM registers can only collected by hardware PEBS events. We should disable it
> for all software/probe events.
> 
> Could you please try the patch as below?

I tested the patch (it was whitespace damaged for some reason, not 
sure if that was on my end though).

Running a few hours here, it hasn't crashed, but it did produce this 
warning which appears to map to:
if (WARN_ON_ONCE(idx >= ARRAY_SIZE(pt_regs_offset)))
return 0;

[ 5552.352046] WARNING: CPU: 1 PID: 11469 at arch/x86/kernel/perf_regs.c:76 
perf_reg_value+0x45/0x50
[ 5552.352083] CPU: 1 PID: 11469 Comm: perf_fuzzer Tainted: GW 
5.2.0-rc1+ #38
[ 5552.352084] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   
06/06/2016
[ 5552.352086] RIP: 0010:perf_reg_value+0x45/0x50
[ 5552.352089] Code: 48 63 f6 48 8b 84 f2 00 ff ff ff c3 31 c0 c3 83 fe 17 77 
16 48 63 f6 8b 04 b5 60 a4 a0 8f 3d a0 00 00 00 77 e7 48 8b 04 07 c3 <0f> 0b 31 
c0 c3 66 0f 1f 44 00 00 0f 1f 44 00 00 48 85 ff 74 12 81
[ 5552.352090] RSP: :a13005f23bd8 EFLAGS: 00010206
[ 5552.352092] RAX: fffd RBX: 001d RCX: 001d
[ 5552.352093] RDX: 001d RSI: 001d RDI: a13005f23f58
[ 5552.352094] RBP: a13005f23c90 R08:  R09: 00029340
[ 5552.352096] R10: 114c8f85b92c R11: 0001 R12: a13005f23f58
[ 5552.352097] R13: 8cc1aa3f4030 R14: 0030 R15: 
[ 5552.352098] FS:  7f2154803540() GS:8cc1b5a4() 
knlGS:
[ 5552.352100] CS:  0010 DS:  ES:  CR0: 80050033
[ 5552.352101] CR2: 7ffe0cb894f8 CR3: 00022d87e006 CR4: 003607e0
[ 5552.352102] DR0:  DR1:  DR2: 
[ 5552.352103] DR3:  DR6: fffe0ff0 DR7: 0600
[ 5552.352104] Call Trace:
[ 5552.352109]  perf_output_sample_regs+0x43/0xa0
[ 5552.352115]  perf_output_sample+0x3aa/0x7a0
[ 5552.352119]  perf_event_output_forward+0x53/0x80
[ 5552.352123]  __perf_event_overflow+0x52/0xf0
[ 5552.352126]  perf_swevent_overflow+0x99/0xc0
[ 5552.352128]  ___perf_sw_event+0xe7/0x120
[ 5552.352131]  ? ptep_set_access_flags+0x23/0x30
[ 5552.352134]  ? do_wp_page+0x2c5/0x5c0
[ 5552.352136]  ? __handle_mm_fault+0xba8/0x1220
[ 5552.352140]  ? _cond_resched+0x15/0x30
[ 5552.352143]  ? handle_mm_fault+0xc2/0x1d0
[ 5552.352146]  ? __do_page_fault+0x268/0x4f0
[ 5552.352149]  ? page_fault+0x8/0x30
[ 5552.352151]  __perf_sw_event+0x55/0xa0
[ 5552.352154]  page_fault+0x1e/0x30
[ 5552.352157] RIP: 0033:0x55e5a5ec6494
[ 5552.352159] Code: b8 00 00 00 00 e8 bc 2c ff ff 48 8b 05 c5 a2 22 00 48 83 
c0 01 48 89 05 ba a2 22 00 90 c9 c3 55 48 89 e5 48 81 ec 10 24 00 00  07 30 
ff ff 89 c2 89 d0 c1 f8 1f c1 e8 19 01 c2 83 e2 7f 29 c2
[ 5552.352160] RSP: 002b:7ffe0cb89500 EFLAGS: 00010206
[ 5552.352161] RAX: 0400 RBX: 000c RCX: 0008
[ 5552.352162] RDX: 55e5a5ecd9c0 RSI: 7ffe0cb8b8f4 RDI: 7f21547fc740
[ 5552.352163] RBP: 7ffe0cb8b910 R08: 7f21547fc1c4 R09: 7f21547fc240
[ 5552.352164] R10: 0001 R11: 0202 R12: 55e5a5eb94c0
[ 5552.352165] R13: 7ffe0cb8dd00 R14:  R15: 
[ 5552.352168] ---[ end trace 199d9be3b0c594ae ]---



perf: fuzzer causes crash in new XMM code

2019-05-22 Thread Vince Weaver


The perf fuzzer caused my skylake machine to crash hard with the trace at 
the end here.  (this is with current git)

It appears to be happening in new code introduced by:

commit 878068ea270ea82767ff1d26c91583263c81fba0
Author: Kan Liang 
Date:   Tue Apr 2 12:44:59 2019 -0700

perf/x86: Support outputting XMM registers


u64 perf_reg_value(struct pt_regs *regs, int idx)
{
struct x86_perf_regs *perf_regs;

if (idx >= PERF_REG_X86_XMM0 && idx < PERF_REG_X86_XMM_MAX) {
perf_regs = container_of(regs, struct x86_perf_regs, regs);
===>if (!perf_regs->xmm_regs)
return 0;
return perf_regs->xmm_regs[idx - PERF_REG_X86_XMM0];
}


[ 9679.952236] BUG: stack guard page was hit at a58f0e2f (stack is 
7d0772c9..938c7501)
[ 9679.962289] kernel stack overflow (page fault):  [#1] SMP PTI
[ 9679.968575] CPU: 1 PID: 18831 Comm: perf_fuzzer Tainted: GW 
5.2.0-rc1 #37
[ 9679.976966] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   
06/06/2016
[ 9679.984325] RIP: 0010:perf_reg_value+0xd/0x50
[ 9679.988799] Code: 45 14 48 83 c3 20 4c 39 e3 75 c3 5b 5d 41 5c 41 5d 41 5e 
c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8d 46 e0 83 f8 1f 77 1d <48> 8b 97 
a8 00 00 00 31 c0 48 85 d2 74 0e 48 63 f6 48 8b 84 f2 00
[ 9680.008003] RSP: :ba6000dd0bc0 EFLAGS: 00010097
[ 9680.013339] RAX: 0001 RBX: 0021 RCX: 0021
[ 9680.020658] RDX: 0021 RSI: 0021 RDI: ba6008d2ff58
[ 9680.027952] RBP: ba6000dd0c78 R08:  R09: 
[ 9680.035262] R10: bff0 R11: 0005 R12: ba6008d2ff58
[ 9680.042564] R13: 94a5ebde48b0 R14: 0030 R15: 
[ 9680.049830] FS:  7fccbb62e540() GS:94a5f5a4() 
knlGS:
[ 9680.058069] CS:  0010 DS:  ES:  CR0: 80050033
[ 9680.063934] CR2: ba6008d3 CR3: 00022b7a8006 CR4: 003606e0
[ 9680.071227] DR0:  DR1: 81007f80 DR2: 
[ 9680.078521] DR3:  DR6: fffe0ff0 DR7: 0600
[ 9680.085831] Call Trace:
[ 9680.088301]  
[ 9680.090363]  perf_output_sample_regs+0x43/0xa0
[ 9680.094928]  perf_output_sample+0x3aa/0x7a0
[ 9680.099181]  perf_event_output_forward+0x53/0x80
[ 9680.103917]  __perf_event_overflow+0x52/0xf0
[ 9680.108266]  ? perf_trace_run_bpf_submit+0xc0/0xc0
[ 9680.113108]  perf_swevent_hrtimer+0xe2/0x150
[ 9680.117475]  ? check_preempt_wakeup+0x181/0x230
[ 9680.122091]  ? check_preempt_curr+0x62/0x90
[ 9680.126361]  ? ttwu_do_wakeup+0x19/0x140
[ 9680.130355]  ? try_to_wake_up+0x54/0x460
[ 9680.134366]  ? reweight_entity+0x15b/0x1a0
[ 9680.138559]  ? __queue_work+0x103/0x3f0
[ 9680.142472]  ? update_dl_rq_load_avg+0x1cd/0x270
[ 9680.147194]  ? timerqueue_del+0x1e/0x40
[ 9680.151092]  ? __remove_hrtimer+0x35/0x70
[ 9680.155191]  __hrtimer_run_queues+0x100/0x280
[ 9680.159658]  hrtimer_interrupt+0x100/0x220
[ 9680.163835]  smp_apic_timer_interrupt+0x6a/0x140
[ 9680.168555]  apic_timer_interrupt+0xf/0x20
[ 9680.172756]  
[ 9680.174905] RIP: 0033:0x55dad77a9927
[ 9680.178575] Code: 00 00 00 48 89 d1 31 c0 48 89 f2 89 fe bf 41 01 00 00 e9 
4c 09 ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 31 c9 b9 1f a1 07 00  c9 75 
fc 31 c0 c3 66 90 48 8b 05 c9 96 00 00 48 89 44 24 f8 b9
[ 9680.197779] RSP: 002b:7fff595603a8 EFLAGS: 0206 ORIG_RAX: 
ff13
[ 9680.205489] RAX: 4985 RBX: 000c RCX: 000365ca
[ 9680.212748] RDX: 1e15d36cec84 RSI:  RDI: 0001
[ 9680.220059] RBP: 7fff595603c0 R08:  R09: 7fccbb62e540
[ 9680.227362] R10: fd4e R11: 0246 R12: 55dad779a4c0
[ 9680.234630] R13: 7fff595627b0 R14:  R15: 
[ 9680.310017] ---[ end trace 511b9368cf14c65a ]---



Re: [tip:perf/core] perf/x86/intel: Force resched when TFA sysctl is modified

2019-04-16 Thread Vince Weaver
On Tue, 16 Apr 2019, tip-bot for Stephane Eranian wrote:

> Commit-ID:  f447e4eb3ad1e60d173ca997fcb2ef2a66f12574
> Gitweb: 
> https://git.kernel.org/tip/f447e4eb3ad1e60d173ca997fcb2ef2a66f12574
> Author: Stephane Eranian 
> AuthorDate: Mon, 8 Apr 2019 10:32:52 -0700
> Committer:  Ingo Molnar 
> CommitDate: Tue, 16 Apr 2019 12:19:35 +0200
> 
> perf/x86/intel: Force resched when TFA sysctl is modified

What's TFA?  Tuna-fish-alarm?  Nowhere in the commit or in the code does 
it ever say what a TFA is or why we'd want to resched when it is modified.

Vince


Re: perf: perf_fuzzer crashes on Pentium 4 systems

2019-04-09 Thread Vince Weaver
On Sun, 7 Apr 2019, Cyrill Gorcunov wrote:

> Vince, could you please disable alias events and see if it change
> anything, once you have time? Note once we've aliases disabled the
> counter for cpu cycles get used for NMI watchdog so PERF_COUNT_HW_CPU_CYCLES
> won't be available in "perf" tool itself, but I guess perf_fuzzer uses
> direct kernel syscall.
> ---
>  arch/x86/events/intel/p4.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> Index: linux-tip.git/arch/x86/events/intel/p4.c
> ===
> --- linux-tip.git.orig/arch/x86/events/intel/p4.c
> +++ linux-tip.git/arch/x86/events/intel/p4.c
> @@ -622,6 +622,8 @@ static u64 p4_get_alias_event(u64 config
>   u64 config_match;
>   int i;
>  
> + return 0;
> +
>   /*
>* Only event with special mark is allowed,
>* we're to be sure it didn't come as malformed
> 

It still crashes at the same place with this patch and my reproducible 
test case.

Vince




Re: perf: perf_fuzzer crashes on Pentium 4 systems

2019-04-04 Thread Vince Weaver
On Thu, 4 Apr 2019, Cyrill Gorcunov wrote:

> On Thu, Apr 04, 2019 at 12:37:18PM -0400, Vince Weaver wrote:
> 
> Oh, Vince, I suspect such kind of bisection might consume a lot of your
> time :( Maybe we could update perf fuzzer so that it would send events
> to some net-storage first then write them to the counters, iow to automatize
> this all stuff somehow?

I do have a lot of this automated already from tracking down past bugs, 
but it turns out that most of the fuzzer-found bugs aren't deterministic 
so it doesn't always work.

For example this bug, while I can easily repeat it, doesn't happen at 
the same time each time.  I suspect something corrupts things, but the
crash doesn't trigger until a context switch happens.

For what it's worth I've put code in p4_pmu_enable_all() to see what's 
going on when the NULL dereference happens, and sure enough the printk is 
triggered where I'd expect.

[  138.132889] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.171380] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.212588] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.263761] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.279944] VMW: p4_pmu_enable_all: idx 4 is NULL

static void p4_pmu_enable_all(int added)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
int idx;

for (idx = 0; idx < x86_pmu.num_counters; idx++) {
struct perf_event *event = cpuc->events[idx];
if (!test_bit(idx, cpuc->active_mask))
continue;
if (event==NULL) {
printk("VMW: p4_pmu_enable_all: idx %d is NULL\n",idx);
} else {
p4_pmu_enable_event(event);
}
}
}


the machine still crashes after this, but not right away.

Vince


Re: perf: perf_fuzzer crashes on Pentium 4 systems

2019-04-04 Thread Vince Weaver
On Thu, 4 Apr 2019, Cyrill Gorcunov wrote:

> On Thu, Apr 04, 2019 at 09:25:47AM -0400, Vince Weaver wrote:
> > 
> > It looks like there are at least two bugs here, one that's a full 
> > hardlockup with nothing on serial console.  The other is the NULL 
> > dereference.

OK, it turns out the hard-lock and the null pointer dereference might be 
the same, I have a random seed for the fuzzer from a hard-lock crash that 
reproduces and it generated the null pointer crash.  (This is with your 
patch applied).

I can try to see if I can bisect down to a specific event sequence that 
triggers this, but that can be tricky sometimes if things lock up so fast 
that the event log doesn't get written out before the crash.

Vince




Re: perf: perf_fuzzer crashes on Pentium 4 systems

2019-04-04 Thread Vince Weaver
On Wed, 3 Apr 2019, Cyrill Gorcunov wrote:

> On Wed, Apr 03, 2019 at 10:19:44PM +0300, Cyrill Gorcunov wrote:
> > 
> > You know, seems I got what happened -- p4_general_events do
> > not cover all general events, they stop at PERF_COUNT_HW_BUS_CYCLES,
> > while more 3 general event left. This is 'cause I've not been following
> > pmu evolution in code. I will try to cover this events hopefully more
> > less soon and send you a patch to test (if you don't mind).
> 
> Still this should not cause nil deref, continue investigating. Vince
> could oyu please apply the patch below, I doubt if it help with nil
> issue but worth having anyway


It looks like there are at least two bugs here, one that's a full 
hardlockup with nothing on serial console.  The other is the NULL 
dereference.

Just ran with your patch applied and it hit the hard lockup case.

I'll have to see if things are reproducible and I can try to see if I can 
get a reproducible value for what even caused the issue.  perf_fuzzer has 
some infrastructure for determining that but it's hit or miss if you can 
get anything useful from it.

I'll keep running things, but I'm a bit busy at work here the next few 
days so there might be some delay in the results.

Vince


perf: perf_fuzzer crashes on Pentium 4 systems

2019-04-03 Thread Vince Weaver


so moving this to its own thread.

There was a two-part question asked.
1. Can the perf-fuzzer crash a Pentium 4 system
2. Does anyone care anymore?

The answer to #1 turns out to be "yes"
I'm not sure about #2 (but it's telling my p4 test system hadn't been 
turned on in over 3 years).

In any case the perf_fuzzer can crash my p4 system within an hour or so.  
The debugging from this isn't great, I forget what the preferred debug 
things to enable in the kernel hacking menu are.

Here is one crash that just happened:

The instruction at RIP is unhelpfully
./arch/x86/include/asm/processor.h:400
which is
DECLARE_PER_CPU_FIRST(union irq_stack_union, irq_stack_union) __visible;

Though looking at the assembly it looks like
p4_pmu_enable_event() is called with NULL as the paramater.

[ 1930.122902] BUG: unable to handle kernel NULL pointer dereference at 
0158
[ 1930.130715] #PF error: [normal kernel read fault]
[ 1930.135402] PGD 0 P4D 0 
[ 1930.137928] Oops:  [#1] SMP PTI
[ 1930.141405] CPU: 0 PID: 30179 Comm: perf_fuzzer Tainted: GW 
5.1.0-rc3+ #6
[ 1930.149555] Hardware name: LENOVO 88088NU/LENOVO, BIOS 2JKT37AUS 07/12/2007
[ 1930.156497] RIP: 0010:p4_pmu_enable_event+0x10/0x160
[ 1930.161443] Code: 89 f0 0f 30 31 c0 8b 15 e6 2e 0f 01 85 d2 7f 01 c3 89 c2 
89 cf e9 70 65 3b 00 0f 1f 44 00 00 41 56 41 55 41 54 49 89 fc 55 53 <48> 8b 9f 
58 01 00 00 48 89 dd 48 89 da 48 c1 ed 20 48 c1 ea 3f 89
[ 1930.180155] RSP: 0018:c90001f57d50 EFLAGS: 00010017
[ 1930.185361] RAX:  RBX: 000c RCX: 0360
[ 1930.192472] RDX:  RSI: 0400 RDI: 
[ 1930.199582] RBP: 88803e40f620 R08:  R09: 000c
[ 1930.206691] R10: 4801fefc R11: 800fce030200 R12: 
[ 1930.213802] R13: 888035c4a0c0 R14: 88803e429300 R15: 0402
[ 1930.220913] FS:  7ff3b934a540() GS:88803e40() 
knlGS:
[ 1930.228976] CS:  0010 DS:  ES:  CR0: 80050033
[ 1930.234700] CR2: 0158 CR3: 3a72e000 CR4: 07f0
[ 1930.241811] DR0:  DR1:  DR2: 
[ 1930.248921] DR3:  DR6: 0ff0 DR7: 0600
[ 1930.256030] Call Trace:
[ 1930.258472]  p4_pmu_enable_all+0x3c/0x50
[ 1930.262384]  __perf_event_task_sched_in+0x174/0x1a0
[ 1930.267247]  ? __switch_to_asm+0x34/0x70
[ 1930.271155]  ? __switch_to_asm+0x40/0x70
[ 1930.275064]  ? __switch_to_asm+0x34/0x70
[ 1930.278971]  ? __switch_to_asm+0x40/0x70
[ 1930.282882]  finish_task_switch+0x10a/0x290
[ 1930.287053]  __schedule+0x207/0x800
[ 1930.290530]  ? event_function_call+0x85/0x100
[ 1930.294873]  ? ctx_resched+0xc0/0xc0
[ 1930.298437]  preempt_schedule_common+0xa/0x20
[ 1930.302777]  _cond_resched+0x1d/0x30
[ 1930.306340]  mutex_lock+0xe/0x30
[ 1930.309558]  perf_event_ctx_lock_nested.isra.89+0x46/0x90
[ 1930.314939]  ? _perf_event_disable+0x40/0x40
[ 1930.319193]  perf_event_task_enable+0x3f/0xa0
[ 1930.323537]  __x64_sys_prctl+0x1b2/0x560
[ 1930.327448]  do_syscall_64+0x4f/0xf0
[ 1930.331011]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1930.336045] RIP: 0033:0x7ff3b928240a



Re: [RFC PATCH v3 0/3] x86/perf/amd: AMD PMC counters and NMI latency

2019-04-03 Thread Vince Weaver
On Wed, 3 Apr 2019, Cyrill Gorcunov wrote:

> > Shame on Intel though for not providing perf JSON files for the 
> > Pentium 4 event names.
> 
> Mind to point me where json events should lay, I could try to convert
> names.

I was mostly joking about that.  But the event lists are in the kernel 
tree in
tools/perf/pmu-events/arch/x86/
I don't think anything older than Nehalem is included.

After letting the fuzzer run a bit longer I did manage to get it 
to hard-lock with no messages in the log, though eventually while I was 
fiddling with alt-sysrq over serial port did get this to trigger.
Though if I'm going to start reporting p4 crashes I should start a 
separate thread.

[ 2352.198361] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! 
[perf_fuzzer:27005]
[ 2352.257304] CPU: 1 PID: 27005 Comm: perf_fuzzer Tainted: GW 
5.1.0-rc3+ #6
[ 2352.265458] Hardware name: LENOVO 88088NU/LENOVO, BIOS 2JKT37AUS 07/12/2007
[ 2352.272407] RIP: 0010:smp_call_function_single+0xc9/0xf0
[ 2352.277700] Code: 8b 4c 24 38 65 48 33 0c 25 28 00 00 00 75 34 c9 c3 48 89 
d1 48 89 f2 48 89 e6 e8 a2 fe ff ff 8b 54 24 18 83 e2 01 74 0b f3 90 <8b> 54 24 
18 83 e2 01 75 f5 eb ca 8b 05 06 c4 45 01 85 c0 75 88 0f
[ 2352.296415] RSP: 0018:c90004d0fb80 EFLAGS: 0202 ORIG_RAX: 
ff13
[ 2352.303961] RAX:  RBX: 888039962500 RCX: 88803e520d80
[ 2352.311071] RDX: 0001 RSI: c90004d0fb80 RDI: c90004d0fb80
[ 2352.318180] RBP: c90004d0fbc0 R08:  R09: 
[ 2352.325291] R10: 888036cc8010 R11: 888039978e98 R12: 8115ec70
[ 2352.332412] R13: 0001 R14: 88803aa15108 R15: 88803d5fdb70
[ 2352.339524] FS:  () GS:88803e50() 
knlGS:
[ 2352.347587] CS:  0010 DS:  ES:  CR0: 80050033
[ 2352.353313] CR2: 7f9d2e56d3b4 CR3: 0200e000 CR4: 06e0
[ 2352.360428] DR0:  DR1:  DR2: 
[ 2352.367540] DR3:  DR6: 0ff0 DR7: 0600
[ 2352.374649] Call Trace:
[ 2352.377104]  ? perf_cgroup_attach+0x70/0x70
[ 2352.381276]  ? slab_destroy+0xa5/0x120
[ 2352.385016]  ? perf_cgroup_attach+0x70/0x70
[ 2352.389186]  task_function_call+0x49/0x80
[ 2352.393186]  ? bpf_jit_compile+0x30/0x30
[ 2352.397095]  event_function_call+0x85/0x100
[ 2352.401265]  ? perf_swevent_hrtimer+0x150/0x150
[ 2352.405781]  perf_remove_from_context+0x20/0x60
[ 2352.410295]  perf_event_release_kernel+0x75/0x2e0
[ 2352.414983]  perf_release+0xc/0x10
[ 2352.418373]  __fput+0xaf/0x1f0
[ 2352.421425]  task_work_run+0x7e/0xa0
[ 2352.424990]  do_exit+0x2c6/0xb40
[ 2352.428213]  ? event_function_local.constprop.132+0xe0/0xe0
[ 2352.433767]  ? visit_groups_merge+0xcd/0x180
[ 2352.438027]  do_group_exit+0x3a/0xa0
[ 2352.441598]  get_signal+0x123/0x6c0
[ 2352.445080]  ? __perf_event_task_sched_in+0xed/0x1a0
[ 2352.450030]  do_signal+0x30/0x6a0
[ 2352.453334]  ? finish_task_switch+0x10a/0x290
[ 2352.457685]  ? __schedule+0x207/0x800
[ 2352.461336]  exit_to_usermode_loop+0x5d/0xc0
[ 2352.465593]  prepare_exit_to_usermode+0x53/0x80
[ 2352.470110]  retint_user+0x8/0x8



Re: [tip:perf/urgent] perf/x86/intel: Initialize TFA MSR

2019-04-03 Thread Vince Weaver
On Wed, 3 Apr 2019, Thomas Gleixner wrote:

> On Wed, 3 Apr 2019, tip-bot for Peter Zijlstra wrote:
> 
> > Commit-ID:  d7262457e35dbe239659e62654e56f8ddb814bed
> > Gitweb: 
> > https://git.kernel.org/tip/d7262457e35dbe239659e62654e56f8ddb814bed
> > Author: Peter Zijlstra 
> > AuthorDate: Thu, 21 Mar 2019 13:38:49 +0100
> > Committer:  Ingo Molnar 
> > CommitDate: Wed, 3 Apr 2019 11:40:32 +0200
> > 
> > perf/x86/intel: Initialize TFA MSR
> > 
> > Stephane reported that the TFA MSR is not initialized by the kernel,
> > but the TFA bit could set by firmware or as a leftover from a kexec,
> > which makes the state inconsistent.
> > 
> > Reported-by: Stephane Eranian 
> > Tested-by: Nelson DSouza 
> > Signed-off-by: Peter Zijlstra (Intel) 
> > Cc: Alexander Shishkin 
> > Cc: Arnaldo Carvalho de Melo 
> > Cc: Jiri Olsa 
> > Cc: Linus Torvalds 
> > Cc: Peter Zijlstra 
> > Cc: Thomas Gleixner 
> > Cc: Vince Weaver 
> > Cc: to...@suse.com
> > Link: 
> > https://lkml.kernel.org/r/20190321123849.gn6...@hirez.programming.kicks-ass.net
> > Signed-off-by: Ingo Molnar 
> 
> This lacks:
> 
>  1) Fixes tag
> 
>  2) Cc: stable 
> 
> Sigh.

It would also be nice to know what a "TFA" bit is without having to go 
find a copy of the Intel documentation.

Vince


Re: [RFC PATCH v3 0/3] x86/perf/amd: AMD PMC counters and NMI latency

2019-04-02 Thread Vince Weaver
On Tue, 2 Apr 2019, Cyrill Gorcunov wrote:

> You know, running fuzzer on p4 might worth in anycase. As to potential
> problems to fix -- i could try find some time slot for, still quite
> limited too 'cause of many other duties :(


Well I fired up the Pentium 4
/dev/sda1 has gone 1457 days without being checked

and eventually got it up to date (running "git pull" in a 5-year old Linux 
tree plus running "apt-get dist-upgrade" at the same time was maybe a 
mistake on a system with 1GB of RAM).

Anyway I have it fuzzing current git and surprisingly while it's hit a few 
WARNINGs and some NMI dazed+confused messages it hasn't actually crashed 
yet.  Not sure if I want to let it fuzz overnight if I'm not here though.

Shame on Intel though for not providing perf JSON files for the 
Pentium 4 event names.

Vince


Re: [RFC PATCH v3 0/3] x86/perf/amd: AMD PMC counters and NMI latency

2019-04-02 Thread Vince Weaver
On Tue, 2 Apr 2019, Cyrill Gorcunov wrote:
> On Tue, Apr 02, 2019 at 03:03:02PM +0200, Peter Zijlstra wrote:
> > I have vague memories of the P4 thing crashing with Vince's perf_fuzzer,
> > but maybe I'm wrong.
> 
> No, you're correct. p4 was crashing many times before we manage to make
> it more-less stable. The main problem though that to find working p4 box
> is really a problem.

I do have some a functioning p4 system I can test on.
I can easily run the fuzzer and report crashes, but I only have limited 
time/skills to actually fix the problems it turns up.

One nice thing is that as of Linux 5.0 *finally* the fuzzer can run 
indefinitely on most modern Intel chips without crashing (still triggers a 
few warnings).  So finally we have the ability to tell when a new crash is 
a regression and potentially can bisect it.  Although obviously this 
doesn't necessarily apply to the p4.

I do think the number of people trying to run perf on a p4 is probably 
pretty small these days.

Vince


Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-02-02 Thread Vince Weaver
On Fri, 1 Feb 2019, Jiri Olsa wrote:

> > 
> > I've just started fuzzing with the patch applied.  Often it takes a few 
> > hours to trigger the bug.
> 
> cool, thanks

I let it run overnight and no crash.

> > Added question about this bug.  It appeared that the crash was triggered 
> > by the BTS driver over-writing kernel memory.  The data being written, was 
> > this user controllable?  Meaning, is this a security issue being fixed, or 
> > just a crashing issue?
> 
> yea, I have an example that can trigger it immediately

I mean: the crash is happening because data structures are getting 
over-written by the BTS driver.  Depending who and what is doing this, 
this could be a security issue (i.e. if it was raw BTS data that was 
partially userspace controlled values).  Though even if this were the case 
it would probably be hard to exploit.

Vince


Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-02-01 Thread Vince Weaver
On Fri, 1 Feb 2019, Jiri Olsa wrote:

> with attached patch I did not trigger the fuzzer crash
> for over a day now, could you guys try?

I've just started fuzzing with the patch applied.  Often it takes a few 
hours to trigger the bug.

Added question about this bug.  It appeared that the crash was triggered 
by the BTS driver over-writing kernel memory.  The data being written, was 
this user controllable?  Meaning, is this a security issue being fixed, or 
just a crashing issue?

Vince Weaver
vincent.wea...@maine.edu






Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3)

2019-01-25 Thread Vince Weaver
On Fri, 25 Jan 2019, Ravi Bangoria wrote:

> I'm seeing a system crash while running perf_fuzzer with upstream kernel
> on an Intel machine. I hit the crash twice (out of which I don't have log
> of first crash so don't know if the reason is same for both the crashes).
> I've attached my .config with the mail.
>   type = PERF_TYPE_HARDWARE;



>   
> 
> And, I'm running fuzzer in a loop with *root*. (Let me know if running
> as root is harmful ;-) ).


There's a known issue related to Intel BTS events that you can trigger 
with the perf_fuzzer, even as a normal user.  I reported it a few months 
ago but I don't think it ever got resolved.  The traces you get look 
similar to some that you posted.

It's hard to track down as it doesn't seem to be a simple issue, but 
rather it looks like the BTS event handling is stomping over memory it 
shouldn't somehow.

Vince


Re: perf: rdpmc bug when viewing all procs on remote cpu

2019-01-18 Thread Vince Weaver
On Fri, 18 Jan 2019, Peter Zijlstra wrote:
> 
> You can actually use rdpmc when you attach to a CPU, but you have to
> ensure that the userspace component is guaranteed to run on that very
> CPU (sched_setaffinity(2) comes to mind).

unfortunately the HPC people using PAPI would probably be annoyed if we 
started binding their threads to cores out from under them.

> The best we could possibly do is put the (target, not current) cpu
> number in the mmap page; but userspace should already know this, for it
> created the event and therefore knows this already.

one other thing the kernel would do is just disable rdpmc (setting index 
to 0) in the case where the original perf_event_open() cpu paramater!=0

though that would stop the case where we were on the same CPU from 
working.

The issue is currently if you're not careful the rdpmc() interface will 
sometimes return plausible (but wrong) results for a cross-CPU rdpmc() 
call, even if you are properly falling back to read() on ->index being 0.
It's a bit surprising and it looks like it will take a decent amount of 
userspace code to work around the issue, which cuts into the low-overhead 
nature of rdpmc.

If the answer is simply this is the way the kernel is going to do it, 
that's fine, I just have to add workarounds to PAPI and then get the 
perf_even_open() manpage updated to make sure people are aware of the 
issue.

Vince




Re: perf: rdpmc bug when viewing all procs on remote cpu

2019-01-18 Thread Vince Weaver
On Fri, 18 Jan 2019, Peter Zijlstra wrote:

> On Fri, Jan 11, 2019 at 04:52:22PM -0500, Vince Weaver wrote:
> > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > 
> > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > > 
> > > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > > > 
> > > > > However if you create an all-process attached to CPU event:
> > > > >   perf_event_open(attr, -1, X, -1, 0);
> > > > > the mmap event index is set as if this were a valid event and so the 
> > > > > rdpmc
> > > > > succeeds even though it shouldn't (we're trying to read an event value
> > > > > on a remote cpu with a local rdpmc).
> > 
> > so on further looking at the code, it doesn't appear that rdpmc events are 
> > explicitly marked as unavailable in the attach-cpu or attach-pid case, 
> > it's just by luck the check for PERF_EVENT_STATE_ACTIVE catches most of 
> > the cases?
> > 
> > should an explicit check be added to zero out userpg->index in cases where 
> > the event being measured is running on a different core?
> 
> And how would we konw? We don't know what CPU will be observing the
> mmap().
> 
> RDPMC fundamentally only makes sense on 'self' (either task or CPU).

so is this a "don't do that then" thing and I should have PAPI 
userspace avoid using rdpmc() whenever a proc/cpu was attached to?

Or is there a way to have the kernel indicate this?  Does the kernel track 
current CPU and original CPU of the mmap and could zero out the index 
field in this case?  Or would that add too much overhead?

Vince


Re: perf: rdpmc bug when viewing all procs on remote cpu

2019-01-11 Thread Vince Weaver
On Thu, 10 Jan 2019, Vince Weaver wrote:

> On Thu, 10 Jan 2019, Vince Weaver wrote:
> 
> > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > 
> > > However if you create an all-process attached to CPU event:
> > >   perf_event_open(attr, -1, X, -1, 0);
> > > the mmap event index is set as if this were a valid event and so the rdpmc
> > > succeeds even though it shouldn't (we're trying to read an event value
> > > on a remote cpu with a local rdpmc).

so on further looking at the code, it doesn't appear that rdpmc events are 
explicitly marked as unavailable in the attach-cpu or attach-pid case, 
it's just by luck the check for PERF_EVENT_STATE_ACTIVE catches most of 
the cases?

should an explicit check be added to zero out userpg->index in cases where 
the event being measured is running on a different core?

Vince


Re: perf: rdpmc bug when viewing all procs on remote cpu

2019-01-10 Thread Vince Weaver
On Thu, 10 Jan 2019, Vince Weaver wrote:

> On Thu, 10 Jan 2019, Vince Weaver wrote:
> 
> > However if you create an all-process attached to CPU event:
> > perf_event_open(attr, -1, X, -1, 0);
> > the mmap event index is set as if this were a valid event and so the rdpmc
> > succeeds even though it shouldn't (we're trying to read an event value
> > on a remote cpu with a local rdpmc).
> 
> For a test case, try the
>   tests/rdpmc/rdpmc_attach_other_cpu
> test found in my perf_event_tests suite
>   git clone https://github.com/deater/perf_event_tests

and that was a cut-and-paste error, I meant
tests/rdpmc/rdpmc_attach_global_cpu
and I was wrong, it does affect AMD machines too.

Vince


Re: perf: rdpmc bug when viewing all procs on remote cpu

2019-01-10 Thread Vince Weaver
On Thu, 10 Jan 2019, Vince Weaver wrote:

> However if you create an all-process attached to CPU event:
>   perf_event_open(attr, -1, X, -1, 0);
> the mmap event index is set as if this were a valid event and so the rdpmc
> succeeds even though it shouldn't (we're trying to read an event value
> on a remote cpu with a local rdpmc).

For a test case, try the
tests/rdpmc/rdpmc_attach_other_cpu
test found in my perf_event_tests suite
git clone https://github.com/deater/perf_event_tests

I can trigger it with current git on an intel machine, but not on an AMD 
machine.  Possibly because it is defaulting to one of the fixed counter 
slots?

Vince


perf: rdpmc bug when viewing all procs on remote cpu

2019-01-10 Thread Vince Weaver
Hello

I think this is a bug turned up by PAPI.  I've been trying to track down 
where this happens in the perf_event code myself, but it might be faster 
to just report it.

If you create a per-process attached to CPU event:
perf_event_open(attr, 0, X, -1, 0);
the mmap event index is set to "0" (not available) on all cores but the
current one so the rdpmc read code can properly fall back to read().

However if you create an all-process attached to CPU event:
perf_event_open(attr, -1, X, -1, 0);
the mmap event index is set as if this were a valid event and so the rdpmc
succeeds even though it shouldn't (we're trying to read an event value
on a remote cpu with a local rdpmc).

so I think somehow in the perf_event_open pid=-1 case rdpmc is not getting 
blocked properly...

Vince



Re: perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-08 Thread Vince Weaver
On Thu, 6 Dec 2018, Jiri Olsa wrote:

> On Thu, Dec 06, 2018 at 10:35:28AM -0500, Vince Weaver wrote:
> > On Wed, 5 Dec 2018, Jiri Olsa wrote:
> > Maybe it is a corruption issue.  I had applied my own debug patch that 
> > would dump some info if data->callchain was NULL.
> > 
> > But my debug code didn't trigger this time because it looks like 
> > data->callchain was "1" rather than "0".
> > 
> > [27764.840179] BUG: unable to handle kernel NULL pointer dereference at 
> > 0001
> > [27764.840179] PGD 0 P4D 0 
> > [27764.840180] Oops:  [#1] SMP PTI
> > [27764.840180] CPU: 1 PID: 18687 Comm: perf_fuzzer Tainted: GW  
> >4.20.0-rc5+ #125
> > [27764.840180] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> > 01/26/2014
> 
> actually, you could try that patch from my previous email?
> 
still crashes with your patch (see below)

I've also been able to replicate this crash on a skylake machine in 
addition to the haswell machine.

Vince

[28269.147232] BUG: unable to handle kernel NULL pointer dereference at 

[28269.155628] PGD 0 P4D 0 
[28269.158360] Oops:  [#1] SMP PTI
[28269.162087] CPU: 0 PID: 1189 Comm: perf_fuzzer Tainted: GW 
4.20.0-rc5+ #128
[28269.171011] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[28269.178935] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[28269.184239] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[28269.204249] RSP: :c9000aca7a40 EFLAGS: 00010082
[28269.209832] RAX:  RBX: c9000aca7a98 RCX: c9000aca7ad8
[28269.217484] RDX:  RSI: c9000aca7b80 RDI: c9000aca7a9e
[28269.225129] RBP: 800bb068 R08: 0002 R09: 000215c0
[28269.232760] R10: 8880ce552000 R11:  R12: c9000aca7b80
[28269.240380] R13: 88803696c800 R14: c9000aca7ad8 R15: e8c06300
[28269.248014] FS:  7f5927fe7500() GS:88811aa0() 
knlGS:
[28269.256606] CS:  0010 DS:  ES:  CR0: 80050033
[28269.262739] CR2:  CR3: 000116d98001 CR4: 001607f0
[28269.270349] DR0:  DR1:  DR2: 
[28269.277968] DR3:  DR6: fffe0ff0 DR7: 0600
[28269.285639] Call Trace:
[28269.288266]  intel_pmu_drain_bts_buffer+0x151/0x220
[28269.293476]  ? radix_tree_delete_item+0x69/0xc0
[28269.298378]  x86_pmu_stop+0x3b/0x90
[28269.302113]  x86_pmu_del+0x57/0x160
[28269.305840]  event_sched_out.isra.106+0x81/0x170
[28269.310780]  group_sched_out.part.108+0x51/0xc0
[28269.315634]  ctx_sched_out+0xf8/0x220
[28269.319551]  __perf_event_task_sched_out+0x18d/0x3f0
[28269.324866]  ? pick_next_task_fair+0x60a/0x660
[28269.329639]  __schedule+0x4b9/0x820
[28269.67]  ? kill_pid_info+0x34/0x50
[28269.337360]  schedule+0x28/0x80
[28269.340725]  exit_to_usermode_loop+0x4e/0xc0
[28269.345272]  prepare_exit_to_usermode+0x53/0x80
[28269.350109]  retint_user+0x8/0x8
[28269.353541] RIP: 0033:0x56154980b6c3
[28269.357346] Code: 01 d0 48 c1 e0 06 48 89 c2 48 8d 05 cf 93 23 00 48 8b 04 
02 48 85 c0 74 11 8b 45 f8 3b 45 f4 75 05 8b 45 fc eb 16 83 45 f8 01 <83> 45 fc 
01 81 7d fc 9f 86 01 00 7e 96 b8 ff ff ff ff c9 c3 55 48
[28269.377462] RSP: 002b:7ffc6a1540a0 EFLAGS: 0246 ORIG_RAX: 
ff13
[28269.385562] RAX:  RBX: 000c RCX: 003c
[28269.393182] RDX: 00b895c0 RSI: 7ffc6a154074 RDI: 7f5927fe0740
[28269.400835] RBP: 7ffc6a1540b0 R08: 7f5927fe01f0 R09: 7f5927fe0240
[28269.408452] R10:  R11: 0246 R12: 56154980b4c0
[28269.416080] R13: 7ffc6a156510 R14:  R15: 
[28269.423723] Modules linked in: snd_hda_codec_hdmi intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm i915 irqbypass 
crct10dif_pclmul crc32_pclmul iosf_mbi ghash_clmulni_intel drm_kms_helper 
aesni_intel snd_hda_codec_realtek aes_x86_64 crypto_simd drm cryptd 
snd_hda_codec_generic i2c_algo_bit snd_hda_intel evdev glue_helper 
snd_hda_codec snd_hda_core iTCO_wdt mei_me mei wmi_bmof tpm_tis snd_hwdep 
tpm_tis_core pcc_cpufreq pcspkr iTCO_vendor_support snd_pcm tpm sg rng_core 
button snd_timer video snd soundcore wmi binfmt_misc ip_tables x_tables autofs4 
sr_mod sd_mod cdrom ahci xhci_pci ehci_pci libahci xhci_hcd ehci_hcd libata 
usbcore lpc_ich mfd_core e1000e scsi_mod i2c_i801 crc32c_intel usb_common fan 
thermal
[28269.492702] CR2: 
[28269.496246] ---[ end trace 6775846bfda0f18b ]---
[28269.501186] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[28269.506482] Code: 06 4c 89 ea 4

Re: perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-06 Thread Vince Weaver
On Wed, 5 Dec 2018, Jiri Olsa wrote:

> On Wed, Dec 05, 2018 at 12:11:19PM -0500, Vince Weaver wrote:
> > On Wed, 5 Dec 2018, Jiri Olsa wrote:
> > 
> > > On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > > > On Tue, Dec 04, 2018 at 10:54:55AM -0500, Vince Weaver wrote:
> > > > > Hello,
> > > > > 
> > > > > I was able to trigger another oops with the perf_fuzzer with current 
> > > > > git.
> > > > > 
> > > > > This is 4.20-rc5 after the fix for the very similar oops I previously 
> > > > > reported got committed.
> > > > > 
> > > > > It seems to be pointing to the same location in the source as 
> > > > > before, I guess maybe triggered a different way?
> > > > 
> > > > nice.. yep, looks the same
> > > > 
> > > > > 
> > > > > Unfortunately this crash is not easily reproducible like the last one 
> > > > > was.
> > > > 
> > > > will check
> > > 
> > > what model are hitting this on?
> > 
> > Haswell.  6/60/3.
> > 
> > While I can't deterministically trigger this, the fuzzer usually hits it
> > within an hour or two.  Is there any debug or printk messages I can
> > add that would help figure out what's going on?
> 
> I can't see how we could end up with that config other than
> some corruption.. the only way I see could be that we touch
> cpu->events array without checking its active_mask bit
> 
> but that does not explain why the crash happened in the same
> place as before

Maybe it is a corruption issue.  I had applied my own debug patch that 
would dump some info if data->callchain was NULL.

But my debug code didn't trigger this time because it looks like 
data->callchain was "1" rather than "0".

[27764.840179] BUG: unable to handle kernel NULL pointer dereference at 
0001
[27764.840179] PGD 0 P4D 0 
[27764.840180] Oops:  [#1] SMP PTI
[27764.840180] CPU: 1 PID: 18687 Comm: perf_fuzzer Tainted: GW 
4.20.0-rc5+ #125
[27764.840180] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014

Vince


Re: perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-06 Thread Vince Weaver
On Wed, 5 Dec 2018, Jiri Olsa wrote:

> On Wed, Dec 05, 2018 at 12:11:19PM -0500, Vince Weaver wrote:
> > On Wed, 5 Dec 2018, Jiri Olsa wrote:
> > 
> > > On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > > > On Tue, Dec 04, 2018 at 10:54:55AM -0500, Vince Weaver wrote:
> > > > > Hello,
> > > > > 
> > > > > I was able to trigger another oops with the perf_fuzzer with current 
> > > > > git.
> > > > > 
> > > > > This is 4.20-rc5 after the fix for the very similar oops I previously 
> > > > > reported got committed.
> > > > > 
> > > > > It seems to be pointing to the same location in the source as 
> > > > > before, I guess maybe triggered a different way?
> > > > 
> > > > nice.. yep, looks the same
> > > > 
> > > > > 
> > > > > Unfortunately this crash is not easily reproducible like the last one 
> > > > > was.
> > > > 
> > > > will check
> > > 
> > > what model are hitting this on?
> > 
> > Haswell.  6/60/3.
> > 
> > While I can't deterministically trigger this, the fuzzer usually hits it
> > within an hour or two.  Is there any debug or printk messages I can
> > add that would help figure out what's going on?
> 
> I can't see how we could end up with that config other than
> some corruption.. the only way I see could be that we touch
> cpu->events array without checking its active_mask bit
> 
> but that does not explain why the crash happened in the same
> place as before

Maybe it is a corruption issue.  I had applied my own debug patch that 
would dump some info if data->callchain was NULL.

But my debug code didn't trigger this time because it looks like 
data->callchain was "1" rather than "0".

[27764.840179] BUG: unable to handle kernel NULL pointer dereference at 
0001
[27764.840179] PGD 0 P4D 0 
[27764.840180] Oops:  [#1] SMP PTI
[27764.840180] CPU: 1 PID: 18687 Comm: perf_fuzzer Tainted: GW 
4.20.0-rc5+ #125
[27764.840180] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014

Vince


Re: perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-05 Thread Vince Weaver
On Wed, 5 Dec 2018, Jiri Olsa wrote:

> On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > On Tue, Dec 04, 2018 at 10:54:55AM -0500, Vince Weaver wrote:
> > > Hello,
> > > 
> > > I was able to trigger another oops with the perf_fuzzer with current git.
> > > 
> > > This is 4.20-rc5 after the fix for the very similar oops I previously 
> > > reported got committed.
> > > 
> > > It seems to be pointing to the same location in the source as 
> > > before, I guess maybe triggered a different way?
> > 
> > nice.. yep, looks the same
> > 
> > > 
> > > Unfortunately this crash is not easily reproducible like the last one was.
> > 
> > will check
> 
> what model are hitting this on?

Haswell.  6/60/3.

While I can't deterministically trigger this, the fuzzer usually hits it
within an hour or two.  Is there any debug or printk messages I can
add that would help figure out what's going on?

Vince




Re: perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-05 Thread Vince Weaver
On Wed, 5 Dec 2018, Jiri Olsa wrote:

> On Wed, Dec 05, 2018 at 01:45:38PM +0100, Jiri Olsa wrote:
> > On Tue, Dec 04, 2018 at 10:54:55AM -0500, Vince Weaver wrote:
> > > Hello,
> > > 
> > > I was able to trigger another oops with the perf_fuzzer with current git.
> > > 
> > > This is 4.20-rc5 after the fix for the very similar oops I previously 
> > > reported got committed.
> > > 
> > > It seems to be pointing to the same location in the source as 
> > > before, I guess maybe triggered a different way?
> > 
> > nice.. yep, looks the same
> > 
> > > 
> > > Unfortunately this crash is not easily reproducible like the last one was.
> > 
> > will check
> 
> what model are hitting this on?

Haswell.  6/60/3.

While I can't deterministically trigger this, the fuzzer usually hits it
within an hour or two.  Is there any debug or printk messages I can
add that would help figure out what's going on?

Vince




perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-04 Thread Vince Weaver
Hello,

I was able to trigger another oops with the perf_fuzzer with current git.

This is 4.20-rc5 after the fix for the very similar oops I previously 
reported got committed.

It seems to be pointing to the same location in the source as 
before, I guess maybe triggered a different way?

Unfortunately this crash is not easily reproducible like the last one was.

kernel/events/core.c:6393

if (sample_type & PERF_SAMPLE_CALLCHAIN) {
int size = 1;

if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
data->callchain = perf_callchain(event, regs);

>   size += data->callchain->nr;

header->size += size * sizeof(u64);
}


Vince

[45050.698745] general protection fault:  [#1] SMP PTI
[45050.698745] CPU: 5 PID: 13475 Comm: perf_fuzzer Tainted: GW 
4.20.0-rc5 #124
[45050.698746] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[45050.698746] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[45050.698746] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[45050.698747] RSP: :c900206bfb00 EFLAGS: 00010082
[45050.698747] RAX: dead0200 RBX: c900206bfb58 RCX: 001f
[45050.698747] RDX:  RSI: 25bbf56f RDI: 
[45050.698748] RBP: 8275 R08: 0002 R09: 000215c0
[45050.698748] R10: 8b25b2e2f5c8 R11:  R12: c900206bfc40
[45050.698748] R13: 8880cf6d7800 R14: c900206bfb98 R15: 88811ab4f420
[45050.698748] FS:  7fab66133500() GS:88811ab4() 
knlGS:
[45050.698749] CS:  0010 DS:  ES:  CR0: 80050033
[45050.698749] CR2: 7fab66133480 CR3: 811aa004 CR4: 001607e0
[45050.698749] DR0:  DR1: 8e8e8000 DR2: 
[45050.698749] DR3:  DR6: fffe0ff0 DR7: 0600
[45050.698750] Call Trace:
[45050.698750]  intel_pmu_drain_bts_buffer+0x151/0x220
[45050.698750]  ? mem_cgroup_commit_charge+0x7a/0x510
[45050.698750]  ? wp_page_copy+0x39e/0x650
[45050.698750]  ? reuse_swap_page+0x129/0x340
[45050.698751]  ? _raw_spin_unlock+0xa/0x10
[45050.698751]  ? do_wp_page+0x30f/0x4d0
[45050.698751]  ? finish_mkwrite_fault+0x140/0x140
[45050.698751]  ? __handle_mm_fault+0xb22/0x12c0
[45050.698751]  intel_pmu_handle_irq+0x6d/0x160
[45050.698752]  perf_event_nmi_handler+0x2d/0x50
[45050.698752]  nmi_handle+0x63/0x110
[45050.698752]  default_do_nmi+0x4e/0x100
[45050.698752]  do_nmi+0x112/0x170
[45050.698752]  nmi+0x8b/0xd4
[45050.698753] RIP: 0033:0x558a6a6366c3
[45050.698753] Code: 01 d0 48 c1 e0 06 48 89 c2 48 8d 05 cf 93 23 00 48 8b 04 
02 48 85 c0 74 11 8b 45 f8 3b 45 f4 75 05 8b 45 fc eb 16 83 45 f8 01 <83> 45 fc 
01 81 7d fc 9f 86 01 00 7e 96 b8 ff ff ff ff c9 c3 55 48
[45050.698753] RSP: 002b:7ffc9f521660 EFLAGS: 0246
[45050.698754] RAX:  RBX:  RCX: 0030
[45050.698754] RDX: e740 RSI: 7ffc9f521634 RDI: 7fab6612c740
[45050.698754] RBP: 7ffc9f521670 R08: 7fab6612c1f0 R09: 7fab6612c240
[45050.698754] R10: 7fab661337d0 R11: 0246 R12: 558a6a6364c0
[45050.698755] R13: 7ffc9f523ad0 R14:  R15: 
[45050.698755] Modules linked in: intel_rapl x86_pkg_temp_thermal 
intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi 
snd_hda_intel coretemp tpm_tis snd_hda_codec snd_hda_core kvm_intel 
tpm_tis_core i915 snd_hwdep kvm tpm snd_pcm rng_core wmi_bmof mei_me sg 
iosf_mbi irqbypass drm_kms_helper evdev crct10dif_pclmul drm mei iTCO_wdt 
i2c_algo_bit iTCO_vendor_support snd_timer pcc_cpufreq crc32_pclmul 
ghash_clmulni_intel aesni_intel snd video aes_x86_64 crypto_simd cryptd 
glue_helper soundcore pcspkr wmi button binfmt_misc ip_tables x_tables autofs4 
sr_mod sd_mod cdrom ahci libahci ehci_pci xhci_pci libata xhci_hcd ehci_hcd 
lpc_ich mfd_core crc32c_intel scsi_mod e1000e i2c_i801 usbcore usb_common fan 
thermal[45051.027024] ---[ end trace 9565944010fbdf23 ]---
[45051.027024] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[45051.027025] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[45051.027025] RSP: :c900206bfb00 EFLAGS: 00010082
[45051.027025] RAX: dead0200 RBX: c900206bfb58 RCX: 001f
[45051.027025] RDX:  RSI: 25bbf56f RDI: 
[45051.027026] RBP: 8275 R08: 0002 R09: 000215c0
[45051.027026] R10: 8b25b2e2f5c8 R11:  R12: c900206bfc40
[45051.027026] R13: 8880cf6d7800 R14: c900206bfb98 R15: 

perf: perf_fuzzer triggers GPF in perf_prepare_sample

2018-12-04 Thread Vince Weaver
Hello,

I was able to trigger another oops with the perf_fuzzer with current git.

This is 4.20-rc5 after the fix for the very similar oops I previously 
reported got committed.

It seems to be pointing to the same location in the source as 
before, I guess maybe triggered a different way?

Unfortunately this crash is not easily reproducible like the last one was.

kernel/events/core.c:6393

if (sample_type & PERF_SAMPLE_CALLCHAIN) {
int size = 1;

if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
data->callchain = perf_callchain(event, regs);

>   size += data->callchain->nr;

header->size += size * sizeof(u64);
}


Vince

[45050.698745] general protection fault:  [#1] SMP PTI
[45050.698745] CPU: 5 PID: 13475 Comm: perf_fuzzer Tainted: GW 
4.20.0-rc5 #124
[45050.698746] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[45050.698746] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[45050.698746] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[45050.698747] RSP: :c900206bfb00 EFLAGS: 00010082
[45050.698747] RAX: dead0200 RBX: c900206bfb58 RCX: 001f
[45050.698747] RDX:  RSI: 25bbf56f RDI: 
[45050.698748] RBP: 8275 R08: 0002 R09: 000215c0
[45050.698748] R10: 8b25b2e2f5c8 R11:  R12: c900206bfc40
[45050.698748] R13: 8880cf6d7800 R14: c900206bfb98 R15: 88811ab4f420
[45050.698748] FS:  7fab66133500() GS:88811ab4() 
knlGS:
[45050.698749] CS:  0010 DS:  ES:  CR0: 80050033
[45050.698749] CR2: 7fab66133480 CR3: 811aa004 CR4: 001607e0
[45050.698749] DR0:  DR1: 8e8e8000 DR2: 
[45050.698749] DR3:  DR6: fffe0ff0 DR7: 0600
[45050.698750] Call Trace:
[45050.698750]  intel_pmu_drain_bts_buffer+0x151/0x220
[45050.698750]  ? mem_cgroup_commit_charge+0x7a/0x510
[45050.698750]  ? wp_page_copy+0x39e/0x650
[45050.698750]  ? reuse_swap_page+0x129/0x340
[45050.698751]  ? _raw_spin_unlock+0xa/0x10
[45050.698751]  ? do_wp_page+0x30f/0x4d0
[45050.698751]  ? finish_mkwrite_fault+0x140/0x140
[45050.698751]  ? __handle_mm_fault+0xb22/0x12c0
[45050.698751]  intel_pmu_handle_irq+0x6d/0x160
[45050.698752]  perf_event_nmi_handler+0x2d/0x50
[45050.698752]  nmi_handle+0x63/0x110
[45050.698752]  default_do_nmi+0x4e/0x100
[45050.698752]  do_nmi+0x112/0x170
[45050.698752]  nmi+0x8b/0xd4
[45050.698753] RIP: 0033:0x558a6a6366c3
[45050.698753] Code: 01 d0 48 c1 e0 06 48 89 c2 48 8d 05 cf 93 23 00 48 8b 04 
02 48 85 c0 74 11 8b 45 f8 3b 45 f4 75 05 8b 45 fc eb 16 83 45 f8 01 <83> 45 fc 
01 81 7d fc 9f 86 01 00 7e 96 b8 ff ff ff ff c9 c3 55 48
[45050.698753] RSP: 002b:7ffc9f521660 EFLAGS: 0246
[45050.698754] RAX:  RBX:  RCX: 0030
[45050.698754] RDX: e740 RSI: 7ffc9f521634 RDI: 7fab6612c740
[45050.698754] RBP: 7ffc9f521670 R08: 7fab6612c1f0 R09: 7fab6612c240
[45050.698754] R10: 7fab661337d0 R11: 0246 R12: 558a6a6364c0
[45050.698755] R13: 7ffc9f523ad0 R14:  R15: 
[45050.698755] Modules linked in: intel_rapl x86_pkg_temp_thermal 
intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi 
snd_hda_intel coretemp tpm_tis snd_hda_codec snd_hda_core kvm_intel 
tpm_tis_core i915 snd_hwdep kvm tpm snd_pcm rng_core wmi_bmof mei_me sg 
iosf_mbi irqbypass drm_kms_helper evdev crct10dif_pclmul drm mei iTCO_wdt 
i2c_algo_bit iTCO_vendor_support snd_timer pcc_cpufreq crc32_pclmul 
ghash_clmulni_intel aesni_intel snd video aes_x86_64 crypto_simd cryptd 
glue_helper soundcore pcspkr wmi button binfmt_misc ip_tables x_tables autofs4 
sr_mod sd_mod cdrom ahci libahci ehci_pci xhci_pci libata xhci_hcd ehci_hcd 
lpc_ich mfd_core crc32c_intel scsi_mod e1000e i2c_i801 usbcore usb_common fan 
thermal[45051.027024] ---[ end trace 9565944010fbdf23 ]---
[45051.027024] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[45051.027025] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[45051.027025] RSP: :c900206bfb00 EFLAGS: 00010082
[45051.027025] RAX: dead0200 RBX: c900206bfb58 RCX: 001f
[45051.027025] RDX:  RSI: 25bbf56f RDI: 
[45051.027026] RBP: 8275 R08: 0002 R09: 000215c0
[45051.027026] R10: 8b25b2e2f5c8 R11:  R12: c900206bfc40
[45051.027026] R13: 8880cf6d7800 R14: c900206bfb98 R15: 

Re: perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver
On Thu, 8 Nov 2018, Alexander Shishkin wrote:

> Vince Weaver  writes:
> 
> > On Thu, 8 Nov 2018, Vince Weaver wrote:
> >
> >> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 
> >> 
> >> [91760.334876] PGD 0 P4D 0 
> >> [91760.337596] Oops:  [#1] SMP PTI
> >> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
> >> 4.20.0-rc1+ #119
> >> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> >> 01/26/2014
> >> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
> >
> > so what's the best way to do the equivelent of addr2line on something like 
> > this, now that we aren't allowed to know the RIP anymore?
> 
> scripts/decode_stacktrace.sh works most of the time.
> 
> Sounds like BTS needs fixing up again. Thanks for looking at it though!

In case it matters, it looks like the address of the oops comes down to

linux.git/kernel/events/core.c:6393

size += data->callchain->nr;

Vince


Re: perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver
On Thu, 8 Nov 2018, Alexander Shishkin wrote:

> Vince Weaver  writes:
> 
> > On Thu, 8 Nov 2018, Vince Weaver wrote:
> >
> >> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 
> >> 
> >> [91760.334876] PGD 0 P4D 0 
> >> [91760.337596] Oops:  [#1] SMP PTI
> >> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
> >> 4.20.0-rc1+ #119
> >> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> >> 01/26/2014
> >> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
> >
> > so what's the best way to do the equivelent of addr2line on something like 
> > this, now that we aren't allowed to know the RIP anymore?
> 
> scripts/decode_stacktrace.sh works most of the time.
> 
> Sounds like BTS needs fixing up again. Thanks for looking at it though!

In case it matters, it looks like the address of the oops comes down to

linux.git/kernel/events/core.c:6393

size += data->callchain->nr;

Vince


Re: perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver
On Thu, 8 Nov 2018, Vince Weaver wrote:

> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 
> 
> [91760.334876] PGD 0 P4D 0 
> [91760.337596] Oops:  [#1] SMP PTI
> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
> 4.20.0-rc1+ #119
> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> 01/26/2014
> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0

so what's the best way to do the equivelent of addr2line on something like 
this, now that we aren't allowed to know the RIP anymore?

I probably knew at one point but I've spent the last 3 months doing 6502 
assembly language Demoscene coding and so apparently I've forgotten 
everything I once knew about x86_64 kernel interfaces.  (As can be 
imagined, Demoscene coding work is a lot more mentally rewarding than 
perf_fuzzer work).

Vince




Re: perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver
On Thu, 8 Nov 2018, Vince Weaver wrote:

> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 
> 
> [91760.334876] PGD 0 P4D 0 
> [91760.337596] Oops:  [#1] SMP PTI
> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
> 4.20.0-rc1+ #119
> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> 01/26/2014
> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0

so what's the best way to do the equivelent of addr2line on something like 
this, now that we aren't allowed to know the RIP anymore?

I probably knew at one point but I've spent the last 3 months doing 6502 
assembly language Demoscene coding and so apparently I've forgotten 
everything I once knew about x86_64 kernel interfaces.  (As can be 
imagined, Demoscene coding work is a lot more mentally rewarding than 
perf_fuzzer work).

Vince




perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver


I was able to trigger this oops with the perf_fuzzer with current git.

I can reliably trigger this on my Haswell machine.

I haven't done any analysis yet, I might not have time to today, but I 
wanted to report it in case the cause was obvious to someone else.

Vince


*** perf_fuzzer 0.32-rc0 *** by Vince Weaver

Linux version 4.20.0-rc1+ x86_64
Processor: Intel 6/60/3

Stopping after 3
Watchdog enabled with timeout 60s
Will auto-exit if signal storm detected
Seeding RNG from time 1541627285

To reproduce, try:
echo 1 > /proc/sys/kernel/nmi_watchdog
echo 0 > /proc/sys/kernel/perf_event_paranoid
echo 1250 > /proc/sys/kernel/perf_event_max_sample_rate
./perf_fuzzer -s 3 -r 1541627285

Fuzzing the following syscalls: mmap perf_event_open close read write 
ioctl fork prctl poll 
Also attempting the following: signal-handler-on-overflow 
busy-instruction-loop accessing-perf-proc-and-sys-files trashing-the-mmap-page 

Pid=14868, sleeping 1s

==
Starting fuzzing at 2018-11-07 16:48:06
==
Cannot open /sys/kernel/tracing/kprobe_events
Iteration 1, 125098 syscalls in 4.90 s (25.525 k syscalls/s)
Open attempts: 117090  Successful: 951  Currently open: 47
EPERM : 11
ENOENT : 598
E2BIG : 10074
EBADF : 7879
EACCES : 4691
UNKNOWN 19 : 1
EINVAL : 92824
EOPNOTSUPP : 61
Trinity Type (Normal 163/29305)(Sampling 17/29139)(Global 
719/29405)(Random 52/29241)
Type (Hardware 224/16272)(software 346/15851)(tracepoint 
63/15585)(Cache 58/14732)(cpu 230/15625)(breakpoint 9/15556)(kprobe 0/948)(msr 
7/940)(power 0/1021)(uncore_imc 0/924)(uncore_cbox_0 3/911)(uncore_cbox_1 
3/957)(uncore_cbox_2 2/914)(uncore_cbox_3 2/860)(uncore_arb 3/873)(cstate_core 
1/902)(cstate_pkg 0/1016)(i915 0/942)(#18 0/16)(>19 0/12245)
Close:  904/904 Successful
Read:   795/881 Successful
Write:  0/934 Successful
Ioctl:  328/952 Successful: (ENABLE 84/84)(DISABLE 76/76)(REFRESH 
4/74)(RESET 68/68)(PERIOD 9/69)(SET_OUTPUT 14/66)(SET_FILTER 0/78)(ID 
69/69)(SET_BPF 0/70)(PAUSE_OUTPUT 4/60)(QUERY_BPF 0/67)(MOD_ATTR 0/55)(#12 
0/0)(#13 0/0)(#14 0/0)(>14 0/116)
Mmap:   442/1113 Successful: (MMAP 442/1113)(TRASH 111/160)(READ 
98/100)(UNMAP 438/1010)(AUX 0/119)(AUX_READ 0/0)
Prctl:  952/952 Successful
Fork:   421/421 Successful
Poll:   889/905 Successful
Access: 113/876 Successful
Overflows: 0  Recursive: 0
SIGIOs due to RT signal queue full: 0
[91760.326510] BUG: unable to handle kernel NULL pointer dereference at 

[91760.334876] PGD 0 P4D 0 
[91760.337596] Oops:  [#1] SMP PTI
[91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
4.20.0-rc1+ #119
[91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[91760.363065] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[91760.383164] RSP: :88011ab83b80 EFLAGS: 00010086
[91760.388753] RAX:  RBX: 88011ab83bd8 RCX: 001f
[91760.396373] RDX:  RSI: 25bbfcb9 RDI: 
[91760.404062] RBP: 800b8165 R08: 0002 R09: 000215c0
[91760.411678] R10: 00011b422ed4649b R11:  R12: 88011ab83cc0
[91760.419287] R13: 8800a8c8c800 R14: 88011ab83c18 R15: e8d86300
[91760.426933] FS:  () GS:88011ab8() 
knlGS:
[91760.435616] CS:  0010 DS:  ES:  CR0: 80050033
[91760.441735] CR2:  CR3: 0200c002 CR4: 001606e0
[91760.449369] DR0: 00a4a7ffb768 DR1:  DR2: 
[91760.457005] DR3:  DR6: fffe0ff0 DR7: 0600
[91760.464641] Call Trace:
[91760.467265]  
[91760.469427]  intel_pmu_drain_bts_buffer+0x151/0x220
[91760.474650]  ? intel_get_event_constraints+0x219/0x360
[91760.480145]  ? perf_assign_events+0xe2/0x2a0
[91760.484732]  ? select_idle_sibling+0x22/0x3a0
[91760.489403]  ? __update_load_avg_se+0x1ec/0x270
[91760.494244]  ? enqueue_task_fair+0x377/0xdd0
[91760.498832]  ? cpumask_next_and+0x19/0x20
[91760.503105]  ? load_balance+0x134/0x950
[91760.507239]  ? check_preempt_curr+0x7a/0x90
[91760.511683]  ? ttwu_do_wakeup+0x19/0x140
[91760.515877]  x86_pmu_stop+0x3b/0x90
[91760.519606]  x86_pmu_del+0x57/0x160
[91760.523343]  event_sched_out.isra.106+0x8

perf: perf_fuzzer triggers NULL pointer dereference

2018-11-08 Thread Vince Weaver


I was able to trigger this oops with the perf_fuzzer with current git.

I can reliably trigger this on my Haswell machine.

I haven't done any analysis yet, I might not have time to today, but I 
wanted to report it in case the cause was obvious to someone else.

Vince


*** perf_fuzzer 0.32-rc0 *** by Vince Weaver

Linux version 4.20.0-rc1+ x86_64
Processor: Intel 6/60/3

Stopping after 3
Watchdog enabled with timeout 60s
Will auto-exit if signal storm detected
Seeding RNG from time 1541627285

To reproduce, try:
echo 1 > /proc/sys/kernel/nmi_watchdog
echo 0 > /proc/sys/kernel/perf_event_paranoid
echo 1250 > /proc/sys/kernel/perf_event_max_sample_rate
./perf_fuzzer -s 3 -r 1541627285

Fuzzing the following syscalls: mmap perf_event_open close read write 
ioctl fork prctl poll 
Also attempting the following: signal-handler-on-overflow 
busy-instruction-loop accessing-perf-proc-and-sys-files trashing-the-mmap-page 

Pid=14868, sleeping 1s

==
Starting fuzzing at 2018-11-07 16:48:06
==
Cannot open /sys/kernel/tracing/kprobe_events
Iteration 1, 125098 syscalls in 4.90 s (25.525 k syscalls/s)
Open attempts: 117090  Successful: 951  Currently open: 47
EPERM : 11
ENOENT : 598
E2BIG : 10074
EBADF : 7879
EACCES : 4691
UNKNOWN 19 : 1
EINVAL : 92824
EOPNOTSUPP : 61
Trinity Type (Normal 163/29305)(Sampling 17/29139)(Global 
719/29405)(Random 52/29241)
Type (Hardware 224/16272)(software 346/15851)(tracepoint 
63/15585)(Cache 58/14732)(cpu 230/15625)(breakpoint 9/15556)(kprobe 0/948)(msr 
7/940)(power 0/1021)(uncore_imc 0/924)(uncore_cbox_0 3/911)(uncore_cbox_1 
3/957)(uncore_cbox_2 2/914)(uncore_cbox_3 2/860)(uncore_arb 3/873)(cstate_core 
1/902)(cstate_pkg 0/1016)(i915 0/942)(#18 0/16)(>19 0/12245)
Close:  904/904 Successful
Read:   795/881 Successful
Write:  0/934 Successful
Ioctl:  328/952 Successful: (ENABLE 84/84)(DISABLE 76/76)(REFRESH 
4/74)(RESET 68/68)(PERIOD 9/69)(SET_OUTPUT 14/66)(SET_FILTER 0/78)(ID 
69/69)(SET_BPF 0/70)(PAUSE_OUTPUT 4/60)(QUERY_BPF 0/67)(MOD_ATTR 0/55)(#12 
0/0)(#13 0/0)(#14 0/0)(>14 0/116)
Mmap:   442/1113 Successful: (MMAP 442/1113)(TRASH 111/160)(READ 
98/100)(UNMAP 438/1010)(AUX 0/119)(AUX_READ 0/0)
Prctl:  952/952 Successful
Fork:   421/421 Successful
Poll:   889/905 Successful
Access: 113/876 Successful
Overflows: 0  Recursive: 0
SIGIOs due to RT signal queue full: 0
[91760.326510] BUG: unable to handle kernel NULL pointer dereference at 

[91760.334876] PGD 0 P4D 0 
[91760.337596] Oops:  [#1] SMP PTI
[91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW 
4.20.0-rc1+ #119
[91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[91760.363065] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 
01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 
8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[91760.383164] RSP: :88011ab83b80 EFLAGS: 00010086
[91760.388753] RAX:  RBX: 88011ab83bd8 RCX: 001f
[91760.396373] RDX:  RSI: 25bbfcb9 RDI: 
[91760.404062] RBP: 800b8165 R08: 0002 R09: 000215c0
[91760.411678] R10: 00011b422ed4649b R11:  R12: 88011ab83cc0
[91760.419287] R13: 8800a8c8c800 R14: 88011ab83c18 R15: e8d86300
[91760.426933] FS:  () GS:88011ab8() 
knlGS:
[91760.435616] CS:  0010 DS:  ES:  CR0: 80050033
[91760.441735] CR2:  CR3: 0200c002 CR4: 001606e0
[91760.449369] DR0: 00a4a7ffb768 DR1:  DR2: 
[91760.457005] DR3:  DR6: fffe0ff0 DR7: 0600
[91760.464641] Call Trace:
[91760.467265]  
[91760.469427]  intel_pmu_drain_bts_buffer+0x151/0x220
[91760.474650]  ? intel_get_event_constraints+0x219/0x360
[91760.480145]  ? perf_assign_events+0xe2/0x2a0
[91760.484732]  ? select_idle_sibling+0x22/0x3a0
[91760.489403]  ? __update_load_avg_se+0x1ec/0x270
[91760.494244]  ? enqueue_task_fair+0x377/0xdd0
[91760.498832]  ? cpumask_next_and+0x19/0x20
[91760.503105]  ? load_balance+0x134/0x950
[91760.507239]  ? check_preempt_curr+0x7a/0x90
[91760.511683]  ? ttwu_do_wakeup+0x19/0x140
[91760.515877]  x86_pmu_stop+0x3b/0x90
[91760.519606]  x86_pmu_del+0x57/0x160
[91760.523343]  event_sched_out.isra.106+0x8

Re: [perf] perf_event.h ABI visibility question

2018-08-28 Thread Vince Weaver
On Mon, 27 Aug 2018, Peter Zijlstra wrote:

> Something like so then?
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index eeb787b1c53c..f35eb72739c0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -144,7 +144,7 @@ enum perf_event_sample_format {
>  
>   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
>  
> - __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
> + __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63, /* non-ABI; 
> internal use */
>  };

yes, something like that would be fine.

I am being difficult about this, but from experience when trying to keep 
the manpage updated, what seems obvious now will not be so obvious 6 
months from now and trying to dig through the git logs / mailing list 
archives to verify the purpose of an addition can be a lot of work 
sometimes.

Vince


Re: [perf] perf_event.h ABI visibility question

2018-08-28 Thread Vince Weaver
On Mon, 27 Aug 2018, Peter Zijlstra wrote:

> Something like so then?
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index eeb787b1c53c..f35eb72739c0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -144,7 +144,7 @@ enum perf_event_sample_format {
>  
>   PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
>  
> - __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
> + __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63, /* non-ABI; 
> internal use */
>  };

yes, something like that would be fine.

I am being difficult about this, but from experience when trying to keep 
the manpage updated, what seems obvious now will not be so obvious 6 
months from now and trying to dig through the git logs / mailing list 
archives to verify the purpose of an addition can be a lot of work 
sometimes.

Vince


Re: [perf] perf_event.h ABI visibility question

2018-08-24 Thread Vince Weaver
On Fri, 24 Aug 2018, Peter Zijlstra wrote:

> > +++ b/include/uapi/linux/perf_event.h
> > @@ -143,6 +143,8 @@ enum perf_event_sample_format {
> > PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
> >  
> > PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
> > +
> > +   __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
> >  };
> > 
> 
> Hurphm.. visible yes, but as you say, also quite useless. Does it really
> make sense to document that?

Well, it should probably be documented either in the manpage or else in 
perf_event.h  (even if it's just "internal use, don't use") as we can't 
really expect people to download a git tree and do a git-blame to try to 
figure out what this mysterious field is all about.

Also, this change increased the size of the enum from 32 to 64 bits on 
32-bit machines, though that only really matters if the user is doing 
something really weird with enum variables.

Vince


Re: [perf] perf_event.h ABI visibility question

2018-08-24 Thread Vince Weaver
On Fri, 24 Aug 2018, Peter Zijlstra wrote:

> > +++ b/include/uapi/linux/perf_event.h
> > @@ -143,6 +143,8 @@ enum perf_event_sample_format {
> > PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
> >  
> > PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
> > +
> > +   __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
> >  };
> > 
> 
> Hurphm.. visible yes, but as you say, also quite useless. Does it really
> make sense to document that?

Well, it should probably be documented either in the manpage or else in 
perf_event.h  (even if it's just "internal use, don't use") as we can't 
really expect people to download a git tree and do a git-blame to try to 
figure out what this mysterious field is all about.

Also, this change increased the size of the enum from 32 to 64 bits on 
32-bit machines, though that only really matters if the user is doing 
something really weird with enum variables.

Vince


[perf] perf_event.h ABI visibility question

2018-08-23 Thread Vince Weaver


I notice that Linux 4.18 has the following changeset which changes the
user visible perf_event.h file

commit 6cbc304f2f360f25cc8607817239d6f4a2fd3dc5
Author: Peter Zijlstra 
Date:   Thu May 10 15:48:41 2018 +0200

perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)

which contains

--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -143,6 +143,8 @@ enum perf_event_sample_format {
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
 
PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+
+   __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
 };


Is this supposed to be a user-visible interface?

I realize that if the user tries to set anything above PERF_SAMPLE_MAX
it will be caught and flagged as EINVAL.

However even with the double-underscore hint in 
__PERF_SAMPLE_CALLCHAIN_EARLY the value is still in the user-visible 
header so it's now part of the ABI and I guess the manpage has to document it.

Vince



[perf] perf_event.h ABI visibility question

2018-08-23 Thread Vince Weaver


I notice that Linux 4.18 has the following changeset which changes the
user visible perf_event.h file

commit 6cbc304f2f360f25cc8607817239d6f4a2fd3dc5
Author: Peter Zijlstra 
Date:   Thu May 10 15:48:41 2018 +0200

perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)

which contains

--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -143,6 +143,8 @@ enum perf_event_sample_format {
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
 
PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */
+
+   __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63,
 };


Is this supposed to be a user-visible interface?

I realize that if the user tries to set anything above PERF_SAMPLE_MAX
it will be caught and flagged as EINVAL.

However even with the double-underscore hint in 
__PERF_SAMPLE_CALLCHAIN_EARLY the value is still in the user-visible 
header so it's now part of the ABI and I guess the manpage has to document it.

Vince



Re: [PATCH v3 0/5]

2018-05-18 Thread Vince Weaver
On Fri, 18 May 2018, Marc Zyngier wrote:

> There is also the case of people natively running 32bit kernels on
> 64bit HW and trying to upstream unspeakable hacks, hoping that the
> stars will align and that they'll win the lottery (see [1]).

I've tested these patches on a Raspberry Pi 3B running a 32-bit upstream 
(4.17-rc5-git) kernel and they work.

[0.472906] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 
counters available

I only needed to add this to the devicetree

arm-pmu {
compatible = "arm,cortex-a53-pmu";
interrupt-parent = <_intc>;
interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
    };


Tested-by: Vince Weaver <vincent.wea...@maine.edu>

Vince


Re: [PATCH v3 0/5]

2018-05-18 Thread Vince Weaver
On Fri, 18 May 2018, Marc Zyngier wrote:

> There is also the case of people natively running 32bit kernels on
> 64bit HW and trying to upstream unspeakable hacks, hoping that the
> stars will align and that they'll win the lottery (see [1]).

I've tested these patches on a Raspberry Pi 3B running a 32-bit upstream 
(4.17-rc5-git) kernel and they work.

[0.472906] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 
counters available

I only needed to add this to the devicetree

arm-pmu {
compatible = "arm,cortex-a53-pmu";
interrupt-parent = <_intc>;
interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
    };


Tested-by: Vince Weaver 

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Vince Weaver wrote:

> On Thu, 17 May 2018, Peter Zijlstra wrote:
> with cortex-a7 now, would it be possible to later drop that if proper 
> cortex-a53 support is added to the armv7 pmu driver?  Or would that lead 
> to all kinds of back-compatability mess?

For what it's worth, the pi-foundation kernel bcm2710 device tree file 
does:

arm-pmu {
#ifdef RPI364
compatible = "arm,armv8-pmuv3", "arm,cortex-a7-pmu";
#else
compatible = "arm,cortex-a7-pmu";
#endif
interrupt-parent = <_intc>;
interrupts = <9>;
};


Which is probably where I was getting the arm,armv8-pmuv3 from in my 
original patch.

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Vince Weaver wrote:

> On Thu, 17 May 2018, Peter Zijlstra wrote:
> with cortex-a7 now, would it be possible to later drop that if proper 
> cortex-a53 support is added to the armv7 pmu driver?  Or would that lead 
> to all kinds of back-compatability mess?

For what it's worth, the pi-foundation kernel bcm2710 device tree file 
does:

arm-pmu {
#ifdef RPI364
compatible = "arm,armv8-pmuv3", "arm,cortex-a7-pmu";
#else
compatible = "arm,cortex-a7-pmu";
#endif
interrupt-parent = <_intc>;
interrupts = <9>;
};


Which is probably where I was getting the arm,armv8-pmuv3 from in my 
original patch.

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Peter Zijlstra wrote:

> On Thu, May 17, 2018 at 06:55:26PM +0200, Stefan Wahren wrote:
> > > Vince Weaver <vincent.wea...@maine.edu> hat am 17. Mai 2018 um 18:34 
> > > geschrieben:
> > > On Thu, 17 May 2018, Stefan Wahren wrote:
> > > > > Eric Anholt <e...@anholt.net> hat am 17. Mai 2018 um 15:17 
> > > > > geschrieben:
> 
> > > > > The a53 and a7 counters seem to match up, so we advertise a7 so that
> > > > > arm32 can probe.
> > > 
> > > so how closely did you look at the a53/a7 differences?  I see some major 
> > > differences, especially with the CPU_CYCLES event (0xff vs 0x11).
> > > 
> > > The proper fix here might be to add a cortex-a53 PMU entry to the armv7 
> > > code rather than trying to treat it as a cortex-a7.
> > 
> > we like to use the PMU of BCM2837 SoC (4x A53 cores) under arm32 and arm64.
> > 
> > What is the right way (tm) to the define the DT compatibles?
> > Does the arm32 PMU driver need patching for proper A53 support?
> 
> I'm completely clueless on all of this; Mark might have ideas.

Spending more time looking at it the only obvious differences are the 
previously mentioned CYCLES difference, as well as the cortex-a7 has
18 events in the perf_cache_map but cortex-a53 only has 3.  Plus probably 
support for the various other features of the armv8v3 pmu that the a7 
knows nothing about.

Is it hard to get lines in the DT changed once they are there?  If we go 
with cortex-a7 now, would it be possible to later drop that if proper 
cortex-a53 support is added to the armv7 pmu driver?  Or would that lead 
to all kinds of back-compatability mess?

Vince



Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Peter Zijlstra wrote:

> On Thu, May 17, 2018 at 06:55:26PM +0200, Stefan Wahren wrote:
> > > Vince Weaver  hat am 17. Mai 2018 um 18:34 
> > > geschrieben:
> > > On Thu, 17 May 2018, Stefan Wahren wrote:
> > > > > Eric Anholt  hat am 17. Mai 2018 um 15:17 
> > > > > geschrieben:
> 
> > > > > The a53 and a7 counters seem to match up, so we advertise a7 so that
> > > > > arm32 can probe.
> > > 
> > > so how closely did you look at the a53/a7 differences?  I see some major 
> > > differences, especially with the CPU_CYCLES event (0xff vs 0x11).
> > > 
> > > The proper fix here might be to add a cortex-a53 PMU entry to the armv7 
> > > code rather than trying to treat it as a cortex-a7.
> > 
> > we like to use the PMU of BCM2837 SoC (4x A53 cores) under arm32 and arm64.
> > 
> > What is the right way (tm) to the define the DT compatibles?
> > Does the arm32 PMU driver need patching for proper A53 support?
> 
> I'm completely clueless on all of this; Mark might have ideas.

Spending more time looking at it the only obvious differences are the 
previously mentioned CYCLES difference, as well as the cortex-a7 has
18 events in the perf_cache_map but cortex-a53 only has 3.  Plus probably 
support for the various other features of the armv8v3 pmu that the a7 
knows nothing about.

Is it hard to get lines in the DT changed once they are there?  If we go 
with cortex-a7 now, would it be possible to later drop that if proper 
cortex-a53 support is added to the armv7 pmu driver?  Or would that lead 
to all kinds of back-compatability mess?

Vince



Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Stefan Wahren wrote:

> 
> > Eric Anholt  hat am 17. Mai 2018 um 15:17 geschrieben:
> > 
> > 
> > The a53 and a7 counters seem to match up, so we advertise a7 so that
> > arm32 can probe.

so how closely did you look at the a53/a7 differences?  I see some major 
differences, especially with the CPU_CYCLES event (0xff vs 0x11).

The proper fix here might be to add a cortex-a53 PMU entry to the armv7 
code rather than trying to treat it as a cortex-a7.

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Stefan Wahren wrote:

> 
> > Eric Anholt  hat am 17. Mai 2018 um 15:17 geschrieben:
> > 
> > 
> > The a53 and a7 counters seem to match up, so we advertise a7 so that
> > arm32 can probe.

so how closely did you look at the a53/a7 differences?  I see some major 
differences, especially with the CPU_CYCLES event (0xff vs 0x11).

The proper fix here might be to add a cortex-a53 PMU entry to the armv7 
code rather than trying to treat it as a cortex-a7.

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Eric Anholt wrote:
> 
> Is that better than a53?  I'm happy to switch to that.  The important
> part to me is the a7, since basically everyone with this hw is running
> arm32.

no, on further investigation it looks like a53 is more proper to use than 
the generic armv8.

Is the armv8 pmu on the cortex-a53 backwards compatible with armv7?  I'm 
dreading having to pull up the various ARM ARMs to look for myself so if 
it works for you I'm fine with that part too.

The biggest pushback I had with my original patch was no one believing irq 
9 was the right one to use.

Vince



Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Eric Anholt wrote:
> 
> Is that better than a53?  I'm happy to switch to that.  The important
> part to me is the a7, since basically everyone with this hw is running
> arm32.

no, on further investigation it looks like a53 is more proper to use than 
the generic armv8.

Is the armv8 pmu on the cortex-a53 backwards compatible with armv7?  I'm 
dreading having to pull up the various ARM ARMs to look for myself so if 
it works for you I'm fine with that part too.

The biggest pushback I had with my original patch was no one believing irq 
9 was the right one to use.

Vince



Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Eric Anholt wrote:

> diff --git a/arch/arm/boot/dts/bcm2837.dtsi b/arch/arm/boot/dts/bcm2837.dtsi
> index 7704bb029605..1f5e5c782835 100644
> --- a/arch/arm/boot/dts/bcm2837.dtsi
> +++ b/arch/arm/boot/dts/bcm2837.dtsi
> @@ -17,6 +17,12 @@
>   };
>   };
>  
> + arm-pmu {
> + compatible = "arm,cortex-a53-pmu", "arm,cortex-a7-pmu";
> + interrupt-parent = <_intc>;
> + interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
> + };
> +

why this and not

arm-pmu {
compatible = "arm,armv8-pmuv3";
interrupt-parent = <_intc>;
interrupts = <9>;
};

which works, though when I didn't get very far when I submitted the patch 
to add this last August.

Vince


Re: [PATCH] arm: bcm2835: Add the PMU to the devicetree.

2018-05-17 Thread Vince Weaver
On Thu, 17 May 2018, Eric Anholt wrote:

> diff --git a/arch/arm/boot/dts/bcm2837.dtsi b/arch/arm/boot/dts/bcm2837.dtsi
> index 7704bb029605..1f5e5c782835 100644
> --- a/arch/arm/boot/dts/bcm2837.dtsi
> +++ b/arch/arm/boot/dts/bcm2837.dtsi
> @@ -17,6 +17,12 @@
>   };
>   };
>  
> + arm-pmu {
> + compatible = "arm,cortex-a53-pmu", "arm,cortex-a7-pmu";
> + interrupt-parent = <_intc>;
> + interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
> + };
> +

why this and not

arm-pmu {
compatible = "arm,armv8-pmuv3";
interrupt-parent = <_intc>;
interrupts = <9>;
};

which works, though when I didn't get very far when I submitted the patch 
to add this last August.

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-05 Thread Vince Weaver
On Fri, 4 May 2018, Josh Poimboeuf wrote:
> 
> The 'nmi_restore' warning points to a bug in my patch, but the others
> are head scratchers.  Here's a patch which combines the first two
> patches, plus improves the existing warnings a bit.  Can you try it?

with that updated patch I hit

May  4 21:51:20 haswell kernel: [19245.450607] WARNING: stack recursion on 
stack type 2
May  4 22:21:29 haswell kernel: [21055.268717] WARNING: can't dereference 
registers at 6546ba71 for ip ret_from_intr+0x6/0x1d
May  4 22:36:22 haswell kernel: [21948.106762] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.115377] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.124086] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xd/0x90
May  4 22:36:22 haswell kernel: [21948.124088] WARNING: stack going in the 
wrong direction? ip=intel_pmu_handle_irq+0x12/0x4a0
May  4 22:36:22 haswell kernel: [21948.124097] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.150189] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.150199] WARNING: stack going in the 
wrong direction? ip=intel_pmu_handle_irq+0xe/0x4a0

the last bit repeated for a few minutes (flooding the log with a few 
thousand entries that look mostly similar)

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-05 Thread Vince Weaver
On Fri, 4 May 2018, Josh Poimboeuf wrote:
> 
> The 'nmi_restore' warning points to a bug in my patch, but the others
> are head scratchers.  Here's a patch which combines the first two
> patches, plus improves the existing warnings a bit.  Can you try it?

with that updated patch I hit

May  4 21:51:20 haswell kernel: [19245.450607] WARNING: stack recursion on 
stack type 2
May  4 22:21:29 haswell kernel: [21055.268717] WARNING: can't dereference 
registers at 6546ba71 for ip ret_from_intr+0x6/0x1d
May  4 22:36:22 haswell kernel: [21948.106762] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.115377] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.124086] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xd/0x90
May  4 22:36:22 haswell kernel: [21948.124088] WARNING: stack going in the 
wrong direction? ip=intel_pmu_handle_irq+0x12/0x4a0
May  4 22:36:22 haswell kernel: [21948.124097] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.150189] WARNING: stack going in the 
wrong direction? ip=native_sched_clock+0xe/0x90
May  4 22:36:22 haswell kernel: [21948.150199] WARNING: stack going in the 
wrong direction? ip=intel_pmu_handle_irq+0xe/0x4a0

the last bit repeated for a few minutes (flooding the log with a few 
thousand entries that look mostly similar)

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-04 Thread Vince Weaver
On Fri, 4 May 2018, Josh Poimboeuf wrote:

> Also, any tips for reproducing this locally?  I cloned the perf fuzzer
> github.  Is it as simple as just "make" and "./run_tests.sh"?

run_tests only runs the perf_event regressiong tests.

To run the fuzzer, enter the "fuzzer" directory and either run
"./fast_repro98.sh"
which will run the fuzzer repeatedly, or you can instead just run
./perf_fuzzer
but that sometimes eventually errors out if it manages to get stuck in a 
signal storm.

While running on haswell and skylake machines the WARNINGs trigger fairly 
quickly (within minutes usually).

You might also encounter a handful of other known kernel warnings which 
I've reported in the past and usually just ignore.

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-04 Thread Vince Weaver
On Fri, 4 May 2018, Josh Poimboeuf wrote:

> Also, any tips for reproducing this locally?  I cloned the perf fuzzer
> github.  Is it as simple as just "make" and "./run_tests.sh"?

run_tests only runs the perf_event regressiong tests.

To run the fuzzer, enter the "fuzzer" directory and either run
"./fast_repro98.sh"
which will run the fuzzer repeatedly, or you can instead just run
./perf_fuzzer
but that sometimes eventually errors out if it manages to get stuck in a 
signal storm.

While running on haswell and skylake machines the WARNINGs trigger fairly 
quickly (within minutes usually).

You might also encounter a handful of other known kernel warnings which 
I've reported in the past and usually just ignore.

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-04 Thread Vince Weaver
On Wed, 2 May 2018, Josh Poimboeuf wrote:

> After looking closer, I realized that at least some of these warnings
> are due to bad unwind hints in the entry code.  Can you try this patch
> instead of the last one?

with just this new patch applied I still get warnings such as this:

[  469.436218] WARNING: can't dereference registers at 886d9235 for ip 
apic_timer_interrupt+0xa/0x20
[  790.499655] WARNING: stack recursion on stack type 2
[  790.907092] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0x9/0x90
[ 3632.876656] WARNING: can't dereference iret registers at 1754e5aa 
for ip nmi_restore+0x16/0x2b
[ 3650.161250] WARNING: missing regs for base reg R10 at ip 
native_sched_clock+0xd/0x90


Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-04 Thread Vince Weaver
On Wed, 2 May 2018, Josh Poimboeuf wrote:

> After looking closer, I realized that at least some of these warnings
> are due to bad unwind hints in the entry code.  Can you try this patch
> instead of the last one?

with just this new patch applied I still get warnings such as this:

[  469.436218] WARNING: can't dereference registers at 886d9235 for ip 
apic_timer_interrupt+0xa/0x20
[  790.499655] WARNING: stack recursion on stack type 2
[  790.907092] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0x9/0x90
[ 3632.876656] WARNING: can't dereference iret registers at 1754e5aa 
for ip nmi_restore+0x16/0x2b
[ 3650.161250] WARNING: missing regs for base reg R10 at ip 
native_sched_clock+0xd/0x90


Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-01 Thread Vince Weaver
On Tue, 1 May 2018, Josh Poimboeuf wrote:

> Can you try the following patch?

I applied the patch, but the warnings don't really look that different.

[   62.220322] WARNING: stack recursion on stack type 4
[   62.220326] WARNING: can't dereference registers at 9ca2e86d for ip 
swapgs_restore_regs_and_return_to_usermode+0x79/0x87
[  367.597013] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0x9/0x90

Vince


Re: perf: fuzzer causes stack going in wrong direction warnings

2018-05-01 Thread Vince Weaver
On Tue, 1 May 2018, Josh Poimboeuf wrote:

> Can you try the following patch?

I applied the patch, but the warnings don't really look that different.

[   62.220322] WARNING: stack recursion on stack type 4
[   62.220326] WARNING: can't dereference registers at 9ca2e86d for ip 
swapgs_restore_regs_and_return_to_usermode+0x79/0x87
[  367.597013] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0x9/0x90

Vince


perf: fuzzer causes stack going in wrong direction warnings

2018-05-01 Thread Vince Weaver
Hello

I reported this back in January, but I think it got lost since everyone 
was busy with other more pressing matters.

But in any case, the perf_fuzzer still can trigger these type of messages 
and just wanted to see if they were a cause for concern, or just noise.

[66620.496076] WARNING: can't dereference registers at 51f78a40 for ip 
interrupt_entry+0xba/0xc0
[66620.506117] WARNING: stack recursion on stack type 4
[67126.898984] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0xd/0x90
[67148.214712] WARNING: can't dereference iret registers at c8f3c864 
for ip error_exit+0x20/0x20


Vince



perf: fuzzer causes stack going in wrong direction warnings

2018-05-01 Thread Vince Weaver
Hello

I reported this back in January, but I think it got lost since everyone 
was busy with other more pressing matters.

But in any case, the perf_fuzzer still can trigger these type of messages 
and just wanted to see if they were a cause for concern, or just noise.

[66620.496076] WARNING: can't dereference registers at 51f78a40 for ip 
interrupt_entry+0xba/0xc0
[66620.506117] WARNING: stack recursion on stack type 4
[67126.898984] WARNING: stack going in the wrong direction? 
ip=native_sched_clock+0xd/0x90
[67148.214712] WARNING: can't dereference iret registers at c8f3c864 
for ip error_exit+0x20/0x20


Vince



Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-20 Thread Vince Weaver
On Fri, 20 Apr 2018, Vince Weaver wrote:

> > AFAICT it works on Power and possibly ARM.
> 
> at least some ARMs are a bit more honest about it than x86
> 
> ivybridge:
>   Performance counter stats for '/bin/ls':
>   1,368,162  instructions
>   1,368,162  instructions:I
> 
> pi2/ARM cortex-A7
>   Performance counter stats for '/bin/ls':
>   1,910,083  instructions
> instructions:I
> 
> I'd fire up my Power8 machine to see but not sure it's worth the hassle 
> and/or having to get out the ear protection.

I did power up the Power8 machine in the end:

power8:
perf stat -e cycles,cycles:I sleep 5
Performance counter stats for 'sleep 5':
14,271,273  cycles
14,271,273  cycles:I

???

But then if I try again on power8

perf stat -a -e cycles,cycles:I sleep 5
 Performance counter stats for 'system wide':
1,238,772,322,327  cycles
1,238,674,771,713  cycles:I   

there is a difference.

But then on ivybridge

perf stat -a -e cycles,cycles:I sleep 5
Performance counter stats for 'system wide':
589,598,104  cycles
589,537,190  cycles:I

there is also a different in system wide mode.

So maybe exclude_idle does do something on x86?  Or am I completely 
misunderstanding what the flag is supposed to be indicating?

Vince


Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-20 Thread Vince Weaver
On Fri, 20 Apr 2018, Vince Weaver wrote:

> > AFAICT it works on Power and possibly ARM.
> 
> at least some ARMs are a bit more honest about it than x86
> 
> ivybridge:
>   Performance counter stats for '/bin/ls':
>   1,368,162  instructions
>   1,368,162  instructions:I
> 
> pi2/ARM cortex-A7
>   Performance counter stats for '/bin/ls':
>   1,910,083  instructions
> instructions:I
> 
> I'd fire up my Power8 machine to see but not sure it's worth the hassle 
> and/or having to get out the ear protection.

I did power up the Power8 machine in the end:

power8:
perf stat -e cycles,cycles:I sleep 5
Performance counter stats for 'sleep 5':
14,271,273  cycles
14,271,273  cycles:I

???

But then if I try again on power8

perf stat -a -e cycles,cycles:I sleep 5
 Performance counter stats for 'system wide':
1,238,772,322,327  cycles
1,238,674,771,713  cycles:I   

there is a difference.

But then on ivybridge

perf stat -a -e cycles,cycles:I sleep 5
Performance counter stats for 'system wide':
589,598,104  cycles
589,537,190  cycles:I

there is also a different in system wide mode.

So maybe exclude_idle does do something on x86?  Or am I completely 
misunderstanding what the flag is supposed to be indicating?

Vince


Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-20 Thread Vince Weaver
On Fri, 20 Apr 2018, Peter Zijlstra wrote:

> On Wed, Apr 18, 2018 at 11:10:20AM -0400, Vince Weaver wrote:
> > On Tue, 17 Apr 2018, Jiri Olsa wrote:
> > 
> > > On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > > > Hi,
> > > > 
> > > > I am trying to understand what the exclude_idle event attribute is 
> > > > supposed
> > > > to accomplish.
> > > > As per the definition in the header file:
> > > > 
> > > > exclude_idle   :  1, /* don't count when idle */
> > > 
> > > AFAICS it's not implemented
> > 
> > so just to be completely clear hear, we're saying that the "exclude_idle" 
> > modifier has never done anything useful and still doesn't?
> 
> AFAICT it works on Power and possibly ARM.

at least some ARMs are a bit more honest about it than x86

ivybridge:
Performance counter stats for '/bin/ls':
1,368,162  instructions
1,368,162  instructions:I

pi2/ARM cortex-A7
Performance counter stats for '/bin/ls':
1,910,083  instructions
  instructions:I

I'd fire up my Power8 machine to see but not sure it's worth the hassle 
and/or having to get out the ear protection.

Vince


Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-20 Thread Vince Weaver
On Fri, 20 Apr 2018, Peter Zijlstra wrote:

> On Wed, Apr 18, 2018 at 11:10:20AM -0400, Vince Weaver wrote:
> > On Tue, 17 Apr 2018, Jiri Olsa wrote:
> > 
> > > On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > > > Hi,
> > > > 
> > > > I am trying to understand what the exclude_idle event attribute is 
> > > > supposed
> > > > to accomplish.
> > > > As per the definition in the header file:
> > > > 
> > > > exclude_idle   :  1, /* don't count when idle */
> > > 
> > > AFAICS it's not implemented
> > 
> > so just to be completely clear hear, we're saying that the "exclude_idle" 
> > modifier has never done anything useful and still doesn't?
> 
> AFAICT it works on Power and possibly ARM.

at least some ARMs are a bit more honest about it than x86

ivybridge:
Performance counter stats for '/bin/ls':
1,368,162  instructions
1,368,162  instructions:I

pi2/ARM cortex-A7
Performance counter stats for '/bin/ls':
1,910,083  instructions
  instructions:I

I'd fire up my Power8 machine to see but not sure it's worth the hassle 
and/or having to get out the ear protection.

Vince


Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-18 Thread Vince Weaver
On Tue, 17 Apr 2018, Jiri Olsa wrote:

> On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > Hi,
> > 
> > I am trying to understand what the exclude_idle event attribute is supposed
> > to accomplish.
> > As per the definition in the header file:
> > 
> > exclude_idle   :  1, /* don't count when idle */
> 
> AFAICS it's not implemented

so just to be completely clear hear, we're saying that the "exclude_idle" 
modifier has never done anything useful and still doesn't?

If so I should update the perf_event_open manpage to spell this out.

Vince


Re: [RFC] perf/core: what is exclude_idle supposed to do

2018-04-18 Thread Vince Weaver
On Tue, 17 Apr 2018, Jiri Olsa wrote:

> On Mon, Apr 16, 2018 at 10:04:53PM +, Stephane Eranian wrote:
> > Hi,
> > 
> > I am trying to understand what the exclude_idle event attribute is supposed
> > to accomplish.
> > As per the definition in the header file:
> > 
> > exclude_idle   :  1, /* don't count when idle */
> 
> AFAICS it's not implemented

so just to be completely clear hear, we're saying that the "exclude_idle" 
modifier has never done anything useful and still doesn't?

If so I should update the perf_event_open manpage to spell this out.

Vince


perf: fuzzer leads to trace_kprobe: Could not insert message flood

2018-04-10 Thread Vince Weaver
Author: Song Liu 
Date:   Wed Dec 6 14:45:15 2017 -0800

When running the perf_fuzzer on a current git checkout my logs are flooded 
with messages such as this:
[71487.869077] trace_kprobe: Could not insert probe at unknown+0: -22
[71488.174479] trace_kprobe: Could not insert probe at unknown+0: -22

Presumably this is due to the introduction of the perf_kprobe PMU in
commit e12f03d7031a977356e3d7b75a68c2185ff8d155
Author: Song Liu 
Date:   Wed Dec 6 14:45:15 2017 -0800

Is there a way to get this error disabled, or else rate-limited?

Vince


perf: fuzzer leads to trace_kprobe: Could not insert message flood

2018-04-10 Thread Vince Weaver
Author: Song Liu 
Date:   Wed Dec 6 14:45:15 2017 -0800

When running the perf_fuzzer on a current git checkout my logs are flooded 
with messages such as this:
[71487.869077] trace_kprobe: Could not insert probe at unknown+0: -22
[71488.174479] trace_kprobe: Could not insert probe at unknown+0: -22

Presumably this is due to the introduction of the perf_kprobe PMU in
commit e12f03d7031a977356e3d7b75a68c2185ff8d155
Author: Song Liu 
Date:   Wed Dec 6 14:45:15 2017 -0800

Is there a way to get this error disabled, or else rate-limited?

Vince


Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Vince Weaver
On Fri, 9 Mar 2018, Peter Zijlstra wrote:

> On Fri, Mar 09, 2018 at 09:31:11AM -0500, Vince Weaver wrote:
> > On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:
> > 
> > > Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Gitweb: 
> > > https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Author: Kan Liang <kan.li...@linux.intel.com>
> > > AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
> > > Committer:  Ingo Molnar <mi...@kernel.org>
> > > CommitDate: Fri, 9 Mar 2018 08:22:23 +0100
> > > 
> > > perf/x86/intel: Disable userspace RDPMC usage for large PEBS
> > > 
> > 
> > 
> > So this whole commit log is about disabling RDPMC usage for "large PEBS"
> > but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"
> > 
> > Either the commit log is really misleading, or else a poor name was chosen 
> > for this feature.
> 
> Its the same thing, and yes that might want renaming I suppose.

I apologize for noticing these things so late in the game, but I haven't 
had time to keep up with a full lkml feed recently so I only see these 
things once I'm CC'd on them.

So to summarize this: rdpmc is only disabled on a per-event basis, and 
only if that event is doing multi-pebs sampling?

If that's true, then I don't think I have an issue with this.

We finally got rdpmc support in a released PAPI, and it is a massive
improvement when self-monitoring (even moreso if KPTI is enabled) so I was 
just trying to make sure this wouldn't suddenly disable rdpmc out from 
under us.

Vince


Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Vince Weaver
On Fri, 9 Mar 2018, Peter Zijlstra wrote:

> On Fri, Mar 09, 2018 at 09:31:11AM -0500, Vince Weaver wrote:
> > On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:
> > 
> > > Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Gitweb: 
> > > https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Author: Kan Liang 
> > > AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
> > > Committer:  Ingo Molnar 
> > > CommitDate: Fri, 9 Mar 2018 08:22:23 +0100
> > > 
> > > perf/x86/intel: Disable userspace RDPMC usage for large PEBS
> > > 
> > 
> > 
> > So this whole commit log is about disabling RDPMC usage for "large PEBS"
> > but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"
> > 
> > Either the commit log is really misleading, or else a poor name was chosen 
> > for this feature.
> 
> Its the same thing, and yes that might want renaming I suppose.

I apologize for noticing these things so late in the game, but I haven't 
had time to keep up with a full lkml feed recently so I only see these 
things once I'm CC'd on them.

So to summarize this: rdpmc is only disabled on a per-event basis, and 
only if that event is doing multi-pebs sampling?

If that's true, then I don't think I have an issue with this.

We finally got rdpmc support in a released PAPI, and it is a massive
improvement when self-monitoring (even moreso if KPTI is enabled) so I was 
just trying to make sure this wouldn't suddenly disable rdpmc out from 
under us.

Vince


Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Vince Weaver
On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:

> Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
> Gitweb: 
> https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
> Author: Kan Liang <kan.li...@linux.intel.com>
> AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
> Committer:  Ingo Molnar <mi...@kernel.org>
> CommitDate: Fri, 9 Mar 2018 08:22:23 +0100
> 
> perf/x86/intel: Disable userspace RDPMC usage for large PEBS
> 


So this whole commit log is about disabling RDPMC usage for "large PEBS"
but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"

Either the commit log is really misleading, or else a poor name was chosen 
for this feature.

Vince



> Userspace RDPMC cannot possibly work for large PEBS, which was introduced in:
> 
>   b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling 
> (large PEBS interrupt threshold)")
> 
> When the PEBS interrupt threshold is larger than one, there is no way
> to get exact auto-reload times and value for userspace RDPMC.  Disable
> the userspace RDPMC usage when large PEBS is enabled.
> 
> The only exception is when the PEBS interrupt threshold is 1, in which
> case user-space RDPMC works well even with auto-reload events.
> 
> Signed-off-by: Kan Liang <kan.li...@linux.intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
> Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Jiri Olsa <jo...@redhat.com>
> Cc: Linus Torvalds <torva...@linux-foundation.org>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Stephane Eranian <eran...@google.com>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Vince Weaver <vincent.wea...@maine.edu>
> Cc: a...@kernel.org
> Fixes: b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt 
> handling (large PEBS interrupt threshold)")
> Link: 
> http://lkml.kernel.org/r/1518474035-21006-6-git-send-email-kan.li...@linux.intel.com
> Signed-off-by: Ingo Molnar <mi...@kernel.org>
> ---
>  arch/x86/events/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 00a6251981d2..9c86e10f1196 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2117,7 +2117,8 @@ static int x86_pmu_event_init(struct perf_event *event)
>   event->destroy(event);
>   }
>  
> - if (READ_ONCE(x86_pmu.attr_rdpmc))
> + if (READ_ONCE(x86_pmu.attr_rdpmc) &&
> + !(event->hw.flags & PERF_X86_EVENT_FREERUNNING))
>   event->hw.flags |= PERF_X86_EVENT_RDPMC_ALLOWED;
>  
>   return err;
> 



Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Vince Weaver
On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:

> Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
> Gitweb: 
> https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
> Author: Kan Liang 
> AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
> Committer:  Ingo Molnar 
> CommitDate: Fri, 9 Mar 2018 08:22:23 +0100
> 
> perf/x86/intel: Disable userspace RDPMC usage for large PEBS
> 


So this whole commit log is about disabling RDPMC usage for "large PEBS"
but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"

Either the commit log is really misleading, or else a poor name was chosen 
for this feature.

Vince



> Userspace RDPMC cannot possibly work for large PEBS, which was introduced in:
> 
>   b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling 
> (large PEBS interrupt threshold)")
> 
> When the PEBS interrupt threshold is larger than one, there is no way
> to get exact auto-reload times and value for userspace RDPMC.  Disable
> the userspace RDPMC usage when large PEBS is enabled.
> 
> The only exception is when the PEBS interrupt threshold is 1, in which
> case user-space RDPMC works well even with auto-reload events.
> 
> Signed-off-by: Kan Liang 
> Signed-off-by: Peter Zijlstra (Intel) 
> Cc: Alexander Shishkin 
> Cc: Arnaldo Carvalho de Melo 
> Cc: Jiri Olsa 
> Cc: Linus Torvalds 
> Cc: Peter Zijlstra 
> Cc: Stephane Eranian 
> Cc: Thomas Gleixner 
> Cc: Vince Weaver 
> Cc: a...@kernel.org
> Fixes: b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt 
> handling (large PEBS interrupt threshold)")
> Link: 
> http://lkml.kernel.org/r/1518474035-21006-6-git-send-email-kan.li...@linux.intel.com
> Signed-off-by: Ingo Molnar 
> ---
>  arch/x86/events/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 00a6251981d2..9c86e10f1196 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2117,7 +2117,8 @@ static int x86_pmu_event_init(struct perf_event *event)
>   event->destroy(event);
>   }
>  
> - if (READ_ONCE(x86_pmu.attr_rdpmc))
> + if (READ_ONCE(x86_pmu.attr_rdpmc) &&
> + !(event->hw.flags & PERF_X86_EVENT_FREERUNNING))
>   event->hw.flags |= PERF_X86_EVENT_RDPMC_ALLOWED;
>  
>   return err;
> 



Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

2018-01-12 Thread Vince Weaver
On Thu, 11 Jan 2018, Peter Zijlstra wrote:

> It makes my IVB very ill, starts spewing RCU stall warnings, but is
> otherwise very unresponsive.
> 
> Awesome... I'll prod at it when my brain works again.
> 

Not sure if it's related, but I hit this on the core2 machine fuzzing 
overnight with "pti=off"

Jan 11 19:03:03 core2 kernel: [12816.125397] WARNING: CPU: 0 PID: 3144 at 
kernel/events/core.c:5097 perf_mmap_close+0x129/0x216
Jan 11 19:03:03 core2 kernel: [12816.126204] WARNING: CPU: 1 PID: 3197 at 
kernel/events/ring_buffer.c:569 __rb_free_aux+0x1a/0xb6
Jan 11 19:03:03 core2 kernel: [12816.126219] CPU: 1 PID: 3197 Comm: perf_fuzzer 
Not tainted 4.15.0-rc7+ #211
Jan 11 19:03:03 core2 kernel: [12816.126220] Hardware name: AOpen   
DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012
Jan 11 19:03:03 core2 kernel: [12816.126222] RIP: 0010:__rb_free_aux+0x1a/0xb6
Jan 11 19:03:03 core2 kernel: [12816.126223] RSP: :c90007417c08 EFLAGS: 
00010006
Jan 11 19:03:03 core2 kernel: [12816.126224] RAX: 8011 RBX: 
8801197aee00 RCX: 
Jan 11 19:03:03 core2 kernel: [12816.126225] RDX: 00040400 RSI: 
000400f6 RDI: 8801197aee00
Jan 11 19:03:03 core2 kernel: [12816.126225] RBP: 88011fc91000 R08: 
0020 R09: 0030
Jan 11 19:03:03 core2 kernel: [12816.126226] R10: c90007417c28 R11: 
0246 R12: 
Jan 11 19:03:03 core2 kernel: [12816.126227] R13: 0001 R14: 
88011901a800 R15: 
Jan 11 19:03:03 core2 kernel: [12816.126228] FS:  7f1957682700() 
GS:88011fc8() knlGS:
Jan 11 19:03:03 core2 kernel: [12816.126229] CS:  0010 DS:  ES:  CR0: 
80050033
Jan 11 19:03:03 core2 kernel: [12816.126230] CR2: 026be684 CR3: 
00011a79a000 CR4: 000407e0
Jan 11 19:03:03 core2 kernel: [12816.126231] DR0:  DR1: 
 DR2: 
Jan 11 19:03:03 core2 kernel: [12816.126231] DR3:  DR6: 
0ff0 DR7: 0600
Jan 11 19:03:03 core2 kernel: [12816.126232] Call Trace:
Jan 11 19:03:03 core2 kernel: [12816.126236]  perf_aux_output_end+0xf4/0x100
Jan 11 19:03:03 core2 kernel: [12816.126239]  intel_bts_interrupt+0x9e/0xf5
Jan 11 19:03:03 core2 kernel: [12816.126241]  intel_pmu_handle_irq+0x72/0x3dd
Jan 11 19:03:03 core2 kernel: [12816.126245]  ? flush_tlb_mm_range+0xb0/0xca
Jan 11 19:03:03 core2 kernel: [12816.126248]  ? radix_tree_next_chunk+0x73/0x26b
Jan 11 19:03:03 core2 kernel: [12816.126249]  ? get_page+0x5/0xa
Jan 11 19:03:03 core2 kernel: [12816.126251]  ? mm_counter_file+0x5/0x14
Jan 11 19:03:03 core2 kernel: [12816.126254]  ? alloc_set_pte+0x1b9/0x1cf
Jan 11 19:03:03 core2 kernel: [12816.126255]  ? unlock_page+0xa/0x20
Jan 11 19:03:03 core2 kernel: [12816.126256]  ? filemap_map_pages+0x182/0x1f4
Jan 11 19:03:03 core2 kernel: [12816.126258]  ? reuse_swap_page+0x7a/0x115
Jan 11 19:03:03 core2 kernel: [12816.126259]  ? wp_page_reuse+0x31/0x3a
Jan 11 19:03:03 core2 kernel: [12816.126260]  ? do_wp_page+0x16d/0x242
Jan 11 19:03:03 core2 kernel: [12816.126262]  ? __handle_mm_fault+0x67c/0x6f1
Jan 11 19:03:03 core2 kernel: [12816.126264]  ? perf_event_nmi_handler+0x27/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126266]  ? perf_event_nmi_handler+0x1b/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126267]  perf_event_nmi_handler+0x27/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126269]  nmi_handle+0x52/0xf5
Jan 11 19:03:03 core2 kernel: [12816.126271]  default_do_nmi+0x41/0xda
Jan 11 19:03:03 core2 kernel: [12816.126273]  do_nmi+0x92/0x102
Jan 11 19:03:03 core2 kernel: [12816.126275]  nmi+0x67/0xb0
Jan 11 19:03:03 core2 kernel: [12816.126277] RIP: 0033:0x40fa77
Jan 11 19:03:03 core2 kernel: [12816.126277] RSP: 002b:7fff93ea1d48 EFLAGS: 
0202
Jan 11 19:03:03 core2 kernel: [12816.126278] RAX:  RBX: 
000c RCX: 0007887c
Jan 11 19:03:03 core2 kernel: [12816.126279] RDX:  RSI: 
7f1957470620 RDI: 7f19574714e0
Jan 11 19:03:03 core2 kernel: [12816.126280] RBP: 7fff93ea1d60 R08: 
 R09: 7f1957682700
Jan 11 19:03:03 core2 kernel: [12816.126281] R10: 7f19576829d0 R11: 
0246 R12: 00401950
Jan 11 19:03:03 core2 kernel: [12816.126281] R13: 7fff93ea4150 R14: 
 R15: 
Jan 11 19:03:03 core2 kernel: [12816.126282] Code: 38 48 01 c7 48 c7 47 08 00 
00 00 00 e9 4c 98 00 00 66 66 66 66 90 65 8b 05 57 59 f1 7e a9 ff ff ff 7f 41 
54 55 53 48 89 fb 74 02 <0f> ff 48 8b bb e0 00 00 00 48 85 ff 74 1c ff 93 c8 00 
00 00 48 
Jan 11 19:03:03 core2 kernel: [12816.126299] ---[ end trace d8df98463050a325 
]---
Jan 11 19:03:03 core2 kernel: [12816.445380] CPU: 0 PID: 3144 Comm: perf_fuzzer 
Tainted: GW4.15.0-rc7+ #211
Jan 11 19:03:03 core2 kernel: [12816.453689] Hardware name: AOpen   
DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012

Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

2018-01-12 Thread Vince Weaver
On Thu, 11 Jan 2018, Peter Zijlstra wrote:

> It makes my IVB very ill, starts spewing RCU stall warnings, but is
> otherwise very unresponsive.
> 
> Awesome... I'll prod at it when my brain works again.
> 

Not sure if it's related, but I hit this on the core2 machine fuzzing 
overnight with "pti=off"

Jan 11 19:03:03 core2 kernel: [12816.125397] WARNING: CPU: 0 PID: 3144 at 
kernel/events/core.c:5097 perf_mmap_close+0x129/0x216
Jan 11 19:03:03 core2 kernel: [12816.126204] WARNING: CPU: 1 PID: 3197 at 
kernel/events/ring_buffer.c:569 __rb_free_aux+0x1a/0xb6
Jan 11 19:03:03 core2 kernel: [12816.126219] CPU: 1 PID: 3197 Comm: perf_fuzzer 
Not tainted 4.15.0-rc7+ #211
Jan 11 19:03:03 core2 kernel: [12816.126220] Hardware name: AOpen   
DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012
Jan 11 19:03:03 core2 kernel: [12816.126222] RIP: 0010:__rb_free_aux+0x1a/0xb6
Jan 11 19:03:03 core2 kernel: [12816.126223] RSP: :c90007417c08 EFLAGS: 
00010006
Jan 11 19:03:03 core2 kernel: [12816.126224] RAX: 8011 RBX: 
8801197aee00 RCX: 
Jan 11 19:03:03 core2 kernel: [12816.126225] RDX: 00040400 RSI: 
000400f6 RDI: 8801197aee00
Jan 11 19:03:03 core2 kernel: [12816.126225] RBP: 88011fc91000 R08: 
0020 R09: 0030
Jan 11 19:03:03 core2 kernel: [12816.126226] R10: c90007417c28 R11: 
0246 R12: 
Jan 11 19:03:03 core2 kernel: [12816.126227] R13: 0001 R14: 
88011901a800 R15: 
Jan 11 19:03:03 core2 kernel: [12816.126228] FS:  7f1957682700() 
GS:88011fc8() knlGS:
Jan 11 19:03:03 core2 kernel: [12816.126229] CS:  0010 DS:  ES:  CR0: 
80050033
Jan 11 19:03:03 core2 kernel: [12816.126230] CR2: 026be684 CR3: 
00011a79a000 CR4: 000407e0
Jan 11 19:03:03 core2 kernel: [12816.126231] DR0:  DR1: 
 DR2: 
Jan 11 19:03:03 core2 kernel: [12816.126231] DR3:  DR6: 
0ff0 DR7: 0600
Jan 11 19:03:03 core2 kernel: [12816.126232] Call Trace:
Jan 11 19:03:03 core2 kernel: [12816.126236]  perf_aux_output_end+0xf4/0x100
Jan 11 19:03:03 core2 kernel: [12816.126239]  intel_bts_interrupt+0x9e/0xf5
Jan 11 19:03:03 core2 kernel: [12816.126241]  intel_pmu_handle_irq+0x72/0x3dd
Jan 11 19:03:03 core2 kernel: [12816.126245]  ? flush_tlb_mm_range+0xb0/0xca
Jan 11 19:03:03 core2 kernel: [12816.126248]  ? radix_tree_next_chunk+0x73/0x26b
Jan 11 19:03:03 core2 kernel: [12816.126249]  ? get_page+0x5/0xa
Jan 11 19:03:03 core2 kernel: [12816.126251]  ? mm_counter_file+0x5/0x14
Jan 11 19:03:03 core2 kernel: [12816.126254]  ? alloc_set_pte+0x1b9/0x1cf
Jan 11 19:03:03 core2 kernel: [12816.126255]  ? unlock_page+0xa/0x20
Jan 11 19:03:03 core2 kernel: [12816.126256]  ? filemap_map_pages+0x182/0x1f4
Jan 11 19:03:03 core2 kernel: [12816.126258]  ? reuse_swap_page+0x7a/0x115
Jan 11 19:03:03 core2 kernel: [12816.126259]  ? wp_page_reuse+0x31/0x3a
Jan 11 19:03:03 core2 kernel: [12816.126260]  ? do_wp_page+0x16d/0x242
Jan 11 19:03:03 core2 kernel: [12816.126262]  ? __handle_mm_fault+0x67c/0x6f1
Jan 11 19:03:03 core2 kernel: [12816.126264]  ? perf_event_nmi_handler+0x27/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126266]  ? perf_event_nmi_handler+0x1b/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126267]  perf_event_nmi_handler+0x27/0x3e
Jan 11 19:03:03 core2 kernel: [12816.126269]  nmi_handle+0x52/0xf5
Jan 11 19:03:03 core2 kernel: [12816.126271]  default_do_nmi+0x41/0xda
Jan 11 19:03:03 core2 kernel: [12816.126273]  do_nmi+0x92/0x102
Jan 11 19:03:03 core2 kernel: [12816.126275]  nmi+0x67/0xb0
Jan 11 19:03:03 core2 kernel: [12816.126277] RIP: 0033:0x40fa77
Jan 11 19:03:03 core2 kernel: [12816.126277] RSP: 002b:7fff93ea1d48 EFLAGS: 
0202
Jan 11 19:03:03 core2 kernel: [12816.126278] RAX:  RBX: 
000c RCX: 0007887c
Jan 11 19:03:03 core2 kernel: [12816.126279] RDX:  RSI: 
7f1957470620 RDI: 7f19574714e0
Jan 11 19:03:03 core2 kernel: [12816.126280] RBP: 7fff93ea1d60 R08: 
 R09: 7f1957682700
Jan 11 19:03:03 core2 kernel: [12816.126281] R10: 7f19576829d0 R11: 
0246 R12: 00401950
Jan 11 19:03:03 core2 kernel: [12816.126281] R13: 7fff93ea4150 R14: 
 R15: 
Jan 11 19:03:03 core2 kernel: [12816.126282] Code: 38 48 01 c7 48 c7 47 08 00 
00 00 00 e9 4c 98 00 00 66 66 66 66 90 65 8b 05 57 59 f1 7e a9 ff ff ff 7f 41 
54 55 53 48 89 fb 74 02 <0f> ff 48 8b bb e0 00 00 00 48 85 ff 74 1c ff 93 c8 00 
00 00 48 
Jan 11 19:03:03 core2 kernel: [12816.126299] ---[ end trace d8df98463050a325 
]---
Jan 11 19:03:03 core2 kernel: [12816.445380] CPU: 0 PID: 3144 Comm: perf_fuzzer 
Tainted: GW4.15.0-rc7+ #211
Jan 11 19:03:03 core2 kernel: [12816.453689] Hardware name: AOpen   
DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012

Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

2018-01-11 Thread Vince Weaver
On Thu, 11 Jan 2018, Vince Weaver wrote:

> On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> 
> > On Thu, Jan 11, 2018 at 01:21:12PM -0600, Josh Poimboeuf wrote:
> > > Yuck.  This time it was stack recursion on the entry stack.  In the
> > > previous error, recursion was detected on the IRQ stack.  Otherwise they
> > > look quite similar.
> > > 
> > > Was that also with nopti?
> > 
> > Both with pti enabled, nopti makes things work again.
> 
> I think I have hit those errors even with pti disabled but now I'll have 
> to double check.

I can confirm this, I am able to trigger the stack recursion warning even 
when "pti=off" is set.

Jan 11 15:34:47 core2 kernel: [  320.668900] WARNING: stack recursion on stack 
type 4
Jan 11 15:34:47 core2 kernel: [  320.668909] WARNING: can't dereference 
registers at d5ae0491 for ip 
swapgs_restore_regs_and_return_to_usermode+0x28/0x7c

Vince


Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

2018-01-11 Thread Vince Weaver
On Thu, 11 Jan 2018, Vince Weaver wrote:

> On Thu, 11 Jan 2018, Peter Zijlstra wrote:
> 
> > On Thu, Jan 11, 2018 at 01:21:12PM -0600, Josh Poimboeuf wrote:
> > > Yuck.  This time it was stack recursion on the entry stack.  In the
> > > previous error, recursion was detected on the IRQ stack.  Otherwise they
> > > look quite similar.
> > > 
> > > Was that also with nopti?
> > 
> > Both with pti enabled, nopti makes things work again.
> 
> I think I have hit those errors even with pti disabled but now I'll have 
> to double check.

I can confirm this, I am able to trigger the stack recursion warning even 
when "pti=off" is set.

Jan 11 15:34:47 core2 kernel: [  320.668900] WARNING: stack recursion on stack 
type 4
Jan 11 15:34:47 core2 kernel: [  320.668909] WARNING: can't dereference 
registers at d5ae0491 for ip 
swapgs_restore_regs_and_return_to_usermode+0x28/0x7c

Vince


Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

2018-01-11 Thread Vince Weaver
On Thu, 11 Jan 2018, Vince Weaver wrote:

> Not sure if this info helps, but if I make perf_fuzzer *not* create AUX 
> mmap() buffers, I'm unable to reproduce the hangs both on core2 and 
> haswell.

Confirmed, I can crash the system without the fuzzer, just by doing

perf record --per-thread -e intel_bts// /bin/ls

on my haswell system.

Vince


  1   2   3   4   5   6   7   8   9   10   >