On Mon, 24 Feb 2014, Peter Zijlstra wrote:
> On Fri, Feb 21, 2014 at 03:18:38PM -0500, Vince Weaver wrote:
> > I've applied the patch and have been unable to trigger the warning with
> > either my testcase or a few hours of fuzzing.
>
> Yay.
>
> > My only comment on the patch is it could always
On Fri, Feb 21, 2014 at 03:18:38PM -0500, Vince Weaver wrote:
> I've applied the patch and have been unable to trigger the warning with
> either my testcase or a few hours of fuzzing.
Yay.
> My only comment on the patch is it could always use some comments.
>
> The perf_event code is really har
On Fri, 21 Feb 2014, Peter Zijlstra wrote:
> group_sched_in() that fails (for whatever reason), and without x86_pmu
> TXN support (because the leader is !x86_pmu), will corrupt the n_added
> state.
>
> If this all is correct; the below ought to cure things.
I've applied the patch and have been u
On Thu, Feb 20, 2014 at 07:23:00PM +0100, Peter Zijlstra wrote:
> This is I think the relevant bit:
>
>pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable:
> x86_pmu_disable
>pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: {
>pec_1076_warn-2804 [000] d...
and the perf_fuzzer overnight triggered this possibly related warning
in x86_pmu_stop()
I assume it's this code (the line numbers don't match up for some reason).
if (__test_and_clear_bit(hwc->idx, cpuc->active_mask)) {
x86_pmu.disable(event);
cpuc->events
On Thu, 20 Feb 2014, Vince Weaver wrote:
> On Thu, 20 Feb 2014, Vince Weaver wrote:
>
> > Might be relevant: check the last_cpu values. Right before the above
> > it looks like the thread gets moved from CPU 1 to CPU 0
> > (possibly as a result of the long chain started with the
> > close() of t
On Thu, 20 Feb 2014, Vince Weaver wrote:
> Might be relevant: check the last_cpu values. Right before the above
> it looks like the thread gets moved from CPU 1 to CPU 0
> (possibly as a result of the long chain started with the
> close() of the tracepoint event),
> so the problem NMI watchdog ev
On Thu, 20 Feb 2014 19:15:38 +0100
Peter Zijlstra wrote:
> I think by using the /debug/tracing/events/ftrace/function event, but
> I'm not actually sure, I've never used it nor did I write the code to do
> it. Jolsa did all that IIRC.
>
> All I know is that we had some 'fun' bugs around there s
On Thu, 20 Feb 2014, Peter Zijlstra wrote:
> On Thu, Feb 20, 2014 at 01:03:16PM -0500, Vince Weaver wrote:
> > attached, it's not very big.
>
> This is I think the relevant bit:
>
>pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable:
> x86_pmu_disable
>pec_1076_warn-2804 [000]
On Thu, Feb 20, 2014 at 07:15:38PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 20, 2014 at 09:31:19AM -0800, Andi Kleen wrote:
> > Peter Zijlstra writes:
> > >
> > > It will; trace_printk() works without -pg, I think you didn't read the
> > > instructions very well.
> >
> > Ok, you enable and disa
On Thu, Feb 20, 2014 at 01:03:16PM -0500, Vince Weaver wrote:
> attached, it's not very big.
This is I think the relevant bit:
pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable: x86_pmu_disable
pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: {
pec_1076_warn
On Thu, Feb 20, 2014 at 12:46:12PM -0500, Steven Rostedt wrote:
> On Thu, 20 Feb 2014 12:43:51 -0500
> Steven Rostedt wrote:
>
> > As a disable_trace_on_warning is more of a modification to the kernel,
> > I'm leaning to adding a /proc/sys/kernel/ftrace_disable_on_warning
> > file. This keeps it
On Thu, Feb 20, 2014 at 09:31:19AM -0800, Andi Kleen wrote:
> Peter Zijlstra writes:
> >
> > It will; trace_printk() works without -pg, I think you didn't read the
> > instructions very well.
>
> Ok, you enable and disable it again. I won't guess why you do that.
To grow the trace buffers; it s
On Thu, 20 Feb 2014, Peter Zijlstra wrote:
> On Wed, Feb 19, 2014 at 05:34:49PM -0500, Vince Weaver wrote:
> > So where would the NMI counter event get disabled? Would it never get
> > disabled, just because it's always running and always gets the same fixed
> > slot? Why isn't this a problem
On Thu, 20 Feb 2014 12:43:51 -0500
Steven Rostedt wrote:
> As a disable_trace_on_warning is more of a modification to the kernel,
> I'm leaning to adding a /proc/sys/kernel/ftrace_disable_on_warning
> file. This keeps it in line with ftrace_dump_on_oops, which is the most
> similar feature.
Neve
On Thu, 20 Feb 2014 18:00:18 +0100
Peter Zijlstra wrote:
> On Thu, Feb 20, 2014 at 11:26:00AM -0500, Steven Rostedt wrote:
> > On Thu, 20 Feb 2014 11:08:30 +0100
> > Peter Zijlstra wrote:
> >
> > > @rostedt: WTF is disable_trace_on_warning a boot option only?
> >
> > Laziness.
> >
> >
> > I'
Peter Zijlstra writes:
>
> It will; trace_printk() works without -pg, I think you didn't read the
> instructions very well.
Ok, you enable and disable it again. I won't guess why you do that.
>
> And there's a very good reason not to apply your patch; you can route
> the function tracer into pe
On Thu, Feb 20, 2014 at 11:26:00AM -0500, Steven Rostedt wrote:
> On Thu, 20 Feb 2014 11:08:30 +0100
> Peter Zijlstra wrote:
>
> > @rostedt: WTF is disable_trace_on_warning a boot option only?
>
> Laziness.
>
>
> I'll add a sysctl for it in 3.15.
/debug/tracing/options/ was where I was lookin
On Thu, 20 Feb 2014 11:08:30 +0100
Peter Zijlstra wrote:
> @rostedt: WTF is disable_trace_on_warning a boot option only?
Laziness.
I'll add a sysctl for it in 3.15.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.
Peter Zijlstra writes:
> On Wed, Feb 19, 2014 at 05:34:49PM -0500, Vince Weaver wrote:
>> So where would the NMI counter event get disabled? Would it never get
>> disabled, just because it's always running and always gets the same fixed
>> slot? Why isn't this a problem all the time, not just
On Thu, Feb 20, 2014 at 07:47:23AM -0800, Andi Kleen wrote:
> Peter Zijlstra writes:
>
> > On Wed, Feb 19, 2014 at 05:34:49PM -0500, Vince Weaver wrote:
> >> So where would the NMI counter event get disabled? Would it never get
> >> disabled, just because it's always running and always gets the
On Wed, Feb 19, 2014 at 05:34:49PM -0500, Vince Weaver wrote:
> So where would the NMI counter event get disabled? Would it never get
> disabled, just because it's always running and always gets the same fixed
> slot? Why isn't this a problem all the time, not just with corner cases?
Well it c
On Wed, 19 Feb 2014, Peter Zijlstra wrote:
> So when we add a new event (or more) we compute a mapping from event to
> counter. Then we disable all (pre existing) events that moved to a new
> location, then we enable all events (insert HES_ARCH) that were running
> but got relocated and the new ev
On Tue, Feb 18, 2014 at 05:20:57PM -0500, Vince Weaver wrote:
> On Tue, 18 Feb 2014, Vince Weaver wrote:
>
> > On Mon, 17 Feb 2014, Peter Zijlstra wrote:
> >
> > > Enable CONFIG_FRAME_POINTER for better stack traces; I suspect the
> > > list_del_event() is just random stack garbage. The path that
On Tue, 18 Feb 2014, Vince Weaver wrote:
> On Mon, 17 Feb 2014, Peter Zijlstra wrote:
>
> > Enable CONFIG_FRAME_POINTER for better stack traces; I suspect the
> > list_del_event() is just random stack garbage. The path that makes sense
> > is:
> > wait_rcu()->__wait_for_common()->schedule_timeo
On Mon, 17 Feb 2014, Peter Zijlstra wrote:
> Enable CONFIG_FRAME_POINTER for better stack traces; I suspect the
> list_del_event() is just random stack garbage. The path that makes sense
> is:
> wait_rcu()->__wait_for_common()->schedule_timeout()
Here's an updated stack trace on 3.14-rc3 with C
On Thu, Feb 13, 2014 at 05:13:20PM -0500, Vince Weaver wrote:
> On Thu, 13 Feb 2014, Vince Weaver wrote:
>
> > The plot thickens. The WARN_ON is not caused by the cycles event that we
> > open, but it's caused by the NMI Watchdog cycles event.
>
> The WARN_ON_ONCE at line 1076 in perf_event.c i
On Thu, 13 Feb 2014, Vince Weaver wrote:
> The plot thickens. The WARN_ON is not caused by the cycles event that we
> open, but it's caused by the NMI Watchdog cycles event.
The WARN_ON_ONCE at line 1076 in perf_event.c is triggering because
in x86_pmu_enable() is calling x86_pmu_start() for al
On Thu, 13 Feb 2014, Vince Weaver wrote:
> On Wed, 12 Feb 2014, Vince Weaver wrote:
> >
> > It is triggered in this case when you have:
> >
> > An event group of breakpoint, cycles, branches
> > An event of instructions with precise=1
> > A tracepoint
> >
> > and then you close the tracepo
On Wed, 12 Feb 2014, Vince Weaver wrote:
> On Tue, 11 Feb 2014, Peter Zijlstra wrote:
> >
> > I'll see if I can run through the reproduction case by hand.
>
> I've come up with an even simpler test case with all of the extraneous
> settings removed. Included below.
>
> It is triggered in this
On Tue, 11 Feb 2014, Peter Zijlstra wrote:
>
> I'll see if I can run through the reproduction case by hand.
I've come up with an even simpler test case with all of the extraneous
settings removed. Included below.
It is triggered in this case when you have:
An event group of breakpoint, cycl
On Mon, Feb 10, 2014 at 04:26:29PM -0500, Vince Weaver wrote:
> On Thu, 30 Jan 2014, Dave Jones wrote:
>
> > I gave Vince's perf_fuzzer a run, hoping to trigger a different perf bug
> > that I've been seeing. Instead I hit a different bug.
>
> I've been seeing that WARN_ON for months but it was h
On Thu, 30 Jan 2014, Dave Jones wrote:
> I gave Vince's perf_fuzzer a run, hoping to trigger a different perf bug
> that I've been seeing. Instead I hit a different bug.
I've been seeing that WARN_ON for months but it was hard to reproduce.
After a lot of hassle (and scores or reboots) I managed
I gave Vince's perf_fuzzer a run, hoping to trigger a different perf bug
that I've been seeing. Instead I hit a different bug.
WARNING: CPU: 1 PID: 9277 at arch/x86/kernel/cpu/perf_event.c:1076
x86_pmu_start+0xd1/0x110()
CPU: 1 PID: 9277 Comm: perf_fuzzer Not tainted 3.13.0+ #101
000
34 matches
Mail list logo