Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 03:36:08PM +, Mark Rutland wrote:
> On the assumption that the patch is otherwise OK, how about the commit
> message below?

Yeah, that'll do fine. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
On Tue, Mar 10, 2015 at 12:53:51PM +, Peter Zijlstra wrote:
> On Tue, Mar 10, 2015 at 12:05:21PM +, Mark Rutland wrote:
> > On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
> > > On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
> > > > From: "Suzuki K. Poulose" 
> > > > 
> > > > Don't allow grouping hardware events from different PMUs
> > > >  (eg. CCI + CPU).
> > > 
> > > Uhm, how does this work? If we have multiple hardware PMUs we'll stop
> > > scheduling events after the first failed event schedule. This can leave
> > > one of the PMUs severely under utilized.
> > 
> > The problem is here group validation at pmu::event_init() time, not
> > scheduling.
> 
> Maybe make that a little more explicit.

On the assumption that the patch is otherwise OK, how about the commit
message below?

>8
arm/pmu: Reject groups spanning multiple hardware PMUs

The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event->ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.

The ARM PMU driver does not take this fact into account, and when
validating groups assumes that it can call to_arm_pmu(event->pmu) for
any HW event. When the event in question is from another HW PMU this is
wrong, and results in dereferencing garbage.

This patch updates the ARM PMU driver to first test for and reject
events from other PMUs, moving the to_arm_pmu and related logic after
this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
a CCI PMU present:

CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57
Hardware name: ARM-Versatile Express
task: bd8484c0 ti: be676000 task.ti: be676000
PC is at 0xbf1bbc90
LR is at validate_event+0x34/0x5c
pc : []lr : [<80016060>]psr: 0013
...
[<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90)
[<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218)
[<80016398>] (armpmu_event_init) from [<800882e4>] 
(perf_try_init_event+0x30/0x48)
[<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4)
[<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c)
[<8008f8a8>] (perf_event_alloc) from [<8009015c>] 
(SyS_perf_event_open+0x498/0xa70)
[<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34)
Code: bf1be000 bf1bb380 802a2664  (0002)
---[ end trace 01aff0ff00926a0a ]---
>8

Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
> I think we could still solve this problem by deferring the 'context'
> validation to the core. The PMUs could validate the group, within its
> context. i.e, if it can accommodate its events as a group, during 
> event_init.  The problem we face now, is encountering an event from a 
> different PMU, which we could leave it to the core as we do already.

Good point: we're not reliant on other drivers because the core will
still check the context. We only hope that those other drivers don't
make similar mistakes and corrupt things.

[...]

>   static int
> -validate_event(struct pmu_hw_events *hw_events,
> -struct perf_event *event)
> +validate_event(struct pmu *pmu, struct pmu_hw_events *hw_events,
> +struct perf_event *event)
>   {
> - struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> + struct arm_pmu *armpmu;
> 
>   if (is_software_event(event))
>   return 1;
> 
> + /*
> +  * We are only worried if we can accommodate the events
> +  * from this pmu in this group.
> +  */
> + if (event->pmu != pmu)
> + return 1;

It's better to explicitly reject this case. We know it's non-sensical
and there's no point wasting any time on it. That will also make
big.LITTLE support a bit nicer, whenever I get that under control -- big
and LITTLE events could live in the same task context (so the core won't
reject grouping them) but mustn't be in the same group (so we have to
reject grouping in the backend).

I'd still prefer the group validation being triggered explicitly by the
core code, so that it's logically separate from initialising the event
in isolation, but that's looking like a much bigger job, and I don't
trust myself to correctly update every PMU driver for v4.0.

For the moment let's clean up the commit message for the original patch.
I'll add splitting group validation to my TODO list; there seems to be a
slot free around 2035...

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Suzuki K. Poulose

On 10/03/15 13:00, Peter Zijlstra wrote:

On Tue, Mar 10, 2015 at 01:53:51PM +0100, Peter Zijlstra wrote:

It would be nicer if we could prevent this in the core so we're not
reliant on every PMU driver doing the same verification. My initial
thought was that seemed like unnecessary duplication of the ctx checking
above, but if we're going to end up shoving it into several drivers
anyway perhaps it's the lesser evil.


Again, agreed, that would be better and less error prone. But I'm not
entirely sure how to go about doing it :/ I'll have to go think about
that; and conferences are not the best place for that.

Suggestions on that are welcome of course ;)


So the problem is that event_init() is what will return the pmu, so we
cannot make decisions on it until after that returns.

Maybe we can pull out the validate step into its own funciton;
pmu->validate() or whatnot, to be called slightly later.


I think we could still solve this problem by deferring the 'context'
validation to the core. The PMUs could validate the group, within its
context. i.e, if it can accommodate its events as a group, during 
event_init.  The problem we face now, is encountering an event from a 
different PMU, which we could leave it to the core as we do already.


i.e the fix could look like (and similarly for other cases):

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 557e128..b3af19b 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -259,20 +259,28 @@ out:
 }

 static int
-validate_event(struct pmu_hw_events *hw_events,
-  struct perf_event *event)
+validate_event(struct pmu *pmu, struct pmu_hw_events *hw_events,
+  struct perf_event *event)
 {
-   struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+   struct arm_pmu *armpmu;

if (is_software_event(event))
return 1;

+   /*
+* We are only worried if we can accommodate the events
+* from this pmu in this group.
+*/
+   if (event->pmu != pmu)
+   return 1;
+
if (event->state < PERF_EVENT_STATE_OFF)
return 1;

if (event->state == PERF_EVENT_STATE_OFF && !event->attr.enable_on_exec)
return 1;

+   armpmu = to_arm_pmu(event->pmu);
return armpmu->get_event_idx(hw_events, event) >= 0;
 }

@@ -288,15 +296,15 @@ validate_group(struct perf_event *event)
 */
memset(_pmu.used_mask, 0, sizeof(fake_pmu.used_mask));

-   if (!validate_event(_pmu, leader))
+   if (!validate_event(event->pmu, _pmu, leader))
return -EINVAL;

list_for_each_entry(sibling, >sibling_list, group_entry) {
-   if (!validate_event(_pmu, sibling))
+   if (!validate_event(event->pmu, _pmu, sibling))
return -EINVAL;
}

-   if (!validate_event(_pmu, event))
+   if (!validate_event(event->pmu, _pmu, event))
return -EINVAL;

return 0;

Thoughts ?

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
> So the problem is that event_init() is what will return the pmu, so we
> cannot make decisions on it until after that returns.

I took a look into hacking something into perf_try_init_event, but it
ends up duplicating all of the existing tests and looks really out of
place.

> Maybe we can pull out the validate step into its own funciton;
> pmu->validate() or whatnot, to be called slightly later.

Perhaps the other way around: move the call to pmu::event_init() later
(after the other checks in perf_event_open), and in its original spot
add a new pmu::can_handle() that only tells us whether an event in
isolation is for this PMU, without validation of the specifics of the
event.

That would keep the split between the two callbacks more clearly
defined, though it would require updating most PMU drivers, so perhaps
that's too invasive.

I'll dig into this a bit.

Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 01:53:51PM +0100, Peter Zijlstra wrote:
> > It would be nicer if we could prevent this in the core so we're not
> > reliant on every PMU driver doing the same verification. My initial
> > thought was that seemed like unnecessary duplication of the ctx checking
> > above, but if we're going to end up shoving it into several drivers
> > anyway perhaps it's the lesser evil.
> 
> Again, agreed, that would be better and less error prone. But I'm not
> entirely sure how to go about doing it :/ I'll have to go think about
> that; and conferences are not the best place for that.
> 
> Suggestions on that are welcome of course ;)

So the problem is that event_init() is what will return the pmu, so we
cannot make decisions on it until after that returns.

Maybe we can pull out the validate step into its own funciton;
pmu->validate() or whatnot, to be called slightly later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 12:05:21PM +, Mark Rutland wrote:
> On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
> > On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
> > > From: "Suzuki K. Poulose" 
> > > 
> > > Don't allow grouping hardware events from different PMUs
> > >  (eg. CCI + CPU).
> > 
> > Uhm, how does this work? If we have multiple hardware PMUs we'll stop
> > scheduling events after the first failed event schedule. This can leave
> > one of the PMUs severely under utilized.
> 
> The problem is here group validation at pmu::event_init() time, not
> scheduling.

Maybe make that a little more explicit.

> We don't allow grouping across disparate HW PMUs because we can't
> provide group semantics anyway. Scheduling is not a problem in this case
> (unlike the big.LITTLE case I have a patch for [1]).

Right, I remember that; I was wondering if this was related.

> We have a CPU PMU and an "uncore" CCI PMU. You can't create task-bound
> events for the CCI, but you can create CPU-bound events for the CCI on
> the nominal CPU the CCI is monitored from.

Indeed, ok.

> The context check you added in c3c87e770458aa00 "perf: Tighten (and fix)
> the grouping condition" implicitly rejects groups that have CPU and CCI
> events (each event::ctx will be the relevant pmu::pmu_cpu_context and
> will differ), and this is sane -- you can't provide group semantics
> across disparate HW PMUs.

Agreed.

> Unfortunately that happens after we've done the
> event->pmu->event_init(event) dance on each event, and in our event_init
> function we try to verify the group is sane. In our verification we
> ignore SW events, but assume that all !SW events are for the CPU PMU.
> If you add a CPU event to a CCI group, that's not the case, and we use
> container_of on an unsuitable object, derefence garbage, invoke the
> eschaton and so on.

Indeed, on x86 we explicitly ignore everything not an x86_pmu event.

> It would be nicer if we could prevent this in the core so we're not
> reliant on every PMU driver doing the same verification. My initial
> thought was that seemed like unnecessary duplication of the ctx checking
> above, but if we're going to end up shoving it into several drivers
> anyway perhaps it's the lesser evil.

Again, agreed, that would be better and less error prone. But I'm not
entirely sure how to go about doing it :/ I'll have to go think about
that; and conferences are not the best place for that.

Suggestions on that are welcome of course ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
> On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
> > From: "Suzuki K. Poulose" 
> > 
> > Don't allow grouping hardware events from different PMUs
> >  (eg. CCI + CPU).
> 
> Uhm, how does this work? If we have multiple hardware PMUs we'll stop
> scheduling events after the first failed event schedule. This can leave
> one of the PMUs severely under utilized.

The problem is here group validation at pmu::event_init() time, not
scheduling.

We don't allow grouping across disparate HW PMUs because we can't
provide group semantics anyway. Scheduling is not a problem in this case
(unlike the big.LITTLE case I have a patch for [1]).

We have a CPU PMU and an "uncore" CCI PMU. You can't create task-bound
events for the CCI, but you can create CPU-bound events for the CCI on
the nominal CPU the CCI is monitored from.

The context check you added in c3c87e770458aa00 "perf: Tighten (and fix)
the grouping condition" implicitly rejects groups that have CPU and CCI
events (each event::ctx will be the relevant pmu::pmu_cpu_context and
will differ), and this is sane -- you can't provide group semantics
across disparate HW PMUs.

Unfortunately that happens after we've done the
event->pmu->event_init(event) dance on each event, and in our event_init
function we try to verify the group is sane. In our verification we
ignore SW events, but assume that all !SW events are for the CPU PMU.
If you add a CPU event to a CCI group, that's not the case, and we use
container_of on an unsuitable object, derefence garbage, invoke the
eschaton and so on.

It would be nicer if we could prevent this in the core so we're not
reliant on every PMU driver doing the same verification. My initial
thought was that seemed like unnecessary duplication of the ctx checking
above, but if we're going to end up shoving it into several drivers
anyway perhaps it's the lesser evil.

Mark.

[1] https://lkml.org/lkml/2015/2/5/520
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Suzuki K. Poulose

On 10/03/15 11:27, Peter Zijlstra wrote:

On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:

From: "Suzuki K. Poulose" 

Don't allow grouping hardware events from different PMUs
  (eg. CCI + CPU).


Uhm, how does this work? If we have multiple hardware PMUs we'll stop
scheduling events after the first failed event schedule. This can leave
one of the PMUs severely under utilized.

This is done from pmu->event_init(), where we haven't scheduled an
event yet. Do you think we need to solve it using a different approach
? What is the best way to handle this situation ? Is it OK
to allow different PMUs in the group ?

Suzuki





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
> From: "Suzuki K. Poulose" 
> 
> Don't allow grouping hardware events from different PMUs
>  (eg. CCI + CPU).

Uhm, how does this work? If we have multiple hardware PMUs we'll stop
scheduling events after the first failed event schedule. This can leave
one of the PMUs severely under utilized.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
 On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
  From: Suzuki K. Poulose suzuki.poul...@arm.com
  
  Don't allow grouping hardware events from different PMUs
   (eg. CCI + CPU).
 
 Uhm, how does this work? If we have multiple hardware PMUs we'll stop
 scheduling events after the first failed event schedule. This can leave
 one of the PMUs severely under utilized.

The problem is here group validation at pmu::event_init() time, not
scheduling.

We don't allow grouping across disparate HW PMUs because we can't
provide group semantics anyway. Scheduling is not a problem in this case
(unlike the big.LITTLE case I have a patch for [1]).

We have a CPU PMU and an uncore CCI PMU. You can't create task-bound
events for the CCI, but you can create CPU-bound events for the CCI on
the nominal CPU the CCI is monitored from.

The context check you added in c3c87e770458aa00 perf: Tighten (and fix)
the grouping condition implicitly rejects groups that have CPU and CCI
events (each event::ctx will be the relevant pmu::pmu_cpu_context and
will differ), and this is sane -- you can't provide group semantics
across disparate HW PMUs.

Unfortunately that happens after we've done the
event-pmu-event_init(event) dance on each event, and in our event_init
function we try to verify the group is sane. In our verification we
ignore SW events, but assume that all !SW events are for the CPU PMU.
If you add a CPU event to a CCI group, that's not the case, and we use
container_of on an unsuitable object, derefence garbage, invoke the
eschaton and so on.

It would be nicer if we could prevent this in the core so we're not
reliant on every PMU driver doing the same verification. My initial
thought was that seemed like unnecessary duplication of the ctx checking
above, but if we're going to end up shoving it into several drivers
anyway perhaps it's the lesser evil.

Mark.

[1] https://lkml.org/lkml/2015/2/5/520
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Suzuki K. Poulose

On 10/03/15 11:27, Peter Zijlstra wrote:

On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:

From: Suzuki K. Poulose suzuki.poul...@arm.com

Don't allow grouping hardware events from different PMUs
  (eg. CCI + CPU).


Uhm, how does this work? If we have multiple hardware PMUs we'll stop
scheduling events after the first failed event schedule. This can leave
one of the PMUs severely under utilized.

This is done from pmu-event_init(), where we haven't scheduled an
event yet. Do you think we need to solve it using a different approach
? What is the best way to handle this situation ? Is it OK
to allow different PMUs in the group ?

Suzuki





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
 From: Suzuki K. Poulose suzuki.poul...@arm.com
 
 Don't allow grouping hardware events from different PMUs
  (eg. CCI + CPU).

Uhm, how does this work? If we have multiple hardware PMUs we'll stop
scheduling events after the first failed event schedule. This can leave
one of the PMUs severely under utilized.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Suzuki K. Poulose

On 10/03/15 13:00, Peter Zijlstra wrote:

On Tue, Mar 10, 2015 at 01:53:51PM +0100, Peter Zijlstra wrote:

It would be nicer if we could prevent this in the core so we're not
reliant on every PMU driver doing the same verification. My initial
thought was that seemed like unnecessary duplication of the ctx checking
above, but if we're going to end up shoving it into several drivers
anyway perhaps it's the lesser evil.


Again, agreed, that would be better and less error prone. But I'm not
entirely sure how to go about doing it :/ I'll have to go think about
that; and conferences are not the best place for that.

Suggestions on that are welcome of course ;)


So the problem is that event_init() is what will return the pmu, so we
cannot make decisions on it until after that returns.

Maybe we can pull out the validate step into its own funciton;
pmu-validate() or whatnot, to be called slightly later.


I think we could still solve this problem by deferring the 'context'
validation to the core. The PMUs could validate the group, within its
context. i.e, if it can accommodate its events as a group, during 
event_init.  The problem we face now, is encountering an event from a 
different PMU, which we could leave it to the core as we do already.


i.e the fix could look like (and similarly for other cases):

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 557e128..b3af19b 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -259,20 +259,28 @@ out:
 }

 static int
-validate_event(struct pmu_hw_events *hw_events,
-  struct perf_event *event)
+validate_event(struct pmu *pmu, struct pmu_hw_events *hw_events,
+  struct perf_event *event)
 {
-   struct arm_pmu *armpmu = to_arm_pmu(event-pmu);
+   struct arm_pmu *armpmu;

if (is_software_event(event))
return 1;

+   /*
+* We are only worried if we can accommodate the events
+* from this pmu in this group.
+*/
+   if (event-pmu != pmu)
+   return 1;
+
if (event-state  PERF_EVENT_STATE_OFF)
return 1;

if (event-state == PERF_EVENT_STATE_OFF  !event-attr.enable_on_exec)
return 1;

+   armpmu = to_arm_pmu(event-pmu);
return armpmu-get_event_idx(hw_events, event) = 0;
 }

@@ -288,15 +296,15 @@ validate_group(struct perf_event *event)
 */
memset(fake_pmu.used_mask, 0, sizeof(fake_pmu.used_mask));

-   if (!validate_event(fake_pmu, leader))
+   if (!validate_event(event-pmu, fake_pmu, leader))
return -EINVAL;

list_for_each_entry(sibling, leader-sibling_list, group_entry) {
-   if (!validate_event(fake_pmu, sibling))
+   if (!validate_event(event-pmu, fake_pmu, sibling))
return -EINVAL;
}

-   if (!validate_event(fake_pmu, event))
+   if (!validate_event(event-pmu, fake_pmu, event))
return -EINVAL;

return 0;

Thoughts ?

Suzuki

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 12:05:21PM +, Mark Rutland wrote:
 On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
  On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
   From: Suzuki K. Poulose suzuki.poul...@arm.com
   
   Don't allow grouping hardware events from different PMUs
(eg. CCI + CPU).
  
  Uhm, how does this work? If we have multiple hardware PMUs we'll stop
  scheduling events after the first failed event schedule. This can leave
  one of the PMUs severely under utilized.
 
 The problem is here group validation at pmu::event_init() time, not
 scheduling.

Maybe make that a little more explicit.

 We don't allow grouping across disparate HW PMUs because we can't
 provide group semantics anyway. Scheduling is not a problem in this case
 (unlike the big.LITTLE case I have a patch for [1]).

Right, I remember that; I was wondering if this was related.

 We have a CPU PMU and an uncore CCI PMU. You can't create task-bound
 events for the CCI, but you can create CPU-bound events for the CCI on
 the nominal CPU the CCI is monitored from.

Indeed, ok.

 The context check you added in c3c87e770458aa00 perf: Tighten (and fix)
 the grouping condition implicitly rejects groups that have CPU and CCI
 events (each event::ctx will be the relevant pmu::pmu_cpu_context and
 will differ), and this is sane -- you can't provide group semantics
 across disparate HW PMUs.

Agreed.

 Unfortunately that happens after we've done the
 event-pmu-event_init(event) dance on each event, and in our event_init
 function we try to verify the group is sane. In our verification we
 ignore SW events, but assume that all !SW events are for the CPU PMU.
 If you add a CPU event to a CCI group, that's not the case, and we use
 container_of on an unsuitable object, derefence garbage, invoke the
 eschaton and so on.

Indeed, on x86 we explicitly ignore everything not an x86_pmu event.

 It would be nicer if we could prevent this in the core so we're not
 reliant on every PMU driver doing the same verification. My initial
 thought was that seemed like unnecessary duplication of the ctx checking
 above, but if we're going to end up shoving it into several drivers
 anyway perhaps it's the lesser evil.

Again, agreed, that would be better and less error prone. But I'm not
entirely sure how to go about doing it :/ I'll have to go think about
that; and conferences are not the best place for that.

Suggestions on that are welcome of course ;)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 01:53:51PM +0100, Peter Zijlstra wrote:
  It would be nicer if we could prevent this in the core so we're not
  reliant on every PMU driver doing the same verification. My initial
  thought was that seemed like unnecessary duplication of the ctx checking
  above, but if we're going to end up shoving it into several drivers
  anyway perhaps it's the lesser evil.
 
 Again, agreed, that would be better and less error prone. But I'm not
 entirely sure how to go about doing it :/ I'll have to go think about
 that; and conferences are not the best place for that.
 
 Suggestions on that are welcome of course ;)

So the problem is that event_init() is what will return the pmu, so we
cannot make decisions on it until after that returns.

Maybe we can pull out the validate step into its own funciton;
pmu-validate() or whatnot, to be called slightly later.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
 So the problem is that event_init() is what will return the pmu, so we
 cannot make decisions on it until after that returns.

I took a look into hacking something into perf_try_init_event, but it
ends up duplicating all of the existing tests and looks really out of
place.

 Maybe we can pull out the validate step into its own funciton;
 pmu-validate() or whatnot, to be called slightly later.

Perhaps the other way around: move the call to pmu::event_init() later
(after the other checks in perf_event_open), and in its original spot
add a new pmu::can_handle() that only tells us whether an event in
isolation is for this PMU, without validation of the specifics of the
event.

That would keep the split between the two callbacks more clearly
defined, though it would require updating most PMU drivers, so perhaps
that's too invasive.

I'll dig into this a bit.

Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
 I think we could still solve this problem by deferring the 'context'
 validation to the core. The PMUs could validate the group, within its
 context. i.e, if it can accommodate its events as a group, during 
 event_init.  The problem we face now, is encountering an event from a 
 different PMU, which we could leave it to the core as we do already.

Good point: we're not reliant on other drivers because the core will
still check the context. We only hope that those other drivers don't
make similar mistakes and corrupt things.

[...]

   static int
 -validate_event(struct pmu_hw_events *hw_events,
 -struct perf_event *event)
 +validate_event(struct pmu *pmu, struct pmu_hw_events *hw_events,
 +struct perf_event *event)
   {
 - struct arm_pmu *armpmu = to_arm_pmu(event-pmu);
 + struct arm_pmu *armpmu;
 
   if (is_software_event(event))
   return 1;
 
 + /*
 +  * We are only worried if we can accommodate the events
 +  * from this pmu in this group.
 +  */
 + if (event-pmu != pmu)
 + return 1;

It's better to explicitly reject this case. We know it's non-sensical
and there's no point wasting any time on it. That will also make
big.LITTLE support a bit nicer, whenever I get that under control -- big
and LITTLE events could live in the same task context (so the core won't
reject grouping them) but mustn't be in the same group (so we have to
reject grouping in the backend).

I'd still prefer the group validation being triggered explicitly by the
core code, so that it's logically separate from initialising the event
in isolation, but that's looking like a much bigger job, and I don't
trust myself to correctly update every PMU driver for v4.0.

For the moment let's clean up the commit message for the original patch.
I'll add splitting group validation to my TODO list; there seems to be a
slot free around 2035...

Thanks,
Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Peter Zijlstra
On Tue, Mar 10, 2015 at 03:36:08PM +, Mark Rutland wrote:
 On the assumption that the patch is otherwise OK, how about the commit
 message below?

Yeah, that'll do fine. Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm/pmu: Reject groups spanning multiple hardware PMUs

2015-03-10 Thread Mark Rutland
On Tue, Mar 10, 2015 at 12:53:51PM +, Peter Zijlstra wrote:
 On Tue, Mar 10, 2015 at 12:05:21PM +, Mark Rutland wrote:
  On Tue, Mar 10, 2015 at 11:27:23AM +, Peter Zijlstra wrote:
   On Mon, Mar 09, 2015 at 12:46:30PM +, Suzuki K. Poulose wrote:
From: Suzuki K. Poulose suzuki.poul...@arm.com

Don't allow grouping hardware events from different PMUs
 (eg. CCI + CPU).
   
   Uhm, how does this work? If we have multiple hardware PMUs we'll stop
   scheduling events after the first failed event schedule. This can leave
   one of the PMUs severely under utilized.
  
  The problem is here group validation at pmu::event_init() time, not
  scheduling.
 
 Maybe make that a little more explicit.

On the assumption that the patch is otherwise OK, how about the commit
message below?

8
arm/pmu: Reject groups spanning multiple hardware PMUs

The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event-ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.

The ARM PMU driver does not take this fact into account, and when
validating groups assumes that it can call to_arm_pmu(event-pmu) for
any HW event. When the event in question is from another HW PMU this is
wrong, and results in dereferencing garbage.

This patch updates the ARM PMU driver to first test for and reject
events from other PMUs, moving the to_arm_pmu and related logic after
this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
a CCI PMU present:

CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57
Hardware name: ARM-Versatile Express
task: bd8484c0 ti: be676000 task.ti: be676000
PC is at 0xbf1bbc90
LR is at validate_event+0x34/0x5c
pc : [bf1bbc90]lr : [80016060]psr: 0013
...
[80016060] (validate_event) from [80016198] (validate_group+0x28/0x90)
[80016198] (validate_group) from [80016398] (armpmu_event_init+0x150/0x218)
[80016398] (armpmu_event_init) from [800882e4] 
(perf_try_init_event+0x30/0x48)
[800882e4] (perf_try_init_event) from [8008f544] (perf_init_event+0x5c/0xf4)
[8008f544] (perf_init_event) from [8008f8a8] (perf_event_alloc+0x2cc/0x35c)
[8008f8a8] (perf_event_alloc) from [8009015c] 
(SyS_perf_event_open+0x498/0xa70)
[8009015c] (SyS_perf_event_open) from [8000e420] (ret_fast_syscall+0x0/0x34)
Code: bf1be000 bf1bb380 802a2664  (0002)
---[ end trace 01aff0ff00926a0a ]---
8

Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/