Re: [PATCH V2 3/5] perf/x86/intel: Filter unsupported Topdown metrics event

2021-01-27 Thread Liang, Kan




On 1/27/2021 2:13 PM, Peter Zijlstra wrote:

On Wed, Jan 27, 2021 at 07:38:43AM -0800, kan.li...@linux.intel.com wrote:

From: Kan Liang 

Current perf doesn't check the index of a Topdown metrics event before
updating the event. A perf tool user may get a value from an unsupported
Topdown metrics event.

For example, the L2 topdown event, cpu/event=0x00,umask=0x84/, is not
supported on Ice Lake. A perf tool user may mistakenly use the
unsupported events via RAW format. In this case, the scheduler follows
the unknown event handling and assigns a GP counter to the event. The
icl_get_topdown_value() returns the value of the slots to the event.
The perf tool user will get the value for the unsupported
Topdown metrics event.

Add a check in the __icl_update_topdown_event() and avoid updating
unsupported events.


I was struggling trying to understand how we end up here. Because
userspace can add whatever crap it wants, and why is only this thing a
problem..

But the actual problem is the next patch changing INTEL_TD_METRIC_NUM,
which then means is_metric_idx() will change, and that means that for
ICL we'll accept these raw events as metric events on creation and then
at runtime we get into trouble.

This isn't spelled out.

I do think this is entirely the wrong fix for that though. You're now
adding cycles to the relative hot path, instead of fixing the problem at
event creation, which is the slow path.

Why can't we either refuse the event on ICL or otherwise wreck it on
construction to avoid getting into trouble here?



To reject the unsupported topdown events, I think perf needs to know the 
number of the supported topdown events. Maybe we can add a variable 
num_topdown_events in x86_pmu as below. Is it OK?


diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 3d6fdcf..c7f2602 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1061,6 +1061,18 @@ int x86_schedule_events(struct cpu_hw_events 
*cpuc, int n, int *assign)

return unsched ? -EINVAL : 0;
 }

+#define INTEL_TD_METRIC_AVAILABLE_MAX  (INTEL_TD_METRIC_RETIRING + \
+((x86_pmu.num_topdown_events - 1) << 
8))
+
+inline bool is_metric_event(struct perf_event *event)
+{
+   u64 config = event->attr.config;
+
+   return ((config & ARCH_PERFMON_EVENTSEL_EVENT) == 0) &&
+   ((config & INTEL_ARCH_EVENT_MASK) >= INTEL_TD_METRIC_RETIRING)  
&&
+   ((config & INTEL_ARCH_EVENT_MASK) <= 
INTEL_TD_METRIC_AVAILABLE_MAX);
+}
+
 static int add_nr_metric_event(struct cpu_hw_events *cpuc,
   struct perf_event *event)
 {
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cd4d542..eab1eba 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5769,6 +5769,7 @@ __init int intel_pmu_init(void)
x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(pmem);
+   x86_pmu.num_topdown_events = 4;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
pr_cont("Icelake events, ");
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 15977d0..8b05893 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -87,14 +87,7 @@ static inline bool is_topdown_count(struct perf_event 
*event)

return event->hw.flags & PERF_X86_EVENT_TOPDOWN;
 }

-static inline bool is_metric_event(struct perf_event *event)
-{
-   u64 config = event->attr.config;
-
-   return ((config & ARCH_PERFMON_EVENTSEL_EVENT) == 0) &&
-   ((config & INTEL_ARCH_EVENT_MASK) >= INTEL_TD_METRIC_RETIRING)  
&&
-   ((config & INTEL_ARCH_EVENT_MASK) <= INTEL_TD_METRIC_MAX);
-}
+inline bool is_metric_event(struct perf_event *event);

 static inline bool is_slots_event(struct perf_event *event)
 {
@@ -782,6 +775,7 @@ struct x86_pmu {
/*
 * Intel perf metrics
 */
+   int num_topdown_events;
u64 (*update_topdown_event)(struct perf_event *event);
int (*set_topdown_event_period)(struct perf_event *event);


Thanks,
Kan


Re: [PATCH V2 3/5] perf/x86/intel: Filter unsupported Topdown metrics event

2021-01-27 Thread Peter Zijlstra
On Wed, Jan 27, 2021 at 07:38:43AM -0800, kan.li...@linux.intel.com wrote:
> From: Kan Liang 
> 
> Current perf doesn't check the index of a Topdown metrics event before
> updating the event. A perf tool user may get a value from an unsupported
> Topdown metrics event.
> 
> For example, the L2 topdown event, cpu/event=0x00,umask=0x84/, is not
> supported on Ice Lake. A perf tool user may mistakenly use the
> unsupported events via RAW format. In this case, the scheduler follows
> the unknown event handling and assigns a GP counter to the event. The
> icl_get_topdown_value() returns the value of the slots to the event.
> The perf tool user will get the value for the unsupported
> Topdown metrics event.
> 
> Add a check in the __icl_update_topdown_event() and avoid updating
> unsupported events.

I was struggling trying to understand how we end up here. Because
userspace can add whatever crap it wants, and why is only this thing a
problem..

But the actual problem is the next patch changing INTEL_TD_METRIC_NUM,
which then means is_metric_idx() will change, and that means that for
ICL we'll accept these raw events as metric events on creation and then
at runtime we get into trouble.

This isn't spelled out.

I do think this is entirely the wrong fix for that though. You're now
adding cycles to the relative hot path, instead of fixing the problem at
event creation, which is the slow path.

Why can't we either refuse the event on ICL or otherwise wreck it on
construction to avoid getting into trouble here?

> Signed-off-by: Kan Liang 
> ---
>  arch/x86/events/intel/core.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 8eba41b..b02d482 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2319,6 +2319,17 @@ static void __icl_update_topdown_event(struct 
> perf_event *event,
>  {
>   u64 delta, last = 0;
>  
> + /*
> +  * Although the unsupported topdown events are not exposed to users,
> +  * users may mistakenly use the unsupported events via RAW format.
> +  * For example, using L2 topdown event, cpu/event=0x00,umask=0x84/,
> +  * on Ice Lake. In this case, the scheduler follows the unknown
> +  * event handling and assigns a GP counter to the event.
> +  * Check the case, and avoid updating unsupported events.
> +  */
> + if (event->hw.idx < INTEL_PMC_IDX_FIXED)
> + return;
> +
>   delta = icl_get_topdown_value(event, slots, metrics);
>   if (last_slots)
>   last = icl_get_topdown_value(event, last_slots, last_metrics);
> -- 
> 2.7.4
> 


[PATCH V2 3/5] perf/x86/intel: Filter unsupported Topdown metrics event

2021-01-27 Thread kan . liang
From: Kan Liang 

Current perf doesn't check the index of a Topdown metrics event before
updating the event. A perf tool user may get a value from an unsupported
Topdown metrics event.

For example, the L2 topdown event, cpu/event=0x00,umask=0x84/, is not
supported on Ice Lake. A perf tool user may mistakenly use the
unsupported events via RAW format. In this case, the scheduler follows
the unknown event handling and assigns a GP counter to the event. The
icl_get_topdown_value() returns the value of the slots to the event.
The perf tool user will get the value for the unsupported
Topdown metrics event.

Add a check in the __icl_update_topdown_event() and avoid updating
unsupported events.

Signed-off-by: Kan Liang 
---
 arch/x86/events/intel/core.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 8eba41b..b02d482 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2319,6 +2319,17 @@ static void __icl_update_topdown_event(struct perf_event 
*event,
 {
u64 delta, last = 0;
 
+   /*
+* Although the unsupported topdown events are not exposed to users,
+* users may mistakenly use the unsupported events via RAW format.
+* For example, using L2 topdown event, cpu/event=0x00,umask=0x84/,
+* on Ice Lake. In this case, the scheduler follows the unknown
+* event handling and assigns a GP counter to the event.
+* Check the case, and avoid updating unsupported events.
+*/
+   if (event->hw.idx < INTEL_PMC_IDX_FIXED)
+   return;
+
delta = icl_get_topdown_value(event, slots, metrics);
if (last_slots)
last = icl_get_topdown_value(event, last_slots, last_metrics);
-- 
2.7.4