Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-14 Thread Thomas Gleixner
On Tue, 13 Sep 2016, Pan, Harry wrote:
> This things is because of the Baytrail/Braswell quirk breaks original
> assumption of perf RAPL polling timer rate calculation regarding of
> counter overflow case based on 200W; 

ESU are the 'Energy Status Units' bits in the MSR_RAPL_POWER_UNIT msr.

ESU = (rdmsr(MSR_RAPL_POWER_UNIT) >> 8) & 0x1f;

So we have 5 bits of information and therefor:

0 <= ESU <= 31

The standard readout is:

joules = counter_value * mult;

mult = 1 / (2 ^ ESU)

The resulting multiplier is:

31   <= ESU  <= 0
4.65661e-10J <= mult <= 1J

The scale function does:

val =  counter << (32 - ESU);

which is converting the readout in to units of

  4.65661e-10J / 2 == 2.32830e-10J

because the shift is actually: (1 + (31 - ESU)).


The math for Baytrail/Braswell is:

microjoules = counter_value * mult

mult = 2 ^ ESU
   
The resulting multiplier is:

31   <= ESU  <= 0
1 uJ <= mult <= 2.14748e+09 uJ
1e-6J<= mult <= 2147J

So now your baytrail/braswell quirk does:

ESU = 32 - ESU

so the scale function becomes:

val = counter << (32 - (32 - ESU))
==> val = counter << ESU

which is converting the readout to units of

 1e-6J

So now you are concerned about the rapl_timer interval which is calculated
so that the counter does not overflow for a total dissipation of 200W,
which is equivalent to 200J/s. The maximum counter width is 32 bit.

So depending on ESU the code scales the timeout to:

   t[ESU] = 1 << (31 - ESU) / 200

So for the normal case we get:

   t[0] = 10.737e6 s
   ...
   t[30] = 0.010 s
   t[31] = 0.005 s

The counter capacity for ESU=31 is 

cap = (1 << 32) * 4.65661e-10J = 2J

So:

toverfl = 2J / 200W = 0.01s

which we cut in half to avoid running the timer and the counter in lockstep
which can cause overflows to go undetected. So this looks correct.
   
 
But for your Baytrail/Braswel that results in:

   t[ESU] = 1 << (31 - (32 - ESU)) / 200

   t[0] = TOTAL CRAP because the shift value becomes -1

But what saves you here is the check for

if (hwunit < 32)

which catches the hwunit = 32 - ESU[0] case and sets the timer to 2ms. So
for the remaining ones we have:

   t[1]  = 0.005s
   ...
   t[31] = 5.3687e+06s 

So lets look at the counter capacity for ESU=1:

   cap = (1 << 32) * 2 uJ == 8589.92J

The resulting overflow is:

toverfl = 8589.92J / 200W = 42.9496 s

So if we divide this by two then we result in: 21.4748 s

So your timeout is actually off by factor ~4k, which is not surprising due
to the fact that the capacity has a ratio of 1 : 2147.48 and you have an
additional off by one due to the (32 - ESU) quirk.

So the overflow prevention timer fires 4k times for no good reason. Indeed
a very power friendly design.

The timer calculation magically works for the original standard conversion,
but in this case it is utter crap. You really want to have a proper scale
factor for the timer calculation so we end up with:

   toverfl = capacity / 200

i.e. you need a way to calculate capacity from the hw_units[] mess and some
factor which is dependent on the base unit. That all can be done with plain
integer math.

> in short, it leads every 80ms system triggers an event to read counters,

I have no idea where these 80ms come from and I can't make any sense from
the rest of your response either.

Fact is, that you did not do the math amd just tinkered the
Baytrail/Braswell support into the existing code and declared it done when
it did not blow up in your face.

Really excellent engineering work - NOT!

Thanks,

tglx
















Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-14 Thread Thomas Gleixner
On Tue, 13 Sep 2016, Pan, Harry wrote:
> This things is because of the Baytrail/Braswell quirk breaks original
> assumption of perf RAPL polling timer rate calculation regarding of
> counter overflow case based on 200W; 

ESU are the 'Energy Status Units' bits in the MSR_RAPL_POWER_UNIT msr.

ESU = (rdmsr(MSR_RAPL_POWER_UNIT) >> 8) & 0x1f;

So we have 5 bits of information and therefor:

0 <= ESU <= 31

The standard readout is:

joules = counter_value * mult;

mult = 1 / (2 ^ ESU)

The resulting multiplier is:

31   <= ESU  <= 0
4.65661e-10J <= mult <= 1J

The scale function does:

val =  counter << (32 - ESU);

which is converting the readout in to units of

  4.65661e-10J / 2 == 2.32830e-10J

because the shift is actually: (1 + (31 - ESU)).


The math for Baytrail/Braswell is:

microjoules = counter_value * mult

mult = 2 ^ ESU
   
The resulting multiplier is:

31   <= ESU  <= 0
1 uJ <= mult <= 2.14748e+09 uJ
1e-6J<= mult <= 2147J

So now your baytrail/braswell quirk does:

ESU = 32 - ESU

so the scale function becomes:

val = counter << (32 - (32 - ESU))
==> val = counter << ESU

which is converting the readout to units of

 1e-6J

So now you are concerned about the rapl_timer interval which is calculated
so that the counter does not overflow for a total dissipation of 200W,
which is equivalent to 200J/s. The maximum counter width is 32 bit.

So depending on ESU the code scales the timeout to:

   t[ESU] = 1 << (31 - ESU) / 200

So for the normal case we get:

   t[0] = 10.737e6 s
   ...
   t[30] = 0.010 s
   t[31] = 0.005 s

The counter capacity for ESU=31 is 

cap = (1 << 32) * 4.65661e-10J = 2J

So:

toverfl = 2J / 200W = 0.01s

which we cut in half to avoid running the timer and the counter in lockstep
which can cause overflows to go undetected. So this looks correct.
   
 
But for your Baytrail/Braswel that results in:

   t[ESU] = 1 << (31 - (32 - ESU)) / 200

   t[0] = TOTAL CRAP because the shift value becomes -1

But what saves you here is the check for

if (hwunit < 32)

which catches the hwunit = 32 - ESU[0] case and sets the timer to 2ms. So
for the remaining ones we have:

   t[1]  = 0.005s
   ...
   t[31] = 5.3687e+06s 

So lets look at the counter capacity for ESU=1:

   cap = (1 << 32) * 2 uJ == 8589.92J

The resulting overflow is:

toverfl = 8589.92J / 200W = 42.9496 s

So if we divide this by two then we result in: 21.4748 s

So your timeout is actually off by factor ~4k, which is not surprising due
to the fact that the capacity has a ratio of 1 : 2147.48 and you have an
additional off by one due to the (32 - ESU) quirk.

So the overflow prevention timer fires 4k times for no good reason. Indeed
a very power friendly design.

The timer calculation magically works for the original standard conversion,
but in this case it is utter crap. You really want to have a proper scale
factor for the timer calculation so we end up with:

   toverfl = capacity / 200

i.e. you need a way to calculate capacity from the hw_units[] mess and some
factor which is dependent on the base unit. That all can be done with plain
integer math.

> in short, it leads every 80ms system triggers an event to read counters,

I have no idea where these 80ms come from and I can't make any sense from
the rest of your response either.

Fact is, that you did not do the math amd just tinkered the
Baytrail/Braswell support into the existing code and declared it done when
it did not blow up in your face.

Really excellent engineering work - NOT!

Thanks,

tglx
















Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-13 Thread Pan, Harry
On Tue, 2016-09-13 at 15:41 +0200, Thomas Gleixner wrote:
> On Sun, 11 Sep 2016, Harry Pan wrote:
> > This patch also enables multiple quirks.
> 
> This patch adds a single quirk for Baytrail. 
> 
> Please stop sending out patches 5 seconds after a review. Take your time
Definitely I take this seriously because I felt awkward as well.

> > +   /*
> > +* Some Atom processors (BYT/BSW) have 2^ESU microjoules
> > +* increment, refer to Software Developers' Manual, Vol. 3C,
> > +* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
> > +*
> > +* TODO: In order to fit BYT/BSW quirk model, here remind
> > +*   this generates timer rate in 80ms; by default
> > +*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
> 
> This sentence is not a sentence and I can't make any sense of it at
> all.
> 
> What's the TODO here? And why is that TODO not addressed in this patch?
> 
I reviewed my sentence and agreed your comment; yes, it is incorrect to
be a "TODO" tag since no decent suggestion/option.

This things is because of the Baytrail/Braswell quirk breaks original
assumption of perf RAPL polling timer rate calculation regarding of
counter overflow case based on 200W; in short, it leads every 80ms
system triggers an event to read counters, and this is concern I want to
comment (wrong tag?) because I could no assess any side effect.
Perhaps I should revise it as "remark" or "caveat" because I do not have
decent suggestion (fulfill "TODO" tag) so far.

Alternately, it shall not affect functionality since I compared w/
powercap driver through sysfs nodes during experiment, yet I am humble
to take any advice to make this patch better.

Sincerely,
Harry



Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-13 Thread Pan, Harry
On Tue, 2016-09-13 at 15:41 +0200, Thomas Gleixner wrote:
> On Sun, 11 Sep 2016, Harry Pan wrote:
> > This patch also enables multiple quirks.
> 
> This patch adds a single quirk for Baytrail. 
> 
> Please stop sending out patches 5 seconds after a review. Take your time
Definitely I take this seriously because I felt awkward as well.

> > +   /*
> > +* Some Atom processors (BYT/BSW) have 2^ESU microjoules
> > +* increment, refer to Software Developers' Manual, Vol. 3C,
> > +* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
> > +*
> > +* TODO: In order to fit BYT/BSW quirk model, here remind
> > +*   this generates timer rate in 80ms; by default
> > +*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
> 
> This sentence is not a sentence and I can't make any sense of it at
> all.
> 
> What's the TODO here? And why is that TODO not addressed in this patch?
> 
I reviewed my sentence and agreed your comment; yes, it is incorrect to
be a "TODO" tag since no decent suggestion/option.

This things is because of the Baytrail/Braswell quirk breaks original
assumption of perf RAPL polling timer rate calculation regarding of
counter overflow case based on 200W; in short, it leads every 80ms
system triggers an event to read counters, and this is concern I want to
comment (wrong tag?) because I could no assess any side effect.
Perhaps I should revise it as "remark" or "caveat" because I do not have
decent suggestion (fulfill "TODO" tag) so far.

Alternately, it shall not affect functionality since I compared w/
powercap driver through sysfs nodes during experiment, yet I am humble
to take any advice to make this patch better.

Sincerely,
Harry



Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-13 Thread Thomas Gleixner
On Sun, 11 Sep 2016, Harry Pan wrote:
> This patch also enables multiple quirks.

This patch adds a single quirk for Baytrail. 

Please stop sending out patches 5 seconds after a review. Take your time
and fixup stuff proper.

> +static void rapl_byt_quirk(void)
> +{
> + int i;
> +
> + /*
> +  * Some Atom processors (BYT/BSW) have 2^ESU microjoules
> +  * increment, refer to Software Developers' Manual, Vol. 3C,
> +  * Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
> +  *
> +  * TODO: In order to fit BYT/BSW quirk model, here remind
> +  *   this generates timer rate in 80ms; by default
> +  *   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.

This sentence is not a sentence and I can't make any sense of it at
all.

What's the TODO here? And why is that TODO not addressed in this patch?

Thanks,

tglx


Re: [PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-13 Thread Thomas Gleixner
On Sun, 11 Sep 2016, Harry Pan wrote:
> This patch also enables multiple quirks.

This patch adds a single quirk for Baytrail. 

Please stop sending out patches 5 seconds after a review. Take your time
and fixup stuff proper.

> +static void rapl_byt_quirk(void)
> +{
> + int i;
> +
> + /*
> +  * Some Atom processors (BYT/BSW) have 2^ESU microjoules
> +  * increment, refer to Software Developers' Manual, Vol. 3C,
> +  * Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
> +  *
> +  * TODO: In order to fit BYT/BSW quirk model, here remind
> +  *   this generates timer rate in 80ms; by default
> +  *   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.

This sentence is not a sentence and I can't make any sense of it at
all.

What's the TODO here? And why is that TODO not addressed in this patch?

Thanks,

tglx


[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -458,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +658,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +786,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +815,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1

[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 55 
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..a434087 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -458,6 +468,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +557,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +664,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +792,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +821,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 55 
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..a434087 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1