[PATCH v2] perf/x86/intel: Update ICL Core and Package C-state event counters

2019-09-01 Thread Harry Pan
Ice Lake microarchitecture inherits Cannon Lake, it has CC1/PC8/PC9/PC10
residency counters.

Update the list of Ice Lake PMU event counters from the snb_cstates[] list
of events to the cnl_cstates[] list of events, which keeps all previously
supported events and also adds the CORE_C1, PKG_C8, PKG_C9, and PKG_C10
residency counters.

This benefits users to profile them through the perf interface.

Signed-off-by: Harry Pan 

---

 arch/x86/events/intel/cstate.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 688592b34564..82fbc4c6e5e6 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -35,56 +35,58 @@
  *The counters include PKG_C*_RESIDENCY.
  *
  * All of these counters are specified in the IntelĀ® 64 and IA-32
- * Architectures Software Developer.s Manual Vol3b.
+ * Architectures Software Developer's Manual Vol4.
  *
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM,CNL
+ *  Available model: SLM,AMT,GLM,CNL,ICL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
-   CNL
+ * CNL,ICL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
- * SKL,KNL,GLM,CNL
+ * SKL,KNL,GLM,CNL,ICL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL,CNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL,ICL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
+ * ICL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
- * GLM,CNL
+ * GLM,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
- *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM,CNL
+ *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL,
+ * ICL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT,KBL,CNL
+ *Available model: HSW ULT,KBL,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT,KBL,CNL
+ *Available model: HSW ULT,KBL,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT,KBL,GLM,CNL
+ *Available model: HSW ULT,KBL,GLM,CNL,ICL
  *Scope: Package (physical package)
  *
  */
@@ -625,8 +627,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates

[PATCH] perf/x86/intel: Update ICL Core and Package C-state event counters

2019-07-26 Thread Harry Pan
Ice Lake microarchitecture inherits Cannon Lake, it has CC1/PC8/PC9/PC10
residency counters.

Update the list of Ice Lake PMU event counters from the snb_cstates[] list
of events to the cnl_cstates[] list of events, which keeps all previously
supported events and also adds the CORE_C1, PKG_C8, PKG_C9, and PKG_C10
residency counters.

This benefits users to profile them through the perf interface.

Signed-off-by: Harry Pan 

---

 arch/x86/events/intel/cstate.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 688592b34564..08291233f5c9 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,51 +40,53 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM,CNL
+ *  Available model: SLM,AMT,GLM,CNL,ICL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
-   CNL
+   CNL,ICL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
- * SKL,KNL,GLM,CNL
+ * SKL,KNL,GLM,CNL,ICL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL,CNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL,ICL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
+   ICL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
- * GLM,CNL
+ * GLM,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM,CNL
+ * SKL,KNL,GLM,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
+   ICL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT,KBL,CNL
+ *Available model: HSW ULT,KBL,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT,KBL,CNL
+ *Available model: HSW ULT,KBL,CNL,ICL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT,KBL,GLM,CNL
+ *Available model: HSW ULT,KBL,GLM,CNL,ICL
  *Scope: Package (physical package)
  *
  */
@@ -625,8 +627,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_MOBILE, snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_MOBILE, cnl_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_DESKTOP, cnl_cstates),
{ },
 };
 MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
-- 
2.20.1



[PATCH] USB: core: correct a spelling mistake in the comment

2019-06-19 Thread Harry Pan
Fix a spelling typo in the function comment.

Signed-off-by: Harry Pan 

---

 drivers/usb/core/hub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 8d4631c81b9f..1988f8f88f75 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2719,7 +2719,7 @@ static bool use_new_scheme(struct usb_device *udev, int 
retry,
 }
 
 /* Is a USB 3.0 port in the Inactive or Compliance Mode state?
- * Port worm reset is required to recover
+ * Port warm reset is required to recover
  */
 static bool hub_port_warm_reset_required(struct usb_hub *hub, int port1,
u16 portstatus)
-- 
2.20.1



[PATCH v3] platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds

2019-06-19 Thread Harry Pan
Refer to the Intel SDM Vol.4, the package C-state residency counters
of modern IA micro-architecture are all ticking in TSC frequency,
hence we can apply simple math to transform the ticks into microseconds.
i.e.,
residency (ms) = count / tsc_khz
residency (us) = count / tsc_khz * 1000

This also aligns to other sysfs debug entries of residency counter in
the same metric in microseconds, benefits reading and scripting.

v2: restore the accidentally deleted newline, no function change.
v3: apply kernel do_div() macro to calculate division

Signed-off-by: Harry Pan 

---

 drivers/platform/x86/intel_pmc_core.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/intel_pmc_core.c 
b/drivers/platform/x86/intel_pmc_core.c
index f2c621b55f49..ab798efacc85 100644
--- a/drivers/platform/x86/intel_pmc_core.c
+++ b/drivers/platform/x86/intel_pmc_core.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "intel_pmc_core.h"
 
@@ -738,7 +739,9 @@ static int pmc_core_pkgc_show(struct seq_file *s, void 
*unused)
if (rdmsrl_safe(map[index].bit_mask, _count))
continue;
 
-   seq_printf(s, "%-8s : 0x%llx\n", map[index].name,
+   pcstate_count *= 1000;
+   do_div(pcstate_count, tsc_khz);
+   seq_printf(s, "%-8s : %llu\n", map[index].name,
   pcstate_count);
}
 
-- 
2.20.1



[PATCH v2] platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds

2019-05-27 Thread Harry Pan
Refer to the Intel SDM Vol.4, the package C-state residency counters
of modern IA micro-architecture are all ticking in TSC frequency,
hence we can apply simple math to transform the ticks into microseconds.
i.e.,
residency (ms) = count / tsc_khz
residency (us) = count / tsc_khz * 1000

This also aligns to other sysfs debug entries of residency counter in
the same metric in microseconds, benefits reading and scripting.

v2: restore the accidentally deleted newline, no function change.

Signed-off-by: Harry Pan 

---

 drivers/platform/x86/intel_pmc_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/intel_pmc_core.c 
b/drivers/platform/x86/intel_pmc_core.c
index f2c621b55f49..c78918a7e731 100644
--- a/drivers/platform/x86/intel_pmc_core.c
+++ b/drivers/platform/x86/intel_pmc_core.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "intel_pmc_core.h"
 
@@ -738,8 +739,8 @@ static int pmc_core_pkgc_show(struct seq_file *s, void 
*unused)
if (rdmsrl_safe(map[index].bit_mask, _count))
continue;
 
-   seq_printf(s, "%-8s : 0x%llx\n", map[index].name,
-  pcstate_count);
+   seq_printf(s, "%-8s : %llu\n", map[index].name,
+  pcstate_count * 1000 / tsc_khz);
}
 
return 0;
-- 
2.20.1



[PATCH] platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds

2019-05-27 Thread Harry Pan
Refer to the Intel SDM Vol.4, the package C-state residency counters
of modern IA micro-architecture are all ticking in TSC frequency,
hence we can apply simple math to transform the ticks into microseconds.
i.e.,
residency (ms) = count / tsc_khz
residency (us) = count / tsc_khz * 1000

This also aligns to other sysfs debug entries of residency counter in
the same metric in microseconds, benefits reading and scripting.

v2: restore the accidentally deleted newline, no function change.

Signed-off-by: Harry Pan 

---

 drivers/platform/x86/intel_pmc_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/intel_pmc_core.c 
b/drivers/platform/x86/intel_pmc_core.c
index f2c621b55f49..c78918a7e731 100644
--- a/drivers/platform/x86/intel_pmc_core.c
+++ b/drivers/platform/x86/intel_pmc_core.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "intel_pmc_core.h"
 
@@ -738,8 +739,8 @@ static int pmc_core_pkgc_show(struct seq_file *s, void 
*unused)
if (rdmsrl_safe(map[index].bit_mask, _count))
continue;
 
-   seq_printf(s, "%-8s : 0x%llx\n", map[index].name,
-  pcstate_count);
+   seq_printf(s, "%-8s : %llu\n", map[index].name,
+  pcstate_count * 1000 / tsc_khz);
}
 
return 0;
-- 
2.20.1



[PATCH] platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds

2019-05-27 Thread Harry Pan
Refer to the Intel SDM Vol.4, the package C-state residency counters
of modern IA micro-architecture are all ticking in TSC frequency,
hence we can apply simple math to transform the ticks into microseconds.
i.e.,
residency (ms) = count / tsc_khz
residency (us) = count / tsc_khz * 1000

This also aligns to other sysfs debug entries of residency counter in
the same metric in microseconds, benefits reading and scripting.

Signed-off-by: Harry Pan 

---

 drivers/platform/x86/intel_pmc_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/platform/x86/intel_pmc_core.c 
b/drivers/platform/x86/intel_pmc_core.c
index f2c621b55f49..20e0843ebfb4 100644
--- a/drivers/platform/x86/intel_pmc_core.c
+++ b/drivers/platform/x86/intel_pmc_core.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "intel_pmc_core.h"
 
@@ -726,7 +727,6 @@ static int pmc_core_ltr_show(struct seq_file *s, void 
*unused)
return 0;
 }
 DEFINE_SHOW_ATTRIBUTE(pmc_core_ltr);
-
 static int pmc_core_pkgc_show(struct seq_file *s, void *unused)
 {
struct pmc_dev *pmcdev = s->private;
@@ -738,8 +738,8 @@ static int pmc_core_pkgc_show(struct seq_file *s, void 
*unused)
if (rdmsrl_safe(map[index].bit_mask, _count))
continue;
 
-   seq_printf(s, "%-8s : 0x%llx\n", map[index].name,
-  pcstate_count);
+   seq_printf(s, "%-8s : %llu\n", map[index].name,
+  pcstate_count * 1000 / tsc_khz);
}
 
return 0;
-- 
2.20.1



[PATCH v3] clocksource: Untrust the watchdog if its interval is too small

2019-05-18 Thread Harry Pan
Perform sanity check on the watchdog to validate its interval, avoid
to generate a false alarm that incorrectly marks the main clocksource
as unstable when there comes discrepancy.

Say if there is a discrepancy between the current clocksource and watchdog,
validate the watchdog deviation first, if its interval is too small against
the expected timer interval, we shall trust the current clocksource, else
incorrectly kick off the main clocksource could mess up the wall clock.

It is identified on some Coffee Lake platform w/ PC10 allowed, it has a
problematic HPET timer in the platform integration, when the CPU exited
from the low power mode of PC10, the HPET generates timestamp delay, this
causes discrepancy making kernel incorrectly untrust the current clocksource
(TSC in this case) and re-select the next clocksource which is the problematic
HPET, this eventually causes a user sensible wall clock delay.

v2: fix resource leak: the locked watchdog_lock
v3: revise the communication: focus on the timer self validation

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203183
Signed-off-by: Harry Pan 

---

 kernel/time/clocksource.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 3bcc19ceb073..090d937d5ec4 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -96,6 +96,7 @@ static u64 suspend_start;
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 static void clocksource_watchdog_work(struct work_struct *work);
 static void clocksource_select(void);
+static void clocksource_dequeue_watchdog(struct clocksource *cs);
 
 static LIST_HEAD(watchdog_list);
 static struct clocksource *watchdog;
@@ -236,6 +237,12 @@ static void clocksource_watchdog(struct timer_list *unused)
 
/* Check the deviation from the watchdog clocksource. */
if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
+   if (wd_nsec < jiffies_to_nsecs(WATCHDOG_INTERVAL) - 
WATCHDOG_THRESHOLD) {
+   pr_err("Stop timekeeping watchdog '%s' because 
expected interval is too small in %lld ns only\n",
+   watchdog->name, wd_nsec);
+   clocksource_dequeue_watchdog(cs);
+   goto out;
+   }
pr_warn("timekeeping watchdog on CPU%d: Marking 
clocksource '%s' as unstable because the skew is too large:\n",
smp_processor_id(), cs->name);
pr_warn("  '%s' wd_now: %llx 
wd_last: %llx mask: %llx\n",
-- 
2.20.1



[PATCH v2] clocksource: Untrust the clocksource watchdog when its interval is too small

2019-05-18 Thread Harry Pan
This patch performs a sanity check on the deviation of the clocksource watchdog,
target to reduce false alarm that incorrectly marks current clocksource unstable
when there comes discrepancy.

Say if there is a discrepancy between the current clocksource and watchdog,
validate the watchdog deviation first, if its interval is too small against
the expected timer interval, we shall trust the current clocksource.

It is identified on some Coffee Lake platform w/ PC10 allowed, when the CPU
entered and exited from PC10 (the residency counter is increased), the HPET
generates timestamp delay, this causes discrepancy making kernel incorrectly
untrust the current clocksource (TSC in this case) and re-select the next
clocksource which is the problematic HPET, this eventually causes a user
sensible wall clock delay.

The HPET timestamp delay shall be tackled in firmware domain in order to
properly handle the timer offload between XTAL and RTC when it enters PC10,
while this patch is a mitigation to reduce the false alarm of clocksource
unstable regardless what clocksources are paired.

v2: fix resource leak: the locked watchdog_lock

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203183
Signed-off-by: Harry Pan 

---

 kernel/time/clocksource.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 3bcc19ceb073..090d937d5ec4 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -96,6 +96,7 @@ static u64 suspend_start;
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 static void clocksource_watchdog_work(struct work_struct *work);
 static void clocksource_select(void);
+static void clocksource_dequeue_watchdog(struct clocksource *cs);
 
 static LIST_HEAD(watchdog_list);
 static struct clocksource *watchdog;
@@ -236,6 +237,12 @@ static void clocksource_watchdog(struct timer_list *unused)
 
/* Check the deviation from the watchdog clocksource. */
if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
+   if (wd_nsec < jiffies_to_nsecs(WATCHDOG_INTERVAL) - 
WATCHDOG_THRESHOLD) {
+   pr_err("Stop timekeeping watchdog '%s' because 
expected interval is too small in %lld ns only\n",
+   watchdog->name, wd_nsec);
+   clocksource_dequeue_watchdog(cs);
+   goto out;
+   }
pr_warn("timekeeping watchdog on CPU%d: Marking 
clocksource '%s' as unstable because the skew is too large:\n",
smp_processor_id(), cs->name);
pr_warn("  '%s' wd_now: %llx 
wd_last: %llx mask: %llx\n",
-- 
2.20.1



[PATCH] clocksource: Untrust the clocksource watchdog when its interval is too small

2019-05-16 Thread Harry Pan
This patch performs a sanity check on the deviation of the clocksource watchdog,
target to reduce false alarm that incorrectly marks current clocksource unstable
when there comes discrepancy.

Say if there is a discrepancy between the current clocksource and watchdog,
validate the watchdog deviation first, if its interval is too small against
the expected timer interval, we shall trust the current clocksource.

It is identified on some Coffee Lake platform w/ PC10 allowed, when the CPU
entered and exited from PC10 (the residency counter is increased), the HPET
generates timestamp delay, this causes discrepancy making kernel incorrectly
untrust the current clocksource (TSC in this case) and re-select the next
clocksource which is the problematic HPET, this eventually causes a user
sensible wall clock delay.

The HPET timestamp delay shall be tackled in firmware domain in order to
properly handle the timer offload between XTAL and RTC when it enters PC10,
while this patch is a mitigation to reduce the false alarm of clocksource
unstable regardless what clocksources are paired.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203183
Signed-off-by: Harry Pan 

---

 kernel/time/clocksource.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 3bcc19ceb073..fb0a67827346 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -96,6 +96,7 @@ static u64 suspend_start;
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 static void clocksource_watchdog_work(struct work_struct *work);
 static void clocksource_select(void);
+static void clocksource_dequeue_watchdog(struct clocksource *cs);
 
 static LIST_HEAD(watchdog_list);
 static struct clocksource *watchdog;
@@ -236,6 +237,12 @@ static void clocksource_watchdog(struct timer_list *unused)
 
/* Check the deviation from the watchdog clocksource. */
if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
+   if (wd_nsec < jiffies_to_nsecs(WATCHDOG_INTERVAL) - 
WATCHDOG_THRESHOLD) {
+   pr_err("Stop timekeeping watchdog '%s' because 
expected interval is too small in %lld ns only\n",
+   watchdog->name, wd_nsec);
+   clocksource_dequeue_watchdog(cs);
+   return;
+   }
pr_warn("timekeeping watchdog on CPU%d: Marking 
clocksource '%s' as unstable because the skew is too large:\n",
smp_processor_id(), cs->name);
pr_warn("  '%s' wd_now: %llx 
wd_last: %llx mask: %llx\n",
-- 
2.20.1



[tip:perf/urgent] perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters

2019-04-25 Thread tip-bot for Harry Pan
Commit-ID:  82c99f7a81f28f8c1be5f701c8377d14c4075b10
Gitweb: https://git.kernel.org/tip/82c99f7a81f28f8c1be5f701c8377d14c4075b10
Author: Harry Pan 
AuthorDate: Wed, 24 Apr 2019 22:50:33 +0800
Committer:  Ingo Molnar 
CommitDate: Thu, 25 Apr 2019 08:59:31 +0200

perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 
counters

Kaby Lake (and Coffee Lake) has PC8/PC9/PC10 residency counters.

This patch updates the list of Kaby/Coffee Lake PMU event counters
from the snb_cstates[] list of events to the hswult_cstates[]
list of events, which keeps all previously supported events and
also adds the PKG_C8, PKG_C9 and PKG_C10 residency counters.

This allows user space tools to profile them through the perf interface.

Signed-off-by: Harry Pan 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Borislav Petkov 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: gs0...@gmail.com
Link: http://lkml.kernel.org/r/20190424145033.1924-1-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/cstate.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 94a4b7fc75d0..d41de9af7a39 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -76,15 +76,15 @@
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT,CNL
+ *Available model: HSW ULT,KBL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT,CNL
+ *Available model: HSW ULT,KBL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT,GLM,CNL
+ *Available model: HSW ULT,KBL,GLM,CNL
  *Scope: Package (physical package)
  *
  */
@@ -566,8 +566,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_X, snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, hswult_cstates),
 
X86_CSTATES_MODEL(INTEL_FAM6_CANNONLAKE_MOBILE, cnl_cstates),
 


[PATCH] perf/x86/intel: Update KBL Package C-state events

2019-04-24 Thread Harry Pan
Kaby Lake (and Coffee Lake) has PC8/PC9/PC10 residency counters.

This patch updates the list of PMU event counters that allows
user space tool to profile them through perf interface.

Signed-off-by: Harry Pan 
---

 arch/x86/events/intel/cstate.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index d2e780705c5a..56194c571299 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -76,15 +76,15 @@
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT,CNL
+ *Available model: HSW ULT,KBL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT,CNL
+ *Available model: HSW ULT,KBL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT,GLM,CNL
+ *Available model: HSW ULT,KBL,GLM,CNL
  *Scope: Package (physical package)
  *
  */
@@ -572,8 +572,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_X, snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, hswult_cstates),
 
X86_CSTATES_MODEL(INTEL_FAM6_CANNONLAKE_MOBILE, cnl_cstates),
 
-- 
2.20.1



[PATCH v7 1/2] PM / sleep: refactor the filesystems sync to reduce duplication

2019-02-25 Thread Harry Pan
This patch creates a common helper to sync filesystems and shares
to the suspend, hibernate, and snapshot.

Signed-off-by: Harry Pan 
---
 include/linux/suspend.h  |  3 +++
 kernel/power/hibernate.c |  5 +
 kernel/power/main.c  |  9 +
 kernel/power/suspend.c   | 13 +
 kernel/power/user.c  |  5 +
 5 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 3f529ad9a9d2..6b3ea9ea6a9e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -425,6 +425,7 @@ void restore_processor_state(void);
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
 extern int unregister_pm_notifier(struct notifier_block *nb);
+extern void ksys_sync_helper(void);
 
 #define pm_notifier(fn, pri) { \
static struct notifier_block fn##_nb =  \
@@ -462,6 +463,8 @@ static inline int unregister_pm_notifier(struct 
notifier_block *nb)
return 0;
 }
 
+static inline void ksys_sync_helper(void) {}
+
 #define pm_notifier(fn, pri)   do { (void)(fn); } while (0)
 
 static inline bool pm_wakeup_pending(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..cc105ecd9c07 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -14,7 +14,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -709,9 +708,7 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
-   ksys_sync();
-   pr_info("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 35b50823d83b..a8a8e6ec57e6 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -51,6 +52,14 @@ void unlock_system_sleep(void)
 }
 EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
+void ksys_sync_helper(void)
+{
+   pr_info("Syncing filesystems ... ");
+   ksys_sync();
+   pr_cont("done.\n");
+}
+EXPORT_SYMBOL_GPL(ksys_sync_helper);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..e39059dea38b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -568,13 +567,11 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   ksys_sync_helper();
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 2d8b60a3c86b..cb24e840a3e6 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -10,7 +10,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -228,9 +227,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (data->frozen)
break;
 
-   printk("Syncing filesystems ... ");
-   ksys_sync();
-   printk("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
-- 
2.18.1



[PATCH v7 2/2] PM / sleep: measure the time of filesystems syncing

2019-02-25 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing filesystems sync during system sleep; although
developer can guess by the timestamp of next log or enable the ftrace
power event for manual calculation, this manner is easier to read and
benefits the automation script.

v1 to v5: context discussion
v6: split patches logically in code refactor and sync profile
v7: improve 32/64 bit machine compatibility

Signed-off-by: Harry Pan 
---
 kernel/power/main.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index a8a8e6ec57e6..eea3d65eb960 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -54,9 +54,15 @@ EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
 void ksys_sync_helper(void)
 {
-   pr_info("Syncing filesystems ... ");
+   ktime_t start;
+   long elapsed_msecs;
+
+   start = ktime_get();
ksys_sync();
-   pr_cont("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %ld.%03ld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
 }
 EXPORT_SYMBOL_GPL(ksys_sync_helper);
 
-- 
2.18.1



[PATCH v7 2/2] PM / sleep: measure the time of filesystems syncing

2019-02-25 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing filesystems sync during system sleep; although
developer can guess by the timestamp of next log or enable the ftrace
power event for manual calculation, this manner is easier to read and
benefits the automation script.

v1 to v5: context discussion
v6: split patches logically in code refactor and sync profile
v7: improve 32/64 bit machine compatibility

Signed-off-by: Harry Pan 
---
 kernel/power/main.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index a8a8e6ec57e6..eea3d65eb960 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -54,9 +54,15 @@ EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
 void ksys_sync_helper(void)
 {
-   pr_info("Syncing filesystems ... ");
+   ktime_t start;
+   long elapsed_msecs;
+
+   start = ktime_get();
ksys_sync();
-   pr_cont("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %ld.%03ld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
 }
 EXPORT_SYMBOL_GPL(ksys_sync_helper);
 
-- 
2.18.1



[PATCH v7 1/2] PM / sleep: refactor the filesystems sync to reduce duplication

2019-02-25 Thread Harry Pan
This patch creates a common helper to sync filesystems and shares
to the suspend, hibernate, and snapshot.

Signed-off-by: Harry Pan 
---
 include/linux/suspend.h  |  3 +++
 kernel/power/hibernate.c |  5 +
 kernel/power/main.c  |  9 +
 kernel/power/suspend.c   | 13 +
 kernel/power/user.c  |  5 +
 5 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 3f529ad9a9d2..6b3ea9ea6a9e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -425,6 +425,7 @@ void restore_processor_state(void);
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
 extern int unregister_pm_notifier(struct notifier_block *nb);
+extern void ksys_sync_helper(void);
 
 #define pm_notifier(fn, pri) { \
static struct notifier_block fn##_nb =  \
@@ -462,6 +463,8 @@ static inline int unregister_pm_notifier(struct 
notifier_block *nb)
return 0;
 }
 
+static inline void ksys_sync_helper(void) {}
+
 #define pm_notifier(fn, pri)   do { (void)(fn); } while (0)
 
 static inline bool pm_wakeup_pending(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..cc105ecd9c07 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -14,7 +14,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -709,9 +708,7 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
-   ksys_sync();
-   pr_info("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 35b50823d83b..a8a8e6ec57e6 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -51,6 +52,14 @@ void unlock_system_sleep(void)
 }
 EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
+void ksys_sync_helper(void)
+{
+   pr_info("Syncing filesystems ... ");
+   ksys_sync();
+   pr_cont("done.\n");
+}
+EXPORT_SYMBOL_GPL(ksys_sync_helper);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..e39059dea38b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -568,13 +567,11 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   ksys_sync_helper();
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 2d8b60a3c86b..cb24e840a3e6 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -10,7 +10,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -228,9 +227,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (data->frozen)
break;
 
-   printk("Syncing filesystems ... ");
-   ksys_sync();
-   printk("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
-- 
2.18.1



[PATCH v6 2/2] PM / sleep: measure the time of filesystems syncing

2019-02-23 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing filesystems sync during system sleep; although
developer can guess by the timestamp of next log or enable the ftrace
power event for manual calculation, this manner is easier to read and
benefits the automation script.

Signed-off-by: Harry Pan 
---
 kernel/power/main.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index a8a8e6ec57e6..a08dcc743f31 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -54,9 +54,15 @@ EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
 void ksys_sync_helper(void)
 {
-   pr_info("Syncing filesystems ... ");
+   ktime_t start;
+   s64 elapsed_msecs;
+
+   start = ktime_get();
ksys_sync();
-   pr_cont("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
 }
 EXPORT_SYMBOL_GPL(ksys_sync_helper);
 
-- 
2.18.1



[PATCH v6 1/2] PM / sleep: refactor the filesystems sync to reduce duplication

2019-02-23 Thread Harry Pan
This patch creates a common helper to sync filesystems and shares
to the suspend, hibernate, and snapshot.

Signed-off-by: Harry Pan 
---
 include/linux/suspend.h  |  3 +++
 kernel/power/hibernate.c |  5 +
 kernel/power/main.c  |  9 +
 kernel/power/suspend.c   | 13 +
 kernel/power/user.c  |  5 +
 5 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 3f529ad9a9d2..6b3ea9ea6a9e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -425,6 +425,7 @@ void restore_processor_state(void);
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
 extern int unregister_pm_notifier(struct notifier_block *nb);
+extern void ksys_sync_helper(void);
 
 #define pm_notifier(fn, pri) { \
static struct notifier_block fn##_nb =  \
@@ -462,6 +463,8 @@ static inline int unregister_pm_notifier(struct 
notifier_block *nb)
return 0;
 }
 
+static inline void ksys_sync_helper(void) {}
+
 #define pm_notifier(fn, pri)   do { (void)(fn); } while (0)
 
 static inline bool pm_wakeup_pending(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..cc105ecd9c07 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -14,7 +14,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -709,9 +708,7 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
-   ksys_sync();
-   pr_info("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 35b50823d83b..a8a8e6ec57e6 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -51,6 +52,14 @@ void unlock_system_sleep(void)
 }
 EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
+void ksys_sync_helper(void)
+{
+   pr_info("Syncing filesystems ... ");
+   ksys_sync();
+   pr_cont("done.\n");
+}
+EXPORT_SYMBOL_GPL(ksys_sync_helper);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..e39059dea38b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -568,13 +567,11 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   ksys_sync_helper();
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 2d8b60a3c86b..cb24e840a3e6 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -10,7 +10,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -228,9 +227,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (data->frozen)
break;
 
-   printk("Syncing filesystems ... ");
-   ksys_sync();
-   printk("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
-- 
2.18.1



[PATCH v5] PM / sleep: measure the time of filesystem syncing

2019-02-22 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during system sleep; although
developer can guess by the timestamp of next log or enable the ftrace
power event for manual calculation, this manner is easier to read and
benefits the automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax
v4: avoid interposition, profile on hibernation, rectify printk format
v5: introduce sync helper shared by suspend, hibernate, and snapshot

Signed-off-by: Harry Pan 
---
 include/linux/suspend.h  |  3 +++
 kernel/power/hibernate.c |  5 +
 kernel/power/main.c  | 15 +++
 kernel/power/suspend.c   | 13 +
 kernel/power/user.c  |  5 +
 5 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 3f529ad9a9d2..6b3ea9ea6a9e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -425,6 +425,7 @@ void restore_processor_state(void);
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
 extern int unregister_pm_notifier(struct notifier_block *nb);
+extern void ksys_sync_helper(void);
 
 #define pm_notifier(fn, pri) { \
static struct notifier_block fn##_nb =  \
@@ -462,6 +463,8 @@ static inline int unregister_pm_notifier(struct 
notifier_block *nb)
return 0;
 }
 
+static inline void ksys_sync_helper(void) {}
+
 #define pm_notifier(fn, pri)   do { (void)(fn); } while (0)
 
 static inline bool pm_wakeup_pending(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..cc105ecd9c07 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -14,7 +14,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -709,9 +708,7 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
-   ksys_sync();
-   pr_info("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 35b50823d83b..a08dcc743f31 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -51,6 +52,20 @@ void unlock_system_sleep(void)
 }
 EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
+void ksys_sync_helper(void)
+{
+   ktime_t start;
+   s64 elapsed_msecs;
+
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+}
+EXPORT_SYMBOL_GPL(ksys_sync_helper);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..e39059dea38b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -568,13 +567,11 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   ksys_sync_helper();
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 2d8b60a3c86b..cb24e840a3e6 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -10,7 +10,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -228,9 +227,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (data->frozen)
break;
 
-   printk("Syncing filesystems ... ");
-   ksys_sync();
-   printk("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
-- 
2.18.1



[PATCH v5] PM / sleep: measure the time of filesystem syncing

2019-02-22 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during system sleep; although
developer can guess by the timestamp of next log or enable the ftrace
power event for manual calculation, this manner is easier to read and
benefits the automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax
v4: avoid interposition, profile on hibernation, rectify printk format
v5: introduce sync helper shared by suspend, hibernate, and snapshot

Signed-off-by: Harry Pan 
---
 include/linux/suspend.h  |  3 +++
 kernel/power/hibernate.c |  4 +---
 kernel/power/main.c  | 15 +++
 kernel/power/suspend.c   | 13 +
 kernel/power/user.c  |  4 +---
 5 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 3f529ad9a9d2..6b3ea9ea6a9e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -425,6 +425,7 @@ void restore_processor_state(void);
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
 extern int unregister_pm_notifier(struct notifier_block *nb);
+extern void ksys_sync_helper(void);
 
 #define pm_notifier(fn, pri) { \
static struct notifier_block fn##_nb =  \
@@ -462,6 +463,8 @@ static inline int unregister_pm_notifier(struct 
notifier_block *nb)
return 0;
 }
 
+static inline void ksys_sync_helper(void) {}
+
 #define pm_notifier(fn, pri)   do { (void)(fn); } while (0)
 
 static inline bool pm_wakeup_pending(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..895f43a5f10c 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -709,9 +709,7 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
-   ksys_sync();
-   pr_info("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 35b50823d83b..a08dcc743f31 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -51,6 +52,20 @@ void unlock_system_sleep(void)
 }
 EXPORT_SYMBOL_GPL(unlock_system_sleep);
 
+void ksys_sync_helper(void)
+{
+   ktime_t start;
+   s64 elapsed_msecs;
+
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+}
+EXPORT_SYMBOL_GPL(ksys_sync_helper);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..e39059dea38b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -568,13 +567,11 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   ksys_sync_helper();
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 2d8b60a3c86b..68dbd9eac8e1 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -228,9 +228,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (data->frozen)
break;
 
-   printk("Syncing filesystems ... ");
-   ksys_sync();
-   printk("done.\n");
+   ksys_sync_helper();
 
error = freeze_processes();
if (error)
-- 
2.18.1



[PATCH v4] PM / suspend: measure the time of filesystem syncing

2019-02-20 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax
v4: avoid interposition, profile on hibernation, rectify printk format

Signed-off-by: Harry Pan 
---
 kernel/power/hibernate.c |  9 +++--
 kernel/power/suspend.c   | 20 +---
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..387703907827 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -688,6 +688,8 @@ int hibernate(void)
 {
int error, nr_calls = 0;
bool snapshot_test = false;
+   ktime_t start;
+   s64 elapsed_msecs;
 
if (!hibernation_available()) {
pm_pr_dbg("Hibernation not available.\n");
@@ -709,9 +711,12 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
+   start = ktime_get();
ksys_sync();
-   pr_info("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..7fb5ba1314d3 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -568,13 +568,19 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   ktime_t start;
+   s64 elapsed_msecs;
+
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
-- 
2.18.1



[PATCH v4] PM / suspend: measure the time of filesystem syncing

2019-02-20 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax
v4: avoid interposition, profile on hibernation, rectify printk format

Signed-off-by: Harry Pan 
---
 kernel/power/hibernate.c |  9 +++--
 kernel/power/suspend.c   | 20 +---
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index abef759de7c8..387703907827 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -688,6 +688,8 @@ int hibernate(void)
 {
int error, nr_calls = 0;
bool snapshot_test = false;
+   ktime_t start;
+   s64 elapsed_msecs;
 
if (!hibernation_available()) {
pm_pr_dbg("Hibernation not available.\n");
@@ -709,9 +711,12 @@ int hibernate(void)
goto Exit;
}
 
-   pr_info("Syncing filesystems ... \n");
+   start = ktime_get();
ksys_sync();
-   pr_info("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
 
error = freeze_processes();
if (error)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..7fb5ba1314d3 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -568,13 +568,19 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   ktime_t start;
+   s64 elapsed_msecs;
+
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_info("Filesystems sync: %lld.%03lld seconds\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
-- 
2.18.1



[PATCH v3] PM / suspend: measure the time of filesystem syncing

2019-02-14 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax

Signed-off-by: Harry Pan 
---
 kernel/power/suspend.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..4844fc6a796d 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -568,13 +568,20 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   ktime_t start;
+   unsigned int elapsed_msecs;
+
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   pr_info("Syncing filesystems ... ");
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_cont("(elapsed %d.%03d seconds) done.\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
-- 
2.18.1



[PATCH v3] PM / suspend: measure the time of filesystem syncing

2019-02-14 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.
v3: reduce conditional compilation, rectify profiling in better syntax

Signed-off-by: Harry Pan 
---
 kernel/power/suspend.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..4844fc6a796d 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -568,13 +568,20 @@ static int enter_state(suspend_state_t state)
if (state == PM_SUSPEND_TO_IDLE)
s2idle_begin();
 
-#ifndef CONFIG_SUSPEND_SKIP_SYNC
-   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
-   pr_info("Syncing filesystems ... ");
-   ksys_sync();
-   pr_cont("done.\n");
-   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
-#endif
+   if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) {
+   ktime_t start;
+   unsigned int elapsed_msecs;
+
+   trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   pr_info("Syncing filesystems ... ");
+   start = ktime_get();
+   ksys_sync();
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_cont("(elapsed %d.%03d seconds) done.\n",
+   elapsed_msecs / MSEC_PER_SEC,
+   elapsed_msecs % MSEC_PER_SEC);
+   trace_suspend_resume(TPS("sync_filesystems"), 0, false);
+   }
 
pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
pm_suspend_clear_flags();
-- 
2.18.1



[PATCH] PM / suspend: measure the time of filesystem syncing

2019-02-06 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.

Signed-off-by: Harry Pan 
---
 kernel/power/suspend.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..87c0073f0c9d 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -550,6 +550,8 @@ static void suspend_finish(void)
 static int enter_state(suspend_state_t state)
 {
int error;
+   ktime_t start;
+   unsigned int elapsed_msecs;
 
trace_suspend_resume(TPS("suspend_enter"), state, true);
if (state == PM_SUSPEND_TO_IDLE) {
@@ -570,9 +572,12 @@ static int enter_state(suspend_state_t state)
 
 #ifndef CONFIG_SUSPEND_SKIP_SYNC
trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   start = ktime_get();
pr_info("Syncing filesystems ... ");
ksys_sync();
-   pr_cont("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_cont("(elapsed %d.%03d seconds) done.\n", elapsed_msecs / 1000,
+   elapsed_msecs % 1000);
trace_suspend_resume(TPS("sync_filesystems"), 0, false);
 #endif
 
-- 
2.18.1



[PATCH v2] PM / suspend: measure the time of filesystem syncing

2019-02-06 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

v2: simplify the variables, apply the simplest form of ktime API.

Signed-off-by: Harry Pan 
---
 kernel/power/suspend.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..87c0073f0c9d 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -550,6 +550,8 @@ static void suspend_finish(void)
 static int enter_state(suspend_state_t state)
 {
int error;
+   ktime_t start;
+   unsigned int elapsed_msecs;
 
trace_suspend_resume(TPS("suspend_enter"), state, true);
if (state == PM_SUSPEND_TO_IDLE) {
@@ -570,9 +572,12 @@ static int enter_state(suspend_state_t state)
 
 #ifndef CONFIG_SUSPEND_SKIP_SYNC
trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   start = ktime_get();
pr_info("Syncing filesystems ... ");
ksys_sync();
-   pr_cont("done.\n");
+   elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
+   pr_cont("(elapsed %d.%03d seconds) done.\n", elapsed_msecs / 1000,
+   elapsed_msecs % 1000);
trace_suspend_resume(TPS("sync_filesystems"), 0, false);
 #endif
 
-- 
2.18.1



[PATCH] PM / suspend: measure the time of filesystem syncing

2019-02-02 Thread Harry Pan
This patch gives the reader an intuitive metric of the time cost by
the kernel issuing a filesystem sync during suspend; although developer
can guess by the timestamp of next log or enable the ftrace power event
for manual calculation, this manner is easier to read and benefits the
automatic script.

Signed-off-by: Harry Pan 
---
 kernel/power/suspend.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0bd595a0b610..f3b7c64f2242 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -550,6 +550,8 @@ static void suspend_finish(void)
 static int enter_state(suspend_state_t state)
 {
int error;
+   ktime_t start, end, elapsed;
+   unsigned int elapsed_msecs;
 
trace_suspend_resume(TPS("suspend_enter"), state, true);
if (state == PM_SUSPEND_TO_IDLE) {
@@ -570,9 +572,14 @@ static int enter_state(suspend_state_t state)
 
 #ifndef CONFIG_SUSPEND_SKIP_SYNC
trace_suspend_resume(TPS("sync_filesystems"), 0, true);
+   start = ktime_get_boottime();
pr_info("Syncing filesystems ... ");
ksys_sync();
-   pr_cont("done.\n");
+   end = ktime_get_boottime();
+   elapsed = ktime_sub(end, start);
+   elapsed_msecs = ktime_to_ms(elapsed);
+   pr_cont("(elapsed %d.%03d seconds) done.\n", elapsed_msecs / 1000,
+   elapsed_msecs % 1000);
trace_suspend_resume(TPS("sync_filesystems"), 0, false);
 #endif
 
-- 
2.18.1



[PATCH] usb: xhci: pci: Enable Intel USB role mux on GLK platforms

2019-01-31 Thread Harry Pan
Like Apollo Lake, Gemini Lake support DRD in port 0, this patch
enables the DRD support for GLK based on the EDS rev 2.2 vol #1
of section 3.8 of USB Controller.

Signed-off-by: Harry Pan 
---
 drivers/usb/host/xhci-pci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index a9ec7051f286..49b438a3ccd4 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -41,6 +41,7 @@
 #define PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI 0x1aa8
 #define PCI_DEVICE_ID_INTEL_APL_XHCI   0x5aa8
 #define PCI_DEVICE_ID_INTEL_DNV_XHCI   0x19d0
+#define PCI_DEVICE_ID_INTEL_GLK_XHCI   0x31a8
 #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_2C_XHCI   0x15b5
 #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_XHCI   0x15b6
 #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_XHCI 0x15db
@@ -194,7 +195,8 @@ static void xhci_pci_quirks(struct device *dev, struct 
xhci_hcd *xhci)
xhci->quirks |= XHCI_SSIC_PORT_UNUSED;
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
(pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI ||
-pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI))
+pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI ||
+pdev->device == PCI_DEVICE_ID_INTEL_GLK_XHCI))
xhci->quirks |= XHCI_INTEL_USB_ROLE_SW;
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
(pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI ||
-- 
2.18.1



[PATCH] mmc: sdhci-pci: mitigate Intel BXT/APL card detection race in early boot

2018-04-12 Thread Harry Pan
When the APL platform is configured to use card detection GPIO, it is
observed a race in early boot that makes the inserted card temporary
malfunction because the driver does not detect the card then it would
turn off the mmc until user manually remove/insert card, or wait a
period of time for runtime suspend/resume to revive it automatically;
from the trace it is generalized the driver has inconsistent status
between GPIO status and the present state of the host controller.

i.e.
kworker/0:3-88[000]  0.812523: mmc_rescan <-process_one_work
...
kworker/0:3-88[000]  0.812527: bxt_get_cd <-mmc_rescan
kworker/0:3-88[000]  0.812546: mmc_power_off <-mmc_rescan
kworker/0:3-88[000]  0.812546: mmc_pwrseq_power_off <-mmc_power_off

Also proved it by adding a hacky dump of GPIO status and present state
register; the GPIO says card existing while controller says not present.

Further investigation I realized when the driver successfully configures
and routes card detection signal to GPIO domain, it immediately decays
the host controller card detection: pin level follows by inserted status,
they disappear a while then they would propagate back automatically,
this generates a small window (while it is one time only) against the card
detection routine; namely when it falls into this window, it behaves like
the above trace. While in case the platform software is not required to
switch the card detection to GPIO, this odd does not happen.

To tackle this odd, did experiment to observe the race window period,
enqueue a mmc detection in 200 ms later when it successfully routes the
card detection to GPIO, this allows margin to make consensus in between
the GPIO and the host controller.

Signed-off-by: Harry Pan <harry@intel.com>
---
 drivers/mmc/host/sdhci-pci-core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/host/sdhci-pci-core.c 
b/drivers/mmc/host/sdhci-pci-core.c
index 82c4f05f91d8..519a3a5a2be0 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -1719,6 +1719,9 @@ static struct sdhci_pci_slot *sdhci_pci_probe_slot(
if (slot->cd_idx >= 0) {
ret = mmc_gpiod_request_cd(host->mmc, NULL, slot->cd_idx,
   slot->cd_override_level, 0, NULL);
+   if (ret == 0)
+   mmc_detect_change(host->mmc, msecs_to_jiffies(200));
+
if (ret == -EPROBE_DEFER)
goto remove;
 
-- 
2.13.5



[PATCH] mmc: sdhci-pci: mitigate Intel BXT/APL card detection race in early boot

2018-04-12 Thread Harry Pan
When the APL platform is configured to use card detection GPIO, it is
observed a race in early boot that makes the inserted card temporary
malfunction because the driver does not detect the card then it would
turn off the mmc until user manually remove/insert card, or wait a
period of time for runtime suspend/resume to revive it automatically;
from the trace it is generalized the driver has inconsistent status
between GPIO status and the present state of the host controller.

i.e.
kworker/0:3-88[000]  0.812523: mmc_rescan <-process_one_work
...
kworker/0:3-88[000]  0.812527: bxt_get_cd <-mmc_rescan
kworker/0:3-88[000]  0.812546: mmc_power_off <-mmc_rescan
kworker/0:3-88[000]  0.812546: mmc_pwrseq_power_off <-mmc_power_off

Also proved it by adding a hacky dump of GPIO status and present state
register; the GPIO says card existing while controller says not present.

Further investigation I realized when the driver successfully configures
and routes card detection signal to GPIO domain, it immediately decays
the host controller card detection: pin level follows by inserted status,
they disappear a while then they would propagate back automatically,
this generates a small window (while it is one time only) against the card
detection routine; namely when it falls into this window, it behaves like
the above trace. While in case the platform software is not required to
switch the card detection to GPIO, this odd does not happen.

To tackle this odd, did experiment to observe the race window period,
enqueue a mmc detection in 200 ms later when it successfully routes the
card detection to GPIO, this allows margin to make consensus in between
the GPIO and the host controller.

Signed-off-by: Harry Pan 
---
 drivers/mmc/host/sdhci-pci-core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/host/sdhci-pci-core.c 
b/drivers/mmc/host/sdhci-pci-core.c
index 82c4f05f91d8..519a3a5a2be0 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -1719,6 +1719,9 @@ static struct sdhci_pci_slot *sdhci_pci_probe_slot(
if (slot->cd_idx >= 0) {
ret = mmc_gpiod_request_cd(host->mmc, NULL, slot->cd_idx,
   slot->cd_override_level, 0, NULL);
+   if (ret == 0)
+   mmc_detect_change(host->mmc, msecs_to_jiffies(200));
+
if (ret == -EPROBE_DEFER)
goto remove;
 
-- 
2.13.5



[tip:perf/core] perf/x86/intel: Add Cannon Lake support for RAPL profiling

2018-03-31 Thread tip-bot for Harry Pan
Commit-ID:  490d03e83da2a5e9d7db84b1ec30a9c95415787e
Gitweb: https://git.kernel.org/tip/490d03e83da2a5e9d7db84b1ec30a9c95415787e
Author: Harry Pan <harry@intel.com>
AuthorDate: Fri, 9 Mar 2018 20:15:47 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Sat, 31 Mar 2018 11:28:36 +0200

perf/x86/intel: Add Cannon Lake support for RAPL profiling

This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.

( ESU and power domains refer to Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Tested-by: Puthikorn Voravootivat <put...@chromium.org>
Signed-off-by: Harry Pan <harry@intel.com>
Reviewed-by: Benson Leung <ble...@chromium.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Cc: colin.k...@canonical.com
Cc: gs0...@gmail.com
Cc: kan.li...@linux.intel.com
Link: http://lkml.kernel.org/r/20180309121549.630-2-harry@intel.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index a2efb490f743..32f3e9423e99 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -774,6 +774,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE,  skl_rapl_init),
+
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_DENVERTON, hsw_rapl_init),
 


[tip:perf/core] perf/x86/intel: Add Cannon Lake support for RAPL profiling

2018-03-31 Thread tip-bot for Harry Pan
Commit-ID:  490d03e83da2a5e9d7db84b1ec30a9c95415787e
Gitweb: https://git.kernel.org/tip/490d03e83da2a5e9d7db84b1ec30a9c95415787e
Author: Harry Pan 
AuthorDate: Fri, 9 Mar 2018 20:15:47 +0800
Committer:  Ingo Molnar 
CommitDate: Sat, 31 Mar 2018 11:28:36 +0200

perf/x86/intel: Add Cannon Lake support for RAPL profiling

This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.

( ESU and power domains refer to Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Tested-by: Puthikorn Voravootivat 
Signed-off-by: Harry Pan 
Reviewed-by: Benson Leung 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: colin.k...@canonical.com
Cc: gs0...@gmail.com
Cc: kan.li...@linux.intel.com
Link: http://lkml.kernel.org/r/20180309121549.630-2-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index a2efb490f743..32f3e9423e99 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -774,6 +774,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE,  skl_rapl_init),
+
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_DENVERTON, hsw_rapl_init),
 


[tip:perf/core] perf/x86/intel: Enable C-state residency events for Cannon Lake

2018-03-31 Thread tip-bot for Harry Pan
Commit-ID:  1159e09476536250c2a0173d4298d15114df7a89
Gitweb: https://git.kernel.org/tip/1159e09476536250c2a0173d4298d15114df7a89
Author: Harry Pan <harry@intel.com>
AuthorDate: Fri, 9 Mar 2018 20:15:48 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Sat, 31 Mar 2018 11:28:36 +0200

perf/x86/intel: Enable C-state residency events for Cannon Lake

Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.

( The MSR information is based on Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )

Tested-by: Puthikorn Voravootivat <put...@chromium.org>
Signed-off-by: Harry Pan <harry@intel.com>
Reviewed-by: Benson Leung <ble...@chromium.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: kan.li...@intel.com
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Cc: gs0...@gmail.com
Link: http://lkml.kernel.org/r/20180309121549.630-3-harry@intel.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/intel/cstate.c | 44 +-
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 72db0664a53d..9aca448bb8e6 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,50 +40,51 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM
+ *  Available model: SLM,AMT,GLM,CNL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
+   CNL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
- *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
- * GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
+ * GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ * SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *A

[tip:perf/core] perf/x86/intel: Enable C-state residency events for Cannon Lake

2018-03-31 Thread tip-bot for Harry Pan
Commit-ID:  1159e09476536250c2a0173d4298d15114df7a89
Gitweb: https://git.kernel.org/tip/1159e09476536250c2a0173d4298d15114df7a89
Author: Harry Pan 
AuthorDate: Fri, 9 Mar 2018 20:15:48 +0800
Committer:  Ingo Molnar 
CommitDate: Sat, 31 Mar 2018 11:28:36 +0200

perf/x86/intel: Enable C-state residency events for Cannon Lake

Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.

( The MSR information is based on Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )

Tested-by: Puthikorn Voravootivat 
Signed-off-by: Harry Pan 
Reviewed-by: Benson Leung 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: kan.li...@intel.com
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: gs0...@gmail.com
Link: http://lkml.kernel.org/r/20180309121549.630-3-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/cstate.c | 44 +-
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 72db0664a53d..9aca448bb8e6 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,50 +40,51 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM
+ *  Available model: SLM,AMT,GLM,CNL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
+   CNL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
- *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
- * GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
+ * GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ * SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT, GLM
+ *Available model: HSW ULT,GLM,CNL

[PATCH 1/3] powercap: intel_rapl: Add support for Cannon Lake

2018-03-09 Thread Harry Pan
Cannon Lake microarchitecture is similar to Kaby Lake in terms of
RAPL, this patch enables CNL RAPL support.

Signed-off-by: Harry Pan <harry@intel.com>
---
 drivers/powercap/intel_rapl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 35636e1d8a3d..295d8dcba48c 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -1162,6 +1162,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
RAPL_CPU(INTEL_FAM6_SKYLAKE_X,  rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_KABYLAKE_MOBILE,rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_KABYLAKE_DESKTOP,   rapl_defaults_core),
+   RAPL_CPU(INTEL_FAM6_CANNONLAKE_MOBILE,  rapl_defaults_core),
 
RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT1,   rapl_defaults_byt),
RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT,   rapl_defaults_cht),
-- 
2.13.5



[PATCH 1/3] powercap: intel_rapl: Add support for Cannon Lake

2018-03-09 Thread Harry Pan
Cannon Lake microarchitecture is similar to Kaby Lake in terms of
RAPL, this patch enables CNL RAPL support.

Signed-off-by: Harry Pan 
---
 drivers/powercap/intel_rapl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 35636e1d8a3d..295d8dcba48c 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -1162,6 +1162,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
RAPL_CPU(INTEL_FAM6_SKYLAKE_X,  rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_KABYLAKE_MOBILE,rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_KABYLAKE_DESKTOP,   rapl_defaults_core),
+   RAPL_CPU(INTEL_FAM6_CANNONLAKE_MOBILE,  rapl_defaults_core),
 
RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT1,   rapl_defaults_byt),
RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT,   rapl_defaults_cht),
-- 
2.13.5



[PATCH 3/3] perf/x86/intel: Enable C-state residency events for Cannon Lake

2018-03-09 Thread Harry Pan
Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/cstate.c | 44 +-
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 72db0664a53d..9aca448bb8e6 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,50 +40,51 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM
+ *  Available model: SLM,AMT,GLM,CNL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
+   CNL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
- *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
- * GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
+ * GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ * SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT, GLM
+ *Available model: HSW ULT,GLM,CNL
  *Scope: Package (physical package)
  *
  */
@@ -486,6 +487,21 @@ static const struct cstate_model hswult_cstates 
__initconst = {
  BIT(PERF_CSTATE_PKG_C10_RES),
 };
 
+static const struct cstate_model cnl_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES) |
+ BIT(PERF_CSTATE_CORE_C7_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_

[PATCH 3/3] perf/x86/intel: Enable C-state residency events for Cannon Lake

2018-03-09 Thread Harry Pan
Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/cstate.c | 44 +-
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 72db0664a53d..9aca448bb8e6 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,50 +40,51 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT,GLM
+ *  Available model: SLM,AMT,GLM,CNL
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
+   CNL
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
- *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
- *Available model: SNB,IVB,HSW,BDW,SKL
+ *Available model: SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
- * GLM
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
+ * GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM
+ * SKL,KNL,GLM,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT only
+ *Available model: HSW ULT,CNL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT, GLM
+ *Available model: HSW ULT,GLM,CNL
  *Scope: Package (physical package)
  *
  */
@@ -486,6 +487,21 @@ static const struct cstate_model hswult_cstates 
__initconst = {
  BIT(PERF_CSTATE_PKG_C10_RES),
 };
 
+static const struct cstate_model cnl_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES) |
+ BIT(PERF_CSTATE_CORE_C7_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT

[PATCH 2/3] perf/x86/intel: Add Cannon Lake support of RAPL profiling

2018-03-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 4, Order No. 335592.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index a2efb490f743..32f3e9423e99 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -774,6 +774,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE,  skl_rapl_init),
+
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_DENVERTON, hsw_rapl_init),
 
-- 
2.13.5



[PATCH 2/3] perf/x86/intel: Add Cannon Lake support of RAPL profiling

2018-03-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 4, Order No. 335592.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index a2efb490f743..32f3e9423e99 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -774,6 +774,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE,  skl_rapl_init),
+
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_DENVERTON, hsw_rapl_init),
 
-- 
2.13.5



[PATCH] ASoC: Intel: bxt: Refine the HW contraint of Ref capture.

2017-12-05 Thread Harry Pan
The patch restricts the HW contraint of the refcap of WoV stream
in single channel (mono) and 16k Hz based on platform implementation.

Such that, the userspace program can rely on correct HW parameters
through the ALSA library call to manipulate the device.

Signed-off-by: Harry Pan <harry@intel.com>
---
 sound/soc/intel/boards/bxt_da7219_max98357a.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/sound/soc/intel/boards/bxt_da7219_max98357a.c 
b/sound/soc/intel/boards/bxt_da7219_max98357a.c
index ce35ec7884d1..382f71228807 100644
--- a/sound/soc/intel/boards/bxt_da7219_max98357a.c
+++ b/sound/soc/intel/boards/bxt_da7219_max98357a.c
@@ -338,8 +338,23 @@ static const struct snd_pcm_hw_constraint_list 
constraints_16000 = {
.list  = rates_16000,
 };
 
+static const unsigned int ch_mono[] = {
+   1,
+};
+
+static const struct snd_pcm_hw_constraint_list constraints_refcap = {
+   .count = ARRAY_SIZE(ch_mono),
+   .list  = ch_mono,
+};
+
 static int broxton_refcap_startup(struct snd_pcm_substream *substream)
 {
+   substream->runtime->hw.channels_max = 1;
+
+   snd_pcm_hw_constraint_list(substream->runtime, 0,
+   SNDRV_PCM_HW_PARAM_CHANNELS,
+   _refcap);
+
return snd_pcm_hw_constraint_list(substream->runtime, 0,
SNDRV_PCM_HW_PARAM_RATE,
_16000);
-- 
2.13.5



[PATCH] ASoC: Intel: bxt: Refine the HW contraint of Ref capture.

2017-12-05 Thread Harry Pan
The patch restricts the HW contraint of the refcap of WoV stream
in single channel (mono) and 16k Hz based on platform implementation.

Such that, the userspace program can rely on correct HW parameters
through the ALSA library call to manipulate the device.

Signed-off-by: Harry Pan 
---
 sound/soc/intel/boards/bxt_da7219_max98357a.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/sound/soc/intel/boards/bxt_da7219_max98357a.c 
b/sound/soc/intel/boards/bxt_da7219_max98357a.c
index ce35ec7884d1..382f71228807 100644
--- a/sound/soc/intel/boards/bxt_da7219_max98357a.c
+++ b/sound/soc/intel/boards/bxt_da7219_max98357a.c
@@ -338,8 +338,23 @@ static const struct snd_pcm_hw_constraint_list 
constraints_16000 = {
.list  = rates_16000,
 };
 
+static const unsigned int ch_mono[] = {
+   1,
+};
+
+static const struct snd_pcm_hw_constraint_list constraints_refcap = {
+   .count = ARRAY_SIZE(ch_mono),
+   .list  = ch_mono,
+};
+
 static int broxton_refcap_startup(struct snd_pcm_substream *substream)
 {
+   substream->runtime->hw.channels_max = 1;
+
+   snd_pcm_hw_constraint_list(substream->runtime, 0,
+   SNDRV_PCM_HW_PARAM_CHANNELS,
+   _refcap);
+
return snd_pcm_hw_constraint_list(substream->runtime, 0,
SNDRV_PCM_HW_PARAM_RATE,
_16000);
-- 
2.13.5



[PATCH] perf/x86/intel/cstate: revise package C-state residency of SKL/KBL

2017-09-01 Thread Harry Pan
Despite Intel SDM vol-4 does not claim to have there MSRs,
EDS of SKL/KBL do claim to have PC8/PC9/PC10 states,
also these residency MSRs are visible through rdmsr.

This patch allows developers to calculate correct portions
among various package C-states against total TSC ticks, and
this would also align the events w/ Len Brown's turbostat
metrics on SKL/KBL platforms.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/cstate.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 4cf100ff2a37..8f339635d22b 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -75,15 +75,15 @@
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,SKL,KBL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT only
+ *Available model: HSW ULT,SKL,KBL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT, GLM
+ *Available model: HSW ULT,GLM,SKL,KBL
  *Scope: Package (physical package)
  *
  */
@@ -550,11 +550,11 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_GT3E,   snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_X,  snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, hswult_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, hswult_cstates),
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
-- 
2.12.2



[PATCH] perf/x86/intel/cstate: revise package C-state residency of SKL/KBL

2017-09-01 Thread Harry Pan
Despite Intel SDM vol-4 does not claim to have there MSRs,
EDS of SKL/KBL do claim to have PC8/PC9/PC10 states,
also these residency MSRs are visible through rdmsr.

This patch allows developers to calculate correct portions
among various package C-states against total TSC ticks, and
this would also align the events w/ Len Brown's turbostat
metrics on SKL/KBL platforms.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/cstate.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 4cf100ff2a37..8f339635d22b 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -75,15 +75,15 @@
  *Scope: Package (physical package)
  * MSR_PKG_C8_RESIDENCY:  Package C8 Residency Counter.
  *perf code: 0x04
- *Available model: HSW ULT only
+ *Available model: HSW ULT,SKL,KBL
  *Scope: Package (physical package)
  * MSR_PKG_C9_RESIDENCY:  Package C9 Residency Counter.
  *perf code: 0x05
- *Available model: HSW ULT only
+ *Available model: HSW ULT,SKL,KBL
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT, GLM
+ *Available model: HSW ULT,GLM,SKL,KBL
  *Scope: Package (physical package)
  *
  */
@@ -550,11 +550,11 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_GT3E,   snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_X,  snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, hswult_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE,  hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_DESKTOP, hswult_cstates),
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
-- 
2.12.2



[tip:perf/urgent] perf/x86/intel: Enable C-state residency events for Apollo Lake

2017-07-18 Thread tip-bot for Harry Pan
Commit-ID:  5c10b048c37cc08a21fa97a0575eccf4948948ca
Gitweb: http://git.kernel.org/tip/5c10b048c37cc08a21fa97a0575eccf4948948ca
Author: Harry Pan <harry@intel.com>
AuthorDate: Mon, 17 Jul 2017 18:37:49 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Tue, 18 Jul 2017 14:13:40 +0200

perf/x86/intel: Enable C-state residency events for Apollo Lake

Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan <harry@intel.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Cc: b...@suse.de
Cc: davi...@google.com
Cc: gs0...@gmail.com
Cc: lukasz.odzi...@intel.com
Cc: piotr@intel.com
Cc: srinivas.pandruv...@linux.intel.com
Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry@intel.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae32..4cf100f 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \

[tip:perf/urgent] perf/x86/intel: Enable C-state residency events for Apollo Lake

2017-07-18 Thread tip-bot for Harry Pan
Commit-ID:  5c10b048c37cc08a21fa97a0575eccf4948948ca
Gitweb: http://git.kernel.org/tip/5c10b048c37cc08a21fa97a0575eccf4948948ca
Author: Harry Pan 
AuthorDate: Mon, 17 Jul 2017 18:37:49 +0800
Committer:  Ingo Molnar 
CommitDate: Tue, 18 Jul 2017 14:13:40 +0200

perf/x86/intel: Enable C-state residency events for Apollo Lake

Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: b...@suse.de
Cc: davi...@google.com
Cc: gs0...@gmail.com
Cc: lukasz.odzi...@intel.com
Cc: piotr@intel.com
Cc: srinivas.pandruv...@linux.intel.com
Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae32..4cf100f 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) 
&(states) }
@@ -546,6 +558,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
+
+   X86_CSTATES_M

[tip:perf/urgent] perf/x86/intel: Enable C-state residency events for Apollo Lake

2017-07-18 Thread tip-bot for Harry Pan
Commit-ID:  80ef64ad3c2babe8dcde05ec518795cfd86bb6e1
Gitweb: http://git.kernel.org/tip/80ef64ad3c2babe8dcde05ec518795cfd86bb6e1
Author: Harry Pan <harry@intel.com>
AuthorDate: Mon, 17 Jul 2017 18:37:49 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Tue, 18 Jul 2017 10:50:23 +0200

perf/x86/intel: Enable C-state residency events for Apollo Lake

Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan <harry@intel.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Cc: b...@suse.de
Cc: davi...@google.com
Cc: gs0...@gmail.com
Cc: lukasz.odzi...@intel.com
Cc: piotr@intel.com
Cc: srinivas.pandruv...@linux.intel.com
Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry@intel.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae32..4cf100f 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \

[tip:perf/urgent] perf/x86/intel: Enable C-state residency events for Apollo Lake

2017-07-18 Thread tip-bot for Harry Pan
Commit-ID:  80ef64ad3c2babe8dcde05ec518795cfd86bb6e1
Gitweb: http://git.kernel.org/tip/80ef64ad3c2babe8dcde05ec518795cfd86bb6e1
Author: Harry Pan 
AuthorDate: Mon, 17 Jul 2017 18:37:49 +0800
Committer:  Ingo Molnar 
CommitDate: Tue, 18 Jul 2017 10:50:23 +0200

perf/x86/intel: Enable C-state residency events for Apollo Lake

Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: b...@suse.de
Cc: davi...@google.com
Cc: gs0...@gmail.com
Cc: lukasz.odzi...@intel.com
Cc: piotr@intel.com
Cc: srinivas.pandruv...@linux.intel.com
Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae32..4cf100f 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) 
&(states) }
@@ -546,6 +558,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
+
+   X86_CSTATES_M

[PATCH] perf/x86/intel/cstate: Enable C-state residency events for Apollo Lake

2017-07-17 Thread Harry Pan
Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae3248ba5..4cf100ff2a37 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) 
&(states) }
@@ -546,6 +558,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
+
+   X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT, glm_cstates),
{ },
 };
 MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
-- 
2.12.2



[PATCH] perf/x86/intel/cstate: Enable C-state residency events for Apollo Lake

2017-07-17 Thread Harry Pan
Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.

The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/cstate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 238ae3248ba5..4cf100ff2a37 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,16 +40,16 @@
  * Model specific counters:
  * MSR_CORE_C1_RES: CORE C1 Residency Counter
  *  perf code: 0x00
- *  Available model: SLM,AMT
+ *  Available model: SLM,AMT,GLM
  *  Scope: Core (each processor core has a MSR)
  * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
  *perf code: 0x01
- *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL
+ *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM
  *Scope: Core
  * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Core
  * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
  *perf code: 0x03
@@ -57,16 +57,17 @@
  *Scope: Core
  * MSR_PKG_C2_RESIDENCY:  Package C2 Residency Counter.
  *perf code: 0x00
- *Available model: SNB,IVB,HSW,BDW,SKL,KNL
+ *Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C3_RESIDENCY:  Package C3 Residency Counter.
  *perf code: 0x01
  *Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL
+ * GLM
  *Scope: Package (physical package)
  * MSR_PKG_C6_RESIDENCY:  Package C6 Residency Counter.
  *perf code: 0x02
  *Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL
+ * SKL,KNL,GLM
  *Scope: Package (physical package)
  * MSR_PKG_C7_RESIDENCY:  Package C7 Residency Counter.
  *perf code: 0x03
@@ -82,7 +83,7 @@
  *Scope: Package (physical package)
  * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
  *perf code: 0x06
- *Available model: HSW ULT only
+ *Available model: HSW ULT, GLM
  *Scope: Package (physical package)
  *
  */
@@ -504,6 +505,17 @@ static const struct cstate_model knl_cstates __initconst = 
{
 };
 
 
+static const struct cstate_model glm_cstates __initconst = {
+   .core_events= BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C3_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES),
+
+   .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
 
 #define X86_CSTATES_MODEL(model, states)   \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) 
&(states) }
@@ -546,6 +558,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
+
+   X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT, glm_cstates),
{ },
 };
 MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
-- 
2.12.2



[PATCH] tools/power turbostat: enable BXT DRAM RAPL domain

2017-03-12 Thread Harry Pan
Goldmont microarchitecture supports DRAM RAPL domain.

This patch adds relevant flags in probing, such that it enables
DRAM energy profile.

Signed-off-by: Harry Pan <harry@intel.com>

Signed-off-by: Harry Pan <harry@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c 
b/tools/power/x86/turbostat/turbostat.c
index 828dccd..d726860 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3343,11 +3343,14 @@ void rapl_probe(unsigned int family, unsigned int model)
break;
case INTEL_FAM6_ATOM_GOLDMONT:  /* BXT */
case INTEL_FAM6_ATOM_GEMINI_LAKE:
-   do_rapl = RAPL_PKG | RAPL_PKG_POWER_INFO;
-   if (rapl_joules)
+   do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | 
RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO;
+   if (rapl_joules) {
BIC_PRESENT(BIC_Pkg_J);
-   else
+   BIC_PRESENT(BIC_RAM_J);
+   } else {
BIC_PRESENT(BIC_PkgWatt);
+   BIC_PRESENT(BIC_RAMWatt);
+   }
break;
case INTEL_FAM6_SKYLAKE_MOBILE: /* SKL */
case INTEL_FAM6_SKYLAKE_DESKTOP:/* SKL */
-- 
2.6.6



[PATCH] tools/power turbostat: enable BXT DRAM RAPL domain

2017-03-12 Thread Harry Pan
Goldmont microarchitecture supports DRAM RAPL domain.

This patch adds relevant flags in probing, such that it enables
DRAM energy profile.

Signed-off-by: Harry Pan 

Signed-off-by: Harry Pan 
---
 tools/power/x86/turbostat/turbostat.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c 
b/tools/power/x86/turbostat/turbostat.c
index 828dccd..d726860 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3343,11 +3343,14 @@ void rapl_probe(unsigned int family, unsigned int model)
break;
case INTEL_FAM6_ATOM_GOLDMONT:  /* BXT */
case INTEL_FAM6_ATOM_GEMINI_LAKE:
-   do_rapl = RAPL_PKG | RAPL_PKG_POWER_INFO;
-   if (rapl_joules)
+   do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | 
RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO;
+   if (rapl_joules) {
BIC_PRESENT(BIC_Pkg_J);
-   else
+   BIC_PRESENT(BIC_RAM_J);
+   } else {
BIC_PRESENT(BIC_PkgWatt);
+   BIC_PRESENT(BIC_RAMWatt);
+   }
break;
case INTEL_FAM6_SKYLAKE_MOBILE: /* SKL */
case INTEL_FAM6_SKYLAKE_DESKTOP:/* SKL */
-- 
2.6.6



[PATCH] drm/i915: minor corner case fix to respect user's backlight setting

2017-01-28 Thread Harry Pan
When enabling panel backlight, if the current backlight level
setting matches the panel's minimal, it would apply default policy to
override the current level by the panel's maximum until next request
to update brightness, this leads unexpected user confusion with
temporary full power backlight.

This odd could be reproduced as commands like these:
 $ xbacklight -set 0
 $ sudo sh -c 'echo mem > /sys/power/state'
 (resume)

To fix this, slightly tinker the backlight level comparison from
'less-and-equal-to' to 'less-than'.

Before: (dmesg | grep backlight # with drm.debug=0xe)
[   82.249265] [drm:intel_backlight_device_update_status [i915]] updating 
intel_backlight, brightness=0/5273
[   82.249282] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207
[   82.249306] [drm:intel_edp_backlight_power [i915]] panel power control 
backlight disable
[   92.066041] [drm:intel_edp_backlight_off [i915]]
[   92.270489] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 0
[   94.080434] [drm:intel_edp_backlight_on.part.25 [i915]]
[   94.080476] [drm:intel_panel_enable_backlight [i915]] pipe A
[   94.080539] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 5273

After:
[ Ā Ā 72.874465] [drm:intel_backlight_device_update_status [i915]] updating 
intel_backlight, brightness=0/5273
[ Ā Ā 72.874499] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207
[ Ā Ā 72.874540] [drm:intel_edp_backlight_power [i915]] panel power control 
backlight disable
[ Ā Ā 86.807928] [drm:intel_edp_backlight_off [i915]]
[ Ā Ā 87.013227] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 0
[ Ā Ā 89.001829] [drm:intel_edp_backlight_on.part.25 [i915]]
[ Ā Ā 89.001859] [drm:intel_panel_enable_backlight [i915]] pipe A
[ Ā Ā 89.001926] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207

Fixes: 13f3fbe827d0 ("fix inconsistent brightness after resume")

Signed-off-by: Harry Pan <harry@intel.com>
---
 drivers/gpu/drm/i915/intel_panel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_panel.c 
b/drivers/gpu/drm/i915/intel_panel.c
index 08ab6d7..e882139 100644
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -1104,7 +1104,7 @@ void intel_panel_enable_backlight(struct intel_connector 
*connector)
 
WARN_ON(panel->backlight.max == 0);
 
-   if (panel->backlight.level <= panel->backlight.min) {
+   if (panel->backlight.level < panel->backlight.min) {
panel->backlight.level = panel->backlight.max;
if (panel->backlight.device)
panel->backlight.device->props.brightness =
-- 
2.6.6



[PATCH] drm/i915: minor corner case fix to respect user's backlight setting

2017-01-28 Thread Harry Pan
When enabling panel backlight, if the current backlight level
setting matches the panel's minimal, it would apply default policy to
override the current level by the panel's maximum until next request
to update brightness, this leads unexpected user confusion with
temporary full power backlight.

This odd could be reproduced as commands like these:
 $ xbacklight -set 0
 $ sudo sh -c 'echo mem > /sys/power/state'
 (resume)

To fix this, slightly tinker the backlight level comparison from
'less-and-equal-to' to 'less-than'.

Before: (dmesg | grep backlight # with drm.debug=0xe)
[   82.249265] [drm:intel_backlight_device_update_status [i915]] updating 
intel_backlight, brightness=0/5273
[   82.249282] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207
[   82.249306] [drm:intel_edp_backlight_power [i915]] panel power control 
backlight disable
[   92.066041] [drm:intel_edp_backlight_off [i915]]
[   92.270489] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 0
[   94.080434] [drm:intel_edp_backlight_on.part.25 [i915]]
[   94.080476] [drm:intel_panel_enable_backlight [i915]] pipe A
[   94.080539] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 5273

After:
[ Ā Ā 72.874465] [drm:intel_backlight_device_update_status [i915]] updating 
intel_backlight, brightness=0/5273
[ Ā Ā 72.874499] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207
[ Ā Ā 72.874540] [drm:intel_edp_backlight_power [i915]] panel power control 
backlight disable
[ Ā Ā 86.807928] [drm:intel_edp_backlight_off [i915]]
[ Ā Ā 87.013227] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 0
[ Ā Ā 89.001829] [drm:intel_edp_backlight_on.part.25 [i915]]
[ Ā Ā 89.001859] [drm:intel_panel_enable_backlight [i915]] pipe A
[ Ā Ā 89.001926] [drm:intel_panel_actually_set_backlight [i915]] set backlight 
PWM = 207

Fixes: 13f3fbe827d0 ("fix inconsistent brightness after resume")

Signed-off-by: Harry Pan 
---
 drivers/gpu/drm/i915/intel_panel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_panel.c 
b/drivers/gpu/drm/i915/intel_panel.c
index 08ab6d7..e882139 100644
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -1104,7 +1104,7 @@ void intel_panel_enable_backlight(struct intel_connector 
*connector)
 
WARN_ON(panel->backlight.max == 0);
 
-   if (panel->backlight.level <= panel->backlight.min) {
+   if (panel->backlight.level < panel->backlight.min) {
panel->backlight.level = panel->backlight.max;
if (panel->backlight.device)
panel->backlight.device->props.brightness =
-- 
2.6.6



[PATCH 2/3] x86/perf/rapl: Make quirk a function pointer

2016-09-10 Thread Harry Pan
From: Thomas Gleixner <t...@linutronix.de>

There are more model specific quirks required. So we need to change the
single purpose boolean quirk flag to an easy extensible mechanism.

Make the quirk a function pointer and move the existing quirk into its own
function.

While at it make the init struct initializers readable and rename the
misnomed intel_rapl_hw_init_fun struct to intel_rapl_model_desc because
that's what it is a cpu model descriptor for the rapl features specific to
a particular model.

Signed-off-by: Thomas Gleixner <t...@linutronix.de>
Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 92 ++--
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..94abfdb 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -152,6 +152,12 @@ struct rapl_pmus {
struct rapl_pmu *pmus[];
 };
 
+struct intel_rapl_model_desc {
+   void(*quirk)(void);
+   int cntr_mask;
+   struct attribute**attrs;
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct rapl_pmus *rapl_pmus;
@@ -617,7 +623,18 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static void rapl_hsx_quirk(void)
+{
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit which can be
+* different than the unit from power unit MSR. See
+* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+*/
+   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +645,9 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
-   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   /* Apply quirk before initializing the timer rate */
+   if (model->quirk)
+   model->quirk();
 
/*
 * Calculate the timer rate:
@@ -701,46 +713,36 @@ static int __init init_rapl_pmus(void)
 #define X86_RAPL_MODEL_MATCH(model, init)  \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
-struct intel_rapl_init_fun {
-   bool apply_quirk;
-   int cntr_mask;
-   struct attribute **attrs;
-};
-
-static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_CLN,
-   .attrs = rapl_events_cln_attr,
+static const struct intel_rapl_model_desc snb_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_CLN,
+   .attrs  = rapl_events_cln_attr,
 };
 
-static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc hsx_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_HSW,
-   .attrs = rapl_events_hsw_attr,
+static const struct intel_rapl_model_desc hsw_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_HSW,
+   .attrs  = rapl_events_hsw_attr,
 };
 
-static const struct intel_rapl_init_fun snbep_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc snbep_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun knl_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_KNL,
-   .attrs = rapl_events_knl_attr,
+static const struct intel_rapl_model_desc knl_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_KNL,
+   .attrs  = rapl_events_knl_attr,
 };
 
-static const struct intel_rapl_init_fun skl_rapl_init __initconst = {
-   .apply_quirk = fal

[PATCH 2/3] x86/perf/rapl: Make quirk a function pointer

2016-09-10 Thread Harry Pan
From: Thomas Gleixner 

There are more model specific quirks required. So we need to change the
single purpose boolean quirk flag to an easy extensible mechanism.

Make the quirk a function pointer and move the existing quirk into its own
function.

While at it make the init struct initializers readable and rename the
misnomed intel_rapl_hw_init_fun struct to intel_rapl_model_desc because
that's what it is a cpu model descriptor for the rapl features specific to
a particular model.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 92 ++--
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..94abfdb 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -152,6 +152,12 @@ struct rapl_pmus {
struct rapl_pmu *pmus[];
 };
 
+struct intel_rapl_model_desc {
+   void(*quirk)(void);
+   int cntr_mask;
+   struct attribute**attrs;
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct rapl_pmus *rapl_pmus;
@@ -617,7 +623,18 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static void rapl_hsx_quirk(void)
+{
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit which can be
+* different than the unit from power unit MSR. See
+* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+*/
+   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +645,9 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
-   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   /* Apply quirk before initializing the timer rate */
+   if (model->quirk)
+   model->quirk();
 
/*
 * Calculate the timer rate:
@@ -701,46 +713,36 @@ static int __init init_rapl_pmus(void)
 #define X86_RAPL_MODEL_MATCH(model, init)  \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
-struct intel_rapl_init_fun {
-   bool apply_quirk;
-   int cntr_mask;
-   struct attribute **attrs;
-};
-
-static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_CLN,
-   .attrs = rapl_events_cln_attr,
+static const struct intel_rapl_model_desc snb_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_CLN,
+   .attrs  = rapl_events_cln_attr,
 };
 
-static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc hsx_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_HSW,
-   .attrs = rapl_events_hsw_attr,
+static const struct intel_rapl_model_desc hsw_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_HSW,
+   .attrs  = rapl_events_hsw_attr,
 };
 
-static const struct intel_rapl_init_fun snbep_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc snbep_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun knl_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_KNL,
-   .attrs = rapl_events_knl_attr,
+static const struct intel_rapl_model_desc knl_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_KNL,
+   .attrs  = rapl_events_knl_attr,
 };
 
-static const struct intel_rapl_init_fun skl_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SKL_CLN,
-   .attrs = rapl_events_skl_attr,
+st

[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -458,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +658,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +786,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +815,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 1/3] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -458,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +658,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +786,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +815,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 1/3] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -458,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +658,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +786,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +815,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..2af6c18 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -458,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +658,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +786,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +815,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 2/3] x86/perf/rapl: Make quirk a function pointer

2016-09-10 Thread Harry Pan
From: Thomas Gleixner <t...@linutronix.de>

There are more model specific quirks required. So we need to change the
single purpose boolean quirk flag to an easy extensible mechanism.

Make the quirk a function pointer and move the existing quirk into its own
function.

While at it make the init struct initializers readable and rename the
misnomed intel_rapl_hw_init_fun struct to intel_rapl_model_desc because
that's what it is a cpu model descriptor for the rapl features specific to
a particular model.

Signed-off-by: Thomas Gleixner <t...@linutronix.de>
Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 92 ++--
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..94abfdb 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -152,6 +152,12 @@ struct rapl_pmus {
struct rapl_pmu *pmus[];
 };
 
+struct intel_rapl_model_desc {
+   void(*quirk)(void);
+   int cntr_mask;
+   struct attribute**attrs;
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct rapl_pmus *rapl_pmus;
@@ -617,7 +623,18 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static void rapl_hsx_quirk(void)
+{
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit which can be
+* different than the unit from power unit MSR. See
+* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+*/
+   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +645,9 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
-   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   /* Apply quirk before initializing the timer rate */
+   if (model->quirk)
+   model->quirk();
 
/*
 * Calculate the timer rate:
@@ -701,46 +713,36 @@ static int __init init_rapl_pmus(void)
 #define X86_RAPL_MODEL_MATCH(model, init)  \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
-struct intel_rapl_init_fun {
-   bool apply_quirk;
-   int cntr_mask;
-   struct attribute **attrs;
-};
-
-static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_CLN,
-   .attrs = rapl_events_cln_attr,
+static const struct intel_rapl_model_desc snb_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_CLN,
+   .attrs  = rapl_events_cln_attr,
 };
 
-static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc hsx_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_HSW,
-   .attrs = rapl_events_hsw_attr,
+static const struct intel_rapl_model_desc hsw_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_HSW,
+   .attrs  = rapl_events_hsw_attr,
 };
 
-static const struct intel_rapl_init_fun snbep_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc snbep_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun knl_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_KNL,
-   .attrs = rapl_events_knl_attr,
+static const struct intel_rapl_model_desc knl_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_KNL,
+   .attrs  = rapl_events_knl_attr,
 };
 
-static const struct intel_rapl_init_fun skl_rapl_init __initconst = {
-   .apply_quirk = fal

[PATCH 2/3] x86/perf/rapl: Make quirk a function pointer

2016-09-10 Thread Harry Pan
From: Thomas Gleixner 

There are more model specific quirks required. So we need to change the
single purpose boolean quirk flag to an easy extensible mechanism.

Make the quirk a function pointer and move the existing quirk into its own
function.

While at it make the init struct initializers readable and rename the
misnomed intel_rapl_hw_init_fun struct to intel_rapl_model_desc because
that's what it is a cpu model descriptor for the rapl features specific to
a particular model.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 92 ++--
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..94abfdb 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -152,6 +152,12 @@ struct rapl_pmus {
struct rapl_pmu *pmus[];
 };
 
+struct intel_rapl_model_desc {
+   void(*quirk)(void);
+   int cntr_mask;
+   struct attribute**attrs;
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct rapl_pmus *rapl_pmus;
@@ -617,7 +623,18 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static void rapl_hsx_quirk(void)
+{
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit which can be
+* different than the unit from power unit MSR. See
+* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+*/
+   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +645,9 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
-   rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   /* Apply quirk before initializing the timer rate */
+   if (model->quirk)
+   model->quirk();
 
/*
 * Calculate the timer rate:
@@ -701,46 +713,36 @@ static int __init init_rapl_pmus(void)
 #define X86_RAPL_MODEL_MATCH(model, init)  \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
-struct intel_rapl_init_fun {
-   bool apply_quirk;
-   int cntr_mask;
-   struct attribute **attrs;
-};
-
-static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_CLN,
-   .attrs = rapl_events_cln_attr,
+static const struct intel_rapl_model_desc snb_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_CLN,
+   .attrs  = rapl_events_cln_attr,
 };
 
-static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc hsx_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_HSW,
-   .attrs = rapl_events_hsw_attr,
+static const struct intel_rapl_model_desc hsw_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_HSW,
+   .attrs  = rapl_events_hsw_attr,
 };
 
-static const struct intel_rapl_init_fun snbep_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SRV,
-   .attrs = rapl_events_srv_attr,
+static const struct intel_rapl_model_desc snbep_rapl_init __initconst = {
+   .cntr_mask  = RAPL_IDX_SRV,
+   .attrs  = rapl_events_srv_attr,
 };
 
-static const struct intel_rapl_init_fun knl_rapl_init __initconst = {
-   .apply_quirk = true,
-   .cntr_mask = RAPL_IDX_KNL,
-   .attrs = rapl_events_knl_attr,
+static const struct intel_rapl_model_desc knl_rapl_init __initconst = {
+   .quirk  = rapl_hsx_quirk,
+   .cntr_mask  = RAPL_IDX_KNL,
+   .attrs  = rapl_events_knl_attr,
 };
 
-static const struct intel_rapl_init_fun skl_rapl_init __initconst = {
-   .apply_quirk = false,
-   .cntr_mask = RAPL_IDX_SKL_CLN,
-   .attrs = rapl_events_skl_attr,
+st

[PATCH 1/3] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 55 
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..a434087 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -458,6 +468,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +557,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +664,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +792,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +821,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[PATCH 1/3] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 3/3] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-10 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 55 
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 94abfdb..a434087 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -458,6 +468,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -539,6 +557,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -634,6 +664,23 @@ static void rapl_hsx_quirk(void)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 }
 
+static void rapl_byt_quirk(void)
+{
+   int i;
+
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*
+* TODO: In order to fit BYT/BSW quirk model, here remind
+*   this generates timer rate in 80ms; by default
+*   ESU of BYT/BSW is 5, so it leads (1000/200)*2^4.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+}
+
 static int rapl_check_hw_unit(const struct intel_rapl_model_desc *model)
 {
u64 msr_rapl_power_unit_bits;
@@ -745,6 +792,12 @@ static const struct intel_rapl_model_desc skl_rapl_init 
__initconst = {
.attrs  = rapl_events_skl_attr,
 };
 
+static const struct intel_rapl_model_desc byt_rapl_init __initconst = {
+   .quirk  = rapl_byt_quirk,
+   .cntr_mask  = RAPL_IDX_BYT,
+   .attrs  = rapl_events_byt_attr,
+};
+
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE,   snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
@@ -768,6 +821,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
 
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_SILVERMONT1, byt_rapl_init),
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_AIRMONT, byt_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
-- 
2.6.6



[tip:perf/core] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread tip-bot for Harry Pan
Commit-ID:  2668c6195685f4b6f281767d10b4f4f2e32c2305
Gitweb: http://git.kernel.org/tip/2668c6195685f4b6f281767d10b4f4f2e32c2305
Author: Harry Pan <harry@intel.com>
AuthorDate: Thu, 8 Sep 2016 17:08:57 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:18:52 +0200

perf/x86/rapl: Enable Apollo Lake RAPL support

This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Cc: b...@alien8.de
Cc: gs0...@gmail.com
Cc: h...@zytor.com
Cc: srinivas.pandruv...@linux.intel.com
Link: 
http://lkml.kernel.org/r/1473325738-730-1-git-send-email-harry@intel.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 62bebcc..b0f0e83 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -767,6 +767,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 


[tip:perf/core] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-10 Thread tip-bot for Harry Pan
Commit-ID:  2668c6195685f4b6f281767d10b4f4f2e32c2305
Gitweb: http://git.kernel.org/tip/2668c6195685f4b6f281767d10b4f4f2e32c2305
Author: Harry Pan 
AuthorDate: Thu, 8 Sep 2016 17:08:57 +0800
Committer:  Ingo Molnar 
CommitDate: Sat, 10 Sep 2016 11:18:52 +0200

perf/x86/rapl: Enable Apollo Lake RAPL support

This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: b...@alien8.de
Cc: gs0...@gmail.com
Cc: h...@zytor.com
Cc: srinivas.pandruv...@linux.intel.com
Link: 
http://lkml.kernel.org/r/1473325738-730-1-git-send-email-harry@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 62bebcc..b0f0e83 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -767,6 +767,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 


[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.
v3: refine multiple quirks in rapl_check_hw_unit().

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 83 +++-
 1 file changed, 67 insertions(+), 16 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..3786574 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -452,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +647,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +658,27 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
+   switch (apply_quirk) {
+   case RAPL_HSX_QUIRK:
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit
+* which can be different than the unit from power unit MSR.
+* See "Intel Xeon Processor E5-1600 and E5-2600 v3 Product
+* Families, V2 of 2. Datasheet, September 2014,
+* Reference Number: 330784-001"
+*/
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   break;
+   case RAPL_BYT_QUIRK:
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i]

[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

v2: simplify setting rapl_hw_unit[] to reduce runtime overhead.
v3: refine multiple quirks in rapl_check_hw_unit().

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 83 +++-
 1 file changed, 67 insertions(+), 16 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..3786574 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -452,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +647,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -628,14 +658,27 @@ static int rapl_check_hw_unit(bool apply_quirk)
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-   /*
-* DRAM domain on HSW server and KNL has fixed energy unit which can be
-* different than the unit from power unit MSR. See
-* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-*/
-   if (apply_quirk)
+   switch (apply_quirk) {
+   case RAPL_HSX_QUIRK:
+   /*
+* DRAM domain on HSW server and KNL has fixed energy unit
+* which can be different than the unit from power unit MSR.
+* See "Intel Xeon Processor E5-1600 and E5-2600 v3 Product
+* Families, V2 of 2. Datasheet, September 2014,
+* Reference Number: 330784-001"
+*/
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+   break;
+   case RAPL_BYT_QUIRK:
+   /*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules
+* increment, refer to Software Developers' Manual, Vol. 3C,
+* Order No. 325384, Table 35-8 of MSR_RAPL_POWER_UNIT.
+*/
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+   break;
+   }
 
/*
 * Calculate the timer rate:
@@ -702,47 +745,53 @@ static int __init init_rapl_pmus(void)
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 struct intel_rapl_init_fun {
-   bool apply_quirk;
+   enum rapl_quirk apply_quirk;
int cntr_mask;

[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 68 +---
 1 file changed, 58 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..328fea4 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -452,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +647,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -634,10 +664,20 @@ static int rapl_check_hw_unit(bool apply_quirk)
 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
 */
-   if (apply_quirk)
+   if (apply_quirk == RAPL_HSX_QUIRK)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
/*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules increment,
+* refer to Software Developers' Manual, Vol. 3C, Order No. 325384,
+* Table 35-8 of MSR_RAPL_POWER_UNIT
+*/
+   if (apply_quirk == RAPL_BYT_QUIRK) {
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+   }
+
+   /*
 * Calculate the timer rate:
 * Use reference of 200W for scaling the timeout to avoid counter
 * overflows. 200W = 200 Joules/sec
@@ -702,47 +742,53 @@ static int __init init_rapl_pmus(void)
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 struct intel_rapl_init_fun {
-   bool apply_quirk;
+   enum rapl_quirk apply_quirk;
int cntr_mask;
struct attribute **attrs;
 };
 
 static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
+   .apply_quirk = RAPL_NO_QUIRK,
.cntr_mask = RAPL_IDX_CLN,
.attrs = rapl_events_cln_attr,
 };
 
 static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
+   .apply_quirk = RAPL_HSX_QUIRK,
.cntr_mask = RAPL_IDX_SRV

[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 68 +---
 1 file changed, 58 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..328fea4 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -136,6 +140,12 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_NO_QUIRK = 0,
+   RAPL_HSX_QUIRK,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -452,6 +462,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +551,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +647,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -634,10 +664,20 @@ static int rapl_check_hw_unit(bool apply_quirk)
 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
 */
-   if (apply_quirk)
+   if (apply_quirk == RAPL_HSX_QUIRK)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
/*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules increment,
+* refer to Software Developers' Manual, Vol. 3C, Order No. 325384,
+* Table 35-8 of MSR_RAPL_POWER_UNIT
+*/
+   if (apply_quirk == RAPL_BYT_QUIRK) {
+   for (i = 0; i < NR_RAPL_DOMAINS; i++)
+   rapl_hw_unit[i] = 32 - rapl_hw_unit[i];
+   }
+
+   /*
 * Calculate the timer rate:
 * Use reference of 200W for scaling the timeout to avoid counter
 * overflows. 200W = 200 Joules/sec
@@ -702,47 +742,53 @@ static int __init init_rapl_pmus(void)
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 struct intel_rapl_init_fun {
-   bool apply_quirk;
+   enum rapl_quirk apply_quirk;
int cntr_mask;
struct attribute **attrs;
 };
 
 static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
-   .apply_quirk = false,
+   .apply_quirk = RAPL_NO_QUIRK,
.cntr_mask = RAPL_IDX_CLN,
.attrs = rapl_events_cln_attr,
 };
 
 static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
-   .apply_quirk = true,
+   .apply_quirk = RAPL_HSX_QUIRK,
.cntr_mask = RAPL_IDX_SRV,
.attrs = rapl_events_srv_attr,
 };
 
 static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
-   .apply_quirk = false,
+   .apply_quirk = RAPL_NO_QUIRK,
.cntr_mask = RAPL_IDX_HSW,
.attrs = rapl_events_hsw_attr,
 };
 
 static const struct intel_rapl_init_fun snbep_rapl_init __initconst

[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-09 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-08 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 1/2] perf/x86/rapl: Enable Apollo Lake RAPL support

2016-09-08 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2886593..f7924640 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -765,6 +765,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst 
= {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE,  skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X,   hsx_rapl_init),
+
+   X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
 };
 
-- 
2.6.6



[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-08 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan <harry@intel.com>

Signed-off-by: Harry Pan <harry@intel.com>
---
 arch/x86/events/intel/rapl.c | 78 ++--
 1 file changed, 68 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..fdd4d86 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<<RAPL_IDX_PKG_NRG_STAT|\
 1<<RAPL_IDX_RAM_NRG_STAT)
 
+/* Baytrail/Braswell clients have PP0, PKG */
+#define RAPL_IDX_BYT   (1<<RAPL_IDX_PP0_NRG_STAT|\
+1<<RAPL_IDX_PKG_NRG_STAT)
+
 /*
  * event code: LSB 8 bits, passed in attr->config
  * any other bit is reserved
@@ -136,6 +140,11 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_HSX_QUIRK = 1,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -158,6 +167,7 @@ static struct rapl_pmus *rapl_pmus;
 static cpumask_t rapl_cpu_mask;
 static unsigned int rapl_cntr_mask;
 static u64 rapl_timer_ms;
+static bool is_baytrail;
 
 static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
 {
@@ -177,6 +187,16 @@ static inline u64 rapl_scale(u64 v, int cfg)
pr_warn("Invalid domain %d, failed to scale data\n", cfg);
return v;
}
+
+   /*
+* Some Atom series processors (BYT/BSW) use 2^ESU microjoules.
+*
+* TODO: this looks hacky, it's better to refactor scale-up mechanism
+* to compromise the main stream processors and Atom ones.
+*/
+   if (is_baytrail)
+   return v << rapl_hw_unit[cfg - 1];
+
/*
 * scale delta to smallest unit (1/2^32)
 * users must then scale back: count * 1/(1e9*2^32) to get Joules
@@ -452,6 +472,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +561,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +657,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -634,10 +674,20 @@ static int rapl_check_hw_unit(bool apply_quirk)
 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
 */
-   if (apply_quirk)
+   if (apply_quirk == RAPL_HSX_QUIRK)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
/*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules increment,
+* refer to Software Developers' Manual, Vol. 3C, Order No. 325384,
+* Table 35-8 of MSR_RAPL_POWER_UNIT
+*/
+   if (apply_quirk == RAPL_BYT_QUIRK)
+   is_baytrail = true;
+   else

[PATCH 2/2] perf/x86/rapl: Enable Baytrail/Braswell RAPL support

2016-09-08 Thread Harry Pan
This patch enables RAPL counters (energy consumption counters)
support for Intel Baytrail and Braswell processors (Model 55 and 76):

The Silvermont/Airmont microarchitecture actually uses fixed
energy status unit (ESU) in smallest unit of microjoule,
this patch adds quirk for these Atom processors (BYT/BSW)
to calculate energy increment in 2^ESU microjoules.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-8.

Usage example:

$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

This patch also enables multiple quirks.

Signed-off-by: Harry Pan 

Signed-off-by: Harry Pan 
---
 arch/x86/events/intel/rapl.c | 78 ++--
 1 file changed, 68 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f7924640..fdd4d86 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,6 +110,10 @@ static const char *const 
rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
 #define RAPL_IDX_KNL   (1<config
  * any other bit is reserved
@@ -136,6 +140,11 @@ static struct perf_pmu_events_attr event_attr_##v = {  
\
.event_str  = str,  
\
 };
 
+enum rapl_quirk {
+   RAPL_HSX_QUIRK = 1,
+   RAPL_BYT_QUIRK,
+};
+
 struct rapl_pmu {
raw_spinlock_t  lock;
int n_active;
@@ -158,6 +167,7 @@ static struct rapl_pmus *rapl_pmus;
 static cpumask_t rapl_cpu_mask;
 static unsigned int rapl_cntr_mask;
 static u64 rapl_timer_ms;
+static bool is_baytrail;
 
 static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
 {
@@ -177,6 +187,16 @@ static inline u64 rapl_scale(u64 v, int cfg)
pr_warn("Invalid domain %d, failed to scale data\n", cfg);
return v;
}
+
+   /*
+* Some Atom series processors (BYT/BSW) use 2^ESU microjoules.
+*
+* TODO: this looks hacky, it's better to refactor scale-up mechanism
+* to compromise the main stream processors and Atom ones.
+*/
+   if (is_baytrail)
+   return v << rapl_hw_unit[cfg - 1];
+
/*
 * scale delta to smallest unit (1/2^32)
 * users must then scale back: count * 1/(1e9*2^32) to get Joules
@@ -452,6 +472,14 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, 
"2.3283064365386962890
 RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, 
"2.3283064365386962890625e-10");
 RAPL_EVENT_ATTR_STR(energy-psys.scale,   rapl_psys_scale, 
"2.3283064365386962890625e-10");
 
+/*
+ * Some Atom series processors (BYT/BSW) have fixed
+ * energy status unit (ESU) in smallest unit of microjoule,
+ * and its increment is in 2^ESU microjoules.
+ */
+RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_byt_cores_scale, "1.0e-6");
+RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_byt_pkg_scale, "1.0e-6");
+
 static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
@@ -533,6 +561,18 @@ static struct attribute *rapl_events_knl_attr[] = {
NULL,
 };
 
+static struct attribute *rapl_events_byt_attr[] = {
+   EVENT_PTR(rapl_cores),
+   EVENT_PTR(rapl_pkg),
+
+   EVENT_PTR(rapl_cores_unit),
+   EVENT_PTR(rapl_pkg_unit),
+
+   EVENT_PTR(rapl_byt_cores_scale),
+   EVENT_PTR(rapl_byt_pkg_scale),
+   NULL,
+};
+
 static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
@@ -617,7 +657,7 @@ static int rapl_cpu_prepare(unsigned int cpu)
return 0;
 }
 
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(enum rapl_quirk apply_quirk)
 {
u64 msr_rapl_power_unit_bits;
int i;
@@ -634,10 +674,20 @@ static int rapl_check_hw_unit(bool apply_quirk)
 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
 */
-   if (apply_quirk)
+   if (apply_quirk == RAPL_HSX_QUIRK)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
/*
+* Some Atom processors (BYT/BSW) have 2^ESU microjoules increment,
+* refer to Software Developers' Manual, Vol. 3C, Order No. 325384,
+* Table 35-8 of MSR_RAPL_POWER_UNIT
+*/
+   if (apply_quirk == RAPL_BYT_QUIRK)
+   is_baytrail = true;
+   else
+   is_baytrail = false;
+
+   /*
 * Calculate the timer rate:
 * Use reference of 200W for scaling the timeout to avoid counter
 * overflows. 200W = 200 Joules/sec
@@ -702,47 +752,53 @@ static int __init init_rapl_pmus(void)
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) }
 
 struct intel_rapl_i

[PATCH] ASoC: dapm: Do not traverse widget hooks to snd-soc-dummy

2016-03-19 Thread Harry Pan
All components are initially given an empty card when registering platform,
and since the commit 6e78108bda78
("ASoC: core: Don't probe the component which is dummy")',
snd-soc-dummy will not be probed so that it remains an empty card assigned.

This patch ignores to iterate widget hooks to the 'snd-soc-dummy'
component, else it will trigger memory fault because of invalid dereference
address of card->widgets.

Test by grep -rsI "hello" /sys/devices/platform/skl_nau88l25_ssm4567_i2s/

Conflicts:
sound/soc/soc-dapm.c

Signed-off-by: Harry Pan <harry@intel.com>
---
 sound/soc/soc-dapm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index 581175a..0bc15c9 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -2188,6 +2188,9 @@ static ssize_t dapm_widget_show_component(struct 
snd_soc_component *cmpnt,
int count = 0;
char *state = "not set";
 
+   if (!strcmp(cmpnt->name, "snd-soc-dummy"))
+   return 0;
+
list_for_each_entry(w, >card->widgets, list) {
if (w->dapm != dapm)
continue;
-- 
2.1.2



[PATCH] ASoC: dapm: Do not traverse widget hooks to snd-soc-dummy

2016-03-19 Thread Harry Pan
All components are initially given an empty card when registering platform,
and since the commit 6e78108bda78
("ASoC: core: Don't probe the component which is dummy")',
snd-soc-dummy will not be probed so that it remains an empty card assigned.

This patch ignores to iterate widget hooks to the 'snd-soc-dummy'
component, else it will trigger memory fault because of invalid dereference
address of card->widgets.

Test by grep -rsI "hello" /sys/devices/platform/skl_nau88l25_ssm4567_i2s/

Conflicts:
sound/soc/soc-dapm.c

Signed-off-by: Harry Pan 
---
 sound/soc/soc-dapm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index 581175a..0bc15c9 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -2188,6 +2188,9 @@ static ssize_t dapm_widget_show_component(struct 
snd_soc_component *cmpnt,
int count = 0;
char *state = "not set";
 
+   if (!strcmp(cmpnt->name, "snd-soc-dummy"))
+   return 0;
+
list_for_each_entry(w, >card->widgets, list) {
if (w->dapm != dapm)
continue;
-- 
2.1.2



[PATCH] ASoC: dapm: Do not traverse widget hooks to snd-soc-dummy

2016-03-19 Thread Harry Pan
All components are initially given an empty card when registering platform,
and since the commit 6e78108bda78
("ASoC: core: Don't probe the component which is dummy")',
snd-soc-dummy will not be probed so that it remains an empty card assigned.

This patch ignores to iterate widget hooks to the 'snd-soc-dummy'
component, else it will trigger memory fault because of invalid dereference
address of card->widgets.

Test by grep -rsI "hello" /sys/devices/platform/skl_nau88l25_ssm4567_i2s/

Conflicts:
sound/soc/soc-dapm.c

Signed-off-by: Harry Pan <harry@intel.com>
---
 sound/soc/soc-dapm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index 581175a..0bc15c9 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -2188,6 +2188,9 @@ static ssize_t dapm_widget_show_component(struct 
snd_soc_component *cmpnt,
int count = 0;
char *state = "not set";
 
+   if (!strcmp(cmpnt->name, "snd-soc-dummy"))
+   return 0;
+
list_for_each_entry(w, >card->widgets, list) {
if (w->dapm != dapm)
continue;
-- 
2.1.2



[PATCH] ASoC: dapm: Do not traverse widget hooks to snd-soc-dummy

2016-03-19 Thread Harry Pan
All components are initially given an empty card when registering platform,
and since the commit 6e78108bda78
("ASoC: core: Don't probe the component which is dummy")',
snd-soc-dummy will not be probed so that it remains an empty card assigned.

This patch ignores to iterate widget hooks to the 'snd-soc-dummy'
component, else it will trigger memory fault because of invalid dereference
address of card->widgets.

Test by grep -rsI "hello" /sys/devices/platform/skl_nau88l25_ssm4567_i2s/

Conflicts:
sound/soc/soc-dapm.c

Signed-off-by: Harry Pan 
---
 sound/soc/soc-dapm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index 581175a..0bc15c9 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -2188,6 +2188,9 @@ static ssize_t dapm_widget_show_component(struct 
snd_soc_component *cmpnt,
int count = 0;
char *state = "not set";
 
+   if (!strcmp(cmpnt->name, "snd-soc-dummy"))
+   return 0;
+
list_for_each_entry(w, >card->widgets, list) {
if (w->dapm != dapm)
continue;
-- 
2.1.2