Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-28 Thread Mark Rutland
On Mon, Mar 27, 2017 at 12:03:24PM +0530, Anurup M wrote:
> On Friday 24 March 2017 05:13 PM, Mark Rutland wrote:
> >>>How do we ensure that we don't take the interrupt in the middle of a
>  >sequence of accesses to the HW?
> >>>
> >>>The L3 cache and MN PMU does not use the overflow IRQ and it does
> >>>not occur here
> >>>as the interrupt Mask register is by default masked in hardware.
> >I was referring to the timer interrupt which backs the hrtimer.
> >
> >i.e. how do we guarantee that hisi_hrtimer_callback() is not called
> >while we are in the middle of a RMW sequence? Are interrupts disabled
> >for all of those seqeunces?
> 
> The HW access via djtag read and write are protected by spin_lock_irqsave.

Thanks for the explanation.

I mistakenly thought that there were sequences that would need to make
several hisi_djtag_{read,writel}() calls that might conflict with the
overflow handler, but that is not the case, so the spin_lock_irqsave()
does appear to be sufficient.

Thanks,
Mark.


Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-28 Thread Mark Rutland
On Mon, Mar 27, 2017 at 12:03:24PM +0530, Anurup M wrote:
> On Friday 24 March 2017 05:13 PM, Mark Rutland wrote:
> >>>How do we ensure that we don't take the interrupt in the middle of a
>  >sequence of accesses to the HW?
> >>>
> >>>The L3 cache and MN PMU does not use the overflow IRQ and it does
> >>>not occur here
> >>>as the interrupt Mask register is by default masked in hardware.
> >I was referring to the timer interrupt which backs the hrtimer.
> >
> >i.e. how do we guarantee that hisi_hrtimer_callback() is not called
> >while we are in the middle of a RMW sequence? Are interrupts disabled
> >for all of those seqeunces?
> 
> The HW access via djtag read and write are protected by spin_lock_irqsave.

Thanks for the explanation.

I mistakenly thought that there were sequences that would need to make
several hisi_djtag_{read,writel}() calls that might conflict with the
overflow handler, but that is not the case, so the spin_lock_irqsave()
does appear to be sufficient.

Thanks,
Mark.


Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-27 Thread Anurup M



On Friday 24 March 2017 05:13 PM, Mark Rutland wrote:

How do we ensure that we don't take the interrupt in the middle of a
> >sequence of accesses to the HW?

>
>The L3 cache and MN PMU does not use the overflow IRQ and it does
>not occur here
>as the interrupt Mask register is by default masked in hardware.

I was referring to the timer interrupt which backs the hrtimer.

i.e. how do we guarantee that hisi_hrtimer_callback() is not called
while we are in the middle of a RMW sequence? Are interrupts disabled
for all of those seqeunces?


The HW access via djtag read and write are protected by spin_lock_irqsave.

Thanks,
Anurup


Thanks,
Mark.




Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-27 Thread Anurup M



On Friday 24 March 2017 05:13 PM, Mark Rutland wrote:

How do we ensure that we don't take the interrupt in the middle of a
> >sequence of accesses to the HW?

>
>The L3 cache and MN PMU does not use the overflow IRQ and it does
>not occur here
>as the interrupt Mask register is by default masked in hardware.

I was referring to the timer interrupt which backs the hrtimer.

i.e. how do we guarantee that hisi_hrtimer_callback() is not called
while we are in the middle of a RMW sequence? Are interrupts disabled
for all of those seqeunces?


The HW access via djtag read and write are protected by spin_lock_irqsave.

Thanks,
Anurup


Thanks,
Mark.




Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-24 Thread Mark Rutland
On Fri, Mar 24, 2017 at 12:13:18PM +0530, Anurup M wrote:
> On Tuesday 21 March 2017 10:46 PM, Mark Rutland wrote:
> >On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:

> >>+/* The counter overflow IRQ is not supported for some PMUs
> >>+ * use hrtimer to periodically poll and avoid overflow
> >>+ */
> >>+static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
> >>+{
> >>+   struct hisi_pmu *hisi_pmu = container_of(hrtimer,
> >>+struct hisi_pmu, hrtimer);
> >>+   struct perf_event *event;
> >>+   struct hw_perf_event *hwc;
> >>+   unsigned long flags;
> >>+
> >>+   /* Return if no active events */
> >>+   if (!hisi_pmu->num_active)
> >>+   return HRTIMER_NORESTART;
> >>+
> >>+   local_irq_save(flags);
> >>+
> >>+   /* Update event count for each active event */
> >>+   list_for_each_entry(event, _pmu->active_list, active_entry) {
> >>+   hwc = >hw;
> >>+   /* Read hardware counter and update the Perf event counter */
> >>+   hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
> >>+   }
> >How do we ensure that we don't take the interrupt in the middle of a
> >sequence of accesses to the HW?
> 
> The L3 cache and MN PMU does not use the overflow IRQ and it does
> not occur here
> as the interrupt Mask register is by default masked in hardware.

I was referring to the timer interrupt which backs the hrtimer.

i.e. how do we guarantee that hisi_hrtimer_callback() is not called
while we are in the middle of a RMW sequence? Are interrupts disabled
for all of those seqeunces?

Thanks,
Mark.


Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-24 Thread Mark Rutland
On Fri, Mar 24, 2017 at 12:13:18PM +0530, Anurup M wrote:
> On Tuesday 21 March 2017 10:46 PM, Mark Rutland wrote:
> >On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:

> >>+/* The counter overflow IRQ is not supported for some PMUs
> >>+ * use hrtimer to periodically poll and avoid overflow
> >>+ */
> >>+static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
> >>+{
> >>+   struct hisi_pmu *hisi_pmu = container_of(hrtimer,
> >>+struct hisi_pmu, hrtimer);
> >>+   struct perf_event *event;
> >>+   struct hw_perf_event *hwc;
> >>+   unsigned long flags;
> >>+
> >>+   /* Return if no active events */
> >>+   if (!hisi_pmu->num_active)
> >>+   return HRTIMER_NORESTART;
> >>+
> >>+   local_irq_save(flags);
> >>+
> >>+   /* Update event count for each active event */
> >>+   list_for_each_entry(event, _pmu->active_list, active_entry) {
> >>+   hwc = >hw;
> >>+   /* Read hardware counter and update the Perf event counter */
> >>+   hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
> >>+   }
> >How do we ensure that we don't take the interrupt in the middle of a
> >sequence of accesses to the HW?
> 
> The L3 cache and MN PMU does not use the overflow IRQ and it does
> not occur here
> as the interrupt Mask register is by default masked in hardware.

I was referring to the timer interrupt which backs the hrtimer.

i.e. how do we guarantee that hisi_hrtimer_callback() is not called
while we are in the middle of a RMW sequence? Are interrupts disabled
for all of those seqeunces?

Thanks,
Mark.


Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-24 Thread Anurup M



On Tuesday 21 March 2017 10:46 PM, Mark Rutland wrote:

On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:

Add hrtimer support which use poll method to avoid counter overflow
when overflow IRQ is not supported in hardware.
The L3 cache PMU use N-N SPI interrupt which has no support in kernel
mainline. So use hrtimer to poll and update event counter to avoid
overflow condition for L3 cache PMU.
An interval of 10 seconds is used for the hrtimer.

This should be folded with the previous patch, given that it is
necessary for counters to work correctly.


I had separated for ease of review. I shall fold it to main L3C patch.


[...]


+/*
+ * Default timer frequency to poll and avoid counter overflow.
+ * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
+ * L1 cache - 2 way set associative
+ * L2  - 16 way set associative
+ * L3  - 16 way set associative. L3 cache has 4 banks.
+ *
+ * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
+ * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
+ *
+ * L3 cache is also used by devices like PCIe, SAS etc. at
+ * the same time. So the overflow time could be even smaller.
+ * So on a safe side we use a timer interval of 10sec
+ */
+#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)

This sounds fine.

[...]


+/*
+ * sysfs hrtimer_interval attributes
+ */
+ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct pmu *pmu = dev_get_drvdata(dev);
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
+
+   if (hisi_pmu->hrt_duration)
+   return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
+   return 0;
+}

I don't think that we need a sysfs property for this.


Agreed. Shall remove it.


[...]


+/* The counter overflow IRQ is not supported for some PMUs
+ * use hrtimer to periodically poll and avoid overflow
+ */
+static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
+{
+   struct hisi_pmu *hisi_pmu = container_of(hrtimer,
+struct hisi_pmu, hrtimer);
+   struct perf_event *event;
+   struct hw_perf_event *hwc;
+   unsigned long flags;
+
+   /* Return if no active events */
+   if (!hisi_pmu->num_active)
+   return HRTIMER_NORESTART;
+
+   local_irq_save(flags);
+
+   /* Update event count for each active event */
+   list_for_each_entry(event, _pmu->active_list, active_entry) {
+   hwc = >hw;
+   /* Read hardware counter and update the Perf event counter */
+   hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
+   }

How do we ensure that we don't take the interrupt in the middle of a
sequence of accesses to the HW?


The L3 cache and MN PMU does not use the overflow IRQ and it does not 
occur here

as the interrupt Mask register is by default masked in hardware.
But yes I would modify it as

hisi_pmu->ops->overflow_handler (which can mask IRQ (if required) and 
call event_update)


Thanks,
Anurup


Thanks,
Mark.




Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-24 Thread Anurup M



On Tuesday 21 March 2017 10:46 PM, Mark Rutland wrote:

On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:

Add hrtimer support which use poll method to avoid counter overflow
when overflow IRQ is not supported in hardware.
The L3 cache PMU use N-N SPI interrupt which has no support in kernel
mainline. So use hrtimer to poll and update event counter to avoid
overflow condition for L3 cache PMU.
An interval of 10 seconds is used for the hrtimer.

This should be folded with the previous patch, given that it is
necessary for counters to work correctly.


I had separated for ease of review. I shall fold it to main L3C patch.


[...]


+/*
+ * Default timer frequency to poll and avoid counter overflow.
+ * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
+ * L1 cache - 2 way set associative
+ * L2  - 16 way set associative
+ * L3  - 16 way set associative. L3 cache has 4 banks.
+ *
+ * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
+ * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
+ *
+ * L3 cache is also used by devices like PCIe, SAS etc. at
+ * the same time. So the overflow time could be even smaller.
+ * So on a safe side we use a timer interval of 10sec
+ */
+#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)

This sounds fine.

[...]


+/*
+ * sysfs hrtimer_interval attributes
+ */
+ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct pmu *pmu = dev_get_drvdata(dev);
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
+
+   if (hisi_pmu->hrt_duration)
+   return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
+   return 0;
+}

I don't think that we need a sysfs property for this.


Agreed. Shall remove it.


[...]


+/* The counter overflow IRQ is not supported for some PMUs
+ * use hrtimer to periodically poll and avoid overflow
+ */
+static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
+{
+   struct hisi_pmu *hisi_pmu = container_of(hrtimer,
+struct hisi_pmu, hrtimer);
+   struct perf_event *event;
+   struct hw_perf_event *hwc;
+   unsigned long flags;
+
+   /* Return if no active events */
+   if (!hisi_pmu->num_active)
+   return HRTIMER_NORESTART;
+
+   local_irq_save(flags);
+
+   /* Update event count for each active event */
+   list_for_each_entry(event, _pmu->active_list, active_entry) {
+   hwc = >hw;
+   /* Read hardware counter and update the Perf event counter */
+   hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
+   }

How do we ensure that we don't take the interrupt in the middle of a
sequence of accesses to the HW?


The L3 cache and MN PMU does not use the overflow IRQ and it does not 
occur here

as the interrupt Mask register is by default masked in hardware.
But yes I would modify it as

hisi_pmu->ops->overflow_handler (which can mask IRQ (if required) and 
call event_update)


Thanks,
Anurup


Thanks,
Mark.




Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-21 Thread Mark Rutland
On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:
> Add hrtimer support which use poll method to avoid counter overflow
> when overflow IRQ is not supported in hardware.
> The L3 cache PMU use N-N SPI interrupt which has no support in kernel
> mainline. So use hrtimer to poll and update event counter to avoid
> overflow condition for L3 cache PMU.
> An interval of 10 seconds is used for the hrtimer.

This should be folded with the previous patch, given that it is
necessary for counters to work correctly.

[...]

> +/*
> + * Default timer frequency to poll and avoid counter overflow.
> + * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
> + * L1 cache - 2 way set associative
> + * L2  - 16 way set associative
> + * L3  - 16 way set associative. L3 cache has 4 banks.
> + *
> + * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
> + * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
> + *
> + * L3 cache is also used by devices like PCIe, SAS etc. at
> + * the same time. So the overflow time could be even smaller.
> + * So on a safe side we use a timer interval of 10sec
> + */
> +#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)

This sounds fine.

[...]

> +/*
> + * sysfs hrtimer_interval attributes
> + */
> +ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
> +  struct device_attribute *attr,
> +  char *buf)
> +{
> + struct pmu *pmu = dev_get_drvdata(dev);
> + struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> +
> + if (hisi_pmu->hrt_duration)
> + return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
> + return 0;
> +}

I don't think that we need a sysfs property for this.

[...]

> +/* The counter overflow IRQ is not supported for some PMUs
> + * use hrtimer to periodically poll and avoid overflow
> + */
> +static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
> +{
> + struct hisi_pmu *hisi_pmu = container_of(hrtimer,
> +  struct hisi_pmu, hrtimer);
> + struct perf_event *event;
> + struct hw_perf_event *hwc;
> + unsigned long flags;
> +
> + /* Return if no active events */
> + if (!hisi_pmu->num_active)
> + return HRTIMER_NORESTART;
> +
> + local_irq_save(flags);
> +
> + /* Update event count for each active event */
> + list_for_each_entry(event, _pmu->active_list, active_entry) {
> + hwc = >hw;
> + /* Read hardware counter and update the Perf event counter */
> + hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
> + }

How do we ensure that we don't take the interrupt in the middle of a
sequence of accesses to the HW?

Thanks,
Mark.


Re: [PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-21 Thread Mark Rutland
On Fri, Mar 10, 2017 at 01:28:45AM -0500, Anurup M wrote:
> Add hrtimer support which use poll method to avoid counter overflow
> when overflow IRQ is not supported in hardware.
> The L3 cache PMU use N-N SPI interrupt which has no support in kernel
> mainline. So use hrtimer to poll and update event counter to avoid
> overflow condition for L3 cache PMU.
> An interval of 10 seconds is used for the hrtimer.

This should be folded with the previous patch, given that it is
necessary for counters to work correctly.

[...]

> +/*
> + * Default timer frequency to poll and avoid counter overflow.
> + * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
> + * L1 cache - 2 way set associative
> + * L2  - 16 way set associative
> + * L3  - 16 way set associative. L3 cache has 4 banks.
> + *
> + * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
> + * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
> + *
> + * L3 cache is also used by devices like PCIe, SAS etc. at
> + * the same time. So the overflow time could be even smaller.
> + * So on a safe side we use a timer interval of 10sec
> + */
> +#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)

This sounds fine.

[...]

> +/*
> + * sysfs hrtimer_interval attributes
> + */
> +ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
> +  struct device_attribute *attr,
> +  char *buf)
> +{
> + struct pmu *pmu = dev_get_drvdata(dev);
> + struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> +
> + if (hisi_pmu->hrt_duration)
> + return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
> + return 0;
> +}

I don't think that we need a sysfs property for this.

[...]

> +/* The counter overflow IRQ is not supported for some PMUs
> + * use hrtimer to periodically poll and avoid overflow
> + */
> +static enum hrtimer_restart hisi_hrtimer_callback(struct hrtimer *hrtimer)
> +{
> + struct hisi_pmu *hisi_pmu = container_of(hrtimer,
> +  struct hisi_pmu, hrtimer);
> + struct perf_event *event;
> + struct hw_perf_event *hwc;
> + unsigned long flags;
> +
> + /* Return if no active events */
> + if (!hisi_pmu->num_active)
> + return HRTIMER_NORESTART;
> +
> + local_irq_save(flags);
> +
> + /* Update event count for each active event */
> + list_for_each_entry(event, _pmu->active_list, active_entry) {
> + hwc = >hw;
> + /* Read hardware counter and update the Perf event counter */
> + hisi_pmu->ops->event_update(event, hwc, GET_CNTR_IDX(hwc));
> + }

How do we ensure that we don't take the interrupt in the middle of a
sequence of accesses to the HW?

Thanks,
Mark.


[PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-09 Thread Anurup M
Add hrtimer support which use poll method to avoid counter overflow
when overflow IRQ is not supported in hardware.
The L3 cache PMU use N-N SPI interrupt which has no support in kernel
mainline. So use hrtimer to poll and update event counter to avoid
overflow condition for L3 cache PMU.
An interval of 10 seconds is used for the hrtimer.

Signed-off-by: Dikshit N 
Signed-off-by: Anurup M 
---
 drivers/perf/hisilicon/hisi_uncore_l3c.c | 47 ++
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 82 
 drivers/perf/hisilicon/hisi_uncore_pmu.h | 17 +++
 3 files changed, 146 insertions(+)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c.c 
b/drivers/perf/hisilicon/hisi_uncore_l3c.c
index 7f80d07..f23fbc2 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c.c
@@ -20,6 +20,8 @@
  * along with this program.  If not, see .
  */
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -53,6 +55,22 @@ enum armv8_hisi_l3c_counters {
 #define L3C_CNT0_REG_OFF 0x170
 #define L3C_EVENT_EN 0x100
 
+/*
+ * Default timer frequency to poll and avoid counter overflow.
+ * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
+ * L1 cache - 2 way set associative
+ * L2  - 16 way set associative
+ * L3  - 16 way set associative. L3 cache has 4 banks.
+ *
+ * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
+ * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
+ *
+ * L3 cache is also used by devices like PCIe, SAS etc. at
+ * the same time. So the overflow time could be even smaller.
+ * So on a safe side we use a timer interval of 10sec
+ */
+#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)
+
 #define GET_MODULE_ID(hwmod_data) hwmod_data->l3c_hwcfg.module_id
 #define GET_BANK_SEL(hwmod_data) hwmod_data->l3c_hwcfg.bank_select
 
@@ -474,11 +492,24 @@ static const struct attribute_group 
hisi_l3c_cpumask_attr_group = {
.attrs = hisi_l3c_cpumask_attrs,
 };
 
+static DEVICE_ATTR(hrtimer_interval, 0444, hisi_hrtimer_interval_sysfs_show,
+  NULL);
+
+static struct attribute *hisi_l3c_hrtimer_interval_attrs[] = {
+   _attr_hrtimer_interval.attr,
+   NULL,
+};
+
+static const struct attribute_group hisi_l3c_hrtimer_interval_attr_group = {
+   .attrs = hisi_l3c_hrtimer_interval_attrs,
+};
+
 static const struct attribute_group *hisi_l3c_pmu_attr_groups[] = {
_l3c_attr_group,
_l3c_format_group,
_l3c_events_group,
_l3c_cpumask_attr_group,
+   _l3c_hrtimer_interval_attr_group,
NULL,
 };
 
@@ -494,6 +525,15 @@ static struct hisi_uncore_ops hisi_uncore_l3c_ops = {
.write_counter = hisi_l3c_write_counter,
 };
 
+/* Initialize hrtimer to poll for avoiding counter overflow */
+static void hisi_l3c_hrtimer_init(struct hisi_pmu *l3c_pmu)
+{
+   INIT_LIST_HEAD(_pmu->active_list);
+   l3c_pmu->ops->start_hrtimer = hisi_hrtimer_start;
+   l3c_pmu->ops->stop_hrtimer = hisi_hrtimer_stop;
+   hisi_hrtimer_init(l3c_pmu, L3C_HRTIMER_INTERVAL);
+}
+
 static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
 struct hisi_djtag_client *client)
 {
@@ -503,6 +543,7 @@ static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
 
l3c_pmu->num_events = HISI_HWEVENT_L3C_EVENT_MAX;
l3c_pmu->num_counters = HISI_IDX_L3C_COUNTER_MAX;
+   l3c_pmu->num_active = 0;
l3c_pmu->scl_id = hisi_djtag_get_sclid(client);
 
l3c_pmu->name = kasprintf(GFP_KERNEL, "hisi_l3c%u_%u",
@@ -513,6 +554,12 @@ static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
/* Pick one core to use for cpumask attributes */
cpumask_set_cpu(smp_processor_id(), _pmu->cpu);
 
+   /*
+* Use poll method to avoid counter overflow as overflow IRQ
+* is not supported in v1,v2 hardware.
+*/
+   hisi_l3c_hrtimer_init(l3c_pmu);
+
return 0;
 }
 
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 0e7b5f1..787602b 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -65,6 +65,70 @@ ssize_t hisi_cpumask_sysfs_show(struct device *dev,
return cpumap_print_to_pagebuf(true, buf, _pmu->cpu);
 }
 
+/*
+ * sysfs hrtimer_interval attributes
+ */
+ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct pmu *pmu = dev_get_drvdata(dev);
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
+
+   if (hisi_pmu->hrt_duration)
+   return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
+   return 0;
+}
+
+/* The counter overflow IRQ is not supported for some PMUs
+ * use hrtimer to periodically poll and avoid overflow
+ */

[PATCH v6 08/11] drivers: perf: hisi: use poll method to avoid L3C counter overflow

2017-03-09 Thread Anurup M
Add hrtimer support which use poll method to avoid counter overflow
when overflow IRQ is not supported in hardware.
The L3 cache PMU use N-N SPI interrupt which has no support in kernel
mainline. So use hrtimer to poll and update event counter to avoid
overflow condition for L3 cache PMU.
An interval of 10 seconds is used for the hrtimer.

Signed-off-by: Dikshit N 
Signed-off-by: Anurup M 
---
 drivers/perf/hisilicon/hisi_uncore_l3c.c | 47 ++
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 82 
 drivers/perf/hisilicon/hisi_uncore_pmu.h | 17 +++
 3 files changed, 146 insertions(+)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c.c 
b/drivers/perf/hisilicon/hisi_uncore_l3c.c
index 7f80d07..f23fbc2 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c.c
@@ -20,6 +20,8 @@
  * along with this program.  If not, see .
  */
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -53,6 +55,22 @@ enum armv8_hisi_l3c_counters {
 #define L3C_CNT0_REG_OFF 0x170
 #define L3C_EVENT_EN 0x100
 
+/*
+ * Default timer frequency to poll and avoid counter overflow.
+ * CPU speed = 2.4Ghz, Therefore Access time = 0.4ns
+ * L1 cache - 2 way set associative
+ * L2  - 16 way set associative
+ * L3  - 16 way set associative. L3 cache has 4 banks.
+ *
+ * Overflow time = 2^31 * (access time L1 + access time L2 + access time L3)
+ * = 2^31 * ((2 * 0.4ns) + (16 * 0.4ns) + (4 * 16 * 0.4ns)) = 70 seconds
+ *
+ * L3 cache is also used by devices like PCIe, SAS etc. at
+ * the same time. So the overflow time could be even smaller.
+ * So on a safe side we use a timer interval of 10sec
+ */
+#define L3C_HRTIMER_INTERVAL (10LL * MSEC_PER_SEC)
+
 #define GET_MODULE_ID(hwmod_data) hwmod_data->l3c_hwcfg.module_id
 #define GET_BANK_SEL(hwmod_data) hwmod_data->l3c_hwcfg.bank_select
 
@@ -474,11 +492,24 @@ static const struct attribute_group 
hisi_l3c_cpumask_attr_group = {
.attrs = hisi_l3c_cpumask_attrs,
 };
 
+static DEVICE_ATTR(hrtimer_interval, 0444, hisi_hrtimer_interval_sysfs_show,
+  NULL);
+
+static struct attribute *hisi_l3c_hrtimer_interval_attrs[] = {
+   _attr_hrtimer_interval.attr,
+   NULL,
+};
+
+static const struct attribute_group hisi_l3c_hrtimer_interval_attr_group = {
+   .attrs = hisi_l3c_hrtimer_interval_attrs,
+};
+
 static const struct attribute_group *hisi_l3c_pmu_attr_groups[] = {
_l3c_attr_group,
_l3c_format_group,
_l3c_events_group,
_l3c_cpumask_attr_group,
+   _l3c_hrtimer_interval_attr_group,
NULL,
 };
 
@@ -494,6 +525,15 @@ static struct hisi_uncore_ops hisi_uncore_l3c_ops = {
.write_counter = hisi_l3c_write_counter,
 };
 
+/* Initialize hrtimer to poll for avoiding counter overflow */
+static void hisi_l3c_hrtimer_init(struct hisi_pmu *l3c_pmu)
+{
+   INIT_LIST_HEAD(_pmu->active_list);
+   l3c_pmu->ops->start_hrtimer = hisi_hrtimer_start;
+   l3c_pmu->ops->stop_hrtimer = hisi_hrtimer_stop;
+   hisi_hrtimer_init(l3c_pmu, L3C_HRTIMER_INTERVAL);
+}
+
 static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
 struct hisi_djtag_client *client)
 {
@@ -503,6 +543,7 @@ static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
 
l3c_pmu->num_events = HISI_HWEVENT_L3C_EVENT_MAX;
l3c_pmu->num_counters = HISI_IDX_L3C_COUNTER_MAX;
+   l3c_pmu->num_active = 0;
l3c_pmu->scl_id = hisi_djtag_get_sclid(client);
 
l3c_pmu->name = kasprintf(GFP_KERNEL, "hisi_l3c%u_%u",
@@ -513,6 +554,12 @@ static int hisi_l3c_pmu_init(struct hisi_pmu *l3c_pmu,
/* Pick one core to use for cpumask attributes */
cpumask_set_cpu(smp_processor_id(), _pmu->cpu);
 
+   /*
+* Use poll method to avoid counter overflow as overflow IRQ
+* is not supported in v1,v2 hardware.
+*/
+   hisi_l3c_hrtimer_init(l3c_pmu);
+
return 0;
 }
 
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c 
b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 0e7b5f1..787602b 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -65,6 +65,70 @@ ssize_t hisi_cpumask_sysfs_show(struct device *dev,
return cpumap_print_to_pagebuf(true, buf, _pmu->cpu);
 }
 
+/*
+ * sysfs hrtimer_interval attributes
+ */
+ssize_t hisi_hrtimer_interval_sysfs_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct pmu *pmu = dev_get_drvdata(dev);
+   struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
+
+   if (hisi_pmu->hrt_duration)
+   return sprintf(buf, "%llu\n", hisi_pmu->hrt_duration);
+   return 0;
+}
+
+/* The counter overflow IRQ is not supported for some PMUs
+ * use hrtimer to periodically poll and avoid overflow
+ */
+static enum hrtimer_restart