Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-16 Thread Dixit, Ashutosh
On Tue, 16 May 2023 00:12:45 -0700, Tvrtko Ursulin wrote:
>
> On 15/05/2023 22:24, Dixit, Ashutosh wrote:
> > On Mon, 15 May 2023 02:52:35 -0700, Tvrtko Ursulin wrote:
> >>
> >> On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:
> >>> On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:
>  On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:
> >
> > On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
> >> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
> >>>
> >>
> >> Hi Umesh/Tvrtko,
> >>
> >>> From: Tvrtko Ursulin 
> >>>
> >>> We do not want to have timers per tile and waste CPU cycles and
> > energy via
> >>> multiple wake-up sources, for a relatively un-important task of PMU
> >>> sampling, so keeping a single timer works well. But we also do not
> > want
> >>> the first GT which goes idle to turn off the timer.
> >>>
> >>> Add some reference counting, via a mask of unparked GTs, to solve
> > this.
> >>>
> >>> Signed-off-by: Tvrtko Ursulin 
> >>> Signed-off-by: Umesh Nerlige Ramappa
> > 
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> >>>    drivers/gpu/drm/i915/i915_pmu.h |  4 
> >>>    2 files changed, 14 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> > b/drivers/gpu/drm/i915/i915_pmu.c
> >>> index 2b63ee31e1b3..669a42e44082 100644
> >>> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >>> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
> >>>   * Signal sampling timer to stop if only engine events are
> > enabled and
> >>>   * GPU went idle.
> >>>   */
> >>> -    pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >>> +    pmu->unparked &= ~BIT(gt->info.id);
> >>> +    if (pmu->unparked == 0)
> >>> +    pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >>>
> >>>      spin_unlock_irq(>lock);
> >>>    }
> >>> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
> >>>      /*
> >>>   * Re-enable sampling timer when GPU goes active.
> >>>   */
> >>> -    __i915_pmu_maybe_start_timer(pmu);
> >>> +    if (pmu->unparked == 0)
> >>> +    __i915_pmu_maybe_start_timer(pmu);
> >>> +
> >>> +    pmu->unparked |= BIT(gt->info.id);
> >>>
> >>>      spin_unlock_irq(>lock);
> >>>    }
> >>> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct
> > hrtimer *hrtimer)
> >>>   */
> >>>
> >>>      for_each_gt(gt, i915, i) {
> >>> +    if (!(pmu->unparked & BIT(i)))
> >>> +    continue;
> >>> +
> >>
> >> This is not correct. In this series we are at least sampling
> > frequencies
> >> (calling frequency_sample) even when GT is parked. So these 3 lines
> > should be
> >> deleted. engines_sample will get called and will return without doing
> >> anything if engine events are disabled.
> >
> > Not sure I understand. This is checking pmu->'un'parked bits.
> 
>  Sorry, my bad. Not "engines_sample will get called and will return
>  without
>  doing anything if engine events are disabled" but "engines_sample will
>  get
>  called and will return without doing anything if GT is not awake". This
>  is
>  the same as the previous behavior before this series.
> 
>  Umesh and I discussed this but writing this out in case Tvrtko takes a
>  look.
> >>>
> >>> Sounds good, Dropping the check here in the new revision.
> >
> > Hi Tvrtko,
> >
> >> I think it is safe to not have the check, but I didn't quite understand the
> >> "this is not correct" part. I can only see the argument that it could be
> >> redundant, not that it is incorrect.
> >
> > I said that it looks incorrect to me because in this series we are still
> > sampling freq when gt is parked and we would be skipping that if we
> > included:
> > if (!(pmu->unparked & BIT(i)))
> > continue;
>
> Ah okay. We concluded in your upstream patch that looks like an omission.
>
> >> In which case I think it should better stay since it is way more efficient,
> >> given this gets called at 200Hz, than the *atomic* atomic_inc_not_zero
> >> (from intel_wakeref_get_if_active).
> >
> > Where efficiency goes, when we merge the patch below (I have a v2 based on
> > your suggestion but I am waiting till Umesh's series gets merged):
> >
> > https://patchwork.freedesktop.org/series/117658/
> >
> > this will turn off the timer itself which will be even more
> > efficient. Rather than use the above code where the timer is running and
> > then we skip. So after the link above is merged the above code will be
> > truly redundant. That was a second reason why I said delete it.
>
> On multi-tile 

Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-16 Thread Tvrtko Ursulin



On 15/05/2023 22:24, Dixit, Ashutosh wrote:

On Mon, 15 May 2023 02:52:35 -0700, Tvrtko Ursulin wrote:


On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:

On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:

On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:


On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:

On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:




Hi Umesh/Tvrtko,


From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and

energy via

multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not

want

the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve

this.


Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Umesh Nerlige Ramappa



---
   drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
   drivers/gpu/drm/i915/i915_pmu.h |  4 
   2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c

b/drivers/gpu/drm/i915/i915_pmu.c

index 2b63ee31e1b3..669a42e44082 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
  * Signal sampling timer to stop if only engine events are

enabled and

  * GPU went idle.
  */
-    pmu->timer_enabled = pmu_needs_timer(pmu, false);
+    pmu->unparked &= ~BIT(gt->info.id);
+    if (pmu->unparked == 0)
+    pmu->timer_enabled = pmu_needs_timer(pmu, false);

     spin_unlock_irq(>lock);
   }
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
     /*
  * Re-enable sampling timer when GPU goes active.
  */
-    __i915_pmu_maybe_start_timer(pmu);
+    if (pmu->unparked == 0)
+    __i915_pmu_maybe_start_timer(pmu);
+
+    pmu->unparked |= BIT(gt->info.id);

     spin_unlock_irq(>lock);
   }
@@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct

hrtimer *hrtimer)

  */

     for_each_gt(gt, i915, i) {
+    if (!(pmu->unparked & BIT(i)))
+    continue;
+


This is not correct. In this series we are at least sampling

frequencies

(calling frequency_sample) even when GT is parked. So these 3 lines

should be

deleted. engines_sample will get called and will return without doing
anything if engine events are disabled.


Not sure I understand. This is checking pmu->'un'parked bits.


Sorry, my bad. Not "engines_sample will get called and will return
without
doing anything if engine events are disabled" but "engines_sample will
get
called and will return without doing anything if GT is not awake". This
is
the same as the previous behavior before this series.

Umesh and I discussed this but writing this out in case Tvrtko takes a
look.


Sounds good, Dropping the check here in the new revision.


Hi Tvrtko,


I think it is safe to not have the check, but I didn't quite understand the
"this is not correct" part. I can only see the argument that it could be
redundant, not that it is incorrect.


I said that it looks incorrect to me because in this series we are still
sampling freq when gt is parked and we would be skipping that if we
included:
if (!(pmu->unparked & BIT(i)))
continue;


Ah okay. We concluded in your upstream patch that looks like an omission.


In which case I think it should better stay since it is way more efficient,
given this gets called at 200Hz, than the *atomic* atomic_inc_not_zero
(from intel_wakeref_get_if_active).


Where efficiency goes, when we merge the patch below (I have a v2 based on
your suggestion but I am waiting till Umesh's series gets merged):

https://patchwork.freedesktop.org/series/117658/

this will turn off the timer itself which will be even more
efficient. Rather than use the above code where the timer is running and
then we skip. So after the link above is merged the above code will be
truly redundant. That was a second reason why I said delete it.


On multi-tile where not all tiles are being looked at it still pays to 
avoid the atomic check. It doesn't apply to tools like intel_gpu_top, 
which monitor all tiles, but still I think there isn't any harm in 
having the fast check.


Regards,

Tvrtko


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-15 Thread Dixit, Ashutosh
On Mon, 15 May 2023 02:52:35 -0700, Tvrtko Ursulin wrote:
>
> On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:
> > On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:
> >> On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:
> >>>
> >>> On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
> >>> > On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
> >>> >>
> >>> >
> >>> > Hi Umesh/Tvrtko,
> >>> >
> >>> >> From: Tvrtko Ursulin 
> >>> >>
> >>> >> We do not want to have timers per tile and waste CPU cycles and
> >>> energy via
> >>> >> multiple wake-up sources, for a relatively un-important task of PMU
> >>> >> sampling, so keeping a single timer works well. But we also do not
> >>> want
> >>> >> the first GT which goes idle to turn off the timer.
> >>> >>
> >>> >> Add some reference counting, via a mask of unparked GTs, to solve
> >>> this.
> >>> >>
> >>> >> Signed-off-by: Tvrtko Ursulin 
> >>> >> Signed-off-by: Umesh Nerlige Ramappa
> >>> 
> >>> >> ---
> >>> >>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> >>> >>  drivers/gpu/drm/i915/i915_pmu.h |  4 
> >>> >>  2 files changed, 14 insertions(+), 2 deletions(-)
> >>> >>
> >>> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> >>> b/drivers/gpu/drm/i915/i915_pmu.c
> >>> >> index 2b63ee31e1b3..669a42e44082 100644
> >>> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >>> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >>> >> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
> >>> >> * Signal sampling timer to stop if only engine events are
> >>> enabled and
> >>> >> * GPU went idle.
> >>> >> */
> >>> >> -    pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >>> >> +    pmu->unparked &= ~BIT(gt->info.id);
> >>> >> +    if (pmu->unparked == 0)
> >>> >> +    pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >>> >>
> >>> >>    spin_unlock_irq(>lock);
> >>> >>  }
> >>> >> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
> >>> >>    /*
> >>> >> * Re-enable sampling timer when GPU goes active.
> >>> >> */
> >>> >> -    __i915_pmu_maybe_start_timer(pmu);
> >>> >> +    if (pmu->unparked == 0)
> >>> >> +    __i915_pmu_maybe_start_timer(pmu);
> >>> >> +
> >>> >> +    pmu->unparked |= BIT(gt->info.id);
> >>> >>
> >>> >>    spin_unlock_irq(>lock);
> >>> >>  }
> >>> >> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct
> >>> hrtimer *hrtimer)
> >>> >> */
> >>> >>
> >>> >>    for_each_gt(gt, i915, i) {
> >>> >> +    if (!(pmu->unparked & BIT(i)))
> >>> >> +    continue;
> >>> >> +
> >>> >
> >>> > This is not correct. In this series we are at least sampling
> >>> frequencies
> >>> > (calling frequency_sample) even when GT is parked. So these 3 lines
> >>> should be
> >>> > deleted. engines_sample will get called and will return without doing
> >>> > anything if engine events are disabled.
> >>>
> >>> Not sure I understand. This is checking pmu->'un'parked bits.
> >>
> >> Sorry, my bad. Not "engines_sample will get called and will return
> >> without
> >> doing anything if engine events are disabled" but "engines_sample will
> >> get
> >> called and will return without doing anything if GT is not awake". This
> >> is
> >> the same as the previous behavior before this series.
> >>
> >> Umesh and I discussed this but writing this out in case Tvrtko takes a
> >> look.
> >
> > Sounds good, Dropping the check here in the new revision.

Hi Tvrtko,

> I think it is safe to not have the check, but I didn't quite understand the
> "this is not correct" part. I can only see the argument that it could be
> redundant, not that it is incorrect.

I said that it looks incorrect to me because in this series we are still
sampling freq when gt is parked and we would be skipping that if we
included:
if (!(pmu->unparked & BIT(i)))
continue;

> In which case I think it should better stay since it is way more efficient,
> given this gets called at 200Hz, than the *atomic* atomic_inc_not_zero
> (from intel_wakeref_get_if_active).

Where efficiency goes, when we merge the patch below (I have a v2 based on
your suggestion but I am waiting till Umesh's series gets merged):

https://patchwork.freedesktop.org/series/117658/

this will turn off the timer itself which will be even more
efficient. Rather than use the above code where the timer is running and
then we skip. So after the link above is merged the above code will be
truly redundant. That was a second reason why I said delete it.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-15 Thread Tvrtko Ursulin



On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:

On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:

On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:


On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>>
>
> Hi Umesh/Tvrtko,
>
>> From: Tvrtko Ursulin 
>>
>> We do not want to have timers per tile and waste CPU cycles and 
energy via

>> multiple wake-up sources, for a relatively un-important task of PMU
>> sampling, so keeping a single timer works well. But we also do not 
want

>> the first GT which goes idle to turn off the timer.
>>
>> Add some reference counting, via a mask of unparked GTs, to solve 
this.

>>
>> Signed-off-by: Tvrtko Ursulin 
>> Signed-off-by: Umesh Nerlige Ramappa 


>> ---
>>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
>>  drivers/gpu/drm/i915/i915_pmu.h |  4 
>>  2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
b/drivers/gpu/drm/i915/i915_pmu.c

>> index 2b63ee31e1b3..669a42e44082 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>> * Signal sampling timer to stop if only engine events are 
enabled and

>> * GPU went idle.
>> */
>> -    pmu->timer_enabled = pmu_needs_timer(pmu, false);
>> +    pmu->unparked &= ~BIT(gt->info.id);
>> +    if (pmu->unparked == 0)
>> +    pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>
>>    spin_unlock_irq(>lock);
>>  }
>> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>>    /*
>> * Re-enable sampling timer when GPU goes active.
>> */
>> -    __i915_pmu_maybe_start_timer(pmu);
>> +    if (pmu->unparked == 0)
>> +    __i915_pmu_maybe_start_timer(pmu);
>> +
>> +    pmu->unparked |= BIT(gt->info.id);
>>
>>    spin_unlock_irq(>lock);
>>  }
>> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct 
hrtimer *hrtimer)

>> */
>>
>>    for_each_gt(gt, i915, i) {
>> +    if (!(pmu->unparked & BIT(i)))
>> +    continue;
>> +
>
> This is not correct. In this series we are at least sampling 
frequencies
> (calling frequency_sample) even when GT is parked. So these 3 lines 
should be

> deleted. engines_sample will get called and will return without doing
> anything if engine events are disabled.

Not sure I understand. This is checking pmu->'un'parked bits.


Sorry, my bad. Not "engines_sample will get called and will return 
without
doing anything if engine events are disabled" but "engines_sample will 
get
called and will return without doing anything if GT is not awake". 
This is

the same as the previous behavior before this series.

Umesh and I discussed this but writing this out in case Tvrtko takes a
look.


Sounds good, Dropping the check here in the new revision.


I think it is safe to not have the check, but I didn't quite understand 
the "this is not correct" part. I can only see the argument that it 
could be redundant, not that it is incorrect.


In which case I think it should better stay since it is way more 
efficient, given this gets called at 200Hz, than the *atomic* 
atomic_inc_not_zero (from intel_wakeref_get_if_active).


Regards,

Tvrtko


[Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-15 Thread Umesh Nerlige Ramappa
From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and energy via
multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not want
the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve this.

v2: Drop the check for unparked in i915_sample (Ashutosh)

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Umesh Nerlige Ramappa 
Reviewed-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/i915_pmu.c | 9 +++--
 drivers/gpu/drm/i915/i915_pmu.h | 4 
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2b63ee31e1b3..725b01b00775 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 * Signal sampling timer to stop if only engine events are enabled and
 * GPU went idle.
 */
-   pmu->timer_enabled = pmu_needs_timer(pmu, false);
+   pmu->unparked &= ~BIT(gt->info.id);
+   if (pmu->unparked == 0)
+   pmu->timer_enabled = pmu_needs_timer(pmu, false);
 
spin_unlock_irq(>lock);
 }
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
/*
 * Re-enable sampling timer when GPU goes active.
 */
-   __i915_pmu_maybe_start_timer(pmu);
+   if (pmu->unparked == 0)
+   __i915_pmu_maybe_start_timer(pmu);
+
+   pmu->unparked |= BIT(gt->info.id);
 
spin_unlock_irq(>lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index a686fd7ccedf..3a811266ac6a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -76,6 +76,10 @@ struct i915_pmu {
 * @lock: Lock protecting enable mask and ref count handling.
 */
spinlock_t lock;
+   /**
+* @unparked: GT unparked mask.
+*/
+   unsigned int unparked;
/**
 * @timer: Timer for internal i915 PMU sampling.
 */
-- 
2.36.1



Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Dixit, Ashutosh
On Fri, 12 May 2023 18:55:43 -0700, Umesh Nerlige Ramappa wrote:
>
> From: Tvrtko Ursulin 
>
> We do not want to have timers per tile and waste CPU cycles and energy via
> multiple wake-up sources, for a relatively un-important task of PMU
> sampling, so keeping a single timer works well. But we also do not want
> the first GT which goes idle to turn off the timer.
>
> Add some reference counting, via a mask of unparked GTs, to solve this.
>
> v2: Drop the check for unparked in i915_sample (Ashutosh)

Reviewed-by: Ashutosh Dixit 

>
> Signed-off-by: Tvrtko Ursulin 
> Reviewed-by: Umesh Nerlige Ramappa 
> Signed-off-by: Umesh Nerlige Ramappa 
> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 9 +++--
>  drivers/gpu/drm/i915/i915_pmu.h | 4 
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 2b63ee31e1b3..725b01b00775 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>* Signal sampling timer to stop if only engine events are enabled and
>* GPU went idle.
>*/
> - pmu->timer_enabled = pmu_needs_timer(pmu, false);
> + pmu->unparked &= ~BIT(gt->info.id);
> + if (pmu->unparked == 0)
> + pmu->timer_enabled = pmu_needs_timer(pmu, false);
>
>   spin_unlock_irq(>lock);
>  }
> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>   /*
>* Re-enable sampling timer when GPU goes active.
>*/
> - __i915_pmu_maybe_start_timer(pmu);
> + if (pmu->unparked == 0)
> + __i915_pmu_maybe_start_timer(pmu);
> +
> + pmu->unparked |= BIT(gt->info.id);
>
>   spin_unlock_irq(>lock);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
> index a686fd7ccedf..3a811266ac6a 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.h
> +++ b/drivers/gpu/drm/i915/i915_pmu.h
> @@ -76,6 +76,10 @@ struct i915_pmu {
>* @lock: Lock protecting enable mask and ref count handling.
>*/
>   spinlock_t lock;
> + /**
> +  * @unparked: GT unparked mask.
> +  */
> + unsigned int unparked;
>   /**
>* @timer: Timer for internal i915 PMU sampling.
>*/
> --
> 2.36.1
>


[Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Umesh Nerlige Ramappa
From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and energy via
multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not want
the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve this.

v2: Drop the check for unparked in i915_sample (Ashutosh)

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/i915_pmu.c | 9 +++--
 drivers/gpu/drm/i915/i915_pmu.h | 4 
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2b63ee31e1b3..725b01b00775 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 * Signal sampling timer to stop if only engine events are enabled and
 * GPU went idle.
 */
-   pmu->timer_enabled = pmu_needs_timer(pmu, false);
+   pmu->unparked &= ~BIT(gt->info.id);
+   if (pmu->unparked == 0)
+   pmu->timer_enabled = pmu_needs_timer(pmu, false);
 
spin_unlock_irq(>lock);
 }
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
/*
 * Re-enable sampling timer when GPU goes active.
 */
-   __i915_pmu_maybe_start_timer(pmu);
+   if (pmu->unparked == 0)
+   __i915_pmu_maybe_start_timer(pmu);
+
+   pmu->unparked |= BIT(gt->info.id);
 
spin_unlock_irq(>lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index a686fd7ccedf..3a811266ac6a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -76,6 +76,10 @@ struct i915_pmu {
 * @lock: Lock protecting enable mask and ref count handling.
 */
spinlock_t lock;
+   /**
+* @unparked: GT unparked mask.
+*/
+   unsigned int unparked;
/**
 * @timer: Timer for internal i915 PMU sampling.
 */
-- 
2.36.1



Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Umesh Nerlige Ramappa

On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:

On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:


On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>>
>
> Hi Umesh/Tvrtko,
>
>> From: Tvrtko Ursulin 
>>
>> We do not want to have timers per tile and waste CPU cycles and energy via
>> multiple wake-up sources, for a relatively un-important task of PMU
>> sampling, so keeping a single timer works well. But we also do not want
>> the first GT which goes idle to turn off the timer.
>>
>> Add some reference counting, via a mask of unparked GTs, to solve this.
>>
>> Signed-off-by: Tvrtko Ursulin 
>> Signed-off-by: Umesh Nerlige Ramappa 
>> ---
>>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
>>  drivers/gpu/drm/i915/i915_pmu.h |  4 
>>  2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
b/drivers/gpu/drm/i915/i915_pmu.c
>> index 2b63ee31e1b3..669a42e44082 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>> * Signal sampling timer to stop if only engine events are enabled and
>> * GPU went idle.
>> */
>> -  pmu->timer_enabled = pmu_needs_timer(pmu, false);
>> +  pmu->unparked &= ~BIT(gt->info.id);
>> +  if (pmu->unparked == 0)
>> +  pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>
>>spin_unlock_irq(>lock);
>>  }
>> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>>/*
>> * Re-enable sampling timer when GPU goes active.
>> */
>> -  __i915_pmu_maybe_start_timer(pmu);
>> +  if (pmu->unparked == 0)
>> +  __i915_pmu_maybe_start_timer(pmu);
>> +
>> +  pmu->unparked |= BIT(gt->info.id);
>>
>>spin_unlock_irq(>lock);
>>  }
>> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
*hrtimer)
>> */
>>
>>for_each_gt(gt, i915, i) {
>> +  if (!(pmu->unparked & BIT(i)))
>> +  continue;
>> +
>
> This is not correct. In this series we are at least sampling frequencies
> (calling frequency_sample) even when GT is parked. So these 3 lines should be
> deleted. engines_sample will get called and will return without doing
> anything if engine events are disabled.

Not sure I understand. This is checking pmu->'un'parked bits.


Sorry, my bad. Not "engines_sample will get called and will return without
doing anything if engine events are disabled" but "engines_sample will get
called and will return without doing anything if GT is not awake". This is
the same as the previous behavior before this series.

Umesh and I discussed this but writing this out in case Tvrtko takes a
look.


Sounds good, Dropping the check here in the new revision.

Thanks,
Umesh


Thanks.
--
Ashutosh




>
>
>>engines_sample(gt, period_ns);
>>
>>if (i == 0) /* FIXME */
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.h 
b/drivers/gpu/drm/i915/i915_pmu.h
>> index a686fd7ccedf..3a811266ac6a 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.h
>> +++ b/drivers/gpu/drm/i915/i915_pmu.h
>> @@ -76,6 +76,10 @@ struct i915_pmu {
>> * @lock: Lock protecting enable mask and ref count handling.
>> */
>>spinlock_t lock;
>> +  /**
>> +   * @unparked: GT unparked mask.
>> +   */
>> +  unsigned int unparked;
>>/**
>> * @timer: Timer for internal i915 PMU sampling.
>> */
>> --
>> 2.36.1
>>


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Dixit, Ashutosh
On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:
>
> On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh/Tvrtko,
> >
> >> From: Tvrtko Ursulin 
> >>
> >> We do not want to have timers per tile and waste CPU cycles and energy via
> >> multiple wake-up sources, for a relatively un-important task of PMU
> >> sampling, so keeping a single timer works well. But we also do not want
> >> the first GT which goes idle to turn off the timer.
> >>
> >> Add some reference counting, via a mask of unparked GTs, to solve this.
> >>
> >> Signed-off-by: Tvrtko Ursulin 
> >> Signed-off-by: Umesh Nerlige Ramappa 
> >> ---
> >>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
> >>  drivers/gpu/drm/i915/i915_pmu.h |  4 
> >>  2 files changed, 14 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> >> b/drivers/gpu/drm/i915/i915_pmu.c
> >> index 2b63ee31e1b3..669a42e44082 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
> >> * Signal sampling timer to stop if only engine events are enabled and
> >> * GPU went idle.
> >> */
> >> -  pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >> +  pmu->unparked &= ~BIT(gt->info.id);
> >> +  if (pmu->unparked == 0)
> >> +  pmu->timer_enabled = pmu_needs_timer(pmu, false);
> >>
> >>spin_unlock_irq(>lock);
> >>  }
> >> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
> >>/*
> >> * Re-enable sampling timer when GPU goes active.
> >> */
> >> -  __i915_pmu_maybe_start_timer(pmu);
> >> +  if (pmu->unparked == 0)
> >> +  __i915_pmu_maybe_start_timer(pmu);
> >> +
> >> +  pmu->unparked |= BIT(gt->info.id);
> >>
> >>spin_unlock_irq(>lock);
> >>  }
> >> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
> >> *hrtimer)
> >> */
> >>
> >>for_each_gt(gt, i915, i) {
> >> +  if (!(pmu->unparked & BIT(i)))
> >> +  continue;
> >> +
> >
> > This is not correct. In this series we are at least sampling frequencies
> > (calling frequency_sample) even when GT is parked. So these 3 lines should 
> > be
> > deleted. engines_sample will get called and will return without doing
> > anything if engine events are disabled.
>
> Not sure I understand. This is checking pmu->'un'parked bits.

Sorry, my bad. Not "engines_sample will get called and will return without
doing anything if engine events are disabled" but "engines_sample will get
called and will return without doing anything if GT is not awake". This is
the same as the previous behavior before this series.

Umesh and I discussed this but writing this out in case Tvrtko takes a
look.

Thanks.
--
Ashutosh



> >
> >
> >>engines_sample(gt, period_ns);
> >>
> >>if (i == 0) /* FIXME */
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.h 
> >> b/drivers/gpu/drm/i915/i915_pmu.h
> >> index a686fd7ccedf..3a811266ac6a 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.h
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.h
> >> @@ -76,6 +76,10 @@ struct i915_pmu {
> >> * @lock: Lock protecting enable mask and ref count handling.
> >> */
> >>spinlock_t lock;
> >> +  /**
> >> +   * @unparked: GT unparked mask.
> >> +   */
> >> +  unsigned int unparked;
> >>/**
> >> * @timer: Timer for internal i915 PMU sampling.
> >> */
> >> --
> >> 2.36.1
> >>


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Umesh Nerlige Ramappa

On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:

On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:




Hi Umesh/Tvrtko,


From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and energy via
multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not want
the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve this.

Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
 drivers/gpu/drm/i915/i915_pmu.h |  4 
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2b63ee31e1b3..669a42e44082 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 * Signal sampling timer to stop if only engine events are enabled and
 * GPU went idle.
 */
-   pmu->timer_enabled = pmu_needs_timer(pmu, false);
+   pmu->unparked &= ~BIT(gt->info.id);
+   if (pmu->unparked == 0)
+   pmu->timer_enabled = pmu_needs_timer(pmu, false);

spin_unlock_irq(>lock);
 }
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
/*
 * Re-enable sampling timer when GPU goes active.
 */
-   __i915_pmu_maybe_start_timer(pmu);
+   if (pmu->unparked == 0)
+   __i915_pmu_maybe_start_timer(pmu);
+
+   pmu->unparked |= BIT(gt->info.id);

spin_unlock_irq(>lock);
 }
@@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
*hrtimer)
 */

for_each_gt(gt, i915, i) {
+   if (!(pmu->unparked & BIT(i)))
+   continue;
+


This is not correct. In this series we are at least sampling frequencies
(calling frequency_sample) even when GT is parked. So these 3 lines should be
deleted. engines_sample will get called and will return without doing
anything if engine events are disabled.


Not sure I understand. This is checking pmu->'un'parked bits.

Thanks,
Umesh


Thanks.
--
Ashutosh



engines_sample(gt, period_ns);

if (i == 0) /* FIXME */
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index a686fd7ccedf..3a811266ac6a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -76,6 +76,10 @@ struct i915_pmu {
 * @lock: Lock protecting enable mask and ref count handling.
 */
spinlock_t lock;
+   /**
+* @unparked: GT unparked mask.
+*/
+   unsigned int unparked;
/**
 * @timer: Timer for internal i915 PMU sampling.
 */
--
2.36.1



Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-12 Thread Dixit, Ashutosh
On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh/Tvrtko,

> From: Tvrtko Ursulin 
>
> We do not want to have timers per tile and waste CPU cycles and energy via
> multiple wake-up sources, for a relatively un-important task of PMU
> sampling, so keeping a single timer works well. But we also do not want
> the first GT which goes idle to turn off the timer.
>
> Add some reference counting, via a mask of unparked GTs, to solve this.
>
> Signed-off-by: Tvrtko Ursulin 
> Signed-off-by: Umesh Nerlige Ramappa 
> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
>  drivers/gpu/drm/i915/i915_pmu.h |  4 
>  2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 2b63ee31e1b3..669a42e44082 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>* Signal sampling timer to stop if only engine events are enabled and
>* GPU went idle.
>*/
> - pmu->timer_enabled = pmu_needs_timer(pmu, false);
> + pmu->unparked &= ~BIT(gt->info.id);
> + if (pmu->unparked == 0)
> + pmu->timer_enabled = pmu_needs_timer(pmu, false);
>
>   spin_unlock_irq(>lock);
>  }
> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>   /*
>* Re-enable sampling timer when GPU goes active.
>*/
> - __i915_pmu_maybe_start_timer(pmu);
> + if (pmu->unparked == 0)
> + __i915_pmu_maybe_start_timer(pmu);
> +
> + pmu->unparked |= BIT(gt->info.id);
>
>   spin_unlock_irq(>lock);
>  }
> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
> *hrtimer)
>*/
>
>   for_each_gt(gt, i915, i) {
> + if (!(pmu->unparked & BIT(i)))
> + continue;
> +

This is not correct. In this series we are at least sampling frequencies
(calling frequency_sample) even when GT is parked. So these 3 lines should be
deleted. engines_sample will get called and will return without doing
anything if engine events are disabled.

Thanks.
--
Ashutosh


>   engines_sample(gt, period_ns);
>
>   if (i == 0) /* FIXME */
> diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
> index a686fd7ccedf..3a811266ac6a 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.h
> +++ b/drivers/gpu/drm/i915/i915_pmu.h
> @@ -76,6 +76,10 @@ struct i915_pmu {
>* @lock: Lock protecting enable mask and ref count handling.
>*/
>   spinlock_t lock;
> + /**
> +  * @unparked: GT unparked mask.
> +  */
> + unsigned int unparked;
>   /**
>* @timer: Timer for internal i915 PMU sampling.
>*/
> --
> 2.36.1
>


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-10 Thread Dixit, Ashutosh
On Tue, 09 May 2023 10:25:16 -0700, Dixit, Ashutosh wrote:
>
> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
> >
> > From: Tvrtko Ursulin 
> >
> > We do not want to have timers per tile and waste CPU cycles and energy via
> > multiple wake-up sources, for a relatively un-important task of PMU
> > sampling, so keeping a single timer works well. But we also do not want
> > the first GT which goes idle to turn off the timer.
>
> Apart from this efficiency, what is the reason for having a device level
> PMU (which monitors gt level events), rather than independent gt level
> PMU's (each of which monitor events from that gt)?
>
> Wouldn't independent gt level PMU's be simpler? And user space tools (say
> intel-gpu-top) would hook into events from a gt and treat each gt
> independently?
>
> So my question really is what is the reason for keeping the PMU device
> level rather than per gt?

Maybe ignore this for now, the way it is expressed it is too open
ended. Let me get a better handle on the code and the patches and I'll see
if I have anything to say.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-09 Thread Dixit, Ashutosh
On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>
> From: Tvrtko Ursulin 
>
> We do not want to have timers per tile and waste CPU cycles and energy via
> multiple wake-up sources, for a relatively un-important task of PMU
> sampling, so keeping a single timer works well. But we also do not want
> the first GT which goes idle to turn off the timer.

Apart from this efficiency, what is the reason for having a device level
PMU (which monitors gt level events), rather than independent gt level
PMU's (each of which monitor events from that gt)?

Wouldn't independent gt level PMU's be simpler? And user space tools (say
intel-gpu-top) would hook into events from a gt and treat each gt
independently?

So my question really is what is the reason for keeping the PMU device
level rather than per gt?

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-08 Thread Umesh Nerlige Ramappa

On Fri, May 05, 2023 at 05:58:14PM -0700, Umesh Nerlige Ramappa wrote:

From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and energy via
multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not want
the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve this.


Looks like the previous patch is a prep work for this one. I would 
mention something about this patch in the previous patch, but then I am 
not sure what's the norm in these scenarios. Recently I created some IGT 
patches that are prep work and refer to future patches in the series.


As is, this patch is 


Reviewed-by: Umesh Nerlige Ramappa 

Thanks,
Umesh



Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Umesh Nerlige Ramappa 
---
drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
drivers/gpu/drm/i915/i915_pmu.h |  4 
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2b63ee31e1b3..669a42e44082 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 * Signal sampling timer to stop if only engine events are enabled and
 * GPU went idle.
 */
-   pmu->timer_enabled = pmu_needs_timer(pmu, false);
+   pmu->unparked &= ~BIT(gt->info.id);
+   if (pmu->unparked == 0)
+   pmu->timer_enabled = pmu_needs_timer(pmu, false);

spin_unlock_irq(>lock);
}
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
/*
 * Re-enable sampling timer when GPU goes active.
 */
-   __i915_pmu_maybe_start_timer(pmu);
+   if (pmu->unparked == 0)
+   __i915_pmu_maybe_start_timer(pmu);
+
+   pmu->unparked |= BIT(gt->info.id);

spin_unlock_irq(>lock);
}
@@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
*hrtimer)
 */

for_each_gt(gt, i915, i) {
+   if (!(pmu->unparked & BIT(i)))
+   continue;
+
engines_sample(gt, period_ns);

if (i == 0) /* FIXME */
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index a686fd7ccedf..3a811266ac6a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -76,6 +76,10 @@ struct i915_pmu {
 * @lock: Lock protecting enable mask and ref count handling.
 */
spinlock_t lock;
+   /**
+* @unparked: GT unparked mask.
+*/
+   unsigned int unparked;
/**
 * @timer: Timer for internal i915 PMU sampling.
 */
--
2.36.1



[Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

2023-05-05 Thread Umesh Nerlige Ramappa
From: Tvrtko Ursulin 

We do not want to have timers per tile and waste CPU cycles and energy via
multiple wake-up sources, for a relatively un-important task of PMU
sampling, so keeping a single timer works well. But we also do not want
the first GT which goes idle to turn off the timer.

Add some reference counting, via a mask of unparked GTs, to solve this.

Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/i915_pmu.c | 12 ++--
 drivers/gpu/drm/i915/i915_pmu.h |  4 
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2b63ee31e1b3..669a42e44082 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 * Signal sampling timer to stop if only engine events are enabled and
 * GPU went idle.
 */
-   pmu->timer_enabled = pmu_needs_timer(pmu, false);
+   pmu->unparked &= ~BIT(gt->info.id);
+   if (pmu->unparked == 0)
+   pmu->timer_enabled = pmu_needs_timer(pmu, false);
 
spin_unlock_irq(>lock);
 }
@@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
/*
 * Re-enable sampling timer when GPU goes active.
 */
-   __i915_pmu_maybe_start_timer(pmu);
+   if (pmu->unparked == 0)
+   __i915_pmu_maybe_start_timer(pmu);
+
+   pmu->unparked |= BIT(gt->info.id);
 
spin_unlock_irq(>lock);
 }
@@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
*hrtimer)
 */
 
for_each_gt(gt, i915, i) {
+   if (!(pmu->unparked & BIT(i)))
+   continue;
+
engines_sample(gt, period_ns);
 
if (i == 0) /* FIXME */
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index a686fd7ccedf..3a811266ac6a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -76,6 +76,10 @@ struct i915_pmu {
 * @lock: Lock protecting enable mask and ref count handling.
 */
spinlock_t lock;
+   /**
+* @unparked: GT unparked mask.
+*/
+   unsigned int unparked;
/**
 * @timer: Timer for internal i915 PMU sampling.
 */
-- 
2.36.1