Re: [PATCH] drm/i915/hwmon: Get rid of devm

2024-04-18 Thread Dixit, Ashutosh
On Thu, 18 Apr 2024 14:56:58 -0700, Andi Shyti wrote:
>
> > v2: Change commit message and other minor code changes
> > v3: Cleanup from i915_hwmon_register on error (Armin Wolf)
> > v4: Eliminate potential static analyzer warning (Rodrigo)
> > Eliminate fetch_and_zero (Jani)
> > v5: Restore previous logic for ddat_gt->hwmon_dev error return (Andi)
>
> Thanks!
>
> Reviewed-by: Andi Shyti 

Thanks a lot Andi, merged!

Ashutosh


Re: [PATCH v4] drm/i915/hwmon: Get rid of devm

2024-04-17 Thread Dixit, Ashutosh
On Wed, 17 Apr 2024 01:28:48 -0700, Andi Shyti wrote:
>

Hi Andi,

> > @@ -839,16 +837,38 @@ void i915_hwmon_register(struct drm_i915_private 
> > *i915)
> > if (!hwm_gt_is_visible(ddat_gt, hwmon_energy, 
> > hwmon_energy_input, 0))
> > continue;
> >
> > -   hwmon_dev = devm_hwmon_device_register_with_info(dev, 
> > ddat_gt->name,
> > -ddat_gt,
> > -
> > _gt_chip_info,
> > -NULL);
> > -   if (!IS_ERR(hwmon_dev))
> > -   ddat_gt->hwmon_dev = hwmon_dev;
> > +   hwmon_dev = hwmon_device_register_with_info(dev, ddat_gt->name,
> > +   ddat_gt,
> > +   _gt_chip_info,
> > +   NULL);
> > +   if (IS_ERR(hwmon_dev))
> > +   goto err;
>
> here the logic is changing, though. Before we were not leaving if
> hwmon_device_register_with_info() was returning error.
>
> Is this wanted? And why isn't it described in the log?

Not sure if the previous logic was intentional or not, anyway I have
restored it in v5 (where I once again forgot to add "PATCH v5" to the
Subject but v5 is there in the version log :/).

Thanks.
--
Ashutosh


Re: [PATCH v2] drm/i915/hwmon: Get rid of devm

2024-04-16 Thread Dixit, Ashutosh
On Tue, 16 Apr 2024 11:55:20 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> > @@ -849,5 +849,26 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> >
> >  void i915_hwmon_unregister(struct drm_i915_private *i915)
> >  {
> > -   fetch_and_zero(>hwmon);
> > +   struct i915_hwmon *hwmon = fetch_and_zero(>hwmon);
> > +   struct hwm_drvdata *ddat = >ddat;
> > +   struct intel_gt *gt;
> > +   int i;
> > +
> > +   if (!hwmon)
> > +   return;
>
> "that's too late", we are going to hear from static analyzer tools.
>
> beter to move ddat = >ddat; after this return.

Yeah, I worried a lot about it :/ But then finally decided (and verified)
that we are never actually dereferencing the (possibly NULL) pointer.

But not sure about static analyzer tools, maybe you are right, I'll move
it.

> with that,
>
> Reviewed-by: Rodrigo Vivi 

Thanks a lot :)

Ashutosh

>
> > +
> > +   for_each_gt(gt, i915, i) {
> > +   struct hwm_drvdata *ddat_gt = hwmon->ddat_gt + i;
> > +
> > +   if (ddat_gt->hwmon_dev) {
> > +   hwmon_device_unregister(ddat_gt->hwmon_dev);
> > +   ddat_gt->hwmon_dev = NULL;
> > +   }
> > +   }
> > +
> > +   if (ddat->hwmon_dev)
> > +   hwmon_device_unregister(ddat->hwmon_dev);
> > +
> > +   mutex_destroy(>hwmon_lock);
> > +   kfree(hwmon);
> >  }
> > --
> > 2.41.0
> >


Re: [PATCH v2] drm/i915/hwmon: Get rid of devm

2024-04-15 Thread Dixit, Ashutosh
On Mon, 15 Apr 2024 16:35:02 -0700, Armin Wolf wrote:
>

Hi Armin,

> Am 16.04.24 um 00:36 schrieb Ashutosh Dixit:
> > @@ -818,10 +818,10 @@ void i915_hwmon_register(struct drm_i915_private 
> > *i915)
> > hwm_get_preregistration_info(i915);
> >
> > /*  hwmon_dev points to device hwmon */
> > -   hwmon_dev = devm_hwmon_device_register_with_info(dev, ddat->name,
> > -ddat,
> > -_chip_info,
> > -hwm_groups);
> > +   hwmon_dev = hwmon_device_register_with_info(dev, ddat->name,
> > +   ddat,
> > +   _chip_info,
> > +   hwm_groups);
> > if (IS_ERR(hwmon_dev)) {
> > i915->hwmon = NULL;
>
> you need to free hwmon here, since it is not managed by devres anymore.

Thanks a lot for catching this, I had missed it in v2, it's fixed in v3. I
am actually reusing i915_hwmon_unregister() for error unwinding in v3.

>
> > return;
> > @@ -838,10 +838,10 @@ void i915_hwmon_register(struct drm_i915_private 
> > *i915)
> > if (!hwm_gt_is_visible(ddat_gt, hwmon_energy, 
> > hwmon_energy_input, 0))
> > continue;
> >
> > -   hwmon_dev = devm_hwmon_device_register_with_info(dev, 
> > ddat_gt->name,
> > -ddat_gt,
> > -
> > _gt_chip_info,
> > -NULL);
> > +   hwmon_dev = hwmon_device_register_with_info(dev, ddat_gt->name,
> > +   ddat_gt,
> > +   _gt_chip_info,
> > +   NULL);
> > if (!IS_ERR(hwmon_dev))
> > ddat_gt->hwmon_dev = hwmon_dev;
> > }
> > @@ -849,5 +849,26 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> >
> >   void i915_hwmon_unregister(struct drm_i915_private *i915)
> >   {
> > -   fetch_and_zero(>hwmon);
> > +   struct i915_hwmon *hwmon = fetch_and_zero(>hwmon);
>
> Why is fetch_and_zero() necessary here?

As mentioned, in v3 i915_hwmon_unregister() itself is used for error
unwinding so we need to prevent multiple device_unregister's etc. That is
the purpose of setting i915->hwmon to NULL. But even earlier, though it is
not obvious, i915_hwmon_unregister() is called multiple times. So e.g. it
will be called at device unbind as well as module unload. So once again we
prevent multiple device_unregister's by setting and checking for NULL
i915->hwmon.

>
> > +   struct hwm_drvdata *ddat = >ddat;
> > +   struct intel_gt *gt;
> > +   int i;
> > +
> > +   if (!hwmon)
> > +   return;
> > +
> > +   for_each_gt(gt, i915, i) {
> > +   struct hwm_drvdata *ddat_gt = hwmon->ddat_gt + i;
> > +
> > +   if (ddat_gt->hwmon_dev) {
> > +   hwmon_device_unregister(ddat_gt->hwmon_dev);
> > +   ddat_gt->hwmon_dev = NULL;
> > +   }
> > +   }
> > +
> > +   if (ddat->hwmon_dev)
> > +   hwmon_device_unregister(ddat->hwmon_dev);
> > +
> > +   mutex_destroy(>hwmon_lock);
> > +   kfree(hwmon);
> >   }

Thanks.
--
Ashutosh


Re: [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter

2024-03-12 Thread Dixit, Ashutosh
On Tue, 12 Mar 2024 13:34:25 -0700, Janusz Krzysztofik wrote:
>

Hi Janusz,

> On Tuesday, 12 March 2024 17:25:14 CET Dixit, Ashutosh wrote:
> > On Mon, 11 Mar 2024 13:34:58 -0700, Janusz Krzysztofik wrote:
> > >
> > > In i915 hwmon sysfs getter path we now take a hwmon_lock, then acquire an
> > > rpm wakeref.  That results in lock inversion:
> > >
> > > <4> [197.079335] ==
> > > <4> [197.085473] WARNING: possible circular locking dependency detected
> > > <4> [197.091611] 6.8.0-rc7-Patchwork_129026v7-gc4dc92fb1152+ #1 Not 
> > > tainted
> > > <4> [197.098096] --
> > > <4> [197.104231] prometheus-node/839 is trying to acquire lock:
> > > <4> [197.109680] 82764d80 (fs_reclaim){+.+.}-{0:0}, at: 
> > > __kmalloc+0x9a/0x350
> > > <4> [197.116939]
> > > but task is already holding lock:
> > > <4> [197.122730] 88811b772a40 (>hwmon_lock){+.+.}-{3:3}, at: 
> > > hwm_energy+0x4b/0x100 [i915]
> > > <4> [197.131543]
> > > which lock already depends on the new lock.
> > > ...
> > > <4> [197.507922] Chain exists of:
> > >   fs_reclaim --> >reset.mutex --> >hwmon_lock
> > > <4> [197.518528]  Possible unsafe locking scenario:
> > > <4> [197.524411]CPU0CPU1
> > > <4> [197.528916]
> > > <4> [197.533418]   lock(>hwmon_lock);
> > > <4> [197.537237]lock(>reset.mutex);
> > > <4> [197.543376]lock(>hwmon_lock);
> > > <4> [197.549682]   lock(fs_reclaim);
> > > ...
> > > <4> [197.632548] Call Trace:
> > > <4> [197.634990]  
> > > <4> [197.637088]  dump_stack_lvl+0x64/0xb0
> > > <4> [197.640738]  check_noncircular+0x15e/0x180
> > > <4> [197.652968]  check_prev_add+0xe9/0xce0
> > > <4> [197.656705]  __lock_acquire+0x179f/0x2300
> > > <4> [197.660694]  lock_acquire+0xd8/0x2d0
> > > <4> [197.673009]  fs_reclaim_acquire+0xa1/0xd0
> > > <4> [197.680478]  __kmalloc+0x9a/0x350
> > > <4> [197.689063]  acpi_ns_internalize_name.part.0+0x4a/0xb0
> > > <4> [197.694170]  acpi_ns_get_node_unlocked+0x60/0xf0
> > > <4> [197.720608]  acpi_ns_get_node+0x3b/0x60
> > > <4> [197.724428]  acpi_get_handle+0x57/0xb0
> > > <4> [197.728164]  acpi_has_method+0x20/0x50
> > > <4> [197.731896]  acpi_pci_set_power_state+0x43/0x120
> > > <4> [197.736485]  pci_power_up+0x24/0x1c0
> > > <4> [197.740047]  pci_pm_default_resume_early+0x9/0x30
> > > <4> [197.744725]  pci_pm_runtime_resume+0x2d/0x90
> > > <4> [197.753911]  __rpm_callback+0x3c/0x110
> > > <4> [197.762586]  rpm_callback+0x58/0x70
> > > <4> [197.766064]  rpm_resume+0x51e/0x730
> > > <4> [197.769542]  rpm_resume+0x267/0x730
> > > <4> [197.773020]  rpm_resume+0x267/0x730
> > > <4> [197.776498]  rpm_resume+0x267/0x730
> > > <4> [197.779974]  __pm_runtime_resume+0x49/0x90
> > > <4> [197.784055]  __intel_runtime_pm_get+0x19/0xa0 [i915]
> > > <4> [197.789070]  hwm_energy+0x55/0x100 [i915]
> > > <4> [197.793183]  hwm_read+0x9a/0x310 [i915]
> > > <4> [197.797124]  hwmon_attr_show+0x36/0x120
> > > <4> [197.800946]  dev_attr_show+0x15/0x60
> > > <4> [197.804509]  sysfs_kf_seq_show+0xb5/0x100
> > >
> > > Acquire the wakeref before the lock and hold it as long as the lock is
> > > also held.  Follow that pattern across the whole source file where similar
> > > lock inversion can happen.
> > >
> > > v2: Keep hardware read under the lock so the whole operation of updating
> > > energy from hardware is still atomic (Guenter),
> > >   - instead, acquire the rpm wakeref before the lock and hold it as long
> > > as the lock is held,
> > >   - use the same aproach for other similar places across the i915_hwmon.c
> > > source file (Rodrigo).
> > >
> > > Fixes: c41b8bdcc297 ("drm/i915/hwmon: Show device level energy usage")
> >
> > I would think that the lock inversion issue was introduced here:
> >
> > 1b44019a93e2 ("drm/i915/guc: Disable PL1 power limit when loading GuC 
> > firmware")
> >
> > This is the commit which introduced this sequence:
> > lock(>reset.mutex);
> > lock(>hwmon_lock);
> >
> > Before this, everything was fine. So perhaps the Fixes tag should reference
> > this commit?
>
> OK, thanks for pointing that out.
>
> > Otherwise the patch LGTM:
> >
> > Reviewed-by: Ashutosh Dixit 

Thanks for fixing this. Somehow I didn't see it when I did
1b44019a93e2. Maybe just didn't have lockdep enabled in the kernel.

Thanks.
--
Ashutosh


Re: [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter

2024-03-12 Thread Dixit, Ashutosh
On Mon, 11 Mar 2024 13:34:58 -0700, Janusz Krzysztofik wrote:
>
> In i915 hwmon sysfs getter path we now take a hwmon_lock, then acquire an
> rpm wakeref.  That results in lock inversion:
>
> <4> [197.079335] ==
> <4> [197.085473] WARNING: possible circular locking dependency detected
> <4> [197.091611] 6.8.0-rc7-Patchwork_129026v7-gc4dc92fb1152+ #1 Not tainted
> <4> [197.098096] --
> <4> [197.104231] prometheus-node/839 is trying to acquire lock:
> <4> [197.109680] 82764d80 (fs_reclaim){+.+.}-{0:0}, at: 
> __kmalloc+0x9a/0x350
> <4> [197.116939]
> but task is already holding lock:
> <4> [197.122730] 88811b772a40 (>hwmon_lock){+.+.}-{3:3}, at: 
> hwm_energy+0x4b/0x100 [i915]
> <4> [197.131543]
> which lock already depends on the new lock.
> ...
> <4> [197.507922] Chain exists of:
>   fs_reclaim --> >reset.mutex --> >hwmon_lock
> <4> [197.518528]  Possible unsafe locking scenario:
> <4> [197.524411]CPU0CPU1
> <4> [197.528916]
> <4> [197.533418]   lock(>hwmon_lock);
> <4> [197.537237]lock(>reset.mutex);
> <4> [197.543376]lock(>hwmon_lock);
> <4> [197.549682]   lock(fs_reclaim);
> ...
> <4> [197.632548] Call Trace:
> <4> [197.634990]  
> <4> [197.637088]  dump_stack_lvl+0x64/0xb0
> <4> [197.640738]  check_noncircular+0x15e/0x180
> <4> [197.652968]  check_prev_add+0xe9/0xce0
> <4> [197.656705]  __lock_acquire+0x179f/0x2300
> <4> [197.660694]  lock_acquire+0xd8/0x2d0
> <4> [197.673009]  fs_reclaim_acquire+0xa1/0xd0
> <4> [197.680478]  __kmalloc+0x9a/0x350
> <4> [197.689063]  acpi_ns_internalize_name.part.0+0x4a/0xb0
> <4> [197.694170]  acpi_ns_get_node_unlocked+0x60/0xf0
> <4> [197.720608]  acpi_ns_get_node+0x3b/0x60
> <4> [197.724428]  acpi_get_handle+0x57/0xb0
> <4> [197.728164]  acpi_has_method+0x20/0x50
> <4> [197.731896]  acpi_pci_set_power_state+0x43/0x120
> <4> [197.736485]  pci_power_up+0x24/0x1c0
> <4> [197.740047]  pci_pm_default_resume_early+0x9/0x30
> <4> [197.744725]  pci_pm_runtime_resume+0x2d/0x90
> <4> [197.753911]  __rpm_callback+0x3c/0x110
> <4> [197.762586]  rpm_callback+0x58/0x70
> <4> [197.766064]  rpm_resume+0x51e/0x730
> <4> [197.769542]  rpm_resume+0x267/0x730
> <4> [197.773020]  rpm_resume+0x267/0x730
> <4> [197.776498]  rpm_resume+0x267/0x730
> <4> [197.779974]  __pm_runtime_resume+0x49/0x90
> <4> [197.784055]  __intel_runtime_pm_get+0x19/0xa0 [i915]
> <4> [197.789070]  hwm_energy+0x55/0x100 [i915]
> <4> [197.793183]  hwm_read+0x9a/0x310 [i915]
> <4> [197.797124]  hwmon_attr_show+0x36/0x120
> <4> [197.800946]  dev_attr_show+0x15/0x60
> <4> [197.804509]  sysfs_kf_seq_show+0xb5/0x100
>
> Acquire the wakeref before the lock and hold it as long as the lock is
> also held.  Follow that pattern across the whole source file where similar
> lock inversion can happen.
>
> v2: Keep hardware read under the lock so the whole operation of updating
> energy from hardware is still atomic (Guenter),
>   - instead, acquire the rpm wakeref before the lock and hold it as long
> as the lock is held,
>   - use the same aproach for other similar places across the i915_hwmon.c
> source file (Rodrigo).
>
> Fixes: c41b8bdcc297 ("drm/i915/hwmon: Show device level energy usage")

I would think that the lock inversion issue was introduced here:

1b44019a93e2 ("drm/i915/guc: Disable PL1 power limit when loading GuC firmware")

This is the commit which introduced this sequence:
lock(>reset.mutex);
lock(>hwmon_lock);

Before this, everything was fine. So perhaps the Fixes tag should reference
this commit?

Otherwise the patch LGTM:

Reviewed-by: Ashutosh Dixit 


Re: [Intel-gfx] [PATCH v2] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-27 Thread Dixit, Ashutosh
On Tue, 27 Jun 2023 12:13:36 -0700, Vinay Belgaumkar wrote:
>
> GuC load takes longer sometimes due to GT frequency not ramping up.
> Add perf_limit_reasons to the existing warn print to see if frequency
> is being throttled.
>
> v2: Review comments (Ashutosh)

Reviewed-by: Ashutosh Dixit 

>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 364d0d546ec8..0f79cb658518 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -251,9 +251,11 @@ static int guc_wait_ucode(struct intel_guc *guc)
>   if (ret == 0)
>   ret = -ENXIO;
>   } else if (delta_ms > 200) {
> - guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
> before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n",
> -  delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
> -  before_freq, status, count, ret);
> + guc_warn(guc, "excessive init time: %lldms! [status = 0x%08X, 
> count = %d, ret = %d]\n",
> +  delta_ms, status, count, ret);
> + guc_warn(guc, "excessive init time: [freq = %dMHz, before = 
> %dMHz, perf_limit_reasons = 0x%08X]\n",
> +  intel_rps_read_actual_frequency(>gt->rps), 
> before_freq,
> +  intel_uncore_read(uncore, 
> intel_gt_perf_limit_reasons_reg(gt)));
>   } else {
>   guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, 
> status = 0x%08X, count = %d, ret = %d\n",
>   delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
> --
> 2.38.1
>


Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-27 Thread Dixit, Ashutosh
On Mon, 26 Jun 2023 21:02:14 -0700, Belgaumkar, Vinay wrote:
>
>
> On 6/26/2023 8:17 PM, Dixit, Ashutosh wrote:
> > On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:
> >> GuC load takes longer sometimes due to GT frequency not ramping up.
> >> Add perf_limit_reasons to the existing warn print to see if frequency
> >> is being throttled.
> >>
> >> Signed-off-by: Vinay Belgaumkar 
> >> ---
> >>   drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
> >> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> >> index 364d0d546ec8..73911536a8e7 100644
> >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> >> @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
> >>guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
> >> before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n",
> >> delta_ms, 
> >> intel_rps_read_actual_frequency(>gt->rps),
> >> before_freq, status, count, ret);
> >> +  guc_warn(guc, "perf limit reasons = 0x%08X\n",
> >> +   intel_uncore_read(uncore, 
> >> intel_gt_perf_limit_reasons_reg(gt)));
> > Maybe just add at the end of the previous guc_warn?
>
> Its already too long a line. If I try adding on the next line checkpatch
> complains about splitting double quotes.

In these cases of long quoted lines we generally ignore checkpatch. Because
perf limit reasons is part of the "excessive init time" message it should
be on the same line within the square brackets. So should not be
splitting double quotes.

Another idea would be something like this:

guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
before = %dMHz, status = 0x%08X]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status);
guc_warn(guc, "excessive init time: [count = %d, ret = %d, perf 
limit reasons = 0x%08X]\n",
 count, ret, intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-26 Thread Dixit, Ashutosh
On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:
> 
> GuC load takes longer sometimes due to GT frequency not ramping up.
> Add perf_limit_reasons to the existing warn print to see if frequency
> is being throttled.
> 
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 364d0d546ec8..73911536a8e7 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
>   guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
> before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n",
>delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
>before_freq, status, count, ret);
> + guc_warn(guc, "perf limit reasons = 0x%08X\n",
> +  intel_uncore_read(uncore, 
> intel_gt_perf_limit_reasons_reg(gt)));

Maybe just add at the end of the previous guc_warn?

>   } else {
>   guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, 
> status = 0x%08X, count = %d, ret = %d\n",
>   delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
> -- 
> 2.38.1
> 


Re: [PATCH] drm/i915/guc/slpc: Apply min softlimit correctly

2023-06-15 Thread Dixit, Ashutosh
On Fri, 09 Jun 2023 15:02:52 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> We were skipping when min_softlimit was equal to RPn. We need to apply
> it rergardless as efficient frequency will push the SLPC min to RPe.
> This will break scenarios where user sets a min softlimit < RPe before
> reset and then performs a GT reset.
>
> Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient 
> frequency")
>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index 01b75529311c..ee9f83af7cf6 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -606,7 +606,7 @@ static int slpc_set_softlimits(struct intel_guc_slpc 
> *slpc)
>   if (unlikely(ret))
>   return ret;
>   slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
> - } else if (slpc->min_freq_softlimit != slpc->min_freq) {
> + } else {
>   return intel_guc_slpc_set_min_freq(slpc,
>  slpc->min_freq_softlimit);

IMO the current code is unnecessarily complicated and confusing and similar
changes (with a little tweaking) should be made for max_freq too. But at
least this is a step in the right direction so:

Reviewed-by: Ashutosh Dixit 



>   }
> --
> 2.38.1
>


Re: [PATCH] drm/i915/guc/slpc: Apply min softlimit correctly

2023-06-13 Thread Dixit, Ashutosh
On Fri, 09 Jun 2023 15:02:52 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> We were skipping when min_softlimit was equal to RPn. We need to apply
> it rergardless as efficient frequency will push the SLPC min to RPe.

regardless

> This will break scenarios where user sets a min softlimit < RPe before
> reset and then performs a GT reset.

Can you explain the reason for the patch clearly in terms of variables in
the code, what variable has what value and what is the bug. I am not
following from the above description.

Thanks.
--
Ashutosh


>
> Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient 
> frequency")
>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index 01b75529311c..ee9f83af7cf6 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -606,7 +606,7 @@ static int slpc_set_softlimits(struct intel_guc_slpc 
> *slpc)
>   if (unlikely(ret))
>   return ret;
>   slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
> - } else if (slpc->min_freq_softlimit != slpc->min_freq) {
> + } else {
>   return intel_guc_slpc_set_min_freq(slpc,
>  slpc->min_freq_softlimit);
>   }
> --
> 2.38.1
>


Re: [PATCH] dim: Disallow remote branch deletions with 'dim push'

2023-06-02 Thread Dixit, Ashutosh
On Fri, 02 Jun 2023 03:16:20 -0700, Jani Nikula wrote:
>
> On Thu, 01 Jun 2023, Ashutosh Dixit  wrote:
> > An inadvertent 'dim push -d' can delete remote branches. Disallow such
> > remote branch deletions.
>
> Please see 
> https://drm.pages.freedesktop.org/maintainer-tools/CONTRIBUTING.html
>
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  dim | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/dim b/dim
> > index 126568e..e5899e6 100755
> > --- a/dim
> > +++ b/dim
> > @@ -1029,6 +1029,12 @@ function dim_push_branch
> > fi
> > fi
> >
> > +   # Disallow remote branch deletions, say with 'dim push -d'
> > +   if [[ "$@" == *"-d"* ]]; then
> > +   echoerr "Attempt to delete remote branch, aborting."
> > +   return 1
> > +   fi
>
> I'm working on adding a server side git pre-receive hook to tackle this
> too, but I guess there's no harm in adding this. The choice of -d for
> dry run was unfortunate, and this helps with the 'dim -d foo' vs 'dim
> foo -d' mistake.

Yup, understood. I thought I'd just send out the patch anyway in case it
was useful.

I have created a merge request for the patch here:

https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/21

Thanks.
--
Ashutosh

> > +
> > git_push $remote $branch "$@"
> >
> > update_linux_next $branch drm-intel-next drm-intel-next-fixes 
> > drm-intel-fixes
>
> --
> Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH 2/2] drm/i915/pmu: Make PMU sample array two-dimensional

2023-05-24 Thread Dixit, Ashutosh
On Wed, 24 May 2023 10:53:20 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 24/05/2023 18:38, Dixit, Ashutosh wrote:
> > On Wed, 24 May 2023 04:38:18 -0700, Tvrtko Ursulin wrote:
> >> On 23/05/2023 16:19, Ashutosh Dixit wrote:
> >>> No functional changes but we can remove some unsightly index computation
> >>> and read/write functions if we convert the PMU sample array from a
> >>> one-dimensional to a two-dimensional array.
> >>>
> >>> Suggested-by: Tvrtko Ursulin 
> >>> Signed-off-by: Ashutosh Dixit 
> >>> ---
> >>>drivers/gpu/drm/i915/i915_pmu.c | 60 ++---
> >>>drivers/gpu/drm/i915/i915_pmu.h |  2 +-
> >>>2 files changed, 19 insertions(+), 43 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> >>> b/drivers/gpu/drm/i915/i915_pmu.c
> >>> index b47d890d4ada1..137e0df9573ee 100644
> >>> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >>> @@ -195,33 +195,6 @@ static inline s64 ktime_since_raw(const ktime_t kt)
> >>>   return ktime_to_ns(ktime_sub(ktime_get_raw(), kt));
> >>>}
> >>>-static unsigned int
> >>> -__sample_idx(struct i915_pmu *pmu, unsigned int gt_id, int sample)
> >>> -{
> >>> - unsigned int idx = gt_id * __I915_NUM_PMU_SAMPLERS + sample;
> >>> -
> >>> - GEM_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> >>> -
> >>> - return idx;
> >>> -}
> >>> -
> >>> -static u64 read_sample(struct i915_pmu *pmu, unsigned int gt_id, int 
> >>> sample)
> >>> -{
> >>> - return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> >>> -}
> >>> -
> >>> -static void
> >>> -store_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample, u64 
> >>> val)
> >>> -{
> >>> - pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> >>> -}
> >>> -
> >>> -static void
> >>> -add_sample_mult(struct i915_pmu *pmu, unsigned int gt_id, int sample, 
> >>> u32 val, u32 mul)
> >>> -{
> >>> - pmu->sample[__sample_idx(pmu, gt_id, sample)].cur += mul_u32_u32(val, 
> >>> mul);
> >>> -}
> >>
> >> IMO read and store helpers could have stayed and just changed the
> >> implementation. Like add_sample_mult which you just moved. I would have
> >> been a smaller patch. So dunno.. a bit of a reluctant r-b.
> >
> > Are you referring just to add_sample_mult or to all the other functions
> > too? add_sample_mult I moved it to where it was before bc4be0a38b63
>
> Read and store helpers.
>
> > ("drm/i915/pmu: Prepare for multi-tile non-engine counters"), could have
> > left it here I guess.
> >
> > The other read and store helpers are not needed with the 2-d array at all
> > since the compiler itself will do that, so I thought it was better to get
> > rid of them completely.
>
> Yes I get it, just that I didn't see the benefit of removing them.
>
> For example:
>
>  -store_sample(pmu, gt_id, __I915_SAMPLE_RC6, val);
>  +pmu->sample[gt_id][__I915_SAMPLE_RC6].cur = val;
>
> It's a meh for me. Either flavour looks fine to me so I would have erred on
> the side of keeping the patch small. If anything I probably slightly prefer
> that the struct pmu_sample implementation was able to be changed with less
> churn before. For example. But a very minor argument really.

OK, I finally understood and have made this change in Patch v2. Please take
a look.

>
> Or maybe next step is get rid of the struct i915_pmu_sample. It is a struct
> because originally previous value was tracked too. Then I removed that and
> it was easier to keep the struct. I guess it can go now and then the
> removal of helpers here will look somewhat nicer without the trailing .cur
> on every affected line.

I have left this as is for now in case the i915_pmu_sample need to be
expanded again. Should be ok with the read/store helpers.

>
> > Let me know if you want any changes, otherwise I will leave as is.
>
> You can leave it as is, I dont' mind much.

I went ahead and changed it anyway since you seemed to want it.

Thanks.
--
Ashutosh

>
> >> Reviewed-by: Tvrtko Ursulin 
> >
> > Thanks for the review. Thanks Andrzej too :)
> > --
> > Ashutosh
> >
> >>> -
> >>>static u64 get_rc6(struct intel_gt *gt)

Re: [Intel-gfx] [PATCH 1/2] drm/i915/pmu: Turn off the timer to sample frequencies when GT is parked

2023-05-24 Thread Dixit, Ashutosh
On Wed, 24 May 2023 02:12:31 -0700, Andrzej Hajda wrote:
>

Hi Andrzej,

> On 23.05.2023 17:19, Ashutosh Dixit wrote:
> > pmu_needs_timer() keeps the timer running even when GT is parked,
> > ostensibly to sample requested/actual frequencies. However
> > frequency_sample() has the following:
> >
> > /* Report 0/0 (actual/requested) frequency while parked. */
> > if (!intel_gt_pm_get_if_awake(gt))
> > return;
> >
> > The above code prevents frequencies to be sampled while the GT is
> > parked. So we might as well turn off the sampling timer itself in this
> > case and save CPU cycles/power.
> >
> > v2: Instead of turning freq bits off, return false, since no counters will
> >  run after this change when GT is parked (Tvrtko)
> >
> > Signed-off-by: Ashutosh Dixit 
> > Reviewed-by: Tvrtko Ursulin 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 12 +---
> >   1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index a814583e19fd7..b47d890d4ada1 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -144,6 +144,10 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, bool 
> > gpu_active)
> > struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu);
> > u32 enable;
> >   + /* When GPU is idle, at present no counters need to run */
> > +   if (!gpu_active)
> > +   return false;
> > +
>
> What is then purpose of calling pmu_needs_timer with 2nd arg false?
> Why not just replace all occurrences of pmu_needs_timer(.., false) with
> false? And remove the 2nd argument.

OK, this didn't seem unreasonable so I went ahead and made this change in
Patch v3. Copying Tvrtko too in case he prefers v2 for any reason. Please
review.

Thanks.
--
Ashutosh


>
>
>
> > /*
> >  * Only some counters need the sampling timer.
> >  *
> > @@ -157,17 +161,11 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, 
> > bool gpu_active)
> >  */
> > enable &= frequency_enabled_mask() | ENGINE_SAMPLE_MASK;
> >   - /*
> > -* When the GPU is idle per-engine counters do not need to be
> > -* running so clear those bits out.
> > -*/
> > -   if (!gpu_active)
> > -   enable &= ~ENGINE_SAMPLE_MASK;
> > /*
> >  * Also there is software busyness tracking available we do not
> >  * need the timer for I915_SAMPLE_BUSY counter.
> >  */
> > -   else if (i915->caps.scheduler & I915_SCHEDULER_CAP_ENGINE_BUSY_STATS)
> > +   if (i915->caps.scheduler & I915_SCHEDULER_CAP_ENGINE_BUSY_STATS)
> > enable &= ~BIT(I915_SAMPLE_BUSY);
> > /*
>


Re: [PATCH 2/2] drm/i915/pmu: Make PMU sample array two-dimensional

2023-05-24 Thread Dixit, Ashutosh
On Wed, 24 May 2023 04:38:18 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 23/05/2023 16:19, Ashutosh Dixit wrote:
> > No functional changes but we can remove some unsightly index computation
> > and read/write functions if we convert the PMU sample array from a
> > one-dimensional to a two-dimensional array.
> >
> > Suggested-by: Tvrtko Ursulin 
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 60 ++---
> >   drivers/gpu/drm/i915/i915_pmu.h |  2 +-
> >   2 files changed, 19 insertions(+), 43 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index b47d890d4ada1..137e0df9573ee 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -195,33 +195,6 @@ static inline s64 ktime_since_raw(const ktime_t kt)
> > return ktime_to_ns(ktime_sub(ktime_get_raw(), kt));
> >   }
> >   -static unsigned int
> > -__sample_idx(struct i915_pmu *pmu, unsigned int gt_id, int sample)
> > -{
> > -   unsigned int idx = gt_id * __I915_NUM_PMU_SAMPLERS + sample;
> > -
> > -   GEM_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> > -
> > -   return idx;
> > -}
> > -
> > -static u64 read_sample(struct i915_pmu *pmu, unsigned int gt_id, int 
> > sample)
> > -{
> > -   return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> > -}
> > -
> > -static void
> > -store_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample, u64 val)
> > -{
> > -   pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> > -}
> > -
> > -static void
> > -add_sample_mult(struct i915_pmu *pmu, unsigned int gt_id, int sample, u32 
> > val, u32 mul)
> > -{
> > -   pmu->sample[__sample_idx(pmu, gt_id, sample)].cur += mul_u32_u32(val, 
> > mul);
> > -}
>
> IMO read and store helpers could have stayed and just changed the
> implementation. Like add_sample_mult which you just moved. I would have
> been a smaller patch. So dunno.. a bit of a reluctant r-b.

Are you referring just to add_sample_mult or to all the other functions
too? add_sample_mult I moved it to where it was before bc4be0a38b63
("drm/i915/pmu: Prepare for multi-tile non-engine counters"), could have
left it here I guess.

The other read and store helpers are not needed with the 2-d array at all
since the compiler itself will do that, so I thought it was better to get
rid of them completely.

Let me know if you want any changes, otherwise I will leave as is.

> Reviewed-by: Tvrtko Ursulin 

Thanks for the review. Thanks Andrzej too :)
--
Ashutosh

> > -
> >   static u64 get_rc6(struct intel_gt *gt)
> >   {
> > struct drm_i915_private *i915 = gt->i915;
> > @@ -240,7 +213,7 @@ static u64 get_rc6(struct intel_gt *gt)
> > spin_lock_irqsave(>lock, flags);
> > if (awake) {
> > -   store_sample(pmu, gt_id, __I915_SAMPLE_RC6, val);
> > +   pmu->sample[gt_id][__I915_SAMPLE_RC6].cur = val;
> > } else {
> > /*
> >  * We think we are runtime suspended.
> > @@ -250,13 +223,13 @@ static u64 get_rc6(struct intel_gt *gt)
> >  * counter value.
> >  */
> > val = ktime_since_raw(pmu->sleep_last[gt_id]);
> > -   val += read_sample(pmu, gt_id, __I915_SAMPLE_RC6);
> > +   val += pmu->sample[gt_id][__I915_SAMPLE_RC6].cur;
> > }
> >   - if (val < read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED))
> > -   val = read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED);
> > +   if (val < pmu->sample[gt_id][__I915_SAMPLE_RC6_LAST_REPORTED].cur)
> > +   val = pmu->sample[gt_id][__I915_SAMPLE_RC6_LAST_REPORTED].cur;
> > else
> > -   store_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED, val);
> > +   pmu->sample[gt_id][__I915_SAMPLE_RC6_LAST_REPORTED].cur = val;
> > spin_unlock_irqrestore(>lock, flags);
> >   @@ -275,9 +248,8 @@ static void init_rc6(struct i915_pmu *pmu)
> > with_intel_runtime_pm(gt->uncore->rpm, wakeref) {
> > u64 val = __get_rc6(gt);
> >   - store_sample(pmu, i, __I915_SAMPLE_RC6, val);
> > -   store_sample(pmu, i, __I915_SAMPLE_RC6_LAST_REPORTED,
> > -val);
> > +   pmu->sample[i][__I915_SAMPLE_RC6].cur = val;
> > +   pmu->sample[i][__I915_SAMPLE_RC6_LAST_REPORTED].cur = 
> > val;
> > pmu->sleep_last[i] = ktime_get_raw();
> > }
> > }
> > @@ -287,7 +259,7 @@ static void park_rc6(struct intel_gt *gt)
> >   {
> > struct i915_pmu *pmu = >i915->pmu;
> >   - store_sample(pmu, gt->info.id, __I915_SAMPLE_RC6, __get_rc6(gt));
> > +   pmu->sample[gt->info.id][__I915_SAMPLE_RC6].cur = __get_rc6(gt);
> > pmu->sleep_last[gt->info.id] = ktime_get_raw();
> >   }
> >   @@ -428,6 +400,12 @@ engines_sample(struct intel_gt *gt, unsigned int
> > period_ns)
> > }
> >   }
> >   +static void
> > +add_sample_mult(struct 

Re: [PATCH] drm/i915/pmu: Turn off the timer to sample frequencies when GT is parked

2023-05-23 Thread Dixit, Ashutosh
On Fri, 12 May 2023 01:59:08 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 12/05/2023 02:53, Ashutosh Dixit wrote:
> > pmu_needs_timer() keeps the timer running even when GT is parked,
> > ostensibly to sample requested/actual frequencies. However
> > frequency_sample() has the following:
> >
> > /* Report 0/0 (actual/requested) frequency while parked. */
> > if (!intel_gt_pm_get_if_awake(gt))
> > return;
> >
> > The above code prevents frequencies to be sampled while the GT is
> > parked. So we might as well turn off the sampling timer itself in this
> > case and save CPU cycles/power.
>
> The confusing situation seems to be the consequence of b66ecd0438bf
> ("drm/i915/pmu: Report frequency as zero while GPU is sleeping").
>
> Before that commit we were deliberately sampling the frequencies as GPU
> minimum during the parked periods and to do so leaving the timer running.
>
> But then some RPS changes exposed that approach as questionable (AFAIR
> software tracked state stopped being reset to min freq and so created
> wild PMU readings) and we went the route of reporting zero when parked.
>
> At which point running the timer stopped making sense, so really that
> commit should/could have made the change you now propose.
>
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 11 +++
> >   1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index 7ece883a7d956..8db1d681cf4ab 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -124,11 +124,14 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, 
> > bool gpu_active)
> >   ENGINE_SAMPLE_MASK;
> > /*
> > -* When the GPU is idle per-engine counters do not need to be
> > -* running so clear those bits out.
> > +* When GPU is idle, frequency or per-engine counters do not need
> > +* to be running so clear those bits out.
> >  */
> > -   if (!gpu_active)
> > -   enable &= ~ENGINE_SAMPLE_MASK;
> > +   if (!gpu_active) {
> > +   enable &= ~(config_mask(I915_PMU_ACTUAL_FREQUENCY) |
> > +   config_mask(I915_PMU_REQUESTED_FREQUENCY) |
> > +   ENGINE_SAMPLE_MASK);
> > +   }
> > /*
> >  * Also there is software busyness tracking available we do not
> >  * need the timer for I915_SAMPLE_BUSY counter.
>
> LGTM.
>
> Reviewed-by: Tvrtko Ursulin 
>
> Or maybe it is possible to simplify since it looks there is no way to return 
> true if gt is parked. So that could be:
>
> pmu_needs_timer(..)
> {
>   ...
>
>   if (!gpu_active)
>   return false;
>
>   ...
>   enable = pmu->enable;
>
>   ...
>   enable &= config_mask(I915_PMU_ACTUAL_FREQUENCY) |
> config_mask(I915_PMU_REQUESTED_FREQUENCY) |
> ENGINE_SAMPLE_MASK;
>
>   ...
>   if (i915->caps.scheduler & I915_SCHEDULER_CAP_ENGINE_BUSY_STATS)
>   enable &= ~BIT(I915_SAMPLE_BUSY);
>
>   return enable;
> }
>
> Not sure it is any better, your call.

I have made this change in v2 of the patch submitted here:

https://patchwork.freedesktop.org/series/118225/

And I have retained your R-b since the patch is essentially a copy of your
code above.

The original version of the patch was here:

https://patchwork.freedesktop.org/series/117658/

Thanks.
--
Ashutosh


Re: [PATCH] drm/i915/perf: Clear out entire reports after reading if not power of 2 size

2023-05-22 Thread Dixit, Ashutosh
On Mon, 22 May 2023 14:34:18 -0700, Umesh Nerlige Ramappa wrote:
>
> On Mon, May 22, 2023 at 01:17:49PM -0700, Ashutosh Dixit wrote:
> > Clearing out report id and timestamp as means to detect unlanded reports
> > only works if report size is power of 2. That is, only when report size is
> > a sub-multiple of the OA buffer size can we be certain that reports will
> > land at the same place each time in the OA buffer (after rewind). If report
> > size is not a power of 2, we need to zero out the entire report to be able
> > to detect unlanded reports reliably.
> >
> > Cc: Umesh Nerlige Ramappa 
> > Signed-off-by: Ashutosh Dixit 
> > ---
> > drivers/gpu/drm/i915/i915_perf.c | 17 +++--
> > 1 file changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c 
> > b/drivers/gpu/drm/i915/i915_perf.c
> > index 19d5652300eeb..58284156428dc 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -877,12 +877,17 @@ static int gen8_append_oa_reports(struct 
> > i915_perf_stream *stream,
> > stream->oa_buffer.last_ctx_id = ctx_id;
> > }
> >
> > -   /*
> > -* Clear out the report id and timestamp as a means to detect 
> > unlanded
> > -* reports.
> > -*/
> > -   oa_report_id_clear(stream, report32);
> > -   oa_timestamp_clear(stream, report32);
> > +   if (is_power_of_2(report_size)) {
> > +   /*
> > +* Clear out the report id and timestamp as a means
> > +* to detect unlanded reports.
> > +*/
> > +   oa_report_id_clear(stream, report32);
> > +   oa_timestamp_clear(stream, report32);
> > +   } else {
> > +   /* Zero out the entire report */
> > +   memset(report32, 0, report_size);
>
> Indeed, this was a bug. For a minute, I started wondering if this is the
> issue I am running into with the other patch posted for DG2, but then I see
> the issue within the first fill of the OA buffer where chunks of the
> reports are zeroed out, so this is a new issue.

Yes I saw this while reviewing your patch. And also I thought your issue
was happening on DG2 with power of 2 report size, only on MTL OAM we
introduce non power of 2 report size.

> lgtm,
>
> Reviewed-by: Umesh Nerlige Ramappa 

Thanks.
--
Ashutosh

>
> > +   }
> > }
> >
> > if (start_offset != *offset) {
> > --
> > 2.38.0
> >


Re: [PATCH] drm/i915/pmu: Change bitmask of enabled events to u32

2023-05-16 Thread Dixit, Ashutosh
On Tue, 16 May 2023 02:24:45 -0700, Tvrtko Ursulin wrote:
>
> From: Tvrtko Ursulin 
>
> Having it as u64 was a confusing (but harmless) mistake.
>
> Also add some asserts to make sure the internal field does not overflow
> in the future.
>
> Signed-off-by: Tvrtko Ursulin 
> Cc: Ashutosh Dixit 
> Cc: Umesh Nerlige Ramappa 
> ---
> I am not entirely sure the __builtin_constant_p->BUILD_BUG_ON branch will
> work with all compilers. Lets see...
>
> Compile tested only.
> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 32 ++--
>  1 file changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 7ece883a7d95..8736b3418f88 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -50,7 +50,7 @@ static u8 engine_event_instance(struct perf_event *event)
>   return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
>  }
>
> -static bool is_engine_config(u64 config)
> +static bool is_engine_config(const u64 config)
>  {
>   return config < __I915_PMU_OTHER(0);
>  }
> @@ -82,15 +82,28 @@ static unsigned int other_bit(const u64 config)
>
>  static unsigned int config_bit(const u64 config)
>  {
> + unsigned int bit;
> +
>   if (is_engine_config(config))
> - return engine_config_sample(config);
> + bit = engine_config_sample(config);
>   else
> - return other_bit(config);
> + bit = other_bit(config);
> +
> + if (__builtin_constant_p(config))
> + BUILD_BUG_ON(bit >
> +  BITS_PER_TYPE(typeof_member(struct i915_pmu,
> +  enable)) - 1);

Given that config comes from the event (it is event->attr.config), can this
ever be a builtin constant?

> + else
> + WARN_ON_ONCE(bit >
> +  BITS_PER_TYPE(typeof_member(struct i915_pmu,
> +  enable)) - 1);

There is really an even stricter limit on what the bit can be, which is the
total number of possible events but anyway this is good enough. So this
patch is:

Reviewed-by: Ashutosh Dixit 

> +
> + return bit;
>  }
>
> -static u64 config_mask(u64 config)
> +static u32 config_mask(const u64 config)
>  {
> - return BIT_ULL(config_bit(config));
> + return BIT(config_bit(config));
>  }
>
>  static bool is_engine_event(struct perf_event *event)
> @@ -633,11 +646,10 @@ static void i915_pmu_enable(struct perf_event *event)
>  {
>   struct drm_i915_private *i915 =
>   container_of(event->pmu, typeof(*i915), pmu.base);
> + const unsigned int bit = event_bit(event);
>   struct i915_pmu *pmu = >pmu;
>   unsigned long flags;
> - unsigned int bit;
>
> - bit = event_bit(event);
>   if (bit == -1)
>   goto update;
>
> @@ -651,7 +663,7 @@ static void i915_pmu_enable(struct perf_event *event)
>   GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
>   GEM_BUG_ON(pmu->enable_count[bit] == ~0);
>
> - pmu->enable |= BIT_ULL(bit);
> + pmu->enable |= BIT(bit);
>   pmu->enable_count[bit]++;
>
>   /*
> @@ -698,7 +710,7 @@ static void i915_pmu_disable(struct perf_event *event)
>  {
>   struct drm_i915_private *i915 =
>   container_of(event->pmu, typeof(*i915), pmu.base);
> - unsigned int bit = event_bit(event);
> + const unsigned int bit = event_bit(event);
>   struct i915_pmu *pmu = >pmu;
>   unsigned long flags;
>
> @@ -734,7 +746,7 @@ static void i915_pmu_disable(struct perf_event *event)
>* bitmask when the last listener on an event goes away.
>*/
>   if (--pmu->enable_count[bit] == 0) {
> - pmu->enable &= ~BIT_ULL(bit);
> + pmu->enable &= ~BIT(bit);
>   pmu->timer_enabled &= pmu_needs_timer(pmu, true);
>   }
>
> --
> 2.39.2
>


Re: [PATCH] drm/i915/guc/slpc: Disable rps_boost debugfs

2023-05-15 Thread Dixit, Ashutosh
On Mon, 15 May 2023 15:58:26 -0700, Dixit, Ashutosh wrote:
>
> On Mon, 15 May 2023 15:23:58 -0700, Belgaumkar, Vinay wrote:
> >
> >
> > On 5/12/2023 5:39 PM, Dixit, Ashutosh wrote:
> > > On Fri, 12 May 2023 16:56:03 -0700, Vinay Belgaumkar wrote:
> > > Hi Vinay,
> > >
> > >> rps_boost debugfs shows host turbo related info. This is not valid
> > >> when SLPC is enabled.
> > > A couple of thoughts about this. It appears people are know only about
> > > rps_boost_info and don't know about guc_slpc_info? So:
> > >
> > > a. Instead of hiding the rps_boost_info file do we need to print there
> > > saying "SLPC is enabled, go look at guc_slpc_info"?
> > rps_boost_info has an eval() function which disables the interface when RPS
> > is OFF. This is indeed the case here, so shouldn't we just follow that
> > instead of trying to link the two?
> > >
> > > b. Or, even just call guc_slpc_info_show from rps_boost_show (so the two
> > > files will show the same SLPC information)?
> >
> > slpc_info has a lot of other info like the SLPC state, not sure that
> > matches up with the rps_boost_info name.
>
> OK, I have asked in https://gitlab.freedesktop.org/drm/intel/-/issues/7632:
>
> @mattst88: is it acceptable to hide the
> /sys/kernel/debug/dri/0/i915_rps_boost_info file so that it doesn't cause
> confusion. And then user would have to go look at
> /sys/kernel/debug/dri/0/i915_guc_slpc_info or some such file when SLPC is
> being used? That's what the patch above is doing.
>
> Let's see what we hear from @mattst88.

@mattst88 agreed on #intel-gfx IRC, so ok by me:

Reviewed-by: Ashutosh Dixit 

>
> > >
> > >> guc_slpc_info already shows the number of boosts.  Add num_waiters there
> > >> as well and disable rps_boost when SLPC is enabled.
> > >>
> > >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7632
> > >> Signed-off-by: Vinay Belgaumkar 
>
> Thanks.
> --
> Ashutosh


Re: [PATCH] drm/i915/guc/slpc: Disable rps_boost debugfs

2023-05-15 Thread Dixit, Ashutosh
On Mon, 15 May 2023 15:23:58 -0700, Belgaumkar, Vinay wrote:
>
>
> On 5/12/2023 5:39 PM, Dixit, Ashutosh wrote:
> > On Fri, 12 May 2023 16:56:03 -0700, Vinay Belgaumkar wrote:
> > Hi Vinay,
> >
> >> rps_boost debugfs shows host turbo related info. This is not valid
> >> when SLPC is enabled.
> > A couple of thoughts about this. It appears people are know only about
> > rps_boost_info and don't know about guc_slpc_info? So:
> >
> > a. Instead of hiding the rps_boost_info file do we need to print there
> > saying "SLPC is enabled, go look at guc_slpc_info"?
> rps_boost_info has an eval() function which disables the interface when RPS
> is OFF. This is indeed the case here, so shouldn't we just follow that
> instead of trying to link the two?
> >
> > b. Or, even just call guc_slpc_info_show from rps_boost_show (so the two
> > files will show the same SLPC information)?
>
> slpc_info has a lot of other info like the SLPC state, not sure that
> matches up with the rps_boost_info name.

OK, I have asked in https://gitlab.freedesktop.org/drm/intel/-/issues/7632:

@mattst88: is it acceptable to hide the
/sys/kernel/debug/dri/0/i915_rps_boost_info file so that it doesn't cause
confusion. And then user would have to go look at
/sys/kernel/debug/dri/0/i915_guc_slpc_info or some such file when SLPC is
being used? That's what the patch above is doing.

Let's see what we hear from @mattst88.

> >
> >> guc_slpc_info already shows the number of boosts.  Add num_waiters there
> >> as well and disable rps_boost when SLPC is enabled.
> >>
> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7632
> >> Signed-off-by: Vinay Belgaumkar 

Thanks.
--
Ashutosh


Re: [PATCH] drm/i915/guc/slpc: Disable rps_boost debugfs

2023-05-12 Thread Dixit, Ashutosh
On Fri, 12 May 2023 16:56:03 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> rps_boost debugfs shows host turbo related info. This is not valid
> when SLPC is enabled.

A couple of thoughts about this. It appears people are know only about
rps_boost_info and don't know about guc_slpc_info? So:

a. Instead of hiding the rps_boost_info file do we need to print there
   saying "SLPC is enabled, go look at guc_slpc_info"?

b. Or, even just call guc_slpc_info_show from rps_boost_show (so the two
   files will show the same SLPC information)?

Ashutosh


> guc_slpc_info already shows the number of boosts.  Add num_waiters there
> as well and disable rps_boost when SLPC is enabled.
>
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7632
> Signed-off-by: Vinay Belgaumkar 


Re: [Intel-gfx] [PATCH] drm/i915/hwmon: Silence UBSAN uninitialized bool variable warning

2023-05-12 Thread Dixit, Ashutosh
On Fri, 12 May 2023 02:33:33 -0700, Andi Shyti wrote:
>
Hi Andi,
>
> On Thu, May 11, 2023 at 10:43:30AM -0700, Dixit, Ashutosh wrote:
> > On Wed, 10 May 2023 11:36:06 -0700, Ashutosh Dixit wrote:
> > >
> > > Loading i915 on UBSAN enabled kernels (CONFIG_UBSAN/CONFIG_UBSAN_BOOL)
> > > causes the following warning:
> > >
> > >   UBSAN: invalid-load in drivers/gpu/drm/i915/gt/uc/intel_uc.c:558:2
> > >   load of value 255 is not a valid value for type '_Bool'
> > >   Call Trace:
> > >dump_stack_lvl+0x57/0x7d
> > >ubsan_epilogue+0x5/0x40
> > >__ubsan_handle_load_invalid_value.cold+0x43/0x48
> > >__uc_init_hw+0x76a/0x903 [i915]
> > >...
> > >i915_driver_probe+0xfb1/0x1eb0 [i915]
> > >i915_pci_probe+0xbe/0x2d0 [i915]
> > >
> > > The warning happens because during probe i915_hwmon is still not available
> > > which results in the output boolean variable *old remaining
> > > uninitialized.
> >
> > Note that the variable was uninitialized in this case but it was never used
> > uninitialized (the variable was not needed when it was uninitialized). So
> > there was no bug in the code. UBSAN warning is just complaining about the
> > uninitialized variable being passed into a function (where it is not used).
> >
> > Also the variable can be initialized in the caller (__uc_init_hw) too and
> > it will fix this issue. However in __uc_init_hw the assumption is that the
> > variable will be initialized in the callee (i915_hwmon_power_max_disable),
> > so that is how I have done it in this patch.
> >
> > I thought these clarifications will help with the review.
>
> I think we should not just consider what's now but also what can
> come later. The use of pl1en is not 100% future proof and
> therefore your patch, even though now is not fixing anything,
> might avoid wrong uses in the future.
>
> I'm just wondering, though, why not initializing the variable at
> it's declaration. As you wish.

OK, in v2 I went ahead and did just that (initializing the variable at the
declaration). I was splitting hair too much :/

> Reviewed-by: Andi Shyti 

Thanks.
--
Ashutosh


> >
> > > Silence the warning by initializing the variable to an arbitrary value.
> > >
> > > Signed-off-by: Ashutosh Dixit 
> > > ---
> > >  drivers/gpu/drm/i915/i915_hwmon.c | 5 -
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > index a3bdd9f68a458..685663861bc0b 100644
> > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > @@ -502,8 +502,11 @@ void i915_hwmon_power_max_disable(struct 
> > > drm_i915_private *i915, bool *old)
> > >   struct i915_hwmon *hwmon = i915->hwmon;
> > >   u32 r;
> > >
> > > - if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > > + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) {
> > > + /* Fix uninitialized bool variable warning */
> > > + *old = false;
> > >   return;
> > > + }
> > >
> > >   mutex_lock(>hwmon_lock);
> > >
> > > --
> > > 2.38.0
> > >


Re: [Intel-gfx] [PATCH] drm/i915/hwmon: Silence UBSAN uninitialized bool variable warning

2023-05-11 Thread Dixit, Ashutosh
On Wed, 10 May 2023 11:36:06 -0700, Ashutosh Dixit wrote:
>
> Loading i915 on UBSAN enabled kernels (CONFIG_UBSAN/CONFIG_UBSAN_BOOL)
> causes the following warning:
>
>   UBSAN: invalid-load in drivers/gpu/drm/i915/gt/uc/intel_uc.c:558:2
>   load of value 255 is not a valid value for type '_Bool'
>   Call Trace:
>dump_stack_lvl+0x57/0x7d
>ubsan_epilogue+0x5/0x40
>__ubsan_handle_load_invalid_value.cold+0x43/0x48
>__uc_init_hw+0x76a/0x903 [i915]
>...
>i915_driver_probe+0xfb1/0x1eb0 [i915]
>i915_pci_probe+0xbe/0x2d0 [i915]
>
> The warning happens because during probe i915_hwmon is still not available
> which results in the output boolean variable *old remaining
> uninitialized.

Note that the variable was uninitialized in this case but it was never used
uninitialized (the variable was not needed when it was uninitialized). So
there was no bug in the code. UBSAN warning is just complaining about the
uninitialized variable being passed into a function (where it is not used).

Also the variable can be initialized in the caller (__uc_init_hw) too and
it will fix this issue. However in __uc_init_hw the assumption is that the
variable will be initialized in the callee (i915_hwmon_power_max_disable),
so that is how I have done it in this patch.

I thought these clarifications will help with the review.

Thanks.
--
Ashutosh

> Silence the warning by initializing the variable to an arbitrary value.
>
> Signed-off-by: Ashutosh Dixit 
> ---
>  drivers/gpu/drm/i915/i915_hwmon.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> b/drivers/gpu/drm/i915/i915_hwmon.c
> index a3bdd9f68a458..685663861bc0b 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -502,8 +502,11 @@ void i915_hwmon_power_max_disable(struct 
> drm_i915_private *i915, bool *old)
>   struct i915_hwmon *hwmon = i915->hwmon;
>   u32 r;
>
> - if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) {
> + /* Fix uninitialized bool variable warning */
> + *old = false;
>   return;
> + }
>
>   mutex_lock(>hwmon_lock);
>
> --
> 2.38.0
>


Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-20 Thread Dixit, Ashutosh
On Thu, 20 Apr 2023 08:43:52 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote:
> >
> > On 19/04/2023 23:10, Dixit, Ashutosh wrote:
> > > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
> > > >
> > >
> > > Hi Tvrtko,
> > >
> > > > On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > > > > Instead of erroring out when GuC reset is in progress, block waiting 
> > > > > for
> > > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > > >
> > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > > >
> > > > > Signed-off-by: Ashutosh Dixit 
> > > > > ---
> > > > >drivers/gpu/drm/i915/i915_hwmon.c | 38 
> > > > > +++
> > > > >1 file changed, 33 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > >   char name[12];
> > > > >   int gt_n;
> > > > >   bool reset_in_progress;
> > > > > + wait_queue_head_t waitq;
> > > > >};
> > > > >  struct i915_hwmon {
> > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, 
> > > > > long *val)
> > > > >static int
> > > > >hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > > >{
> > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > > +
> > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > >
> > > > Patch looks good to me
> > >
> > > Great, thanks :)
> > >
> > > > apart that I am not sure what is the purpose of the timeout? This is 
> > > > just
> > > > the sysfs write path or has more callers?
> > >
> > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI
> > > stack (Level 0). In the initial version I also didn't have the timeout
> > > thinking that the app can send a signal to the blocked thread to unblock
> > > it. I introduced the timeout after Rodrigo brought it up and I am now
> > > thinking maybe it's better to have the timeout in the driver since the app
> > > has no knowledge of how long GuC resets can take. But I can remove it if
> > > you think it's not needed.
> >
> > Maybe I am missing something but I don't get why we would need to provide a
> > timeout facility in sysfs? If the library writes here to configure something
> > it already has to expect a blocking write by the nature of a a write(2) and
> > sysfs contract. It can take long for any reason so I hope we are not
> > guaranteeing some latency number to someone? Or the concern is just about
> > things getting stuck? In which case I think Ctrl-C is the answer because
> > ETIME is not even listed as an errno for write(2).

Hmm, good point.

> I suggested the timeout on the other version because of that race,
> which is fixed now with this approach. It is probably better to remove
> it now to avoid confusions. I'm sorry about that.

No problem, I've removed the timeout in the latest version.

Thanks for the R-b.

Ashutosh


Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-19 Thread Dixit, Ashutosh
On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> > >
> >
> > Hi Rodrigo,
> >
> > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > >
> > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > >
> > > > Signed-off-by: Ashutosh Dixit 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > char name[12];
> > > > int gt_n;
> > > > bool reset_in_progress;
> > > > +   wait_queue_head_t waitq;
> > > >  };
> > > >
> > > >  struct i915_hwmon {
> > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > > > *val)
> > > >  static int
> > > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > >  {
> > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > +
> > > > +   int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > > struct i915_hwmon *hwmon = ddat->hwmon;
> > > > intel_wakeref_t wakeref;
> > > > -   int ret = 0;
> > > > +   DEFINE_WAIT(wait);
> > > > u32 nval;
> > > >
> > > > -   mutex_lock(>hwmon_lock);
> > > > -   if (hwmon->ddat.reset_in_progress) {
> > > > -   ret = -EAGAIN;
> > > > -   goto unlock;
> > > > +   /* Block waiting for GuC reset to complete when needed */
> > > > +   for (;;) {
> > > > +   mutex_lock(>hwmon_lock);
> > >
> > > I'm really afraid of how this mutex is handled with the wait queue.
> > > some initial thought it looks like it is trying to reimplement ww_mutex?
> >
> > Sorry, but I am missing the relation with ww_mutex. No such relation is
> > intended.
> >
> > > all other examples of the wait_queue usages like this or didn't use
> > > locks or had it in a total different flow that I could not correlate.
> >
> > Actually there are several examples of prepare_to_wait/finish_wait
> > sequences with both spinlock and mutex in the kernel. See
> > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> >
> > Also, as I mentioned, except for the lock, the sequence here is identical
> > to intel_guc_wait_for_pending_msg().
> >
> > >
> > > > +
> > > > +   prepare_to_wait(>waitq, , 
> > > > TASK_INTERRUPTIBLE);
> > > > +
> > > > +   if (!hwmon->ddat.reset_in_progress)
> > > > +   break;
> > >
> > > If this breaks we never unlock it?
> >
> > Correct, this is the original case in Patch 2 where the mutex is acquired
> > in the beginning of the function and released just before the final exit
> > from the function (so the mutex is held for the entire duration of the
> > function).
>
> I got really confused here...

Sorry, the patch is a little confusing/tricky but I thought I'd better
stick to the standard 'for (;;)' loop pattern otherwise it will also be
hard to review.

> I looked at the patch 2 again and I don't see any place where the lock
> remains outside of the function. What was what I asked to remove on the
> initial versions.

So it was in Patch 1 where we changed the code to take the lock in the
beginning of the function and release it at the end of the function (you
can see it Patch 1).

In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock
is released at the 'unlock' label (it is visible in Patch 2).

> But now with this one I'm even more confused because I couldn't follow
> to understand who will remove the lock and when.

In Patch 3 again the lock is released at the the 'unlock' label (i.e. the
destination of 'got

Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-19 Thread Dixit, Ashutosh
On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> >   1 file changed, 33 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..8471a667dfc71 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > char name[12];
> > int gt_n;
> > bool reset_in_progress;
> > +   wait_queue_head_t waitq;
> >   };
> > struct i915_hwmon {
> > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > *val)
> >   static int
> >   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >   {
> > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > +
> > +   int ret = 0, timeout = GUC_RESET_TIMEOUT;
>
> Patch looks good to me

Great, thanks :)

> apart that I am not sure what is the purpose of the timeout? This is just
> the sysfs write path or has more callers?

It is just the sysfs path, but the sysfs is accessed also by the oneAPI
stack (Level 0). In the initial version I also didn't have the timeout
thinking that the app can send a signal to the blocked thread to unblock
it. I introduced the timeout after Rodrigo brought it up and I am now
thinking maybe it's better to have the timeout in the driver since the app
has no knowledge of how long GuC resets can take. But I can remove it if
you think it's not needed.

> If the
> former perhaps it would be better to just use interruptible everything
> (mutex and sleep) and wait for as long as it takes or until user presses
> Ctrl-C?

Now we are not holding the mutexes for long, just long enough do register
rmw's. So not holding the mutex across GuC reset as we were
originally. Therefore I am thinking mutex_lock_interruptible is not needed?
The sleep is already interruptible (TASK_INTERRUPTIBLE).

Anyway please let me know if you think we need to change anything.

Thanks.
--
Ashutosh

> > struct i915_hwmon *hwmon = ddat->hwmon;
> > intel_wakeref_t wakeref;
> > -   int ret = 0;
> > +   DEFINE_WAIT(wait);
> > u32 nval;
> >   - mutex_lock(>hwmon_lock);
> > -   if (hwmon->ddat.reset_in_progress) {
> > -   ret = -EAGAIN;
> > -   goto unlock;
> > +   /* Block waiting for GuC reset to complete when needed */
> > +   for (;;) {
> > +   mutex_lock(>hwmon_lock);
> > +
> > +   prepare_to_wait(>waitq, , TASK_INTERRUPTIBLE);
> > +
> > +   if (!hwmon->ddat.reset_in_progress)
> > +   break;
> > +
> > +   if (signal_pending(current)) {
> > +   ret = -EINTR;
> > +   break;
> > +   }
> > +
> > +   if (!timeout) {
> > +   ret = -ETIME;
> > +   break;
> > +   }
> > +
> > +   mutex_unlock(>hwmon_lock);
> > +
> > +   timeout = schedule_timeout(timeout);
> > }
> > +   finish_wait(>waitq, );
> > +   if (ret)
> > +   goto unlock;
> > +
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > /* Disable PL1 limit and verify, because the limit cannot be
> > disabled on all platforms */
> > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct 
> > drm_i915_private *i915, bool old)
> > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >  PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > hwmon->ddat.reset_in_progress = false;
> > +   wake_up_all(>ddat.waitq);
> > mutex_unlock(>hwmon_lock);
> >   }
> > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > ddat->uncore = >uncore;
> > snprintf(ddat->name, sizeof(ddat->name), "i915");
> > ddat->gt_n = -1;
> > +   init_waitqueue_head(>waitq);
> > for_each_gt(gt, i915, i) {
> > ddat_gt = hwmon->ddat_gt + i;


Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-18 Thread Dixit, Ashutosh
On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> >  1 file changed, 33 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..8471a667dfc71 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > char name[12];
> > int gt_n;
> > bool reset_in_progress;
> > +   wait_queue_head_t waitq;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > *val)
> >  static int
> >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  {
> > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > +
> > +   int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > struct i915_hwmon *hwmon = ddat->hwmon;
> > intel_wakeref_t wakeref;
> > -   int ret = 0;
> > +   DEFINE_WAIT(wait);
> > u32 nval;
> >
> > -   mutex_lock(>hwmon_lock);
> > -   if (hwmon->ddat.reset_in_progress) {
> > -   ret = -EAGAIN;
> > -   goto unlock;
> > +   /* Block waiting for GuC reset to complete when needed */
> > +   for (;;) {
> > +   mutex_lock(>hwmon_lock);
>
> I'm really afraid of how this mutex is handled with the wait queue.
> some initial thought it looks like it is trying to reimplement ww_mutex?

Sorry, but I am missing the relation with ww_mutex. No such relation is
intended.

> all other examples of the wait_queue usages like this or didn't use
> locks or had it in a total different flow that I could not correlate.

Actually there are several examples of prepare_to_wait/finish_wait
sequences with both spinlock and mutex in the kernel. See
e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().

Also, as I mentioned, except for the lock, the sequence here is identical
to intel_guc_wait_for_pending_msg().

>
> > +
> > +   prepare_to_wait(>waitq, , TASK_INTERRUPTIBLE);
> > +
> > +   if (!hwmon->ddat.reset_in_progress)
> > +   break;
>
> If this breaks we never unlock it?

Correct, this is the original case in Patch 2 where the mutex is acquired
in the beginning of the function and released just before the final exit
from the function (so the mutex is held for the entire duration of the
function).

>
> > +
> > +   if (signal_pending(current)) {
> > +   ret = -EINTR;
> > +   break;
> > +   }
> > +
> > +   if (!timeout) {
> > +   ret = -ETIME;
> > +   break;
> > +   }
> > +
> > +   mutex_unlock(>hwmon_lock);
>
> do we need to lock the signal pending and timeout as well?
> or only wrapping it around the hwmon->ddat access would be
> enough?

Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
flag. But because this is not a performance path, implementing it as done
in the patch simplifies the code flow (since there are several if/else,
goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).

So if possible I *really* want to not try to over-optimize here (I did try
a few other things when writing the patch but it was getting ugly). The
only real requirement is to drop the lock before calling schedule_timeout()
below (and we are reacquiring the lock as soon as we are scheduled back in,
as you can see in the loop above).

>
> > +
> > +   timeout = schedule_timeout(timeout);
> > }
> > +   finish_wait(>waitq, );
> > +   if (ret)
> > +   goto unlock;
> > +
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > /* Disable PL1 limit and verify, because the limit cannot be disabled 
> > on all platforms */
> > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct 
> > drm_i915_private *i915, bool old)
> > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >  PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > hwmon->ddat.reset_in_progress = false;
> > +   wake_up_all(>ddat.waitq);
> >
> > mutex_unlock(>hwmon_lock);
> >  }
> > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > ddat->uncore = >uncore;
> > snprintf(ddat->name, sizeof(ddat->name), "i915");
> > ddat->gt_n = -1;
> > +   init_waitqueue_head(>waitq);
> >
> > for_each_gt(gt, i915, i) {
> > ddat_gt = hwmon->ddat_gt + i;
> > --
> > 2.38.0
> >

From what I understand is the locking above is fine and is not the
point. The real race is between schedule_timeout() (which 

Re: [Intel-gfx] [PATCH v3] drm/i915/guc/slpc: Provide sysfs for efficient freq

2023-04-14 Thread Dixit, Ashutosh
On Fri, 14 Apr 2023 15:34:15 -0700, Vinay Belgaumkar wrote:
>
> @@ -457,6 +458,34 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
> *slpc, u32 *val)
>   return ret;
>  }
>
> +int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val)
> +{
> + struct drm_i915_private *i915 = slpc_to_i915(slpc);
> + intel_wakeref_t wakeref;
> + int ret = 0;
> +
> + /* Need a lock now since waitboost can be modifying min as well */

Delete comment.

> + mutex_lock(>lock);

Actually, don't need the lock itself now so delete the lock.

Or, maybe the lock prevents the race if userspace writes to the sysfs when
GuC reset is going on so let's retain the lock. But the comment is wrong.

> + wakeref = intel_runtime_pm_get(>runtime_pm);
> +
> + /* Ignore efficient freq if lower min freq is requested */

Delete comment, it's wrong.

> + ret = slpc_set_param(slpc,
> +  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
> +  val);
> + if (ret) {
> + guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient 
> freq(%d): %pe\n",
> + val, ERR_PTR(ret));
> + goto out;
> + }
> +
> + slpc->ignore_eff_freq = val;
> +

This extra line can also be deleted.

> +out:
> + intel_runtime_pm_put(>runtime_pm, wakeref);
> + mutex_unlock(>lock);
> + return ret;
> +}
> +
>  /**
>   * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
>   * @slpc: pointer to intel_guc_slpc.
> @@ -482,16 +511,6 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
> *slpc, u32 val)
>   mutex_lock(>lock);
>   wakeref = intel_runtime_pm_get(>runtime_pm);
>
> - /* Ignore efficient freq if lower min freq is requested */
> - ret = slpc_set_param(slpc,
> -  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
> -  val < slpc->rp1_freq);
> - if (ret) {
> - guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient 
> freq: %pe\n",
> - ERR_PTR(ret));
> - goto out;
> - }
> -

Great, thanks!

After taking care of the above, and seems there are also a couple of
checkpatch errors, this is:

Reviewed-by: Ashutosh Dixit 


Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-10 Thread Dixit, Ashutosh
On Fri, 07 Apr 2023 04:04:06 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..4343efb48e61b 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > char name[12];
> > int gt_n;
> > bool reset_in_progress;
> > +   wait_queue_head_t wqh;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long 
> > val)
> > int ret = 0;
> > u32 nval;
> >
> > +retry:
> > mutex_lock(>hwmon_lock);
> > if (hwmon->ddat.reset_in_progress) {
> > -   ret = -EAGAIN;
> > -   goto unlock;
> > +   mutex_unlock(>hwmon_lock);
> > +   ret = wait_event_interruptible(ddat->wqh,
> > +  !hwmon->ddat.reset_in_progress);
>
> this is indeed very clever!

Not clever, see below :/

> my fear is probably due to the lack of knowledge on this wait queue, but
> I'm wondering what could go wrong if due to some funny race you enter this
> check right after wake_up_all below has passed and then you will be here
> indefinitely waiting...

You are absolutely right, there is indeed a race in the patch because in
the above code when we drop the mutex (mutex_unlock) the wake_up_all can
happen before we have queued ourselves for the wake up.

Solving this race needs a more complicated prepare_to_wait/finish_wait
sequence which I have gone ahead and implemented in patch v2. The v2 code
is also a standard code pattern and the pattern I have implemented is
basically the same as that in intel_guc_wait_for_pending_msg() in i915
which I liked.

I have read in several places (e.g. in the Advanced Sleeping section in
https://static.lwn.net/images/pdf/LDD3/ch06.pdf and in kernel documentation
for try_to_wake_up()) that this sequence will avoid the race (between
schedule() and wake_up()). The crucial difference from the v1 patch is that
in v2 the mutex is dropped after we queue ourselves in prepare_to_wait()
just before calling schedule_timeout().

> maybe just use the timeout version to be on the safeside and then return the
> -EAGAIN on timeout?

Also incorporated timeout in the new version. All code paths in the new
patch have been tested.

Thanks.
--
Ashutosh

> > +   if (ret)
> > +   return ret;
> > +   goto retry;
> > }
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> > intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > -unlock:
> > mutex_unlock(>hwmon_lock);
> > return ret;
> >  }
> > @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct 
> > drm_i915_private *i915, bool old)
> > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >  PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > hwmon->ddat.reset_in_progress = false;
> > +   wake_up_all(>ddat.wqh);
> >
> > mutex_unlock(>hwmon_lock);
> >  }
> > @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > ddat->uncore = >uncore;
> > snprintf(ddat->name, sizeof(ddat->name), "i915");
> > ddat->gt_n = -1;
> > +   init_waitqueue_head(>wqh);
> >
> > for_each_gt(gt, i915, i) {
> > ddat_gt = hwmon->ddat_gt + i;
> > --
> > 2.38.0
> >


Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-10 Thread Dixit, Ashutosh
On Fri, 07 Apr 2023 04:08:31 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:21PM -0700, Ashutosh Dixit wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> >
> > v2:
> >  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >  - Add hwm_power_max_restore to error return code path
> >
> > v3 (Jani N):
> >  - Add/remove explanatory comments
> >  - Function renames
> >  - Type corrections
> >  - Locking annotation
> >
> > v4:
> >  - Don't hold the lock across GuC reset (Rodrigo)
> >  - New locking scheme (suggested by Rodrigo)
> >  - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++
> >  drivers/gpu/drm/i915/i915_hwmon.c | 40 +++
> >  drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >  #include "intel_uc.h"
> >
> >  #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> >
> >  static const struct intel_uc_ops uc_ops_off;
> >  static const struct intel_uc_ops uc_ops_on;
> > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > struct intel_guc *guc = >guc;
> > struct intel_huc *huc = >huc;
> > int ret, attempts;
> > +   bool pl1en;
>
> we need to initialize this to make warn free builds happy...
> what's our default btw? false? true? we need to read it back?

Yes this was a real bug caught by the kernel build robot. We don't know the
default till we read it back, which would mean exposing a new function. I
have avoided exposing the new function, i.e. I have fixed this by creating a
new (err_rps) label which will make sure that the variable is not used
unless it is initialized. I am not expecting to see warnings from the build
robot with this fix now.

> >
> > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > else
> > attempts = 1;
> >
> > +   /* Disable a potentially low PL1 power limit to allow freq to be raised 
> > */
> > +   i915_hwmon_power_max_disable(gt->i915, );
> > +
> > intel_rps_raise_unslice(_to_gt(uc)->rps);
> >
> > while (attempts--) {
> > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > }
> >
> > +   i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> > guc_info(guc, "submission %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> > guc_info(guc, "SLPC %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >
> > @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > /* Return GT back to RPn */
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> >
> > +   i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> > __uc_sanitize(uc);
> >
> > if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 7f44e809ca155..9ab8971679fe3 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -50,6 +50,7 @@ struct hwm_drvdata {
> > struct hwm_energy_info ei;  /*  Energy info for 
> > energy1_input */
> > char name[12];
> > int gt_n;
> > +   bool reset_in_progress;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > u32 nval;
> >
> > mutex_lock(>hwmon_lock);
> > +   if (hwmon->ddat.reset_in_progress) {
> > +   ret = -EAGAIN;
> > +   goto unlock;
> > +   }
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > /* Disable PL1 limit and verify, because the limit cannot be disabled 
> > on all platforms */
> > @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> > intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > +unlock:
> > mutex_unlock(>hwmon_lock);
> > return ret;
> >  }
> > @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, 
> > int chan, long val)
> > }
> >  }
> >
> > +void 

Re: [Intel-gfx] [PATCH] i915/guc/slpc: Provide sysfs for efficient freq

2023-04-05 Thread Dixit, Ashutosh
On Wed, 05 Apr 2023 13:12:29 -0700, Rodrigo Vivi wrote:
>
> On Wed, Apr 05, 2023 at 12:42:30PM -0700, Dixit, Ashutosh wrote:
> > On Wed, 05 Apr 2023 06:57:42 -0700, Rodrigo Vivi wrote:
> > >

Hi Rodrigo,

> >
> > > On Fri, Mar 31, 2023 at 08:11:29PM -0700, Dixit, Ashutosh wrote:
> > > > On Fri, 31 Mar 2023 19:00:49 -0700, Vinay Belgaumkar wrote:
> > > > >
> > > >
> > > > Hi Vinay,
> > > >
> > > > > @@ -478,20 +507,15 @@ int intel_guc_slpc_set_min_freq(struct 
> > > > > intel_guc_slpc *slpc, u32 val)
> > > > >   val > slpc->max_freq_softlimit)
> > > > >   return -EINVAL;
> > > > >
> > > > > + /* Ignore efficient freq if lower min freq is requested */
> > > > > + ret = intel_guc_slpc_set_ignore_eff_freq(slpc, val < 
> > > > > slpc->rp1_freq);
> > > > > + if (ret)
> > > > > + goto out;
> > > > > +
> > > >
> > > > I don't agree with this. If we are now providing an interface 
> > > > explicitly to
> > > > ignore RPe, that should be /only/ way to ignore RPe. There should be no
> > > > other "under the hood" ignoring of RPe. In other words, ignoring RPe 
> > > > should
> > > > be minimized unless explicitly requested.
> > > >
> > > > I don't clearly understand why this was done previously but it makes 
> > > > even
> > > > less sense to me now after this patch.
> > >
> > > well, I had suggested this previously. And just because without this we 
> > > would
> > > be breaking API expectations.
> > >
> > > When user selects a minimal frequency it expect that to stick. But with 
> > > the
> > > efficient freq enabled in guc if minimal is less than the efficient one,
> > > this request is likely ignored.
> > >
> > > Well, even worse is that we are actually caching the request in the soft 
> > > values.
> > > So we show a minimal, but the hardware without any workload is operating 
> > > at
> > > efficient.
> > >
> > > So, the thought process was: 'if user requested a very low minimal, we 
> > > give them
> > > the minimal requested, even if that means to disable the efficient freq.'
> >
> > Hmm, I understand this even less now :)
> >
> > * Why is RPe ignored when min < RPe? Since the freq can be between min and
> >   max? Shouldn't the condition be min > RPe, that is turn RPe off if min
> >   higher that RPe is requested?
>
> that is not how guc efficient freq selection works. (unless my memory is
> tricking me right now.)
>
> So, if we select a min that is between RPe and RP0, guc will respect and
> use the selected min. So we don't need to disable guc selection of the
> efficient.
>
> This is not true when we select a very low min like RPn. If we select RPn
> as min and guc efficient freq selection is enabled guc will simply ignore
> our request. So the only way to give the user what is asked, is to also
> disable guc's efficient freq selection. (I probably confused you in the
> previous email because I used 'RP0' when I meant 'RPn'. I hope it gets
> clear now).
>
> >
> > * Also isn't RPe dynamic, so we can't say RPe == rp1 when using in KMD?
>
> Oh... yeap, this is an issue indeed. Specially with i915 where we have
> the soft values cached instead of asking guc everytime.
>
> That's a good point. The variance is not big, but we will hit corner cases.
> One way is to keep checking and updating everytime a sysfs is touched.

This I believe not possible in all cases. Say the freq's are set through
sysfs first and the workload starts later. In this case RPe will probably
start changing after the workload starts, not when freq's are set in sysfs.

> Other way is do what you are suggesting and let's just accept and deal
> with the reality that is: "we cannot guarantee a min freq selection if user
> doesn't disable the efficient freq selection".
>
> >
> > * Finally, we know that enabling RPe broke the kernel freq API because RPe
> >   could go over max_freq. So it is actually the max freq which is not
> >   obeyed after RPe is enabled.
>
> Oh! so it was my bad memory indeed and everything is the other way around?
> But I just looked to Xe code, my most recent memory, and I just needed
> to toggle the efficient freq off on the case that I mentioned, when min
> selection is below the efficient one. With that all the API

Re: [Intel-gfx] [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-05 Thread Dixit, Ashutosh
On Tue, 28 Mar 2023 02:14:42 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 27/03/2023 18:47, Rodrigo Vivi wrote:
> >
> > +Daniel
> >
> > On Mon, Mar 27, 2023 at 09:58:52AM -0700, Dixit, Ashutosh wrote:
> >> On Sun, 26 Mar 2023 04:52:59 -0700, Rodrigo Vivi wrote:
> >>>
> >>
> >> Hi Rodrigo,
> >>
> >>> On Fri, Mar 24, 2023 at 04:31:22PM -0700, Dixit, Ashutosh wrote:
> >>>> On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
> >>>>>
> >>>>
> >>>> Hi Vinay,
> >>>>
> >>>> Thanks for the review. Comments inline below.
> >>>>
> >>>>> On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:
> >>>>>> On dGfx, the PL1 power limit being enabled and set to a low value 
> >>>>>> results
> >>>>>> in a low GPU operating freq. It also negates the freq raise operation 
> >>>>>> which
> >>>>>> is done before GuC firmware load. As a result GuC firmware load can 
> >>>>>> time
> >>>>>> out. Such timeouts were seen in the GL #8062 bug below (where the PL1 
> >>>>>> power
> >>>>>> limit was enabled and set to a low value). Therefore disable the PL1 
> >>>>>> power
> >>>>>> limit when allowed by HW when loading GuC firmware.
> >>>>> v3 label missing in subject.
> >>>>>>
> >>>>>> v2:
> >>>>>>- Take mutex (to disallow writes to power1_max) across GuC reset/fw 
> >>>>>> load
> >>>>>>- Add hwm_power_max_restore to error return code path
> >>>>>>
> >>>>>> v3 (Jani N):
> >>>>>>- Add/remove explanatory comments
> >>>>>>- Function renames
> >>>>>>- Type corrections
> >>>>>>- Locking annotation
> >>>>>>
> >>>>>> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> >>>>>> Signed-off-by: Ashutosh Dixit 
> >>>>>> ---
> >>>>>>drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
> >>>>>>drivers/gpu/drm/i915/i915_hwmon.c | 39 
> >>>>>> +++
> >>>>>>drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >>>>>>3 files changed, 55 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> >>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>>>>> index 4ccb4be4c9cba..aa8e35a5636a0 100644
> >>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>>>>> @@ -18,6 +18,7 @@
> >>>>>>#include "intel_uc.h"
> >>>>>>  #include "i915_drv.h"
> >>>>>> +#include "i915_hwmon.h"
> >>>>>>  static const struct intel_uc_ops uc_ops_off;
> >>>>>>static const struct intel_uc_ops uc_ops_on;
> >>>>>> @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>>>>struct intel_guc *guc = >guc;
> >>>>>>struct intel_huc *huc = >huc;
> >>>>>>int ret, attempts;
> >>>>>> +  bool pl1en;
> >>>>>
> >>>>> Init to 'false' here
> >>>>
> >>>> See next comment.
> >>>>
> >>>>>
> >>>>>
> >>>>>>GEM_BUG_ON(!intel_uc_supports_guc(uc));
> >>>>>>GEM_BUG_ON(!intel_uc_wants_guc(uc));
> >>>>>> @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>>>>else
> >>>>>>attempts = 1;
> >>>>>>+   /* Disable a potentially low PL1 power limit to allow freq to be
> >>>>>> raised */
> >>>>>> +  i915_hwmon_power_max_disable(gt->i915, );
> >>>>>> +
> >>>>>>intel_rps_raise_unslice(_to_gt(uc)->rps);
> >>>>>>while (attempts--) {
> >>>>>> @@ -547,6 +552,8 @@ static int __uc_init_hw

Re: [Intel-gfx] [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-04-05 Thread Dixit, Ashutosh
On Mon, 27 Mar 2023 10:47:25 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

Sorry for the delay, I got pulled away into a couple of other things and
could only now get back to this.

>
> +Daniel
>
> On Mon, Mar 27, 2023 at 09:58:52AM -0700, Dixit, Ashutosh wrote:
> > On Sun, 26 Mar 2023 04:52:59 -0700, Rodrigo Vivi wrote:
> > >
> >
> > Hi Rodrigo,
> >
> > > On Fri, Mar 24, 2023 at 04:31:22PM -0700, Dixit, Ashutosh wrote:
> > > > On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
> > > > >
> > > >
> > > > Hi Vinay,
> > > >
> > > > Thanks for the review. Comments inline below.
> > > >
> > > > > On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:
> > > > > > On dGfx, the PL1 power limit being enabled and set to a low value 
> > > > > > results
> > > > > > in a low GPU operating freq. It also negates the freq raise 
> > > > > > operation which
> > > > > > is done before GuC firmware load. As a result GuC firmware load can 
> > > > > > time
> > > > > > out. Such timeouts were seen in the GL #8062 bug below (where the 
> > > > > > PL1 power
> > > > > > limit was enabled and set to a low value). Therefore disable the 
> > > > > > PL1 power
> > > > > > limit when allowed by HW when loading GuC firmware.
> > > > > v3 label missing in subject.
> > > > > >
> > > > > > v2:
> > > > > >   - Take mutex (to disallow writes to power1_max) across GuC 
> > > > > > reset/fw load
> > > > > >   - Add hwm_power_max_restore to error return code path
> > > > > >
> > > > > > v3 (Jani N):
> > > > > >   - Add/remove explanatory comments
> > > > > >   - Function renames
> > > > > >   - Type corrections
> > > > > >   - Locking annotation
> > > > > >
> > > > > > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > > > > > Signed-off-by: Ashutosh Dixit 
> > > > > > ---
> > > > > >   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
> > > > > >   drivers/gpu/drm/i915/i915_hwmon.c | 39 
> > > > > > +++
> > > > > >   drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> > > > > >   3 files changed, 55 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > > > > > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > > > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > > > @@ -18,6 +18,7 @@
> > > > > >   #include "intel_uc.h"
> > > > > > #include "i915_drv.h"
> > > > > > +#include "i915_hwmon.h"
> > > > > > static const struct intel_uc_ops uc_ops_off;
> > > > > >   static const struct intel_uc_ops uc_ops_on;
> > > > > > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > > > struct intel_guc *guc = >guc;
> > > > > > struct intel_huc *huc = >huc;
> > > > > > int ret, attempts;
> > > > > > +   bool pl1en;
> > > > >
> > > > > Init to 'false' here
> > > >
> > > > See next comment.
> > > >
> > > > >
> > > > >
> > > > > > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > > > > > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > > > > > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > > > else
> > > > > > attempts = 1;
> > > > > >   + /* Disable a potentially low PL1 power limit to allow freq to be
> > > > > > raised */
> > > > > > +   i915_hwmon_power_max_disable(gt->i915, );
> > > > > > +
> > > > > > intel_rps_raise_unslice(_to_gt(uc)->rps);
> > > > > > while (attempts--) {
> > > > > > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > > &g

Re: [Intel-gfx] [PATCH] i915/guc/slpc: Provide sysfs for efficient freq

2023-04-05 Thread Dixit, Ashutosh
On Wed, 05 Apr 2023 06:57:42 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Fri, Mar 31, 2023 at 08:11:29PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 31 Mar 2023 19:00:49 -0700, Vinay Belgaumkar wrote:
> > >
> >
> > Hi Vinay,
> >
> > > @@ -478,20 +507,15 @@ int intel_guc_slpc_set_min_freq(struct 
> > > intel_guc_slpc *slpc, u32 val)
> > >   val > slpc->max_freq_softlimit)
> > >   return -EINVAL;
> > >
> > > + /* Ignore efficient freq if lower min freq is requested */
> > > + ret = intel_guc_slpc_set_ignore_eff_freq(slpc, val < slpc->rp1_freq);
> > > + if (ret)
> > > + goto out;
> > > +
> >
> > I don't agree with this. If we are now providing an interface explicitly to
> > ignore RPe, that should be /only/ way to ignore RPe. There should be no
> > other "under the hood" ignoring of RPe. In other words, ignoring RPe should
> > be minimized unless explicitly requested.
> >
> > I don't clearly understand why this was done previously but it makes even
> > less sense to me now after this patch.
>
> well, I had suggested this previously. And just because without this we would
> be breaking API expectations.
>
> When user selects a minimal frequency it expect that to stick. But with the
> efficient freq enabled in guc if minimal is less than the efficient one,
> this request is likely ignored.
>
> Well, even worse is that we are actually caching the request in the soft 
> values.
> So we show a minimal, but the hardware without any workload is operating at
> efficient.
>
> So, the thought process was: 'if user requested a very low minimal, we give 
> them
> the minimal requested, even if that means to disable the efficient freq.'

Hmm, I understand this even less now :)

* Why is RPe ignored when min < RPe? Since the freq can be between min and
  max? Shouldn't the condition be min > RPe, that is turn RPe off if min
  higher that RPe is requested?

* Also isn't RPe dynamic, so we can't say RPe == rp1 when using in KMD?

* Finally, we know that enabling RPe broke the kernel freq API because RPe
  could go over max_freq. So it is actually the max freq which is not
  obeyed after RPe is enabled.

So we ignore RPe in some select cases (which also I don't understand as
mentioned above but maybe I am missing something) to claim that we are
obeying the freq API, but let the freq API stay broken in other cases?

> So, that was introduced to avoid API breakage. Removing it now would mean
> breaking API. (Not sure if the IGT tests for the API got merged already,
> but think that as the API contract).

I think we should take this patch as an opportunity to fix this and give
the user a clean interface to ignore RPe and remove this other implicit way
to ignore RPe. All IGT changes are unmerged at present.

Thanks.
--
Ashutosh



>
> But I do agree with you that having something selected from multiple places
> also has the potential to cause some miss-expectations. So I was thinking
> about multiple even orders where the user select the RP0 as minimal, then
> enable the efficient or vice versa, but I couldn't think of a bad case.
> Or at least not as bad as the user asking to get RP0 as minimal and only
> getting RPe back.
>
> With this in mind, and having checked the code:
>
> Reviewed-by: Rodrigo Vivi 
>
> But I won't push this immediately because I'm still open to hear another
> side/angle.
>
> >
> > Thanks.
> > --
> > Ashutosh
> >
> >
> > >   /* Need a lock now since waitboost can be modifying min as well */
> > >   mutex_lock(>lock);
> > >   wakeref = intel_runtime_pm_get(>runtime_pm);
> > >
> > > - /* Ignore efficient freq if lower min freq is requested */
> > > - ret = slpc_set_param(slpc,
> > > -  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
> > > -  val < slpc->rp1_freq);
> > > - if (ret) {
> > > - guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient 
> > > freq: %pe\n",
> > > - ERR_PTR(ret));
> > > - goto out;
> > > - }
> > > -
> > >   ret = slpc_set_param(slpc,
> > >SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
> > >val);


Re: [Intel-gfx] [PATCH] i915/guc/slpc: Provide sysfs for efficient freq

2023-03-31 Thread Dixit, Ashutosh
On Fri, 31 Mar 2023 19:00:49 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> @@ -478,20 +507,15 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
> *slpc, u32 val)
>   val > slpc->max_freq_softlimit)
>   return -EINVAL;
>
> + /* Ignore efficient freq if lower min freq is requested */
> + ret = intel_guc_slpc_set_ignore_eff_freq(slpc, val < slpc->rp1_freq);
> + if (ret)
> + goto out;
> +

I don't agree with this. If we are now providing an interface explicitly to
ignore RPe, that should be /only/ way to ignore RPe. There should be no
other "under the hood" ignoring of RPe. In other words, ignoring RPe should
be minimized unless explicitly requested.

I don't clearly understand why this was done previously but it makes even
less sense to me now after this patch.

Thanks.
--
Ashutosh


>   /* Need a lock now since waitboost can be modifying min as well */
>   mutex_lock(>lock);
>   wakeref = intel_runtime_pm_get(>runtime_pm);
>
> - /* Ignore efficient freq if lower min freq is requested */
> - ret = slpc_set_param(slpc,
> -  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
> -  val < slpc->rp1_freq);
> - if (ret) {
> - guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient 
> freq: %pe\n",
> - ERR_PTR(ret));
> - goto out;
> - }
> -
>   ret = slpc_set_param(slpc,
>SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
>val);


Re: [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit

2023-03-31 Thread Dixit, Ashutosh
On Fri, 31 Mar 2023 03:23:33 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> > @@ -385,8 +395,22 @@ static int
> >   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >   {
> > struct i915_hwmon *hwmon = ddat->hwmon;
> > +   intel_wakeref_t wakeref;
> > u32 nval;
> >   + if (val == PL1_DISABLE) {
> > +   /* Disable PL1 limit */
> > +   hwm_locked_with_pm_intel_uncore_rmw(ddat, 
> > hwmon->rg.pkg_rapl_limit,
> > +   PKG_PWR_LIM_1_EN, 0);
> > +
> > +   /* Verify, because PL1 limit cannot be disabled on all 
> > platforms */
>
> I think there is a race right here, since above grabbed and released the
> hwmon_lock, anyone can modify it at this point before the verification
> below. Not sure if any consequences worse than a wrong -EPERM are possible
> though.
>
> Also, is EPERM correct for something hardware does not support? We usually
> say ENODEV for such things, IIRC at least.

Changed to -ENODEV in v3.

> Anyway, race looks easily solvable by holding the existing mutex and a
> single rpm ref for the whole rmw-r cycle.

Fixed in v3, thanks for catching these.

Ashutosh

> > +   with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
> > +   nval = intel_uncore_read(ddat->uncore, 
> > hwmon->rg.pkg_rapl_limit);
> > +   if (nval & PKG_PWR_LIM_1_EN)
> > +   return -EPERM;
> > +   return 0;
> > +   }
> > +
> > /* Computation in 64-bits to avoid overflow. Round to nearest. */
> > nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, 
> > SF_POWER);
> > nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);


Re: [PATCH] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit

2023-03-30 Thread Dixit, Ashutosh
On Thu, 30 Mar 2023 08:44:34 -0700, Rodrigo Vivi wrote:
>
> On Wed, Mar 29, 2023 at 10:50:09PM -0700, Dixit, Ashutosh wrote:
> > On Tue, 28 Mar 2023 16:35:43 -0700, Ashutosh Dixit wrote:
> > >
> > > On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
> > > that the PL1 limit is always enabled and therefore did not have a notion 
> > > of
> > > a disabled PL1 limit. This results in erroneous PL1 limit values when the
> > > PL1 limit is disabled. For example at power up, the disabled ATSM PL1 
> > > limit
> > > was previously shown as 0 which means a low PL1 limit whereas the limit
> > > being disabled actually implies a high effective PL1 limit value.
> > >
> > > To get round this problem, the PL1 limit uapi is expanded to include a
> > > special value 0 to designate a disabled PL1 limit.
> >
> > This patch is another attempt to show when the PL1 power limit is disabled
> > and to disable it when it needs to. Previous abandoned attempts to do this
> > are [1] and [2].
> >
> > The preferred way to do this was [2] but that was NAK'd by hwmon folks (see
> > [2]). That is why here we fall back on the approach in [1].
>
> I still don't get it, but let's move on...
>
> >
> > This patch is identical to [1] except that the value used to disable the
> > PL1 limit has been changed to 0 (from -1 in [1]) as was suggested in [2]
> > (both -1 and 0 seem ok for the purpose).
> >
> > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
> >
> > The link between this patch and these pretty serious bugs might not be
> > immediately clear so here's an explanation:
> >
> > * Because on ATSM the PL1 power limit is disabled on power up and there
> >   were no means to enable it, in 6fd3d8bf89fc we implemented the means to
> >   enable the limit when the PL1 hwmon entry (power1_max) was written to.
> >
> > * Now there is an IGT igt@i915_hwmon@hwmon_write which (a) reads orig value
> >   from all hwmon sysfs  (b) does a bunch of random writes and finally (c)
> >   restores the orig value read. On ATSM since the orig value was 0, when
> >   the IGT restores the 0 value, the PL1 limit is now enabled with a value
> >   of 0.
> >
> > * PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
> >   100 MHz. This causes GuC FW load and several IGT's to start timing out
> >   and gives rise the above (and even more) bugs about GuC FW load timing
> >   out.
>
> I believe these 3 bullets are key information that deserves to be in
> the commit message itself.

Done in v2.

>
> With that there,
>
> Reviewed-by: Rodrigo Vivi 

Thanks.
--
Ashutosh


>
>
> >
> > * After this patch, writing 0 would disable the PL1 limit instead of
> >   enabling it, avoiding the freq drop issue above, and resolving this Intel
> >   CI issue.
> >
> > Thanks.
> > --
> > Ashutosh
> >
> > [1] https://patchwork.freedesktop.org/patch/522612/?series=113972=1
> > [2] https://patchwork.freedesktop.org/patch/522652/?series=113984=1


Re: [PATCH] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit

2023-03-30 Thread Dixit, Ashutosh
On Tue, 28 Mar 2023 16:35:43 -0700, Ashutosh Dixit wrote:
>
> On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
> that the PL1 limit is always enabled and therefore did not have a notion of
> a disabled PL1 limit. This results in erroneous PL1 limit values when the
> PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit
> was previously shown as 0 which means a low PL1 limit whereas the limit
> being disabled actually implies a high effective PL1 limit value.
>
> To get round this problem, the PL1 limit uapi is expanded to include a
> special value 0 to designate a disabled PL1 limit.

This patch is another attempt to show when the PL1 power limit is disabled
and to disable it when it needs to. Previous abandoned attempts to do this
are [1] and [2].

The preferred way to do this was [2] but that was NAK'd by hwmon folks (see
[2]). That is why here we fall back on the approach in [1].

This patch is identical to [1] except that the value used to disable the
PL1 limit has been changed to 0 (from -1 in [1]) as was suggested in [2]
(both -1 and 0 seem ok for the purpose).

> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8060

The link between this patch and these pretty serious bugs might not be
immediately clear so here's an explanation:

* Because on ATSM the PL1 power limit is disabled on power up and there
  were no means to enable it, in 6fd3d8bf89fc we implemented the means to
  enable the limit when the PL1 hwmon entry (power1_max) was written to.

* Now there is an IGT igt@i915_hwmon@hwmon_write which (a) reads orig value
  from all hwmon sysfs  (b) does a bunch of random writes and finally (c)
  restores the orig value read. On ATSM since the orig value was 0, when
  the IGT restores the 0 value, the PL1 limit is now enabled with a value
  of 0.

* PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
  100 MHz. This causes GuC FW load and several IGT's to start timing out
  and gives rise the above (and even more) bugs about GuC FW load timing
  out.

* After this patch, writing 0 would disable the PL1 limit instead of
  enabling it, avoiding the freq drop issue above, and resolving this Intel
  CI issue.

Thanks.
--
Ashutosh

[1] https://patchwork.freedesktop.org/patch/522612/?series=113972=1
[2] https://patchwork.freedesktop.org/patch/522652/?series=113984=1


Re: [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-27 Thread Dixit, Ashutosh
On Sun, 26 Mar 2023 04:52:59 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Fri, Mar 24, 2023 at 04:31:22PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
> > >
> >
> > Hi Vinay,
> >
> > Thanks for the review. Comments inline below.
> >
> > > On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:
> > > > On dGfx, the PL1 power limit being enabled and set to a low value 
> > > > results
> > > > in a low GPU operating freq. It also negates the freq raise operation 
> > > > which
> > > > is done before GuC firmware load. As a result GuC firmware load can time
> > > > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 
> > > > power
> > > > limit was enabled and set to a low value). Therefore disable the PL1 
> > > > power
> > > > limit when allowed by HW when loading GuC firmware.
> > > v3 label missing in subject.
> > > >
> > > > v2:
> > > >   - Take mutex (to disallow writes to power1_max) across GuC reset/fw 
> > > > load
> > > >   - Add hwm_power_max_restore to error return code path
> > > >
> > > > v3 (Jani N):
> > > >   - Add/remove explanatory comments
> > > >   - Function renames
> > > >   - Type corrections
> > > >   - Locking annotation
> > > >
> > > > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > > > Signed-off-by: Ashutosh Dixit 
> > > > ---
> > > >   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
> > > >   drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
> > > >   drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> > > >   3 files changed, 55 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > > > @@ -18,6 +18,7 @@
> > > >   #include "intel_uc.h"
> > > > #include "i915_drv.h"
> > > > +#include "i915_hwmon.h"
> > > > static const struct intel_uc_ops uc_ops_off;
> > > >   static const struct intel_uc_ops uc_ops_on;
> > > > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > struct intel_guc *guc = >guc;
> > > > struct intel_huc *huc = >huc;
> > > > int ret, attempts;
> > > > +   bool pl1en;
> > >
> > > Init to 'false' here
> >
> > See next comment.
> >
> > >
> > >
> > > > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > > > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > > > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > else
> > > > attempts = 1;
> > > >   + /* Disable a potentially low PL1 power limit to allow freq to be
> > > > raised */
> > > > +   i915_hwmon_power_max_disable(gt->i915, );
> > > > +
> > > > intel_rps_raise_unslice(_to_gt(uc)->rps);
> > > > while (attempts--) {
> > > > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > > > }
> > > >   + i915_hwmon_power_max_restore(gt->i915, pl1en);
> > > > +
> > > > guc_info(guc, "submission %s\n", 
> > > > str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> > > > guc_info(guc, "SLPC %s\n", 
> > > > str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> > > >   @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > > > /* Return GT back to RPn */
> > > > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > > >   + i915_hwmon_power_max_restore(gt->i915, pl1en);
> > >
> > > if (pl1en)
> > >
> > >     i915_hwmon_power_max_enable().
> >
> > IMO it's better not to have checks in the main __uc_init_hw() function (if
> > we do this we'll need to add 2 checks in __uc_init_hw()). If you really
> > want we could do something like this inside
> > i915_hwmon_power_max_disable/i915_hwmon_power_max_restore. But for now I
> > am not making any changes.

Re: [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-27 Thread Dixit, Ashutosh
On Fri, 24 Mar 2023 17:06:33 -0700, Belgaumkar, Vinay wrote:
>

Hi Vinay,

> On 3/24/2023 4:31 PM, Dixit, Ashutosh wrote:
> > On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
> > Hi Vinay,
> >
> > Thanks for the review. Comments inline below.
> Sorry about asking the same questions all over again :) Didn't look at
> previous versions.

Np, the previous versions were buried somewhere anyway that's why I
provided the link.

> >
> >> On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:
> >>> On dGfx, the PL1 power limit being enabled and set to a low value results
> >>> in a low GPU operating freq. It also negates the freq raise operation 
> >>> which
> >>> is done before GuC firmware load. As a result GuC firmware load can time
> >>> out. Such timeouts were seen in the GL #8062 bug below (where the PL1 
> >>> power
> >>> limit was enabled and set to a low value). Therefore disable the PL1 power
> >>> limit when allowed by HW when loading GuC firmware.
> >> v3 label missing in subject.
> >>> v2:
> >>>- Take mutex (to disallow writes to power1_max) across GuC reset/fw 
> >>> load
> >>>- Add hwm_power_max_restore to error return code path
> >>>
> >>> v3 (Jani N):
> >>>- Add/remove explanatory comments
> >>>- Function renames
> >>>- Type corrections
> >>>- Locking annotation
> >>>
> >>> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> >>> Signed-off-by: Ashutosh Dixit 
> >>> ---
> >>>drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
> >>>drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
> >>>drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >>>3 files changed, 55 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> >>> b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>> index 4ccb4be4c9cba..aa8e35a5636a0 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> >>> @@ -18,6 +18,7 @@
> >>>#include "intel_uc.h"
> >>>  #include "i915_drv.h"
> >>> +#include "i915_hwmon.h"
> >>>  static const struct intel_uc_ops uc_ops_off;
> >>>static const struct intel_uc_ops uc_ops_on;
> >>> @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>   struct intel_guc *guc = >guc;
> >>>   struct intel_huc *huc = >huc;
> >>>   int ret, attempts;
> >>> + bool pl1en;
> >> Init to 'false' here
> > See next comment.
> >
> >>
> >>>   GEM_BUG_ON(!intel_uc_supports_guc(uc));
> >>>   GEM_BUG_ON(!intel_uc_wants_guc(uc));
> >>> @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>   else
> >>>   attempts = 1;
> >>>+  /* Disable a potentially low PL1 power limit to allow freq to be
> >>> raised */
> >>> + i915_hwmon_power_max_disable(gt->i915, );
> >>> +
> >>>   intel_rps_raise_unslice(_to_gt(uc)->rps);
> >>>   while (attempts--) {
> >>> @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>   intel_rps_lower_unslice(_to_gt(uc)->rps);
> >>>   }
> >>>+  i915_hwmon_power_max_restore(gt->i915, pl1en);
> >>> +
> >>>   guc_info(guc, "submission %s\n", 
> >>> str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> >>>   guc_info(guc, "SLPC %s\n", 
> >>> str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >>>@@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >>>   /* Return GT back to RPn */
> >>>   intel_rps_lower_unslice(_to_gt(uc)->rps);
> >>>+  i915_hwmon_power_max_restore(gt->i915, pl1en);
> >> if (pl1en)
> >>
> >>      i915_hwmon_power_max_enable().
> > IMO it's better not to have checks in the main __uc_init_hw() function (if
> > we do this we'll need to add 2 checks in __uc_init_hw()). If you really
> > want we could do something like this inside
> > i915_hwmon_power_max_disable/i915_hwmon_power_max_restore. But for now I
> > am not making any changes.
> ok.
> >
> > (I can send a patch with the changes if you want to take a look b

Re: [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-24 Thread Dixit, Ashutosh
On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
>

Hi Vinay,

Thanks for the review. Comments inline below.

> On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> v3 label missing in subject.
> >
> > v2:
> >   - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >   - Add hwm_power_max_restore to error return code path
> >
> > v3 (Jani N):
> >   - Add/remove explanatory comments
> >   - Function renames
> >   - Type corrections
> >   - Locking annotation
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
> >   drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
> >   drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >   3 files changed, 55 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >   #include "intel_uc.h"
> > #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> > static const struct intel_uc_ops uc_ops_off;
> >   static const struct intel_uc_ops uc_ops_on;
> > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > struct intel_guc *guc = >guc;
> > struct intel_huc *huc = >huc;
> > int ret, attempts;
> > +   bool pl1en;
>
> Init to 'false' here

See next comment.

>
>
> > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > else
> > attempts = 1;
> >   + /* Disable a potentially low PL1 power limit to allow freq to be
> > raised */
> > +   i915_hwmon_power_max_disable(gt->i915, );
> > +
> > intel_rps_raise_unslice(_to_gt(uc)->rps);
> > while (attempts--) {
> > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > }
> >   + i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> > guc_info(guc, "submission %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> > guc_info(guc, "SLPC %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >   @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > /* Return GT back to RPn */
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> >   + i915_hwmon_power_max_restore(gt->i915, pl1en);
>
> if (pl1en)
>
>     i915_hwmon_power_max_enable().

IMO it's better not to have checks in the main __uc_init_hw() function (if
we do this we'll need to add 2 checks in __uc_init_hw()). If you really
want we could do something like this inside
i915_hwmon_power_max_disable/i915_hwmon_power_max_restore. But for now I
am not making any changes.

(I can send a patch with the changes if you want to take a look but IMO it
will add more logic/code but without real benefits (it will save a rmw if
the limit was already disabled, but IMO this code is called so infrequently
(only during GuC resets) as to not have any significant impact)).

>
> > +
> > __uc_sanitize(uc);
> > if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index ee63a8fd88fc1..769b5bda4d53f 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -444,6 +444,45 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, 
> > int chan, long val)
> > }
> >   }
> >   +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool
> > *old)
> Shouldn't we call this i915_hwmon_package_pl1_disable()?

I did think of using "pl1" in the function name but then decided to retain
"power_max" because other hwmon functions for PL1 limit also use
"power_max" (hwm_power_max_read/hwm_power_max_write) and currently
"hwmon_power_max" is mapped to the PL1 limit. So "power_max" is used to
show that all these functions deal with the PL1 power limit.

There is a comment in __uc_init_hw() explaining "power_max" means the PL1
power limit.

> > +   __acquires(i915->hwmon->hwmon_lock)
> > +{
> > +   struct i915_hwmon *hwmon = i915->hwmon;
> > +   intel_wakeref_t wakeref;
> > +   u32 r;
> > +
> > +   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +   return;
> > +
> > +   /* Take mutex to prevent concurrent hwm_power_max_write */
> > +   

Re: Reverted patch reappears!

2023-03-18 Thread Dixit, Ashutosh
On Fri, 17 Mar 2023 20:28:58 -0700, Dixit, Ashutosh wrote:
>
> Jani/Rodrigo,
>
> Original Subject: Re: [Intel-gfx] [PATCH] Revert "drm/i915/hwmon: Enable PL1 
> power limit"
>
> On Wed, 15 Feb 2023 09:19:07 -0800, Rodrigo Vivi wrote:
> >
> > On Wed, Feb 15, 2023 at 08:24:51AM -0800, Dixit, Ashutosh wrote:
> > > On Wed, 15 Feb 2023 07:37:30 -0800, Jani Nikula wrote:
> > > >
> > > > On Wed, 08 Feb 2023, Rodrigo Vivi  wrote:
> > > > > On Wed, Feb 08, 2023 at 11:03:12AM -0800, Ashutosh Dixit wrote:
> > > > >> This reverts commit 0349c41b05968befaffa5fbb7e73d0ee6004f610.
> > > > >>
> > > > >> 0349c41b0596 ("drm/i915/hwmon: Enable PL1 power limit") is incorrect 
> > > > >> and
> > > > >> caused a major regression on ATSM. The change enabled the PL1 power 
> > > > >> limit
> > > > >> but FW sets the default value of the PL1 limit to 0 which implies HW 
> > > > >> now
> > > > >> works at minimum power and therefore the lowest effective frequency. 
> > > > >> This
> > > > >> means all workloads now run slower resulting in even GuC FW load 
> > > > >> operations
> > > > >> timing out, rendering ATSM unusable.
> > > > >>
> > > > >> A different solution to the original issue of the PL1 limit being 
> > > > >> disabled
> > > > >> on ATSM is needed but till that is developed, revert 0349c41b0596.
> > > > >>
> > > > >> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > > > >
> > > > > pushed to drm-intel-next and removed from drm-intel-fixes.
> > > > >
> > > > > Thanks for the quick reaction.
> > > >
> > > > Please always add Fixes: tags also to reverts.
> > > >
> > > > I suppose we should fix dim to also detect reverts, but I ended up
> > > > cherry-picking and pushing the original commit out to
> > > > drm-intel-next-fixes before realizing it's been reverted.
> > >
> > > Oops, sorry!
> >
> > That's my mistake. I should had thought about this when pushing
> > and removing from the fixes. I just realized yet, when this patch
> > showed up in my -fixes cherry-pick again, but without the revert.
> >
> > I'm sorry.
>
> Not sure if it's related to this, but the reverted patch below has
> reappeared on drm-tip. Newest on top:
>
> ee892ea83d996 drm/i915/hwmon: Enable PL1 power limit
> 05d5562e401eb Revert "drm/i915/hwmon: Enable PL1 power limit"
> 0349c41b05968 drm/i915/hwmon: Enable PL1 power limit
>
> The new patch is:
>
> commit ee892ea83d99610fa33bea612de058e0955eec3a
> Author: Ashutosh Dixit 
> AuthorDate: Fri Feb 3 07:53:09 2023 -0800
> Commit: Jani Nikula 
> CommitDate: Mon Mar 13 11:38:05 2023 +0200
>
> drm/i915/hwmon: Enable PL1 power limit
>
> Sorry I couldn't track which branch did this new patch come from (looks
> like drm-tip itself?).
>
> This is breaking ATSM again:
>
> https://intel-gfx-ci.01.org/tree/drm-tip/bat-atsm-1.html
>
> so needs to be reverted again and stay reverted. I could send a revert or
> any of you can also do it.

I have sent out the revert of ee892ea83d996:

https://patchwork.freedesktop.org/series/113793/

ee892ea83d996 is also present in Linus' tree (in v6.3-rc2) so will need to
be reverted there too. The previous two commits (the original commit and
its revert) are not present in Linus' tree, at least yet.

Thanks.
--
Ashutosh


Reverted patch reappears!

2023-03-17 Thread Dixit, Ashutosh
Jani/Rodrigo,

Original Subject: Re: [Intel-gfx] [PATCH] Revert "drm/i915/hwmon: Enable PL1 
power limit"

On Wed, 15 Feb 2023 09:19:07 -0800, Rodrigo Vivi wrote:
>
> On Wed, Feb 15, 2023 at 08:24:51AM -0800, Dixit, Ashutosh wrote:
> > On Wed, 15 Feb 2023 07:37:30 -0800, Jani Nikula wrote:
> > >
> > > On Wed, 08 Feb 2023, Rodrigo Vivi  wrote:
> > > > On Wed, Feb 08, 2023 at 11:03:12AM -0800, Ashutosh Dixit wrote:
> > > >> This reverts commit 0349c41b05968befaffa5fbb7e73d0ee6004f610.
> > > >>
> > > >> 0349c41b0596 ("drm/i915/hwmon: Enable PL1 power limit") is incorrect 
> > > >> and
> > > >> caused a major regression on ATSM. The change enabled the PL1 power 
> > > >> limit
> > > >> but FW sets the default value of the PL1 limit to 0 which implies HW 
> > > >> now
> > > >> works at minimum power and therefore the lowest effective frequency. 
> > > >> This
> > > >> means all workloads now run slower resulting in even GuC FW load 
> > > >> operations
> > > >> timing out, rendering ATSM unusable.
> > > >>
> > > >> A different solution to the original issue of the PL1 limit being 
> > > >> disabled
> > > >> on ATSM is needed but till that is developed, revert 0349c41b0596.
> > > >>
> > > >> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > > >
> > > > pushed to drm-intel-next and removed from drm-intel-fixes.
> > > >
> > > > Thanks for the quick reaction.
> > >
> > > Please always add Fixes: tags also to reverts.
> > >
> > > I suppose we should fix dim to also detect reverts, but I ended up
> > > cherry-picking and pushing the original commit out to
> > > drm-intel-next-fixes before realizing it's been reverted.
> >
> > Oops, sorry!
>
> That's my mistake. I should had thought about this when pushing
> and removing from the fixes. I just realized yet, when this patch
> showed up in my -fixes cherry-pick again, but without the revert.
>
> I'm sorry.

Not sure if it's related to this, but the reverted patch below has
reappeared on drm-tip. Newest on top:

ee892ea83d996 drm/i915/hwmon: Enable PL1 power limit
05d5562e401eb Revert "drm/i915/hwmon: Enable PL1 power limit"
0349c41b05968 drm/i915/hwmon: Enable PL1 power limit

The new patch is:

commit ee892ea83d99610fa33bea612de058e0955eec3a
Author: Ashutosh Dixit 
AuthorDate: Fri Feb 3 07:53:09 2023 -0800
Commit: Jani Nikula 
CommitDate: Mon Mar 13 11:38:05 2023 +0200

drm/i915/hwmon: Enable PL1 power limit

Sorry I couldn't track which branch did this new patch come from (looks
like drm-tip itself?).

This is breaking ATSM again:

https://intel-gfx-ci.01.org/tree/drm-tip/bat-atsm-1.html

so needs to be reverted again and stay reverted. I could send a revert or
any of you can also do it.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/guc: Allow for very slow GuC loading

2023-03-17 Thread Dixit, Ashutosh
On Thu, 16 Mar 2023 15:06:32 -0700, john.c.harri...@intel.com wrote:
>
> From: John Harrison 
>
> A failure to load the GuC is occasionally observed where the GuC log
> actually showed that the GuC had loaded just fine. The implication
> being that the load just took ever so slightly longer than the 200ms
> timeout. Given that the actual time should be tens of milliseconds at
> the slowest, this should never happen. So far the issue has generally
> been caused by a bad IFWI resulting in low frequencies during boot
> (depsite the KMD requesting max frequency). However, the issue seems
> to happen more often than one would like.
>
> So a) increase the timeout so that the user still gets a working
> system even in the case of slow load. And b) report the frequency
> during the load to see if that is the case of the slow down.
>
> v2: Reduce timeout in non-debug builds, add references (Daniele)
>
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/7931
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8083
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8136
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8137
> Signed-off-by: John Harrison 

Tested this on ATSM and saw the interrmittent GuC FW load timeouts
disappear:

Tested-by: Ashutosh Dixit 


Re: [PATCH v2] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-15 Thread Dixit, Ashutosh
On Tue, 14 Mar 2023 02:35:07 -0700, Jani Nikula wrote:
>

Hi Jani,

Thanks for the review. I have posted v3, comments inline below.

> On Mon, 13 Mar 2023, Ashutosh Dixit  wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> >
> > v2:
> >  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >  - Add hwm_power_max_restore to error return code path
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 10 ++-
> >  drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
> >  drivers/gpu/drm/i915/i915_hwmon.h |  7 +
> >  3 files changed, 55 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cb..15f8e94edc61 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >  #include "intel_uc.h"
> >
> >  #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> >
> >  static const struct intel_uc_ops uc_ops_off;
> >  static const struct intel_uc_ops uc_ops_on;
> > @@ -460,7 +461,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> > struct drm_i915_private *i915 = gt->i915;
> > struct intel_guc *guc = >guc;
> > struct intel_huc *huc = >huc;
> > -   int ret, attempts;
> > +   int ret, attempts, pl1en;
> >
> > GEM_BUG_ON(!intel_uc_supports_guc(uc));
> > GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +492,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> > else
> > attempts = 1;
> >
> > +   /* Disable PL1 limit before raising freq */
>
> That's just duplicating what the code says; a few words on the why might
> be helpful.

I've added a hint. The comment is not a duplication because what is known
as the PL1 power limit is known as "hwm_power_max" inside hwmon. I have
retained "hwm_power_max" in the function name (rather than say
"i915_hwmon_pl1_limit_disable") because other hwmon functions are similarly
named. So there is a need to tell someone reading the code that this is the
PL1 power limit we are referring to.

>
> > +   hwm_power_max_disable(gt, );
> > +
> > intel_rps_raise_unslice(_to_gt(uc)->rps);
> >
> > while (attempts--) {
> > @@ -547,6 +551,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> > }
> >
> > +   hwm_power_max_restore(gt, pl1en); /* Restore PL1 limit */
> > +
> > guc_info(guc, "submission %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> > guc_info(guc, "SLPC %s\n", 
> > str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >
> > @@ -563,6 +569,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> > /* Return GT back to RPn */
> > intel_rps_lower_unslice(_to_gt(uc)->rps);
> >
> > +   hwm_power_max_restore(gt, pl1en); /* Restore PL1 limit */
>
> Ditto about code and comment duplicating the same thing.
>
> Also, we don't use end of the line comments very much.

Removed the comment here, comment seems pointless after the comment above.

>
> > +
> > __uc_sanitize(uc);
> >
> > if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index ee63a8fd88fc..2bbca75ac477 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -444,6 +444,45 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, 
> > int chan, long val)
> > }
> >  }
> >
> > +void hwm_power_max_disable(struct intel_gt *gt, u32 *old)
> > +{
> > +   struct i915_hwmon *hwmon = gt->i915->hwmon;
> > +   intel_wakeref_t wakeref;
> > +   u32 r;
> > +
> > +   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +   return;
> > +
> > +   /* Take mutex to prevent concurrent hwm_power_max_write */
> > +   mutex_lock(>hwmon_lock);
> > +
> > +   with_intel_runtime_pm(hwmon->ddat.uncore->rpm, wakeref)
> > +   r = intel_uncore_rmw(hwmon->ddat.uncore,
> > +hwmon->rg.pkg_rapl_limit,
> > +PKG_PWR_LIM_1_EN, 0);
> > +
> > +   *old = !!(r & PKG_PWR_LIM_1_EN);
>
> If you only need a bool, why do you use a u32?

Changed.

>
> > +
> > +   /* hwmon_lock mutex is unlocked in hwm_power_max_restore */
>
> Not too happy about that... any better ideas?

Afais, taking the mutex is the only fully correct solution (when we disable
the power limit, userspace can go re-enable it). Examples of partly
incorrect solutions (which 

Re: [PATCH 1/2] drm/i915/pmu: Use functions common with sysfs to read actual freq

2023-03-15 Thread Dixit, Ashutosh
On Wed, 15 Mar 2023 02:43:30 -0700, Tvrtko Ursulin wrote:
>
> On 10/03/2023 00:59, Ashutosh Dixit wrote:
> > Expose intel_rps_read_actual_frequency_fw to read the actual freq without
> > taking forcewake for use by PMU. The code is refactored to use a common set
> > of functions across sysfs and PMU. Using common functions with sysfs in PMU
> > solves the issues of missing support for MTL and missing support for older
> > generations (prior to Gen6). It also future proofs the PMU where sometimes
> > code has been updated for sysfs and PMU has been missed.
> >
> > v2: Remove runtime_pm_if_in_use from read_actual_frequency_fw (Tvrtko)
> >
> > Fixes: 22009b6dad66 ("drm/i915/mtl: Modify CAGF functions for MTL")
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8280
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_rps.c | 34 -
> >   drivers/gpu/drm/i915/gt/intel_rps.h |  2 +-
> >   drivers/gpu/drm/i915/i915_pmu.c | 10 -
> >   3 files changed, 24 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> > b/drivers/gpu/drm/i915/gt/intel_rps.c
> > index 4d0dc9de23f9..9d9ac35691fc 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> > @@ -2046,16 +2046,6 @@ void intel_rps_sanitize(struct intel_rps *rps)
> > rps_disable_interrupts(rps);
> >   }
> >   -u32 intel_rps_read_rpstat_fw(struct intel_rps *rps)
> > -{
> > -   struct drm_i915_private *i915 = rps_to_i915(rps);
> > -   i915_reg_t rpstat;
> > -
> > -   rpstat = (GRAPHICS_VER(i915) >= 12) ? GEN12_RPSTAT1 : GEN6_RPSTAT1;
> > -
> > -   return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat);
> > -}
> > -
> >   u32 intel_rps_read_rpstat(struct intel_rps *rps)
> >   {
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > @@ -2089,10 +2079,11 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 
> > rpstat)
> > return cagf;
> >   }
> >   -static u32 read_cagf(struct intel_rps *rps)
> > +static u32 __read_cagf(struct intel_rps *rps, bool take_fw)
> >   {
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > struct intel_uncore *uncore = rps_to_uncore(rps);
> > +   i915_reg_t r = INVALID_MMIO_REG;
> > u32 freq;
> > /*
> > @@ -2100,22 +2091,30 @@ static u32 read_cagf(struct intel_rps *rps)
> >  * registers will return 0 freq when GT is in RC6
> >  */
> > if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
> > -   freq = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1);
> > +   r = MTL_MIRROR_TARGET_WP1;
> > } else if (GRAPHICS_VER(i915) >= 12) {
> > -   freq = intel_uncore_read(uncore, GEN12_RPSTAT1);
> > +   r = GEN12_RPSTAT1;
> > } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
> > vlv_punit_get(i915);
> > freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
> > vlv_punit_put(i915);
> > +   goto exit;
>
> Alternatively you could avoid the goto by making the read below conditional
> on r being set. One more conditional though for avoiding gotos.. up to you.

Done.

>
> > } else if (GRAPHICS_VER(i915) >= 6) {
> > -   freq = intel_uncore_read(uncore, GEN6_RPSTAT1);
> > +   r = GEN6_RPSTAT1;
> > } else {
> > -   freq = intel_uncore_read(uncore, MEMSTAT_ILK);
> > +   r = MEMSTAT_ILK;
> > }
> >   + freq = take_fw ? intel_uncore_read(uncore, r) :
> > intel_uncore_read_fw(uncore, r);
> > +exit:
> > return intel_rps_get_cagf(rps, freq);
> >   }
> >   +static u32 read_cagf(struct intel_rps *rps)
> > +{
> > +   return __read_cagf(rps, true);
> > +}
>
> There is only one caller so up to you if you think a helper is needed or
> not.

There are other callers too in i915/gt/selftest_rps.c so need to retain it.

>
> > +
> >   u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
> >   {
> > struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm;
> > @@ -2128,6 +2127,11 @@ u32 intel_rps_read_actual_frequency(struct intel_rps 
> > *rps)
> > return freq;
> >   }
> >   +u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps)
> > +{
> > +   return intel_gpu_freq(rps, __read_cagf(rps, false));
> > +}
> > +
> >   u32 intel_rps_read_punit_req(struct intel_rps *rps)
> >   {
> > struct intel_uncore *uncore = rps_to_uncore(rps);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
> > b/drivers/gpu/drm/i915/gt/intel_rps.h
> > index c622962c6bef..2d5b3ef58606 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rps.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_rps.h
> > @@ -39,6 +39,7 @@ int intel_gpu_freq(struct intel_rps *rps, int val);
> >   int intel_freq_opcode(struct intel_rps *rps, int val);
> >   u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
> >   u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
> > +u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps);
> >   u32 

Re: [PATCH 2/2] drm/i915/pmu: Remove fallback to requested freq for SLPC

2023-03-15 Thread Dixit, Ashutosh
On Wed, 15 Mar 2023 02:50:17 -0700, Tvrtko Ursulin wrote:
>
> On 10/03/2023 00:59, Ashutosh Dixit wrote:
> > The fallback to requested freq does not work for SLPC because SLPC does not
> > use 'struct intel_rps'. Also for SLPC requested freq can only be obtained
> > from a hw register after acquiring forcewake which we don't want to do for
> > PMU. Therefore remove fallback to requested freq for SLPC. The actual freq
> > will be 0 when gt is in RC6 which is correct. Also this is rare since PMU
> > freq sampling happens only when gt is unparked.
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 9 -
> >   1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index 7ece883a7d95..f697fabed64a 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -393,7 +393,14 @@ frequency_sample(struct intel_gt *gt, unsigned int 
> > period_ns)
> >  * frequency. Fortunately, the read should rarely fail!
> >  */
> > val = intel_rps_read_actual_frequency_fw(rps);
> > -   if (!val)
> > +
> > +   /*
> > +* SLPC does not use 'struct intel_rps'. Also for SLPC
> > +* requested freq can only be obtained after acquiring
> > +* forcewake and reading a hw register. For SLPC just
> > +* let val be 0
> > +*/
> > +   if (!val && !intel_uc_uses_guc_slpc(>uc))
> > val = intel_gpu_freq(rps, rps->cur_freq);
>
> I really dislike sprinkling of "uses slpc" since I think the thing hasn't
> really been integrated nicely. Case in point is probably the flow duality
> in intel_rps_boost. Data structures as well, even though some fields and
> concepts are shared.
>
> For instance why we can't have the notion of software tracked cur_freq in
> rps, and/or have it zero if with SLPC we can't have it otherwise?

For SLPC:

* We can't have the notion of software tracked cur_freq in rps because FW is
  managing the freq.
* rps->cur_freq /is/ actually 0 since SLPC does not use 'struct
  intel_rps'. So this patch doesn't really make any practical difference,
  PMU values will be exactly the same with or without this patch. It was
  just clarifying things.

> I will abstain, sorry.

I will drop this patch, there doesn't seem much point in it.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915: Fix format for perf_limit_reasons

2023-03-14 Thread Dixit, Ashutosh
On Tue, 14 Mar 2023 19:29:06 -0700, Vinay Belgaumkar wrote:
>
> Use hex format so that it is easier to decode.
>
> Fixes: fe5979665f64 ('Add perf_limit_reasons in debugfs')
>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> index 83df4cd5e06c..80dbbef86b1d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> @@ -580,7 +580,7 @@ static bool perf_limit_reasons_eval(void *data)
>  }
>
>  DEFINE_SIMPLE_ATTRIBUTE(perf_limit_reasons_fops, perf_limit_reasons_get,
> - perf_limit_reasons_clear, "%llu\n");
> + perf_limit_reasons_clear, "0x%llx\n");

Duh.

Reviewed-by: Ashutosh Dixit 


Re: [Intel-gfx] [PATCH 2/2] drm/i915/guc: Allow for very slow GuC loading

2023-03-13 Thread Dixit, Ashutosh
On Fri, 10 Mar 2023 17:01:42 -0800, John Harrison wrote:
>
> >> +  for (count = 0; count < 20; count++) {
> >> +  ret = wait_for(guc_load_done(uncore, , ), 1000);
> >
> > Isn't 20 secs a bit too long for an in-place wait? I get that if the GuC
> > doesn't load (or fail to) within a few secs the HW is likely toast, but
> > still that seems a bit too long to me. What's the worst case load time
> > ever observed? I suggest reducing the wait to 3 secs as a compromise, if
> > that's bigger than the worst case.
>
> I can drop it to 3 for normal builds and keep 20 for
> CONFIG_DRM_I915_DEBUG_GEM builds. However, that won't actually be long
> enough for all slow situations. We have seen times of at least 11s when the
> GPU is running at minimum frequency. So, for CI runs we definitely want to
> keep the 20s limit. For end users? Is it better to wait for up to 20s or to
> boot in display only fallback mode? And note that this is a timeout only. A
> functional system will still complete in tens of milliseconds.

Just FYI, in this related patch:

https://patchwork.freedesktop.org/series/115003/#rev2

I am holding a mutex across GuC FW load, so very unlikely, but worst case a
thread can get blocked for the duration of the GuC reset/FW load.

Ashutosh


Re: [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-11 Thread Dixit, Ashutosh
On Fri, 10 Mar 2023 16:33:58 -0800, Ashutosh Dixit wrote:
>
> On dGfx, the PL1 power limit being enabled and set to a low value results
> in a low GPU operating freq. It also negates the freq raise operation which
> is done before GuC firmware load. As a result GuC firmware load can time
> out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> limit was enabled and set to a low value). Therefore disable the PL1 power
> limit when possible when loading GuC firmware.

There are a couple of bugs in the patch. Please don't review yet, will post
a v2. Thanks.


Re: [PATCH 1/2] drm/i915/pmu: Use functions common with sysfs to read actual freq

2023-03-09 Thread Dixit, Ashutosh
On Thu, 09 Mar 2023 01:20:09 -0800, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 09/03/2023 03:46, Ashutosh Dixit wrote:
> > Expose intel_rps_read_actual_frequency_fw to read the actual freq without
> > taking forcewake for use by PMU. The code is refactored to use a common set
> > of functions across sysfs and PMU. Using common functions with sysfs in PMU
> > solves the issues of missing support for MTL and missing support for older
> > generations (prior to Gen6). It also future proofs the PMU where sometimes
> > code has been updated for sysfs and PMU has been missed.
> >
> > Fixes: 22009b6dad66 ("drm/i915/mtl: Modify CAGF functions for MTL")
>
> So not DG1 and above?

The issue for DG1+ happens if non-freq bits are non-zero but freq bits are
zero. But we've already seen that during PMU freq sampling gt is unparked
so freq bits being 0 is rare. Therefore IMO there is 0 practical impact of
that bug, I don't think it's worth fixing it and Cc'ing stable etc. (also
those platforms are under force probe).

> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8280
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_rps.c | 46 +++--
> >   drivers/gpu/drm/i915/gt/intel_rps.h |  2 +-
> >   drivers/gpu/drm/i915/i915_pmu.c | 10 +++
> >   3 files changed, 36 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> > b/drivers/gpu/drm/i915/gt/intel_rps.c
> > index 4d0dc9de23f9..3957c5ee5cba 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> > @@ -2046,16 +2046,6 @@ void intel_rps_sanitize(struct intel_rps *rps)
> > rps_disable_interrupts(rps);
> >   }
> >   -u32 intel_rps_read_rpstat_fw(struct intel_rps *rps)
> > -{
> > -   struct drm_i915_private *i915 = rps_to_i915(rps);
> > -   i915_reg_t rpstat;
> > -
> > -   rpstat = (GRAPHICS_VER(i915) >= 12) ? GEN12_RPSTAT1 : GEN6_RPSTAT1;
> > -
> > -   return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat);
> > -}
> > -
> >   u32 intel_rps_read_rpstat(struct intel_rps *rps)
> >   {
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > @@ -2089,10 +2079,11 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 
> > rpstat)
> > return cagf;
> >   }
> >   -static u32 read_cagf(struct intel_rps *rps)
> > +static u32 __read_cagf(struct intel_rps *rps, bool take_fw)
> >   {
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > struct intel_uncore *uncore = rps_to_uncore(rps);
> > +   i915_reg_t r = INVALID_MMIO_REG;
> > u32 freq;
> > /*
> > @@ -2100,22 +2091,30 @@ static u32 read_cagf(struct intel_rps *rps)
> >  * registers will return 0 freq when GT is in RC6
> >  */
> > if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
> > -   freq = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1);
> > +   r = MTL_MIRROR_TARGET_WP1;
> > } else if (GRAPHICS_VER(i915) >= 12) {
> > -   freq = intel_uncore_read(uncore, GEN12_RPSTAT1);
> > +   r = GEN12_RPSTAT1;
> > } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
> > vlv_punit_get(i915);
> > freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
> > vlv_punit_put(i915);
> > +   goto exit;
> > } else if (GRAPHICS_VER(i915) >= 6) {
> > -   freq = intel_uncore_read(uncore, GEN6_RPSTAT1);
> > +   r = GEN6_RPSTAT1;
> > } else {
> > -   freq = intel_uncore_read(uncore, MEMSTAT_ILK);
> > +   r = MEMSTAT_ILK;
> > }
> >   + freq = take_fw ? intel_uncore_read(uncore, r) :
> > intel_uncore_read_fw(uncore, r);
> > +exit:
> > return intel_rps_get_cagf(rps, freq);
> >   }
> >   +static u32 read_cagf(struct intel_rps *rps)
> > +{
> > +   return __read_cagf(rps, true);
> > +}
> > +
> >   u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
> >   {
> > struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm;
> > @@ -2128,6 +2127,23 @@ u32 intel_rps_read_actual_frequency(struct intel_rps 
> > *rps)
> > return freq;
> >   }
> >   +static u32 read_cagf_fw(struct intel_rps *rps)
> > +{
> > +   return __read_cagf(rps, false);
> > +}
> > +
> > +u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps)
> > +{
> > +   struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm;
> > +   intel_wakeref_t wakeref;
> > +   u32 freq = 0;
> > +
> > +   with_intel_runtime_pm_if_in_use(rpm, wakeref)
>
> When called from i915_pmu.c::frequency sample() above seems redundant since
> there we already are under intel_gt_pm_get_if_awake. Perhaps it is not a
> huge deal but it is nevertheless wasteful.
>
> Also, maybe I am a bit rusty, but more fundamentally, wouldn't this be
> adding a _very_ atypical pattern of a _fw function which grabs rpm? I'd
> expect they all assume it's already held since the forcewake is already
> held.
>
> Am I missing the reason why it is needed?

Thanks for catching this, you are right, it was just 

Re: [PATCH 3/3] drm/i915/pmu: Use common freq functions with sysfs

2023-03-08 Thread Dixit, Ashutosh
On Tue, 07 Mar 2023 22:12:49 -0800, Belgaumkar, Vinay wrote:
>

Hi Vinay,

> On 3/7/2023 9:33 PM, Ashutosh Dixit wrote:
> > Using common freq functions with sysfs in PMU (but without taking
> > forcewake) solves the following issues (a) missing support for MTL (b)
>
> For the requested_freq, we read it only if actual_freq is zero below
> (meaning, GT is in C6). So then what is the point of reading it without a
> force wake? It will also be zero, correct?

Yes agreed. I had tested this and you do see values for requested freq
which look correct even when actual freq is 0 even without taking
forcewake. That is why I ended up writing Patch 2/3.

However what I missed is what you pointed out that 0xa008 is a shadowed
register which cannot be read without taking forcewake. It is probably
returning the last value which was written to the shadowed write register.

As a result I have dropped the "drm/i915/rps: Expose
get_requested_frequency_fw for PMU" patch in v2 of this series.

Thanks.
--
Ashutosh


Re: [PATCH 2/2] drm/i915/pmu: Use correct requested freq for SLPC

2023-03-07 Thread Dixit, Ashutosh
On Mon, 06 Mar 2023 03:10:24 -0800, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 04/03/2023 01:27, Ashutosh Dixit wrote:
> > SLPC does not use 'struct intel_rps'. Use UNSLICE_RATIO bits from
>
> Would it be more accurate to say 'SLPC does not use rps->cur_freq' rather
> than it not using struct intel_rps?

No actually SLPC maintains a separate 'struct intel_guc_slpc' and does not
use 'struct intel_rps' at all so all of 'struct intel_rps' is 0.

> Fixes: / stable ? CI chances of catching this?

Same issue as Patch 1, I have answered this there.

> > GEN6_RPNSWREQ for SLPC. See intel_rps_get_requested_frequency.
> >
> > Bspec: 52745
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 9 +++--
> >   1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index f0a1e36915b8..5ee836610801 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -394,8 +394,13 @@ frequency_sample(struct intel_gt *gt, unsigned int 
> > period_ns)
> >  * frequency. Fortunately, the read should rarely fail!
> >  */
> > val = intel_rps_get_cagf(rps, intel_rps_read_rpstat_fw(rps));
> > -   if (!val)
> > -   val = rps->cur_freq;
> > +   if (!val) {
> > +   if (intel_uc_uses_guc_slpc(>uc))
> > +   val = intel_rps_read_punit_req(rps) >>
> > +   GEN9_SW_REQ_UNSLICE_RATIO_SHIFT;
> > +   else
> > +   val = rps->cur_freq;
> > +   }
>
> That's a bunch of duplication from intel_rps.c so perhaps the appropriate
> helpers should be exported (some way) from there.

This is also addressed in the new series:

https://patchwork.freedesktop.org/series/114814/

> > add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT],
> > intel_gpu_freq(rps, val), period_ns / 1000);

Thanks.
--
Ashutosh


Re: [PATCH 1/2] drm/i915/pmu: Use only freq bits for falling back to requested freq

2023-03-07 Thread Dixit, Ashutosh
On Mon, 06 Mar 2023 03:04:40 -0800, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 04/03/2023 01:27, Ashutosh Dixit wrote:
> > On newer generations, the GEN12_RPSTAT1 register contains more than freq
> > information, e.g. see GEN12_VOLTAGE_MASK. Therefore use only the freq bits
> > to decide whether to fall back to requested freq.
>

> CI is not catching the problem?

This is because as we know PMU freq sampling happens only when gt is
unparked (actively processing requests) so it is highly unlikely that gt
will be in rc6 when it might have to fall back to requested freq (I checked
this and it seems it is only at the end of the workload that we see it
entering the fallback code path). Deleting the fallback path completely
will not make much difference to the output and is an option too. Anyway I
have retained it for now.

> Could you find an appropriate Fixes: tag please? If it can affects a
> platform out of force probe then cc: stable to.

Cc stable is anyway not needed because affected platforms (DG1 onwards) are
under force probe. Also because the issue does not affect real metrics (as
mentioned above) as well as because it is a really a missing patch rather
than a broken previous patch I am skipping the Fixes tag.

> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 6 ++
> >   1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
> > b/drivers/gpu/drm/i915/i915_pmu.c
> > index 52531ab28c5f..f0a1e36915b8 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -393,10 +393,8 @@ frequency_sample(struct intel_gt *gt, unsigned int 
> > period_ns)
> >  * case we assume the system is running at the intended
> >  * frequency. Fortunately, the read should rarely fail!
> >  */
> > -   val = intel_rps_read_rpstat_fw(rps);
> > -   if (val)
> > -   val = intel_rps_get_cagf(rps, val);
> > -   else
> > +   val = intel_rps_get_cagf(rps, intel_rps_read_rpstat_fw(rps));
>
> Will this work with gen5_invert_freq as called by intel_rps_get_cagf?

PMU has ever only supported Gen6+. See intel_rps_read_rpstat_fw (Gen5 does
not have a GEN6_RPSTAT1 register) as well as 01b8c2e60e96.

More importantly PMU was missing support for MTL. It is to avoid these
kinds of issues I have submitted a new series with a different approach
which should now take care of both MTL+ as well as Gen5-:

https://patchwork.freedesktop.org/series/114814/

> > +   if (!val)
> > val = rps->cur_freq;
> > add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT],

Thanks.
--
Ashutosh


Re: [PATCH RESEND] drm/tegra: sor: Remove redundant error logging

2023-03-01 Thread Dixit, Ashutosh
On Wed, 01 Mar 2023 11:48:06 -0800, Deepak R Varma wrote:
>
> A call to platform_get_irq() already prints an error on failure within
> its own implementation. So printing another error based on its return
> value in the caller is redundant and should be removed. The clean up
> also makes if condition block braces unnecessary. Remove that as well.
>
> Issue identified using platform_get_irq.cocci coccicheck script.

Reviewed-by: Ashutosh Dixit 

>
> Signed-off-by: Deepak R Varma 
> ---
> Note:
>Resending the patch for review and feedback. Originally sent on Dec 12 
> 2022.
>
>  drivers/gpu/drm/tegra/sor.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
> index 8af632740673..ceaebd33408d 100644
> --- a/drivers/gpu/drm/tegra/sor.c
> +++ b/drivers/gpu/drm/tegra/sor.c
> @@ -3799,10 +3799,8 @@ static int tegra_sor_probe(struct platform_device 
> *pdev)
>   }
>
>   err = platform_get_irq(pdev, 0);
> - if (err < 0) {
> - dev_err(>dev, "failed to get IRQ: %d\n", err);
> + if (err < 0)
>   goto remove;
> - }
>
>   sor->irq = err;
>
> --
> 2.34.1
>
>
>


Re: [PATCH RESEND] drm/nouveau/hwmon: Use sysfs_emit in show function callsbacks

2023-03-01 Thread Dixit, Ashutosh
On Wed, 01 Mar 2023 11:35:41 -0800, Deepak R Varma wrote:
>
> According to Documentation/filesystems/sysfs.rst, the show() callback
> function of kobject attributes should strictly use sysfs_emit() instead
> of sprintf() family functions. So, make this change.
> Issue identified using the coccinelle device_attr_show.cocci script.

Reviewed-by: Ashutosh Dixit 

>
> Signed-off-by: Deepak R Varma 
> ---
> Note:
>Resending the patch for review and feedback. No functional changes.
>
>
>  drivers/gpu/drm/nouveau/nouveau_hwmon.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> index a7db7c31064b..e844be49e11e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> @@ -41,7 +41,7 @@ static ssize_t
>  nouveau_hwmon_show_temp1_auto_point1_pwm(struct device *d,
>struct device_attribute *a, char *buf)
>  {
> - return snprintf(buf, PAGE_SIZE, "%d\n", 100);
> + return sysfs_emit(buf, "%d\n", 100);
>  }
>  static SENSOR_DEVICE_ATTR(temp1_auto_point1_pwm, 0444,
> nouveau_hwmon_show_temp1_auto_point1_pwm, NULL, 0);
> @@ -54,8 +54,8 @@ nouveau_hwmon_temp1_auto_point1_temp(struct device *d,
>   struct nouveau_drm *drm = nouveau_drm(dev);
>   struct nvkm_therm *therm = nvxx_therm(>client.device);
>
> - return snprintf(buf, PAGE_SIZE, "%d\n",
> -   therm->attr_get(therm, NVKM_THERM_ATTR_THRS_FAN_BOOST) * 1000);
> + return sysfs_emit(buf, "%d\n",
> +   therm->attr_get(therm, 
> NVKM_THERM_ATTR_THRS_FAN_BOOST) * 1000);
>  }
>  static ssize_t
>  nouveau_hwmon_set_temp1_auto_point1_temp(struct device *d,
> @@ -87,8 +87,8 @@ nouveau_hwmon_temp1_auto_point1_temp_hyst(struct device *d,
>   struct nouveau_drm *drm = nouveau_drm(dev);
>   struct nvkm_therm *therm = nvxx_therm(>client.device);
>
> - return snprintf(buf, PAGE_SIZE, "%d\n",
> -  therm->attr_get(therm, NVKM_THERM_ATTR_THRS_FAN_BOOST_HYST) * 1000);
> + return sysfs_emit(buf, "%d\n",
> +   therm->attr_get(therm, 
> NVKM_THERM_ATTR_THRS_FAN_BOOST_HYST) * 1000);
>  }
>  static ssize_t
>  nouveau_hwmon_set_temp1_auto_point1_temp_hyst(struct device *d,
> --
> 2.34.1
>
>
>


Re: [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting

2023-02-28 Thread Dixit, Ashutosh
On Fri, 12 Aug 2022 11:06:58 -0700, Guenter Roeck wrote:
>

Hi Guenter/linux-hwmon,


> On 8/12/22 10:37, Badal Nilawar wrote:
> > From: Dale B Stimson 
> >
> > Use i915 HWMON to display/modify dGfx power PL1 limit and TDP setting.
> >

/snip/

>
> Acked-by: Guenter Roeck 
>
> > ---
> >   .../ABI/testing/sysfs-driver-intel-i915-hwmon |  20 ++
> >   drivers/gpu/drm/i915/i915_hwmon.c | 176 +-
> >   drivers/gpu/drm/i915/i915_reg.h   |  16 ++
> >   drivers/gpu/drm/i915/intel_mchbar_regs.h  |   7 +
> >   4 files changed, 217 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
> > b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > index 24c4b7477d51..9a2d10edfce8 100644
> > --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > @@ -5,3 +5,23 @@ Contact:   dri-devel@lists.freedesktop.org
> >   Description:  RO. Current Voltage in millivolt.
> > Only supported for particular Intel i915 graphics
> > platforms.
> > +
> > +What:  /sys/devices/.../hwmon/hwmon/power1_max
> > +Date:  June 2022
> > +KernelVersion: 5.19
> > +Contact:   dri-devel@lists.freedesktop.org
> > +Description:   RW. Card reactive sustained  (PL1/Tau) power limit in 
> > microwatts.
> > +
> > +   The power controller will throttle the operating frequency
> > +   if the power averaged over a window (typically seconds)
> > +   exceeds this limit.

We exposed this as 'power1_max' previously. However this is a "power
limit".

https://github.com/torvalds/linux/blob/master/Documentation/hwmon/sysfs-interface.rst

says power1_max is "Maximum power". On the other hand, power1_cap is "If
power use rises above this limit, the system should take action to reduce
power use". So it would seem we should have chosen power1_cap for this
power limit instead of power1_max? So do you think we should change this to
power1_cap instead? Though even power1_max has an associated alarm so it
also seems to be a sort of limit.

Is there any guidance as to how these different power limits should be
used? Generally speaking is: power1_max <= power1_cap <= power1_crit, or is
it arbitrary or something else?

Also, only power1_cap seems to have power1_cap_min and power1_cap_max (in
case we wanted to use min/max values for the limits), not the others.

Separately, we have already used up power1_crit (which is the other limit
in official hwmon power limits) so we can't use that.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] Revert "drm/i915/hwmon: Enable PL1 power limit"

2023-02-15 Thread Dixit, Ashutosh
On Wed, 15 Feb 2023 07:37:30 -0800, Jani Nikula wrote:
>
> On Wed, 08 Feb 2023, Rodrigo Vivi  wrote:
> > On Wed, Feb 08, 2023 at 11:03:12AM -0800, Ashutosh Dixit wrote:
> >> This reverts commit 0349c41b05968befaffa5fbb7e73d0ee6004f610.
> >>
> >> 0349c41b0596 ("drm/i915/hwmon: Enable PL1 power limit") is incorrect and
> >> caused a major regression on ATSM. The change enabled the PL1 power limit
> >> but FW sets the default value of the PL1 limit to 0 which implies HW now
> >> works at minimum power and therefore the lowest effective frequency. This
> >> means all workloads now run slower resulting in even GuC FW load operations
> >> timing out, rendering ATSM unusable.
> >>
> >> A different solution to the original issue of the PL1 limit being disabled
> >> on ATSM is needed but till that is developed, revert 0349c41b0596.
> >>
> >> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> >
> > pushed to drm-intel-next and removed from drm-intel-fixes.
> >
> > Thanks for the quick reaction.
>
> Please always add Fixes: tags also to reverts.
>
> I suppose we should fix dim to also detect reverts, but I ended up
> cherry-picking and pushing the original commit out to
> drm-intel-next-fixes before realizing it's been reverted.

Oops, sorry!


Re: [PATCH 3/3] drm/i915/hwmon: Expose power1_max_enable

2023-02-14 Thread Dixit, Ashutosh
On Mon, 13 Feb 2023 22:16:44 -0800, Guenter Roeck wrote:
>

Hi Guenter,

> On 2/13/23 21:33, Ashutosh Dixit wrote:
> > On ATSM the PL1 power limit is disabled at power up. The previous uapi
> > assumed that the PL1 limit is always enabled and therefore did not have a
> > notion of a disabled PL1 limit. This results in erroneous PL1 limit values
> > when PL1 limit is disabled. For example at power up, the disabled ATSM PL1
> > limit is shown as 0 which means a low PL1 limit whereas the limit being
> > disabled actually implies a high effective PL1 limit value.
> >
> > To get round this problem, expose power1_max_enable as a custom hwmon
> > attribute. power1_max_enable can be used in conjunction with power1_max to
> > interpret power1_max (PL1 limit) values correctly. It can also be used to
> > enable/disable the PL1 power limit.
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   .../ABI/testing/sysfs-driver-intel-i915-hwmon |  7 +++
> >   drivers/gpu/drm/i915/i915_hwmon.c | 48 +--
> >   2 files changed, 51 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
> > b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > index 2d6a472eef885..edd94a44b4570 100644
> > --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > @@ -18,6 +18,13 @@ Description: RW. Card reactive sustained  (PL1/Tau) 
> > power limit in microwatts.
> > Only supported for particular Intel i915 graphics
> > platforms.
> >   +What:/sys/devices/.../hwmon/hwmon/power1_max_enable
>
> This is not a standard hwmon attribute. The standard attribute would be
> power1_enable.
>
> So from hwmon perspective this is a NACK.

Thanks for the feedback. I did consider power1_enable but decided to go
with the power1_max_enable custom attribute. Documentation for
power1_enable says it is to "Enable or disable the sensors" but in our case
we are not enabling/disabling sensors (which we don't have any ability to,
neither do we expose any power measurements, only energy from which power
can be derived) but enabling/disabling a "power limit" (a limit beyond
which HW takes steps to limit power).

As mentioned in the commit message, power1_max_enable is exposed to avoid
possible misinterpretations in measured energy in response to the set power
limit (something specific to our HW). We may have multiple such limits in
the future with similar issues and multiplexing enabling/disabling these
power limits via a single power1_enable file will not provide sufficient
granularity for our purposes.

Also, I had previously posted this patch:

https://patchwork.freedesktop.org/patch/522612/?series=113972=1

which avoids the power1_max_enable file and instead uses a power1_max value
of -1 to indicate that the power1_max limit is disabled.

So in summary we have the following options:

1. Go with power1_max_enable (preferred, works well for us)
2. Go with -1 to indicate that the power1_max limit is disabled
   (non-intuitive and also a little ugly)
3. Go with power1_enable (possible but confusing because multiple power
   limits/entities are multiplexed via a single file)

If you still think we should not use power1_max_enable I think I might drop
this patch for now (I am trying to preempt future issues but in this case
it's better to wait till people actually complain rather than expose a
non-ideal uapi).

Even if drop we this patch now, it would be good to know your preference in
case we need to revisit this issue later.

Thanks and also sorry for the rather long winded email.

Ashutosh

> Guenter
>
> > +Date:  May 2023
> > +KernelVersion: 6.3
> > +Contact:   intel-...@lists.freedesktop.org
> > +Description:   RW. Enable/disable the PL1 power limit (power1_max).
> > +
> > +   Only supported for particular Intel i915 graphics platforms.
> >   What: /sys/devices/.../hwmon/hwmon/power1_rated_max
> >   Date: February 2023
> >   KernelVersion:6.2
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 7c20a6f47b92e..5665869d8602b 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -230,13 +230,52 @@ hwm_power1_max_interval_store(struct device *dev,
> > PKG_PWR_LIM_1_TIME, rxy);
> > return count;
> >   }
> > +static SENSOR_DEVICE_ATTR_RW(power1_max_interval, hwm_power1_max_interval, 
> > 0);
> >   -static SENSOR_DEVICE_ATTR(power1_max_interval, 0664,
> > - hwm_power1_max_interval_show,
> > - hwm_power1_max_interval_store, 0);
> > +static ssize_t
> > +hwm_power1_max_enable_show(struct device *dev, struct device_attribute 
> > *attr, char *buf)
> > +{
> > +   struct hwm_drvdata *ddat = dev_get_drvdata(dev);
> > +   intel_wakeref_t wakeref;
> > +   

Re: [PATCH] drm/i915/hwmon: Enable PL1 power limit

2023-02-07 Thread Dixit, Ashutosh
On Tue, 07 Feb 2023 08:12:25 -0800, Dixit, Ashutosh wrote:
>
> On Tue, 07 Feb 2023 01:32:44 -0800, Matthew Auld wrote:
> >
> > On Fri, 3 Feb 2023 at 15:54, Ashutosh Dixit  
> > wrote:
> > >
> > > Previous documentation suggested that PL1 power limit is always
> > > enabled. However we now find this not to be the case on some
> > > platforms (such as ATSM). Therefore enable PL1 power limit during hwmon
> > > initialization.
> >
> > For some reason it looks like this change is impacting the atsm in CI:
> > https://intel-gfx-ci.01.org/tree/drm-tip/bat-atsm-1.html
>
> Hmm, the change was meant for ATSM. Anyway let me try to get hold of an
> ATSM and see if I can figure out what might be going on with these
> seemingly unrelated failures and if I can repro them locally. Thanks!

Rodrigo/Matt,

I am proposing we revert this now and remerge again after investigating,
even getting ATSM systems to investigate is not easy so it might take a few
days to investigate. What do you guys think?

Thanks.
--
Ashutosh


>
> >
> > >
> > > Bspec: 51864
> > >
> > > v2: Add Bspec reference (Gwan-gyeong)
> > > v3: Add Fixes tag
> > >
> > > Fixes: 99f55efb79114 ("drm/i915/hwmon: Power PL1 limit and TDP setting")
> > > Signed-off-by: Ashutosh Dixit 
> > > Reviewed-by: Gwan-gyeong Mun 
> > > ---
> > >  drivers/gpu/drm/i915/i915_hwmon.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > index 1225bc432f0d5..4683a5b96eff1 100644
> > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > @@ -687,6 +687,11 @@ hwm_get_preregistration_info(struct drm_i915_private 
> > > *i915)
> > > for_each_gt(gt, i915, i)
> > > hwm_energy(>ddat_gt[i], );
> > > }
> > > +
> > > +   /* Enable PL1 power limit */
> > > +   if (i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > > +   hwm_locked_with_pm_intel_uncore_rmw(ddat, 
> > > hwmon->rg.pkg_rapl_limit,
> > > +   PKG_PWR_LIM_1_EN, 
> > > PKG_PWR_LIM_1_EN);
> > >  }
> > >
> > >  void i915_hwmon_register(struct drm_i915_private *i915)
> > > --
> > > 2.38.0
> > >


Re: [PATCH] drm/i915/hwmon: Enable PL1 power limit

2023-02-07 Thread Dixit, Ashutosh
On Tue, 07 Feb 2023 01:32:44 -0800, Matthew Auld wrote:
>
> On Fri, 3 Feb 2023 at 15:54, Ashutosh Dixit  wrote:
> >
> > Previous documentation suggested that PL1 power limit is always
> > enabled. However we now find this not to be the case on some
> > platforms (such as ATSM). Therefore enable PL1 power limit during hwmon
> > initialization.
>
> For some reason it looks like this change is impacting the atsm in CI:
> https://intel-gfx-ci.01.org/tree/drm-tip/bat-atsm-1.html

Hmm, the change was meant for ATSM. Anyway let me try to get hold of an
ATSM and see if I can figure out what might be going on with these
seemingly unrelated failures and if I can repro them locally. Thanks!

>
> >
> > Bspec: 51864
> >
> > v2: Add Bspec reference (Gwan-gyeong)
> > v3: Add Fixes tag
> >
> > Fixes: 99f55efb79114 ("drm/i915/hwmon: Power PL1 limit and TDP setting")
> > Signed-off-by: Ashutosh Dixit 
> > Reviewed-by: Gwan-gyeong Mun 
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 1225bc432f0d5..4683a5b96eff1 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -687,6 +687,11 @@ hwm_get_preregistration_info(struct drm_i915_private 
> > *i915)
> > for_each_gt(gt, i915, i)
> > hwm_energy(>ddat_gt[i], );
> > }
> > +
> > +   /* Enable PL1 power limit */
> > +   if (i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +   hwm_locked_with_pm_intel_uncore_rmw(ddat, 
> > hwmon->rg.pkg_rapl_limit,
> > +   PKG_PWR_LIM_1_EN, 
> > PKG_PWR_LIM_1_EN);
> >  }
> >
> >  void i915_hwmon_register(struct drm_i915_private *i915)
> > --
> > 2.38.0
> >


Re: [Intel-gfx] [PATCH] drm/i915/mtl: Connect root sysfs entries to GT0

2023-01-12 Thread Dixit, Ashutosh
On Thu, 12 Jan 2023 20:26:34 -0800, Belgaumkar, Vinay wrote:
>
> I think the ABI was changed by the patch mentioned in the commit
> (a8a4f0467d70).

The ABI was originally changed in 80cf8af17af04 and 56a709cf77468.


Re: [Intel-gfx] [PATCH] drm/i915/mtl: Connect root sysfs entries to GT0

2023-01-12 Thread Dixit, Ashutosh
On Thu, 12 Jan 2023 18:27:52 -0800, Vinay Belgaumkar wrote:
>
> Reading current root sysfs entries gives a min/max of all
> GTs. Updating this so we return default (GT0) values when root
> level sysfs entries are accessed, instead of min/max for the card.
> Tests that are not multi GT capable will read incorrect sysfs
> values without this change on multi-GT platforms like MTL.
>
> Fixes: a8a4f0467d70 ("drm/i915: Fix CFI violations in gt_sysfs")

We seem to be proposing to change the previous sysfs ABI with this patch?
But even then it doesn't seem correct to use gt0 values for device level
sysfs. Actually I received the following comment about using max freq
across gt's for device level freq's (gt_act_freq_mhz etc.) from one of our
users:

-
On Sun, 06 Nov 2022 08:54:04 -0800, Lawson, Lowren H wrote:

Why show maximum? Wouldn’t average be more accurate to the user experience?

As a user, I expect the ‘card’ frequency to be relatively accurate to the
entire card. If I see 1.6GHz, but the card is behaving as if it’s running a
1.0 & 1.6GHz on the different compute tiles, I’m going to see a massive
decrease in compute workload performance while at ‘maximum’ frequency.
-

So I am not sure why max/min were previously chosen. Why not the average?

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Allow SLPC to use efficient frequency

2022-12-09 Thread Dixit, Ashutosh
On Sun, 14 Aug 2022 16:46:54 -0700, Vinay Belgaumkar wrote:
>
> Host Turbo operates at efficient frequency when GT is not idle unless
> the user or workload has forced it to a higher level. Replicate the same
> behavior in SLPC by allowing the algorithm to use efficient frequency.
> We had disabled it during boot due to concerns that it might break
> kernel ABI for min frequency. However, this is not the case since
> SLPC will still abide by the (min,max) range limits.

This change seems to have broken the i915 kernel ABI for min frequency for
DG2. Tvrtko pointed this out here:

https://patchwork.freedesktop.org/patch/512274/?series=110574=3

These bugs are the result of that ABI break:

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6806
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6786

On DG2 when we set min == max freq, we see the GPU running not at the set
min == max freq but at efficient freq (different from the set freq).

We are still trying to see if the ABI can be salvaged but something is
definitely wrong at present.

Thanks.
--
Ashutosh


Re: [PATCH v3] drm/i915/mtl: Enable Idle Messaging for GSC CS

2022-11-18 Thread Dixit, Ashutosh
On Fri, 18 Nov 2022 10:37:37 -0800, Vivi, Rodrigo wrote:
>
> On Sat, 2022-11-19 at 00:03 +0530, Badal Nilawar wrote:
> > From: Vinay Belgaumkar 
> >
> > By defaut idle messaging is disabled for GSC CS so to unblock RC6
> > entry on media tile idle messaging need to be enabled.
> >
> > v2:
> >  - Fix review comments (Vinay)
> >  - Set GSC idle hysteresis as per spec (Badal)
> > v3:
> >  - Fix review comments (Rodrigo)
> >
> > Bspec: 71496
> >
> > Cc: Daniele Ceraolo Spurio 
> > Signed-off-by: Vinay Belgaumkar 
> > Signed-off-by: Badal Nilawar 
> > Reviewed-by: Vinay Belgaumkar 
>
> He is the author of the patch, no?!
> or you can remove this or change the author to be you and keep his
> reviewed-by...
>
> or I can just remove his rv-b while merging.. just let me know..

Not sure if that is the case here, but when multiple people contribute to a
patch, the original author can review changes by others and add his
Reviewed-by, no? Or are we saying it is redundant for the author to add his
R-b?

Similarly, are S-o-b and R-b by the same person ok? I add changes to
someone's patch so add my S-o-b but also review other's changes so add my
R-b? Sometimes finding a 3rd person to add a R-b is hard. But two poeple
can contribute to a patch and review each other's changes so add both their
S-o-b's and R-b's or no?

:)

Ashutosh



Re: [Intel-gfx] [PATCH v5 5/7] drm/i915/gt: Create per-tile RC6 sysfs interface

2022-11-06 Thread Dixit, Ashutosh
On Tue, 22 Feb 2022 00:57:02 -0800, Andi Shyti wrote:
>

Old thread, new comment below at the bottom. Please take a look. Thanks.

> Hi Tvrtko and Joonas,
>
> > > > > > Now tiles have their own sysfs interfaces under the gt/
> > > > > > directory. Because RC6 is a property that can be configured on a
> > > > > > tile basis, then each tile should have its own interface
> > > > > >
> > > > > > The new sysfs structure will have a similar layout for the 4 tile
> > > > > > case:
> > > > > >
> > > > > > /sys/.../card0
> > > > > >\u251c\u2500\u2500 gt
> > > > > >\u2502   \u251c\u2500\u2500 gt0
> > > > > >\u2502   \u2502   \u251c\u2500\u2500 id
> > > > > >\u2502   \u2502   \u251c\u2500\u2500 rc6_enable
> > > > > >\u2502   \u2502   \u251c\u2500\u2500 rc6_residency_ms
> > > > > >.   .   .
> > > > > >.   .   .
> > > > > >.   .
> > > > > >\u2502   \u2514\u2500\u2500 gtN
> > > > > >\u2502   \u251c\u2500\u2500 id
> > > > > >\u2502   \u251c\u2500\u2500 rc6_enable
> > > > > >\u2502   \u251c\u2500\u2500 rc6_residency_ms
> > > > > >\u2502   .
> > > > > >\u2502   .
> > > > > >\u2502
> > > > > >\u2514\u2500\u2500 power/-+
> > > > > > \u251c\u2500\u2500 rc6_enable|Original 
> > > > > > interface
> > > > > > \u251c\u2500\u2500 rc6_residency_ms  +->  kept as 
> > > > > > existing ABI;
> > > > > > . |it multiplexes over
> > > > > > . |the GTs
> > > > > >  -+
> > > > > >
> > > > > > The existing interfaces have been kept in their original location
> > > > > > to preserve the existing ABI. They act on all the GTs: when
> > > > > > reading they provide the average value from all the GTs.
> > > > >
> > > > > Average feels very odd to me. I'd ask if we can get away providing an 
> > > > > errno
> > > > > instead? Or tile zero data?
> > >
> > > Tile zero data is always wrong, in my opinion. If we have round-robin
> > > scaling workloads like some media cases, part of the system load might
> > > just disappear when it goes to tile 1.
> >
> > I was thinking that in conjunction with deprecated log message it wouldn't
> > be wrong - I mean if the route take was to eventually retire the legacy
> > files altogether.
>
> that's a good point... do we want to treat the legacy interfaces
> as an error or do we want to make them a feature? As the
> discussion is turning those interfaces are becoming a feature.
> But what are we going to do with the coming interfaces?
>
> E.g. in the future we will have the rc6_enable/disable that can
> be a command, so that we will add the "_store" interface per
> tile. What are we going to do with the above interfaces? Are we
> going to add a multiplexed command as well?
>
> > > When we have frequency readbacks without control, returning MAX() across
> > > tiles would be the logical thing. The fact that parts of the hardware can
> > > be clocked lower when one part is fully utilized is the "new feature".
> > >
> > > After that we're only really left with the rc6_residency_ms. And that is
> > > the tough one. I'm inclined that MIN() across tiles would be the right
> > > answer. If you are fully utilizing a single tile, you should be able to
> > > see it.
>
> >  So we have MIN, AVG or SUM, or errno, or remove the file (which is
> > just a different kind of errno?) to choose from. :)
>
> in this case it would just be MIN and MAX. At the end we have
> here only two types of interface: frequencies and residency_ms.
> For the first type we would use 'max', for the second 'min'.

We have the comment below from Lowren about this about showing MAX for
freq. Could someone reply. Thanks.

On Sun, 06 Nov 2022 08:54:04 -0800, Lawson, Lowren H wrote:

Why show maximum?  Wouldn’t average be more accurate to the user
experience?

As a user, I expect the ‘card’ frequency to be relatively accurate to the
entire card.  If I see 1.6GHz, but the card is behaving as if it’s
running a 1.0 & 1.6GHz on the different compute tiles, I’m going to see a
massive decrease in compute workload performance while at ‘maximum’
frequency.


Re: [PATCH] drm/i915/hwmon: Don't use FIELD_PREP

2022-11-02 Thread Dixit, Ashutosh
On Tue, 01 Nov 2022 03:58:13 -0700, Jani Nikula wrote:
>
> On Mon, 31 Oct 2022, Ashutosh Dixit  wrote:
> > FIELD_PREP and REG_FIELD_PREP have checks requiring a compile time constant
> > mask. When the mask comes in as the argument of a function these checks can
> > can fail depending on the compiler (gcc vs clang), optimization level,
> > etc. Use a simpler version of FIELD_PREP which skips these checks. The
> > checks are not needed because the mask is formed using REG_GENMASK (so is
> > actually a compile time constant).
> >
> > v2: Split REG_FIELD_PREP into a macro with checks and one without and use
> > the one without checks in i915_hwmon.c (Gwan-gyeong Mun)
>
> I frankly think you're solving the wrong problem here. See [1].

We can consider the sort of refactoring suggested in [1] in the future,
right now I thought I'll offer what in my opinion is the correct way to fix
the clang compile break incrementally with the current code. But otherwise
feel free to go with whatever you think is the correct course of action for
this issue. Even if we don't fix the issue the clang guys will (as they
have in the past).

Thanks.
--
Ashutosh

> [1] https://lore.kernel.org/r/87leov7yix@intel.com
>
> >
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7354
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c|  2 +-
> >  drivers/gpu/drm/i915/i915_reg_defs.h | 17 +++--
> >  2 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9e97814930254..ae435b035229a 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -112,7 +112,7 @@ hwm_field_scale_and_write(struct hwm_drvdata *ddat, 
> > i915_reg_t rgadr,
> > nval = DIV_ROUND_CLOSEST_ULL((u64)lval << nshift, scale_factor);
> >
> > bits_to_clear = field_msk;
> > -   bits_to_set = FIELD_PREP(field_msk, nval);
> > +   bits_to_set = __REG_FIELD_PREP(field_msk, nval);
> >
> > hwm_locked_with_pm_intel_uncore_rmw(ddat, rgadr,
> > bits_to_clear, bits_to_set);
> > diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h 
> > b/drivers/gpu/drm/i915/i915_reg_defs.h
> > index f1859046a9c48..dddacc8d48928 100644
> > --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> > +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> > @@ -67,12 +67,17 @@
> >   *
> >   * @return: @__val masked and shifted into the field defined by @__mask.
> >   */
> > -#define REG_FIELD_PREP(__mask, __val)  
> > \
> > -   ((u32)typeof(__mask))(__val) << __bf_shf(__mask)) & (__mask)) + 
> > \
> > -  BUILD_BUG_ON_ZERO(!__is_constexpr(__mask)) + \
> > -  BUILD_BUG_ON_ZERO((__mask) == 0 || (__mask) > U32_MAX) + 
> > \
> > -  BUILD_BUG_ON_ZERO(!IS_POWER_OF_2((__mask) + (1ULL << 
> > __bf_shf(__mask + \
> > -  BUILD_BUG_ON_ZERO(__builtin_choose_expr(__is_constexpr(__val), 
> > (~((__mask) >> __bf_shf(__mask)) & (__val)), 0
> > +#define __REG_FIELD_PREP_CHK(__mask, __val) \
> > +   (BUILD_BUG_ON_ZERO(!__is_constexpr(__mask)) + \
> > +BUILD_BUG_ON_ZERO((__mask) == 0 || (__mask) > U32_MAX) + \
> > +BUILD_BUG_ON_ZERO(!IS_POWER_OF_2((__mask) + (1ULL << 
> > __bf_shf(__mask + \
> > +BUILD_BUG_ON_ZERO(__builtin_choose_expr(__is_constexpr(__val), 
> > (~((__mask) >> __bf_shf(__mask)) & (__val)), 0)))
> > +
> > +#define __REG_FIELD_PREP(__mask, __val) \
> > +   ((u32)typeof(__mask))(__val) << __bf_shf(__mask)) & (__mask
> > +
> > +#define REG_FIELD_PREP(__mask, __val) \
> > +   (__REG_FIELD_PREP(__mask, __val) + __REG_FIELD_PREP_CHK(__mask, __val))
> >
> >  /**
> >   * REG_FIELD_GET() - Extract a u32 bitfield value


Re: [Intel-gfx] [PATCH v4] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-25 Thread Dixit, Ashutosh
On Mon, 24 Oct 2022 15:54:53 -0700, Vinay Belgaumkar wrote:
>
> GuC will set the min/max frequencies to theoretical max on
> ATS-M. This will break kernel ABI, so limit min/max frequency
> to RP0(platform max) instead.
>
> Also modify the SLPC selftest to update the min frequency
> when we have a server part so that we can iterate between
> platform min and max.
>
> v2: Check softlimits instead of platform limits (Riana)
> v3: More review comments (Ashutosh)
> v4: No need to use saved_min_freq and other comments (Ashutosh)

OK, much better now overall.

> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7030
>
> +static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
> +{
> + /* For server parts, SLPC min will be at RPMax.
> +  * Use min softlimit to clamp it to RP0 instead.
> +  */
> + if (!slpc->min_freq_softlimit &&
> + is_slpc_min_freq_rpmax(slpc)) {
> + slpc->min_is_rpmax = true;

The only remaining issue is slpc->min_is_rpmax is now set but never used so
it can possibly be removed, or retained for debuggability (I think it's a
fair reason to retain it). Though I am not sure if we will hit a "variable
set but never used" error from these clever compilers.

> + slpc->min_freq_softlimit = slpc->rp0_freq;
> + (slpc_to_gt(slpc))->defaults.min_freq = 
> slpc->min_freq_softlimit;
> + }
> +}

In any case, this is now:

Reviewed-by: Ashutosh Dixit 


Re: [PATCH 5/5] drm/i915/mtl: C6 residency and C state type for MTL SAMedia

2022-10-24 Thread Dixit, Ashutosh
On Fri, 21 Oct 2022 09:35:32 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Oct 19, 2022 at 04:37:21PM -0700, Ashutosh Dixit wrote:
> > From: Badal Nilawar 
> >
> > Add support for C6 residency and C state type for MTL SAMedia. Also add
> > mtl_drpc.
>
> I believe this patch deserves a slip between the actual support and
> the debugfs, but I'm late to the review, so feel free to ignore this
> comment...

Sorry didn't understand what you mean by "slip", you mean the patch should
be split in two?

> but I do have more dummy doubts below:
>
> >
> > v2: Fixed review comments (Ashutosh)
> > v3: Sort registers and fix whitespace errors in intel_gt_regs.h (Matt R)
> > Remove MTL_CC_SHIFT (Ashutosh)
> > Adapt to RC6 residency register code refactor (Jani N)
> > v4: Move MTL branch to top in drpc_show
> > v5: Use FORCEWAKE_MT identical to gen6_drpc (Ashutosh)
> >
> > Signed-off-by: Ashutosh Dixit 
> > Signed-off-by: Badal Nilawar 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 58 ++-
> >  drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  5 ++
> >  drivers/gpu/drm/i915/gt/intel_rc6.c   | 17 --
> >  3 files changed, 75 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > index 5d6b346831393..f15a7486a9866 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > @@ -256,6 +256,60 @@ static int ilk_drpc(struct seq_file *m)
> > return 0;
> >  }
> >
> > +static int mtl_drpc(struct seq_file *m)
> > +{
> > +   struct intel_gt *gt = m->private;
> > +   struct intel_uncore *uncore = gt->uncore;
> > +   u32 gt_core_status, rcctl1, mt_fwake_req;
> > +   u32 mtl_powergate_enable = 0, mtl_powergate_status = 0;
> > +
> > +   mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
> > +   gt_core_status = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1);
> > +
> > +   rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
> > +   mtl_powergate_enable = intel_uncore_read(uncore, GEN9_PG_ENABLE);
> > +   mtl_powergate_status = intel_uncore_read(uncore,
> > +GEN9_PWRGT_DOMAIN_STATUS);
> > +
> > +   seq_printf(m, "RC6 Enabled: %s\n",
> > +  str_yes_no(rcctl1 & GEN6_RC_CTL_RC6_ENABLE));
> > +   if (gt->type == GT_MEDIA) {
> > +   seq_printf(m, "Media Well Gating Enabled: %s\n",
> > +  str_yes_no(mtl_powergate_enable & 
> > GEN9_MEDIA_PG_ENABLE));
> > +   } else {
> > +   seq_printf(m, "Render Well Gating Enabled: %s\n",
> > +  str_yes_no(mtl_powergate_enable & 
> > GEN9_RENDER_PG_ENABLE));
> > +   }
> > +
> > +   seq_puts(m, "Current RC state: ");
>
> (Just a "loud" thought here in this chunck, but no actual action requested)
>
> should we really use "R" (Render) for this Media C state?

This function is called for both render and media gt's. But let's think
about this. We can call easily call them e.g. RC6 for render and MC6 for
media too if that is more accurate and descriptive. On the other hand, do
we really need to introduce a new term like MC6? Maybe we just stick to
RC/RC6 terminology for anything on the GPU?

> But well, MC6 seems to be a totally different thing and CC6

MC6 is not the same as RC6 for the media tile?

> and CC6 is really strange because the C stands for Core and this can get
> very confusing with the SoC or CPU C states...  :(

Yes Bspec 66300 refers to these as core C states but refers to GT and
IA. So it's confusing.

> At least with the Render we know which level of the IP we
> are looking at when looking at media...

Yup that's why I've left this as RC/RC6 in Patch v6.

>
> > +   switch (REG_FIELD_GET(MTL_CC_MASK, gt_core_status)) {
> > +   case MTL_CC0:
> > +   seq_puts(m, "on\n");
>
> maybe "*C0" instead of "on"?

Done in v6. Though this string is "on" also in the previous function
gen6_drpc. Also, if we are calling this C0 we could call the C6 state as
just C6 (which would mean RC6 for render and MC6 for media). But I thought
RC6 is better for both render and media.

>
> > +   break;
> > +   case MTL_CC6:
> > +   seq_puts(m, "RC6\n");
> > +   break;
> > +   default:
> > +   seq_puts(m, "Unknown\n");
>
> maybe use a MISSING_CASE() here?
> or raise a WARN?

Done in v6.

>
> > +   break;
> > +   }
> > +
> > +   seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
> > +   if (gt->type == GT_MEDIA)
> > +   seq_printf(m, "Media Power Well: %s\n",
> > +  (mtl_powergate_status &
> > +   GEN9_PWRGT_MEDIA_STATUS_MASK) ? "Up" : "Down");
>
> gate is up and power is down or gate is down and power is up?

Yes name is confusing but is the same as Bspec and also gen6_drpc. So the
prints "Media Power Well: Up" or "Media Power Well: Down" are correct (0 is
down, 1 is up). 

Re: [Intel-gfx] [PATCH v4] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-22 Thread Dixit, Ashutosh
On Sat, 22 Oct 2022 10:56:03 -0700, Belgaumkar, Vinay wrote:
>

Hi Vinay,

> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> >> b/drivers/gpu/drm/i915/gt/intel_rps.c
> >> index fc23c562d9b2..32e1f5dde5bb 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> >> @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq)
> >>if (rps_uses_slpc(rps)) {
> >>slpc = rps_to_slpc(rps);
> >>
> >> +  if (slpc->min_freq_softlimit == slpc->boost_freq)
> >> +  return;
> > nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq'
> > (looks possible to me from the code though we might not have intended it)?
> > Then we can change this to:
> >
> > if (slpc->min_freq_softlimit >= slpc->boost_freq)
> > return;

Any comment about this? It looks clearly possible to me from the code.

So with the above change this is:

Reviewed-by: Ashutosh Dixit 


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-21 Thread Dixit, Ashutosh
On Fri, 21 Oct 2022 18:38:57 -0700, Belgaumkar, Vinay wrote:
> On 10/20/2022 3:57 PM, Dixit, Ashutosh wrote:
> > On Tue, 18 Oct 2022 11:30:31 -0700, Vinay Belgaumkar wrote:
> > Hi Vinay,
> >
> >> diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
> >> b/drivers/gpu/drm/i915/gt/selftest_slpc.c
> >> index 4c6e9257e593..e42bc215e54d 100644
> >> --- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
> >> +++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
> >> @@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int test_type)
> >>enum intel_engine_id id;
> >>struct igt_spinner spin;
> >>u32 slpc_min_freq, slpc_max_freq;
> >> +  u32 saved_min_freq;
> >>int err = 0;
> >>
> >>if (!intel_uc_uses_guc_slpc(>uc))
> >> @@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int 
> >> test_type)
> >>return -EIO;
> >>}
> >>
> >> -  /*
> >> -   * FIXME: With efficient frequency enabled, GuC can request
> >> -   * frequencies higher than the SLPC max. While this is fixed
> >> -   * in GuC, we level set these tests with RPn as min.
> >> -   */
> >> -  err = slpc_set_min_freq(slpc, slpc->min_freq);
> >> -  if (err)
> >> -  return err;
> >> +  if (slpc_min_freq == slpc_max_freq) {
> >> +  /* Server parts will have min/max clamped to RP0 */
> >> +  if (slpc->min_is_rpmax) {
> >> +  err = slpc_set_min_freq(slpc, slpc->min_freq);
> >> +  if (err) {
> >> +  pr_err("Unable to update min freq on server 
> >> part");
> >> +  return err;
> >> +  }
> >>
> >> -  if (slpc->min_freq == slpc->rp0_freq) {
> >> -  pr_err("Min/Max are fused to the same value\n");
> >> -  return -EINVAL;
> >> +  } else {
> >> +  pr_err("Min/Max are fused to the same value\n");
> >> +  return -EINVAL;
> > Sorry but I am not following this else case here. Why are we saying min/max
> > are fused to the same value? In this case we can't do
> > "slpc_set_min_freq(slpc, slpc->min_freq)" ? That is, we can't change SLPC
> > min freq?
>
> This would be an error case due to a faulty part. We may come across a part
> where min/max is fused to the same value.

But even then the original check is much clearer since it is actually
comparing the fused freq's:

if (slpc->min_freq == slpc->rp0_freq)

Because if min/max have been changed slpc_min_freq and slpc_max_freq are no
longer fused freq.

And also this check should be right at the top of run_test, right after if
(!intel_uc_uses_guc_slpc), rather than in the middle here (otherwise
because we are basically not doing any error rewinding so causing memory
leaks if any of the functions return error).

> >>+   }
> >>+   } else {
> >>+   /*
> >>+* FIXME: With efficient frequency enabled, GuC can request
> >>+* frequencies higher than the SLPC max. While this is fixed
> >>+* in GuC, we level set these tests with RPn as min.
> >>+*/
> >>+   err = slpc_set_min_freq(slpc, slpc->min_freq);
> >>+   if (err)
> >>+   return err;
> >>}

So let's do what is suggested above and then see what remains here and if
we need all these code changes. Most likely we can just do unconditionally
what we were doing before, i.e.:

err = slpc_set_min_freq(slpc, slpc->min_freq);
if (err)
return err;

> >>
> >>+   saved_min_freq = slpc_min_freq;
> >>+
> >>+   /* New temp min freq = RPn */
> >>+   slpc_min_freq = slpc->min_freq;

Why do we need saved_min_freq? We can retain slpc_min_freq and in the check 
below:

if (max_act_freq <= slpc_min_freq)

We can just change the check to:

if (max_act_freq <= slpc->min_freq)

Looks like to have been a bug in the original code?

> >>+
> >>intel_gt_pm_wait_for_idle(gt);
> >>intel_gt_pm_get(gt);
> >>for_each_engine(engine, gt, id) {
> >>@@ -347,7 +363,7 @@ static int run_test(struct intel_gt *gt, int test_type)
> >>
> >>/* Restore min/max frequencies */
> >>slpc_set_max_freq(slpc, slpc_max_freq);
> >>-   slpc_set_min_freq(slpc, slpc_min_freq);
> >>+   slpc_set_min_freq(slpc, saved_min_fre

Re: [Intel-gfx] [PATCH v4] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-21 Thread Dixit, Ashutosh
On Fri, 21 Oct 2022 17:24:52 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> Waitboost (when SLPC is enabled) results in a H2G message. This can result
> in thousands of messages during a stress test and fill up an already full
> CTB. There is no need to request for RP0 if boost_freq and the min softlimit
> are the same.
>
> v2: Add the tracing back, and check requested freq
> in the worker thread (Tvrtko)
> v3: Check requested freq in dec_waiters as well
> v4: Only check min_softlimit against boost_freq. Limit this
> optimization for server parts for now.

Sorry I didn't follow. Why are we saying limit this only to server? This:

if (slpc->min_freq_softlimit == slpc->boost_freq)
return;

The condition above should work for client too if it is true? But yes it is
typically true automatically for server but not for client. Is that what
you mean?

>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/intel_rps.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index fc23c562d9b2..32e1f5dde5bb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq)
>   if (rps_uses_slpc(rps)) {
>   slpc = rps_to_slpc(rps);
>
> + if (slpc->min_freq_softlimit == slpc->boost_freq)
> + return;

nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq'
(looks possible to me from the code though we might not have intended it)?
Then we can change this to:

if (slpc->min_freq_softlimit >= slpc->boost_freq)
return;


> +
>   /* Return if old value is non zero */
> - if (!atomic_fetch_inc(>num_waiters))
> + if (!atomic_fetch_inc(>num_waiters)) {
> + GT_TRACE(rps_to_gt(rps), "boost 
> fence:%llx:%llx\n",
> +  rq->fence.context, rq->fence.seqno);

Another possibility would have been to add the trace to slpc_boost_work but
this is matches host turbo so I think it is fine here.

>   schedule_work(>boost_work);
> + }
>
>   return;
>   }

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-21 Thread Dixit, Ashutosh
On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote:
>
>
> On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote:
> > On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote:
> >> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote:
> >>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
> >>> Hi Vinay,
> >>>
> >>>> Waitboost (when SLPC is enabled) results in a H2G message. This can 
> >>>> result
> >>>> in thousands of messages during a stress test and fill up an already full
> >>>> CTB. There is no need to request for RP0 if GuC is already requesting the
> >>>> same.
> >>> But how are we sure that the freq will remain at RP0 in the future (when
> >>> the waiting request or any requests which are ahead execute)?
> >>>
> >>> In the current waitboost implementation, set_param is sent to GuC ahead of
> >>> the waiting request to ensure that the freq would be max when this waiting
> >>> request executed on the GPU and the freq is kept at max till this request
> >>> retires (considering just one waiting request). How can we ensure this if
> >>> we don't send the waitboost set_param to GuC?
> >> There is no way to guarantee the frequency will remain at RP0 till the
> >> request retires. As a theoretical example, lets say the request boosted
> >> freq to RP0, but a user changed min freq using sysfs immediately after.
> > That would be a bug. If waitboost is in progress and in the middle user
> > changed min freq, I would expect the freq to revert to the new min only
> > after the waitboost phase was over.
>
> The problem here is that GuC is unaware of this "boosting"
> phenomenon. Setting the min_freq_softlimit as well to boost when we send a
> boost request might help with this issue.
>
> >
> > In any case, I am not referring to this case. Since FW controls the freq
> > there is nothing preventing FW to change the freq unless we raise min to
> > max which is what waitboost does.
> Ok, so maybe the solution here is to check if min_softlimit is already at
> boost freq, as it tracks the min freq changes. That should take care of
> server parts automatically as well.

Correct, yes that would be the right way to do it.

Thanks.
--
Ashutosh

> >> Waitboost is done by a pending request to "hurry" the current requests. If
> >> GT is already at boost frequency, that purpose is served.
> > FW can bring the freq down later before the waiting request is scheduled.
> >> Also, host algorithm already has this optimization as well.
> > Host turbo is different from SLPC. Host turbo controls the freq algorithm
> > so it knows freq will not come down till it itself brings the freq
> > down. Unlike SLPC where FW is controling the freq. Therefore host turbo
> > doesn't ever need to do a MMIO read but only needs to refer to its own
> > state (rps->cur_freq etc.).
> True. Host algorithm has a periodic timer where it updates frequency. Here,
> it checks num_waiters and sets client_boost every time that is non-zero.
> >>> I had assumed we'll do this optimization for server parts where min is
> >>> already RP0 in which case we can completely disable waitboost. But this
> >>> patch is something else.
>
> Hopefully the softlimit changes above will help with client and server.
>
> Thanks,
>
> Vinay.
>
> > Thanks.
> > --
> > Ashutosh
> >
> >>>> v2: Add the tracing back, and check requested freq
> >>>> in the worker thread (Tvrtko)
> >>>> v3: Check requested freq in dec_waiters as well
> >>>>
> >>>> Signed-off-by: Vinay Belgaumkar 
> >>>> ---
> >>>>drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
> >>>>drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
> >>>>2 files changed, 14 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> >>>> b/drivers/gpu/drm/i915/gt/intel_rps.c
> >>>> index fc23c562d9b2..18b75cf08d1b 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> >>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
> >>>>  if (rps_uses_slpc(rps)) {
> >>>>  slpc = rps_to_slpc(rps);
> >>>>
> >>>> +GT_TRACE(rps_to_gt(rps), "boost 
> >>>> fenc

Re: [Intel-gfx] [PATCH 3/5] drm/i915/mtl: Modify CAGF functions for MTL

2022-10-21 Thread Dixit, Ashutosh
On Wed, 19 Oct 2022 16:37:19 -0700, Ashutosh Dixit wrote:
>
> From: Badal Nilawar 
>
> Update CAGF functions for MTL to get actual resolved frequency of 3D and
> SAMedia.
>
> v2: Update MTL_MIRROR_TARGET_WP1 position/formatting (MattR)
> Move MTL branches in cagf functions to top (MattR)
> Fix commit message (Andi)
> v3: Added comment about registers not needing forcewake for Gen12+ and
> returning 0 freq in RC6
> v4: Use REG_FIELD_GET and uncore (Rodrigo)
>
> Bspec: 66300

Reviewed-by: Ashutosh Dixit 

>
> Signed-off-by: Ashutosh Dixit 
> Signed-off-by: Badal Nilawar 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h |  4 
>  drivers/gpu/drm/i915/gt/intel_rps.c | 12 ++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
> b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index f8c4f758ac0b1..d8dbd0ac3b064 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -21,6 +21,10 @@
>   */
>  #define PERF_REG(offset) _MMIO(offset)
>
> +/* MTL workpoint reg to get core C state and actual freq of 3D, SAMedia */
> +#define MTL_MIRROR_TARGET_WP1_MMIO(0xc60)
> +#define   MTL_CAGF_MASK  REG_GENMASK(8, 0)
> +
>  /* RPM unit config (Gen8+) */
>  #define RPM_CONFIG0  _MMIO(0xd00)
>  #define   GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT  3
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index da6b969f554b6..63cc7c538401e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -2093,7 +2093,9 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 
> rpstat)
>   struct drm_i915_private *i915 = rps_to_i915(rps);
>   u32 cagf;
>
> - if (GRAPHICS_VER(i915) >= 12)
> + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> + cagf = REG_FIELD_GET(MTL_CAGF_MASK, rpstat);
> + else if (GRAPHICS_VER(i915) >= 12)
>   cagf = REG_FIELD_GET(GEN12_CAGF_MASK, rpstat);
>   else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
>   cagf = REG_FIELD_GET(RPE_MASK, rpstat);
> @@ -2115,7 +2117,13 @@ static u32 read_cagf(struct intel_rps *rps)
>   struct intel_uncore *uncore = rps_to_uncore(rps);
>   u32 freq;
>
> - if (GRAPHICS_VER(i915) >= 12) {
> + /*
> +  * For Gen12+ reading freq from HW does not need a forcewake and
> +  * registers will return 0 freq when GT is in RC6
> +  */
> + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
> + freq = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1);
> + } else if (GRAPHICS_VER(i915) >= 12) {
>   freq = intel_uncore_read(uncore, GEN12_RPSTAT1);
>   } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
>   vlv_punit_get(i915);
> --
> 2.38.0
>


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-20 Thread Dixit, Ashutosh
On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote:
>
> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote:
> > On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
> > Hi Vinay,
> >
> >> Waitboost (when SLPC is enabled) results in a H2G message. This can result
> >> in thousands of messages during a stress test and fill up an already full
> >> CTB. There is no need to request for RP0 if GuC is already requesting the
> >> same.
> > But how are we sure that the freq will remain at RP0 in the future (when
> > the waiting request or any requests which are ahead execute)?
> >
> > In the current waitboost implementation, set_param is sent to GuC ahead of
> > the waiting request to ensure that the freq would be max when this waiting
> > request executed on the GPU and the freq is kept at max till this request
> > retires (considering just one waiting request). How can we ensure this if
> > we don't send the waitboost set_param to GuC?
>
> There is no way to guarantee the frequency will remain at RP0 till the
> request retires. As a theoretical example, lets say the request boosted
> freq to RP0, but a user changed min freq using sysfs immediately after.

That would be a bug. If waitboost is in progress and in the middle user
changed min freq, I would expect the freq to revert to the new min only
after the waitboost phase was over.

In any case, I am not referring to this case. Since FW controls the freq
there is nothing preventing FW to change the freq unless we raise min to
max which is what waitboost does.

> Waitboost is done by a pending request to "hurry" the current requests. If
> GT is already at boost frequency, that purpose is served.

FW can bring the freq down later before the waiting request is scheduled.

> Also, host algorithm already has this optimization as well.

Host turbo is different from SLPC. Host turbo controls the freq algorithm
so it knows freq will not come down till it itself brings the freq
down. Unlike SLPC where FW is controling the freq. Therefore host turbo
doesn't ever need to do a MMIO read but only needs to refer to its own
state (rps->cur_freq etc.).

> >
> > I had assumed we'll do this optimization for server parts where min is
> > already RP0 in which case we can completely disable waitboost. But this
> > patch is something else.

Thanks.
--
Ashutosh

> >> v2: Add the tracing back, and check requested freq
> >> in the worker thread (Tvrtko)
> >> v3: Check requested freq in dec_waiters as well
> >>
> >> Signed-off-by: Vinay Belgaumkar 
> >> ---
> >>   drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
> >>   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
> >>   2 files changed, 14 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> >> b/drivers/gpu/drm/i915/gt/intel_rps.c
> >> index fc23c562d9b2..18b75cf08d1b 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> >> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
> >>if (rps_uses_slpc(rps)) {
> >>slpc = rps_to_slpc(rps);
> >>
> >> +  GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
> >> +   rq->fence.context, rq->fence.seqno);
> >> +
> >>/* Return if old value is non zero */
> >>if (!atomic_fetch_inc(>num_waiters))
> >>schedule_work(>boost_work);
> >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> >> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> >> index b7cdeec44bd3..9dbdbab1515a 100644
> >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> >> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
> >> *slpc, u32 freq)
> >>   static void slpc_boost_work(struct work_struct *work)
> >>   {
> >>struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
> >> boost_work);
> >> +  struct intel_rps *rps = _to_gt(slpc)->rps;
> >>int err;
> >>
> >>/*
> >> * Raise min freq to boost. It's possible that
> >> * this is greater than current max. But it will
> >> * certainly be limited by RP0. An error setting
> >> -   * the min param is not fatal.
> >> +   * the min param is not fatal. No need to boost
> >> +   * if we are already requesting it.

Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-20 Thread Dixit, Ashutosh
On Tue, 18 Oct 2022 11:30:31 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
> b/drivers/gpu/drm/i915/gt/selftest_slpc.c
> index 4c6e9257e593..e42bc215e54d 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
> @@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int test_type)
>   enum intel_engine_id id;
>   struct igt_spinner spin;
>   u32 slpc_min_freq, slpc_max_freq;
> + u32 saved_min_freq;
>   int err = 0;
>
>   if (!intel_uc_uses_guc_slpc(>uc))
> @@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int test_type)
>   return -EIO;
>   }
>
> - /*
> -  * FIXME: With efficient frequency enabled, GuC can request
> -  * frequencies higher than the SLPC max. While this is fixed
> -  * in GuC, we level set these tests with RPn as min.
> -  */
> - err = slpc_set_min_freq(slpc, slpc->min_freq);
> - if (err)
> - return err;
> + if (slpc_min_freq == slpc_max_freq) {
> + /* Server parts will have min/max clamped to RP0 */
> + if (slpc->min_is_rpmax) {
> + err = slpc_set_min_freq(slpc, slpc->min_freq);
> + if (err) {
> + pr_err("Unable to update min freq on server 
> part");
> + return err;
> + }
>
> - if (slpc->min_freq == slpc->rp0_freq) {
> - pr_err("Min/Max are fused to the same value\n");
> - return -EINVAL;
> + } else {
> + pr_err("Min/Max are fused to the same value\n");
> + return -EINVAL;

Sorry but I am not following this else case here. Why are we saying min/max
are fused to the same value? In this case we can't do
"slpc_set_min_freq(slpc, slpc->min_freq)" ? That is, we can't change SLPC
min freq?

> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index fdd895f73f9f..b7cdeec44bd3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
>
>   slpc->max_freq_softlimit = 0;
>   slpc->min_freq_softlimit = 0;
> + slpc->min_is_rpmax = false;
>
>   slpc->boost_freq = 0;
>   atomic_set(>num_waiters, 0);
> @@ -588,6 +589,32 @@ static int slpc_set_softlimits(struct intel_guc_slpc 
> *slpc)
>   return 0;
>  }
>
> +static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
> +{
> + int slpc_min_freq;
> +
> + if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
> + return false;

I am wondering what happens if the above fails on server? Should we return
true or false on server and what are the consequences of returning false on
server?

Any case I think we should at least put a drm_err or something here just in
case this ever fails so we'll know something weird happened.

> +
> + if (slpc_min_freq == SLPC_MAX_FREQ_MHZ)
> + return true;
> + else
> + return false;
> +}
> +
> +static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
> +{
> + /* For server parts, SLPC min will be at RPMax.
> +  * Use min softlimit to clamp it to RP0 instead.
> +  */
> + if (is_slpc_min_freq_rpmax(slpc) &&
> + !slpc->min_freq_softlimit) {
> + slpc->min_is_rpmax = true;
> + slpc->min_freq_softlimit = slpc->rp0_freq;
> + (slpc_to_gt(slpc))->defaults.min_freq = 
> slpc->min_freq_softlimit;
> + }
> +}
> +
>  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
>  {
>   /* Force SLPC to used platform rp0 */
> @@ -647,6 +674,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
>
>   slpc_get_rp_values(slpc);
>
> + /* Handle the case where min=max=RPmax */
> + update_server_min_softlimit(slpc);
> +
>   /* Set SLPC max limit to RP0 */
>   ret = slpc_use_fused_rp0(slpc);
>   if (unlikely(ret)) {
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
> index 82a98f78f96c..11975a31c9d0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
> @@ -9,6 +9,8 @@
>  #include "intel_guc_submission.h"
>  #include "intel_guc_slpc_types.h"
>
> +#define SLPC_MAX_FREQ_MHZ 4250

This seems to be really a value (255 converted to freq) so seems ok to
intepret in MHz.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-20 Thread Dixit, Ashutosh
On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> Waitboost (when SLPC is enabled) results in a H2G message. This can result
> in thousands of messages during a stress test and fill up an already full
> CTB. There is no need to request for RP0 if GuC is already requesting the
> same.

But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?

In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?

I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.

Thanks.
--
Ashutosh


>
> v2: Add the tracing back, and check requested freq
> in the worker thread (Tvrtko)
> v3: Check requested freq in dec_waiters as well
>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
>  2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index fc23c562d9b2..18b75cf08d1b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
>   if (rps_uses_slpc(rps)) {
>   slpc = rps_to_slpc(rps);
>
> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
> +  rq->fence.context, rq->fence.seqno);
> +
>   /* Return if old value is non zero */
>   if (!atomic_fetch_inc(>num_waiters))
>   schedule_work(>boost_work);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index b7cdeec44bd3..9dbdbab1515a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
> *slpc, u32 freq)
>  static void slpc_boost_work(struct work_struct *work)
>  {
>   struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
> boost_work);
> + struct intel_rps *rps = _to_gt(slpc)->rps;
>   int err;
>
>   /*
>* Raise min freq to boost. It's possible that
>* this is greater than current max. But it will
>* certainly be limited by RP0. An error setting
> -  * the min param is not fatal.
> +  * the min param is not fatal. No need to boost
> +  * if we are already requesting it.
>*/
> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
> + return;
> +
>   mutex_lock(>lock);
>   if (atomic_read(>num_waiters)) {
>   err = slpc_force_min_freq(slpc, slpc->boost_freq);
> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc 
> *slpc, u32 val)
>
>  void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
>  {
> + struct intel_rps *rps = _to_gt(slpc)->rps;
>   /*
>* Return min back to the softlimit.
>* This is called during request retire,
> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc 
> *slpc)
>* set_param fails.
>*/
>   mutex_lock(>lock);
> - if (atomic_dec_and_test(>num_waiters))
> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + if (atomic_dec_and_test(>num_waiters)) {
> + if (intel_rps_get_requested_frequency(rps) != 
> slpc->min_freq_softlimit)
> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + }
>   mutex_unlock(>lock);
>  }
>
> --
> 2.35.1
>


Random submitter change in Freedesktop Patchwork

2022-10-20 Thread Dixit, Ashutosh
The freedesktop Patchwork seems to have a "feature" where in some cases the
submitter for a series changes randomly to a person who did not actually
submit a version of the series.

Not sure but this changed submitter seems to be a maintainer:


https://patchwork.freedesktop.org/series/108156/

Original submission by badal.nila...@intel.com and subsequent submissions
by me (ashutosh.di...@intel.com) but current submitter is
jani.nik...@linux.intel.com.

For the above series I believe the submitter changed at v7 where perhaps a
rebuild or a retest was scheduled (not sure if Jani did it and that changed
something) but the build failed at v7. Also note root msg-id's for v6 and
v7 are the same.

https://patchwork.freedesktop.org/series/108091/

Original submission by me (ashutosh.di...@intel.com) but current submitter
is rodrigo.v...@intel.com.

Similarly here submitter seems to have changed at v3 where again the build
failed. Also note root msg-id's for v2 and v3 are the same.


The problem this change of submitter causes is that if the actual original
submitter wants to schedule a retest they cannot do it using the retest
button.

Thanks.
--
Ashutosh


Re: [PATCH 2/4] drm/i915/mtl: Modify CAGF functions for MTL

2022-10-19 Thread Dixit, Ashutosh
On Wed, 19 Oct 2022 07:58:13 -0700, Rodrigo Vivi wrote:
>
> On Tue, Oct 18, 2022 at 10:20:41PM -0700, Ashutosh Dixit wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> > b/drivers/gpu/drm/i915/gt/intel_rps.c
> > index df21258976d86..5a743ae4dd11e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> > @@ -2093,7 +2093,9 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 
> > rpstat)
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > u32 cagf;
> >
> > -   if (GRAPHICS_VER(i915) >= 12)
> > +   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> > +   cagf = rpstat & MTL_CAGF_MASK;
>
> I believe we should advocate more the use of the REG_FIELD_GET
>
>   cagf = REG_FIELD_GET(MTL_CAGF_MASK, rpstat);
>
> > +   else if (GRAPHICS_VER(i915) >= 12)
> > cagf = (rpstat & GEN12_CAGF_MASK) >> GEN12_CAGF_SHIFT;
>
> cagf = REG_FIELD_GET(GEN12_CAGF_MASK, rpstat);
> // witht the proper REG_GENAMSK usage on the gen12_cagf_mask...
>
> > else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> > cagf = (rpstat >> 8) & 0xff;
>
>  #define RPE_MASK REG_GENMASK(15, 8)
>  cagf = REG_FIELD_GET(RPE_MASK, rpstat)

All these are now converted to REG_FIELD_GET in series version v8.

> > @@ -2116,7 +2118,13 @@ static u32 read_cagf(struct intel_rps *rps)
> > struct intel_uncore *uncore = rps_to_uncore(rps);
> ^
>
> > u32 freq;
> >
> > -   if (GRAPHICS_VER(i915) >= 12) {
> > +   /*
> > +* For Gen12+ reading freq from HW does not need a forcewake and
> > +* registers will return 0 freq when GT is in RC6
> > +*/
> > +   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
> > +   freq = intel_uncore_read(rps_to_gt(rps)->uncore, 
> > MTL_MIRROR_TARGET_WP1);
>
> here we should use directly the local uncore already declared above with
> the same helper...  and consistent with the following elses...

Fixed.

>
> > +   } else if (GRAPHICS_VER(i915) >= 12) {
> > freq = intel_uncore_read(uncore, GEN12_RPSTAT1);
> > } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
> > vlv_punit_get(i915);
> > --
> > 2.38.0
> >

Thanks.
--
Ashutosh


Re: [PATCH 1/4] drm/i915: Use GEN12_RPSTAT register for GT freq

2022-10-19 Thread Dixit, Ashutosh
On Wed, 19 Oct 2022 08:06:26 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Tue, Oct 18, 2022 at 10:20:40PM -0700, Ashutosh Dixit wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
> > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index 36d95b79022c0..a7a0129d0e3fc 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -1543,6 +1543,8 @@
> >
> >  #define GEN12_RPSTAT1  _MMIO(0x1381b4)
> >  #define   GEN12_VOLTAGE_MASK   REG_GENMASK(10, 0)
> > +#define   GEN12_CAGF_SHIFT 11
>
> we don't need to define the shift if we use the REG_FIELD_GET

Yes I was also suggesting this but then went ahead with the mask/shift
based code to match previous style in the function.

In any case based on your suggestions I have added a new patch is series
version v8 which converts all previous branches in intel_rps_get_cagf to
REG_FIELD_GET so that the new code can also consistently use REG_FIELD_GET.

>
> > +#define   GEN12_CAGF_MASK  REG_GENMASK(19, 11)
>
> ah, cool, this is already right and in place
> (ignore my comment about this in the other patch)

> >  u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat)
> >  {
> > struct drm_i915_private *i915 = rps_to_i915(rps);
> > u32 cagf;
> >
> > -   if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> > +   if (GRAPHICS_VER(i915) >= 12)
> > +   cagf = (rpstat & GEN12_CAGF_MASK) >> GEN12_CAGF_SHIFT;
>
>   cagf = REG_FIELD_GET(GEN12_CAGF_MASK, rpstat);
>
> > +   else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> > cagf = (rpstat >> 8) & 0xff;
> > else if (GRAPHICS_VER(i915) >= 9)
> > cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;

Thanks.
--
Ashutosh


Re: [PATCH 3/3] drm/i915/mtl: C6 residency and C state type for MTL SAMedia

2022-10-19 Thread Dixit, Ashutosh
On Mon, 17 Oct 2022 13:12:33 -0700, Dixit, Ashutosh wrote:
>
> On Fri, 14 Oct 2022 20:26:18 -0700, Ashutosh Dixit wrote:
> >
> > From: Badal Nilawar 
>
> Hi Badal,
>
> One question below.
>
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > index 1fb053cbf52db..3a9bb4387248e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> > @@ -256,6 +256,61 @@ static int ilk_drpc(struct seq_file *m)
> > return 0;
> >  }
> >
> > +static int mtl_drpc(struct seq_file *m)
> > +{
>
> Here we have:
>
> > +   global_forcewake = intel_uncore_read(uncore, FORCEWAKE_GT_GEN9);
> and
> > +   seq_printf(m, "Global Forcewake Requests: 0x%x\n", global_forcewake);
>
> In gen6_drpc we have:
>
>   mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
> and
>   seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
>
> Also:
>   #define FORCEWAKE_MT_MMIO(0xa188)
>   #define FORCEWAKE_GT_GEN9   _MMIO(0xa188)
>
> So they are both the same register. So what is the reason for this
> difference, which one should we use?
>
> Also let's have the prints in the same order as gen6_drpc (move fw request
> before rc6 residency).

This has been made identical to gen6_drpc in series v8.

Thanks.
--
Ashutosh


Re: [PATCH 3/4] drm/i915/gt: Use RC6 residency types as arguments to residency functions

2022-10-19 Thread Dixit, Ashutosh
On Wed, 19 Oct 2022 00:51:45 -0700, Jani Nikula wrote:
>
> On Tue, 18 Oct 2022, Ashutosh Dixit  wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.h 
> > b/drivers/gpu/drm/i915/gt/intel_rc6.h
> > index b6fea71afc223..3105bc72c096b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rc6.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_rc6.h
> > @@ -6,7 +6,7 @@
> >  #ifndef INTEL_RC6_H
> >  #define INTEL_RC6_H
> >
> > -#include "i915_reg_defs.h"
> > +#include "intel_rc6_types.h"
> >
> >  struct intel_engine_cs;
> >  struct intel_rc6;
> > @@ -21,7 +21,9 @@ void intel_rc6_sanitize(struct intel_rc6 *rc6);
> >  void intel_rc6_enable(struct intel_rc6 *rc6);
> >  void intel_rc6_disable(struct intel_rc6 *rc6);
> >
> > -u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, i915_reg_t reg);
> > -u64 intel_rc6_residency_us(struct intel_rc6 *rc6, i915_reg_t reg);
> > +u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, enum intel_rc6_res_type 
> > id);
> > +u64 intel_rc6_residency_us(struct intel_rc6 *rc6, enum intel_rc6_res_type 
> > id);
> > +void intel_rc6_print_residency(struct seq_file *m, const char *title,
> > +  enum intel_rc6_res_type id);
> >
> >  #endif /* INTEL_RC6_H */
>
> Please apply this on top to avoid includes from includes.
>
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.h 
> b/drivers/gpu/drm/i915/gt/intel_rc6.h
> index 3105bc72c096..456fa668a276 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rc6.h
> +++ b/drivers/gpu/drm/i915/gt/intel_rc6.h
> @@ -6,10 +6,11 @@
>  #ifndef INTEL_RC6_H
>  #define INTEL_RC6_H
>
> -#include "intel_rc6_types.h"
> +#include 
>
> -struct intel_engine_cs;
> +enum intel_rc6_res_type;
>  struct intel_rc6;
> +struct seq_file;
>
>  void intel_rc6_init(struct intel_rc6 *rc6);
>  void intel_rc6_fini(struct intel_rc6 *rc6);

Thanks, done in series version v8.

Ashutosh


Re: [PATCH 1/3] drm/i915/gt: Change RC6 residency functions to accept register ID's

2022-10-18 Thread Dixit, Ashutosh
On Mon, 17 Oct 2022 01:27:35 -0700, Jani Nikula wrote:

Hi Jani,

Thanks for reviewing, great suggestions overall. I have taken care of most
of them in series version v6. Please see below.

> On Fri, 14 Oct 2022, Ashutosh Dixit  wrote:
> > @@ -811,9 +809,23 @@ u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, 
> > const i915_reg_t reg)
> > return mul_u64_u32_div(time_hw, mul, div);
> >  }
> >
> > -u64 intel_rc6_residency_us(struct intel_rc6 *rc6, i915_reg_t reg)
> > +u64 intel_rc6_residency_us(struct intel_rc6 *rc6, const enum rc6_res_reg 
> > id)
> > +{
> > +   return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(rc6, id), 1000);
> > +}
> > +
> > +void intel_rc6_print_rc6_res(struct seq_file *m,
> > +const char *title,
> > +const enum rc6_res_reg id)
>
> intel_rc6_print_rc5_res is unnecessary duplication.
>
> intel_rc6_print_residency() maybe?

Done.

>
> >  {
> > -   return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(rc6, reg), 1000);
> > +   struct intel_gt *gt = m->private;
> > +   i915_reg_t reg = gt->rc6.res_reg[id];
> > +   intel_wakeref_t wakeref;
> > +
> > +   with_intel_runtime_pm(gt->uncore->rpm, wakeref)
> > +   seq_printf(m, "%s %u (%llu us)\n", title,
> > +  intel_uncore_read(gt->uncore, reg),
> > +  intel_rc6_residency_us(>rc6, id));
> >  }
> >
> >  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.h 
> > b/drivers/gpu/drm/i915/gt/intel_rc6.h
> > index b6fea71afc223..584d2d3b2ec3f 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rc6.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_rc6.h
> > @@ -6,7 +6,7 @@
> >  #ifndef INTEL_RC6_H
> >  #define INTEL_RC6_H
> >
> > -#include "i915_reg_defs.h"
> > +#include "intel_rc6_types.h"
>
> You can forward declare enums as a gcc extension.
>
> enum rc6_res_reg;

Tried but was seeing compile errors so left as is.

> >  struct intel_engine_cs;
> >  struct intel_rc6;
> > @@ -21,7 +21,10 @@ void intel_rc6_sanitize(struct intel_rc6 *rc6);
> >  void intel_rc6_enable(struct intel_rc6 *rc6);
> >  void intel_rc6_disable(struct intel_rc6 *rc6);
> >
> > -u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, i915_reg_t reg);
> > -u64 intel_rc6_residency_us(struct intel_rc6 *rc6, i915_reg_t reg);
> > +u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, const enum rc6_res_reg 
> > id);
> > +u64 intel_rc6_residency_us(struct intel_rc6 *rc6, const enum rc6_res_reg 
> > id);
> > +void intel_rc6_print_rc6_res(struct seq_file *m,
> > +const char *title,
> > +const enum rc6_res_reg id);
>
> "const enum" makes no sense.

Removed. Probably const for pass-by-value function arguments never makes
sense, I had left the const thinking it would indicate that the function
won't modify that argument, but is probably not worth it so removed all
"const enum"s.

>
> >
> >  #endif /* INTEL_RC6_H */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_rc6_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_rc6_types.h
> > index e747492b2f46e..0386a3f6e4dc6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_rc6_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_rc6_types.h
> > @@ -13,7 +13,17 @@
> >
> >  struct drm_i915_gem_object;
> >
> > +enum rc6_res_reg {
> > +   RC6_RES_REG_RC6_LOCKED,
> > +   RC6_RES_REG_RC6,
> > +   RC6_RES_REG_RC6p,
> > +   RC6_RES_REG_RC6pp
> > +};
>
> Naming: intel_rc6_* for all.

Done.

> I think you need to take the abstraction further away from
> registers. You don't need the *register* part here for anything. Stop
> thinking in terms of registers in the interface.
>
> The callers care about things like "RC6+ residency since boot", and the
> callers don't care about where or how this information originates
> from. They just want the info, and the register is an implementation
> detail hidden behind the interface.
>
> I.e. use the enum to identify the data you want, not which register it
> comes from.

Done, please take a look at the new patch.

>
> > +
> > +#define VLV_RC6_RES_REG_MEDIA_RC6 RC6_RES_REG_RC6p
>
> Please handle this in the enum.

Done.

>
> > +
> >  struct intel_rc6 {
> > +   i915_reg_t res_reg[4];
>
> Maybe the id enum should have _MAX as last value, used for size here.

Done.

Thanks.
--
Ashutosh


>
> > u64 prev_hw_residency[4];
> > u64 cur_residency[4];
> >
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c 
> > b/drivers/gpu/drm/i915/gt/selftest_rc6.c
> > index 8c70b7e120749..a236e3f8f3183 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
> > @@ -19,11 +19,11 @@ static u64 rc6_residency(struct intel_rc6 *rc6)
> >
> > /* XXX VLV_GT_MEDIA_RC6? */
> >
> > -   result = intel_rc6_residency_ns(rc6, GEN6_GT_GFX_RC6);
> > +   result = intel_rc6_residency_ns(rc6, RC6_RES_REG_RC6);
> > if (HAS_RC6p(rc6_to_i915(rc6)))
> > -   result += intel_rc6_residency_ns(rc6, GEN6_GT_GFX_RC6p);
> > +   result += 

Re: [PATCH 3/3] drm/i915/mtl: C6 residency and C state type for MTL SAMedia

2022-10-17 Thread Dixit, Ashutosh
On Fri, 14 Oct 2022 20:26:18 -0700, Ashutosh Dixit wrote:
>
> From: Badal Nilawar 

Hi Badal,

One question below.

> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> index 1fb053cbf52db..3a9bb4387248e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> @@ -256,6 +256,61 @@ static int ilk_drpc(struct seq_file *m)
>   return 0;
>  }
>
> +static int mtl_drpc(struct seq_file *m)
> +{

Here we have:

> + global_forcewake = intel_uncore_read(uncore, FORCEWAKE_GT_GEN9);
and
> + seq_printf(m, "Global Forcewake Requests: 0x%x\n", global_forcewake);

In gen6_drpc we have:

mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
and
seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);

Also:
#define FORCEWAKE_MT_MMIO(0xa188)
#define FORCEWAKE_GT_GEN9   _MMIO(0xa188)

So they are both the same register. So what is the reason for this
difference, which one should we use?

Also let's have the prints in the same order as gen6_drpc (move fw request
before rc6 residency).

Thanks.
--
Ashutosh


Re: [PATCH 2/2] drm/i915/mtl: Add C6 residency support for MTL SAMedia

2022-10-14 Thread Dixit, Ashutosh
On Tue, 20 Sep 2022 01:06:52 -0700, Jani Nikula wrote:
>
> On Mon, 19 Sep 2022, "Dixit, Ashutosh"  wrote:
> > On Mon, 19 Sep 2022 05:13:18 -0700, Jani Nikula wrote:
> >>
> >> On Mon, 19 Sep 2022, Badal Nilawar  wrote:
> >> > For MTL SAMedia updated relevant functions and places in the code to get
> >> > Media C6 residency.
> >> >
> >> > v2: Fixed review comments (Ashutosh)
> >> >
> >> > Cc: Vinay Belgaumkar 
> >> > Cc: Ashutosh Dixit 
> >> > Cc: Chris Wilson 
> >> > Signed-off-by: Badal Nilawar 
> >> > ---
> >> >  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 60 +++
> >> >  drivers/gpu/drm/i915/gt/intel_gt_regs.h   | 10 
> >> >  drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c   |  9 ++-
> >> >  drivers/gpu/drm/i915/gt/intel_rc6.c   |  5 +-
> >> >  drivers/gpu/drm/i915/gt/selftest_rc6.c|  9 ++-
> >> >  drivers/gpu/drm/i915/i915_pmu.c   |  8 ++-
> >> >  6 files changed, 97 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> >> > b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> >> > index 68310881a793..053167b506a9 100644
> >> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> >> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> >> > @@ -269,6 +269,64 @@ static int ilk_drpc(struct seq_file *m)
> >> >  return 0;
> >> >  }
> >> >
> >> > +static int mtl_drpc(struct seq_file *m)
> >> > +{
> >> > +struct intel_gt *gt = m->private;
> >> > +struct intel_uncore *uncore = gt->uncore;
> >> > +u32 gt_core_status, rcctl1, global_forcewake;
> >> > +u32 mtl_powergate_enable = 0, mtl_powergate_status = 0;
> >> > +i915_reg_t reg;
> >> > +
> >> > +gt_core_status = intel_uncore_read(uncore, 
> >> > MTL_MIRROR_TARGET_WP1);
> >> > +
> >> > +global_forcewake = intel_uncore_read(uncore, FORCEWAKE_GT_GEN9);
> >> > +
> >> > +rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
> >> > +mtl_powergate_enable = intel_uncore_read(uncore, 
> >> > GEN9_PG_ENABLE);
> >> > +mtl_powergate_status = intel_uncore_read(uncore,
> >> > + 
> >> > GEN9_PWRGT_DOMAIN_STATUS);
> >> > +
> >> > +seq_printf(m, "RC6 Enabled: %s\n",
> >> > +   str_yes_no(rcctl1 & GEN6_RC_CTL_RC6_ENABLE));
> >> > +if (gt->type == GT_MEDIA) {
> >> > +seq_printf(m, "Media Well Gating Enabled: %s\n",
> >> > +   str_yes_no(mtl_powergate_enable & 
> >> > GEN9_MEDIA_PG_ENABLE));
> >> > +} else {
> >> > +seq_printf(m, "Render Well Gating Enabled: %s\n",
> >> > +   str_yes_no(mtl_powergate_enable & 
> >> > GEN9_RENDER_PG_ENABLE));
> >> > +}
> >> > +
> >> > +seq_puts(m, "Current RC state: ");
> >> > +
> >> > +switch ((gt_core_status & MTL_CC_MASK) >> MTL_CC_SHIFT) {
> >> > +case MTL_CC0:
> >> > +seq_puts(m, "on\n");
> >> > +break;
> >> > +case MTL_CC6:
> >> > +seq_puts(m, "RC6\n");
> >> > +break;
> >> > +default:
> >> > +seq_puts(m, "Unknown\n");
> >> > +break;
> >> > +}
> >> > +
> >> > +if (gt->type == GT_MEDIA)
> >> > +seq_printf(m, "Media Power Well: %s\n",
> >> > +   (mtl_powergate_status &
> >> > +GEN9_PWRGT_MEDIA_STATUS_MASK) ? "Up" : 
> >> > "Down");
> >> > +else
> >> > +seq_printf(m, "Render Power Well: %s\n",
> >> > +   (mtl_powergate_status &
> >> > +GEN9_PWRGT_RENDER_STATUS_MASK) ? "Up" : 
> >> > "Down");
> >> > +

Re: [Intel-gfx] [PATCH 1/2] drm/i915/mtl: Modify CAGF functions for MTL

2022-10-14 Thread Dixit, Ashutosh
On Mon, 19 Sep 2022 09:49:07 -0700, Andi Shyti wrote:
>
> Hi Badal,

Hi Andi,

Badal is out for a bit so I am sending out this version.

> On Mon, Sep 19, 2022 at 05:29:05PM +0530, Badal Nilawar wrote:
> > Updated the CAGF functions to get actual resolved frequency of
> > 3D and SAMedia
>
> can you please use the imperative form? "Update" and not
> "Updated".

> Besides I don't really understand what you did from the
> commit, can you please bea  bit more descriptive?

Done in series version v5. Please take a look.

> > Bspec: 66300
> >
> > Cc: Vinay Belgaumkar 
> > Cc: Ashutosh Dixit 
> > Signed-off-by: Badal Nilawar 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_regs.h | 8 
> >  drivers/gpu/drm/i915/gt/intel_rps.c | 6 +-
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
> > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index 2275ee47da95..7819d32db956 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -1510,6 +1510,14 @@
> >  #define VLV_RENDER_C0_COUNT_MMIO(0x138118)
> >  #define VLV_MEDIA_C0_COUNT _MMIO(0x13811c)
> >
> > +/*
> > + * MTL: Workpoint reg to get Core C state and act freq of 3D, SAMedia/
> > + * 3D - 0x0C60 , SAMedia - 0x380C60
> > + * Intel uncore handler redirects transactions for SAMedia to 
> > MTL_MEDIA_GSI_BASE
> > + */
>
> This comment is not understandable... we don't have limits in
> space, you can be a bit more explicit :)

Based on Matt R's comment the comment has been deleted (except for the
first line). There is an explanation at the bottom of gt/intel_gt_regs.h.

Thanks.
--
Ashutosh


Re: [PATCH 1/2] drm/i915/mtl: Modify CAGF functions for MTL

2022-10-14 Thread Dixit, Ashutosh
On Mon, 19 Sep 2022 15:49:17 -0700, Matt Roper wrote:
>
> On Mon, Sep 19, 2022 at 03:46:47PM -0700, Matt Roper wrote:
> > On Mon, Sep 19, 2022 at 05:29:05PM +0530, Badal Nilawar wrote:
> > > Updated the CAGF functions to get actual resolved frequency of
> > > 3D and SAMedia
> > >
> > > Bspec: 66300
> > >
> > > Cc: Vinay Belgaumkar 
> > > Cc: Ashutosh Dixit 
> > > Signed-off-by: Badal Nilawar 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_gt_regs.h | 8 
> > >  drivers/gpu/drm/i915/gt/intel_rps.c | 6 +-
> > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
> > > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > index 2275ee47da95..7819d32db956 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > @@ -1510,6 +1510,14 @@
> > >  #define VLV_RENDER_C0_COUNT  _MMIO(0x138118)
> > >  #define VLV_MEDIA_C0_COUNT   _MMIO(0x13811c)
> > >
> > > +/*
> > > + * MTL: Workpoint reg to get Core C state and act freq of 3D, SAMedia/
> > > + * 3D - 0x0C60 , SAMedia - 0x380C60
> > > + * Intel uncore handler redirects transactions for SAMedia to 
> > > MTL_MEDIA_GSI_BASE
> > > + */
>
> Also, this comment is unnecessary.  This is already how all GT registers
> work so there's no reason to state this again on one one random
> register.
>
> > > +#define MTL_MIRROR_TARGET_WP1  _MMIO(0x0C60)
> > > +#define   MTL_CAGF_MASKREG_GENMASK(8, 0)
> > > +
> >
> > This register is at the wrong place in the file (and is misformatted).
> >  - Keep it sorted with respect to the other registers in the file.
> >  - Write it as "0xc60" for consistency with all the other registers
> >(i.e., lower-case hex, no unnecessary 0 prefix).
> >  - The whitespace between the name and the REG_GENMASK should be tabs,
> >not spaces, ensuring it's lined up with the other definitions.
> >
> > i915_reg.h turned into a huge mess over time because it wasn't
> > consistently organized or formatted so nobody knew what to do when
> > adding new registers.  We're trying to do a better job of following
> > consistent rules with the new register headers so that we don't wind up
> > with the same confusion again.

Fixed in series version v5 (Patch version v2). Same for the comments below
too.

> >
> > >  #define GEN11_GT_INTR_DW(x)  _MMIO(0x190018 + ((x) * 
> > > 4))
> > >  #define   GEN11_CSME (31)
> > >  #define   GEN11_GUNIT(28)
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> > > b/drivers/gpu/drm/i915/gt/intel_rps.c
> > > index 17b40b625e31..c2349949ebae 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> > > @@ -2075,6 +2075,8 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 
> > > rpstat)
> > >
> > >   if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> > >   cagf = (rpstat >> 8) & 0xff;
> > > + else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> > > + cagf = rpstat & MTL_CAGF_MASK;
> >
> > Generally we try to put the newer platform at the top of if/else
> > ladders.  So this new MTL code should come before the VLV/CHV branch.

Done.

> >
> > >   else if (GRAPHICS_VER(i915) >= 9)
> > >   cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
> > >   else if (IS_HASWELL(i915) || IS_BROADWELL(i915))
> > > @@ -2098,7 +2100,9 @@ static u32 read_cagf(struct intel_rps *rps)
> > >   vlv_punit_get(i915);
> > >   freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
> > >   vlv_punit_put(i915);
> > > - } else if (GRAPHICS_VER(i915) >= 6) {
> > > + } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> > > + freq = intel_uncore_read(rps_to_gt(rps)->uncore, 
> > > MTL_MIRROR_TARGET_WP1);
> >
> > Same here.

Done.

Thanks.
--
Ashutosh

> > > + else if (GRAPHICS_VER(i915) >= 6) {
> > >   freq = intel_uncore_read(uncore, GEN6_RPSTAT1);
> > >   } else {
> > >   freq = intel_uncore_read(uncore, MEMSTAT_ILK);
> > > --
> > > 2.25.1
> > >
> >
> > --
> > Matt Roper
> > Graphics Software Engineer
> > VTT-OSGC Platform Enablement
> > Intel Corporation
>
> --
> Matt Roper
> Graphics Software Engineer
> VTT-OSGC Platform Enablement
> Intel Corporation


Re: [PATCH v2] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-13 Thread Dixit, Ashutosh
On Thu, 13 Oct 2022 08:55:24 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> GuC will set the min/max frequencies to theoretical max on
> ATS-M. This will break kernel ABI, so limit min/max frequency
> to RP0(platform max) instead.

Isn't what we are calling "theoretical max" or "RPmax" really just -1U
(0x)? Though I have heard this is not a max value but -1U indicates
FW default values unmodified by host SW, which would mean frequencies are
fully controlled by FW (min == max == -1U). But if this were the case I
don't know why this would be the case only for server, why doesn't FW set
these for clients too to indicate it is fully in control?

So the question what does -1U actually represent? Is it the RPmax value or
does -1U represent "FW defaults"?

Also this concept of using -1U as "FW defaults" is present in Level0/OneAPI
(and likely in firmware) but we seem to have blocked in the i915 ABI.

I understand we may not be able to make such changes at present but this
provides some context for the review comments below.

> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index fdd895f73f9f..11613d373a49 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
>
>   slpc->max_freq_softlimit = 0;
>   slpc->min_freq_softlimit = 0;
> + slpc->min_is_rpmax = false;
>
>   slpc->boost_freq = 0;
>   atomic_set(>num_waiters, 0);
> @@ -588,6 +589,31 @@ static int slpc_set_softlimits(struct intel_guc_slpc 
> *slpc)
>   return 0;
>  }
>
> +static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
> +{
> + int slpc_min_freq;
> +
> + if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
> + return false;
> +
> + if (slpc_min_freq > slpc->rp0_freq)

> or >=.

If what we are calling "rpmax" really -1U then why don't we just check for
-1U here?

u32 slpc_min_freq;

if (slpc_min_freq == -1U)

> + return true;
> + else
> + return false;
> +}
> +
> +static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
> +{
> + /* For server parts, SLPC min will be at RPMax.
> +  * Use min softlimit to clamp it to RP0 instead.
> +  */
> + if (is_slpc_min_freq_rpmax(slpc) &&
> + !slpc->min_freq_softlimit) {
> + slpc->min_is_rpmax = true;
> + slpc->min_freq_softlimit = slpc->rp0_freq;

Isn't it safer to use a platform check such as IS_ATSM or IS_XEHPSDV (or
even #define IS_SERVER()) to set min freq to RP0 rather than this -1U value
from FW? What if -1U means "FW defaults" and FW starts setting this on
client products tomorrow?

Also, we need to set gt->defaults.min_freq here.

Thanks.
--
Ashutosh


> + }
> +}
> +
>  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
>  {
>   /* Force SLPC to used platform rp0 */
> @@ -647,6 +673,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
>
>   slpc_get_rp_values(slpc);
>
> + /* Handle the case where min=max=RPmax */
> + update_server_min_softlimit(slpc);
> +
>   /* Set SLPC max limit to RP0 */
>   ret = slpc_use_fused_rp0(slpc);
>   if (unlikely(ret)) {
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
> index 73d208123528..a6ef53b04e04 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
> @@ -19,6 +19,9 @@ struct intel_guc_slpc {
>   bool supported;
>   bool selected;
>
> + /* Indicates this is a server part */
> + bool min_is_rpmax;
> +
>   /* platform frequency limits */
>   u32 min_freq;
>   u32 rp0_freq;
> --
> 2.35.1
>


Re: [Intel-gfx] [PATCH 6/7] drm/i915/hwmon: Expose power1_max_interval

2022-10-13 Thread Dixit, Ashutosh
On Mon, 03 Oct 2022 14:32:36 -0700, Andi Shyti wrote:
>

Hi Andi,

> > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
> > b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > index f9d6d3b08bba..19b9fe3ef237 100644
> > --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > @@ -26,6 +26,15 @@ Description: RO. Card default power limit (default 
> > TDP setting).
> >
> > Only supported for particular Intel i915 graphics platforms.
> >
> > +What:  /sys/devices/.../hwmon/hwmon/power1_max_interval
> > +Date:  February 2023
> > +KernelVersion: 6.2
> > +Contact:   dri-devel@lists.freedesktop.org
>
> same question here.
>
> > +Description:   RW. Sustained power limit interval (Tau in PL1/Tau) in
> > +   milliseconds over which sustained power is averaged.
> > +
> > +   Only supported for particular Intel i915 graphics platforms.
> > +
> >  What:  /sys/devices/.../hwmon/hwmon/power1_crit
> >  Date:  February 2023
> >  KernelVersion: 6.2
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 2394fa789793..641143956c45 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -20,11 +20,13 @@
> >   * - power  - microwatts
> >   * - curr   - milliamperes
> >   * - energy - microjoules
> > + * - time   - milliseconds
> >   */
> >  #define SF_VOLTAGE 1000
> >  #define SF_POWER   100
> >  #define SF_CURR1000
> >  #define SF_ENERGY  100
> > +#define SF_TIME1000
> >
> >  struct hwm_reg {
> > i915_reg_t gt_perf_status;
> > @@ -53,6 +55,7 @@ struct i915_hwmon {
> > struct hwm_reg rg;
> > int scl_shift_power;
> > int scl_shift_energy;
> > +   int scl_shift_time;
> >  };
> >
> >  static void
> > @@ -161,6 +164,115 @@ hwm_energy(struct hwm_drvdata *ddat, long *energy)
> > return 0;
> >  }
> >
> > +static ssize_t
> > +hwm_power1_max_interval_show(struct device *dev, struct device_attribute 
> > *attr,
> > +char *buf)
> > +{
> > +   struct hwm_drvdata *ddat = dev_get_drvdata(dev);
> > +   struct i915_hwmon *hwmon = ddat->hwmon;
> > +   intel_wakeref_t wakeref;
> > +   u32 r, x, y, x_w = 2; /* 2 bits */
> > +   u64 tau4, out;
> > +
> > +   with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
> > +   r = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
> > +
> > +   x = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_X, r);
> > +   y = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_Y, r);
> > +   /*
> > +* tau = 1.x * power(2,y), x = bits(23:22), y = bits(21:17)
> > +* = (4 | x) << (y - 2)
> > +* where (y - 2) ensures a 1.x fixed point representation of 1.x
> > +* However because y can be < 2, we compute
> > +* tau4 = (4 | x) << y
> > +* but add 2 when doing the final right shift to account for units
> > +*/
> > +   tau4 = ((1 << x_w) | x) << y;
> > +   /* val in hwmon interface units (millisec) */
> > +   out = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
> > +
> > +   return sysfs_emit(buf, "%llu\n", out);
> > +}
> > +
> > +static ssize_t
> > +hwm_power1_max_interval_store(struct device *dev,
> > + struct device_attribute *attr,
> > + const char *buf, size_t count)
> > +{
> > +   struct hwm_drvdata *ddat = dev_get_drvdata(dev);
> > +   struct i915_hwmon *hwmon = ddat->hwmon;
> > +   long val, max_win, ret;
>
> you have some type mismatch here:
>
>  - val should be unsigned long
>  - max_win should be u64
>  - ret should be int

Thanks, fixed in v9.

>
> > +   u32 x, y, rxy, x_w = 2; /* 2 bits */
> > +   u64 tau4, r;
> > +
> > +#define PKG_MAX_WIN_DEFAULT 0x12ull
>
> could you please add a comment here?

Done.

> > +
> > +   ret = kstrtoul(buf, 0, );
> > +   if (ret)
> > +   return ret;
> > +
> > +   /*
> > +* val must be < max in hwmon interface units. The steps below are
> > +* explained in i915_power1_max_interval_show()
> > +*/
> > +   r = FIELD_PREP(PKG_MAX_WIN, PKG_MAX_WIN_DEFAULT);
> > +
> > +   x = REG_FIELD_GET(PKG_MAX_WIN_X, r);
> > +   y = REG_FIELD_GET(PKG_MAX_WIN_Y, r);
> > +   tau4 = ((1 << x_w) | x) << y;
> > +   max_win = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
> > +
> > +   if (val > max_win)
> > +   return -EINVAL;
> > +
> > +   /* val in hw units */
> > +   val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
> > +   /* Convert to 1.x * power(2,y) */
> > +   if (!val)
> > +   return -EINVAL;
> > +   y = ilog2(val);
> > +   /* x = (val - (1 << y)) >> (y - 2); */
>
> some leftover

No, it's a comment describing what's happening in the line below. I've left
it as is for now. Can remove it if you think it's unnecessary.

>
> > +   x = (val - (1ul << y)) << x_w >> y;
> > +
> > +   rxy = 

Re: [Intel-gfx] [PATCH 4/7] drm/i915/hwmon: Show device level energy usage

2022-10-13 Thread Dixit, Ashutosh
On Mon, 03 Oct 2022 14:13:10 -0700, Andi Shyti wrote:
>

Hi Andi,

> [...]
>
> > > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
> > > b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > > index 16e697b1db3d..7525db243d74 100644
> > > --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > > +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > > @@ -25,3 +25,11 @@ Contact:   dri-devel@lists.freedesktop.org
> > >  Description: RO. Card default power limit (default TDP setting).
> > >
> > >   Only supported for particular Intel i915 graphics platforms.
> > > +
> > > +What:/sys/devices/.../hwmon/hwmon/energy1_input
> > > +Date:February 2023
> > > +KernelVersion:   6.2
> > > +Contact: dri-devel@lists.freedesktop.org
> >
> > I'm sorry for being late on the review here, and I know that others
> > already looked at the date and other details here in this doc.
> > So I'm curious why we have decided for the dri-devel mailing list
> > and not for the intel-gfx since intel-gfx is the only one we have
> > listed for i915 dir in the MAINTAINERS file:
> > L:  intel-...@lists.freedesktop.org
>
> same question here.

These have all been changed to intel-gfx.

>
> > > +Description: RO. Energy input of device in microjoules.
> > > +
> > > + Only supported for particular Intel i915 graphics platforms.
>
> [...]
>
> > > +/*
> > > + * hwm_energy - Obtain energy value
> > > + *
> > > + * The underlying energy hardware register is 32-bits and is subject to
> > > + * overflow. How long before overflow? For example, with an example
> > > + * scaling bit shift of 14 bits (see register *PACKAGE_POWER_SKU_UNIT) 
> > > and
> > > + * a power draw of 1000 watts, the 32-bit counter will overflow in
> > > + * approximately 4.36 minutes.
> > > + *
> > > + * Examples:
> > > + *1 watt:  (2^32 >> 14) /1 W / (60 * 60 * 24) secs/day -> 3 days
> > > + * 1000 watts: (2^32 >> 14) / 1000 W / 60 secs/min -> 4.36 
> > > minutes
> > > + *
> > > + * The function significantly increases overflow duration (from 4.36
> > > + * minutes) by accumulating the energy register into a 'long' as allowed 
> > > by
> > > + * the hwmon API. Using x86_64 128 bit arithmetic (see 
> > > mul_u64_u32_shr()),
> > > + * a 'long' of 63 bits, SF_ENERGY of 1e6 (~20 bits) and
> > > + * hwmon->scl_shift_energy of 14 bits we have 57 (63 - 20 + 14) bits 
> > > before
> > > + * energy1_input overflows. This at 1000 W is an overflow duration of 
> > > 278 years.
> > > + */
> > > +static int
> > > +hwm_energy(struct hwm_drvdata *ddat, long *energy)
>
> This function can just be void.

Done.

Thanks.
--
Ashutosh


Re: [PATCH 4/7] drm/i915/hwmon: Show device level energy usage

2022-10-13 Thread Dixit, Ashutosh
On Fri, 30 Sep 2022 09:52:28 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Tue, Sep 27, 2022 at 11:20:17AM +0530, Badal Nilawar wrote:
> > From: Dale B Stimson 
> >
> > Use i915 HWMON to display device level energy input.
> >
> > v2: Updated the date and kernel version in feature description
> > v3:
> >   - Cleaned up hwm_energy function and removed unused function
> > i915_hwmon_energy_status_get (Ashutosh)
> > v4: KernelVersion: 6.2, Date: February 2023 in doc (Tvrtko)
> >
> > Signed-off-by: Dale B Stimson 
> > Signed-off-by: Ashutosh Dixit 
> > Signed-off-by: Riana Tauro 
> > Signed-off-by: Badal Nilawar 
> > Acked-by: Guenter Roeck 
> > Reviewed-by: Ashutosh Dixit 
> > Reviewed-by: Anshuman Gupta 
> > ---
> >  .../ABI/testing/sysfs-driver-intel-i915-hwmon |   8 ++
> >  drivers/gpu/drm/i915/i915_hwmon.c | 107 +-
> >  drivers/gpu/drm/i915/intel_mchbar_regs.h  |   2 +
> >  3 files changed, 115 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
> > b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > index 16e697b1db3d..7525db243d74 100644
> > --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> > @@ -25,3 +25,11 @@ Contact: dri-devel@lists.freedesktop.org
> >  Description:   RO. Card default power limit (default TDP setting).
> >
> > Only supported for particular Intel i915 graphics platforms.
> > +
> > +What:  /sys/devices/.../hwmon/hwmon/energy1_input
> > +Date:  February 2023
> > +KernelVersion: 6.2
> > +Contact:   dri-devel@lists.freedesktop.org
>
> I'm sorry for being late on the review here, and I know that others
> already looked at the date and other details here in this doc.
> So I'm curious why we have decided for the dri-devel mailing list
> and not for the intel-gfx since intel-gfx is the only one we have
> listed for i915 dir in the MAINTAINERS file:
> L:  intel-...@lists.freedesktop.org

I have changed the contact to intel-...@lists.freedesktop.org in v9 for all
patches.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting

2022-10-13 Thread Dixit, Ashutosh
On Mon, 03 Oct 2022 14:05:14 -0700, Andi Shyti wrote:
>
> Hi Badal,
>
> [...]
>
> >  hwm_get_preregistration_info(struct drm_i915_private *i915)
> >  {
> > struct i915_hwmon *hwmon = i915->hwmon;
> > +   struct intel_uncore *uncore = >uncore;
> > +   intel_wakeref_t wakeref;
> > +   u32 val_sku_unit;
> >
> > -   if (IS_DG1(i915) || IS_DG2(i915))
> > +   if (IS_DG1(i915) || IS_DG2(i915)) {
> > hwmon->rg.gt_perf_status = GEN12_RPSTAT1;
> > -   else
> > +   hwmon->rg.pkg_power_sku_unit = PCU_PACKAGE_POWER_SKU_UNIT;
> > +   hwmon->rg.pkg_power_sku = PCU_PACKAGE_POWER_SKU;
> > +   hwmon->rg.pkg_rapl_limit = PCU_PACKAGE_RAPL_LIMIT;
> > +   } else {
> > hwmon->rg.gt_perf_status = INVALID_MMIO_REG;
> > +   hwmon->rg.pkg_power_sku_unit = INVALID_MMIO_REG;
> > +   hwmon->rg.pkg_power_sku = INVALID_MMIO_REG;
> > +   hwmon->rg.pkg_rapl_limit = INVALID_MMIO_REG;
> > +   }
> > +
> > +   with_intel_runtime_pm(uncore->rpm, wakeref) {
> > +   /*
> > +* The contents of register hwmon->rg.pkg_power_sku_unit do not 
> > change,
> > +* so read it once and store the shift values.
> > +*/
> > +   if (i915_mmio_reg_valid(hwmon->rg.pkg_power_sku_unit)) {
> > +   val_sku_unit = intel_uncore_read(uncore,
> > +
> > hwmon->rg.pkg_power_sku_unit);
> > +   } else {
> > +   val_sku_unit = 0;
> > +   }
>
> please remove the brackets here and, just a small nitpick:
>
> move val_sky_unit inside the "with_intel_runtime_pm()" and
> initialize it to '0', you will save the else statement.

Hi Andi, fixed in v9 of the series.

>
> Other than that:
>
> Reviewed-by: Andi Shyti 

Thanks.
--
Ashutosh


Re: [PATCH 4/7] drm/i915/hwmon: Show device level energy usage

2022-10-13 Thread Dixit, Ashutosh
On Wed, 21 Sep 2022 05:02:48 -0700, Gupta, Anshuman wrote:
>
> > diff --git a/drivers/gpu/drm/i915/intel_mchbar_regs.h 
> > b/drivers/gpu/drm/i915/intel_mchbar_regs.h
> > index b74df11977c6..1014d0b7cc16 100644
> > --- a/drivers/gpu/drm/i915/intel_mchbar_regs.h
> > +++ b/drivers/gpu/drm/i915/intel_mchbar_regs.h
> > @@ -191,7 +191,9 @@
> > #define PCU_PACKAGE_POWER_SKU_UNIT
> > _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5938)
> >   #define   PKG_PWR_UNITREG_GENMASK(3, 0)
> > +#define   PKG_ENERGY_UNIT  REG_GENMASK(12, 8)
> Please use tab here instead of space to line up with above macros.

Fixed in v9.

> With that,
> Reviewed-by: Anshuman Gupta 

Thanks.


Re: [Intel-gfx] [PATCH 2/7] drm/i915/hwmon: Add HWMON current voltage support

2022-10-13 Thread Dixit, Ashutosh
On Mon, 03 Oct 2022 13:56:05 -0700, Andi Shyti wrote:

Hi Andi,

Badal is out for a bit so I am posting this version of the patches.

>
> Hi Badal,
>
> [...]
>
> >  static void
> >  hwm_get_preregistration_info(struct drm_i915_private *i915)
> >  {
> > +   struct i915_hwmon *hwmon = i915->hwmon;
> > +
> > +   if (IS_DG1(i915) || IS_DG2(i915))
>
> why not GRAPHICS_VER(i915) >= 12 here?

Thanks for catching this, because GEN12_RPSTAT1 is indeed available for all
Gen12+. It was done this way because the voltage bits of GEN12_RPSTAT1 are
only available for DG1/DG2. Anyway in v9 I have changed this to just:

/* Available for all Gen12+/dGfx */
hwmon->rg.gt_perf_status = GEN12_RPSTAT1;

That is because hwmon is only availbable for dGfx (there's a check in Patch
1). Also, because of this change the 'IS_DG1(i915) || IS_DG2(i915)' check
has been moved to hwm_in_is_visible.

Thanks.
--
Ashutosh

> > +   hwmon->rg.gt_perf_status = GEN12_RPSTAT1;
> > +   else
> > +   hwmon->rg.gt_perf_status = INVALID_MMIO_REG;
> >  }
> >
> >  void i915_hwmon_register(struct drm_i915_private *i915)
> > --
> > 2.25.1


Re: [PATCH] drm/i915/pmu: Match frequencies reported by PMU and sysfs

2022-10-05 Thread Dixit, Ashutosh
On Tue, 04 Oct 2022 06:00:22 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

>
> On 04/10/2022 10:29, Tvrtko Ursulin wrote:
> >
> > On 03/10/2022 20:24, Ashutosh Dixit wrote:
> >> PMU and sysfs use different wakeref's to "interpret" zero freq. Sysfs
> >> uses
> >> runtime PM wakeref (see intel_rps_read_punit_req and
> >> intel_rps_read_actual_frequency). PMU uses the GT parked/unparked
> >> wakeref. In general the GT wakeref is held for less time that the runtime
> >> PM wakeref which causes PMU to report a lower average freq than the
> >> average
> >> freq obtained from sampling sysfs.
> >>
> >> To resolve this, use the same freq functions (and wakeref's) in PMU as
> >> those used in sysfs.
> >>
> >> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7025
> >> Reported-by: Ashwin Kumar Kulkarni 
> >> Cc: Tvrtko Ursulin 
> >> Signed-off-by: Ashutosh Dixit 
> >> ---
> >>   drivers/gpu/drm/i915/i915_pmu.c | 27 ++-
> >>   1 file changed, 2 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> >> b/drivers/gpu/drm/i915/i915_pmu.c
> >> index 958b37123bf1..eda03f264792 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >> @@ -371,37 +371,16 @@ static void
> >>   frequency_sample(struct intel_gt *gt, unsigned int period_ns)
> >>   {
> >>   struct drm_i915_private *i915 = gt->i915;
> >> -    struct intel_uncore *uncore = gt->uncore;
> >>   struct i915_pmu *pmu = >pmu;
> >>   struct intel_rps *rps = >rps;
> >>   if (!frequency_sampling_enabled(pmu))
> >>   return;
> >> -    /* Report 0/0 (actual/requested) frequency while parked. */
> >> -    if (!intel_gt_pm_get_if_awake(gt))
> >> -    return;
> >> -
> >>   if (pmu->enable & config_mask(I915_PMU_ACTUAL_FREQUENCY)) {
> >> -    u32 val;
> >> -
> >> -    /*
> >> - * We take a quick peek here without using forcewake
> >> - * so that we don't perturb the system under observation
> >> - * (forcewake => !rc6 => increased power use). We expect
> >> - * that if the read fails because it is outside of the
> >> - * mmio power well, then it will return 0 -- in which
> >> - * case we assume the system is running at the intended
> >> - * frequency. Fortunately, the read should rarely fail!
> >> - */
> >> -    val = intel_uncore_read_fw(uncore, GEN6_RPSTAT1);
> >> -    if (val)
> >> -    val = intel_rps_get_cagf(rps, val);
> >> -    else
> >> -    val = rps->cur_freq;
> >> -
> >>   add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT],
> >> -    intel_gpu_freq(rps, val), period_ns / 1000);
> >> +    intel_rps_read_actual_frequency(rps),
> >> +    period_ns / 1000);
> >>   }
> >>   if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) {
> >
> > What is software tracking of requested frequency showing when GT is
> > parked or runtime suspended? With this change sampling would be outside
> > any such checks so we need to be sure reported value makes sense.
> >
> > Although more important open is around what is actually correct.
> >
> > For instance how does the patch affect RC6 and power? I don't know how
> > power management of different blocks is wired up, so personally I would
> > only be able to look at it empirically. In other words what I am asking
> > is this - if we changed from skipping obtaining forcewake even when
> > unparked, to obtaining forcewake if not runtime suspended - what hardware
> > blocks does that power up and how it affects RC6 and power? Can it affect
> > actual frequency or not? (Will "something" power up the clocks just
> > because we will be getting forcewake?)
> >
> > Or maybe question simplified - does 200Hz polling on existing sysfs
> > actual frequency field disturbs the system under some circumstances?
> > (Increases power and decreases RC6.) If it does then that would be a
> > problem. We want a solution which shows the real data, but where the act
> > of monitoring itself does not change it too much. If it doesn't then it's
> > okay.
> >
> > Could you somehow investigate on these topics? Maybe log RAPL GPU power
> > while polling on sysfs, versus getting the actual frequency from the
> > existing PMU implementation and see if that shows anything? Or actually
> > simpler - RAPL GPU power for current PMU intel_gpu_top versus this patch?
> > On idle(-ish) desktop workloads perhaps? Power and frequency graphed for
> > both.
>
> Another thought - considering that bspec says for 0xa01c "This register
> reflects real-time values and thus does not have a pre-determined default
> value out of reset" - could it be that it also does not reflect a real
> value when GPU is not executing anything (so zero), just happens to be not
> runtime suspended? That would mean sysfs reads could maybe show last known
> value? Just a thought to check.

Thanks for the suggestion, I'll try 

Re: [Intel-gfx] [PATCH] drm/i915: Perf_limit_reasons are only available for Gen11+

2022-09-28 Thread Dixit, Ashutosh
On Wed, 28 Sep 2022 11:35:18 -0700, Rodrigo Vivi wrote:
>
> On Wed, Sep 28, 2022 at 11:17:06AM -0700, Dixit, Ashutosh wrote:
> > On Wed, 28 Sep 2022 04:38:46 -0700, Jani Nikula wrote:
> > >
> > > On Mon, 19 Sep 2022, Ashutosh Dixit  wrote:
> > > > Register GT0_PERF_LIMIT_REASONS (0x1381a8) is available only for
> > > > Gen11+. Therefore ensure perf_limit_reasons sysfs/debugfs files are 
> > > > created
> > > > only for Gen11+. Otherwise on Gen < 5 accessing these files results in 
> > > > the
> > > > following oops:
> > > >
> > > > <1> [88.829420] BUG: unable to handle page fault for address: 
> > > > c9bb81a8
> > > > <1> [88.829438] #PF: supervisor read access in kernel mode
> > > > <1> [88.829447] #PF: error_code(0x) - not-present page
> > > >
> > > > Bspec: 20008
> > > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6863
> > > > Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in 
> > > > debugfs")
> > > > Fixes: fa68bff7cf27 ("drm/i915/gt: Add sysfs throttle frequency 
> > > > interfaces")
> > > > Signed-off-by: Ashutosh Dixit 
> > >
> >
> > Hi Jani,
> >
> > > Ashutosh, can you provide a backport of this i.e. commit 0d2d201095e9
> > > ("drm/i915: Perf_limit_reasons are only available for Gen11+") that
> > > applies cleanly on drm-intel-fixes, please?
> >
> > I've sent the patch:
> >
> > https://patchwork.freedesktop.org/series/109196/
> >
> > Not sure though if it is worth applying on drm-intel-fixes because of one
> > conflict with drm-tip which will need to be resolved manually. On

Hi Rodrigo,

> The conflict shouldn't be that bad to resolve, but since the patch deviates
> from the original, the new commit message needs to highlight and explain
> that this is a backport and the reasons of the difference and including the 
> sha
> of the already merged patch. Similar to the option 3 of the stable rules. [1].

I have improved the commit message and sent out a v2. Please take a look.

Thanks.
--
Ashutosh

> Well, another option is to wait until this patch gets propagated to Linus 
> master
> and then send the backported version to the stable mailing list. But again,
> with the proper rules of the option 3. [1]
>
> [1] - https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>
> > drm-intel-fixes the crash mentioned above will be seen only on Gen < 5 if
> > someone manually cat's the sysfs. We had to fix on drm-tip because there
> > was a CI failure with Gen3 debugfs but that code is not in drm-intel-fixes.
>
> since it is sysfs it is probably a good protection to have anyway.
>
> >
> > Thanks.
> > --
> > Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915: Perf_limit_reasons are only available for Gen11+

2022-09-28 Thread Dixit, Ashutosh
On Wed, 28 Sep 2022 04:38:46 -0700, Jani Nikula wrote:
>
> On Mon, 19 Sep 2022, Ashutosh Dixit  wrote:
> > Register GT0_PERF_LIMIT_REASONS (0x1381a8) is available only for
> > Gen11+. Therefore ensure perf_limit_reasons sysfs/debugfs files are created
> > only for Gen11+. Otherwise on Gen < 5 accessing these files results in the
> > following oops:
> >
> > <1> [88.829420] BUG: unable to handle page fault for address: 
> > c9bb81a8
> > <1> [88.829438] #PF: supervisor read access in kernel mode
> > <1> [88.829447] #PF: error_code(0x) - not-present page
> >
> > Bspec: 20008
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6863
> > Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
> > Fixes: fa68bff7cf27 ("drm/i915/gt: Add sysfs throttle frequency interfaces")
> > Signed-off-by: Ashutosh Dixit 
>

Hi Jani,

> Ashutosh, can you provide a backport of this i.e. commit 0d2d201095e9
> ("drm/i915: Perf_limit_reasons are only available for Gen11+") that
> applies cleanly on drm-intel-fixes, please?

I've sent the patch:

https://patchwork.freedesktop.org/series/109196/

Not sure though if it is worth applying on drm-intel-fixes because of one
conflict with drm-tip which will need to be resolved manually. On
drm-intel-fixes the crash mentioned above will be seen only on Gen < 5 if
someone manually cat's the sysfs. We had to fix on drm-tip because there
was a CI failure with Gen3 debugfs but that code is not in drm-intel-fixes.

Thanks.
--
Ashutosh


Re: [PATCH 1/1] drm/i915: Use GEN12 RPSTAT register

2022-09-27 Thread Dixit, Ashutosh
On Tue, 27 Sep 2022 04:35:29 -0700, Badal Nilawar wrote:
>
> From: Don Hiatt 
>
> On GEN12 and above use GEN12_RPSTAT register to get Current
> Actual Graphics Frequency of GT

I think even for the purposes of reviewing this it would be good to mention
in the commit message that:

a. GEN12_RPSTAT register doesn't require a forcewake to be read (it doesn't
   belong to a forcewake domain)
b. Will result in a 0 frequency if the GT is in RC6

Thanks.
--
Ashutosh

> v2:
>   - Fixed review comments(Ashutosh)
>   - Added function intel_rps_read_rpstat_fw to read RPSTAT without
> forcewake, required especially for GEN6_RPSTAT1 (Ashutosh, Tvrtko)
>
> Cc: Don Hiatt 
> Cc: Andi Shyti 
> Signed-off-by: Don Hiatt 
> Signed-off-by: Badal Nilawar 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  4 +++
>  drivers/gpu/drm/i915/gt/intel_rps.c   | 32 +--
>  drivers/gpu/drm/i915/gt/intel_rps.h   |  2 ++
>  drivers/gpu/drm/i915/i915_pmu.c   |  3 +-
>  5 files changed, 38 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> index 10f680dbd7b6..b9b47052b26d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
> @@ -380,7 +380,7 @@ void intel_gt_pm_frequency_dump(struct intel_gt *gt, 
> struct drm_printer *p)
>   rpinclimit = intel_uncore_read(uncore, GEN6_RP_UP_THRESHOLD);
>   rpdeclimit = intel_uncore_read(uncore, GEN6_RP_DOWN_THRESHOLD);
>
> - rpstat = intel_uncore_read(uncore, GEN6_RPSTAT1);
> + rpstat = intel_rps_read_rpstat(rps);
>   rpcurupei = intel_uncore_read(uncore, GEN6_RP_CUR_UP_EI) & 
> GEN6_CURICONT_MASK;
>   rpcurup = intel_uncore_read(uncore, GEN6_RP_CUR_UP) & 
> GEN6_CURBSYTAVG_MASK;
>   rpprevup = intel_uncore_read(uncore, GEN6_RP_PREV_UP) & 
> GEN6_CURBSYTAVG_MASK;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
> b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 7f79bbf97828..1f1e90acc1ab 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -1519,6 +1519,10 @@
>  #define VLV_RENDER_C0_COUNT  _MMIO(0x138118)
>  #define VLV_MEDIA_C0_COUNT   _MMIO(0x13811c)
>
> +#define GEN12_RPSTAT1_MMIO(0x1381b4)
> +#define   GEN12_CAGF_SHIFT   11
> +#define   GEN12_CAGF_MASKREG_GENMASK(19, 11)
> +
>  #define GEN11_GT_INTR_DW(x)  _MMIO(0x190018 + ((x) * 4))
>  #define   GEN11_CSME (31)
>  #define   GEN11_GUNIT(28)
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index 17b40b625e31..5a15a630b1c6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -2068,12 +2068,40 @@ void intel_rps_sanitize(struct intel_rps *rps)
>   rps_disable_interrupts(rps);
>  }
>
> +u32 intel_rps_read_rpstat_fw(struct intel_rps *rps)
> +{
> + struct drm_i915_private *i915 = rps_to_i915(rps);
> + i915_reg_t rpstat;
> +
> + if (GRAPHICS_VER(i915) >= 12)
> + rpstat = GEN12_RPSTAT1;
> + else
> + rpstat = GEN6_RPSTAT1;
> +
> + return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat);
> +}
> +
> +u32 intel_rps_read_rpstat(struct intel_rps *rps)
> +{
> + struct drm_i915_private *i915 = rps_to_i915(rps);
> + i915_reg_t rpstat;
> +
> + if (GRAPHICS_VER(i915) >= 12)
> + rpstat = GEN12_RPSTAT1;
> + else
> + rpstat = GEN6_RPSTAT1;
> +
> + return intel_uncore_read(rps_to_gt(rps)->uncore, rpstat);
> +}
> +
>  u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat)
>  {
>   struct drm_i915_private *i915 = rps_to_i915(rps);
>   u32 cagf;
>
> - if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> + if (GRAPHICS_VER(i915) >= 12)
> + cagf = (rpstat & GEN12_CAGF_MASK) >> GEN12_CAGF_SHIFT;
> + else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
>   cagf = (rpstat >> 8) & 0xff;
>   else if (GRAPHICS_VER(i915) >= 9)
>   cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
> @@ -2099,7 +2127,7 @@ static u32 read_cagf(struct intel_rps *rps)
>   freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
>   vlv_punit_put(i915);
>   } else if (GRAPHICS_VER(i915) >= 6) {
> - freq = intel_uncore_read(uncore, GEN6_RPSTAT1);
> + freq = intel_rps_read_rpstat(rps);
>   } else {
>   freq = intel_uncore_read(uncore, MEMSTAT_ILK);
>   }
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
> b/drivers/gpu/drm/i915/gt/intel_rps.h
> index 4509dfdc52e0..76c8404d8416 100644
> --- 

Re: [PATCH 6/7] drm/i915/hwmon: Expose power1_max_interval

2022-09-23 Thread Dixit, Ashutosh
On Fri, 23 Sep 2022 12:56:42 -0700, Badal Nilawar wrote:
>
> From: Ashutosh Dixit 
>
> Expose power1_max_interval, that is the tau corresponding to PL1.

I think let's change the above sentence to: "Expose power1_max_interval,
that is the tau corresponding to PL1, as a custom hwmon attribute".

This is the only custom attribute we are exposing so better to mention this
in the commit message I think.

Thanks.
--
Ashutosh


Re: [PATCH 4/7] drm/i915/hwmon: Show device level energy usage

2022-09-23 Thread Dixit, Ashutosh
On Fri, 23 Sep 2022 12:56:40 -0700, Badal Nilawar wrote:
>
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.h 
> b/drivers/gpu/drm/i915/i915_hwmon.h
> index 7ca9cf2c34c9..4e5b6c149f3a 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.h
> +++ b/drivers/gpu/drm/i915/i915_hwmon.h
> @@ -17,4 +17,5 @@ static inline void i915_hwmon_register(struct 
> drm_i915_private *i915) { };
>  static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
>  #endif
>
> +int i915_hwmon_energy_status_get(struct drm_i915_private *i915, long 
> *energy);

We deleted this function definition, this is just leftover so please delete
this too.


  1   2   >