Re: [PATCH v2] drm/i915/guc: Use context hints for GT freq

2024-02-28 Thread Belgaumkar, Vinay



On 2/28/2024 4:54 AM, Tvrtko Ursulin wrote:


On 27/02/2024 23:51, Vinay Belgaumkar wrote:

Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.

We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.

Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINTS. This flag is true for all guc 
submission

enabled platforms as they use SLPC for frequency management.

The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint

v2: Rename flags as per review suggestions (Rodrigo, Tvrtko).
Also, use flag bits in intel_context as it allows finer control for
toggling per engine if needed (Tvrtko).

Cc: Rodrigo Vivi 
Cc: Tvrtko Ursulin 
Cc: Sushma Venkatesh Reddy 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 15 +++--
  .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
  drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
  drivers/gpu/drm/i915/gt/intel_rps.c   |  5 +
  .../drm/i915/gt/uc/abi/guc_actions_slpc_abi.h | 21 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 17 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h   |  1 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  6 ++
  drivers/gpu/drm/i915/i915_getparam.c  | 12 +++
  include/uapi/drm/i915_drm.h   | 15 +
  10 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index dcbfe32fd30c..0799cb0b2803 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -879,6 +879,7 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

 struct i915_gem_proto_context *pc,
 struct drm_i915_gem_context_param *args)
  {
+    struct drm_i915_private *i915 = fpriv->i915;
  int ret = 0;
    switch (args->param) {
@@ -904,6 +905,13 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

  pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
  break;
  +    case I915_CONTEXT_PARAM_LOW_LATENCY:
+    if (intel_uc_uses_guc_submission(_gt(i915)->uc))
+    pc->user_flags |= BIT(UCONTEXT_LOW_LATENCY);
+    else
+    ret = -EINVAL;
+    break;
+
  case I915_CONTEXT_PARAM_RECOVERABLE:
  if (args->size)
  ret = -EINVAL;
@@ -992,6 +1000,9 @@ static int intel_context_set_gem(struct 
intel_context *ce,
  if (sseu.slice_mask && !WARN_ON(ce->engine->class != 
RENDER_CLASS))

  ret = intel_context_reconfigure_sseu(ce, sseu);
  +    if (test_bit(UCONTEXT_LOW_LATENCY, >user_flags))
+    set_bit(CONTEXT_LOW_LATENCY, >flags);


Does not need to be atomic so can use __set_bit as higher up in the 
function.

ok.



+
  return ret;
  }
  @@ -1630,6 +1641,8 @@ i915_gem_create_context(struct 
drm_i915_private *i915,

  if (vm)
  ctx->vm = vm;
  +    ctx->user_flags = pc->user_flags;
+


Given how most ctx->something assignments are at the bottom of the 
function I would stick a comment here saying along the lines of 
"assign early for intel_context_set_gem called when creating engines".

ok.



mutex_init(>engines_mutex);
  if (pc->num_user_engines >= 0) {
  i915_gem_context_set_user_engines(ctx);
@@ -1652,8 +1665,6 @@ i915_gem_create_context(struct drm_i915_private 
*i915,

   * is no remap info, it will be a NOP. */
  ctx->remap_slice = ALL_L3_SLICES(i915);
  -    ctx->user_flags = pc->user_flags;
-
  for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
  ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
  diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h

index 03bc7f9d191b..b6d97da63d1f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -338,6 +338,7 @@ struct i915_gem_context {
  #define UCONTEXT_BANNABLE    2
  #define UCONTEXT_RECOVERABLE    3
  #define UCONTEXT_PERSISTENCE    4
+#define UCONTEXT_LOW_LATENCY    5
    /**
   * @flags: small set of booleans
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h

index 7eccbd70d89f..ed95a7b57cbb 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ 

Re: [PATCH] drm/i915/guc: Add Compute context hint

2024-02-23 Thread Belgaumkar, Vinay



On 2/23/2024 12:51 AM, Tvrtko Ursulin wrote:


On 22/02/2024 23:31, Belgaumkar, Vinay wrote:


On 2/22/2024 7:32 AM, Tvrtko Ursulin wrote:


On 21/02/2024 21:28, Rodrigo Vivi wrote:

On Wed, Feb 21, 2024 at 09:42:34AM +, Tvrtko Ursulin wrote:


On 21/02/2024 00:14, Vinay Belgaumkar wrote:

Allow user to provide a context hint. When this is set, KMD will
send a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more 
slowly.
We also disable waitboost for this context as that will interfere 
with

the strategy.

We need to enable the use of Compute strategy during SLPC init, but
it will apply only to contexts that set this bit during context
creation.

Userland can check whether this feature is supported using a new 
param-
I915_PARAM_HAS_COMPUTE_CONTEXT. This flag is true for all guc 
submission

enabled platforms since they use SLPC for freq management.

The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint


This allows for setting it for the whole application, correct? 
Upsides,

downsides? Are there any plans for per context?


Currently there's no extension on a high level API 
(Vulkan/OpenGL/OpenCL/etc)
that would allow the application to hint for power/freq/latency. So 
Mesa cannot
decide when to hint. So their solution was to use .drirc and make 
per-application

decision.

I would prefer a high level extension for a more granular and 
informative

decision. We need to work with that goal, but for now I don't see any
cons on this approach.


In principle yeah I doesn't harm to have the option. I am just not 
sure how useful this intermediate step this is with its lack of 
intra-process granularity.



Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  8 +++
   .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
   drivers/gpu/drm/i915/gt/intel_rps.c   |  8 +++
   .../drm/i915/gt/uc/abi/guc_actions_slpc_abi.h | 21 
+++
   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 17 
+++

   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h   |  1 +
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  7 +++
   drivers/gpu/drm/i915/i915_getparam.c  | 11 ++
   include/uapi/drm/i915_drm.h   | 15 +
   9 files changed, 89 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index dcbfe32fd30c..ceab7dbe9b47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -879,6 +879,7 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

  struct i915_gem_proto_context *pc,
  struct drm_i915_gem_context_param *args)
   {
+    struct drm_i915_private *i915 = fpriv->i915;
   int ret = 0;
   switch (args->param) {
@@ -904,6 +905,13 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

   pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
   break;
+    case I915_CONTEXT_PARAM_IS_COMPUTE:
+    if (!intel_uc_uses_guc_submission(_gt(i915)->uc))
+    ret = -EINVAL;
+    else
+    pc->user_flags |= BIT(UCONTEXT_COMPUTE);
+    break;
+
   case I915_CONTEXT_PARAM_RECOVERABLE:
   if (args->size)
   ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h

index 03bc7f9d191b..db86d6f6245f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -338,6 +338,7 @@ struct i915_gem_context {
   #define UCONTEXT_BANNABLE    2
   #define UCONTEXT_RECOVERABLE    3
   #define UCONTEXT_PERSISTENCE    4
+#define UCONTEXT_COMPUTE    5


What is the GuC behaviour when SLPC_CTX_FREQ_REQ_IS_COMPUTE is set 
for
non-compute engines? Wondering if per intel_context is what we 
want instead.

(Which could then be the i915_context_param_engines extension to mark
individual contexts as compute strategy.)


Perhaps we should rename this? This is a freq-decision-strategy inside
GuC that is there mostly targeting compute workloads that needs lower
latency with short burst execution. But the engine itself doesn't 
matter.

It can be applied to any engine.


I have no idea if it makes sense for other engines, such as video, 
and what would be pros and cons in terms of PnP. But in the case we 
end up allowing it on any engine, then at least userspace name 
shouldn't be compute. :)
Yes, one of the suggestions from Daniele was to have something along 
the lines of UCONTEXT_HIFREQ or something along those lines so we 
don't confu

Re: [PATCH] drm/i915/guc: Add Compute context hint

2024-02-22 Thread Belgaumkar, Vinay



On 2/22/2024 7:32 AM, Tvrtko Ursulin wrote:


On 21/02/2024 21:28, Rodrigo Vivi wrote:

On Wed, Feb 21, 2024 at 09:42:34AM +, Tvrtko Ursulin wrote:


On 21/02/2024 00:14, Vinay Belgaumkar wrote:

Allow user to provide a context hint. When this is set, KMD will
send a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.

We need to enable the use of Compute strategy during SLPC init, but
it will apply only to contexts that set this bit during context
creation.

Userland can check whether this feature is supported using a new 
param-
I915_PARAM_HAS_COMPUTE_CONTEXT. This flag is true for all guc 
submission

enabled platforms since they use SLPC for freq management.

The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint


This allows for setting it for the whole application, correct? Upsides,
downsides? Are there any plans for per context?


Currently there's no extension on a high level API 
(Vulkan/OpenGL/OpenCL/etc)
that would allow the application to hint for power/freq/latency. So 
Mesa cannot
decide when to hint. So their solution was to use .drirc and make 
per-application

decision.

I would prefer a high level extension for a more granular and 
informative

decision. We need to work with that goal, but for now I don't see any
cons on this approach.


In principle yeah I doesn't harm to have the option. I am just not 
sure how useful this intermediate step this is with its lack of 
intra-process granularity.



Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  8 +++
   .../gpu/drm/i915/gem/i915_gem_context_types.h |  1 +
   drivers/gpu/drm/i915/gt/intel_rps.c   |  8 +++
   .../drm/i915/gt/uc/abi/guc_actions_slpc_abi.h | 21 
+++

   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 17 +++
   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h   |  1 +
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  7 +++
   drivers/gpu/drm/i915/i915_getparam.c  | 11 ++
   include/uapi/drm/i915_drm.h   | 15 +
   9 files changed, 89 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index dcbfe32fd30c..ceab7dbe9b47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -879,6 +879,7 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

  struct i915_gem_proto_context *pc,
  struct drm_i915_gem_context_param *args)
   {
+    struct drm_i915_private *i915 = fpriv->i915;
   int ret = 0;
   switch (args->param) {
@@ -904,6 +905,13 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,

   pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
   break;
+    case I915_CONTEXT_PARAM_IS_COMPUTE:
+    if (!intel_uc_uses_guc_submission(_gt(i915)->uc))
+    ret = -EINVAL;
+    else
+    pc->user_flags |= BIT(UCONTEXT_COMPUTE);
+    break;
+
   case I915_CONTEXT_PARAM_RECOVERABLE:
   if (args->size)
   ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h

index 03bc7f9d191b..db86d6f6245f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -338,6 +338,7 @@ struct i915_gem_context {
   #define UCONTEXT_BANNABLE    2
   #define UCONTEXT_RECOVERABLE    3
   #define UCONTEXT_PERSISTENCE    4
+#define UCONTEXT_COMPUTE    5


What is the GuC behaviour when SLPC_CTX_FREQ_REQ_IS_COMPUTE is set for
non-compute engines? Wondering if per intel_context is what we want 
instead.

(Which could then be the i915_context_param_engines extension to mark
individual contexts as compute strategy.)


Perhaps we should rename this? This is a freq-decision-strategy inside
GuC that is there mostly targeting compute workloads that needs lower
latency with short burst execution. But the engine itself doesn't 
matter.

It can be applied to any engine.


I have no idea if it makes sense for other engines, such as video, and 
what would be pros and cons in terms of PnP. But in the case we end up 
allowing it on any engine, then at least userspace name shouldn't be 
compute. :)
Yes, one of the suggestions from Daniele was to have something along the 
lines of UCONTEXT_HIFREQ or something along those lines so we don't 
confuse it with the Compute Engine.


Or if we decide to call it compute and only apply to compute engines, 
then I would strongly 

Re: [PATCH] drm/i915/mtl: Wake GT before sending H2G message

2024-01-18 Thread Belgaumkar, Vinay



On 1/18/2024 3:50 PM, Matt Roper wrote:

On Thu, Jan 18, 2024 at 03:17:28PM -0800, Vinay Belgaumkar wrote:

Instead of waiting until the interrupt reaches GuC, we can grab a
forcewake while triggering the H2G interrupt. GEN11_GUC_HOST_INTERRUPT
is inside an "always on" domain with respect to RC6. However, there

A bit of a nitpick, but technically "always on" is a description of GT
register ranges that never get powered down.  GEN11_GUC_HOST_INTERRUPT
isn't inside the GT at all, but rather is an sgunit register and thus
isn't affected by forcewake.  This is just a special case where the
sgunit register forwards a message back to the GT's GuC, and the
workaround wants us to make sure the GT is awake before that message
gets there.

True, can modify the description to reflect this.



could be some delays when platform is entering/exiting some higher
level platform sleep states and a H2G is triggered. A forcewake
ensures those sleep states have been fully exited and further
processing occurs as expected.

Based on this description, is adding implicit forcewake to this register
really enough?  Implicit forcewake powers up before a read/write, but
also allows it to power back down as soon as the MMIO operation is
complete.  If the GuC is a bit slow to notice the interrupt, then we
could wind up with a sequence like

  - Driver grabs forcewake and GT powers up
  - Driver writes 0x1901f0 to trigger GuC interrupt
  - Driver releases forcewake and GT powers down
  - GuC notices interrupt (or maybe fails to notice it because the GT
powered down before it had a chance to process it?)

which I'm guessing isn't actually going to satisfy this workaround.  Do
we actually need to keep the GT awake not just through the register
operation, but also through the GuC's processing of the interrupt?  If
so, then we probably want to do an explicit forcewake get/put to ensure
the hardware stays powered up long enough.


The issue being addressed here is not GT entering C6, but the higher 
platform sleep states. Once we force wake GT while writing to the H2G 
register, that should bring us out of sleep. After clearing the 
forcewake (which would happen after the write for 0x1901f0 goes 
through), we still have C6 hysteresis and the hysteresis counters for 
the higher platform sleep states which should give GuC enough time to 
process the interrupt before we enter C6 and then subsequently these 
higher sleep states.


Thanks,

Vinay.




Matt


This will have an official WA soon so adding a FIXME in the comments.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/intel_uncore.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index dfefad5a5fec..121458a31886 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1800,7 +1800,10 @@ static const struct intel_forcewake_range 
__mtl_fw_ranges[] = {
GEN_FW_RANGE(0x24000, 0x2, 0), /*
0x24000 - 0x2407f: always on
0x24080 - 0x2: reserved */
-   GEN_FW_RANGE(0x3, 0x3, FORCEWAKE_GT)
+   GEN_FW_RANGE(0x3, 0x3, FORCEWAKE_GT),
+   GEN_FW_RANGE(0x4, 0x1901ec, 0),
+   GEN_FW_RANGE(0x1901f0, 0x1901f0, FORCEWAKE_GT)
+   /* FIXME: WA to wake GT while triggering H2G */
  };
  
  /*

--
2.38.1



Re: [PATCH i-g-t] tests/perf_pmu: Restore sysfs freq in exit handler

2024-01-08 Thread Belgaumkar, Vinay



On 1/5/2024 3:33 AM, Kamil Konieczny wrote:

Hi Vinay,
On 2024-01-04 at 17:10:00 -0800, Vinay Belgaumkar wrote:

looks good, there are some nits, first about subject:

[PATCH i-g-t] tests/perf_pmu: Restore sysfs freq in exit handler

s!tests/perf_pmu:!tests/intel/perf_pmu:!
Also you can drop "sysfs", so it will look:

[PATCH i-g-t] tests/intel/perf_pmu: Restore freq in exit handler


Seeing random issues where this test starts with invalid values.

Btw if issue is it starts with invalid values maybe culprit is in
some previous test, not this one? What about setting freq values
to defaults first? This can be done in separate patch.

I looked into log from test here:
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1438/bat-dg2-11/igt_runner10.txt
and here:
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1438/bat-dg2-11/igt@perf_pmu@freque...@gt0.html

One more thing, why is boost < max? Is it allowed? What about
just restore it to max (or other value?) before testing and
skipping only when min == max? But even then it seems like
restoring defaults should be first step before freq checks.
The only freq related test in that log is gem_ctx_freq which never 
modifies boost freq. AFAICS, this is the only test that modifies boost 
freq to be below RP0. I am thinking a previous iteration of this test 
left it in this state, not impossible I guess. Boost freq can be < max, 
it is allowed. We could "restore" to default,  but if we have exit 
handlers in place, that should never be needed.


For more nits see below.


Ensure that we restore the frequencies in case test exits early
due to some system issues.

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9432
Signed-off-by: Vinay Belgaumkar 
---
  tests/intel/perf_pmu.c | 53 +-
  1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/tests/intel/perf_pmu.c b/tests/intel/perf_pmu.c
index c6e6a8b77..ceacc1d3d 100644
--- a/tests/intel/perf_pmu.c
+++ b/tests/intel/perf_pmu.c
@@ -2454,12 +2454,59 @@ static void pmu_read(int i915)
for_each_if((e)->class == I915_ENGINE_CLASS_RENDER) \
igt_dynamic_f("%s", e->name)
  
+int fd = -1;

+uint32_t *stash_min, *stash_max, *stash_boost;
+
+static void save_sysfs_freq(int i915)
+{
+   int gt, num_gts, sysfs, tmp;
+
+   num_gts = igt_sysfs_get_num_gt(i915);
+
+   stash_min = (uint32_t *)malloc(sizeof(uint32_t) * num_gts);
+   stash_max = (uint32_t *)malloc(sizeof(uint32_t) * num_gts);
+   stash_boost = (uint32_t *)malloc(sizeof(uint32_t) * num_gts);
+
+   /* Save boost, min and max across GTs */
+   i915_for_each_gt(i915, tmp, gt) {
+   sysfs = igt_sysfs_gt_open(i915, gt);
+   igt_require(sysfs >= 0);
+
+   stash_min[gt] = igt_sysfs_get_u32(sysfs, "rps_min_freq_mhz");
+   stash_max[gt] = igt_sysfs_get_u32(sysfs, "rps_max_freq_mhz");
+   stash_boost[gt] = igt_sysfs_get_u32(sysfs, 
"rps_boost_freq_mhz");
+   igt_debug("GT: %d, min: %d, max: %d, boost:%d\n",
+ gt, stash_min[gt], stash_max[gt], stash_boost[gt]);
+
+   close(sysfs);
+   }
+}
+
+static void restore_sysfs_freq(int sig)
+{
+   int sysfs, gt, tmp;
+
+   /* Restore frequencies */
+   i915_for_each_gt(fd, tmp, gt) {
+   sysfs = igt_sysfs_gt_open(fd, gt);
+   igt_require(sysfs >= 0);

^
Don't use require at exit handler, better use continue.
Not sure about this. If we cannot restore, doesn't it mean there is an 
issue writing to sysfs and we should fail?



+
+   igt_require(__igt_sysfs_set_u32(sysfs, "rps_max_freq_mhz", 
stash_max[gt]));

^
Same here.


+   igt_require(__igt_sysfs_set_u32(sysfs, "rps_min_freq_mhz", 
stash_min[gt]));

^
Same.


+   igt_require(__igt_sysfs_set_u32(sysfs, "rps_boost_freq_mhz", 
stash_boost[gt]));

^
Same.


+
+   close(sysfs);
+   }
+   free(stash_min);
+   free(stash_max);

Free also stash_boost.

ok.



+}
+
  igt_main
  {
const struct intel_execution_engine2 *e;
unsigned int num_engines = 0;
const intel_ctx_t *ctx = NULL;
-   int gt, tmp, fd = -1;
+   int gt, tmp;
int num_gt = 0;
  
  	/**

@@ -2482,6 +2529,7 @@ igt_main
  
  		i915_for_each_gt(fd, tmp, gt)

num_gt++;
+

Remove this empty line.


ok, thanks,

Vinay,



Regards,
Kamil


}
  
  	igt_describe("Verify i915 pmu dir exists and read all events");

@@ -2664,6 +2712,9 @@ igt_main
 * Test GPU frequency.
 */
igt_subtest_with_dynamic("frequency") {
+   save_sysfs_freq(fd);
+   igt_install_exit_handler(restore_sysfs_freq);
+
i915_for_each_gt(fd, tmp, gt) {
igt_dynamic_f("gt%u", gt)
test_frequency(fd, gt);
--
2.38.1



Re: [Intel-gfx] [PATCH v2 1/4] drm/i915: Enable Wa_16019325821

2023-12-13 Thread Belgaumkar, Vinay


On 10/27/2023 2:18 PM, john.c.harri...@intel.com wrote:

From: John Harrison

Some platforms require holding RCS context switches until CCS is idle
(the reverse w/a of Wa_14014475959). Some platforms require both
versions.

Signed-off-by: John Harrison
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 19 +++
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ---
  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  4 
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  3 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  7 ++-
  5 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 86a04afff64b3..9cccd60a5c41d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -743,21 +743,23 @@ static u32 *gen12_emit_preempt_busywait(struct 
i915_request *rq, u32 *cs)
  }
  
  /* Wa_14014475959:dg2 */

-#define CCS_SEMAPHORE_PPHWSP_OFFSET0x540
-static u32 ccs_semaphore_offset(struct i915_request *rq)
+/* Wa_16019325821 */
+#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540
+static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
  {
return i915_ggtt_offset(rq->context->state) +
-   (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET;
+   (LRC_PPHWSP_PN * PAGE_SIZE) + 
HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET;
  }
  
  /* Wa_14014475959:dg2 */

-static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)
+/* Wa_16019325821 */
+static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs)
  {
int i;
  
  	*cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL |

MI_ATOMIC_MOVE;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
*cs++ = 1;
  
@@ -773,7 +775,7 @@ static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)

MI_SEMAPHORE_POLL |
MI_SEMAPHORE_SAD_EQ_SDD;
*cs++ = 0;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
  
  	return cs;

@@ -790,8 +792,9 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, 
u32 *cs)
cs = gen12_emit_preempt_busywait(rq, cs);
  
  	/* Wa_14014475959:dg2 */

-   if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine))
-   cs = ccs_emit_wa_busywait(rq, cs);
+   /* Wa_16019325821 */
+   if (intel_engine_uses_wa_hold_switchout(rq->engine))
+   cs = hold_switchout_emit_wa_busywait(rq, cs);
  
  	rq->tail = intel_ring_offset(rq, cs);

assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 8769760257fd9..f08739d020332 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -584,7 +584,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_HAS_RCS_REG_STATE  BIT(9)
  #define I915_ENGINE_HAS_EU_PRIORITYBIT(10)
  #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
-#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12)
+#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12)
unsigned int flags;
  
  	/*

@@ -694,10 +694,11 @@ intel_engine_has_relative_mmio(const struct 
intel_engine_cs * const engine)
  }
  
  /* Wa_14014475959:dg2 */

+/* Wa_16019325821 */
  static inline bool
-intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
+intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
  {
-   return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
+   return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT;
  }
  
  #endif /* __INTEL_ENGINE_TYPES_H__ */

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 3f3df1166b860..0e6c160de3315 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
IS_DG2(gt->i915))
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
  
+	/* Wa_16019325821 */

+   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
+   flags |= GUC_WA_RCS_CCS_SWITCHOUT;
+
/*
 * Wa_14012197797
 * Wa_22011391025
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 8ae1846431da7..48863188a130e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -96,8 +96,9 @@
  #define   GUC_WA_GAM_CREDITS  BIT(10)
  #define   GUC_WA_DUAL_QUEUE   BIT(11)
  #define   GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13)
-#define   GUC_WA_CONTEXT_ISOLATION BIT(15)
  #define   GUC_WA_PRE_PARSER   BIT(14)
+#define   

Re: [Intel-gfx] [PATCH v2 3/4] drm/i915/guc: Enable Wa_14019159160

2023-12-13 Thread Belgaumkar, Vinay



On 10/27/2023 2:18 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a
super-set of Wa_16019325821, so requires turning that one as well as
setting the new flag for Wa_14019159160 itself.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c  |  3 ++
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  1 +
  drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  7 
  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 34 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  1 +
  6 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 9cccd60a5c41d..359b21fb02ab2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -744,6 +744,7 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request 
*rq, u32 *cs)
  
  /* Wa_14014475959:dg2 */

  /* Wa_16019325821 */
+/* Wa_14019159160 */
  #define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET0x540
  static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
  {
@@ -753,6 +754,7 @@ static u32 hold_switchout_semaphore_offset(struct 
i915_request *rq)
  
  /* Wa_14014475959:dg2 */

  /* Wa_16019325821 */
+/* Wa_14019159160 */
  static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs)
  {
int i;
@@ -793,6 +795,7 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, 
u32 *cs)
  
  	/* Wa_14014475959:dg2 */

/* Wa_16019325821 */
+   /* Wa_14019159160 */
if (intel_engine_uses_wa_hold_switchout(rq->engine))
cs = hold_switchout_emit_wa_busywait(rq, cs);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h

index f08739d020332..3b4993955a4b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -695,6 +695,7 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs 
* const engine)
  
  /* Wa_14014475959:dg2 */

  /* Wa_16019325821 */
+/* Wa_14019159160 */
  static inline bool
  intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
  {
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
index 58012edd4eb0e..bebf28e3c4794 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
@@ -101,4 +101,11 @@ enum {
GUC_CONTEXT_POLICIES_KLV_NUM_IDS = 5,
  };
  
+/*

+ * Workaround keys:
+ */
+enum {
+   GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE   = 
0x9001,
+};
+
  #endif /* _ABI_GUC_KLVS_ABI_H */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 0e6c160de3315..6252f32d67011 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -295,6 +295,7 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
  
  	/* Wa_16019325821 */

+   /* Wa_14019159160 */
if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
flags |= GUC_WA_RCS_CCS_SWITCHOUT;
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c

index 251e7a7a05cb8..8f7298cbbc322 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -810,6 +810,25 @@ guc_capture_prep_lists(struct intel_guc *guc)
return PAGE_ALIGN(total_size);
  }
  
+/* Wa_14019159160 */

+static u32 guc_waklv_ra_mode(struct intel_guc *guc, u32 offset, u32 remain)
+{
+   u32 size;
+   u32 klv_entry[] = {
+   /* 16:16 key/length */
+   FIELD_PREP(GUC_KLV_0_KEY, 
GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE) |
+   FIELD_PREP(GUC_KLV_0_LEN, 0),
+   /* 0 dwords data */
+   };
+
+   size = sizeof(klv_entry);
+   GEM_BUG_ON(remain < size);
+
+   iosys_map_memcpy_to(>ads_map, offset, klv_entry, size);
+
+   return size;
+}
+
  static void guc_waklv_init(struct intel_guc *guc)
  {
struct intel_gt *gt = guc_to_gt(guc);
@@ -825,15 +844,12 @@ static void guc_waklv_init(struct intel_guc *guc)
offset = guc_ads_waklv_offset(guc);
remain = guc_ads_waklv_size(guc);
  
-	/*

-* Add workarounds here:
-*
-* if (want_wa_) {
-*  size = guc_waklv_(guc, offset, remain);
-*  offset += size;
-*  remain -= size;
-* }
-*/
+   /* Wa_14019159160 */
+   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) {
+   size = guc_waklv_ra_mode(guc, offset, remain);
+   offset += size;
+   remain -= size;
+   }
  
  	size = 

Re: [Intel-gfx] [PATCH v2 2/4] drm/i915/guc: Add support for w/a KLVs

2023-12-13 Thread Belgaumkar, Vinay



On 10/27/2023 2:18 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

To prevent running out of bits, new w/a enable flags are being added
via a KLV system instead of a 32 bit flags word.

Signed-off-by: John Harrison 
---
  .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  2 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 73 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  6 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  5 +-
  5 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
index dabeaf4f245f3..00d6402333f8e 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
@@ -36,6 +36,7 @@ enum intel_guc_load_status {
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
+   INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR= 0x75,
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
  
  	INTEL_GUC_LOAD_STATUS_READY= 0xF0,

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 2b6dfe62c8f2a..4113776ff3e19 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -198,6 +198,8 @@ struct intel_guc {
struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
+   /** @ads_waklv_size: size of workaround KLVs */
+   u32 ads_waklv_size;
/** @ads_capture_size: size of register lists in the ADS used for error 
capture */
u32 ads_capture_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 63724e17829a7..251e7a7a05cb8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -46,6 +46,10 @@
   *  +---+
   *  | padding   |
   *  +---+ <== 4K aligned
+ *  | w/a KLVs  |
+ *  +---+
+ *  | padding   |
+ *  +---+ <== 4K aligned
   *  | capture lists |
   *  +---+
   *  | padding   |
@@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc)
return PAGE_ALIGN(guc->ads_golden_ctxt_size);
  }
  
+static u32 guc_ads_waklv_size(struct intel_guc *guc)

+{
+   return PAGE_ALIGN(guc->ads_waklv_size);
+}
+
  static u32 guc_ads_capture_size(struct intel_guc *guc)
  {
return PAGE_ALIGN(guc->ads_capture_size);
@@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc)
return PAGE_ALIGN(offset);
  }
  
-static u32 guc_ads_capture_offset(struct intel_guc *guc)

+static u32 guc_ads_waklv_offset(struct intel_guc *guc)
  {
u32 offset;
  
@@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct intel_guc *guc)

return PAGE_ALIGN(offset);
  }
  
+static u32 guc_ads_capture_offset(struct intel_guc *guc)

+{
+   u32 offset;
+
+   offset = guc_ads_waklv_offset(guc) +
+guc_ads_waklv_size(guc);
+
+   return PAGE_ALIGN(offset);
+}
+
  static u32 guc_ads_private_data_offset(struct intel_guc *guc)
  {
u32 offset;
@@ -791,6 +810,49 @@ guc_capture_prep_lists(struct intel_guc *guc)
return PAGE_ALIGN(total_size);
  }
  
+static void guc_waklv_init(struct intel_guc *guc)

+{
+   struct intel_gt *gt = guc_to_gt(guc);
+   u32 offset, addr_ggtt, remain, size;
+
+   if (!intel_uc_uses_guc_submission(>uc))
+   return;
+
+   if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0))
+   return;
+
+   GEM_BUG_ON(iosys_map_is_null(>ads_map));
+   offset = guc_ads_waklv_offset(guc);
+   remain = guc_ads_waklv_size(guc);
+
+   /*
+* Add workarounds here:
+*
+* if (want_wa_) {
+*  size = guc_waklv_(guc, offset, remain);
+*  offset += size;
+*  remain -= size;
+* }
+*/
+
+   size = guc_ads_waklv_size(guc) - remain;
+   if (!size)
+   return;
+
+   offset = guc_ads_waklv_offset(guc);
+   addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
+
+   ads_blob_write(guc, ads.wa_klv_addr_lo, addr_ggtt);
+   ads_blob_write(guc, ads.wa_klv_addr_hi, 0);
+   ads_blob_write(guc, 

Re: [Intel-gfx] [PATCH] drm/i915: Read a shadowed mmio register for ggtt flush

2023-11-09 Thread Belgaumkar, Vinay



On 11/9/2023 12:35 PM, Ville Syrjälä wrote:

On Thu, Nov 09, 2023 at 12:01:26PM -0800, Belgaumkar, Vinay wrote:

On 11/9/2023 11:30 AM, Ville Syrjälä wrote:

On Thu, Nov 09, 2023 at 11:21:48AM -0800, Vinay Belgaumkar wrote:

We read RENDER_HEAD as a part of the flush. If GT is in
deeper sleep states, this could lead to read errors since we are
not using a forcewake. Safer to read a shadowed register instead.

IIRC shadowing is only thing for writes, not reads.

Sure, but reading from a shadowed register does return the cached value

Does it? I suppose that would make some sense, but I don't recall that
ever being stated anywhere. At least before the shadow registers
existed reads would just give you zeroes when not awake.


(even though we don't care about the vakue here). When GT is in deeper
sleep states, it is better to read a shadowed (cached) value instead of
trying to attempt an mmio register read without a force wake anyways.

So you're saying reads from non-shadowed registers fails somehow
when not awake? How exactly do they fail? And when reading from
a shadowed register that failure never happens?


We could hit problems like the one being addressed here - 
https://patchwork.freedesktop.org/series/125356/.  Reading from a 
shadowed register will avoid any needless references(without a wake) to 
the MMIO space. Shouldn't hurt to make this change for all gens IMO.


Thanks,

Vinay.




Thanks,

Vinay.


Cc: John Harrison 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index ed32bf5b1546..ea814ea5f700 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -451,7 +451,7 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt)
   
   		spin_lock_irqsave(>lock, flags);

intel_uncore_posting_read_fw(uncore,
-RING_HEAD(RENDER_RING_BASE));
+RING_TAIL(RENDER_RING_BASE));
spin_unlock_irqrestore(>lock, flags);
}
   }
--
2.38.1


Re: [Intel-gfx] [PATCH] drm/i915: Read a shadowed mmio register for ggtt flush

2023-11-09 Thread Belgaumkar, Vinay



On 11/9/2023 11:30 AM, Ville Syrjälä wrote:

On Thu, Nov 09, 2023 at 11:21:48AM -0800, Vinay Belgaumkar wrote:

We read RENDER_HEAD as a part of the flush. If GT is in
deeper sleep states, this could lead to read errors since we are
not using a forcewake. Safer to read a shadowed register instead.

IIRC shadowing is only thing for writes, not reads.


Sure, but reading from a shadowed register does return the cached value 
(even though we don't care about the vakue here). When GT is in deeper 
sleep states, it is better to read a shadowed (cached) value instead of 
trying to attempt an mmio register read without a force wake anyways.


Thanks,

Vinay.




Cc: John Harrison 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index ed32bf5b1546..ea814ea5f700 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -451,7 +451,7 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt)
  
  		spin_lock_irqsave(>lock, flags);

intel_uncore_posting_read_fw(uncore,
-RING_HEAD(RENDER_RING_BASE));
+RING_TAIL(RENDER_RING_BASE));
spin_unlock_irqrestore(>lock, flags);
}
  }
--
2.38.1


Re: [Intel-gfx] [PATCH] drm/i915/mtl: Don't set PIPE_CONTROL_FLUSH_L3

2023-10-16 Thread Belgaumkar, Vinay



On 10/16/2023 4:24 PM, John Harrison wrote:

On 10/16/2023 15:55, Vinay Belgaumkar wrote:

This bit does not cause an explicit L3 flush. We already use
At all? Or only on newer hardware? And as a genuine spec change or as 
a bug / workaround?


If the hardware has re-purposed the bit then it is probably worth at 
least adding a comment to the bit definition to say that it is only 
valid up to IP version 12.70.
At this point, this is a bug on MTL since this bit is not related to L3 
flushes as per spec. Regarding older platforms, still checking the 
reason why this was added (i.e if it fixed something and will regress if 
removed). If not, we can extend the change for others as well in a 
separate patch. On older platforms, this bit seems to cause an implicit 
flush at best.



PIPE_CONTROL_DC_FLUSH_ENABLE for that purpose.

Cc: Nirmoy Das 
Cc: Mikka Kuoppala 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c

index ba4c2422b340..abbc02f3e66e 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -247,6 +247,7 @@ static int mtl_dummy_pipe_control(struct 
i915_request *rq)

  int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
  {
  struct intel_engine_cs *engine = rq->engine;
+    struct intel_gt *gt = rq->engine->gt;
    /*
   * On Aux CCS platforms the invalidation of the Aux
@@ -278,7 +279,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, 
u32 mode)

   * deals with Protected Memory which is not needed for
   * AUX CCS invalidation and lead to unwanted side effects.
   */
-    if (mode & EMIT_FLUSH)
+    if ((mode & EMIT_FLUSH) &&
+    !(IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71
Why stop at 12.71? Is the meaning only changed for 12.70 and the 
old/correct version will be restored in later hardware?


Was trying to keep this limited to MTL for now until the above 
statements are verified.


Thanks,

Vinay.



John.



  bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
    bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
@@ -812,12 +814,14 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct 
i915_request *rq, u32 *cs)

  u32 flags = (PIPE_CONTROL_CS_STALL |
   PIPE_CONTROL_TLB_INVALIDATE |
   PIPE_CONTROL_TILE_CACHE_FLUSH |
- PIPE_CONTROL_FLUSH_L3 |
   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
   PIPE_CONTROL_DEPTH_CACHE_FLUSH |
   PIPE_CONTROL_DC_FLUSH_ENABLE |
   PIPE_CONTROL_FLUSH_ENABLE);
  +    if (!(IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71
+    flags |= PIPE_CONTROL_FLUSH_L3;
+
  /* Wa_14016712196 */
  if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)) || 
IS_DG2(i915))

  /* dummy PIPE_CONTROL + depth flush */




Re: [Intel-gfx] [PATCH 3/4] drm/i915/guc: Add support for w/a KLVs

2023-10-06 Thread Belgaumkar, Vinay



On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

To prevent running out of bits, new w/a enable flags are being added
via a KLV system instead of a 32 bit flags word.

Signed-off-by: John Harrison 
---
  .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 64 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  6 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  5 +-
  5 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
index dabeaf4f245f3..00d6402333f8e 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
@@ -36,6 +36,7 @@ enum intel_guc_load_status {
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
+   INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR= 0x75,
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
  
  	INTEL_GUC_LOAD_STATUS_READY= 0xF0,

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 6c392bad29c19..3b1fc5f96306b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -186,6 +186,8 @@ struct intel_guc {
struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
+   /** @ads_waklv_size: size of workaround KLVs */
+   u32 ads_waklv_size;
/** @ads_capture_size: size of register lists in the ADS used for error 
capture */
u32 ads_capture_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
@@ -295,6 +297,7 @@ struct intel_guc {
  #define MAKE_GUC_VER(maj, min, pat)   (((maj) << 16) | ((min) << 8) | (pat))
  #define MAKE_GUC_VER_STRUCT(ver)  MAKE_GUC_VER((ver).major, (ver).minor, 
(ver).patch)
  #define GUC_SUBMIT_VER(guc)   
MAKE_GUC_VER_STRUCT((guc)->submission_version)
+#define GUC_FIRMWARE_VER(guc)  
MAKE_GUC_VER_STRUCT((guc)->fw.file_selected.ver)
  
  static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)

  {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 63724e17829a7..792910af3a481 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -46,6 +46,10 @@
   *  +---+
   *  | padding   |
   *  +---+ <== 4K aligned
+ *  | w/a KLVs  |
+ *  +---+
+ *  | padding   |
+ *  +---+ <== 4K aligned
   *  | capture lists |
   *  +---+
   *  | padding   |
@@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc)
return PAGE_ALIGN(guc->ads_golden_ctxt_size);
  }
  
+static u32 guc_ads_waklv_size(struct intel_guc *guc)

+{
+   return PAGE_ALIGN(guc->ads_waklv_size);
+}
+
  static u32 guc_ads_capture_size(struct intel_guc *guc)
  {
return PAGE_ALIGN(guc->ads_capture_size);
@@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc)
return PAGE_ALIGN(offset);
  }
  
-static u32 guc_ads_capture_offset(struct intel_guc *guc)

+static u32 guc_ads_waklv_offset(struct intel_guc *guc)
  {
u32 offset;
  
@@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct intel_guc *guc)

return PAGE_ALIGN(offset);
  }
  
+static u32 guc_ads_capture_offset(struct intel_guc *guc)

+{
+   u32 offset;
+
+   offset = guc_ads_waklv_offset(guc) +
+guc_ads_waklv_size(guc);
+
+   return PAGE_ALIGN(offset);
+}
+
  static u32 guc_ads_private_data_offset(struct intel_guc *guc)
  {
u32 offset;
@@ -791,6 +810,40 @@ guc_capture_prep_lists(struct intel_guc *guc)
return PAGE_ALIGN(total_size);
  }
  
+static void guc_waklv_init(struct intel_guc *guc)

+{
+   struct intel_gt *gt = guc_to_gt(guc);
+   u32 offset, addr_ggtt, remain, size;
+
+   if (!intel_uc_uses_guc_submission(>uc))
+   return;
+
+   if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0))
+   return;

should this be <= ?

+
+   GEM_BUG_ON(iosys_map_is_null(>ads_map));
+   offset = guc_ads_waklv_offset(guc);
+   remain = guc_ads_waklv_size(guc);
+
+   /* Add workarounds here */
+

extra blank line?

+   size = guc_ads_waklv_size(guc) - remain;

Re: [Intel-gfx] [PATCH 2/4] drm/i915: Enable Wa_16019325821

2023-10-06 Thread Belgaumkar, Vinay



On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

Some platforms require holding RCS context switches until CCS is idle
(the reverse w/a of Wa_14014475959). Some platforms require both
versions.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 19 +++
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ---
  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  4 
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  3 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 +++-
  5 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 0143445dba830..8b494825c55f2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -733,21 +733,23 @@ static u32 *gen12_emit_preempt_busywait(struct 
i915_request *rq, u32 *cs)
  }
  
  /* Wa_14014475959:dg2 */

-#define CCS_SEMAPHORE_PPHWSP_OFFSET0x540
-static u32 ccs_semaphore_offset(struct i915_request *rq)
+/* Wa_16019325821 */
+#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540
+static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
  {
return i915_ggtt_offset(rq->context->state) +
-   (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET;
+   (LRC_PPHWSP_PN * PAGE_SIZE) + 
HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET;
  }
  
  /* Wa_14014475959:dg2 */

-static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)
+/* Wa_16019325821 */
+static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs)
  {
int i;
  
  	*cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL |

MI_ATOMIC_MOVE;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
*cs++ = 1;
  
@@ -763,7 +765,7 @@ static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)

MI_SEMAPHORE_POLL |
MI_SEMAPHORE_SAD_EQ_SDD;
*cs++ = 0;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
  
  	return cs;

@@ -780,8 +782,9 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, 
u32 *cs)
cs = gen12_emit_preempt_busywait(rq, cs);
  
  	/* Wa_14014475959:dg2 */

-   if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine))
-   cs = ccs_emit_wa_busywait(rq, cs);
+   /* Wa_16019325821 */
+   if (intel_engine_uses_wa_hold_switchout(rq->engine))
+   cs = hold_switchout_emit_wa_busywait(rq, cs);
  
  	rq->tail = intel_ring_offset(rq, cs);

assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index a7e6775980043..68fe1cef9cd94 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -573,7 +573,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_HAS_RCS_REG_STATE  BIT(9)
  #define I915_ENGINE_HAS_EU_PRIORITYBIT(10)
  #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
-#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12)
+#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12)
unsigned int flags;
  
  	/*

@@ -683,10 +683,11 @@ intel_engine_has_relative_mmio(const struct 
intel_engine_cs * const engine)
  }
  
  /* Wa_14014475959:dg2 */

+/* Wa_16019325821 */
  static inline bool
-intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
+intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
  {
-   return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
+   return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT;
  }
  
  #endif /* __INTEL_ENGINE_TYPES_H__ */

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 27df41c53b890..4001679ba0793 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
IS_DG2(gt->i915))
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
  
+	/* Wa_16019325821 */

+   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
+   flags |= GUC_WA_RCS_CCS_SWITCHOUT;
+
/*
 * Wa_14012197797
 * Wa_22011391025
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index b4d56eccfb1f0..f97af0168a66b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -95,8 +95,9 @@
  #define   GUC_WA_GAM_CREDITS  BIT(10)
  #define   GUC_WA_DUAL_QUEUE   BIT(11)
  #define   GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13)
-#define   GUC_WA_CONTEXT_ISOLATION BIT(15)
  #define   GUC_WA_PRE_PARSER   BIT(14)
+#define   

Re: [Intel-gfx] [PATCH 2/2] drm/i915/guc: Enable WA 14018913170

2023-10-05 Thread Belgaumkar, Vinay



On 9/14/2023 3:28 PM, john.c.harri...@intel.com wrote:

From: Daniele Ceraolo Spurio 

The GuC handles the WA, the KMD just needs to set the flag to enable
it on the appropriate platforms.

Signed-off-by: John Harrison 
Signed-off-by: Daniele Ceraolo Spurio 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc.c  | 6 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc.h  | 1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 1 +
  3 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 27df41c53b890..3f3df1166b860 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -319,6 +319,12 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
if (!RCS_MASK(gt))
flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
  
+	/* Wa_14018913170 */

+   if (GUC_FIRMWARE_VER(guc) >= MAKE_GUC_VER(70, 7, 0)) {
+   if (IS_DG2(gt->i915) || IS_METEORLAKE(gt->i915) || 
IS_PONTEVECCHIO(gt->i915))
+   flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
+   }
+
return flags;


LGTM,

Reviewed-by: Vinay Belgaumkar 


  }
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h

index 6c392bad29c19..818c8c146fd47 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -295,6 +295,7 @@ struct intel_guc {
  #define MAKE_GUC_VER(maj, min, pat)   (((maj) << 16) | ((min) << 8) | (pat))
  #define MAKE_GUC_VER_STRUCT(ver)  MAKE_GUC_VER((ver).major, (ver).minor, 
(ver).patch)
  #define GUC_SUBMIT_VER(guc)   
MAKE_GUC_VER_STRUCT((guc)->submission_version)
+#define GUC_FIRMWARE_VER(guc)  
MAKE_GUC_VER_STRUCT((guc)->fw.file_selected.ver)
  
  static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)

  {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index b4d56eccfb1f0..123ad75d2eb28 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -100,6 +100,7 @@
  #define   GUC_WA_HOLD_CCS_SWITCHOUT   BIT(17)
  #define   GUC_WA_POLLCS   BIT(18)
  #define   GUC_WA_RCS_REGS_IN_CCS_REGS_LISTBIT(21)
+#define   GUC_WA_ENABLE_TSC_CHECK_ON_RC6   BIT(22)
  
  #define GUC_CTL_FEATURE			2

  #define   GUC_CTL_ENABLE_SLPC BIT(2)


Re: [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost

2023-09-27 Thread Belgaumkar, Vinay



On 9/21/2023 3:41 AM, Tvrtko Ursulin wrote:


On 20/09/2023 22:56, Vinay Belgaumkar wrote:

Provide a bit to disable waitboost while waiting on a gem object.
Waitboost results in increased power consumption by requesting RP0
while waiting for the request to complete. Add a bit in the gem_wait()
IOCTL where this can be disabled.

This is related to the libva API change here -
Link: 
https://github.com/XinfengZhang/libva/commit/3d90d18c67609a73121bb71b20ee4776b54b61a7


This link does not appear to lead to userspace code using this uapi?

We have asked Carl (cc'd) to post a patch for the same.




Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gem/i915_gem_wait.c | 9 ++---
  drivers/gpu/drm/i915/i915_request.c  | 3 ++-
  drivers/gpu/drm/i915/i915_request.h  | 1 +
  include/uapi/drm/i915_drm.h  | 1 +
  4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c

index d4b918fb11ce..955885ec859d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -72,7 +72,8 @@ i915_gem_object_wait_reservation(struct dma_resv 
*resv,

  struct dma_fence *fence;
  long ret = timeout ?: 1;
  -    i915_gem_object_boost(resv, flags);
+    if (!(flags & I915_WAITBOOST_DISABLE))
+    i915_gem_object_boost(resv, flags);
    dma_resv_iter_begin(, resv,
  dma_resv_usage_rw(flags & I915_WAIT_ALL));
@@ -236,7 +237,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void 
*data, struct drm_file *file)

  ktime_t start;
  long ret;
  -    if (args->flags != 0)
+    if (args->flags != 0 || args->flags != I915_GEM_WAITBOOST_DISABLE)
  return -EINVAL;
    obj = i915_gem_object_lookup(file, args->bo_handle);
@@ -248,7 +249,9 @@ i915_gem_wait_ioctl(struct drm_device *dev, void 
*data, struct drm_file *file)

  ret = i915_gem_object_wait(obj,
 I915_WAIT_INTERRUPTIBLE |
 I915_WAIT_PRIORITY |
-   I915_WAIT_ALL,
+   I915_WAIT_ALL |
+   (args->flags & I915_GEM_WAITBOOST_DISABLE ?
+    I915_WAITBOOST_DISABLE : 0),
 to_wait_timeout(args->timeout_ns));
    if (args->timeout_ns > 0) {
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c

index f59081066a19..2957409b4b2a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -2044,7 +2044,8 @@ long i915_request_wait_timeout(struct 
i915_request *rq,

   * but at a cost of spending more power processing the workload
   * (bad for battery).
   */
-    if (flags & I915_WAIT_PRIORITY && !i915_request_started(rq))
+    if (!(flags & I915_WAITBOOST_DISABLE) && (flags & 
I915_WAIT_PRIORITY) &&

+    !i915_request_started(rq))
  intel_rps_boost(rq);
    wait.tsk = current;
diff --git a/drivers/gpu/drm/i915/i915_request.h 
b/drivers/gpu/drm/i915/i915_request.h

index 0ac55b2e4223..3cc00e8254dc 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,6 +445,7 @@ long i915_request_wait(struct i915_request *rq,
  #define I915_WAIT_INTERRUPTIBLE    BIT(0)
  #define I915_WAIT_PRIORITY    BIT(1) /* small priority bump for the 
request */
  #define I915_WAIT_ALL    BIT(2) /* used by 
i915_gem_object_wait() */
+#define I915_WAITBOOST_DISABLE    BIT(3) /* used by 
i915_gem_object_wait() */

    void i915_request_show(struct drm_printer *m,
 const struct i915_request *rq,
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7000e5910a1d..4adee70e39cf 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1928,6 +1928,7 @@ struct drm_i915_gem_wait {
  /** Handle of BO we shall wait on */
  __u32 bo_handle;
  __u32 flags;
+#define I915_GEM_WAITBOOST_DISABLE  (1u<<0)


Probably would be good to avoid mentioning waitboost in the uapi since 
so far it wasn't an explicit feature/contract. Something like 
I915_GEM_WAIT_BACKGROUND_PRIORITY? Low priority?

sure.


I also wonder if there could be a possible angle to help Rob (+cc) 
upstream the syncobj/fence deadline code if our media driver might 
make use of that somehow.


Like if either we could wire up the deadline into GEM_WAIT (in a 
backward compatible manner), or if media could use sync fd wait 
instead. Assuming they have an out fence already, which may not be true.


Makes sense. We could add a SET_DEADLINE flag or something similar and 
pass in the deadline when appropriate.


Thanks,

Vinay.



Regards,

Tvrtko


  /** Number of nanoseconds to wait, Returns time remaining. */
  __s64 timeout_ns;
  };


Re: [Intel-gfx] [PATCH i-g-t] tests/i915_pm_freq_api: Set min/max to expected values

2023-09-20 Thread Belgaumkar, Vinay



On 9/20/2023 7:07 AM, Rodrigo Vivi wrote:

On Mon, Sep 18, 2023 at 12:02:59PM -0700, Vinay Belgaumkar wrote:

A prior(rps) test leaves the system in a bad state causing failures
in the basic test.

Why?

What was the freq immediately before the failure that made the
machine to be busted and not accept the new freq request?

Maybe we should use this information to limit the freq requests
that we accept instead of workaround the test case. Otherwise
we are at risk of users selecting the bad freq that let " the
system in a bad state"...


i915_pm_rps (waitboost) test sets soft max_freq to some value less than 
RP0 and then fails. The restore on failure does not work properly as the 
test is not multitile capable(it sets the root sysfs entry instead of 
using the per tile entry). Then, the current test (i915_pm_freq_api --r 
basic-api) tries to set min_freq to RP0 as part of normal testing. This 
fails as soft_max is < RP0.


There is some non-trivial effort needed to convert i915_pm_rps to 
multitile, and this is a BAT failure, hence adding the quick fix to 
ensure the test runs with a good pre-environment.


Thanks,

Vinay.




Set min/max to expected values before running it.
Test will restore values at the end.

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8670

Signed-off-by: Vinay Belgaumkar 
---
  tests/intel/i915_pm_freq_api.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tests/intel/i915_pm_freq_api.c b/tests/intel/i915_pm_freq_api.c
index 03bd0d05b..6018692a2 100644
--- a/tests/intel/i915_pm_freq_api.c
+++ b/tests/intel/i915_pm_freq_api.c
@@ -55,7 +55,11 @@ static void test_freq_basic_api(int dirfd, int gt)
rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
-   igt_debug("GT: %d, RPn: %d, RPe: %d, RP0: %d", gt, rpn, rpe, rp0);
+   igt_debug("GT: %d, RPn: %d, RPe: %d, RP0: %d\n", gt, rpn, rpe, rp0);
+
+   /* Set min/max to RPn, RP0 for baseline behavior */
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rp0) > 0);
  
  	/*

 * Negative bound tests
@@ -170,7 +174,7 @@ igt_main
for_each_sysfs_gt_dirfd(i915, dirfd, gt) {
stash_min[gt] = get_freq(dirfd, RPS_MIN_FREQ_MHZ);
stash_max[gt] = get_freq(dirfd, RPS_MAX_FREQ_MHZ);
-   igt_debug("GT: %d, min: %d, max: %d", gt, 
stash_min[gt], stash_max[gt]);
+   igt_debug("GT: %d, min: %d, max: %d\n", gt, 
stash_min[gt], stash_max[gt]);
igt_pm_ignore_slpc_efficient_freq(i915, dirfd, true);
}
igt_install_exit_handler(restore_sysfs_freq);
--
2.38.1



Re: [Intel-gfx] [PATCH i-g-t] tests/i915_pm_freq_api: Ignore zero register value

2023-08-14 Thread Belgaumkar, Vinay



On 8/14/2023 12:24 AM, Riana Tauro wrote:

Hi Vinay

On 8/9/2023 6:20 AM, Vinay Belgaumkar wrote:

Register read for requested_freq can return 0 when system is
in runtime_pm. Make allowance for this case.

Link: https://gitlab.freedesktop.org/drm/intel/issues/8736
Link: https://gitlab.freedesktop.org/drm/intel/issues/8989

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_pm_freq_api.c | 18 ++
  1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tests/i915/i915_pm_freq_api.c 
b/tests/i915/i915_pm_freq_api.c

index cf21cc936..9c71411ee 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -88,6 +88,7 @@ static void test_freq_basic_api(int dirfd, int gt)
  static void test_reset(int i915, int dirfd, int gt, int count)
  {
  uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+    uint32_t req_freq;
  int fd;
    for (int i = 0; i < count; i++) {
@@ -95,14 +96,18 @@ static void test_reset(int i915, int dirfd, int 
gt, int count)

  igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
  igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
  usleep(ACT_FREQ_LATENCY_US);
-    igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
+    req_freq = get_freq(dirfd, RPS_CUR_FREQ_MHZ);
+    if (req_freq)
+    igt_assert_eq(req_freq, rpn);


Is there anything else that can cause req_freq to be zero?

To differentiate can we assert only when runtime_status is active 
(igt_get_runtime_pm_status() == IGT_RUNTIME_PM_STATUS_ACTIVE) ?


Makes sense, re-sending.

Thanks,

Vinay.




Thanks
Riana Tauro

    /* Manually trigger a GT reset */
  fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
  igt_require(fd >= 0);
  igt_ignore_warn(write(fd, "1\n", 2));
  -    igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
+    req_freq = get_freq(dirfd, RPS_CUR_FREQ_MHZ);
+    if (req_freq)
+    igt_assert_eq(req_freq, rpn);
  }
  close(fd);
  }
@@ -110,17 +115,22 @@ static void test_reset(int i915, int dirfd, int 
gt, int count)

  static void test_suspend(int i915, int dirfd, int gt)
  {
  uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+    uint32_t req_freq;
    igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
  igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
  usleep(ACT_FREQ_LATENCY_US);
-    igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
+    req_freq = get_freq(dirfd, RPS_CUR_FREQ_MHZ);
+    if (req_freq)
+    igt_assert_eq(req_freq, rpn);
    /* Manually trigger a suspend */
  igt_system_suspend_autoresume(SUSPEND_STATE_S3,
    SUSPEND_TEST_NONE);
  -    igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
+    req_freq = get_freq(dirfd, RPS_CUR_FREQ_MHZ);
+    if (req_freq)
+    igt_assert_eq(req_freq, rpn);
  }
    int i915 = -1;


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/guc/slpc: Restore efficient freq earlier (rev3)

2023-08-11 Thread Belgaumkar, Vinay



On 8/10/2023 5:22 PM, Rodrigo Vivi wrote:

On Wed, Aug 02, 2023 at 12:41:09AM +, Belgaumkar, Vinay wrote:
 
 
 
 
From: Patchwork 

Sent: Thursday, July 27, 2023 6:59 PM
To: Belgaumkar, Vinay 
Cc: intel-gfx@lists.freedesktop.org
Subject: ✗ Fi.CI.IGT: failure for drm/i915/guc/slpc: Restore efficient
freq earlier (rev3)
 
 
 
Patch Details
 
Series:  drm/i915/guc/slpc: Restore efficient freq earlier (rev3)

URL: [1]https://patchwork.freedesktop.org/series/121150/
State:   failure
Details: 
[2]https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_121150v3/index.html
 
  CI Bug Log - changes from CI_DRM_13432_full -> Patchwork_121150v3_full
 
Summary
 
FAILURE
 
Serious unknown changes coming with Patchwork_121150v3_full absolutely

need to be
verified manually.
 
If you think the reported changes have nothing to do with the changes

introduced in Patchwork_121150v3_full, please notify your bug team to
allow them
to document this new failure mode, which will reduce false positives in
CI.
 
Participating hosts (10 -> 10)
 
No changes in participating hosts
 
Possible new issues
 
Here are the unknown changes that may have been introduced in

Patchwork_121150v3_full:
 
   IGT changes
 
 Possible regressions
 
  • igt@sysfs_timeslice_duration@invalid:
 
 ◦ shard-mtlp: NOTRUN -> [3]TIMEOUT
 
Does not seem related to this patch.

But i915_selftests@live@workarounds seems to fail on every platform after I 
cherry
picked this patch to drm-intel-fixes:

http://gfx-ci.igk.intel.com/tree/drm-intel-fixes/combined-alt.html?

<5> [314.508910] i915 :00:02.0: [drm] Resetting chip for live_workarounds
<6> [314.511971] i915 :00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin 
version 70.8.0
<7> [314.523625] i915 :00:02.0: [drm:intel_guc_fw_upload [i915]] GT0: GUC: 
init took 8ms, freq = 2250MHz, before = 2250MHz, status = 0x8002F034, count = 0, ret 
= 0
<7> [314.526596] i915 :00:02.0: [drm:guc_enable_communication [i915]] GT0: 
GUC: communication enabled
<6> [314.531291] i915 :00:02.0: [drm] GT0: GUC: submission enabled
<6> [314.531324] i915 :00:02.0: [drm] GT0: GUC: SLPC enabled
<6> [314.576597] i915: Running 
intel_workarounds_live_selftests/live_engine_reset_workarounds
<7> [314.576715] MCR Steering: L3BANK steering: group=0x0, instance=0x0
<7> [314.576736] MCR Steering: DSS steering: group=0x0, instance=0x0
<7> [314.576751] MCR Steering: INSTANCE 0 steering: group=0x0, instance=0x0
<7> [314.576818] i915 :00:02.0: [drm:wa_init_finish [i915]] Initialized 5 
GT_REF workarounds on global
<7> [314.578192] i915 :00:02.0: [drm:wa_init_finish [i915]] Initialized 5 
REF workarounds on rcs0
<7> [314.579454] i915 :00:02.0: [drm:wa_init_finish [i915]] Initialized 6 
CTX_REF workarounds on rcs0
<7> [314.580487] i915 :00:02.0: [drm:wa_init_finish [i915]] Initialized 1 
REF workarounds on bcs0
<7> [314.581449] i915 :00:02.0: [drm:wa_init_finish [i915]] Initialized 2 
CTX_REF workarounds on bcs0
<7> [314.582206] i915 :00:02.0: [drm:wa_init_finish [i9

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Restore efficient freq earlier

2023-07-21 Thread Belgaumkar, Vinay



On 7/21/2023 3:08 PM, Belgaumkar, Vinay wrote:


On 7/21/2023 2:23 PM, Rodrigo Vivi wrote:

On Fri, Jul 21, 2023 at 01:44:34PM -0700, Belgaumkar, Vinay wrote:

On 7/21/2023 1:41 PM, Rodrigo Vivi wrote:

On Fri, Jul 21, 2023 at 11:03:49AM -0700, Vinay Belgaumkar wrote:

This should be done before the soft min/max frequencies are restored.
When we disable the "Ignore efficient frequency" flag, GuC does not
actually bring the requested freq down to RPn.

Specifically, this scenario-

- ignore efficient freq set to true
- reduce min to RPn (from efficient)
- suspend
- resume (includes GuC load, restore soft min/max, restore 
efficient freq)

- validate min freq has been resored to RPn

This will fail if we didn't first restore(disable, in this case) 
efficient

freq flag before setting the soft min frequency.
that's strange. so guc is returning the rpe when we request the min 
freq

during the soft config?

we could alternatively change the soft config to actually get the min
and not be tricked by this.

But also the patch below doesn't hurt.

Reviewed-by: Rodrigo Vivi 
(Although I'm still curious and want to understand exactly why
the soft min gets messed up when we don't tell guc to ignore the
efficient freq beforehand. Please help me to understand.)
The soft min does not get messed up, but GuC keeps requesting RPe 
even after

disabling efficient freq. (unless we manually set min freq to RPn AFTER
disabling efficient).
so it looks to me that the right solution would be to ensure that 
everytime
that we disable the efficient freq we make sure to also set the mim 
freq to RPn,

no?!


Hmm, may not be applicable every time. What if someone disables 
efficient frequency while running a workload or with frequency fixed 
to 800, for example?


I'll take that back, it should not matter. GuC will not change it's 
request just because we switched min lower. I will resend the patch with 
the min setting as well.


Thanks,

Vinay.



Thanks,

Vinay.




Thanks,

Vinay.




Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8736
Fixes: 55f9720dbf23 ("drm/i915/guc/slpc: Provide sysfs for 
efficient freq")

Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index ee9f83af7cf6..f16dff7c3185 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -743,6 +743,9 @@ int intel_guc_slpc_enable(struct 
intel_guc_slpc *slpc)

   intel_guc_pm_intrmsk_enable(slpc_to_gt(slpc));
+    /* Set cached value of ignore efficient freq */
+    intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
+
   slpc_get_rp_values(slpc);
   /* Handle the case where min=max=RPmax */
@@ -765,9 +768,6 @@ int intel_guc_slpc_enable(struct 
intel_guc_slpc *slpc)

   /* Set cached media freq ratio mode */
   intel_guc_slpc_set_media_ratio_mode(slpc, 
slpc->media_ratio_mode);

-    /* Set cached value of ignore efficient freq */
-    intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
-
   return 0;
   }
--
2.38.1



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Restore efficient freq earlier

2023-07-21 Thread Belgaumkar, Vinay



On 7/21/2023 2:23 PM, Rodrigo Vivi wrote:

On Fri, Jul 21, 2023 at 01:44:34PM -0700, Belgaumkar, Vinay wrote:

On 7/21/2023 1:41 PM, Rodrigo Vivi wrote:

On Fri, Jul 21, 2023 at 11:03:49AM -0700, Vinay Belgaumkar wrote:

This should be done before the soft min/max frequencies are restored.
When we disable the "Ignore efficient frequency" flag, GuC does not
actually bring the requested freq down to RPn.

Specifically, this scenario-

- ignore efficient freq set to true
- reduce min to RPn (from efficient)
- suspend
- resume (includes GuC load, restore soft min/max, restore efficient freq)
- validate min freq has been resored to RPn

This will fail if we didn't first restore(disable, in this case) efficient
freq flag before setting the soft min frequency.

that's strange. so guc is returning the rpe when we request the min freq
during the soft config?

we could alternatively change the soft config to actually get the min
and not be tricked by this.

But also the patch below doesn't hurt.

Reviewed-by: Rodrigo Vivi 
(Although I'm still curious and want to understand exactly why
the soft min gets messed up when we don't tell guc to ignore the
efficient freq beforehand. Please help me to understand.)

The soft min does not get messed up, but GuC keeps requesting RPe even after
disabling efficient freq. (unless we manually set min freq to RPn AFTER
disabling efficient).

so it looks to me that the right solution would be to ensure that everytime
that we disable the efficient freq we make sure to also set the mim freq to RPn,
no?!


Hmm, may not be applicable every time. What if someone disables 
efficient frequency while running a workload or with frequency fixed to 
800, for example?


Thanks,

Vinay.




Thanks,

Vinay.




Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8736
Fixes: 55f9720dbf23 ("drm/i915/guc/slpc: Provide sysfs for efficient freq")
Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index ee9f83af7cf6..f16dff7c3185 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -743,6 +743,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
intel_guc_pm_intrmsk_enable(slpc_to_gt(slpc));
+   /* Set cached value of ignore efficient freq */
+   intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
+
slpc_get_rp_values(slpc);
/* Handle the case where min=max=RPmax */
@@ -765,9 +768,6 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
/* Set cached media freq ratio mode */
intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode);
-   /* Set cached value of ignore efficient freq */
-   intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
-
return 0;
   }
--
2.38.1



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Restore efficient freq earlier

2023-07-21 Thread Belgaumkar, Vinay



On 7/21/2023 1:41 PM, Rodrigo Vivi wrote:

On Fri, Jul 21, 2023 at 11:03:49AM -0700, Vinay Belgaumkar wrote:

This should be done before the soft min/max frequencies are restored.
When we disable the "Ignore efficient frequency" flag, GuC does not
actually bring the requested freq down to RPn.

Specifically, this scenario-

- ignore efficient freq set to true
- reduce min to RPn (from efficient)
- suspend
- resume (includes GuC load, restore soft min/max, restore efficient freq)
- validate min freq has been resored to RPn

This will fail if we didn't first restore(disable, in this case) efficient
freq flag before setting the soft min frequency.

that's strange. so guc is returning the rpe when we request the min freq
during the soft config?

we could alternatively change the soft config to actually get the min
and not be tricked by this.

But also the patch below doesn't hurt.

Reviewed-by: Rodrigo Vivi 
(Although I'm still curious and want to understand exactly why
the soft min gets messed up when we don't tell guc to ignore the
efficient freq beforehand. Please help me to understand.)


The soft min does not get messed up, but GuC keeps requesting RPe even 
after disabling efficient freq. (unless we manually set min freq to RPn 
AFTER disabling efficient).


Thanks,

Vinay.





Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8736
Fixes: 55f9720dbf23 ("drm/i915/guc/slpc: Provide sysfs for efficient freq")
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index ee9f83af7cf6..f16dff7c3185 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -743,6 +743,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
  
  	intel_guc_pm_intrmsk_enable(slpc_to_gt(slpc));
  
+	/* Set cached value of ignore efficient freq */

+   intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
+
slpc_get_rp_values(slpc);
  
  	/* Handle the case where min=max=RPmax */

@@ -765,9 +768,6 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
/* Set cached media freq ratio mode */
intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode);
  
-	/* Set cached value of ignore efficient freq */

-   intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
-
return 0;
  }
  
--

2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH v2 i-g-t] i915_pm_freq_api: Add some debug to tests

2023-07-18 Thread Belgaumkar, Vinay



On 7/17/2023 9:26 PM, Dixit, Ashutosh wrote:

On Mon, 17 Jul 2023 21:19:13 -0700, Belgaumkar, Vinay wrote:


On 7/17/2023 6:50 PM, Dixit, Ashutosh wrote:

On Mon, 17 Jul 2023 11:42:13 -0700, Vinay Belgaumkar wrote:

Some subtests seem to be failing in CI, use igt_assert_(lt/eq) which
print the values being compared and some additional debug as well.

v2: Print GT as well (Ashutosh)

Signed-off-by: Vinay Belgaumkar 
---
   tests/i915/i915_pm_freq_api.c | 18 --
   1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 522abee35..a7bbd4896 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -55,6 +55,7 @@ static void test_freq_basic_api(int dirfd, int gt)
rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_debug("GT: %d, RPn: %d, RPe: %d, RP0: %d", gt, rpn, rpe, rp0);

/*
 * Negative bound tests
@@ -90,21 +91,18 @@ static void test_reset(int i915, int dirfd, int gt, int 
count)
int fd;

for (int i = 0; i < count; i++) {
-   igt_assert_f(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
-   igt_assert_f(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
+   igt_debug("Running cycle: %d", i);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);

I am R-b'ing this but stuff like this should be using igt_assert_lt()
according to the commit message?

This _lt stuff has to be fixed all over the file, not just this patch, if
it brings any value (again according to the commit message).

Let me know if you want to fix this now or in a later patch. I'll wait
before merging.

Yup, I will send out another version with the corrected commit message.

Hmm, I thought the code needs to be fixed not the commit message :)


Ok, I meant this specific patch will address just the area where we 
check for the requested frequency. I will change the remaining in a 
separate patch.


Thanks,

Vinay.




Thanks,

Vinay.


Reviewed-by: Ashutosh Dixit 


usleep(ACT_FREQ_LATENCY_US);
-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a GT reset */
fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
igt_require(fd >= 0);
igt_ignore_warn(write(fd, "1\n", 2));

-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
}
close(fd);
   }
@@ -116,13 +114,13 @@ static void test_suspend(int i915, int dirfd, int gt)
igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
usleep(ACT_FREQ_LATENCY_US);
-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a suspend */
igt_system_suspend_autoresume(SUSPEND_STATE_S3,
  SUSPEND_TEST_NONE);

-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
   }

   int i915 = -1;
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH v2 i-g-t] i915_pm_freq_api: Add some debug to tests

2023-07-17 Thread Belgaumkar, Vinay



On 7/17/2023 6:50 PM, Dixit, Ashutosh wrote:

On Mon, 17 Jul 2023 11:42:13 -0700, Vinay Belgaumkar wrote:

Some subtests seem to be failing in CI, use igt_assert_(lt/eq) which
print the values being compared and some additional debug as well.

v2: Print GT as well (Ashutosh)

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_pm_freq_api.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 522abee35..a7bbd4896 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -55,6 +55,7 @@ static void test_freq_basic_api(int dirfd, int gt)
rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_debug("GT: %d, RPn: %d, RPe: %d, RP0: %d", gt, rpn, rpe, rp0);

/*
 * Negative bound tests
@@ -90,21 +91,18 @@ static void test_reset(int i915, int dirfd, int gt, int 
count)
int fd;

for (int i = 0; i < count; i++) {
-   igt_assert_f(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
-   igt_assert_f(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
+   igt_debug("Running cycle: %d", i);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);

I am R-b'ing this but stuff like this should be using igt_assert_lt()
according to the commit message?

This _lt stuff has to be fixed all over the file, not just this patch, if
it brings any value (again according to the commit message).

Let me know if you want to fix this now or in a later patch. I'll wait
before merging.


Yup, I will send out another version with the corrected commit message.

Thanks,

Vinay.



Reviewed-by: Ashutosh Dixit 


usleep(ACT_FREQ_LATENCY_US);
-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a GT reset */
fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
igt_require(fd >= 0);
igt_ignore_warn(write(fd, "1\n", 2));

-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
}
close(fd);
  }
@@ -116,13 +114,13 @@ static void test_suspend(int i915, int dirfd, int gt)
igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
usleep(ACT_FREQ_LATENCY_US);
-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a suspend */
igt_system_suspend_autoresume(SUSPEND_STATE_S3,
  SUSPEND_TEST_NONE);

-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
  }

  int i915 = -1;
--
2.38.1



Re: [Intel-gfx] [PATCH i-g-t] i915_pm_freq_api: Add some debug to tests

2023-07-17 Thread Belgaumkar, Vinay



On 7/8/2023 12:36 PM, Dixit, Ashutosh wrote:

On Fri, 07 Jul 2023 16:23:59 -0700, Vinay Belgaumkar wrote:

Some subtests seem to be failing in CI, use igt_assert_(lt/eq) which
print the values being compared and some additional debug as well.

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_pm_freq_api.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 522abee35..cdb2e70ca 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -55,6 +55,7 @@ static void test_freq_basic_api(int dirfd, int gt)
rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_debug("RPn: %d, RPe: %d, RP0: %d", rpn, rpe, rp0);

Print gt here too.

ok.



/*
 * Negative bound tests
@@ -90,21 +91,18 @@ static void test_reset(int i915, int dirfd, int gt, int 
count)
int fd;

for (int i = 0; i < count; i++) {
-   igt_assert_f(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
-   igt_assert_f(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0,
-"Failed after %d good cycles\n", i);
+   igt_debug("Running cycle: %d", i);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
usleep(ACT_FREQ_LATENCY_US);
-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a GT reset */
fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
igt_require(fd >= 0);
igt_ignore_warn(write(fd, "1\n", 2));

-   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
-"Failed after %d good cycles\n", i);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

Probably ok but why the changes in this loop?


There are a couple of bugs that are failing around this area.

Thanks,

Vinay.




}
close(fd);
  }
@@ -116,13 +114,13 @@ static void test_suspend(int i915, int dirfd, int gt)
igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
usleep(ACT_FREQ_LATENCY_US);
-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);

/* Manually trigger a suspend */
igt_system_suspend_autoresume(SUSPEND_STATE_S3,
  SUSPEND_TEST_NONE);

-   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
+   igt_assert_eq(get_freq(dirfd, RPS_CUR_FREQ_MHZ), rpn);
  }

  int i915 = -1;
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/i915_pm_rps: Exercise sysfs thresholds

2023-06-30 Thread Belgaumkar, Vinay



On 5/23/2023 3:51 AM, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Exercise a bunch of up and down rps thresholds to verify hardware
is happy with them all.

To limit the overall runtime relies on probability and number of runs
to approach complete coverage.

Signed-off-by: Tvrtko Ursulin 
Cc: Rodrigo Vivi 
---
  tests/i915/i915_pm_rps.c | 232 +++
  1 file changed, 232 insertions(+)

diff --git a/tests/i915/i915_pm_rps.c b/tests/i915/i915_pm_rps.c
index 050d68a16559..acff59207311 100644
--- a/tests/i915/i915_pm_rps.c
+++ b/tests/i915/i915_pm_rps.c
@@ -39,8 +39,10 @@
  #include "i915/gem.h"
  #include "i915/gem_create.h"
  #include "igt.h"
+#include "igt_aux.h"
  #include "igt_dummyload.h"
  #include "igt_perf.h"
+#include "igt_rand.h"
  #include "igt_sysfs.h"
  /**
   * TEST: i915 pm rps
@@ -914,6 +916,200 @@ static void pm_rps_exit_handler(int sig)
close(drm_fd);
  }
  
+static igt_spin_t *__spin_poll(int fd, uint64_t ahnd, const intel_ctx_t *ctx,

+  const struct intel_execution_engine2 *e)
+{
+   struct igt_spin_factory opts = {
+   .ahnd = ahnd,
+   .ctx = ctx,
+   .engine = e->flags,
+   };
+
+   if (gem_class_can_store_dword(fd, e->class))
+   opts.flags |= IGT_SPIN_POLL_RUN;
+
+   return __igt_spin_factory(fd, );
+}
+
+static unsigned long __spin_wait(int fd, igt_spin_t *spin)
+{
+   struct timespec start = { };
+
+   igt_nsec_elapsed();
+
+   if (igt_spin_has_poll(spin)) {
+   unsigned long timeout = 0;
+
+   while (!igt_spin_has_started(spin)) {
+   unsigned long t = igt_nsec_elapsed();
+
+   igt_assert(gem_bo_busy(fd, spin->handle));
+   if ((t - timeout) > 250e6) {
+   timeout = t;
+   igt_warn("Spinner not running after %.2fms\n",
+(double)t / 1e6);
+   igt_assert(t < 2e9);
+   }
+   }
+   } else {
+   igt_debug("__spin_wait - usleep mode\n");
+   usleep(500e3); /* Better than nothing! */
+   }
+
+   igt_assert(gem_bo_busy(fd, spin->handle));
+   return igt_nsec_elapsed();
+}
+
+static igt_spin_t *__spin_sync(int fd, uint64_t ahnd, const intel_ctx_t *ctx,
+  const struct intel_execution_engine2 *e)
+{
+   igt_spin_t *spin = __spin_poll(fd, ahnd, ctx, e);
+
+   __spin_wait(fd, spin);
+
+   return spin;
+}
All the above spin functions have been duplicated across 2-3 tests, time 
to create a lib for them?

+
+static struct i915_engine_class_instance
+find_dword_engine(int i915, const unsigned int gt)
+{
+   struct i915_engine_class_instance *engines, ci = { -1, -1 };
+   unsigned int i, count;
+
+   engines = gem_list_engines(i915, 1u << gt, ~0u, );
+   igt_assert(engines);
+
+   for (i = 0; i < count; i++) {
+   if (!gem_class_can_store_dword(i915, engines[i].engine_class))
+   continue;
+
+   ci = engines[i];
+   break;
+   }
+
+   free(engines);
+
+   return ci;
+}
+
+static igt_spin_t *spin_sync_gt(int i915, uint64_t ahnd, unsigned int gt,
+   const intel_ctx_t **ctx)
+{
+   struct i915_engine_class_instance ci = { -1, -1 };
+   struct intel_execution_engine2 e = { };
+
+   ci = find_dword_engine(i915, gt);
+
+   igt_require(ci.engine_class != (uint16_t)I915_ENGINE_CLASS_INVALID);
+
+   if (gem_has_contexts(i915)) {
+   e.class = ci.engine_class;
+   e.instance = ci.engine_instance;
+   e.flags = 0;
+   *ctx = intel_ctx_create_for_engine(i915, e.class, e.instance);
+   } else {
+   igt_require(gt == 0); /* Impossible anyway. */
+   e.class = gem_execbuf_flags_to_engine_class(I915_EXEC_DEFAULT);
+   e.instance = 0;
+   e.flags = I915_EXEC_DEFAULT;
+   *ctx = intel_ctx_0(i915);
+   }
+
+   igt_debug("Using engine %u:%u\n", e.class, e.instance);
+
+   return __spin_sync(i915, ahnd, *ctx, );
+}
+
+#define TEST_IDLE 0x1
+#define TEST_PARK 0x2
+static void test_thresholds(int i915, unsigned int gt, unsigned int flags)
+{
+   uint64_t ahnd = get_reloc_ahnd(i915, 0);
+   const unsigned int points = 10;
+   unsigned int def_up, def_down;
+   igt_spin_t *spin = NULL;
+   const intel_ctx_t *ctx;
+   unsigned int *ta, *tb;
+   unsigned int i;
+   int sysfs;
+
+   sysfs = igt_sysfs_gt_open(i915, gt);
+   igt_require(sysfs >= 0);
+
+   /* Feature test */
+   def_up = igt_sysfs_get_u32(sysfs, "rps_up_threshold_pct");
+   def_down = igt_sysfs_get_u32(sysfs, "rps_down_threshold_pct");
+   igt_require(def_up && def_down);
+
+   /* 

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/guc: Dump perf_limit_reasons for debug (rev2)

2023-06-29 Thread Belgaumkar, Vinay


On 6/28/2023 10:41 AM, Patchwork wrote:

Project List - Patchwork *Patch Details*
*Series:*   drm/i915/guc: Dump perf_limit_reasons for debug (rev2)
*URL:*  https://patchwork.freedesktop.org/series/119893/
*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_119893v2/index.html



  CI Bug Log - changes from CI_DRM_13328_full -> Patchwork_119893v2_full


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_119893v2_full absolutely 
need to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_119893v2_full, please notify your bug team to 
allow them
to document this new failure mode, which will reduce false positives 
in CI.



Participating hosts (9 -> 9)

No changes in participating hosts


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_119893v2_full:



  IGT changes


Possible regressions

 *

igt@i915_pm_rpm@cursor:

  o shard-dg2: PASS


-> FAIL


 *

igt@kms_content_protection@atomic-dpms@pipe-a-dp-2:

  o shard-dg2: NOTRUN -> TIMEOUT


+1 similar issue
 *

igt@kms_flip@flip-vs-suspend-interruptible@b-hdmi-a3:

  o shard-dg2: NOTRUN -> INCOMPLETE




None of these failures are related to the patch.

Thanks,

Vinay.


 *


New tests

New tests have been introduced between CI_DRM_13328_full and 
Patchwork_119893v2_full:



  New IGT tests (1)

  * igt@kms_pipe_crc_basic@suspend-read-crc@pipe-d-hdmi-a-1:
  o Statuses : 1 pass(s)
  o Exec time: [0.0] s


Known issues

Here are the changes found in Patchwork_119893v2_full that come from 
known issues:



  IGT changes


Issues hit

 *

igt@drm_fdinfo@most-busy-idle-check-all@rcs0:

  o shard-rkl: PASS


-> FAIL


(i915#7742 )
 *

igt@gem_create@create-ext-set-pat:

  o shard-glk: NOTRUN -> FAIL


(i915#8621 )
 *

igt@gem_exec_balancer@bonded-pair:

  o shard-mtlp: NOTRUN -> SKIP


(i915#4771 )
 *

igt@gem_exec_balancer@invalid-bonds:

  o shard-mtlp: NOTRUN -> SKIP


(i915#4036 )
 *

igt@gem_exec_fair@basic-none@vecs0:

  o shard-rkl: PASS


-> FAIL


(i915#2842 )
 *

igt@gem_exec_reloc@basic-cpu-gtt:

  o shard-mtlp: NOTRUN -> SKIP


(i915#3281
) +1
similar issue
 *

igt@gem_exec_schedule@deep@vecs0:

  o shard-mtlp: PASS


-> FAIL


(i915#8606 )
 *

igt@gem_exec_schedule@semaphore-power:

  o shard-mtlp: NOTRUN -> SKIP


(i915#4812 )
 *

igt@gem_exec_whisper@basic-contexts-forked-all:

  o shard-mtlp: PASS
  

Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-27 Thread Belgaumkar, Vinay



On 6/26/2023 11:43 PM, Dixit, Ashutosh wrote:

On Mon, 26 Jun 2023 21:02:14 -0700, Belgaumkar, Vinay wrote:


On 6/26/2023 8:17 PM, Dixit, Ashutosh wrote:

On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:

GuC load takes longer sometimes due to GT frequency not ramping up.
Add perf_limit_reasons to the existing warn print to see if frequency
is being throttled.

Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 364d0d546ec8..73911536a8e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = 
%dMHz, status = 0x%08X, count = %d, ret = %d]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status, count, ret);
+   guc_warn(guc, "perf limit reasons = 0x%08X\n",
+intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));

Maybe just add at the end of the previous guc_warn?

Its already too long a line. If I try adding on the next line checkpatch
complains about splitting double quotes.

In these cases of long quoted lines we generally ignore checkpatch. Because
perf limit reasons is part of the "excessive init time" message it should
be on the same line within the square brackets. So should not be
splitting double quotes.

Another idea would be something like this:

guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = 
%dMHz, status = 0x%08X]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status);
guc_warn(guc, "excessive init time: [count = %d, ret = %d, perf 
limit reasons = 0x%08X]\n",
 count, ret, intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));


ok, I will split iut based on freq and non-freq based debug.

Thanks,

Vinay.



Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-26 Thread Belgaumkar, Vinay



On 6/26/2023 8:17 PM, Dixit, Ashutosh wrote:

On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:

GuC load takes longer sometimes due to GT frequency not ramping up.
Add perf_limit_reasons to the existing warn print to see if frequency
is being throttled.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 364d0d546ec8..73911536a8e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = 
%dMHz, status = 0x%08X, count = %d, ret = %d]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status, count, ret);
+   guc_warn(guc, "perf limit reasons = 0x%08X\n",
+intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));

Maybe just add at the end of the previous guc_warn?


Its already too long a line. If I try adding on the next line checkpatch 
complains about splitting double quotes.


Thanks,

Vinay.




} else {
guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, status 
= 0x%08X, count = %d, ret = %d\n",
delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
--
2.38.1



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Apply min softlimit correctly

2023-06-14 Thread Belgaumkar, Vinay



On 6/13/2023 7:25 PM, Dixit, Ashutosh wrote:

On Fri, 09 Jun 2023 15:02:52 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


We were skipping when min_softlimit was equal to RPn. We need to apply
it rergardless as efficient frequency will push the SLPC min to RPe.

regardless


This will break scenarios where user sets a min softlimit < RPe before
reset and then performs a GT reset.

Can you explain the reason for the patch clearly in terms of variables in
the code, what variable has what value and what is the bug. I am not
following from the above description.


Hi Ashutosh,

Scenario being fixed here is exactly the one in i915_pm_freq_api 
reset/suspend subtests (currently in review). Test sets min freq to RPn 
and then performs a reset. It then checks if cur_freq is RPn.


Here's the sequence that shows the problem-

RPLS:/home/gta# modprobe i915
RPLS:/home/gta# echo 1 > /sys/class/drm/card0/gt/gt0/slpc_ignore_eff_freq
RPLS:/home/gta# echo 300 > /sys/class/drm/card0/gt_min_freq_mhz (RPn)
RPLS:/home/gta# cat /sys/class/drm/card0/gt_cur_freq_mhz --> cur == RPn 
as expected

300
RPLS:/home/gta# echo 1 > /sys/kernel/debug/dri/0/gt0/reset --> reset
RPLS:/home/gta# cat /sys/class/drm/card0/gt_min_freq_mhz --> shows the 
internal cached variable correctly

300
RPLS:/home/gta# cat /sys/class/drm/card0/gt_cur_freq_mhz --> actual freq 
being requested by SLPC (it's not RPn!!)

700

We need to sync up driver min freq value and SLPC min after a 
reset/suspend. Currently, we skip if the user had manually set min to 
RPn (this was an optimization we had before we enabled efficient freq 
usage).


Thanks,

Vinay.



Thanks.
--
Ashutosh



Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index 01b75529311c..ee9f83af7cf6 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -606,7 +606,7 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
if (unlikely(ret))
return ret;
slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
-   } else if (slpc->min_freq_softlimit != slpc->min_freq) {
+   } else {
return intel_guc_slpc_set_min_freq(slpc,
   slpc->min_freq_softlimit);
}
--
2.38.1



Re: [Intel-gfx] [PATCH v2 i-g-t] tests/i915_pm_freq_api: Add a suspend subtest

2023-06-13 Thread Belgaumkar, Vinay



On 6/13/2023 2:25 PM, Dixit, Ashutosh wrote:

On Mon, 12 Jun 2023 12:42:13 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Verify that SLPC API works as expected after a suspend. Added
another subtest that does multiple GT resets and checks freq api
works as expected after each one.

We now check requested frequency instead of soft min/max after a
reset or suspend. That ensures the soft limits got applied
correctly at init. Also, disable efficient freq before starting the
test which allows current freq to be consistent with SLPC min freq.

v2: Restore freq in exit handler (Ashutosh)

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_pm_freq_api.c | 89 +++
  1 file changed, 69 insertions(+), 20 deletions(-)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 9005cd220..4e1d4edca 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -18,6 +18,12 @@
   *
   * SUBTEST: freq-reset
   * Description: Test basic freq API works after a reset
+ *
+ * SUBTEST: freq-reset-multiple
+ * Description: Test basic freq API works after multiple resets
+ *
+ * SUBTEST: freq-suspend
+ * Description: Test basic freq API works after a runtime suspend
   */

  IGT_TEST_DESCRIPTION("Test SLPC freq API");
@@ -79,31 +85,64 @@ static void test_freq_basic_api(int dirfd, int gt)

  }

-static void test_reset(int i915, int dirfd, int gt)
+static void test_reset(int i915, int dirfd, int gt, int count)
  {
uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
int fd;

+   for (int i = 0; i < count; i++) {
+   igt_assert_f(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0,
+"Failed after %d good cycles\n", i);
+   igt_assert_f(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0,
+"Failed after %d good cycles\n", i);
+   usleep(ACT_FREQ_LATENCY_US);
+   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
+"Failed after %d good cycles\n", i);
+
+   /* Manually trigger a GT reset */
+   fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
+   igt_require(fd >= 0);
+   igt_ignore_warn(write(fd, "1\n", 2));

No need for 'usleep(ACT_FREQ_LATENCY_US)' here?
Don't think we need it. The delay is specifically for H2G calls. I 
haven't seen the need for a delay here in the limited testing I have done.



+
+   igt_assert_f(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn,
+"Failed after %d good cycles\n", i);
+   }
+   close(fd);
+}
+
+static void test_suspend(int i915, int dirfd, int gt)
+{
+   uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+
igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
usleep(ACT_FREQ_LATENCY_US);
-   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);

-   /* Manually trigger a GT reset */
-   fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
-   igt_require(fd >= 0);
-   igt_ignore_warn(write(fd, "1\n", 2));
-   close(fd);
+   /* Manually trigger a suspend */
+   igt_system_suspend_autoresume(SUSPEND_STATE_S3,
+ SUSPEND_TEST_NONE);

No need for 'usleep(ACT_FREQ_LATENCY_US)' here?
I believe this is a blocking call and will only return after resume 
completes (when console comes back), so delay is not needed.

-   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
-   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_CUR_FREQ_MHZ) == rpn);
  }

-igt_main
+int i915 = -1;
+uint32_t *stash_min, *stash_max;

nit: could we maybe make these fixed size array's (2 or 4 entries) and drop
the malloc's for these, malloc's seem excessive in this case.
What if this is a multi-card device? Though, one thing missing here is 
the 'free' for the allocations. Will add that.



+
+static void restore_sysfs_freq(int sig)
  {
-   int i915 = -1;
-   uint32_t *stash_min, *stash_max;
+   int dirfd, gt;
+   /* Restore frequencies */
+   for_each_sysfs_gt_dirfd(i915, dirfd, gt) {
+   igt_pm_ignore_slpc_efficient_freq(i915, dirfd, false);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, stash_max[gt]) > 
0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, stash_min[gt]) > 
0);

nit: I would remove the igt_assert's from here, it's basically a best
effort restore so we try to restore everything even if we fail.

If we fail, it means the api is not working, so we should flag an error.



+   }
+   close(i915);
+}

+igt_main
+{
igt_fixture {
int num_gts, dirfd, gt;

@@ -122,7 +161,9 @@ igt_main
for_each_sysfs_gt_dirfd(i915, dirfd, gt) {

Re: [Intel-gfx] [PATCH] drm/i915/pxp/mtl: intel_pxp_init_hw needs runtime-pm inside pm-complete

2023-06-13 Thread Belgaumkar, Vinay



On 6/1/2023 8:59 AM, Alan Previn wrote:

In the case of failed suspend flow or cases where the kernel does not go
into full suspend but goes from suspend_prepare back to resume_complete,
we get called for a pm_complete but without runtime_pm guaranteed.

Thus, ensure we take the runtime_pm when calling intel_pxp_init_hw
from within intel_pxp_resume_complete.


LGTM,

Reviewed-by: Vinay Belgaumkar 



Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/pxp/intel_pxp_pm.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
index 1a04067f61fc..1d184dcd63c7 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
@@ -36,6 +36,8 @@ void intel_pxp_suspend(struct intel_pxp *pxp)
  
  void intel_pxp_resume_complete(struct intel_pxp *pxp)

  {
+   intel_wakeref_t wakeref;
+
if (!intel_pxp_is_enabled(pxp))
return;
  
@@ -48,7 +50,8 @@ void intel_pxp_resume_complete(struct intel_pxp *pxp)

if (!HAS_ENGINE(pxp->ctrl_gt, GSC0) && !pxp->pxp_component)
return;
  
-	intel_pxp_init_hw(pxp);

+   with_intel_runtime_pm(>ctrl_gt->i915->runtime_pm, wakeref)
+   intel_pxp_init_hw(pxp);
  }
  
  void intel_pxp_runtime_suspend(struct intel_pxp *pxp)


base-commit: a66da4c33d8ede541aea9ba6d0d73b556a072d54


Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/i915_pm_freq_api: Add a suspend subtest

2023-06-07 Thread Belgaumkar, Vinay



On 6/7/2023 4:11 PM, Belgaumkar, Vinay wrote:


On 6/7/2023 3:56 PM, Dixit, Ashutosh wrote:

On Wed, 07 Jun 2023 15:31:33 -0700, Belgaumkar, Vinay wrote:

On 6/7/2023 2:12 PM, Dixit, Ashutosh wrote:

On Tue, 06 Jun 2023 13:35:35 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Verify that SLPC API works as expected after a suspend.

Signed-off-by: Vinay Belgaumkar 
---
   tests/i915/i915_pm_freq_api.c | 30 ++
   1 file changed, 30 insertions(+)

diff --git a/tests/i915/i915_pm_freq_api.c 
b/tests/i915/i915_pm_freq_api.c

index 9005cd220..f35f1f8e0 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -18,6 +18,9 @@
    *
    * SUBTEST: freq-reset
    * Description: Test basic freq API works after a reset
+ *
+ * SUBTEST: freq-suspend
+ * Description: Test basic freq API works after a runtime suspend
    */

   IGT_TEST_DESCRIPTION("Test SLPC freq API");
@@ -99,6 +102,24 @@ static void test_reset(int i915, int dirfd, 
int gt)

igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
   }

+static void test_suspend(int i915, int dirfd, int gt)
+{
+    uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+
+    igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+    igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
+    usleep(ACT_FREQ_LATENCY_US);
+    igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+    igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+
+    /* Manually trigger a suspend */
+    igt_system_suspend_autoresume(SUSPEND_STATE_S3,
+  SUSPEND_TEST_NONE);
+
+    igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+    igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
I am wondering what the purpose/value of this test (and also 
"freq-reset")
is?  How can the "set" min/max set freq (which are just input 
settings)
change whether or not there is a suspend/resume or a reset? 
Especially when

we just return cached min/max values from i915?
It is mainly checking that we don't smother the softlimit during a 
reset or

suspend flow.
How can softlimit which is a ordinary variable in memory get 
clobbered by

suspend resume?


It shouldn't, but funnier things have happened. Anyways, I can add a 
check for cur_freq and ensure that is at min. That will prove we applied 
the soft limit after suspend.


Thanks,

Vinay.




In addition, it also tests the read/write interface works as expected
after those events.
There's no write. Sorry, but I'm not convinced. There should be some 
more

meat to the test.

There are writes in the IGT fixture after the test completes.


Maybe we can write a test which will check /all/ sysfs values are the 
same

after a suspend resume cycle? Why do only these specific ones have to be
checked?


This test is specific to the freq api, hence just min/max entries.

Thanks,

Vinay.



Thanks.
--
Ashutosh



Thanks,

Vinay.


Thanks.
--
Ashutosh



+}
+
   igt_main
   {
int i915 = -1;
@@ -143,6 +164,15 @@ igt_main
    test_reset(i915, dirfd, gt);
}

+    igt_describe("Test basic freq API works after suspend");
+    igt_subtest_with_dynamic_f("freq-suspend") {
+    int dirfd, gt;
+
+    for_each_sysfs_gt_dirfd(i915, dirfd, gt)
+    igt_dynamic_f("gt%u", gt)
+    test_suspend(i915, dirfd, gt);
+    }
+
igt_fixture {
    int dirfd, gt;
    /* Restore frequencies */
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/i915_pm_freq_api: Add a suspend subtest

2023-06-07 Thread Belgaumkar, Vinay



On 6/7/2023 3:56 PM, Dixit, Ashutosh wrote:

On Wed, 07 Jun 2023 15:31:33 -0700, Belgaumkar, Vinay wrote:

On 6/7/2023 2:12 PM, Dixit, Ashutosh wrote:

On Tue, 06 Jun 2023 13:35:35 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Verify that SLPC API works as expected after a suspend.

Signed-off-by: Vinay Belgaumkar 
---
   tests/i915/i915_pm_freq_api.c | 30 ++
   1 file changed, 30 insertions(+)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 9005cd220..f35f1f8e0 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -18,6 +18,9 @@
*
* SUBTEST: freq-reset
* Description: Test basic freq API works after a reset
+ *
+ * SUBTEST: freq-suspend
+ * Description: Test basic freq API works after a runtime suspend
*/

   IGT_TEST_DESCRIPTION("Test SLPC freq API");
@@ -99,6 +102,24 @@ static void test_reset(int i915, int dirfd, int gt)
igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
   }

+static void test_suspend(int i915, int dirfd, int gt)
+{
+   uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
+   usleep(ACT_FREQ_LATENCY_US);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+
+   /* Manually trigger a suspend */
+   igt_system_suspend_autoresume(SUSPEND_STATE_S3,
+ SUSPEND_TEST_NONE);
+
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);

I am wondering what the purpose/value of this test (and also "freq-reset")
is?  How can the "set" min/max set freq (which are just input settings)
change whether or not there is a suspend/resume or a reset? Especially when
we just return cached min/max values from i915?

It is mainly checking that we don't smother the softlimit during a reset or
suspend flow.

How can softlimit which is a ordinary variable in memory get clobbered by
suspend resume?


In addition, it also tests the read/write interface works as expected
after those events.

There's no write. Sorry, but I'm not convinced. There should be some more
meat to the test.

There are writes in the IGT fixture after the test completes.


Maybe we can write a test which will check /all/ sysfs values are the same
after a suspend resume cycle? Why do only these specific ones have to be
checked?


This test is specific to the freq api, hence just min/max entries.

Thanks,

Vinay.



Thanks.
--
Ashutosh



Thanks,

Vinay.


Thanks.
--
Ashutosh



+}
+
   igt_main
   {
int i915 = -1;
@@ -143,6 +164,15 @@ igt_main
test_reset(i915, dirfd, gt);
}

+   igt_describe("Test basic freq API works after suspend");
+   igt_subtest_with_dynamic_f("freq-suspend") {
+   int dirfd, gt;
+
+   for_each_sysfs_gt_dirfd(i915, dirfd, gt)
+   igt_dynamic_f("gt%u", gt)
+   test_suspend(i915, dirfd, gt);
+   }
+
igt_fixture {
int dirfd, gt;
/* Restore frequencies */
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/i915_pm_freq_api: Add a suspend subtest

2023-06-07 Thread Belgaumkar, Vinay



On 6/7/2023 2:12 PM, Dixit, Ashutosh wrote:

On Tue, 06 Jun 2023 13:35:35 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Verify that SLPC API works as expected after a suspend.

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_pm_freq_api.c | 30 ++
  1 file changed, 30 insertions(+)

diff --git a/tests/i915/i915_pm_freq_api.c b/tests/i915/i915_pm_freq_api.c
index 9005cd220..f35f1f8e0 100644
--- a/tests/i915/i915_pm_freq_api.c
+++ b/tests/i915/i915_pm_freq_api.c
@@ -18,6 +18,9 @@
   *
   * SUBTEST: freq-reset
   * Description: Test basic freq API works after a reset
+ *
+ * SUBTEST: freq-suspend
+ * Description: Test basic freq API works after a runtime suspend
   */

  IGT_TEST_DESCRIPTION("Test SLPC freq API");
@@ -99,6 +102,24 @@ static void test_reset(int i915, int dirfd, int gt)
igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
  }

+static void test_suspend(int i915, int dirfd, int gt)
+{
+   uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
+   usleep(ACT_FREQ_LATENCY_US);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+
+   /* Manually trigger a suspend */
+   igt_system_suspend_autoresume(SUSPEND_STATE_S3,
+ SUSPEND_TEST_NONE);
+
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);

I am wondering what the purpose/value of this test (and also "freq-reset")
is?  How can the "set" min/max set freq (which are just input settings)
change whether or not there is a suspend/resume or a reset? Especially when
we just return cached min/max values from i915?


It is mainly checking that we don't smother the softlimit during a reset 
or suspend flow. In addition, it also tests the read/write interface 
works as expected after those events.


Thanks,

Vinay.



Thanks.
--
Ashutosh



+}
+
  igt_main
  {
int i915 = -1;
@@ -143,6 +164,15 @@ igt_main
test_reset(i915, dirfd, gt);
}

+   igt_describe("Test basic freq API works after suspend");
+   igt_subtest_with_dynamic_f("freq-suspend") {
+   int dirfd, gt;
+
+   for_each_sysfs_gt_dirfd(i915, dirfd, gt)
+   igt_dynamic_f("gt%u", gt)
+   test_suspend(i915, dirfd, gt);
+   }
+
igt_fixture {
int dirfd, gt;
/* Restore frequencies */
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/i915/gem_ctx_persistence: Skip some subtests

2023-06-06 Thread Belgaumkar, Vinay



On 6/1/2023 12:55 PM, Andrzej Hajda wrote:



On 24.05.2023 21:19, Vinay Belgaumkar wrote:

Hang and heartbeat subtests are not supported with GuC submission
enabled.

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/gem_ctx_persistence.c | 32 +++-
  1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/tests/i915/gem_ctx_persistence.c 
b/tests/i915/gem_ctx_persistence.c

index 42cf96329..1e122535e 100644
--- a/tests/i915/gem_ctx_persistence.c
+++ b/tests/i915/gem_ctx_persistence.c
@@ -1366,19 +1366,25 @@ igt_main
    igt_subtest("hostile")
  test_nohangcheck_hostile(i915, _cfg);
-    igt_subtest("hang")
-    test_nohangcheck_hang(i915, _cfg);
-
-    igt_subtest("heartbeat-stop")
-    test_noheartbeat_many(i915, 1, 0);
-    igt_subtest("heartbeat-hang")
-    test_noheartbeat_many(i915, 1, IGT_SPIN_NO_PREEMPTION);
-    igt_subtest("heartbeat-many")
-    test_noheartbeat_many(i915, 16, 0);
-    igt_subtest("heartbeat-close")
-    test_noheartbeat_close(i915, 0);
-    igt_subtest("heartbeat-hostile")
-    test_noheartbeat_close(i915, IGT_SPIN_NO_PREEMPTION);
+
+    igt_subtest_group {
+    igt_fixture
+    igt_skip_on(gem_using_guc_submission(i915));


As Kamil said this should be put into test function.
Otherwise you will have misleading errors in other tests - fixture 
will be called always regardless of running test.



+
+    igt_subtest("hang")
+    test_nohangcheck_hang(i915, _cfg);


What is 'missing' in GuC in case of this test? CI is happy :)


For now. I have seen this  fail before, so better to skip. I have sent 
out a patch with a skip for just this one since all others have been 
taken care of.


https://patchwork.freedesktop.org/patch/541407/

Thanks,

Vinay.





+
+    igt_subtest("heartbeat-stop")
+    test_noheartbeat_many(i915, 1, 0);
+    igt_subtest("heartbeat-hang")
+    test_noheartbeat_many(i915, 1, IGT_SPIN_NO_PREEMPTION);
+    igt_subtest("heartbeat-many")
+    test_noheartbeat_many(i915, 16, 0);
+    igt_subtest("heartbeat-close")
+    test_noheartbeat_close(i915, 0);
+    igt_subtest("heartbeat-hostile")
+    test_noheartbeat_close(i915, IGT_SPIN_NO_PREEMPTION);


These tests are handled already by recently merged:
https://patchwork.freedesktop.org/patch/539647/?series=118423=3

Regards
Andrzej



+    }
    igt_subtest_group {
  igt_fixture




Re: [Intel-gfx] [PATCH i-g-t] tests/i915/gem_ctx_persistence: Skip some subtests

2023-06-01 Thread Belgaumkar, Vinay



On 5/25/2023 11:25 AM, Kamil Konieczny wrote:

Hi Vinay,

On 2023-05-24 at 12:19:06 -0700, Vinay Belgaumkar wrote:

Hang and heartbeat subtests are not supported with GuC submission
enabled.

Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/gem_ctx_persistence.c | 32 +++-
  1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/tests/i915/gem_ctx_persistence.c b/tests/i915/gem_ctx_persistence.c
index 42cf96329..1e122535e 100644
--- a/tests/i915/gem_ctx_persistence.c
+++ b/tests/i915/gem_ctx_persistence.c
@@ -1366,19 +1366,25 @@ igt_main
  
  	igt_subtest("hostile")

test_nohangcheck_hostile(i915, _cfg);
-   igt_subtest("hang")
-   test_nohangcheck_hang(i915, _cfg);
-
-   igt_subtest("heartbeat-stop")
-   test_noheartbeat_many(i915, 1, 0);
-   igt_subtest("heartbeat-hang")
-   test_noheartbeat_many(i915, 1, IGT_SPIN_NO_PREEMPTION);
-   igt_subtest("heartbeat-many")
-   test_noheartbeat_many(i915, 16, 0);
-   igt_subtest("heartbeat-close")
-   test_noheartbeat_close(i915, 0);
-   igt_subtest("heartbeat-hostile")
-   test_noheartbeat_close(i915, IGT_SPIN_NO_PREEMPTION);
+
+   igt_subtest_group {
+   igt_fixture
+   igt_skip_on(gem_using_guc_submission(i915));

--- ^^^
You cannot put this in fixture as there is no test defined in it.
Place skips at begin of test functions that need it.


Hi Kamil,

   That's why I created a subtest_group. Is that not sufficient?

Thanks,

Vinay.



Regards,
Kamil


+
+   igt_subtest("hang")
+   test_nohangcheck_hang(i915, _cfg);
+
+   igt_subtest("heartbeat-stop")
+   test_noheartbeat_many(i915, 1, 0);
+   igt_subtest("heartbeat-hang")
+   test_noheartbeat_many(i915, 1, IGT_SPIN_NO_PREEMPTION);
+   igt_subtest("heartbeat-many")
+   test_noheartbeat_many(i915, 16, 0);
+   igt_subtest("heartbeat-close")
+   test_noheartbeat_close(i915, 0);
+   igt_subtest("heartbeat-hostile")
+   test_noheartbeat_close(i915, IGT_SPIN_NO_PREEMPTION);
+   }
  
  	igt_subtest_group {

igt_fixture
--
2.38.1



Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/guc: Dump error capture to dmesg on CTB error

2023-05-16 Thread Belgaumkar, Vinay



On 4/18/2023 11:17 AM, john.c.harri...@intel.com wrote:

From: John Harrison 

In the past, There have been sporadic CTB failures which proved hard
to reproduce manually. The most effective solution was to dump the GuC
log at the point of failure and let the CI system do the repro. It is
preferable not to dump the GuC log via dmesg for all issues as it is
not always necessary and is not helpful for end users. But rather than
trying to re-invent the code to do this each time it is wanted, commit
the code but for DEBUG_GUC builds only.

v2: Use IS_ENABLED for testing config options.


LGTM,

Reviewed-by: Vinay Belgaumkar 



Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 53 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 +++
  2 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed648..dc5cd712f1ff5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -13,6 +13,30 @@
  #include "intel_guc_ct.h"
  #include "intel_guc_print.h"
  
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)

+enum {
+   CT_DEAD_ALIVE = 0,
+   CT_DEAD_SETUP,
+   CT_DEAD_WRITE,
+   CT_DEAD_DEADLOCK,
+   CT_DEAD_H2G_HAS_ROOM,
+   CT_DEAD_READ,
+   CT_DEAD_PROCESS_FAILED,
+};
+
+static void ct_dead_ct_worker_func(struct work_struct *w);
+
+#define CT_DEAD(ct, reason)\
+   do { \
+   if (!(ct)->dead_ct_reported) { \
+   (ct)->dead_ct_reason |= 1 << CT_DEAD_##reason; \
+   queue_work(system_unbound_wq, &(ct)->dead_ct_worker); \
+   } \
+   } while (0)
+#else
+#define CT_DEAD(ct, reason)do { } while (0)
+#endif
+
  static inline struct intel_guc *ct_to_guc(struct intel_guc_ct *ct)
  {
return container_of(ct, struct intel_guc, ct);
@@ -93,6 +117,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct)
spin_lock_init(>requests.lock);
INIT_LIST_HEAD(>requests.pending);
INIT_LIST_HEAD(>requests.incoming);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+   INIT_WORK(>dead_ct_worker, ct_dead_ct_worker_func);
+#endif
INIT_WORK(>requests.worker, ct_incoming_request_worker_func);
tasklet_setup(>receive_tasklet, ct_receive_tasklet_func);
init_waitqueue_head(>wq);
@@ -319,11 +346,16 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
  
  	ct->enabled = true;

ct->stall_time = KTIME_MAX;
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+   ct->dead_ct_reported = false;
+   ct->dead_ct_reason = CT_DEAD_ALIVE;
+#endif
  
  	return 0;
  
  err_out:

CT_PROBE_ERROR(ct, "Failed to enable CTB (%pe)\n", ERR_PTR(err));
+   CT_DEAD(ct, SETUP);
return err;
  }
  
@@ -434,6 +466,7 @@ static int ct_write(struct intel_guc_ct *ct,

  corrupted:
CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
 desc->head, desc->tail, desc->status);
+   CT_DEAD(ct, WRITE);
ctb->broken = true;
return -EPIPE;
  }
@@ -504,6 +537,7 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
CT_ERROR(ct, "Head: %u\n (Dwords)", ct->ctbs.recv.desc->head);
CT_ERROR(ct, "Tail: %u\n (Dwords)", ct->ctbs.recv.desc->tail);
  
+		CT_DEAD(ct, DEADLOCK);

ct->ctbs.send.broken = true;
}
  
@@ -552,6 +586,7 @@ static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)

 head, ctb->size);
desc->status |= GUC_CTB_STATUS_OVERFLOW;
ctb->broken = true;
+   CT_DEAD(ct, H2G_HAS_ROOM);
return false;
}
  
@@ -908,6 +943,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)

CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
 desc->head, desc->tail, desc->status);
ctb->broken = true;
+   CT_DEAD(ct, READ);
return -EPIPE;
  }
  
@@ -1057,6 +1093,7 @@ static bool ct_process_incoming_requests(struct intel_guc_ct *ct)

if (unlikely(err)) {
CT_ERROR(ct, "Failed to process CT message (%pe) %*ph\n",
 ERR_PTR(err), 4 * request->size, request->msg);
+   CT_DEAD(ct, PROCESS_FAILED);
ct_free_msg(request);
}
  
@@ -1233,3 +1270,19 @@ void intel_guc_ct_print_info(struct intel_guc_ct *ct,

drm_printf(p, "Tail: %u\n",
   ct->ctbs.recv.desc->tail);
  }
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+static void ct_dead_ct_worker_func(struct work_struct *w)
+{
+   struct intel_guc_ct *ct = container_of(w, struct intel_guc_ct, 
dead_ct_worker);
+   struct intel_guc *guc = ct_to_guc(ct);
+
+   if (ct->dead_ct_reported)
+   return;
+
+   ct->dead_ct_reported = true;
+
+   guc_info(guc, "CTB is dead - 

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Dump error capture to kernel log

2023-05-16 Thread Belgaumkar, Vinay



On 4/18/2023 11:17 AM, john.c.harri...@intel.com wrote:

From: John Harrison 

This is useful for getting debug information out in certain
situations, such as failing kernel selftests and CI runs that don't
log error captures. It is especially useful for things like retrieving
GuC logs as GuC operation can't be tracked by adding printk or ftrace
entries.

v2: Add CONFIG_DRM_I915_DEBUG_GEM wrapper (review feedback by Rodrigo).


Do the CI sparse warnings hold water? With that looked at,

LGTM,

Reviewed-by: Vinay Belgaumkar 



Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/i915_gpu_error.c | 132 ++
  drivers/gpu/drm/i915/i915_gpu_error.h |  10 ++
  2 files changed, 142 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index f020c0086fbcd..03d62c250c465 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -2219,3 +2219,135 @@ void i915_disable_error_state(struct drm_i915_private 
*i915, int err)
i915->gpu_error.first_error = ERR_PTR(err);
spin_unlock_irq(>gpu_error.lock);
  }
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+void intel_klog_error_capture(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask)
+{
+   static int g_count;
+   struct drm_i915_private *i915 = gt->i915;
+   struct i915_gpu_coredump *error;
+   intel_wakeref_t wakeref;
+   size_t buf_size = PAGE_SIZE * 128;
+   size_t pos_err;
+   char *buf, *ptr, *next;
+   int l_count = g_count++;
+   int line = 0;
+
+   /* Can't allocate memory during a reset */
+   if (test_bit(I915_RESET_BACKOFF, >reset.flags)) {
+   drm_err(>i915->drm, "[Capture/%d.%d] Inside GT reset, skipping 
error capture :(\n",
+   l_count, line++);
+   return;
+   }
+
+   error = READ_ONCE(i915->gpu_error.first_error);
+   if (error) {
+   drm_err(>drm, "[Capture/%d.%d] Clearing existing error capture 
first...\n",
+   l_count, line++);
+   i915_reset_error_state(i915);
+   }
+
+   with_intel_runtime_pm(>runtime_pm, wakeref)
+   error = i915_gpu_coredump(gt, engine_mask, CORE_DUMP_FLAG_NONE);
+
+   if (IS_ERR(error)) {
+   drm_err(>drm, "[Capture/%d.%d] Failed to capture error 
capture: %ld!\n",
+   l_count, line++, PTR_ERR(error));
+   return;
+   }
+
+   buf = kvmalloc(buf_size, GFP_KERNEL);
+   if (!buf) {
+   drm_err(>drm, "[Capture/%d.%d] Failed to allocate buffer for 
error capture!\n",
+   l_count, line++);
+   i915_gpu_coredump_put(error);
+   return;
+   }
+
+   drm_info(>drm, "[Capture/%d.%d] Dumping i915 error capture for 
%ps...\n",
+l_count, line++, __builtin_return_address(0));
+
+   /* Largest string length safe to print via dmesg */
+#  define MAX_CHUNK800
+
+   pos_err = 0;
+   while (1) {
+   ssize_t got = i915_gpu_coredump_copy_to_buffer(error, buf, 
pos_err, buf_size - 1);
+
+   if (got <= 0)
+   break;
+
+   buf[got] = 0;
+   pos_err += got;
+
+   ptr = buf;
+   while (got > 0) {
+   size_t count;
+   char tag[2];
+
+   next = strnchr(ptr, got, '\n');
+   if (next) {
+   count = next - ptr;
+   *next = 0;
+   tag[0] = '>';
+   tag[1] = '<';
+   } else {
+   count = got;
+   tag[0] = '}';
+   tag[1] = '{';
+   }
+
+   if (count > MAX_CHUNK) {
+   size_t pos;
+   char *ptr2 = ptr;
+
+   for (pos = MAX_CHUNK; pos < count; pos += 
MAX_CHUNK) {
+   char chr = ptr[pos];
+
+   ptr[pos] = 0;
+   drm_info(>drm, "[Capture/%d.%d] 
}%s{\n",
+l_count, line++, ptr2);
+   ptr[pos] = chr;
+   ptr2 = ptr + pos;
+
+   /*
+* If spewing large amounts of data via 
a serial console,
+* this can be a very slow process. So 
be friendly and try
+* not to cause 'softlockup on CPU' 
problems.
+*/
+   cond_resched();

Re: [Intel-gfx] [PATCH v2 0/2] Add support for dumping error captures via kernel logging

2023-05-16 Thread Belgaumkar, Vinay



On 4/18/2023 11:17 AM, john.c.harri...@intel.com wrote:

From: John Harrison 

Sometimes, the only effective way to debug an issue is to dump all the
interesting information at the point of failure. So add support for
doing that.

v2: Extra CONFIG wrapping (review feedback from Rodrigo)

Signed-off-by: John Harrison 


series LGTM,

Reviewed-by: Vinay Belgaumkar 




John Harrison (2):
   drm/i915: Dump error capture to kernel log
   drm/i915/guc: Dump error capture to dmesg on CTB error

  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  53 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   6 +
  drivers/gpu/drm/i915/i915_gpu_error.c | 132 ++
  drivers/gpu/drm/i915/i915_gpu_error.h |  10 ++
  4 files changed, 201 insertions(+)



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Disable rps_boost debugfs

2023-05-15 Thread Belgaumkar, Vinay



On 5/12/2023 5:39 PM, Dixit, Ashutosh wrote:

On Fri, 12 May 2023 16:56:03 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


rps_boost debugfs shows host turbo related info. This is not valid
when SLPC is enabled.

A couple of thoughts about this. It appears people are know only about
rps_boost_info and don't know about guc_slpc_info? So:

a. Instead of hiding the rps_boost_info file do we need to print there
saying "SLPC is enabled, go look at guc_slpc_info"?
rps_boost_info has an eval() function which disables the interface when 
RPS is OFF. This is indeed the case here, so shouldn't we just follow 
that instead of trying to link the two?


b. Or, even just call guc_slpc_info_show from rps_boost_show (so the two
files will show the same SLPC information)?


slpc_info has a lot of other info like the SLPC state, not sure that 
matches up with the rps_boost_info name.


Thanks,

Vinay.



Ashutosh



guc_slpc_info already shows the number of boosts.  Add num_waiters there
as well and disable rps_boost when SLPC is enabled.

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7632
Signed-off-by: Vinay Belgaumkar 


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [v6,1/2] drm/i915/guc/slpc: Provide sysfs for efficient freq (rev2)

2023-04-26 Thread Belgaumkar, Vinay


On 4/26/2023 6:13 PM, Patchwork wrote:

Project List - Patchwork *Patch Details*
*Series:* 	series starting with [v6,1/2] drm/i915/guc/slpc: Provide 
sysfs for efficient freq (rev2)

*URL:*  https://patchwork.freedesktop.org/series/116957/
*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116957v2/index.html



  CI Bug Log - changes from CI_DRM_13066 -> Patchwork_116957v2


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_116957v2 absolutely need 
to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_116957v2, please notify your bug team to allow 
them
to document this new failure mode, which will reduce false positives 
in CI.


External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116957v2/index.html



Participating hosts (40 -> 38)

Missing (2): bat-rpls-2 fi-snb-2520m


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_116957v2:



  IGT changes


Possible regressions

  * igt@gem_exec_suspend@basic-s3@smem:
  o bat-rpls-1: PASS


-> ABORT




Failure has nothing to do with this patch series.


 *


Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  * igt@i915_selftest@live@gt_heartbeat:
  o {bat-kbl-2}: PASS


-> FAIL




This as well.

Thanks,

Vinay.


 *


Known issues

Here are the changes found in Patchwork_116957v2 that come from known 
issues:



  IGT changes


Issues hit

 *

igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:

  o bat-dg2-11: NOTRUN -> SKIP


(i915#1845
 /
i915#5354 )
 *

igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1:

  o bat-dg2-8: PASS


-> FAIL


(i915#7932 )


Possible fixes

  * igt@i915_selftest@live@slpc:
  o bat-adln-1: FAIL


(i915#6997
) ->
PASS




Warnings

  * igt@i915_suspend@basic-s3-without-i915:
  o fi-tgl-1115g4: INCOMPLETE


(i915#7443
 /
i915#8102
) ->
INCOMPLETE


(i915#8102 )

{name}: This element is suppressed. This means it is ignored when 
computing

the status of the difference (SUCCESS, WARNING, or FAILURE).


Build changes

  * Linux: CI_DRM_13066 -> Patchwork_116957v2

CI-20190529: 20190529
CI_DRM_13066: bdd3e1a1625175c5a56bb850b986d478ea9fbf60 @ 
git://anongit.freedesktop.org/gfx-ci/linux
IGT_7272: b2786c0c504bb4fa9f2dc6082fe9332223198b24 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
Patchwork_116957v2: bdd3e1a1625175c5a56bb850b986d478ea9fbf60 @ 
git://anongit.freedesktop.org/gfx-ci/linux



  Linux commits

37444baa0ad8 drm/i915/selftest: Update the SLPC selftest
cc3e89a3db27 drm/i915/guc/slpc: Provide sysfs for efficient freq


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [v6,1/2] drm/i915/guc/slpc: Provide sysfs for efficient freq

2023-04-25 Thread Belgaumkar, Vinay


On 4/25/2023 6:40 PM, Patchwork wrote:

Project List - Patchwork *Patch Details*
*Series:* 	series starting with [v6,1/2] drm/i915/guc/slpc: Provide 
sysfs for efficient freq

*URL:*  https://patchwork.freedesktop.org/series/116957/
*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116957v1/index.html



  CI Bug Log - changes from CI_DRM_13062 -> Patchwork_116957v1


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_116957v1 absolutely need 
to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_116957v1, please notify your bug team to allow 
them
to document this new failure mode, which will reduce false positives 
in CI.


External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116957v1/index.html



Participating hosts (39 -> 36)

Missing (3): fi-kbl-soraka bat-mtlp-8 fi-snb-2520m


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_116957v1:



  IGT changes


Possible regressions

 *

igt@i915_pm_rps@basic-api:

 o

bat-adlp-9: PASS


-> FAIL



 o

bat-adlp-6: PASS


-> FAIL



 o

bat-atsm-1: PASS


-> FAIL



 o

bat-adlm-1: PASS


-> FAIL



i915_pm_rps@basic-api is supposed to skip on GuC enabled platforms. 
Another series - https://patchwork.freedesktop.org/series/115698/ will 
fix this and ensure the skip actually happens.


Thanks,

Vinay.


 *
 o


Known issues

Here are the changes found in Patchwork_116957v1 that come from known 
issues:



  IGT changes


Issues hit

 *

igt@i915_pm_rps@basic-api:

 o

bat-dg1-7: PASS


-> FAIL


(i915#8308 )

 o

bat-rplp-1: PASS


-> FAIL


(i915#8308 )

 o

bat-dg1-5: PASS


-> FAIL


(i915#8308 )

 o

bat-dg2-9: PASS


-> FAIL


(i915#8308 )

 o

bat-adln-1: PASS


-> FAIL


(i915#8308 )

 o

bat-dg2-8: PASS


-> FAIL


(i915#8308 )

 o

bat-rpls-1: PASS


-> FAIL


(i915#8308 

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/4] lib: Make SLPC helper function per GT

2023-04-23 Thread Belgaumkar, Vinay



On 4/14/2023 1:25 PM, Dixit, Ashutosh wrote:

On Fri, 14 Apr 2023 12:16:37 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Use default of 0 where GT id is not being used.

v2: Add a helper for GT 0 (Ashutosh)

Signed-off-by: Vinay Belgaumkar 
---
  lib/igt_pm.c | 36 ++--
  lib/igt_pm.h |  3 ++-
  2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/lib/igt_pm.c b/lib/igt_pm.c
index 704acf7d..8a30bb3b 100644
--- a/lib/igt_pm.c
+++ b/lib/igt_pm.c
@@ -1329,21 +1329,37 @@ void igt_pm_print_pci_card_runtime_status(void)
}
  }

-bool i915_is_slpc_enabled(int fd)
+/**
+ * i915_is_slpc_enabled_gt:
+ * @drm_fd: DRM file descriptor
+ * @gt: GT id
+ * Check if SLPC is enabled on a GT
+ */
+bool i915_is_slpc_enabled_gt(int drm_fd, int gt)
  {
-   int debugfs_fd = igt_debugfs_dir(fd);
-   char buf[4096] = {};
-   int len;
+   int debugfs_fd;
+   char buf[256] = {};

Shouldn't this be 4096 as before?


-   igt_require(debugfs_fd != -1);
+   debugfs_fd = igt_debugfs_gt_open(drm_fd, gt, "uc/guc_slpc_info", 
O_RDONLY);
+
+   /* if guc_slpc_info not present then return false */
+   if (debugfs_fd < 0)
+   return false;

I think this should just be:

igt_require_fd(debugfs_fd);

Basically we cannot determine if SLPC is enabled or not if say debugfs is
not mounted, so it's not correct return false from here.


Actually, rethinking on this, we should keep it to return false. This is 
making tests skip on platforms where it shouldn't. Debugfs will not be 
mounted only when driver load fails, which would cause the test to fail 
when we try to create the drm fd before this. Case in point - 
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8839/fi-tgl-1115g4/igt@i915_pm_...@basic-api.html 
- here, the test should have run (guc disabled platform) but it skipped.


Thanks,

Vinay.




+   read(debugfs_fd, buf, sizeof(buf)-1);

-   len = igt_debugfs_simple_read(debugfs_fd, "gt/uc/guc_slpc_info", buf, 
sizeof(buf));
close(debugfs_fd);

-   if (len < 0)
-   return false;
-   else
-   return strstr(buf, "SLPC state: running");
+   return strstr(buf, "SLPC state: running");
+}
+
+/**
+ * i915_is_slpc_enabled:
+ * @drm_fd: DRM file descriptor
+ * Check if SLPC is enabled on GT 0

Hmm, not sure why we are not using the i915_for_each_gt() loop here since
that is the correct way of doing it.

At the min let's remove the GT 0 in the comment above. This function
doesn't check for GT0, it checks if "slpc is enabled for the device". We
can check only on GT0 if we are certain that checking on GT0 is sufficient,
that is if SLPC is disabled on GT0 it's disabled for the device. But then
someone can ask the question in that case why are we exposing slpc_enabled
for each gt from the kernel rather than at the device level.

In any case for now let's change the above comment to:

"Check if SLPC is enabled" or ""Check if SLPC is enabled for the i915
device".

With the above comments addressed this is:

Reviewed-by: Ashutosh Dixit 

Also, why is igt@i915_pm_rps@basic-api still skipping on DG2/ATSM in
pre-merge CI even after this series?

Thanks.
--
Ashutosh



+ */
+bool i915_is_slpc_enabled(int drm_fd)
+{
+   return i915_is_slpc_enabled_gt(drm_fd, 0);
  }
  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev)
diff --git a/lib/igt_pm.h b/lib/igt_pm.h
index d0d6d673..448cf42d 100644
--- a/lib/igt_pm.h
+++ b/lib/igt_pm.h
@@ -84,7 +84,8 @@ void igt_pm_set_d3cold_allowed(struct igt_device_card *card, 
const char *val);
  void igt_pm_setup_pci_card_runtime_pm(struct pci_device *pci_dev);
  void igt_pm_restore_pci_card_runtime_pm(void);
  void igt_pm_print_pci_card_runtime_status(void);
-bool i915_is_slpc_enabled(int fd);
+bool i915_is_slpc_enabled_gt(int drm_fd, int gt);
+bool i915_is_slpc_enabled(int drm_fd);
  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev);
  int igt_pm_get_runtime_usage(struct pci_device *pci_dev);

--
2.38.1



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Provide sysfs for efficient freq

2023-04-18 Thread Belgaumkar, Vinay



On 4/17/2023 6:39 PM, Andi Shyti wrote:

Hi Vinay,

Looks good, just few minor comments below,

[...]


@@ -267,13 +267,11 @@ static int run_test(struct intel_gt *gt, int test_type)
}
  
  	/*

-* Set min frequency to RPn so that we can test the whole
-* range of RPn-RP0. This also turns off efficient freq
-* usage and makes results more predictable.
+* Turn off efficient freq so RPn/RP0 ranges are obeyed
 */
-   err = slpc_set_min_freq(slpc, slpc->min_freq);
+   err = intel_guc_slpc_set_ignore_eff_freq(slpc, true);
if (err) {
-   pr_err("Unable to update min freq!");
+   pr_err("Unable to turn off efficient freq!");

drm_err()? or gt_err()? As we are here we can use a proper
printing.

How is this change related to the scope of this patch?
The selftest was relying on setting min freq < RP1 to disable efficient 
freq, now that we have an interface, the test should use that (former 
method will not work). Should this be a separate patch?



return err;
}
  
@@ -358,9 +356,10 @@ static int run_test(struct intel_gt *gt, int test_type)

break;
}
  
-	/* Restore min/max frequencies */

-   slpc_set_max_freq(slpc, slpc_max_freq);
+   /* Restore min/max frequencies and efficient flag */
slpc_set_min_freq(slpc, slpc_min_freq);
+   slpc_set_max_freq(slpc, slpc_max_freq);
+   intel_guc_slpc_set_ignore_eff_freq(slpc, false);

mmhhh... do we care here about the return value?

I guess we should, will add.


  
  	if (igt_flush_test(gt->i915))

err = -EIO;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index 026d73855f36..b1b70ee3001b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -277,6 +277,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
  
  	slpc->max_freq_softlimit = 0;

slpc->min_freq_softlimit = 0;
+   slpc->ignore_eff_freq = false;
slpc->min_is_rpmax = false;
  
  	slpc->boost_freq = 0;

@@ -457,6 +458,31 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
*slpc, u32 *val)
return ret;
  }
  
+int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val)

+{
+   struct drm_i915_private *i915 = slpc_to_i915(slpc);
+   intel_wakeref_t wakeref;
+   int ret = 0;

no need to initialize ret here.

ok.



+
+   mutex_lock(>lock);
+   wakeref = intel_runtime_pm_get(>runtime_pm);
+
+   ret = slpc_set_param(slpc,
+SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+val);
+   if (ret) {
+   guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient 
freq(%d): %pe\n",
+   val, ERR_PTR(ret));
+   goto out;
+   }
+
+   slpc->ignore_eff_freq = val;

nit that you can ignore: if you put this under else and save
brackets and a goto.


ok.

Thanks,

Vinay.



Andi


Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/4] lib: Make SLPC helper function per GT

2023-04-17 Thread Belgaumkar, Vinay



On 4/14/2023 1:25 PM, Dixit, Ashutosh wrote:

On Fri, 14 Apr 2023 12:16:37 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Use default of 0 where GT id is not being used.

v2: Add a helper for GT 0 (Ashutosh)

Signed-off-by: Vinay Belgaumkar 
---
  lib/igt_pm.c | 36 ++--
  lib/igt_pm.h |  3 ++-
  2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/lib/igt_pm.c b/lib/igt_pm.c
index 704acf7d..8a30bb3b 100644
--- a/lib/igt_pm.c
+++ b/lib/igt_pm.c
@@ -1329,21 +1329,37 @@ void igt_pm_print_pci_card_runtime_status(void)
}
  }

-bool i915_is_slpc_enabled(int fd)
+/**
+ * i915_is_slpc_enabled_gt:
+ * @drm_fd: DRM file descriptor
+ * @gt: GT id
+ * Check if SLPC is enabled on a GT
+ */
+bool i915_is_slpc_enabled_gt(int drm_fd, int gt)
  {
-   int debugfs_fd = igt_debugfs_dir(fd);
-   char buf[4096] = {};
-   int len;
+   int debugfs_fd;
+   char buf[256] = {};

Shouldn't this be 4096 as before?

ok.



-   igt_require(debugfs_fd != -1);
+   debugfs_fd = igt_debugfs_gt_open(drm_fd, gt, "uc/guc_slpc_info", 
O_RDONLY);
+
+   /* if guc_slpc_info not present then return false */
+   if (debugfs_fd < 0)
+   return false;

I think this should just be:

igt_require_fd(debugfs_fd);

Basically we cannot determine if SLPC is enabled or not if say debugfs is
not mounted, so it's not correct return false from here.

yup, makes sense.



+   read(debugfs_fd, buf, sizeof(buf)-1);

-   len = igt_debugfs_simple_read(debugfs_fd, "gt/uc/guc_slpc_info", buf, 
sizeof(buf));
close(debugfs_fd);

-   if (len < 0)
-   return false;
-   else
-   return strstr(buf, "SLPC state: running");
+   return strstr(buf, "SLPC state: running");
+}
+
+/**
+ * i915_is_slpc_enabled:
+ * @drm_fd: DRM file descriptor
+ * Check if SLPC is enabled on GT 0

Hmm, not sure why we are not using the i915_for_each_gt() loop here since
that is the correct way of doing it.
Didn't want to introduce another aggregation here. If SLPC is enabled on 
GT0, it is obviously enabled on all other tiles on that device. There is 
no per tile SLPC/GuC control.


At the min let's remove the GT 0 in the comment above. This function
doesn't check for GT0, it checks if "slpc is enabled for the device". We
can check only on GT0 if we are certain that checking on GT0 is sufficient,
that is if SLPC is disabled on GT0 it's disabled for the device. But then
someone can ask the question in that case why are we exposing slpc_enabled
for each gt from the kernel rather than at the device level.

In any case for now let's change the above comment to:

"Check if SLPC is enabled" or ""Check if SLPC is enabled for the i915
device".

ok.


With the above comments addressed this is:

Reviewed-by: Ashutosh Dixit 

Also, why is igt@i915_pm_rps@basic-api still skipping on DG2/ATSM in
pre-merge CI even after this series?


basic-api is supposed to skip on GuC platforms. It wasn't due to the 
test incorrectly reading the SLPC enabled status from debugfs (which is 
being fixed here).


Thanks for the review,

Vinay.



Thanks.
--
Ashutosh



+ */
+bool i915_is_slpc_enabled(int drm_fd)
+{
+   return i915_is_slpc_enabled_gt(drm_fd, 0);
  }
  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev)
diff --git a/lib/igt_pm.h b/lib/igt_pm.h
index d0d6d673..448cf42d 100644
--- a/lib/igt_pm.h
+++ b/lib/igt_pm.h
@@ -84,7 +84,8 @@ void igt_pm_set_d3cold_allowed(struct igt_device_card *card, 
const char *val);
  void igt_pm_setup_pci_card_runtime_pm(struct pci_device *pci_dev);
  void igt_pm_restore_pci_card_runtime_pm(void);
  void igt_pm_print_pci_card_runtime_status(void);
-bool i915_is_slpc_enabled(int fd);
+bool i915_is_slpc_enabled_gt(int drm_fd, int gt);
+bool i915_is_slpc_enabled(int drm_fd);
  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev);
  int igt_pm_get_runtime_usage(struct pci_device *pci_dev);

--
2.38.1



Re: [Intel-gfx] [PATCH v3] drm/i915/guc/slpc: Provide sysfs for efficient freq

2023-04-17 Thread Belgaumkar, Vinay



On 4/14/2023 4:49 PM, Dixit, Ashutosh wrote:

On Fri, 14 Apr 2023 15:34:15 -0700, Vinay Belgaumkar wrote:

@@ -457,6 +458,34 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
*slpc, u32 *val)
return ret;
  }

+int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val)
+{
+   struct drm_i915_private *i915 = slpc_to_i915(slpc);
+   intel_wakeref_t wakeref;
+   int ret = 0;
+
+   /* Need a lock now since waitboost can be modifying min as well */

Delete comment.

ok.

+   mutex_lock(>lock);

Actually, don't need the lock itself now so delete the lock.

Or, maybe the lock prevents the race if userspace writes to the sysfs when
GuC reset is going on so let's retain the lock. But the comment is wrong.

yup, ok.



+   wakeref = intel_runtime_pm_get(>runtime_pm);
+
+   /* Ignore efficient freq if lower min freq is requested */

Delete comment, it's wrong.

ok.



+   ret = slpc_set_param(slpc,
+SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+val);
+   if (ret) {
+   guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient 
freq(%d): %pe\n",
+   val, ERR_PTR(ret));
+   goto out;
+   }
+
+   slpc->ignore_eff_freq = val;
+

This extra line can also be deleted.

ok.



+out:
+   intel_runtime_pm_put(>runtime_pm, wakeref);
+   mutex_unlock(>lock);
+   return ret;
+}
+
  /**
   * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
   * @slpc: pointer to intel_guc_slpc.
@@ -482,16 +511,6 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
*slpc, u32 val)
mutex_lock(>lock);
wakeref = intel_runtime_pm_get(>runtime_pm);

-   /* Ignore efficient freq if lower min freq is requested */
-   ret = slpc_set_param(slpc,
-SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
-val < slpc->rp1_freq);
-   if (ret) {
-   guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient freq: 
%pe\n",
-   ERR_PTR(ret));
-   goto out;
-   }
-

Great, thanks!

After taking care of the above, and seems there are also a couple of
checkpatch errors, this is:

Reviewed-by: Ashutosh Dixit 


Thanks,

Vinay.



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/4] lib: Make SLPC helper function per GT

2023-04-14 Thread Belgaumkar, Vinay



On 4/14/2023 11:10 AM, Dixit, Ashutosh wrote:

On Thu, 13 Apr 2023 15:44:12 -0700, Vinay Belgaumkar wrote:

Use default of 0 where GT id is not being used.

Signed-off-by: Vinay Belgaumkar 
---
  lib/igt_pm.c | 20 ++--
  lib/igt_pm.h |  2 +-
  tests/i915/i915_pm_rps.c |  6 +++---
  3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/lib/igt_pm.c b/lib/igt_pm.c
index 704acf7d..8ca7c181 100644
--- a/lib/igt_pm.c
+++ b/lib/igt_pm.c
@@ -1329,21 +1329,21 @@ void igt_pm_print_pci_card_runtime_status(void)
}
  }

-bool i915_is_slpc_enabled(int fd)
+bool i915_is_slpc_enabled(int drm_fd, int gt)

OK, we understand that the debugfs dir path is per gt, but I am wondering
if we need to expose this as a function argument? Since, in all instances,
we are always passing gt as 0.

Maybe the caller is only interested in knowing if slpc is enabled. Can SLPC
be enabled for gt 0 and disabled for gt 1? In the case the caller should
really call something like:

for_each_gt()
i915_is_slpc_enabled(fd, gt)

and return false if slpc is disabled for any gt.

I think what we should do is write two functions:

1. Rename the function above with the gt argument to something like:

i915_is_slpc_enabled_gt()

2. Have another function without the gt argument:

i915_is_slpc_enabled() which will do:

for_each_gt()
i915_is_slpc_enabled_gt(fd, gt)

and return false if slpc is disabled for any gt.

And then have the tests call this second function without the gt argument.

I think this will be cleaner than passing 0 as the gt from the tests.


ok, created a helper for the helper :) This will hard code GT 0 instead 
of the tests doing it, when necessary.


Thanks,

Vinay.



Thanks.
--
Ashutosh



  {
-   int debugfs_fd = igt_debugfs_dir(fd);
-   char buf[4096] = {};
-   int len;
+   int debugfs_fd;
+   char buf[256] = {};
+
+   debugfs_fd = igt_debugfs_gt_open(drm_fd, gt, "uc/guc_slpc_info", 
O_RDONLY);

-   igt_require(debugfs_fd != -1);
+   /* if guc_slpc_info not present then return false */
+   if (debugfs_fd < 0)
+   return false;
+   read(debugfs_fd, buf, sizeof(buf)-1);

-   len = igt_debugfs_simple_read(debugfs_fd, "gt/uc/guc_slpc_info", buf, 
sizeof(buf));
close(debugfs_fd);

-   if (len < 0)
-   return false;
-   else
-   return strstr(buf, "SLPC state: running");
+   return strstr(buf, "SLPC state: running");
  }

  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev)
diff --git a/lib/igt_pm.h b/lib/igt_pm.h
index d0d6d673..1b054dce 100644
--- a/lib/igt_pm.h
+++ b/lib/igt_pm.h
@@ -84,7 +84,7 @@ void igt_pm_set_d3cold_allowed(struct igt_device_card *card, 
const char *val);
  void igt_pm_setup_pci_card_runtime_pm(struct pci_device *pci_dev);
  void igt_pm_restore_pci_card_runtime_pm(void);
  void igt_pm_print_pci_card_runtime_status(void);
-bool i915_is_slpc_enabled(int fd);
+bool i915_is_slpc_enabled(int fd, int gt);
  int igt_pm_get_runtime_suspended_time(struct pci_device *pci_dev);
  int igt_pm_get_runtime_usage(struct pci_device *pci_dev);

diff --git a/tests/i915/i915_pm_rps.c b/tests/i915/i915_pm_rps.c
index d4ee2d58..85dae449 100644
--- a/tests/i915/i915_pm_rps.c
+++ b/tests/i915/i915_pm_rps.c
@@ -916,21 +916,21 @@ igt_main
}

igt_subtest("basic-api") {
-   igt_skip_on_f(i915_is_slpc_enabled(drm_fd),
+   igt_skip_on_f(i915_is_slpc_enabled(drm_fd, 0),
  "This subtest is not supported when SLPC is 
enabled\n");
min_max_config(basic_check, false);
}

/* Verify the constraints, check if we can reach idle */
igt_subtest("min-max-config-idle") {
-   igt_skip_on_f(i915_is_slpc_enabled(drm_fd),
+   igt_skip_on_f(i915_is_slpc_enabled(drm_fd, 0),
  "This subtest is not supported when SLPC is 
enabled\n");
min_max_config(idle_check, true);
}

/* Verify the constraints with high load, check if we can reach max */
igt_subtest("min-max-config-loaded") {
-   igt_skip_on_f(i915_is_slpc_enabled(drm_fd),
+   igt_skip_on_f(i915_is_slpc_enabled(drm_fd, 0),
  "This subtest is not supported when SLPC is 
enabled\n");
load_helper_run(HIGH);
min_max_config(loaded_check, false);
--
2.38.1



Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/2] i915_pm_freq_api: Add some basic SLPC igt tests

2023-04-03 Thread Belgaumkar, Vinay



On 4/3/2023 8:36 AM, Dixit, Ashutosh wrote:

On Mon, 03 Apr 2023 08:23:45 -0700, Belgaumkar, Vinay wrote:


On 3/31/2023 4:56 PM, Dixit, Ashutosh wrote:

On Mon, 27 Mar 2023 19:00:28 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


+/*
+ * Too many intermediate components and steps before freq is adjusted
+ * Specially if workload is under execution, so let's wait 100 ms.
+ */
+#define ACT_FREQ_LATENCY_US 10
+
+static uint32_t get_freq(int dirfd, uint8_t id)
+{
+   uint32_t val;
+
+   igt_require(igt_sysfs_rps_scanf(dirfd, id, "%u", ) == 1);

igt_assert?

ok.

+static void test_freq_basic_api(int dirfd, int gt)
+{
+   uint32_t rpn, rp0, rpe;
+
+   /* Save frequencies */
+   rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+   rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
+   rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_info("System min freq: %dMHz; max freq: %dMHz\n", rpn, rp0);
+
+   /*
+* Negative bound tests
+* RPn is the floor
+* RP0 is the ceiling
+*/
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rp0 + 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);

Is this supposed to be RPS_MAX_FREQ_MHZ?

We could do this check for max as well. But this is trying to see if min
can be set to below rpn.

In that case this statement is the same as the first one (2 lines
above). Is that needed?


ah, yes. Need more coffee. That should be RPS_MAX_FREQ_MHZ.

Thanks,

Vinay.






+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rp0 + 1) < 0);
+

After addressing the above, this is:

Reviewed-by: Ashutosh Dixit 

Also, before merging it would be good to see the results of the new
tests. So could you add a HAX patch adding the new tests to
fast-feedback.testlist and resend the series?

Sure, will do. Thanks for the review.

Vinay.


Thanks.
--
Ashutosh


Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/2] i915_pm_freq_api: Add some basic SLPC igt tests

2023-04-03 Thread Belgaumkar, Vinay



On 3/31/2023 4:56 PM, Dixit, Ashutosh wrote:

On Mon, 27 Mar 2023 19:00:28 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


+/*
+ * Too many intermediate components and steps before freq is adjusted
+ * Specially if workload is under execution, so let's wait 100 ms.
+ */
+#define ACT_FREQ_LATENCY_US 10
+
+static uint32_t get_freq(int dirfd, uint8_t id)
+{
+   uint32_t val;
+
+   igt_require(igt_sysfs_rps_scanf(dirfd, id, "%u", ) == 1);

igt_assert?

ok.



+static void test_freq_basic_api(int dirfd, int gt)
+{
+   uint32_t rpn, rp0, rpe;
+
+   /* Save frequencies */
+   rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+   rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
+   rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_info("System min freq: %dMHz; max freq: %dMHz\n", rpn, rp0);
+
+   /*
+* Negative bound tests
+* RPn is the floor
+* RP0 is the ceiling
+*/
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rp0 + 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);

Is this supposed to be RPS_MAX_FREQ_MHZ?
We could do this check for max as well. But this is trying to see if min 
can be set to below rpn.



+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rp0 + 1) < 0);
+

After addressing the above, this is:

Reviewed-by: Ashutosh Dixit 

Also, before merging it would be good to see the results of the new
tests. So could you add a HAX patch adding the new tests to
fast-feedback.testlist and resend the series?


Sure, will do. Thanks for the review.

Vinay.



Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH i-g-t 2/2] i915_guc_pc: Add some basic SLPC igt tests

2023-03-30 Thread Belgaumkar, Vinay



On 3/28/2023 11:53 AM, Rodrigo Vivi wrote:

On Mon, Mar 27, 2023 at 04:29:55PM -0700, Belgaumkar, Vinay wrote:

On 3/26/2023 4:04 AM, Rodrigo Vivi wrote:

On Fri, Mar 24, 2023 at 03:49:59PM -0700, Vinay Belgaumkar wrote:

Use the xe_guc_pc test for i915 as well. Validate basic
api for GT freq control. Also test interaction with GT
reset. We skip rps tests with SLPC enabled, this will
re-introduce some coverage. SLPC selftests are already
covering some other workload related scenarios.

Signed-off-by: Rodrigo Vivi 

you probably meant 'Cc:'

Added you as Signed-off-by since you are the original author in xe igt.

I do understand you did with the best of intentions here. But since with
the new Xe driver we are going to hit many cases like this. Please allow
me to use this case here to bring some thoughts.

First of all, there's a very common misunderstanding of the meaning of the
'Signed-off-by:' (sob).

**hint**: It does *not* mean 'authorship'!

Although we are in an IGT patch, let's use the kernel definition so we
are aligned in some well documented rule:

https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=signed%20off#developer-s-certificate-of-origin-1-1

So, like defined on the official rules above, in this very specific case,
when you created the patch, your 'sob' certified ('b') that:
"The contribution is based upon previous work that, to the best of my knowledge,
  is covered under an appropriate open source license and I have the right under
that license to submit that work with modifications"

Any extra Sob would be added as the patch could be in its transportation.

"Any further SoBs (Signed-off-by:’s) following the author’s SoB are from people
handling and transporting the patch, but were not involved in its development.
SoB chains should reflect the real route a patch took as it was propagated to
the maintainers and ultimately to Linus, with the first SoB entry signalling
primary authorship of a single author."

Same as 'c' of the certificate of origin: "The contribution was provided 
directly
to me by some other person who certified (a), (b) or (c) and I have not 
modified it.

It is very important to highlight this transportation rules because there
are many new devs that think that we maintainers are stealing ownership.
As you can see, this is not the case, but the rule.

Back to your case, since I had never seen this patch in my life before it hit
the mailing list, I couldn't have certified this patch in any possible way,
so the forged sob is the improper approach.

It is very hard to define a written rule on what to do with the code copied
from one driver to the other. In many cases the recognition is important,
but in other cases it is even hard to find who was actually the true author of
that code.

There are many options available. A simple one could be 'Cc' the person and
write in the commit message that the code was based on the other driver from
that person, but maybe there are better options available. So let's say that
when in doubt: ask. Contact the original author and ask what he/she has
to suggest. Maybe just this mention and cc would be enough, maybe even with
an acked-by or with the explicit sob, or maybe with some other tag like
'co-developed-by'.


Ok, makes sense. I have sent out another patch with you Cc'd.

Thanks,

Vinay.



Thanks,
Rodrigo.


Signed-off-by: Vinay Belgaumkar 
---
   tests/i915/i915_guc_pc.c | 151 +++
   tests/meson.build|   1 +
   2 files changed, 152 insertions(+)
   create mode 100644 tests/i915/i915_guc_pc.c

diff --git a/tests/i915/i915_guc_pc.c b/tests/i915/i915_guc_pc.c
new file mode 100644
index ..f9a0ed83
--- /dev/null
+++ b/tests/i915/i915_guc_pc.c

since 'guc_pc' is not a thing in i915 I'm afraid this will cause
confusion later.

I know, guc_slpc also doesn't make a lot of sense here...

Should we then try to move this code to the 'tests/i915/i915_pm_rps.c'
or maybe name it i915_pm_freq_api or something like that?

Sure. I was trying to make these guc/slpc specific since host trubo/RPS
already has coverage in IGT.

Thanks,

Vinay.


@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "drmtest.h"
+#include "i915/gem.h"
+#include "igt_sysfs.h"
+#include "igt.h"
+
+IGT_TEST_DESCRIPTION("Test GuC PM features like SLPC and its interactions");
+/*
+ * Too many intermediate components and steps before freq is adjusted
+ * Specially if workload is under execution, so let's wait 100 ms.
+ */
+#define ACT_FREQ_LATENCY_US 10
+
+static uint32_t get_freq(int dirfd, uint8_t id)
+{
+   uint32_t val;
+
+   igt_require(igt_sysfs_rps_scanf(dirfd, id, "%u", ) == 1);
+
+   return val;
+}
+
+static int set_freq(int dirfd, 

Re: [Intel-gfx] [PATCH i-g-t 2/2] i915_guc_pc: Add some basic SLPC igt tests

2023-03-27 Thread Belgaumkar, Vinay



On 3/26/2023 4:04 AM, Rodrigo Vivi wrote:

On Fri, Mar 24, 2023 at 03:49:59PM -0700, Vinay Belgaumkar wrote:

Use the xe_guc_pc test for i915 as well. Validate basic
api for GT freq control. Also test interaction with GT
reset. We skip rps tests with SLPC enabled, this will
re-introduce some coverage. SLPC selftests are already
covering some other workload related scenarios.

Signed-off-by: Rodrigo Vivi 

you probably meant 'Cc:'

Added you as Signed-off-by since you are the original author in xe igt.



Signed-off-by: Vinay Belgaumkar 
---
  tests/i915/i915_guc_pc.c | 151 +++
  tests/meson.build|   1 +
  2 files changed, 152 insertions(+)
  create mode 100644 tests/i915/i915_guc_pc.c

diff --git a/tests/i915/i915_guc_pc.c b/tests/i915/i915_guc_pc.c
new file mode 100644
index ..f9a0ed83
--- /dev/null
+++ b/tests/i915/i915_guc_pc.c

since 'guc_pc' is not a thing in i915 I'm afraid this will cause
confusion later.

I know, guc_slpc also doesn't make a lot of sense here...

Should we then try to move this code to the 'tests/i915/i915_pm_rps.c'
or maybe name it i915_pm_freq_api or something like that?


Sure. I was trying to make these guc/slpc specific since host trubo/RPS 
already has coverage in IGT.


Thanks,

Vinay.




@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "drmtest.h"
+#include "i915/gem.h"
+#include "igt_sysfs.h"
+#include "igt.h"
+
+IGT_TEST_DESCRIPTION("Test GuC PM features like SLPC and its interactions");
+/*
+ * Too many intermediate components and steps before freq is adjusted
+ * Specially if workload is under execution, so let's wait 100 ms.
+ */
+#define ACT_FREQ_LATENCY_US 10
+
+static uint32_t get_freq(int dirfd, uint8_t id)
+{
+   uint32_t val;
+
+   igt_require(igt_sysfs_rps_scanf(dirfd, id, "%u", ) == 1);
+
+   return val;
+}
+
+static int set_freq(int dirfd, uint8_t id, uint32_t val)
+{
+   return igt_sysfs_rps_printf(dirfd, id, "%u", val);
+}
+
+static void test_freq_basic_api(int dirfd, int gt)
+{
+   uint32_t rpn, rp0, rpe;
+
+   /* Save frequencies */
+   rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+   rp0 = get_freq(dirfd, RPS_RP0_FREQ_MHZ);
+   rpe = get_freq(dirfd, RPS_RP1_FREQ_MHZ);
+   igt_info("System min freq: %dMHz; max freq: %dMHz\n", rpn, rp0);
+
+   /*
+* Negative bound tests
+* RPn is the floor
+* RP0 is the ceiling
+*/
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rp0 + 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn - 1) < 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rp0 + 1) < 0);
+
+   /* Assert min requests are respected from rp0 to rpn */
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rp0) > 0);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rp0);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpe) > 0);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpe);
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+
+   /* Assert max requests are respected from rpn to rp0 */
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpe) > 0);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpe);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rp0) > 0);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rp0);
+
+}
+
+static void test_reset(int i915, int dirfd, int gt)
+{
+   uint32_t rpn = get_freq(dirfd, RPS_RPn_FREQ_MHZ);
+   int fd;
+
+   igt_assert(set_freq(dirfd, RPS_MIN_FREQ_MHZ, rpn) > 0);
+   igt_assert(set_freq(dirfd, RPS_MAX_FREQ_MHZ, rpn) > 0);
+   usleep(ACT_FREQ_LATENCY_US);
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+
+   /* Manually trigger a GT reset */
+   fd = igt_debugfs_gt_open(i915, gt, "reset", O_WRONLY);
+   igt_require(fd >= 0);
+   igt_ignore_warn(write(fd, "1\n", 2));
+   close(fd);
+
+   igt_assert(get_freq(dirfd, RPS_MIN_FREQ_MHZ) == rpn);
+   igt_assert(get_freq(dirfd, RPS_MAX_FREQ_MHZ) == rpn);
+}
+
+igt_main
+{
+   int i915 = -1;
+   uint32_t *stash_min, *stash_max;
+
+   igt_fixture {
+   int num_gts, dirfd, gt;
+
+   i915 = drm_open_driver(DRIVER_INTEL);
+   igt_require_gem(i915);
+   /* i915_pm_rps already covers execlist path */
+   igt_require(gem_using_guc_submission(i915));
+
+   num_gts = igt_sysfs_get_num_gt(i915);
+   stash_min = (uint32_t*)malloc(sizeof(uint32_t) * num_gts);
+   stash_max = 

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/xe_guc_pc: Restore max freq first

2023-03-27 Thread Belgaumkar, Vinay



On 3/26/2023 3:51 AM, Rodrigo Vivi wrote:

On Fri, Mar 24, 2023 at 05:34:42PM -0700, Vinay Belgaumkar wrote:

When min/max are both at RPn, restoring min back to 300
will not work. Max needs to be increased first.

why max needs to come first in this case? we should probably at
least document so we don't forget it again...
I was assuming we use soft limits like in i915, but looks like we don't. 
So, this is not an issue.



Also, add
igt_assert() here, which would have caught the issue.

I was going to ask if we should really add asserts inside the fixture
or maybe using igt_require instead, but then I noticed more cases
doing the assert...


Do we still need to add the assert in this case?

Thanks,

Vinay.




Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
  tests/xe/xe_guc_pc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/xe/xe_guc_pc.c b/tests/xe/xe_guc_pc.c
index 60c93288..43bf6f48 100644
--- a/tests/xe/xe_guc_pc.c
+++ b/tests/xe/xe_guc_pc.c
@@ -489,8 +489,8 @@ igt_main
  
  	igt_fixture {

xe_for_each_gt(fd, gt) {
-   set_freq(sysfs, gt, "min", stash_min);
-   set_freq(sysfs, gt, "max", stash_max);
+   igt_assert(set_freq(sysfs, gt, "max", stash_max) > 0);
+   igt_assert(set_freq(sysfs, gt, "min", stash_min) > 0);
}
close(sysfs);
xe_device_put(fd);
--
2.38.1



Re: [Intel-gfx] [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-24 Thread Belgaumkar, Vinay



On 3/24/2023 4:31 PM, Dixit, Ashutosh wrote:

On Fri, 24 Mar 2023 11:15:02 -0700, Belgaumkar, Vinay wrote:
Hi Vinay,

Thanks for the review. Comments inline below.
Sorry about asking the same questions all over again :) Didn't look at 
previous versions.



On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:

On dGfx, the PL1 power limit being enabled and set to a low value results
in a low GPU operating freq. It also negates the freq raise operation which
is done before GuC firmware load. As a result GuC firmware load can time
out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
limit was enabled and set to a low value). Therefore disable the PL1 power
limit when allowed by HW when loading GuC firmware.

v3 label missing in subject.

v2:
   - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
   - Add hwm_power_max_restore to error return code path

v3 (Jani N):
   - Add/remove explanatory comments
   - Function renames
   - Type corrections
   - Locking annotation

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
Signed-off-by: Ashutosh Dixit 
---
   drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
   drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
   drivers/gpu/drm/i915/i915_hwmon.h |  7 +
   3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 4ccb4be4c9cba..aa8e35a5636a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -18,6 +18,7 @@
   #include "intel_uc.h"
 #include "i915_drv.h"
+#include "i915_hwmon.h"
 static const struct intel_uc_ops uc_ops_off;
   static const struct intel_uc_ops uc_ops_on;
@@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
struct intel_guc *guc = >guc;
struct intel_huc *huc = >huc;
int ret, attempts;
+   bool pl1en;

Init to 'false' here

See next comment.




GEM_BUG_ON(!intel_uc_supports_guc(uc));
GEM_BUG_ON(!intel_uc_wants_guc(uc));
@@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
else
attempts = 1;
   +/* Disable a potentially low PL1 power limit to allow freq to be
raised */
+   i915_hwmon_power_max_disable(gt->i915, );
+
intel_rps_raise_unslice(_to_gt(uc)->rps);
while (attempts--) {
@@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
intel_rps_lower_unslice(_to_gt(uc)->rps);
}
   +i915_hwmon_power_max_restore(gt->i915, pl1en);
+
guc_info(guc, "submission %s\n", 
str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
guc_info(guc, "SLPC %s\n", 
str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
   @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
/* Return GT back to RPn */
intel_rps_lower_unslice(_to_gt(uc)->rps);
   +i915_hwmon_power_max_restore(gt->i915, pl1en);

if (pl1en)

     i915_hwmon_power_max_enable().

IMO it's better not to have checks in the main __uc_init_hw() function (if
we do this we'll need to add 2 checks in __uc_init_hw()). If you really
want we could do something like this inside
i915_hwmon_power_max_disable/i915_hwmon_power_max_restore. But for now I
am not making any changes.

ok.


(I can send a patch with the changes if you want to take a look but IMO it
will add more logic/code but without real benefits (it will save a rmw if
the limit was already disabled, but IMO this code is called so infrequently
(only during GuC resets) as to not have any significant impact)).


+
__uc_sanitize(uc);
if (!ret) {
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index ee63a8fd88fc1..769b5bda4d53f 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -444,6 +444,45 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int 
chan, long val)
}
   }
   +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool
*old)

Shouldn't we call this i915_hwmon_package_pl1_disable()?

I did think of using "pl1" in the function name but then decided to retain
"power_max" because other hwmon functions for PL1 limit also use
"power_max" (hwm_power_max_read/hwm_power_max_write) and currently
"hwmon_power_max" is mapped to the PL1 limit. So "power_max" is used to
show that all these functions deal with the PL1 power limit.

There is a comment in __uc_init_hw() explaining "power_max" means the PL1
power limit.

ok.



+   __acquires(i915->hwmon->hwmon_lock)
+{
+   struct i915_hwmon *hwmon = i915->hwmon;
+   intel_wakeref_t wakeref;
+   u32 r;
+
+   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+   return;
+
+   /* Take mu

Re: [Intel-gfx] [PATCH] drm/i915/mtl: Disable C6 on MTL A0 for media

2023-03-24 Thread Belgaumkar, Vinay



On 3/24/2023 11:02 AM, Umesh Nerlige Ramappa wrote:

Earlier merge dropped an if block when applying the patch -
"drm/i915/mtl: Synchronize i915/BIOS on C6 enabling". Bring back the
if block as the check is required by - "drm/i915/mtl: Disable MC6 for MTL
A step" to disable C6 on media for A0 stepping.


LGTM,

Reviewed-by: Vinay Belgaumkar 



Fixes: 3735040978a4 ("drm/i915/mtl: Synchronize i915/BIOS on C6 enabling")
Signed-off-by: Umesh Nerlige Ramappa 
---
  drivers/gpu/drm/i915/gt/intel_rc6.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c 
b/drivers/gpu/drm/i915/gt/intel_rc6.c
index f760586f9f46..8f3cd68d14f8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@ -525,6 +525,13 @@ static bool rc6_supported(struct intel_rc6 *rc6)
return false;
}
  
+	if (IS_MTL_MEDIA_STEP(gt->i915, STEP_A0, STEP_B0) &&

+   gt->type == GT_MEDIA) {
+   drm_notice(>drm,
+  "Media RC6 disabled on A step\n");
+   return false;
+   }
+
return true;
  }
  


Re: [Intel-gfx] [PATCH] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

2023-03-24 Thread Belgaumkar, Vinay



On 3/15/2023 8:59 PM, Ashutosh Dixit wrote:

On dGfx, the PL1 power limit being enabled and set to a low value results
in a low GPU operating freq. It also negates the freq raise operation which
is done before GuC firmware load. As a result GuC firmware load can time
out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
limit was enabled and set to a low value). Therefore disable the PL1 power
limit when allowed by HW when loading GuC firmware.

v3 label missing in subject.


v2:
  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
  - Add hwm_power_max_restore to error return code path

v3 (Jani N):
  - Add/remove explanatory comments
  - Function renames
  - Type corrections
  - Locking annotation

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
Signed-off-by: Ashutosh Dixit 
---
  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 +++
  drivers/gpu/drm/i915/i915_hwmon.c | 39 +++
  drivers/gpu/drm/i915/i915_hwmon.h |  7 +
  3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 4ccb4be4c9cba..aa8e35a5636a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -18,6 +18,7 @@
  #include "intel_uc.h"
  
  #include "i915_drv.h"

+#include "i915_hwmon.h"
  
  static const struct intel_uc_ops uc_ops_off;

  static const struct intel_uc_ops uc_ops_on;
@@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
struct intel_guc *guc = >guc;
struct intel_huc *huc = >huc;
int ret, attempts;
+   bool pl1en;


Init to 'false' here


  
  	GEM_BUG_ON(!intel_uc_supports_guc(uc));

GEM_BUG_ON(!intel_uc_wants_guc(uc));
@@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
else
attempts = 1;
  
+	/* Disable a potentially low PL1 power limit to allow freq to be raised */

+   i915_hwmon_power_max_disable(gt->i915, );
+
intel_rps_raise_unslice(_to_gt(uc)->rps);
  
  	while (attempts--) {

@@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
intel_rps_lower_unslice(_to_gt(uc)->rps);
}
  
+	i915_hwmon_power_max_restore(gt->i915, pl1en);

+
guc_info(guc, "submission %s\n", 
str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
guc_info(guc, "SLPC %s\n", 
str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
  
@@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)

/* Return GT back to RPn */
intel_rps_lower_unslice(_to_gt(uc)->rps);
  
+	i915_hwmon_power_max_restore(gt->i915, pl1en);


if (pl1en)

    i915_hwmon_power_max_enable().


+
__uc_sanitize(uc);
  
  	if (!ret) {

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index ee63a8fd88fc1..769b5bda4d53f 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -444,6 +444,45 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int 
chan, long val)
}
  }
  
+void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)

Shouldn't we call this i915_hwmon_package_pl1_disable()?

+   __acquires(i915->hwmon->hwmon_lock)
+{
+   struct i915_hwmon *hwmon = i915->hwmon;
+   intel_wakeref_t wakeref;
+   u32 r;
+
+   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+   return;
+
+   /* Take mutex to prevent concurrent hwm_power_max_write */
+   mutex_lock(>hwmon_lock);
+
+   with_intel_runtime_pm(hwmon->ddat.uncore->rpm, wakeref)
+   r = intel_uncore_rmw(hwmon->ddat.uncore,
+hwmon->rg.pkg_rapl_limit,
+PKG_PWR_LIM_1_EN, 0);

Most of this code (lock and rmw parts) is already inside static void
hwm_locked_with_pm_intel_uncore_rmw() , can we reuse that here?

+
+   *old = !!(r & PKG_PWR_LIM_1_EN);
+}
+
+void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
+   __releases(i915->hwmon->hwmon_lock)
We can just call this i915_hwmon_power_max_enable() and call whenever 
the old value was actually enabled. That way, we have proper mirror 
functions.

+{
+   struct i915_hwmon *hwmon = i915->hwmon;
+   intel_wakeref_t wakeref;
+
+   if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+   return;
+
+   with_intel_runtime_pm(hwmon->ddat.uncore->rpm, wakeref)
+   intel_uncore_rmw(hwmon->ddat.uncore,
+hwmon->rg.pkg_rapl_limit,
+PKG_PWR_LIM_1_EN,
+old ? PKG_PWR_LIM_1_EN : 0);


3rd param should be 0 here, else we will end up clearing other bits.

Thanks,

Vinay.


+
+   mutex_unlock(>hwmon_lock);
+}
+
  static umode_t
  hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr)
  {
diff --git 

Re: [Intel-gfx] [PATCH v6 02/12] drm/i915/mtl: Synchronize i915/BIOS on C6 enabling

2023-03-17 Thread Belgaumkar, Vinay



On 3/16/2023 8:43 PM, Dixit, Ashutosh wrote:

On Wed, 15 Mar 2023 18:00:51 -0700, Umesh Nerlige Ramappa wrote:

From: Vinay Belgaumkar 

Hi Vinay,


If BIOS enables/disables C6, i915 should do the same.

So MTL bios has a control for enabling/disabling C6? Both RC6 and MC6
individually or collectively?

Yes, we can toggle both independently in BIOS.


What happens if bios has disabled RC6 and i915 enables it: just that it
will bust OA?


Yes, since OA init will rely on this information.

Thanks,

Vinay.



The patch itself LGTM if the above is true, I can R-b it after I hear about
the above.

Thanks.
--
Ashutosh


Also, retain this value across driver reloads. This is needed only for
MTL as of now due to an existing bug in OA which needs C6 disabled for it
to function. BIOS behavior is also different across platforms in terms of
how C6 is enabled.

Signed-off-by: Vinay Belgaumkar 


Re: [Intel-gfx] [PATCH 3/3] drm/i915/pmu: Use common freq functions with sysfs

2023-03-07 Thread Belgaumkar, Vinay



On 3/7/2023 9:33 PM, Ashutosh Dixit wrote:

Using common freq functions with sysfs in PMU (but without taking
forcewake) solves the following issues (a) missing support for MTL (b)


For the requested_freq, we read it only if actual_freq is zero below 
(meaning, GT is in C6). So then what is the point of reading it without 
a force wake? It will also be zero, correct?


Thanks,

Vinay.


missing support for older generation (prior to Gen6) (c) missing support
for slpc when freq sampling has to fall back to requested freq. It also
makes the PMU code future proof where sometimes code has been updated for
sysfs and PMU has been missed.

Signed-off-by: Ashutosh Dixit 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 10 --
  drivers/gpu/drm/i915/gt/intel_rps.h |  1 -
  drivers/gpu/drm/i915/i915_pmu.c | 10 --
  3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index 49df31927c0e..b03bfbe7ee23 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -2046,16 +2046,6 @@ void intel_rps_sanitize(struct intel_rps *rps)
rps_disable_interrupts(rps);
  }
  
-u32 intel_rps_read_rpstat_fw(struct intel_rps *rps)

-{
-   struct drm_i915_private *i915 = rps_to_i915(rps);
-   i915_reg_t rpstat;
-
-   rpstat = (GRAPHICS_VER(i915) >= 12) ? GEN12_RPSTAT1 : GEN6_RPSTAT1;
-
-   return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat);
-}
-
  u32 intel_rps_read_rpstat(struct intel_rps *rps)
  {
struct drm_i915_private *i915 = rps_to_i915(rps);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
b/drivers/gpu/drm/i915/gt/intel_rps.h
index a990f985ab23..60ae27679011 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -53,7 +53,6 @@ u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
  u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
  u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
  u32 intel_rps_read_rpstat(struct intel_rps *rps);
-u32 intel_rps_read_rpstat_fw(struct intel_rps *rps);
  void gen6_rps_get_freq_caps(struct intel_rps *rps, struct intel_rps_freq_caps 
*caps);
  void intel_rps_raise_unslice(struct intel_rps *rps);
  void intel_rps_lower_unslice(struct intel_rps *rps);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index a76c5ce9513d..1a4c9fed257c 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -392,14 +392,12 @@ frequency_sample(struct intel_gt *gt, unsigned int 
period_ns)
 * case we assume the system is running at the intended
 * frequency. Fortunately, the read should rarely fail!
 */
-   val = intel_rps_read_rpstat_fw(rps);
-   if (val)
-   val = intel_rps_get_cagf(rps, val);
-   else
-   val = rps->cur_freq;
+   val = intel_rps_read_actual_frequency_fw(rps);
+   if (!val)
+   val = intel_rps_get_requested_frequency_fw(rps),
  
  		add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT],

-   intel_gpu_freq(rps, val), period_ns / 1000);
+   val, period_ns / 1000);
}
  
  	if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) {


Re: [Intel-gfx] [PATCH] drm/i915/gsc: Fix the Driver-FLR completion

2023-02-22 Thread Belgaumkar, Vinay



On 2/22/2023 1:01 PM, Alan Previn wrote:

The Driver-FLR flow may inadvertently exit early before the full
completion of the re-init of the internal HW state if we only poll
GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead
we need a two-step completion wait-for-completion flow that also
involves GU_CNTL. See the patch and new code comments for detail.
This is new direction from HW architecture folks.

v2: - Add error message for the teardown timeout (Anshuman)
- Don't duplicate code in comments (Jani)


LGTM,

Tested-by: Vinay Belgaumkar 



Signed-off-by: Alan Previn 
Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was 
loaded")
---
  drivers/gpu/drm/i915/intel_uncore.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index f018da7ebaac..f3c46352db89 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2749,14 +2749,25 @@ static void driver_initiated_flr(struct intel_uncore 
*uncore)
/* Trigger the actual Driver-FLR */
intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR);
  
+	/* Wait for hardware teardown to complete */

+   ret = intel_wait_for_register_fw(uncore, GU_CNTL,
+DRIVERFLR_STATUS, 0,
+flr_timeout_ms);
+   if (ret) {
+   drm_err(>drm, "Driver-FLR-teardown wait completion failed! 
%d\n", ret);
+   return;
+   }
+
+   /* Wait for hardware/firmware re-init to complete */
ret = intel_wait_for_register_fw(uncore, GU_DEBUG,
 DRIVERFLR_STATUS, DRIVERFLR_STATUS,
 flr_timeout_ms);
if (ret) {
-   drm_err(>drm, "wait for Driver-FLR completion failed! 
%d\n", ret);
+   drm_err(>drm, "Driver-FLR-reinit wait completion failed! 
%d\n", ret);
return;
}
  
+	/* Clear sticky completion status */

intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS);
  }
  


Re: [Intel-gfx] [PATCH] drm/i915/mtl: Connect root sysfs entries to GT0

2023-01-16 Thread Belgaumkar, Vinay



On 1/16/2023 10:58 AM, Andi Shyti wrote:

Hi,

On Thu, Jan 12, 2023 at 08:48:11PM -0800, Belgaumkar, Vinay wrote:

On 1/12/2023 8:37 PM, Dixit, Ashutosh wrote:

On Thu, 12 Jan 2023 20:26:34 -0800, Belgaumkar, Vinay wrote:

I think the ABI was changed by the patch mentioned in the commit
(a8a4f0467d70).

The ABI was originally changed in 80cf8af17af04 and 56a709cf77468.

In theory the ABI has never changed, we just needed to agree once
and for all what to do when reading the upper level interface.
There has never been a previous multitile specification before
this change.

There have been long and exhaustive discussions on what to do and
the decision is that in some cases we show the average, in others
the maximum. Never the GT0, though.


Yes, you are right. @Andi, did we have a plan to update the IGT tests that
use these interfaces to properly refer to the per GT entries as well? They
now receive average values instead of absolute, hence will fail on a
multi-GT device.

I don't know what's the plan for igt's.

Which tests are failing? I think we shouldn't be using the upper
level interfaces at all in IGT's. Previously there has been an
error printed on dmesg when this was happening. The error has
been removed in order to set the ABI as agreed above.


Tests like perf_mu and gem_ctx_freq will fail as they read upper level 
sysfs entries and expect them to change as per the test. I think this 
includes all of the tests that read RC6 or Trubo related sysfs entries 
for that matter.


Thanks,

Vinay.



Andi


Re: [Intel-gfx] [PATCH] drm/i915/mtl: Connect root sysfs entries to GT0

2023-01-12 Thread Belgaumkar, Vinay



On 1/12/2023 8:37 PM, Dixit, Ashutosh wrote:

On Thu, 12 Jan 2023 20:26:34 -0800, Belgaumkar, Vinay wrote:

I think the ABI was changed by the patch mentioned in the commit
(a8a4f0467d70).

The ABI was originally changed in 80cf8af17af04 and 56a709cf77468.


Yes, you are right. @Andi, did we have a plan to update the IGT tests 
that use these interfaces to properly refer to the per GT entries as 
well? They now receive average values instead of absolute, hence will 
fail on a multi-GT device.


Thanks,

Vinay.



Re: [Intel-gfx] [PATCH] drm/i915/mtl: Connect root sysfs entries to GT0

2023-01-12 Thread Belgaumkar, Vinay



On 1/12/2023 7:15 PM, Dixit, Ashutosh wrote:

On Thu, 12 Jan 2023 18:27:52 -0800, Vinay Belgaumkar wrote:

Reading current root sysfs entries gives a min/max of all
GTs. Updating this so we return default (GT0) values when root
level sysfs entries are accessed, instead of min/max for the card.
Tests that are not multi GT capable will read incorrect sysfs
values without this change on multi-GT platforms like MTL.

Fixes: a8a4f0467d70 ("drm/i915: Fix CFI violations in gt_sysfs")

We seem to be proposing to change the previous sysfs ABI with this patch?
But even then it doesn't seem correct to use gt0 values for device level
sysfs. Actually I received the following comment about using max freq
across gt's for device level freq's (gt_act_freq_mhz etc.) from one of our
users:


I think the ABI was changed by the patch mentioned in the commit 
(a8a4f0467d70). If I am not mistaken, original behavior was to return 
the GT0 values (I will double check this).


IMO, if that patch changed the behavior, it should have been accompanied 
with patches that update all the tests to use the proper per GT sysfs as 
well.


Thanks,

Vinay.



-
On Sun, 06 Nov 2022 08:54:04 -0800, Lawson, Lowren H wrote:

Why show maximum? Wouldn’t average be more accurate to the user experience?

As a user, I expect the ‘card’ frequency to be relatively accurate to the
entire card. If I see 1.6GHz, but the card is behaving as if it’s running a
1.0 & 1.6GHz on the different compute tiles, I’m going to see a massive
decrease in compute workload performance while at ‘maximum’ frequency.
-

So I am not sure why max/min were previously chosen. Why not the average?

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v3 1/1] drm/i915/pxp: Use drm_dbg if arb session failed due to fw version

2023-01-11 Thread Belgaumkar, Vinay



On 12/21/2022 9:49 AM, Alan Previn wrote:

If PXP arb-session is being attempted on older hardware SKUs or
on hardware with older, unsupported, firmware versions, then don't
report the failure with a drm_error. Instead, look specifically for
the API-version error reply and drm_dbg that reply. In this case, the
user-space will eventually get a -ENODEV for the protected context
creation which is the correct behavior and we don't create unnecessary
drm_error's in our dmesg (for what is unsupported platforms).


LGTM. Is there a link to where these pxp status codes are documented?

Reviewed-by: Vinay Belgaumkar 



Changes from prio revs:
v2 : - remove unnecessary newline. (Jani)
v1 : - print incorrect version from input packet, not output.

Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h | 1 +
  drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   | 4 
  2 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h
index c2f23394f9b8..aaa8187a0afb 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h
@@ -17,6 +17,7 @@
   */
  enum pxp_status {
PXP_STATUS_SUCCESS = 0x0,
+   PXP_STATUS_ERROR_API_VERSION = 0x1002,
PXP_STATUS_OP_NOT_PERMITTED = 0x4013
  };
  
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c

index d50354bfb993..73aa8015f828 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -298,6 +298,10 @@ int intel_pxp_tee_cmd_create_arb_session(struct intel_pxp 
*pxp,
  
  	if (ret)

drm_err(>drm, "Failed to send tee msg ret=[%d]\n", ret);
+   else if (msg_out.header.status == PXP_STATUS_ERROR_API_VERSION)
+   drm_dbg(>drm, "PXP firmware version unsupported, requested: 
"
+   "CMD-ID-[0x%08x] on API-Ver-[0x%08x]\n",
+   msg_in.header.command_id, msg_in.header.api_version);
else if (msg_out.header.status != 0x0)
drm_warn(>drm, "PXP firmware failed arb session init request 
ret=[0x%08x]\n",
 msg_out.header.status);

base-commit: cc44a1e87ea6b788868878295119398966f98a81


Re: [Intel-gfx] [PATCH 1/1] drm/i915/mtl: Enable Idle Messaging for GSC CS

2022-11-16 Thread Belgaumkar, Vinay



On 11/15/2022 5:44 AM, Badal Nilawar wrote:

From: Vinay Belgaumkar 

By defaut idle mesaging is disabled for GSC CS so to unblock RC6
entry on media tile idle messaging need to be enabled.

v2:
  - Fix review comments (Vinay)
  - Set GSC idle hysterisis to 5 us (Badal)

Bspec: 71496

Cc: Daniele Ceraolo Spurio 
Signed-off-by: Vinay Belgaumkar 
Signed-off-by: Badal Nilawar 
---
  drivers/gpu/drm/i915/gt/intel_engine_pm.c | 18 ++
  drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  4 
  2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index b0a4a2dbe3ee..5522885b2db0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -15,6 +15,22 @@
  #include "intel_rc6.h"
  #include "intel_ring.h"
  #include "shmem_utils.h"
+#include "intel_gt_regs.h"
+
+static void intel_gsc_idle_msg_enable(struct intel_engine_cs *engine)
+{
+   struct drm_i915_private *i915 = engine->i915;
+
+   if (IS_METEORLAKE(i915) && engine->id == GSC0) {
+   intel_uncore_write(engine->gt->uncore,
+  RC_PSMI_CTRL_GSCCS,
+  _MASKED_BIT_DISABLE(IDLE_MSG_DISABLE));
+   /* 5 us hysterisis */
+   intel_uncore_write(engine->gt->uncore,
+  PWRCTX_MAXCNT_GSCCS,
+  0xA);
+   }
+}
  
  static void dbg_poison_ce(struct intel_context *ce)

  {
@@ -275,6 +291,8 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
  
  	intel_wakeref_init(>wakeref, rpm, _ops);

intel_engine_init_heartbeat(engine);
+
+   intel_gsc_idle_msg_enable(engine);
  }
  
  /**

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 07031e03f80c..20472eb15364 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -913,6 +913,10 @@
  #define  MSG_IDLE_FW_MASK REG_GENMASK(13, 9)
  #define  MSG_IDLE_FW_SHIFT9
  
+#define	RC_PSMI_CTRL_GSCCS	_MMIO(0x11a050)


Alignment still seems off? Other than that,

Reviewed-by: Vinay Belgaumkar 


+#define  IDLE_MSG_DISABLE  BIT(0)
+#define PWRCTX_MAXCNT_GSCCS_MMIO(0x11a054)
+
  #define FORCEWAKE_MEDIA_GEN9  _MMIO(0xa270)
  #define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
  


Re: [Intel-gfx] [PATCH v4 1/1] drm/i915/guc/slpc: Add selftest for slpc tile-tile interaction

2022-11-10 Thread Belgaumkar, Vinay



On 11/9/2022 3:25 AM, Riana Tauro wrote:

Run a workload on tiles simultaneously by requesting for RP0 frequency.
Pcode can however limit the frequency being granted due to throttling
reasons. This test checks if there is any throttling but does not fail
if RP0 is not granted due to throttle reasons

v2: Fix build error
v3: Use IS_ERR_OR_NULL to check worker
 Addressed cosmetic review comments (Tvrtko)
v4: do not skip test on media engines if gt type is GT_MEDIA.
 Use correct PERF_LIMIT_REASONS register for MTL (Vinay)


LGTM.

Reviewed-by: Vinay Belgaumkar 



Signed-off-by: Riana Tauro 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 70 +++--
  1 file changed, 66 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index 82ec95a299f6..bd44ce73a504 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -13,6 +13,14 @@ enum test_type {
VARY_MAX,
MAX_GRANTED,
SLPC_POWER,
+   TILE_INTERACTION,
+};
+
+struct slpc_thread {
+   struct kthread_worker *worker;
+   struct kthread_work work;
+   struct intel_gt *gt;
+   int result;
  };
  
  static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)

@@ -212,7 +220,8 @@ static int max_granted_freq(struct intel_guc_slpc *slpc, 
struct intel_rps *rps,
*max_act_freq =  intel_rps_read_actual_frequency(rps);
if (*max_act_freq != slpc->rp0_freq) {
/* Check if there was some throttling by pcode */
-   perf_limit_reasons = intel_uncore_read(gt->uncore, 
GT0_PERF_LIMIT_REASONS);
+   perf_limit_reasons = intel_uncore_read(gt->uncore,
+  
intel_gt_perf_limit_reasons_reg(gt));
  
  		/* If not, this is an error */

if (!(perf_limit_reasons & GT0_PERF_LIMIT_REASONS_MASK)) {
@@ -310,9 +319,10 @@ static int run_test(struct intel_gt *gt, int test_type)
break;
  
  		case MAX_GRANTED:

+   case TILE_INTERACTION:
/* Media engines have a different RP0 */
-   if (engine->class == VIDEO_DECODE_CLASS ||
-   engine->class == VIDEO_ENHANCEMENT_CLASS) {
+   if (gt->type != GT_MEDIA && (engine->class == 
VIDEO_DECODE_CLASS ||
+engine->class == 
VIDEO_ENHANCEMENT_CLASS)) {
igt_spinner_end();
st_engine_heartbeat_enable(engine);
err = 0;
@@ -335,7 +345,8 @@ static int run_test(struct intel_gt *gt, int test_type)
if (max_act_freq <= slpc->min_freq) {
pr_err("Actual freq did not rise above min\n");
pr_err("Perf Limit Reasons: 0x%x\n",
-  intel_uncore_read(gt->uncore, 
GT0_PERF_LIMIT_REASONS));
+  intel_uncore_read(gt->uncore,
+
intel_gt_perf_limit_reasons_reg(gt)));
err = -EINVAL;
}
}
@@ -426,6 +437,56 @@ static int live_slpc_power(void *arg)
return ret;
  }
  
+static void slpc_spinner_thread(struct kthread_work *work)

+{
+   struct slpc_thread *thread = container_of(work, typeof(*thread), work);
+
+   thread->result = run_test(thread->gt, TILE_INTERACTION);
+}
+
+static int live_slpc_tile_interaction(void *arg)
+{
+   struct drm_i915_private *i915 = arg;
+   struct intel_gt *gt;
+   struct slpc_thread *threads;
+   int i = 0, ret = 0;
+
+   threads = kcalloc(I915_MAX_GT, sizeof(*threads), GFP_KERNEL);
+   if (!threads)
+   return -ENOMEM;
+
+   for_each_gt(gt, i915, i) {
+   threads[i].worker = kthread_create_worker(0, 
"igt/slpc_parallel:%d", gt->info.id);
+
+   if (IS_ERR(threads[i].worker)) {
+   ret = PTR_ERR(threads[i].worker);
+   break;
+   }
+
+   threads[i].gt = gt;
+   kthread_init_work([i].work, slpc_spinner_thread);
+   kthread_queue_work(threads[i].worker, [i].work);
+   }
+
+   for_each_gt(gt, i915, i) {
+   int status;
+
+   if (IS_ERR_OR_NULL(threads[i].worker))
+   continue;
+
+   kthread_flush_work([i].work);
+   status = READ_ONCE(threads[i].result);
+   if (status && !ret) {
+   pr_err("%s GT %d failed ", __func__, gt->info.id);
+   ret = status;
+   }
+   kthread_destroy_worker(threads[i].worker);
+   }
+
+   kfree(threads);
+   return ret;
+}
+
  int 

Re: [Intel-gfx] [PATCH 2/2] drm/i915/mtl: Enable Idle Messaging for GSC CS

2022-11-04 Thread Belgaumkar, Vinay



On 10/31/2022 8:36 PM, Badal Nilawar wrote:

From: Vinay Belgaumkar 

By defaut idle mesaging is disabled for GSC CS so to unblock RC6
entry on media tile idle messaging need to be enabled.

C6 entry instead of RC6. Also *needs*.


Bspec: 71496

Cc: Daniele Ceraolo Spurio 
Signed-off-by: Vinay Belgaumkar 
Signed-off-by: Badal Nilawar 
---
  drivers/gpu/drm/i915/gt/intel_engine_pm.c | 12 
  drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  3 +++
  2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index b0a4a2dbe3ee..8d391f8fd861 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -15,6 +15,7 @@
  #include "intel_rc6.h"
  #include "intel_ring.h"
  #include "shmem_utils.h"
+#include "intel_gt_regs.h"
  
  static void dbg_poison_ce(struct intel_context *ce)

  {
@@ -271,10 +272,21 @@ static const struct intel_wakeref_ops wf_ops = {
  
  void intel_engine_init__pm(struct intel_engine_cs *engine)

  {
+   struct drm_i915_private *i915 = engine->i915;
struct intel_runtime_pm *rpm = engine->uncore->rpm;
  
  	intel_wakeref_init(>wakeref, rpm, _ops);

intel_engine_init_heartbeat(engine);
+
+   if (IS_METEORLAKE(i915) && engine->id == GSC0) {
+   intel_uncore_write(engine->gt->uncore,
+  RC_PSMI_CTRL_GSCCS,
+  _MASKED_BIT_DISABLE(IDLE_MSG_DISABLE));
+   drm_dbg(>drm,
+   "Set GSC CS Idle Reg to: 0x%x",
+   intel_uncore_read(engine->gt->uncore, 
RC_PSMI_CTRL_GSCCS));

Do we need the debug print here?

+   }
+
  }
  
  /**

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index f4624262dc81..176902a9f2a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -908,6 +908,9 @@
  #define  MSG_IDLE_FW_MASK REG_GENMASK(13, 9)
  #define  MSG_IDLE_FW_SHIFT9
  
+#define	RC_PSMI_CTRL_GSCCS	_MMIO(0x11a050)

+#define IDLE_MSG_DISABLE   BIT(0)


Is the alignment off?

Thanks,

Vinay.


+
  #define FORCEWAKE_MEDIA_GEN9  _MMIO(0xa270)
  #define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
  


Re: [Intel-gfx] [PATCH v3 1/1] drm/i915/guc/slpc: Add selftest for slpc tile-tile interaction

2022-11-03 Thread Belgaumkar, Vinay



On 10/30/2022 10:14 PM, Riana Tauro wrote:

Run a workload on tiles simultaneously by requesting for RP0 frequency.
Pcode can however limit the frequency being granted due to throttling
reasons. This test fails if there is any throttling
It actually passes if there was throttling. Only fails if actual 
frequency does not reach RP0 AND there was no throttle reasons.


v2: Fix build error
v3: Use IS_ERR_OR_NULL to check worker
 Addressed cosmetic review comments (Tvrtko)

Signed-off-by: Riana Tauro 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 60 +
  1 file changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index 82ec95a299f6..427e714b432b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -13,6 +13,14 @@ enum test_type {
VARY_MAX,
MAX_GRANTED,
SLPC_POWER,
+   TILE_INTERACTION,
+};
+
+struct slpc_thread {
+   struct kthread_worker *worker;
+   struct kthread_work work;
+   struct intel_gt *gt;
+   int result;
  };
  
  static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)

@@ -310,6 +318,7 @@ static int run_test(struct intel_gt *gt, int test_type)
break;
  
  		case MAX_GRANTED:

+   case TILE_INTERACTION:
/* Media engines have a different RP0 */
if (engine->class == VIDEO_DECODE_CLASS ||
engine->class == VIDEO_ENHANCEMENT_CLASS) {


So, for MTL, this will just stop the spinner from running on the media 
tile and return 0, right? Not sure we are testing anything in that case 
(just one spinner running on the Render tile). Should we try and find 
out what media RP0 is and expect that here?


Thanks,

Vinay.


@@ -426,6 +435,56 @@ static int live_slpc_power(void *arg)
return ret;
  }
  
+static void slpc_spinner_thread(struct kthread_work *work)

+{
+   struct slpc_thread *thread = container_of(work, typeof(*thread), work);
+
+   thread->result = run_test(thread->gt, TILE_INTERACTION);
+}
+
+static int live_slpc_tile_interaction(void *arg)
+{
+   struct drm_i915_private *i915 = arg;
+   struct intel_gt *gt;
+   struct slpc_thread *threads;
+   int i = 0, ret = 0;
+
+   threads = kcalloc(I915_MAX_GT, sizeof(*threads), GFP_KERNEL);
+   if (!threads)
+   return -ENOMEM;
+
+   for_each_gt(gt, i915, i) {
+   threads[i].worker = kthread_create_worker(0, 
"igt/slpc_parallel:%d", gt->info.id);
+
+   if (IS_ERR(threads[i].worker)) {
+   ret = PTR_ERR(threads[i].worker);
+   break;
+   }
+
+   threads[i].gt = gt;
+   kthread_init_work([i].work, slpc_spinner_thread);
+   kthread_queue_work(threads[i].worker, [i].work);
+   }
+
+   for_each_gt(gt, i915, i) {
+   int status;
+
+   if (IS_ERR_OR_NULL(threads[i].worker))
+   continue;
+
+   kthread_flush_work([i].work);
+   status = READ_ONCE(threads[i].result);
+   if (status && !ret) {
+   pr_err("%s GT %d failed ", __func__, gt->info.id);
+   ret = status;
+   }
+   kthread_destroy_worker(threads[i].worker);
+   }
+
+   kfree(threads);
+   return ret;
+}
+
  int intel_slpc_live_selftests(struct drm_i915_private *i915)
  {
static const struct i915_subtest tests[] = {
@@ -433,6 +492,7 @@ int intel_slpc_live_selftests(struct drm_i915_private *i915)
SUBTEST(live_slpc_vary_min),
SUBTEST(live_slpc_max_granted),
SUBTEST(live_slpc_power),
+   SUBTEST(live_slpc_tile_interaction),
};
  
  	struct intel_gt *gt;


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/slpc: Use platform limits for min/max frequency (rev5)

2022-10-26 Thread Belgaumkar, Vinay


On 10/26/2022 12:13 PM, Belgaumkar, Vinay wrote:

Project List - Patchwork

*From:* Patchwork 
*Sent:* Tuesday, October 25, 2022 7:39 PM
*To:* Belgaumkar, Vinay 
*Cc:* intel-gfx@lists.freedesktop.org
*Subject:* ✗ Fi.CI.IGT: failure for drm/i915/slpc: Use platform limits 
for min/max frequency (rev5)


*Patch Details*

*Series:*



drm/i915/slpc: Use platform limits for min/max frequency (rev5)

*URL:*



https://patchwork.freedesktop.org/series/109632/

*State:*



failure

*Details:*



https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/index.html


  CI Bug Log - changes from CI_DRM_12293_full -> Patchwork_109632v5_full


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_109632v5_full absolutely 
need to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_109632v5_full, please notify your bug team to 
allow them
to document this new failure mode, which will reduce false positives 
in CI.



Participating hosts (9 -> 11)

Additional (2): shard-rkl shard-dg1


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_109632v5_full:



  IGT changes


Possible regressions

  * igt@gem_exec_capture@pi@vecs0:
  o shard-iclb: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-iclb2/igt@gem_exec_capture@p...@vecs0.html>
-> INCOMPLETE

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-iclb8/igt@gem_exec_capture@p...@vecs0.html>

Not related to this change as it is not a server part.

To clarify, this patch affects the guc path, ICL does not use that. So 
failure is not related to this patch.


Thanks,

Vinay.


Thanks,

Vinay.


Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  * igt@gem_create@create-clear@smem0:
  o {shard-rkl}: NOTRUN -> INCOMPLETE

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-rkl-5/igt@gem_create@create-cl...@smem0.html>
  * igt@sysfs_preempt_timeout@idempotent@rcs0:
  o {shard-dg1}: NOTRUN -> FAIL

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-dg1-12/igt@sysfs_preempt_timeout@idempot...@rcs0.html>
+4 similar issues


Known issues

Here are the changes found in Patchwork_109632v5_full that come from 
known issues:



  IGT changes


Issues hit

  * igt@gem_ctx_exec@basic-nohangcheck:
  o shard-tglb: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-tglb2/igt@gem_ctx_e...@basic-nohangcheck.html>
-> FAIL

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-tglb7/igt@gem_ctx_e...@basic-nohangcheck.html>
(i915#6268 <https://gitlab.freedesktop.org/drm/intel/issues/6268>)
  * igt@gem_exec_balancer@parallel:
  o shard-iclb: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-iclb1/igt@gem_exec_balan...@parallel.html>
-> SKIP

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-iclb7/igt@gem_exec_balan...@parallel.html>
(i915#4525 <https://gitlab.freedesktop.org/drm/intel/issues/4525>)
  * igt@gem_exec_fair@basic-pace-share@rcs0:
  o shard-glk: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-glk5/igt@gem_exec_fair@basic-pace-sh...@rcs0.html>
-> FAIL

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-glk9/igt@gem_exec_fair@basic-pace-sh...@rcs0.html>
(i915#2842 <https://gitlab.freedesktop.org/drm/intel/issues/2842>)
  * igt@gem_huc_copy@huc-copy:
  o shard-tglb: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-tglb2/igt@gem_huc_c...@huc-copy.html>
-> SKIP

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-tglb7/igt@gem_huc_c...@huc-copy.html>
(i915#2190 <https://gitlab.freedesktop.org/drm/intel/issues/2190>)
  * igt@gem_lmem_swapping@parallel-random:
  o shard-skl: NOTRUN -> SKIP

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-skl10/igt@gem_lmem_swapp...@parallel-random.html>
(fdo#109271
<https://bugs.freedesktop.org/show_bug.cgi?id=109271> /
i915#4613 <https://gitlab.freedesktop.org/drm/intel/issues/4613>)
  * igt@kms_async_flips@alternate-sync-async-flip@pipe-a-edp-1:
  o shard-skl: PASS

<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-skl10/igt@kms_async_flips@alternate-sync-async-f...@pipe-a-edp-1.html>
-> FAIL

<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-skl4/igt@kms_async_flips@alternate-sync-async-f...@pipe-a-edp-1.html>
(

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/slpc: Use platform limits for min/max frequency (rev5)

2022-10-26 Thread Belgaumkar, Vinay


From: Patchwork 
Sent: Tuesday, October 25, 2022 7:39 PM
To: Belgaumkar, Vinay 
Cc: intel-gfx@lists.freedesktop.org
Subject: ✗ Fi.CI.IGT: failure for drm/i915/slpc: Use platform limits for 
min/max frequency (rev5)

Patch Details
Series:
drm/i915/slpc: Use platform limits for min/max frequency (rev5)
URL:
https://patchwork.freedesktop.org/series/109632/
State:
failure
Details:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/index.html
CI Bug Log - changes from CI_DRM_12293_full -> Patchwork_109632v5_full
Summary

FAILURE

Serious unknown changes coming with Patchwork_109632v5_full absolutely need to 
be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_109632v5_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

Participating hosts (9 -> 11)

Additional (2): shard-rkl shard-dg1

Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_109632v5_full:

IGT changes
Possible regressions

  *   igt@gem_exec_capture@pi@vecs0:
 *   shard-iclb: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-iclb2/igt@gem_exec_capture@p...@vecs0.html>
 -> 
INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-iclb8/igt@gem_exec_capture@p...@vecs0.html>
Not related to this change as it is not a server part.
Thanks,
Vinay.
Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  *   igt@gem_create@create-clear@smem0:
 *   {shard-rkl}: NOTRUN -> 
INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-rkl-5/igt@gem_create@create-cl...@smem0.html>
  *   igt@sysfs_preempt_timeout@idempotent@rcs0:
 *   {shard-dg1}: NOTRUN -> 
FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-dg1-12/igt@sysfs_preempt_timeout@idempot...@rcs0.html>
 +4 similar issues

Known issues

Here are the changes found in Patchwork_109632v5_full that come from known 
issues:

IGT changes
Issues hit

  *   igt@gem_ctx_exec@basic-nohangcheck:
 *   shard-tglb: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-tglb2/igt@gem_ctx_e...@basic-nohangcheck.html>
 -> 
FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-tglb7/igt@gem_ctx_e...@basic-nohangcheck.html>
 (i915#6268<https://gitlab.freedesktop.org/drm/intel/issues/6268>)
  *   igt@gem_exec_balancer@parallel:
 *   shard-iclb: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-iclb1/igt@gem_exec_balan...@parallel.html>
 -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-iclb7/igt@gem_exec_balan...@parallel.html>
 (i915#4525<https://gitlab.freedesktop.org/drm/intel/issues/4525>)
  *   igt@gem_exec_fair@basic-pace-share@rcs0:
 *   shard-glk: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-glk5/igt@gem_exec_fair@basic-pace-sh...@rcs0.html>
 -> 
FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-glk9/igt@gem_exec_fair@basic-pace-sh...@rcs0.html>
 (i915#2842<https://gitlab.freedesktop.org/drm/intel/issues/2842>)
  *   igt@gem_huc_copy@huc-copy:
 *   shard-tglb: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-tglb2/igt@gem_huc_c...@huc-copy.html>
 -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-tglb7/igt@gem_huc_c...@huc-copy.html>
 (i915#2190<https://gitlab.freedesktop.org/drm/intel/issues/2190>)
  *   igt@gem_lmem_swapping@parallel-random:
 *   shard-skl: NOTRUN -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-skl10/igt@gem_lmem_swapp...@parallel-random.html>
 (fdo#109271<https://bugs.freedesktop.org/show_bug.cgi?id=109271> / 
i915#4613<https://gitlab.freedesktop.org/drm/intel/issues/4613>)
  *   igt@kms_async_flips@alternate-sync-async-flip@pipe-a-edp-1:
 *   shard-skl: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-skl10/igt@kms_async_flips@alternate-sync-async-f...@pipe-a-edp-1.html>
 -> 
FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-skl4/igt@kms_async_flips@alternate-sync-async-f...@pipe-a-edp-1.html>
 (i915#2521<https://gitlab.freedesktop.org/drm/intel/issues/2521>)
  *   igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
 *   shard-skl: NOTRUN -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109632v5/shard-skl10/igt@kms_big...@4-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html>
 (fdo#109271<https://bugs.freedesktop.org/show_bug.cgi?id=109271>) +44 similar 
issues
  *   igt@kms_big_fb@y-tiled-32bpp-rotate-180:
 *   shard-glk: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12293/shard-glk1/igt@kms_big.

Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-24 Thread Belgaumkar, Vinay



On 10/21/2022 10:26 PM, Dixit, Ashutosh wrote:

On Fri, 21 Oct 2022 18:38:57 -0700, Belgaumkar, Vinay wrote:

On 10/20/2022 3:57 PM, Dixit, Ashutosh wrote:

On Tue, 18 Oct 2022 11:30:31 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index 4c6e9257e593..e42bc215e54d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int test_type)
enum intel_engine_id id;
struct igt_spinner spin;
u32 slpc_min_freq, slpc_max_freq;
+   u32 saved_min_freq;
int err = 0;

if (!intel_uc_uses_guc_slpc(>uc))
@@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int test_type)
return -EIO;
}

-   /*
-* FIXME: With efficient frequency enabled, GuC can request
-* frequencies higher than the SLPC max. While this is fixed
-* in GuC, we level set these tests with RPn as min.
-*/
-   err = slpc_set_min_freq(slpc, slpc->min_freq);
-   if (err)
-   return err;
+   if (slpc_min_freq == slpc_max_freq) {
+   /* Server parts will have min/max clamped to RP0 */
+   if (slpc->min_is_rpmax) {
+   err = slpc_set_min_freq(slpc, slpc->min_freq);
+   if (err) {
+   pr_err("Unable to update min freq on server 
part");
+   return err;
+   }

-   if (slpc->min_freq == slpc->rp0_freq) {
-   pr_err("Min/Max are fused to the same value\n");
-   return -EINVAL;
+   } else {
+   pr_err("Min/Max are fused to the same value\n");
+   return -EINVAL;

Sorry but I am not following this else case here. Why are we saying min/max
are fused to the same value? In this case we can't do
"slpc_set_min_freq(slpc, slpc->min_freq)" ? That is, we can't change SLPC
min freq?

This would be an error case due to a faulty part. We may come across a part
where min/max is fused to the same value.

But even then the original check is much clearer since it is actually
comparing the fused freq's:

if (slpc->min_freq == slpc->rp0_freq)

Because if min/max have been changed slpc_min_freq and slpc_max_freq are no
longer fused freq.

And also this check should be right at the top of run_test, right after if
(!intel_uc_uses_guc_slpc), rather than in the middle here (otherwise
because we are basically not doing any error rewinding so causing memory
leaks if any of the functions return error).

ok.



+   }
+   } else {
+   /*
+* FIXME: With efficient frequency enabled, GuC can request
+* frequencies higher than the SLPC max. While this is fixed
+* in GuC, we level set these tests with RPn as min.
+*/
+   err = slpc_set_min_freq(slpc, slpc->min_freq);
+   if (err)
+   return err;
}

So let's do what is suggested above and then see what remains here and if
we need all these code changes. Most likely we can just do unconditionally
what we were doing before, i.e.:

err = slpc_set_min_freq(slpc, slpc->min_freq);
if (err)
return err;


+   saved_min_freq = slpc_min_freq;
+
+   /* New temp min freq = RPn */
+   slpc_min_freq = slpc->min_freq;

Why do we need saved_min_freq? We can retain slpc_min_freq and in the check 
below:

if (max_act_freq <= slpc_min_freq)

We can just change the check to:

if (max_act_freq <= slpc->min_freq)

Looks like to have been a bug in the original code?
Not a bug, it wasn't needed until we didn't have server parts 
(slpc_min_freq would typically be slpc->min_freq on non-server parts).

+
intel_gt_pm_wait_for_idle(gt);
intel_gt_pm_get(gt);
for_each_engine(engine, gt, id) {
@@ -347,7 +363,7 @@ static int run_test(struct intel_gt *gt, int test_type)

/* Restore min/max frequencies */
slpc_set_max_freq(slpc, slpc_max_freq);
-   slpc_set_min_freq(slpc, slpc_min_freq);
+   slpc_set_min_freq(slpc, saved_min_freq);

if (igt_flush_test(gt->i915))
err = -EIO;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index fdd895f73f9f..b7cdeec44bd3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)

slpc->max_freq_softlimit = 0;
slpc->min_freq_softlimit = 0;
+   slpc->min_is_rpmax = false;

slpc->boost_freq = 0;
atomic_set(>num_wait

Re: [Intel-gfx] [PATCH v4] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-24 Thread Belgaumkar, Vinay



On 10/22/2022 12:22 PM, Dixit, Ashutosh wrote:

On Sat, 22 Oct 2022 10:56:03 -0700, Belgaumkar, Vinay wrote:
Hi Vinay,


diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..32e1f5dde5bb 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);

+   if (slpc->min_freq_softlimit == slpc->boost_freq)
+   return;

nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq'
(looks possible to me from the code though we might not have intended it)?
Then we can change this to:

if (slpc->min_freq_softlimit >= slpc->boost_freq)
return;

Any comment about this? It looks clearly possible to me from the code.

So with the above change this is:

Reviewed-by: Ashutosh Dixit 


Agree.

Thanks,

Vinay.



Re: [Intel-gfx] [PATCH v4] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-22 Thread Belgaumkar, Vinay



On 10/21/2022 7:11 PM, Dixit, Ashutosh wrote:

On Fri, 21 Oct 2022 17:24:52 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if boost_freq and the min softlimit
are the same.

v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)
v3: Check requested freq in dec_waiters as well
v4: Only check min_softlimit against boost_freq. Limit this
optimization for server parts for now.

Sorry I didn't follow. Why are we saying limit this only to server? This:

if (slpc->min_freq_softlimit == slpc->boost_freq)
return;

The condition above should work for client too if it is true? But yes it is
typically true automatically for server but not for client. Is that what
you mean?

yes. For client, min_freq_softlimit would typically be RPn.



Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..32e1f5dde5bb 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);

+   if (slpc->min_freq_softlimit == slpc->boost_freq)
+   return;

nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq'
(looks possible to me from the code though we might not have intended it)?
Then we can change this to:

if (slpc->min_freq_softlimit >= slpc->boost_freq)
return;



+
/* Return if old value is non zero */
-   if (!atomic_fetch_inc(>num_waiters))
+   if (!atomic_fetch_inc(>num_waiters)) {
+   GT_TRACE(rps_to_gt(rps), "boost 
fence:%llx:%llx\n",
+rq->fence.context, rq->fence.seqno);

Another possibility would have been to add the trace to slpc_boost_work but
this is matches host turbo so I think it is fine here.


schedule_work(>boost_work);
+   }

return;
}

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-21 Thread Belgaumkar, Vinay



On 10/20/2022 3:57 PM, Dixit, Ashutosh wrote:

On Tue, 18 Oct 2022 11:30:31 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index 4c6e9257e593..e42bc215e54d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int test_type)
enum intel_engine_id id;
struct igt_spinner spin;
u32 slpc_min_freq, slpc_max_freq;
+   u32 saved_min_freq;
int err = 0;

if (!intel_uc_uses_guc_slpc(>uc))
@@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int test_type)
return -EIO;
}

-   /*
-* FIXME: With efficient frequency enabled, GuC can request
-* frequencies higher than the SLPC max. While this is fixed
-* in GuC, we level set these tests with RPn as min.
-*/
-   err = slpc_set_min_freq(slpc, slpc->min_freq);
-   if (err)
-   return err;
+   if (slpc_min_freq == slpc_max_freq) {
+   /* Server parts will have min/max clamped to RP0 */
+   if (slpc->min_is_rpmax) {
+   err = slpc_set_min_freq(slpc, slpc->min_freq);
+   if (err) {
+   pr_err("Unable to update min freq on server 
part");
+   return err;
+   }

-   if (slpc->min_freq == slpc->rp0_freq) {
-   pr_err("Min/Max are fused to the same value\n");
-   return -EINVAL;
+   } else {
+   pr_err("Min/Max are fused to the same value\n");
+   return -EINVAL;

Sorry but I am not following this else case here. Why are we saying min/max
are fused to the same value? In this case we can't do
"slpc_set_min_freq(slpc, slpc->min_freq)" ? That is, we can't change SLPC
min freq?
This would be an error case due to a faulty part. We may come across a 
part where min/max is fused to the same value.



diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index fdd895f73f9f..b7cdeec44bd3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)

slpc->max_freq_softlimit = 0;
slpc->min_freq_softlimit = 0;
+   slpc->min_is_rpmax = false;

slpc->boost_freq = 0;
atomic_set(>num_waiters, 0);
@@ -588,6 +589,32 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return 0;
  }

+static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
+{
+   int slpc_min_freq;
+
+   if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
+   return false;

I am wondering what happens if the above fails on server? Should we return
true or false on server and what are the consequences of returning false on
server?

Any case I think we should at least put a drm_err or something here just in
case this ever fails so we'll know something weird happened.


Makes sense.

Thanks,

Vinay.




+
+   if (slpc_min_freq == SLPC_MAX_FREQ_MHZ)
+   return true;
+   else
+   return false;
+}
+
+static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
+{
+   /* For server parts, SLPC min will be at RPMax.
+* Use min softlimit to clamp it to RP0 instead.
+*/
+   if (is_slpc_min_freq_rpmax(slpc) &&
+   !slpc->min_freq_softlimit) {
+   slpc->min_is_rpmax = true;
+   slpc->min_freq_softlimit = slpc->rp0_freq;
+   (slpc_to_gt(slpc))->defaults.min_freq = 
slpc->min_freq_softlimit;
+   }
+}
+
  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
  {
/* Force SLPC to used platform rp0 */
@@ -647,6 +674,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)

slpc_get_rp_values(slpc);

+   /* Handle the case where min=max=RPmax */
+   update_server_min_softlimit(slpc);
+
/* Set SLPC max limit to RP0 */
ret = slpc_use_fused_rp0(slpc);
if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
index 82a98f78f96c..11975a31c9d0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
@@ -9,6 +9,8 @@
  #include "intel_guc_submission.h"
  #include "intel_guc_slpc_types.h"

+#define SLPC_MAX_FREQ_MHZ 4250

This seems to be really a value (255 converted to freq) so seems ok to
intepret in MHz.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-21 Thread Belgaumkar, Vinay



On 10/21/2022 11:40 AM, Dixit, Ashutosh wrote:

On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote:


On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote:

On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote:

On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote:

On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if GuC is already requesting the
same.

But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?

In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?

There is no way to guarantee the frequency will remain at RP0 till the
request retires. As a theoretical example, lets say the request boosted
freq to RP0, but a user changed min freq using sysfs immediately after.

That would be a bug. If waitboost is in progress and in the middle user
changed min freq, I would expect the freq to revert to the new min only
after the waitboost phase was over.

The problem here is that GuC is unaware of this "boosting"
phenomenon. Setting the min_freq_softlimit as well to boost when we send a
boost request might help with this issue.


In any case, I am not referring to this case. Since FW controls the freq
there is nothing preventing FW to change the freq unless we raise min to
max which is what waitboost does.

Ok, so maybe the solution here is to check if min_softlimit is already at
boost freq, as it tracks the min freq changes. That should take care of
server parts automatically as well.

Correct, yes that would be the right way to do it.


Actually, rethinking, it's not going to work for client GPUs. We cannot 
clobber the min_softlimit as the user may have set it. So, I'll just 
make this change to optimize it for server parts for now.


Thanks,

Vinay.



Thanks.
--
Ashutosh


Waitboost is done by a pending request to "hurry" the current requests. If
GT is already at boost frequency, that purpose is served.

FW can bring the freq down later before the waiting request is scheduled.

Also, host algorithm already has this optimization as well.

Host turbo is different from SLPC. Host turbo controls the freq algorithm
so it knows freq will not come down till it itself brings the freq
down. Unlike SLPC where FW is controling the freq. Therefore host turbo
doesn't ever need to do a MMIO read but only needs to refer to its own
state (rps->cur_freq etc.).

True. Host algorithm has a periodic timer where it updates frequency. Here,
it checks num_waiters and sets client_boost every time that is non-zero.

I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.

Hopefully the softlimit changes above will help with client and server.

Thanks,

Vinay.


Thanks.
--
Ashutosh


v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)
v3: Check requested freq in dec_waiters as well

Signed-off-by: Vinay Belgaumkar 
---
drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..18b75cf08d1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);

+   GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
+rq->fence.context, rq->fence.seqno);
+
/* Return if old value is non zero */
if (!atomic_fetch_inc(>num_waiters))
schedule_work(>boost_work);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index b7cdeec44bd3..9dbdbab1515a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
*slpc, u32 freq)
static void slpc_boost_work(struct work_struct *work)
{
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
boost_work);
+   struct intel_rps *rps = _to_gt(slpc)->rps;
int err;

/*
 

Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-21 Thread Belgaumkar, Vinay



On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote:

On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote:

On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote:

On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if GuC is already requesting the
same.

But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?

In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?

There is no way to guarantee the frequency will remain at RP0 till the
request retires. As a theoretical example, lets say the request boosted
freq to RP0, but a user changed min freq using sysfs immediately after.

That would be a bug. If waitboost is in progress and in the middle user
changed min freq, I would expect the freq to revert to the new min only
after the waitboost phase was over.


The problem here is that GuC is unaware of this "boosting" phenomenon. 
Setting the min_freq_softlimit as well to boost when we send a boost 
request might help with this issue.




In any case, I am not referring to this case. Since FW controls the freq
there is nothing preventing FW to change the freq unless we raise min to
max which is what waitboost does.
Ok, so maybe the solution here is to check if min_softlimit is already 
at boost freq, as it tracks the min freq changes. That should take care 
of server parts automatically as well.



Waitboost is done by a pending request to "hurry" the current requests. If
GT is already at boost frequency, that purpose is served.

FW can bring the freq down later before the waiting request is scheduled.

Also, host algorithm already has this optimization as well.

Host turbo is different from SLPC. Host turbo controls the freq algorithm
so it knows freq will not come down till it itself brings the freq
down. Unlike SLPC where FW is controling the freq. Therefore host turbo
doesn't ever need to do a MMIO read but only needs to refer to its own
state (rps->cur_freq etc.).
True. Host algorithm has a periodic timer where it updates frequency. 
Here, it checks num_waiters and sets client_boost every time that is 
non-zero.

I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.


Hopefully the softlimit changes above will help with client and server.

Thanks,

Vinay.


Thanks.
--
Ashutosh


v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)
v3: Check requested freq in dec_waiters as well

Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
   drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
   2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..18b75cf08d1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);

+   GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
+rq->fence.context, rq->fence.seqno);
+
/* Return if old value is non zero */
if (!atomic_fetch_inc(>num_waiters))
schedule_work(>boost_work);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index b7cdeec44bd3..9dbdbab1515a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
*slpc, u32 freq)
   static void slpc_boost_work(struct work_struct *work)
   {
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
boost_work);
+   struct intel_rps *rps = _to_gt(slpc)->rps;
int err;

/*
 * Raise min freq to boost. It's possible that
 * this is greater than current max. But it will
 * certainly be limited by RP0. An error setting
-* the min param is not fatal.
+* the min param is not fatal. No need to boost
+* if we are already requesting it.
 */
+   if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
+   return;
+

Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-20 Thread Belgaumkar, Vinay



On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote:

On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if GuC is already requesting the
same.

But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?

In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?


There is no way to guarantee the frequency will remain at RP0 till the 
request retires. As a theoretical example, lets say the request boosted 
freq to RP0, but a user changed min freq using sysfs immediately after.


Waitboost is done by a pending request to "hurry" the current requests. 
If GT is already at boost frequency, that purpose is served. Also, host 
algorithm already has this optimization as well.


Thanks,

Vinay.



I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.

Thanks.
--
Ashutosh



v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)
v3: Check requested freq in dec_waiters as well

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c |  3 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++---
  2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..18b75cf08d1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);

+   GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
+rq->fence.context, rq->fence.seqno);
+
/* Return if old value is non zero */
if (!atomic_fetch_inc(>num_waiters))
schedule_work(>boost_work);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index b7cdeec44bd3..9dbdbab1515a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
*slpc, u32 freq)
  static void slpc_boost_work(struct work_struct *work)
  {
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
boost_work);
+   struct intel_rps *rps = _to_gt(slpc)->rps;
int err;

/*
 * Raise min freq to boost. It's possible that
 * this is greater than current max. But it will
 * certainly be limited by RP0. An error setting
-* the min param is not fatal.
+* the min param is not fatal. No need to boost
+* if we are already requesting it.
 */
+   if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
+   return;
+
mutex_lock(>lock);
if (atomic_read(>num_waiters)) {
err = slpc_force_min_freq(slpc, slpc->boost_freq);
@@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc 
*slpc, u32 val)

  void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
  {
+   struct intel_rps *rps = _to_gt(slpc)->rps;
/*
 * Return min back to the softlimit.
 * This is called during request retire,
@@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc 
*slpc)
 * set_param fails.
 */
mutex_lock(>lock);
-   if (atomic_dec_and_test(>num_waiters))
-   slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
+   if (atomic_dec_and_test(>num_waiters)) {
+   if (intel_rps_get_requested_frequency(rps) != 
slpc->min_freq_softlimit)
+   slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
+   }
mutex_unlock(>lock);
  }

--
2.35.1



Re: [Intel-gfx] [PATCH v2] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-19 Thread Belgaumkar, Vinay



On 10/19/2022 4:05 PM, Vinay Belgaumkar wrote:

Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if GuC is already requesting the
same.

v2: Add the tracing back, and check requested freq
in the worker thread (Tvrtko)

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 7 ++-
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..18b75cf08d1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
if (rps_uses_slpc(rps)) {
slpc = rps_to_slpc(rps);
  
+			GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",

+rq->fence.context, rq->fence.seqno);
+
/* Return if old value is non zero */
if (!atomic_fetch_inc(>num_waiters))


The issue when we move the req freq check into the slpc_work is that we 
are incrementing num_waiters. That will trigger a de-boost and result in 
a H2G. Need to check the req frequency there as well.


Thanks,

Vinay.


schedule_work(>boost_work);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index b7cdeec44bd3..7ab96221be7e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
*slpc, u32 freq)
  static void slpc_boost_work(struct work_struct *work)
  {
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
boost_work);
+   struct intel_rps *rps = _to_gt(slpc)->rps;
int err;
  
  	/*

 * Raise min freq to boost. It's possible that
 * this is greater than current max. But it will
 * certainly be limited by RP0. An error setting
-* the min param is not fatal.
+* the min param is not fatal. No need to boost
+* if we are already requesting it.
 */
+   if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
+   return;
+
mutex_lock(>lock);
if (atomic_read(>num_waiters)) {
err = slpc_force_min_freq(slpc, slpc->boost_freq);


Re: [Intel-gfx] [PATCH] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-19 Thread Belgaumkar, Vinay



On 10/19/2022 2:12 PM, Belgaumkar, Vinay wrote:


On 10/19/2022 12:40 AM, Tvrtko Ursulin wrote:


On 18/10/2022 23:15, Vinay Belgaumkar wrote:
Waitboost (when SLPC is enabled) results in a H2G message. This can 
result
in thousands of messages during a stress test and fill up an already 
full
CTB. There is no need to request for RP0 if GuC is already 
requesting the

same.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c

index fc23c562d9b2..a20ae4fceac8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1005,13 +1005,20 @@ void intel_rps_dec_waiters(struct intel_rps 
*rps)

  void intel_rps_boost(struct i915_request *rq)
  {
  struct intel_guc_slpc *slpc;
+    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
    if (i915_request_signaled(rq) || 
i915_request_has_waitboost(rq))

  return;
  +    /* If GuC is already requesting RP0, skip */
+    if (rps_uses_slpc(rps)) {
+    slpc = rps_to_slpc(rps);
+    if (intel_rps_get_requested_frequency(rps) == slpc->rp0_freq)

One correction here is this should be slpc->boost_freq.

+    return;
+    }
+


Feels a little bit like a layering violation. Wait boost reference 
counts and request markings will changed based on asynchronous state 
- a mmio read.


Also, a little below we have this:

"""
/* Serializes with i915_request_retire() */
if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) {
    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;

    if (rps_uses_slpc(rps)) {
    slpc = rps_to_slpc(rps);

    /* Return if old value is non zero */
    if (!atomic_fetch_inc(>num_waiters))

***>>>> Wouldn't it skip doing anything here already? <<<<***
It will skip only if boost is already happening. This patch is trying 
to prevent even that first one if possible.


    schedule_work(>boost_work);

    return;
    }

    if (atomic_fetch_inc(>num_waiters))
    return;
"""

But I wonder if this is not a layering violation already. Looks like 
one for me at the moment. And as it happens there is an ongoing debug 
of clvk slowness where I was a bit puzzled by the lack of "boost 
fence" in trace_printk logs - but now I see how that happens. Does 
not feel right to me that we lose that tracing with SLPC.
Agreed. Will add the trace to the SLPC case as well.  However, the 
question is what does that trace indicate? Even in the host case, we 
log the trace, but may skip the actual boost as the req is already 
matching boost freq. IMO, we should log the trace only when we 
actually decide to boost.
On second thoughts, that trace only tracks the boost fence, which is set 
in this case. So, might be ok to have it regardless. We count the 
num_boosts anyways if we ever wanted to know how many of those actually 
went on to boost the freq.


So in general - why the correct approach wouldn't be to solve this in 
the worker - which perhaps should fork to slpc specific branch and do 
the consolidations/skips based on mmio reads in there?


sure, I can move the mmio read to the SLPC worker thread.

Thanks,

Vinay.



Regards,

Tvrtko


  /* Serializes with i915_request_retire() */
  if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) {
-    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
    if (rps_uses_slpc(rps)) {
  slpc = rps_to_slpc(rps);


Re: [Intel-gfx] [PATCH] drm/i915/slpc: Optmize waitboost for SLPC

2022-10-19 Thread Belgaumkar, Vinay



On 10/19/2022 12:40 AM, Tvrtko Ursulin wrote:


On 18/10/2022 23:15, Vinay Belgaumkar wrote:
Waitboost (when SLPC is enabled) results in a H2G message. This can 
result
in thousands of messages during a stress test and fill up an already 
full
CTB. There is no need to request for RP0 if GuC is already requesting 
the

same.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c

index fc23c562d9b2..a20ae4fceac8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1005,13 +1005,20 @@ void intel_rps_dec_waiters(struct intel_rps 
*rps)

  void intel_rps_boost(struct i915_request *rq)
  {
  struct intel_guc_slpc *slpc;
+    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
    if (i915_request_signaled(rq) || i915_request_has_waitboost(rq))
  return;
  +    /* If GuC is already requesting RP0, skip */
+    if (rps_uses_slpc(rps)) {
+    slpc = rps_to_slpc(rps);
+    if (intel_rps_get_requested_frequency(rps) == slpc->rp0_freq)

One correction here is this should be slpc->boost_freq.

+    return;
+    }
+


Feels a little bit like a layering violation. Wait boost reference 
counts and request markings will changed based on asynchronous state - 
a mmio read.


Also, a little below we have this:

"""
/* Serializes with i915_request_retire() */
if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) {
    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;

    if (rps_uses_slpc(rps)) {
    slpc = rps_to_slpc(rps);

    /* Return if old value is non zero */
    if (!atomic_fetch_inc(>num_waiters))

*** Wouldn't it skip doing anything here already? ***
It will skip only if boost is already happening. This patch is trying to 
prevent even that first one if possible.


    schedule_work(>boost_work);

    return;
    }

    if (atomic_fetch_inc(>num_waiters))
    return;
"""

But I wonder if this is not a layering violation already. Looks like 
one for me at the moment. And as it happens there is an ongoing debug 
of clvk slowness where I was a bit puzzled by the lack of "boost 
fence" in trace_printk logs - but now I see how that happens. Does not 
feel right to me that we lose that tracing with SLPC.
Agreed. Will add the trace to the SLPC case as well.  However, the 
question is what does that trace indicate? Even in the host case, we log 
the trace, but may skip the actual boost as the req is already matching 
boost freq. IMO, we should log the trace only when we actually decide to 
boost.


So in general - why the correct approach wouldn't be to solve this in 
the worker - which perhaps should fork to slpc specific branch and do 
the consolidations/skips based on mmio reads in there?


sure, I can move the mmio read to the SLPC worker thread.

Thanks,

Vinay.



Regards,

Tvrtko


  /* Serializes with i915_request_retire() */
  if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) {
-    struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
    if (rps_uses_slpc(rps)) {
  slpc = rps_to_slpc(rps);


Re: [Intel-gfx] [PATCH v2] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-13 Thread Belgaumkar, Vinay



On 10/13/2022 3:28 PM, Dixit, Ashutosh wrote:

On Thu, 13 Oct 2022 08:55:24 -0700, Vinay Belgaumkar wrote:
Hi Vinay,


GuC will set the min/max frequencies to theoretical max on
ATS-M. This will break kernel ABI, so limit min/max frequency
to RP0(platform max) instead.

Isn't what we are calling "theoretical max" or "RPmax" really just -1U
(0x)? Though I have heard this is not a max value but -1U indicates
FW default values unmodified by host SW, which would mean frequencies are
fully controlled by FW (min == max == -1U). But if this were the case I
don't know why this would be the case only for server, why doesn't FW set
these for clients too to indicate it is fully in control?
FW sets max to -1U for client products(we already pull it down to RP0). 
It additionally makes min=max for server parts.


So the question what does -1U actually represent? Is it the RPmax value or
does -1U represent "FW defaults"?

Also this concept of using -1U as "FW defaults" is present in Level0/OneAPI
(and likely in firmware) but we seem to have blocked in the i915 ABI.

I understand we may not be able to make such changes at present but this
provides some context for the review comments below.


diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index fdd895f73f9f..11613d373a49 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)

slpc->max_freq_softlimit = 0;
slpc->min_freq_softlimit = 0;
+   slpc->min_is_rpmax = false;

slpc->boost_freq = 0;
atomic_set(>num_waiters, 0);
@@ -588,6 +589,31 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return 0;
  }

+static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
+{
+   int slpc_min_freq;
+
+   if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
+   return false;
+
+   if (slpc_min_freq > slpc->rp0_freq)
or >=.

If what we are calling "rpmax" really -1U then why don't we just check for
-1U here?

u32 slpc_min_freq;

if (slpc_min_freq == -1U)
That'll work similarly too. Only time slpc_min_freq is greater than rp0 
is for a server part.



+   return true;
+   else
+   return false;
+}
+
+static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
+{
+   /* For server parts, SLPC min will be at RPMax.
+* Use min softlimit to clamp it to RP0 instead.
+*/
+   if (is_slpc_min_freq_rpmax(slpc) &&
+   !slpc->min_freq_softlimit) {
+   slpc->min_is_rpmax = true;
+   slpc->min_freq_softlimit = slpc->rp0_freq;

Isn't it safer to use a platform check such as IS_ATSM or IS_XEHPSDV (or
even #define IS_SERVER()) to set min freq to RP0 rather than this -1U value
from FW? What if -1U means "FW defaults" and FW starts setting this on
client products tomorrow?


We are not checking for -1 specifically, but only if FW has set min > 
RP0 as an indicator. Also, might be worth having IS_SERVER at some point 
if there are other places we need this info as well.




Also, we need to set gt->defaults.min_freq here.


yes, need to add that.

Thanks,

Vinay.



Thanks.
--
Ashutosh



+   }
+}
+
  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
  {
/* Force SLPC to used platform rp0 */
@@ -647,6 +673,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)

slpc_get_rp_values(slpc);

+   /* Handle the case where min=max=RPmax */
+   update_server_min_softlimit(slpc);
+
/* Set SLPC max limit to RP0 */
ret = slpc_use_fused_rp0(slpc);
if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
index 73d208123528..a6ef53b04e04 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
@@ -19,6 +19,9 @@ struct intel_guc_slpc {
bool supported;
bool selected;

+   /* Indicates this is a server part */
+   bool min_is_rpmax;
+
/* platform frequency limits */
u32 min_freq;
u32 rp0_freq;
--
2.35.1



Re: [Intel-gfx] [PATCH] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-13 Thread Belgaumkar, Vinay



On 10/13/2022 8:14 AM, Das, Nirmoy wrote:


On 10/12/2022 8:26 PM, Vinay Belgaumkar wrote:

GuC will set the min/max frequencies to theoretical max on
ATS-M. This will break kernel ABI, so limit min/max frequency
to RP0(platform max) instead.

Also modify the SLPC selftest to update the min frequency
when we have a server part so that we can iterate between
platform min and max.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c   | 40 +--
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 29 ++
  .../gpu/drm/i915/gt/uc/intel_guc_slpc_types.h |  3 ++
  3 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c

index 4c6e9257e593..1f84362af737 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int 
test_type)

  enum intel_engine_id id;
  struct igt_spinner spin;
  u32 slpc_min_freq, slpc_max_freq;
+    u32 saved_min_freq;
  int err = 0;
    if (!intel_uc_uses_guc_slpc(>uc))
@@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int 
test_type)

  return -EIO;
  }
  -    /*
- * FIXME: With efficient frequency enabled, GuC can request
- * frequencies higher than the SLPC max. While this is fixed
- * in GuC, we level set these tests with RPn as min.
- */
-    err = slpc_set_min_freq(slpc, slpc->min_freq);
-    if (err)
-    return err;
-
  if (slpc->min_freq == slpc->rp0_freq) {
-    pr_err("Min/Max are fused to the same value\n");
-    return -EINVAL;
+    /* Servers will have min/max clamped to RP0 */



This should be "server parts". Tested the patch with Riana's suggested 
changes.


Acked-by: Nirmoy Das  with above changes.


Thanks, v2 sent with corrections.

Vinay.




Nirmoy


+    if (slpc->min_is_rpmax) {
+    err = slpc_set_min_freq(slpc, slpc->min_freq);
+    if (err) {
+    pr_err("Unable to update min freq on server part");
+    return err;
+    }
+
+    } else {
+    pr_err("Min/Max are fused to the same value\n");
+    return -EINVAL;
+    }
+    } else {
+    /*
+ * FIXME: With efficient frequency enabled, GuC can request
+ * frequencies higher than the SLPC max. While this is fixed
+ * in GuC, we level set these tests with RPn as min.
+ */
+    err = slpc_set_min_freq(slpc, slpc->min_freq);
+    if (err)
+    return err;
  }
  +    saved_min_freq = slpc_min_freq;
+
+    /* New temp min freq = RPn */
+    slpc_min_freq = slpc->min_freq;
+
  intel_gt_pm_wait_for_idle(gt);
  intel_gt_pm_get(gt);
  for_each_engine(engine, gt, id) {
@@ -347,7 +363,7 @@ static int run_test(struct intel_gt *gt, int 
test_type)

    /* Restore min/max frequencies */
  slpc_set_max_freq(slpc, slpc_max_freq);
-    slpc_set_min_freq(slpc, slpc_min_freq);
+    slpc_set_min_freq(slpc, saved_min_freq);
    if (igt_flush_test(gt->i915))
  err = -EIO;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index fdd895f73f9f..11613d373a49 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
    slpc->max_freq_softlimit = 0;
  slpc->min_freq_softlimit = 0;
+    slpc->min_is_rpmax = false;
    slpc->boost_freq = 0;
  atomic_set(>num_waiters, 0);
@@ -588,6 +589,31 @@ static int slpc_set_softlimits(struct 
intel_guc_slpc *slpc)

  return 0;
  }
  +static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
+{
+    int slpc_min_freq;
+
+    if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
+    return false;
+
+    if (slpc_min_freq > slpc->rp0_freq)
+    return true;
+    else
+    return false;
+}
+
+static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
+{
+    /* For server parts, SLPC min will be at RPMax.
+ * Use min softlimit to clamp it to RP0 instead.
+ */
+    if (is_slpc_min_freq_rpmax(slpc) &&
+    !slpc->min_freq_softlimit) {
+    slpc->min_is_rpmax = true;
+    slpc->min_freq_softlimit = slpc->rp0_freq;
+    }
+}
+
  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
  {
  /* Force SLPC to used platform rp0 */
@@ -647,6 +673,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc 
*slpc)

    slpc_get_rp_values(slpc);
  +    /* Handle the case where min=max=RPmax */
+    update_server_min_softlimit(slpc);
+
  /* Set SLPC max limit to RP0 */
  ret = slpc_use_fused_rp0(slpc);
  if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h

index 73d208123528..a6ef53b04e04 100644
--- 

Re: [Intel-gfx] [PATCH] drm/i915/slpc: Use platform limits for min/max frequency

2022-10-13 Thread Belgaumkar, Vinay



On 10/13/2022 4:34 AM, Tauro, Riana wrote:



On 10/12/2022 11:56 PM, Vinay Belgaumkar wrote:

GuC will set the min/max frequencies to theoretical max on
ATS-M. This will break kernel ABI, so limit min/max frequency
to RP0(platform max) instead.

Also modify the SLPC selftest to update the min frequency
when we have a server part so that we can iterate between
platform min and max.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c   | 40 +--
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c   | 29 ++
  .../gpu/drm/i915/gt/uc/intel_guc_slpc_types.h |  3 ++
  3 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c

index 4c6e9257e593..1f84362af737 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -234,6 +234,7 @@ static int run_test(struct intel_gt *gt, int 
test_type)

  enum intel_engine_id id;
  struct igt_spinner spin;
  u32 slpc_min_freq, slpc_max_freq;
+    u32 saved_min_freq;
  int err = 0;
    if (!intel_uc_uses_guc_slpc(>uc))
@@ -252,20 +253,35 @@ static int run_test(struct intel_gt *gt, int 
test_type)

  return -EIO;
  }
  -    /*
- * FIXME: With efficient frequency enabled, GuC can request
- * frequencies higher than the SLPC max. While this is fixed
- * in GuC, we level set these tests with RPn as min.
- */
-    err = slpc_set_min_freq(slpc, slpc->min_freq);
-    if (err)
-    return err;
-
  if (slpc->min_freq == slpc->rp0_freq) {

This has to be (slpc_min_freq == slpc_max_freq) instead of
(slpc->min_freq == slpc->rp0_freq).

Servers will have min/max softlimits clamped to RP0


Agree. will send out v2.

Thanks,

Vinay.



Thanks
Riana

-    pr_err("Min/Max are fused to the same value\n");
-    return -EINVAL;
+    /* Servers will have min/max clamped to RP0 */
+    if (slpc->min_is_rpmax) {
+    err = slpc_set_min_freq(slpc, slpc->min_freq);
+    if (err) {
+    pr_err("Unable to update min freq on server part");
+    return err;
+    }
+
+    } else {
+    pr_err("Min/Max are fused to the same value\n");
+    return -EINVAL;
+    }
+    } else {
+    /*
+ * FIXME: With efficient frequency enabled, GuC can request
+ * frequencies higher than the SLPC max. While this is fixed
+ * in GuC, we level set these tests with RPn as min.
+ */
+    err = slpc_set_min_freq(slpc, slpc->min_freq);
+    if (err)
+    return err;
  }
  +    saved_min_freq = slpc_min_freq;
+
+    /* New temp min freq = RPn */
+    slpc_min_freq = slpc->min_freq;
+
  intel_gt_pm_wait_for_idle(gt);
  intel_gt_pm_get(gt);
  for_each_engine(engine, gt, id) {
@@ -347,7 +363,7 @@ static int run_test(struct intel_gt *gt, int 
test_type)

    /* Restore min/max frequencies */
  slpc_set_max_freq(slpc, slpc_max_freq);
-    slpc_set_min_freq(slpc, slpc_min_freq);
+    slpc_set_min_freq(slpc, saved_min_freq);
    if (igt_flush_test(gt->i915))
  err = -EIO;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index fdd895f73f9f..11613d373a49 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -263,6 +263,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
    slpc->max_freq_softlimit = 0;
  slpc->min_freq_softlimit = 0;
+    slpc->min_is_rpmax = false;
    slpc->boost_freq = 0;
  atomic_set(>num_waiters, 0);
@@ -588,6 +589,31 @@ static int slpc_set_softlimits(struct 
intel_guc_slpc *slpc)

  return 0;
  }
  +static bool is_slpc_min_freq_rpmax(struct intel_guc_slpc *slpc)
+{
+    int slpc_min_freq;
+
+    if (intel_guc_slpc_get_min_freq(slpc, _min_freq))
+    return false;
+
+    if (slpc_min_freq > slpc->rp0_freq)
+    return true;
+    else
+    return false;
+}
+
+static void update_server_min_softlimit(struct intel_guc_slpc *slpc)
+{
+    /* For server parts, SLPC min will be at RPMax.
+ * Use min softlimit to clamp it to RP0 instead.
+ */
+    if (is_slpc_min_freq_rpmax(slpc) &&
+    !slpc->min_freq_softlimit) {
+    slpc->min_is_rpmax = true;
+    slpc->min_freq_softlimit = slpc->rp0_freq;
+    }
+}
+
  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
  {
  /* Force SLPC to used platform rp0 */
@@ -647,6 +673,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc 
*slpc)

    slpc_get_rp_values(slpc);
  +    /* Handle the case where min=max=RPmax */
+    update_server_min_softlimit(slpc);
+
  /* Set SLPC max limit to RP0 */
  ret = slpc_use_fused_rp0(slpc);
  if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h

index 

Re: [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915/slpc: Update frequency debugfs for SLPC (rev3)

2022-10-04 Thread Belgaumkar, Vinay



On 10/4/2022 4:33 PM, Patchwork wrote:

== Series Details ==

Series: drm/i915/slpc: Update frequency debugfs for SLPC (rev3)
URL   : https://patchwork.freedesktop.org/series/109328/
State : failure

== Summary ==

Error: make failed
   CALLscripts/checksyscalls.sh
   CALLscripts/atomic/check-atomics.sh
   DESCEND objtool
   CHK include/generated/compile.h
   CC [M]  drivers/gpu/drm/i915/gt/intel_rps.o
drivers/gpu/drm/i915/gt/intel_rps.c::6: error: no previous prototype for 
‘rps_frequency_dump’ [-Werror=missing-prototypes]
  void rps_frequency_dump(struct intel_rps *rps, struct drm_printer *p)


Forgot to use static for this function definition, will send out another 
version.


Thanks,

Vinay.


   ^~
cc1: all warnings being treated as errors
scripts/Makefile.build:249: recipe for target 
'drivers/gpu/drm/i915/gt/intel_rps.o' failed
make[4]: *** [drivers/gpu/drm/i915/gt/intel_rps.o] Error 1
scripts/Makefile.build:465: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:465: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:465: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1852: recipe for target 'drivers' failed
make: *** [drivers] Error 2




Re: [Intel-gfx] [PATCH 1/2] drm/i915: Add a wrapper for frequency debugfs

2022-10-04 Thread Belgaumkar, Vinay



On 10/4/2022 12:36 AM, Jani Nikula wrote:

On Mon, 03 Oct 2022, Vinay Belgaumkar  wrote:

Move it to the RPS source file.

The idea was that the 1st patch would be non-functional code
movement. This is still a functional change.

Or you can do the functional changes first, and then move code, as long
as you don't combine code movement with functional changes.

Yup, will move the SLPC check to the second patch as well.


Please also mark your patch revisions and note the changes. There's no
indication this series is v2.


ok.

Thanks,

Vinay.



BR,
Jani.


Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 157 +---
  drivers/gpu/drm/i915/gt/intel_rps.c   | 169 ++
  drivers/gpu/drm/i915/gt/intel_rps.h   |   3 +
  3 files changed, 173 insertions(+), 156 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
index 9fd4d9255a97..4319d6cdafe2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
@@ -344,162 +344,7 @@ void intel_gt_pm_frequency_dump(struct intel_gt *gt, 
struct drm_printer *p)
drm_printf(p, "efficient (RPe) frequency: %d MHz\n",
   intel_gpu_freq(rps, rps->efficient_freq));
} else if (GRAPHICS_VER(i915) >= 6) {
-   u32 rp_state_limits;
-   u32 gt_perf_status;
-   struct intel_rps_freq_caps caps;
-   u32 rpmodectl, rpinclimit, rpdeclimit;
-   u32 rpstat, cagf, reqf;
-   u32 rpcurupei, rpcurup, rpprevup;
-   u32 rpcurdownei, rpcurdown, rpprevdown;
-   u32 rpupei, rpupt, rpdownei, rpdownt;
-   u32 pm_ier, pm_imr, pm_isr, pm_iir, pm_mask;
-
-   rp_state_limits = intel_uncore_read(uncore, 
GEN6_RP_STATE_LIMITS);
-   gen6_rps_get_freq_caps(rps, );
-   if (IS_GEN9_LP(i915))
-   gt_perf_status = intel_uncore_read(uncore, 
BXT_GT_PERF_STATUS);
-   else
-   gt_perf_status = intel_uncore_read(uncore, 
GEN6_GT_PERF_STATUS);
-
-   /* RPSTAT1 is in the GT power well */
-   intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
-
-   reqf = intel_uncore_read(uncore, GEN6_RPNSWREQ);
-   if (GRAPHICS_VER(i915) >= 9) {
-   reqf >>= 23;
-   } else {
-   reqf &= ~GEN6_TURBO_DISABLE;
-   if (IS_HASWELL(i915) || IS_BROADWELL(i915))
-   reqf >>= 24;
-   else
-   reqf >>= 25;
-   }
-   reqf = intel_gpu_freq(rps, reqf);
-
-   rpmodectl = intel_uncore_read(uncore, GEN6_RP_CONTROL);
-   rpinclimit = intel_uncore_read(uncore, GEN6_RP_UP_THRESHOLD);
-   rpdeclimit = intel_uncore_read(uncore, GEN6_RP_DOWN_THRESHOLD);
-
-   rpstat = intel_uncore_read(uncore, GEN6_RPSTAT1);
-   rpcurupei = intel_uncore_read(uncore, GEN6_RP_CUR_UP_EI) & 
GEN6_CURICONT_MASK;
-   rpcurup = intel_uncore_read(uncore, GEN6_RP_CUR_UP) & 
GEN6_CURBSYTAVG_MASK;
-   rpprevup = intel_uncore_read(uncore, GEN6_RP_PREV_UP) & 
GEN6_CURBSYTAVG_MASK;
-   rpcurdownei = intel_uncore_read(uncore, GEN6_RP_CUR_DOWN_EI) & 
GEN6_CURIAVG_MASK;
-   rpcurdown = intel_uncore_read(uncore, GEN6_RP_CUR_DOWN) & 
GEN6_CURBSYTAVG_MASK;
-   rpprevdown = intel_uncore_read(uncore, GEN6_RP_PREV_DOWN) & 
GEN6_CURBSYTAVG_MASK;
-
-   rpupei = intel_uncore_read(uncore, GEN6_RP_UP_EI);
-   rpupt = intel_uncore_read(uncore, GEN6_RP_UP_THRESHOLD);
-
-   rpdownei = intel_uncore_read(uncore, GEN6_RP_DOWN_EI);
-   rpdownt = intel_uncore_read(uncore, GEN6_RP_DOWN_THRESHOLD);
-
-   cagf = intel_rps_read_actual_frequency(rps);
-
-   intel_uncore_forcewake_put(uncore, FORCEWAKE_ALL);
-
-   if (GRAPHICS_VER(i915) >= 11) {
-   pm_ier = intel_uncore_read(uncore, 
GEN11_GPM_WGBOXPERF_INTR_ENABLE);
-   pm_imr = intel_uncore_read(uncore, 
GEN11_GPM_WGBOXPERF_INTR_MASK);
-   /*
-* The equivalent to the PM ISR & IIR cannot be read
-* without affecting the current state of the system
-*/
-   pm_isr = 0;
-   pm_iir = 0;
-   } else if (GRAPHICS_VER(i915) >= 8) {
-   pm_ier = intel_uncore_read(uncore, GEN8_GT_IER(2));
-   pm_imr = intel_uncore_read(uncore, GEN8_GT_IMR(2));
-   pm_isr = intel_uncore_read(uncore, GEN8_GT_ISR(2));
-   pm_iir = intel_uncore_read(uncore, GEN8_GT_IIR(2));
- 

Re: [Intel-gfx] [PATCH v2 14/15] drm/i915/guc: Support OA when Wa_16011777198 is enabled

2022-09-26 Thread Belgaumkar, Vinay



On 9/26/2022 11:19 AM, Umesh Nerlige Ramappa wrote:

On Mon, Sep 26, 2022 at 08:56:01AM -0700, Dixit, Ashutosh wrote:

On Fri, 23 Sep 2022 13:11:53 -0700, Umesh Nerlige Ramappa wrote:


From: Vinay Belgaumkar 


Hi Umesh/Vinay,

@@ -3254,6 +3265,24 @@ static int i915_oa_stream_init(struct 
i915_perf_stream *stream,

intel_engine_pm_get(stream->engine);
intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);

+    /*
+ * Wa_16011777198:dg2: GuC resets render as part of the Wa. 
This causes
+ * OA to lose the configuration state. Prevent this by 
overriding GUCRC

+ * mode.
+ */
+    if (intel_guc_slpc_is_used(>uc.guc) &&
+    intel_uc_uses_guc_rc(>uc) &&


Is this condition above correct? E.g. what happens when:

a. GuCRC is used but SLPC is not used?


Currently, we have no way to separate out GuC RC and SLPC. Both are on 
when guc submission is enabled/supported. So, the above check is kind of 
redundant anyways.


Thanks,

Vinay.


b. GuCRC is not used. Don't we need to disable RC6 in host based RC6
  control?


When using host based rc6, existing OA code is using forcewake and a 
reference to engine_pm to prevent rc6. Other questions, directing to 
@Vinay.


Thanks,
Umesh



Do we need to worry about these cases?

Or if we always expect both GuCRC and SLPC to be used on DG2 then I 
think
let's get rid of these from the if condition and add a drm_err() if 
we see

these not being used and OA is being enabled on DG2?

Thanks.
--
Ashutosh


+ (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
+ IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) {
+    ret = intel_guc_slpc_override_gucrc_mode(>uc.guc.slpc,
+ SLPC_GUCRC_MODE_GUCRC_NO_RC6);
+    if (ret) {
+    drm_dbg(>perf->i915->drm,
+    "Unable to override gucrc mode\n");
+    goto err_config;
+    }
+    }
+
ret = alloc_oa_buffer(stream);
if (ret)
    goto err_oa_buf_alloc;
--
2.25.1



Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc/slpc: Add SLPC selftest live_slpc_power

2022-09-26 Thread Belgaumkar, Vinay



On 9/23/2022 4:00 AM, Riana Tauro wrote:

A fundamental assumption is that at lower frequencies,
not only do we run slower, but we save power compared to
higher frequencies.
live_slpc_power checks if running at low frequency saves power

v2: re-use code to measure power
 fixed cosmetic review comments (Vinay)

Signed-off-by: Riana Tauro 


LGTM,

Reviewed-by: Vinay Belgaumkar 


---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 127 ++--
  1 file changed, 118 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index 928f74718881..4c6e9257e593 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -11,7 +11,8 @@
  enum test_type {
VARY_MIN,
VARY_MAX,
-   MAX_GRANTED
+   MAX_GRANTED,
+   SLPC_POWER,
  };
  
  static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)

@@ -41,6 +42,39 @@ static int slpc_set_max_freq(struct intel_guc_slpc *slpc, 
u32 freq)
return ret;
  }
  
+static int slpc_set_freq(struct intel_gt *gt, u32 freq)

+{
+   int err;
+   struct intel_guc_slpc *slpc = >uc.guc.slpc;
+
+   err = slpc_set_max_freq(slpc, freq);
+   if (err) {
+   pr_err("Unable to update max freq");
+   return err;
+   }
+
+   err = slpc_set_min_freq(slpc, freq);
+   if (err) {
+   pr_err("Unable to update min freq");
+   return err;
+   }
+
+   return err;
+}
+
+static u64 measure_power_at_freq(struct intel_gt *gt, int *freq, u64 *power)
+{
+   int err = 0;
+
+   err = slpc_set_freq(gt, *freq);
+   if (err)
+   return err;
+   *freq = intel_rps_read_actual_frequency(>rps);
+   *power = measure_power(>rps, freq);
+
+   return err;
+}
+
  static int vary_max_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
 u32 *max_act_freq)
  {
@@ -113,6 +147,58 @@ static int vary_min_freq(struct intel_guc_slpc *slpc, 
struct intel_rps *rps,
return err;
  }
  
+static int slpc_power(struct intel_gt *gt, struct intel_engine_cs *engine)

+{
+   struct intel_guc_slpc *slpc = >uc.guc.slpc;
+   struct {
+   u64 power;
+   int freq;
+   } min, max;
+   int err = 0;
+
+   /*
+* Our fundamental assumption is that running at lower frequency
+* actually saves power. Let's see if our RAPL measurement supports
+* that theory.
+*/
+   if (!librapl_supported(gt->i915))
+   return 0;
+
+   min.freq = slpc->min_freq;
+   err = measure_power_at_freq(gt, , );
+
+   if (err)
+   return err;
+
+   max.freq = slpc->rp0_freq;
+   err = measure_power_at_freq(gt, , );
+
+   if (err)
+   return err;
+
+   pr_info("%s: min:%llumW @ %uMHz, max:%llumW @ %uMHz\n",
+   engine->name,
+   min.power, min.freq,
+   max.power, max.freq);
+
+   if (10 * min.freq >= 9 * max.freq) {
+   pr_notice("Could not control frequency, ran at [%uMHz, 
%uMhz]\n",
+ min.freq, max.freq);
+   }
+
+   if (11 * min.power > 10 * max.power) {
+   pr_err("%s: did not conserve power when setting lower 
frequency!\n",
+  engine->name);
+   err = -EINVAL;
+   }
+
+   /* Restore min/max frequencies */
+   slpc_set_max_freq(slpc, slpc->rp0_freq);
+   slpc_set_min_freq(slpc, slpc->min_freq);
+
+   return err;
+}
+
  static int max_granted_freq(struct intel_guc_slpc *slpc, struct intel_rps 
*rps, u32 *max_act_freq)
  {
struct intel_gt *gt = rps_to_gt(rps);
@@ -233,17 +319,23 @@ static int run_test(struct intel_gt *gt, int test_type)
  
  			err = max_granted_freq(slpc, rps, _act_freq);

break;
+
+   case SLPC_POWER:
+   err = slpc_power(gt, engine);
+   break;
}
  
-		pr_info("Max actual frequency for %s was %d\n",

-   engine->name, max_act_freq);
+   if (test_type != SLPC_POWER) {
+   pr_info("Max actual frequency for %s was %d\n",
+   engine->name, max_act_freq);
  
-		/* Actual frequency should rise above min */

-   if (max_act_freq <= slpc_min_freq) {
-   pr_err("Actual freq did not rise above min\n");
-   pr_err("Perf Limit Reasons: 0x%x\n",
-  intel_uncore_read(gt->uncore, 
GT0_PERF_LIMIT_REASONS));
-   err = -EINVAL;
+   /* Actual frequency should rise above min */
+   if (max_act_freq <= slpc_min_freq) {
+   pr_err("Actual freq did not rise above min\n");
+   pr_err("Perf Limit Reasons: 

Re: [Intel-gfx] [PATCH 1/3] drm/i915/guc/slpc: Run SLPC selftests on all tiles

2022-09-26 Thread Belgaumkar, Vinay



On 9/23/2022 4:00 AM, Riana Tauro wrote:

Run slpc selftests on all tiles

Signed-off-by: Riana Tauro 


LGTM,

Reviewed-by: Vinay Belgaumkar 


---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 45 -
  1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index f8a1d27df272..928f74718881 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -270,26 +270,50 @@ static int run_test(struct intel_gt *gt, int test_type)
  static int live_slpc_vary_min(void *arg)
  {
struct drm_i915_private *i915 = arg;
-   struct intel_gt *gt = to_gt(i915);
+   struct intel_gt *gt;
+   unsigned int i;
+   int ret;
+
+   for_each_gt(gt, i915, i) {
+   ret = run_test(gt, VARY_MIN);
+   if (ret)
+   return ret;
+   }
  
-	return run_test(gt, VARY_MIN);

+   return ret;
  }
  
  static int live_slpc_vary_max(void *arg)

  {
struct drm_i915_private *i915 = arg;
-   struct intel_gt *gt = to_gt(i915);
+   struct intel_gt *gt;
+   unsigned int i;
+   int ret;
+
+   for_each_gt(gt, i915, i) {
+   ret = run_test(gt, VARY_MAX);
+   if (ret)
+   return ret;
+   }
  
-	return run_test(gt, VARY_MAX);

+   return ret;
  }
  
  /* check if pcode can grant RP0 */

  static int live_slpc_max_granted(void *arg)
  {
struct drm_i915_private *i915 = arg;
-   struct intel_gt *gt = to_gt(i915);
+   struct intel_gt *gt;
+   unsigned int i;
+   int ret;
+
+   for_each_gt(gt, i915, i) {
+   ret = run_test(gt, MAX_GRANTED);
+   if (ret)
+   return ret;
+   }
  
-	return run_test(gt, MAX_GRANTED);

+   return ret;
  }
  
  int intel_slpc_live_selftests(struct drm_i915_private *i915)

@@ -300,8 +324,13 @@ int intel_slpc_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(live_slpc_max_granted),
};
  
-	if (intel_gt_is_wedged(to_gt(i915)))

-   return 0;
+   struct intel_gt *gt;
+   unsigned int i;
+
+   for_each_gt(gt, i915, i) {
+   if (intel_gt_is_wedged(gt))
+   return 0;
+   }
  
  	return i915_live_subtests(tests, i915);

  }


Re: [Intel-gfx] [PATCH 2/3] drm/i915/selftests: Add helper function measure_power

2022-09-26 Thread Belgaumkar, Vinay



On 9/23/2022 4:00 AM, Riana Tauro wrote:

move the power measurement and the triangle filter
to a different function. No functional changes.

Signed-off-by: Riana Tauro 


LGTM,

Reviewed-by: Vinay Belgaumkar 


---
  drivers/gpu/drm/i915/gt/selftest_rps.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c 
b/drivers/gpu/drm/i915/gt/selftest_rps.c
index cfb4708dd62e..99a372486fb7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rps.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
@@ -1107,21 +1107,27 @@ static u64 __measure_power(int duration_ms)
return div64_u64(1000 * 1000 * dE, dt);
  }
  
-static u64 measure_power_at(struct intel_rps *rps, int *freq)

+static u64 measure_power(struct intel_rps *rps, int *freq)
  {
u64 x[5];
int i;
  
-	*freq = rps_set_check(rps, *freq);

for (i = 0; i < 5; i++)
x[i] = __measure_power(5);
-   *freq = (*freq + read_cagf(rps)) / 2;
+
+   *freq = (*freq + intel_rps_read_actual_frequency(rps)) / 2;
  
  	/* A simple triangle filter for better result stability */

sort(x, 5, sizeof(*x), cmp_u64, NULL);
return div_u64(x[1] + 2 * x[2] + x[3], 4);
  }
  
+static u64 measure_power_at(struct intel_rps *rps, int *freq)

+{
+   *freq = rps_set_check(rps, *freq);
+   return measure_power(rps, freq);
+}
+
  int live_rps_power(void *arg)
  {
struct intel_gt *gt = arg;


Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc/slpc: Add SLPC selftest live_slpc_power

2022-09-22 Thread Belgaumkar, Vinay



On 9/22/2022 7:32 AM, Riana Tauro wrote:

A fundamental assumption is that at lower frequencies,
not only do we run slower, but we save power compared to
higher frequencies.
live_slpc_power checks if running at low frequency saves power

Signed-off-by: Riana Tauro 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 116 ++--
  1 file changed, 107 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index f8a1d27df272..f22f091d2844 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -11,7 +11,8 @@
  enum test_type {
VARY_MIN,
VARY_MAX,
-   MAX_GRANTED
+   MAX_GRANTED,
+   SLPC_POWER,
  };
  
  static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)

@@ -41,6 +42,42 @@ static int slpc_set_max_freq(struct intel_guc_slpc *slpc, 
u32 freq)
return ret;
  }
  
+static int slpc_set_freq(struct intel_gt *gt, u32 freq)

+{
+   int err;
+   struct intel_guc_slpc *slpc = >uc.guc.slpc;
+
+   err = slpc_set_max_freq(slpc, freq);
+   if (err) {
+   pr_err("Unable to update max freq");
+   return err;
+   }
+
+   err = slpc_set_min_freq(slpc, freq);
+   if (err) {
+   pr_err("Unable to update min freq");
+   return err;
+   }
+
+   return intel_rps_read_actual_frequency(>rps);
The return value here is overloaded (either -ERR or frequency). Can we 
just return the error status here and query the act_freq in the caller 
instead?

+}
+
+static u64 measure_slpc_power_at(struct intel_gt *gt, int *freq)

Name is a little misleading, maybe slpc_measure_power_at() ?

+{
+   u64 x[5];
+   int i;
+
+   *freq = slpc_set_freq(gt, *freq);

Here, we can check for return code and then query for act_freq.

+   for (i = 0; i < 5; i++)
+   x[i] = __measure_power(5);
+   *freq = (*freq + intel_rps_read_actual_frequency(>rps)) / 2;
+
+   /* A simple triangle filter for better result stability */
+   sort(x, 5, sizeof(*x), cmp_u64, NULL);
+
+   return div_u64(x[1] + 2 * x[2] + x[3], 4);
we are duplicating code from selftest_rps here, is it possible to add a 
helper instead (like __measure_power())?

+}
+
  static int vary_max_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
 u32 *max_act_freq)
  {
@@ -113,6 +150,52 @@ static int vary_min_freq(struct intel_guc_slpc *slpc, 
struct intel_rps *rps,
return err;
  }
  
+static int slpc_power(struct intel_gt *gt, struct intel_engine_cs *engine)

+{
+   struct intel_guc_slpc *slpc = >uc.guc.slpc;
+   struct {
+   u64 power;
+   int freq;
+   } min, max;
+   int err = 0;
+
+   /*
+* Our fundamental assumption is that running at lower frequency
+* actually saves power. Let's see if our RAPL measurement support

supports*

+* that theory.
+*/
+   if (!librapl_supported(gt->i915))
+   return 0;
+
+   min.freq = slpc->min_freq;
+   min.power =  measure_slpc_power_at(gt, );
+
+   max.freq = slpc->rp0_freq;
+   max.power = measure_slpc_power_at(gt, );
+
+   pr_info("%s: min:%llumW @ %uMHz, max:%llumW @ %uMHz\n",
+   engine->name,
+   min.power, min.freq,
+   max.power, max.freq);
+
+   if (10 * min.freq >= 9 * max.freq) {
+   pr_notice("Could not control frequency, ran at [%uMHz, 
%uMhz]\n",
+ min.freq, max.freq);
+   }
+
+   if (11 * min.power > 10 * max.power) {
+   pr_err("%s: did not conserve power when setting lower 
frequency!\n",
+  engine->name);
+   err = -EINVAL;
+   }
+
+   /* Restore min/max frequencies */
+   slpc_set_max_freq(slpc, slpc->rp0_freq);
+   slpc_set_min_freq(slpc, slpc->min_freq);
+
+   return err;
+}
+
  static int max_granted_freq(struct intel_guc_slpc *slpc, struct intel_rps 
*rps, u32 *max_act_freq)
  {
struct intel_gt *gt = rps_to_gt(rps);
@@ -233,17 +316,23 @@ static int run_test(struct intel_gt *gt, int test_type)
  
  			err = max_granted_freq(slpc, rps, _act_freq);

break;
+
+   case SLPC_POWER:
+   err = slpc_power(gt, engine);
+   break;
}
  
-		pr_info("Max actual frequency for %s was %d\n",

-   engine->name, max_act_freq);
+   if (test_type != SLPC_POWER) {
+   pr_info("Max actual frequency for %s was %d\n",
+   engine->name, max_act_freq);
  
-		/* Actual frequency should rise above min */

-   if (max_act_freq <= slpc_min_freq) {
-   pr_err("Actual freq did not rise above min\n");
-   pr_err("Perf Limit Reasons: 0x%x\n",
- 

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Allow SLPC to use efficient frequency

2022-08-15 Thread Belgaumkar, Vinay



On 8/15/2022 10:32 AM, Rodrigo Vivi wrote:

On Sun, Aug 14, 2022 at 04:46:54PM -0700, Vinay Belgaumkar wrote:

Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case since
SLPC will still abide by the (min,max) range limits.

With this change, min freq will be at efficient frequency level at init
instead of fused min (RPn). If user chooses to reduce min freq below the
efficient freq, we will turn off usage of efficient frequency and honor
the user request. When a higher value is written, it will get toggled
back again.

The patch also corrects the register which needs to be read for obtaining
the correct efficient frequency for Gen9+.

We see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled as expected.

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468

Cc: Rodrigo Vivi 

First of all sorry for looking to the old patch first... I was delayed in my 
inbox flow.


Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c |  3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 66 +++--
  drivers/gpu/drm/i915/intel_mchbar_regs.h|  3 +
  3 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index c7d381ad90cf..281a086fc265 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1108,6 +1108,9 @@ void gen6_rps_get_freq_caps(struct intel_rps *rps, struct 
intel_rps_freq_caps *c
} else {
caps->rp0_freq = (rp_state_cap >>  0) & 0xff;
caps->rp1_freq = (rp_state_cap >>  8) & 0xff;
+   caps->rp1_freq = REG_FIELD_GET(RPE_MASK,
+  
intel_uncore_read(to_gt(i915)->uncore,
+  GEN10_FREQ_INFO_REC));

This register is only gen10+ while the func is gen6+.
either we handle the platform properly or we create a new rpe_freq tracker 
somewhere
and if that's available we use this rpe, otherwise we use the hw fused rp1 
which is a good
enough, but it is not the actual one resolved by pcode, like this new RPe one.

sure.



caps->min_freq = (rp_state_cap >> 16) & 0xff;
}
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index e1fa1f32f29e..70a2af5f518d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -465,6 +465,29 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
*slpc, u32 *val)
return ret;
  }
  
+static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

I know this code was already there, but I do have some questions around this
and maybe we can simplify now that are touching this function.


+{
+   int ret = 0;
+
+   if (ignore) {
+   ret = slpc_set_param(slpc,
+SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+ignore);
+   if (!ret)
+   return slpc_set_param(slpc,
+ 
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ slpc->min_freq);

why do we need to touch this min request here?

true, not needed anymore.



+   } else {
+   ret = slpc_unset_param(slpc,
+  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY);

do we really need the unset param?

for me using set_param(SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY, freq < rpe_freq)
was enough...


Yup, removed this helper function as discussed.

Thanks,

Vinay.




+   if (!ret)
+   return slpc_unset_param(slpc,
+   
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ);
+   }
+
+   return ret;
+}
+
  /**
   * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
   * @slpc: pointer to intel_guc_slpc.
@@ -491,6 +514,14 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
*slpc, u32 val)
  
  	with_intel_runtime_pm(>runtime_pm, wakeref) {
  
+		/* Ignore efficient freq if lower min freq is requested */

+   ret = slpc_ignore_eff_freq(slpc, val < slpc->rp1_freq);
+   if (unlikely(ret)) {
+   i915_probe_error(i915, "Failed to toggle efficient freq 
(%pe)\n",
+ERR_PTR(ret));
+   return ret;
+   }
+
ret = slpc_set_param(slpc,
 SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
 

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Allow SLPC to use efficient frequency

2022-08-15 Thread Belgaumkar, Vinay



On 8/15/2022 9:51 AM, Rodrigo Vivi wrote:

On Tue, Aug 09, 2022 at 05:03:06PM -0700, Vinay Belgaumkar wrote:

Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case, since
SLPC will still abide by the (min,max) range limits and pcode forces
frequency to 0 anyways when GT is in C6.

We also see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled.

Fixes: 025cb07bebfa ("drm/i915/guc/slpc: Cache platform frequency limits")

Signed-off-by: Vinay Belgaumkar 

I'm honestly surprised that our CI passed cleanly. What happens when user
request both min and max < rpe?

I'm sure that in this case GuC SLPC will put us to rpe ignoring our requests.
Or is this good enough for the users expectation because of the soft limits
showing the requested freq and we not asking to guc what it currently has as
minimal?

I just want to be sure that we are not causing any confusion for end users
out there in the case they request some min/max below RPe and start seeing
mismatches on the expectation because GuC is forcing the real min request
to RPe.

My suggestion is to ignore the RPe whenever we have a min request below it.
So GuC respects our (and users) chosen min. And restore whenever min request
is abobe rpe.


Yup, I have already sent a patch yesterday with that change, doesn't 
look like CI has run on it yet. This was the old version.


Thanks,

Vinay.



Thanks,
Rodrigo.


---
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 52 -
  1 file changed, 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index e1fa1f32f29e..4b824da3048a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -137,17 +137,6 @@ static int guc_action_slpc_set_param(struct intel_guc 
*guc, u8 id, u32 value)
return ret > 0 ? -EPROTO : ret;
  }
  
-static int guc_action_slpc_unset_param(struct intel_guc *guc, u8 id)

-{
-   u32 request[] = {
-   GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
-   SLPC_EVENT(SLPC_EVENT_PARAMETER_UNSET, 1),
-   id,
-   };
-
-   return intel_guc_send(guc, request, ARRAY_SIZE(request));
-}
-
  static bool slpc_is_running(struct intel_guc_slpc *slpc)
  {
return slpc_get_state(slpc) == SLPC_GLOBAL_STATE_RUNNING;
@@ -201,16 +190,6 @@ static int slpc_set_param(struct intel_guc_slpc *slpc, u8 
id, u32 value)
return ret;
  }
  
-static int slpc_unset_param(struct intel_guc_slpc *slpc,

-   u8 id)
-{
-   struct intel_guc *guc = slpc_to_guc(slpc);
-
-   GEM_BUG_ON(id >= SLPC_MAX_PARAM);
-
-   return guc_action_slpc_unset_param(guc, id);
-}
-
  static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
  {
struct drm_i915_private *i915 = slpc_to_i915(slpc);
@@ -597,29 +576,6 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return 0;
  }
  
-static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

-{
-   int ret = 0;
-
-   if (ignore) {
-   ret = slpc_set_param(slpc,
-SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
-ignore);
-   if (!ret)
-   return slpc_set_param(slpc,
- 
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
- slpc->min_freq);
-   } else {
-   ret = slpc_unset_param(slpc,
-  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY);
-   if (!ret)
-   return slpc_unset_param(slpc,
-   
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ);
-   }
-
-   return ret;
-}
-
  static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
  {
/* Force SLPC to used platform rp0 */
@@ -679,14 +635,6 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
  
  	slpc_get_rp_values(slpc);
  
-	/* Ignore efficient freq and set min to platform min */

-   ret = slpc_ignore_eff_freq(slpc, true);
-   if (unlikely(ret)) {
-   i915_probe_error(i915, "Failed to set SLPC min to RPn (%pe)\n",
-ERR_PTR(ret));
-   return ret;
-   }
-
/* Set SLPC max limit to RP0 */
ret = slpc_use_fused_rp0(slpc);
if (unlikely(ret)) {
--
2.35.1



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Allow SLPC to use efficient frequency

2022-08-14 Thread Belgaumkar, Vinay



On 8/14/2022 4:46 PM, Vinay Belgaumkar wrote:

Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case since
SLPC will still abide by the (min,max) range limits.

With this change, min freq will be at efficient frequency level at init
instead of fused min (RPn). If user chooses to reduce min freq below the
efficient freq, we will turn off usage of efficient frequency and honor
the user request. When a higher value is written, it will get toggled
back again.

The patch also corrects the register which needs to be read for obtaining
the correct efficient frequency for Gen9+.

We see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled as expected.

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468

Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c |  3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 66 +++--
  drivers/gpu/drm/i915/intel_mchbar_regs.h|  3 +
  3 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index c7d381ad90cf..281a086fc265 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1108,6 +1108,9 @@ void gen6_rps_get_freq_caps(struct intel_rps *rps, struct 
intel_rps_freq_caps *c
} else {
caps->rp0_freq = (rp_state_cap >>  0) & 0xff;
caps->rp1_freq = (rp_state_cap >>  8) & 0xff;


Forgot to remove old code here. Will do so for the next revision as it 
does not affect the patch.


Thanks,

Vinay.


+   caps->rp1_freq = REG_FIELD_GET(RPE_MASK,
+  
intel_uncore_read(to_gt(i915)->uncore,
+  GEN10_FREQ_INFO_REC));
caps->min_freq = (rp_state_cap >> 16) & 0xff;
}
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index e1fa1f32f29e..70a2af5f518d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -465,6 +465,29 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
*slpc, u32 *val)
return ret;
  }
  
+static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

+{
+   int ret = 0;
+
+   if (ignore) {
+   ret = slpc_set_param(slpc,
+SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+ignore);
+   if (!ret)
+   return slpc_set_param(slpc,
+ 
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ slpc->min_freq);
+   } else {
+   ret = slpc_unset_param(slpc,
+  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY);
+   if (!ret)
+   return slpc_unset_param(slpc,
+   
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ);
+   }
+
+   return ret;
+}
+
  /**
   * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
   * @slpc: pointer to intel_guc_slpc.
@@ -491,6 +514,14 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
*slpc, u32 val)
  
  	with_intel_runtime_pm(>runtime_pm, wakeref) {
  
+		/* Ignore efficient freq if lower min freq is requested */

+   ret = slpc_ignore_eff_freq(slpc, val < slpc->rp1_freq);
+   if (unlikely(ret)) {
+   i915_probe_error(i915, "Failed to toggle efficient freq 
(%pe)\n",
+ERR_PTR(ret));
+   return ret;
+   }
+
ret = slpc_set_param(slpc,
 SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
 val);
@@ -587,7 +618,9 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return ret;
  
  	if (!slpc->min_freq_softlimit) {

-   slpc->min_freq_softlimit = slpc->min_freq;
+   ret = intel_guc_slpc_get_min_freq(slpc, 
>min_freq_softlimit);
+   if (unlikely(ret))
+   return ret;
slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
} else if (slpc->min_freq_softlimit != slpc->min_freq) {
return intel_guc_slpc_set_min_freq(slpc,
@@ -597,29 +630,6 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return 0;
  }
  
-static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

-{
-   int 

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/guc/slpc: Add a new SLPC selftest (rev4)

2022-06-28 Thread Belgaumkar, Vinay


From: Patchwork 
Sent: Monday, June 27, 2022 10:00 PM
To: Belgaumkar, Vinay 
Cc: intel-gfx@lists.freedesktop.org
Subject: ✗ Fi.CI.BAT: failure for drm/i915/guc/slpc: Add a new SLPC selftest 
(rev4)

Patch Details
Series:
drm/i915/guc/slpc: Add a new SLPC selftest (rev4)
URL:
https://patchwork.freedesktop.org/series/105005/
State:
failure
Details:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/index.html
CI Bug Log - changes from CI_DRM_11816 -> Patchwork_105005v4
Summary

FAILURE

Serious unknown changes coming with Patchwork_105005v4 absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_105005v4, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/index.html

Participating hosts (40 -> 39)

Additional (3): fi-hsw-4770 bat-adlm-1 fi-icl-u2
Missing (4): bat-dg2-8 fi-rkl-11600 fi-bdw-samus bat-dg1-5

Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_105005v4:

IGT changes
Possible regressions

  *   igt@i915_selftest@live@workarounds:
 *   fi-bdw-5557u: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/fi-bdw-5557u/igt@i915_selftest@l...@workarounds.html>
 -> 
INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-bdw-5557u/igt@i915_selftest@l...@workarounds.html>
This failure is not related to the patch, bdw does not have SLPC.
Thanks,
Vinay.
Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  *   igt@i915_selftest@live@perf:
 *   {bat-adln-1}: NOTRUN -> 
DMESG-FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/bat-adln-1/igt@i915_selftest@l...@perf.html>

Known issues

Here are the changes found in Patchwork_105005v4 that come from known issues:

IGT changes
Issues hit

  *   igt@gem_huc_copy@huc-copy:
 *   fi-icl-u2: NOTRUN -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-icl-u2/igt@gem_huc_c...@huc-copy.html>
 (i915#2190<https://gitlab.freedesktop.org/drm/intel/issues/2190>)
  *   igt@gem_lmem_swapping@random-engines:
 *   fi-icl-u2: NOTRUN -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-icl-u2/igt@gem_lmem_swapp...@random-engines.html>
 (i915#4613<https://gitlab.freedesktop.org/drm/intel/issues/4613>) +3 similar 
issues
  *   igt@i915_pm_backlight@basic-brightness:
 *   fi-hsw-4770: NOTRUN -> 
SKIP<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-hsw-4770/igt@i915_pm_backli...@basic-brightness.html>
 (fdo#109271<https://bugs.freedesktop.org/show_bug.cgi?id=109271> / 
i915#3012<https://gitlab.freedesktop.org/drm/intel/issues/3012>)
  *   igt@i915_pm_rpm@module-reload:
 *   fi-cfl-8109u: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/fi-cfl-8109u/igt@i915_pm_...@module-reload.html>
 -> 
DMESG-FAIL<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-cfl-8109u/igt@i915_pm_...@module-reload.html>
 (i915#62<https://gitlab.freedesktop.org/drm/intel/issues/62>)
 *   bat-adlp-4: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/bat-adlp-4/igt@i915_pm_...@module-reload.html>
 -> 
DMESG-WARN<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/bat-adlp-4/igt@i915_pm_...@module-reload.html>
 (i915#3576<https://gitlab.freedesktop.org/drm/intel/issues/3576>) +1 similar 
issue
  *   igt@i915_selftest@live@execlists:
 *   fi-bsw-nick: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/fi-bsw-nick/igt@i915_selftest@l...@execlists.html>
 -> 
INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-bsw-nick/igt@i915_selftest@l...@execlists.html>
 (i915#5847<https://gitlab.freedesktop.org/drm/intel/issues/5847>)
  *   igt@i915_selftest@live@hangcheck:
 *   fi-hsw-g3258: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/fi-hsw-g3258/igt@i915_selftest@l...@hangcheck.html>
 -> 
INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-hsw-g3258/igt@i915_selftest@l...@hangcheck.html>
 (i915#4785<https://gitlab.freedesktop.org/drm/intel/issues/4785>)
  *   igt@i915_selftest@live@ring_submission:
 *   fi-cfl-8109u: 
PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11816/fi-cfl-8109u/igt@i915_selftest@live@ring_submission.html>
 -> 
DMESG-WARN<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105005v4/fi-cfl-8109u/igt@i915_selftest@live@ring_submission.html>
 (i915#5904<https://gitlab.freedesktop.org/drm/intel/issues/5904>) +11 similar 
issues
  *   igt@i915_suspend@basic-s2idle-without-i915:
 *   fi-cfl-8109u: 
PASS<https://intel-gfx-ci.01.org/t

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Add a new SLPC selftest

2022-06-27 Thread Belgaumkar, Vinay



On 6/24/2022 8:59 PM, Dixit, Ashutosh wrote:

On Thu, 23 Jun 2022 16:33:20 -0700, Vinay Belgaumkar wrote:

+static int max_granted_freq(struct intel_guc_slpc *slpc, struct intel_rps 
*rps, u32 *max_act_freq)
+{
+   struct intel_gt *gt = rps_to_gt(rps);
+   u32 perf_limit_reasons;
+   int err = 0;

-   igt_spinner_end();
-   st_engine_heartbeat_enable(engine);
-   }
+   err = slpc_set_min_freq(slpc, slpc->rp0_freq);
+   if (err)
+   return err;

-   pr_info("Max actual frequency for %s was %d\n",
-   engine->name, max_act_freq);
+   *max_act_freq =  intel_rps_read_actual_frequency(rps);
+   if (!(*max_act_freq == slpc->rp0_freq)) {

nit but '*max_act_freq != slpc->rp0_freq'



+   /* Check if there was some throttling by pcode */
+   perf_limit_reasons = intel_uncore_read(gt->uncore, 
GT0_PERF_LIMIT_REASONS);

-   /* Actual frequency should rise above min */
-   if (max_act_freq == slpc_min_freq) {
-   pr_err("Actual freq did not rise above min\n");
+   /* If not, this is an error */
+   if (!(perf_limit_reasons && GT0_PERF_LIMIT_REASONS_MASK)) {

Still wrong, should be & not &&


+   pr_err("Pcode did not grant max freq\n");
err = -EINVAL;
-   }
+   } else {
+   pr_info("Pcode throttled frequency 0x%x\n", 
perf_limit_reasons);

Another question, why are we using pr_err/info here rather than
drm_err/info? pr_err/info is ok for mock selftests since there is no drm
device but that is not the case here, I think this is done in other
selftests too but maybe fix this as well if we are making so many changes
here? Anyway can do later too.


Yup, will send a separate patch to change them to drm_err/info.

Thanks,

Vinay.



So let's settle issues in v2 thread first.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Add a new SLPC selftest

2022-06-27 Thread Belgaumkar, Vinay



On 6/24/2022 8:59 PM, Dixit, Ashutosh wrote:

On Thu, 23 Jun 2022 16:21:46 -0700, Belgaumkar, Vinay wrote:

On 6/22/2022 1:32 PM, Dixit, Ashutosh wrote:

On Fri, 10 Jun 2022 16:47:12 -0700, Vinay Belgaumkar wrote:

This test will validate we can achieve actual frequency of RP0. Pcode
grants frequencies based on what GuC is requesting. However, thermal
throttling can limit what is being granted. Add a test to request for
max, but don't fail the test if RP0 is not granted due to throttle
reasons.

Also optimize the selftest by using a common run_test function to avoid
code duplication.

The refactoring does change the order of operations (changing the freq vs
spawning the spinner) but should be fine I think.

Yes, we now start the spinner outside the for loop, so that freq changes
occur quickly. This ensures we don't mess with SLPC algorithm's history by
frequently restarting the WL in the for loop.

Rename the "clamp" tests to vary_max_freq and vary_min_freq.

Either is ok, but maybe "clamp" names were ok I think since they verify req
freq is clamped at min/max.

True, though clamp usually is associated with limiting, whereas we actually
increase the min.

v2: Fix compile warning

Fixes 8ee2c227822e ("drm/i915/guc/slpc: Add SLPC selftest")
Signed-off-by: Vinay Belgaumkar 
---
   drivers/gpu/drm/i915/gt/selftest_slpc.c | 323 
   1 file changed, 158 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index b768cea5943d..099129aae9a5 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -8,6 +8,11 @@
   #define delay_for_h2g() usleep_range(H2G_DELAY, H2G_DELAY + 1)
   #define FREQUENCY_REQ_UNIT   DIV_ROUND_CLOSEST(GT_FREQUENCY_MULTIPLIER, \
  GEN9_FREQ_SCALER)
+enum test_type {
+   VARY_MIN,
+   VARY_MAX,
+   MAX_GRANTED
+};

   static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)
   {
@@ -36,147 +41,120 @@ static int slpc_set_max_freq(struct intel_guc_slpc *slpc, 
u32 freq)
return ret;
   }

-static int live_slpc_clamp_min(void *arg)
+static int vary_max_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
+ u32 *max_act_freq)

Please run checkpatch, indentation seems off.

I had run it. Not sure why this wasn't caught.

Need to use 'checkpatch --strict'.

ok.



   {
-   struct drm_i915_private *i915 = arg;
-   struct intel_gt *gt = to_gt(i915);
-   struct intel_guc_slpc *slpc = >uc.guc.slpc;
-   struct intel_rps *rps = >rps;
-   struct intel_engine_cs *engine;
-   enum intel_engine_id id;
-   struct igt_spinner spin;
+   u32 step, max_freq, req_freq;
+   u32 act_freq;
u32 slpc_min_freq, slpc_max_freq;
int err = 0;

-   if (!intel_uc_uses_guc_slpc(>uc))
-   return 0;
-
-   if (igt_spinner_init(, gt))
-   return -ENOMEM;
+   slpc_min_freq = slpc->min_freq;
+   slpc_max_freq = slpc->rp0_freq;

nit but we don't really need such variables since we don't change their
values, we should just use slpc->min_freq, slpc->rp0_freq directly. I'd
change this in all places in this patch.

I will remove it from the sub-functions, but will need to keep the one in
the main run_test(). We should query SLPC's min and max and then restore
that at the end of the test. It is possible that SLPC's min is different
from platform min for certain skus.

Sorry, I am not following. The tests are varying freq between platform min
to platform max correct? And platform min can be different from slpc min?
So why don't the tests start at slpc min rather than platform min? Can't
this return error?
Will start the tests from platform min -> platform max, that way we 
remain consistent.


And shouldn't slpc->min set to the real slpc min rather than to the
platform min when slpc initializes (in intel_guc_slpc_enable() or
slpc_get_rp_values())? (I am assuming the issue is only for the min and not
the max but not sure).
Certain conditions may result in SLPC setting the min to a different 
value. We can worry about that in a different patch.


So I'd expect everywhere a consistent set of freq's be used, in run_test()
and the actual vary_min/max_freq tests and also in the main driver.

Agree.



-   if (intel_guc_slpc_get_max_freq(slpc, _max_freq)) {
-   pr_err("Could not get SLPC max freq\n");
-   return -EIO;
-   }
-
-   if (intel_guc_slpc_get_min_freq(slpc, _min_freq)) {
-   pr_err("Could not get SLPC min freq\n");
-   return -EIO;

Why do we need these two function calls? Can't we just use slpc->rp0_freq
and slpc->min_freq as we are doing in the vary_min/max_freq() functions
above?

Same as above.

Also, as mentioned below I think here we should just do:

  

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Add a new SLPC selftest

2022-06-27 Thread Belgaumkar, Vinay



On 6/24/2022 8:59 PM, Dixit, Ashutosh wrote:

On Thu, 23 Jun 2022 16:33:20 -0700, Vinay Belgaumkar wrote:

+static int max_granted_freq(struct intel_guc_slpc *slpc, struct intel_rps 
*rps, u32 *max_act_freq)
+{
+   struct intel_gt *gt = rps_to_gt(rps);
+   u32 perf_limit_reasons;
+   int err = 0;

-   igt_spinner_end();
-   st_engine_heartbeat_enable(engine);
-   }
+   err = slpc_set_min_freq(slpc, slpc->rp0_freq);
+   if (err)
+   return err;

-   pr_info("Max actual frequency for %s was %d\n",
-   engine->name, max_act_freq);
+   *max_act_freq =  intel_rps_read_actual_frequency(rps);
+   if (!(*max_act_freq == slpc->rp0_freq)) {

nit but '*max_act_freq != slpc->rp0_freq'

ok.




+   /* Check if there was some throttling by pcode */
+   perf_limit_reasons = intel_uncore_read(gt->uncore, 
GT0_PERF_LIMIT_REASONS);

-   /* Actual frequency should rise above min */
-   if (max_act_freq == slpc_min_freq) {
-   pr_err("Actual freq did not rise above min\n");
+   /* If not, this is an error */
+   if (!(perf_limit_reasons && GT0_PERF_LIMIT_REASONS_MASK)) {

Still wrong, should be & not &&

yup, third time's the charm.



+   pr_err("Pcode did not grant max freq\n");
err = -EINVAL;
-   }
+   } else {
+   pr_info("Pcode throttled frequency 0x%x\n", 
perf_limit_reasons);

Another question, why are we using pr_err/info here rather than
drm_err/info? pr_err/info is ok for mock selftests since there is no drm
device but that is not the case here, I think this is done in other
selftests too but maybe fix this as well if we are making so many changes
here? Anyway can do later too.

So let's settle issues in v2 thread first.


Thanks,

Vinay.



Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Add a new SLPC selftest

2022-06-23 Thread Belgaumkar, Vinay



On 6/22/2022 1:32 PM, Dixit, Ashutosh wrote:

On Fri, 10 Jun 2022 16:47:12 -0700, Vinay Belgaumkar wrote:

This test will validate we can achieve actual frequency of RP0. Pcode
grants frequencies based on what GuC is requesting. However, thermal
throttling can limit what is being granted. Add a test to request for
max, but don't fail the test if RP0 is not granted due to throttle
reasons.

Also optimize the selftest by using a common run_test function to avoid
code duplication.

The refactoring does change the order of operations (changing the freq vs
spawning the spinner) but should be fine I think.
Yes, we now start the spinner outside the for loop, so that freq changes 
occur quickly. This ensures we don't mess with SLPC algorithm's history 
by frequently restarting the WL in the for loop.



Rename the "clamp" tests to vary_max_freq and vary_min_freq.

Either is ok, but maybe "clamp" names were ok I think since they verify req
freq is clamped at min/max.
True, though clamp usually is associated with limiting, whereas we 
actually increase the min.



v2: Fix compile warning

Fixes 8ee2c227822e ("drm/i915/guc/slpc: Add SLPC selftest")
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/selftest_slpc.c | 323 
  1 file changed, 158 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c 
b/drivers/gpu/drm/i915/gt/selftest_slpc.c
index b768cea5943d..099129aae9a5 100644
--- a/drivers/gpu/drm/i915/gt/selftest_slpc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -8,6 +8,11 @@
  #define delay_for_h2g() usleep_range(H2G_DELAY, H2G_DELAY + 1)
  #define FREQUENCY_REQ_UNITDIV_ROUND_CLOSEST(GT_FREQUENCY_MULTIPLIER, \
  GEN9_FREQ_SCALER)
+enum test_type {
+   VARY_MIN,
+   VARY_MAX,
+   MAX_GRANTED
+};

  static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)
  {
@@ -36,147 +41,120 @@ static int slpc_set_max_freq(struct intel_guc_slpc *slpc, 
u32 freq)
return ret;
  }

-static int live_slpc_clamp_min(void *arg)
+static int vary_max_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
+ u32 *max_act_freq)

Please run checkpatch, indentation seems off.

I had run it. Not sure why this wasn't caught.



  {
-   struct drm_i915_private *i915 = arg;
-   struct intel_gt *gt = to_gt(i915);
-   struct intel_guc_slpc *slpc = >uc.guc.slpc;
-   struct intel_rps *rps = >rps;
-   struct intel_engine_cs *engine;
-   enum intel_engine_id id;
-   struct igt_spinner spin;
+   u32 step, max_freq, req_freq;
+   u32 act_freq;
u32 slpc_min_freq, slpc_max_freq;
int err = 0;

-   if (!intel_uc_uses_guc_slpc(>uc))
-   return 0;
-
-   if (igt_spinner_init(, gt))
-   return -ENOMEM;
+   slpc_min_freq = slpc->min_freq;
+   slpc_max_freq = slpc->rp0_freq;

nit but we don't really need such variables since we don't change their
values, we should just use slpc->min_freq, slpc->rp0_freq directly. I'd
change this in all places in this patch.


I will remove it from the sub-functions, but will need to keep the one 
in the main run_test(). We should query SLPC's min and max and then 
restore that at the end of the test. It is possible that SLPC's min is 
different from platform min for certain skus.





-   if (intel_guc_slpc_get_max_freq(slpc, _max_freq)) {
-   pr_err("Could not get SLPC max freq\n");
-   return -EIO;
-   }
-
-   if (intel_guc_slpc_get_min_freq(slpc, _min_freq)) {
-   pr_err("Could not get SLPC min freq\n");
-   return -EIO;

Why do we need these two function calls? Can't we just use slpc->rp0_freq
and slpc->min_freq as we are doing in the vary_min/max_freq() functions
above?

Same as above.


Also, as mentioned below I think here we should just do:

 slpc_set_max_freq(slpc, slpc->rp0_freq);
 slpc_set_min_freq(slpc, slpc->min_freq);

to restore freq to a known state before starting the test (just in case a
previous test changed the values).

Any test that changes the frequencies should restore them as well.



-   }
-
-   if (slpc_min_freq == slpc_max_freq) {
-   pr_err("Min/Max are fused to the same value\n");
-   return -EINVAL;

What if they are actually equal? I think basically the max/min freq test
loops will just not be entered (so effectively the tests will just
skip). The granted freq test will be fine. So I think we can just delete
this if statement?

(It is showing deleted above in the patch but is in the new code somewhere
too).
Actually, we should set it to min/rp0 if this is the case. That change 
will be in a separate patch. This is needed for certain cases.



-   }
-
-   intel_gt_pm_wait_for_idle(gt);
-   intel_gt_pm_get(gt);
-   for_each_engine(engine, gt, id) {
-   struct i915_request *rq;
-   u32 

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost

2022-06-22 Thread Belgaumkar, Vinay



On 6/21/2022 5:26 PM, Dixit, Ashutosh wrote:

On Sat, 14 May 2022 23:05:06 -0700, Vinay Belgaumkar wrote:

SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.

This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.

Overall I am ok moving waitboost to use the non-blocking H2G. We can
consider increasing the timeout in wait_for_ct_request_update() to be a
separate issue for blocking cases and we can handle that separately.

Still there a couple of issues with this patch mentioned below.


v2: Use drm_notice to report any errors that might occur while
sending the waitboost H2G request (Tvrtko)

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 44 +
  1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index 1db833da42df..e5e869c96262 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc *slpc)
return data->header.global_state;
  }

+static int guc_action_slpc_set_param_nb(struct intel_guc *guc, u8 id, u32 
value)
+{
+   u32 request[] = {
+   GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+   SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
+   id,
+   value,
+   };
+   int ret;
+
+   ret = intel_guc_send_nb(guc, request, ARRAY_SIZE(request), 0);
+
+   return ret > 0 ? -EPROTO : ret;
+}
+
+static int slpc_set_param_nb(struct intel_guc_slpc *slpc, u8 id, u32 value)
+{
+   struct intel_guc *guc = slpc_to_guc(slpc);
+
+   GEM_BUG_ON(id >= SLPC_MAX_PARAM);
+
+   return guc_action_slpc_set_param_nb(guc, id, value);
+}
+
  static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
  {
u32 request[] = {
@@ -208,12 +232,10 @@ static int slpc_force_min_freq(struct intel_guc_slpc 
*slpc, u32 freq)
 */

with_intel_runtime_pm(>runtime_pm, wakeref) {
-   ret = slpc_set_param(slpc,
-SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
-freq);
-   if (ret)
-   i915_probe_error(i915, "Unable to force min freq to %u: 
%d",
-freq, ret);
+   /* Non-blocking request will avoid stalls */
+   ret = slpc_set_param_nb(slpc,
+   
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+   freq);
}

return ret;
@@ -222,6 +244,8 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, 
u32 freq)
  static void slpc_boost_work(struct work_struct *work)
  {
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), 
boost_work);
+   struct drm_i915_private *i915 = slpc_to_i915(slpc);
+   int err;

/*
 * Raise min freq to boost. It's possible that
@@ -231,8 +255,12 @@ static void slpc_boost_work(struct work_struct *work)
 */
mutex_lock(>lock);
if (atomic_read(>num_waiters)) {
-   slpc_force_min_freq(slpc, slpc->boost_freq);
-   slpc->num_boosts++;
+   err = slpc_force_min_freq(slpc, slpc->boost_freq);
+   if (!err)
+   slpc->num_boosts++;
+   else
+   drm_notice(>drm, "Failed to send waitboost request 
(%d)\n",
+  err);

The issue I have is what happens when we de-boost (restore min freq to its
previous value in intel_guc_slpc_dec_waiters()). It would seem that that
call is fairly important to get the min freq down when there are no pending
requests. Therefore what do we do in that case?

This is the function:

void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
{
 mutex_lock(>lock);
 if (atomic_dec_and_test(>num_waiters))
 slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
 mutex_unlock(>lock);
}


1. First it would seem that at the minimum we need a similar drm_notice()
in intel_guc_slpc_dec_waiters(). That would mean we need to put the
drm_notice() back in slpc_force_min_freq() (replacing
i915_probe_error()) rather than in slpc_boost_work() above?

Sure.


2. Further, if de-boosting is important then maybe as was being discussed
in v1 of this patch (see the bottom of
https://patchwork.freedesktop.org/patch/485004/?series=103598=1) do
we need to use intel_guc_send_busy_loop() in the
intel_guc_slpc_dec_waiters() code path?


Using a busy_loop here would 

Re: [Intel-gfx] [PATCH] drm/i915: Add global forcewake status to drpc

2022-06-17 Thread Belgaumkar, Vinay



On 6/17/2022 1:53 PM, Dixit, Ashutosh wrote:

On Fri, 17 Jun 2022 13:25:34 -0700, Vinay Belgaumkar wrote:

We have seen multiple RC6 issues where it is useful to know
which global forcewake bits are set. Add this to the 'drpc'
debugfs output.

A couple of optional nits below to look at but otherwise this is:

Reviewed-by: Ashutosh Dixit 


+static u32 mt_fwake_status(struct intel_uncore *uncore)
+{
+   return intel_uncore_read_fw(uncore, FORCEWAKE_MT);
+}
+
  static int vlv_drpc(struct seq_file *m)
  {
struct intel_gt *gt = m->private;
struct intel_uncore *uncore = gt->uncore;
-   u32 rcctl1, pw_status;
+   u32 rcctl1, pw_status, mt_fwake;

+   mt_fwake = mt_fwake_status(uncore);

I would get rid of the function and just duplicate the intel_uncore_read_fw().
Made it a function in case we can find the equivalent register for ILK. 
Though, I am not sure if ILK even had the concept of MT fwake.



pw_status = intel_uncore_read(uncore, VLV_GTLC_PW_STATUS);
rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);

seq_printf(m, "RC6 Enabled: %s\n",
   str_yes_no(rcctl1 & (GEN7_RC_CTL_TO_MODE |
GEN6_RC_CTL_EI_MODE(1;
+   seq_printf(m, "Multi-threaded Forcewake: 0x%x\n", mt_fwake);

Is "Multi-threaded Forcewake Request" (the Bspec register name) a more
descriptive print?

Same for gen6_drpc() below. Thanks!


Sure.

Thanks,

Vinay.



Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost

2022-05-14 Thread Belgaumkar, Vinay



On 5/6/2022 9:43 AM, John Harrison wrote:

On 5/6/2022 00:18, Tvrtko Ursulin wrote:

On 05/05/2022 19:36, John Harrison wrote:

On 5/5/2022 10:21, Belgaumkar, Vinay wrote:

On 5/5/2022 5:13 AM, Tvrtko Ursulin wrote:

On 05/05/2022 06:40, Vinay Belgaumkar wrote:

SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.


Is it the "Unable to force min freq" error? Do you have a link to 
the GitLab issue to add to commit message?
We don't have a specific error for this one, but have seen similar 
issues with other H2G which are blocking.



This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.


AFAIU with this approach, when CT channel is congested, you 
instead achieve silent dropping of the waitboost request, right?

We are hoping it makes it, but just not waiting for it to complete.
We are not 'hoping it makes it'. We know for a fact that it will 
make it. We just don't know when. The issue is not about whether the 
waitboost request itself gets dropped/lost it is about the ack that 
comes back. The GuC will process the message and it will send an 
ack. It's just a question of whether the i915 driver has given up 
waiting for it yet. And if it has, then you get the initial 'timed 
out waiting for ack' followed by a later 'got unexpected ack' message.


Whereas, if we make the call asynchronous, there is no ack. i915 
doesn't bother waiting and it won't get surprised later.


Also, note that this is only an issue when GuC itself is backed up. 
Normally that requires the creation/destruction of large numbers of 
contexts in rapid succession (context management is about the 
slowest thing we do with GuC). Some of the IGTs and selftests do 
that with thousands of contexts all at once. Those are generally 
where we see this kind of problem. It would be highly unlikely (but 
not impossible) to hit it in real world usage.


Goto ->

The general design philosophy of H2G messages is that asynchronous 
mode should be used for everything if at all possible. It is fire 
and forget and will all get processed in the order sent (same as 
batch buffer execution, really). Synchronous messages should only be 
used when an ack/status is absolutely required. E.g. start of day 
initialisation or things like TLB invalidation where we need to know 
that a cache has been cleared/flushed before updating memory from 
the CPU.


John.




It sounds like a potentially important feedback from the field to 
lose so easily. How about you added drm_notice to the worker when 
it fails?


Or simply a "one line patch" to replace i915_probe_error (!?) with 
drm_notice and keep the blocking behavior. (I have no idea what is 
the typical time to drain the CT buffer, and so to decide whether 
waiting or dropping makes more sense for effectiveness of 
waitboosting.)


Or since the congestion /should not/ happen in production, then 
the argument is why complicate with more code, in which case going 
with one line patch is an easy way forward?


Here. Where I did hint I understood the "should not happen in 
production angle".


So statement is GuC is congested in processing requests, but the h2g 
buffer is not congested so no chance intel_guc_send_nb() will fail 
with no space in that buffer? Sounds a bit un-intuitive.
That's two different things. The problem of no space in the H2G buffer 
is the same whether the call is sent blocking or non-blocking. The 
wait-for-space version is intel_guc_send_busy_loop() rather than 
intel_guc_send_nb(). NB: _busy_loop is a wrapper around _nb, so the 
wait-for-space version is also non-blocking ;). If a non-looping 
version is used (blocking or otherwise) it will return -EBUSY if there 
is no space. So both the original SLPC call and this non-blocking 
version will still get an immediate EBUSY return code if the H2G 
channel is backed up completely.


Whether the code should be handling EBUSY or not is another matter. 
Vinay, does anything higher up do a loop on EBUSY? If not, maybe it 
should be using the _busy_loop() call instead?


The blocking vs non-blocking is about waiting for a response if the 
command is successfully sent. The blocking case will sit and spin for 
a reply, the non-blocking assumes success and expects an asynchronous 
error report on failure. The assumption being that the call can't fail 
unless something is already broken - i915 sending invalid data to GuC 
for example. And thus any failure is in the BUG_ON category rather 
than the try again with a different approach and/or try again later 
category.


This is the point of the change. We are currently getting timeout 
errors when the H2G channel has space so the command can be sent, but 
the channel already contains a lo

Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost

2022-05-06 Thread Belgaumkar, Vinay



On 5/6/2022 12:18 AM, Tvrtko Ursulin wrote:


On 05/05/2022 19:36, John Harrison wrote:

On 5/5/2022 10:21, Belgaumkar, Vinay wrote:

On 5/5/2022 5:13 AM, Tvrtko Ursulin wrote:

On 05/05/2022 06:40, Vinay Belgaumkar wrote:

SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.


Is it the "Unable to force min freq" error? Do you have a link to 
the GitLab issue to add to commit message?
We don't have a specific error for this one, but have seen similar 
issues with other H2G which are blocking.



This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.


AFAIU with this approach, when CT channel is congested, you instead 
achieve silent dropping of the waitboost request, right?

We are hoping it makes it, but just not waiting for it to complete.
We are not 'hoping it makes it'. We know for a fact that it will make 
it. We just don't know when. The issue is not about whether the 
waitboost request itself gets dropped/lost it is about the ack that 
comes back. The GuC will process the message and it will send an ack. 
It's just a question of whether the i915 driver has given up waiting 
for it yet. And if it has, then you get the initial 'timed out 
waiting for ack' followed by a later 'got unexpected ack' message.


Whereas, if we make the call asynchronous, there is no ack. i915 
doesn't bother waiting and it won't get surprised later.


Also, note that this is only an issue when GuC itself is backed up. 
Normally that requires the creation/destruction of large numbers of 
contexts in rapid succession (context management is about the slowest 
thing we do with GuC). Some of the IGTs and selftests do that with 
thousands of contexts all at once. Those are generally where we see 
this kind of problem. It would be highly unlikely (but not 
impossible) to hit it in real world usage.


Goto ->

The general design philosophy of H2G messages is that asynchronous 
mode should be used for everything if at all possible. It is fire and 
forget and will all get processed in the order sent (same as batch 
buffer execution, really). Synchronous messages should only be used 
when an ack/status is absolutely required. E.g. start of day 
initialisation or things like TLB invalidation where we need to know 
that a cache has been cleared/flushed before updating memory from the 
CPU.


John.




It sounds like a potentially important feedback from the field to 
lose so easily. How about you added drm_notice to the worker when 
it fails?


Or simply a "one line patch" to replace i915_probe_error (!?) with 
drm_notice and keep the blocking behavior. (I have no idea what is 
the typical time to drain the CT buffer, and so to decide whether 
waiting or dropping makes more sense for effectiveness of 
waitboosting.)


Or since the congestion /should not/ happen in production, then the 
argument is why complicate with more code, in which case going with 
one line patch is an easy way forward?


Here. Where I did hint I understood the "should not happen in 
production angle".


So statement is GuC is congested in processing requests, but the h2g 
buffer is not congested so no chance intel_guc_send_nb() will fail 
with no space in that buffer? Sounds a bit un-intuitive.


Anyway, it sounds okay to me to use the non-blocking, but I would like 
to see some logging if the unexpected does happen. Hence I was 
suggesting the option of adding drm_notice logging if the send fails 
from the worker. (Because I think other callers would already 
propagate the error, like sysfs.)


  err = slpc_force_min_freq(slpc, slpc->boost_freq);
  if (!err)
   slpc->num_boosts++;
  else
   drm_notice(... "Failed to send waitboost request (%d)", err);


Ok, makes sense. Will send out another rev with this change.

Thanks,

Vinay.




Something like that.

Regards,

Tvrtko


Even if we soften the blow here, the actual timeout error occurs in 
the intel_guc_ct.c code, so we cannot hide that error anyways. 
Making this call non-blocking will achieve both things.


Thanks,

Vinay.



Regards,

Tvrtko


Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 38 
-

  1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index 1db833da42df..c852f73cf521 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc 
*slpc)

  return data->header.global_state;
  }
  +static int guc_action_slpc_set_param_nb(struct intel_guc *guc, 
u8 id, u32 value)

+{
+    u32 reques

  1   2   >