Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Make GuC log sizes runtime configurable

2022-08-14 Thread Teres Alexis, Alan Previn
Looks good to me. 

Reviewed-by: Alan Previn 


On Wed, 2022-07-27 at 19:20 -0700, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> The GuC log buffer sizes had to be configured statically at compile
> time. This can be quite troublesome when needing to get larger logs
> out of a released driver. So re-organise the code to allow a boot time
> module parameter override.
> 
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  53 ++
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c|  14 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 176 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|  42 +++--
>  drivers/gpu/drm/i915/i915_params.c|  12 ++
>  drivers/gpu/drm/i915/i915_params.h|   3 +
>  6 files changed, 226 insertions(+), 74 deletions(-)
> 
...
> +static s32 scale_log_param(struct intel_guc_log *log, const struct 
> guc_log_section *section,
> +s32 param)
> +{
> + /* -1 means default */
> + if (param < 0)
> + return section->default_val;
> +
> + /* Check for 32-bit overflow */
> + if (param >= SZ_4K) {
> + drm_err(_to_gt(log_to_guc(log))->i915->drm, "Size too large 
> for GuC %s log: %dMB!",
> + section->name, param);
> + return section->default_val;
> + }
> +
> + /* Param units are 1MB */
> + return param * SZ_1M;
> +}
> +





[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [v4,1/2] drm/i915/selftests: Use correct selfest calls for live tests

2022-08-14 Thread Patchwork
== Series Details ==

Series: series starting with [v4,1/2] drm/i915/selftests: Use correct selfest 
calls for live tests
URL   : https://patchwork.freedesktop.org/series/107259/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11986 -> Patchwork_107259v1


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_107259v1 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_107259v1, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/index.html

Participating hosts (38 -> 28)
--

  Missing(10): bat-dg1-5 bat-dg2-8 bat-adlm-1 bat-dg2-9 bat-adlp-6 
fi-hsw-4770 bat-rplp-1 bat-rpls-1 bat-dg2-10 bat-jsl-1 

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_107259v1:

### IGT changes ###

 Possible regressions 

  * igt@core_auth@basic-auth:
- fi-rkl-guc: [PASS][1] -> [TIMEOUT][2] +4 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/fi-rkl-guc/igt@core_a...@basic-auth.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/fi-rkl-guc/igt@core_a...@basic-auth.html
- bat-dg1-6:  [PASS][3] -> [TIMEOUT][4] +1 similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-dg1-6/igt@core_a...@basic-auth.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-dg1-6/igt@core_a...@basic-auth.html
- bat-adlp-4: [PASS][5] -> [TIMEOUT][6] +1 similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-adlp-4/igt@core_a...@basic-auth.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-adlp-4/igt@core_a...@basic-auth.html

  * igt@debugfs_test@read_all_entries:
- bat-adlp-4: [PASS][7] -> [INCOMPLETE][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-adlp-4/igt@debugfs_test@read_all_entries.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-adlp-4/igt@debugfs_test@read_all_entries.html
- bat-dg1-6:  [PASS][9] -> [INCOMPLETE][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-dg1-6/igt@debugfs_test@read_all_entries.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-dg1-6/igt@debugfs_test@read_all_entries.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@debugfs_test@read_all_entries:
- {bat-rpls-2}:   [PASS][11] -> [INCOMPLETE][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-rpls-2/igt@debugfs_test@read_all_entries.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-rpls-2/igt@debugfs_test@read_all_entries.html

  * igt@i915_module_load@load:
- {bat-rpls-2}:   [PASS][13] -> [TIMEOUT][14] +1 similar issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11986/bat-rpls-2/igt@i915_module_l...@load.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/bat-rpls-2/igt@i915_module_l...@load.html

  
Known issues


  Here are the changes found in Patchwork_107259v1 that come from known issues:

### IGT changes ###

  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#5153]: https://gitlab.freedesktop.org/drm/intel/issues/5153


Build changes
-

  * Linux: CI_DRM_11986 -> Patchwork_107259v1

  CI-20190529: 20190529
  CI_DRM_11986: 1cb5379e17f93685065d8ec5f1baf9386ffe @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6623: c8edfca649da71b296d882bb0319181d94e619eb @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_107259v1: 1cb5379e17f93685065d8ec5f1baf9386ffe @ 
git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

8563a7912191 drm/i915/guc: Add delay to disable scheduling after pin count goes 
to zero
53f012bc3f6f drm/i915/selftests: Use correct selfest calls for live tests

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107259v1/index.html


Re: [Intel-gfx] [PATCH 3/7] drm/i915/guc: Add GuC <-> kernel time stamp translation information

2022-08-14 Thread Teres Alexis, Alan Previn
Sounds good - thanks. (ignore the "doing the opposite" comment).

Reviewed-by: Alan Previn 

On Mon, 2022-08-08 at 11:43 -0700, Harrison, John C wrote:
> On 8/4/2022 17:40, Teres Alexis, Alan Previn wrote:
> > I have a question on below code. Everything else looked good.
> > Will r-b as soon as we can close on below question
> > ...alan
> > 
> > 
> > On Wed, 2022-07-27 at 19:20 -0700, john.c.harri...@intel.com wrote:
> > > From: John Harrison 
> > > 
> > > It is useful to be able to match GuC events to kernel events when
> > > looking at the GuC log. That requires being able to convert GuC
> > > timestamps to kernel time. So, when dumping error captures and/or GuC
> > > logs, include a stamp in both time zones plus the clock frequency.
> > > 
> > > Signed-off-by: John Harrison 
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt,
> > >*/
> > >   error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
> > >   error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
> > > +
> > > + /*
> > > +  * Save the GuC log and include a timestamp reference for converting the
> > > +  * log times to system times (in conjunction with the error->boottime 
> > > and
> > > +  * gt->clock_frequency fields saved elsewhere).
> > > +  */
> > > + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, 
> > > GUCPMTIMESTAMP);
> > Alan:this register is in the GUC-SHIM domain and so unless i am mistaken u 
> > might need to ensure we hold a wakeref so
> > that are getting a live value of the real timestamp register that this 
> > register is mirror-ing and not a stale snapshot.
> > Or was this already done farther up the stack? Or are we doing the opposite 
> > - in which case we should ensure we drop al
> >   wakeref prior to this point. (which i am not sure is a reliable method 
> > since we wouldnt know what GuC ref was at).
> The intel_uncore_read() does a forcewake acquire implicitly.
> 
> Not sure what you mean about dropping all wakerefs prior to this point?
> 
> John.
> 
> > >   error_uc->guc_log = create_vma_coredump(gt->_gt, 
> > > uc->guc.log.vma,
> > >   "GuC log buffer", 
> > > compress);
> > >   



[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v4,1/2] drm/i915/selftests: Use correct selfest calls for live tests

2022-08-14 Thread Patchwork
== Series Details ==

Series: series starting with [v4,1/2] drm/i915/selftests: Use correct selfest 
calls for live tests
URL   : https://patchwork.freedesktop.org/series/107259/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.




[Intel-gfx] [Intel-gfx v4 2/2] drm/i915/guc: Add delay to disable scheduling after pin count goes to zero

2022-08-14 Thread Alan Previn
From: Matthew Brost 

Add a delay, configurable via debugs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a somewhat costly operation so the idea is that a delay
allows the user to resubmit something before doing this operation.
This delay is only done if the context isn't closed and less than 3/4
of total guc_ids are in use.

As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Alan Previn: Matt Brost first introduced this series back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means that the totality of all 36 containers and its
workloads were not saturating the utilized hw engines to its max
(in order to maintain just enough headrooom to meet the min fps and
max latencies of incoming container submissions).

Problem statement: It was observed that the CPU utilization of the CPU
core that was pinned to i915 soft IRQ work was experiencing severe load.
Using tracelogs and an instrumentation patch to count specific i915 IRQ
events, it was confirmed that the majority of the CPU cycles were caused
by the gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G request are sent
whenever a context is idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.

Signed-off-by: Matthew Brost 
Signed-off-by: Alan Previn 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.h   |   9 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |   7 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  18 +++
 .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c|  57 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 150 +++---
 drivers/gpu/drm/i915/i915_selftest.h  |   2 +
 drivers/gpu/drm/i915/i915_trace.h |  10 ++
 8 files changed, 229 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index dabdfe09f5e5..df7fd1b019ec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1454,7 +1454,7 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
int err;
 
/* serialises with execbuf */
-   set_bit(CONTEXT_CLOSED_BIT, >flags);
+   intel_context_close(ce);
if (!intel_context_pin_if_active(ce))
continue;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 8e2d70630c49..7cc4bb9ad042 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -276,6 +276,15 @@ static inline bool intel_context_is_barrier(const struct 
intel_context *ce)
return test_bit(CONTEXT_BARRIER_BIT, >flags);
 }
 
+static inline void intel_context_close(struct intel_context *ce)
+{
+   set_bit(CONTEXT_CLOSED_BIT, >flags);
+
+   trace_intel_context_close(ce);
+   if (ce->ops->close)
+   ce->ops->close(ce);
+}
+
 static 

[Intel-gfx] [Intel-gfx v4 1/2] drm/i915/selftests: Use correct selfest calls for live tests

2022-08-14 Thread Alan Previn
From: Matthew Brost 

This will help in an upcoming patch where the live selftest wrappers
are extended to do more.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c| 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c  | 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c| 2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   | 2 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c  | 2 +-
 drivers/gpu/drm/i915/selftests/i915_request.c   | 2 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c   | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787e..a666d7e610f5 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -434,5 +434,5 @@ int i915_gem_coherency_live_selftests(struct 
drm_i915_private *i915)
SUBTEST(igt_gem_coherency),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index 62c61af77a42..51ed824b020c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -476,5 +476,5 @@ int i915_gem_dmabuf_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_dmabuf_import_same_driver_lmem_smem),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ced9948a331..3cff08f04f6c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1844,5 +1844,5 @@ int i915_gem_mman_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_mmap_gpu),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
index fe0a890775e2..bdf5bb40ccf1 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
@@ -95,5 +95,5 @@ int i915_gem_object_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_gem_huge),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index ab9f17fc85bc..fb5e61963479 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -2324,5 +2324,5 @@ int i915_gem_gtt_live_selftests(struct drm_i915_private 
*i915)
 
GEM_BUG_ON(offset_in_page(to_gt(i915)->ggtt->vm.total));
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c 
b/drivers/gpu/drm/i915/selftests/i915_perf.c
index 88db2e3d81d0..429c6d73b159 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -431,7 +431,7 @@ int i915_perf_live_selftests(struct drm_i915_private *i915)
if (err)
return err;
 
-   err = i915_subtests(tests, i915);
+   err = i915_live_subtests(tests, i915);
 
destroy_empty_config(>perf);
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index ec05f578a698..818a4909c1f3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -1821,7 +1821,7 @@ int i915_request_live_selftests(struct drm_i915_private 
*i915)
if (intel_gt_is_wedged(to_gt(i915)))
return 0;
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
 
 static int switch_to_kernel_sync(struct intel_context *ce, int err)
diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c 
b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 6921ba128015..e3821398a5b0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -1103,5 +1103,5 @@ int i915_vma_live_selftests(struct drm_i915_private *i915)
SUBTEST(igt_vma_remapped_gtt),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
-- 
2.25.1



[Intel-gfx] [PULL] gvt-fixes

2022-08-14 Thread Zhenyu Wang

Hi,

Here's one gvt-fixes pull for 6.0-rc. Major one is Cometlake regression
fix for mmio table rework, and others are left kernel doc fixes not pushed yet.

Thanks
--
The following changes since commit a7a47a5dfa9a9692a41764ee9ab4054f12924a42:

  drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend 
(2022-07-25 15:57:54 +0100)

are available in the Git repository at:

  https://github.com/intel/gvt-linux.git tags/gvt-fixes-2022-08-15

for you to fetch changes up to 394f0560a76298842defd1d95bd64b203a5fdcc4:

  drm/i915/gvt: Fix Comet Lake (2022-08-15 10:54:03 +0800)


gvt-fixes-2022-08-15

- CometLake regression fix in mmio table rework (Alex)
- misc kernel doc and typo fixes


Alex Williamson (1):
  drm/i915/gvt: Fix Comet Lake

Colin Ian King (1):
  drm/i915/reg: Fix spelling mistake "Unsupport" -> "Unsupported"

Jiapeng Chong (3):
  drm/i915/gvt: Fix kernel-doc
  drm/i915/gvt: Fix kernel-doc
  drm/i915/gvt: Fix kernel-doc

Julia Lawall (1):
  drm/i915/gvt: fix typo in comment

 drivers/gpu/drm/i915/gvt/aperture_gm.c  | 4 ++--
 drivers/gpu/drm/i915/gvt/gtt.c  | 2 +-
 drivers/gpu/drm/i915/gvt/handlers.c | 4 ++--
 drivers/gpu/drm/i915/gvt/mmio_context.c | 2 +-
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 3 ++-
 5 files changed, 8 insertions(+), 7 deletions(-)


signature.asc
Description: PGP signature


Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Allow SLPC to use efficient frequency

2022-08-14 Thread Belgaumkar, Vinay



On 8/14/2022 4:46 PM, Vinay Belgaumkar wrote:

Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case since
SLPC will still abide by the (min,max) range limits.

With this change, min freq will be at efficient frequency level at init
instead of fused min (RPn). If user chooses to reduce min freq below the
efficient freq, we will turn off usage of efficient frequency and honor
the user request. When a higher value is written, it will get toggled
back again.

The patch also corrects the register which needs to be read for obtaining
the correct efficient frequency for Gen9+.

We see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled as expected.

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468

Cc: Rodrigo Vivi 
Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c |  3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 66 +++--
  drivers/gpu/drm/i915/intel_mchbar_regs.h|  3 +
  3 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index c7d381ad90cf..281a086fc265 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1108,6 +1108,9 @@ void gen6_rps_get_freq_caps(struct intel_rps *rps, struct 
intel_rps_freq_caps *c
} else {
caps->rp0_freq = (rp_state_cap >>  0) & 0xff;
caps->rp1_freq = (rp_state_cap >>  8) & 0xff;


Forgot to remove old code here. Will do so for the next revision as it 
does not affect the patch.


Thanks,

Vinay.


+   caps->rp1_freq = REG_FIELD_GET(RPE_MASK,
+  
intel_uncore_read(to_gt(i915)->uncore,
+  GEN10_FREQ_INFO_REC));
caps->min_freq = (rp_state_cap >> 16) & 0xff;
}
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c

index e1fa1f32f29e..70a2af5f518d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -465,6 +465,29 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc 
*slpc, u32 *val)
return ret;
  }
  
+static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

+{
+   int ret = 0;
+
+   if (ignore) {
+   ret = slpc_set_param(slpc,
+SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+ignore);
+   if (!ret)
+   return slpc_set_param(slpc,
+ 
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ slpc->min_freq);
+   } else {
+   ret = slpc_unset_param(slpc,
+  SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY);
+   if (!ret)
+   return slpc_unset_param(slpc,
+   
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ);
+   }
+
+   return ret;
+}
+
  /**
   * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
   * @slpc: pointer to intel_guc_slpc.
@@ -491,6 +514,14 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc 
*slpc, u32 val)
  
  	with_intel_runtime_pm(>runtime_pm, wakeref) {
  
+		/* Ignore efficient freq if lower min freq is requested */

+   ret = slpc_ignore_eff_freq(slpc, val < slpc->rp1_freq);
+   if (unlikely(ret)) {
+   i915_probe_error(i915, "Failed to toggle efficient freq 
(%pe)\n",
+ERR_PTR(ret));
+   return ret;
+   }
+
ret = slpc_set_param(slpc,
 SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
 val);
@@ -587,7 +618,9 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return ret;
  
  	if (!slpc->min_freq_softlimit) {

-   slpc->min_freq_softlimit = slpc->min_freq;
+   ret = intel_guc_slpc_get_min_freq(slpc, 
>min_freq_softlimit);
+   if (unlikely(ret))
+   return ret;
slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
} else if (slpc->min_freq_softlimit != slpc->min_freq) {
return intel_guc_slpc_set_min_freq(slpc,
@@ -597,29 +630,6 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
return 0;
  }
  
-static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)

-{
-   int 

[Intel-gfx] ✗ Fi.CI.BUILD: failure for Delay disabling scheduling on a context (rev3)

2022-08-14 Thread Patchwork
== Series Details ==

Series: Delay disabling scheduling on a context (rev3)
URL   : https://patchwork.freedesktop.org/series/96167/
State : failure

== Summary ==

Error: make failed
  CALLscripts/checksyscalls.sh
  CALLscripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.o
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:4292:5: error: no previous 
prototype for ‘__guc_get_non_lrc_num_guc_ids’ [-Werror=missing-prototypes]
 int __guc_get_non_lrc_num_guc_ids(struct intel_guc *guc)
 ^
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:4297:5: error: no previous 
prototype for ‘__guc_get_sched_disable_gucid_threshold_default’ 
[-Werror=missing-prototypes]
 int __guc_get_sched_disable_gucid_threshold_default(struct intel_guc *guc)
 ^~~
cc1: all warnings being treated as errors
scripts/Makefile.build:249: recipe for target 
'drivers/gpu/drm/i915/gt/uc/intel_guc_submission.o' failed
make[4]: *** [drivers/gpu/drm/i915/gt/uc/intel_guc_submission.o] Error 1
scripts/Makefile.build:466: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:466: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:466: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1843: recipe for target 'drivers' failed
make: *** [drivers] Error 2




[Intel-gfx] [Intel-gfx 2/2] drm/i915/guc: Add delay to disable scheduling after pin count goes to zero

2022-08-14 Thread Alan Previn
From: Matthew Brost 

Add a delay, configurable via debugs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a somewhat costly operation so the idea is that a delay
allows the user to resubmit something before doing this operation.
This delay is only done if the context isn't closed and less than 3/4
of total guc_ids are in use.

As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Alan Previn: Matt Brost first introduced this series back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means that the totality of all 36 containers and its
workloads were not saturating the utilized hw engines to its max
(in order to maintain just enough headrooom to meet the min fps and
max latencies of incoming container submissions).

Problem statement: It was observed that the CPU utilization of the CPU
core that was pinned to i915 soft IRQ work was experiencing severe load.
Using tracelogs and an instrumentation patch to count specific i915 IRQ
events, it was confirmed that the majority of the CPU cycles were caused
by the gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G request are sent
whenever a context is idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.

Signed-off-by: Matthew Brost 
Signed-off-by: Alan Previn 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.h   |   9 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |   7 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  18 +++
 .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c|  57 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 150 +++---
 drivers/gpu/drm/i915/i915_selftest.h  |   2 +
 drivers/gpu/drm/i915/i915_trace.h |  10 ++
 8 files changed, 229 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index dabdfe09f5e5..df7fd1b019ec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1454,7 +1454,7 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
int err;
 
/* serialises with execbuf */
-   set_bit(CONTEXT_CLOSED_BIT, >flags);
+   intel_context_close(ce);
if (!intel_context_pin_if_active(ce))
continue;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 8e2d70630c49..7cc4bb9ad042 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -276,6 +276,15 @@ static inline bool intel_context_is_barrier(const struct 
intel_context *ce)
return test_bit(CONTEXT_BARRIER_BIT, >flags);
 }
 
+static inline void intel_context_close(struct intel_context *ce)
+{
+   set_bit(CONTEXT_CLOSED_BIT, >flags);
+
+   trace_intel_context_close(ce);
+   if (ce->ops->close)
+   ce->ops->close(ce);
+}
+
 static 

[Intel-gfx] [Intel-gfx 0/2] Delay disabling scheduling on a context

2022-08-14 Thread Alan Previn
This is a revival of the same series posted by Matthew Brost
back in October 2021 (https://patchwork.freedesktop.org/series/96167/).
Additional real world measured metrics is included this time around
that has proven the effectiveness of this series.

This series adds a delay before disabling scheduling the guc-context
when a context has become idle. The 2nd patch should explain it quite well.

This is the 3rd rev of this series (counting from the first
version by Matt). Changes from prior revs:

  v3: Differentiate and appropriately name helper functions for getting
  the 'default threshold of num-guc-ids' vs the 'max threshold of
  num-guc-ids' for bypassing sched-disable and use the correct one
  for the debugfs validation (John Harrison).
  v2: Changed the default of the schedule-disable delay to 34 milisecs
  and added debugfs to control this timing knob. Also added a debugfs
  to control the bypass for not delaying the schedule-disable if
  the we are under pressure with a very low balance of remaining
  guc-ds. (John Harrison).

  v1: Drop the trace log for intel_context_close (Chris Wilson).
  Add "Tested-by" into patch-2 (Chris Wilson)
  Add JIRA number into patch-0 (Chris Wilson).
  Summaries patch-2s problem and metrics into
  cover letter (Chris Wilson).

Matthew Brost (2):
  drm/i915/selftests: Use correct selfest calls for live tests
  drm/i915/guc: Add delay to disable scheduling after pin count goes to
zero

 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |   2 +-
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|   2 +-
 .../drm/i915/gem/selftests/i915_gem_object.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.h   |   9 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |   7 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  18 +++
 .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c|  57 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 150 +++---
 drivers/gpu/drm/i915/i915_selftest.h  |   2 +
 drivers/gpu/drm/i915/i915_trace.h |  10 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   2 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c|   2 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |   2 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c |   2 +-
 16 files changed, 237 insertions(+), 34 deletions(-)


base-commit: 1cb5379e17f93685065d8ec5f1baf9386ffe
-- 
2.25.1



[Intel-gfx] [Intel-gfx 1/2] drm/i915/selftests: Use correct selfest calls for live tests

2022-08-14 Thread Alan Previn
From: Matthew Brost 

This will help in an upcoming patch where the live selftest wrappers
are extended to do more.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c| 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c  | 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c| 2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   | 2 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c  | 2 +-
 drivers/gpu/drm/i915/selftests/i915_request.c   | 2 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c   | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 13b088cc787e..a666d7e610f5 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -434,5 +434,5 @@ int i915_gem_coherency_live_selftests(struct 
drm_i915_private *i915)
SUBTEST(igt_gem_coherency),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index 62c61af77a42..51ed824b020c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -476,5 +476,5 @@ int i915_gem_dmabuf_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_dmabuf_import_same_driver_lmem_smem),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ced9948a331..3cff08f04f6c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1844,5 +1844,5 @@ int i915_gem_mman_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_mmap_gpu),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
index fe0a890775e2..bdf5bb40ccf1 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
@@ -95,5 +95,5 @@ int i915_gem_object_live_selftests(struct drm_i915_private 
*i915)
SUBTEST(igt_gem_huge),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index ab9f17fc85bc..fb5e61963479 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -2324,5 +2324,5 @@ int i915_gem_gtt_live_selftests(struct drm_i915_private 
*i915)
 
GEM_BUG_ON(offset_in_page(to_gt(i915)->ggtt->vm.total));
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c 
b/drivers/gpu/drm/i915/selftests/i915_perf.c
index 88db2e3d81d0..429c6d73b159 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -431,7 +431,7 @@ int i915_perf_live_selftests(struct drm_i915_private *i915)
if (err)
return err;
 
-   err = i915_subtests(tests, i915);
+   err = i915_live_subtests(tests, i915);
 
destroy_empty_config(>perf);
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index ec05f578a698..818a4909c1f3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -1821,7 +1821,7 @@ int i915_request_live_selftests(struct drm_i915_private 
*i915)
if (intel_gt_is_wedged(to_gt(i915)))
return 0;
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
 
 static int switch_to_kernel_sync(struct intel_context *ce, int err)
diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c 
b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 6921ba128015..e3821398a5b0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -1103,5 +1103,5 @@ int i915_vma_live_selftests(struct drm_i915_private *i915)
SUBTEST(igt_vma_remapped_gtt),
};
 
-   return i915_subtests(tests, i915);
+   return i915_live_subtests(tests, i915);
 }
-- 
2.25.1