[Intel-gfx] ✓ Fi.CI.BAT: success for Introduce drm scaling filter property (rev7)

2020-08-02 Thread Patchwork
== Series Details ==

Series: Introduce drm scaling filter property (rev7)
URL   : https://patchwork.freedesktop.org/series/73883/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18296


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/index.html

Known issues


  Here are the changes found in Patchwork_18296 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic:
- fi-icl-u2:  [PASS][1] -> [DMESG-WARN][2] ([i915#1982])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-bxt-dsi: [DMESG-WARN][3] ([i915#1635] / [i915#1982]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bxt-dsi/igt@i915_module_l...@reload.html

  * igt@i915_selftest@live@active:
- {fi-ehl-1}: [DMESG-FAIL][5] ([i915#541]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-ehl-1/igt@i915_selftest@l...@active.html

  * igt@i915_selftest@live@execlists:
- fi-icl-y:   [INCOMPLETE][7] -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-y/igt@i915_selftest@l...@execlists.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-y/igt@i915_selftest@l...@execlists.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-bsw-n3050:   [DMESG-WARN][9] ([i915#1982]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-n3050/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bsw-n3050/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
- fi-bsw-kefka:   [DMESG-WARN][11] ([i915#1982]) -> [PASS][12] +1 
similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
- fi-icl-u2:  [DMESG-WARN][13] ([i915#1982]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-x1275:   [DMESG-WARN][15] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][16] ([i915#1982] / [i915#62] / [i915#92])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy:
- fi-kbl-x1275:   [DMESG-WARN][17] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][18] ([i915#62] / [i915#92]) +2 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html

  * igt@kms_force_connector_basic@force-edid:
- fi-kbl-x1275:   [DMESG-WARN][19] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +3 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#665]: https://gitlab.freedesktop.org/drm/intel/issues/665
  

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Introduce drm scaling filter property (rev7)

2020-08-02 Thread Patchwork
== Series Details ==

Series: Introduce drm scaling filter property (rev7)
URL   : https://patchwork.freedesktop.org/series/73883/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Introduce drm scaling filter property (rev7)

2020-08-02 Thread Patchwork
== Series Details ==

Series: Introduce drm scaling filter property (rev7)
URL   : https://patchwork.freedesktop.org/series/73883/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e4e6d491d898 drm: Introduce plane and CRTC scaling filter properties
aef69ee0f157 drm/drm-kms.rst: Add plane and CRTC scaling filter property 
documentation
1e38228e0f26 drm/i915: Introduce scaling filter related registers and bit 
fields.
-:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'id' - possible side-effects?
#70: FILE: drivers/gpu/drm/i915/i915_reg.h:7514:
+#define CNL_PS_COEF_INDEX_SET(pipe, id, set)  _MMIO_PIPE(pipe,\
+   _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) 
+ (set) * 8, \
+   _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) 
+ (set) * 8)

-:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'set' - possible side-effects?
#70: FILE: drivers/gpu/drm/i915/i915_reg.h:7514:
+#define CNL_PS_COEF_INDEX_SET(pipe, id, set)  _MMIO_PIPE(pipe,\
+   _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) 
+ (set) * 8, \
+   _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) 
+ (set) * 8)

-:74: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'id' - possible side-effects?
#74: FILE: drivers/gpu/drm/i915/i915_reg.h:7518:
+#define CNL_PS_COEF_DATA_SET(pipe, id, set)  _MMIO_PIPE(pipe, \
+   _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + 
(set) * 8, \
+   _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + 
(set) * 8)

-:74: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'set' - possible side-effects?
#74: FILE: drivers/gpu/drm/i915/i915_reg.h:7518:
+#define CNL_PS_COEF_DATA_SET(pipe, id, set)  _MMIO_PIPE(pipe, \
+   _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + 
(set) * 8, \
+   _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + 
(set) * 8)

total: 0 errors, 0 warnings, 4 checks, 47 lines checked
8eff2169f553 drm/i915/display: Add Nearest-neighbor based integer scaling 
support
c7a29cfbc920 drm/i915: Enable scaling filter for plane and CRTC
-:95: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#95: FILE: drivers/gpu/drm/i915/display/intel_display.c:16824:
+   drm_crtc_create_scaling_filter_property(>base,
+   BIT(DRM_SCALING_FILTER_DEFAULT) 
|

-:155: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#155: FILE: drivers/gpu/drm/i915/display/intel_sprite.c:3172:
+   drm_plane_create_scaling_filter_property(>base,
+   BIT(DRM_SCALING_FILTER_DEFAULT) 
|

total: 0 errors, 0 warnings, 2 checks, 104 lines checked


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 5/5] drm/i915: Enable scaling filter for plane and CRTC

2020-08-02 Thread Pankaj Bharadiya
GEN >= 10 hardware supports the programmable scaler filter.

Attach scaling filter property for CRTC and plane for GEN >= 10
hardwares and program scaler filter based on the selected filter
type.

changes since v3:
* None
changes since v2:
* Use updated functions
* Add ps_ctrl var to contain the full PS_CTRL register value (Ville)
* Duplicate the scaling filter in crtc and plane hw state (Ville)
changes since v1:
* None
Changes since RFC:
* Enable properties for GEN >= 10 platforms (Ville)
* Do not round off the crtc co-ordinate (Danial Stone, Ville)
* Add new functions to handle scaling filter setup (Ville)
* Remove coefficient set 0 hardcoding.

Signed-off-by: Shashank Sharma 
Signed-off-by: Ankit Nautiyal 
Signed-off-by: Pankaj Bharadiya 
---
 .../gpu/drm/i915/display/intel_atomic_plane.c  |  1 +
 drivers/gpu/drm/i915/display/intel_display.c   | 18 --
 .../gpu/drm/i915/display/intel_display_types.h |  2 ++
 drivers/gpu/drm/i915/display/intel_sprite.c| 15 +--
 4 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_atomic_plane.c 
b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
index 79032701873a..415d41b21915 100644
--- a/drivers/gpu/drm/i915/display/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
@@ -262,6 +262,7 @@ void intel_plane_copy_uapi_to_hw_state(struct 
intel_plane_state *plane_state,
plane_state->hw.rotation = from_plane_state->uapi.rotation;
plane_state->hw.color_encoding = from_plane_state->uapi.color_encoding;
plane_state->hw.color_range = from_plane_state->uapi.color_range;
+   plane_state->hw.scaling_filter = from_plane_state->uapi.scaling_filter;
 }
 
 void intel_plane_set_invisible(struct intel_crtc_state *crtc_state,
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 388999404e05..507932099b8d 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -6352,6 +6352,7 @@ static void skl_pfit_enable(const struct intel_crtc_state 
*crtc_state)
int hscale, vscale;
unsigned long irqflags;
int id;
+   u32 ps_ctrl;
 
if (!crtc_state->pch_pfit.enabled)
return;
@@ -6368,10 +6369,16 @@ static void skl_pfit_enable(const struct 
intel_crtc_state *crtc_state)
 
id = scaler_state->scaler_id;
 
+   ps_ctrl = skl_scaler_get_filter_select(crtc_state->hw.scaling_filter, 
0);
+   ps_ctrl |=  PS_SCALER_EN | scaler_state->scalers[id].mode;
+
spin_lock_irqsave(_priv->uncore.lock, irqflags);
 
-   intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id), PS_SCALER_EN |
- PS_FILTER_MEDIUM | scaler_state->scalers[id].mode);
+   skl_scaler_setup_filter(dev_priv, pipe, id, 0,
+   crtc_state->hw.scaling_filter);
+
+   intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id), ps_ctrl);
+
intel_de_write_fw(dev_priv, SKL_PS_VPHASE(pipe, id),
  PS_Y_PHASE(0) | PS_UV_RGB_PHASE(uv_rgb_vphase));
intel_de_write_fw(dev_priv, SKL_PS_HPHASE(pipe, id),
@@ -13332,6 +13339,7 @@ intel_crtc_copy_uapi_to_hw_state(struct 
intel_crtc_state *crtc_state)
crtc_state->hw.active = crtc_state->uapi.active;
crtc_state->hw.mode = crtc_state->uapi.mode;
crtc_state->hw.adjusted_mode = crtc_state->uapi.adjusted_mode;
+   crtc_state->hw.scaling_filter = crtc_state->uapi.scaling_filter;
intel_crtc_copy_uapi_to_hw_state_nomodeset(crtc_state);
 }
 
@@ -13343,6 +13351,7 @@ static void intel_crtc_copy_hw_to_uapi_state(struct 
intel_crtc_state *crtc_state
drm_atomic_set_mode_for_crtc(_state->uapi, 
_state->hw.mode) < 0);
 
crtc_state->uapi.adjusted_mode = crtc_state->hw.adjusted_mode;
+   crtc_state->uapi.scaling_filter = crtc_state->hw.scaling_filter;
 
/* copy color blobs to uapi */
drm_property_replace_blob(_state->uapi.degamma_lut,
@@ -16810,6 +16819,11 @@ static int intel_crtc_init(struct drm_i915_private 
*dev_priv, enum pipe pipe)
dev_priv->plane_to_crtc_mapping[i9xx_plane] = crtc;
}
 
+   if (INTEL_GEN(dev_priv) >= 10)
+   drm_crtc_create_scaling_filter_property(>base,
+   BIT(DRM_SCALING_FILTER_DEFAULT) 
|
+   
BIT(DRM_SCALING_FILTER_NEAREST_NEIGHBOR));
+
intel_color_init(crtc);
 
intel_crtc_crc_init(crtc);
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index f581260e8dbf..670ab317134b 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -518,6 +518,7 @@ struct intel_plane_state {
unsigned int rotation;
enum drm_color_encoding 

[Intel-gfx] [PATCH v5 4/5] drm/i915/display: Add Nearest-neighbor based integer scaling support

2020-08-02 Thread Pankaj Bharadiya
Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer
(i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
works by filling in the missing color values in the upscaled image
with that of the coordinate-mapped nearest source pixel value.

Both IS and NN preserve the clarity of the original image. Integer
scaling is particularly useful for pixel art games that rely on
sharp, blocky images to deliver their distinctive look.

Introduce functions to configure the scaler filter coefficients to
enable nearest-neighbor filtering.

Bspec: 49247

changes since v3:
* None
changes since v2:
* Move APIs from 5/5 into this patch.
* Change filter programming related function names to cnl_*, move
  filter select bits related code into inline function (Ville)
changes since v1:
* Rearrange skl_scaler_setup_nearest_neighbor_filter() to iterate the
  registers directly instead of the phases and taps (Ville)

changes since RFC:
* Refine the skl_scaler_setup_nearest_neighbor_filter() logic (Ville)

Signed-off-by: Shashank Sharma 
Signed-off-by: Ankit Nautiyal 
Signed-off-by: Pankaj Bharadiya 
---
 drivers/gpu/drm/i915/display/intel_display.c | 99 
 drivers/gpu/drm/i915/display/intel_display.h |  4 +
 2 files changed, 103 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index db2a5a1a9b35..388999404e05 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -6233,6 +6233,105 @@ void skl_scaler_disable(const struct intel_crtc_state 
*old_crtc_state)
skl_detach_scaler(crtc, i);
 }
 
+static int cnl_coef_tap(int i)
+{
+   return i % 7;
+}
+
+static u16 cnl_nearest_filter_coef(int t)
+{
+   return t == 3 ? 0x0800 : 0x3000;
+}
+
+/**
+ *  Theory behind setting nearest-neighbor integer scaling:
+ *
+ *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
+ *  The letter represents the filter tap (D is the center tap) and the number
+ *  represents the coefficient set for a phase (0-16).
+ *
+ * ++++
+ * |Index value | Data value coeffient 1 | Data value coeffient 2 |
+ * ++++
+ * |   00h  |  B0|  A0|
+ * ++++
+ * |   01h  |  D0|  C0|
+ * ++++
+ * |   02h  |  F0|  E0|
+ * ++++
+ * |   03h  |  A1|  G0|
+ * ++++
+ * |   04h  |  C1|  B1|
+ * ++++
+ * |   ...  |  ...   |  ...   |
+ * ++++
+ * |   38h  |  B16   |  A16   |
+ * ++++
+ * |   39h  |  D16   |  C16   |
+ * ++++
+ * |   3Ah  |  F16   |  C16   |
+ * ++++
+ * |   3Bh  |Reserved|  G16   |
+ * ++++
+ *
+ *  To enable nearest-neighbor scaling:  program scaler coefficents with
+ *  the center tap (Dxx) values set to 1 and all other values set to 0 as per
+ *  SCALER_COEFFICIENT_FORMAT
+ *
+ */
+
+static void cnl_program_nearest_filter_coefs(struct drm_i915_private *dev_priv,
+enum pipe pipe, int id, int set)
+{
+   int i;
+
+   intel_de_write_fw(dev_priv, CNL_PS_COEF_INDEX_SET(pipe, id, set),
+ PS_COEE_INDEX_AUTO_INC);
+
+   for (i = 0; i < 17 * 7; i += 2) {
+   u32 tmp;
+   int t;
+
+   t = cnl_coef_tap(i);
+   tmp = cnl_nearest_filter_coef(t);
+
+   t = cnl_coef_tap(i + 1);
+   tmp |= cnl_nearest_filter_coef(t) << 16;
+
+   intel_de_write_fw(dev_priv, CNL_PS_COEF_DATA_SET(pipe, id, set),
+ tmp);
+   }
+
+   intel_de_write_fw(dev_priv, CNL_PS_COEF_INDEX_SET(pipe, id, set), 0);
+}
+
+inline u32 skl_scaler_get_filter_select(enum 

[Intel-gfx] [PATCH v5 2/5] drm/drm-kms.rst: Add plane and CRTC scaling filter property documentation

2020-08-02 Thread Pankaj Bharadiya
Add documentation for newly introduced KMS plane and CRTC scaling
filter properties.

changes since v3:
* None
changes since v1:
* None
changes since RFC:
* Add separate documentation for plane and CRTC.

Signed-off-by: Pankaj Bharadiya 
---
 Documentation/gpu/drm-kms.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
index 3c5ae4f6dfd2..8e4031afbb1b 100644
--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@@ -518,6 +518,18 @@ Variable Refresh Properties
 .. kernel-doc:: drivers/gpu/drm/drm_connector.c
:doc: Variable refresh properties
 
+Plane Scaling Filter Property
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_plane.c
+   :doc: Plane scaling filter property
+
+CRTC Scaling Filter Property
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_crtc.c
+   :doc: CRTC scaling filter property
+
 Existing KMS Properties
 ---
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 1/5] drm: Introduce plane and CRTC scaling filter properties

2020-08-02 Thread Pankaj Bharadiya
Introduce per-plane and per-CRTC scaling filter properties to allow
userspace to select the driver's default scaling filter or
Nearest-neighbor(NN) filter for upscaling operations on CRTC and
plane.

Drivers can set up this property for a plane by calling
drm_plane_create_scaling_filter() and for a CRTC by calling
drm_crtc_create_scaling_filter().

NN filter works by filling in the missing color values in the upscaled
image with that of the coordinate-mapped nearest source pixel value.

NN filter for integer multiple scaling can be particularly useful for
for pixel art games that rely on sharp, blocky images to deliver their
distinctive look.

changes since v3:
* Refactor code, add new function for common code (Ville)
changes since v2:
* Create per-plane and per-CRTC scaling filter property (Ville)
changes since v1:
* None
changes since RFC:
* Add separate properties for plane and CRTC (Ville)

Signed-off-by: Pankaj Bharadiya 
---
 drivers/gpu/drm/drm_atomic_uapi.c   |  8 +++
 drivers/gpu/drm/drm_crtc.c  | 48 +++
 drivers/gpu/drm/drm_crtc_internal.h |  3 +
 drivers/gpu/drm/drm_plane.c | 90 +
 include/drm/drm_crtc.h  | 16 +
 include/drm/drm_plane.h | 21 +++
 6 files changed, 186 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 25c269bc4681..ef82009035e6 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -469,6 +469,8 @@ static int drm_atomic_crtc_set_property(struct drm_crtc 
*crtc,
return -EFAULT;
 
set_out_fence_for_crtc(state->state, crtc, fence_ptr);
+   } else if (property == crtc->scaling_filter_property) {
+   state->scaling_filter = val;
} else if (crtc->funcs->atomic_set_property) {
return crtc->funcs->atomic_set_property(crtc, state, property, 
val);
} else {
@@ -503,6 +505,8 @@ drm_atomic_crtc_get_property(struct drm_crtc *crtc,
*val = (state->gamma_lut) ? state->gamma_lut->base.id : 0;
else if (property == config->prop_out_fence_ptr)
*val = 0;
+   else if (property == crtc->scaling_filter_property)
+   *val = state->scaling_filter;
else if (crtc->funcs->atomic_get_property)
return crtc->funcs->atomic_get_property(crtc, state, property, 
val);
else
@@ -585,6 +589,8 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
sizeof(struct drm_rect),
);
return ret;
+   } else if (property == plane->scaling_filter_property) {
+   state->scaling_filter = val;
} else if (plane->funcs->atomic_set_property) {
return plane->funcs->atomic_set_property(plane, state,
property, val);
@@ -643,6 +649,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
} else if (property == config->prop_fb_damage_clips) {
*val = (state->fb_damage_clips) ?
state->fb_damage_clips->base.id : 0;
+   } else if (property == plane->scaling_filter_property) {
+   *val = state->scaling_filter;
} else if (plane->funcs->atomic_get_property) {
return plane->funcs->atomic_get_property(plane, state, 
property, val);
} else {
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 283bcc4362ca..70f5cd9704ba 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -776,3 +776,51 @@ int drm_mode_crtc_set_obj_prop(struct drm_mode_object *obj,
 
return ret;
 }
+
+/**
+ * DOC: CRTC scaling filter property
+ *
+ * SCALING_FILTER:
+ *
+ * Indicates scaling filter to be used for CRTC scaler
+ *
+ * The value of this property can be one of the following:
+ * Default:
+ * Driver's default scaling filter
+ * Nearest Neighbor:
+ * Nearest Neighbor scaling filter
+ *
+ * Drivers can set up this property for a CRTC by calling
+ * drm_crtc_create_scaling_filter_property
+ */
+
+/**
+ * drm_crtc_create_scaling_filter_property - create a new scaling filter
+ * property
+ *
+ * @crtc: drm CRTC
+ * @supported_filters: bitmask of supported scaling filters, must include
+ *BIT(DRM_SCALING_FILTER_DEFAULT).
+ *
+ * This function lets driver to enable the scaling filter property on a given
+ * CRTC.
+ *
+ * RETURNS:
+ * Zero for success or -errno
+ */
+int drm_crtc_create_scaling_filter_property(struct drm_crtc *crtc,
+   unsigned int supported_filters)
+{
+   struct drm_property *prop =
+   drm_create_scaling_filter_prop(crtc->dev, supported_filters);
+
+   if (IS_ERR(prop))
+   return PTR_ERR(prop);
+
+   drm_object_attach_property(>base, prop,
+  

[Intel-gfx] [PATCH v5 3/5] drm/i915: Introduce scaling filter related registers and bit fields.

2020-08-02 Thread Pankaj Bharadiya
Introduce scaler registers and bit fields needed to configure the
scaling filter in prgrammed mode and configure scaling filter
coefficients.

changes since v3:
* None
changes since v2:
* Change macro names to CNL_* and  use +(set)*8 instead of adding
  another trip through _PICK_EVEN (Ville).
changes since v1:
* None
changes since RFC:
* Parametrize scaler coeffient macros by 'set' (Ville)

Signed-off-by: Shashank Sharma 
Signed-off-by: Ankit Nautiyal 
Signed-off-by: Pankaj Bharadiya 
---
 drivers/gpu/drm/i915/i915_reg.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5eae593ee784..e582021cc208 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7391,6 +7391,7 @@ enum {
 #define PS_PLANE_SEL(plane) (((plane) + 1) << 25)
 #define PS_FILTER_MASK (3 << 23)
 #define PS_FILTER_MEDIUM   (0 << 23)
+#define PS_FILTER_PROGRAMMED   (1 << 23)
 #define PS_FILTER_EDGE_ENHANCE (2 << 23)
 #define PS_FILTER_BILINEAR (3 << 23)
 #define PS_VERT3TAP(1 << 21)
@@ -7405,6 +7406,10 @@ enum {
 #define PS_VADAPT_MODE_MOST_ADAPT  (3 << 5)
 #define PS_PLANE_Y_SEL_MASK  (7 << 5)
 #define PS_PLANE_Y_SEL(plane) (((plane) + 1) << 5)
+#define PS_Y_VERT_FILTER_SELECT(set)   ((set) << 4)
+#define PS_Y_HORZ_FILTER_SELECT(set)   ((set) << 3)
+#define PS_UV_VERT_FILTER_SELECT(set)  ((set) << 2)
+#define PS_UV_HORZ_FILTER_SELECT(set)  ((set) << 1)
 
 #define _PS_PWR_GATE_1A 0x68160
 #define _PS_PWR_GATE_2A 0x68260
@@ -7467,6 +7472,17 @@ enum {
 #define _PS_ECC_STAT_2B 0x68AD0
 #define _PS_ECC_STAT_1C 0x691D0
 
+#define _PS_COEF_SET0_INDEX_1A0x68198
+#define _PS_COEF_SET0_INDEX_2A0x68298
+#define _PS_COEF_SET0_INDEX_1B0x68998
+#define _PS_COEF_SET0_INDEX_2B0x68A98
+#define PS_COEE_INDEX_AUTO_INC(1 << 10)
+
+#define _PS_COEF_SET0_DATA_1A 0x6819C
+#define _PS_COEF_SET0_DATA_2A 0x6829C
+#define _PS_COEF_SET0_DATA_1B 0x6899C
+#define _PS_COEF_SET0_DATA_2B 0x68A9C
+
 #define _ID(id, a, b) _PICK_EVEN(id, a, b)
 #define SKL_PS_CTRL(pipe, id) _MMIO_PIPE(pipe,\
_ID(id, _PS_1A_CTRL, _PS_2A_CTRL),   \
@@ -7495,7 +7511,13 @@ enum {
 #define SKL_PS_ECC_STAT(pipe, id)  _MMIO_PIPE(pipe, \
_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
+#define CNL_PS_COEF_INDEX_SET(pipe, id, set)  _MMIO_PIPE(pipe,\
+   _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) 
+ (set) * 8, \
+   _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) 
+ (set) * 8)
 
+#define CNL_PS_COEF_DATA_SET(pipe, id, set)  _MMIO_PIPE(pipe, \
+   _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + 
(set) * 8, \
+   _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + 
(set) * 8)
 /* legacy palette */
 #define _LGC_PALETTE_A   0x4a000
 #define _LGC_PALETTE_B   0x4a800
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 0/5] Introduce drm scaling filter property

2020-08-02 Thread Pankaj Bharadiya
Earlier, I kept this series on hold since we wanted to have a 
reference userspace implementation in place.

Now, Sameer has implemented Integer scaling in Kodi Retro gaming
framework which demonstrate how Integer scaling gives distinctive
look to pixel art games when played on higher resolution monitors. 

Kodi patches are almost reviewed and closer to merge now.
Here is the userspace patch series link:
https://github.com/xbmc/xbmc/pull/18194

Background on Integer scaling:

Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer (i.e., whole
number) multiplier. Nearest-neighbor (NN) interpolation works by
filling in the missing color values in the upscaled image with that of
the coordinate-mapped nearest source pixel value.

Both IS and NN preserve the clarity of the original image. In
contrast, traditional upscaling algorithms, such as bilinear or
bicubic interpolation, result in blurry upscaled images because they
employ interpolation techniques that smooth out the transition from
one pixel to another.  Therefore, integer scaling is particularly
useful for pixel art games that rely on sharp, blocky images to
deliver their distinctive look.

Many gaming communities have been asking for integer-mode scaling
support, some links and background:

https://software.intel.com/en-us/articles/integer-scaling-support-on-intel-graphics
http://tanalin.com/en/articles/lossless-scaling/
https://community.amd.com/thread/209107
https://www.nvidia.com/en-us/geforce/forums/game-ready-drivers/13/1002/feature-request-nonblurry-upscaling-at-integer-rat/

* Changes in v5:
 - Rebase to latest drm-tip.

Pankaj Bharadiya (5):
  drm: Introduce plane and CRTC scaling filter properties
  drm/drm-kms.rst: Add plane and CRTC scaling filter property
documentation
  drm/i915: Introduce scaling filter related registers and bit fields.
  drm/i915/display: Add Nearest-neighbor based integer scaling support
  drm/i915: Enable scaling filter for plane and CRTC

 Documentation/gpu/drm-kms.rst |  12 ++
 drivers/gpu/drm/drm_atomic_uapi.c |   8 ++
 drivers/gpu/drm/drm_crtc.c|  48 +++
 drivers/gpu/drm/drm_crtc_internal.h   |   3 +
 drivers/gpu/drm/drm_plane.c   |  90 ++
 .../gpu/drm/i915/display/intel_atomic_plane.c |   1 +
 drivers/gpu/drm/i915/display/intel_display.c  | 117 +-
 drivers/gpu/drm/i915/display/intel_display.h  |   4 +
 .../drm/i915/display/intel_display_types.h|   2 +
 drivers/gpu/drm/i915/display/intel_sprite.c   |  15 ++-
 drivers/gpu/drm/i915/i915_reg.h   |  22 
 include/drm/drm_crtc.h|  16 +++
 include/drm/drm_plane.h   |  21 
 13 files changed, 355 insertions(+), 4 deletions(-)

-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/42] drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: series starting with [01/42] drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80179/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8832_full -> Patchwork_18295_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18295_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18295_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_18295_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_ctx_exec@basic-nohangcheck:
- shard-tglb: [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb7/igt@gem_ctx_e...@basic-nohangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-tglb2/igt@gem_ctx_e...@basic-nohangcheck.html

  * igt@gem_ctx_ringsize@active@bcs0:
- shard-skl:  [PASS][3] -> [INCOMPLETE][4] +2 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@gem_ctx_ringsize@act...@bcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl9/igt@gem_ctx_ringsize@act...@bcs0.html

  * igt@gem_exec_parallel@fds:
- shard-tglb: NOTRUN -> [INCOMPLETE][5]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-tglb1/igt@gem_exec_paral...@fds.html

  * igt@gem_exec_whisper@basic-contexts-forked-all:
- shard-glk:  [PASS][6] -> [INCOMPLETE][7]
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-glk9/igt@gem_exec_whis...@basic-contexts-forked-all.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-glk2/igt@gem_exec_whis...@basic-contexts-forked-all.html

  * igt@kms_mmap_write_crc@main:
- shard-glk:  [PASS][8] -> [FAIL][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-glk2/igt@kms_mmap_write_...@main.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-glk5/igt@kms_mmap_write_...@main.html

  
New tests
-

  New tests have been introduced between CI_DRM_8832_full and 
Patchwork_18295_full:

### New IGT tests (3) ###

  * igt@i915_selftest@live@scheduler:
- Statuses : 7 pass(s)
- Exec time: [0.45, 3.40] s

  * igt@i915_selftest@mock@scheduler:
- Statuses : 7 pass(s)
- Exec time: [0.11, 1.00] s

  * igt@i915_selftest@perf@scheduler:
- Statuses : 7 pass(s)
- Exec time: [0.44, 16.03] s

  

Known issues


  Here are the changes found in Patchwork_18295_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@nonpriv-switch@rcs0:
- shard-apl:  [PASS][10] -> [INCOMPLETE][11] ([i915#1635]) +1 
similar issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-apl3/igt@gem_ctx_isolation@nonpriv-swi...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-apl6/igt@gem_ctx_isolation@nonpriv-swi...@rcs0.html

  * igt@gem_ctx_persistence@legacy-engines-mixed-process@render:
- shard-skl:  [PASS][12] -> [FAIL][13] ([i915#1528])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl9/igt@gem_ctx_persistence@legacy-engines-mixed-proc...@render.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl7/igt@gem_ctx_persistence@legacy-engines-mixed-proc...@render.html

  * igt@gem_exec_whisper@basic-contexts-forked-all:
- shard-kbl:  [PASS][14] -> [INCOMPLETE][15] ([i915#794])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-kbl2/igt@gem_exec_whis...@basic-contexts-forked-all.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-kbl6/igt@gem_exec_whis...@basic-contexts-forked-all.html

  * igt@i915_selftest@mock@contexts:
- shard-skl:  [PASS][16] -> [INCOMPLETE][17] ([i915#198])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl2/igt@i915_selftest@m...@contexts.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl10/igt@i915_selftest@m...@contexts.html

  * igt@kms_big_fb@y-tiled-64bpp-rotate-0:
- shard-iclb: [PASS][18] -> [DMESG-WARN][19] ([i915#1982])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-iclb5/igt@kms_big...@y-tiled-64bpp-rotate-0.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-iclb3/igt@kms_big...@y-tiled-64bpp-rotate-0.html

  * igt@kms_cursor_crc@pipe-c-cursor-suspend:
- shard-glk:  [PASS][20] -> [FAIL][21] ([i915#54])
   [20]: 

Re: [Intel-gfx] [PATCH v5 06/16] pwm: lpss: Use pwm_lpss_apply() when restoring state on resume

2020-08-02 Thread Hans de Goede

Hi,

On 7/29/20 10:12 AM, Andy Shevchenko wrote:

On Tue, Jul 28, 2020 at 09:55:22PM +0200, Hans de Goede wrote:

On 7/28/20 8:57 PM, Andy Shevchenko wrote:

On Fri, Jul 17, 2020 at 03:37:43PM +0200, Hans de Goede wrote:


...


Maybe I'm too picky, but I would go even further and split apply to two versions

static int pwm_lpss_apply_on_resume(struct pwm_chip *chip, struct pwm_device 
*pwm,
  const struct pwm_state *state)

   {
struct pwm_lpss_chip *lpwm = to_lpwm(chip);
if (state->enabled)
return pwm_lpss_prepare_enable(lpwm, pwm, state, 
!pwm_is_enabled(pwm));
if (pwm_is_enabled(pwm)) {
pwm_lpss_write(pwm, pwm_lpss_read(pwm) & ~PWM_ENABLE);
return 0;
   }


and another one for !from_resume.


It is a bit picky :) But that is actually not a bad idea, although I would write
it like this for more symmetry with the normal (not on_resume) apply version,
while at it I also renamed the function:

/*
  * This is a mirror of pwm_lpss_apply() without pm_runtime reference handling
  * for restoring the PWM state on resume.
  */
static int pwm_lpss_restore_state(struct pwm_chip *chip, struct pwm_device *pwm,
   const struct pwm_state *state)
{
struct pwm_lpss_chip *lpwm = to_lpwm(chip);
int ret = 0;

if (state->enabled)
ret = pwm_lpss_prepare_enable(lpwm, pwm, state, 
!pwm_is_enabled(pwm));
else if (pwm_is_enabled(pwm))
pwm_lpss_write(pwm, pwm_lpss_read(pwm) & ~PWM_ENABLE);

return ret;
}

Would that work for you?


Yes.


Ok, I've added the suggested/discussed helper in my personal tree. Is it ok
if I add your Reviewed-by with that change in place. This is the last unreviewed
bit, so I would rather not respin the series just for this (there will be one
more respin when I rebase it on 5.9-rc1).

If you want to check out what the patch looks like now, the new version from
my personal tree is here:

https://github.com/jwrdegoede/linux-sunxi/commit/e4869830d88bb8cb8251718e0086ac189abc0f56

Regards,

Hans

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value

2020-08-02 Thread Borislav Petkov
On Sun, Aug 02, 2020 at 02:14:06PM -0500, Bjorn Helgaas wrote:
> Wait, I'm not convinced yet.  I know that if a PCI read fails, you
> normally get ~0 data because the host bridge fabricates it to complete
> the CPU load.
> 
> But what guarantees that a PCI config register cannot contain ~0?

Well, I don't think you can differentiate that case, right?

I guess this is where the driver knowledge comes into play: if the read
returns ~0, the pci_read_config* should probably return in that case
something like:

PCIBIOS_READ_MAYBE_FAILED

to denote it is all 1s and then the caller should be able to determine,
based on any of domain:bus:slot.func and whatever else the driver knows
about its hardware, whether the 1s are a valid value or an error.
Hopefully.

Or something better of which I cannot think of right now...

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 00/16] acpi/pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API

2020-08-02 Thread Hans de Goede

Hi,

On 8/2/20 1:25 PM, Andy Shevchenko wrote:

On Sat, Aug 01, 2020 at 04:38:16PM +0200, Hans de Goede wrote:

On 7/29/20 12:54 PM, Andy Shevchenko wrote:

On Fri, Jul 17, 2020 at 03:37:37PM +0200, Hans de Goede wrote:


...


One comment to consider, though. There are three channels in that PWM AFAIU.
One of them is backlight control, another one can be attached to haptics. The
concern is how this series may (or may not?) affect haptics behaviour.


When you say "in that PWM" do you mean the LPSS one or the CRC one ?


CRC one. I have read it from PMIC spec, that's why the question.


Ah I see, well the kernel driver only implements support for 1 PWM output,
the one which is used for the backlight brighness control.

So this series should not affect haptics behavior, since it looks like
the haptic functionality is not supported in the mainline kernel at all.

And I'm also not aware of any tablets with a CRC PMIC which have
(non working) haptic support.

Regards,

Hans

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for Add generic i915_ggtt ballooning support

2020-08-02 Thread Patchwork
== Series Details ==

Series: Add generic i915_ggtt ballooning support
URL   : https://patchwork.freedesktop.org/series/80177/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8832_full -> Patchwork_18294_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18294_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18294_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_18294_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_mmap_gtt@fault-concurrent:
- shard-snb:  [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-snb6/igt@gem_mmap_...@fault-concurrent.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-snb4/igt@gem_mmap_...@fault-concurrent.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
- shard-skl:  [PASS][3] -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@runner@aborted:
- shard-snb:  NOTRUN -> [FAIL][5]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-snb4/igt@run...@aborted.html

  
Known issues


  Here are the changes found in Patchwork_18294_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][6] -> [SKIP][7] ([i915#2190])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb1/igt@gem_huc_c...@huc-copy.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-tglb6/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@mock@contexts:
- shard-skl:  [PASS][8] -> [INCOMPLETE][9] ([i915#198])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl2/igt@i915_selftest@m...@contexts.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@i915_selftest@m...@contexts.html

  * igt@kms_cursor_edge_walk@pipe-c-256x256-bottom-edge:
- shard-skl:  [PASS][10] -> [DMESG-WARN][11] ([i915#1982]) +7 
similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@kms_cursor_edge_w...@pipe-c-256x256-bottom-edge.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@kms_cursor_edge_w...@pipe-c-256x256-bottom-edge.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-ytiled:
- shard-apl:  [PASS][12] -> [DMESG-WARN][13] ([i915#1635] / 
[i915#1982])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-apl3/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-apl6/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html

  * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-cpu:
- shard-tglb: [PASS][14] -> [DMESG-WARN][15] ([i915#1982])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb7/igt@kms_frontbuffer_track...@psr-1p-offscren-pri-shrfb-draw-mmap-cpu.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-tglb5/igt@kms_frontbuffer_track...@psr-1p-offscren-pri-shrfb-draw-mmap-cpu.html

  * igt@kms_hdr@bpc-switch-suspend:
- shard-kbl:  [PASS][16] -> [DMESG-WARN][17] ([i915#180]) +8 
similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-kbl2/igt@kms_...@bpc-switch-suspend.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-kbl2/igt@kms_...@bpc-switch-suspend.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
- shard-skl:  [PASS][18] -> [INCOMPLETE][19] ([CI#80])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-b.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl10/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-b.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
- shard-skl:  [PASS][20] -> [FAIL][21] ([fdo#108145] / [i915#265])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl7/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl3/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_mmap_cpu:
- shard-iclb: 

Re: [Intel-gfx] Time, where did it go?

2020-08-02 Thread Chris Wilson
Quoting Dave Airlie (2020-08-02 18:56:44)
> On Mon, 3 Aug 2020 at 02:44, Chris Wilson  wrote:
> >
> > Lots of small incremental improvements to reduce execution latency
> > which basically offsets the small regressions incurred when compared to
> > 5.7. And then there are some major fixes found while staring agape at
> > lockstat.
> 
> What introduced the 5.7 regressions? are they documented somewhere.

No. There's a 5.8-rc1 bisect (to the merge but not into rc1) for
something in the core causing perf fluctuations, but I have not yet
reproduced that one to bisect into the rc1 merge. [The system that showed
the issue has historically seen strong swings from p-state setup, might
be that again?]. This is from measuring simulated transcode workloads that
we've built up to track KPI. That we can then compare against the real
workloads run by other groups.
 
> What is the goal here, is there a benchmark or application that this
> benefits that you can quantify the benefits?

Entirely motivated by not wanting to have to explain why there's even a
1% regression in their client metrics. They wouldn't even notice for a
few releases by which point the problem is likely compounded and we
suddenly have crisis meetings.
 
> Is the lack of userspace command submission a problem vs other vendors here?

If you mean HW scheduling (which is the bit that we are most in dire need
of for replacing this series), not really, our closest equivalent has not
yet proven itself, at least in previous incarnations, adequate to their
requirements.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value

2020-08-02 Thread Bjorn Helgaas
On Sun, Aug 02, 2020 at 08:46:48PM +0200, Borislav Petkov wrote:
> On Sun, Aug 02, 2020 at 07:28:00PM +0200, Saheed Bolarinwa wrote:
> > Because the value ~0 has a meaning to some drivers and only
> 
> No, ~0 means that the PCI read failed. For *every* PCI device I know.

Wait, I'm not convinced yet.  I know that if a PCI read fails, you
normally get ~0 data because the host bridge fabricates it to complete
the CPU load.

But what guarantees that a PCI config register cannot contain ~0?
If there's something about that in the spec I'd love to know where it
is because it would simplify a lot of things.

I don't think we should merge any of these patches as-is.  If we *do*
want to go this direction, we at least need some kind of macro or
function that tests for ~0 so we have a clue about what's happening
and can grep for it.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value

2020-08-02 Thread Borislav Petkov
On Sun, Aug 02, 2020 at 07:28:00PM +0200, Saheed Bolarinwa wrote:
> Because the value ~0 has a meaning to some drivers and only

No, ~0 means that the PCI read failed. For *every* PCI device I know.

Here's me reading from 0xf0 offset of my hostbridge:

# setpci -s 00:00.0 0xf0.l
0100

That device doesn't have extended config space, so the last valid byte
is 0xff. Let's read beyond that:

# setpci -s 00:00.0 0x100.l


> Again, only the drivers can determine if ~0 is a valid value. This
> information is not available inside pci_config_read*().

Of course it is.

*every* change you've done in 6/17 - this is the only patch I have
received - checks for == ~0. So that check can just as well be moved
inside pci_config_read_*().

Here's how one could do it:

#define PCI_OP_READ(size, type, len) \
int noinline pci_bus_read_config_##size \
(struct pci_bus *bus, unsigned int devfn, int pos, type *value) \
{   \
int res;\
unsigned long flags;\
u32 data = 0;   \
if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;   \
pci_lock_config(flags); \
res = bus->ops->read(bus, devfn, pos, len, );  \

/* Check we actually read something which is not all 1s.*/
if (data == ~0)
return PCIBIOS_READ_FAILED;

*value = (type)data;\
pci_unlock_config(flags);   \
return res; \
}

Also, I'd prefer a function to *not* return void but return either
an error or success. In the success case, the @value argument can be
consumed by the caller and otherwise not.

In any case, that change is a step in the wrong direction and I don't
like it, sorry.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Time, where did it go?

2020-08-02 Thread Dave Airlie
On Mon, 3 Aug 2020 at 02:44, Chris Wilson  wrote:
>
> Lots of small incremental improvements to reduce execution latency
> which basically offsets the small regressions incurred when compared to
> 5.7. And then there are some major fixes found while staring agape at
> lockstat.

What introduced the 5.7 regressions? are they documented somewhere.

What is the goal here, is there a benchmark or application that this
benefits that you can quantify the benefits?

Is the lack of userspace command submission a problem vs other vendors here?

Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/42] drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: series starting with [01/42] drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80179/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18295


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/index.html

New tests
-

  New tests have been introduced between CI_DRM_8832 and Patchwork_18295:

### New IGT tests (1) ###

  * igt@i915_selftest@live@scheduler:
- Statuses : 33 pass(s)
- Exec time: [0.47, 2.95] s

  

Known issues


  Here are the changes found in Patchwork_18295 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_pm_rpm@module-reload:
- fi-byt-j1900:   [PASS][1] -> [DMESG-WARN][2] ([i915#1982])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-byt-j1900/igt@i915_pm_...@module-reload.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-byt-j1900/igt@i915_pm_...@module-reload.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-bxt-dsi: [DMESG-WARN][3] ([i915#1635] / [i915#1982]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-bxt-dsi/igt@i915_module_l...@reload.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-bsw-kefka:   [DMESG-WARN][5] ([i915#1982]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html

  * igt@i915_selftest@live@active:
- {fi-ehl-1}: [DMESG-FAIL][7] ([i915#541]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-ehl-1/igt@i915_selftest@l...@active.html

  * igt@i915_selftest@live@gem_contexts:
- fi-tgl-u2:  [INCOMPLETE][9] ([i915#2045]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
- fi-icl-u2:  [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html

  
 Warnings 

  * igt@i915_module_load@reload:
- fi-kbl-x1275:   [DMESG-WARN][13] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][14] ([i915#62] / [i915#92]) +1 similar issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@i915_module_l...@reload.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-x1275/igt@i915_module_l...@reload.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-guc: [SKIP][15] ([fdo#109271]) -> [DMESG-FAIL][16] 
([i915#2203])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-guc/igt@i915_pm_...@basic-rte.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-guc/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@execlists:
- fi-icl-y:   [INCOMPLETE][17] -> [INCOMPLETE][18] ([i915#1684])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-y/igt@i915_selftest@l...@execlists.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-icl-y/igt@i915_selftest@l...@execlists.html

  * igt@kms_force_connector_basic@prune-stale-modes:
- fi-kbl-x1275:   [DMESG-WARN][19] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +7 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_force_connector_ba...@prune-stale-modes.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-x1275/igt@kms_force_connector_ba...@prune-stale-modes.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1684]: https://gitlab.freedesktop.org/drm/intel/issues/1684
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2045]: https://gitlab.freedesktop.org/drm/intel/issues/2045
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/42] drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: series starting with [01/42] drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80179/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/42] drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: series starting with [01/42] drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80179/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
641fe3760e7f drm/i915/gem: Don't drop the timeline lock during execbuf
1eace725c392 drm/i915/gem: Reduce context termination list iteration guard to 
RCU
-:20: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#20: 
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests

-:20: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit d22d2d073ef8 ("drm/i915: Protect 
i915_request_await_start from early waits")'
#20: 
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests

total: 1 errors, 1 warnings, 0 checks, 65 lines checked
f418a1a9323c drm/i915/gt: Protect context lifetime with RCU
ed19011a3244 drm/i915/gt: Free stale request on destroying the virtual engine
b7deae3dd990 drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb 
spinlock
9dfbca235567 drm/i915/gt: Split the breadcrumb spinlock between global and 
contexts
-:299: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#299: FILE: drivers/gpu/drm/i915/gt/intel_context_types.h:54:
+   spinlock_t signal_lock;

total: 0 errors, 0 warnings, 1 checks, 260 lines checked
d64649d5b06d drm/i915: Drop i915_request.lock serialisation around await_start
5b3023be5688 drm/i915: Drop i915_request.lock requirement for intel_rps_boost()
4cc1911d1e27 drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source
a5c7155455bf drm/i915/gem: Reduce ctx->engines_mutex for get_engines()
84b6ff2b8884 drm/i915: Reduce test_and_set_bit to set_bit in 
i915_request_submit()
0fcdf7a5cdb8 drm/i915/gt: Decouple completed requests on unwind
fa84ed47c393 drm/i915/gt: Check for a completed last request once
61ede9a148ba drm/i915/gt: Refactor heartbeat request construction and submission
570c67a58ab7 drm/i915/gt: Replace direct submit with direct call to tasklet
769442507dcd drm/i915/gt: Use virtual_engine during execlists_dequeue
122fabf291d6 drm/i915/gt: Decouple inflight virtual engines
515575064cb7 drm/i915/gt: Defer schedule_out until after the next dequeue
4adeb0f9dbfe drm/i915/gt: Resubmit the virtual engine on schedule-out
4f73843fd816 drm/i915/gt: Simplify virtual engine handling for execlists_hold()
3d5f28c98ebe drm/i915/gt: ce->inflight updates are now serialised
6cb4a83d25f2 drm/i915/gt: Drop atomic for engine->fw_active tracking
a28b1bd35d4e drm/i915/gt: Extract busy-stats for ring-scheduler
-:12: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#12: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 95 lines checked
323ff915d375 drm/i915/gt: Convert stats.active to plain unsigned int
9bdc1f8119ab drm/i915: Lift waiter/signaler iterators
b1e01ae8f345 drm/i915: Strip out internal priorities
fc2d75819d99 drm/i915: Remove I915_USER_PRIORITY_SHIFT
51c3bc9b7f8f drm/i915/gt: Defer the kmem_cache_free() until after the HW submit
8b5974e86d02 drm/i915: Prune empty priolists
508fbfb2fa86 drm/i915: Replace engine->schedule() with a known request operation
acef80fd74c7 drm/i915/gt: Do not suspend bonded requests if one hangs
c472b142c7b9 drm/i915: Teach the i915_dependency to use a double-lock
cdad85c1538f drm/i915: Restructure priority inheritance
fba4ba812a76 drm/i915/selftests: Measure set-priority duration
-:52: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#52: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 553 lines checked
521577d83d5b drm/i915: Improve DFS for priority inheritance
d61ada4e881a drm/i915/gt: Remove timeslice suppression
713774df6186 drm/i915: Fair low-latency scheduling
341457ff92bf drm/i915/gt: Specify a deadline for the heartbeat
a32ab75c19ba drm/i915: Replace the priority boosting for the display with a 
deadline
a445af000bcb drm/i915: Move saturated workload detection back to the context
-:29: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#29: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

-:29: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 44d89409a12e ("drm/i915: Make 
the semaphore saturation mask global")'
#29: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

total: 1 errors, 1 warnings, 0 checks, 69 lines checked
87199b330ee5 drm/i915/gt: Another tweak for flushing the tasklets


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 33/42] drm/i915: Teach the i915_dependency to use a double-lock

2020-08-02 Thread Chris Wilson
Currently, we construct and teardown the i915_dependency chains using a
global spinlock. As the lists are entirely local, it should be possible
to use an double-lock with an explicit nesting [signaler -> waiter,
always] and so avoid the costly convenience of a global spinlock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c |  6 +--
 drivers/gpu/drm/i915/i915_request.c |  2 +-
 drivers/gpu/drm/i915/i915_scheduler.c   | 44 +
 drivers/gpu/drm/i915/i915_scheduler.h   |  2 +-
 drivers/gpu/drm/i915/i915_scheduler_types.h |  1 +
 5 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3999bf6c6aee..cb53f350848f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1847,7 +1847,7 @@ static void defer_request(struct i915_request *rq, struct 
list_head * const pl)
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
 
-   if (p->flags & I915_DEPENDENCY_WEAK)
+   if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK)
continue;
 
/* Leave semaphores spinning on the other engines */
@@ -2700,7 +2700,7 @@ static void __execlists_hold(struct i915_request *rq)
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
 
-   if (p->flags & I915_DEPENDENCY_WEAK)
+   if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK)
continue;
 
/* Leave semaphores spinning on the other engines */
@@ -2798,7 +2798,7 @@ static void __execlists_unhold(struct i915_request *rq)
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
 
-   if (p->flags & I915_DEPENDENCY_WEAK)
+   if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK)
continue;
 
/* Propagate any change in error status */
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 8a263556997c..3f9e3e6a34ed 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -329,7 +329,7 @@ bool i915_request_retire(struct i915_request *rq)
intel_context_unpin(rq->context);
 
free_capture_list(rq);
-   i915_sched_node_fini(>sched);
+   i915_sched_node_retire(>sched);
i915_request_put(rq);
 
return true;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 4760d7c9b7cd..247d2e2bb3c8 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -389,6 +389,8 @@ void i915_request_set_priority(struct i915_request *rq, int 
prio)
 
 void i915_sched_node_init(struct i915_sched_node *node)
 {
+   spin_lock_init(>lock);
+
INIT_LIST_HEAD(>signalers_list);
INIT_LIST_HEAD(>waiters_list);
INIT_LIST_HEAD(>link);
@@ -426,7 +428,8 @@ bool __i915_sched_node_add_dependency(struct 
i915_sched_node *node,
 {
bool ret = false;
 
-   spin_lock_irq(_lock);
+   /* The signal->lock is always the outer lock in this double-lock. */
+   spin_lock_irq(>lock);
 
if (!node_signaled(signal)) {
INIT_LIST_HEAD(>dfs_link);
@@ -435,15 +438,17 @@ bool __i915_sched_node_add_dependency(struct 
i915_sched_node *node,
dep->flags = flags;
 
/* All set, now publish. Beware the lockless walkers. */
+   spin_lock_nested(>lock, SINGLE_DEPTH_NESTING);
list_add_rcu(>signal_link, >signalers_list);
list_add_rcu(>wait_link, >waiters_list);
+   spin_unlock(>lock);
 
/* Propagate the chains */
node->flags |= signal->flags;
ret = true;
}
 
-   spin_unlock_irq(_lock);
+   spin_unlock_irq(>lock);
 
return ret;
 }
@@ -469,39 +474,46 @@ int i915_sched_node_add_dependency(struct i915_sched_node 
*node,
return 0;
 }
 
-void i915_sched_node_fini(struct i915_sched_node *node)
+void i915_sched_node_retire(struct i915_sched_node *node)
 {
struct i915_dependency *dep, *tmp;
 
-   spin_lock_irq(_lock);
+   spin_lock_irq(>lock);
 
/*
 * Everyone we depended upon (the fences we wait to be signaled)
 * should retire before us and remove themselves from our list.
 * However, retirement is run independently on each timeline and
-* so we may be called out-of-order.
+* so we may be called out-of-order. As we need to avoid taking
+* the signaler's lock, just mark up our completion and be wary
+   

[Intel-gfx] [PATCH 35/42] drm/i915/selftests: Measure set-priority duration

2020-08-02 Thread Chris Wilson
As a topological sort, we expect it to run in linear graph time,
O(V+E). In removing the recursion, it is no longer a DFS but rather a
BFS, and performs as O(VE). Let's demonstrate how bad this is with a few
examples, and build a few test cases to verify a potential fix.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_scheduler.c |   4 +
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 +
 .../gpu/drm/i915/selftests/i915_scheduler.c   | 530 ++
 4 files changed, 536 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_scheduler.c

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 45e000c257db..3c1a0b001746 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -565,6 +565,10 @@ void i915_sched_node_retire(struct i915_sched_node *node)
spin_unlock_irq(>lock);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_scheduler.c"
+#endif
+
 static void i915_global_scheduler_shrink(void)
 {
kmem_cache_shrink(global.slab_dependencies);
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h 
b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a92c0e9b7e6b..2200a5baa68e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests)
 selftest(gt_pm, intel_gt_pm_live_selftests)
 selftest(gt_heartbeat, intel_heartbeat_live_selftests)
 selftest(requests, i915_request_live_selftests)
+selftest(scheduler, i915_scheduler_live_selftests)
 selftest(active, i915_active_live_selftests)
 selftest(objects, i915_gem_object_live_selftests)
 selftest(mman, i915_gem_mman_live_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h 
b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index c2389f8a257d..137e35283fee 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -17,5 +17,6 @@
  */
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
+selftest(scheduler, i915_scheduler_perf_selftests)
 selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c 
b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
new file mode 100644
index ..98d29fa8f5f8
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c
@@ -0,0 +1,530 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "i915_selftest.h"
+
+#include "gt/intel_context.h"
+#include "gt/selftest_engine_heartbeat.h"
+#include "selftests/igt_spinner.h"
+
+static void scheduling_disable(struct intel_engine_cs *engine)
+{
+   engine->props.preempt_timeout_ms = 0;
+   engine->props.timeslice_duration_ms = 0;
+
+   st_engine_heartbeat_disable(engine);
+}
+
+static void scheduling_enable(struct intel_engine_cs *engine)
+{
+   st_engine_heartbeat_enable(engine);
+
+   engine->props.preempt_timeout_ms =
+   engine->defaults.preempt_timeout_ms;
+   engine->props.timeslice_duration_ms =
+   engine->defaults.timeslice_duration_ms;
+}
+
+static int first_engine(struct drm_i915_private *i915,
+   int (*chain)(struct intel_engine_cs *engine,
+unsigned long param,
+void (*fn)(struct i915_request *rq,
+   unsigned long v,
+   unsigned long e)),
+   unsigned long param,
+   void (*fn)(struct i915_request *rq,
+  unsigned long v, unsigned long e))
+{
+   struct intel_engine_cs *engine;
+
+   for_each_uabi_engine(engine, i915) {
+   if (!intel_engine_has_scheduler(engine))
+   continue;
+
+   return chain(engine, param, fn);
+   }
+
+   return 0;
+}
+
+static int all_engines(struct drm_i915_private *i915,
+  int (*chain)(struct intel_engine_cs *engine,
+   unsigned long param,
+   void (*fn)(struct i915_request *rq,
+  unsigned long v,
+  unsigned long e)),
+  unsigned long param,
+  void (*fn)(struct i915_request *rq,
+ unsigned long v, unsigned long e))
+{
+   struct intel_engine_cs *engine;
+   int err;
+
+   for_each_uabi_engine(engine, i915) {
+   if (!intel_engine_has_scheduler(engine))
+

[Intel-gfx] [PATCH 10/42] drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source

2020-08-02 Thread Chris Wilson
When cloning the engines from the source context, we need to ensure that
the engines are not freed as we copy them, and that the flags we clone
from the source correspond with the engines we copy across. To do this
we need only take a reference to the src->engines, rather than hold the
src->engine_mutex, so long as we verify that nothing changed under the
read.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 24 +
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index db893f6c516b..6530bd2e634e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -755,7 +755,8 @@ __create_context(struct drm_i915_private *i915)
 }
 
 static inline struct i915_gem_engines *
-__context_engines_await(const struct i915_gem_context *ctx)
+__context_engines_await(const struct i915_gem_context *ctx,
+   bool *user_engines)
 {
struct i915_gem_engines *engines;
 
@@ -764,6 +765,10 @@ __context_engines_await(const struct i915_gem_context *ctx)
engines = rcu_dereference(ctx->engines);
GEM_BUG_ON(!engines);
 
+   if (user_engines)
+   *user_engines = i915_gem_context_user_engines(ctx);
+
+   /* successful await => strong mb */
if (unlikely(!i915_sw_fence_await(>fence)))
continue;
 
@@ -787,7 +792,7 @@ context_apply_all(struct i915_gem_context *ctx,
struct intel_context *ce;
int err = 0;
 
-   e = __context_engines_await(ctx);
+   e = __context_engines_await(ctx, NULL);
for_each_gem_engine(ce, e, it) {
err = fn(ce, data);
if (err)
@@ -1129,7 +1134,7 @@ static int context_barrier_task(struct i915_gem_context 
*ctx,
return err;
}
 
-   e = __context_engines_await(ctx);
+   e = __context_engines_await(ctx, NULL);
if (!e) {
i915_active_release(>base);
return -ENOENT;
@@ -2126,11 +2131,14 @@ static int copy_ring_size(struct intel_context *dst,
 static int clone_engines(struct i915_gem_context *dst,
 struct i915_gem_context *src)
 {
-   struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
-   struct i915_gem_engines *clone;
+   struct i915_gem_engines *clone, *e;
bool user_engines;
unsigned long n;
 
+   e = __context_engines_await(src, _engines);
+   if (!e)
+   return -ENOENT;
+
clone = alloc_engines(e->num_engines);
if (!clone)
goto err_unlock;
@@ -2172,9 +2180,7 @@ static int clone_engines(struct i915_gem_context *dst,
}
}
clone->num_engines = n;
-
-   user_engines = i915_gem_context_user_engines(src);
-   i915_gem_context_unlock_engines(src);
+   i915_sw_fence_complete(>fence);
 
/* Serialised by constructor */
engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
@@ -2185,7 +2191,7 @@ static int clone_engines(struct i915_gem_context *dst,
return 0;
 
 err_unlock:
-   i915_gem_context_unlock_engines(src);
+   i915_sw_fence_complete(>fence);
return -ENOMEM;
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/42] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock

2020-08-02 Thread Chris Wilson
Make b->signaled_requests a lockless-list so that we can manipulate it
outside of the b->irq_lock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 53 ++-
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |  2 +-
 drivers/gpu/drm/i915/i915_request.h   |  6 ++-
 3 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d8b206e53660..9e7ac612fabb 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -174,16 +174,13 @@ static void add_retire(struct intel_breadcrumbs *b, 
struct intel_timeline *tl)
intel_engine_add_retire(b->irq_engine, tl);
 }
 
-static bool __signal_request(struct i915_request *rq, struct list_head 
*signals)
+static bool __signal_request(struct i915_request *rq)
 {
-   clear_bit(I915_FENCE_FLAG_SIGNAL, >fence.flags);
-
if (!__dma_fence_signal(>fence)) {
i915_request_put(rq);
return false;
}
 
-   list_add_tail(>signal_link, signals);
return true;
 }
 
@@ -191,17 +188,42 @@ static void signal_irq_work(struct irq_work *work)
 {
struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
const ktime_t timestamp = ktime_get();
+   struct llist_node *signal, *sn;
struct intel_context *ce, *cn;
struct list_head *pos, *next;
-   LIST_HEAD(signal);
+
+   signal = NULL;
+   if (unlikely(!llist_empty(>signaled_requests)))
+   signal = llist_del_all(>signaled_requests);
 
spin_lock(>irq_lock);
 
-   if (list_empty(>signalers))
+   /*
+* Keep the irq armed until the interrupt after all listeners are gone.
+*
+* Enabling/disabling the interrupt is rather costly, roughly a couple
+* of hundred microseconds. If we are proactive and enable/disable
+* the interrupt around every request that wants a breadcrumb, we
+* quickly drown in the extra orders of magnitude of latency imposed
+* on request submission.
+*
+* So we try to be lazy, and keep the interrupts enabled until no
+* more listeners appear within a breadcrumb interrupt interval (that
+* is until a request completes that no one cares about). The
+* observation is that listeners come in batches, and will often
+* listen to a bunch of requests in succession.
+*
+* We also try to avoid raising too many interrupts, as they may
+* be generated by userspace batches and it is unfortunately rather
+* too easy to drown the CPU under a flood of GPU interrupts. Thus
+* whenever no one appears to be listening, we turn off the interrupts.
+* Fewer interrupts should conserve power -- at the very least, fewer
+* interrupt draw less ire from other users of the system and tools
+* like powertop.
+*/
+   if (!signal && list_empty(>signalers))
__intel_breadcrumbs_disarm_irq(b);
 
-   list_splice_init(>signaled_requests, );
-
list_for_each_entry_safe(ce, cn, >signalers, signal_link) {
GEM_BUG_ON(list_empty(>signals));
 
@@ -218,7 +240,11 @@ static void signal_irq_work(struct irq_work *work)
 * spinlock as the callback chain may end up adding
 * more signalers to the same context or engine.
 */
-   __signal_request(rq, );
+   clear_bit(I915_FENCE_FLAG_SIGNAL, >fence.flags);
+   if (__signal_request(rq)) {
+   rq->signal_node.next = signal;
+   signal = >signal_node;
+   }
}
 
/*
@@ -238,9 +264,9 @@ static void signal_irq_work(struct irq_work *work)
 
spin_unlock(>irq_lock);
 
-   list_for_each_safe(pos, next, ) {
+   llist_for_each_safe(signal, sn, signal) {
struct i915_request *rq =
-   list_entry(pos, typeof(*rq), signal_link);
+   llist_entry(signal, typeof(*rq), signal_node);
struct list_head cb_list;
 
spin_lock(>lock);
@@ -264,7 +290,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
 
spin_lock_init(>irq_lock);
INIT_LIST_HEAD(>signalers);
-   INIT_LIST_HEAD(>signaled_requests);
+   init_llist_head(>signaled_requests);
 
init_irq_work(>irq_work, signal_irq_work);
 
@@ -327,7 +353,8 @@ static void insert_breadcrumb(struct i915_request *rq,
 * its signal completion.
 */
if (__request_completed(rq)) {
-   if (__signal_request(rq, >signaled_requests))
+   if (__signal_request(rq) &&
+   llist_add(>signal_node, 

[Intel-gfx] [PATCH 03/42] drm/i915/gem: Reduce context termination list iteration guard to RCU

2020-08-02 Thread Chris Wilson
As we now protect the timeline list using RCU, we can drop the
timeline->mutex for guarding the list iteration during context close, as
we are searching for an inflight request. Any new request will see the
context is banned and not be submitted. In doing so, pull the checks for
a concurrent submission of the request (notably the
i915_request_completed()) under the engine spinlock, to fully serialise
with __i915_request_submit()). That is in the case of preempt-to-busy
where the request may be completed during the __i915_request_submit(),
we need to be careful that we sample the request status after
serialising so that we don't miss the request the engine is actually
submitting.

Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()")
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 32 -
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d8cccbab7a51..db893f6c516b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -439,29 +439,36 @@ static bool __cancel_engine(struct intel_engine_cs 
*engine)
return __reset_engine(engine);
 }
 
-static struct intel_engine_cs *__active_engine(struct i915_request *rq)
+static bool
+__active_engine(struct i915_request *rq, struct intel_engine_cs **active)
 {
struct intel_engine_cs *engine, *locked;
+   bool ret = false;
 
/*
 * Serialise with __i915_request_submit() so that it sees
 * is-banned?, or we know the request is already inflight.
+*
+* Note that rq->engine is unstable, and so we double
+* check that we have acquired the lock on the final engine.
 */
locked = READ_ONCE(rq->engine);
spin_lock_irq(>active.lock);
while (unlikely(locked != (engine = READ_ONCE(rq->engine {
spin_unlock(>active.lock);
-   spin_lock(>active.lock);
locked = engine;
+   spin_lock(>active.lock);
}
 
-   engine = NULL;
-   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
-   engine = rq->engine;
+   if (!i915_request_completed(rq)) {
+   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
+   *active = locked;
+   ret = true;
+   }
 
spin_unlock_irq(>active.lock);
 
-   return engine;
+   return ret;
 }
 
 static struct intel_engine_cs *active_engine(struct intel_context *ce)
@@ -472,17 +479,16 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
if (!ce->timeline)
return NULL;
 
-   mutex_lock(>timeline->mutex);
-   list_for_each_entry_reverse(rq, >timeline->requests, link) {
-   if (i915_request_completed(rq))
-   break;
+   rcu_read_lock();
+   list_for_each_entry_rcu(rq, >timeline->requests, link) {
+   if (i915_request_is_active(rq) && i915_request_completed(rq))
+   continue;
 
/* Check with the backend if the request is inflight */
-   engine = __active_engine(rq);
-   if (engine)
+   if (__active_engine(rq, ))
break;
}
-   mutex_unlock(>timeline->mutex);
+   rcu_read_unlock();
 
return engine;
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Time, where did it go?

2020-08-02 Thread Chris Wilson
Lots of small incremental improvements to reduce execution latency
which basically offsets the small regressions incurred when compared to
5.7. And then there are some major fixes found while staring agape at
lockstat.
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 20/42] drm/i915/gt: Resubmit the virtual engine on schedule-out

2020-08-02 Thread Chris Wilson
Having recognised that we do not change the sibling until we schedule
out, we can then defer the decision to resubmit the virtual engine from
the unwind of the active queue to scheduling out of the virtual context.

By keeping the unwind order intact on the local engine, we can preserve
data dependency ordering while doing a preempt-to-busy pass until we
have determined the new ELSP. This means that if we try to timeslice
between a virtual engine and a data-dependent ordinary request, the pair
will maintain their relative ordering and we will avoid the
resubmission, cancelling the timeslicing until further change.

The dilemma though is that we then may end up in a situation where the
'demotion' of the virtual request to an ordinary request in the engine
queue results in filling the ELSP[] with virtual requests instead of
spreading the load across the engines. To compensate for this, we mark
each virtual request and refuse to resubmit a virtual request in the
secondary ELSP slots, thus forcing subsequent virtual requests to be
scheduled out after timeslicing. By delaying the decision until we
schedule out, we will avoid unnecessary resubmission.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c| 124 +++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c |   2 +-
 2 files changed, 78 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index a9ae0638e117..d1a45d5e4225 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -,39 +,23 @@ __unwind_incomplete_requests(struct intel_engine_cs 
*engine)
 
__i915_request_unsubmit(rq);
 
-   /*
-* Push the request back into the queue for later resubmission.
-* If this request is not native to this physical engine (i.e.
-* it came from a virtual source), push it back onto the virtual
-* engine so that it can be moved across onto another physical
-* engine as load dictates.
-*/
-   if (likely(rq->execution_mask == engine->mask)) {
-   GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-   if (rq_prio(rq) != prio) {
-   prio = rq_prio(rq);
-   pl = i915_sched_lookup_priolist(engine, prio);
-   }
-   
GEM_BUG_ON(RB_EMPTY_ROOT(>execlists.queue.rb_root));
-
-   list_move(>sched.link, pl);
-   set_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags);
+   GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+   if (rq_prio(rq) != prio) {
+   prio = rq_prio(rq);
+   pl = i915_sched_lookup_priolist(engine, prio);
+   }
+   GEM_BUG_ON(RB_EMPTY_ROOT(>execlists.queue.rb_root));
 
-   /* Check in case we rollback so far we wrap [size/2] */
-   if (intel_ring_direction(rq->ring,
-intel_ring_wrap(rq->ring,
-rq->tail),
-rq->ring->tail) > 0)
-   rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
+   list_move(>sched.link, pl);
+   set_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags);
 
-   active = rq;
-   } else {
-   struct intel_engine_cs *owner = rq->context->engine;
+   /* Check in case we rollback so far we wrap [size/2] */
+   if (intel_ring_direction(rq->ring,
+intel_ring_wrap(rq->ring, rq->tail),
+rq->ring->tail) > 0)
+   rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE;
 
-   WRITE_ONCE(rq->engine, owner);
-   owner->submit_request(rq);
-   active = NULL;
-   }
+   active = rq;
}
 
return active;
@@ -1387,12 +1371,37 @@ static inline void execlists_schedule_in(struct 
i915_request *rq, int idx)
GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
 }
 
+static void
+resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve)
+{
+   struct intel_engine_cs *engine = rq->engine;
+   unsigned long flags;
+
+   spin_lock_irqsave(>active.lock, flags);
+
+   clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags);
+   WRITE_ONCE(rq->engine, >base);
+   ve->base.submit_request(rq);
+
+   spin_unlock_irqrestore(>active.lock, flags);
+}
+
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
struct virtual_engine *ve = container_of(ce, typeof(*ve), 

[Intel-gfx] [PATCH 29/42] drm/i915/gt: Defer the kmem_cache_free() until after the HW submit

2020-08-02 Thread Chris Wilson
Watching lock_stat, we noticed that the kmem_cache_free() would cause
the occasional multi-millisecond spike (directly affecting max-holdtime
and so the max-waittime). Delaying our submission of the next ELSP by a
millisecond will leave the GPU idle, so defer the kmem_cache_free()
until afterwards.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 10 +-
 drivers/gpu/drm/i915/i915_scheduler.c | 13 +
 drivers/gpu/drm/i915/i915_scheduler.h | 12 
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e8f6d0a80c8e..7ac864cd57e3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2029,6 +2029,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
struct i915_request **port = execlists->pending;
struct i915_request ** const last_port = port + execlists->port_mask;
struct i915_request *last = *execlists->active;
+   struct list_head *free = NULL;
struct virtual_engine *ve;
struct rb_node *rb;
bool submit = false;
@@ -2316,8 +2317,9 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
}
}
 
+   /* Remove the node, but defer the free for later */
rb_erase_cached(>node, >queue);
-   i915_priolist_free(p);
+   free = i915_priolist_free_defer(p, free);
}
 done:
*port++ = i915_request_get(last);
@@ -2369,6 +2371,12 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
i915_request_put(*port);
*execlists->pending = NULL;
}
+
+   /*
+* We noticed that kmem_cache_free() may cause 1ms+ latencies, so
+* we defer the frees until after we have submitted the ELSP.
+*/
+   i915_priolist_free_many(free);
 }
 
 static inline void clear_ports(struct i915_request **ports, int count)
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index a9973d7a724c..bfbbd94dfcbc 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -126,6 +126,19 @@ void __i915_priolist_free(struct i915_priolist *p)
kmem_cache_free(global.slab_priorities, p);
 }
 
+void i915_priolist_free_many(struct list_head *list)
+{
+   while (list) {
+   struct i915_priolist *p;
+
+   p = container_of(list, typeof(*p), requests);
+   list = p->requests.next;
+
+   GEM_BUG_ON(p->priority == I915_PRIORITY_NORMAL);
+   kmem_cache_free(global.slab_priorities, p);
+   }
+}
+
 struct sched_cache {
struct list_head *priolist;
 };
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h 
b/drivers/gpu/drm/i915/i915_scheduler.h
index b089d5cace1d..d8bf335c5e96 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -46,4 +46,16 @@ static inline void i915_priolist_free(struct i915_priolist 
*p)
__i915_priolist_free(p);
 }
 
+void i915_priolist_free_many(struct list_head *list);
+
+static inline struct list_head *
+i915_priolist_free_defer(struct i915_priolist *p, struct list_head *free)
+{
+   if (p->priority != I915_PRIORITY_NORMAL) {
+   p->requests.next = free;
+   free = >requests;
+   }
+   return free;
+}
+
 #endif /* _I915_SCHEDULER_H_ */
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/42] drm/i915/gem: Reduce ctx->engines_mutex for get_engines()

2020-08-02 Thread Chris Wilson
Take a snapshot of the ctx->engines, so we can avoid taking the
ctx->engines_mutex for a mere read in get_engines().

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 +
 1 file changed, 8 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 6530bd2e634e..56700ddbed47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1874,27 +1874,6 @@ set_engines(struct i915_gem_context *ctx,
return 0;
 }
 
-static struct i915_gem_engines *
-__copy_engines(struct i915_gem_engines *e)
-{
-   struct i915_gem_engines *copy;
-   unsigned int n;
-
-   copy = alloc_engines(e->num_engines);
-   if (!copy)
-   return ERR_PTR(-ENOMEM);
-
-   for (n = 0; n < e->num_engines; n++) {
-   if (e->engines[n])
-   copy->engines[n] = intel_context_get(e->engines[n]);
-   else
-   copy->engines[n] = NULL;
-   }
-   copy->num_engines = n;
-
-   return copy;
-}
-
 static int
 get_engines(struct i915_gem_context *ctx,
struct drm_i915_gem_context_param *args)
@@ -1902,19 +1881,17 @@ get_engines(struct i915_gem_context *ctx,
struct i915_context_param_engines __user *user;
struct i915_gem_engines *e;
size_t n, count, size;
+   bool user_engines;
int err = 0;
 
-   err = mutex_lock_interruptible(>engines_mutex);
-   if (err)
-   return err;
+   e = __context_engines_await(ctx, _engines);
+   if (!e)
+   return -ENOENT;
 
-   e = NULL;
-   if (i915_gem_context_user_engines(ctx))
-   e = __copy_engines(i915_gem_context_engines(ctx));
-   mutex_unlock(>engines_mutex);
-   if (IS_ERR_OR_NULL(e)) {
+   if (!user_engines) {
+   i915_sw_fence_complete(>fence);
args->size = 0;
-   return PTR_ERR_OR_ZERO(e);
+   return 0;
}
 
count = e->num_engines;
@@ -1965,7 +1942,7 @@ get_engines(struct i915_gem_context *ctx,
args->size = size;
 
 err_free:
-   free_engines(e);
+   i915_sw_fence_complete(>fence);
return err;
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 24/42] drm/i915/gt: Extract busy-stats for ring-scheduler

2020-08-02 Thread Chris Wilson
Lift the busy-stats context-in/out implementation out of intel_lrc, so
that we can reuse it for other scheduler implementations.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_stats.h | 49 
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 34 +-
 2 files changed, 50 insertions(+), 33 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_stats.h

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_stats.h 
b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
new file mode 100644
index ..58491eae3482
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_ENGINE_STATS_H__
+#define __INTEL_ENGINE_STATS_H__
+
+#include 
+#include 
+#include 
+
+#include "i915_gem.h" /* GEM_BUG_ON */
+#include "intel_engine.h"
+
+static inline void intel_engine_context_in(struct intel_engine_cs *engine)
+{
+   unsigned long flags;
+
+   if (atomic_add_unless(>stats.active, 1, 0))
+   return;
+
+   write_seqlock_irqsave(>stats.lock, flags);
+   if (!atomic_add_unless(>stats.active, 1, 0)) {
+   engine->stats.start = ktime_get();
+   atomic_inc(>stats.active);
+   }
+   write_sequnlock_irqrestore(>stats.lock, flags);
+}
+
+static inline void intel_engine_context_out(struct intel_engine_cs *engine)
+{
+   unsigned long flags;
+
+   GEM_BUG_ON(!atomic_read(>stats.active));
+
+   if (atomic_add_unless(>stats.active, -1, 1))
+   return;
+
+   write_seqlock_irqsave(>stats.lock, flags);
+   if (atomic_dec_and_test(>stats.active)) {
+   engine->stats.total =
+   ktime_add(engine->stats.total,
+ ktime_sub(ktime_get(), engine->stats.start));
+   }
+   write_sequnlock_irqrestore(>stats.lock, flags);
+}
+
+#endif /* __INTEL_ENGINE_STATS_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 61dfc932201e..ccf0b43c02cf 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -140,6 +140,7 @@
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
+#include "intel_engine_stats.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
@@ -1156,39 +1157,6 @@ execlists_context_status_change(struct i915_request *rq, 
unsigned long status)
   status, rq);
 }
 
-static void intel_engine_context_in(struct intel_engine_cs *engine)
-{
-   unsigned long flags;
-
-   if (atomic_add_unless(>stats.active, 1, 0))
-   return;
-
-   write_seqlock_irqsave(>stats.lock, flags);
-   if (!atomic_add_unless(>stats.active, 1, 0)) {
-   engine->stats.start = ktime_get();
-   atomic_inc(>stats.active);
-   }
-   write_sequnlock_irqrestore(>stats.lock, flags);
-}
-
-static void intel_engine_context_out(struct intel_engine_cs *engine)
-{
-   unsigned long flags;
-
-   GEM_BUG_ON(!atomic_read(>stats.active));
-
-   if (atomic_add_unless(>stats.active, -1, 1))
-   return;
-
-   write_seqlock_irqsave(>stats.lock, flags);
-   if (atomic_dec_and_test(>stats.active)) {
-   engine->stats.total =
-   ktime_add(engine->stats.total,
- ktime_sub(ktime_get(), engine->stats.start));
-   }
-   write_sequnlock_irqrestore(>stats.lock, flags);
-}
-
 static void
 execlists_check_context(const struct intel_context *ce,
const struct intel_engine_cs *engine)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/42] drm/i915/gt: Protect context lifetime with RCU

2020-08-02 Thread Chris Wilson
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c | 330 +---
 drivers/gpu/drm/i915/i915_active.c  |  10 +
 drivers/gpu/drm/i915/i915_active.h  |   2 +
 drivers/gpu/drm/i915/i915_utils.h   |   7 +
 4 files changed, 202 insertions(+), 147 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 52db2bde44a3..4e7924640ffa 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -22,7 +22,7 @@ static struct i915_global_context {
 
 static struct intel_context *intel_context_alloc(void)
 {
-   return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
+   return kmem_cache_alloc(global.slab_ce, GFP_KERNEL);
 }
 
 void intel_context_free(struct intel_context *ce)
@@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce)
kmem_cache_free(global.slab_ce, ce);
 }
 
+static int __context_pin_state(struct i915_vma *vma)
+{
+   unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
+   int err;
+
+   err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(>active);
+   if (err)
+   goto err_unpin;
+
+   /*
+* And mark it as a globally pinned object to let the shrinker know
+* it cannot reclaim the object until we release it.
+*/
+   i915_vma_make_unshrinkable(vma);
+   vma->obj->mm.dirty = true;
+
+   return 0;
+
+err_unpin:
+   i915_vma_unpin(vma);
+   return err;
+}
+
+static void __context_unpin_state(struct i915_vma *vma)
+{
+   i915_vma_make_shrinkable(vma);
+   i915_active_release(>active);
+   __i915_vma_unpin(vma);
+}
+
+static int __ring_active(struct intel_ring *ring)
+{
+   int err;
+
+   err = intel_ring_pin(ring);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(>vma->active);
+   if (err)
+   goto err_pin;
+
+   return 0;
+
+err_pin:
+   intel_ring_unpin(ring);
+   return err;
+}
+
+static void __ring_retire(struct intel_ring *ring)
+{
+   i915_active_release(>vma->active);
+   intel_ring_unpin(ring);
+}
+
+__i915_active_call
+static void __intel_context_retire(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+
+   CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
+intel_context_get_total_runtime_ns(ce),
+intel_context_get_avg_runtime_ns(ce));
+
+   set_bit(CONTEXT_VALID_BIT, >flags);
+   if (ce->state)
+   __context_unpin_state(ce->state);
+
+   intel_timeline_unpin(ce->timeline);
+   __ring_retire(ce->ring);
+
+   intel_context_put(ce);
+}
+
+static int __intel_context_active(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+   int err;
+
+   CE_TRACE(ce, "active\n");
+
+   intel_context_get(ce);
+
+   err = __ring_active(ce->ring);
+   if (err)
+   goto err_put;
+
+   err = intel_timeline_pin(ce->timeline);
+   if (err)
+   goto err_ring;
+
+   if (!ce->state)
+   return 0;
+
+   err = __context_pin_state(ce->state);
+   if (err)
+   goto err_timeline;
+
+   return 0;
+
+err_timeline:
+   intel_timeline_unpin(ce->timeline);
+err_ring:
+   __ring_retire(ce->ring);
+err_put:
+   intel_context_put(ce);
+   return err;
+}
+
+static void __intel_context_ctor(void *arg)
+{
+   struct intel_context *ce = arg;
+
+   INIT_LIST_HEAD(>signal_link);
+   INIT_LIST_HEAD(>signals);
+
+   atomic_set(>pin_count, 0);
+   mutex_init(>pin_mutex);
+
+   ce->active_count = 0;
+   i915_active_init(>active,
+__intel_context_active, __intel_context_retire);
+
+   ce->inflight = NULL;
+   ce->lrc_reg_state = NULL;
+   ce->lrc.desc = 0;
+}
+
+static void
+__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
+{
+   GEM_BUG_ON(!engine->cops);
+   GEM_BUG_ON(!engine->gt->vm);
+
+   kref_init(>ref);
+   i915_active_reinit(>active);
+   mutex_reinit(>pin_mutex);
+
+   ce->engine = engine;
+   ce->ops = engine->cops;
+   ce->sseu = engine->sseu;
+
+   ce->wa_bb_page = 0;
+   ce->flags = 0;
+   ce->tag = 0;
+
+   memset(>runtime, 0, sizeof(ce->runtime));
+
+   

[Intel-gfx] [PATCH 42/42] drm/i915/gt: Another tweak for flushing the tasklets

2020-08-02 Thread Chris Wilson
tasklet_kill() ensures that we _yield_ the processor until a remote
tasklet is completed. However, this leads to a starvation condition as
being at the bottom of the scheduler's runqueue means that anything else
is able to run, including all hogs keeping the tasklet occupied.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index e3bda3509cd0..5c08c29e58af 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1197,10 +1197,6 @@ void intel_engine_flush_submission(struct 
intel_engine_cs *engine)
if (!t->func)
return;
 
-   /* Synchronise and wait for the tasklet on another CPU */
-   tasklet_kill(t);
-
-   /* Having cancelled the tasklet, ensure that is run */
local_bh_disable();
if (tasklet_trylock(t)) {
/* Must wait for any GPU reset in progress. */
@@ -1209,6 +1205,9 @@ void intel_engine_flush_submission(struct intel_engine_cs 
*engine)
tasklet_unlock(t);
}
local_bh_enable();
+
+   /* Synchronise and wait for the tasklet on another CPU */
+   tasklet_unlock_wait(t);
 }
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 22/42] drm/i915/gt: ce->inflight updates are now serialised

2020-08-02 Thread Chris Wilson
Since schedule-in and schedule-out are now both always under the tasklet
bitlock, we can reduce the individual atomic operations to simple
instructions and worry less.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 44 +
 1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 0d072356cf21..e3dccdc53faf 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1333,7 +1333,7 @@ __execlists_schedule_in(struct i915_request *rq)
unsigned int tag = ffs(READ_ONCE(engine->context_tag));
 
GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
-   clear_bit(tag - 1, >context_tag);
+   __clear_bit(tag - 1, >context_tag);
ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32);
 
BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
@@ -1360,13 +1360,10 @@ static inline void execlists_schedule_in(struct 
i915_request *rq, int idx)
GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine));
trace_i915_request_in(rq, idx);
 
-   old = READ_ONCE(ce->inflight);
-   do {
-   if (!old) {
-   WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq));
-   break;
-   }
-   } while (!try_cmpxchg(>inflight, , ptr_inc(old)));
+   old = ce->inflight;
+   if (!old)
+   old = __execlists_schedule_in(rq);
+   WRITE_ONCE(ce->inflight, ptr_inc(old));
 
GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
 }
@@ -1404,12 +1401,11 @@ static void kick_siblings(struct i915_request *rq, 
struct intel_context *ce)
resubmit_virtual_request(rq, ve);
 }
 
-static inline void
-__execlists_schedule_out(struct i915_request *rq,
-struct intel_engine_cs * const engine,
-unsigned int ccid)
+static inline void __execlists_schedule_out(struct i915_request *rq)
 {
struct intel_context * const ce = rq->context;
+   struct intel_engine_cs * const engine = rq->engine;
+   unsigned int ccid;
 
/*
 * NB process_csb() is not under the engine->active.lock and hence
@@ -1417,7 +1413,7 @@ __execlists_schedule_out(struct i915_request *rq,
 * refrain from doing non-trivial work here.
 */
 
-   CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid);
+   CE_TRACE(ce, "schedule-out, ccid:%x\n", ce->lrc.ccid);
 
/*
 * If we have just completed this context, the engine may now be
@@ -1427,12 +1423,13 @@ __execlists_schedule_out(struct i915_request *rq,
i915_request_completed(rq))
intel_engine_add_retire(engine, ce->timeline);
 
+   ccid = ce->lrc.ccid;
ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
ccid &= GEN12_MAX_CONTEXT_HW_ID;
if (ccid < BITS_PER_LONG) {
GEM_BUG_ON(ccid == 0);
GEM_BUG_ON(test_bit(ccid - 1, >context_tag));
-   set_bit(ccid - 1, >context_tag);
+   __set_bit(ccid - 1, >context_tag);
}
 
intel_context_update_runtime(ce);
@@ -1453,26 +1450,23 @@ __execlists_schedule_out(struct i915_request *rq,
 */
if (ce->engine != engine)
kick_siblings(rq, ce);
-
-   intel_context_put(ce);
 }
 
 static inline void
 execlists_schedule_out(struct i915_request *rq)
 {
struct intel_context * const ce = rq->context;
-   struct intel_engine_cs *cur, *old;
-   u32 ccid;
 
trace_i915_request_out(rq);
 
-   ccid = rq->context->lrc.ccid;
-   old = READ_ONCE(ce->inflight);
-   do
-   cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
-   while (!try_cmpxchg(>inflight, , cur));
-   if (!cur)
-   __execlists_schedule_out(rq, old, ccid);
+   GEM_BUG_ON(!ce->inflight);
+   ce->inflight = ptr_dec(ce->inflight);
+   if (!intel_context_inflight_count(ce)) {
+   GEM_BUG_ON(ce->inflight != rq->engine);
+   __execlists_schedule_out(rq);
+   WRITE_ONCE(ce->inflight, NULL);
+   intel_context_put(ce);
+   }
 
i915_request_put(rq);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 21/42] drm/i915/gt: Simplify virtual engine handling for execlists_hold()

2020-08-02 Thread Chris Wilson
Now that the tasklet completely controls scheduling of the requests, and
we postpone scheduling out the old requests, we can keep a hanging
virtual request bound to the engine on which it hung, and remove it from
te queue. On release, it will be returned to the same engine and remain
in its queue until it is scheduled; after which point it will become
eligible for transfer to a sibling. Instead, we could opt to resubmit the
request along the virtual engine on unhold, making it eligible for load
balancing immediately -- but that seems like a pointless optimisation
for a hanging context.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 29 -
 1 file changed, 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index d1a45d5e4225..0d072356cf21 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2780,35 +2780,6 @@ static bool execlists_hold(struct intel_engine_cs 
*engine,
goto unlock;
}
 
-   if (rq->engine != engine) { /* preempted virtual engine */
-   struct virtual_engine *ve = to_virtual_engine(rq->engine);
-
-   /*
-* intel_context_inflight() is only protected by virtue
-* of process_csb() being called only by the tasklet (or
-* directly from inside reset while the tasklet is suspended).
-* Assert that neither of those are allowed to run while we
-* poke at the request queues.
-*/
-   GEM_BUG_ON(!reset_in_progress(>execlists));
-
-   /*
-* An unsubmitted request along a virtual engine will
-* remain on the active (this) engine until we are able
-* to process the context switch away (and so mark the
-* context as no longer in flight). That cannot have happened
-* yet, otherwise we would not be hanging!
-*/
-   spin_lock(>base.active.lock);
-   GEM_BUG_ON(intel_context_inflight(rq->context) != engine);
-   GEM_BUG_ON(ve->request != rq);
-   ve->request = NULL;
-   spin_unlock(>base.active.lock);
-   i915_request_put(rq);
-
-   rq->engine = engine;
-   }
-
/*
 * Transfer this request onto the hold queue to prevent it
 * being resumbitted to HW (and potentially completed) before we have
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 26/42] drm/i915: Lift waiter/signaler iterators

2020-08-02 Thread Chris Wilson
Lift the list iteration defines for traversing the signaler/waiter lists
into i915_scheduler.h for reuse.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 10 --
 drivers/gpu/drm/i915/i915_scheduler_types.h | 10 ++
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index ccf0b43c02cf..3c022e621a38 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1835,16 +1835,6 @@ static void virtual_xfer_context(struct virtual_engine 
*ve,
}
 }
 
-#define for_each_waiter(p__, rq__) \
-   list_for_each_entry_lockless(p__, \
-&(rq__)->sched.waiters_list, \
-wait_link)
-
-#define for_each_signaler(p__, rq__) \
-   list_for_each_entry_rcu(p__, \
-   &(rq__)->sched.signalers_list, \
-   signal_link)
-
 static void defer_request(struct i915_request *rq, struct list_head * const pl)
 {
LIST_HEAD(list);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h 
b/drivers/gpu/drm/i915/i915_scheduler_types.h
index f72e6c397b08..343ed44d5ed4 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -81,4 +81,14 @@ struct i915_dependency {
 #define I915_DEPENDENCY_WEAK   BIT(2)
 };
 
+#define for_each_waiter(p__, rq__) \
+   list_for_each_entry_lockless(p__, \
+&(rq__)->sched.waiters_list, \
+wait_link)
+
+#define for_each_signaler(p__, rq__) \
+   list_for_each_entry_rcu(p__, \
+   &(rq__)->sched.signalers_list, \
+   signal_link)
+
 #endif /* _I915_SCHEDULER_TYPES_H_ */
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 17/42] drm/i915/gt: Use virtual_engine during execlists_dequeue

2020-08-02 Thread Chris Wilson
Rather than going back and forth between the rb_node entry and the
virtual_engine type, store the ve local and reuse it. As the
container_of conversion from rb_node to virtual_engine requires a
variable offset, performing that conversion just once shaves off a bit
of code.

v2: Keep a single virtual engine lookup, for typical use.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 239 
 1 file changed, 105 insertions(+), 134 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index dcbd47e8a513..c1ba1eee4b29 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -454,9 +454,15 @@ static int queue_prio(const struct intel_engine_execlists 
*execlists)
return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
 }
 
+static int virtual_prio(const struct intel_engine_execlists *el)
+{
+   struct rb_node *rb = rb_first_cached(>virtual);
+
+   return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN;
+}
+
 static inline bool need_preempt(const struct intel_engine_cs *engine,
-   const struct i915_request *rq,
-   struct rb_node *rb)
+   const struct i915_request *rq)
 {
int last_prio;
 
@@ -493,25 +499,6 @@ static inline bool need_preempt(const struct 
intel_engine_cs *engine,
rq_prio(list_next_entry(rq, sched.link)) > last_prio)
return true;
 
-   if (rb) {
-   struct virtual_engine *ve =
-   rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-   bool preempt = false;
-
-   if (engine == ve->siblings[0]) { /* only preempt one sibling */
-   struct i915_request *next;
-
-   rcu_read_lock();
-   next = READ_ONCE(ve->request);
-   if (next)
-   preempt = rq_prio(next) > last_prio;
-   rcu_read_unlock();
-   }
-
-   if (preempt)
-   return preempt;
-   }
-
/*
 * If the inflight context did not trigger the preemption, then maybe
 * it was the set of queued requests? Pick the highest priority in
@@ -522,7 +509,8 @@ static inline bool need_preempt(const struct 
intel_engine_cs *engine,
 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
 * context, it's priority would not exceed ELSP[0] aka last_prio.
 */
-   return queue_prio(>execlists) > last_prio;
+   return max(virtual_prio(>execlists),
+  queue_prio(>execlists)) > last_prio;
 }
 
 __maybe_unused static inline bool
@@ -1807,6 +1795,35 @@ static bool virtual_matches(const struct virtual_engine 
*ve,
return true;
 }
 
+static struct virtual_engine *
+first_virtual_engine(struct intel_engine_cs *engine)
+{
+   struct intel_engine_execlists *el = >execlists;
+   struct rb_node *rb = rb_first_cached(>virtual);
+
+   while (rb) {
+   struct virtual_engine *ve =
+   rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+   struct i915_request *rq = READ_ONCE(ve->request);
+
+   /* lazily cleanup after another engine handled rq */
+   if (!rq) {
+   rb_erase_cached(rb, >virtual);
+   RB_CLEAR_NODE(rb);
+   rb = rb_first_cached(>virtual);
+   continue;
+   }
+
+   if (!virtual_matches(ve, rq, engine)) {
+   rb = rb_next(rb);
+   continue;
+   }
+   return ve;
+   }
+
+   return NULL;
+}
+
 static void virtual_xfer_context(struct virtual_engine *ve,
 struct intel_engine_cs *engine)
 {
@@ -1905,32 +1922,15 @@ static void defer_active(struct intel_engine_cs *engine)
 
 static bool
 need_timeslice(const struct intel_engine_cs *engine,
-  const struct i915_request *rq,
-  const struct rb_node *rb)
+  const struct i915_request *rq)
 {
int hint;
 
if (!intel_engine_has_timeslices(engine))
return false;
 
-   hint = engine->execlists.queue_priority_hint;
-
-   if (rb) {
-   const struct virtual_engine *ve =
-   rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-   const struct intel_engine_cs *inflight =
-   intel_context_inflight(>context);
-
-   if (!inflight || inflight == engine) {
-   struct i915_request *next;
-
-   rcu_read_lock();
-   next = READ_ONCE(ve->request);
-   if (next)
-   hint = max(hint, rq_prio(next));

[Intel-gfx] [PATCH 39/42] drm/i915/gt: Specify a deadline for the heartbeat

2020-08-02 Thread Chris Wilson
As we know when we expect the heartbeat to be checked for completion,
pass this information along as its deadline. We still do not complain if
the deadline is missed, at least until we have tried a few times, but it
will allow for quicker hang detection on systems where deadlines are
adhered to.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index b7b2f332bc3c..2e316759d1a1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -67,6 +67,16 @@ static void heartbeat_commit(struct i915_request *rq,
local_bh_enable();
 }
 
+static void set_heartbeat_deadline(struct intel_engine_cs *engine,
+  struct i915_request *rq)
+{
+   unsigned long interval;
+
+   interval = READ_ONCE(engine->props.heartbeat_interval_ms);
+   if (interval)
+   i915_request_set_deadline(rq, ktime_get() + (interval << 20));
+}
+
 static void show_heartbeat(const struct i915_request *rq,
   struct intel_engine_cs *engine)
 {
@@ -132,6 +142,8 @@ static void heartbeat(struct work_struct *wrk)
 
local_bh_disable();
i915_request_set_priority(rq, attr.priority);
+   if (attr.priority == I915_PRIORITY_BARRIER)
+   i915_request_set_deadline(rq, 0);
local_bh_enable();
} else {
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
@@ -166,6 +178,7 @@ static void heartbeat(struct work_struct *wrk)
if (engine->i915->params.enable_hangcheck)
engine->heartbeat.systole = i915_request_get(rq);
 
+   set_heartbeat_deadline(engine, rq);
heartbeat_commit(rq, );
 
 unlock:
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 19/42] drm/i915/gt: Defer schedule_out until after the next dequeue

2020-08-02 Thread Chris Wilson
Inside schedule_out, we do extra work upon idling the context, such as
updating the runtime, kicking off retires, kicking virtual engines.
However, if we are in a series of processing single requests per
contexts, we may find ourselves scheduling out the context, only to
immediately schedule it back in during dequeue. This is just extra work
that we can avoid if we keep the context marked as inflight across the
dequeue. This becomes more significant later on for minimising virtual
engine misses.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |   4 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 127 --
 2 files changed, 86 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index a78c1c225ce3..9a28339b6d74 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -45,8 +45,8 @@ struct intel_context {
 
struct intel_engine_cs *engine;
struct intel_engine_cs *inflight;
-#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 2)
-#define intel_context_inflight_count(ce) 
ptr_unmask_bits(READ_ONCE((ce)->inflight), 2)
+#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 3)
+#define intel_context_inflight_count(ce) 
ptr_unmask_bits(READ_ONCE((ce)->inflight), 3)
 
struct i915_address_space *vm;
struct i915_gem_context __rcu *gem_context;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 19cd5f4f74cf..a9ae0638e117 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1363,6 +1363,8 @@ __execlists_schedule_in(struct i915_request *rq)
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
intel_engine_context_in(engine);
 
+   CE_TRACE(ce, "schedule-in, ccid:%x\n", ce->lrc.ccid);
+
return engine;
 }
 
@@ -1406,6 +1408,8 @@ __execlists_schedule_out(struct i915_request *rq,
 * refrain from doing non-trivial work here.
 */
 
+   CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid);
+
/*
 * If we have just completed this context, the engine may now be
 * idle and we want to re-enter powersaving.
@@ -2053,19 +2057,6 @@ static void set_preempt_timeout(struct intel_engine_cs 
*engine,
 active_preempt_timeout(engine, rq));
 }
 
-static inline void clear_ports(struct i915_request **ports, int count)
-{
-   memset_p((void **)ports, NULL, count);
-}
-
-static inline void
-copy_ports(struct i915_request **dst, struct i915_request **src, int count)
-{
-   /* A memcpy_p() would be very useful here! */
-   while (count--)
-   WRITE_ONCE(*dst++, *src++); /* avoid write tearing */
-}
-
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
struct intel_engine_execlists * const execlists = >execlists;
@@ -2400,26 +2391,44 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
start_timeslice(engine, execlists->queue_priority_hint);
 skip_submit:
ring_set_paused(engine, 0);
+   while (port-- != execlists->pending)
+   i915_request_put(*port);
*execlists->pending = NULL;
}
 }
 
-static void
-cancel_port_requests(struct intel_engine_execlists * const execlists)
+static inline void clear_ports(struct i915_request **ports, int count)
+{
+   memset_p((void **)ports, NULL, count);
+}
+
+static inline void
+copy_ports(struct i915_request **dst, struct i915_request **src, int count)
+{
+   /* A memcpy_p() would be very useful here! */
+   while (count--)
+   WRITE_ONCE(*dst++, *src++); /* avoid write tearing */
+}
+
+static struct i915_request **
+cancel_port_requests(struct intel_engine_execlists * const execlists,
+struct i915_request **inactive)
 {
struct i915_request * const *port;
 
for (port = execlists->pending; *port; port++)
-   execlists_schedule_out(*port);
+   *inactive++ = *port;
clear_ports(execlists->pending, ARRAY_SIZE(execlists->pending));
 
/* Mark the end of active before we overwrite *active */
for (port = xchg(>active, execlists->pending); *port; port++)
-   execlists_schedule_out(*port);
+   *inactive++ = *port;
clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight));
 
smp_wmb(); /* complete the seqlock for execlists_active() */
WRITE_ONCE(execlists->active, execlists->inflight);
+
+   return inactive;
 }
 
 static inline void
@@ -2491,7 +2500,8 @@ gen8_csb_parse(const struct intel_engine_execlists 
*execlists, const u32 *csb)
return *csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED);
 }
 
-static void process_csb(struct intel_engine_cs *engine)

[Intel-gfx] [PATCH 16/42] drm/i915/gt: Replace direct submit with direct call to tasklet

2020-08-02 Thread Chris Wilson
Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths.

The trickiest part here is to ensure that paired operations (such as
schedule_in/schedule_out) remain under consistent locking domains,
e.g. when pulled outside of the engine->active.lock

v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET
softirq synchronous").
v3: Update engine-reset to be tasklet aware

Signed-off-by: Chris Wilson 
Reviewed-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   2 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  35 +++--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   3 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   3 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 126 ++
 drivers/gpu/drm/i915/gt/intel_reset.c |  60 ++---
 drivers/gpu/drm/i915/gt/intel_reset.h |   2 +
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   7 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  27 ++--
 drivers/gpu/drm/i915/gt/selftest_reset.c  |   8 +-
 drivers/gpu/drm/i915/i915_request.c   |   2 +
 drivers/gpu/drm/i915/selftests/i915_request.c |   6 +-
 13 files changed, 144 insertions(+), 141 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 56700ddbed47..04d291a8d31d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -398,12 +398,14 @@ static bool __reset_engine(struct intel_engine_cs *engine)
if (!intel_has_reset_engine(gt))
return false;
 
+   local_bh_disable();
if (!test_and_set_bit(I915_RESET_ENGINE + engine->id,
  >reset.flags)) {
-   success = intel_engine_reset(engine, NULL) == 0;
+   success = __intel_engine_reset_bh(engine, NULL) == 0;
clear_and_wake_up_bit(I915_RESET_ENGINE + engine->id,
  >reset.flags);
}
+   local_bh_enable();
 
return success;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 89c7b33dd99f..9f676df68fae 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2453,7 +2453,9 @@ static void eb_request_add(struct i915_execbuffer *eb)
__i915_request_skip(rq);
}
 
+   local_bh_disable();
__i915_request_queue(rq, );
+   local_bh_enable();
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index ea4ba2afe9f9..26f2781e4a51 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1000,32 +1000,39 @@ static unsigned long stop_timeout(const struct 
intel_engine_cs *engine)
return READ_ONCE(engine->props.stop_timeout_ms);
 }
 
-int intel_engine_stop_cs(struct intel_engine_cs *engine)
+static int __intel_engine_stop_cs(struct intel_engine_cs *engine,
+ int fast_timeout_us,
+ int slow_timeout_ms)
 {
struct intel_uncore *uncore = engine->uncore;
-   const u32 base = engine->mmio_base;
-   const i915_reg_t mode = RING_MI_MODE(base);
+   const i915_reg_t mode = RING_MI_MODE(engine->mmio_base);
int err;
 
+   intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
+   err = __intel_wait_for_register_fw(engine->uncore, mode,
+  MODE_IDLE, MODE_IDLE,
+  fast_timeout_us,
+  slow_timeout_ms,
+  NULL);
+
+   /* A final mmio read to let GPU writes be hopefully flushed to memory */
+   intel_uncore_posting_read_fw(uncore, mode);
+   return err;
+}
+
+int intel_engine_stop_cs(struct intel_engine_cs *engine)
+{
+   int err = 0;
+
if (INTEL_GEN(engine->i915) < 3)
return -ENODEV;
 
ENGINE_TRACE(engine, "\n");
-
-   intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
-
-   err = 0;
-   if (__intel_wait_for_register_fw(uncore,
-mode, MODE_IDLE, MODE_IDLE,
-1000, stop_timeout(engine),
-NULL)) {
+   if (__intel_engine_stop_cs(engine, 1000, stop_timeout(engine))) {
ENGINE_TRACE(engine, "timed out on STOP_RING -> IDLE\n");

[Intel-gfx] [PATCH 30/42] drm/i915: Prune empty priolists

2020-08-02 Thread Chris Wilson
A side-effect of our priority inheritance scheme is that we promote
requests from one priority to the next, moving them from one list to the
next. This can often leave the old priority list empty, but still
resident in the rbtree, which we then have to traverse during HW
submission. rb_next() is relatively expensive operation so if we can
push that to the update where we can do piecemeal pruning and reuse the
nodes, this reduces the latency for HW submission.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_scheduler.c | 41 +--
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index bfbbd94dfcbc..4b465a571a83 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -64,9 +64,10 @@ struct list_head *
 i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio)
 {
struct intel_engine_execlists * const execlists = >execlists;
-   struct i915_priolist *p;
+   struct list_head *free = NULL;
struct rb_node **parent, *rb;
-   bool first = true;
+   struct i915_priolist *p;
+   bool first;
 
lockdep_assert_held(>active.lock);
assert_priolists(execlists);
@@ -77,22 +78,40 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, 
int prio)
 find_priolist:
/* most positive priority is scheduled first, equal priorities fifo */
rb = NULL;
+   first = true;
parent = >queue.rb_root.rb_node;
while (*parent) {
rb = *parent;
p = to_priolist(rb);
-   if (prio > p->priority) {
-   parent = >rb_left;
-   } else if (prio < p->priority) {
-   parent = >rb_right;
-   first = false;
-   } else {
-   return >requests;
+
+   if (prio == p->priority)
+   goto out;
+
+   /*
+* Prune an empty priolist, we can reuse it if we need to
+* allocate. After removing this node and rotating the subtrees
+* beneath its parent, we need to restart our descent from the
+* parent.
+*/
+   if (list_empty(>requests)) {
+   rb = rb_parent(>node);
+   parent = rb ?  : >queue.rb_root.rb_node;
+   rb_erase_cached(>node, >queue);
+   free = i915_priolist_free_defer(p, free);
+   continue;
}
+
+   if (prio < p->priority)
+   parent = >rb_left;
+   else
+   parent = >rb_right, first = false;
}
 
if (prio == I915_PRIORITY_NORMAL) {
p = >default_priolist;
+   } else if (free) {
+   p = container_of(free, typeof(*p), requests);
+   free = p->requests.next;
} else {
p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC);
/* Convert an allocation failure to a priority bump */
@@ -117,7 +136,11 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, 
int prio)
 
rb_link_node(>node, rb, parent);
rb_insert_color_cached(>node, >queue, first);
+   GEM_BUG_ON(rb_first_cached(>queue) !=
+  rb_first(>queue.rb_root));
 
+out:
+   i915_priolist_free_many(free);
return >requests;
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/42] drm/i915: Reduce test_and_set_bit to set_bit in i915_request_submit()

2020-08-02 Thread Chris Wilson
Avoid the full blown memory barrier of test_and_set_bit() by noting the
completed request and removing it from the lists.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_request.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 67267b0d5b06..1cd102591b36 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -537,8 +537,10 @@ bool __i915_request_submit(struct i915_request *request)
 * dropped upon retiring. (Otherwise if resubmit a *retired*
 * request, this would be a horrible use-after-free.)
 */
-   if (i915_request_completed(request))
-   goto xfer;
+   if (i915_request_completed(request)) {
+   list_del_init(>sched.link);
+   goto active;
+   }
 
if (unlikely(intel_context_is_banned(request->context)))
i915_request_set_error_once(request, -EIO);
@@ -572,11 +574,11 @@ bool __i915_request_submit(struct i915_request *request)
engine->serial++;
result = true;
 
-xfer:
-   if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags)) {
-   list_move_tail(>sched.link, >active.requests);
-   clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags);
-   }
+   GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags));
+   list_move_tail(>sched.link, >active.requests);
+active:
+   clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags);
+   set_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags);
 
/*
 * XXX Rollback bonded-execution on __i915_request_unsubmit()?
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 31/42] drm/i915: Replace engine->schedule() with a known request operation

2020-08-02 Thread Chris Wilson
Looking to the future, we want to set the scheduling attributes
explicitly and so replace the generic engine->schedule() with the more
direct i915_request_set_priority()

What it loses in removing the 'schedule' name from the function, it
gains in having an explicit entry point with a stated goal.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  9 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c  | 27 +--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  3 --
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  4 +--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 29 
 drivers/gpu/drm/i915/gt/intel_engine_user.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 11 +++
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 33 +--
 drivers/gpu/drm/i915/i915_request.c   | 11 ---
 drivers/gpu/drm/i915/i915_scheduler.c | 15 +
 drivers/gpu/drm/i915/i915_scheduler.h |  3 +-
 14 files changed, 58 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 7212204a4759..e713b673be4b 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -15917,13 +15917,6 @@ static void intel_plane_unpin_fb(struct 
intel_plane_state *old_plane_state)
intel_unpin_fb_vma(vma, old_plane_state->flags);
 }
 
-static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj)
-{
-   struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY };
-
-   i915_gem_object_wait_priority(obj, 0, );
-}
-
 /**
  * intel_prepare_plane_fb - Prepare fb for usage on plane
  * @_plane: drm plane to prepare for
@@ -16000,7 +15993,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
if (ret)
return ret;
 
-   fb_obj_bump_render_priority(obj);
+   i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB);
 
if (!new_plane_state->uapi.fence) { /* implicit fencing */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9f676df68fae..ab57b527c893 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2099,7 +2099,7 @@ static int __eb_pin_reloc_engine(struct i915_execbuffer 
*eb)
return PTR_ERR(ce);
 
/* Reuse eb->context->timeline with scheduler! */
-   if (engine->schedule)
+   if (intel_engine_has_scheduler(engine))
ce->timeline = intel_timeline_get(eb->context->timeline);
 
i915_vm_put(ce->vm);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index e5b9276d254c..631237c859e9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -474,7 +474,7 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 long timeout);
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
  unsigned int flags,
- const struct i915_sched_attr *attr);
+ int prio);
 
 void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 enum fb_op_origin origin);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 8af55cd3e690..cefbbb3d9b52 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -93,28 +93,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
return timeout;
 }
 
-static void __fence_set_priority(struct dma_fence *fence,
-const struct i915_sched_attr *attr)
+static void __fence_set_priority(struct dma_fence *fence, int prio)
 {
-   struct i915_request *rq;
-   struct intel_engine_cs *engine;
-
if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
return;
 
-   rq = to_request(fence);
-   engine = rq->engine;
-
local_bh_disable();
-   rcu_read_lock(); /* RCU serialisation for set-wedged protection */
-   if (engine->schedule)
-   engine->schedule(rq, attr);
-   rcu_read_unlock();
+   i915_request_set_priority(to_request(fence), prio);
local_bh_enable(); /* kick the tasklets if queues were reprioritised */
 }
 
-static void fence_set_priority(struct dma_fence *fence,
-  const struct i915_sched_attr *attr)
+static void fence_set_priority(struct dma_fence *fence, int prio)
 {
 

[Intel-gfx] [PATCH 25/42] drm/i915/gt: Convert stats.active to plain unsigned int

2020-08-02 Thread Chris Wilson
As context-in/out is now always serialised, we do not have to worry
about concurrent enabling/disable of the busy-stats and can reduce the
atomic_t active to a plain unsigned int, and the seqlock to a seqcount.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  8 ++--
 drivers/gpu/drm/i915/gt/intel_engine_stats.h | 45 
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  4 +-
 3 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b2f520bc1497..8de3105632ef 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -339,7 +339,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
engine->schedule = NULL;
 
ewma__engine_latency_init(>latency);
-   seqlock_init(>stats.lock);
+   seqcount_init(>stats.lock);
 
ATOMIC_INIT_NOTIFIER_HEAD(>context_status_notifier);
 
@@ -1714,7 +1714,7 @@ static ktime_t __intel_engine_get_busy_time(struct 
intel_engine_cs *engine,
 * add it to the total.
 */
*now = ktime_get();
-   if (atomic_read(>stats.active))
+   if (READ_ONCE(engine->stats.active))
total = ktime_add(total, ktime_sub(*now, engine->stats.start));
 
return total;
@@ -1733,9 +1733,9 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs 
*engine, ktime_t *now)
ktime_t total;
 
do {
-   seq = read_seqbegin(>stats.lock);
+   seq = read_seqcount_begin(>stats.lock);
total = __intel_engine_get_busy_time(engine, now);
-   } while (read_seqretry(>stats.lock, seq));
+   } while (read_seqcount_retry(>stats.lock, seq));
 
return total;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_stats.h 
b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
index 58491eae3482..24fbdd94351a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_stats.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
@@ -17,33 +17,44 @@ static inline void intel_engine_context_in(struct 
intel_engine_cs *engine)
 {
unsigned long flags;
 
-   if (atomic_add_unless(>stats.active, 1, 0))
+   if (engine->stats.active) {
+   engine->stats.active++;
return;
-
-   write_seqlock_irqsave(>stats.lock, flags);
-   if (!atomic_add_unless(>stats.active, 1, 0)) {
-   engine->stats.start = ktime_get();
-   atomic_inc(>stats.active);
}
-   write_sequnlock_irqrestore(>stats.lock, flags);
+
+   /* The writer is serialised; but the pmu reader may be from hardirq */
+   local_irq_save(flags);
+   write_seqcount_begin(>stats.lock);
+
+   engine->stats.start = ktime_get();
+   engine->stats.active++;
+
+   write_seqcount_end(>stats.lock);
+   local_irq_restore(flags);
+
+   GEM_BUG_ON(!engine->stats.active);
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 {
unsigned long flags;
 
-   GEM_BUG_ON(!atomic_read(>stats.active));
-
-   if (atomic_add_unless(>stats.active, -1, 1))
+   GEM_BUG_ON(!engine->stats.active);
+   if (engine->stats.active > 1) {
+   engine->stats.active--;
return;
-
-   write_seqlock_irqsave(>stats.lock, flags);
-   if (atomic_dec_and_test(>stats.active)) {
-   engine->stats.total =
-   ktime_add(engine->stats.total,
- ktime_sub(ktime_get(), engine->stats.start));
}
-   write_sequnlock_irqrestore(>stats.lock, flags);
+
+   local_irq_save(flags);
+   write_seqcount_begin(>stats.lock);
+
+   engine->stats.active--;
+   engine->stats.total =
+   ktime_add(engine->stats.total,
+ ktime_sub(ktime_get(), engine->stats.start));
+
+   write_seqcount_end(>stats.lock);
+   local_irq_restore(flags);
 }
 
 #endif /* __INTEL_ENGINE_STATS_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 2361206fd026..7ee9e32d149b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -525,12 +525,12 @@ struct intel_engine_cs {
/**
 * @active: Number of contexts currently scheduled in.
 */
-   atomic_t active;
+   unsigned int active;
 
/**
 * @lock: Lock protecting the below fields.
 */
-   seqlock_t lock;
+   seqcount_t lock;
 
/**
 * @total: Total time this engine was busy.
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 07/42] drm/i915/gt: Split the breadcrumb spinlock between global and contexts

2020-08-02 Thread Chris Wilson
As we funnel more and more contexts into the breadcrumbs on an engine,
the hold time of b->irq_lock grows. As we may then contend with the
b->irq_lock during request submission, this increases the burden upon
the engine->active.lock and so directly impacts both our execution
latency and client latency. If we split the b->irq_lock by introducing a
per-context spinlock to manage the signalers within a context, we then
only need the b->irq_lock for enabling/disabling the interrupt and can
avoid taking the lock for walking the list of contexts within the signal
worker. Even with the current setup, this greatly reduces the number of
times we have to take and fight for b->irq_lock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 154 ++
 drivers/gpu/drm/i915/gt/intel_context.c   |   1 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   1 +
 3 files changed, 86 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 9e7ac612fabb..98d323354082 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -103,17 +103,19 @@ static void __intel_breadcrumbs_disarm_irq(struct 
intel_breadcrumbs *b)
 static void add_signaling_context(struct intel_breadcrumbs *b,
  struct intel_context *ce)
 {
-   intel_context_get(ce);
-   list_add_tail(>signal_link, >signalers);
-   if (list_is_first(>signal_link, >signalers))
+   lockdep_assert_held(>irq_lock);
+
+   list_add_rcu(>signal_link, >signalers);
+   if (list_is_last(>signal_link, >signalers))
__intel_breadcrumbs_arm_irq(b);
 }
 
 static void remove_signaling_context(struct intel_breadcrumbs *b,
 struct intel_context *ce)
 {
-   list_del(>signal_link);
-   intel_context_put(ce);
+   spin_lock(>irq_lock);
+   list_del_rcu(>signal_link);
+   spin_unlock(>irq_lock);
 }
 
 static inline bool __request_completed(const struct i915_request *rq)
@@ -189,15 +191,12 @@ static void signal_irq_work(struct irq_work *work)
struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
const ktime_t timestamp = ktime_get();
struct llist_node *signal, *sn;
-   struct intel_context *ce, *cn;
-   struct list_head *pos, *next;
+   struct intel_context *ce;
 
signal = NULL;
if (unlikely(!llist_empty(>signaled_requests)))
signal = llist_del_all(>signaled_requests);
 
-   spin_lock(>irq_lock);
-
/*
 * Keep the irq armed until the interrupt after all listeners are gone.
 *
@@ -221,11 +220,24 @@ static void signal_irq_work(struct irq_work *work)
 * interrupt draw less ire from other users of the system and tools
 * like powertop.
 */
-   if (!signal && list_empty(>signalers))
-   __intel_breadcrumbs_disarm_irq(b);
+   if (!signal && READ_ONCE(b->irq_armed) && list_empty(>signalers)) {
+   if (spin_trylock(>irq_lock)) {
+   if (list_empty(>signalers))
+   __intel_breadcrumbs_disarm_irq(b);
+   spin_unlock(>irq_lock);
+   }
+   }
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ce, >signalers, signal_link) {
+   struct list_head *pos, *next;
+   bool release = false;
 
-   list_for_each_entry_safe(ce, cn, >signalers, signal_link) {
-   GEM_BUG_ON(list_empty(>signals));
+   if (!spin_trylock(>signal_lock))
+   continue;
+
+   if (list_empty(>signals))
+   goto unlock;
 
list_for_each_safe(pos, next, >signals) {
struct i915_request *rq =
@@ -258,11 +270,16 @@ static void signal_irq_work(struct irq_work *work)
if (>signals == pos) { /* now empty */
add_retire(b, ce->timeline);
remove_signaling_context(b, ce);
+   release = true;
}
}
-   }
 
-   spin_unlock(>irq_lock);
+unlock:
+   spin_unlock(>signal_lock);
+   if (release)
+   intel_context_put(ce);
+   }
+   rcu_read_unlock();
 
llist_for_each_safe(signal, sn, signal) {
struct i915_request *rq =
@@ -336,9 +353,9 @@ void intel_breadcrumbs_free(struct intel_breadcrumbs *b)
kfree(b);
 }
 
-static void insert_breadcrumb(struct i915_request *rq,
- struct intel_breadcrumbs *b)
+static void insert_breadcrumb(struct i915_request *rq)
 {
+   struct intel_breadcrumbs *b = READ_ONCE(rq->engine)->breadcrumbs;
struct intel_context *ce = rq->context;
struct list_head 

[Intel-gfx] [PATCH 34/42] drm/i915: Restructure priority inheritance

2020-08-02 Thread Chris Wilson
In anticipation of wanting to be able to call pi from underneath an
engine's active.lock, rework the priority inheritance to primarily work
along an engine's priority queue, delegating any other engine that the
chain may traverse to a worker. This reduces the global spinlock from
governing the entire priority inheritance depth-first search, to a small
lock around a single list.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_scheduler.c   | 299 
 drivers/gpu/drm/i915/i915_scheduler_types.h |   6 +-
 2 files changed, 178 insertions(+), 127 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 247d2e2bb3c8..45e000c257db 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -17,7 +17,70 @@ static struct i915_global_scheduler {
struct kmem_cache *slab_priorities;
 } global;
 
-static DEFINE_SPINLOCK(schedule_lock);
+static DEFINE_SPINLOCK(ipi_lock);
+static LIST_HEAD(ipi_list);
+
+static inline int rq_prio(const struct i915_request *rq)
+{
+   return READ_ONCE(rq->sched.attr.priority);
+}
+
+static void ipi_schedule(struct irq_work *wrk)
+{
+   rcu_read_lock();
+   do {
+   struct i915_dependency *p;
+   struct i915_request *rq;
+   int prio;
+
+   spin_lock(_lock);
+   p = list_first_entry_or_null(_list, typeof(*p), ipi_link);
+   if (p) {
+   rq = container_of(p->signaler, typeof(*rq), sched);
+   list_del_init(>ipi_link);
+
+   prio = p->ipi_priority;
+   p->ipi_priority = I915_PRIORITY_INVALID;
+   }
+   spin_unlock(_lock);
+   if (!p)
+   break;
+
+   if (i915_request_completed(rq))
+   continue;
+
+   i915_request_set_priority(rq, prio);
+   } while (1);
+   rcu_read_unlock();
+}
+
+static DEFINE_IRQ_WORK(ipi_work, ipi_schedule);
+
+static void ipi_queue(struct i915_dependency *p)
+{
+   list_move(>ipi_link, _list);
+   irq_work_queue(_work);
+}
+
+/*
+ * Virtual engines complicate acquiring the engine timeline lock,
+ * as their rq->engine pointer is not stable until under that
+ * engine lock. The simple ploy we use is to take the lock then
+ * check that the rq still belongs to the newly locked engine.
+ */
+#define lock_engine_irqsave(rq, flags) ({ \
+   struct i915_request * const rq__ = (rq); \
+   struct intel_engine_cs *engine__ = READ_ONCE(rq__->engine); \
+\
+   spin_lock_irqsave(__->active.lock, (flags)); \
+   while (engine__ != READ_ONCE((rq__)->engine)) { \
+   spin_unlock(__->active.lock); \
+   engine__ = READ_ONCE(rq__->engine); \
+   spin_lock(__->active.lock); \
+   } \
+\
+   engine__; \
+})
 
 static const struct i915_request *
 node_to_request(const struct i915_sched_node *node)
@@ -162,42 +225,6 @@ void i915_priolist_free_many(struct list_head *list)
}
 }
 
-struct sched_cache {
-   struct list_head *priolist;
-};
-
-static struct intel_engine_cs *
-sched_lock_engine(const struct i915_sched_node *node,
- struct intel_engine_cs *locked,
- struct sched_cache *cache)
-{
-   const struct i915_request *rq = node_to_request(node);
-   struct intel_engine_cs *engine;
-
-   GEM_BUG_ON(!locked);
-
-   /*
-* Virtual engines complicate acquiring the engine timeline lock,
-* as their rq->engine pointer is not stable until under that
-* engine lock. The simple ploy we use is to take the lock then
-* check that the rq still belongs to the newly locked engine.
-*/
-   while (locked != (engine = READ_ONCE(rq->engine))) {
-   spin_unlock(>active.lock);
-   memset(cache, 0, sizeof(*cache));
-   spin_lock(>active.lock);
-   locked = engine;
-   }
-
-   GEM_BUG_ON(locked != engine);
-   return locked;
-}
-
-static inline int rq_prio(const struct i915_request *rq)
-{
-   return rq->sched.attr.priority;
-}
-
 static inline bool need_preempt(int prio, int active)
 {
/*
@@ -223,19 +250,17 @@ static void kick_submission(struct intel_engine_cs 
*engine,
if (prio <= engine->execlists.queue_priority_hint)
return;
 
-   rcu_read_lock();
-
/* Nothing currently active? We're overdue for a submission! */
inflight = execlists_active(>execlists);
if (!inflight)
-   goto unlock;
+   return;
 
/*
 * If we are already the currently executing context, don't
 * bother evaluating if we should preempt ourselves.
 */
if (inflight->context == rq->context)
-   goto unlock;
+   return;
 
ENGINE_TRACE(engine,
 

[Intel-gfx] [PATCH 40/42] drm/i915: Replace the priority boosting for the display with a deadline

2020-08-02 Thread Chris Wilson
For a modeset/pageflip, there is a very precise deadline by which the
frame must be completed in order to hit the vblank and be shown. While
we don't pass along that exact information, we can at least inform the
scheduler that this request-chain needs to be completed asap.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/display/intel_display.c |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_wait.c | 19 ++-
 drivers/gpu/drm/i915/i915_priolist_types.h   |  3 ---
 4 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index e713b673be4b..0327e93586c2 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -15993,7 +15993,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
if (ret)
return ret;
 
-   i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
+   i915_gem_object_wait_deadline(obj, 0, ktime_get() /* next vblank? */);
i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB);
 
if (!new_plane_state->uapi.fence) { /* implicit fencing */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 631237c859e9..3b1b0601adf3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -472,9 +472,9 @@ static inline void __start_cpu_write(struct 
drm_i915_gem_object *obj)
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 unsigned int flags,
 long timeout);
-int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
+int i915_gem_object_wait_deadline(struct drm_i915_gem_object *obj,
  unsigned int flags,
- int prio);
+ ktime_t deadline);
 
 void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 enum fb_op_origin origin);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index cefbbb3d9b52..3334817183f6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -93,17 +93,18 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
return timeout;
 }
 
-static void __fence_set_priority(struct dma_fence *fence, int prio)
+static void __fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
 {
if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
return;
 
local_bh_disable();
-   i915_request_set_priority(to_request(fence), prio);
+   i915_request_set_deadline(to_request(fence),
+ i915_sched_to_ticks(deadline));
local_bh_enable(); /* kick the tasklets if queues were reprioritised */
 }
 
-static void fence_set_priority(struct dma_fence *fence, int prio)
+static void fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
 {
/* Recurse once into a fence-array */
if (dma_fence_is_array(fence)) {
@@ -111,16 +112,16 @@ static void fence_set_priority(struct dma_fence *fence, 
int prio)
int i;
 
for (i = 0; i < array->num_fences; i++)
-   __fence_set_priority(array->fences[i], prio);
+   __fence_set_deadline(array->fences[i], deadline);
} else {
-   __fence_set_priority(fence, prio);
+   __fence_set_deadline(fence, deadline);
}
 }
 
 int
-i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
+i915_gem_object_wait_deadline(struct drm_i915_gem_object *obj,
  unsigned int flags,
- int prio)
+ ktime_t deadline)
 {
struct dma_fence *excl;
 
@@ -135,7 +136,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object 
*obj,
return ret;
 
for (i = 0; i < count; i++) {
-   fence_set_priority(shared[i], prio);
+   fence_set_deadline(shared[i], deadline);
dma_fence_put(shared[i]);
}
 
@@ -145,7 +146,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object 
*obj,
}
 
if (excl) {
-   fence_set_priority(excl, prio);
+   fence_set_deadline(excl, deadline);
dma_fence_put(excl);
}
return 0;
diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h 
b/drivers/gpu/drm/i915/i915_priolist_types.h
index 43a0ac45295f..ac6d9614ea23 100644
--- a/drivers/gpu/drm/i915/i915_priolist_types.h
+++ b/drivers/gpu/drm/i915/i915_priolist_types.h
@@ -20,9 +20,6 @@ enum {
/* A preemptive pulse used to monitor the health of each 

[Intel-gfx] [PATCH 28/42] drm/i915: Remove I915_USER_PRIORITY_SHIFT

2020-08-02 Thread Chris Wilson
As we do not have any internal priority levels, the priority can be set
directed from the user values.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  6 +--
 .../i915/gem/selftests/i915_gem_object_blt.c  |  4 +-
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  6 +--
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 44 +++
 drivers/gpu/drm/i915/i915_priolist_types.h|  3 --
 drivers/gpu/drm/i915/i915_scheduler.c |  1 -
 7 files changed, 23 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index db2a5a1a9b35..7212204a4759 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -15919,9 +15919,7 @@ static void intel_plane_unpin_fb(struct 
intel_plane_state *old_plane_state)
 
 static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj)
 {
-   struct i915_sched_attr attr = {
-   .priority = I915_USER_PRIORITY(I915_PRIORITY_DISPLAY),
-   };
+   struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY };
 
i915_gem_object_wait_priority(obj, 0, );
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 04d291a8d31d..b2cd488c3420 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -719,7 +719,7 @@ __create_context(struct drm_i915_private *i915)
 
kref_init(>ref);
ctx->i915 = i915;
-   ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL);
+   ctx->sched.priority = I915_PRIORITY_NORMAL;
mutex_init(>mutex);
INIT_LIST_HEAD(>link);
 
@@ -1992,7 +1992,7 @@ static int set_priority(struct i915_gem_context *ctx,
!capable(CAP_SYS_NICE))
return -EPERM;
 
-   ctx->sched.priority = I915_USER_PRIORITY(priority);
+   ctx->sched.priority = priority;
context_apply_all(ctx, __apply_priority, ctx);
 
return 0;
@@ -2496,7 +2496,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
 
case I915_CONTEXT_PARAM_PRIORITY:
args->size = 0;
-   args->value = ctx->sched.priority >> I915_USER_PRIORITY_SHIFT;
+   args->value = ctx->sched.priority;
break;
 
case I915_CONTEXT_PARAM_SSEU:
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
index 23b6e11bbc3e..c4c04fb97d14 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
@@ -220,7 +220,7 @@ static int igt_fill_blt_thread(void *arg)
return PTR_ERR(ctx);
 
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-   ctx->sched.priority = I915_USER_PRIORITY(prio);
+   ctx->sched.priority = prio;
}
 
ce = i915_gem_context_get_engine(ctx, 0);
@@ -338,7 +338,7 @@ static int igt_copy_blt_thread(void *arg)
return PTR_ERR(ctx);
 
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-   ctx->sched.priority = I915_USER_PRIORITY(prio);
+   ctx->sched.priority = prio;
}
 
ce = i915_gem_context_get_engine(ctx, 0);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 469639a2cc30..5ab76c507e46 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -82,9 +82,7 @@ static void show_heartbeat(const struct i915_request *rq,
 
 static void heartbeat(struct work_struct *wrk)
 {
-   struct i915_sched_attr attr = {
-   .priority = I915_USER_PRIORITY(I915_PRIORITY_MIN),
-   };
+   struct i915_sched_attr attr = { .priority = I915_PRIORITY_MIN };
struct intel_engine_cs *engine =
container_of(wrk, typeof(*engine), heartbeat.work.work);
struct intel_context *ce = engine->kernel_context;
@@ -128,7 +126,7 @@ static void heartbeat(struct work_struct *wrk)
 */
attr.priority = 0;
if (rq->sched.attr.priority >= attr.priority)
-   attr.priority |= 
I915_USER_PRIORITY(I915_PRIORITY_HEARTBEAT);
+   attr.priority = I915_PRIORITY_HEARTBEAT;
if (rq->sched.attr.priority >= attr.priority)
attr.priority = I915_PRIORITY_BARRIER;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 037145b51459..2b707f39b04c 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ 

[Intel-gfx] [PATCH 18/42] drm/i915/gt: Decouple inflight virtual engines

2020-08-02 Thread Chris Wilson
Once a virtual engine has been bound to a sibling, it will remain bound
until we finally schedule out the last active request. We can not rebind
the context to a new sibling while it is inflight as the context save
will conflict, hence we wait. As we cannot then use any other sibliing
while the context is inflight, only kick the bound sibling while it
inflight and upon scheduling out the kick the rest (so that we can swap
engines on timeslicing if the previously bound engine becomes
oversubscribed).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 28 
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index c1ba1eee4b29..19cd5f4f74cf 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1388,9 +1388,8 @@ static inline void execlists_schedule_in(struct 
i915_request *rq, int idx)
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
-   struct i915_request *next = READ_ONCE(ve->request);
 
-   if (next == rq || (next && next->execution_mask & ~rq->execution_mask))
+   if (READ_ONCE(ve->request))
tasklet_hi_schedule(>base.execlists.tasklet);
 }
 
@@ -1807,17 +1806,13 @@ first_virtual_engine(struct intel_engine_cs *engine)
struct i915_request *rq = READ_ONCE(ve->request);
 
/* lazily cleanup after another engine handled rq */
-   if (!rq) {
+   if (!rq || !virtual_matches(ve, rq, engine)) {
rb_erase_cached(rb, >virtual);
RB_CLEAR_NODE(rb);
rb = rb_first_cached(>virtual);
continue;
}
 
-   if (!virtual_matches(ve, rq, engine)) {
-   rb = rb_next(rb);
-   continue;
-   }
return ve;
}
 
@@ -5469,7 +5464,6 @@ static void virtual_submission_tasklet(unsigned long data)
if (unlikely(!mask))
return;
 
-   local_irq_disable();
for (n = 0; n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]);
struct ve_node * const node = >nodes[sibling->id];
@@ -5479,20 +5473,19 @@ static void virtual_submission_tasklet(unsigned long 
data)
if (!READ_ONCE(ve->request))
break; /* already handled by a sibling's tasklet */
 
+   spin_lock_irq(>active.lock);
+
if (unlikely(!(mask & sibling->mask))) {
if (!RB_EMPTY_NODE(>rb)) {
-   spin_lock(>active.lock);
rb_erase_cached(>rb,
>execlists.virtual);
RB_CLEAR_NODE(>rb);
-   spin_unlock(>active.lock);
}
-   continue;
-   }
 
-   spin_lock(>active.lock);
+   goto unlock_engine;
+   }
 
-   if (!RB_EMPTY_NODE(>rb)) {
+   if (unlikely(!RB_EMPTY_NODE(>rb))) {
/*
 * Cheat and avoid rebalancing the tree if we can
 * reuse this node in situ.
@@ -5532,9 +5525,12 @@ static void virtual_submission_tasklet(unsigned long 
data)
if (first && prio > sibling->execlists.queue_priority_hint)
tasklet_hi_schedule(>execlists.tasklet);
 
-   spin_unlock(>active.lock);
+unlock_engine:
+   spin_unlock_irq(>active.lock);
+
+   if (intel_context_inflight(>context))
+   break;
}
-   local_irq_enable();
 }
 
 static void virtual_submit_request(struct i915_request *rq)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/42] drm/i915/gt: Free stale request on destroying the virtual engine

2020-08-02 Thread Chris Wilson
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 417f6b0c6c61..cb04bc5474be 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -180,6 +180,7 @@
 #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
 
 struct virtual_engine {
+   struct rcu_head rcu;
struct intel_engine_cs base;
struct intel_context context;
 
@@ -5393,10 +5394,25 @@ static void virtual_context_destroy(struct kref *kref)
container_of(kref, typeof(*ve), context.ref);
unsigned int n;
 
-   GEM_BUG_ON(!list_empty(virtual_queue(ve)));
-   GEM_BUG_ON(ve->request);
GEM_BUG_ON(ve->context.inflight);
 
+   if (unlikely(ve->request)) {
+   struct i915_request *old;
+   unsigned long flags;
+
+   spin_lock_irqsave(>base.active.lock, flags);
+
+   old = fetch_and_zero(>request);
+   if (old) {
+   GEM_BUG_ON(!i915_request_completed(old));
+   __i915_request_submit(old);
+   i915_request_put(old);
+   }
+
+   spin_unlock_irqrestore(>base.active.lock, flags);
+   }
+   GEM_BUG_ON(!list_empty(virtual_queue(ve)));
+
for (n = 0; n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
struct rb_node *node = >nodes[sibling->id].rb;
@@ -5422,7 +5438,7 @@ static void virtual_context_destroy(struct kref *kref)
intel_engine_free_request_pool(>base);
 
kfree(ve->bonds);
-   kfree(ve);
+   kfree_rcu(ve, rcu);
 }
 
 static void virtual_engine_initial_hint(struct virtual_engine *ve)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 23/42] drm/i915/gt: Drop atomic for engine->fw_active tracking

2020-08-02 Thread Chris Wilson
Since schedule-in/out is now entirely serialised by the tasklet bitlock,
we do not need to worry about concurrent in/out operations and so reduce
the atomic operations to plain instructions.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 26f2781e4a51..b2f520bc1497 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1637,7 +1637,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
   ktime_to_ms(intel_engine_get_busy_time(engine,
  )));
drm_printf(m, "\tForcewake: %x domains, %d active\n",
-  engine->fw_domain, atomic_read(>fw_active));
+  engine->fw_domain, READ_ONCE(engine->fw_active));
 
rcu_read_lock();
rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 3ea1e59d2b85..2361206fd026 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -328,7 +328,7 @@ struct intel_engine_cs {
 * as possible.
 */
enum forcewake_domains fw_domain;
-   atomic_t fw_active;
+   unsigned int fw_active;
 
unsigned long context_tag;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e3dccdc53faf..61dfc932201e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1342,7 +1342,7 @@ __execlists_schedule_in(struct i915_request *rq)
ce->lrc.ccid |= engine->execlists.ccid;
 
__intel_gt_pm_get(engine->gt);
-   if (engine->fw_domain && !atomic_fetch_inc(>fw_active))
+   if (engine->fw_domain && !engine->fw_active++)
intel_uncore_forcewake_get(engine->uncore, engine->fw_domain);
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
intel_engine_context_in(engine);
@@ -1435,7 +1435,7 @@ static inline void __execlists_schedule_out(struct 
i915_request *rq)
intel_context_update_runtime(ce);
intel_engine_context_out(engine);
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-   if (engine->fw_domain && !atomic_dec_return(>fw_active))
+   if (engine->fw_domain && !--engine->fw_active)
intel_uncore_forcewake_put(engine->uncore, engine->fw_domain);
intel_gt_pm_put_async(engine->gt);
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 38/42] drm/i915: Fair low-latency scheduling

2020-08-02 Thread Chris Wilson
The first "scheduler" was a topographical sorting of requests into
priority order. The execution order was deterministic, the earliest
submitted, highest priority request would be executed first. Priority
inherited ensured that inversions were kept at bay, and allowed us to
dynamically boost priorities (e.g. for interactive pageflips).

The minimalistic timeslicing scheme was an attempt to introduce fairness
between long running requests, by evicting the active request at the end
of a timeslice and moving it to the back of its priority queue (while
ensuring that dependencies were kept in order). For short running
requests from many clients of equal priority, the scheme is still very
much FIFO submission ordering, and as unfair as before.

To impose fairness, we need an external metric that ensures that clients
are interpersed, we don't execute one long chain from client A before
executing any of client B. This could be imposed by the clients by using
a fences based on an external clock, that is they only submit work for a
"frame" at frame-interval, instead of submitting as much work as they
are able to. The standard SwapBuffers approach is akin to double
bufferring, where as one frame is being executed, the next is being
submitted, such that there is always a maximum of two frames per client
in the pipeline. Even this scheme exhibits unfairness under load as a
single client will execute two frames back to back before the next, and
with enough clients, deadlines will be missed.

The idea introduced by BFS/MuQSS is that fairness is introduced by
metering with an external clock. Every request, when it becomes ready to
execute is assigned a virtual deadline, and execution order is then
determined by earliest deadline. Priority is used as a hint, rather than
strict ordering, where high priority requests have earlier deadlines,
but not necessarily earlier than outstanding work. Thus work is executed
in order of 'readiness', with timeslicing to demote long running work.

The Achille's heel of this scheduler is its strong preference for
low-latency and favouring of new queues. Whereas it was easy to dominate
the old scheduler by flooding it with many requests over a short period
of time, the new scheduler can be dominated by a 'synchronous' client
that waits for each of its requests to complete before submitting the
next. As such a client has no history, it is always considered
ready-to-run and receives an earlier deadline than the long running
requests.

To check the impact on throughput (often the downfall of latency
sensitive schedulers), we used gem_wsim to simulate various transcode
workloads with different load balancers, and varying the number of
competing [heterogenous] clients.

+mB+
|   a  |
| cda  |
| c.a  |
| ..aa |
|   ..---. |
|   -.--+-.|
|.c.-.-+++.  b |
|   bbb.d-c-+--+++.aab aab b   |
|b  b   b   b  b.  b ..---+++-+a. b. b b   b   bb b|
1   A| |
2 |___AM|  |
3|A__| |
4|MA_| |
+--+
Clients   Min   Max Median   AvgStddev
1   -8.20   5.4 -0.045  -0.02375   0.094722134
2  -15.96 19.28  -0.64 -1.05 2.2428076
4   -5.11  2.95  -1.15-1.0680.72382651
8   -5.63  1.85 -0.905   -0.871224490.73390971

The impact was on average 1% under contention due to the change in context
execution order and number of context switches.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  12 +-
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  14 -
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 287 ---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  41 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   6 +-
 drivers/gpu/drm/i915/i915_priolist_types.h|   7 +-
 drivers/gpu/drm/i915/i915_request.c   |   1 +
 drivers/gpu/drm/i915/i915_scheduler.c | 462 +++---
 drivers/gpu/drm/i915/i915_scheduler.h |  29 +-
 

[Intel-gfx] [PATCH 13/42] drm/i915/gt: Decouple completed requests on unwind

2020-08-02 Thread Chris Wilson
Since the introduction of preempt-to-busy, requests can complete in the
background, even while they are not on the engine->active.requests list.
As such, the engine->active.request list itself is not in strict
retirement order, and we have to scan the entire list while unwinding to
not miss any. However, if the request is completed we currently leave it
on the list [until retirement], but we could just as simply remove it
and stop treating it as active. We would only have to then traverse it
once while unwinding in quick succession.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 6 --
 drivers/gpu/drm/i915/i915_request.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index cb04bc5474be..f9aa44f222db 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs 
*engine)
list_for_each_entry_safe_reverse(rq, rn,
 >active.requests,
 sched.link) {
-   if (i915_request_completed(rq))
-   continue; /* XXX */
+   if (i915_request_completed(rq)) {
+   list_del_init(>sched.link);
+   continue;
+   }
 
__i915_request_unsubmit(rq);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 1cd102591b36..ebfa195d6463 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -319,7 +319,8 @@ bool i915_request_retire(struct i915_request *rq)
 * after removing the breadcrumb and signaling it, so that we do not
 * inadvertently attach the breadcrumb to a completed request.
 */
-   remove_from_engine(rq);
+   if (!list_empty(>sched.link))
+   remove_from_engine(rq);
GEM_BUG_ON(!llist_empty(>execute_cb));
 
__list_del_entry(>link); /* poison neither prev/next (RCU walks) */
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/42] drm/i915: Fix wrong return value

2020-08-02 Thread Chris Wilson
From: Tianjia Zhang 

In function i915_active_acquire_preallocate_barrier(), not all
paths have the return value set correctly, and in case of memory
allocation failure, a negative error code should be returned.

Cc: Chris Wilson 
Signed-off-by: Tianjia Zhang 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_active.c| 7 ++-
 drivers/gpu/drm/i915/selftests/i915_request.c | 2 +-
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 3ee5e02c3e17..b0a6522be3d1 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -889,7 +889,6 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
intel_engine_mask_t tmp, mask = engine->mask;
struct llist_node *first = NULL, *last = NULL;
struct intel_gt *gt = engine->gt;
-   int err;
 
GEM_BUG_ON(i915_active_is_idle(ref));
 
@@ -914,10 +913,8 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
rcu_read_unlock();
if (!node) {
node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL);
-   if (!node) {
-   err = ENOMEM;
+   if (!node)
goto unwind;
-   }
 
RCU_INIT_POINTER(node->base.fence, NULL);
node->base.cb.func = node_retire;
@@ -965,7 +962,7 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
 
kmem_cache_free(global.slab_cache, node);
}
-   return err;
+   return -ENOMEM;
 }
 
 void i915_active_acquire_barrier(struct i915_active *ref)
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index 57dd6f5122ee..c1dcd4b91bda 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -331,7 +331,7 @@ static int __igt_breadcrumbs_smoketest(void *arg)
if (!wait) {
i915_sw_fence_commit(submit);
heap_fence_put(submit);
-   err = ENOMEM;
+   err = -ENOMEM;
break;
}
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 27/42] drm/i915: Strip out internal priorities

2020-08-02 Thread Chris Wilson
Since we are not using any internal priority levels, and in the next few
patches will introduce a new index for which the optimisation is not so
lear cut, discard the small table within the priolist.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 22 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  2 -
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  6 +--
 drivers/gpu/drm/i915/i915_priolist_types.h|  8 +--
 drivers/gpu/drm/i915/i915_scheduler.c | 51 +++
 drivers/gpu/drm/i915/i915_scheduler.h | 18 ++-
 7 files changed, 21 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 5a0196dc5716..469639a2cc30 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -126,7 +126,7 @@ static void heartbeat(struct work_struct *wrk)
 * low latency and no jitter] the chance to naturally
 * complete before being preempted.
 */
-   attr.priority = I915_PRIORITY_MASK;
+   attr.priority = 0;
if (rq->sched.attr.priority >= attr.priority)
attr.priority |= 
I915_USER_PRIORITY(I915_PRIORITY_HEARTBEAT);
if (rq->sched.attr.priority >= attr.priority)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3c022e621a38..e8f6d0a80c8e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -437,22 +437,13 @@ static int effective_prio(const struct i915_request *rq)
 
 static int queue_prio(const struct intel_engine_execlists *execlists)
 {
-   struct i915_priolist *p;
struct rb_node *rb;
 
rb = rb_first_cached(>queue);
if (!rb)
return INT_MIN;
 
-   /*
-* As the priolist[] are inverted, with the highest priority in [0],
-* we have to flip the index value to become priority.
-*/
-   p = to_priolist(rb);
-   if (!I915_USER_PRIORITY_SHIFT)
-   return p->priority;
-
-   return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
+   return to_priolist(rb)->priority;
 }
 
 static int virtual_prio(const struct intel_engine_execlists *el)
@@ -2249,9 +2240,8 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
while ((rb = rb_first_cached(>queue))) {
struct i915_priolist *p = to_priolist(rb);
struct i915_request *rq, *rn;
-   int i;
 
-   priolist_for_each_request_consume(rq, rn, p, i) {
+   priolist_for_each_request_consume(rq, rn, p) {
bool merge = true;
 
/*
@@ -4253,9 +4243,8 @@ static void execlists_reset_cancel(struct intel_engine_cs 
*engine)
/* Flush the queued requests to the timeline list (for retiring). */
while ((rb = rb_first_cached(>queue))) {
struct i915_priolist *p = to_priolist(rb);
-   int i;
 
-   priolist_for_each_request_consume(rq, rn, p, i) {
+   priolist_for_each_request_consume(rq, rn, p) {
mark_eio(rq);
__i915_request_submit(rq);
}
@@ -5296,7 +5285,7 @@ static int __execlists_context_alloc(struct intel_context 
*ce,
 
 static struct list_head *virtual_queue(struct virtual_engine *ve)
 {
-   return >base.execlists.default_priolist.requests[0];
+   return >base.execlists.default_priolist.requests;
 }
 
 static void virtual_context_destroy(struct kref *kref)
@@ -5863,9 +5852,8 @@ void intel_execlists_show_requests(struct intel_engine_cs 
*engine,
count = 0;
for (rb = rb_first_cached(>queue); rb; rb = rb_next(rb)) {
struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
-   int i;
 
-   priolist_for_each_request(rq, p, i) {
+   priolist_for_each_request(rq, p) {
if (count++ < max - 1)
show_request(m, rq, "\t\tQ ");
else
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 79c4a2d80770..037145b51459 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1102,7 +1102,6 @@ create_rewinder(struct intel_context *ce,
 
intel_ring_advance(rq, cs);
 
-   rq->sched.attr.priority = I915_PRIORITY_MASK;
err = 0;
 err:
i915_request_get(rq);
@@ -5362,7 +5361,6 @@ create_timestamp(struct intel_context *ce, void *slot, 
int idx)
 
intel_ring_advance(rq, cs);
 
-   rq->sched.attr.priority = I915_PRIORITY_MASK;

[Intel-gfx] [PATCH 32/42] drm/i915/gt: Do not suspend bonded requests if one hangs

2020-08-02 Thread Chris Wilson
Treat the dependency between bonded requests as weak and leave the
remainder of the pair on the GPU if one hangs.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fad58f227bca..3999bf6c6aee 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2700,6 +2700,9 @@ static void __execlists_hold(struct i915_request *rq)
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
 
+   if (p->flags & I915_DEPENDENCY_WEAK)
+   continue;
+
/* Leave semaphores spinning on the other engines */
if (w->engine != rq->engine)
continue;
@@ -2795,6 +2798,9 @@ static void __execlists_unhold(struct i915_request *rq)
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
 
+   if (p->flags & I915_DEPENDENCY_WEAK)
+   continue;
+
/* Propagate any change in error status */
if (rq->fence.error)
i915_request_set_error_once(w, rq->fence.error);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/42] drm/i915/gem: Don't drop the timeline lock during execbuf

2020-08-02 Thread Chris Wilson
Our timeline lock is our defence against a concurrent execbuf
interrupting our request construction. we need hold it throughout or,
for example, a second thread may interject a relocation request in
between our own relocation request and execution in the ring.

A second, major benefit, is that it allows us to preserve a large chunk
of the ringbuffer for our exclusive use; which should virtually
eliminate the threat of hitting a wait_for_space during request
construction -- although we should have already dropped other
contentious locks at that point.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 414 +++---
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  29 +-
 2 files changed, 288 insertions(+), 155 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 07cb2dd0f795..89c7b33dd99f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -259,6 +259,8 @@ struct i915_execbuffer {
bool has_fence : 1;
bool needs_unfenced : 1;
 
+   struct intel_context *ce;
+
struct i915_vma *target;
struct i915_request *rq;
struct i915_vma *rq_vma;
@@ -639,6 +641,35 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
return 0;
 }
 
+static void retire_requests(struct intel_timeline *tl)
+{
+   struct i915_request *rq, *rn;
+
+   list_for_each_entry_safe(rq, rn, >requests, link)
+   if (!i915_request_retire(rq))
+   break;
+}
+
+static int wait_for_timeline(struct intel_timeline *tl)
+{
+   do {
+   struct dma_fence *fence;
+   int err;
+
+   fence = i915_active_fence_get(>last_request);
+   if (!fence)
+   return 0;
+
+   err = dma_fence_wait(fence, true);
+   dma_fence_put(fence);
+   if (err)
+   return err;
+
+   /* Retiring may trigger a barrier, requiring an extra pass */
+   retire_requests(tl);
+   } while (1);
+}
+
 static int eb_reserve(struct i915_execbuffer *eb)
 {
const unsigned int count = eb->buffer_count;
@@ -646,7 +677,6 @@ static int eb_reserve(struct i915_execbuffer *eb)
struct list_head last;
struct eb_vma *ev;
unsigned int i, pass;
-   int err = 0;
 
/*
 * Attempt to pin all of the buffers into the GTT.
@@ -662,18 +692,37 @@ static int eb_reserve(struct i915_execbuffer *eb)
 * room for the earlier objects *unless* we need to defragment.
 */
 
-   if (mutex_lock_interruptible(>i915->drm.struct_mutex))
-   return -EINTR;
-
pass = 0;
do {
+   int err = 0;
+
+   /*
+* We need to hold one lock as we bind all the vma so that
+* we have a consistent view of the entire vm and can plan
+* evictions to fill the whole GTT. If we allow a second
+* thread to run as we do this, it will either unbind
+* everything we want pinned, or steal space that we need for
+* ourselves. The closer we are to a full GTT, the more likely
+* such contention will cause us to fail to bind the workload
+* for this batch. Since we know at this point we need to
+* find space for new buffers, we know that extra pressure
+* from contention is likely.
+*
+* In lieu of being able to hold vm->mutex for the entire
+* sequence (it's complicated!), we opt for struct_mutex.
+*/
+   if (mutex_lock_interruptible(>i915->drm.struct_mutex))
+   return -EINTR;
+
list_for_each_entry(ev, >unbound, bind_link) {
err = eb_reserve_vma(eb, ev, pin_flags);
if (err)
break;
}
-   if (!(err == -ENOSPC || err == -EAGAIN))
-   break;
+   if (!(err == -ENOSPC || err == -EAGAIN)) {
+   mutex_unlock(>i915->drm.struct_mutex);
+   return err;
+   }
 
/* Resort *all* the objects into priority order */
INIT_LIST_HEAD(>unbound);
@@ -702,38 +751,50 @@ static int eb_reserve(struct i915_execbuffer *eb)
list_add_tail(>bind_link, );
}
list_splice_tail(, >unbound);
+   mutex_unlock(>i915->drm.struct_mutex);
 
if (err == -EAGAIN) {
-   mutex_unlock(>i915->drm.struct_mutex);
flush_workqueue(eb->i915->mm.userptr_wq);
-  

[Intel-gfx] [PATCH 36/42] drm/i915: Improve DFS for priority inheritance

2020-08-02 Thread Chris Wilson
The core of the scheduling algorithm is that we compute the topological
order of the fence DAG. Knowing that we have a DAG, we should be able to
use a DFS to compute the topological sort in linear time. However,
during the conversion of the recursive algorithm into an iterative one,
the memorization of how far we had progressed down a branch was
forgotten.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_scheduler.c | 57 ---
 1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 3c1a0b001746..ca681e6d9c6d 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -287,15 +287,33 @@ static void ipi_priority(struct i915_dependency *p, int 
prio)
spin_unlock(_lock);
 }
 
+static struct i915_request *
+stack_push(struct i915_request *rq,
+  struct i915_request *stack,
+  struct list_head *pos)
+{
+   stack->sched.dfs.prev = pos;
+   rq->sched.dfs.next = (struct list_head *)stack;
+   return rq;
+}
+
+static struct i915_request *stack_pop(struct i915_request *rq,
+ struct list_head **pos)
+{
+   rq = (struct i915_request *)rq->sched.dfs.next;
+   if (rq)
+   *pos = rq->sched.dfs.prev;
+   return rq;
+}
+
 static void __i915_request_set_priority(struct i915_request *rq, int prio)
 {
struct intel_engine_cs *engine = rq->engine;
-   struct i915_request *rn;
+   struct list_head *pos = >sched.signalers_list;
struct list_head *plist;
-   LIST_HEAD(dfs);
 
lockdep_assert_held(>active.lock);
-   list_add(>sched.dfs, );
+   plist = i915_sched_lookup_priolist(engine, prio);
 
/*
 * Recursively bump all dependent priorities to match the new request.
@@ -315,40 +333,31 @@ static void __i915_request_set_priority(struct 
i915_request *rq, int prio)
 * end result is a topological list of requests in reverse order, the
 * last element in the list is the request we must execute first.
 */
-   list_for_each_entry(rq, , sched.dfs) {
-   struct i915_dependency *p;
-
-   /* Also release any children on this engine that are ready */
-   GEM_BUG_ON(rq->engine != engine);
-
-   for_each_signaler(p, rq) {
+   rq->sched.dfs.next = NULL;
+   do {
+   list_for_each_continue(pos, >sched.signalers_list) {
+   struct i915_dependency *p =
+   list_entry(pos, typeof(*p), signal_link);
struct i915_request *s =
container_of(p->signaler, typeof(*s), sched);
 
-   GEM_BUG_ON(s == rq);
-
if (rq_prio(s) >= prio)
continue;
 
if (i915_request_completed(s))
continue;
 
-   if (s->engine != rq->engine) {
+   if (s->engine != engine) {
ipi_priority(p, prio);
continue;
}
 
-   list_move_tail(>sched.dfs, );
+   /* Remember our position along this branch */
+   rq = stack_push(s, rq, pos);
+   pos = >sched.signalers_list;
}
-   }
-
-   plist = i915_sched_lookup_priolist(engine, prio);
-
-   /* Fifo and depth-first replacement ensure our deps execute first */
-   list_for_each_entry_safe_reverse(rq, rn, , sched.dfs) {
-   GEM_BUG_ON(rq->engine != engine);
 
-   INIT_LIST_HEAD(>sched.dfs);
+   RQ_TRACE(rq, "set-priority:%d\n", prio);
WRITE_ONCE(rq->sched.attr.priority, prio);
 
/*
@@ -362,12 +371,13 @@ static void __i915_request_set_priority(struct 
i915_request *rq, int prio)
if (!i915_request_is_ready(rq))
continue;
 
+   GEM_BUG_ON(rq->engine != engine);
if (i915_request_in_priority_queue(rq))
list_move_tail(>sched.link, plist);
 
/* Defer (tasklet) submission until after all updates. */
kick_submission(engine, rq, prio);
-   }
+   } while ((rq = stack_pop(rq, )));
 }
 
 void i915_request_set_priority(struct i915_request *rq, int prio)
@@ -435,7 +445,6 @@ void i915_sched_node_init(struct i915_sched_node *node)
INIT_LIST_HEAD(>signalers_list);
INIT_LIST_HEAD(>waiters_list);
INIT_LIST_HEAD(>link);
-   INIT_LIST_HEAD(>dfs);
 
i915_sched_node_reinit(node);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/42] drm/i915: Drop i915_request.lock serialisation around await_start

2020-08-02 Thread Chris Wilson
Originally, we used the signal->lock as a means of following the
previous link in its timeline and peeking at the previous fence.
However, we have replaced the explicit serialisation with a series of
very careful probes that anticipate the links being deleted and the
fences recycled before we are able to acquire a strong reference to it.
We do not need the signal->lock crutch anymore, nor want the contention.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_request.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 0208e917d14a..f7865e22afa9 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -954,9 +954,16 @@ i915_request_await_start(struct i915_request *rq, struct 
i915_request *signal)
if (i915_request_started(signal))
return 0;
 
+   /*
+* The caller holds a reference on @signal, but we do not serialise
+* against it being retired and removed from the lists.
+*
+* We do not hold a reference to the request before @signal, and
+* so must be very careful to ensure that it is not _recycled_ as
+* we follow the link backwards.
+*/
fence = NULL;
rcu_read_lock();
-   spin_lock_irq(>lock);
do {
struct list_head *pos = READ_ONCE(signal->link.prev);
struct i915_request *prev;
@@ -987,7 +994,6 @@ i915_request_await_start(struct i915_request *rq, struct 
i915_request *signal)
 
fence = >fence;
} while (0);
-   spin_unlock_irq(>lock);
rcu_read_unlock();
if (!fence)
return 0;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 09/42] drm/i915: Drop i915_request.lock requirement for intel_rps_boost()

2020-08-02 Thread Chris Wilson
Since we use a flag within i915_request.flags to indicate when we have
boosted the request (so that we only apply the boost) once, this can be
used as the serialisation with i915_request_retire() to avoid having to
explicitly take the i915_request.lock which is more heavily contended.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 15 ++-
 drivers/gpu/drm/i915/i915_request.c |  4 +---
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index e6a00eea0631..2a43e216e0d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -889,17 +889,15 @@ void intel_rps_park(struct intel_rps *rps)
 
 void intel_rps_boost(struct i915_request *rq)
 {
-   struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
-   unsigned long flags;
-
-   if (i915_request_signaled(rq) || !intel_rps_is_active(rps))
+   if (i915_request_signaled(rq) || i915_request_has_waitboost(rq))
return;
 
/* Serializes with i915_request_retire() */
-   spin_lock_irqsave(>lock, flags);
-   if (!i915_request_has_waitboost(rq) &&
-   !dma_fence_is_signaled_locked(>fence)) {
-   set_bit(I915_FENCE_FLAG_BOOST, >fence.flags);
+   if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) {
+   struct intel_rps *rps = _ONCE(rq->engine)->gt->rps;
+
+   if (!intel_rps_is_active(rps))
+   return;
 
GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
 rq->fence.context, rq->fence.seqno);
@@ -910,7 +908,6 @@ void intel_rps_boost(struct i915_request *rq)
 
atomic_inc(>boosts);
}
-   spin_unlock_irqrestore(>lock, flags);
 }
 
 int intel_rps_set(struct intel_rps *rps, u8 val)
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index f7865e22afa9..67267b0d5b06 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -306,10 +306,8 @@ bool i915_request_retire(struct i915_request *rq)
spin_unlock_irq(>lock);
}
 
-   if (i915_request_has_waitboost(rq)) {
-   GEM_BUG_ON(!atomic_read(>engine->gt->rps.num_waiters));
+   if (test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags))
atomic_dec(>engine->gt->rps.num_waiters);
-   }
 
/*
 * We only loosely track inflight requests across preemption,
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 41/42] drm/i915: Move saturated workload detection back to the context

2020-08-02 Thread Chris Wilson
When we introduced the saturated workload detection to tell us to back
off from semaphore usage [semaphores have a noticeable impact on
contended bus cycles with the CPU for some heavy workloads], we first
introduced it as a per-context tracker. This allows individual contexts
to try and optimise their own usage, but we found that with the local
tracking and the no-semaphore boosting, the first context to disable
semaphores got a massive priority boost and so would starve the rest and
all new contexts (as they started with semaphores enabled and lower
priority). Hence we moved the saturated workload detection to the
engine, and a consequence had to disable semaphores on virtual engines.

Now that we do not have semaphore priority boosting, and try to fairly
schedule irrespective of semaphore usage, we can move the tracking back
to the context and virtual engines can now utilise the faster inter-engine
synchronisation. If we see that any context fairs to use the semaphore,
because the system is oversubscribed and was busy doing something else
instead of spinning on the semaphore, we disable further usage of
semaphores with that context until it idles again. This should restrict
the semaphores to lightly utilised system where the latency between
requests is more noticeable, and curtail the bus-contention from checking
for signaled semaphores.

References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  2 --
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 --
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 15 ---
 drivers/gpu/drm/i915/i915_request.c   |  4 ++--
 6 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index cde356c7754d..641e705c9289 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -115,6 +115,8 @@ static int __intel_context_active(struct i915_active 
*active)
 
CE_TRACE(ce, "active\n");
 
+   ce->saturated = 0;
+
intel_context_get(ce);
 
err = __ring_active(ce->ring);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 9a28339b6d74..58fb40d44c85 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -79,6 +79,8 @@ struct intel_context {
} lrc;
u32 tag; /* cookie passed to HW to track this context on submission */
 
+   intel_engine_mask_t saturated; /* submitting semaphores too late? */
+
/* Time on GPU as tracked by the hw. */
struct {
struct ewma_runtime avg;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index ac56df945846..fa7b855902f9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -232,8 +232,6 @@ static int __engine_park(struct intel_wakeref *wf)
struct intel_engine_cs *engine =
container_of(wf, typeof(*engine), wakeref);
 
-   engine->saturated = 0;
-
/*
 * If one and only one request is completed between pm events,
 * we know that we are inside the kernel context and it is
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 742db055d3d6..a200e2f85e7b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -325,8 +325,6 @@ struct intel_engine_cs {
 
struct intel_context *kernel_context; /* pinned */
 
-   intel_engine_mask_t saturated; /* submitting semaphores too late? */
-
struct {
struct delayed_work work;
struct i915_request *systole;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index feec1b841cd3..36ab955545a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -5504,21 +5504,6 @@ intel_execlists_create_virtual(struct intel_engine_cs 
**siblings,
ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
 
-   /*
-* The decision on whether to submit a request using semaphores
-* depends on the saturated state of the engine. We only compute
-* this during HW submission of the request, and we need for this
-* state to be globally applied to all requests being submitted
-* to this engine. Virtual engines encompass more than one physical
-* engine and so we cannot accurately tell in advance if one of those
-* engines is already saturated and so cannot afford to use a 

[Intel-gfx] [PATCH 14/42] drm/i915/gt: Check for a completed last request once

2020-08-02 Thread Chris Wilson
Pull the repeated check for the last active request being completed to a
single spot, when deciding whether or not execlist preemption is
required.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f9aa44f222db..a870fd243e44 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2148,12 +2148,9 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 */
 
if ((last = *active)) {
-   if (need_preempt(engine, last, rb)) {
-   if (i915_request_completed(last)) {
-   tasklet_hi_schedule(>tasklet);
-   return;
-   }
-
+   if (i915_request_completed(last)) {
+   goto check_secondary;
+   } else if (need_preempt(engine, last, rb)) {
ENGINE_TRACE(engine,
 "preempting last=%llx:%lld, prio=%d, 
hint=%d\n",
 last->fence.context,
@@ -2181,11 +2178,6 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
last = NULL;
} else if (need_timeslice(engine, last, rb) &&
   timeslice_expired(execlists, last)) {
-   if (i915_request_completed(last)) {
-   tasklet_hi_schedule(>tasklet);
-   return;
-   }
-
ENGINE_TRACE(engine,
 "expired last=%llx:%lld, prio=%d, hint=%d, 
yield?=%s\n",
 last->fence.context,
@@ -2221,6 +2213,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 * we hopefully coalesce several updates into a single
 * submission.
 */
+check_secondary:
if (!list_is_last(>sched.link,
  >active.requests)) {
/*
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 37/42] drm/i915/gt: Remove timeslice suppression

2020-08-02 Thread Chris Wilson
In the next patch, we remove the strict priority system and continuously
re-evaluate the relative priority of tasks. As such we need to enable
the timeslice whenever there is more than one context in the pipeline.
This simplifies the decision and removes some of the tweaks to suppress
timeslicing, allowing us to lift the timeslice enabling to a common spot
at the end of running the submission tasklet.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  10 --
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 148 +++
 2 files changed, 53 insertions(+), 105 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index bccde0e27300..de85b969a79e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -237,16 +237,6 @@ struct intel_engine_execlists {
 */
unsigned int port_mask;
 
-   /**
-* @switch_priority_hint: Second context priority.
-*
-* We submit multiple contexts to the HW simultaneously and would
-* like to occasionally switch between them to emulate timeslicing.
-* To know when timeslicing is suitable, we track the priority of
-* the context submitted second.
-*/
-   int switch_priority_hint;
-
/**
 * @queue_priority_hint: Highest pending priority.
 *
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index cb53f350848f..6b06cddcfd9a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1885,25 +1885,6 @@ static void defer_active(struct intel_engine_cs *engine)
defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
 }
 
-static bool
-need_timeslice(const struct intel_engine_cs *engine,
-  const struct i915_request *rq)
-{
-   int hint;
-
-   if (!intel_engine_has_timeslices(engine))
-   return false;
-
-   hint = max(engine->execlists.queue_priority_hint,
-  virtual_prio(>execlists));
-
-   if (!list_is_last(>sched.link, >active.requests))
-   hint = max(hint, rq_prio(list_next_entry(rq, sched.link)));
-
-   GEM_BUG_ON(hint >= I915_PRIORITY_UNPREEMPTABLE);
-   return hint >= effective_prio(rq);
-}
-
 static bool
 timeslice_yield(const struct intel_engine_execlists *el,
const struct i915_request *rq)
@@ -1923,76 +1904,63 @@ timeslice_yield(const struct intel_engine_execlists *el,
return rq->context->lrc.ccid == READ_ONCE(el->yield);
 }
 
-static bool
-timeslice_expired(const struct intel_engine_execlists *el,
- const struct i915_request *rq)
+static bool needs_timeslice(const struct intel_engine_cs *engine,
+   const struct i915_request *rq)
 {
-   return timer_expired(>timer) || timeslice_yield(el, rq);
-}
+   /* If not currently active, or about to switch, wait for next event */
+   if (!rq || i915_request_completed(rq))
+   return false;
 
-static int
-switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
-{
-   if (list_is_last(>sched.link, >active.requests))
-   return engine->execlists.queue_priority_hint;
+   /* We do not need to start the timeslice until after the ACK */
+   if (READ_ONCE(engine->execlists.pending[0]))
+   return false;
 
-   return rq_prio(list_next_entry(rq, sched.link));
-}
+   /* If ELSP[1] is occupied, always check to see if worth slicing */
+   if (!list_is_last_rcu(>sched.link, >active.requests))
+   return true;
 
-static inline unsigned long
-timeslice(const struct intel_engine_cs *engine)
-{
-   return READ_ONCE(engine->props.timeslice_duration_ms);
+   /* Otherwise, ELSP[0] is by itself, but may be waiting in the queue */
+   if (!RB_EMPTY_ROOT(>execlists.queue.rb_root))
+   return true;
+
+   return !RB_EMPTY_ROOT(>execlists.virtual.rb_root);
 }
 
-static unsigned long active_timeslice(const struct intel_engine_cs *engine)
+static bool
+timeslice_expired(struct intel_engine_cs *engine, const struct i915_request 
*rq)
 {
-   const struct intel_engine_execlists *execlists = >execlists;
-   const struct i915_request *rq = *execlists->active;
+   const struct intel_engine_execlists *el = >execlists;
 
-   if (!rq || i915_request_completed(rq))
-   return 0;
+   if (!intel_engine_has_timeslices(engine))
+   return false;
 
-   if (READ_ONCE(execlists->switch_priority_hint) < effective_prio(rq))
-   return 0;
+   if (i915_request_has_nopreempt(rq) && i915_request_started(rq))
+   return false;
+
+   if (!needs_timeslice(engine, rq))
+   return false;
 
-   return timeslice(engine);
+   return timer_expired(>timer) || timeslice_yield(el, rq);
 }

[Intel-gfx] [PATCH 15/42] drm/i915/gt: Refactor heartbeat request construction and submission

2020-08-02 Thread Chris Wilson
Pull the individual strands of creating a custom heartbeat requests into
a pair of common functions. This will reduce the number of changes we
will need to make in future.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 56 +--
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 8ffdf676c0a0..eb4393b2342e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -37,12 +37,33 @@ static bool next_heartbeat(struct intel_engine_cs *engine)
return true;
 }
 
+static struct i915_request *
+heartbeat_create(struct intel_context *ce, gfp_t gfp)
+{
+   struct i915_request *rq;
+
+   intel_context_enter(ce);
+   rq = __i915_request_create(ce, gfp);
+   intel_context_exit(ce);
+
+   return rq;
+}
+
 static void idle_pulse(struct intel_engine_cs *engine, struct i915_request *rq)
 {
engine->wakeref_serial = READ_ONCE(engine->serial) + 1;
i915_request_add_active_barriers(rq);
 }
 
+static void heartbeat_commit(struct i915_request *rq,
+const struct i915_sched_attr *attr)
+{
+   idle_pulse(rq->engine, rq);
+
+   __i915_request_commit(rq);
+   __i915_request_queue(rq, attr);
+}
+
 static void show_heartbeat(const struct i915_request *rq,
   struct intel_engine_cs *engine)
 {
@@ -137,18 +158,14 @@ static void heartbeat(struct work_struct *wrk)
goto out;
}
 
-   intel_context_enter(ce);
-   rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
-   intel_context_exit(ce);
+   rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN);
if (IS_ERR(rq))
goto unlock;
 
-   idle_pulse(engine, rq);
if (engine->i915->params.enable_hangcheck)
engine->heartbeat.systole = i915_request_get(rq);
 
-   __i915_request_commit(rq);
-   __i915_request_queue(rq, );
+   heartbeat_commit(rq, );
 
 unlock:
mutex_unlock(>timeline->mutex);
@@ -220,19 +237,14 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
goto out_rpm;
}
 
-   intel_context_enter(ce);
-   rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
-   intel_context_exit(ce);
+   rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto out_unlock;
}
 
__set_bit(I915_FENCE_FLAG_SENTINEL, >fence.flags);
-   idle_pulse(engine, rq);
-
-   __i915_request_commit(rq);
-   __i915_request_queue(rq, );
+   heartbeat_commit(rq, );
GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
err = 0;
 
@@ -245,8 +257,9 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
 
 int intel_engine_flush_barriers(struct intel_engine_cs *engine)
 {
+   struct intel_context *ce = engine->kernel_context;
struct i915_request *rq;
-   int err = 0;
+   int err;
 
if (llist_empty(>barrier_tasks))
return 0;
@@ -254,15 +267,22 @@ int intel_engine_flush_barriers(struct intel_engine_cs 
*engine)
if (!intel_engine_pm_get_if_awake(engine))
return 0;
 
-   rq = i915_request_create(engine->kernel_context);
+   if (mutex_lock_interruptible(>timeline->mutex)) {
+   err = -EINTR;
+   goto out_rpm;
+   }
+
+   rq = heartbeat_create(ce, GFP_KERNEL);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
-   goto out_rpm;
+   goto out_unlock;
}
 
-   idle_pulse(engine, rq);
-   i915_request_add(rq);
+   heartbeat_commit(rq, NULL);
 
+   err = 0;
+out_unlock:
+   mutex_unlock(>timeline->mutex);
 out_rpm:
intel_engine_pm_put(engine);
return err;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for Add generic i915_ggtt ballooning support

2020-08-02 Thread Patchwork
== Series Details ==

Series: Add generic i915_ggtt ballooning support
URL   : https://patchwork.freedesktop.org/series/80177/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18294


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/index.html

Known issues


  Here are the changes found in Patchwork_18294 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-u2:  [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_module_load@reload:
- fi-byt-j1900:   [PASS][3] -> [DMESG-WARN][4] ([i915#1982]) +1 similar 
issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-byt-j1900/igt@i915_module_l...@reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-byt-j1900/igt@i915_module_l...@reload.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-bxt-dsi: [DMESG-WARN][5] ([i915#1635] / [i915#1982]) -> 
[PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-bxt-dsi/igt@i915_module_l...@reload.html

  * igt@i915_selftest@live@active:
- {fi-ehl-1}: [DMESG-FAIL][7] ([i915#541]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-ehl-1/igt@i915_selftest@l...@active.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-bsw-kefka:   [DMESG-WARN][9] ([i915#1982]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
- fi-icl-u2:  [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s3:
- fi-kbl-x1275:   [DMESG-WARN][13] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][14] ([i915#62] / [i915#92]) +2 similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@gem_exec_susp...@basic-s3.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-kbl-x1275/igt@gem_exec_susp...@basic-s3.html

  * igt@kms_flip@basic-flip-vs-dpms@a-dp1:
- fi-kbl-x1275:   [DMESG-WARN][15] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][16] ([i915#62] / [i915#92] / [i915#95]) +2 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-d...@a-dp1.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-d...@a-dp1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (42 -> 35)
--

  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_8832 -> Patchwork_18294

  CI-20190529: 20190529
  CI_DRM_8832: 00b9a4b2331e1dceef8994e3b144e2c5e8c55003 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5757: d78c7fd293cb228fe03ccff730202b033e25ff18 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18294: c9d01ff1c89f338f52fd60f7261fe51395897a3e @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c9d01ff1c89f drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt
ab86294c92b2 drm/i915/ggtt: Add generic i915_ggtt ballooning support

== Logs ==

For more details see: 

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add generic i915_ggtt ballooning support

2020-08-02 Thread Patchwork
== Series Details ==

Series: Add generic i915_ggtt ballooning support
URL   : https://patchwork.freedesktop.org/series/80177/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/ggtt: Add generic i915_ggtt ballooning support

2020-08-02 Thread Chris Wilson
Quoting Michal Wajdeczko (2020-08-02 16:34:09)
> Reserving part of the GGTT for the GuC requires same steps
> as in VGT GGTT ballooning. Add generic GGTT ballooning
> helpers to intel_ggtt.c to avoid code duplication.
> 
> Signed-off-by: Michal Wajdeczko 
> Cc: Xiong Zhang 
> Cc: Chris Wilson 
> Cc: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 ++--
>  drivers/gpu/drm/i915/gt/intel_gtt.h  |  4 ++
>  drivers/gpu/drm/i915/i915_vgpu.c | 64 +-
>  3 files changed, 70 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 33a3f627ddb1..7001252b4703 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -462,29 +462,17 @@ static void ggtt_unbind_vma(struct i915_address_space 
> *vm, struct i915_vma *vma)
>  
>  static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
>  {
> -   u64 size;
> -   int ret;
> -
> if (!intel_uc_uses_guc(>vm.gt->uc))
> return 0;
>  
> GEM_BUG_ON(ggtt->vm.total <= GUC_GGTT_TOP);
> -   size = ggtt->vm.total - GUC_GGTT_TOP;
> -
> -   ret = i915_gem_gtt_reserve(>vm, >uc_fw, size,
> -  GUC_GGTT_TOP, I915_COLOR_UNEVICTABLE,
> -  PIN_NOEVICT);
> -   if (ret)
> -   drm_dbg(>vm.i915->drm,
> -   "Failed to reserve top of GGTT for GuC\n");
> -
> -   return ret;
> +   return i915_ggtt_balloon(ggtt, GUC_GGTT_TOP, ggtt->vm.total,
> +>uc_fw);

I still don't buy this, this is definitely not "ballooning". And I'm yet
to be convinced that ballooning is a central concept to the i915_ggtt
itself and not a client coordination facility on top.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt

2020-08-02 Thread Michal Wajdeczko
Since VGT ballooning nodes are GGTT specific, we can move them
to i915_ggtt struct close to some other similar nodes. This way
we drop another place in driver that uses static data.

Signed-off-by: Michal Wajdeczko 
Cc: Xiong Zhang 
Cc: Chris Wilson 
Cc: Jani Nikula 
---
 drivers/gpu/drm/i915/gt/intel_gtt.h |  1 +
 drivers/gpu/drm/i915/i915_vgpu.c| 27 ---
 2 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 111306f2f8d6..72780543c9c6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -336,6 +336,7 @@ struct i915_ggtt {
struct mutex error_mutex;
struct drm_mm_node error_capture;
struct drm_mm_node uc_fw;
+   struct drm_mm_node balloon[4];
 };
 
 struct i915_ppgtt {
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index f505142d6dfc..6eb7657fb2ca 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -134,17 +134,6 @@ bool intel_vgpu_has_huge_gtt(struct drm_i915_private 
*dev_priv)
return dev_priv->vgpu.caps & VGT_CAPS_HUGE_GTT;
 }
 
-struct _balloon_info_ {
-   /*
-* There are up to 2 regions per mappable/unmappable graphic
-* memory that might be ballooned. Here, index 0/1 is for mappable
-* graphic memory, 2/3 for unmappable graphic memory.
-*/
-   struct drm_mm_node space[4];
-};
-
-static struct _balloon_info_ bl_info;
-
 /**
  * intel_vgt_deballoon - deballoon reserved graphics address trunks
  * @ggtt: the global GGTT from which we reserved earlier
@@ -163,7 +152,7 @@ void intel_vgt_deballoon(struct i915_ggtt *ggtt)
drm_dbg(_priv->drm, "VGT deballoon.\n");
 
for (i = 0; i < 4; i++)
-   i915_ggtt_deballoon(ggtt, _info.space[i]);
+   i915_ggtt_deballoon(ggtt, >balloon[i]);
 }
 
 /**
@@ -253,7 +242,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
/* Unmappable graphic memory ballooning */
if (unmappable_base > ggtt->mappable_end) {
ret = i915_ggtt_balloon(ggtt, ggtt->mappable_end,
-   unmappable_base, _info.space[2]);
+   unmappable_base, >balloon[2]);
 
if (ret)
goto err;
@@ -261,7 +250,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
 
if (unmappable_end < ggtt_end) {
ret = i915_ggtt_balloon(ggtt, unmappable_end, ggtt_end,
-   _info.space[3]);
+   >balloon[3]);
if (ret)
goto err_upon_mappable;
}
@@ -269,7 +258,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
/* Mappable graphic memory ballooning */
if (mappable_base) {
ret = i915_ggtt_balloon(ggtt, 0, mappable_base,
-   _info.space[0]);
+   >balloon[0]);
 
if (ret)
goto err_upon_unmappable;
@@ -277,7 +266,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
 
if (mappable_end < ggtt->mappable_end) {
ret = i915_ggtt_balloon(ggtt, mappable_end, ggtt->mappable_end,
-   _info.space[1]);
+   >balloon[1]);
 
if (ret)
goto err_below_mappable;
@@ -287,11 +276,11 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
return 0;
 
 err_below_mappable:
-   i915_ggtt_deballoon(ggtt, _info.space[0]);
+   i915_ggtt_deballoon(ggtt, >balloon[0]);
 err_upon_unmappable:
-   i915_ggtt_deballoon(ggtt, _info.space[3]);
+   i915_ggtt_deballoon(ggtt, >balloon[3]);
 err_upon_mappable:
-   i915_ggtt_deballoon(ggtt, _info.space[2]);
+   i915_ggtt_deballoon(ggtt, >balloon[2]);
 err:
drm_err(_priv->drm, "VGT balloon fail\n");
return ret;
-- 
2.27.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 0/2] Add generic i915_ggtt ballooning support

2020-08-02 Thread Michal Wajdeczko
Rebase forgotten series [1]

[1] https://patchwork.freedesktop.org/series/71920/

Michal Wajdeczko (2):
  drm/i915/ggtt: Add generic i915_ggtt ballooning support
  drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt

 drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 +++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  5 ++
 drivers/gpu/drm/i915/i915_vgpu.c | 75 +---
 3 files changed, 71 insertions(+), 78 deletions(-)

-- 
2.27.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] drm/i915/ggtt: Add generic i915_ggtt ballooning support

2020-08-02 Thread Michal Wajdeczko
Reserving part of the GGTT for the GuC requires same steps
as in VGT GGTT ballooning. Add generic GGTT ballooning
helpers to intel_ggtt.c to avoid code duplication.

Signed-off-by: Michal Wajdeczko 
Cc: Xiong Zhang 
Cc: Chris Wilson 
Cc: Jani Nikula 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 ++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  4 ++
 drivers/gpu/drm/i915/i915_vgpu.c | 64 +-
 3 files changed, 70 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 33a3f627ddb1..7001252b4703 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -462,29 +462,17 @@ static void ggtt_unbind_vma(struct i915_address_space 
*vm, struct i915_vma *vma)
 
 static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
 {
-   u64 size;
-   int ret;
-
if (!intel_uc_uses_guc(>vm.gt->uc))
return 0;
 
GEM_BUG_ON(ggtt->vm.total <= GUC_GGTT_TOP);
-   size = ggtt->vm.total - GUC_GGTT_TOP;
-
-   ret = i915_gem_gtt_reserve(>vm, >uc_fw, size,
-  GUC_GGTT_TOP, I915_COLOR_UNEVICTABLE,
-  PIN_NOEVICT);
-   if (ret)
-   drm_dbg(>vm.i915->drm,
-   "Failed to reserve top of GGTT for GuC\n");
-
-   return ret;
+   return i915_ggtt_balloon(ggtt, GUC_GGTT_TOP, ggtt->vm.total,
+>uc_fw);
 }
 
 static void ggtt_release_guc_top(struct i915_ggtt *ggtt)
 {
-   if (drm_mm_node_allocated(>uc_fw))
-   drm_mm_remove_node(>uc_fw);
+   i915_ggtt_deballoon(ggtt, >uc_fw);
 }
 
 static void cleanup_init_ggtt(struct i915_ggtt *ggtt)
@@ -1464,3 +1452,54 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma)
}
return ret;
 }
+
+/**
+ * i915_ggtt_balloon - reserve fixed space in an GGTT
+ * @ggtt: the  i915_ggtt
+ * @start: start offset inside the GGTT,
+ *  must be #I915_GTT_MIN_ALIGNMENT aligned
+ * @end: end offset inside the GGTT,
+ *must be #I915_GTT_PAGE_SIZE aligned
+ * @node: the  drm_mm_node
+ *
+ * i915_ggtt_balloon() tries to reserve the @node from @start to @end inside
+ * GGTT the address space.
+ *
+ * Returns: 0 on success, -ENOSPC if no suitable hole is found.
+ */
+int i915_ggtt_balloon(struct i915_ggtt *ggtt, u64 start, u64 end,
+ struct drm_mm_node *node)
+{
+   u64 size = end - start;
+   int err;
+
+   GEM_BUG_ON(start >= end);
+   drm_dbg(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n",
+   "ballooning ", start, end, size / SZ_1K);
+
+   err = i915_gem_gtt_reserve(>vm, node, size, start,
+  I915_COLOR_UNEVICTABLE, PIN_NOEVICT);
+   if (unlikely(err)) {
+   drm_err(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n",
+   "Failed to balloon ", node->start,
+   node->start + node->size, node->size / SZ_1K);
+   return err;
+   }
+
+   ggtt->vm.reserved += node->size;
+   return 0;
+}
+
+void i915_ggtt_deballoon(struct i915_ggtt *ggtt, struct drm_mm_node *node)
+{
+   if (!drm_mm_node_allocated(node))
+   return;
+
+   drm_dbg(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n",
+   "deballooning ", node->start, node->start + node->size,
+   node->size / SZ_1K);
+
+   GEM_BUG_ON(ggtt->vm.reserved < node->size);
+   ggtt->vm.reserved -= node->size;
+   drm_mm_remove_node(node);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index c13c650ced22..111306f2f8d6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -495,6 +495,10 @@ static inline bool i915_ggtt_has_aperture(const struct 
i915_ggtt *ggtt)
return ggtt->mappable_end > 0;
 }
 
+int i915_ggtt_balloon(struct i915_ggtt *ggtt, u64 start, u64 end,
+ struct drm_mm_node *node);
+void i915_ggtt_deballoon(struct i915_ggtt *ggtt, struct drm_mm_node *node);
+
 int i915_ppgtt_init_hw(struct intel_gt *gt);
 
 struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 70fca72f5162..f505142d6dfc 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -145,23 +145,6 @@ struct _balloon_info_ {
 
 static struct _balloon_info_ bl_info;
 
-static void vgt_deballoon_space(struct i915_ggtt *ggtt,
-   struct drm_mm_node *node)
-{
-   struct drm_i915_private *dev_priv = ggtt->vm.i915;
-   if (!drm_mm_node_allocated(node))
-   return;
-
-   drm_dbg(_priv->drm,
-   "deballoon space: range [0x%llx - 0x%llx] %llu KiB.\n",
-   node->start,
-   node->start + node->size,
-   node->size / 1024);
-
-   

Re: [Intel-gfx] [PATCH v4] drm/kmb: Add support for KeemBay Display

2020-08-02 Thread Sam Ravnborg
Hi Anitha.

On Thu, Jul 30, 2020 at 01:44:44PM -0700, Anitha Chrisanthus wrote:
> This is a basic KMS atomic modesetting display driver for KeemBay family of
> SOCs. Driver has no 2D or 3D graphics.It calls into the ADV bridge
> driver at the connector level.
> 
> Single CRTC with LCD controller->mipi DSI-> ADV bridge
> 
> Only 1080p resolution and single plane is supported at this time.
> 
> v2: moved extern to .h, removed license text
> use drm_dev_init, upclassed dev_private, removed HAVE_IRQ.
> 
> v3: Squashed all 59 commits to one
> 
> v4: review changes from Sam Ravnborg
>   renamed dev_p to kmb
>   moved clocks under kmb_clock, consolidated clk initializations
>   use drmm functions
>   use DRM_GEM_CMA_DRIVER_OPS_VMAP
> 

I have not found time neither energy to take a look at v4.
But I applied locally and ran it through checkpatch
with my options:
checkpatch -q --emacs --strict --show-types --codespell --codespellfile 
/usr/lib/python3/dist-packages/codespell_lib/data/dictionary.txt

The options are from the dim script used when maintaining drm-misc-next with
codespell options added.

Please try to reproduce locally and fix relevant warnings.

Sam

-:146: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#146: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:58:
+   kmb_clr_bitmask_lcd(kmb, LCD_INT_ENABLE,
+   LCD_INT_VERT_COMP);

-:173: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#173: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:85:
+   drm_info(dev,
+   "vfp= %d vbp= %d vsyc_len=%d hfp=%d hbp=%d hsync_len=%d\n",

-:194: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#194: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:106:
+   drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d 
h-active=%d h-bp=%d h-fp=%d hysnc-l=%d",
+   __func__, __LINE__,

-:199: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#199: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:111:
+   kmb_write_lcd(kmb, LCD_V_ACTIVEHEIGHT,
+   m->crtc_vdisplay - 1);

-:204: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#204: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:116:
+   kmb_write_lcd(kmb, LCD_H_ACTIVEWIDTH,
+   m->crtc_hdisplay - 1);

-:217: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#217: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:129:
+   kmb_write_lcd(kmb,
+   LCD_V_BACKPORCH_EVEN, vm.vback_porch);

-:219: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#219: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:131:
+   kmb_write_lcd(kmb,
+   LCD_V_FRONTPORCH_EVEN, vm.vfront_porch);

-:413: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#413: FILE: drivers/gpu/drm/kmb/kmb_drv.c:60:
+   drm_err(>drm,
+   "Failed to enable MIPI_ECFG clock: %d\n", ret);

-:420: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#420: FILE: drivers/gpu/drm/kmb/kmb_drv.c:67:
+   drm_err(>drm,
+   "Failed to enable MIPI_CFG clock: %d\n", ret);

-:427: CHECK:LINE_SPACING: Please don't use multiple blank lines
#427: FILE: drivers/gpu/drm/kmb/kmb_drv.c:74:
+
+

-:463: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV)
#463: FILE: drivers/gpu/drm/kmb/kmb_drv.c:110:
+   kmb->sys_clk_mhz = clk_get_rate(kmb_clk.clk_pll0)/100;
 ^

-:470: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#470: FILE: drivers/gpu/drm/kmb/kmb_drv.c:117:
+   drm_err(>drm, "failed to set to clk_lcd to %d\n",
+ KMB_LCD_DEFAULT_CLK);

-:479: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#479: FILE: drivers/gpu/drm/kmb/kmb_drv.c:126:
+   drm_err(>drm, "failed to set to clk_mipi to %d\n",
+ KMB_MIPI_DEFAULT_CLK);

-:506: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#506: FILE: drivers/gpu/drm/kmb/kmb_drv.c:153:
+   drm_err(>drm,
+   "failed to set clk_mipi_cfg to %d\n",

-:511: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#511: FILE: drivers/gpu/drm/kmb/kmb_drv.c:158:
+   drm_info(>drm,
+   "Get clk_mipi_cfg after set = %ld\n", clk);

-:561: CHECK:LINE_SPACING: Please don't use multiple blank lines
#561: FILE: drivers/gpu/drm/kmb/kmb_drv.c:208:
+
+

-:688: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#688: FILE: drivers/gpu/drm/kmb/kmb_drv.c:335:
+   if (status & LCD_INT_EOF) {
+

-:701: CHECK:CAMELCASE: Avoid CamelCase: 
#701: FILE: drivers/gpu/drm/kmb/kmb_drv.c:348:
+  

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80175/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8830_full -> Patchwork_18293_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18293_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18293_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_18293_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_eio@in-flight-suspend:
- shard-skl:  [PASS][1] -> [INCOMPLETE][2] +1 similar issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl9/igt@gem_...@in-flight-suspend.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl1/igt@gem_...@in-flight-suspend.html

  
Known issues


  Here are the changes found in Patchwork_18293_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_balancer@bonded-early:
- shard-kbl:  [PASS][3] -> [FAIL][4] ([i915#2079])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-kbl2/igt@gem_exec_balan...@bonded-early.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-kbl2/igt@gem_exec_balan...@bonded-early.html

  * igt@gem_exec_whisper@basic-queues-priority:
- shard-glk:  [PASS][5] -> [DMESG-WARN][6] ([i915#118] / [i915#95])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk8/igt@gem_exec_whis...@basic-queues-priority.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk2/igt@gem_exec_whis...@basic-queues-priority.html

  * igt@gen9_exec_parse@allowed-single:
- shard-skl:  [PASS][7] -> [DMESG-WARN][8] ([i915#1436] / 
[i915#716])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl1/igt@gen9_exec_pa...@allowed-single.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl1/igt@gen9_exec_pa...@allowed-single.html

  * igt@kms_big_fb@y-tiled-64bpp-rotate-180:
- shard-glk:  [PASS][9] -> [DMESG-FAIL][10] ([i915#118] / 
[i915#95]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk9/igt@kms_big...@y-tiled-64bpp-rotate-180.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk8/igt@kms_big...@y-tiled-64bpp-rotate-180.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-ytiled:
- shard-apl:  [PASS][11] -> [DMESG-WARN][12] ([i915#1635] / 
[i915#1982])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-apl3/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-apl6/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html

  * igt@kms_flip@2x-flip-vs-expired-vblank@ac-hdmi-a1-hdmi-a2:
- shard-glk:  [PASS][13] -> [FAIL][14] ([i915#79])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk5/igt@kms_flip@2x-flip-vs-expired-vbl...@ac-hdmi-a1-hdmi-a2.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk7/igt@kms_flip@2x-flip-vs-expired-vbl...@ac-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1:
- shard-skl:  [PASS][15] -> [FAIL][16] ([i915#79])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl1/igt@kms_flip@flip-vs-expired-vblank-interrupti...@a-edp1.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interrupti...@a-edp1.html

  * igt@kms_flip_tiling@flip-changes-tiling:
- shard-skl:  [PASS][17] -> [DMESG-WARN][18] ([i915#1982]) +13 
similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl8/igt@kms_flip_til...@flip-changes-tiling.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl3/igt@kms_flip_til...@flip-changes-tiling.html

  * igt@kms_frontbuffer_tracking@fbc-badstride:
- shard-glk:  [PASS][19] -> [DMESG-WARN][20] ([i915#1982]) +2 
similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk6/igt@kms_frontbuffer_track...@fbc-badstride.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk4/igt@kms_frontbuffer_track...@fbc-badstride.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-render:
- shard-tglb: [PASS][21] -> [DMESG-WARN][22] ([i915#1982]) +1 
similar issue
   [21]: 

Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value

2020-08-02 Thread Tom Rix


On 8/1/20 5:56 AM, Borislav Petkov wrote:
> On Sat, Aug 01, 2020 at 01:24:29PM +0200, Saheed O. Bolarinwa wrote:
>> The return value of pci_read_config_*() may not indicate a device error.
>> However, the value read by these functions is more likely to indicate
>> this kind of error. This presents two overlapping ways of reporting
>> errors and complicates error checking.
> So why isn't the *value check done in the pci_read_config_* functions
> instead of touching gazillion callers?
>
> For example, pci_conf{1,2}_read() could check whether the u32 *value it
> just read depending on the access method, whether that value is ~0 and
> return proper PCIBIOS_ error in that case.
>
> The check you're replicating
>
>   if (val32 == (u32)~0)
>
> everywhere, instead, is just ugly and tests a naked value ~0 which
> doesn't mean anything...
>
I agree, if there is a change, it should be in the pci_read_* functions.

Anything returning void should not fail and likely future users of the proposed 
change will not do the extra checks.

Tom

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value

2020-08-02 Thread Andi Shyti
> > > diff --git a/drivers/gpu/drm/i915/i915_active.c 
> > > b/drivers/gpu/drm/i915/i915_active.c
> > > index d960d0be5bd2..cc017e3cc9c5 100644
> > > --- a/drivers/gpu/drm/i915/i915_active.c
> > > +++ b/drivers/gpu/drm/i915/i915_active.c
> > > @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct 
> > > i915_active *ref,
> > >   intel_engine_mask_t tmp, mask = engine->mask;
> > >   struct llist_node *first = NULL, *last = NULL;
> > >   struct intel_gt *gt = engine->gt;
> > > - int err;
> > > + int err = 0;
> > 
> > you don't need the initialization here.
> 
> But it's close enough that I can munge the patch inline.
> Reviewed-by: Chris Wilson 

sure... you can also remove it before merging it and it might
also need:

Fixes: d8af05ff38ae7 ("drm/i915: Allow sharing the idle-barrier from other 
kernel requests)

but, yeah...

Reviewed-by: Andi Shyti 

... as well :)

Andi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80175/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8830 -> Patchwork_18293


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/index.html

Known issues


  Here are the changes found in Patchwork_18293 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@debugfs_test@read_all_entries:
- fi-bsw-nick:[PASS][1] -> [INCOMPLETE][2] ([i915#1250] / 
[i915#1436])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bsw-nick/igt@debugfs_test@read_all_entries.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bsw-nick/igt@debugfs_test@read_all_entries.html

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-u2:  [PASS][3] -> [FAIL][4] ([i915#1888])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@active:
- fi-bdw-5557u:   [PASS][5] -> [DMESG-FAIL][6] ([i915#765])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bdw-5557u/igt@i915_selftest@l...@active.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bdw-5557u/igt@i915_selftest@l...@active.html

  * igt@i915_selftest@live@gem_contexts:
- fi-tgl-u2:  [PASS][7] -> [INCOMPLETE][8] ([i915#2045])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html

  * igt@i915_selftest@live@gt_lrc:
- fi-tgl-u2:  [PASS][9] -> [DMESG-FAIL][10] ([i915#1233])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_chamelium@hdmi-crc-fast:
- fi-kbl-7500u:   [PASS][11] -> [FAIL][12] ([i915#1372])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-7500u/igt@kms_chamel...@hdmi-crc-fast.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-7500u/igt@kms_chamel...@hdmi-crc-fast.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-bsw-kefka:   [PASS][13] -> [DMESG-WARN][14] ([i915#1982]) +1 
similar issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1:
- fi-icl-u2:  [PASS][15] -> [DMESG-WARN][16] ([i915#1982])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2:
- fi-skl-guc: [PASS][17] -> [DMESG-WARN][18] ([i915#2203])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html

  
 Possible fixes 

  * igt@i915_selftest@live@execlists:
- fi-icl-y:   [INCOMPLETE][19] -> [PASS][20]
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-icl-y/igt@i915_selftest@l...@execlists.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-icl-y/igt@i915_selftest@l...@execlists.html

  * igt@kms_busy@basic@flip:
- fi-kbl-x1275:   [DMESG-WARN][21] ([i915#62] / [i915#92] / [i915#95]) 
-> [PASS][22]
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-x1275/igt@kms_busy@ba...@flip.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-x1275/igt@kms_busy@ba...@flip.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-7500u:   [DMESG-WARN][23] ([i915#2203]) -> [PASS][24]
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-x1275:   [DMESG-WARN][25] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][26] ([i915#62] / [i915#92] / [i915#95]) +4 similar issues
  

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Fix wrong return value

2020-08-02 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix wrong return value
URL   : https://patchwork.freedesktop.org/series/80175/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI] drm/i915: Fix wrong return value

2020-08-02 Thread Chris Wilson
From: Tianjia Zhang 

In function i915_active_acquire_preallocate_barrier(), not all
paths have the return value set correctly, and in case of memory
allocation failure, a negative error code should be returned.

Cc: Chris Wilson 
Signed-off-by: Tianjia Zhang 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_active.c| 7 ++-
 drivers/gpu/drm/i915/selftests/i915_request.c | 2 +-
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 3ee5e02c3e17..b0a6522be3d1 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -889,7 +889,6 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
intel_engine_mask_t tmp, mask = engine->mask;
struct llist_node *first = NULL, *last = NULL;
struct intel_gt *gt = engine->gt;
-   int err;
 
GEM_BUG_ON(i915_active_is_idle(ref));
 
@@ -914,10 +913,8 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
rcu_read_unlock();
if (!node) {
node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL);
-   if (!node) {
-   err = ENOMEM;
+   if (!node)
goto unwind;
-   }
 
RCU_INIT_POINTER(node->base.fence, NULL);
node->base.cb.func = node_retire;
@@ -965,7 +962,7 @@ int i915_active_acquire_preallocate_barrier(struct 
i915_active *ref,
 
kmem_cache_free(global.slab_cache, node);
}
-   return err;
+   return -ENOMEM;
 }
 
 void i915_active_acquire_barrier(struct i915_active *ref)
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index 57dd6f5122ee..c1dcd4b91bda 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -331,7 +331,7 @@ static int __igt_breadcrumbs_smoketest(void *arg)
if (!wait) {
i915_sw_fence_commit(submit);
heap_fence_put(submit);
-   err = ENOMEM;
+   err = -ENOMEM;
break;
}
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value

2020-08-02 Thread Chris Wilson
Quoting Andi Shyti (2020-08-02 12:40:44)
> Hi Tianjia,
> 
> > diff --git a/drivers/gpu/drm/i915/i915_active.c 
> > b/drivers/gpu/drm/i915/i915_active.c
> > index d960d0be5bd2..cc017e3cc9c5 100644
> > --- a/drivers/gpu/drm/i915/i915_active.c
> > +++ b/drivers/gpu/drm/i915/i915_active.c
> > @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct 
> > i915_active *ref,
> >   intel_engine_mask_t tmp, mask = engine->mask;
> >   struct llist_node *first = NULL, *last = NULL;
> >   struct intel_gt *gt = engine->gt;
> > - int err;
> > + int err = 0;
> 
> you don't need the initialization here.

But it's close enough that I can munge the patch inline.
Reviewed-by: Chris Wilson 
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value

2020-08-02 Thread Andi Shyti
Hi Tianjia,

> diff --git a/drivers/gpu/drm/i915/i915_active.c 
> b/drivers/gpu/drm/i915/i915_active.c
> index d960d0be5bd2..cc017e3cc9c5 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct 
> i915_active *ref,
>   intel_engine_mask_t tmp, mask = engine->mask;
>   struct llist_node *first = NULL, *last = NULL;
>   struct intel_gt *gt = engine->gt;
> - int err;
> + int err = 0;

you don't need the initialization here.

Thanks,
Andi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 00/16] acpi/pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API

2020-08-02 Thread Andy Shevchenko
On Sat, Aug 01, 2020 at 04:38:16PM +0200, Hans de Goede wrote:
> On 7/29/20 12:54 PM, Andy Shevchenko wrote:
> > On Fri, Jul 17, 2020 at 03:37:37PM +0200, Hans de Goede wrote:

...

> > One comment to consider, though. There are three channels in that PWM AFAIU.
> > One of them is backlight control, another one can be attached to haptics. 
> > The
> > concern is how this series may (or may not?) affect haptics behaviour.
> 
> When you say "in that PWM" do you mean the LPSS one or the CRC one ?

CRC one. I have read it from PMIC spec, that's why the question.

> The CRC PWM driver patches do make it honor the requested output frequency,
> where before, because of a bug, it would stick with the out frequency
> setup by the firmware (or the power-on-reset value if no value is set
> by the firmware). This fix causing this is very clearly correct, but
> it could have unexpected side-effects.

-- 
With Best Regards,
Andy Shevchenko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx