[Intel-gfx] ✓ Fi.CI.BAT: success for Introduce drm scaling filter property (rev7)
== Series Details == Series: Introduce drm scaling filter property (rev7) URL : https://patchwork.freedesktop.org/series/73883/ State : success == Summary == CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18296 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/index.html Known issues Here are the changes found in Patchwork_18296 that come from known issues: ### IGT changes ### Issues hit * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic: - fi-icl-u2: [PASS][1] -> [DMESG-WARN][2] ([i915#1982]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html Possible fixes * igt@i915_module_load@reload: - fi-bxt-dsi: [DMESG-WARN][3] ([i915#1635] / [i915#1982]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bxt-dsi/igt@i915_module_l...@reload.html * igt@i915_selftest@live@active: - {fi-ehl-1}: [DMESG-FAIL][5] ([i915#541]) -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-ehl-1/igt@i915_selftest@l...@active.html * igt@i915_selftest@live@execlists: - fi-icl-y: [INCOMPLETE][7] -> [PASS][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-y/igt@i915_selftest@l...@execlists.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-y/igt@i915_selftest@l...@execlists.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-bsw-n3050: [DMESG-WARN][9] ([i915#1982]) -> [PASS][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-n3050/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bsw-n3050/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html - fi-bsw-kefka: [DMESG-WARN][11] ([i915#1982]) -> [PASS][12] +1 similar issue [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1: - fi-icl-u2: [DMESG-WARN][13] ([i915#1982]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html Warnings * igt@gem_exec_suspend@basic-s0: - fi-kbl-x1275: [DMESG-WARN][15] ([i915#62] / [i915#92]) -> [DMESG-WARN][16] ([i915#1982] / [i915#62] / [i915#92]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy: - fi-kbl-x1275: [DMESG-WARN][17] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][18] ([i915#62] / [i915#92]) +2 similar issues [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html * igt@kms_force_connector_basic@force-edid: - fi-kbl-x1275: [DMESG-WARN][19] ([i915#62] / [i915#92]) -> [DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +3 similar issues [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18296/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635 [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982 [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541 [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579 [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62 [i915#665]: https://gitlab.freedesktop.org/drm/intel/issues/665
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Introduce drm scaling filter property (rev7)
== Series Details == Series: Introduce drm scaling filter property (rev7) URL : https://patchwork.freedesktop.org/series/73883/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.0 Fast mode used, each commit won't be checked separately. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Introduce drm scaling filter property (rev7)
== Series Details == Series: Introduce drm scaling filter property (rev7) URL : https://patchwork.freedesktop.org/series/73883/ State : warning == Summary == $ dim checkpatch origin/drm-tip e4e6d491d898 drm: Introduce plane and CRTC scaling filter properties aef69ee0f157 drm/drm-kms.rst: Add plane and CRTC scaling filter property documentation 1e38228e0f26 drm/i915: Introduce scaling filter related registers and bit fields. -:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'id' - possible side-effects? #70: FILE: drivers/gpu/drm/i915/i915_reg.h:7514: +#define CNL_PS_COEF_INDEX_SET(pipe, id, set) _MMIO_PIPE(pipe,\ + _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) + (set) * 8) -:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'set' - possible side-effects? #70: FILE: drivers/gpu/drm/i915/i915_reg.h:7514: +#define CNL_PS_COEF_INDEX_SET(pipe, id, set) _MMIO_PIPE(pipe,\ + _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) + (set) * 8) -:74: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'id' - possible side-effects? #74: FILE: drivers/gpu/drm/i915/i915_reg.h:7518: +#define CNL_PS_COEF_DATA_SET(pipe, id, set) _MMIO_PIPE(pipe, \ + _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + (set) * 8) -:74: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'set' - possible side-effects? #74: FILE: drivers/gpu/drm/i915/i915_reg.h:7518: +#define CNL_PS_COEF_DATA_SET(pipe, id, set) _MMIO_PIPE(pipe, \ + _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + (set) * 8) total: 0 errors, 0 warnings, 4 checks, 47 lines checked 8eff2169f553 drm/i915/display: Add Nearest-neighbor based integer scaling support c7a29cfbc920 drm/i915: Enable scaling filter for plane and CRTC -:95: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #95: FILE: drivers/gpu/drm/i915/display/intel_display.c:16824: + drm_crtc_create_scaling_filter_property(>base, + BIT(DRM_SCALING_FILTER_DEFAULT) | -:155: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #155: FILE: drivers/gpu/drm/i915/display/intel_sprite.c:3172: + drm_plane_create_scaling_filter_property(>base, + BIT(DRM_SCALING_FILTER_DEFAULT) | total: 0 errors, 0 warnings, 2 checks, 104 lines checked ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 5/5] drm/i915: Enable scaling filter for plane and CRTC
GEN >= 10 hardware supports the programmable scaler filter. Attach scaling filter property for CRTC and plane for GEN >= 10 hardwares and program scaler filter based on the selected filter type. changes since v3: * None changes since v2: * Use updated functions * Add ps_ctrl var to contain the full PS_CTRL register value (Ville) * Duplicate the scaling filter in crtc and plane hw state (Ville) changes since v1: * None Changes since RFC: * Enable properties for GEN >= 10 platforms (Ville) * Do not round off the crtc co-ordinate (Danial Stone, Ville) * Add new functions to handle scaling filter setup (Ville) * Remove coefficient set 0 hardcoding. Signed-off-by: Shashank Sharma Signed-off-by: Ankit Nautiyal Signed-off-by: Pankaj Bharadiya --- .../gpu/drm/i915/display/intel_atomic_plane.c | 1 + drivers/gpu/drm/i915/display/intel_display.c | 18 -- .../gpu/drm/i915/display/intel_display_types.h | 2 ++ drivers/gpu/drm/i915/display/intel_sprite.c| 15 +-- 4 files changed, 32 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_atomic_plane.c b/drivers/gpu/drm/i915/display/intel_atomic_plane.c index 79032701873a..415d41b21915 100644 --- a/drivers/gpu/drm/i915/display/intel_atomic_plane.c +++ b/drivers/gpu/drm/i915/display/intel_atomic_plane.c @@ -262,6 +262,7 @@ void intel_plane_copy_uapi_to_hw_state(struct intel_plane_state *plane_state, plane_state->hw.rotation = from_plane_state->uapi.rotation; plane_state->hw.color_encoding = from_plane_state->uapi.color_encoding; plane_state->hw.color_range = from_plane_state->uapi.color_range; + plane_state->hw.scaling_filter = from_plane_state->uapi.scaling_filter; } void intel_plane_set_invisible(struct intel_crtc_state *crtc_state, diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 388999404e05..507932099b8d 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -6352,6 +6352,7 @@ static void skl_pfit_enable(const struct intel_crtc_state *crtc_state) int hscale, vscale; unsigned long irqflags; int id; + u32 ps_ctrl; if (!crtc_state->pch_pfit.enabled) return; @@ -6368,10 +6369,16 @@ static void skl_pfit_enable(const struct intel_crtc_state *crtc_state) id = scaler_state->scaler_id; + ps_ctrl = skl_scaler_get_filter_select(crtc_state->hw.scaling_filter, 0); + ps_ctrl |= PS_SCALER_EN | scaler_state->scalers[id].mode; + spin_lock_irqsave(_priv->uncore.lock, irqflags); - intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id), PS_SCALER_EN | - PS_FILTER_MEDIUM | scaler_state->scalers[id].mode); + skl_scaler_setup_filter(dev_priv, pipe, id, 0, + crtc_state->hw.scaling_filter); + + intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id), ps_ctrl); + intel_de_write_fw(dev_priv, SKL_PS_VPHASE(pipe, id), PS_Y_PHASE(0) | PS_UV_RGB_PHASE(uv_rgb_vphase)); intel_de_write_fw(dev_priv, SKL_PS_HPHASE(pipe, id), @@ -13332,6 +13339,7 @@ intel_crtc_copy_uapi_to_hw_state(struct intel_crtc_state *crtc_state) crtc_state->hw.active = crtc_state->uapi.active; crtc_state->hw.mode = crtc_state->uapi.mode; crtc_state->hw.adjusted_mode = crtc_state->uapi.adjusted_mode; + crtc_state->hw.scaling_filter = crtc_state->uapi.scaling_filter; intel_crtc_copy_uapi_to_hw_state_nomodeset(crtc_state); } @@ -13343,6 +13351,7 @@ static void intel_crtc_copy_hw_to_uapi_state(struct intel_crtc_state *crtc_state drm_atomic_set_mode_for_crtc(_state->uapi, _state->hw.mode) < 0); crtc_state->uapi.adjusted_mode = crtc_state->hw.adjusted_mode; + crtc_state->uapi.scaling_filter = crtc_state->hw.scaling_filter; /* copy color blobs to uapi */ drm_property_replace_blob(_state->uapi.degamma_lut, @@ -16810,6 +16819,11 @@ static int intel_crtc_init(struct drm_i915_private *dev_priv, enum pipe pipe) dev_priv->plane_to_crtc_mapping[i9xx_plane] = crtc; } + if (INTEL_GEN(dev_priv) >= 10) + drm_crtc_create_scaling_filter_property(>base, + BIT(DRM_SCALING_FILTER_DEFAULT) | + BIT(DRM_SCALING_FILTER_NEAREST_NEIGHBOR)); + intel_color_init(crtc); intel_crtc_crc_init(crtc); diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index f581260e8dbf..670ab317134b 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -518,6 +518,7 @@ struct intel_plane_state { unsigned int rotation; enum drm_color_encoding
[Intel-gfx] [PATCH v5 4/5] drm/i915/display: Add Nearest-neighbor based integer scaling support
Integer scaling (IS) is a nearest-neighbor upscaling technique that simply scales up the existing pixels by an integer (i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation works by filling in the missing color values in the upscaled image with that of the coordinate-mapped nearest source pixel value. Both IS and NN preserve the clarity of the original image. Integer scaling is particularly useful for pixel art games that rely on sharp, blocky images to deliver their distinctive look. Introduce functions to configure the scaler filter coefficients to enable nearest-neighbor filtering. Bspec: 49247 changes since v3: * None changes since v2: * Move APIs from 5/5 into this patch. * Change filter programming related function names to cnl_*, move filter select bits related code into inline function (Ville) changes since v1: * Rearrange skl_scaler_setup_nearest_neighbor_filter() to iterate the registers directly instead of the phases and taps (Ville) changes since RFC: * Refine the skl_scaler_setup_nearest_neighbor_filter() logic (Ville) Signed-off-by: Shashank Sharma Signed-off-by: Ankit Nautiyal Signed-off-by: Pankaj Bharadiya --- drivers/gpu/drm/i915/display/intel_display.c | 99 drivers/gpu/drm/i915/display/intel_display.h | 4 + 2 files changed, 103 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index db2a5a1a9b35..388999404e05 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -6233,6 +6233,105 @@ void skl_scaler_disable(const struct intel_crtc_state *old_crtc_state) skl_detach_scaler(crtc, i); } +static int cnl_coef_tap(int i) +{ + return i % 7; +} + +static u16 cnl_nearest_filter_coef(int t) +{ + return t == 3 ? 0x0800 : 0x3000; +} + +/** + * Theory behind setting nearest-neighbor integer scaling: + * + * 17 phase of 7 taps requires 119 coefficients in 60 dwords per set. + * The letter represents the filter tap (D is the center tap) and the number + * represents the coefficient set for a phase (0-16). + * + * ++++ + * |Index value | Data value coeffient 1 | Data value coeffient 2 | + * ++++ + * | 00h | B0| A0| + * ++++ + * | 01h | D0| C0| + * ++++ + * | 02h | F0| E0| + * ++++ + * | 03h | A1| G0| + * ++++ + * | 04h | C1| B1| + * ++++ + * | ... | ... | ... | + * ++++ + * | 38h | B16 | A16 | + * ++++ + * | 39h | D16 | C16 | + * ++++ + * | 3Ah | F16 | C16 | + * ++++ + * | 3Bh |Reserved| G16 | + * ++++ + * + * To enable nearest-neighbor scaling: program scaler coefficents with + * the center tap (Dxx) values set to 1 and all other values set to 0 as per + * SCALER_COEFFICIENT_FORMAT + * + */ + +static void cnl_program_nearest_filter_coefs(struct drm_i915_private *dev_priv, +enum pipe pipe, int id, int set) +{ + int i; + + intel_de_write_fw(dev_priv, CNL_PS_COEF_INDEX_SET(pipe, id, set), + PS_COEE_INDEX_AUTO_INC); + + for (i = 0; i < 17 * 7; i += 2) { + u32 tmp; + int t; + + t = cnl_coef_tap(i); + tmp = cnl_nearest_filter_coef(t); + + t = cnl_coef_tap(i + 1); + tmp |= cnl_nearest_filter_coef(t) << 16; + + intel_de_write_fw(dev_priv, CNL_PS_COEF_DATA_SET(pipe, id, set), + tmp); + } + + intel_de_write_fw(dev_priv, CNL_PS_COEF_INDEX_SET(pipe, id, set), 0); +} + +inline u32 skl_scaler_get_filter_select(enum
[Intel-gfx] [PATCH v5 2/5] drm/drm-kms.rst: Add plane and CRTC scaling filter property documentation
Add documentation for newly introduced KMS plane and CRTC scaling filter properties. changes since v3: * None changes since v1: * None changes since RFC: * Add separate documentation for plane and CRTC. Signed-off-by: Pankaj Bharadiya --- Documentation/gpu/drm-kms.rst | 12 1 file changed, 12 insertions(+) diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst index 3c5ae4f6dfd2..8e4031afbb1b 100644 --- a/Documentation/gpu/drm-kms.rst +++ b/Documentation/gpu/drm-kms.rst @@ -518,6 +518,18 @@ Variable Refresh Properties .. kernel-doc:: drivers/gpu/drm/drm_connector.c :doc: Variable refresh properties +Plane Scaling Filter Property +--- + +.. kernel-doc:: drivers/gpu/drm/drm_plane.c + :doc: Plane scaling filter property + +CRTC Scaling Filter Property +--- + +.. kernel-doc:: drivers/gpu/drm/drm_crtc.c + :doc: CRTC scaling filter property + Existing KMS Properties --- -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 1/5] drm: Introduce plane and CRTC scaling filter properties
Introduce per-plane and per-CRTC scaling filter properties to allow userspace to select the driver's default scaling filter or Nearest-neighbor(NN) filter for upscaling operations on CRTC and plane. Drivers can set up this property for a plane by calling drm_plane_create_scaling_filter() and for a CRTC by calling drm_crtc_create_scaling_filter(). NN filter works by filling in the missing color values in the upscaled image with that of the coordinate-mapped nearest source pixel value. NN filter for integer multiple scaling can be particularly useful for for pixel art games that rely on sharp, blocky images to deliver their distinctive look. changes since v3: * Refactor code, add new function for common code (Ville) changes since v2: * Create per-plane and per-CRTC scaling filter property (Ville) changes since v1: * None changes since RFC: * Add separate properties for plane and CRTC (Ville) Signed-off-by: Pankaj Bharadiya --- drivers/gpu/drm/drm_atomic_uapi.c | 8 +++ drivers/gpu/drm/drm_crtc.c | 48 +++ drivers/gpu/drm/drm_crtc_internal.h | 3 + drivers/gpu/drm/drm_plane.c | 90 + include/drm/drm_crtc.h | 16 + include/drm/drm_plane.h | 21 +++ 6 files changed, 186 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 25c269bc4681..ef82009035e6 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -469,6 +469,8 @@ static int drm_atomic_crtc_set_property(struct drm_crtc *crtc, return -EFAULT; set_out_fence_for_crtc(state->state, crtc, fence_ptr); + } else if (property == crtc->scaling_filter_property) { + state->scaling_filter = val; } else if (crtc->funcs->atomic_set_property) { return crtc->funcs->atomic_set_property(crtc, state, property, val); } else { @@ -503,6 +505,8 @@ drm_atomic_crtc_get_property(struct drm_crtc *crtc, *val = (state->gamma_lut) ? state->gamma_lut->base.id : 0; else if (property == config->prop_out_fence_ptr) *val = 0; + else if (property == crtc->scaling_filter_property) + *val = state->scaling_filter; else if (crtc->funcs->atomic_get_property) return crtc->funcs->atomic_get_property(crtc, state, property, val); else @@ -585,6 +589,8 @@ static int drm_atomic_plane_set_property(struct drm_plane *plane, sizeof(struct drm_rect), ); return ret; + } else if (property == plane->scaling_filter_property) { + state->scaling_filter = val; } else if (plane->funcs->atomic_set_property) { return plane->funcs->atomic_set_property(plane, state, property, val); @@ -643,6 +649,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane, } else if (property == config->prop_fb_damage_clips) { *val = (state->fb_damage_clips) ? state->fb_damage_clips->base.id : 0; + } else if (property == plane->scaling_filter_property) { + *val = state->scaling_filter; } else if (plane->funcs->atomic_get_property) { return plane->funcs->atomic_get_property(plane, state, property, val); } else { diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 283bcc4362ca..70f5cd9704ba 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -776,3 +776,51 @@ int drm_mode_crtc_set_obj_prop(struct drm_mode_object *obj, return ret; } + +/** + * DOC: CRTC scaling filter property + * + * SCALING_FILTER: + * + * Indicates scaling filter to be used for CRTC scaler + * + * The value of this property can be one of the following: + * Default: + * Driver's default scaling filter + * Nearest Neighbor: + * Nearest Neighbor scaling filter + * + * Drivers can set up this property for a CRTC by calling + * drm_crtc_create_scaling_filter_property + */ + +/** + * drm_crtc_create_scaling_filter_property - create a new scaling filter + * property + * + * @crtc: drm CRTC + * @supported_filters: bitmask of supported scaling filters, must include + *BIT(DRM_SCALING_FILTER_DEFAULT). + * + * This function lets driver to enable the scaling filter property on a given + * CRTC. + * + * RETURNS: + * Zero for success or -errno + */ +int drm_crtc_create_scaling_filter_property(struct drm_crtc *crtc, + unsigned int supported_filters) +{ + struct drm_property *prop = + drm_create_scaling_filter_prop(crtc->dev, supported_filters); + + if (IS_ERR(prop)) + return PTR_ERR(prop); + + drm_object_attach_property(>base, prop, +
[Intel-gfx] [PATCH v5 3/5] drm/i915: Introduce scaling filter related registers and bit fields.
Introduce scaler registers and bit fields needed to configure the scaling filter in prgrammed mode and configure scaling filter coefficients. changes since v3: * None changes since v2: * Change macro names to CNL_* and use +(set)*8 instead of adding another trip through _PICK_EVEN (Ville). changes since v1: * None changes since RFC: * Parametrize scaler coeffient macros by 'set' (Ville) Signed-off-by: Shashank Sharma Signed-off-by: Ankit Nautiyal Signed-off-by: Pankaj Bharadiya --- drivers/gpu/drm/i915/i915_reg.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 5eae593ee784..e582021cc208 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7391,6 +7391,7 @@ enum { #define PS_PLANE_SEL(plane) (((plane) + 1) << 25) #define PS_FILTER_MASK (3 << 23) #define PS_FILTER_MEDIUM (0 << 23) +#define PS_FILTER_PROGRAMMED (1 << 23) #define PS_FILTER_EDGE_ENHANCE (2 << 23) #define PS_FILTER_BILINEAR (3 << 23) #define PS_VERT3TAP(1 << 21) @@ -7405,6 +7406,10 @@ enum { #define PS_VADAPT_MODE_MOST_ADAPT (3 << 5) #define PS_PLANE_Y_SEL_MASK (7 << 5) #define PS_PLANE_Y_SEL(plane) (((plane) + 1) << 5) +#define PS_Y_VERT_FILTER_SELECT(set) ((set) << 4) +#define PS_Y_HORZ_FILTER_SELECT(set) ((set) << 3) +#define PS_UV_VERT_FILTER_SELECT(set) ((set) << 2) +#define PS_UV_HORZ_FILTER_SELECT(set) ((set) << 1) #define _PS_PWR_GATE_1A 0x68160 #define _PS_PWR_GATE_2A 0x68260 @@ -7467,6 +7472,17 @@ enum { #define _PS_ECC_STAT_2B 0x68AD0 #define _PS_ECC_STAT_1C 0x691D0 +#define _PS_COEF_SET0_INDEX_1A0x68198 +#define _PS_COEF_SET0_INDEX_2A0x68298 +#define _PS_COEF_SET0_INDEX_1B0x68998 +#define _PS_COEF_SET0_INDEX_2B0x68A98 +#define PS_COEE_INDEX_AUTO_INC(1 << 10) + +#define _PS_COEF_SET0_DATA_1A 0x6819C +#define _PS_COEF_SET0_DATA_2A 0x6829C +#define _PS_COEF_SET0_DATA_1B 0x6899C +#define _PS_COEF_SET0_DATA_2B 0x68A9C + #define _ID(id, a, b) _PICK_EVEN(id, a, b) #define SKL_PS_CTRL(pipe, id) _MMIO_PIPE(pipe,\ _ID(id, _PS_1A_CTRL, _PS_2A_CTRL), \ @@ -7495,7 +7511,13 @@ enum { #define SKL_PS_ECC_STAT(pipe, id) _MMIO_PIPE(pipe, \ _ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A), \ _ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B)) +#define CNL_PS_COEF_INDEX_SET(pipe, id, set) _MMIO_PIPE(pipe,\ + _ID(id, _PS_COEF_SET0_INDEX_1A, _PS_COEF_SET0_INDEX_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_INDEX_1B, _PS_COEF_SET0_INDEX_2B) + (set) * 8) +#define CNL_PS_COEF_DATA_SET(pipe, id, set) _MMIO_PIPE(pipe, \ + _ID(id, _PS_COEF_SET0_DATA_1A, _PS_COEF_SET0_DATA_2A) + (set) * 8, \ + _ID(id, _PS_COEF_SET0_DATA_1B, _PS_COEF_SET0_DATA_2B) + (set) * 8) /* legacy palette */ #define _LGC_PALETTE_A 0x4a000 #define _LGC_PALETTE_B 0x4a800 -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 0/5] Introduce drm scaling filter property
Earlier, I kept this series on hold since we wanted to have a reference userspace implementation in place. Now, Sameer has implemented Integer scaling in Kodi Retro gaming framework which demonstrate how Integer scaling gives distinctive look to pixel art games when played on higher resolution monitors. Kodi patches are almost reviewed and closer to merge now. Here is the userspace patch series link: https://github.com/xbmc/xbmc/pull/18194 Background on Integer scaling: Integer scaling (IS) is a nearest-neighbor upscaling technique that simply scales up the existing pixels by an integer (i.e., whole number) multiplier. Nearest-neighbor (NN) interpolation works by filling in the missing color values in the upscaled image with that of the coordinate-mapped nearest source pixel value. Both IS and NN preserve the clarity of the original image. In contrast, traditional upscaling algorithms, such as bilinear or bicubic interpolation, result in blurry upscaled images because they employ interpolation techniques that smooth out the transition from one pixel to another. Therefore, integer scaling is particularly useful for pixel art games that rely on sharp, blocky images to deliver their distinctive look. Many gaming communities have been asking for integer-mode scaling support, some links and background: https://software.intel.com/en-us/articles/integer-scaling-support-on-intel-graphics http://tanalin.com/en/articles/lossless-scaling/ https://community.amd.com/thread/209107 https://www.nvidia.com/en-us/geforce/forums/game-ready-drivers/13/1002/feature-request-nonblurry-upscaling-at-integer-rat/ * Changes in v5: - Rebase to latest drm-tip. Pankaj Bharadiya (5): drm: Introduce plane and CRTC scaling filter properties drm/drm-kms.rst: Add plane and CRTC scaling filter property documentation drm/i915: Introduce scaling filter related registers and bit fields. drm/i915/display: Add Nearest-neighbor based integer scaling support drm/i915: Enable scaling filter for plane and CRTC Documentation/gpu/drm-kms.rst | 12 ++ drivers/gpu/drm/drm_atomic_uapi.c | 8 ++ drivers/gpu/drm/drm_crtc.c| 48 +++ drivers/gpu/drm/drm_crtc_internal.h | 3 + drivers/gpu/drm/drm_plane.c | 90 ++ .../gpu/drm/i915/display/intel_atomic_plane.c | 1 + drivers/gpu/drm/i915/display/intel_display.c | 117 +- drivers/gpu/drm/i915/display/intel_display.h | 4 + .../drm/i915/display/intel_display_types.h| 2 + drivers/gpu/drm/i915/display/intel_sprite.c | 15 ++- drivers/gpu/drm/i915/i915_reg.h | 22 include/drm/drm_crtc.h| 16 +++ include/drm/drm_plane.h | 21 13 files changed, 355 insertions(+), 4 deletions(-) -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/42] drm/i915: Fix wrong return value
== Series Details == Series: series starting with [01/42] drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80179/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8832_full -> Patchwork_18295_full Summary --- **FAILURE** Serious unknown changes coming with Patchwork_18295_full absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_18295_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_18295_full: ### IGT changes ### Possible regressions * igt@gem_ctx_exec@basic-nohangcheck: - shard-tglb: [PASS][1] -> [FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb7/igt@gem_ctx_e...@basic-nohangcheck.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-tglb2/igt@gem_ctx_e...@basic-nohangcheck.html * igt@gem_ctx_ringsize@active@bcs0: - shard-skl: [PASS][3] -> [INCOMPLETE][4] +2 similar issues [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@gem_ctx_ringsize@act...@bcs0.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl9/igt@gem_ctx_ringsize@act...@bcs0.html * igt@gem_exec_parallel@fds: - shard-tglb: NOTRUN -> [INCOMPLETE][5] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-tglb1/igt@gem_exec_paral...@fds.html * igt@gem_exec_whisper@basic-contexts-forked-all: - shard-glk: [PASS][6] -> [INCOMPLETE][7] [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-glk9/igt@gem_exec_whis...@basic-contexts-forked-all.html [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-glk2/igt@gem_exec_whis...@basic-contexts-forked-all.html * igt@kms_mmap_write_crc@main: - shard-glk: [PASS][8] -> [FAIL][9] [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-glk2/igt@kms_mmap_write_...@main.html [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-glk5/igt@kms_mmap_write_...@main.html New tests - New tests have been introduced between CI_DRM_8832_full and Patchwork_18295_full: ### New IGT tests (3) ### * igt@i915_selftest@live@scheduler: - Statuses : 7 pass(s) - Exec time: [0.45, 3.40] s * igt@i915_selftest@mock@scheduler: - Statuses : 7 pass(s) - Exec time: [0.11, 1.00] s * igt@i915_selftest@perf@scheduler: - Statuses : 7 pass(s) - Exec time: [0.44, 16.03] s Known issues Here are the changes found in Patchwork_18295_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_ctx_isolation@nonpriv-switch@rcs0: - shard-apl: [PASS][10] -> [INCOMPLETE][11] ([i915#1635]) +1 similar issue [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-apl3/igt@gem_ctx_isolation@nonpriv-swi...@rcs0.html [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-apl6/igt@gem_ctx_isolation@nonpriv-swi...@rcs0.html * igt@gem_ctx_persistence@legacy-engines-mixed-process@render: - shard-skl: [PASS][12] -> [FAIL][13] ([i915#1528]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl9/igt@gem_ctx_persistence@legacy-engines-mixed-proc...@render.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl7/igt@gem_ctx_persistence@legacy-engines-mixed-proc...@render.html * igt@gem_exec_whisper@basic-contexts-forked-all: - shard-kbl: [PASS][14] -> [INCOMPLETE][15] ([i915#794]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-kbl2/igt@gem_exec_whis...@basic-contexts-forked-all.html [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-kbl6/igt@gem_exec_whis...@basic-contexts-forked-all.html * igt@i915_selftest@mock@contexts: - shard-skl: [PASS][16] -> [INCOMPLETE][17] ([i915#198]) [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl2/igt@i915_selftest@m...@contexts.html [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-skl10/igt@i915_selftest@m...@contexts.html * igt@kms_big_fb@y-tiled-64bpp-rotate-0: - shard-iclb: [PASS][18] -> [DMESG-WARN][19] ([i915#1982]) [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-iclb5/igt@kms_big...@y-tiled-64bpp-rotate-0.html [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/shard-iclb3/igt@kms_big...@y-tiled-64bpp-rotate-0.html * igt@kms_cursor_crc@pipe-c-cursor-suspend: - shard-glk: [PASS][20] -> [FAIL][21] ([i915#54]) [20]:
Re: [Intel-gfx] [PATCH v5 06/16] pwm: lpss: Use pwm_lpss_apply() when restoring state on resume
Hi, On 7/29/20 10:12 AM, Andy Shevchenko wrote: On Tue, Jul 28, 2020 at 09:55:22PM +0200, Hans de Goede wrote: On 7/28/20 8:57 PM, Andy Shevchenko wrote: On Fri, Jul 17, 2020 at 03:37:43PM +0200, Hans de Goede wrote: ... Maybe I'm too picky, but I would go even further and split apply to two versions static int pwm_lpss_apply_on_resume(struct pwm_chip *chip, struct pwm_device *pwm, const struct pwm_state *state) { struct pwm_lpss_chip *lpwm = to_lpwm(chip); if (state->enabled) return pwm_lpss_prepare_enable(lpwm, pwm, state, !pwm_is_enabled(pwm)); if (pwm_is_enabled(pwm)) { pwm_lpss_write(pwm, pwm_lpss_read(pwm) & ~PWM_ENABLE); return 0; } and another one for !from_resume. It is a bit picky :) But that is actually not a bad idea, although I would write it like this for more symmetry with the normal (not on_resume) apply version, while at it I also renamed the function: /* * This is a mirror of pwm_lpss_apply() without pm_runtime reference handling * for restoring the PWM state on resume. */ static int pwm_lpss_restore_state(struct pwm_chip *chip, struct pwm_device *pwm, const struct pwm_state *state) { struct pwm_lpss_chip *lpwm = to_lpwm(chip); int ret = 0; if (state->enabled) ret = pwm_lpss_prepare_enable(lpwm, pwm, state, !pwm_is_enabled(pwm)); else if (pwm_is_enabled(pwm)) pwm_lpss_write(pwm, pwm_lpss_read(pwm) & ~PWM_ENABLE); return ret; } Would that work for you? Yes. Ok, I've added the suggested/discussed helper in my personal tree. Is it ok if I add your Reviewed-by with that change in place. This is the last unreviewed bit, so I would rather not respin the series just for this (there will be one more respin when I rebase it on 5.9-rc1). If you want to check out what the patch looks like now, the new version from my personal tree is here: https://github.com/jwrdegoede/linux-sunxi/commit/e4869830d88bb8cb8251718e0086ac189abc0f56 Regards, Hans ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
On Sun, Aug 02, 2020 at 02:14:06PM -0500, Bjorn Helgaas wrote: > Wait, I'm not convinced yet. I know that if a PCI read fails, you > normally get ~0 data because the host bridge fabricates it to complete > the CPU load. > > But what guarantees that a PCI config register cannot contain ~0? Well, I don't think you can differentiate that case, right? I guess this is where the driver knowledge comes into play: if the read returns ~0, the pci_read_config* should probably return in that case something like: PCIBIOS_READ_MAYBE_FAILED to denote it is all 1s and then the caller should be able to determine, based on any of domain:bus:slot.func and whatever else the driver knows about its hardware, whether the 1s are a valid value or an error. Hopefully. Or something better of which I cannot think of right now... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 00/16] acpi/pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API
Hi, On 8/2/20 1:25 PM, Andy Shevchenko wrote: On Sat, Aug 01, 2020 at 04:38:16PM +0200, Hans de Goede wrote: On 7/29/20 12:54 PM, Andy Shevchenko wrote: On Fri, Jul 17, 2020 at 03:37:37PM +0200, Hans de Goede wrote: ... One comment to consider, though. There are three channels in that PWM AFAIU. One of them is backlight control, another one can be attached to haptics. The concern is how this series may (or may not?) affect haptics behaviour. When you say "in that PWM" do you mean the LPSS one or the CRC one ? CRC one. I have read it from PMIC spec, that's why the question. Ah I see, well the kernel driver only implements support for 1 PWM output, the one which is used for the backlight brighness control. So this series should not affect haptics behavior, since it looks like the haptic functionality is not supported in the mainline kernel at all. And I'm also not aware of any tablets with a CRC PMIC which have (non working) haptic support. Regards, Hans ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: failure for Add generic i915_ggtt ballooning support
== Series Details == Series: Add generic i915_ggtt ballooning support URL : https://patchwork.freedesktop.org/series/80177/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8832_full -> Patchwork_18294_full Summary --- **FAILURE** Serious unknown changes coming with Patchwork_18294_full absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_18294_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_18294_full: ### IGT changes ### Possible regressions * igt@gem_mmap_gtt@fault-concurrent: - shard-snb: [PASS][1] -> [DMESG-WARN][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-snb6/igt@gem_mmap_...@fault-concurrent.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-snb4/igt@gem_mmap_...@fault-concurrent.html * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes: - shard-skl: [PASS][3] -> [INCOMPLETE][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html * igt@runner@aborted: - shard-snb: NOTRUN -> [FAIL][5] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-snb4/igt@run...@aborted.html Known issues Here are the changes found in Patchwork_18294_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_huc_copy@huc-copy: - shard-tglb: [PASS][6] -> [SKIP][7] ([i915#2190]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb1/igt@gem_huc_c...@huc-copy.html [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-tglb6/igt@gem_huc_c...@huc-copy.html * igt@i915_selftest@mock@contexts: - shard-skl: [PASS][8] -> [INCOMPLETE][9] ([i915#198]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl2/igt@i915_selftest@m...@contexts.html [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@i915_selftest@m...@contexts.html * igt@kms_cursor_edge_walk@pipe-c-256x256-bottom-edge: - shard-skl: [PASS][10] -> [DMESG-WARN][11] ([i915#1982]) +7 similar issues [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl6/igt@kms_cursor_edge_w...@pipe-c-256x256-bottom-edge.html [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl5/igt@kms_cursor_edge_w...@pipe-c-256x256-bottom-edge.html * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-ytiled: - shard-apl: [PASS][12] -> [DMESG-WARN][13] ([i915#1635] / [i915#1982]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-apl3/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-apl6/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-cpu: - shard-tglb: [PASS][14] -> [DMESG-WARN][15] ([i915#1982]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-tglb7/igt@kms_frontbuffer_track...@psr-1p-offscren-pri-shrfb-draw-mmap-cpu.html [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-tglb5/igt@kms_frontbuffer_track...@psr-1p-offscren-pri-shrfb-draw-mmap-cpu.html * igt@kms_hdr@bpc-switch-suspend: - shard-kbl: [PASS][16] -> [DMESG-WARN][17] ([i915#180]) +8 similar issues [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-kbl2/igt@kms_...@bpc-switch-suspend.html [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-kbl2/igt@kms_...@bpc-switch-suspend.html * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b: - shard-skl: [PASS][18] -> [INCOMPLETE][19] ([CI#80]) [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-b.html [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl10/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-b.html * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc: - shard-skl: [PASS][20] -> [FAIL][21] ([fdo#108145] / [i915#265]) [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/shard-skl7/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/shard-skl3/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html * igt@kms_psr@psr2_cursor_mmap_cpu: - shard-iclb:
Re: [Intel-gfx] Time, where did it go?
Quoting Dave Airlie (2020-08-02 18:56:44) > On Mon, 3 Aug 2020 at 02:44, Chris Wilson wrote: > > > > Lots of small incremental improvements to reduce execution latency > > which basically offsets the small regressions incurred when compared to > > 5.7. And then there are some major fixes found while staring agape at > > lockstat. > > What introduced the 5.7 regressions? are they documented somewhere. No. There's a 5.8-rc1 bisect (to the merge but not into rc1) for something in the core causing perf fluctuations, but I have not yet reproduced that one to bisect into the rc1 merge. [The system that showed the issue has historically seen strong swings from p-state setup, might be that again?]. This is from measuring simulated transcode workloads that we've built up to track KPI. That we can then compare against the real workloads run by other groups. > What is the goal here, is there a benchmark or application that this > benefits that you can quantify the benefits? Entirely motivated by not wanting to have to explain why there's even a 1% regression in their client metrics. They wouldn't even notice for a few releases by which point the problem is likely compounded and we suddenly have crisis meetings. > Is the lack of userspace command submission a problem vs other vendors here? If you mean HW scheduling (which is the bit that we are most in dire need of for replacing this series), not really, our closest equivalent has not yet proven itself, at least in previous incarnations, adequate to their requirements. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
On Sun, Aug 02, 2020 at 08:46:48PM +0200, Borislav Petkov wrote: > On Sun, Aug 02, 2020 at 07:28:00PM +0200, Saheed Bolarinwa wrote: > > Because the value ~0 has a meaning to some drivers and only > > No, ~0 means that the PCI read failed. For *every* PCI device I know. Wait, I'm not convinced yet. I know that if a PCI read fails, you normally get ~0 data because the host bridge fabricates it to complete the CPU load. But what guarantees that a PCI config register cannot contain ~0? If there's something about that in the spec I'd love to know where it is because it would simplify a lot of things. I don't think we should merge any of these patches as-is. If we *do* want to go this direction, we at least need some kind of macro or function that tests for ~0 so we have a clue about what's happening and can grep for it. Bjorn ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
On Sun, Aug 02, 2020 at 07:28:00PM +0200, Saheed Bolarinwa wrote: > Because the value ~0 has a meaning to some drivers and only No, ~0 means that the PCI read failed. For *every* PCI device I know. Here's me reading from 0xf0 offset of my hostbridge: # setpci -s 00:00.0 0xf0.l 0100 That device doesn't have extended config space, so the last valid byte is 0xff. Let's read beyond that: # setpci -s 00:00.0 0x100.l > Again, only the drivers can determine if ~0 is a valid value. This > information is not available inside pci_config_read*(). Of course it is. *every* change you've done in 6/17 - this is the only patch I have received - checks for == ~0. So that check can just as well be moved inside pci_config_read_*(). Here's how one could do it: #define PCI_OP_READ(size, type, len) \ int noinline pci_bus_read_config_##size \ (struct pci_bus *bus, unsigned int devfn, int pos, type *value) \ { \ int res;\ unsigned long flags;\ u32 data = 0; \ if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER; \ pci_lock_config(flags); \ res = bus->ops->read(bus, devfn, pos, len, ); \ /* Check we actually read something which is not all 1s.*/ if (data == ~0) return PCIBIOS_READ_FAILED; *value = (type)data;\ pci_unlock_config(flags); \ return res; \ } Also, I'd prefer a function to *not* return void but return either an error or success. In the success case, the @value argument can be consumed by the caller and otherwise not. In any case, that change is a step in the wrong direction and I don't like it, sorry. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Time, where did it go?
On Mon, 3 Aug 2020 at 02:44, Chris Wilson wrote: > > Lots of small incremental improvements to reduce execution latency > which basically offsets the small regressions incurred when compared to > 5.7. And then there are some major fixes found while staring agape at > lockstat. What introduced the 5.7 regressions? are they documented somewhere. What is the goal here, is there a benchmark or application that this benefits that you can quantify the benefits? Is the lack of userspace command submission a problem vs other vendors here? Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/42] drm/i915: Fix wrong return value
== Series Details == Series: series starting with [01/42] drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80179/ State : success == Summary == CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18295 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/index.html New tests - New tests have been introduced between CI_DRM_8832 and Patchwork_18295: ### New IGT tests (1) ### * igt@i915_selftest@live@scheduler: - Statuses : 33 pass(s) - Exec time: [0.47, 2.95] s Known issues Here are the changes found in Patchwork_18295 that come from known issues: ### IGT changes ### Issues hit * igt@i915_pm_rpm@module-reload: - fi-byt-j1900: [PASS][1] -> [DMESG-WARN][2] ([i915#1982]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-byt-j1900/igt@i915_pm_...@module-reload.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-byt-j1900/igt@i915_pm_...@module-reload.html Possible fixes * igt@i915_module_load@reload: - fi-bxt-dsi: [DMESG-WARN][3] ([i915#1635] / [i915#1982]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-bxt-dsi/igt@i915_module_l...@reload.html * igt@i915_pm_rpm@basic-pci-d3-state: - fi-bsw-kefka: [DMESG-WARN][5] ([i915#1982]) -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html * igt@i915_selftest@live@active: - {fi-ehl-1}: [DMESG-FAIL][7] ([i915#541]) -> [PASS][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-ehl-1/igt@i915_selftest@l...@active.html * igt@i915_selftest@live@gem_contexts: - fi-tgl-u2: [INCOMPLETE][9] ([i915#2045]) -> [PASS][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1: - fi-icl-u2: [DMESG-WARN][11] ([i915#1982]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html Warnings * igt@i915_module_load@reload: - fi-kbl-x1275: [DMESG-WARN][13] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][14] ([i915#62] / [i915#92]) +1 similar issue [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@i915_module_l...@reload.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-x1275/igt@i915_module_l...@reload.html * igt@i915_pm_rpm@basic-rte: - fi-kbl-guc: [SKIP][15] ([fdo#109271]) -> [DMESG-FAIL][16] ([i915#2203]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-guc/igt@i915_pm_...@basic-rte.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-guc/igt@i915_pm_...@basic-rte.html * igt@i915_selftest@live@execlists: - fi-icl-y: [INCOMPLETE][17] -> [INCOMPLETE][18] ([i915#1684]) [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-y/igt@i915_selftest@l...@execlists.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-icl-y/igt@i915_selftest@l...@execlists.html * igt@kms_force_connector_basic@prune-stale-modes: - fi-kbl-x1275: [DMESG-WARN][19] ([i915#62] / [i915#92]) -> [DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +7 similar issues [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_force_connector_ba...@prune-stale-modes.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18295/fi-kbl-x1275/igt@kms_force_connector_ba...@prune-stale-modes.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635 [i915#1684]: https://gitlab.freedesktop.org/drm/intel/issues/1684 [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982 [i915#2045]: https://gitlab.freedesktop.org/drm/intel/issues/2045 [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/42] drm/i915: Fix wrong return value
== Series Details == Series: series starting with [01/42] drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80179/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.0 Fast mode used, each commit won't be checked separately. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/42] drm/i915: Fix wrong return value
== Series Details == Series: series starting with [01/42] drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80179/ State : warning == Summary == $ dim checkpatch origin/drm-tip 641fe3760e7f drm/i915/gem: Don't drop the timeline lock during execbuf 1eace725c392 drm/i915/gem: Reduce context termination list iteration guard to RCU -:20: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #20: References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests -:20: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits")' #20: References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests total: 1 errors, 1 warnings, 0 checks, 65 lines checked f418a1a9323c drm/i915/gt: Protect context lifetime with RCU ed19011a3244 drm/i915/gt: Free stale request on destroying the virtual engine b7deae3dd990 drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock 9dfbca235567 drm/i915/gt: Split the breadcrumb spinlock between global and contexts -:299: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment #299: FILE: drivers/gpu/drm/i915/gt/intel_context_types.h:54: + spinlock_t signal_lock; total: 0 errors, 0 warnings, 1 checks, 260 lines checked d64649d5b06d drm/i915: Drop i915_request.lock serialisation around await_start 5b3023be5688 drm/i915: Drop i915_request.lock requirement for intel_rps_boost() 4cc1911d1e27 drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source a5c7155455bf drm/i915/gem: Reduce ctx->engines_mutex for get_engines() 84b6ff2b8884 drm/i915: Reduce test_and_set_bit to set_bit in i915_request_submit() 0fcdf7a5cdb8 drm/i915/gt: Decouple completed requests on unwind fa84ed47c393 drm/i915/gt: Check for a completed last request once 61ede9a148ba drm/i915/gt: Refactor heartbeat request construction and submission 570c67a58ab7 drm/i915/gt: Replace direct submit with direct call to tasklet 769442507dcd drm/i915/gt: Use virtual_engine during execlists_dequeue 122fabf291d6 drm/i915/gt: Decouple inflight virtual engines 515575064cb7 drm/i915/gt: Defer schedule_out until after the next dequeue 4adeb0f9dbfe drm/i915/gt: Resubmit the virtual engine on schedule-out 4f73843fd816 drm/i915/gt: Simplify virtual engine handling for execlists_hold() 3d5f28c98ebe drm/i915/gt: ce->inflight updates are now serialised 6cb4a83d25f2 drm/i915/gt: Drop atomic for engine->fw_active tracking a28b1bd35d4e drm/i915/gt: Extract busy-stats for ring-scheduler -:12: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating? #12: new file mode 100644 total: 0 errors, 1 warnings, 0 checks, 95 lines checked 323ff915d375 drm/i915/gt: Convert stats.active to plain unsigned int 9bdc1f8119ab drm/i915: Lift waiter/signaler iterators b1e01ae8f345 drm/i915: Strip out internal priorities fc2d75819d99 drm/i915: Remove I915_USER_PRIORITY_SHIFT 51c3bc9b7f8f drm/i915/gt: Defer the kmem_cache_free() until after the HW submit 8b5974e86d02 drm/i915: Prune empty priolists 508fbfb2fa86 drm/i915: Replace engine->schedule() with a known request operation acef80fd74c7 drm/i915/gt: Do not suspend bonded requests if one hangs c472b142c7b9 drm/i915: Teach the i915_dependency to use a double-lock cdad85c1538f drm/i915: Restructure priority inheritance fba4ba812a76 drm/i915/selftests: Measure set-priority duration -:52: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating? #52: new file mode 100644 total: 0 errors, 1 warnings, 0 checks, 553 lines checked 521577d83d5b drm/i915: Improve DFS for priority inheritance d61ada4e881a drm/i915/gt: Remove timeslice suppression 713774df6186 drm/i915: Fair low-latency scheduling 341457ff92bf drm/i915/gt: Specify a deadline for the heartbeat a32ab75c19ba drm/i915: Replace the priority boosting for the display with a deadline a445af000bcb drm/i915: Move saturated workload detection back to the context -:29: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #29: References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global") -:29: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")' #29: References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global") total: 1 errors, 1 warnings, 0 checks, 69 lines checked 87199b330ee5 drm/i915/gt: Another tweak for flushing the tasklets ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 33/42] drm/i915: Teach the i915_dependency to use a double-lock
Currently, we construct and teardown the i915_dependency chains using a global spinlock. As the lists are entirely local, it should be possible to use an double-lock with an explicit nesting [signaler -> waiter, always] and so avoid the costly convenience of a global spinlock. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 6 +-- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_scheduler.c | 44 + drivers/gpu/drm/i915/i915_scheduler.h | 2 +- drivers/gpu/drm/i915/i915_scheduler_types.h | 1 + 5 files changed, 34 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3999bf6c6aee..cb53f350848f 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1847,7 +1847,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); - if (p->flags & I915_DEPENDENCY_WEAK) + if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK) continue; /* Leave semaphores spinning on the other engines */ @@ -2700,7 +2700,7 @@ static void __execlists_hold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); - if (p->flags & I915_DEPENDENCY_WEAK) + if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK) continue; /* Leave semaphores spinning on the other engines */ @@ -2798,7 +2798,7 @@ static void __execlists_unhold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); - if (p->flags & I915_DEPENDENCY_WEAK) + if (!p->waiter || p->flags & I915_DEPENDENCY_WEAK) continue; /* Propagate any change in error status */ diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 8a263556997c..3f9e3e6a34ed 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -329,7 +329,7 @@ bool i915_request_retire(struct i915_request *rq) intel_context_unpin(rq->context); free_capture_list(rq); - i915_sched_node_fini(>sched); + i915_sched_node_retire(>sched); i915_request_put(rq); return true; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 4760d7c9b7cd..247d2e2bb3c8 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -389,6 +389,8 @@ void i915_request_set_priority(struct i915_request *rq, int prio) void i915_sched_node_init(struct i915_sched_node *node) { + spin_lock_init(>lock); + INIT_LIST_HEAD(>signalers_list); INIT_LIST_HEAD(>waiters_list); INIT_LIST_HEAD(>link); @@ -426,7 +428,8 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node, { bool ret = false; - spin_lock_irq(_lock); + /* The signal->lock is always the outer lock in this double-lock. */ + spin_lock_irq(>lock); if (!node_signaled(signal)) { INIT_LIST_HEAD(>dfs_link); @@ -435,15 +438,17 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node, dep->flags = flags; /* All set, now publish. Beware the lockless walkers. */ + spin_lock_nested(>lock, SINGLE_DEPTH_NESTING); list_add_rcu(>signal_link, >signalers_list); list_add_rcu(>wait_link, >waiters_list); + spin_unlock(>lock); /* Propagate the chains */ node->flags |= signal->flags; ret = true; } - spin_unlock_irq(_lock); + spin_unlock_irq(>lock); return ret; } @@ -469,39 +474,46 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node, return 0; } -void i915_sched_node_fini(struct i915_sched_node *node) +void i915_sched_node_retire(struct i915_sched_node *node) { struct i915_dependency *dep, *tmp; - spin_lock_irq(_lock); + spin_lock_irq(>lock); /* * Everyone we depended upon (the fences we wait to be signaled) * should retire before us and remove themselves from our list. * However, retirement is run independently on each timeline and -* so we may be called out-of-order. +* so we may be called out-of-order. As we need to avoid taking +* the signaler's lock, just mark up our completion and be wary +
[Intel-gfx] [PATCH 35/42] drm/i915/selftests: Measure set-priority duration
As a topological sort, we expect it to run in linear graph time, O(V+E). In removing the recursion, it is no longer a DFS but rather a BFS, and performs as O(VE). Let's demonstrate how bad this is with a few examples, and build a few test cases to verify a potential fix. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_scheduler.c | 4 + .../drm/i915/selftests/i915_live_selftests.h | 1 + .../drm/i915/selftests/i915_perf_selftests.h | 1 + .../gpu/drm/i915/selftests/i915_scheduler.c | 530 ++ 4 files changed, 536 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/i915_scheduler.c diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 45e000c257db..3c1a0b001746 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -565,6 +565,10 @@ void i915_sched_node_retire(struct i915_sched_node *node) spin_unlock_irq(>lock); } +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/i915_scheduler.c" +#endif + static void i915_global_scheduler_shrink(void) { kmem_cache_shrink(global.slab_dependencies); diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index a92c0e9b7e6b..2200a5baa68e 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests) selftest(gt_pm, intel_gt_pm_live_selftests) selftest(gt_heartbeat, intel_heartbeat_live_selftests) selftest(requests, i915_request_live_selftests) +selftest(scheduler, i915_scheduler_live_selftests) selftest(active, i915_active_live_selftests) selftest(objects, i915_gem_object_live_selftests) selftest(mman, i915_gem_mman_live_selftests) diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h index c2389f8a257d..137e35283fee 100644 --- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h @@ -17,5 +17,6 @@ */ selftest(engine_cs, intel_engine_cs_perf_selftests) selftest(request, i915_request_perf_selftests) +selftest(scheduler, i915_scheduler_perf_selftests) selftest(blt, i915_gem_object_blt_perf_selftests) selftest(region, intel_memory_region_perf_selftests) diff --git a/drivers/gpu/drm/i915/selftests/i915_scheduler.c b/drivers/gpu/drm/i915/selftests/i915_scheduler.c new file mode 100644 index ..98d29fa8f5f8 --- /dev/null +++ b/drivers/gpu/drm/i915/selftests/i915_scheduler.c @@ -0,0 +1,530 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include "i915_selftest.h" + +#include "gt/intel_context.h" +#include "gt/selftest_engine_heartbeat.h" +#include "selftests/igt_spinner.h" + +static void scheduling_disable(struct intel_engine_cs *engine) +{ + engine->props.preempt_timeout_ms = 0; + engine->props.timeslice_duration_ms = 0; + + st_engine_heartbeat_disable(engine); +} + +static void scheduling_enable(struct intel_engine_cs *engine) +{ + st_engine_heartbeat_enable(engine); + + engine->props.preempt_timeout_ms = + engine->defaults.preempt_timeout_ms; + engine->props.timeslice_duration_ms = + engine->defaults.timeslice_duration_ms; +} + +static int first_engine(struct drm_i915_private *i915, + int (*chain)(struct intel_engine_cs *engine, +unsigned long param, +void (*fn)(struct i915_request *rq, + unsigned long v, + unsigned long e)), + unsigned long param, + void (*fn)(struct i915_request *rq, + unsigned long v, unsigned long e)) +{ + struct intel_engine_cs *engine; + + for_each_uabi_engine(engine, i915) { + if (!intel_engine_has_scheduler(engine)) + continue; + + return chain(engine, param, fn); + } + + return 0; +} + +static int all_engines(struct drm_i915_private *i915, + int (*chain)(struct intel_engine_cs *engine, + unsigned long param, + void (*fn)(struct i915_request *rq, + unsigned long v, + unsigned long e)), + unsigned long param, + void (*fn)(struct i915_request *rq, + unsigned long v, unsigned long e)) +{ + struct intel_engine_cs *engine; + int err; + + for_each_uabi_engine(engine, i915) { + if (!intel_engine_has_scheduler(engine)) +
[Intel-gfx] [PATCH 10/42] drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source
When cloning the engines from the source context, we need to ensure that the engines are not freed as we copy them, and that the flags we clone from the source correspond with the engines we copy across. To do this we need only take a reference to the src->engines, rather than hold the src->engine_mutex, so long as we verify that nothing changed under the read. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 24 + 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index db893f6c516b..6530bd2e634e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -755,7 +755,8 @@ __create_context(struct drm_i915_private *i915) } static inline struct i915_gem_engines * -__context_engines_await(const struct i915_gem_context *ctx) +__context_engines_await(const struct i915_gem_context *ctx, + bool *user_engines) { struct i915_gem_engines *engines; @@ -764,6 +765,10 @@ __context_engines_await(const struct i915_gem_context *ctx) engines = rcu_dereference(ctx->engines); GEM_BUG_ON(!engines); + if (user_engines) + *user_engines = i915_gem_context_user_engines(ctx); + + /* successful await => strong mb */ if (unlikely(!i915_sw_fence_await(>fence))) continue; @@ -787,7 +792,7 @@ context_apply_all(struct i915_gem_context *ctx, struct intel_context *ce; int err = 0; - e = __context_engines_await(ctx); + e = __context_engines_await(ctx, NULL); for_each_gem_engine(ce, e, it) { err = fn(ce, data); if (err) @@ -1129,7 +1134,7 @@ static int context_barrier_task(struct i915_gem_context *ctx, return err; } - e = __context_engines_await(ctx); + e = __context_engines_await(ctx, NULL); if (!e) { i915_active_release(>base); return -ENOENT; @@ -2126,11 +2131,14 @@ static int copy_ring_size(struct intel_context *dst, static int clone_engines(struct i915_gem_context *dst, struct i915_gem_context *src) { - struct i915_gem_engines *e = i915_gem_context_lock_engines(src); - struct i915_gem_engines *clone; + struct i915_gem_engines *clone, *e; bool user_engines; unsigned long n; + e = __context_engines_await(src, _engines); + if (!e) + return -ENOENT; + clone = alloc_engines(e->num_engines); if (!clone) goto err_unlock; @@ -2172,9 +2180,7 @@ static int clone_engines(struct i915_gem_context *dst, } } clone->num_engines = n; - - user_engines = i915_gem_context_user_engines(src); - i915_gem_context_unlock_engines(src); + i915_sw_fence_complete(>fence); /* Serialised by constructor */ engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1)); @@ -2185,7 +2191,7 @@ static int clone_engines(struct i915_gem_context *dst, return 0; err_unlock: - i915_gem_context_unlock_engines(src); + i915_sw_fence_complete(>fence); return -ENOMEM; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/42] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock
Make b->signaled_requests a lockless-list so that we can manipulate it outside of the b->irq_lock. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 53 ++- .../gpu/drm/i915/gt/intel_breadcrumbs_types.h | 2 +- drivers/gpu/drm/i915/i915_request.h | 6 ++- 3 files changed, 46 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index d8b206e53660..9e7ac612fabb 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -174,16 +174,13 @@ static void add_retire(struct intel_breadcrumbs *b, struct intel_timeline *tl) intel_engine_add_retire(b->irq_engine, tl); } -static bool __signal_request(struct i915_request *rq, struct list_head *signals) +static bool __signal_request(struct i915_request *rq) { - clear_bit(I915_FENCE_FLAG_SIGNAL, >fence.flags); - if (!__dma_fence_signal(>fence)) { i915_request_put(rq); return false; } - list_add_tail(>signal_link, signals); return true; } @@ -191,17 +188,42 @@ static void signal_irq_work(struct irq_work *work) { struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work); const ktime_t timestamp = ktime_get(); + struct llist_node *signal, *sn; struct intel_context *ce, *cn; struct list_head *pos, *next; - LIST_HEAD(signal); + + signal = NULL; + if (unlikely(!llist_empty(>signaled_requests))) + signal = llist_del_all(>signaled_requests); spin_lock(>irq_lock); - if (list_empty(>signalers)) + /* +* Keep the irq armed until the interrupt after all listeners are gone. +* +* Enabling/disabling the interrupt is rather costly, roughly a couple +* of hundred microseconds. If we are proactive and enable/disable +* the interrupt around every request that wants a breadcrumb, we +* quickly drown in the extra orders of magnitude of latency imposed +* on request submission. +* +* So we try to be lazy, and keep the interrupts enabled until no +* more listeners appear within a breadcrumb interrupt interval (that +* is until a request completes that no one cares about). The +* observation is that listeners come in batches, and will often +* listen to a bunch of requests in succession. +* +* We also try to avoid raising too many interrupts, as they may +* be generated by userspace batches and it is unfortunately rather +* too easy to drown the CPU under a flood of GPU interrupts. Thus +* whenever no one appears to be listening, we turn off the interrupts. +* Fewer interrupts should conserve power -- at the very least, fewer +* interrupt draw less ire from other users of the system and tools +* like powertop. +*/ + if (!signal && list_empty(>signalers)) __intel_breadcrumbs_disarm_irq(b); - list_splice_init(>signaled_requests, ); - list_for_each_entry_safe(ce, cn, >signalers, signal_link) { GEM_BUG_ON(list_empty(>signals)); @@ -218,7 +240,11 @@ static void signal_irq_work(struct irq_work *work) * spinlock as the callback chain may end up adding * more signalers to the same context or engine. */ - __signal_request(rq, ); + clear_bit(I915_FENCE_FLAG_SIGNAL, >fence.flags); + if (__signal_request(rq)) { + rq->signal_node.next = signal; + signal = >signal_node; + } } /* @@ -238,9 +264,9 @@ static void signal_irq_work(struct irq_work *work) spin_unlock(>irq_lock); - list_for_each_safe(pos, next, ) { + llist_for_each_safe(signal, sn, signal) { struct i915_request *rq = - list_entry(pos, typeof(*rq), signal_link); + llist_entry(signal, typeof(*rq), signal_node); struct list_head cb_list; spin_lock(>lock); @@ -264,7 +290,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine) spin_lock_init(>irq_lock); INIT_LIST_HEAD(>signalers); - INIT_LIST_HEAD(>signaled_requests); + init_llist_head(>signaled_requests); init_irq_work(>irq_work, signal_irq_work); @@ -327,7 +353,8 @@ static void insert_breadcrumb(struct i915_request *rq, * its signal completion. */ if (__request_completed(rq)) { - if (__signal_request(rq, >signaled_requests)) + if (__signal_request(rq) && + llist_add(>signal_node,
[Intel-gfx] [PATCH 03/42] drm/i915/gem: Reduce context termination list iteration guard to RCU
As we now protect the timeline list using RCU, we can drop the timeline->mutex for guarding the list iteration during context close, as we are searching for an inflight request. Any new request will see the context is banned and not be submitted. In doing so, pull the checks for a concurrent submission of the request (notably the i915_request_completed()) under the engine spinlock, to fully serialise with __i915_request_submit()). That is in the case of preempt-to-busy where the request may be completed during the __i915_request_submit(), we need to be careful that we sample the request status after serialising so that we don't miss the request the engine is actually submitting. Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()") References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 32 - 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index d8cccbab7a51..db893f6c516b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -439,29 +439,36 @@ static bool __cancel_engine(struct intel_engine_cs *engine) return __reset_engine(engine); } -static struct intel_engine_cs *__active_engine(struct i915_request *rq) +static bool +__active_engine(struct i915_request *rq, struct intel_engine_cs **active) { struct intel_engine_cs *engine, *locked; + bool ret = false; /* * Serialise with __i915_request_submit() so that it sees * is-banned?, or we know the request is already inflight. +* +* Note that rq->engine is unstable, and so we double +* check that we have acquired the lock on the final engine. */ locked = READ_ONCE(rq->engine); spin_lock_irq(>active.lock); while (unlikely(locked != (engine = READ_ONCE(rq->engine { spin_unlock(>active.lock); - spin_lock(>active.lock); locked = engine; + spin_lock(>active.lock); } - engine = NULL; - if (i915_request_is_active(rq) && rq->fence.error != -EIO) - engine = rq->engine; + if (!i915_request_completed(rq)) { + if (i915_request_is_active(rq) && rq->fence.error != -EIO) + *active = locked; + ret = true; + } spin_unlock_irq(>active.lock); - return engine; + return ret; } static struct intel_engine_cs *active_engine(struct intel_context *ce) @@ -472,17 +479,16 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce) if (!ce->timeline) return NULL; - mutex_lock(>timeline->mutex); - list_for_each_entry_reverse(rq, >timeline->requests, link) { - if (i915_request_completed(rq)) - break; + rcu_read_lock(); + list_for_each_entry_rcu(rq, >timeline->requests, link) { + if (i915_request_is_active(rq) && i915_request_completed(rq)) + continue; /* Check with the backend if the request is inflight */ - engine = __active_engine(rq); - if (engine) + if (__active_engine(rq, )) break; } - mutex_unlock(>timeline->mutex); + rcu_read_unlock(); return engine; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Time, where did it go?
Lots of small incremental improvements to reduce execution latency which basically offsets the small regressions incurred when compared to 5.7. And then there are some major fixes found while staring agape at lockstat. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 20/42] drm/i915/gt: Resubmit the virtual engine on schedule-out
Having recognised that we do not change the sibling until we schedule out, we can then defer the decision to resubmit the virtual engine from the unwind of the active queue to scheduling out of the virtual context. By keeping the unwind order intact on the local engine, we can preserve data dependency ordering while doing a preempt-to-busy pass until we have determined the new ELSP. This means that if we try to timeslice between a virtual engine and a data-dependent ordinary request, the pair will maintain their relative ordering and we will avoid the resubmission, cancelling the timeslicing until further change. The dilemma though is that we then may end up in a situation where the 'demotion' of the virtual request to an ordinary request in the engine queue results in filling the ELSP[] with virtual requests instead of spreading the load across the engines. To compensate for this, we mark each virtual request and refuse to resubmit a virtual request in the secondary ELSP slots, thus forcing subsequent virtual requests to be scheduled out after timeslicing. By delaying the decision until we schedule out, we will avoid unnecessary resubmission. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c| 124 +++-- drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 +- 2 files changed, 78 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index a9ae0638e117..d1a45d5e4225 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -,39 +,23 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) __i915_request_unsubmit(rq); - /* -* Push the request back into the queue for later resubmission. -* If this request is not native to this physical engine (i.e. -* it came from a virtual source), push it back onto the virtual -* engine so that it can be moved across onto another physical -* engine as load dictates. -*/ - if (likely(rq->execution_mask == engine->mask)) { - GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); - if (rq_prio(rq) != prio) { - prio = rq_prio(rq); - pl = i915_sched_lookup_priolist(engine, prio); - } - GEM_BUG_ON(RB_EMPTY_ROOT(>execlists.queue.rb_root)); - - list_move(>sched.link, pl); - set_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags); + GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); + if (rq_prio(rq) != prio) { + prio = rq_prio(rq); + pl = i915_sched_lookup_priolist(engine, prio); + } + GEM_BUG_ON(RB_EMPTY_ROOT(>execlists.queue.rb_root)); - /* Check in case we rollback so far we wrap [size/2] */ - if (intel_ring_direction(rq->ring, -intel_ring_wrap(rq->ring, -rq->tail), -rq->ring->tail) > 0) - rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE; + list_move(>sched.link, pl); + set_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags); - active = rq; - } else { - struct intel_engine_cs *owner = rq->context->engine; + /* Check in case we rollback so far we wrap [size/2] */ + if (intel_ring_direction(rq->ring, +intel_ring_wrap(rq->ring, rq->tail), +rq->ring->tail) > 0) + rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE; - WRITE_ONCE(rq->engine, owner); - owner->submit_request(rq); - active = NULL; - } + active = rq; } return active; @@ -1387,12 +1371,37 @@ static inline void execlists_schedule_in(struct i915_request *rq, int idx) GEM_BUG_ON(intel_context_inflight(ce) != rq->engine); } +static void +resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve) +{ + struct intel_engine_cs *engine = rq->engine; + unsigned long flags; + + spin_lock_irqsave(>active.lock, flags); + + clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags); + WRITE_ONCE(rq->engine, >base); + ve->base.submit_request(rq); + + spin_unlock_irqrestore(>active.lock, flags); +} + static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve),
[Intel-gfx] [PATCH 29/42] drm/i915/gt: Defer the kmem_cache_free() until after the HW submit
Watching lock_stat, we noticed that the kmem_cache_free() would cause the occasional multi-millisecond spike (directly affecting max-holdtime and so the max-waittime). Delaying our submission of the next ELSP by a millisecond will leave the GPU idle, so defer the kmem_cache_free() until afterwards. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 10 +- drivers/gpu/drm/i915/i915_scheduler.c | 13 + drivers/gpu/drm/i915/i915_scheduler.h | 12 3 files changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index e8f6d0a80c8e..7ac864cd57e3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2029,6 +2029,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) struct i915_request **port = execlists->pending; struct i915_request ** const last_port = port + execlists->port_mask; struct i915_request *last = *execlists->active; + struct list_head *free = NULL; struct virtual_engine *ve; struct rb_node *rb; bool submit = false; @@ -2316,8 +2317,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } } + /* Remove the node, but defer the free for later */ rb_erase_cached(>node, >queue); - i915_priolist_free(p); + free = i915_priolist_free_defer(p, free); } done: *port++ = i915_request_get(last); @@ -2369,6 +2371,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine) i915_request_put(*port); *execlists->pending = NULL; } + + /* +* We noticed that kmem_cache_free() may cause 1ms+ latencies, so +* we defer the frees until after we have submitted the ELSP. +*/ + i915_priolist_free_many(free); } static inline void clear_ports(struct i915_request **ports, int count) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index a9973d7a724c..bfbbd94dfcbc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -126,6 +126,19 @@ void __i915_priolist_free(struct i915_priolist *p) kmem_cache_free(global.slab_priorities, p); } +void i915_priolist_free_many(struct list_head *list) +{ + while (list) { + struct i915_priolist *p; + + p = container_of(list, typeof(*p), requests); + list = p->requests.next; + + GEM_BUG_ON(p->priority == I915_PRIORITY_NORMAL); + kmem_cache_free(global.slab_priorities, p); + } +} + struct sched_cache { struct list_head *priolist; }; diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index b089d5cace1d..d8bf335c5e96 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -46,4 +46,16 @@ static inline void i915_priolist_free(struct i915_priolist *p) __i915_priolist_free(p); } +void i915_priolist_free_many(struct list_head *list); + +static inline struct list_head * +i915_priolist_free_defer(struct i915_priolist *p, struct list_head *free) +{ + if (p->priority != I915_PRIORITY_NORMAL) { + p->requests.next = free; + free = >requests; + } + return free; +} + #endif /* _I915_SCHEDULER_H_ */ -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/42] drm/i915/gem: Reduce ctx->engines_mutex for get_engines()
Take a snapshot of the ctx->engines, so we can avoid taking the ctx->engines_mutex for a mere read in get_engines(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 + 1 file changed, 8 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 6530bd2e634e..56700ddbed47 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1874,27 +1874,6 @@ set_engines(struct i915_gem_context *ctx, return 0; } -static struct i915_gem_engines * -__copy_engines(struct i915_gem_engines *e) -{ - struct i915_gem_engines *copy; - unsigned int n; - - copy = alloc_engines(e->num_engines); - if (!copy) - return ERR_PTR(-ENOMEM); - - for (n = 0; n < e->num_engines; n++) { - if (e->engines[n]) - copy->engines[n] = intel_context_get(e->engines[n]); - else - copy->engines[n] = NULL; - } - copy->num_engines = n; - - return copy; -} - static int get_engines(struct i915_gem_context *ctx, struct drm_i915_gem_context_param *args) @@ -1902,19 +1881,17 @@ get_engines(struct i915_gem_context *ctx, struct i915_context_param_engines __user *user; struct i915_gem_engines *e; size_t n, count, size; + bool user_engines; int err = 0; - err = mutex_lock_interruptible(>engines_mutex); - if (err) - return err; + e = __context_engines_await(ctx, _engines); + if (!e) + return -ENOENT; - e = NULL; - if (i915_gem_context_user_engines(ctx)) - e = __copy_engines(i915_gem_context_engines(ctx)); - mutex_unlock(>engines_mutex); - if (IS_ERR_OR_NULL(e)) { + if (!user_engines) { + i915_sw_fence_complete(>fence); args->size = 0; - return PTR_ERR_OR_ZERO(e); + return 0; } count = e->num_engines; @@ -1965,7 +1942,7 @@ get_engines(struct i915_gem_context *ctx, args->size = size; err_free: - free_engines(e); + i915_sw_fence_complete(>fence); return err; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 24/42] drm/i915/gt: Extract busy-stats for ring-scheduler
Lift the busy-stats context-in/out implementation out of intel_lrc, so that we can reuse it for other scheduler implementations. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_stats.h | 49 drivers/gpu/drm/i915/gt/intel_lrc.c | 34 +- 2 files changed, 50 insertions(+), 33 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_stats.h diff --git a/drivers/gpu/drm/i915/gt/intel_engine_stats.h b/drivers/gpu/drm/i915/gt/intel_engine_stats.h new file mode 100644 index ..58491eae3482 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_stats.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef __INTEL_ENGINE_STATS_H__ +#define __INTEL_ENGINE_STATS_H__ + +#include +#include +#include + +#include "i915_gem.h" /* GEM_BUG_ON */ +#include "intel_engine.h" + +static inline void intel_engine_context_in(struct intel_engine_cs *engine) +{ + unsigned long flags; + + if (atomic_add_unless(>stats.active, 1, 0)) + return; + + write_seqlock_irqsave(>stats.lock, flags); + if (!atomic_add_unless(>stats.active, 1, 0)) { + engine->stats.start = ktime_get(); + atomic_inc(>stats.active); + } + write_sequnlock_irqrestore(>stats.lock, flags); +} + +static inline void intel_engine_context_out(struct intel_engine_cs *engine) +{ + unsigned long flags; + + GEM_BUG_ON(!atomic_read(>stats.active)); + + if (atomic_add_unless(>stats.active, -1, 1)) + return; + + write_seqlock_irqsave(>stats.lock, flags); + if (atomic_dec_and_test(>stats.active)) { + engine->stats.total = + ktime_add(engine->stats.total, + ktime_sub(ktime_get(), engine->stats.start)); + } + write_sequnlock_irqrestore(>stats.lock, flags); +} + +#endif /* __INTEL_ENGINE_STATS_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 61dfc932201e..ccf0b43c02cf 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -140,6 +140,7 @@ #include "intel_breadcrumbs.h" #include "intel_context.h" #include "intel_engine_pm.h" +#include "intel_engine_stats.h" #include "intel_gt.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" @@ -1156,39 +1157,6 @@ execlists_context_status_change(struct i915_request *rq, unsigned long status) status, rq); } -static void intel_engine_context_in(struct intel_engine_cs *engine) -{ - unsigned long flags; - - if (atomic_add_unless(>stats.active, 1, 0)) - return; - - write_seqlock_irqsave(>stats.lock, flags); - if (!atomic_add_unless(>stats.active, 1, 0)) { - engine->stats.start = ktime_get(); - atomic_inc(>stats.active); - } - write_sequnlock_irqrestore(>stats.lock, flags); -} - -static void intel_engine_context_out(struct intel_engine_cs *engine) -{ - unsigned long flags; - - GEM_BUG_ON(!atomic_read(>stats.active)); - - if (atomic_add_unless(>stats.active, -1, 1)) - return; - - write_seqlock_irqsave(>stats.lock, flags); - if (atomic_dec_and_test(>stats.active)) { - engine->stats.total = - ktime_add(engine->stats.total, - ktime_sub(ktime_get(), engine->stats.start)); - } - write_sequnlock_irqrestore(>stats.lock, flags); -} - static void execlists_check_context(const struct intel_context *ce, const struct intel_engine_cs *engine) -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 04/42] drm/i915/gt: Protect context lifetime with RCU
Allow a brief period for continued access to a dead intel_context by deferring the release of the struct until after an RCU grace period. As we are using a dedicated slab cache for the contexts, we can defer the release of the slab pages via RCU, with the caveat that individual structs may be reused from the freelist within an RCU grace period. To handle that, we have to avoid clearing members of the zombie struct. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 330 +--- drivers/gpu/drm/i915/i915_active.c | 10 + drivers/gpu/drm/i915/i915_active.h | 2 + drivers/gpu/drm/i915/i915_utils.h | 7 + 4 files changed, 202 insertions(+), 147 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 52db2bde44a3..4e7924640ffa 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -22,7 +22,7 @@ static struct i915_global_context { static struct intel_context *intel_context_alloc(void) { - return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL); + return kmem_cache_alloc(global.slab_ce, GFP_KERNEL); } void intel_context_free(struct intel_context *ce) @@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce) kmem_cache_free(global.slab_ce, ce); } +static int __context_pin_state(struct i915_vma *vma) +{ + unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; + int err; + + err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH); + if (err) + return err; + + err = i915_active_acquire(>active); + if (err) + goto err_unpin; + + /* +* And mark it as a globally pinned object to let the shrinker know +* it cannot reclaim the object until we release it. +*/ + i915_vma_make_unshrinkable(vma); + vma->obj->mm.dirty = true; + + return 0; + +err_unpin: + i915_vma_unpin(vma); + return err; +} + +static void __context_unpin_state(struct i915_vma *vma) +{ + i915_vma_make_shrinkable(vma); + i915_active_release(>active); + __i915_vma_unpin(vma); +} + +static int __ring_active(struct intel_ring *ring) +{ + int err; + + err = intel_ring_pin(ring); + if (err) + return err; + + err = i915_active_acquire(>vma->active); + if (err) + goto err_pin; + + return 0; + +err_pin: + intel_ring_unpin(ring); + return err; +} + +static void __ring_retire(struct intel_ring *ring) +{ + i915_active_release(>vma->active); + intel_ring_unpin(ring); +} + +__i915_active_call +static void __intel_context_retire(struct i915_active *active) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n", +intel_context_get_total_runtime_ns(ce), +intel_context_get_avg_runtime_ns(ce)); + + set_bit(CONTEXT_VALID_BIT, >flags); + if (ce->state) + __context_unpin_state(ce->state); + + intel_timeline_unpin(ce->timeline); + __ring_retire(ce->ring); + + intel_context_put(ce); +} + +static int __intel_context_active(struct i915_active *active) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + int err; + + CE_TRACE(ce, "active\n"); + + intel_context_get(ce); + + err = __ring_active(ce->ring); + if (err) + goto err_put; + + err = intel_timeline_pin(ce->timeline); + if (err) + goto err_ring; + + if (!ce->state) + return 0; + + err = __context_pin_state(ce->state); + if (err) + goto err_timeline; + + return 0; + +err_timeline: + intel_timeline_unpin(ce->timeline); +err_ring: + __ring_retire(ce->ring); +err_put: + intel_context_put(ce); + return err; +} + +static void __intel_context_ctor(void *arg) +{ + struct intel_context *ce = arg; + + INIT_LIST_HEAD(>signal_link); + INIT_LIST_HEAD(>signals); + + atomic_set(>pin_count, 0); + mutex_init(>pin_mutex); + + ce->active_count = 0; + i915_active_init(>active, +__intel_context_active, __intel_context_retire); + + ce->inflight = NULL; + ce->lrc_reg_state = NULL; + ce->lrc.desc = 0; +} + +static void +__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) +{ + GEM_BUG_ON(!engine->cops); + GEM_BUG_ON(!engine->gt->vm); + + kref_init(>ref); + i915_active_reinit(>active); + mutex_reinit(>pin_mutex); + + ce->engine = engine; + ce->ops = engine->cops; + ce->sseu = engine->sseu; + + ce->wa_bb_page = 0; + ce->flags = 0; + ce->tag = 0; + + memset(>runtime, 0, sizeof(ce->runtime)); + +
[Intel-gfx] [PATCH 42/42] drm/i915/gt: Another tweak for flushing the tasklets
tasklet_kill() ensures that we _yield_ the processor until a remote tasklet is completed. However, this leads to a starvation condition as being at the bottom of the scheduler's runqueue means that anything else is able to run, including all hogs keeping the tasklet occupied. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index e3bda3509cd0..5c08c29e58af 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1197,10 +1197,6 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine) if (!t->func) return; - /* Synchronise and wait for the tasklet on another CPU */ - tasklet_kill(t); - - /* Having cancelled the tasklet, ensure that is run */ local_bh_disable(); if (tasklet_trylock(t)) { /* Must wait for any GPU reset in progress. */ @@ -1209,6 +1205,9 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine) tasklet_unlock(t); } local_bh_enable(); + + /* Synchronise and wait for the tasklet on another CPU */ + tasklet_unlock_wait(t); } /** -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 22/42] drm/i915/gt: ce->inflight updates are now serialised
Since schedule-in and schedule-out are now both always under the tasklet bitlock, we can reduce the individual atomic operations to simple instructions and worry less. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 44 + 1 file changed, 19 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 0d072356cf21..e3dccdc53faf 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1333,7 +1333,7 @@ __execlists_schedule_in(struct i915_request *rq) unsigned int tag = ffs(READ_ONCE(engine->context_tag)); GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); - clear_bit(tag - 1, >context_tag); + __clear_bit(tag - 1, >context_tag); ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32); BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID); @@ -1360,13 +1360,10 @@ static inline void execlists_schedule_in(struct i915_request *rq, int idx) GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine)); trace_i915_request_in(rq, idx); - old = READ_ONCE(ce->inflight); - do { - if (!old) { - WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq)); - break; - } - } while (!try_cmpxchg(>inflight, , ptr_inc(old))); + old = ce->inflight; + if (!old) + old = __execlists_schedule_in(rq); + WRITE_ONCE(ce->inflight, ptr_inc(old)); GEM_BUG_ON(intel_context_inflight(ce) != rq->engine); } @@ -1404,12 +1401,11 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce) resubmit_virtual_request(rq, ve); } -static inline void -__execlists_schedule_out(struct i915_request *rq, -struct intel_engine_cs * const engine, -unsigned int ccid) +static inline void __execlists_schedule_out(struct i915_request *rq) { struct intel_context * const ce = rq->context; + struct intel_engine_cs * const engine = rq->engine; + unsigned int ccid; /* * NB process_csb() is not under the engine->active.lock and hence @@ -1417,7 +1413,7 @@ __execlists_schedule_out(struct i915_request *rq, * refrain from doing non-trivial work here. */ - CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid); + CE_TRACE(ce, "schedule-out, ccid:%x\n", ce->lrc.ccid); /* * If we have just completed this context, the engine may now be @@ -1427,12 +1423,13 @@ __execlists_schedule_out(struct i915_request *rq, i915_request_completed(rq)) intel_engine_add_retire(engine, ce->timeline); + ccid = ce->lrc.ccid; ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; ccid &= GEN12_MAX_CONTEXT_HW_ID; if (ccid < BITS_PER_LONG) { GEM_BUG_ON(ccid == 0); GEM_BUG_ON(test_bit(ccid - 1, >context_tag)); - set_bit(ccid - 1, >context_tag); + __set_bit(ccid - 1, >context_tag); } intel_context_update_runtime(ce); @@ -1453,26 +1450,23 @@ __execlists_schedule_out(struct i915_request *rq, */ if (ce->engine != engine) kick_siblings(rq, ce); - - intel_context_put(ce); } static inline void execlists_schedule_out(struct i915_request *rq) { struct intel_context * const ce = rq->context; - struct intel_engine_cs *cur, *old; - u32 ccid; trace_i915_request_out(rq); - ccid = rq->context->lrc.ccid; - old = READ_ONCE(ce->inflight); - do - cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL; - while (!try_cmpxchg(>inflight, , cur)); - if (!cur) - __execlists_schedule_out(rq, old, ccid); + GEM_BUG_ON(!ce->inflight); + ce->inflight = ptr_dec(ce->inflight); + if (!intel_context_inflight_count(ce)) { + GEM_BUG_ON(ce->inflight != rq->engine); + __execlists_schedule_out(rq); + WRITE_ONCE(ce->inflight, NULL); + intel_context_put(ce); + } i915_request_put(rq); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 21/42] drm/i915/gt: Simplify virtual engine handling for execlists_hold()
Now that the tasklet completely controls scheduling of the requests, and we postpone scheduling out the old requests, we can keep a hanging virtual request bound to the engine on which it hung, and remove it from te queue. On release, it will be returned to the same engine and remain in its queue until it is scheduled; after which point it will become eligible for transfer to a sibling. Instead, we could opt to resubmit the request along the virtual engine on unhold, making it eligible for load balancing immediately -- but that seems like a pointless optimisation for a hanging context. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 29 - 1 file changed, 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d1a45d5e4225..0d072356cf21 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2780,35 +2780,6 @@ static bool execlists_hold(struct intel_engine_cs *engine, goto unlock; } - if (rq->engine != engine) { /* preempted virtual engine */ - struct virtual_engine *ve = to_virtual_engine(rq->engine); - - /* -* intel_context_inflight() is only protected by virtue -* of process_csb() being called only by the tasklet (or -* directly from inside reset while the tasklet is suspended). -* Assert that neither of those are allowed to run while we -* poke at the request queues. -*/ - GEM_BUG_ON(!reset_in_progress(>execlists)); - - /* -* An unsubmitted request along a virtual engine will -* remain on the active (this) engine until we are able -* to process the context switch away (and so mark the -* context as no longer in flight). That cannot have happened -* yet, otherwise we would not be hanging! -*/ - spin_lock(>base.active.lock); - GEM_BUG_ON(intel_context_inflight(rq->context) != engine); - GEM_BUG_ON(ve->request != rq); - ve->request = NULL; - spin_unlock(>base.active.lock); - i915_request_put(rq); - - rq->engine = engine; - } - /* * Transfer this request onto the hold queue to prevent it * being resumbitted to HW (and potentially completed) before we have -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 26/42] drm/i915: Lift waiter/signaler iterators
Lift the list iteration defines for traversing the signaler/waiter lists into i915_scheduler.h for reuse. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 10 -- drivers/gpu/drm/i915/i915_scheduler_types.h | 10 ++ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index ccf0b43c02cf..3c022e621a38 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1835,16 +1835,6 @@ static void virtual_xfer_context(struct virtual_engine *ve, } } -#define for_each_waiter(p__, rq__) \ - list_for_each_entry_lockless(p__, \ -&(rq__)->sched.waiters_list, \ -wait_link) - -#define for_each_signaler(p__, rq__) \ - list_for_each_entry_rcu(p__, \ - &(rq__)->sched.signalers_list, \ - signal_link) - static void defer_request(struct i915_request *rq, struct list_head * const pl) { LIST_HEAD(list); diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index f72e6c397b08..343ed44d5ed4 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -81,4 +81,14 @@ struct i915_dependency { #define I915_DEPENDENCY_WEAK BIT(2) }; +#define for_each_waiter(p__, rq__) \ + list_for_each_entry_lockless(p__, \ +&(rq__)->sched.waiters_list, \ +wait_link) + +#define for_each_signaler(p__, rq__) \ + list_for_each_entry_rcu(p__, \ + &(rq__)->sched.signalers_list, \ + signal_link) + #endif /* _I915_SCHEDULER_TYPES_H_ */ -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 17/42] drm/i915/gt: Use virtual_engine during execlists_dequeue
Rather than going back and forth between the rb_node entry and the virtual_engine type, store the ve local and reuse it. As the container_of conversion from rb_node to virtual_engine requires a variable offset, performing that conversion just once shaves off a bit of code. v2: Keep a single virtual engine lookup, for typical use. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 239 1 file changed, 105 insertions(+), 134 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index dcbd47e8a513..c1ba1eee4b29 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -454,9 +454,15 @@ static int queue_prio(const struct intel_engine_execlists *execlists) return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used); } +static int virtual_prio(const struct intel_engine_execlists *el) +{ + struct rb_node *rb = rb_first_cached(>virtual); + + return rb ? rb_entry(rb, struct ve_node, rb)->prio : INT_MIN; +} + static inline bool need_preempt(const struct intel_engine_cs *engine, - const struct i915_request *rq, - struct rb_node *rb) + const struct i915_request *rq) { int last_prio; @@ -493,25 +499,6 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, rq_prio(list_next_entry(rq, sched.link)) > last_prio) return true; - if (rb) { - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); - bool preempt = false; - - if (engine == ve->siblings[0]) { /* only preempt one sibling */ - struct i915_request *next; - - rcu_read_lock(); - next = READ_ONCE(ve->request); - if (next) - preempt = rq_prio(next) > last_prio; - rcu_read_unlock(); - } - - if (preempt) - return preempt; - } - /* * If the inflight context did not trigger the preemption, then maybe * it was the set of queued requests? Pick the highest priority in @@ -522,7 +509,8 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same * context, it's priority would not exceed ELSP[0] aka last_prio. */ - return queue_prio(>execlists) > last_prio; + return max(virtual_prio(>execlists), + queue_prio(>execlists)) > last_prio; } __maybe_unused static inline bool @@ -1807,6 +1795,35 @@ static bool virtual_matches(const struct virtual_engine *ve, return true; } +static struct virtual_engine * +first_virtual_engine(struct intel_engine_cs *engine) +{ + struct intel_engine_execlists *el = >execlists; + struct rb_node *rb = rb_first_cached(>virtual); + + while (rb) { + struct virtual_engine *ve = + rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + struct i915_request *rq = READ_ONCE(ve->request); + + /* lazily cleanup after another engine handled rq */ + if (!rq) { + rb_erase_cached(rb, >virtual); + RB_CLEAR_NODE(rb); + rb = rb_first_cached(>virtual); + continue; + } + + if (!virtual_matches(ve, rq, engine)) { + rb = rb_next(rb); + continue; + } + return ve; + } + + return NULL; +} + static void virtual_xfer_context(struct virtual_engine *ve, struct intel_engine_cs *engine) { @@ -1905,32 +1922,15 @@ static void defer_active(struct intel_engine_cs *engine) static bool need_timeslice(const struct intel_engine_cs *engine, - const struct i915_request *rq, - const struct rb_node *rb) + const struct i915_request *rq) { int hint; if (!intel_engine_has_timeslices(engine)) return false; - hint = engine->execlists.queue_priority_hint; - - if (rb) { - const struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); - const struct intel_engine_cs *inflight = - intel_context_inflight(>context); - - if (!inflight || inflight == engine) { - struct i915_request *next; - - rcu_read_lock(); - next = READ_ONCE(ve->request); - if (next) - hint = max(hint, rq_prio(next));
[Intel-gfx] [PATCH 39/42] drm/i915/gt: Specify a deadline for the heartbeat
As we know when we expect the heartbeat to be checked for completion, pass this information along as its deadline. We still do not complain if the deadline is missed, at least until we have tried a few times, but it will allow for quicker hang detection on systems where deadlines are adhered to. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index b7b2f332bc3c..2e316759d1a1 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -67,6 +67,16 @@ static void heartbeat_commit(struct i915_request *rq, local_bh_enable(); } +static void set_heartbeat_deadline(struct intel_engine_cs *engine, + struct i915_request *rq) +{ + unsigned long interval; + + interval = READ_ONCE(engine->props.heartbeat_interval_ms); + if (interval) + i915_request_set_deadline(rq, ktime_get() + (interval << 20)); +} + static void show_heartbeat(const struct i915_request *rq, struct intel_engine_cs *engine) { @@ -132,6 +142,8 @@ static void heartbeat(struct work_struct *wrk) local_bh_disable(); i915_request_set_priority(rq, attr.priority); + if (attr.priority == I915_PRIORITY_BARRIER) + i915_request_set_deadline(rq, 0); local_bh_enable(); } else { if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) @@ -166,6 +178,7 @@ static void heartbeat(struct work_struct *wrk) if (engine->i915->params.enable_hangcheck) engine->heartbeat.systole = i915_request_get(rq); + set_heartbeat_deadline(engine, rq); heartbeat_commit(rq, ); unlock: -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 19/42] drm/i915/gt: Defer schedule_out until after the next dequeue
Inside schedule_out, we do extra work upon idling the context, such as updating the runtime, kicking off retires, kicking virtual engines. However, if we are in a series of processing single requests per contexts, we may find ourselves scheduling out the context, only to immediately schedule it back in during dequeue. This is just extra work that we can avoid if we keep the context marked as inflight across the dequeue. This becomes more significant later on for minimising virtual engine misses. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 127 -- 2 files changed, 86 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index a78c1c225ce3..9a28339b6d74 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -45,8 +45,8 @@ struct intel_context { struct intel_engine_cs *engine; struct intel_engine_cs *inflight; -#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 2) -#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 2) +#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 3) +#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 3) struct i915_address_space *vm; struct i915_gem_context __rcu *gem_context; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 19cd5f4f74cf..a9ae0638e117 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1363,6 +1363,8 @@ __execlists_schedule_in(struct i915_request *rq) execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); intel_engine_context_in(engine); + CE_TRACE(ce, "schedule-in, ccid:%x\n", ce->lrc.ccid); + return engine; } @@ -1406,6 +1408,8 @@ __execlists_schedule_out(struct i915_request *rq, * refrain from doing non-trivial work here. */ + CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid); + /* * If we have just completed this context, the engine may now be * idle and we want to re-enter powersaving. @@ -2053,19 +2057,6 @@ static void set_preempt_timeout(struct intel_engine_cs *engine, active_preempt_timeout(engine, rq)); } -static inline void clear_ports(struct i915_request **ports, int count) -{ - memset_p((void **)ports, NULL, count); -} - -static inline void -copy_ports(struct i915_request **dst, struct i915_request **src, int count) -{ - /* A memcpy_p() would be very useful here! */ - while (count--) - WRITE_ONCE(*dst++, *src++); /* avoid write tearing */ -} - static void execlists_dequeue(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = >execlists; @@ -2400,26 +2391,44 @@ static void execlists_dequeue(struct intel_engine_cs *engine) start_timeslice(engine, execlists->queue_priority_hint); skip_submit: ring_set_paused(engine, 0); + while (port-- != execlists->pending) + i915_request_put(*port); *execlists->pending = NULL; } } -static void -cancel_port_requests(struct intel_engine_execlists * const execlists) +static inline void clear_ports(struct i915_request **ports, int count) +{ + memset_p((void **)ports, NULL, count); +} + +static inline void +copy_ports(struct i915_request **dst, struct i915_request **src, int count) +{ + /* A memcpy_p() would be very useful here! */ + while (count--) + WRITE_ONCE(*dst++, *src++); /* avoid write tearing */ +} + +static struct i915_request ** +cancel_port_requests(struct intel_engine_execlists * const execlists, +struct i915_request **inactive) { struct i915_request * const *port; for (port = execlists->pending; *port; port++) - execlists_schedule_out(*port); + *inactive++ = *port; clear_ports(execlists->pending, ARRAY_SIZE(execlists->pending)); /* Mark the end of active before we overwrite *active */ for (port = xchg(>active, execlists->pending); *port; port++) - execlists_schedule_out(*port); + *inactive++ = *port; clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight)); smp_wmb(); /* complete the seqlock for execlists_active() */ WRITE_ONCE(execlists->active, execlists->inflight); + + return inactive; } static inline void @@ -2491,7 +2500,8 @@ gen8_csb_parse(const struct intel_engine_execlists *execlists, const u32 *csb) return *csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED); } -static void process_csb(struct intel_engine_cs *engine)
[Intel-gfx] [PATCH 16/42] drm/i915/gt: Replace direct submit with direct call to tasklet
Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 35 +++-- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 3 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 3 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 126 ++ drivers/gpu/drm/i915/gt/intel_reset.c | 60 ++--- drivers/gpu/drm/i915/gt/intel_reset.h | 2 + drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 7 +- drivers/gpu/drm/i915/gt/selftest_lrc.c| 27 ++-- drivers/gpu/drm/i915/gt/selftest_reset.c | 8 +- drivers/gpu/drm/i915/i915_request.c | 2 + drivers/gpu/drm/i915/selftests/i915_request.c | 6 +- 13 files changed, 144 insertions(+), 141 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 56700ddbed47..04d291a8d31d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -398,12 +398,14 @@ static bool __reset_engine(struct intel_engine_cs *engine) if (!intel_has_reset_engine(gt)) return false; + local_bh_disable(); if (!test_and_set_bit(I915_RESET_ENGINE + engine->id, >reset.flags)) { - success = intel_engine_reset(engine, NULL) == 0; + success = __intel_engine_reset_bh(engine, NULL) == 0; clear_and_wake_up_bit(I915_RESET_ENGINE + engine->id, >reset.flags); } + local_bh_enable(); return success; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 89c7b33dd99f..9f676df68fae 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2453,7 +2453,9 @@ static void eb_request_add(struct i915_execbuffer *eb) __i915_request_skip(rq); } + local_bh_disable(); __i915_request_queue(rq, ); + local_bh_enable(); } static int diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index ea4ba2afe9f9..26f2781e4a51 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1000,32 +1000,39 @@ static unsigned long stop_timeout(const struct intel_engine_cs *engine) return READ_ONCE(engine->props.stop_timeout_ms); } -int intel_engine_stop_cs(struct intel_engine_cs *engine) +static int __intel_engine_stop_cs(struct intel_engine_cs *engine, + int fast_timeout_us, + int slow_timeout_ms) { struct intel_uncore *uncore = engine->uncore; - const u32 base = engine->mmio_base; - const i915_reg_t mode = RING_MI_MODE(base); + const i915_reg_t mode = RING_MI_MODE(engine->mmio_base); int err; + intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING)); + err = __intel_wait_for_register_fw(engine->uncore, mode, + MODE_IDLE, MODE_IDLE, + fast_timeout_us, + slow_timeout_ms, + NULL); + + /* A final mmio read to let GPU writes be hopefully flushed to memory */ + intel_uncore_posting_read_fw(uncore, mode); + return err; +} + +int intel_engine_stop_cs(struct intel_engine_cs *engine) +{ + int err = 0; + if (INTEL_GEN(engine->i915) < 3) return -ENODEV; ENGINE_TRACE(engine, "\n"); - - intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING)); - - err = 0; - if (__intel_wait_for_register_fw(uncore, -mode, MODE_IDLE, MODE_IDLE, -1000, stop_timeout(engine), -NULL)) { + if (__intel_engine_stop_cs(engine, 1000, stop_timeout(engine))) { ENGINE_TRACE(engine, "timed out on STOP_RING -> IDLE\n");
[Intel-gfx] [PATCH 30/42] drm/i915: Prune empty priolists
A side-effect of our priority inheritance scheme is that we promote requests from one priority to the next, moving them from one list to the next. This can often leave the old priority list empty, but still resident in the rbtree, which we then have to traverse during HW submission. rb_next() is relatively expensive operation so if we can push that to the update where we can do piecemeal pruning and reuse the nodes, this reduces the latency for HW submission. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_scheduler.c | 41 +-- 1 file changed, 32 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index bfbbd94dfcbc..4b465a571a83 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -64,9 +64,10 @@ struct list_head * i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio) { struct intel_engine_execlists * const execlists = >execlists; - struct i915_priolist *p; + struct list_head *free = NULL; struct rb_node **parent, *rb; - bool first = true; + struct i915_priolist *p; + bool first; lockdep_assert_held(>active.lock); assert_priolists(execlists); @@ -77,22 +78,40 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio) find_priolist: /* most positive priority is scheduled first, equal priorities fifo */ rb = NULL; + first = true; parent = >queue.rb_root.rb_node; while (*parent) { rb = *parent; p = to_priolist(rb); - if (prio > p->priority) { - parent = >rb_left; - } else if (prio < p->priority) { - parent = >rb_right; - first = false; - } else { - return >requests; + + if (prio == p->priority) + goto out; + + /* +* Prune an empty priolist, we can reuse it if we need to +* allocate. After removing this node and rotating the subtrees +* beneath its parent, we need to restart our descent from the +* parent. +*/ + if (list_empty(>requests)) { + rb = rb_parent(>node); + parent = rb ? : >queue.rb_root.rb_node; + rb_erase_cached(>node, >queue); + free = i915_priolist_free_defer(p, free); + continue; } + + if (prio < p->priority) + parent = >rb_left; + else + parent = >rb_right, first = false; } if (prio == I915_PRIORITY_NORMAL) { p = >default_priolist; + } else if (free) { + p = container_of(free, typeof(*p), requests); + free = p->requests.next; } else { p = kmem_cache_alloc(global.slab_priorities, GFP_ATOMIC); /* Convert an allocation failure to a priority bump */ @@ -117,7 +136,11 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio) rb_link_node(>node, rb, parent); rb_insert_color_cached(>node, >queue, first); + GEM_BUG_ON(rb_first_cached(>queue) != + rb_first(>queue.rb_root)); +out: + i915_priolist_free_many(free); return >requests; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 12/42] drm/i915: Reduce test_and_set_bit to set_bit in i915_request_submit()
Avoid the full blown memory barrier of test_and_set_bit() by noting the completed request and removing it from the lists. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_request.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 67267b0d5b06..1cd102591b36 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -537,8 +537,10 @@ bool __i915_request_submit(struct i915_request *request) * dropped upon retiring. (Otherwise if resubmit a *retired* * request, this would be a horrible use-after-free.) */ - if (i915_request_completed(request)) - goto xfer; + if (i915_request_completed(request)) { + list_del_init(>sched.link); + goto active; + } if (unlikely(intel_context_is_banned(request->context))) i915_request_set_error_once(request, -EIO); @@ -572,11 +574,11 @@ bool __i915_request_submit(struct i915_request *request) engine->serial++; result = true; -xfer: - if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags)) { - list_move_tail(>sched.link, >active.requests); - clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags); - } + GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags)); + list_move_tail(>sched.link, >active.requests); +active: + clear_bit(I915_FENCE_FLAG_PQUEUE, >fence.flags); + set_bit(I915_FENCE_FLAG_ACTIVE, >fence.flags); /* * XXX Rollback bonded-execution on __i915_request_unsubmit()? -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 31/42] drm/i915: Replace engine->schedule() with a known request operation
Looking to the future, we want to set the scheduling attributes explicitly and so replace the generic engine->schedule() with the more direct i915_request_set_priority() What it loses in removing the 'schedule' name from the function, it gains in having an explicit entry point with a stated goal. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_display.c | 9 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h| 2 +- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 27 +-- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 -- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 4 +-- drivers/gpu/drm/i915/gt/intel_engine_types.h | 29 drivers/gpu/drm/i915/gt/intel_engine_user.c | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 11 +++ drivers/gpu/drm/i915/gt/selftest_lrc.c| 33 +-- drivers/gpu/drm/i915/i915_request.c | 11 --- drivers/gpu/drm/i915/i915_scheduler.c | 15 + drivers/gpu/drm/i915/i915_scheduler.h | 3 +- 14 files changed, 58 insertions(+), 96 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 7212204a4759..e713b673be4b 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -15917,13 +15917,6 @@ static void intel_plane_unpin_fb(struct intel_plane_state *old_plane_state) intel_unpin_fb_vma(vma, old_plane_state->flags); } -static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj) -{ - struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY }; - - i915_gem_object_wait_priority(obj, 0, ); -} - /** * intel_prepare_plane_fb - Prepare fb for usage on plane * @_plane: drm plane to prepare for @@ -16000,7 +15993,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane, if (ret) return ret; - fb_obj_bump_render_priority(obj); + i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY); i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB); if (!new_plane_state->uapi.fence) { /* implicit fencing */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 9f676df68fae..ab57b527c893 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2099,7 +2099,7 @@ static int __eb_pin_reloc_engine(struct i915_execbuffer *eb) return PTR_ERR(ce); /* Reuse eb->context->timeline with scheduler! */ - if (engine->schedule) + if (intel_engine_has_scheduler(engine)) ce->timeline = intel_timeline_get(eb->context->timeline); i915_vm_put(ce->vm); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index e5b9276d254c..631237c859e9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -474,7 +474,7 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj, long timeout); int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, unsigned int flags, - const struct i915_sched_attr *attr); + int prio); void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, enum fb_op_origin origin); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index 8af55cd3e690..cefbbb3d9b52 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -93,28 +93,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, return timeout; } -static void __fence_set_priority(struct dma_fence *fence, -const struct i915_sched_attr *attr) +static void __fence_set_priority(struct dma_fence *fence, int prio) { - struct i915_request *rq; - struct intel_engine_cs *engine; - if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence)) return; - rq = to_request(fence); - engine = rq->engine; - local_bh_disable(); - rcu_read_lock(); /* RCU serialisation for set-wedged protection */ - if (engine->schedule) - engine->schedule(rq, attr); - rcu_read_unlock(); + i915_request_set_priority(to_request(fence), prio); local_bh_enable(); /* kick the tasklets if queues were reprioritised */ } -static void fence_set_priority(struct dma_fence *fence, - const struct i915_sched_attr *attr) +static void fence_set_priority(struct dma_fence *fence, int prio) {
[Intel-gfx] [PATCH 25/42] drm/i915/gt: Convert stats.active to plain unsigned int
As context-in/out is now always serialised, we do not have to worry about concurrent enabling/disable of the busy-stats and can reduce the atomic_t active to a plain unsigned int, and the seqlock to a seqcount. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 8 ++-- drivers/gpu/drm/i915/gt/intel_engine_stats.h | 45 drivers/gpu/drm/i915/gt/intel_engine_types.h | 4 +- 3 files changed, 34 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index b2f520bc1497..8de3105632ef 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -339,7 +339,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) engine->schedule = NULL; ewma__engine_latency_init(>latency); - seqlock_init(>stats.lock); + seqcount_init(>stats.lock); ATOMIC_INIT_NOTIFIER_HEAD(>context_status_notifier); @@ -1714,7 +1714,7 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine, * add it to the total. */ *now = ktime_get(); - if (atomic_read(>stats.active)) + if (READ_ONCE(engine->stats.active)) total = ktime_add(total, ktime_sub(*now, engine->stats.start)); return total; @@ -1733,9 +1733,9 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now) ktime_t total; do { - seq = read_seqbegin(>stats.lock); + seq = read_seqcount_begin(>stats.lock); total = __intel_engine_get_busy_time(engine, now); - } while (read_seqretry(>stats.lock, seq)); + } while (read_seqcount_retry(>stats.lock, seq)); return total; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_stats.h b/drivers/gpu/drm/i915/gt/intel_engine_stats.h index 58491eae3482..24fbdd94351a 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_stats.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_stats.h @@ -17,33 +17,44 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine) { unsigned long flags; - if (atomic_add_unless(>stats.active, 1, 0)) + if (engine->stats.active) { + engine->stats.active++; return; - - write_seqlock_irqsave(>stats.lock, flags); - if (!atomic_add_unless(>stats.active, 1, 0)) { - engine->stats.start = ktime_get(); - atomic_inc(>stats.active); } - write_sequnlock_irqrestore(>stats.lock, flags); + + /* The writer is serialised; but the pmu reader may be from hardirq */ + local_irq_save(flags); + write_seqcount_begin(>stats.lock); + + engine->stats.start = ktime_get(); + engine->stats.active++; + + write_seqcount_end(>stats.lock); + local_irq_restore(flags); + + GEM_BUG_ON(!engine->stats.active); } static inline void intel_engine_context_out(struct intel_engine_cs *engine) { unsigned long flags; - GEM_BUG_ON(!atomic_read(>stats.active)); - - if (atomic_add_unless(>stats.active, -1, 1)) + GEM_BUG_ON(!engine->stats.active); + if (engine->stats.active > 1) { + engine->stats.active--; return; - - write_seqlock_irqsave(>stats.lock, flags); - if (atomic_dec_and_test(>stats.active)) { - engine->stats.total = - ktime_add(engine->stats.total, - ktime_sub(ktime_get(), engine->stats.start)); } - write_sequnlock_irqrestore(>stats.lock, flags); + + local_irq_save(flags); + write_seqcount_begin(>stats.lock); + + engine->stats.active--; + engine->stats.total = + ktime_add(engine->stats.total, + ktime_sub(ktime_get(), engine->stats.start)); + + write_seqcount_end(>stats.lock); + local_irq_restore(flags); } #endif /* __INTEL_ENGINE_STATS_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 2361206fd026..7ee9e32d149b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -525,12 +525,12 @@ struct intel_engine_cs { /** * @active: Number of contexts currently scheduled in. */ - atomic_t active; + unsigned int active; /** * @lock: Lock protecting the below fields. */ - seqlock_t lock; + seqcount_t lock; /** * @total: Total time this engine was busy. -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/42] drm/i915/gt: Split the breadcrumb spinlock between global and contexts
As we funnel more and more contexts into the breadcrumbs on an engine, the hold time of b->irq_lock grows. As we may then contend with the b->irq_lock during request submission, this increases the burden upon the engine->active.lock and so directly impacts both our execution latency and client latency. If we split the b->irq_lock by introducing a per-context spinlock to manage the signalers within a context, we then only need the b->irq_lock for enabling/disabling the interrupt and can avoid taking the lock for walking the list of contexts within the signal worker. Even with the current setup, this greatly reduces the number of times we have to take and fight for b->irq_lock. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 154 ++ drivers/gpu/drm/i915/gt/intel_context.c | 1 + drivers/gpu/drm/i915/gt/intel_context_types.h | 1 + 3 files changed, 86 insertions(+), 70 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 9e7ac612fabb..98d323354082 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -103,17 +103,19 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) static void add_signaling_context(struct intel_breadcrumbs *b, struct intel_context *ce) { - intel_context_get(ce); - list_add_tail(>signal_link, >signalers); - if (list_is_first(>signal_link, >signalers)) + lockdep_assert_held(>irq_lock); + + list_add_rcu(>signal_link, >signalers); + if (list_is_last(>signal_link, >signalers)) __intel_breadcrumbs_arm_irq(b); } static void remove_signaling_context(struct intel_breadcrumbs *b, struct intel_context *ce) { - list_del(>signal_link); - intel_context_put(ce); + spin_lock(>irq_lock); + list_del_rcu(>signal_link); + spin_unlock(>irq_lock); } static inline bool __request_completed(const struct i915_request *rq) @@ -189,15 +191,12 @@ static void signal_irq_work(struct irq_work *work) struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work); const ktime_t timestamp = ktime_get(); struct llist_node *signal, *sn; - struct intel_context *ce, *cn; - struct list_head *pos, *next; + struct intel_context *ce; signal = NULL; if (unlikely(!llist_empty(>signaled_requests))) signal = llist_del_all(>signaled_requests); - spin_lock(>irq_lock); - /* * Keep the irq armed until the interrupt after all listeners are gone. * @@ -221,11 +220,24 @@ static void signal_irq_work(struct irq_work *work) * interrupt draw less ire from other users of the system and tools * like powertop. */ - if (!signal && list_empty(>signalers)) - __intel_breadcrumbs_disarm_irq(b); + if (!signal && READ_ONCE(b->irq_armed) && list_empty(>signalers)) { + if (spin_trylock(>irq_lock)) { + if (list_empty(>signalers)) + __intel_breadcrumbs_disarm_irq(b); + spin_unlock(>irq_lock); + } + } + + rcu_read_lock(); + list_for_each_entry_rcu(ce, >signalers, signal_link) { + struct list_head *pos, *next; + bool release = false; - list_for_each_entry_safe(ce, cn, >signalers, signal_link) { - GEM_BUG_ON(list_empty(>signals)); + if (!spin_trylock(>signal_lock)) + continue; + + if (list_empty(>signals)) + goto unlock; list_for_each_safe(pos, next, >signals) { struct i915_request *rq = @@ -258,11 +270,16 @@ static void signal_irq_work(struct irq_work *work) if (>signals == pos) { /* now empty */ add_retire(b, ce->timeline); remove_signaling_context(b, ce); + release = true; } } - } - spin_unlock(>irq_lock); +unlock: + spin_unlock(>signal_lock); + if (release) + intel_context_put(ce); + } + rcu_read_unlock(); llist_for_each_safe(signal, sn, signal) { struct i915_request *rq = @@ -336,9 +353,9 @@ void intel_breadcrumbs_free(struct intel_breadcrumbs *b) kfree(b); } -static void insert_breadcrumb(struct i915_request *rq, - struct intel_breadcrumbs *b) +static void insert_breadcrumb(struct i915_request *rq) { + struct intel_breadcrumbs *b = READ_ONCE(rq->engine)->breadcrumbs; struct intel_context *ce = rq->context; struct list_head
[Intel-gfx] [PATCH 34/42] drm/i915: Restructure priority inheritance
In anticipation of wanting to be able to call pi from underneath an engine's active.lock, rework the priority inheritance to primarily work along an engine's priority queue, delegating any other engine that the chain may traverse to a worker. This reduces the global spinlock from governing the entire priority inheritance depth-first search, to a small lock around a single list. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_scheduler.c | 299 drivers/gpu/drm/i915/i915_scheduler_types.h | 6 +- 2 files changed, 178 insertions(+), 127 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 247d2e2bb3c8..45e000c257db 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -17,7 +17,70 @@ static struct i915_global_scheduler { struct kmem_cache *slab_priorities; } global; -static DEFINE_SPINLOCK(schedule_lock); +static DEFINE_SPINLOCK(ipi_lock); +static LIST_HEAD(ipi_list); + +static inline int rq_prio(const struct i915_request *rq) +{ + return READ_ONCE(rq->sched.attr.priority); +} + +static void ipi_schedule(struct irq_work *wrk) +{ + rcu_read_lock(); + do { + struct i915_dependency *p; + struct i915_request *rq; + int prio; + + spin_lock(_lock); + p = list_first_entry_or_null(_list, typeof(*p), ipi_link); + if (p) { + rq = container_of(p->signaler, typeof(*rq), sched); + list_del_init(>ipi_link); + + prio = p->ipi_priority; + p->ipi_priority = I915_PRIORITY_INVALID; + } + spin_unlock(_lock); + if (!p) + break; + + if (i915_request_completed(rq)) + continue; + + i915_request_set_priority(rq, prio); + } while (1); + rcu_read_unlock(); +} + +static DEFINE_IRQ_WORK(ipi_work, ipi_schedule); + +static void ipi_queue(struct i915_dependency *p) +{ + list_move(>ipi_link, _list); + irq_work_queue(_work); +} + +/* + * Virtual engines complicate acquiring the engine timeline lock, + * as their rq->engine pointer is not stable until under that + * engine lock. The simple ploy we use is to take the lock then + * check that the rq still belongs to the newly locked engine. + */ +#define lock_engine_irqsave(rq, flags) ({ \ + struct i915_request * const rq__ = (rq); \ + struct intel_engine_cs *engine__ = READ_ONCE(rq__->engine); \ +\ + spin_lock_irqsave(__->active.lock, (flags)); \ + while (engine__ != READ_ONCE((rq__)->engine)) { \ + spin_unlock(__->active.lock); \ + engine__ = READ_ONCE(rq__->engine); \ + spin_lock(__->active.lock); \ + } \ +\ + engine__; \ +}) static const struct i915_request * node_to_request(const struct i915_sched_node *node) @@ -162,42 +225,6 @@ void i915_priolist_free_many(struct list_head *list) } } -struct sched_cache { - struct list_head *priolist; -}; - -static struct intel_engine_cs * -sched_lock_engine(const struct i915_sched_node *node, - struct intel_engine_cs *locked, - struct sched_cache *cache) -{ - const struct i915_request *rq = node_to_request(node); - struct intel_engine_cs *engine; - - GEM_BUG_ON(!locked); - - /* -* Virtual engines complicate acquiring the engine timeline lock, -* as their rq->engine pointer is not stable until under that -* engine lock. The simple ploy we use is to take the lock then -* check that the rq still belongs to the newly locked engine. -*/ - while (locked != (engine = READ_ONCE(rq->engine))) { - spin_unlock(>active.lock); - memset(cache, 0, sizeof(*cache)); - spin_lock(>active.lock); - locked = engine; - } - - GEM_BUG_ON(locked != engine); - return locked; -} - -static inline int rq_prio(const struct i915_request *rq) -{ - return rq->sched.attr.priority; -} - static inline bool need_preempt(int prio, int active) { /* @@ -223,19 +250,17 @@ static void kick_submission(struct intel_engine_cs *engine, if (prio <= engine->execlists.queue_priority_hint) return; - rcu_read_lock(); - /* Nothing currently active? We're overdue for a submission! */ inflight = execlists_active(>execlists); if (!inflight) - goto unlock; + return; /* * If we are already the currently executing context, don't * bother evaluating if we should preempt ourselves. */ if (inflight->context == rq->context) - goto unlock; + return; ENGINE_TRACE(engine,
[Intel-gfx] [PATCH 40/42] drm/i915: Replace the priority boosting for the display with a deadline
For a modeset/pageflip, there is a very precise deadline by which the frame must be completed in order to hit the vblank and be shown. While we don't pass along that exact information, we can at least inform the scheduler that this request-chain needs to be completed asap. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_display.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 ++-- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 19 ++- drivers/gpu/drm/i915/i915_priolist_types.h | 3 --- 4 files changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index e713b673be4b..0327e93586c2 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -15993,7 +15993,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane, if (ret) return ret; - i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY); + i915_gem_object_wait_deadline(obj, 0, ktime_get() /* next vblank? */); i915_gem_object_flush_frontbuffer(obj, ORIGIN_DIRTYFB); if (!new_plane_state->uapi.fence) { /* implicit fencing */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 631237c859e9..3b1b0601adf3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -472,9 +472,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj) int i915_gem_object_wait(struct drm_i915_gem_object *obj, unsigned int flags, long timeout); -int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, +int i915_gem_object_wait_deadline(struct drm_i915_gem_object *obj, unsigned int flags, - int prio); + ktime_t deadline); void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, enum fb_op_origin origin); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index cefbbb3d9b52..3334817183f6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -93,17 +93,18 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, return timeout; } -static void __fence_set_priority(struct dma_fence *fence, int prio) +static void __fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence)) return; local_bh_disable(); - i915_request_set_priority(to_request(fence), prio); + i915_request_set_deadline(to_request(fence), + i915_sched_to_ticks(deadline)); local_bh_enable(); /* kick the tasklets if queues were reprioritised */ } -static void fence_set_priority(struct dma_fence *fence, int prio) +static void fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { /* Recurse once into a fence-array */ if (dma_fence_is_array(fence)) { @@ -111,16 +112,16 @@ static void fence_set_priority(struct dma_fence *fence, int prio) int i; for (i = 0; i < array->num_fences; i++) - __fence_set_priority(array->fences[i], prio); + __fence_set_deadline(array->fences[i], deadline); } else { - __fence_set_priority(fence, prio); + __fence_set_deadline(fence, deadline); } } int -i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, +i915_gem_object_wait_deadline(struct drm_i915_gem_object *obj, unsigned int flags, - int prio) + ktime_t deadline) { struct dma_fence *excl; @@ -135,7 +136,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, return ret; for (i = 0; i < count; i++) { - fence_set_priority(shared[i], prio); + fence_set_deadline(shared[i], deadline); dma_fence_put(shared[i]); } @@ -145,7 +146,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, } if (excl) { - fence_set_priority(excl, prio); + fence_set_deadline(excl, deadline); dma_fence_put(excl); } return 0; diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h index 43a0ac45295f..ac6d9614ea23 100644 --- a/drivers/gpu/drm/i915/i915_priolist_types.h +++ b/drivers/gpu/drm/i915/i915_priolist_types.h @@ -20,9 +20,6 @@ enum { /* A preemptive pulse used to monitor the health of each
[Intel-gfx] [PATCH 28/42] drm/i915: Remove I915_USER_PRIORITY_SHIFT
As we do not have any internal priority levels, the priority can be set directed from the user values. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_display.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 +-- .../i915/gem/selftests/i915_gem_object_blt.c | 4 +- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 6 +-- drivers/gpu/drm/i915/gt/selftest_lrc.c| 44 +++ drivers/gpu/drm/i915/i915_priolist_types.h| 3 -- drivers/gpu/drm/i915/i915_scheduler.c | 1 - 7 files changed, 23 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index db2a5a1a9b35..7212204a4759 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -15919,9 +15919,7 @@ static void intel_plane_unpin_fb(struct intel_plane_state *old_plane_state) static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj) { - struct i915_sched_attr attr = { - .priority = I915_USER_PRIORITY(I915_PRIORITY_DISPLAY), - }; + struct i915_sched_attr attr = { .priority = I915_PRIORITY_DISPLAY }; i915_gem_object_wait_priority(obj, 0, ); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 04d291a8d31d..b2cd488c3420 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -719,7 +719,7 @@ __create_context(struct drm_i915_private *i915) kref_init(>ref); ctx->i915 = i915; - ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); + ctx->sched.priority = I915_PRIORITY_NORMAL; mutex_init(>mutex); INIT_LIST_HEAD(>link); @@ -1992,7 +1992,7 @@ static int set_priority(struct i915_gem_context *ctx, !capable(CAP_SYS_NICE)) return -EPERM; - ctx->sched.priority = I915_USER_PRIORITY(priority); + ctx->sched.priority = priority; context_apply_all(ctx, __apply_priority, ctx); return 0; @@ -2496,7 +2496,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, case I915_CONTEXT_PARAM_PRIORITY: args->size = 0; - args->value = ctx->sched.priority >> I915_USER_PRIORITY_SHIFT; + args->value = ctx->sched.priority; break; case I915_CONTEXT_PARAM_SSEU: diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c index 23b6e11bbc3e..c4c04fb97d14 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c @@ -220,7 +220,7 @@ static int igt_fill_blt_thread(void *arg) return PTR_ERR(ctx); prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng); - ctx->sched.priority = I915_USER_PRIORITY(prio); + ctx->sched.priority = prio; } ce = i915_gem_context_get_engine(ctx, 0); @@ -338,7 +338,7 @@ static int igt_copy_blt_thread(void *arg) return PTR_ERR(ctx); prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng); - ctx->sched.priority = I915_USER_PRIORITY(prio); + ctx->sched.priority = prio; } ce = i915_gem_context_get_engine(ctx, 0); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 469639a2cc30..5ab76c507e46 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -82,9 +82,7 @@ static void show_heartbeat(const struct i915_request *rq, static void heartbeat(struct work_struct *wrk) { - struct i915_sched_attr attr = { - .priority = I915_USER_PRIORITY(I915_PRIORITY_MIN), - }; + struct i915_sched_attr attr = { .priority = I915_PRIORITY_MIN }; struct intel_engine_cs *engine = container_of(wrk, typeof(*engine), heartbeat.work.work); struct intel_context *ce = engine->kernel_context; @@ -128,7 +126,7 @@ static void heartbeat(struct work_struct *wrk) */ attr.priority = 0; if (rq->sched.attr.priority >= attr.priority) - attr.priority |= I915_USER_PRIORITY(I915_PRIORITY_HEARTBEAT); + attr.priority = I915_PRIORITY_HEARTBEAT; if (rq->sched.attr.priority >= attr.priority) attr.priority = I915_PRIORITY_BARRIER; diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 037145b51459..2b707f39b04c 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++
[Intel-gfx] [PATCH 18/42] drm/i915/gt: Decouple inflight virtual engines
Once a virtual engine has been bound to a sibling, it will remain bound until we finally schedule out the last active request. We can not rebind the context to a new sibling while it is inflight as the context save will conflict, hence we wait. As we cannot then use any other sibliing while the context is inflight, only kick the bound sibling while it inflight and upon scheduling out the kick the rest (so that we can swap engines on timeslicing if the previously bound engine becomes oversubscribed). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index c1ba1eee4b29..19cd5f4f74cf 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1388,9 +1388,8 @@ static inline void execlists_schedule_in(struct i915_request *rq, int idx) static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); - struct i915_request *next = READ_ONCE(ve->request); - if (next == rq || (next && next->execution_mask & ~rq->execution_mask)) + if (READ_ONCE(ve->request)) tasklet_hi_schedule(>base.execlists.tasklet); } @@ -1807,17 +1806,13 @@ first_virtual_engine(struct intel_engine_cs *engine) struct i915_request *rq = READ_ONCE(ve->request); /* lazily cleanup after another engine handled rq */ - if (!rq) { + if (!rq || !virtual_matches(ve, rq, engine)) { rb_erase_cached(rb, >virtual); RB_CLEAR_NODE(rb); rb = rb_first_cached(>virtual); continue; } - if (!virtual_matches(ve, rq, engine)) { - rb = rb_next(rb); - continue; - } return ve; } @@ -5469,7 +5464,6 @@ static void virtual_submission_tasklet(unsigned long data) if (unlikely(!mask)) return; - local_irq_disable(); for (n = 0; n < ve->num_siblings; n++) { struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]); struct ve_node * const node = >nodes[sibling->id]; @@ -5479,20 +5473,19 @@ static void virtual_submission_tasklet(unsigned long data) if (!READ_ONCE(ve->request)) break; /* already handled by a sibling's tasklet */ + spin_lock_irq(>active.lock); + if (unlikely(!(mask & sibling->mask))) { if (!RB_EMPTY_NODE(>rb)) { - spin_lock(>active.lock); rb_erase_cached(>rb, >execlists.virtual); RB_CLEAR_NODE(>rb); - spin_unlock(>active.lock); } - continue; - } - spin_lock(>active.lock); + goto unlock_engine; + } - if (!RB_EMPTY_NODE(>rb)) { + if (unlikely(!RB_EMPTY_NODE(>rb))) { /* * Cheat and avoid rebalancing the tree if we can * reuse this node in situ. @@ -5532,9 +5525,12 @@ static void virtual_submission_tasklet(unsigned long data) if (first && prio > sibling->execlists.queue_priority_hint) tasklet_hi_schedule(>execlists.tasklet); - spin_unlock(>active.lock); +unlock_engine: + spin_unlock_irq(>active.lock); + + if (intel_context_inflight(>context)) + break; } - local_irq_enable(); } static void virtual_submit_request(struct i915_request *rq) -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/42] drm/i915/gt: Free stale request on destroying the virtual engine
Since preempt-to-busy, we may unsubmit a request while it is still on the HW and completes asynchronously. That means it may be retired and in the process destroy the virtual engine (as the user has closed their context), but that engine may still be holding onto the unsubmitted compelted request. Therefore we need to potentially cleanup the old request on destroying the virtual engine. We also have to keep the virtual_engine alive until after the sibling's execlists_dequeue() have finished peeking into the virtual engines, for which we serialise with RCU. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 417f6b0c6c61..cb04bc5474be 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -180,6 +180,7 @@ #define EXECLISTS_REQUEST_SIZE 64 /* bytes */ struct virtual_engine { + struct rcu_head rcu; struct intel_engine_cs base; struct intel_context context; @@ -5393,10 +5394,25 @@ static void virtual_context_destroy(struct kref *kref) container_of(kref, typeof(*ve), context.ref); unsigned int n; - GEM_BUG_ON(!list_empty(virtual_queue(ve))); - GEM_BUG_ON(ve->request); GEM_BUG_ON(ve->context.inflight); + if (unlikely(ve->request)) { + struct i915_request *old; + unsigned long flags; + + spin_lock_irqsave(>base.active.lock, flags); + + old = fetch_and_zero(>request); + if (old) { + GEM_BUG_ON(!i915_request_completed(old)); + __i915_request_submit(old); + i915_request_put(old); + } + + spin_unlock_irqrestore(>base.active.lock, flags); + } + GEM_BUG_ON(!list_empty(virtual_queue(ve))); + for (n = 0; n < ve->num_siblings; n++) { struct intel_engine_cs *sibling = ve->siblings[n]; struct rb_node *node = >nodes[sibling->id].rb; @@ -5422,7 +5438,7 @@ static void virtual_context_destroy(struct kref *kref) intel_engine_free_request_pool(>base); kfree(ve->bonds); - kfree(ve); + kfree_rcu(ve, rcu); } static void virtual_engine_initial_hint(struct virtual_engine *ve) -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 23/42] drm/i915/gt: Drop atomic for engine->fw_active tracking
Since schedule-in/out is now entirely serialised by the tasklet bitlock, we do not need to worry about concurrent in/out operations and so reduce the atomic operations to plain instructions. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 2 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 26f2781e4a51..b2f520bc1497 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1637,7 +1637,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, ktime_to_ms(intel_engine_get_busy_time(engine, ))); drm_printf(m, "\tForcewake: %x domains, %d active\n", - engine->fw_domain, atomic_read(>fw_active)); + engine->fw_domain, READ_ONCE(engine->fw_active)); rcu_read_lock(); rq = READ_ONCE(engine->heartbeat.systole); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 3ea1e59d2b85..2361206fd026 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -328,7 +328,7 @@ struct intel_engine_cs { * as possible. */ enum forcewake_domains fw_domain; - atomic_t fw_active; + unsigned int fw_active; unsigned long context_tag; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index e3dccdc53faf..61dfc932201e 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1342,7 +1342,7 @@ __execlists_schedule_in(struct i915_request *rq) ce->lrc.ccid |= engine->execlists.ccid; __intel_gt_pm_get(engine->gt); - if (engine->fw_domain && !atomic_fetch_inc(>fw_active)) + if (engine->fw_domain && !engine->fw_active++) intel_uncore_forcewake_get(engine->uncore, engine->fw_domain); execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); intel_engine_context_in(engine); @@ -1435,7 +1435,7 @@ static inline void __execlists_schedule_out(struct i915_request *rq) intel_context_update_runtime(ce); intel_engine_context_out(engine); execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT); - if (engine->fw_domain && !atomic_dec_return(>fw_active)) + if (engine->fw_domain && !--engine->fw_active) intel_uncore_forcewake_put(engine->uncore, engine->fw_domain); intel_gt_pm_put_async(engine->gt); -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 38/42] drm/i915: Fair low-latency scheduling
The first "scheduler" was a topographical sorting of requests into priority order. The execution order was deterministic, the earliest submitted, highest priority request would be executed first. Priority inherited ensured that inversions were kept at bay, and allowed us to dynamically boost priorities (e.g. for interactive pageflips). The minimalistic timeslicing scheme was an attempt to introduce fairness between long running requests, by evicting the active request at the end of a timeslice and moving it to the back of its priority queue (while ensuring that dependencies were kept in order). For short running requests from many clients of equal priority, the scheme is still very much FIFO submission ordering, and as unfair as before. To impose fairness, we need an external metric that ensures that clients are interpersed, we don't execute one long chain from client A before executing any of client B. This could be imposed by the clients by using a fences based on an external clock, that is they only submit work for a "frame" at frame-interval, instead of submitting as much work as they are able to. The standard SwapBuffers approach is akin to double bufferring, where as one frame is being executed, the next is being submitted, such that there is always a maximum of two frames per client in the pipeline. Even this scheme exhibits unfairness under load as a single client will execute two frames back to back before the next, and with enough clients, deadlines will be missed. The idea introduced by BFS/MuQSS is that fairness is introduced by metering with an external clock. Every request, when it becomes ready to execute is assigned a virtual deadline, and execution order is then determined by earliest deadline. Priority is used as a hint, rather than strict ordering, where high priority requests have earlier deadlines, but not necessarily earlier than outstanding work. Thus work is executed in order of 'readiness', with timeslicing to demote long running work. The Achille's heel of this scheduler is its strong preference for low-latency and favouring of new queues. Whereas it was easy to dominate the old scheduler by flooding it with many requests over a short period of time, the new scheduler can be dominated by a 'synchronous' client that waits for each of its requests to complete before submitting the next. As such a client has no history, it is always considered ready-to-run and receives an earlier deadline than the long running requests. To check the impact on throughput (often the downfall of latency sensitive schedulers), we used gem_wsim to simulate various transcode workloads with different load balancers, and varying the number of competing [heterogenous] clients. +mB+ | a | | cda | | c.a | | ..aa | | ..---. | | -.--+-.| |.c.-.-+++. b | | bbb.d-c-+--+++.aab aab b | |b b b b b. b ..---+++-+a. b. b b b bb b| 1 A| | 2 |___AM| | 3|A__| | 4|MA_| | +--+ Clients Min Max Median AvgStddev 1 -8.20 5.4 -0.045 -0.02375 0.094722134 2 -15.96 19.28 -0.64 -1.05 2.2428076 4 -5.11 2.95 -1.15-1.0680.72382651 8 -5.63 1.85 -0.905 -0.871224490.73390971 The impact was on average 1% under contention due to the change in context execution order and number of context switches. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 12 +- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 1 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 - drivers/gpu/drm/i915/gt/intel_lrc.c | 287 --- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 5 +- drivers/gpu/drm/i915/gt/selftest_lrc.c| 41 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 6 +- drivers/gpu/drm/i915/i915_priolist_types.h| 7 +- drivers/gpu/drm/i915/i915_request.c | 1 + drivers/gpu/drm/i915/i915_scheduler.c | 462 +++--- drivers/gpu/drm/i915/i915_scheduler.h | 29 +-
[Intel-gfx] [PATCH 13/42] drm/i915/gt: Decouple completed requests on unwind
Since the introduction of preempt-to-busy, requests can complete in the background, even while they are not on the engine->active.requests list. As such, the engine->active.request list itself is not in strict retirement order, and we have to scan the entire list while unwinding to not miss any. However, if the request is completed we currently leave it on the list [until retirement], but we could just as simply remove it and stop treating it as active. We would only have to then traverse it once while unwinding in quick succession. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 6 -- drivers/gpu/drm/i915/i915_request.c | 3 ++- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index cb04bc5474be..f9aa44f222db 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) list_for_each_entry_safe_reverse(rq, rn, >active.requests, sched.link) { - if (i915_request_completed(rq)) - continue; /* XXX */ + if (i915_request_completed(rq)) { + list_del_init(>sched.link); + continue; + } __i915_request_unsubmit(rq); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 1cd102591b36..ebfa195d6463 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -319,7 +319,8 @@ bool i915_request_retire(struct i915_request *rq) * after removing the breadcrumb and signaling it, so that we do not * inadvertently attach the breadcrumb to a completed request. */ - remove_from_engine(rq); + if (!list_empty(>sched.link)) + remove_from_engine(rq); GEM_BUG_ON(!llist_empty(>execute_cb)); __list_del_entry(>link); /* poison neither prev/next (RCU walks) */ -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 01/42] drm/i915: Fix wrong return value
From: Tianjia Zhang In function i915_active_acquire_preallocate_barrier(), not all paths have the return value set correctly, and in case of memory allocation failure, a negative error code should be returned. Cc: Chris Wilson Signed-off-by: Tianjia Zhang Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_active.c| 7 ++- drivers/gpu/drm/i915/selftests/i915_request.c | 2 +- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 3ee5e02c3e17..b0a6522be3d1 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -889,7 +889,6 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, intel_engine_mask_t tmp, mask = engine->mask; struct llist_node *first = NULL, *last = NULL; struct intel_gt *gt = engine->gt; - int err; GEM_BUG_ON(i915_active_is_idle(ref)); @@ -914,10 +913,8 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, rcu_read_unlock(); if (!node) { node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); - if (!node) { - err = ENOMEM; + if (!node) goto unwind; - } RCU_INIT_POINTER(node->base.fence, NULL); node->base.cb.func = node_retire; @@ -965,7 +962,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, kmem_cache_free(global.slab_cache, node); } - return err; + return -ENOMEM; } void i915_active_acquire_barrier(struct i915_active *ref) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index 57dd6f5122ee..c1dcd4b91bda 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -331,7 +331,7 @@ static int __igt_breadcrumbs_smoketest(void *arg) if (!wait) { i915_sw_fence_commit(submit); heap_fence_put(submit); - err = ENOMEM; + err = -ENOMEM; break; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 27/42] drm/i915: Strip out internal priorities
Since we are not using any internal priority levels, and in the next few patches will introduce a new index for which the optimisation is not so lear cut, discard the small table within the priolist. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 22 ++-- drivers/gpu/drm/i915/gt/selftest_lrc.c| 2 - .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 6 +-- drivers/gpu/drm/i915/i915_priolist_types.h| 8 +-- drivers/gpu/drm/i915/i915_scheduler.c | 51 +++ drivers/gpu/drm/i915/i915_scheduler.h | 18 ++- 7 files changed, 21 insertions(+), 88 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 5a0196dc5716..469639a2cc30 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -126,7 +126,7 @@ static void heartbeat(struct work_struct *wrk) * low latency and no jitter] the chance to naturally * complete before being preempted. */ - attr.priority = I915_PRIORITY_MASK; + attr.priority = 0; if (rq->sched.attr.priority >= attr.priority) attr.priority |= I915_USER_PRIORITY(I915_PRIORITY_HEARTBEAT); if (rq->sched.attr.priority >= attr.priority) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3c022e621a38..e8f6d0a80c8e 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -437,22 +437,13 @@ static int effective_prio(const struct i915_request *rq) static int queue_prio(const struct intel_engine_execlists *execlists) { - struct i915_priolist *p; struct rb_node *rb; rb = rb_first_cached(>queue); if (!rb) return INT_MIN; - /* -* As the priolist[] are inverted, with the highest priority in [0], -* we have to flip the index value to become priority. -*/ - p = to_priolist(rb); - if (!I915_USER_PRIORITY_SHIFT) - return p->priority; - - return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used); + return to_priolist(rb)->priority; } static int virtual_prio(const struct intel_engine_execlists *el) @@ -2249,9 +2240,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) while ((rb = rb_first_cached(>queue))) { struct i915_priolist *p = to_priolist(rb); struct i915_request *rq, *rn; - int i; - priolist_for_each_request_consume(rq, rn, p, i) { + priolist_for_each_request_consume(rq, rn, p) { bool merge = true; /* @@ -4253,9 +4243,8 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine) /* Flush the queued requests to the timeline list (for retiring). */ while ((rb = rb_first_cached(>queue))) { struct i915_priolist *p = to_priolist(rb); - int i; - priolist_for_each_request_consume(rq, rn, p, i) { + priolist_for_each_request_consume(rq, rn, p) { mark_eio(rq); __i915_request_submit(rq); } @@ -5296,7 +5285,7 @@ static int __execlists_context_alloc(struct intel_context *ce, static struct list_head *virtual_queue(struct virtual_engine *ve) { - return >base.execlists.default_priolist.requests[0]; + return >base.execlists.default_priolist.requests; } static void virtual_context_destroy(struct kref *kref) @@ -5863,9 +5852,8 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine, count = 0; for (rb = rb_first_cached(>queue); rb; rb = rb_next(rb)) { struct i915_priolist *p = rb_entry(rb, typeof(*p), node); - int i; - priolist_for_each_request(rq, p, i) { + priolist_for_each_request(rq, p) { if (count++ < max - 1) show_request(m, rq, "\t\tQ "); else diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 79c4a2d80770..037145b51459 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -1102,7 +1102,6 @@ create_rewinder(struct intel_context *ce, intel_ring_advance(rq, cs); - rq->sched.attr.priority = I915_PRIORITY_MASK; err = 0; err: i915_request_get(rq); @@ -5362,7 +5361,6 @@ create_timestamp(struct intel_context *ce, void *slot, int idx) intel_ring_advance(rq, cs); - rq->sched.attr.priority = I915_PRIORITY_MASK;
[Intel-gfx] [PATCH 32/42] drm/i915/gt: Do not suspend bonded requests if one hangs
Treat the dependency between bonded requests as weak and leave the remainder of the pair on the GPU if one hangs. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fad58f227bca..3999bf6c6aee 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2700,6 +2700,9 @@ static void __execlists_hold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); + if (p->flags & I915_DEPENDENCY_WEAK) + continue; + /* Leave semaphores spinning on the other engines */ if (w->engine != rq->engine) continue; @@ -2795,6 +2798,9 @@ static void __execlists_unhold(struct i915_request *rq) struct i915_request *w = container_of(p->waiter, typeof(*w), sched); + if (p->flags & I915_DEPENDENCY_WEAK) + continue; + /* Propagate any change in error status */ if (rq->fence.error) i915_request_set_error_once(w, rq->fence.error); -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/42] drm/i915/gem: Don't drop the timeline lock during execbuf
Our timeline lock is our defence against a concurrent execbuf interrupting our request construction. we need hold it throughout or, for example, a second thread may interject a relocation request in between our own relocation request and execution in the ring. A second, major benefit, is that it allows us to preserve a large chunk of the ringbuffer for our exclusive use; which should virtually eliminate the threat of hitting a wait_for_space during request construction -- although we should have already dropped other contentious locks at that point. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 414 +++--- .../i915/gem/selftests/i915_gem_execbuffer.c | 29 +- 2 files changed, 288 insertions(+), 155 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 07cb2dd0f795..89c7b33dd99f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -259,6 +259,8 @@ struct i915_execbuffer { bool has_fence : 1; bool needs_unfenced : 1; + struct intel_context *ce; + struct i915_vma *target; struct i915_request *rq; struct i915_vma *rq_vma; @@ -639,6 +641,35 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb, return 0; } +static void retire_requests(struct intel_timeline *tl) +{ + struct i915_request *rq, *rn; + + list_for_each_entry_safe(rq, rn, >requests, link) + if (!i915_request_retire(rq)) + break; +} + +static int wait_for_timeline(struct intel_timeline *tl) +{ + do { + struct dma_fence *fence; + int err; + + fence = i915_active_fence_get(>last_request); + if (!fence) + return 0; + + err = dma_fence_wait(fence, true); + dma_fence_put(fence); + if (err) + return err; + + /* Retiring may trigger a barrier, requiring an extra pass */ + retire_requests(tl); + } while (1); +} + static int eb_reserve(struct i915_execbuffer *eb) { const unsigned int count = eb->buffer_count; @@ -646,7 +677,6 @@ static int eb_reserve(struct i915_execbuffer *eb) struct list_head last; struct eb_vma *ev; unsigned int i, pass; - int err = 0; /* * Attempt to pin all of the buffers into the GTT. @@ -662,18 +692,37 @@ static int eb_reserve(struct i915_execbuffer *eb) * room for the earlier objects *unless* we need to defragment. */ - if (mutex_lock_interruptible(>i915->drm.struct_mutex)) - return -EINTR; - pass = 0; do { + int err = 0; + + /* +* We need to hold one lock as we bind all the vma so that +* we have a consistent view of the entire vm and can plan +* evictions to fill the whole GTT. If we allow a second +* thread to run as we do this, it will either unbind +* everything we want pinned, or steal space that we need for +* ourselves. The closer we are to a full GTT, the more likely +* such contention will cause us to fail to bind the workload +* for this batch. Since we know at this point we need to +* find space for new buffers, we know that extra pressure +* from contention is likely. +* +* In lieu of being able to hold vm->mutex for the entire +* sequence (it's complicated!), we opt for struct_mutex. +*/ + if (mutex_lock_interruptible(>i915->drm.struct_mutex)) + return -EINTR; + list_for_each_entry(ev, >unbound, bind_link) { err = eb_reserve_vma(eb, ev, pin_flags); if (err) break; } - if (!(err == -ENOSPC || err == -EAGAIN)) - break; + if (!(err == -ENOSPC || err == -EAGAIN)) { + mutex_unlock(>i915->drm.struct_mutex); + return err; + } /* Resort *all* the objects into priority order */ INIT_LIST_HEAD(>unbound); @@ -702,38 +751,50 @@ static int eb_reserve(struct i915_execbuffer *eb) list_add_tail(>bind_link, ); } list_splice_tail(, >unbound); + mutex_unlock(>i915->drm.struct_mutex); if (err == -EAGAIN) { - mutex_unlock(>i915->drm.struct_mutex); flush_workqueue(eb->i915->mm.userptr_wq); -
[Intel-gfx] [PATCH 36/42] drm/i915: Improve DFS for priority inheritance
The core of the scheduling algorithm is that we compute the topological order of the fence DAG. Knowing that we have a DAG, we should be able to use a DFS to compute the topological sort in linear time. However, during the conversion of the recursive algorithm into an iterative one, the memorization of how far we had progressed down a branch was forgotten. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_scheduler.c | 57 --- 1 file changed, 33 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 3c1a0b001746..ca681e6d9c6d 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -287,15 +287,33 @@ static void ipi_priority(struct i915_dependency *p, int prio) spin_unlock(_lock); } +static struct i915_request * +stack_push(struct i915_request *rq, + struct i915_request *stack, + struct list_head *pos) +{ + stack->sched.dfs.prev = pos; + rq->sched.dfs.next = (struct list_head *)stack; + return rq; +} + +static struct i915_request *stack_pop(struct i915_request *rq, + struct list_head **pos) +{ + rq = (struct i915_request *)rq->sched.dfs.next; + if (rq) + *pos = rq->sched.dfs.prev; + return rq; +} + static void __i915_request_set_priority(struct i915_request *rq, int prio) { struct intel_engine_cs *engine = rq->engine; - struct i915_request *rn; + struct list_head *pos = >sched.signalers_list; struct list_head *plist; - LIST_HEAD(dfs); lockdep_assert_held(>active.lock); - list_add(>sched.dfs, ); + plist = i915_sched_lookup_priolist(engine, prio); /* * Recursively bump all dependent priorities to match the new request. @@ -315,40 +333,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio) * end result is a topological list of requests in reverse order, the * last element in the list is the request we must execute first. */ - list_for_each_entry(rq, , sched.dfs) { - struct i915_dependency *p; - - /* Also release any children on this engine that are ready */ - GEM_BUG_ON(rq->engine != engine); - - for_each_signaler(p, rq) { + rq->sched.dfs.next = NULL; + do { + list_for_each_continue(pos, >sched.signalers_list) { + struct i915_dependency *p = + list_entry(pos, typeof(*p), signal_link); struct i915_request *s = container_of(p->signaler, typeof(*s), sched); - GEM_BUG_ON(s == rq); - if (rq_prio(s) >= prio) continue; if (i915_request_completed(s)) continue; - if (s->engine != rq->engine) { + if (s->engine != engine) { ipi_priority(p, prio); continue; } - list_move_tail(>sched.dfs, ); + /* Remember our position along this branch */ + rq = stack_push(s, rq, pos); + pos = >sched.signalers_list; } - } - - plist = i915_sched_lookup_priolist(engine, prio); - - /* Fifo and depth-first replacement ensure our deps execute first */ - list_for_each_entry_safe_reverse(rq, rn, , sched.dfs) { - GEM_BUG_ON(rq->engine != engine); - INIT_LIST_HEAD(>sched.dfs); + RQ_TRACE(rq, "set-priority:%d\n", prio); WRITE_ONCE(rq->sched.attr.priority, prio); /* @@ -362,12 +371,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio) if (!i915_request_is_ready(rq)) continue; + GEM_BUG_ON(rq->engine != engine); if (i915_request_in_priority_queue(rq)) list_move_tail(>sched.link, plist); /* Defer (tasklet) submission until after all updates. */ kick_submission(engine, rq, prio); - } + } while ((rq = stack_pop(rq, ))); } void i915_request_set_priority(struct i915_request *rq, int prio) @@ -435,7 +445,6 @@ void i915_sched_node_init(struct i915_sched_node *node) INIT_LIST_HEAD(>signalers_list); INIT_LIST_HEAD(>waiters_list); INIT_LIST_HEAD(>link); - INIT_LIST_HEAD(>dfs); i915_sched_node_reinit(node); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 08/42] drm/i915: Drop i915_request.lock serialisation around await_start
Originally, we used the signal->lock as a means of following the previous link in its timeline and peeking at the previous fence. However, we have replaced the explicit serialisation with a series of very careful probes that anticipate the links being deleted and the fences recycled before we are able to acquire a strong reference to it. We do not need the signal->lock crutch anymore, nor want the contention. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_request.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 0208e917d14a..f7865e22afa9 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -954,9 +954,16 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) if (i915_request_started(signal)) return 0; + /* +* The caller holds a reference on @signal, but we do not serialise +* against it being retired and removed from the lists. +* +* We do not hold a reference to the request before @signal, and +* so must be very careful to ensure that it is not _recycled_ as +* we follow the link backwards. +*/ fence = NULL; rcu_read_lock(); - spin_lock_irq(>lock); do { struct list_head *pos = READ_ONCE(signal->link.prev); struct i915_request *prev; @@ -987,7 +994,6 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) fence = >fence; } while (0); - spin_unlock_irq(>lock); rcu_read_unlock(); if (!fence) return 0; -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/42] drm/i915: Drop i915_request.lock requirement for intel_rps_boost()
Since we use a flag within i915_request.flags to indicate when we have boosted the request (so that we only apply the boost) once, this can be used as the serialisation with i915_request_retire() to avoid having to explicitly take the i915_request.lock which is more heavily contended. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_rps.c | 15 ++- drivers/gpu/drm/i915/i915_request.c | 4 +--- 2 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index e6a00eea0631..2a43e216e0d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -889,17 +889,15 @@ void intel_rps_park(struct intel_rps *rps) void intel_rps_boost(struct i915_request *rq) { - struct intel_rps *rps = _ONCE(rq->engine)->gt->rps; - unsigned long flags; - - if (i915_request_signaled(rq) || !intel_rps_is_active(rps)) + if (i915_request_signaled(rq) || i915_request_has_waitboost(rq)) return; /* Serializes with i915_request_retire() */ - spin_lock_irqsave(>lock, flags); - if (!i915_request_has_waitboost(rq) && - !dma_fence_is_signaled_locked(>fence)) { - set_bit(I915_FENCE_FLAG_BOOST, >fence.flags); + if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) { + struct intel_rps *rps = _ONCE(rq->engine)->gt->rps; + + if (!intel_rps_is_active(rps)) + return; GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", rq->fence.context, rq->fence.seqno); @@ -910,7 +908,6 @@ void intel_rps_boost(struct i915_request *rq) atomic_inc(>boosts); } - spin_unlock_irqrestore(>lock, flags); } int intel_rps_set(struct intel_rps *rps, u8 val) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f7865e22afa9..67267b0d5b06 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -306,10 +306,8 @@ bool i915_request_retire(struct i915_request *rq) spin_unlock_irq(>lock); } - if (i915_request_has_waitboost(rq)) { - GEM_BUG_ON(!atomic_read(>engine->gt->rps.num_waiters)); + if (test_and_set_bit(I915_FENCE_FLAG_BOOST, >fence.flags)) atomic_dec(>engine->gt->rps.num_waiters); - } /* * We only loosely track inflight requests across preemption, -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 41/42] drm/i915: Move saturated workload detection back to the context
When we introduced the saturated workload detection to tell us to back off from semaphore usage [semaphores have a noticeable impact on contended bus cycles with the CPU for some heavy workloads], we first introduced it as a per-context tracker. This allows individual contexts to try and optimise their own usage, but we found that with the local tracking and the no-semaphore boosting, the first context to disable semaphores got a massive priority boost and so would starve the rest and all new contexts (as they started with semaphores enabled and lower priority). Hence we moved the saturated workload detection to the engine, and a consequence had to disable semaphores on virtual engines. Now that we do not have semaphore priority boosting, and try to fairly schedule irrespective of semaphore usage, we can move the tracking back to the context and virtual engines can now utilise the faster inter-engine synchronisation. If we see that any context fairs to use the semaphore, because the system is oversubscribed and was busy doing something else instead of spinning on the semaphore, we disable further usage of semaphores with that context until it idles again. This should restrict the semaphores to lightly utilised system where the latency between requests is more noticeable, and curtail the bus-contention from checking for signaled semaphores. References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global") Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 2 ++ drivers/gpu/drm/i915/gt/intel_context_types.h | 2 ++ drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 -- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 -- drivers/gpu/drm/i915/gt/intel_lrc.c | 15 --- drivers/gpu/drm/i915/i915_request.c | 4 ++-- 6 files changed, 6 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index cde356c7754d..641e705c9289 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -115,6 +115,8 @@ static int __intel_context_active(struct i915_active *active) CE_TRACE(ce, "active\n"); + ce->saturated = 0; + intel_context_get(ce); err = __ring_active(ce->ring); diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 9a28339b6d74..58fb40d44c85 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -79,6 +79,8 @@ struct intel_context { } lrc; u32 tag; /* cookie passed to HW to track this context on submission */ + intel_engine_mask_t saturated; /* submitting semaphores too late? */ + /* Time on GPU as tracked by the hw. */ struct { struct ewma_runtime avg; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index ac56df945846..fa7b855902f9 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -232,8 +232,6 @@ static int __engine_park(struct intel_wakeref *wf) struct intel_engine_cs *engine = container_of(wf, typeof(*engine), wakeref); - engine->saturated = 0; - /* * If one and only one request is completed between pm events, * we know that we are inside the kernel context and it is diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 742db055d3d6..a200e2f85e7b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -325,8 +325,6 @@ struct intel_engine_cs { struct intel_context *kernel_context; /* pinned */ - intel_engine_mask_t saturated; /* submitting semaphores too late? */ - struct { struct delayed_work work; struct i915_request *systole; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index feec1b841cd3..36ab955545a0 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -5504,21 +5504,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings, ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL; ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL; - /* -* The decision on whether to submit a request using semaphores -* depends on the saturated state of the engine. We only compute -* this during HW submission of the request, and we need for this -* state to be globally applied to all requests being submitted -* to this engine. Virtual engines encompass more than one physical -* engine and so we cannot accurately tell in advance if one of those -* engines is already saturated and so cannot afford to use a
[Intel-gfx] [PATCH 14/42] drm/i915/gt: Check for a completed last request once
Pull the repeated check for the last active request being completed to a single spot, when deciding whether or not execlist preemption is required. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index f9aa44f222db..a870fd243e44 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2148,12 +2148,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) */ if ((last = *active)) { - if (need_preempt(engine, last, rb)) { - if (i915_request_completed(last)) { - tasklet_hi_schedule(>tasklet); - return; - } - + if (i915_request_completed(last)) { + goto check_secondary; + } else if (need_preempt(engine, last, rb)) { ENGINE_TRACE(engine, "preempting last=%llx:%lld, prio=%d, hint=%d\n", last->fence.context, @@ -2181,11 +2178,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) last = NULL; } else if (need_timeslice(engine, last, rb) && timeslice_expired(execlists, last)) { - if (i915_request_completed(last)) { - tasklet_hi_schedule(>tasklet); - return; - } - ENGINE_TRACE(engine, "expired last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n", last->fence.context, @@ -2221,6 +2213,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * we hopefully coalesce several updates into a single * submission. */ +check_secondary: if (!list_is_last(>sched.link, >active.requests)) { /* -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 37/42] drm/i915/gt: Remove timeslice suppression
In the next patch, we remove the strict priority system and continuously re-evaluate the relative priority of tasks. As such we need to enable the timeslice whenever there is more than one context in the pipeline. This simplifies the decision and removes some of the tweaks to suppress timeslicing, allowing us to lift the timeslice enabling to a common spot at the end of running the submission tasklet. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 -- drivers/gpu/drm/i915/gt/intel_lrc.c | 148 +++ 2 files changed, 53 insertions(+), 105 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index bccde0e27300..de85b969a79e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -237,16 +237,6 @@ struct intel_engine_execlists { */ unsigned int port_mask; - /** -* @switch_priority_hint: Second context priority. -* -* We submit multiple contexts to the HW simultaneously and would -* like to occasionally switch between them to emulate timeslicing. -* To know when timeslicing is suitable, we track the priority of -* the context submitted second. -*/ - int switch_priority_hint; - /** * @queue_priority_hint: Highest pending priority. * diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index cb53f350848f..6b06cddcfd9a 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1885,25 +1885,6 @@ static void defer_active(struct intel_engine_cs *engine) defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq))); } -static bool -need_timeslice(const struct intel_engine_cs *engine, - const struct i915_request *rq) -{ - int hint; - - if (!intel_engine_has_timeslices(engine)) - return false; - - hint = max(engine->execlists.queue_priority_hint, - virtual_prio(>execlists)); - - if (!list_is_last(>sched.link, >active.requests)) - hint = max(hint, rq_prio(list_next_entry(rq, sched.link))); - - GEM_BUG_ON(hint >= I915_PRIORITY_UNPREEMPTABLE); - return hint >= effective_prio(rq); -} - static bool timeslice_yield(const struct intel_engine_execlists *el, const struct i915_request *rq) @@ -1923,76 +1904,63 @@ timeslice_yield(const struct intel_engine_execlists *el, return rq->context->lrc.ccid == READ_ONCE(el->yield); } -static bool -timeslice_expired(const struct intel_engine_execlists *el, - const struct i915_request *rq) +static bool needs_timeslice(const struct intel_engine_cs *engine, + const struct i915_request *rq) { - return timer_expired(>timer) || timeslice_yield(el, rq); -} + /* If not currently active, or about to switch, wait for next event */ + if (!rq || i915_request_completed(rq)) + return false; -static int -switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq) -{ - if (list_is_last(>sched.link, >active.requests)) - return engine->execlists.queue_priority_hint; + /* We do not need to start the timeslice until after the ACK */ + if (READ_ONCE(engine->execlists.pending[0])) + return false; - return rq_prio(list_next_entry(rq, sched.link)); -} + /* If ELSP[1] is occupied, always check to see if worth slicing */ + if (!list_is_last_rcu(>sched.link, >active.requests)) + return true; -static inline unsigned long -timeslice(const struct intel_engine_cs *engine) -{ - return READ_ONCE(engine->props.timeslice_duration_ms); + /* Otherwise, ELSP[0] is by itself, but may be waiting in the queue */ + if (!RB_EMPTY_ROOT(>execlists.queue.rb_root)) + return true; + + return !RB_EMPTY_ROOT(>execlists.virtual.rb_root); } -static unsigned long active_timeslice(const struct intel_engine_cs *engine) +static bool +timeslice_expired(struct intel_engine_cs *engine, const struct i915_request *rq) { - const struct intel_engine_execlists *execlists = >execlists; - const struct i915_request *rq = *execlists->active; + const struct intel_engine_execlists *el = >execlists; - if (!rq || i915_request_completed(rq)) - return 0; + if (!intel_engine_has_timeslices(engine)) + return false; - if (READ_ONCE(execlists->switch_priority_hint) < effective_prio(rq)) - return 0; + if (i915_request_has_nopreempt(rq) && i915_request_started(rq)) + return false; + + if (!needs_timeslice(engine, rq)) + return false; - return timeslice(engine); + return timer_expired(>timer) || timeslice_yield(el, rq); }
[Intel-gfx] [PATCH 15/42] drm/i915/gt: Refactor heartbeat request construction and submission
Pull the individual strands of creating a custom heartbeat requests into a pair of common functions. This will reduce the number of changes we will need to make in future. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 56 +-- 1 file changed, 38 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 8ffdf676c0a0..eb4393b2342e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -37,12 +37,33 @@ static bool next_heartbeat(struct intel_engine_cs *engine) return true; } +static struct i915_request * +heartbeat_create(struct intel_context *ce, gfp_t gfp) +{ + struct i915_request *rq; + + intel_context_enter(ce); + rq = __i915_request_create(ce, gfp); + intel_context_exit(ce); + + return rq; +} + static void idle_pulse(struct intel_engine_cs *engine, struct i915_request *rq) { engine->wakeref_serial = READ_ONCE(engine->serial) + 1; i915_request_add_active_barriers(rq); } +static void heartbeat_commit(struct i915_request *rq, +const struct i915_sched_attr *attr) +{ + idle_pulse(rq->engine, rq); + + __i915_request_commit(rq); + __i915_request_queue(rq, attr); +} + static void show_heartbeat(const struct i915_request *rq, struct intel_engine_cs *engine) { @@ -137,18 +158,14 @@ static void heartbeat(struct work_struct *wrk) goto out; } - intel_context_enter(ce); - rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN); - intel_context_exit(ce); + rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN); if (IS_ERR(rq)) goto unlock; - idle_pulse(engine, rq); if (engine->i915->params.enable_hangcheck) engine->heartbeat.systole = i915_request_get(rq); - __i915_request_commit(rq); - __i915_request_queue(rq, ); + heartbeat_commit(rq, ); unlock: mutex_unlock(>timeline->mutex); @@ -220,19 +237,14 @@ int intel_engine_pulse(struct intel_engine_cs *engine) goto out_rpm; } - intel_context_enter(ce); - rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN); - intel_context_exit(ce); + rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN); if (IS_ERR(rq)) { err = PTR_ERR(rq); goto out_unlock; } __set_bit(I915_FENCE_FLAG_SENTINEL, >fence.flags); - idle_pulse(engine, rq); - - __i915_request_commit(rq); - __i915_request_queue(rq, ); + heartbeat_commit(rq, ); GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER); err = 0; @@ -245,8 +257,9 @@ int intel_engine_pulse(struct intel_engine_cs *engine) int intel_engine_flush_barriers(struct intel_engine_cs *engine) { + struct intel_context *ce = engine->kernel_context; struct i915_request *rq; - int err = 0; + int err; if (llist_empty(>barrier_tasks)) return 0; @@ -254,15 +267,22 @@ int intel_engine_flush_barriers(struct intel_engine_cs *engine) if (!intel_engine_pm_get_if_awake(engine)) return 0; - rq = i915_request_create(engine->kernel_context); + if (mutex_lock_interruptible(>timeline->mutex)) { + err = -EINTR; + goto out_rpm; + } + + rq = heartbeat_create(ce, GFP_KERNEL); if (IS_ERR(rq)) { err = PTR_ERR(rq); - goto out_rpm; + goto out_unlock; } - idle_pulse(engine, rq); - i915_request_add(rq); + heartbeat_commit(rq, NULL); + err = 0; +out_unlock: + mutex_unlock(>timeline->mutex); out_rpm: intel_engine_pm_put(engine); return err; -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for Add generic i915_ggtt ballooning support
== Series Details == Series: Add generic i915_ggtt ballooning support URL : https://patchwork.freedesktop.org/series/80177/ State : success == Summary == CI Bug Log - changes from CI_DRM_8832 -> Patchwork_18294 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/index.html Known issues Here are the changes found in Patchwork_18294 that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_suspend@basic-s3: - fi-tgl-u2: [PASS][1] -> [FAIL][2] ([i915#1888]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html * igt@i915_module_load@reload: - fi-byt-j1900: [PASS][3] -> [DMESG-WARN][4] ([i915#1982]) +1 similar issue [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-byt-j1900/igt@i915_module_l...@reload.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-byt-j1900/igt@i915_module_l...@reload.html Possible fixes * igt@i915_module_load@reload: - fi-bxt-dsi: [DMESG-WARN][5] ([i915#1635] / [i915#1982]) -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bxt-dsi/igt@i915_module_l...@reload.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-bxt-dsi/igt@i915_module_l...@reload.html * igt@i915_selftest@live@active: - {fi-ehl-1}: [DMESG-FAIL][7] ([i915#541]) -> [PASS][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-ehl-1/igt@i915_selftest@l...@active.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-ehl-1/igt@i915_selftest@l...@active.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-bsw-kefka: [DMESG-WARN][9] ([i915#1982]) -> [PASS][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1: - fi-icl-u2: [DMESG-WARN][11] ([i915#1982]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html Warnings * igt@gem_exec_suspend@basic-s3: - fi-kbl-x1275: [DMESG-WARN][13] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][14] ([i915#62] / [i915#92]) +2 similar issues [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@gem_exec_susp...@basic-s3.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-kbl-x1275/igt@gem_exec_susp...@basic-s3.html * igt@kms_flip@basic-flip-vs-dpms@a-dp1: - fi-kbl-x1275: [DMESG-WARN][15] ([i915#62] / [i915#92]) -> [DMESG-WARN][16] ([i915#62] / [i915#92] / [i915#95]) +2 similar issues [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8832/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-d...@a-dp1.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18294/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-d...@a-dp1.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635 [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888 [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982 [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541 [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62 [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92 [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95 Participating hosts (42 -> 35) -- Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus Build changes - * Linux: CI_DRM_8832 -> Patchwork_18294 CI-20190529: 20190529 CI_DRM_8832: 00b9a4b2331e1dceef8994e3b144e2c5e8c55003 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5757: d78c7fd293cb228fe03ccff730202b033e25ff18 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_18294: c9d01ff1c89f338f52fd60f7261fe51395897a3e @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == c9d01ff1c89f drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt ab86294c92b2 drm/i915/ggtt: Add generic i915_ggtt ballooning support == Logs == For more details see:
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add generic i915_ggtt ballooning support
== Series Details == Series: Add generic i915_ggtt ballooning support URL : https://patchwork.freedesktop.org/series/80177/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.0 Fast mode used, each commit won't be checked separately. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/ggtt: Add generic i915_ggtt ballooning support
Quoting Michal Wajdeczko (2020-08-02 16:34:09) > Reserving part of the GGTT for the GuC requires same steps > as in VGT GGTT ballooning. Add generic GGTT ballooning > helpers to intel_ggtt.c to avoid code duplication. > > Signed-off-by: Michal Wajdeczko > Cc: Xiong Zhang > Cc: Chris Wilson > Cc: Jani Nikula > --- > drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 ++-- > drivers/gpu/drm/i915/gt/intel_gtt.h | 4 ++ > drivers/gpu/drm/i915/i915_vgpu.c | 64 +- > 3 files changed, 70 insertions(+), 67 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c > b/drivers/gpu/drm/i915/gt/intel_ggtt.c > index 33a3f627ddb1..7001252b4703 100644 > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c > @@ -462,29 +462,17 @@ static void ggtt_unbind_vma(struct i915_address_space > *vm, struct i915_vma *vma) > > static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt) > { > - u64 size; > - int ret; > - > if (!intel_uc_uses_guc(>vm.gt->uc)) > return 0; > > GEM_BUG_ON(ggtt->vm.total <= GUC_GGTT_TOP); > - size = ggtt->vm.total - GUC_GGTT_TOP; > - > - ret = i915_gem_gtt_reserve(>vm, >uc_fw, size, > - GUC_GGTT_TOP, I915_COLOR_UNEVICTABLE, > - PIN_NOEVICT); > - if (ret) > - drm_dbg(>vm.i915->drm, > - "Failed to reserve top of GGTT for GuC\n"); > - > - return ret; > + return i915_ggtt_balloon(ggtt, GUC_GGTT_TOP, ggtt->vm.total, > +>uc_fw); I still don't buy this, this is definitely not "ballooning". And I'm yet to be convinced that ballooning is a central concept to the i915_ggtt itself and not a client coordination facility on top. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt
Since VGT ballooning nodes are GGTT specific, we can move them to i915_ggtt struct close to some other similar nodes. This way we drop another place in driver that uses static data. Signed-off-by: Michal Wajdeczko Cc: Xiong Zhang Cc: Chris Wilson Cc: Jani Nikula --- drivers/gpu/drm/i915/gt/intel_gtt.h | 1 + drivers/gpu/drm/i915/i915_vgpu.c| 27 --- 2 files changed, 9 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 111306f2f8d6..72780543c9c6 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -336,6 +336,7 @@ struct i915_ggtt { struct mutex error_mutex; struct drm_mm_node error_capture; struct drm_mm_node uc_fw; + struct drm_mm_node balloon[4]; }; struct i915_ppgtt { diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c index f505142d6dfc..6eb7657fb2ca 100644 --- a/drivers/gpu/drm/i915/i915_vgpu.c +++ b/drivers/gpu/drm/i915/i915_vgpu.c @@ -134,17 +134,6 @@ bool intel_vgpu_has_huge_gtt(struct drm_i915_private *dev_priv) return dev_priv->vgpu.caps & VGT_CAPS_HUGE_GTT; } -struct _balloon_info_ { - /* -* There are up to 2 regions per mappable/unmappable graphic -* memory that might be ballooned. Here, index 0/1 is for mappable -* graphic memory, 2/3 for unmappable graphic memory. -*/ - struct drm_mm_node space[4]; -}; - -static struct _balloon_info_ bl_info; - /** * intel_vgt_deballoon - deballoon reserved graphics address trunks * @ggtt: the global GGTT from which we reserved earlier @@ -163,7 +152,7 @@ void intel_vgt_deballoon(struct i915_ggtt *ggtt) drm_dbg(_priv->drm, "VGT deballoon.\n"); for (i = 0; i < 4; i++) - i915_ggtt_deballoon(ggtt, _info.space[i]); + i915_ggtt_deballoon(ggtt, >balloon[i]); } /** @@ -253,7 +242,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt) /* Unmappable graphic memory ballooning */ if (unmappable_base > ggtt->mappable_end) { ret = i915_ggtt_balloon(ggtt, ggtt->mappable_end, - unmappable_base, _info.space[2]); + unmappable_base, >balloon[2]); if (ret) goto err; @@ -261,7 +250,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt) if (unmappable_end < ggtt_end) { ret = i915_ggtt_balloon(ggtt, unmappable_end, ggtt_end, - _info.space[3]); + >balloon[3]); if (ret) goto err_upon_mappable; } @@ -269,7 +258,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt) /* Mappable graphic memory ballooning */ if (mappable_base) { ret = i915_ggtt_balloon(ggtt, 0, mappable_base, - _info.space[0]); + >balloon[0]); if (ret) goto err_upon_unmappable; @@ -277,7 +266,7 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt) if (mappable_end < ggtt->mappable_end) { ret = i915_ggtt_balloon(ggtt, mappable_end, ggtt->mappable_end, - _info.space[1]); + >balloon[1]); if (ret) goto err_below_mappable; @@ -287,11 +276,11 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt) return 0; err_below_mappable: - i915_ggtt_deballoon(ggtt, _info.space[0]); + i915_ggtt_deballoon(ggtt, >balloon[0]); err_upon_unmappable: - i915_ggtt_deballoon(ggtt, _info.space[3]); + i915_ggtt_deballoon(ggtt, >balloon[3]); err_upon_mappable: - i915_ggtt_deballoon(ggtt, _info.space[2]); + i915_ggtt_deballoon(ggtt, >balloon[2]); err: drm_err(_priv->drm, "VGT balloon fail\n"); return ret; -- 2.27.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/2] Add generic i915_ggtt ballooning support
Rebase forgotten series [1] [1] https://patchwork.freedesktop.org/series/71920/ Michal Wajdeczko (2): drm/i915/ggtt: Add generic i915_ggtt ballooning support drm/i915/vgt: Move VGT GGTT ballooning nodes to i915_ggtt drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 +++-- drivers/gpu/drm/i915/gt/intel_gtt.h | 5 ++ drivers/gpu/drm/i915/i915_vgpu.c | 75 +--- 3 files changed, 71 insertions(+), 78 deletions(-) -- 2.27.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915/ggtt: Add generic i915_ggtt ballooning support
Reserving part of the GGTT for the GuC requires same steps as in VGT GGTT ballooning. Add generic GGTT ballooning helpers to intel_ggtt.c to avoid code duplication. Signed-off-by: Michal Wajdeczko Cc: Xiong Zhang Cc: Chris Wilson Cc: Jani Nikula --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 69 ++-- drivers/gpu/drm/i915/gt/intel_gtt.h | 4 ++ drivers/gpu/drm/i915/i915_vgpu.c | 64 +- 3 files changed, 70 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 33a3f627ddb1..7001252b4703 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -462,29 +462,17 @@ static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma) static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt) { - u64 size; - int ret; - if (!intel_uc_uses_guc(>vm.gt->uc)) return 0; GEM_BUG_ON(ggtt->vm.total <= GUC_GGTT_TOP); - size = ggtt->vm.total - GUC_GGTT_TOP; - - ret = i915_gem_gtt_reserve(>vm, >uc_fw, size, - GUC_GGTT_TOP, I915_COLOR_UNEVICTABLE, - PIN_NOEVICT); - if (ret) - drm_dbg(>vm.i915->drm, - "Failed to reserve top of GGTT for GuC\n"); - - return ret; + return i915_ggtt_balloon(ggtt, GUC_GGTT_TOP, ggtt->vm.total, +>uc_fw); } static void ggtt_release_guc_top(struct i915_ggtt *ggtt) { - if (drm_mm_node_allocated(>uc_fw)) - drm_mm_remove_node(>uc_fw); + i915_ggtt_deballoon(ggtt, >uc_fw); } static void cleanup_init_ggtt(struct i915_ggtt *ggtt) @@ -1464,3 +1452,54 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma) } return ret; } + +/** + * i915_ggtt_balloon - reserve fixed space in an GGTT + * @ggtt: the i915_ggtt + * @start: start offset inside the GGTT, + * must be #I915_GTT_MIN_ALIGNMENT aligned + * @end: end offset inside the GGTT, + *must be #I915_GTT_PAGE_SIZE aligned + * @node: the drm_mm_node + * + * i915_ggtt_balloon() tries to reserve the @node from @start to @end inside + * GGTT the address space. + * + * Returns: 0 on success, -ENOSPC if no suitable hole is found. + */ +int i915_ggtt_balloon(struct i915_ggtt *ggtt, u64 start, u64 end, + struct drm_mm_node *node) +{ + u64 size = end - start; + int err; + + GEM_BUG_ON(start >= end); + drm_dbg(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n", + "ballooning ", start, end, size / SZ_1K); + + err = i915_gem_gtt_reserve(>vm, node, size, start, + I915_COLOR_UNEVICTABLE, PIN_NOEVICT); + if (unlikely(err)) { + drm_err(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n", + "Failed to balloon ", node->start, + node->start + node->size, node->size / SZ_1K); + return err; + } + + ggtt->vm.reserved += node->size; + return 0; +} + +void i915_ggtt_deballoon(struct i915_ggtt *ggtt, struct drm_mm_node *node) +{ + if (!drm_mm_node_allocated(node)) + return; + + drm_dbg(>vm.i915->drm, "%sGGTT [%#llx-%#llx] %lluK\n", + "deballooning ", node->start, node->start + node->size, + node->size / SZ_1K); + + GEM_BUG_ON(ggtt->vm.reserved < node->size); + ggtt->vm.reserved -= node->size; + drm_mm_remove_node(node); +} diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index c13c650ced22..111306f2f8d6 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -495,6 +495,10 @@ static inline bool i915_ggtt_has_aperture(const struct i915_ggtt *ggtt) return ggtt->mappable_end > 0; } +int i915_ggtt_balloon(struct i915_ggtt *ggtt, u64 start, u64 end, + struct drm_mm_node *node); +void i915_ggtt_deballoon(struct i915_ggtt *ggtt, struct drm_mm_node *node); + int i915_ppgtt_init_hw(struct intel_gt *gt); struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt); diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c index 70fca72f5162..f505142d6dfc 100644 --- a/drivers/gpu/drm/i915/i915_vgpu.c +++ b/drivers/gpu/drm/i915/i915_vgpu.c @@ -145,23 +145,6 @@ struct _balloon_info_ { static struct _balloon_info_ bl_info; -static void vgt_deballoon_space(struct i915_ggtt *ggtt, - struct drm_mm_node *node) -{ - struct drm_i915_private *dev_priv = ggtt->vm.i915; - if (!drm_mm_node_allocated(node)) - return; - - drm_dbg(_priv->drm, - "deballoon space: range [0x%llx - 0x%llx] %llu KiB.\n", - node->start, - node->start + node->size, - node->size / 1024); - -
Re: [Intel-gfx] [PATCH v4] drm/kmb: Add support for KeemBay Display
Hi Anitha. On Thu, Jul 30, 2020 at 01:44:44PM -0700, Anitha Chrisanthus wrote: > This is a basic KMS atomic modesetting display driver for KeemBay family of > SOCs. Driver has no 2D or 3D graphics.It calls into the ADV bridge > driver at the connector level. > > Single CRTC with LCD controller->mipi DSI-> ADV bridge > > Only 1080p resolution and single plane is supported at this time. > > v2: moved extern to .h, removed license text > use drm_dev_init, upclassed dev_private, removed HAVE_IRQ. > > v3: Squashed all 59 commits to one > > v4: review changes from Sam Ravnborg > renamed dev_p to kmb > moved clocks under kmb_clock, consolidated clk initializations > use drmm functions > use DRM_GEM_CMA_DRIVER_OPS_VMAP > I have not found time neither energy to take a look at v4. But I applied locally and ran it through checkpatch with my options: checkpatch -q --emacs --strict --show-types --codespell --codespellfile /usr/lib/python3/dist-packages/codespell_lib/data/dictionary.txt The options are from the dim script used when maintaining drm-misc-next with codespell options added. Please try to reproduce locally and fix relevant warnings. Sam -:146: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #146: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:58: + kmb_clr_bitmask_lcd(kmb, LCD_INT_ENABLE, + LCD_INT_VERT_COMP); -:173: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #173: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:85: + drm_info(dev, + "vfp= %d vbp= %d vsyc_len=%d hfp=%d hbp=%d hsync_len=%d\n", -:194: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #194: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:106: + drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d h-active=%d h-bp=%d h-fp=%d hysnc-l=%d", + __func__, __LINE__, -:199: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #199: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:111: + kmb_write_lcd(kmb, LCD_V_ACTIVEHEIGHT, + m->crtc_vdisplay - 1); -:204: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #204: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:116: + kmb_write_lcd(kmb, LCD_H_ACTIVEWIDTH, + m->crtc_hdisplay - 1); -:217: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #217: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:129: + kmb_write_lcd(kmb, + LCD_V_BACKPORCH_EVEN, vm.vback_porch); -:219: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #219: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:131: + kmb_write_lcd(kmb, + LCD_V_FRONTPORCH_EVEN, vm.vfront_porch); -:413: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #413: FILE: drivers/gpu/drm/kmb/kmb_drv.c:60: + drm_err(>drm, + "Failed to enable MIPI_ECFG clock: %d\n", ret); -:420: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #420: FILE: drivers/gpu/drm/kmb/kmb_drv.c:67: + drm_err(>drm, + "Failed to enable MIPI_CFG clock: %d\n", ret); -:427: CHECK:LINE_SPACING: Please don't use multiple blank lines #427: FILE: drivers/gpu/drm/kmb/kmb_drv.c:74: + + -:463: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV) #463: FILE: drivers/gpu/drm/kmb/kmb_drv.c:110: + kmb->sys_clk_mhz = clk_get_rate(kmb_clk.clk_pll0)/100; ^ -:470: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #470: FILE: drivers/gpu/drm/kmb/kmb_drv.c:117: + drm_err(>drm, "failed to set to clk_lcd to %d\n", + KMB_LCD_DEFAULT_CLK); -:479: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #479: FILE: drivers/gpu/drm/kmb/kmb_drv.c:126: + drm_err(>drm, "failed to set to clk_mipi to %d\n", + KMB_MIPI_DEFAULT_CLK); -:506: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #506: FILE: drivers/gpu/drm/kmb/kmb_drv.c:153: + drm_err(>drm, + "failed to set clk_mipi_cfg to %d\n", -:511: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #511: FILE: drivers/gpu/drm/kmb/kmb_drv.c:158: + drm_info(>drm, + "Get clk_mipi_cfg after set = %ld\n", clk); -:561: CHECK:LINE_SPACING: Please don't use multiple blank lines #561: FILE: drivers/gpu/drm/kmb/kmb_drv.c:208: + + -:688: CHECK:BRACES: Blank lines aren't necessary after an open brace '{' #688: FILE: drivers/gpu/drm/kmb/kmb_drv.c:335: + if (status & LCD_INT_EOF) { + -:701: CHECK:CAMELCASE: Avoid CamelCase: #701: FILE: drivers/gpu/drm/kmb/kmb_drv.c:348: +
[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix wrong return value
== Series Details == Series: drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80175/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8830_full -> Patchwork_18293_full Summary --- **FAILURE** Serious unknown changes coming with Patchwork_18293_full absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_18293_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_18293_full: ### IGT changes ### Possible regressions * igt@gem_eio@in-flight-suspend: - shard-skl: [PASS][1] -> [INCOMPLETE][2] +1 similar issue [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl9/igt@gem_...@in-flight-suspend.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl1/igt@gem_...@in-flight-suspend.html Known issues Here are the changes found in Patchwork_18293_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_balancer@bonded-early: - shard-kbl: [PASS][3] -> [FAIL][4] ([i915#2079]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-kbl2/igt@gem_exec_balan...@bonded-early.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-kbl2/igt@gem_exec_balan...@bonded-early.html * igt@gem_exec_whisper@basic-queues-priority: - shard-glk: [PASS][5] -> [DMESG-WARN][6] ([i915#118] / [i915#95]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk8/igt@gem_exec_whis...@basic-queues-priority.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk2/igt@gem_exec_whis...@basic-queues-priority.html * igt@gen9_exec_parse@allowed-single: - shard-skl: [PASS][7] -> [DMESG-WARN][8] ([i915#1436] / [i915#716]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl1/igt@gen9_exec_pa...@allowed-single.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl1/igt@gen9_exec_pa...@allowed-single.html * igt@kms_big_fb@y-tiled-64bpp-rotate-180: - shard-glk: [PASS][9] -> [DMESG-FAIL][10] ([i915#118] / [i915#95]) +1 similar issue [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk9/igt@kms_big...@y-tiled-64bpp-rotate-180.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk8/igt@kms_big...@y-tiled-64bpp-rotate-180.html * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-ytiled: - shard-apl: [PASS][11] -> [DMESG-WARN][12] ([i915#1635] / [i915#1982]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-apl3/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-apl6/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-ytiled.html * igt@kms_flip@2x-flip-vs-expired-vblank@ac-hdmi-a1-hdmi-a2: - shard-glk: [PASS][13] -> [FAIL][14] ([i915#79]) [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk5/igt@kms_flip@2x-flip-vs-expired-vbl...@ac-hdmi-a1-hdmi-a2.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk7/igt@kms_flip@2x-flip-vs-expired-vbl...@ac-hdmi-a1-hdmi-a2.html * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1: - shard-skl: [PASS][15] -> [FAIL][16] ([i915#79]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl1/igt@kms_flip@flip-vs-expired-vblank-interrupti...@a-edp1.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interrupti...@a-edp1.html * igt@kms_flip_tiling@flip-changes-tiling: - shard-skl: [PASS][17] -> [DMESG-WARN][18] ([i915#1982]) +13 similar issues [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-skl8/igt@kms_flip_til...@flip-changes-tiling.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-skl3/igt@kms_flip_til...@flip-changes-tiling.html * igt@kms_frontbuffer_tracking@fbc-badstride: - shard-glk: [PASS][19] -> [DMESG-WARN][20] ([i915#1982]) +2 similar issues [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/shard-glk6/igt@kms_frontbuffer_track...@fbc-badstride.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/shard-glk4/igt@kms_frontbuffer_track...@fbc-badstride.html * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-render: - shard-tglb: [PASS][21] -> [DMESG-WARN][22] ([i915#1982]) +1 similar issue [21]:
Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
On 8/1/20 5:56 AM, Borislav Petkov wrote: > On Sat, Aug 01, 2020 at 01:24:29PM +0200, Saheed O. Bolarinwa wrote: >> The return value of pci_read_config_*() may not indicate a device error. >> However, the value read by these functions is more likely to indicate >> this kind of error. This presents two overlapping ways of reporting >> errors and complicates error checking. > So why isn't the *value check done in the pci_read_config_* functions > instead of touching gazillion callers? > > For example, pci_conf{1,2}_read() could check whether the u32 *value it > just read depending on the access method, whether that value is ~0 and > return proper PCIBIOS_ error in that case. > > The check you're replicating > > if (val32 == (u32)~0) > > everywhere, instead, is just ugly and tests a naked value ~0 which > doesn't mean anything... > I agree, if there is a change, it should be in the pci_read_* functions. Anything returning void should not fail and likely future users of the proposed change will not do the extra checks. Tom ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value
> > > diff --git a/drivers/gpu/drm/i915/i915_active.c > > > b/drivers/gpu/drm/i915/i915_active.c > > > index d960d0be5bd2..cc017e3cc9c5 100644 > > > --- a/drivers/gpu/drm/i915/i915_active.c > > > +++ b/drivers/gpu/drm/i915/i915_active.c > > > @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct > > > i915_active *ref, > > > intel_engine_mask_t tmp, mask = engine->mask; > > > struct llist_node *first = NULL, *last = NULL; > > > struct intel_gt *gt = engine->gt; > > > - int err; > > > + int err = 0; > > > > you don't need the initialization here. > > But it's close enough that I can munge the patch inline. > Reviewed-by: Chris Wilson sure... you can also remove it before merging it and it might also need: Fixes: d8af05ff38ae7 ("drm/i915: Allow sharing the idle-barrier from other kernel requests) but, yeah... Reviewed-by: Andi Shyti ... as well :) Andi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix wrong return value
== Series Details == Series: drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80175/ State : success == Summary == CI Bug Log - changes from CI_DRM_8830 -> Patchwork_18293 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/index.html Known issues Here are the changes found in Patchwork_18293 that come from known issues: ### IGT changes ### Issues hit * igt@debugfs_test@read_all_entries: - fi-bsw-nick:[PASS][1] -> [INCOMPLETE][2] ([i915#1250] / [i915#1436]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bsw-nick/igt@debugfs_test@read_all_entries.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bsw-nick/igt@debugfs_test@read_all_entries.html * igt@gem_exec_suspend@basic-s3: - fi-tgl-u2: [PASS][3] -> [FAIL][4] ([i915#1888]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@gem_exec_susp...@basic-s3.html * igt@i915_selftest@live@active: - fi-bdw-5557u: [PASS][5] -> [DMESG-FAIL][6] ([i915#765]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bdw-5557u/igt@i915_selftest@l...@active.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bdw-5557u/igt@i915_selftest@l...@active.html * igt@i915_selftest@live@gem_contexts: - fi-tgl-u2: [PASS][7] -> [INCOMPLETE][8] ([i915#2045]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html * igt@i915_selftest@live@gt_lrc: - fi-tgl-u2: [PASS][9] -> [DMESG-FAIL][10] ([i915#1233]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html * igt@kms_chamelium@hdmi-crc-fast: - fi-kbl-7500u: [PASS][11] -> [FAIL][12] ([i915#1372]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-7500u/igt@kms_chamel...@hdmi-crc-fast.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-7500u/igt@kms_chamel...@hdmi-crc-fast.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-bsw-kefka: [PASS][13] -> [DMESG-WARN][14] ([i915#1982]) +1 similar issue [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-bsw-kefka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1: - fi-icl-u2: [PASS][15] -> [DMESG-WARN][16] ([i915#1982]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2: - fi-skl-guc: [PASS][17] -> [DMESG-WARN][18] ([i915#2203]) [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html Possible fixes * igt@i915_selftest@live@execlists: - fi-icl-y: [INCOMPLETE][19] -> [PASS][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-icl-y/igt@i915_selftest@l...@execlists.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-icl-y/igt@i915_selftest@l...@execlists.html * igt@kms_busy@basic@flip: - fi-kbl-x1275: [DMESG-WARN][21] ([i915#62] / [i915#92] / [i915#95]) -> [PASS][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-x1275/igt@kms_busy@ba...@flip.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-x1275/igt@kms_busy@ba...@flip.html * igt@kms_chamelium@common-hpd-after-suspend: - fi-kbl-7500u: [DMESG-WARN][23] ([i915#2203]) -> [PASS][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8830/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18293/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html Warnings * igt@gem_exec_suspend@basic-s0: - fi-kbl-x1275: [DMESG-WARN][25] ([i915#62] / [i915#92]) -> [DMESG-WARN][26] ([i915#62] / [i915#92] / [i915#95]) +4 similar issues
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Fix wrong return value
== Series Details == Series: drm/i915: Fix wrong return value URL : https://patchwork.freedesktop.org/series/80175/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.0 Fast mode used, each commit won't be checked separately. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI] drm/i915: Fix wrong return value
From: Tianjia Zhang In function i915_active_acquire_preallocate_barrier(), not all paths have the return value set correctly, and in case of memory allocation failure, a negative error code should be returned. Cc: Chris Wilson Signed-off-by: Tianjia Zhang Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_active.c| 7 ++- drivers/gpu/drm/i915/selftests/i915_request.c | 2 +- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 3ee5e02c3e17..b0a6522be3d1 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -889,7 +889,6 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, intel_engine_mask_t tmp, mask = engine->mask; struct llist_node *first = NULL, *last = NULL; struct intel_gt *gt = engine->gt; - int err; GEM_BUG_ON(i915_active_is_idle(ref)); @@ -914,10 +913,8 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, rcu_read_unlock(); if (!node) { node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); - if (!node) { - err = ENOMEM; + if (!node) goto unwind; - } RCU_INIT_POINTER(node->base.fence, NULL); node->base.cb.func = node_retire; @@ -965,7 +962,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, kmem_cache_free(global.slab_cache, node); } - return err; + return -ENOMEM; } void i915_active_acquire_barrier(struct i915_active *ref) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index 57dd6f5122ee..c1dcd4b91bda 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -331,7 +331,7 @@ static int __igt_breadcrumbs_smoketest(void *arg) if (!wait) { i915_sw_fence_commit(submit); heap_fence_put(submit); - err = ENOMEM; + err = -ENOMEM; break; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value
Quoting Andi Shyti (2020-08-02 12:40:44) > Hi Tianjia, > > > diff --git a/drivers/gpu/drm/i915/i915_active.c > > b/drivers/gpu/drm/i915/i915_active.c > > index d960d0be5bd2..cc017e3cc9c5 100644 > > --- a/drivers/gpu/drm/i915/i915_active.c > > +++ b/drivers/gpu/drm/i915/i915_active.c > > @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct > > i915_active *ref, > > intel_engine_mask_t tmp, mask = engine->mask; > > struct llist_node *first = NULL, *last = NULL; > > struct intel_gt *gt = engine->gt; > > - int err; > > + int err = 0; > > you don't need the initialization here. But it's close enough that I can munge the patch inline. Reviewed-by: Chris Wilson -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong return value
Hi Tianjia, > diff --git a/drivers/gpu/drm/i915/i915_active.c > b/drivers/gpu/drm/i915/i915_active.c > index d960d0be5bd2..cc017e3cc9c5 100644 > --- a/drivers/gpu/drm/i915/i915_active.c > +++ b/drivers/gpu/drm/i915/i915_active.c > @@ -758,7 +758,7 @@ int i915_active_acquire_preallocate_barrier(struct > i915_active *ref, > intel_engine_mask_t tmp, mask = engine->mask; > struct llist_node *first = NULL, *last = NULL; > struct intel_gt *gt = engine->gt; > - int err; > + int err = 0; you don't need the initialization here. Thanks, Andi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 00/16] acpi/pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API
On Sat, Aug 01, 2020 at 04:38:16PM +0200, Hans de Goede wrote: > On 7/29/20 12:54 PM, Andy Shevchenko wrote: > > On Fri, Jul 17, 2020 at 03:37:37PM +0200, Hans de Goede wrote: ... > > One comment to consider, though. There are three channels in that PWM AFAIU. > > One of them is backlight control, another one can be attached to haptics. > > The > > concern is how this series may (or may not?) affect haptics behaviour. > > When you say "in that PWM" do you mean the LPSS one or the CRC one ? CRC one. I have read it from PMIC spec, that's why the question. > The CRC PWM driver patches do make it honor the requested output frequency, > where before, because of a bug, it would stick with the out frequency > setup by the firmware (or the power-on-reset value if no value is set > by the firmware). This fix causing this is very clearly correct, but > it could have unexpected side-effects. -- With Best Regards, Andy Shevchenko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx