[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [v7,1/6] drm/i915: Fallback to lower link rate and lane count during link training (rev3)
== Series Details == Series: series starting with [v7,1/6] drm/i915: Fallback to lower link rate and lane count during link training (rev3) URL : https://patchwork.freedesktop.org/series/12534/ State : success == Summary == Series 12534v3 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/12534/revisions/3/mbox/ Test kms_psr_sink_crc: Subgroup psr_basic: dmesg-warn -> PASS (fi-skl-6700hq) fi-bdw-5557u total:244 pass:229 dwarn:0 dfail:0 fail:0 skip:15 fi-bsw-n3050 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 fi-byt-n2820 total:244 pass:208 dwarn:0 dfail:0 fail:1 skip:35 fi-hsw-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-hsw-4770r total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-ilk-650 total:244 pass:182 dwarn:0 dfail:0 fail:2 skip:60 fi-ivb-3520m total:244 pass:219 dwarn:0 dfail:0 fail:0 skip:25 fi-ivb-3770 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 fi-skl-6260u total:244 pass:230 dwarn:0 dfail:0 fail:0 skip:14 fi-skl-6700hqtotal:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-skl-6700k total:244 pass:219 dwarn:1 dfail:0 fail:0 skip:24 fi-snb-2520m total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 fi-snb-2600 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 fi-skl-6770hq failed to collect. IGT log at Patchwork_2563/fi-skl-6770hq/igt.log Results at /archive/results/CI_IGT_test/Patchwork_2563/ 4ca90e7c3b6e429e033b93fc56fc156da8f222ef drm-intel-nightly: 2016y-09m-20d-12h-43m-32s UTC integration manifest fbf93bd drm/i915/dp/mst: Add support for upfront link training for DP MST c35524e drm/i915/dp: Enable Upfront link training on DDI platforms 3e061c1 drm/i915: Code cleanup to use dev_priv and INTEL_GEN ad24e7c drm/i915: Change the placement of some static functions in intel_dp.c 789b7b3 drm/i915: Remove the link rate and lane count loop in compute config 61d14b7 drm/i915: Fallback to lower link rate and lane count during link training ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v18 5/6] drm/i915/dp: Enable Upfront link training on DDI platforms
To support USB type C alternate DP mode, the display driver needs to know the number of lanes required by the DP panel as well as number of lanes that can be supported by the type-C cable. Sometimes, the type-C cable may limit the bandwidth even if Panel can support more lanes. To address these scenarios we need to train the link before modeset. This upfront link training caches the values of max link rate and max lane count that get used later during modeset. Upfront link training does not change any HW state, the link is disabled and PLL values are reset to previous values after upfront link tarining so that subsequent modeset is not aware of these changes. This patch is based on prior work done by R,DurgadossChanges since v17: * Rebased on the latest nightly Changes since v16: * Use HAS_DDI macro for enabling this feature (Rodrigo Vivi) * Fix some unnecessary removals/changes due to rebase (Rodrigo Vivi) Changes since v15: * Split this patch into two patches - one with functional changes to enable upfront and other with moving the existing functions around so that they can be used for upfront (Jani Nikula) * Cleaned up the commit message Signed-off-by: Durgadoss R Signed-off-by: Jim Bride Signed-off-by: Manasi Navare --- drivers/gpu/drm/i915/intel_ddi.c | 21 ++- drivers/gpu/drm/i915/intel_dp.c | 190 +- drivers/gpu/drm/i915/intel_dp_link_training.c | 1 - drivers/gpu/drm/i915/intel_drv.h | 14 +- 4 files changed, 218 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 093038c..8e52507 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1676,7 +1676,8 @@ static void intel_ddi_pre_enable_dp(struct intel_encoder *encoder, pll->config.crtc_mask = 0; /* If Link Training fails, send a uevent to generate a hotplug */ - if (!intel_ddi_link_train(intel_dp, link_rate, lane_count, link_mst)) + if (!intel_ddi_link_train(intel_dp, link_rate, lane_count, link_mst, + false)) drm_kms_helper_hotplug_event(encoder->base.dev); pll->config = tmp_pll_config; } @@ -2464,7 +2465,7 @@ intel_ddi_get_link_dpll(struct intel_dp *intel_dp, int clock) bool intel_ddi_link_train(struct intel_dp *intel_dp, int max_link_rate, -uint8_t max_lane_count, bool link_mst) +uint8_t max_lane_count, bool link_mst, bool is_upfront) { struct intel_connector *connector = intel_dp->attached_connector; struct intel_encoder *encoder = connector->encoder; @@ -2513,6 +2514,7 @@ intel_ddi_link_train(struct intel_dp *intel_dp, int max_link_rate, pll->funcs.disable(dev_priv, pll); pll->config = tmp_pll_config; } + if (ret) { DRM_DEBUG_KMS("Link Training successful at link rate: %d lane: %d\n", link_rate, lane_count); @@ -2522,6 +2524,21 @@ intel_ddi_link_train(struct intel_dp *intel_dp, int max_link_rate, intel_dp_stop_link_train(intel_dp); + if (is_upfront) { + DRM_DEBUG_KMS("Upfront link train %s: link_clock:%d lanes:%d\n", + ret ? "Passed" : "Failed", + link_rate, lane_count); + /* Disable port followed by PLL for next retry/clean up */ + intel_ddi_post_disable(encoder, NULL, NULL); + pll->funcs.disable(dev_priv, pll); + pll->config = tmp_pll_config; + if (ret) { + /* Save the upfront values */ + intel_dp->max_lanes_upfront = lane_count; + intel_dp->max_link_rate_upfront = link_rate; + } + } + if (!lane_count) DRM_ERROR("Link Training Failed\n"); diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 8d9a8ab..a058d5d 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -153,12 +153,21 @@ intel_dp_max_link_bw(struct intel_dp *intel_dp) static u8 intel_dp_max_lane_count(struct intel_dp *intel_dp) { struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); - u8 source_max, sink_max; + u8 temp, source_max, sink_max; source_max = intel_dig_port->max_lanes; sink_max = drm_dp_max_lane_count(intel_dp->dpcd); - return min(source_max, sink_max); + temp = min(source_max, sink_max); + + /* +* Limit max lanes w.r.t to the max value found +* using Upfront link training also. +*/ + if (intel_dp->max_lanes_upfront) + return min(temp,
[Intel-gfx] [PATCH v5] drm/i915/dp: DP audio API changes for MST
From: "Pandiyan, Dhinakaran"DP MST provides the capability to send multiple video and audio streams through a single port. This requires the API's between i915 and audio drivers to distinguish between multiple audio capable displays that can be connected to a port. Currently only the port identity is shared in the APIs. This patch adds support for MST with an additional parameter 'int pipe'. The existing parameter 'port' does not change it's meaning. pipe = MST : display pipe that the stream originates from Non-MST : -1 Affected APIs: struct i915_audio_component_ops - int (*sync_audio_rate)(struct device *, int port, int rate); + int (*sync_audio_rate)(struct device *, int port, int pipe, +int rate); - int (*get_eld)(struct device *, int port, bool *enabled, - unsigned char *buf, int max_bytes); + int (*get_eld)(struct device *, int port, int pipe, + bool *enabled, unsigned char *buf, int max_bytes); struct i915_audio_component_audio_ops - void (*pin_eld_notify)(void *audio_ptr, int port); + void (*pin_eld_notify)(void *audio_ptr, int port, int pipe); This patch makes dummy changes in the audio drivers (thanks Libin) for build to succeed. The audio side drivers will send the right 'pipe' values for MST in patches that will follow. v2: Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, Ville) Included Asoc driver API compatibility changes from Jeeja. Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi) Added comment for av_enc_map[] definition. (Takashi) v3: Fixed logic error introduced while renaming 'dev_id' as 'pipe' (Ville) Renamed get_saved_encoder() to get_saved_enc() to reduce line length v4: Rebased. Parameter check for pipe < -1 values in get_saved_enc() (Ville) Switched to for_each_pipe() in get_saved_enc() (Ville) Renamed 'pipe' to 'dev_id' in audio side code (Takashi) v5: Included a comment for the dev_id arg. (Libin) Signed-off-by: Dhinakaran Pandiyan Reviewed-by: Takashi Iwai Reviewed-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_drv.h| 3 +- drivers/gpu/drm/i915/intel_audio.c | 94 ++ include/drm/i915_component.h | 6 +-- include/sound/hda_i915.h | 11 +++-- sound/hda/hdac_i915.c | 18 +--- sound/pci/hda/patch_hdmi.c | 7 +-- sound/soc/codecs/hdac_hdmi.c | 2 +- 7 files changed, 94 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 270543c..fe14cca 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2080,7 +2080,8 @@ struct drm_i915_private { /* perform PHY state sanity checks? */ bool chv_phy_assert[2]; - struct intel_encoder *dig_port_map[I915_MAX_PORTS]; + /* Used to save the pipe-to-encoder mapping for audio */ + struct intel_encoder *av_enc_map[I915_MAX_PIPES]; /* * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch diff --git a/drivers/gpu/drm/i915/intel_audio.c b/drivers/gpu/drm/i915/intel_audio.c index 40fbdd8..9583f43 100644 --- a/drivers/gpu/drm/i915/intel_audio.c +++ b/drivers/gpu/drm/i915/intel_audio.c @@ -491,6 +491,7 @@ void intel_audio_codec_enable(struct intel_encoder *intel_encoder) struct drm_i915_private *dev_priv = to_i915(encoder->dev); struct i915_audio_component *acomp = dev_priv->audio_component; enum port port = intel_encoder->port; + enum pipe pipe = crtc->pipe; connector = drm_select_eld(encoder); if (!connector) @@ -515,12 +516,18 @@ void intel_audio_codec_enable(struct intel_encoder *intel_encoder) mutex_lock(_priv->av_mutex); intel_encoder->audio_connector = connector; + /* referred in audio callbacks */ - dev_priv->dig_port_map[port] = intel_encoder; + dev_priv->av_enc_map[pipe] = intel_encoder; mutex_unlock(_priv->av_mutex); + /* audio drivers expect pipe = -1 to indicate Non-MST cases */ + if (intel_encoder->type != INTEL_OUTPUT_DP_MST) + pipe = -1; + if (acomp && acomp->audio_ops && acomp->audio_ops->pin_eld_notify) - acomp->audio_ops->pin_eld_notify(acomp->audio_ops->audio_ptr, (int) port); + acomp->audio_ops->pin_eld_notify(acomp->audio_ops->audio_ptr, +(int) port, (int) pipe); } /** @@ -536,17 +543,24 @@ void intel_audio_codec_disable(struct intel_encoder *intel_encoder) struct drm_i915_private *dev_priv = to_i915(encoder->dev); struct i915_audio_component *acomp = dev_priv->audio_component; enum port port = intel_encoder->port; + struct intel_crtc *crtc = to_intel_crtc(encoder->crtc); + enum
Re: [Intel-gfx] [PATCH v12 5/7] drm/i915/skl: Ensure pipes with changed wms get added to the state
Hi I've bisected back to this commit in the drm-intel-nightly branch 05a76d3d6ad1ee9f9814f88949cc9305fc165460 is the first bad commit commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460 Author: LyudeDate: Wed Aug 17 15:55:57 2016 -0400 drm/i915/skl: Ensure pipes with changed wms get added to the state If we're enabling a pipe, we'll need to modify the watermarks on all active planes. Since those planes won't be added to the state on their own, we need to add them ourselves. Signed-off-by: Lyude Reviewed-by: Matt Roper Cc: sta...@vger.kernel.org Cc: Ville Syrjälä Cc: Daniel Vetter Cc: Radhakrishna Sripada Cc: Hans de Goede Signed-off-by: Maarten Lankhorst Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-6-git-send-email-cp...@redhat.com The symptoms I'm seeing look like tearing at the top of the screen and it's especially noticeable in Chrome - reverting this commit makes the issue go away Let me know if you'd like me to raise a bug Cheers Mike (Re-sending from Gmail - as Inbox doesn't let me send as plain text) On 17 August 2016 at 20:55, Lyude wrote: > If we're enabling a pipe, we'll need to modify the watermarks on all > active planes. Since those planes won't be added to the state on > their own, we need to add them ourselves. > > Signed-off-by: Lyude > Reviewed-by: Matt Roper > Cc: sta...@vger.kernel.org > Cc: Ville Syrjälä > Cc: Daniel Vetter > Cc: Radhakrishna Sripada > Cc: Hans de Goede > --- > drivers/gpu/drm/i915/intel_pm.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 849f039..a3d24cb 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -4117,6 +4117,10 @@ skl_compute_ddb(struct drm_atomic_state *state) > ret = skl_allocate_pipe_ddb(cstate, ddb); > if (ret) > return ret; > + > + ret = drm_atomic_add_affected_planes(state, > _crtc->base); > + if (ret) > + return ret; > } > > return 0; > -- > 2.7.4 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v12 5/7] drm/i915/skl: Ensure pipes with changed wms get added to the state
Hi I've bisected back to this commit in the drm-intel-nightly branch 05a76d3d6ad1ee9f9814f88949cc9305fc165460 is the first bad commit commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460 Author: LyudeDate: Wed Aug 17 15:55:57 2016 -0400 drm/i915/skl: Ensure pipes with changed wms get added to the state If we're enabling a pipe, we'll need to modify the watermarks on all active planes. Since those planes won't be added to the state on their own, we need to add them ourselves. Signed-off-by: Lyude Reviewed-by: Matt Roper Cc: sta...@vger.kernel.org Cc: Ville Syrjälä Cc: Daniel Vetter Cc: Radhakrishna Sripada Cc: Hans de Goede Signed-off-by: Maarten Lankhorst Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-6-git-send-email-cp...@redhat.com The symptoms I'm seeing look like tearing at the top of the screen and it's especially noticeable in Chrome - reverting this commit makes the issue go away Let me know if you'd like me to raise a bug Cheers Mike On Wed, 17 Aug 2016 at 20:56 Lyude wrote: > If we're enabling a pipe, we'll need to modify the watermarks on all > active planes. Since those planes won't be added to the state on > their own, we need to add them ourselves. > > Signed-off-by: Lyude > Reviewed-by: Matt Roper > Cc: sta...@vger.kernel.org > Cc: Ville Syrjälä > Cc: Daniel Vetter > Cc: Radhakrishna Sripada > Cc: Hans de Goede > --- > drivers/gpu/drm/i915/intel_pm.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c > b/drivers/gpu/drm/i915/intel_pm.c > index 849f039..a3d24cb 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -4117,6 +4117,10 @@ skl_compute_ddb(struct drm_atomic_state *state) > ret = skl_allocate_pipe_ddb(cstate, ddb); > if (ret) > return ret; > + > + ret = drm_atomic_add_affected_planes(state, > _crtc->base); > + if (ret) > + return ret; > } > > return 0; > -- > 2.7.4 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes against the available link bandwidth
Hi All, Looking forward for more reviews comments and concerns if any, regarding this patch. Regards, Anusha >-Original Message- >From: Jim Bride [mailto:jim.br...@linux.intel.com] >Sent: Tuesday, September 6, 2016 11:57 AM >To: Srivatsa, Anusha>Cc: intel-gfx@lists.freedesktop.org >Subject: Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes against the >available link bandwidth > >On Tue, Sep 06, 2016 at 06:39:12PM +, Srivatsa, Anusha wrote: >> Sending to the list again since Ville's review comment didn't hit the >> mailing list >last time. >> >> Regards, >> Anusha >> >> -Original Message- >> From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com] >> Sent: Monday, September 5, 2016 4:39 AM >> To: Srivatsa, Anusha >> Cc: intel-gfx@lists.freedesktop.org >> Subject: Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes >> against the available link bandwidth >> >> On Thu, Aug 18, 2016 at 05:11:31PM -0700, Anusha Srivatsa wrote: >> > Change intel_dp_mst_mode_valid() to use available link bandwidth >> > rather than the link's maximum supported bandwidth to evaluate >> > whether modes are legal for the current configuration. This takes >> > into account the fact that link bandwidth may already be dedicated >> > to other virtual channels. >> >> We can't do this. Results of mode_valid() aren't supposed to change like this >depending on what else is happening in the system. The only thing mode_valid() >tells us is whether it's possible to use the mode under *some* circumstances. >Only if the mode can't be used under *any* circumstances should it be filtered >out. >> >> If we wanted to change this, we'd at the very least have to synthesize >> hotplug >uevents whenever the conditions changed. But doing that would be a fairly big >behavioural change, so I'm not sure how people feel about it. It also doesn't >really solve the problem since eg. atomic can go directly from 0->n active >pipes, >so there's no way to know upfront which modes we can pick for each pipe. >> > >I won't dispute that this won't help for all cases, but it does make hotplugs, >at a >minimum, more sane. For that reason alone, I'd like to see this patch land. >Longer term I think we should look at how we can make user space and atomic >better handle MST (IMHO in multi-modeset operations on MST atomic should >ensure that the aggregate dotclock of the modes being set are less than the >link >bandwidth that's configured.) I think that user space also needs to be more >MST >aware and update what's available based on the current configuration at any >given time. I'm not sure what the precise solution should look like here, but >I'd >think that what we want is for user space to fetch the list of available >resolutions >based on the current configuration any time we plug or unplug a device on a DP >MST topology. In an ideal world it would be nice to have some sort of >information about the total link bandwidth, and how much bandwidth each mode >is taking up so that the user could have a guide to tweaking their setup in >such a >way to maximize how they choose to configure their monitors based on the >available link bandwidth. > >Jim > > >> > >> > Signed-off-by: Anusha Srivatsa >> > --- >> > drivers/gpu/drm/i915/intel_dp_mst.c | 12 +--- >> > 1 file changed, 9 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c >> > b/drivers/gpu/drm/i915/intel_dp_mst.c >> > index 629337d..39c58eb 100644 >> > --- a/drivers/gpu/drm/i915/intel_dp_mst.c >> > +++ b/drivers/gpu/drm/i915/intel_dp_mst.c >> > @@ -352,16 +352,22 @@ static enum drm_mode_status >> > intel_dp_mst_mode_valid(struct drm_connector *connector, >> >struct drm_display_mode *mode) >> > { >> > - int max_dotclk = to_i915(connector->dev)->max_dotclk_freq; >> > + int req_pbn = 0; >> > + int slots = 0; >> > + struct intel_connector *intel_connector = >to_intel_connector(connector); >> > + struct intel_dp *intel_dp = intel_connector->mst_port; >> > + struct drm_dp_mst_topology_mgr *mgr = _dp->mst_mgr; >> > + >> > + req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24); >> > + slots = drm_dp_find_vcpi_slots(mgr, req_pbn); >> > >> > - /* TODO - validate mode against available PBN for link */ >> >if (mode->clock < 1) >> >return MODE_CLOCK_LOW; >> > >> >if (mode->flags & DRM_MODE_FLAG_DBLCLK) >> >return MODE_H_ILLEGAL; >> > >> > - if (mode->clock > max_dotclk) >> > + if (slots == -ENOSPC) >> >return MODE_CLOCK_HIGH; >> > >> >return MODE_OK; >> > -- >> > 2.7.4 >> > >> > ___ >> > Intel-gfx mailing list >> > Intel-gfx@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> -- >> Ville Syrjälä >> Intel OTC >> ___ >> Intel-gfx mailing list >>
[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [v2,RESEND,1/2] drm/i915: Cleanup instdone collection
== Series Details == Series: series starting with [v2,RESEND,1/2] drm/i915: Cleanup instdone collection URL : https://patchwork.freedesktop.org/series/12714/ State : warning == Summary == Series 12714v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/12714/revisions/1/mbox/ Test drv_module_reload_basic: pass -> SKIP (fi-skl-6770hq) Test kms_pipe_crc_basic: Subgroup hang-read-crc-pipe-a: pass -> SKIP (fi-hsw-4770r) Test kms_psr_sink_crc: Subgroup psr_basic: dmesg-warn -> PASS (fi-skl-6700hq) fi-bdw-5557u total:244 pass:229 dwarn:0 dfail:0 fail:0 skip:15 fi-bsw-n3050 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 fi-hsw-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-hsw-4770r total:244 pass:221 dwarn:0 dfail:0 fail:0 skip:23 fi-ilk-650 total:244 pass:182 dwarn:0 dfail:0 fail:2 skip:60 fi-ivb-3520m total:244 pass:219 dwarn:0 dfail:0 fail:0 skip:25 fi-ivb-3770 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 fi-skl-6260u total:244 pass:230 dwarn:0 dfail:0 fail:0 skip:14 fi-skl-6700hqtotal:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-skl-6700k total:244 pass:219 dwarn:1 dfail:0 fail:0 skip:24 fi-skl-6770hqtotal:244 pass:227 dwarn:1 dfail:0 fail:1 skip:15 fi-snb-2520m total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 fi-snb-2600 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 Results at /archive/results/CI_IGT_test/Patchwork_2561/ 4ca90e7c3b6e429e033b93fc56fc156da8f222ef drm-intel-nightly: 2016y-09m-20d-12h-43m-32s UTC integration manifest 4721d61 drm/i915: Try to print INSTDONE bits for all slice/subslice 6240439 drm/i915: Cleanup instdone collection ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 20/38] drm/i915: Implement pwrite without struct-mutex
Hi Chris, [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on next-20160920] [cannot apply to v4.8-rc7] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] [Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on] [Check https://git-scm.com/docs/git-format-patch for more information] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839 base: git://anongit.freedesktop.org/drm-intel for-linux-next reproduce: make htmldocs All warnings (new ones prefixed by >>): drivers/gpu/drm/drm_modes.c:693: warning: No description found for parameter 'bus_flags' drivers/gpu/drm/drm_plane_helper.c:248: warning: No description found for parameter 'dst' drivers/gpu/drm/drm_plane_helper.c:248: warning: Excess function parameter 'dest' description in 'drm_plane_helper_check_update' drivers/gpu/drm/drm_plane_helper.c:247: warning: No description found for parameter 'dst' drivers/gpu/drm/drm_plane_helper.c:247: warning: Excess function parameter 'dest' description in 'drm_plane_helper_check_update' >> drivers/gpu/drm/i915/i915_gem.c:1162: warning: Excess function parameter >> 'i915' description in 'i915_gem_gtt_pwrite_fast' >> drivers/gpu/drm/i915/i915_gem.c:1162: warning: Excess function parameter >> 'file' description in 'i915_gem_gtt_pwrite_fast' drivers/gpu/drm/i915/i915_gem_fence.c:644: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:674: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for parameter 'pages' drivers/gpu/drm/drm_crtc.c:1270: WARNING: Inline literal start-string without end-string. drivers/gpu/drm/drm_crtc.c:1385: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1202: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1255: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1268: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1272: WARNING: Inline literal start-string without end-string. drivers/gpu/drm/drm_irq.c:718: WARNING: Option list ends without a blank line; unexpected unindent. drivers/gpu/drm/drm_fb_helper.c:2195: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/drm_simple_kms_helper.c:141: WARNING: Inline literal start-string without end-string. include/drm/drm_gem.h:212: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/i915_vgpu.c:176: WARNING: Literal block ends without a blank line; unexpected unindent. drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/intel_guc_fwif.h:159: WARNING: Block quote ends without a blank line; unexpected unindent. drivers/gpu/drm/i915/intel_guc_fwif.h:178: WARNING: Enumerated list ends without a blank line; unexpected unindent. Documentation/gpu/drm-kms.rst:13: WARNING: Could not lex literal_block as "C". Highlighting skipped. Documentation/gpu/drm-kms-helpers.rst:16: WARNING: Could not lex literal_block as "C". Highlighting skipped. Documentation/gpu/i915.rst:57: WARNING: Could not lex literal_block as "C". Highlighting skipped. vim +1162 drivers/gpu/drm/i915/i915_gem.c 9b5ec232 Chris Wilson 2016-09-20 1146} 19709047 Chris Wilson 2016-09-20 1147 19709047 Chris Wilson 2016-09-20 1148return unwritten; 19709047 Chris Wilson 2016-09-20 1149 } 19709047 Chris Wilson 2016-09-20 1150 3de09aa3 Eric Anholt2009-03-09 1151 /** 3de09aa3 Eric Anholt2009-03-09 1152 * This is the fast pwrite path, where we copy the data directly from the 3de09aa3 Eric Anholt2009-03-09 1153 * user into the GTT, uncached. 62f90b38 Daniel Vetter 2016-07-15 1154 * @i915: i915 device private data 14bb2c11 Tvrtko Ursulin 2016-06-03 1155 * @obj: i915 gem object 14bb2c11 Tvrtko Ursulin 2016-06-03 1156 * @args: pwrite arguments structure 14bb2c11 Tvrtko Ursulin 2016-06-03 1157 * @file: drm file pointer 3de09aa3 Eric Anholt2009-03-09 1158 */ 673a394b Eric Anholt2008-07-30 1159 static int 9b5ec232 Chris Wilson 2016-09
[Intel-gfx] [PATCH v2 RESEND 1/2] drm/i915: Cleanup instdone collection
From: Ben WidawskyConsolidate the instdone logic so we can get a bit fancier. This patch also removes the duplicated print of INSTDONE[0]. v2: (Imre) - Rebased on top of hangcheck INSTDONE changes. - Move all INSTDONE registers into a single struct, store it within the engine error struct during error capturing. Signed-off-by: Ben Widawsky Signed-off-by: Imre Deak Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_debugfs.c | 41 +++- drivers/gpu/drm/i915/i915_drv.h | 8 ++-- drivers/gpu/drm/i915/i915_gpu_error.c | 84 +++-- drivers/gpu/drm/i915/i915_irq.c | 69 ++- drivers/gpu/drm/i915/i915_reg.h | 1 - drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +++- 6 files changed, 151 insertions(+), 62 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index a6d174a..ba155c0 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1277,15 +1277,36 @@ out: return ret; } +static void i915_instdone_info(struct drm_i915_private *dev_priv, + struct seq_file *m, + struct intel_instdone *instdone) +{ + seq_printf(m, "\t\tINSTDONE: 0x%08x\n", + instdone->instdone); + + if (INTEL_GEN(dev_priv) <= 3) + return; + + seq_printf(m, "\t\tSC_INSTDONE: 0x%08x\n", + instdone->slice_common); + + if (INTEL_GEN(dev_priv) <= 6) + return; + + seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n", + instdone->sampler); + seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n", + instdone->row); +} + static int i915_hangcheck_info(struct seq_file *m, void *unused) { struct drm_i915_private *dev_priv = node_to_i915(m->private); struct intel_engine_cs *engine; u64 acthd[I915_NUM_ENGINES]; u32 seqno[I915_NUM_ENGINES]; - u32 instdone[I915_NUM_INSTDONE_REG]; + struct intel_instdone instdone; enum intel_engine_id id; - int j; if (test_bit(I915_WEDGED, _priv->gpu_error.flags)) seq_printf(m, "Wedged\n"); @@ -1308,7 +1329,7 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) seqno[id] = intel_engine_get_seqno(engine); } - i915_get_extra_instdone(dev_priv, instdone); + i915_get_engine_instdone(dev_priv, RCS, ); intel_runtime_pm_put(dev_priv); @@ -1336,18 +1357,14 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) seq_printf(m, "\taction = %d\n", engine->hangcheck.action); if (engine->id == RCS) { - seq_puts(m, "\tinstdone read ="); + seq_puts(m, "\tinstdone read =\n"); - for (j = 0; j < I915_NUM_INSTDONE_REG; j++) - seq_printf(m, " 0x%08x", instdone[j]); + i915_instdone_info(dev_priv, m, ); - seq_puts(m, "\n\tinstdone accu ="); + seq_puts(m, "\tinstdone accu =\n"); - for (j = 0; j < I915_NUM_INSTDONE_REG; j++) - seq_printf(m, " 0x%08x", - engine->hangcheck.instdone[j]); - - seq_puts(m, "\n"); + i915_instdone_info(dev_priv, m, + >hangcheck.instdone); } } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4dd307e..3568995 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -759,7 +759,7 @@ struct drm_i915_error_state { u32 gam_ecochk; u32 gab_ctl; u32 gfx_mode; - u32 extra_instdone[I915_NUM_INSTDONE_REG]; + u64 fence[I915_MAX_NUM_FENCES]; struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; @@ -791,7 +791,6 @@ struct drm_i915_error_state { u32 hws; u32 ipeir; u32 ipehr; - u32 instdone; u32 bbstate; u32 instpm; u32 instps; @@ -802,6 +801,7 @@ struct drm_i915_error_state { u64 faddr; u32 rc_psmi; /* sleep state */ u32 semaphore_mboxes[I915_NUM_ENGINES - 1]; + struct intel_instdone instdone; struct drm_i915_error_object { int page_count; @@ -3542,7 +3542,9 @@ void i915_error_state_get(struct drm_device *dev, void i915_error_state_put(struct i915_error_state_file_priv *error_priv); void i915_destroy_error_state(struct drm_device *dev); -void
[Intel-gfx] [PATCH v2 RESEND 2/2] drm/i915: Try to print INSTDONE bits for all slice/subslice
From: Ben Widawskyv2: (Imre) - Access only subslices that are known to exist. - Reset explictly the MCR selector to slice/sub-slice ID 0 after the readout. - Use the subslice INSTDONE bits for the hangcheck/subunits-stuck detection too. - Take the uncore lock for the MCR-select/subslice-readout sequence. Signed-off-by: Ben Widawsky Signed-off-by: Imre Deak Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_debugfs.c | 14 -- drivers/gpu/drm/i915/i915_gpu_error.c | 76 ++--- drivers/gpu/drm/i915/i915_irq.c | 25 --- drivers/gpu/drm/i915/i915_reg.h | 5 +++ drivers/gpu/drm/i915/intel_ringbuffer.h | 23 +- 5 files changed, 125 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index ba155c0..d3f83c5 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1281,6 +1281,9 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv, struct seq_file *m, struct intel_instdone *instdone) { + int slice; + int subslice; + seq_printf(m, "\t\tINSTDONE: 0x%08x\n", instdone->instdone); @@ -1293,10 +1296,13 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv, if (INTEL_GEN(dev_priv) <= 6) return; - seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n", - instdone->sampler); - seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n", - instdone->row); + for_each_instdone_slice_subslice(dev_priv, slice, subslice) + seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, instdone->sampler[slice][subslice]); + + for_each_instdone_slice_subslice(dev_priv, slice, subslice) + seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, instdone->row[slice][subslice]); } static int i915_hangcheck_info(struct seq_file *m, void *unused) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 6a2775a..2bbab22 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -231,6 +231,9 @@ static const char *hangcheck_action_to_str(enum intel_engine_hangcheck_action a) static void error_print_instdone(struct drm_i915_error_state_buf *m, struct drm_i915_error_engine *ee) { + int slice; + int subslice; + err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone.instdone); @@ -243,10 +246,15 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, if (INTEL_GEN(m->i915) <= 6) return; - err_printf(m, " SAMPLER_INSTDONE: 0x%08x\n", - ee->instdone.sampler); - err_printf(m, " ROW_INSTDONE: 0x%08x\n", - ee->instdone.row); + for_each_instdone_slice_subslice(m->i915, slice, subslice) + err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.sampler[slice][subslice]); + + for_each_instdone_slice_subslice(m->i915, slice, subslice) + err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.row[slice][subslice]); } static void error_print_engine(struct drm_i915_error_state_buf *m, @@ -1549,12 +1557,52 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type) } } +static inline uint32_t +read_subslice_reg(struct drm_i915_private *dev_priv, int slice, + int subslice, i915_reg_t reg) +{ + uint32_t mcr; + uint32_t ret; + enum forcewake_domains fw_domains; + + fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg, + FW_REG_READ); + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, +GEN8_MCR_SELECTOR, +FW_REG_READ | FW_REG_WRITE); + + spin_lock_irq(_priv->uncore.lock); + intel_uncore_forcewake_get__locked(dev_priv, fw_domains); + + mcr = I915_READ_FW(GEN8_MCR_SELECTOR); + /* +* The HW expects the slice and sublice selectors to be reset to 0 +* after reading out the registers. +*/ + WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK)); + mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK); + mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); + I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr); + + ret =
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Cleanup instdone collection
On pe, 2016-09-16 at 12:56 +0300, Jani Nikula wrote: > On Tue, 06 Sep 2016, Mika Kuoppala> wrote: > > Imre Deak writes: > > > > > From: Ben Widawsky > > > > > > Consolidate the instdone logic so we can get a bit fancier. This > > > patch also > > > removes the duplicated print of INSTDONE[0]. > > > > > > v2: (Imre) > > > - Rebased on top of hangcheck INSTDONE changes. > > > - Move all INSTDONE registers into a single struct, store it > > > within the > > > engine error struct during error capturing. > > > > > > Signed-off-by: Ben Widawsky > > > Signed-off-by: Imre Deak > > > > Reviewed-by: Mika Kuoppala > > I didn't receive the original messages, and I can't find them on the > list either. Can't find it in the list archive either, not sure where it got stuck. I'll resend it. --Imre > > BR, > Jani. > > > > > > > --- > > > drivers/gpu/drm/i915/i915_debugfs.c | 41 +++- > > > drivers/gpu/drm/i915/i915_drv.h | 8 ++-- > > > drivers/gpu/drm/i915/i915_gpu_error.c | 84 > > > +++-- > > > drivers/gpu/drm/i915/i915_irq.c | 69 ++- > > > > > > drivers/gpu/drm/i915/i915_reg.h | 1 - > > > drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +++- > > > 6 files changed, 151 insertions(+), 62 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > > > b/drivers/gpu/drm/i915/i915_debugfs.c > > > index 3fde507..45244f9 100644 > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > > @@ -1277,15 +1277,36 @@ out: > > > return ret; > > > } > > > > > > +static void i915_instdone_info(struct drm_i915_private > > > *dev_priv, > > > + struct seq_file *m, > > > + struct intel_instdone *instdone) > > > +{ > > > + seq_printf(m, "\t\tINSTDONE: 0x%08x\n", > > > + instdone->instdone); > > > + > > > + if (INTEL_GEN(dev_priv) <= 3) > > > + return; > > > + > > > + seq_printf(m, "\t\tSC_INSTDONE: 0x%08x\n", > > > + instdone->slice_common); > > > + > > > + if (INTEL_GEN(dev_priv) <= 6) > > > + return; > > > + > > > + seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n", > > > + instdone->sampler); > > > + seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n", > > > + instdone->row); > > > +} > > > + > > > static int i915_hangcheck_info(struct seq_file *m, void *unused) > > > { > > > struct drm_i915_private *dev_priv = node_to_i915(m- > > > >private); > > > struct intel_engine_cs *engine; > > > u64 acthd[I915_NUM_ENGINES]; > > > u32 seqno[I915_NUM_ENGINES]; > > > - u32 instdone[I915_NUM_INSTDONE_REG]; > > > + struct intel_instdone instdone; > > > enum intel_engine_id id; > > > - int j; > > > > > > if (!i915.enable_hangcheck) { > > > seq_printf(m, "Hangcheck disabled\n"); > > > @@ -1299,7 +1320,7 @@ static int i915_hangcheck_info(struct > > > seq_file *m, void *unused) > > > seqno[id] = intel_engine_get_seqno(engine); > > > } > > > > > > - i915_get_extra_instdone(dev_priv, instdone); > > > + i915_get_engine_instdone(dev_priv, RCS, ); > > > > > > intel_runtime_pm_put(dev_priv); > > > > > > @@ -1327,18 +1348,14 @@ static int i915_hangcheck_info(struct > > > seq_file *m, void *unused) > > > seq_printf(m, "\taction = %d\n", engine- > > > >hangcheck.action); > > > > > > if (engine->id == RCS) { > > > - seq_puts(m, "\tinstdone read ="); > > > + seq_puts(m, "\tinstdone read =\n"); > > > > > > - for (j = 0; j < I915_NUM_INSTDONE_REG; > > > j++) > > > - seq_printf(m, " 0x%08x", > > > instdone[j]); > > > + i915_instdone_info(dev_priv, m, > > > ); > > > > > > - seq_puts(m, "\n\tinstdone accu ="); > > > + seq_puts(m, "\tinstdone accu =\n"); > > > > > > - for (j = 0; j < I915_NUM_INSTDONE_REG; > > > j++) > > > - seq_printf(m, " 0x%08x", > > > - engine- > > > >hangcheck.instdone[j]); > > > - > > > - seq_puts(m, "\n"); > > > + i915_instdone_info(dev_priv, m, > > > + > > > >hangcheck.instdone); > > > } > > > } > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > > b/drivers/gpu/drm/i915/i915_drv.h > > > index 757b1d1..20bf72e 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -747,7 +747,7 @@ struct drm_i915_error_state { > > > u32 gam_ecochk; > > > u32 gab_ctl; > > > u32 gfx_mode; > > > - u32 extra_instdone[I915_NUM_INSTDONE_REG]; > > > + > > > u64 fence[I915_MAX_NUM_FENCES]; > > > struct intel_overlay_error_state *overlay; > > >
Re: [Intel-gfx] Kernel stability on baytrail machines
Hi, I think there might be another clue on this one. One of the comments is also mentioning an unfixed erratum of certain Baytrail processors, named as "EOI Transaction May Not be Sent if Software Enters Core C6 During an Interrupt Service Routine". This erratum can be found on several different processors, even on several non-baytrails, like Inte Xeon 3400 and similar. I also came across a patch that was created for SUSE and that seems to be adressing this issue in pre 4.X kernels: https://build.opensuse.org/package/view_file?file=22160-Intel-C6-EOI.patch=xen=home%3Acharlesa%3AopenSUSE11.3=7 --- Michal Dne 18.7.2016 15:30, One Thousand Gnomes napsal: On Tue, 12 Jul 2016 16:41:58 -0300 Ezequiel Garciawrote: Hi Alan, (Adding interested people to this thread) On 09 Apr 08:14 PM, One Thousand Gnomes wrote: > > > I do feel that the importance of the mentioned bug is currently > > > underestimated. Can anyone here give a note, how much current linux > > > kernel is supposed to be stable on general baytrail machines? > > > > If you did not get any replies... you might want to check MAINTAINERS file, and > > put Intel x86 maintainers on Cc list. > > > > I'm sure someone cares :-). > > Yes we care, and there are people looking at the various reports. > Are there any updates on the status of this issue? The current bugzilla report [1] marks this as a power management issue. However, many reports indicate that it would only freeze when running X, so it's not completely clear if it's related to the gfx driver too. There are two things we are currently tracking. One of them was merged which seems to have made my machine stable at least and fixes a problem related to the MMC. The second one we may need is a power related changed to SPI to hold the CPU in C0/C1 whenever the ACPI _SEM is held. Graphics shows these problems up because of the way the GPU causes power state changes. Alan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/2] drm/i915: Try to print INSTDONE bits for all slice/subslice
Imre Deakwrites: > From: Ben Widawsky > > v2: (Imre) > - Access only subslices that are known to exist. > - Reset explictly the MCR selector to slice/sub-slice ID 0 after the > readout. > - Use the subslice INSTDONE bits for the hangcheck/subunits-stuck > detection too. > - Take the uncore lock for the MCR-select/subslice-readout sequence. > > Signed-off-by: Ben Widawsky > Signed-off-by: Imre Deak Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_debugfs.c | 14 -- > drivers/gpu/drm/i915/i915_gpu_error.c | 76 > ++--- > drivers/gpu/drm/i915/i915_irq.c | 25 --- > drivers/gpu/drm/i915/i915_reg.h | 5 +++ > drivers/gpu/drm/i915/intel_ringbuffer.h | 23 +- > 5 files changed, 125 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 45244f9..0f84165 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -1281,6 +1281,9 @@ static void i915_instdone_info(struct drm_i915_private > *dev_priv, > struct seq_file *m, > struct intel_instdone *instdone) > { > + int slice; > + int subslice; > + > seq_printf(m, "\t\tINSTDONE: 0x%08x\n", > instdone->instdone); > > @@ -1293,10 +1296,13 @@ static void i915_instdone_info(struct > drm_i915_private *dev_priv, > if (INTEL_GEN(dev_priv) <= 6) > return; > > - seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n", > -instdone->sampler); > - seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n", > -instdone->row); > + for_each_instdone_slice_subslice(dev_priv, slice, subslice) > + seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n", > +slice, subslice, instdone->sampler[slice][subslice]); > + > + for_each_instdone_slice_subslice(dev_priv, slice, subslice) > + seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n", > +slice, subslice, instdone->row[slice][subslice]); > } > > static int i915_hangcheck_info(struct seq_file *m, void *unused) > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > b/drivers/gpu/drm/i915/i915_gpu_error.c > index 80fe101..06d4309 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -231,6 +231,9 @@ static const char *hangcheck_action_to_str(enum > intel_engine_hangcheck_action a) > static void error_print_instdone(struct drm_i915_error_state_buf *m, >struct drm_i915_error_engine *ee) > { > + int slice; > + int subslice; > + > err_printf(m, " INSTDONE: 0x%08x\n", > ee->instdone.instdone); > > @@ -243,10 +246,15 @@ static void error_print_instdone(struct > drm_i915_error_state_buf *m, > if (INTEL_GEN(m->i915) <= 6) > return; > > - err_printf(m, " SAMPLER_INSTDONE: 0x%08x\n", > -ee->instdone.sampler); > - err_printf(m, " ROW_INSTDONE: 0x%08x\n", > -ee->instdone.row); > + for_each_instdone_slice_subslice(m->i915, slice, subslice) > + err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", > +slice, subslice, > +ee->instdone.sampler[slice][subslice]); > + > + for_each_instdone_slice_subslice(m->i915, slice, subslice) > + err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n", > +slice, subslice, > +ee->instdone.row[slice][subslice]); > } > > static void error_print_engine(struct drm_i915_error_state_buf *m, > @@ -1534,12 +1542,52 @@ const char *i915_cache_level_str(struct > drm_i915_private *i915, int type) > } > } > > +static inline uint32_t > +read_subslice_reg(struct drm_i915_private *dev_priv, int slice, > + int subslice, i915_reg_t reg) > +{ > + uint32_t mcr; > + uint32_t ret; > + enum forcewake_domains fw_domains; > + > + fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg, > + FW_REG_READ); > + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > + GEN8_MCR_SELECTOR, > + FW_REG_READ | > FW_REG_WRITE); > + > + spin_lock_irq(_priv->uncore.lock); > + intel_uncore_forcewake_get__locked(dev_priv, fw_domains); > + > + mcr = I915_READ_FW(GEN8_MCR_SELECTOR); > + /* > + * The HW expects the slice and sublice selectors to be reset to 0 > + * after reading out the registers. > + */ > + WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK)); > + mcr &=
Re: [Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue
Op 20-09-16 om 14:51 schreef Chris Wilson: > On Tue, Sep 20, 2016 at 02:58:19PM +0300, Imre Deak wrote: >> While user space has control over the scheduling priority of its page >> flipping thread, the corresponding work the driver schedules for MMIO >> flips always runs from the generic system workqueue which has some >> scheduling overhead due it being CPU bound. This would hinder an >> application that wants more stringent guarantees over flip timing (to >> avoid missing a flip at the next frame count). >> >> Fix this by scheduling the work from the unbound system workqueue >> which provides for minimal scheduling latency. >> >> v2: >> - Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris) >> v3: >> - Use the system unbound wq instead of a dedicated one. (Maarten) >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775 >> Testcase: igt/kms_cursor_legacy >> CC: Chris Wilson>> CC: Maarten Lankhorst >> CC: Tvrtko Ursulin >> Signed-off-by: Imre Deak >> Reviewed-by: Tvrtko Ursulin (v1) > We violate the unbound_wq rules no worse than the ordinary system_wq, > and this brings mmioflip on a par with nonblocking atomic modesets, so > Reviewed-by: Chris Wilson > -Chris > Reviewed-by: Maarten Lankhorst ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue
On Tue, Sep 20, 2016 at 02:58:19PM +0300, Imre Deak wrote: > While user space has control over the scheduling priority of its page > flipping thread, the corresponding work the driver schedules for MMIO > flips always runs from the generic system workqueue which has some > scheduling overhead due it being CPU bound. This would hinder an > application that wants more stringent guarantees over flip timing (to > avoid missing a flip at the next frame count). > > Fix this by scheduling the work from the unbound system workqueue > which provides for minimal scheduling latency. > > v2: > - Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris) > v3: > - Use the system unbound wq instead of a dedicated one. (Maarten) > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775 > Testcase: igt/kms_cursor_legacy > CC: Chris Wilson> CC: Maarten Lankhorst > CC: Tvrtko Ursulin > Signed-off-by: Imre Deak > Reviewed-by: Tvrtko Ursulin (v1) We violate the unbound_wq rules no worse than the ordinary system_wq, and this brings mmioflip on a par with nonblocking atomic modesets, so Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Unlock PPS registers after GPU reset
On to, 2016-09-15 at 07:20 +, Patchwork wrote: > == Series Details == > > Series: drm/i915: Unlock PPS registers after GPU reset > URL : https://patchwork.freedesktop.org/series/12446/ > State : success Thanks for the report and review, I pushed it to -dinq. > > == Summary == > > Series 12446v1 drm/i915: Unlock PPS registers after GPU reset > https://patchwork.freedesktop.org/api/1.0/series/12446/revisions/1/mb > ox/ > > Test gem_exec_suspend: > Subgroup basic-s3: > incomplete -> PASS (fi-hsw-4770k) > Test kms_frontbuffer_tracking: > Subgroup basic: > skip -> PASS (fi-ivb-3770) > Test kms_pipe_crc_basic: > Subgroup suspend-read-crc-pipe-a: > dmesg-warn -> PASS (fi-byt-j1900) > Subgroup suspend-read-crc-pipe-b: > dmesg-warn -> PASS (fi-byt-j1900) > > fi-bdw- > 5557u total:244 pass:229 dwarn:0 dfail:0 fail:0 skip:15 > fi-bsw- > n3050 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 > fi-byt- > j1900 total:244 pass:212 dwarn:0 dfail:0 fail:1 skip:31 > fi-byt- > n2820 total:244 pass:208 dwarn:0 dfail:0 fail:1 skip:35 > fi-hsw- > 4770k total:244 pass:226 dwarn:0 dfail:0 fail:0 skip:18 > fi-hsw- > 4770r total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 > fi-ilk- > 650 total:244 pass:183 dwarn:0 dfail:0 fail:1 skip:60 > fi-ivb- > 3520m total:244 pass:219 dwarn:0 dfail:0 fail:0 skip:25 > fi-ivb- > 3770 total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 > fi-skl- > 6260u total:244 pass:230 dwarn:0 dfail:0 fail:0 skip:14 > fi-skl- > 6700hqtotal:244 pass:221 dwarn:0 dfail:0 fail:1 skip:22 > fi-skl- > 6700k total:244 pass:219 dwarn:1 dfail:0 fail:0 skip:24 > fi-snb- > 2520m total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 > fi-snb- > 2600 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 > > Results at /archive/results/CI_IGT_test/Patchwork_2530/ > > 9aa8c0cdbc076bcc0486d7a31922a0f77c032fe7 drm-intel-nightly: 2016y- > 09m-14d-09h-19m-25s UTC integration manifest > 424c774 drm/i915: Unlock PPS registers after GPU reset > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Queue page flip work with high priority (rev3)
== Series Details == Series: drm/i915: Queue page flip work with high priority (rev3) URL : https://patchwork.freedesktop.org/series/12336/ State : warning == Summary == Series 12336v3 drm/i915: Queue page flip work with high priority https://patchwork.freedesktop.org/api/1.0/series/12336/revisions/3/mbox/ Test gem_exec_gttfill: Subgroup basic: pass -> SKIP (fi-snb-2600) Test gem_exec_suspend: Subgroup basic-s3: incomplete -> PASS (fi-hsw-4770k) Test kms_pipe_crc_basic: Subgroup read-crc-pipe-c-frame-sequence: pass -> SKIP (fi-hsw-4770r) Subgroup suspend-read-crc-pipe-a: pass -> SKIP (fi-hsw-4770r) Subgroup suspend-read-crc-pipe-b: skip -> PASS (fi-hsw-4770r) Test kms_psr_sink_crc: Subgroup psr_basic: fail -> DMESG-WARN (fi-skl-6700hq) fi-bdw-5557u total:244 pass:229 dwarn:0 dfail:0 fail:0 skip:15 fi-bsw-n3050 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 fi-byt-n2820 total:244 pass:208 dwarn:0 dfail:0 fail:1 skip:35 fi-hsw-4770k total:208 pass:187 dwarn:0 dfail:0 fail:0 skip:20 fi-hsw-4770r total:244 pass:220 dwarn:0 dfail:0 fail:0 skip:24 fi-ilk-650 total:244 pass:182 dwarn:0 dfail:0 fail:2 skip:60 fi-ivb-3520m total:244 pass:219 dwarn:0 dfail:0 fail:0 skip:25 fi-ivb-3770 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 fi-skl-6260u total:244 pass:230 dwarn:0 dfail:0 fail:0 skip:14 fi-skl-6700hqtotal:244 pass:221 dwarn:1 dfail:0 fail:0 skip:22 fi-skl-6700k total:244 pass:219 dwarn:1 dfail:0 fail:0 skip:24 fi-skl-6770hqtotal:244 pass:228 dwarn:1 dfail:0 fail:1 skip:14 fi-snb-2520m total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 fi-snb-2600 total:244 pass:206 dwarn:0 dfail:0 fail:0 skip:38 Results at /archive/results/CI_IGT_test/Patchwork_2560/ de934d54f765dcfa82c018ea5d256801cffee7af drm-intel-nightly: 2016y-09m-20d-09h-13m-55s UTC integration manifest 04838a2 drm/i915: Queue page flip work via a low latency, unbound workqueue ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu: > From: Mahesh Kumar> > This patch make use of plane_wm variable directly instead of passing > skl_plane_wm struct. this way reduces number of argument requirement > in watermark calculation functions. > > It also gives more freedom of decision making to implement Bspec WM > workarounds. This is just my personal opinion, but I don't think this patch is an improvement to the codebase. The way things are currently organized, there's no risk that we may end up reading some variable that's still not set/computed, so there's less risk that some later patch may end up using information it shouldn't. Also, by having these explicit out parameters, we clearly highlight what's the goal of the function: output those 3 values, instead of writing I-don't-know-what to a huge struct. Besides, the patch even keeps the out_ variables as local variables instead of parameters, which makes things even more confusing IMHO, since in_ and out_ variables are usually function parameters. I also see that this patch is not necessary for the series. We kinda use the new pipe_wm variable at patch 9, but we could just go with the variables we have. So, unless some new arguments are given, I'd suggest to just drop this patch. > > Signed-off-by: Mahesh Kumar > --- > drivers/gpu/drm/i915/intel_pm.c | 29 +++-- > 1 file changed, 15 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c > b/drivers/gpu/drm/i915/intel_pm.c > index 86c6d66..3fdec4d 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct > drm_i915_private *dev_priv, > struct intel_plane_state > *intel_pstate, > uint16_t ddb_allocation, > int level, > - uint16_t *out_blocks, /* out */ > - uint8_t *out_lines, /* out */ > - bool *enabled /* out */) > + struct skl_pipe_wm *pipe_wm) > { > struct drm_plane_state *pstate = _pstate->base; > struct drm_framebuffer *fb = pstate->fb; > @@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct > drm_i915_private *dev_priv, > uint32_t width = 0, height = 0; > uint32_t plane_pixel_rate; > uint32_t y_tile_minimum, y_min_scanlines; > + int id = skl_wm_plane_id(to_intel_plane(pstate->plane)); > + struct skl_wm_level *result = _wm->wm[level]; > + uint16_t *out_blocks = >plane_res_b[id]; > + uint8_t *out_lines = >plane_res_l[id]; > + bool *enabled = >plane_en[id]; > > if (latency == 0 || !cstate->base.active || !intel_pstate- > >base.visible) { > *enabled = false; > @@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct > drm_i915_private *dev_priv, > struct skl_ddb_allocation *ddb, > struct intel_crtc_state *cstate, > int level, > - struct skl_wm_level *result) > + struct skl_pipe_wm *pipe_wm) > { > struct drm_atomic_state *state = cstate->base.state; > struct intel_crtc *intel_crtc = to_intel_crtc(cstate- > >base.crtc); > @@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct > drm_i915_private *dev_priv, > enum pipe pipe = intel_crtc->pipe; > int ret; > > - /* > - * We'll only calculate watermarks for planes that are > actually > - * enabled, so make sure all other planes are set as > disabled. > - */ > - memset(result, 0, sizeof(*result)); > - > for_each_intel_plane_mask(_priv->drm, > intel_plane, > cstate->base.plane_mask) { > @@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct > drm_i915_private *dev_priv, > intel_pstate, > ddb_blocks, > level, > - >plane_res_b[i], > - >plane_res_l[i], > - >plane_en[i]); > + pipe_wm); > if (ret) > return ret; > } > @@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct > intel_crtc_state *cstate, > int level, max_level = ilk_wm_max_level(dev); > int ret; > > + /* > + * We'll only calculate watermarks for planes that are > actually > + * enabled, so make sure all other planes are set as > disabled. > + */ > + memset(pipe_wm, 0, sizeof(*pipe_wm)); > + > for (level = 0; level <= max_level; level++) { > ret = skl_compute_wm_level(dev_priv, ddb, cstate, > -
[Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue
While user space has control over the scheduling priority of its page flipping thread, the corresponding work the driver schedules for MMIO flips always runs from the generic system workqueue which has some scheduling overhead due it being CPU bound. This would hinder an application that wants more stringent guarantees over flip timing (to avoid missing a flip at the next frame count). Fix this by scheduling the work from the unbound system workqueue which provides for minimal scheduling latency. v2: - Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris) v3: - Use the system unbound wq instead of a dedicated one. (Maarten) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775 Testcase: igt/kms_cursor_legacy CC: Chris WilsonCC: Maarten Lankhorst CC: Tvrtko Ursulin Signed-off-by: Imre Deak Reviewed-by: Tvrtko Ursulin (v1) --- drivers/gpu/drm/i915/intel_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 7bb7874..8da9f3c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12266,7 +12266,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, work->flip_queued_req = i915_gem_active_get(>last_write, >base.dev->struct_mutex); - schedule_work(>mmio_work); + queue_work(system_unbound_wq, >mmio_work); } else { request = i915_gem_request_alloc(engine, engine->last_context); if (IS_ERR(request)) { -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 16/38] drm/i915: Pass around sg_table to get_pages/put_pages backend
Hi Chris, [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on next-20160919] [cannot apply to v4.8-rc7] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] [Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on] [Check https://git-scm.com/docs/git-format-patch for more information] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839 base: git://anongit.freedesktop.org/drm-intel for-linux-next reproduce: make htmldocs All warnings (new ones prefixed by >>): drivers/gpu/drm/drm_modes.c:693: warning: No description found for parameter 'bus_flags' drivers/gpu/drm/drm_plane_helper.c:248: warning: No description found for parameter 'dst' drivers/gpu/drm/drm_plane_helper.c:248: warning: Excess function parameter 'dest' description in 'drm_plane_helper_check_update' drivers/gpu/drm/drm_plane_helper.c:247: warning: No description found for parameter 'dst' drivers/gpu/drm/drm_plane_helper.c:247: warning: Excess function parameter 'dest' description in 'drm_plane_helper_check_update' >> drivers/gpu/drm/i915/i915_gem_fence.c:644: warning: No description found for >> parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:674: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for parameter 'pages' drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for parameter 'pages' drivers/gpu/drm/drm_crtc.c:1270: WARNING: Inline literal start-string without end-string. drivers/gpu/drm/drm_crtc.c:1385: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1202: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1255: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1268: WARNING: Inline literal start-string without end-string. include/drm/drm_crtc.h:1272: WARNING: Inline literal start-string without end-string. drivers/gpu/drm/drm_irq.c:718: WARNING: Option list ends without a blank line; unexpected unindent. drivers/gpu/drm/drm_fb_helper.c:2195: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/drm_simple_kms_helper.c:141: WARNING: Inline literal start-string without end-string. include/drm/drm_gem.h:212: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/i915_vgpu.c:176: WARNING: Literal block ends without a blank line; unexpected unindent. drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string without end-string. drivers/gpu/drm/i915/intel_guc_fwif.h:159: WARNING: Block quote ends without a blank line; unexpected unindent. drivers/gpu/drm/i915/intel_guc_fwif.h:178: WARNING: Enumerated list ends without a blank line; unexpected unindent. Documentation/gpu/drm-kms.rst:13: WARNING: Could not lex literal_block as "C". Highlighting skipped. Documentation/gpu/drm-kms-helpers.rst:16: WARNING: Could not lex literal_block as "C". Highlighting skipped. Documentation/gpu/i915.rst:57: WARNING: Could not lex literal_block as "C". Highlighting skipped. vim +/pages +644 drivers/gpu/drm/i915/i915_gem_fence.c 3271dca48 Daniel Vetter 2015-07-24 628 /** 3271dca48 Daniel Vetter 2015-07-24 629 * i915_gem_object_do_bit_17_swizzle - fixup bit 17 swizzling 3271dca48 Daniel Vetter 2015-07-24 630 * @obj: i915 GEM buffer object 3271dca48 Daniel Vetter 2015-07-24 631 * 3271dca48 Daniel Vetter 2015-07-24 632 * This function fixes up the swizzling in case any page frame number for this 3271dca48 Daniel Vetter 2015-07-24 633 * object has changed in bit 17 since that state has been saved with 3271dca48 Daniel Vetter 2015-07-24 634 * i915_gem_object_save_bit_17_swizzle(). 3271dca48 Daniel Vetter 2015-07-24 635 * 3271dca48 Daniel Vetter 2015-07-24 636 * This is called when pinning backing storage again, since the kernel is free 3271dca48 Daniel Vetter 2015-07-24 637 * to move unpinned backing storage around (either by directly moving pages or 3271dca48 Daniel Vetter 2015-07-24 638 * by swapping them out and back in again). 3271dca48 Daniel Vetter 2015-07-24 639 */ 7f96ecaf1 Daniel Vetter 2015-07-24 640 void eb1ae14ed Chris Wilson 2016-09-20 641 i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, eb1ae14ed Chris Wilson 2016-09-20 642
[Intel-gfx] [PATCH] drm/i915: fix semicolon.cocci warnings
drivers/gpu/drm/i915/i915_gem_request.c:256:42-43: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci CC: Chris WilsonSigned-off-by: Fengguang Wu --- i915_gem_request.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -253,7 +253,7 @@ static int i915_gem_init_global_seqno(st i915_gem_retire_requests(dev_priv); /* If the seqno wraps around, we need to clear the breadcrumb rbtree */ - timeline = _priv->gt.global_timeline;; + timeline = _priv->gt.global_timeline; if (!i915_seqno_passed(seqno, atomic_read(>next_seqno))) { while (intel_kick_waiters(dev_priv) || intel_kick_signalers(dev_priv)) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 35/38] drm/i915: Reserve space in the global seqno during request allocation
Hi Chris, [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on next-20160919] [cannot apply to v4.8-rc7] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] [Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on] [Check https://git-scm.com/docs/git-format-patch for more information] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839 base: git://anongit.freedesktop.org/drm-intel for-linux-next coccinelle warnings: (new ones prefixed by >>) >> drivers/gpu/drm/i915/i915_gem_request.c:256:42-43: Unneeded semicolon Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] tests: kms_pipe_color: fix ctm tests
On Tue, Sep 20, 2016 at 09:46:58AM +0100, Lionel Landwerlin wrote: > Some of the Intel platforms have odd numbers of LUT entries and we > need to tests a couple of values around the expected result. Bring > back the CRC equal function we need that doesn't trigger an assert > right away, while we still assert if we can't find a correct result in > the outter loop. > > v2: update Fixes field (Jani) > > bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97691 > Fixes: 582ce4cd19c6 ("lib/debugs: nuke igt_crc_equal again") > Cc: Daniel Vetter> Cc: Christophe Prigent > --- > tests/kms_pipe_color.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/tests/kms_pipe_color.c b/tests/kms_pipe_color.c > index b0a2f07..ebfceab 100644 > --- a/tests/kms_pipe_color.c > +++ b/tests/kms_pipe_color.c > @@ -652,6 +652,16 @@ static void test_pipe_legacy_gamma_reset(data_t *data, > free(gamma_zero); > } > > +static bool crc_equal(igt_crc_t *a, igt_crc_t *b) > +{ > + int i; > + > + for (i = 0; i < a->n_words; i++) > +if (a->crc[i] != b->crc[i]) > + return false; > + return true; > +} memcmp()? > + > /* > * Draw 3 rectangles using before colors with the ctm matrix apply and verify > * the CRC is equal to using after colors with an identify ctm matrix. > @@ -724,7 +734,7 @@ static bool test_pipe_ctm(data_t *data, > /* Verify that the CRC of the software computed output is >* equal to the CRC of the CTM matrix transformation output. >*/ > - igt_assert_crc_equal(_software, _hardware); > + ret &= crc_equal(_software, _hardware); > > igt_output_set_pipe(output, PIPE_ANY); > } > -- > 2.8.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h
On Tue, 20 Sep 2016, Chris Wilsonwrote: > On Tue, Sep 20, 2016 at 11:57:06AM +0300, Jani Nikula wrote: >> On Mon, 19 Sep 2016, Joonas Lahtinen wrote: >> > On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote: >> >> Fix sparse warnings: >> >> >> >> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol >> >> 'i915_driver_load' was not declared. Should it be static? >> >> >> >> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol >> >> 'i915_driver_unload' was not declared. Should it be static? >> >> >> >> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops' >> >> was not declared. Should it be static? >> >> >> > >> > Hmm, Chris, were not these change in the middle of a series, did the >> > cleanup patches just fall off? >> >> Can we just please use whichever patches that fix the issue? > > It was deliberate placement to avoid having those symbols exposed to > every user of i915_drv.h, i.e. everyone. Limit i915_drv.h to the > interface exposed by i915_drv.c and so used only by a handful of files. I can see that, but I prefer to get the warnings gone. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/kms_psr_sink_crc: commit before querying mode
Op 20-09-16 om 02:21 schreef Rodrigo Vivi: > From: Mika Kuoppala> > Commit to a mode before querying it. > > Tested-by: Rodrigo Vivi > References: https://bugs.freedesktop.org/show_bug.cgi?id=97172 > Cc: Maarten Lankhorst > Signed-off-by: Mika Kuoppala > Signed-off-by: Rodrigo Vivi > --- > tests/kms_psr_sink_crc.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/tests/kms_psr_sink_crc.c b/tests/kms_psr_sink_crc.c > index 8aafedb..68310d8 100644 > --- a/tests/kms_psr_sink_crc.c > +++ b/tests/kms_psr_sink_crc.c > @@ -112,6 +112,14 @@ static void setup_output(data_t *data) > continue; > > igt_output_set_pipe(output, pipe); > + igt_display_commit(display); > + > + if (!output->valid) { > + igt_output_set_pipe(output, PIPE_ANY); > + igt_display_commit(display); > + continue; > + } Could you be more verbose? I'm trying to get rid of this requirement, in particular the output->valid branch will never happen here. To be compatible with atomic, igt_display_commit should not be called until the setup is completed, with fb's set etc. Quick hack below, does this also fix the issue? diff --git a/lib/igt_kms.c b/lib/igt_kms.c index 7d53360359ef..86884466c5c4 100644 --- a/lib/igt_kms.c +++ b/lib/igt_kms.c @@ -1215,13 +1215,16 @@ static void igt_display_log_shift(igt_display_t *display, int shift) igt_assert(display->log_shift >= 0); } -static void igt_output_refresh(igt_output_t *output) +static void igt_output_refresh(igt_output_t *output, bool final) { igt_display_t *display = output->display; unsigned long crtc_idx_mask; + crtc_idx_mask = output->pending_crtc_idx_mask; + /* we mask out the pipes already in use */ - crtc_idx_mask = output->pending_crtc_idx_mask & ~display->pipes_in_use; + if (final) + crtc_idx_mask &= ~display->pipes_in_use; kmstest_free_connector_config(>config); @@ -1504,7 +1507,7 @@ void igt_display_init(igt_display_t *display, int drm_fd) output->id = resources->connectors[i]; output->display = display; - igt_output_refresh(output); + igt_output_refresh(output, false); output->config.pipe_changed = true; } @@ -1591,7 +1594,7 @@ static void igt_display_refresh(igt_display_t *display) for (i = 0; i < display->n_outputs; i++) { igt_output_t *output = >outputs[i]; - igt_output_refresh(output); + igt_output_refresh(output, true); } } @@ -2462,6 +2465,8 @@ void igt_output_set_pipe(igt_output_t *output, enum pipe pipe) if (pipe != output->config.pipe) output->config.pipe_changed = true; + + igt_output_refresh(output, false); } void igt_output_set_scaling_mode(igt_output_t *output, uint64_t scaling_mode) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h
On Tue, Sep 20, 2016 at 11:57:06AM +0300, Jani Nikula wrote: > On Mon, 19 Sep 2016, Joonas Lahtinenwrote: > > On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote: > >> Fix sparse warnings: > >> > >> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol > >> 'i915_driver_load' was not declared. Should it be static? > >> > >> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol > >> 'i915_driver_unload' was not declared. Should it be static? > >> > >> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops' > >> was not declared. Should it be static? > >> > > > > Hmm, Chris, were not these change in the middle of a series, did the > > cleanup patches just fall off? > > Can we just please use whichever patches that fix the issue? It was deliberate placement to avoid having those symbols exposed to every user of i915_drv.h, i.e. everyone. Limit i915_drv.h to the interface exposed by i915_drv.c and so used only by a handful of files. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/38] drm/i915: Allow disabling error capture
== Series Details == Series: series starting with [01/38] drm/i915: Allow disabling error capture URL : https://patchwork.freedesktop.org/series/12703/ State : failure == Summary == Series 12703v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/12703/revisions/1/mbox/ Test drv_module_reload_basic: pass -> INCOMPLETE (fi-ilk-650) pass -> DMESG-WARN (fi-bdw-5557u) Test kms_cursor_legacy: Subgroup basic-flip-after-cursor-varying-size: pass -> SKIP (fi-hsw-4770r) Subgroup basic-flip-before-cursor-varying-size: pass -> SKIP (fi-hsw-4770r) Test kms_force_connector_basic: Subgroup force-connector-state: pass -> SKIP (fi-hsw-4770k) Subgroup force-edid: pass -> SKIP (fi-hsw-4770k) Subgroup force-load-detect: pass -> SKIP (fi-hsw-4770k) Subgroup prune-stale-modes: pass -> SKIP (fi-hsw-4770k) fi-bdw-5557u total:244 pass:228 dwarn:1 dfail:0 fail:0 skip:15 fi-bsw-n3050 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 fi-byt-n2820 total:244 pass:208 dwarn:0 dfail:0 fail:1 skip:35 fi-hsw-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-hsw-4770r total:244 pass:220 dwarn:0 dfail:0 fail:0 skip:24 fi-ilk-650 total:31 pass:11 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:244 pass:219 dwarn:0 dfail:0 fail:0 skip:25 fi-ivb-3770 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 fi-skl-6260u total:244 pass:230 dwarn:0 dfail:0 fail:0 skip:14 fi-skl-6700hqtotal:244 pass:221 dwarn:0 dfail:0 fail:1 skip:22 fi-skl-6700k total:244 pass:219 dwarn:1 dfail:0 fail:0 skip:24 fi-skl-6770hqtotal:244 pass:228 dwarn:1 dfail:0 fail:1 skip:14 fi-snb-2520m total:244 pass:208 dwarn:0 dfail:0 fail:0 skip:36 fi-snb-2600 total:244 pass:207 dwarn:0 dfail:0 fail:0 skip:37 Results at /archive/results/CI_IGT_test/Patchwork_2559/ 4c518aef024daa0223692124baa2d7399f54dd97 drm-intel-nightly: 2016y-09m-19d-20h-40m-51s UTC integration manifest 0a4cc56 drm/i915: Support explicit fencing for execbuf 761e0ee drm/i915: Enable userspace to opt-out of implicit fencing a673b6f drm/i915: Enable multiple timelines 5526441 drm/i915: Reserve space in the global seqno during request allocation 2ce5a23 drm/i915: Create a unique name for the context 82dbbe1 drm/i915: Move the global sync optimisation to the timeline b44dc8d drm/i915: Defer request emission ba0e9ba drm/i915: Record space required for request emission 8702dfa drm/i915: Introduce a global_seqno for each request b483e16 drm/i915: Wait first for submission, before waiting for request completion b2b3b08 drm/i915: Queue the idling context switch after all other timelines 2dbc868 drm/i915: Combine seqno + tracking into a global timeline struct 1e5f30e drm/i915: Restore nonblocking awaits for modesetting e30f561 drm: Add reference counting to drm_atomic_state 3d6d7d3 drm/i915: Move GEM activity tracking into a common struct reservation_object 1223e3f drm/i915: Use lockless object free e22aea4 drm/i915: Move object release to a freelist + worker 59801b5 drm/i915: Acquire the backing storage outside of struct_mutex in set-domain 4d2cc56 drm/i915: Implement pwrite without struct-mutex 16ba351 drm/i915: Implement pread without struct-mutex 75967f2 drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex 35ba93d drm/i915: Move object backing storage manipulation to its own locking ac493d3 drm/i915: Pass around sg_table to get_pages/put_pages backend 9901023 drm/i915: Refactor object page API 0f5c18f drm/i915: Use a radixtree for random access to the object's backing storage 58f3da0 drm/i915: Markup GEM API with lockdep asserts 339ff75 drm/i915: Reuse the active golden render state batch eb014cf drm/i915: Introduce an internal allocator for disposable private objects 2278008 drm/i915: Defer active reference until required 1a7a8ee drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked() a049b32 drm/i915: Rearrange i915_wait_request() accounting with callers 688719b drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate f1aef72 drm/i915: Support asynchronous waits on struct fence from i915_gem_request 2da40ec drm/i915: Compress GPU objects in error state fd35d16 drm/i915: Consolidate error object printing af118c3 drm/i915: Always use the GTT for error capture a7f289e drm/i915: Stop the machine whilst capturing the GPU crash dump 0412489 drm/i915: Allow disabling error capture ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] drm/i915: fix pwm increment setup
On Tue, 20 Sep 2016, Ville Syrjäläwrote: > On Mon, Sep 19, 2016 at 01:35:24PM +0300, Jani Nikula wrote: >> CI got confused by all the patches flowing in the earlier thread, so >> resend. No changes. > > Series lgtm. > > Reviewed-by: Ville Syrjälä Thanks for the review, pushed to drm-intel-next-queued. BR, Jani. > >> >> BR, >> Jani. >> >> Jani Nikula (1): >> drm/i915/backlight: setup and cache pwm alternate increment value >> >> Shawn Lee (1): >> drm/i915/backlight: setup backlight pwm alternate increment on >> backlight enable >> >> drivers/gpu/drm/i915/intel_drv.h | 1 + >> drivers/gpu/drm/i915/intel_panel.c | 32 >> 2 files changed, 29 insertions(+), 4 deletions(-) >> >> -- >> 2.1.4 >> >> ___ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h
On Mon, 19 Sep 2016, Joonas Lahtinenwrote: > On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote: >> Fix sparse warnings: >> >> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol >> 'i915_driver_load' was not declared. Should it be static? >> >> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol >> 'i915_driver_unload' was not declared. Should it be static? >> >> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops' >> was not declared. Should it be static? >> > > Hmm, Chris, were not these change in the middle of a series, did the > cleanup patches just fall off? Can we just please use whichever patches that fix the issue? BR, Jani. > > Regards, Joonas -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] tests: kms_pipe_color: fix ctm tests
Some of the Intel platforms have odd numbers of LUT entries and we need to tests a couple of values around the expected result. Bring back the CRC equal function we need that doesn't trigger an assert right away, while we still assert if we can't find a correct result in the outter loop. v2: update Fixes field (Jani) bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97691 Fixes: 582ce4cd19c6 ("lib/debugs: nuke igt_crc_equal again") Cc: Daniel VetterCc: Christophe Prigent --- tests/kms_pipe_color.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/tests/kms_pipe_color.c b/tests/kms_pipe_color.c index b0a2f07..ebfceab 100644 --- a/tests/kms_pipe_color.c +++ b/tests/kms_pipe_color.c @@ -652,6 +652,16 @@ static void test_pipe_legacy_gamma_reset(data_t *data, free(gamma_zero); } +static bool crc_equal(igt_crc_t *a, igt_crc_t *b) +{ + int i; + + for (i = 0; i < a->n_words; i++) +if (a->crc[i] != b->crc[i]) + return false; + return true; +} + /* * Draw 3 rectangles using before colors with the ctm matrix apply and verify * the CRC is equal to using after colors with an identify ctm matrix. @@ -724,7 +734,7 @@ static bool test_pipe_ctm(data_t *data, /* Verify that the CRC of the software computed output is * equal to the CRC of the CTM matrix transformation output. */ - igt_assert_crc_equal(_software, _hardware); + ret &= crc_equal(_software, _hardware); igt_output_set_pipe(output, PIPE_ANY); } -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] Remaining patches for upfront link training on DDI platforms
On Fri, 16 Sep 2016, Manasi Navarewrote: > This patch series includes some of the remaining patches to enable > upfront link training on DDI platforms for DP SST and MST. > They are based on some of the patches submitted earlier by > Ander and Durgadoss. When you post new versions of an entire series, please post them in a fresh thread, without --in-reply-to. Thanks, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] drm/i915: fix pwm increment setup
On Mon, Sep 19, 2016 at 01:35:24PM +0300, Jani Nikula wrote: > CI got confused by all the patches flowing in the earlier thread, so > resend. No changes. Series lgtm. Reviewed-by: Ville Syrjälä> > BR, > Jani. > > Jani Nikula (1): > drm/i915/backlight: setup and cache pwm alternate increment value > > Shawn Lee (1): > drm/i915/backlight: setup backlight pwm alternate increment on > backlight enable > > drivers/gpu/drm/i915/intel_drv.h | 1 + > drivers/gpu/drm/i915/intel_panel.c | 32 > 2 files changed, 29 insertions(+), 4 deletions(-) > > -- > 2.1.4 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] lib/igt_core: Print stacktrace in initialization/fixture blocks.
On Tue, Sep 20, 2016 at 11:20:38AM +0300, Marius Vlad wrote: > Likely candidate for this behaviour is the igt_fixture block. Seen in the CI > by > running tests/kms_psr_sink_crc which is causing segfaults in the fixture > block. > > While at it fix some minor printing bugs. > > Signed-off-by: Marius Vlad> CC: Chris Wilson > --- > @@ -1740,7 +1740,7 @@ static void fatal_sig_handler(int sig) > igt_assert_eq(write(STDERR_FILENO, ".\n", 2), 2); > } > > - if (in_subtest && crash_signal(sig)) { > + if (crash_signal(sig)) { > /* Linux standard to return exit code as 128 + signal */ > if (!failed_one) > igt_exitcode = 128 + sig; > @@ -1749,7 +1749,8 @@ static void fatal_sig_handler(int sig) > #ifdef HAVE_LIBUNWIND > print_backtrace_sig_safe(); > #endif > - exit_subtest("CRASH"); > + if (in_subtest) > + exit_subtest("CRASH"); Makes sense, I think - as much as I know this part of igt at least! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 29/38] drm/i915: Wait first for submission, before waiting for request completion
In future patches, we will no longer be able to wait on a static global seqno and instead have to break our wait up into phases. First we wait for the global seqno assignment (upon submission to hardware), and once submitted we wait for the hardware to complete. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 49 + 1 file changed, 49 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 998387e53e5e..c1fe3d28fa37 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -773,6 +773,49 @@ bool __i915_spin_request(const struct drm_i915_gem_request *req, return false; } +static long +__i915_request_wait_for_submit(struct drm_i915_gem_request *request, + unsigned int flags, + long timeout) +{ + const int state = flags & I915_WAIT_INTERRUPTIBLE ? + TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE; + wait_queue_head_t *q = >i915->gpu_error.wait_queue; + DEFINE_WAIT(reset); + DEFINE_WAIT(wait); + + if (flags & I915_WAIT_LOCKED) + add_wait_queue(q, ); + + do { + prepare_to_wait(>submit.wait, , state); + + if (i915_sw_fence_done(>submit)) + break; + + if (flags & I915_WAIT_LOCKED && + i915_reset_in_progress(>i915->gpu_error)) { + __set_current_state(TASK_RUNNING); + i915_reset(request->i915); + reset_wait_queue(q, ); + continue; + } + + if (signal_pending_state(state, current)) { + timeout = -ERESTARTSYS; + break; + } + + timeout = io_schedule_timeout(timeout); + } while (timeout); + finish_wait(>submit.wait, ); + + if (flags & I915_WAIT_LOCKED) + remove_wait_queue(q, ); + + return timeout; +} + /** * i915_wait_request - wait until execution of request has finished * @req: duh! @@ -811,6 +854,12 @@ long i915_wait_request(struct drm_i915_gem_request *req, trace_i915_gem_request_wait_begin(req); + if (!i915_sw_fence_done(>submit)) { + timeout = __i915_request_wait_for_submit(req, flags, timeout); + if (timeout < 0) + goto complete; + } + /* Optimistic short spin before touching IRQs */ if (i915_spin_request(req, state, 5)) goto complete; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 31/38] drm/i915: Record space required for request emission
In the next patch, we will use deferred request emission. That requires reserving sufficient space in the ringbuffer to emit the request, which first requires us to know how large the request is. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 1 + drivers/gpu/drm/i915/intel_lrc.c| 6 ++ drivers/gpu/drm/i915/intel_ringbuffer.c | 29 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + 4 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 8a3193609bc3..c6c31aab9e38 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -427,6 +427,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, * away, e.g. because a GPU scheduler has deferred it. */ req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST; + GEM_BUG_ON(req->reserved_space < engine->emit_request_sz); if (i915.enable_execlists) ret = intel_logical_ring_alloc_request_extras(req); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index a2421a566ed6..87e6bf4392fd 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1572,6 +1572,8 @@ static int gen8_emit_request(struct drm_i915_gem_request *request) return intel_logical_ring_advance(request); } +static const int gen8_emit_request_sz = 6 + WA_TAIL_DWORDS; + static int gen8_emit_request_render(struct drm_i915_gem_request *request) { struct intel_ring *ring = request->ring; @@ -1603,6 +1605,8 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request) return intel_logical_ring_advance(request); } +static const int gen8_emit_request_render_sz = 8 + WA_TAIL_DWORDS; + static int gen8_init_rcs_context(struct drm_i915_gem_request *req) { int ret; @@ -1677,6 +1681,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine) engine->reset_hw = reset_common_ring; engine->emit_flush = gen8_emit_flush; engine->emit_request = gen8_emit_request; + engine->emit_request_sz = gen8_emit_request_sz; engine->submit_request = execlists_submit_request; engine->irq_enable = gen8_logical_ring_enable_irq; @@ -1799,6 +1804,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine) engine->init_context = gen8_init_rcs_context; engine->emit_flush = gen8_emit_flush_render; engine->emit_request = gen8_emit_request_render; + engine->emit_request_sz = gen8_emit_request_render_sz; ret = intel_engine_create_scratch(engine, 4096); if (ret) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index de3876cbf58a..56915c1a9b7d 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1405,6 +1405,8 @@ static int i9xx_emit_request(struct drm_i915_gem_request *req) return 0; } +static const int i9xx_emit_request_sz = 4; + /** * gen6_sema_emit_request - Update the semaphore mailbox registers * @@ -1458,6 +1460,8 @@ static int gen8_render_emit_request(struct drm_i915_gem_request *req) return 0; } +static const int gen8_render_emit_request_sz = 8; + /** * intel_ring_sync - sync the waiter to the signaller on seqno * @@ -2677,8 +2681,21 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv, engine->reset_hw = reset_ring_common; engine->emit_request = i9xx_emit_request; - if (i915.semaphores) + engine->emit_request_sz = i9xx_emit_request_sz; + if (i915.semaphores) { + int num_rings; + engine->emit_request = gen6_sema_emit_request; + + num_rings = hweight32(INTEL_INFO(dev_priv)->ring_mask) - 1; + if (INTEL_GEN(dev_priv) >= 8) { + engine->emit_request_sz += num_rings * 6; + } else { + engine->emit_request_sz += num_rings * 3; + if (num_rings & 1) + engine->emit_request_sz++; + } + } engine->submit_request = i9xx_submit_request; if (INTEL_GEN(dev_priv) >= 8) @@ -2706,9 +2723,17 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine) if (INTEL_GEN(dev_priv) >= 8) { engine->init_context = intel_rcs_ctx_init; engine->emit_request = gen8_render_emit_request; + engine->emit_request_sz = gen8_render_emit_request_sz; engine->emit_flush = gen8_render_ring_flush; - if (i915.semaphores) + if (i915.semaphores) { + int num_rings; +
[Intel-gfx] [PATCH 36/38] drm/i915: Enable multiple timelines
With the infrastructure converted over to tracking multiple timelines in the GEM API whilst preserving the efficiency of using a single execution timeline internally, we can now assign a separate timeline to every context with full-ppgtt. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 10 drivers/gpu/drm/i915/i915_gem.c | 5 ++ drivers/gpu/drm/i915/i915_gem_context.c | 4 +- drivers/gpu/drm/i915/i915_gem_evict.c| 11 +++-- drivers/gpu/drm/i915/i915_gem_gtt.c | 19 --- drivers/gpu/drm/i915/i915_gem_gtt.h | 4 +- drivers/gpu/drm/i915/i915_gem_request.c | 85 +++- drivers/gpu/drm/i915/i915_gem_timeline.c | 1 + drivers/gpu/drm/i915/i915_gem_timeline.h | 2 + drivers/gpu/drm/i915/intel_breadcrumbs.c | 12 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 5 -- 11 files changed, 101 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 48c63365184d..1187e8f51985 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3493,6 +3493,16 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx) kref_put(>ref, i915_gem_context_free); } +static inline struct intel_timeline * +i915_gem_context_lookup_timeline(struct i915_gem_context *ctx, +struct intel_engine_cs *engine) +{ + struct i915_address_space *vm; + + vm = ctx->ppgtt ? >ppgtt->base : >i915->ggtt.base; + return >timeline.engine[engine->id]; +} + static inline bool i915_gem_context_is_default(const struct i915_gem_context *c) { return c->user_handle == DEFAULT_CONTEXT_HANDLE; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cd436f28e702..fd62e947322a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2558,6 +2558,7 @@ static void i915_gem_reset_engine(struct intel_engine_cs *engine) { struct drm_i915_gem_request *request; struct i915_gem_context *incomplete_ctx; + struct intel_timeline *timeline; bool ring_hung; /* Ensure irq handler finishes, and not run again. */ @@ -2595,6 +2596,10 @@ static void i915_gem_reset_engine(struct intel_engine_cs *engine) list_for_each_entry_continue(request, >timeline->requests, link) if (request->ctx == incomplete_ctx) reset_request(request); + + timeline = i915_gem_context_lookup_timeline(incomplete_ctx, engine); + list_for_each_entry(request, >requests, link) + reset_request(request); } void i915_gem_reset(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 762f3de9421f..5574bb1ca5c4 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -365,9 +365,9 @@ i915_gem_create_context(struct drm_device *dev, return ctx; if (USES_FULL_PPGTT(dev)) { - struct i915_hw_ppgtt *ppgtt = - i915_ppgtt_create(to_i915(dev), file_priv); + struct i915_hw_ppgtt *ppgtt; + ppgtt = i915_ppgtt_create(to_i915(dev), file_priv, ctx->name); if (IS_ERR(ppgtt)) { DRM_DEBUG_DRIVER("PPGTT setup failed (%ld)\n", PTR_ERR(ppgtt)); diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 61f716c8768c..dcab3da29b04 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -33,13 +33,16 @@ #include "intel_drv.h" #include "i915_trace.h" -static bool -gpu_is_idle(struct drm_i915_private *dev_priv) +static bool ggtt_is_idle(struct drm_i915_private *dev_priv) { + struct i915_ggtt *ggtt = _priv->ggtt; struct intel_engine_cs *engine; for_each_engine(engine, dev_priv) { - if (intel_engine_is_active(engine)) + struct intel_timeline *tl; + + tl = >base.timeline.engine[engine->id]; + if (i915_gem_active_isset(>last_request)) return false; } @@ -153,7 +156,7 @@ search_again: if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK) return -ENOSPC; - if (gpu_is_idle(dev_priv)) { + if (ggtt_is_idle(dev_priv)) { /* If we still have pending pageflip completions, drop * back to userspace to give our workqueues time to * acquire our locks and unpin the old scanouts. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2a8eed5a70b9..0b7541b18195 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2112,8 +2112,10 @@ static int
[Intel-gfx] [PATCH 12/38] drm/i915: Reuse the active golden render state batch
The golden render state is constant, but we recreate the batch setting it up for every new context. If we keep that batch in a volatile cache we can safely reuse it whenever we need to initialise a new context. We mark the pages as purgeable and use the shrinker to recover pages from the batch whenever we face memory pressues, recreating that batch afresh on the next new context. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem_render_state.c | 178 +-- drivers/gpu/drm/i915/i915_gem_render_state.h | 4 +- drivers/gpu/drm/i915/intel_engine_cs.c | 5 + drivers/gpu/drm/i915/intel_lrc.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 + 6 files changed, 124 insertions(+), 70 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 53175aac5cd0..31b75132f6f8 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -28,17 +28,19 @@ #include "i915_drv.h" #include "intel_renderstate.h" -struct render_state { +struct intel_render_state { const struct intel_renderstate_rodata *rodata; struct i915_vma *vma; - u32 aux_batch_size; - u32 aux_batch_offset; + u32 batch_offset; + u32 batch_size; + u32 aux_offset; + u32 aux_size; }; static const struct intel_renderstate_rodata * -render_state_get_rodata(const struct drm_i915_gem_request *req) +render_state_get_rodata(const struct intel_engine_cs *engine) { - switch (INTEL_GEN(req->i915)) { + switch (INTEL_GEN(engine->i915)) { case 6: return _null_state; case 7: @@ -63,29 +65,27 @@ render_state_get_rodata(const struct drm_i915_gem_request *req) */ #define OUT_BATCH(batch, i, val) \ do {\ - if (WARN_ON((i) >= PAGE_SIZE / sizeof(u32))) { \ - ret = -ENOSPC; \ - goto err_out; \ - } \ + if ((i) >= PAGE_SIZE / sizeof(u32)) \ + goto err; \ (batch)[(i)++] = (val); \ } while(0) -static int render_state_setup(struct render_state *so) +static int render_state_setup(struct intel_render_state *so, + struct drm_i915_private *i915) { - struct drm_device *dev = so->vma->vm->dev; const struct intel_renderstate_rodata *rodata = so->rodata; - const bool has_64bit_reloc = INTEL_GEN(dev) >= 8; + const bool has_64bit_reloc = INTEL_GEN(i915) >= 8; + struct drm_i915_gem_object *obj = so->vma->obj; unsigned int i = 0, reloc_index = 0; - struct page *page; + unsigned int needs_clflush; u32 *d; int ret; - ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true); + ret = i915_gem_obj_prepare_shmem_write(obj, _clflush); if (ret) return ret; - page = i915_gem_object_get_dirty_page(so->vma->obj, 0); - d = kmap(page); + d = kmap_atomic(i915_gem_object_get_dirty_page(obj, 0)); while (i < rodata->batch_items) { u32 s = rodata->batch[i]; @@ -95,10 +95,8 @@ static int render_state_setup(struct render_state *so) s = lower_32_bits(r); if (has_64bit_reloc) { if (i + 1 >= rodata->batch_items || - rodata->batch[i + 1] != 0) { - ret = -EINVAL; - goto err_out; - } + rodata->batch[i + 1] != 0) + goto err; d[i++] = s; s = upper_32_bits(r); @@ -110,12 +108,20 @@ static int render_state_setup(struct render_state *so) d[i++] = s; } + if (rodata->reloc[reloc_index] != -1) { + DRM_ERROR("only %d relocs resolved\n", reloc_index); + goto err; + } + + so->batch_offset = so->vma->node.start; + so->batch_size = rodata->batch_items * sizeof(u32); + while (i % CACHELINE_DWORDS) OUT_BATCH(d, i, MI_NOOP); - so->aux_batch_offset = i * sizeof(u32); + so->aux_offset = i * sizeof(u32); - if (HAS_POOLED_EU(dev)) { + if (HAS_POOLED_EU(i915)) { /* * We always program 3x6 pool config but depending upon which * subslice is disabled HW drops down to appropriate config @@
[Intel-gfx] [PATCH 28/38] drm/i915: Queue the idling context switch after all other timelines
Before suspend, we wait for the switch to the kernel context. In order for all the other context images to be complete upon suspend, that switch must be the last operation by the GPU (i.e. this idling request must not overtake any pending requests). To make this request execute last, we make it depend on every other inflight request. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem.c | 12 drivers/gpu/drm/i915/i915_gem_context.c | 21 +++-- 2 files changed, 27 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 768190594c4f..b10d26d3254d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4259,6 +4259,17 @@ void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj) i915_gem_object_put(obj); } +__maybe_unused static bool is_kernel_context(struct drm_i915_private *dev_priv) +{ + struct intel_engine_cs *engine; + + for_each_engine(engine, dev_priv) + if (engine->last_context != dev_priv->kernel_context) + return false; + + return true; +} + int i915_gem_suspend(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); @@ -4288,6 +4299,7 @@ int i915_gem_suspend(struct drm_device *dev) i915_gem_retire_requests(dev_priv); + GEM_BUG_ON(!is_kernel_context(dev_priv)); i915_gem_context_lost(dev_priv); mutex_unlock(>struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 1d2ab73a8f43..fe45cd7640b3 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -929,21 +929,30 @@ int i915_switch_context(struct drm_i915_gem_request *req) int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv) { struct intel_engine_cs *engine; + struct i915_gem_timeline *timeline; for_each_engine(engine, dev_priv) { struct drm_i915_gem_request *req; int ret; - if (engine->last_context == NULL) - continue; - - if (engine->last_context == dev_priv->kernel_context) - continue; - req = i915_gem_request_alloc(engine, dev_priv->kernel_context); if (IS_ERR(req)) return PTR_ERR(req); + /* Queue this switch after all other activity */ + list_for_each_entry(timeline, _priv->gt.timelines, link) { + struct drm_i915_gem_request *prev; + struct intel_timeline *tl; + + tl = >engine[engine->id]; + prev = i915_gem_active_raw(>last_request, + _priv->drm.struct_mutex); + if (prev) + i915_sw_fence_await_sw_fence(>submit, +>submit, +NULL, GFP_KERNEL); + } + ret = i915_switch_context(req); i915_add_request_no_flush(req); if (ret) -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 35/38] drm/i915: Reserve space in the global seqno during request allocation
A restriction on our global seqno is that they cannot wrap, and that we cannot use the value 0. This allows us to detect when a request has not yet been submitted, its global seqno is still 0, and ensures that hardware semaphores are monotonic as required by older hardware. To meet these restrictions when we defer the assignment of the global seqno, we must check that we have an available slot in the global seqno space during request construction. If that test fails, we wait for all requests to be completed and reset the hardware back to 0. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen name, i915_gem_request_get_seqno(work->flip_queued_req), - dev_priv->gt.global_timeline.next_seqno, + atomic_read(_priv->gt.global_timeline.next_seqno), intel_engine_get_seqno(engine), i915_gem_request_completed(work->flip_queued_req)); } else @@ -1055,7 +1055,7 @@ i915_next_seqno_get(void *data, u64 *val) { struct drm_i915_private *dev_priv = data; - *val = READ_ONCE(dev_priv->gt.global_timeline.next_seqno); + *val = atomic_read(_priv->gt.global_timeline.next_seqno); return 0; } @@ -2331,8 +2331,8 @@ static int i915_rps_boost_info(struct seq_file *m, void *data) struct drm_file *file; seq_printf(m, "RPS enabled? %d\n", dev_priv->rps.enabled); - seq_printf(m, "GPU busy? %s [%x]\n", - yesno(dev_priv->gt.awake), dev_priv->gt.active_engines); + seq_printf(m, "GPU busy? %s [%d requests]\n", + yesno(dev_priv->gt.awake), dev_priv->gt.active_requests); seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv)); seq_printf(m, "Frequency requested %d\n", intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq)); @@ -2367,7 +2367,7 @@ static int i915_rps_boost_info(struct seq_file *m, void *data) if (INTEL_GEN(dev_priv) >= 6 && dev_priv->rps.enabled && - dev_priv->gt.active_engines) { + dev_priv->gt.active_requests) { u32 rpup, rpupei; u32 rpdown, rpdownei; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bb2b8b41eb61..48c63365184d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2059,6 +2059,7 @@ struct drm_i915_private { struct list_head timelines; struct i915_gem_timeline global_timeline; + u32 active_requests; /** * Is the GPU currently considered idle, or busy executing @@ -2067,7 +2068,6 @@ struct drm_i915_private { * In order to reduce the effect on performance, there * is a slight delay before we do so. */ - unsigned int active_engines; bool awake; /** diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 659ee3b910d7..cd436f28e702 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2640,8 +2640,6 @@ static void i915_gem_cleanup_engine(struct intel_engine_cs *engine) memset(engine->execlist_port, 0, sizeof(engine->execlist_port)); spin_unlock(>execlist_lock); } - - engine->i915->gt.active_engines &= ~intel_engine_flag(engine); } void i915_gem_set_wedged(struct drm_i915_private *dev_priv) @@ -2696,7 +2694,7 @@ i915_gem_idle_work_handler(struct work_struct *work) if (!READ_ONCE(dev_priv->gt.awake)) return; - if (READ_ONCE(dev_priv->gt.active_engines)) + if (READ_ONCE(dev_priv->gt.active_requests)) return; rearm_hangcheck = @@ -2710,7 +2708,7 @@ i915_gem_idle_work_handler(struct work_struct *work) goto out_rearm; } - if (dev_priv->gt.active_engines)
[Intel-gfx] [PATCH 37/38] drm/i915: Enable userspace to opt-out of implicit fencing
Userspace is faced with a dilemma. The kernel requires implicit fencing to manage resource usage (we always must wait for the GPU to finish before releasing its PTE) and for third parties. However, userspace may wish to avoid this serialisation if it is either using explicit fencing between parties and wants more fine-grained access to buffers (e.g. it may partition the buffer between uses and track fences on ranges rather than the implicit fences tracking the whole object). It follows that userspace needs a mechanism to avoid the kernel's serialisation on its implicit fences before execbuf execution. The next question is whether this is an object, execbuf or context flag. Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on shared winsys buffers, but implicit fencing on internal surfaces) require a per-object level flag. Given that this flag need to be only set once for the lifetime of the object, this reduces the convenience of having an execbuf or context level flag (and avoids having multiple pieces of uABI controlling the same feature). Incorrect use of this flag will result in rendering corruption and GPU hangs - but will not result in use-after-free or similar resource tracking issues. Serious caveat: write ordering is not strictly correct after setting this flag on a render target on multiple engines. This affects all subsequent GEM operations (execbuf, set-domain, pread) and shared dma-buf operations. A fix is possible - but costly (both in terms of further ABI changes and runtime overhead). Testcase: igt/gem_exec_async Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.c| 1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ include/uapi/drm/i915_drm.h| 27 ++- 3 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index b393347554bd..19ee76284371 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -332,6 +332,7 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_EXEC_HANDLE_LUT: case I915_PARAM_HAS_COHERENT_PHYS_GTT: case I915_PARAM_HAS_EXEC_SOFTPIN: + case I915_PARAM_HAS_EXEC_ASYNC: /* For the time being all of these are always true; * if some supported hardware does not have one of these * features this value needs to be provided from diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index e2d4f937d0b2..7038da9aa68f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1107,6 +1107,9 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; + if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC) + continue; + ret = i915_gem_request_await_object (req, obj, obj->base.pending_write_domain); if (ret) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 03725fe89859..a2fa511b46b3 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -388,6 +388,10 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_POOLED_EU38 #define I915_PARAM_MIN_EU_IN_POOL 39 #define I915_PARAM_MMAP_GTT_VERSION 40 +/* Query whether DRM_I915_GEM_EXECBUFFER2 supports the ability to opt-out of + * synchronisation with implicit fencing on individual objects. + */ +#define I915_PARAM_HAS_EXEC_ASYNC 41 typedef struct drm_i915_getparam { __s32 param; @@ -729,8 +733,29 @@ struct drm_i915_gem_exec_object2 { #define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3) #define EXEC_OBJECT_PINNED (1<<4) #define EXEC_OBJECT_PAD_TO_SIZE (1<<5) +/* The kernel implicitly tracks GPU activity on all GEM objects, and + * synchronises operations with outstanding rendering. This includes + * rendering on other devices if exported via dma-buf. However, sometimes + * this tracking is too coarse and the user knows better. For example, + * if the object is split into non-overlapping ranges shared between different + * clients or engines (i.e. suballocating objects), the implicit tracking + * by kernel assumes that each operation affects the whole object rather + * than an individual range, causing needless synchronisation between clients. + * The kernel will also forgo any CPU cache flushes prior to rendering from + * the object as the client is expected to be also handling such domain + * tracking. + * + * The kernel maintains the implicit tracking in order to manage resources + * used by the GPU - this flag only disables the
[Intel-gfx] [PATCH 24/38] drm/i915: Move GEM activity tracking into a common struct reservation_object
In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which *i915* engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 72s). The runtime effect is far larger than the overhead added to execbuf as indicated by perf - interesting secondary effects? * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_debugfs.c| 32 ++-- drivers/gpu/drm/i915/i915_drv.h| 45 + drivers/gpu/drm/i915/i915_gem.c| 277 + drivers/gpu/drm/i915/i915_gem_batch_pool.c | 3 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 46 + drivers/gpu/drm/i915/i915_gem_dmabuf.h | 45 - drivers/gpu/drm/i915/i915_gem_execbuffer.c | 71 ++-- drivers/gpu/drm/i915/i915_gem_gtt.c| 32 drivers/gpu/drm/i915/i915_gem_gtt.h| 1 + drivers/gpu/drm/i915/i915_gem_request.c| 48 +++-- drivers/gpu/drm/i915/i915_gem_request.h| 37 +--- drivers/gpu/drm/i915/i915_gpu_error.c | 6 +- drivers/gpu/drm/i915/intel_atomic_plane.c | 2 - drivers/gpu/drm/i915/intel_display.c | 131 -- drivers/gpu/drm/i915/intel_drv.h | 3 - 15 files changed, 234 insertions(+), 545 deletions(-) delete mode 100644 drivers/gpu/drm/i915/i915_gem_dmabuf.h diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 355eec8f7cac..7ecdd5cc27dd 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -130,6 +130,23 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj) return size; } +static struct intel_engine_cs * +last_write_engine(struct drm_i915_gem_object *obj) +{ + struct intel_engine_cs *engine = NULL; + struct fence *fence; + + rcu_read_lock(); + fence = reservation_object_get_excl_rcu(obj->resv); + rcu_read_unlock(); + + if (fence && fence_is_i915(fence) && !fence_is_signaled(fence)) + engine = to_request(fence)->engine; + fence_put(fence); + + return engine; +} + static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { @@ -138,11 +155,10 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) struct i915_vma *vma; unsigned int frontbuffer_bits; int pin_count = 0; - enum intel_engine_id id; lockdep_assert_held(>base.dev->struct_mutex); - seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x [ ", + seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x %s%s%s", >base, get_active_flag(obj), get_pin_flag(obj), @@ -151,14 +167,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) get_pin_mapped_flag(obj), obj->base.size / 1024, obj->base.read_domains, - obj->base.write_domain); - for_each_engine_id(engine, dev_priv, id) - seq_printf(m, "%x ", - i915_gem_active_get_seqno(>last_read[id], - >base.dev->struct_mutex)); - seq_printf(m, "] %x %s%s%s", - i915_gem_active_get_seqno(>last_write, ->base.dev->struct_mutex), +
[Intel-gfx] [PATCH 21/38] drm/i915: Acquire the backing storage outside of struct_mutex in set-domain
As we can locklessly (well struct_mutex-lessly) acquire the backing storage, do so in set-domain-ioctl to reduce the contention on the struct_mutex. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem.c | 83 ++--- 1 file changed, 53 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1c77c04422cc..4ee074305908 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1459,6 +1459,30 @@ write_origin(struct drm_i915_gem_object *obj, unsigned domain) obj->frontbuffer_ggtt_origin : ORIGIN_CPU); } +static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915; + struct list_head *list; + struct i915_vma *vma; + + list_for_each_entry(vma, >vma_list, obj_link) { + if (!i915_vma_is_ggtt(vma)) + continue; + + if (i915_vma_is_active(vma)) + continue; + + if (!drm_mm_node_allocated(>node)) + continue; + + list_move_tail(>vm_link, >vm->inactive_list); + } + + i915 = to_i915(obj->base.dev); + list = obj->bind_count ? >mm.bound_list : >mm.unbound_list; + list_move_tail(>global_list, list); +} + /** * Called when user space prepares to use an object with the CPU, either * through the mmap ioctl's mapping or a GTT mapping. @@ -1500,25 +1524,40 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, MAX_SCHEDULE_TIMEOUT, to_rps_client(file)); if (ret) - goto err; + goto err_unlocked; + + /* Flush and acquire obj->pages so that we are coherent through +* direct access in memory with previous cached writes through +* shmemfs and that our cache domain tracking remains valid. +* For example, if the obj->filp was moved to swap without us +* being notified and releasing the pages, we would mistakenly +* continue to assume that the obj remained out of the CPU cached +* domain. +*/ + ret = i915_gem_object_pin_pages(obj); + if (ret) + goto err_unlocked; ret = i915_mutex_lock_interruptible(dev); if (ret) - goto err; + goto err_pages; if (read_domains & I915_GEM_DOMAIN_GTT) ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0); else ret = i915_gem_object_set_to_cpu_domain(obj, write_domain != 0); - if (write_domain != 0) - intel_fb_obj_invalidate(obj, write_origin(obj, write_domain)); + /* And bump the LRU for this access */ + i915_gem_object_bump_inactive_ggtt(obj); - i915_gem_object_put(obj); mutex_unlock(>struct_mutex); - return ret; -err: + if (write_domain != 0) + intel_fb_obj_invalidate(obj, write_origin(obj, write_domain)); + +err_pages: + i915_gem_object_unpin_pages(obj); +err_unlocked: i915_gem_object_put_unlocked(obj); return ret; } @@ -1741,6 +1780,10 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf) if (ret) goto err; + ret = i915_gem_object_pin_pages(obj); + if (ret) + goto err; + intel_runtime_pm_get(dev_priv); ret = i915_mutex_lock_interruptible(dev); @@ -1822,6 +1865,7 @@ err_unlock: mutex_unlock(>struct_mutex); err_rpm: intel_runtime_pm_put(dev_priv); + i915_gem_object_unpin_pages(obj); err: switch (ret) { case -EIO: @@ -3205,24 +3249,6 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj) I915_GEM_DOMAIN_CPU); } -static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) -{ - struct i915_vma *vma; - - list_for_each_entry(vma, >vma_list, obj_link) { - if (!i915_vma_is_ggtt(vma)) - continue; - - if (i915_vma_is_active(vma)) - continue; - - if (!drm_mm_node_allocated(>node)) - continue; - - list_move_tail(>vm_link, >vm->inactive_list); - } -} - /** * Moves a single object to the GTT read, and possibly write domain. * @obj: object to act on @@ -3278,7 +3304,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) /* It should now be out of any other write domains, and we can update * the domain values for our changes. */ - BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_GTT) != 0); + GEM_BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_GTT) != 0);
[Intel-gfx] [PATCH 34/38] drm/i915: Create a unique name for the context
This will be used for communicating issues with this context to userspace, so we want to identify the parent process and the individual context. Note that the name isn't quite unique, it makes the presumption of there only being a single device fd per process. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem_context.c | 23 ++- 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6daa4d538f42..bb2b8b41eb61 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -918,6 +918,7 @@ struct i915_gem_context { struct drm_i915_file_private *file_priv; struct i915_hw_ppgtt *ppgtt; struct pid *pid; + const char *name; struct i915_ctx_hang_stats hang_stats; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index fe45cd7640b3..762f3de9421f 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -158,6 +158,7 @@ void i915_gem_context_free(struct kref *ctx_ref) __i915_gem_object_release_unless_active(ce->state->obj); } + kfree(ctx->name); put_pid(ctx->pid); list_del(>link); @@ -303,19 +304,28 @@ __create_hw_context(struct drm_device *dev, } /* Default context will never have a file_priv */ - if (file_priv != NULL) { + ret = DEFAULT_CONTEXT_HANDLE; + if (file_priv) { ret = idr_alloc(_priv->context_idr, ctx, DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL); if (ret < 0) goto err_out; - } else - ret = DEFAULT_CONTEXT_HANDLE; + } + ctx->user_handle = ret; ctx->file_priv = file_priv; - if (file_priv) + if (file_priv) { ctx->pid = get_task_pid(current, PIDTYPE_PID); + ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x", + current->comm, + pid_nr(ctx->pid), + ctx->user_handle); + if (!ctx->name) { + ret = -ENOMEM; + goto err_pid; + } + } - ctx->user_handle = ret; /* NB: Mark all slices as needing a remap so that when the context first * loads it will restore whatever remap state already exists. If there * is no remap info, it will be a NOP. */ @@ -329,6 +339,9 @@ __create_hw_context(struct drm_device *dev, return ctx; +err_pid: + put_pid(ctx->pid); + idr_remove(_priv->context_idr, ctx->user_handle); err_out: context_close(ctx); return ERR_PTR(ret); -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 32/38] drm/i915: Defer request emission
Move the actual emission of the request (the closing breadcrumb) from i915_add_request() to the submit callback. (It can be moved later when required.) This allows us to defer the allocation of the global_seqno from request construction to actual submission, allowing us to emit the requests out of order (wrt to the order of their construction, they still will only be executed one all of their dependencies are resolved including that all earlier requests on their timeline have been submitted.) We have to specialise how we then emit the request in order to write into the preallocated space, rather than at the tail of the ringbuffer (which will have been advanced by the addition of new requests). Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem_request.c | 26 ++--- drivers/gpu/drm/i915/intel_lrc.c| 120 --- drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++- drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +- 4 files changed, 111 insertions(+), 213 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index c6c31aab9e38..c31cb7c4a61f 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -632,9 +632,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) struct intel_ring *ring = request->ring; struct intel_timeline *timeline = request->timeline; struct drm_i915_gem_request *prev; - u32 request_start; - u32 reserved_tail; - int ret; + int err; lockdep_assert_held(>i915->drm.struct_mutex); trace_i915_gem_request_add(request); @@ -644,8 +642,6 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) * should already have been reserved in the ring buffer. Let the ring * know that it is time to use that space up. */ - request_start = ring->tail; - reserved_tail = request->reserved_space; request->reserved_space = 0; /* @@ -656,10 +652,10 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) * what. */ if (flush_caches) { - ret = engine->emit_flush(request, EMIT_FLUSH); + err = engine->emit_flush(request, EMIT_FLUSH); /* Not allowed to fail! */ - WARN(ret, "engine->emit_flush() failed: %d!\n", ret); + WARN(err, "engine->emit_flush() failed: %d!\n", err); } /* Record the position of the start of the breadcrumb so that @@ -667,20 +663,12 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) * GPU processing the request, we never over-estimate the * position of the ring's HEAD. */ + err = intel_ring_begin(request, engine->emit_request_sz); + GEM_BUG_ON(err); request->postfix = ring->tail; - /* Not allowed to fail! */ - ret = engine->emit_request(request); - WARN(ret, "(%s)->emit_request failed: %d!\n", engine->name, ret); - - /* Sanity check that the reserved size was large enough. */ - ret = ring->tail - request_start; - if (ret < 0) - ret += ring->size; - WARN_ONCE(ret > reserved_tail, - "Not enough space reserved (%d bytes) " - "for adding the request (%d bytes)\n", - reserved_tail, ret); + engine->emit_request(request, request->ring->vaddr + request->postfix); + ring->tail += engine->emit_request_sz * sizeof(u32); /* Seal the request and mark it as pending execution. Note that * we may inspect this state, without holding any locks, during diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 87e6bf4392fd..ffbb4da6cf95 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -360,7 +360,7 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq) struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt; u32 *reg_state = ce->lrc_reg_state; - reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail); + reg_state[CTX_RING_TAIL+1] = rq->tail; /* True 32b PPGTT with dynamic page allocation: update PDP * registers and point the unallocated PDPs to scratch page. @@ -594,6 +594,15 @@ static void execlists_submit_request(struct drm_i915_gem_request *request) spin_lock_irqsave(>execlist_lock, flags); + /* We keep the previous context alive until we retire the following +* request. This ensures that any the context object is still pinned +* for any residual writes the HW makes into it on the context switch +* into the next object following the breadcrumb. Otherwise, we may +* retire the context too early. +*/ +
[Intel-gfx] [PATCH 22/38] drm/i915: Move object release to a freelist + worker
We want to hide the latency of releasing objects and their backing storage from the submission, so we move the actual free to a worker. This allows us to switch to struct_mutex freeing of the object in the next patch. Furthermore, if we know that the object we are dereferencing remains valid for the duration of our access, we can forgo the usual synchronisation barriers and atomic reference counting. To ensure this we defer freeing an object til after an RCU grace period, such that any lookup of the object within an RCU read critical section will remain valid until after we exit that critical section. We also employ this delay for rate-limiting the serialisation on reallocation - we have to slow down object creation in order to prevent resource starvation (in particular, files). Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_debugfs.c | 15 ++- drivers/gpu/drm/i915/i915_drv.c | 3 + drivers/gpu/drm/i915/i915_drv.h | 44 +++- drivers/gpu/drm/i915/i915_gem.c | 166 +-- drivers/gpu/drm/i915/i915_gem_shrinker.c | 20 ++-- drivers/gpu/drm/i915/i915_gem_tiling.c | 21 ++-- 6 files changed, 196 insertions(+), 73 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 0b2e0ff968cf..355eec8f7cac 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -4783,10 +4783,12 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_ring_test_irq_fops, #define DROP_BOUND 0x2 #define DROP_RETIRE 0x4 #define DROP_ACTIVE 0x8 -#define DROP_ALL (DROP_UNBOUND | \ - DROP_BOUND | \ - DROP_RETIRE | \ - DROP_ACTIVE) +#define DROP_FREED 0x10 +#define DROP_ALL (DROP_UNBOUND | \ + DROP_BOUND| \ + DROP_RETIRE | \ + DROP_ACTIVE | \ + DROP_FREED) static int i915_drop_caches_get(void *data, u64 *val) { @@ -4830,6 +4832,11 @@ i915_drop_caches_set(void *data, u64 val) unlock: mutex_unlock(>struct_mutex); + if (val & DROP_FREED) { + synchronize_rcu(); + flush_work(_priv->mm.free_work); + } + return ret; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 7f4e8adec8a8..dded8af4092c 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -563,6 +563,9 @@ static void i915_gem_fini(struct drm_device *dev) i915_gem_context_fini(dev); mutex_unlock(>struct_mutex); + synchronize_rcu(); + flush_work(_priv->mm.free_work); + WARN_ON(!list_empty(_i915(dev)->context_list)); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e68746f616b0..ba0c8d097afc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1337,11 +1337,17 @@ struct i915_gem_mm { struct list_head bound_list; /** * List of objects which are not bound to the GTT (thus -* are idle and not used by the GPU) but still have -* (presumably uncached) pages still attached. +* are idle and not used by the GPU). These objects may or may +* not actually have any pages attached. */ struct list_head unbound_list; + /** +* List of objects which are pending destruction. +*/ + struct llist_head free_list; + struct work_struct free_work; + /** Usable portion of the GTT for GEM */ unsigned long stolen_base; /* limited to low memory (32-bit) */ @@ -2196,6 +2202,10 @@ struct drm_i915_gem_object { /** Stolen memory for this object, instead of being backed by shmem. */ struct drm_mm_node *stolen; struct list_head global_list; + union { + struct rcu_head rcu; + struct llist_node freed; + }; /** Used in execbuf to temporarily hold a ref */ struct list_head obj_exec_link; @@ -2318,10 +2328,38 @@ to_intel_bo(struct drm_gem_object *gem) return container_of(gem, struct drm_i915_gem_object, base); } +/** + * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle + * @filp: DRM file private date + * @handle: userspace handle + * + * Returns: + * + * A pointer to the object named by the handle if such exists on @filp, NULL + * otherwise. This object is only valid whilst under the RCU read lock, and + * note carefully the object may be in the process of being destroyed. + */ +static inline struct drm_i915_gem_object * +i915_gem_object_lookup_rcu(struct drm_file *file, u32 handle) +{ +#ifdef CONFIG_LOCKDEP + WARN_ON(!lock_is_held(_lock_map)); +#endif + return idr_find(>object_idr, handle); +} + static inline struct drm_i915_gem_object * i915_gem_object_lookup(struct drm_file *file, u32 handle) { - return
[Intel-gfx] [PATCH 30/38] drm/i915: Introduce a global_seqno for each request
Though we will have multiple timelines, we still have a single timeline of execution. This we can use to provide an execution and retirement order of requests. This keeps tracking execution of requests simple, and vital for preserving a single waiter (i.e. so that we can order the waiters so that only the earliest to wakeup need be woken). To accomplish this we distinguish the seqno used to order requests per-context (external) and that used internally for execution. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 4 ++-- drivers/gpu/drm/i915/i915_gem.c| 4 ++-- drivers/gpu/drm/i915/i915_gem_request.c| 23 --- drivers/gpu/drm/i915/i915_gem_request.h| 30 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 4 ++-- drivers/gpu/drm/i915/i915_trace.h | 8 drivers/gpu/drm/i915/intel_breadcrumbs.c | 8 +--- drivers/gpu/drm/i915/intel_lrc.c | 4 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c| 14 +++--- 11 files changed, 67 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index ef1672b3a4bc..79ad67893d41 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -683,7 +683,7 @@ static int i915_gem_request_info(struct seq_file *m, void *data) rcu_read_lock(); task = pid ? pid_task(pid, PIDTYPE_PID) : NULL; seq_printf(m, "%x @ %d: %s [%d]\n", - req->fence.seqno, + req->global_seqno, (int) (jiffies - req->emitted_jiffies), task ? task->comm : "", task ? task->pid : -1); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 06b429f81ffd..ac83f449efa5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3956,7 +3956,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req) /* Before we do the heavier coherent read of the seqno, * check the value (hopefully) in the CPU cacheline. */ - if (i915_gem_request_completed(req)) + if (__i915_gem_request_completed(req)) return true; /* Ensure our read of the seqno is coherent so that we @@ -4007,7 +4007,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req) wake_up_process(tsk); rcu_read_unlock(); - if (i915_gem_request_completed(req)) + if (__i915_gem_request_completed(req)) return true; } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b10d26d3254d..659ee3b910d7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2525,7 +2525,7 @@ i915_gem_find_active_request(struct intel_engine_cs *engine) * not need an engine->irq_seqno_barrier() before the seqno reads. */ list_for_each_entry(request, >timeline->requests, link) { - if (i915_gem_request_completed(request)) + if (__i915_gem_request_completed(request)) continue; if (!i915_sw_fence_done(>submit)) @@ -2575,7 +2575,7 @@ static void i915_gem_reset_engine(struct intel_engine_cs *engine) return; DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n", -engine->name, request->fence.seqno); +engine->name, request->global_seqno); /* Setup the CS to resume from the breadcrumb of the hung request */ engine->reset_hw(engine, request); diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index c1fe3d28fa37..8a3193609bc3 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -358,7 +358,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, /* Move the oldest request to the slab-cache (if not in use!) */ req = list_first_entry_or_null(>timeline->requests, typeof(*req), link); - if (req && i915_gem_request_completed(req)) + if (req && __i915_gem_request_completed(req)) i915_gem_request_retire(req); /* Beware: Dragons be flying overhead. @@ -369,7 +369,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, * of being read by __i915_gem_active_get_rcu(). As such, * we have to be very careful when overwriting the contents. During
[Intel-gfx] [PATCH 20/38] drm/i915: Implement pwrite without struct-mutex
We only need struct_mutex within pwrite for a brief window where we need to serialise with rendering and control our cache domains. Elsewhere we can rely on the backing storage being pinned, and forgive userspace any races against us. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem.c | 339 ++-- 1 file changed, 117 insertions(+), 222 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4b41f8f51dc2..1c77c04422cc 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1121,10 +1121,10 @@ out: */ static inline int -fast_user_write(struct io_mapping *mapping, - loff_t page_base, int page_offset, - char __user *user_data, - int length) +ggtt_write(struct io_mapping *mapping, + loff_t page_base, int page_offset, + char __user *user_data, + int length) { void __iomem *vaddr_atomic; void *vaddr; @@ -1136,28 +1136,15 @@ fast_user_write(struct io_mapping *mapping, unwritten = __copy_from_user_inatomic_nocache(vaddr, user_data, length); io_mapping_unmap_atomic(vaddr_atomic); - return unwritten; -} - -static inline unsigned long -slow_user_access(struct io_mapping *mapping, -uint64_t page_base, int page_offset, -char __user *user_data, -unsigned long length, bool pwrite) -{ - void __iomem *ioaddr; - void *vaddr; - uint64_t unwritten; - ioaddr = io_mapping_map_wc(mapping, page_base, PAGE_SIZE); - /* We can use the cpu mem copy function because this is X86. */ - vaddr = (void __force *)ioaddr + page_offset; - if (pwrite) - unwritten = __copy_from_user(vaddr, user_data, length); - else - unwritten = __copy_to_user(user_data, vaddr, length); + if (unwritten) { + vaddr_atomic = io_mapping_map_wc(mapping, page_base, PAGE_SIZE); + /* We can use the cpu mem copy function because this is X86. */ + vaddr = (void __force*)vaddr_atomic + page_offset; + unwritten = copy_from_user(vaddr, user_data, length); + io_mapping_unmap(vaddr_atomic); + } - io_mapping_unmap(ioaddr); return unwritten; } @@ -1170,22 +1157,19 @@ slow_user_access(struct io_mapping *mapping, * @file: drm file pointer */ static int -i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, -struct drm_i915_gem_object *obj, -struct drm_i915_gem_pwrite *args, -struct drm_file *file) +i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, +const struct drm_i915_gem_pwrite *args) { - struct i915_ggtt *ggtt = >ggtt; - struct drm_device *dev = obj->base.dev; - struct i915_vma *vma; + struct i915_ggtt *ggtt = _i915(obj->base.dev)->ggtt; struct drm_mm_node node; - uint64_t remain, offset; - char __user *user_data; + struct i915_vma *vma; + u64 remain, offset; + void __user *user_data; int ret; - bool hit_slow_path = false; - if (i915_gem_object_is_tiled(obj)) - return -EFAULT; + ret = mutex_lock_interruptible(>base.dev->struct_mutex); + if (ret) + return ret; vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE | PIN_NONBLOCK); @@ -1201,21 +1185,17 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, if (IS_ERR(vma)) { ret = insert_mappable_node(ggtt, , PAGE_SIZE); if (ret) - goto out; - - ret = i915_gem_object_pin_pages(obj); - if (ret) { - remove_mappable_node(); - goto out; - } + goto out_unlock; + GEM_BUG_ON(!node.allocated); } ret = i915_gem_object_set_to_gtt_domain(obj, true); if (ret) goto out_unpin; + mutex_unlock(>base.dev->struct_mutex); + intel_fb_obj_invalidate(obj, ORIGIN_CPU); - obj->mm.dirty = true; user_data = u64_to_user_ptr(args->data_ptr); offset = args->offset; @@ -1246,92 +1226,36 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, * If the object is non-shmem backed, we retry again with the * path that handles page fault. */ - if (fast_user_write(>mappable, page_base, - page_offset, user_data, page_length)) { - hit_slow_path = true; - mutex_unlock(>struct_mutex); - if (slow_user_access(>mappable,
[Intel-gfx] [PATCH 14/38] drm/i915: Use a radixtree for random access to the object's backing storage
A while ago we switched from a contiguous array of pages into an sglist, for that was both more convenient for mapping to hardware and avoided the requirement for a vmalloc array of pages on every object. However, certain GEM API calls (like pwrite, pread as well as performing relocations) do desire access to individual struct pages. A quick hack was to introduce a cache of the last access such that finding the following page was quick - this works so long as the caller desired sequential access. Walking backwards, or multiple callers, still hits a slow linear search for each page. One solution is to store each successful lookup in a radix tree. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 57 drivers/gpu/drm/i915/i915_gem.c | 149 drivers/gpu/drm/i915/i915_gem_stolen.c | 4 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 4 +- 4 files changed, 154 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6ef2068ed399..33ce6db85ce4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2261,9 +2261,12 @@ struct drm_i915_gem_object { struct sg_table *pages; int pages_pin_count; - struct get_page { - struct scatterlist *sg; - int last; + struct i915_gem_object_page_iter { + struct scatterlist *sg_pos; + unsigned long sg_idx; + + struct radix_tree_root radix; + struct mutex lock; } get_page; void *mapping; @@ -3152,45 +3155,21 @@ static inline int __sg_page_count(struct scatterlist *sg) return sg->length >> PAGE_SHIFT; } -struct page * -i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n); - -static inline dma_addr_t -i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, int n) -{ - if (n < obj->get_page.last) { - obj->get_page.sg = obj->pages->sgl; - obj->get_page.last = 0; - } - - while (obj->get_page.last + __sg_page_count(obj->get_page.sg) <= n) { - obj->get_page.last += __sg_page_count(obj->get_page.sg++); - if (unlikely(sg_is_chain(obj->get_page.sg))) - obj->get_page.sg = sg_chain_ptr(obj->get_page.sg); - } - - return sg_dma_address(obj->get_page.sg) + ((n - obj->get_page.last) << PAGE_SHIFT); -} - -static inline struct page * -i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) -{ - if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT)) - return NULL; +struct scatterlist * +i915_gem_object_get_sg(struct drm_i915_gem_object *obj, + unsigned long n, unsigned int *offset); - if (n < obj->get_page.last) { - obj->get_page.sg = obj->pages->sgl; - obj->get_page.last = 0; - } +struct page * +i915_gem_object_get_page(struct drm_i915_gem_object *obj, +unsigned long n); - while (obj->get_page.last + __sg_page_count(obj->get_page.sg) <= n) { - obj->get_page.last += __sg_page_count(obj->get_page.sg++); - if (unlikely(sg_is_chain(obj->get_page.sg))) - obj->get_page.sg = sg_chain_ptr(obj->get_page.sg); - } +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, + unsigned long n); - return nth_page(sg_page(obj->get_page.sg), n - obj->get_page.last); -} +dma_addr_t +i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, + unsigned long n); static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 0d8ae6c54e5a..253f92b95400 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2292,6 +2292,15 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj) kfree(obj->pages); } +static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj) +{ + struct radix_tree_iter iter; + void **slot; + + radix_tree_for_each_slot(slot, >get_page.radix, , 0) + radix_tree_delete(>get_page.radix, iter.index); +} + int i915_gem_object_put_pages(struct drm_i915_gem_object *obj) { @@ -2324,6 +2333,8 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) obj->mapping = NULL; } + __i915_gem_object_reset_page_iter(obj); + ops->put_pages(obj); obj->pages = NULL; @@ -2488,8 +2499,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) list_add_tail(>global_list, _priv->mm.unbound_list); - obj->get_page.sg = obj->pages->sgl; - obj->get_page.last = 0; + obj->get_page.sg_pos = obj->pages->sgl; +
[Intel-gfx] [PATCH 18/38] drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex
Use the per-object mm.lock to allocate the backing storage (and hold a reference to it across the dmabuf access) without resorting to struct_mutex. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 51 ++ 1 file changed, 21 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index a4c90e915051..61b983efba3f 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -44,19 +44,15 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme struct scatterlist *src, *dst; int ret, i; - ret = i915_mutex_lock_interruptible(obj->base.dev); - if (ret) - goto err; - ret = i915_gem_object_pin_pages(obj); if (ret) - goto err_unlock; + goto err; /* Copy sg so that we make an independent mapping */ st = kmalloc(sizeof(struct sg_table), GFP_KERNEL); if (st == NULL) { ret = -ENOMEM; - goto err_unpin; + goto err_put_pages; } ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL); @@ -72,21 +68,18 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme } if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) { - ret =-ENOMEM; + ret = -ENOMEM; goto err_free_sg; } - mutex_unlock(>base.dev->struct_mutex); return st; err_free_sg: sg_free_table(st); err_free: kfree(st); -err_unpin: +err_put_pages: i915_gem_object_unpin_pages(obj); -err_unlock: - mutex_unlock(>base.dev->struct_mutex); err: return ERR_PTR(ret); } @@ -101,36 +94,21 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, sg_free_table(sg); kfree(sg); - mutex_lock(>base.dev->struct_mutex); i915_gem_object_unpin_pages(obj); - mutex_unlock(>base.dev->struct_mutex); } static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); - struct drm_device *dev = obj->base.dev; - void *addr; - int ret; - ret = i915_mutex_lock_interruptible(dev); - if (ret) - return ERR_PTR(ret); - - addr = i915_gem_object_pin_map(obj, I915_MAP_WB); - mutex_unlock(>struct_mutex); - - return addr; + return i915_gem_object_pin_map(obj, I915_MAP_WB); } static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); - struct drm_device *dev = obj->base.dev; - mutex_lock(>struct_mutex); i915_gem_object_unpin_map(obj); - mutex_unlock(>struct_mutex); } static void *i915_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, unsigned long page_num) @@ -177,15 +155,22 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); struct drm_device *dev = obj->base.dev; - int ret; bool write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE); + int ret; - ret = i915_mutex_lock_interruptible(dev); + ret = i915_gem_object_pin_pages(obj); if (ret) return ret; + ret = i915_mutex_lock_interruptible(dev); + if (ret) + goto err; + ret = i915_gem_object_set_to_cpu_domain(obj, write); mutex_unlock(>struct_mutex); + +err: + i915_gem_object_unpin_pages(obj); return ret; } @@ -195,13 +180,19 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct struct drm_device *dev = obj->base.dev; int ret; - ret = i915_mutex_lock_interruptible(dev); + ret = i915_gem_object_pin_pages(obj); if (ret) return ret; + ret = i915_mutex_lock_interruptible(dev); + if (ret) + goto err; + ret = i915_gem_object_set_to_gtt_domain(obj, false); mutex_unlock(>struct_mutex); +err: + i915_gem_object_unpin_pages(obj); return ret; } -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 33/38] drm/i915: Move the global sync optimisation to the timeline
Currently we try to reduce the number of synchronisations (now the number of requests we need to wait upon) by noting that if we have earlier waited upon a request, all subsequent requests in the timeline will be after the wait. This only applies to requests in this timeline, as other timelines will not be ordered by that waiter. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 9 -- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/i915_gem_request.c | 18 drivers/gpu/drm/i915/i915_gem_timeline.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c| 48 +++- drivers/gpu/drm/i915/intel_engine_cs.c | 2 -- drivers/gpu/drm/i915/intel_ringbuffer.c | 3 -- drivers/gpu/drm/i915/intel_ringbuffer.h | 23 --- 8 files changed, 43 insertions(+), 62 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 79ad67893d41..ca7d2e0da23c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3185,15 +3185,6 @@ static int i915_semaphore_status(struct seq_file *m, void *unused) seq_putc(m, '\n'); } - seq_puts(m, "\nSync seqno:\n"); - for_each_engine(engine, dev_priv) { - for (j = 0; j < num_rings; j++) - seq_printf(m, " 0x%08x ", - engine->semaphore.sync_seqno[j]); - seq_putc(m, '\n'); - } - seq_putc(m, '\n'); - intel_runtime_pm_put(dev_priv); mutex_unlock(>struct_mutex); return 0; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ac83f449efa5..6daa4d538f42 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -784,7 +784,6 @@ struct drm_i915_error_state { u32 cpu_ring_tail; u32 last_seqno; - u32 semaphore_seqno[I915_NUM_ENGINES - 1]; /* Register state */ u32 start; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index c31cb7c4a61f..3d09301f7b40 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -238,8 +238,9 @@ static int i915_gem_check_wedge(struct drm_i915_private *dev_priv) static int i915_gem_init_global_seqno(struct drm_i915_private *dev_priv, u32 seqno) { + struct i915_gem_timeline *timeline; struct intel_engine_cs *engine; - int ret; + int ret, i; /* Carefully retire all requests without writing to the rings */ ret = i915_gem_wait_for_idle(dev_priv, @@ -262,6 +263,14 @@ static int i915_gem_init_global_seqno(struct drm_i915_private *dev_priv, for_each_engine(engine, dev_priv) intel_engine_init_global_seqno(engine, seqno); + list_for_each_entry(timeline, _priv->gt.timelines, link) { + for (i = 0; i < ARRAY_SIZE(timeline->engine); i++) { + struct intel_timeline *tl = >engine[i]; + + memset(tl->sync_seqno, 0, sizeof(tl->sync_seqno)); + } + } + return 0; } @@ -456,7 +465,7 @@ static int i915_gem_request_await_request(struct drm_i915_gem_request *to, struct drm_i915_gem_request *from) { - int idx, ret; + int ret; GEM_BUG_ON(to == from); @@ -476,8 +485,7 @@ i915_gem_request_await_request(struct drm_i915_gem_request *to, return ret < 0 ? ret : 0; } - idx = intel_engine_sync_index(from->engine, to->engine); - if (from->global_seqno <= from->engine->semaphore.sync_seqno[idx]) + if (from->global_seqno <= to->timeline->sync_seqno[from->engine->id]) return 0; trace_i915_gem_ring_sync_to(to, from); @@ -495,7 +503,7 @@ i915_gem_request_await_request(struct drm_i915_gem_request *to, return ret; } - from->engine->semaphore.sync_seqno[idx] = from->global_seqno; + to->timeline->sync_seqno[from->engine->id] = from->global_seqno; return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.h b/drivers/gpu/drm/i915/i915_gem_timeline.h index be34c2e5ac85..ec2e56352c4b 100644 --- a/drivers/gpu/drm/i915/i915_gem_timeline.h +++ b/drivers/gpu/drm/i915/i915_gem_timeline.h @@ -47,6 +47,7 @@ struct intel_timeline { * struct_mutex. */ struct i915_gem_active last_request; + u32 sync_seqno[I915_NUM_ENGINES]; struct i915_gem_timeline *common; }; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c263f42cca07..4ee31c9a2bd9 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++
[Intel-gfx] [PATCH 19/38] drm/i915: Implement pread without struct-mutex
We only need struct_mutex within pread for a brief window where we need to serialise with rendering and control our cache domains. Elsewhere we can rely on the backing storage being pinned, and forgive userspace any races against us. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem.c | 360 +--- 1 file changed, 156 insertions(+), 204 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c5e14c3f4dee..4b41f8f51dc2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -63,13 +63,13 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) } static int -insert_mappable_node(struct drm_i915_private *i915, +insert_mappable_node(struct i915_ggtt *ggtt, struct drm_mm_node *node, u32 size) { memset(node, 0, sizeof(*node)); - return drm_mm_insert_node_in_range_generic(>ggtt.base.mm, node, - size, 0, 0, 0, - i915->ggtt.mappable_end, + return drm_mm_insert_node_in_range_generic(>base.mm, node, + size, 0, -1, + 0, ggtt->mappable_end, DRM_MM_SEARCH_DEFAULT, DRM_MM_CREATE_DEFAULT); } @@ -820,32 +820,6 @@ err_unpin: return ret; } -/* Per-page copy function for the shmem pread fastpath. - * Flushes invalid cachelines before reading the target if - * needs_clflush is set. */ -static int -shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length, -char __user *user_data, -bool page_do_bit17_swizzling, bool needs_clflush) -{ - char *vaddr; - int ret; - - if (unlikely(page_do_bit17_swizzling)) - return -EINVAL; - - vaddr = kmap_atomic(page); - if (needs_clflush) - drm_clflush_virt_range(vaddr + shmem_page_offset, - page_length); - ret = __copy_to_user_inatomic(user_data, - vaddr + shmem_page_offset, - page_length); - kunmap_atomic(vaddr); - - return ret ? -EFAULT : 0; -} - static void shmem_clflush_swizzled_range(char *addr, unsigned long length, bool swizzled) @@ -871,7 +845,7 @@ shmem_clflush_swizzled_range(char *addr, unsigned long length, /* Only difference to the fast-path function is that this can handle bit17 * and uses non-atomic copy and kmap functions. */ static int -shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length, +shmem_pread_slow(struct page *page, int offset, int length, char __user *user_data, bool page_do_bit17_swizzling, bool needs_clflush) { @@ -880,60 +854,132 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length, vaddr = kmap(page); if (needs_clflush) - shmem_clflush_swizzled_range(vaddr + shmem_page_offset, -page_length, + shmem_clflush_swizzled_range(vaddr + offset, length, page_do_bit17_swizzling); if (page_do_bit17_swizzling) - ret = __copy_to_user_swizzled(user_data, - vaddr, shmem_page_offset, - page_length); + ret = __copy_to_user_swizzled(user_data, vaddr, offset, length); else - ret = __copy_to_user(user_data, -vaddr + shmem_page_offset, -page_length); + ret = __copy_to_user(user_data, vaddr + offset, length); kunmap(page); return ret ? - EFAULT : 0; } -static inline unsigned long -slow_user_access(struct io_mapping *mapping, -uint64_t page_base, int page_offset, -char __user *user_data, -unsigned long length, bool pwrite) +static int +shmem_pread(struct page *page, int offset, int length, char __user *user_data, + bool page_do_bit17_swizzling, bool needs_clflush) { - void __iomem *ioaddr; + int ret; + + ret = -ENODEV; + if (!page_do_bit17_swizzling) { + char *vaddr = kmap_atomic(page); + + if (needs_clflush) + drm_clflush_virt_range(vaddr + offset, length); + ret = __copy_to_user_inatomic(user_data, vaddr + offset, length); + kunmap_atomic(vaddr); + } + if (ret == 0) + return 0; + + return shmem_pread_slow(page, offset, length, user_data, +
[Intel-gfx] [PATCH 27/38] drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered list of requests to be executed. Due to the restriction of handling individual address spaces, we are limited to a timeline per address space but we use a fence context per engine within. Our first step to introducing independent timelines per context (i.e. to allow each context to have a queue of requests to execute that have a defined set of dependencies on other requests) is to provide a timeline abstraction for the global execution queue. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_debugfs.c| 19 +++ drivers/gpu/drm/i915/i915_drv.c| 6 ++- drivers/gpu/drm/i915/i915_drv.h| 9 ++-- drivers/gpu/drm/i915/i915_gem.c| 80 ++ drivers/gpu/drm/i915/i915_gem.h| 2 + drivers/gpu/drm/i915/i915_gem_request.c| 74 --- drivers/gpu/drm/i915/i915_gem_request.h| 1 + drivers/gpu/drm/i915/i915_gem_timeline.c | 64 drivers/gpu/drm/i915/i915_gem_timeline.h | 69 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 6 +-- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +- drivers/gpu/drm/i915/i915_irq.c| 2 +- drivers/gpu/drm/i915/intel_engine_cs.c | 13 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h| 35 ++--- 15 files changed, 271 insertions(+), 113 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_gem_timeline.c create mode 100644 drivers/gpu/drm/i915/i915_gem_timeline.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b94a90f34d2d..bf07b9de078d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -42,6 +42,7 @@ i915-y += i915_cmd_parser.o \ i915_gem_shrinker.o \ i915_gem_stolen.o \ i915_gem_tiling.o \ + i915_gem_timeline.o \ i915_gem_userptr.o \ i915_gpu_error.o \ i915_trace_points.o \ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 8614bf4e479a..ef1672b3a4bc 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -579,7 +579,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) seq_printf(m, "Flip queued on %s at seqno %x, next seqno %x [current breadcrumb %x], completed? %d\n", engine->name, i915_gem_request_get_seqno(work->flip_queued_req), - dev_priv->next_seqno, + dev_priv->gt.global_timeline.next_seqno, intel_engine_get_seqno(engine), i915_gem_request_completed(work->flip_queued_req)); } else @@ -670,13 +670,13 @@ static int i915_gem_request_info(struct seq_file *m, void *data) int count; count = 0; - list_for_each_entry(req, >request_list, link) + list_for_each_entry(req, >timeline->requests, link) count++; if (count == 0) continue; seq_printf(m, "%s requests: %d\n", engine->name, count); - list_for_each_entry(req, >request_list, link) { + list_for_each_entry(req, >timeline->requests, link) { struct pid *pid = req->ctx->pid; struct task_struct *task; @@ -1054,15 +1054,8 @@ static int i915_next_seqno_get(void *data, u64 *val) { struct drm_i915_private *dev_priv = data; - int ret; - - ret = mutex_lock_interruptible(_priv->drm.struct_mutex); - if (ret) - return ret; - - *val = dev_priv->next_seqno; - mutex_unlock(_priv->drm.struct_mutex); + *val = READ_ONCE(dev_priv->gt.global_timeline.next_seqno); return 0; } @@ -1077,7 +1070,7 @@ i915_next_seqno_set(void *data, u64 val) if (ret) return ret; - ret = i915_gem_set_seqno(dev, val); + ret = i915_gem_set_global_seqno(dev, val); mutex_unlock(>struct_mutex); return ret; @@ -1336,7 +1329,7 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) seq_printf(m, "\tseqno = %x [current %x, last %x]\n", engine->hangcheck.seqno, seqno[id], - engine->last_submitted_seqno); + engine->timeline->last_submitted_seqno); seq_printf(m, "\twaiters? %s, fake irq active? %s\n",
[Intel-gfx] [PATCH 26/38] drm/i915: Restore nonblocking awaits for modesetting
After combining the dma-buf reservation object and the GEM reservation object, we lost the ability to do a nonblocking wait on the i915 request (as we blocked upon the reservation object during prepare_fb). We can instead convert the reservation object into a fence upon which we can asynchronously wait (including a forced timeout in case the DMA fence is never signaled). Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/intel_display.c | 78 ++-- drivers/gpu/drm/i915/intel_drv.h | 2 + 2 files changed, 51 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c4b28a30b282..3909b55ca157 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -14408,12 +14408,33 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) static void intel_atomic_commit_work(struct work_struct *work) { - struct drm_atomic_state *state = container_of(work, - struct drm_atomic_state, - commit_work); + struct drm_atomic_state *state = + container_of(work, struct drm_atomic_state, commit_work); + intel_atomic_commit_tail(state); } +static int __i915_sw_fence_call +intel_atomic_commit_ready(struct i915_sw_fence *fence, + enum i915_sw_fence_notify notify) +{ + struct intel_atomic_state *state = + container_of(fence, struct intel_atomic_state, commit_ready); + + switch (notify) { + case FENCE_COMPLETE: + if (state->base.commit_work.func) + queue_work(system_unbound_wq, >base.commit_work); + break; + + case FENCE_FREE: + drm_atomic_state_put(>base); + break; + } + + return NOTIFY_DONE; +} + static void intel_atomic_track_fbs(struct drm_atomic_state *state) { struct drm_plane_state *old_plane_state; @@ -14459,11 +14480,14 @@ static int intel_atomic_commit(struct drm_device *dev, if (ret) return ret; - INIT_WORK(>commit_work, intel_atomic_commit_work); + drm_atomic_state_get(state); + i915_sw_fence_init(_state->commit_ready, + intel_atomic_commit_ready); ret = intel_atomic_prepare_commit(dev, state); if (ret) { DRM_DEBUG_ATOMIC("Preparing state failed with %i\n", ret); + i915_sw_fence_commit(_state->commit_ready); return ret; } @@ -14474,10 +14498,14 @@ static int intel_atomic_commit(struct drm_device *dev, intel_atomic_track_fbs(state); drm_atomic_state_get(state); - if (nonblock) - queue_work(system_unbound_wq, >commit_work); - else + INIT_WORK(>commit_work, + nonblock ? intel_atomic_commit_work : NULL); + + i915_sw_fence_commit(_state->commit_ready); + if (!nonblock) { + i915_sw_fence_wait(_state->commit_ready); intel_atomic_commit_tail(state); + } return 0; } @@ -14593,8 +14621,7 @@ intel_prepare_plane_fb(struct drm_plane *plane, struct drm_framebuffer *fb = new_state->fb; struct drm_i915_gem_object *obj = intel_fb_obj(fb); struct drm_i915_gem_object *old_obj = intel_fb_obj(plane->state->fb); - long lret; - int ret = 0; + int ret; if (!obj && !old_obj) return 0; @@ -14602,7 +14629,6 @@ intel_prepare_plane_fb(struct drm_plane *plane, if (old_obj) { struct drm_crtc_state *crtc_state = drm_atomic_get_existing_crtc_state(new_state->state, plane->state->crtc); - long timeout = 0; /* Big Hammer, we also need to ensure that any pending * MI_WAIT_FOR_EVENT inside a user batch buffer on the @@ -14615,31 +14641,25 @@ intel_prepare_plane_fb(struct drm_plane *plane, * This should only fail upon a hung GPU, in which case we * can safely continue. */ - if (needs_modeset(crtc_state)) - timeout = i915_gem_object_wait(old_obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED, - MAX_SCHEDULE_TIMEOUT, - NULL); - if (timeout < 0) { - /* GPU hangs should have been swallowed by the wait */ - WARN_ON(timeout == -EIO); - return timeout; + if (needs_modeset(crtc_state)) { + ret =
[Intel-gfx] [PATCH 38/38] drm/i915: Support explicit fencing for execbuf
Now that the user can opt-out of implicit fencing, we need to give them back control over the fencing. We employ sync_file to wrap our drm_i915_gem_request and provide an fd that userspace can merge with other sync_file fds and pass back to the kernel to wait upon before future execution. Testcase: igt/gem_exec_fence Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 3 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 53 +++--- include/uapi/drm/i915_drm.h| 36 +++- 4 files changed, 86 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 92ecced1bc8f..ca23fd17c3ce 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -19,6 +19,7 @@ config DRM_I915 select INPUT if ACPI select ACPI_VIDEO if ACPI select ACPI_BUTTON if ACPI + select SYNC_FILE help Choose this option if you have a system that has "Intel Graphics Media Accelerator" or "HD Graphics" integrated graphics, diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 19ee76284371..3acc1fd9a9e1 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -333,6 +333,7 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_COHERENT_PHYS_GTT: case I915_PARAM_HAS_EXEC_SOFTPIN: case I915_PARAM_HAS_EXEC_ASYNC: + case I915_PARAM_HAS_EXEC_FENCE: /* For the time being all of these are always true; * if some supported hardware does not have one of these * features this value needs to be provided from @@ -2535,7 +2536,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_HWS_ADDR, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_GEM_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER, i915_gem_execbuffer, DRM_AUTH), - DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2, i915_gem_execbuffer2, DRM_AUTH|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2_WR, i915_gem_execbuffer2, DRM_AUTH|DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_PIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_GEM_UNPIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_GEM_BUSY, i915_gem_busy_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 7038da9aa68f..a99bd002596c 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -28,6 +28,7 @@ #include #include +#include #include #include @@ -1585,6 +1586,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct i915_execbuffer_params *params = _master; const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 dispatch_flags; + struct fence *in_fence = NULL; + struct sync_file *out_fence = NULL; + int out_fence_fd = -1; int ret; bool need_relocs; @@ -1628,6 +1632,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, dispatch_flags |= I915_DISPATCH_RS; } + if (args->flags & I915_EXEC_FENCE_IN) { + in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); + if (!in_fence) { + ret = -EINVAL; + goto pre_mutex_err; + } + } + + if (args->flags & I915_EXEC_FENCE_OUT) { + out_fence_fd = get_unused_fd_flags(O_CLOEXEC); + if (out_fence_fd < 0) { + ret = out_fence_fd; + out_fence_fd = -1; + goto pre_mutex_err; + } + } + /* Take a local wakeref for preparing to dispatch the execbuf as * we expect to access the hardware fairly frequently in the * process. Upon first dispatch, we acquire another prolonged @@ -1772,6 +1793,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err_batch_unpin; } + if (in_fence) { + ret = i915_gem_request_await_fence(params->request, in_fence); + if (ret < 0) + goto err_request; + } + + if (out_fence_fd != -1) { + out_fence = sync_file_create(fence_get(>request->fence)); + if (!out_fence) { + ret = -ENOMEM; + goto err_request; + } + } + /* Whilst this request exists, batch_obj will be on the
[Intel-gfx] [PATCH 09/38] drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked()
Since we only use the more generic unlocked variant, just rename it as the normal i915_gem_active_wait(). The temporary cost is that we need to always acquire the reference in a RCU safe manner, but the benefit is that we will combine the common paths. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.h | 34 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 2 files changed, 4 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h index f1ce390c9244..45998eedda2c 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.h +++ b/drivers/gpu/drm/i915/i915_gem_request.h @@ -569,40 +569,13 @@ i915_gem_active_is_idle(const struct i915_gem_active *active, } /** - * i915_gem_active_wait - waits until the request is completed - * @active - the active request on which to wait - * - * i915_gem_active_wait() waits until the request is completed before - * returning. Note that it does not guarantee that the request is - * retired first, see i915_gem_active_retire(). - * - * i915_gem_active_wait() returns immediately if the active - * request is already complete. - */ -static inline int __must_check -i915_gem_active_wait(const struct i915_gem_active *active, struct mutex *mutex) -{ - struct drm_i915_gem_request *request; - long ret; - - request = i915_gem_active_peek(active, mutex); - if (!request) - return 0; - - ret = i915_wait_request(request, - I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED, - MAX_SCHEDULE_TIMEOUT); - return ret < 0 ? ret : 0; -} - -/** - * i915_gem_active_wait_unlocked - waits until the request is completed + * i915_gem_active_wait- waits until the request is completed * @active - the active request on which to wait * @flags - how to wait * @timeout - how long to wait at most * @rps - userspace client to charge for a waitboost * - * i915_gem_active_wait_unlocked() waits until the request is completed before + * i915_gem_active_wait() waits until the request is completed before * returning, without requiring any locks to be held. Note that it does not * retire any requests before returning. * @@ -618,8 +591,7 @@ i915_gem_active_wait(const struct i915_gem_active *active, struct mutex *mutex) * Returns 0 if successful, or a negative error code. */ static inline int -i915_gem_active_wait_unlocked(const struct i915_gem_active *active, - unsigned int flags) +i915_gem_active_wait(const struct i915_gem_active *active, unsigned int flags) { struct drm_i915_gem_request *request; long ret = 0; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 88ed21f1135b..80f2db3d63d9 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -503,7 +503,7 @@ static inline int intel_engine_idle(struct intel_engine_cs *engine, unsigned int flags) { /* Wait upon the last request to be completed */ - return i915_gem_active_wait_unlocked(>last_request, flags); + return i915_gem_active_wait(>last_request, flags); } int intel_init_render_ring_buffer(struct intel_engine_cs *engine); -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 13/38] drm/i915: Markup GEM API with lockdep asserts
Add lockdep_assert_held(struct_mutex) to the API preamble of the internal GEM interfaces. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem.c | 20 drivers/gpu/drm/i915/i915_gem_evict.c| 5 - drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++ drivers/gpu/drm/i915/i915_gem_render_state.c | 2 ++ drivers/gpu/drm/i915/i915_gem_request.c | 3 +++ 5 files changed, 31 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e8373f264858..0d8ae6c54e5a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -104,6 +104,8 @@ i915_gem_wait_for_error(struct i915_gpu_error *error) { int ret; + might_sleep(); + if (!i915_reset_in_progress(error)) return 0; @@ -2295,6 +2297,8 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) { const struct drm_i915_gem_object_ops *ops = obj->ops; + lockdep_assert_held(>base.dev->struct_mutex); + if (obj->pages == NULL) return 0; @@ -2466,6 +2470,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) const struct drm_i915_gem_object_ops *ops = obj->ops; int ret; + lockdep_assert_held(>base.dev->struct_mutex); + if (obj->pages) return 0; @@ -2745,6 +2751,8 @@ void i915_gem_reset(struct drm_i915_private *dev_priv) { struct intel_engine_cs *engine; + lockdep_assert_held(_priv->drm.struct_mutex); + i915_gem_retire_requests(dev_priv); for_each_engine(engine, dev_priv) @@ -2967,6 +2975,8 @@ int i915_vma_unbind(struct i915_vma *vma) unsigned long active; int ret; + lockdep_assert_held(>base.dev->struct_mutex); + /* First wait upon any activity as retiring the request may * have side-effects such as unpinning or even unbinding this vma. */ @@ -3359,6 +3369,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) int ret; lockdep_assert_held(>base.dev->struct_mutex); + ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED | @@ -3437,6 +3448,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret = 0; + lockdep_assert_held(>base.dev->struct_mutex); + if (obj->cache_level == cache_level) goto out; @@ -3647,6 +3660,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 old_read_domains, old_write_domain; int ret; + lockdep_assert_held(>base.dev->struct_mutex); + /* Mark the pin_display early so that we account for the * display coherency whilst setting up the cache domains. */ @@ -3713,6 +3728,7 @@ err_unpin_display: void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma) { + lockdep_assert_held(>vm->dev->struct_mutex); if (WARN_ON(vma->obj->pin_display == 0)) return; @@ -3742,6 +3758,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) int ret; lockdep_assert_held(>base.dev->struct_mutex); + ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED | @@ -3897,6 +3914,7 @@ int __i915_vma_do_pin(struct i915_vma *vma, unsigned int bound = vma->flags; int ret; + lockdep_assert_held(>vm->dev->struct_mutex); GEM_BUG_ON((flags & (PIN_GLOBAL | PIN_USER)) == 0); GEM_BUG_ON((flags & PIN_GLOBAL) && !i915_vma_is_ggtt(vma)); @@ -3937,6 +3955,8 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret; + lockdep_assert_held(>base.dev->struct_mutex); + vma = i915_gem_obj_lookup_or_create_vma(obj, vm, view); if (IS_ERR(vma)) return vma; diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 5b6f81c1dbca..61f716c8768c 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -102,6 +102,7 @@ i915_gem_evict_something(struct i915_address_space *vm, struct i915_vma *vma, *next; int ret; + lockdep_assert_held(>dev->struct_mutex); trace_i915_gem_evict(vm, min_size, alignment, flags); /* @@ -212,6 +213,8 @@ i915_gem_evict_for_vma(struct i915_vma *target) { struct drm_mm_node *node, *next; + lockdep_assert_held(>vm->dev->struct_mutex); + list_for_each_entry_safe(node, next, >vm->mm.head_node.node_list, node_list) { @@ -265,7 +268,7 @@ int i915_gem_evict_vm(struct
[Intel-gfx] [PATCH 23/38] drm/i915: Use lockless object free
Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 10 +- drivers/gpu/drm/i915/i915_gem.c | 21 ++--- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 4 ++-- drivers/gpu/drm/i915/intel_display.c| 6 +++--- drivers/gpu/drm/i915/intel_overlay.c| 4 ++-- drivers/gpu/drm/i915/intel_pm.c | 2 +- 8 files changed, 21 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index dded8af4092c..4222e3f11eea 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2579,7 +2579,7 @@ static struct drm_driver driver = { .set_busid = drm_pci_set_busid, .gem_close_object = i915_gem_close_object, - .gem_free_object = i915_gem_free_object, + .gem_free_object_unlocked = i915_gem_free_object, .gem_vm_ops = _gem_vm_ops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ba0c8d097afc..8a59e27dfee4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2381,19 +2381,12 @@ __attribute__((nonnull)) static inline void i915_gem_object_put(struct drm_i915_gem_object *obj) { - drm_gem_object_unreference(>base); + __drm_gem_object_unreference(>base); } __deprecated extern void drm_gem_object_unreference(struct drm_gem_object *); -__attribute__((nonnull)) -static inline void -i915_gem_object_put_unlocked(struct drm_i915_gem_object *obj) -{ - drm_gem_object_unreference_unlocked(>base); -} - __deprecated extern void drm_gem_object_unreference_unlocked(struct drm_gem_object *); @@ -2495,7 +2488,6 @@ static inline struct i915_vma *i915_vma_get(struct i915_vma *vma) static inline void i915_vma_put(struct i915_vma *vma) { - lockdep_assert_held(>vm->dev->struct_mutex); i915_gem_object_put(vma->obj); } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4d69a58f8a2b..4b4b6f164540 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -616,7 +616,7 @@ i915_gem_create(struct drm_file *file, ret = drm_gem_handle_create(file, >base, ); /* drop reference from allocate - handle holds it now */ - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); if (ret) return ret; @@ -1115,7 +1115,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, i915_gem_object_unpin_pages(obj); out: - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return ret; } @@ -1451,7 +1451,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, i915_gem_object_unpin_pages(obj); err: - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return ret; } @@ -1561,7 +1561,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, err_pages: i915_gem_object_unpin_pages(obj); err_unlocked: - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return ret; } @@ -1592,7 +1592,7 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, } } - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return err; } @@ -1638,7 +1638,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, * pages from. */ if (!obj->base.filp) { - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return -EINVAL; } @@ -1650,7 +1650,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, struct vm_area_struct *vma; if (down_write_killable(>mmap_sem)) { - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return -EINTR; } vma = find_vma(mm, addr); @@ -1664,7 +1664,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, /* This may race, but that's ok, it only gets set */ WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU); } - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); if (IS_ERR((void *)addr)) return addr; @@ -2074,7 +2074,7 @@ i915_gem_mmap_gtt(struct drm_file *file, if (ret == 0) *offset = drm_vma_node_offset_addr(>base.vma_node); - i915_gem_object_put_unlocked(obj); + i915_gem_object_put(obj); return ret; } @@ -2871,8 +2871,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) args->timeout_ns = 0; } -
[Intel-gfx] [PATCH 25/38] drm: Add reference counting to drm_atomic_state
drm_atomic_state has a complicated single owner model that tracks the single reference from allocation through to destruction on another thread - or perhaps on a local error path. We can simplify this tracking by using reference counting (at a cost of a few more atomics). This is even more beneficial when the lifetime of the state becomes more convoluted than being passed to a single worker thread for the commit. v2: Double check !intel atomic_commit functions for missing gets v3: Update kerneldocs Signed-off-by: Chris WilsonCc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Reviewed-by: Eric Engestrom --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 3 +- drivers/gpu/drm/drm_atomic.c | 25 +++ drivers/gpu/drm/drm_atomic_helper.c | 98 +++- drivers/gpu/drm/drm_fb_helper.c | 9 +-- drivers/gpu/drm/exynos/exynos_drm_drv.c | 3 +- drivers/gpu/drm/i915/i915_debugfs.c | 3 +- drivers/gpu/drm/i915/intel_display.c | 31 + drivers/gpu/drm/i915/intel_sprite.c | 4 +- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 3 +- drivers/gpu/drm/msm/msm_atomic.c | 3 +- drivers/gpu/drm/omapdrm/omap_drv.c | 3 +- drivers/gpu/drm/rcar-du/rcar_du_kms.c| 3 +- drivers/gpu/drm/sti/sti_drv.c| 3 +- drivers/gpu/drm/tegra/drm.c | 3 +- drivers/gpu/drm/tilcdc/tilcdc_drv.c | 2 - drivers/gpu/drm/vc4/vc4_kms.c| 3 +- include/drm/drm_atomic.h | 28 +++- include/drm/drm_crtc.h | 3 + 18 files changed, 101 insertions(+), 129 deletions(-) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c index 8e7483d90c47..c9e645fa1233 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c @@ -464,7 +464,7 @@ atmel_hlcdc_dc_atomic_complete(struct atmel_hlcdc_dc_commit *commit) drm_atomic_helper_cleanup_planes(dev, old_state); - drm_atomic_state_free(old_state); + drm_atomic_state_put(old_state); /* Complete the commit, wake up any waiter. */ spin_lock(>commit.wait.lock); @@ -521,6 +521,7 @@ static int atmel_hlcdc_dc_atomic_commit(struct drm_device *dev, /* Swap the state, this is the point of no return. */ drm_atomic_helper_swap_state(state, true); + drm_atomic_state_get(state); if (async) queue_work(dc->wq, >work); else diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 23739609427d..5dd70540219c 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -74,6 +74,8 @@ EXPORT_SYMBOL(drm_atomic_state_default_release); int drm_atomic_state_init(struct drm_device *dev, struct drm_atomic_state *state) { + kref_init(>ref); + /* TODO legacy paths should maybe do a better job about * setting this appropriately? */ @@ -215,22 +217,16 @@ void drm_atomic_state_clear(struct drm_atomic_state *state) EXPORT_SYMBOL(drm_atomic_state_clear); /** - * drm_atomic_state_free - free all memory for an atomic state - * @state: atomic state to deallocate + * __drm_atomic_state_free - free all memory for an atomic state + * @ref: This atomic state to deallocate * * This frees all memory associated with an atomic state, including all the * per-object state for planes, crtcs and connectors. */ -void drm_atomic_state_free(struct drm_atomic_state *state) +void __drm_atomic_state_free(struct kref *ref) { - struct drm_device *dev; - struct drm_mode_config *config; - - if (!state) - return; - - dev = state->dev; - config = >mode_config; + struct drm_atomic_state *state = container_of(ref, typeof(*state), ref); + struct drm_mode_config *config = >dev->mode_config; drm_atomic_state_clear(state); @@ -243,7 +239,7 @@ void drm_atomic_state_free(struct drm_atomic_state *state) kfree(state); } } -EXPORT_SYMBOL(drm_atomic_state_free); +EXPORT_SYMBOL(__drm_atomic_state_free); /** * drm_atomic_get_crtc_state - get crtc state @@ -1742,7 +1738,7 @@ retry: if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) { /* * Unlike commit, check_only does not clean up state. -* Below we call drm_atomic_state_free for it. +* Below we call drm_atomic_state_put for it. */ ret = drm_atomic_check_only(state); } else if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK) { @@ -1775,8 +1771,7 @@ out: goto retry; } - if (ret || arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) - drm_atomic_state_free(state); + drm_atomic_state_put(state);
[Intel-gfx] [PATCH 17/38] drm/i915: Move object backing storage manipulation to its own locking
Break the allocation of the backing storage away from struct_mutex into a per-object lock. This allows parallel page allocation, provided we can do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT fault), i.e. before execbuf! The increased cost of the atomic counters are hidden behind i915_vma_pin() for the typical case of execbuf, i.e. as the object is typically bound between execbufs, the page_pin_count is static. The cost will be felt around set-domain and pwrite, but offset by the improvement from reduced struct_mutex contention. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 26 -- drivers/gpu/drm/i915/i915_gem.c | 86 +--- drivers/gpu/drm/i915/i915_gem_shrinker.c | 42 ++-- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 + drivers/gpu/drm/i915/i915_gem_userptr.c | 10 ++-- 5 files changed, 100 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e267f25fca6e..e68746f616b0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2249,7 +2249,8 @@ struct drm_i915_gem_object { unsigned int pin_display; struct { - unsigned int pages_pin_count; + struct mutex lock; + atomic_t pages_pin_count; struct sg_table *pages; void *mapping; @@ -3190,8 +3191,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { - lockdep_assert_held(>base.dev->struct_mutex); - if (obj->mm.pages_pin_count++) + if (atomic_inc_not_zero(>mm.pages_pin_count)) return 0; return __i915_gem_object_get_pages(obj); @@ -3200,29 +3200,26 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) static inline void __i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { - lockdep_assert_held_unless(>base.dev->struct_mutex, - i915_gem_object_is_dead(obj)); GEM_BUG_ON(!obj->mm.pages); - obj->mm.pages_pin_count++; + atomic_inc(>mm.pages_pin_count); } static inline bool i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj) { - return obj->mm.pages_pin_count; + return atomic_read(>mm.pages_pin_count); } static inline void __i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) { - lockdep_assert_held_unless(>base.dev->struct_mutex, - i915_gem_object_is_dead(obj)); GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); GEM_BUG_ON(!obj->mm.pages); - obj->mm.pages_pin_count--; + atomic_dec(>mm.pages_pin_count); } -static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) +static inline void +i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) { __i915_gem_object_unpin_pages(obj); } @@ -3245,8 +3242,8 @@ enum i915_map_type { * the kernel address space. Based on the @type of mapping, the PTE will be * set to either WriteBack or WriteCombine (via pgprot_t). * - * The caller must hold the struct_mutex, and is responsible for calling - * i915_gem_object_unpin_map() when the mapping is no longer required. + * The caller is responsible for calling i915_gem_object_unpin_map() when the + * mapping is no longer required. * * Returns the pointer through which to access the mapped object, or an * ERR_PTR() on error. @@ -3262,12 +3259,9 @@ void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, * with your access, call i915_gem_object_unpin_map() to release the pin * upon the mapping. Once the pin count reaches zero, that mapping may be * removed. - * - * The caller must hold the struct_mutex. */ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj) { - lockdep_assert_held(>base.dev->struct_mutex); i915_gem_object_unpin_pages(obj); } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 88ed07575f5c..c5e14c3f4dee 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2287,12 +2287,17 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) { struct sg_table *pages; - lockdep_assert_held(>base.dev->struct_mutex); - if (i915_gem_object_has_pinned_pages(obj)) return; GEM_BUG_ON(obj->bind_count); + if (!READ_ONCE(obj->mm.pages)) + return; + + /* May be called by shrinker from within get_pages() (on another bo) */ + mutex_lock_nested(>mm.lock, SINGLE_DEPTH_NESTING); + if (unlikely(atomic_read(>mm.pages_pin_count))) + goto unlock; /* ->put_pages might need to allocate memory for the bit17 swizzle * array, hence
[Intel-gfx] [PATCH 11/38] drm/i915: Introduce an internal allocator for disposable private objects
Quite a few of our objects used for internal hardware programming do not benefit from being swappable or from being zero initialised. As such they do not benefit from using a shmemfs backing storage and since they are internal and never directly exposed to the user, we do not need to worry about providing a filp. For these we can use an drm_i915_gem_object wrapper around a sg_table of plain struct page. They are not swapped backed and not automatically pinned. If they are reaped by the shrinker, the pages are released and the contents discarded. For the internal use case, this is fine as for example, ringbuffers are pinned from being written by a request to be read by the hardware. Once they are idle, they can be discarded entirely. As such they are a good match for execlist ringbuffers and a small varierty of other internal objects. In the first iteration, this is limited to the scratch batch buffers we use (for command parsing and state initialisation). Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_gem_batch_pool.c | 28 ++--- drivers/gpu/drm/i915/i915_gem_internal.c | 156 +++ drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- 7 files changed, 174 insertions(+), 22 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index a998c2bce70a..b94a90f34d2d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \ i915_gem_execbuffer.o \ i915_gem_fence.o \ i915_gem_gtt.o \ + i915_gem_internal.o \ i915_gem.o \ i915_gem_render_state.o \ i915_gem_request.o \ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 89eab85ab11c..6ef2068ed399 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3522,6 +3522,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, u32 gtt_offset, u32 size); +/* i915_gem_internal.c */ +struct drm_i915_gem_object * +i915_gem_object_create_internal(struct drm_device *dev, + unsigned int size); + /* i915_gem_shrinker.c */ unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv, unsigned long target, diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c index cb25cad3318c..3934c9103cf2 100644 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c @@ -97,9 +97,9 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, size_t size) { struct drm_i915_gem_object *obj = NULL; - struct drm_i915_gem_object *tmp, *next; + struct drm_i915_gem_object *tmp; struct list_head *list; - int n; + int n, ret; lockdep_assert_held(>engine->i915->drm.struct_mutex); @@ -112,19 +112,12 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, n = ARRAY_SIZE(pool->cache_list) - 1; list = >cache_list[n]; - list_for_each_entry_safe(tmp, next, list, batch_pool_link) { + list_for_each_entry(tmp, list, batch_pool_link) { /* The batches are strictly LRU ordered */ if (!i915_gem_active_is_idle(>last_read[pool->engine->id], >base.dev->struct_mutex)) break; - /* While we're looping, do some clean up */ - if (tmp->madv == __I915_MADV_PURGED) { - list_del(>batch_pool_link); - i915_gem_object_put(tmp); - continue; - } - if (tmp->base.size >= size) { obj = tmp; break; @@ -132,19 +125,16 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, } if (obj == NULL) { - int ret; - - obj = i915_gem_object_create(>engine->i915->drm, size); + obj = i915_gem_object_create_internal(>engine->i915->drm, + size); if (IS_ERR(obj)) return obj; - - ret = i915_gem_object_get_pages(obj); - if (ret) - return ERR_PTR(ret); - - obj->madv = I915_MADV_DONTNEED; } + ret = i915_gem_object_get_pages(obj); + if (ret) + return ERR_PTR(ret); +
[Intel-gfx] [PATCH 15/38] drm/i915: Refactor object page API
The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_cmd_parser.c | 2 +- drivers/gpu/drm/i915/i915_debugfs.c | 17 +-- drivers/gpu/drm/i915/i915_drv.h | 89 drivers/gpu/drm/i915/i915_gem.c | 207 +-- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 3 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 14 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_gem_fence.c| 4 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 12 +- drivers/gpu/drm/i915/i915_gem_internal.c | 19 +-- drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/i915_gem_shrinker.c | 10 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 24 ++-- drivers/gpu/drm/i915/i915_gem_tiling.c | 8 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 30 ++-- drivers/gpu/drm/i915/i915_gpu_error.c| 4 +- drivers/gpu/drm/i915/intel_lrc.c | 6 +- 17 files changed, 235 insertions(+), 218 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 70980f82a15b..8d20020cb9f9 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1290,7 +1290,7 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, } if (ret == 0 && needs_clflush_after) - drm_clflush_virt_range(shadow_batch_obj->mapping, batch_len); + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len); i915_gem_object_unpin_map(shadow_batch_obj); return ret; diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 278fb18efde2..0b2e0ff968cf 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -114,7 +114,7 @@ static char get_global_flag(struct drm_i915_gem_object *obj) static char get_pin_mapped_flag(struct drm_i915_gem_object *obj) { - return obj->mapping ? 'M' : ' '; + return obj->mm.mapping ? 'M' : ' '; } static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj) @@ -160,8 +160,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) i915_gem_active_get_seqno(>last_write, >base.dev->struct_mutex), i915_cache_level_str(dev_priv, obj->cache_level), - obj->dirty ? " dirty" : "", - obj->madv == I915_MADV_DONTNEED ? " purgeable" : ""); + obj->mm.dirty ? " dirty" : "", + obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : ""); if (obj->base.name) seq_printf(m, " (name: %d)", obj->base.name); list_for_each_entry(vma, >vma_list, obj_link) { @@ -413,12 +413,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) size += obj->base.size; ++count; - if (obj->madv == I915_MADV_DONTNEED) { + if (obj->mm.madv == I915_MADV_DONTNEED) { purgeable_size += obj->base.size; ++purgeable_count; } - if (obj->mapping) { + if (obj->mm.mapping) { mapped_count++; mapped_size += obj->base.size; } @@ -435,12 +435,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) ++dpy_count; } - if (obj->madv == I915_MADV_DONTNEED) { + if (obj->mm.madv == I915_MADV_DONTNEED) { purgeable_size += obj->base.size; ++purgeable_count; } - if (obj->mapping) { + if (obj->mm.mapping) { mapped_count++; mapped_size += obj->base.size; } @@ -1978,7 +1978,7 @@ static void i915_dump_lrc_obj(struct seq_file *m, seq_printf(m, "\tBound in GGTT at 0x%08x\n", i915_ggtt_offset(vma)); - if (i915_gem_object_get_pages(vma->obj)) { + if (i915_gem_object_pin_pages(vma->obj)) { seq_puts(m, "\tFailed to get pages for context object\n\n"); return; } @@ -1997,6 +1997,7 @@ static void i915_dump_lrc_obj(struct seq_file *m, kunmap_atomic(reg_state); } + i915_gem_object_unpin_pages(vma->obj); seq_putc(m, '\n'); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 33ce6db85ce4..d51a4ec512fa 100644 ---
[Intel-gfx] [PATCH 16/38] drm/i915: Pass around sg_table to get_pages/put_pages backend
The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 36 +-- drivers/gpu/drm/i915/i915_gem.c | 174 +++ drivers/gpu/drm/i915/i915_gem_dmabuf.c | 20 ++-- drivers/gpu/drm/i915/i915_gem_fence.c| 16 +-- drivers/gpu/drm/i915/i915_gem_gtt.c | 19 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 6 +- drivers/gpu/drm/i915/i915_gem_internal.c | 22 ++-- drivers/gpu/drm/i915/i915_gem_shrinker.c | 11 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 43 drivers/gpu/drm/i915/i915_gem_userptr.c | 88 10 files changed, 226 insertions(+), 209 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d51a4ec512fa..e267f25fca6e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2157,8 +2157,8 @@ struct drm_i915_gem_object_ops { * being released or under memory pressure (where we attempt to * reap pages for the shrinker). */ - int (*get_pages)(struct drm_i915_gem_object *); - void (*put_pages)(struct drm_i915_gem_object *); + struct sg_table *(*get_pages)(struct drm_i915_gem_object *); + void (*put_pages)(struct drm_i915_gem_object *, struct sg_table *); int (*dmabuf_export)(struct drm_i915_gem_object *); void (*release)(struct drm_i915_gem_object *); @@ -2300,8 +2300,6 @@ struct drm_i915_gem_object { struct i915_gem_userptr { uintptr_t ptr; unsigned read_only :1; - unsigned workers :4; -#define I915_GEM_USERPTR_MAX_WORKERS 15 struct i915_mm_struct *mm; struct i915_mmu_object *mmu_object; @@ -2361,6 +2359,19 @@ __deprecated extern void drm_gem_object_unreference_unlocked(struct drm_gem_object *); static inline bool +i915_gem_object_is_dead(const struct drm_i915_gem_object *obj) +{ + return atomic_read(>base.refcount.refcount) == 0; +} + +#if IS_ENABLED(CONFIG_LOCKDEP) +#define lockdep_assert_held_unless(lock, cond) \ + GEM_BUG_ON(!lockdep_is_held(lock) && !(cond)) +#else +#define lockdep_assert_held_unless(lock, cond) +#endif + +static inline bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj) { return obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE; @@ -3172,6 +3183,8 @@ dma_addr_t i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, unsigned long n); +void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, +struct sg_table *pages); int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check @@ -3187,7 +3200,8 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) static inline void __i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { - lockdep_assert_held(>base.dev->struct_mutex); + lockdep_assert_held_unless(>base.dev->struct_mutex, + i915_gem_object_is_dead(obj)); GEM_BUG_ON(!obj->mm.pages); obj->mm.pages_pin_count++; } @@ -3201,7 +3215,8 @@ i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj) static inline void __i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) { - lockdep_assert_held(>base.dev->struct_mutex); + lockdep_assert_held_unless(>base.dev->struct_mutex, + i915_gem_object_is_dead(obj)); GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); GEM_BUG_ON(!obj->mm.pages); obj->mm.pages_pin_count--; @@ -3212,7 +3227,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) __i915_gem_object_unpin_pages(obj); } -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); +void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); +void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj); enum i915_map_type { I915_MAP_WB = 0, @@ -3435,8 +3451,10 @@ i915_vma_unpin_fence(struct i915_vma *vma) void i915_gem_restore_fences(struct drm_device *dev); void i915_gem_detect_bit_6_swizzle(struct drm_device *dev); -void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj); -void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj); +void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, + struct sg_table *pages); +void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, +
[Intel-gfx] [PATCH 06/38] drm/i915: Support asynchronous waits on struct fence from i915_gem_request
We will need to wait on DMA completion (as signaled via struct fence) before executing our i915_gem_request. Therefore we want to expose a method for adding the await on the fence itself to the request. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem_request.c | 40 + drivers/gpu/drm/i915/i915_gem_request.h | 2 ++ 2 files changed, 42 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 40978bc12ceb..624d71bf4518 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -23,6 +23,7 @@ */ #include +#include #include "i915_drv.h" @@ -494,6 +495,45 @@ i915_gem_request_await_request(struct drm_i915_gem_request *to, return 0; } +int +i915_gem_request_await_fence(struct drm_i915_gem_request *req, +struct fence *fence) +{ + struct fence_array *array; + int ret; + int i; + + if (test_bit(FENCE_FLAG_SIGNALED_BIT, >flags)) + return 0; + + if (fence_is_i915(fence)) + return i915_gem_request_await_request(req, to_request(fence)); + + if (!fence_is_array(fence)) { + ret = i915_sw_fence_await_dma_fence(>submit, + fence, 10*HZ, + GFP_KERNEL); + return ret < 0 ? ret : 0; + } + + array = to_fence_array(fence); + for (i = 0; i < array->num_fences; i++) { + struct fence *child = array->fences[i]; + + if (fence_is_i915(child)) + ret = i915_gem_request_await_request(req, +to_request(child)); + else + ret = i915_sw_fence_await_dma_fence(>submit, + child, 10*HZ, + GFP_KERNEL); + if (ret < 0) + return ret; + } + + return 0; +} + /** * i915_gem_request_await_object - set this request to (async) wait upon a bo * diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h index 974bd7bcc801..c85a3d82febf 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.h +++ b/drivers/gpu/drm/i915/i915_gem_request.h @@ -214,6 +214,8 @@ int i915_gem_request_await_object(struct drm_i915_gem_request *to, struct drm_i915_gem_object *obj, bool write); +int i915_gem_request_await_fence(struct drm_i915_gem_request *req, +struct fence *fence); void __i915_add_request(struct drm_i915_gem_request *req, bool flush_caches); #define i915_add_request(req) \ -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/38] drm/i915: Stop the machine whilst capturing the GPU crash dump
The error state is purposefully racy as we expect it to be called at any time and so have avoided any locking whilst capturing the crash dump. However, with multi-engine GPUs and multiple CPUs, those races can manifest into OOPSes as we attempt to chase dangling pointers freed on other CPUs. Under discussion are lots of ways to slow down normal operation in order to protect the post-mortem error capture, but what it we take the opposite approach and freeze the machine whilst the error capture runs (note the GPU may still running, but as long as we don't process any of the results the driver's bookkeeping will be static). Note that by of itself, this is not a complete fix. It also depends on the compiler barriers in list_add/list_del to prevent traversing the lists into the void. We also depend that we only require state from carefully controlled sources - i.e. all the state we require for post-mortem debugging should be reachable from the request itself so that we only have to worry about retrieving the request carefully. Once we have the request, we know that all pointers from it are intact. v2: Avoid drm_clflush_pages() inside stop_machine() as it may use stop_machine() itself for its wbinvd fallback. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 46 +-- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 10a6ac11b6a9..0f46a9c04c0e 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -4,6 +4,7 @@ config DRM_I915 depends on X86 && PCI select INTEL_GTT select INTERVAL_TREE + select STOP_MACHINE # we need shmfs for the swappable backing store, and in particular # the shmem_readpage() which depends upon tmpfs select SHMEM diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1643acb02513..3fda31816fdd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -735,6 +735,8 @@ struct drm_i915_error_state { struct kref ref; struct timeval time; + struct drm_i915_private *i915; + char error_msg[128]; bool simulated; int iommu; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 364f5b17df21..13ad86e11695 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -28,6 +28,7 @@ */ #include +#include #include "i915_drv.h" #ifdef CONFIG_DRM_I915_CAPTURE_ERROR @@ -720,14 +721,12 @@ i915_error_object_create(struct drm_i915_private *dev_priv, dst->page_count = num_pages; while (num_pages--) { - unsigned long flags; void *d; d = kmalloc(PAGE_SIZE, GFP_ATOMIC); if (d == NULL) goto unwind; - local_irq_save(flags); if (use_ggtt) { void __iomem *s; @@ -746,15 +745,10 @@ i915_error_object_create(struct drm_i915_private *dev_priv, page = i915_gem_object_get_page(src, i); - drm_clflush_pages(, 1); - s = kmap_atomic(page); memcpy(d, s, PAGE_SIZE); kunmap_atomic(s); - - drm_clflush_pages(, 1); } - local_irq_restore(flags); dst->pages[i++] = d; reloc_offset += PAGE_SIZE; @@ -1420,6 +1414,31 @@ static void i915_capture_gen_state(struct drm_i915_private *dev_priv, sizeof(error->device_info)); } +static int capture(void *data) +{ + struct drm_i915_error_state *error = data; + + /* Ensure that what we readback from memory matches what the GPU sees */ + wbinvd(); + + i915_capture_gen_state(error->i915, error); + i915_capture_reg_state(error->i915, error); + i915_gem_record_fences(error->i915, error); + i915_gem_record_rings(error->i915, error); + i915_capture_active_buffers(error->i915, error); + i915_capture_pinned_buffers(error->i915, error); + + do_gettimeofday(>time); + + error->overlay = intel_overlay_capture_error_state(error->i915); + error->display = intel_display_capture_error_state(error->i915); + + /* And make sure we don't leave trash in the CPU cache */ + wbinvd(); + + return 0; +} + /** * i915_capture_error_state - capture an error record for later analysis * @dev: drm device @@ -1451,18 +1470,9 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv, } kref_init(>ref); + error->i915 = dev_priv; - i915_capture_gen_state(dev_priv, error); -
[Intel-gfx] [PATCH 03/38] drm/i915: Always use the GTT for error capture
Since the GTT provides universal access to any GPU page, we can use it to reduce our plethora of read methods to just one. It also has the important characteristic of being exactly what the GPU sees - if there are incoherency problems, seeing the batch as executed (rather than as trapped inside the cpu cache) is important. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_gem_gtt.c | 43 drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + drivers/gpu/drm/i915/i915_gpu_error.c | 121 -- 3 files changed, 74 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0bb4232f66bc..7906fc3aa488 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2717,6 +2717,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) */ struct i915_ggtt *ggtt = _priv->ggtt; unsigned long hole_start, hole_end; + struct i915_hw_ppgtt *ppgtt; struct drm_mm_node *entry; int ret; @@ -2724,6 +2725,15 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) if (ret) return ret; + /* Reserve a mappable slot for our lockless error capture */ + ret = drm_mm_insert_node_in_range_generic(>base.mm, + >gpu_error, + 4096, 0, -1, + 0, ggtt->mappable_end, + 0, 0); + if (ret) + return ret; + /* Clear any non-preallocated blocks */ drm_mm_for_each_hole(entry, >base.mm, hole_start, hole_end) { DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n", @@ -2738,25 +2748,21 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) true); if (USES_PPGTT(dev_priv) && !USES_FULL_PPGTT(dev_priv)) { - struct i915_hw_ppgtt *ppgtt; - ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL); - if (!ppgtt) - return -ENOMEM; + if (!ppgtt) { + ret = -ENOMEM; + goto err; + } ret = __hw_ppgtt_init(ppgtt, dev_priv); - if (ret) { - kfree(ppgtt); - return ret; - } + if (ret) + goto err_ppgtt; - if (ppgtt->base.allocate_va_range) + if (ppgtt->base.allocate_va_range) { ret = ppgtt->base.allocate_va_range(>base, 0, ppgtt->base.total); - if (ret) { - ppgtt->base.cleanup(>base); - kfree(ppgtt); - return ret; + if (ret) + goto err_ppgtt_cleanup; } ppgtt->base.clear_range(>base, @@ -2770,6 +2776,14 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) } return 0; + +err_ppgtt_cleanup: + ppgtt->base.cleanup(>base); +err_ppgtt: + kfree(ppgtt); +err: + drm_mm_remove_node(>gpu_error); + return ret; } /** @@ -2788,6 +2802,9 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv) i915_gem_cleanup_stolen(_priv->drm); + if (drm_mm_node_allocated(>gpu_error)) + drm_mm_remove_node(>gpu_error); + if (drm_mm_initialized(>base.mm)) { intel_vgt_deballoon(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index ec78be2f8c77..a90999fe2e57 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -450,6 +450,8 @@ struct i915_ggtt { bool do_idle_maps; int mtrr; + + struct drm_mm_node gpu_error; }; struct i915_hw_ppgtt { diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 13ad86e11695..82a928b0b2f3 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -632,7 +632,7 @@ static void i915_error_object_free(struct drm_i915_error_object *obj) return; for (page = 0; page < obj->page_count; page++) - kfree(obj->pages[page]); + free_page((unsigned long)obj->pages[page]); kfree(obj); } @@ -669,98 +669,68 @@ static void i915_error_state_free(struct kref *error_ref) kfree(error); } +static int compress_page(void *src, struct drm_i915_error_object *dst) +{ + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + return -ENOMEM; + +
[Intel-gfx] [PATCH 08/38] drm/i915: Rearrange i915_wait_request() accounting with callers
Our low-level wait routine has evolved from our generic wait interface that handled unlocked, RPS boosting, waits with time tracking. If we push our GEM fence tracking to use reservation_objects (required for handling multiple timelines), we lose the ability to pass the required information down to i915_wait_request(). However, if we push the extra functionality from i915_wait_request() to the individual callsites (i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use of those extras, we can both simplify our low level wait and prepare for extending the GEM interface for use of reservation_objects. * This causes a temporary regression in the use of wait-ioctl as a busy query -- it will fail to report immediately, but go into i915_wait_request() before timing out. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 7 +- drivers/gpu/drm/i915/i915_gem.c | 300 +++- drivers/gpu/drm/i915/i915_gem_request.c | 117 ++--- drivers/gpu/drm/i915/i915_gem_request.h | 32 ++-- drivers/gpu/drm/i915/i915_gem_userptr.c | 12 +- drivers/gpu/drm/i915/intel_display.c| 34 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c | 14 +- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +- 8 files changed, 283 insertions(+), 236 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index dc0cca48ff3d..a3401418bf4d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3279,9 +3279,10 @@ int __must_check i915_gem_wait_for_idle(struct drm_i915_private *dev_priv, int __must_check i915_gem_suspend(struct drm_device *dev); void i915_gem_resume(struct drm_device *dev); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); -int __must_check -i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, - bool readonly); +int i915_gem_object_wait(struct drm_i915_gem_object *obj, +unsigned int flags, +long timeout, +struct intel_rps_client *rps); int __must_check i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c8bd02277b7d..d9214c9d31d2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -292,7 +292,12 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj) * must wait for all rendering to complete to the object (as unbinding * must anyway), and retire the requests. */ - ret = i915_gem_object_wait_rendering(obj, false); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT, + NULL); if (ret) return ret; @@ -311,88 +316,171 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj) return ret; } -/** - * Ensures that all rendering to the object has completed and the object is - * safe to unbind from the GTT or access from the CPU. - * @obj: i915 gem object - * @readonly: waiting for just read access or read-write access - */ -int -i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, - bool readonly) +static long +i915_gem_object_wait_fence(struct fence *fence, + unsigned int flags, + long timeout, + struct intel_rps_client *rps) { - struct reservation_object *resv; - struct i915_gem_active *active; - unsigned long active_mask; - int idx; + struct drm_i915_gem_request *rq; - lockdep_assert_held(>base.dev->struct_mutex); + BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1); - if (!readonly) { - active = obj->last_read; - active_mask = i915_gem_object_get_active(obj); - } else { - active_mask = 1; - active = >last_write; + if (test_bit(FENCE_FLAG_SIGNALED_BIT, >flags)) + return timeout; + + if (!fence_is_i915(fence)) + return fence_wait_timeout(fence, + flags & I915_WAIT_INTERRUPTIBLE, + timeout); + + rq = to_request(fence); + if (i915_gem_request_completed(rq)) + goto out; + + /* This client is about to stall waiting for the GPU. In many cases +* this is undesirable and limits the throughput of the system, as +* many clients cannot continue processing user input/output whilst +* blocked. RPS autotuning may take tens of milliseconds to respond +
[Intel-gfx] [PATCH 07/38] drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate
In forthcoming patches, we want to be able to dynamically allocate the wait_queue_t used whilst awaiting. This is more convenient if we extend the i915_sw_fence_await_sw_fence() to perform the allocation for us if we pass in a gfp mask as an alternative than a preallocated struct. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 2 +- drivers/gpu/drm/i915/i915_sw_fence.c| 22 -- drivers/gpu/drm/i915/i915_sw_fence.h| 7 ++- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 624d71bf4518..7496661b295e 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -678,7 +678,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) >i915->drm.struct_mutex); if (prev) i915_sw_fence_await_sw_fence(>submit, >submit, ->submitq); +>submitq, GFP_NOWAIT); request->emitted_jiffies = jiffies; request->previous_seqno = engine->last_submitted_seqno; diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 1e5cbc585ca2..1fd20b69feea 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -13,6 +13,8 @@ #include "i915_sw_fence.h" +#define I915_SW_FENCE_FLAG_ALLOC BIT(0) + static DEFINE_SPINLOCK(i915_sw_fence_lock); static int __i915_sw_fence_notify(struct i915_sw_fence *fence, @@ -135,6 +137,8 @@ static int i915_sw_fence_wake(wait_queue_t *wq, unsigned mode, int flags, void * list_del(>task_list); __i915_sw_fence_complete(wq->private, key); i915_sw_fence_put(wq->private); + if (wq->flags & I915_SW_FENCE_FLAG_ALLOC) + kfree(wq); return 0; } @@ -194,7 +198,7 @@ static bool i915_sw_fence_check_if_after(struct i915_sw_fence *fence, int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence, struct i915_sw_fence *signaler, -wait_queue_t *wq) +wait_queue_t *wq, gfp_t gfp) { unsigned long flags; int pending; @@ -206,8 +210,22 @@ int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence, if (unlikely(i915_sw_fence_check_if_after(fence, signaler))) return -EINVAL; + pending = 0; + if (!wq) { + wq = kmalloc(sizeof(*wq), gfp); + if (!wq) { + if (!gfpflags_allow_blocking(gfp)) + return -ENOMEM; + + i915_sw_fence_wait(signaler); + return 0; + } + + pending |= I915_SW_FENCE_FLAG_ALLOC; + } + INIT_LIST_HEAD(>task_list); - wq->flags = 0; + wq->flags = pending; wq->func = i915_sw_fence_wake; wq->private = i915_sw_fence_get(fence); diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index 373141602ca4..d221df381ec2 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -45,7 +45,7 @@ void i915_sw_fence_commit(struct i915_sw_fence *fence); int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence, struct i915_sw_fence *after, -wait_queue_t *wq); +wait_queue_t *wq, gfp_t gfp); int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, struct fence *dma, unsigned long timeout, @@ -62,4 +62,9 @@ static inline bool i915_sw_fence_done(const struct i915_sw_fence *fence) return atomic_read(>pending) < 0; } +static inline void i915_sw_fence_wait(struct i915_sw_fence *fence) +{ + wait_event(fence->wait, i915_sw_fence_done(fence)); +} + #endif /* _I915_SW_FENCE_H_ */ -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/38] drm/i915: Defer active reference until required
We only need the active reference to keep the object alive after the handle has been deleted (so as to prevent a synchronous gem_close). Why then pay the price of a kref on every execbuf when we can insert that final active ref just in time for the handle deletion? Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/i915_drv.h | 34 drivers/gpu/drm/i915/i915_gem.c | 25 +++- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 2 +- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 -- drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +- drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 4 ++-- 8 files changed, 64 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a3401418bf4d..89eab85ab11c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2214,6 +2214,13 @@ struct drm_i915_gem_object { ((READ_ONCE((bo)->flags) >> I915_BO_ACTIVE_SHIFT) & I915_BO_ACTIVE_MASK) /** +* Have we taken a reference for the object for incomplete GPU +* activity? +*/ +#define I915_BO_ACTIVE_REF_SHIFT (I915_BO_ACTIVE_SHIFT + I915_NUM_ENGINES) +#define I915_BO_ACTIVE_REF BIT(I915_BO_ACTIVE_REF_SHIFT) + + /** * This is set if the object has been written to since last bound * to the GTT */ @@ -2383,6 +2390,28 @@ i915_gem_object_has_active_engine(const struct drm_i915_gem_object *obj, return obj->flags & BIT(engine + I915_BO_ACTIVE_SHIFT); } +static inline bool +i915_gem_object_has_active_reference(const struct drm_i915_gem_object *obj) +{ + return obj->flags & I915_BO_ACTIVE_REF; +} + +static inline void +i915_gem_object_set_active_reference(struct drm_i915_gem_object *obj) +{ + lockdep_assert_held(>base.dev->struct_mutex); + obj->flags |= I915_BO_ACTIVE_REF; +} + +static inline void +i915_gem_object_clear_active_reference(struct drm_i915_gem_object *obj) +{ + lockdep_assert_held(>base.dev->struct_mutex); + obj->flags &= ~I915_BO_ACTIVE_REF; +} + +void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj); + static inline unsigned int i915_gem_object_get_tiling(struct drm_i915_gem_object *obj) { @@ -2413,6 +2442,11 @@ static inline void i915_vma_put(struct i915_vma *vma) i915_gem_object_put(vma->obj); } +static inline void i915_vma_put_internal(struct i915_vma *vma) +{ + __i915_gem_object_release_unless_active(vma->obj); +} + /* * Optimised SGL iterator for GEM objects */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d9214c9d31d2..e8373f264858 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2618,7 +2618,10 @@ i915_gem_object_retire__read(struct i915_gem_active *active, list_move_tail(>global_list, >i915->mm.bound_list); - i915_gem_object_put(obj); + if (i915_gem_object_has_active_reference(obj)) { + i915_gem_object_clear_active_reference(obj); + i915_gem_object_put(obj); + } } static bool i915_context_is_banned(const struct i915_gem_context *ctx) @@ -2881,6 +2884,12 @@ void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file) list_for_each_entry_safe(vma, vn, >vma_list, obj_link) if (vma->vm->file == fpriv) i915_vma_close(vma); + + if (i915_gem_object_is_active(obj) && + !i915_gem_object_has_active_reference(obj)) { + i915_gem_object_set_active_reference(obj); + i915_gem_object_get(obj); + } mutex_unlock(>base.dev->struct_mutex); } @@ -4358,6 +4367,20 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) intel_runtime_pm_put(dev_priv); } +void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj) +{ + if (!obj) + return; + + lockdep_assert_held(>base.dev->struct_mutex); + + GEM_BUG_ON(i915_gem_object_has_active_reference(obj)); + if (i915_gem_object_is_active(obj)) + i915_gem_object_set_active_reference(obj); + else + i915_gem_object_put(obj); +} + int i915_gem_suspend(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c index ed989596d9a3..cb25cad3318c 100644 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c @@ -73,7 +73,7 @@ void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) list_for_each_entry_safe(obj, next,
[Intel-gfx] Multiple timelines, take 2
A minor hitch in handling the unification of vma + obj retirements is that we trigger a recursive use-after-free if we try to purge an object with an active reference. One way to side-step that issue was to pull in the deferred free work, which pulled in dependencies on the separate mm.lock and a few rearrangements to make that possible. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 01/38] drm/i915: Allow disabling error capture
We currently capture the GPU state after we detect a hang. This is vital for us to both triage and debug hangs in the wild (post-mortem debugging). However, it comes at the cost of running some potentially dangerous code (since it has to make very few assumption about the state of the driver) that is quite resource intensive. Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/Kconfig | 10 ++ drivers/gpu/drm/i915/i915_debugfs.c | 6 ++ drivers/gpu/drm/i915/i915_drv.h | 16 drivers/gpu/drm/i915/i915_gpu_error.c | 7 +++ drivers/gpu/drm/i915/i915_params.c| 9 + drivers/gpu/drm/i915/i915_params.h| 1 + drivers/gpu/drm/i915/i915_sysfs.c | 8 drivers/gpu/drm/i915/intel_display.c | 4 drivers/gpu/drm/i915/intel_overlay.c | 4 9 files changed, 65 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e469118f..10a6ac11b6a9 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -46,6 +46,16 @@ config DRM_I915_PRELIMINARY_HW_SUPPORT If in doubt, say "N". +config DRM_I915_CAPTURE_ERROR + bool "Enable capturing GPU state following a hang" + depends on DRM_I915 + default y + help + This option enables capturing the GPU state when a hang is detected. + This information is vital for triaging hangs and assists in debugging. + + If in doubt, say "Y". + config DRM_I915_USERPTR bool "Always enable userptr support" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 27b0e34dadec..278fb18efde2 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -956,6 +956,8 @@ static int i915_hws_info(struct seq_file *m, void *data) return 0; } +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR + static ssize_t i915_error_state_write(struct file *filp, const char __user *ubuf, @@ -1038,6 +1040,8 @@ static const struct file_operations i915_error_state_fops = { .release = i915_error_state_release, }; +#endif + static int i915_next_seqno_get(void *data, u64 *val) { @@ -5309,7 +5313,9 @@ static const struct i915_debugfs_files { {"i915_ring_missed_irq", _ring_missed_irq_fops}, {"i915_ring_test_irq", _ring_test_irq_fops}, {"i915_gem_drop_caches", _drop_caches_fops}, +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR {"i915_error_state", _error_state_fops}, +#endif {"i915_next_seqno", _next_seqno_fops}, {"i915_display_crc_ctl", _display_crc_ctl_fops}, {"i915_pri_wm_latency", _pri_wm_latency_fops}, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4dd307ed4336..1643acb02513 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3522,6 +3522,8 @@ static inline void intel_display_crc_init(struct drm_i915_private *dev_priv) {} #endif /* i915_gpu_error.c */ +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR + __printf(2, 3) void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...); int i915_error_state_to_str(struct drm_i915_error_state_buf *estr, @@ -3542,6 +3544,20 @@ void i915_error_state_get(struct drm_device *dev, void i915_error_state_put(struct i915_error_state_file_priv *error_priv); void i915_destroy_error_state(struct drm_device *dev); +#else + +static inline void i915_capture_error_state(struct drm_i915_private *dev_priv, + u32 engine_mask, + const char *error_msg) +{ +} + +static inline void i915_destroy_error_state(struct drm_device *dev) +{ +} + +#endif + void i915_get_extra_instdone(struct drm_i915_private *dev_priv, uint32_t *instdone); const char *i915_cache_level_str(struct drm_i915_private *i915, int type); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 334f15df7c8d..364f5b17df21 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -30,6 +30,8 @@ #include #include "i915_drv.h" +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR + static const char *engine_str(int engine) { switch (engine) { @@ -1435,6 +1437,9 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error; unsigned long flags; + if (!i915.error_capture) + return; + if (READ_ONCE(dev_priv->gpu_error.first_error)) return; @@ -1520,6 +1525,8 @@ void i915_destroy_error_state(struct drm_device *dev) kref_put(>ref, i915_error_state_free); } +#endif + const char *i915_cache_level_str(struct drm_i915_private *i915, int type) { switch (type) { diff --git a/drivers/gpu/drm/i915/i915_params.c
[Intel-gfx] [PATCH 04/38] drm/i915: Consolidate error object printing
Leave all the pretty printing to userspace and simplify the error capture to only have a single common object printer. It makes the kernel code more compact, and the refactoring allows us to apply more complex transformations like compressing the output. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 100 +- 1 file changed, 25 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 82a928b0b2f3..f4cc6746dc72 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -311,10 +311,22 @@ void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...) } static void print_error_obj(struct drm_i915_error_state_buf *m, + struct intel_engine_cs *engine, + const char *name, struct drm_i915_error_object *obj) { int page, offset, elt; + if (!obj) + return; + + if (name) { + err_printf(m, "%s --- %s = 0x%08x %08x\n", + engine ? engine->name : "global", name, + upper_32_bits(obj->gtt_offset), + lower_32_bits(obj->gtt_offset)); + } + for (page = offset = 0; page < obj->page_count; page++) { for (elt = 0; elt < PAGE_SIZE/4; elt++) { err_printf(m, "%08x : %08x\n", offset, @@ -342,8 +354,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, struct pci_dev *pdev = dev_priv->drm.pdev; struct drm_i915_error_state *error = error_priv->error; struct drm_i915_error_object *obj; - int i, j, offset, elt; int max_hangcheck_score; + int i, j; if (!error) { err_printf(m, "no error state collected\n"); @@ -467,15 +479,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, err_printf(m, " --- gtt_offset = 0x%08x %08x\n", upper_32_bits(obj->gtt_offset), lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); - } - - obj = ee->wa_batchbuffer; - if (obj) { - err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n", - dev_priv->engine[i].name, - lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); + print_error_obj(m, _priv->engine[i], NULL, obj); } if (ee->num_requests) { @@ -507,77 +511,23 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } - if ((obj = ee->ringbuffer)) { - err_printf(m, "%s --- ringbuffer = 0x%08x\n", - dev_priv->engine[i].name, - lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); - } + print_error_obj(m, _priv->engine[i], + "ringbuffer", ee->ringbuffer); - if ((obj = ee->hws_page)) { - u64 hws_offset = obj->gtt_offset; - u32 *hws_page = >pages[0][0]; + print_error_obj(m, _priv->engine[i], + "HW Status", ee->hws_page); - if (i915.enable_execlists) { - hws_offset += LRC_PPHWSP_PN * PAGE_SIZE; - hws_page = >pages[LRC_PPHWSP_PN][0]; - } - err_printf(m, "%s --- HW Status = 0x%08llx\n", - dev_priv->engine[i].name, hws_offset); - offset = 0; - for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { - err_printf(m, "[%04x] %08x %08x %08x %08x\n", - offset, - hws_page[elt], - hws_page[elt+1], - hws_page[elt+2], - hws_page[elt+3]); - offset += 16; - } - } + print_error_obj(m, _priv->engine[i], + "HW context", ee->ctx); - obj = ee->wa_ctx; - if (obj) { - u64 wa_ctx_offset = obj->gtt_offset; - u32 *wa_ctx_page = >pages[0][0]; - struct intel_engine_cs *engine = _priv->engine[RCS]; -
[Intel-gfx] [PATCH 05/38] drm/i915: Compress GPU objects in error state
Our error states are quickly growing, pinning kernel memory with them. The majority of the space is taken up by the error objects. These compress well using zlib and without decode are mostly meaningless, so encoding them does not hinder quickly parsing the error state for familiarity. v2: Make the zlib dependency optional Signed-off-by: Chris Wilson--- drivers/gpu/drm/i915/Kconfig | 12 +++ drivers/gpu/drm/i915/i915_drv.h | 3 +- drivers/gpu/drm/i915/i915_gpu_error.c | 170 +- 3 files changed, 163 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 0f46a9c04c0e..92ecced1bc8f 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -57,6 +57,18 @@ config DRM_I915_CAPTURE_ERROR If in doubt, say "Y". +config DRM_I915_COMPRESS_ERROR + bool "Compress GPU error state" + depends on DRM_I915_CAPTURE_ERROR + select ZLIB_DEFLATE + default y + help + This option selects ZLIB_DEFLATE if it isn't already + selected and causes any error state captured upon a GPU hang + to be compressed using zlib. + + If in doubt, say "Y". + config DRM_I915_USERPTR bool "Always enable userptr support" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3fda31816fdd..dc0cca48ff3d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -806,9 +806,10 @@ struct drm_i915_error_state { u32 semaphore_mboxes[I915_NUM_ENGINES - 1]; struct drm_i915_error_object { - int page_count; u64 gtt_offset; u64 gtt_size; + int page_count; + int unused; u32 *pages[0]; } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index f4cc6746dc72..dcb279035b29 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -29,6 +29,7 @@ #include #include +#include #include "i915_drv.h" #ifdef CONFIG_DRM_I915_CAPTURE_ERROR @@ -175,6 +176,110 @@ static void i915_error_puts(struct drm_i915_error_state_buf *e, #define err_printf(e, ...) i915_error_printf(e, __VA_ARGS__) #define err_puts(e, s) i915_error_puts(e, s) +#ifdef CONFIG_DRM_I915_COMPRESS_ERROR + +static bool compress_init(struct z_stream_s *zstream) +{ + memset(zstream, 0, sizeof(*zstream)); + + zstream->workspace = + kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL), + GFP_ATOMIC | __GFP_NOWARN); + if (!zstream->workspace) + return NULL; + + if (zlib_deflateInit(zstream, Z_DEFAULT_COMPRESSION) != Z_OK) { + kfree(zstream->workspace); + return false; + } + + return true; +} + +static int compress_page(struct z_stream_s *zstream, +void *src, +struct drm_i915_error_object *dst) +{ + zstream->next_in = src; + zstream->avail_in = PAGE_SIZE; + + do { + if (zstream->avail_out == 0) { + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + return -ENOMEM; + + dst->pages[dst->page_count++] = (void *)page; + + zstream->next_out = (void *)page; + zstream->avail_out = PAGE_SIZE; + } + + if (zlib_deflate(zstream, Z_SYNC_FLUSH) != Z_OK) + return -EIO; + + /* Fallback to uncompressed if we increase size? */ + if (0 && zstream->total_out > zstream->total_in) + return -E2BIG; + } while (zstream->avail_in); + + return 0; +} + +static void compress_fini(struct z_stream_s *zstream, + struct drm_i915_error_object *dst) +{ + if (dst) { + zlib_deflate(zstream, Z_FINISH); + dst->unused = zstream->avail_out; + } + + zlib_deflateEnd(zstream); + kfree(zstream->workspace); +} + +static void err_compression_marker(struct drm_i915_error_state_buf *m) +{ + err_puts(m, ":"); +} + +#else + +static bool compress_init(struct z_stream_s *zstream) +{ + return true; +} + +static int compress_page(struct z_stream_s *zstream, +void *src, +struct drm_i915_error_object *dst) +{ + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + return -ENOMEM; + +
Re: [Intel-gfx] [PATCH 0/7] drm/i915: Add pipe scaler for Gen9 in atomic path
On Tue, Aug 30, 2016 at 10:30:54AM +0530, Nabendu Maiti wrote: > Following patch series add pipe scaler functionality in atomic path.The pipe > scaler can be changed dynamically without modeset.Apart from default panel > fitter supported scaling modes -CENTER/ASPECT/FULLSCREEN custom scaling mode > mode is added as there is no such restriction on GEn9 pipe scaler. Some quick observations: - missing any interaction with drm core, so all generic fb size checks and whatnot will not work out, I think - the way it's done goes against what I've suggested in the past. My idea has been to add a "fixed mode" property to connectors instead, which I think would minimize the impact on the core, and it would follow closely the way eDP/LVDS/DSI already works today. - There's no need to restrict the feature to gen9+ only. It should work out just fine for at least all pch platforms. gmch platforms would be more challenging - the pfiter_recalculate thing looks pretty wrong atomic wise > > > > Nabendu Maiti (7): > drm/i915: Add pipe scaler pipe source drm property > drm/i915: Add pipe_src size property calculations in atomic path. > drm/i915: Panel fitting calculation for GEN9 > drm/i915: Adding atomic fitting mode property for GEN9 > drm/i915: Add pipe scaler co-ordinate and size property for Gen9 > drm/i915: Update pipe-scaler according to destination size > drm/i915: Pipescaler destination size limit check on Gen9 > > drivers/gpu/drm/drm_atomic.c | 35 ++ > drivers/gpu/drm/drm_crtc.c | 56 +++ > drivers/gpu/drm/i915/intel_display.c | 128 > +-- > drivers/gpu/drm/i915/intel_drv.h | 3 + > drivers/gpu/drm/i915/intel_panel.c | 34 +- > include/drm/drm_crtc.h | 14 > include/uapi/drm/drm_mode.h | 1 + > 7 files changed, 263 insertions(+), 8 deletions(-) > > -- > 1.9.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] tests/kms_psr_sink_crc: Assert if we don't have a valid crtc.
In the fixture block for the test we receive a segfault with the following trace: 0 setup_output (data=) at kms_psr_sink_crc.c:115 1 display_init (data=0x7ffd46833cf0) at kms_psr_sink_crc.c:126 2 main (argc=1, argv=) at kms_psr_sink_crc.c:528 And in setup_output(), output.config is: $2 = {crtc = 0x0, connector = 0x6ad460, encoder = 0x0, default_mode = {clock = 361310, hdisplay = 3200, hsync_start = 3248, hsync_end = 3280, htotal = 3316, hskew = 0, vdisplay = 1800, vsync_start = 1802, vsync_end = 1807, vtotal = 1816, vscan = 0, vrefresh = 60, flags = 10, type = 72, name = "3200x1800", '\000' }, connector_scaling_mode = 0, connector_scaling_mode_changed = false, pipe_changed = true, atomic_props_connector = {44, 0}, pipe = -1, valid_crtc_idx_mask = 7} Also, include some warnings in _kmstest_connector_config() in case we fail to retrieve the connector or the crtc. Signed-off-by: Marius VladCC: Maarten Lankhorst CC: Rodrigo Vivi CC: Daniel Vetter --- lib/igt_kms.c| 10 +- tests/kms_psr_sink_crc.c | 1 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/igt_kms.c b/lib/igt_kms.c index 960ecbd..32cd5cf 100644 --- a/lib/igt_kms.c +++ b/lib/igt_kms.c @@ -862,8 +862,10 @@ static bool _kmstest_connector_config(int drm_fd, uint32_t connector_id, else connector = drmModeGetConnectorCurrent(drm_fd, connector_id); - if (!connector) + if (!connector) { + igt_warn("drmModeGetConnector(Current) failed\n"); goto err2; + } if (connector->connector_id != connector_id) { igt_warn("connector id doesn't match (%d != %d)\n", @@ -896,6 +898,12 @@ static bool _kmstest_connector_config(int drm_fd, uint32_t connector_id, config->encoder = _kmstest_connector_config_find_encoder(drm_fd, connector, config->pipe); config->crtc = drmModeGetCrtc(drm_fd, resources->crtcs[config->pipe]); + if (!config->crtc) { + igt_warn("drmModeGetCrtc for pipe %s failed\n", + kmstest_pipe_name(config->pipe)); + goto err2; + } + if (connector->connection != DRM_MODE_CONNECTED) goto err2; diff --git a/tests/kms_psr_sink_crc.c b/tests/kms_psr_sink_crc.c index 8aafedb..0890d5b 100644 --- a/tests/kms_psr_sink_crc.c +++ b/tests/kms_psr_sink_crc.c @@ -112,6 +112,7 @@ static void setup_output(data_t *data) continue; igt_output_set_pipe(output, pipe); + igt_assert(output->config.crtc != NULL); data->crtc_id = output->config.crtc->crtc_id; data->output = output; data->mode = igt_output_get_mode(output); -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] lib/igt_core: Print stacktrace in initialization/fixture blocks.
Likely candidate for this behaviour is the igt_fixture block. Seen in the CI by running tests/kms_psr_sink_crc which is causing segfaults in the fixture block. While at it fix some minor printing bugs. Signed-off-by: Marius VladCC: Chris Wilson --- lib/igt_core.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/lib/igt_core.c b/lib/igt_core.c index dd27a22..43db468 100644 --- a/lib/igt_core.c +++ b/lib/igt_core.c @@ -1122,7 +1122,7 @@ static void printnum(unsigned long long num, unsigned base) { int i = 0; - unsigned long long __num; + unsigned long long __num = num; /* determine from where we should start dividing */ do { @@ -1264,7 +1264,7 @@ static void print_backtrace_sig_safe(void) unw_word_t off; if (unw_get_proc_name(, name, 255, ) < 0) - xstrlcpy(name, "", 9); + xstrlcpy(name, "", 10); xprintf(" #%d [%s+0x%x]\n", stack_num++, name, (unsigned int) off); @@ -1740,7 +1740,7 @@ static void fatal_sig_handler(int sig) igt_assert_eq(write(STDERR_FILENO, ".\n", 2), 2); } - if (in_subtest && crash_signal(sig)) { + if (crash_signal(sig)) { /* Linux standard to return exit code as 128 + signal */ if (!failed_one) igt_exitcode = 128 + sig; @@ -1749,7 +1749,8 @@ static void fatal_sig_handler(int sig) #ifdef HAVE_LIBUNWIND print_backtrace_sig_safe(); #endif - exit_subtest("CRASH"); + if (in_subtest) + exit_subtest("CRASH"); } break; } -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Skylake graphics regression: projector failure with 4.8-rc3
On Mon, 19 Sep 2016, James Bottomleywrote: > On Mon, 2016-09-19 at 08:09 -0700, James Bottomley wrote: >> On Sun, 2016-09-18 at 13:35 +0200, Thorsten Leemhuis wrote: >> > Hi! James & Paulo: What's the current status of this? >> >> No, the only interaction has been the suggestion below for a revert, >> which didn't fix the problem. >> >> > Was this issue discussed elsewhere or even fixed in between? Just >> > asking, because this issue is on the list of regressions for 4.8. >> >> >> I'm just about to try out -rc7, but it's not fixed so far. > > OK, with -rc7 and the i915 fixes, there seems to be a marked > improvement. I can no longer crash the crtc by using lots of xrandr > switches, which was the principal problem. > > I've so far only got one of these in the logs > > [14858.635035] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* > CPU pipe B FIFO underrun > > And the only residual problem seems to be that the monitor goes blank > periodically, but this can be fixed by switching resolution a couple of > times. > > I haven't seen any of the link training errors so far and I've run > through my usual battery of be nasty to the external monitor tests. Thanks for the update. We still have fixes for the underruns in the pipeline. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [i915] monitor is not detected unless it was active during boot
On Mon, 19 Sep 2016, Maarten Maathuiswrote: > Any advice on how to proceed? Please file a bug at [1]. BR, Jani. [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI=DRM/Intel -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx