[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [v7,1/6] drm/i915: Fallback to lower link rate and lane count during link training (rev3)

2016-09-20 Thread Patchwork
== Series Details ==

Series: series starting with [v7,1/6] drm/i915: Fallback to lower link rate and 
lane count during link training (rev3)
URL   : https://patchwork.freedesktop.org/series/12534/
State : success

== Summary ==

Series 12534v3 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/12534/revisions/3/mbox/

Test kms_psr_sink_crc:
Subgroup psr_basic:
dmesg-warn -> PASS   (fi-skl-6700hq)

fi-bdw-5557u total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050 total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
fi-byt-n2820 total:244  pass:208  dwarn:0   dfail:0   fail:1   skip:35 
fi-hsw-4770k total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-hsw-4770r total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-ilk-650   total:244  pass:182  dwarn:0   dfail:0   fail:2   skip:60 
fi-ivb-3520m total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
fi-ivb-3770  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6260u total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hqtotal:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6700k total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
fi-snb-2520m total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
fi-snb-2600  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6770hq failed to collect. IGT log at Patchwork_2563/fi-skl-6770hq/igt.log

Results at /archive/results/CI_IGT_test/Patchwork_2563/

4ca90e7c3b6e429e033b93fc56fc156da8f222ef drm-intel-nightly: 
2016y-09m-20d-12h-43m-32s UTC integration manifest
fbf93bd drm/i915/dp/mst: Add support for upfront link training for DP MST
c35524e drm/i915/dp: Enable Upfront link training on DDI platforms
3e061c1 drm/i915: Code cleanup to use dev_priv and INTEL_GEN
ad24e7c drm/i915: Change the placement of some static functions in intel_dp.c
789b7b3 drm/i915: Remove the link rate and lane count loop in compute config
61d14b7 drm/i915: Fallback to lower link rate and lane count during link 
training

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v18 5/6] drm/i915/dp: Enable Upfront link training on DDI platforms

2016-09-20 Thread Manasi Navare
To support USB type C alternate DP mode, the display driver needs to
know the number of lanes required by the DP panel as well as number
of lanes that can be supported by the type-C cable. Sometimes, the
type-C cable may limit the bandwidth even if Panel can support
more lanes. To address these scenarios we need to train the link before
modeset. This upfront link training caches the values of max link rate
and max lane count that get used later during modeset. Upfront link
training does not change any HW state, the link is disabled and PLL
values are reset to previous values after upfront link tarining so
that subsequent modeset is not aware of these changes.

This patch is based on prior work done by
R,Durgadoss 

Changes since v17:
* Rebased on the latest nightly
Changes since v16:
* Use HAS_DDI macro for enabling this feature (Rodrigo Vivi)
* Fix some unnecessary removals/changes due to rebase (Rodrigo Vivi)

Changes since v15:
* Split this patch into two patches - one with functional
changes to enable upfront and other with moving the existing
functions around so that they can be used for upfront (Jani Nikula)
* Cleaned up the commit message

Signed-off-by: Durgadoss R 
Signed-off-by: Jim Bride 
Signed-off-by: Manasi Navare 
---
 drivers/gpu/drm/i915/intel_ddi.c  |  21 ++-
 drivers/gpu/drm/i915/intel_dp.c   | 190 +-
 drivers/gpu/drm/i915/intel_dp_link_training.c |   1 -
 drivers/gpu/drm/i915/intel_drv.h  |  14 +-
 4 files changed, 218 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 093038c..8e52507 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1676,7 +1676,8 @@ static void intel_ddi_pre_enable_dp(struct intel_encoder 
*encoder,
pll->config.crtc_mask = 0;
 
/* If Link Training fails, send a uevent to generate a hotplug */
-   if (!intel_ddi_link_train(intel_dp, link_rate, lane_count, link_mst))
+   if (!intel_ddi_link_train(intel_dp, link_rate, lane_count, link_mst,
+ false))
drm_kms_helper_hotplug_event(encoder->base.dev);
pll->config = tmp_pll_config;
 }
@@ -2464,7 +2465,7 @@ intel_ddi_get_link_dpll(struct intel_dp *intel_dp, int 
clock)
 
 bool
 intel_ddi_link_train(struct intel_dp *intel_dp, int max_link_rate,
-uint8_t max_lane_count, bool link_mst)
+uint8_t max_lane_count, bool link_mst, bool is_upfront)
 {
struct intel_connector *connector = intel_dp->attached_connector;
struct intel_encoder *encoder = connector->encoder;
@@ -2513,6 +2514,7 @@ intel_ddi_link_train(struct intel_dp *intel_dp, int 
max_link_rate,
pll->funcs.disable(dev_priv, pll);
pll->config = tmp_pll_config;
}
+
if (ret) {
DRM_DEBUG_KMS("Link Training successful at link rate: 
%d lane: %d\n",
  link_rate, lane_count);
@@ -2522,6 +2524,21 @@ intel_ddi_link_train(struct intel_dp *intel_dp, int 
max_link_rate,
 
intel_dp_stop_link_train(intel_dp);
 
+   if (is_upfront) {
+   DRM_DEBUG_KMS("Upfront link train %s: link_clock:%d lanes:%d\n",
+ ret ? "Passed" : "Failed",
+ link_rate, lane_count);
+   /* Disable port followed by PLL for next retry/clean up */
+   intel_ddi_post_disable(encoder, NULL, NULL);
+   pll->funcs.disable(dev_priv, pll);
+   pll->config = tmp_pll_config;
+   if (ret) {
+   /* Save the upfront values */
+   intel_dp->max_lanes_upfront = lane_count;
+   intel_dp->max_link_rate_upfront = link_rate;
+   }
+   }
+
if (!lane_count)
DRM_ERROR("Link Training Failed\n");
 
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 8d9a8ab..a058d5d 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -153,12 +153,21 @@ intel_dp_max_link_bw(struct intel_dp  *intel_dp)
 static u8 intel_dp_max_lane_count(struct intel_dp *intel_dp)
 {
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
-   u8 source_max, sink_max;
+   u8 temp, source_max, sink_max;
 
source_max = intel_dig_port->max_lanes;
sink_max = drm_dp_max_lane_count(intel_dp->dpcd);
 
-   return min(source_max, sink_max);
+   temp = min(source_max, sink_max);
+
+   /*
+* Limit max lanes w.r.t to the max value found
+* using Upfront link training also.
+*/
+   if (intel_dp->max_lanes_upfront)
+   return min(temp, 

[Intel-gfx] [PATCH v5] drm/i915/dp: DP audio API changes for MST

2016-09-20 Thread Dhinakaran Pandiyan
From: "Pandiyan, Dhinakaran" 

DP MST provides the capability to send multiple video and audio streams
through a single port. This requires the API's between i915 and audio
drivers to distinguish between multiple audio capable displays that can be
connected to a port. Currently only the port identity is shared in the
APIs. This patch adds support for MST with an additional parameter
'int pipe'. The existing parameter 'port' does not change it's meaning.

pipe =
MST : display pipe that the stream originates from
Non-MST : -1

Affected APIs:
struct i915_audio_component_ops
-   int (*sync_audio_rate)(struct device *, int port, int rate);
+   int (*sync_audio_rate)(struct device *, int port, int pipe,
+int rate);

-   int (*get_eld)(struct device *, int port, bool *enabled,
-   unsigned char *buf, int max_bytes);
+   int (*get_eld)(struct device *, int port, int pipe,
+  bool *enabled, unsigned char *buf, int max_bytes);

struct i915_audio_component_audio_ops
-   void (*pin_eld_notify)(void *audio_ptr, int port);
+   void (*pin_eld_notify)(void *audio_ptr, int port, int pipe);

This patch makes dummy changes in the audio drivers (thanks Libin) for
build to succeed. The audio side drivers will send the right 'pipe' values
for MST in patches that will follow.

v2:
Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, Ville)
Included Asoc driver API compatibility changes from Jeeja.
Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi)
Added comment for av_enc_map[] definition. (Takashi)

v3:
Fixed logic error introduced while renaming 'dev_id' as 'pipe' (Ville)
Renamed get_saved_encoder() to get_saved_enc() to reduce line length

v4:
Rebased.
Parameter check for pipe < -1 values in get_saved_enc() (Ville)
Switched to for_each_pipe() in get_saved_enc() (Ville)
Renamed 'pipe' to 'dev_id' in audio side code (Takashi)

v5:
Included a comment for the dev_id arg. (Libin)

Signed-off-by: Dhinakaran Pandiyan 
Reviewed-by: Takashi Iwai 
Reviewed-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/i915_drv.h|  3 +-
 drivers/gpu/drm/i915/intel_audio.c | 94 ++
 include/drm/i915_component.h   |  6 +--
 include/sound/hda_i915.h   | 11 +++--
 sound/hda/hdac_i915.c  | 18 +---
 sound/pci/hda/patch_hdmi.c |  7 +--
 sound/soc/codecs/hdac_hdmi.c   |  2 +-
 7 files changed, 94 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 270543c..fe14cca 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2080,7 +2080,8 @@ struct drm_i915_private {
/* perform PHY state sanity checks? */
bool chv_phy_assert[2];
 
-   struct intel_encoder *dig_port_map[I915_MAX_PORTS];
+   /* Used to save the pipe-to-encoder mapping for audio */
+   struct intel_encoder *av_enc_map[I915_MAX_PIPES];
 
/*
 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
diff --git a/drivers/gpu/drm/i915/intel_audio.c 
b/drivers/gpu/drm/i915/intel_audio.c
index 40fbdd8..9583f43 100644
--- a/drivers/gpu/drm/i915/intel_audio.c
+++ b/drivers/gpu/drm/i915/intel_audio.c
@@ -491,6 +491,7 @@ void intel_audio_codec_enable(struct intel_encoder 
*intel_encoder)
struct drm_i915_private *dev_priv = to_i915(encoder->dev);
struct i915_audio_component *acomp = dev_priv->audio_component;
enum port port = intel_encoder->port;
+   enum pipe pipe = crtc->pipe;
 
connector = drm_select_eld(encoder);
if (!connector)
@@ -515,12 +516,18 @@ void intel_audio_codec_enable(struct intel_encoder 
*intel_encoder)
 
mutex_lock(_priv->av_mutex);
intel_encoder->audio_connector = connector;
+
/* referred in audio callbacks */
-   dev_priv->dig_port_map[port] = intel_encoder;
+   dev_priv->av_enc_map[pipe] = intel_encoder;
mutex_unlock(_priv->av_mutex);
 
+   /* audio drivers expect pipe = -1 to indicate Non-MST cases */
+   if (intel_encoder->type != INTEL_OUTPUT_DP_MST)
+   pipe = -1;
+
if (acomp && acomp->audio_ops && acomp->audio_ops->pin_eld_notify)
-   acomp->audio_ops->pin_eld_notify(acomp->audio_ops->audio_ptr, 
(int) port);
+   acomp->audio_ops->pin_eld_notify(acomp->audio_ops->audio_ptr,
+(int) port, (int) pipe);
 }
 
 /**
@@ -536,17 +543,24 @@ void intel_audio_codec_disable(struct intel_encoder 
*intel_encoder)
struct drm_i915_private *dev_priv = to_i915(encoder->dev);
struct i915_audio_component *acomp = dev_priv->audio_component;
enum port port = intel_encoder->port;
+   struct intel_crtc *crtc = to_intel_crtc(encoder->crtc);
+   enum 

Re: [Intel-gfx] [PATCH v12 5/7] drm/i915/skl: Ensure pipes with changed wms get added to the state

2016-09-20 Thread Mike Lothian
Hi

I've bisected back to this commit in the drm-intel-nightly branch

05a76d3d6ad1ee9f9814f88949cc9305fc165460 is the first bad commit
commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460
Author: Lyude 
Date:   Wed Aug 17 15:55:57 2016 -0400

   drm/i915/skl: Ensure pipes with changed wms get added to the state

   If we're enabling a pipe, we'll need to modify the watermarks on all
   active planes. Since those planes won't be added to the state on
   their own, we need to add them ourselves.

   Signed-off-by: Lyude 
   Reviewed-by: Matt Roper 
   Cc: sta...@vger.kernel.org
   Cc: Ville Syrjälä 
   Cc: Daniel Vetter 
   Cc: Radhakrishna Sripada 
   Cc: Hans de Goede 
   Signed-off-by: Maarten Lankhorst 
   Link: 
http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-6-git-send-email-cp...@redhat.com

The symptoms I'm seeing look like tearing at the top of the screen and
it's especially noticeable in Chrome - reverting this commit makes the
issue go away

Let me know if you'd like me to raise a bug

Cheers

Mike

(Re-sending from Gmail - as Inbox doesn't let me send as plain text)

On 17 August 2016 at 20:55, Lyude  wrote:
> If we're enabling a pipe, we'll need to modify the watermarks on all
> active planes. Since those planes won't be added to the state on
> their own, we need to add them ourselves.
>
> Signed-off-by: Lyude 
> Reviewed-by: Matt Roper 
> Cc: sta...@vger.kernel.org
> Cc: Ville Syrjälä 
> Cc: Daniel Vetter 
> Cc: Radhakrishna Sripada 
> Cc: Hans de Goede 
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 849f039..a3d24cb 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4117,6 +4117,10 @@ skl_compute_ddb(struct drm_atomic_state *state)
> ret = skl_allocate_pipe_ddb(cstate, ddb);
> if (ret)
> return ret;
> +
> +   ret = drm_atomic_add_affected_planes(state, 
> _crtc->base);
> +   if (ret)
> +   return ret;
> }
>
> return 0;
> --
> 2.7.4
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v12 5/7] drm/i915/skl: Ensure pipes with changed wms get added to the state

2016-09-20 Thread Mike Lothian
Hi

I've bisected back to this commit in the drm-intel-nightly branch

05a76d3d6ad1ee9f9814f88949cc9305fc165460 is the first bad commit
commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460
Author: Lyude 
Date:   Wed Aug 17 15:55:57 2016 -0400

   drm/i915/skl: Ensure pipes with changed wms get added to the state

   If we're enabling a pipe, we'll need to modify the watermarks on all
   active planes. Since those planes won't be added to the state on
   their own, we need to add them ourselves.

   Signed-off-by: Lyude 
   Reviewed-by: Matt Roper 
   Cc: sta...@vger.kernel.org
   Cc: Ville Syrjälä 
   Cc: Daniel Vetter 
   Cc: Radhakrishna Sripada 
   Cc: Hans de Goede 
   Signed-off-by: Maarten Lankhorst 
   Link:
http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-6-git-send-email-cp...@redhat.com

The symptoms I'm seeing look like tearing at the top of the screen and it's
especially noticeable in Chrome - reverting this commit makes the issue go
away

Let me know if you'd like me to raise a bug

Cheers

Mike

On Wed, 17 Aug 2016 at 20:56 Lyude  wrote:

> If we're enabling a pipe, we'll need to modify the watermarks on all
> active planes. Since those planes won't be added to the state on
> their own, we need to add them ourselves.
>
> Signed-off-by: Lyude 
> Reviewed-by: Matt Roper 
> Cc: sta...@vger.kernel.org
> Cc: Ville Syrjälä 
> Cc: Daniel Vetter 
> Cc: Radhakrishna Sripada 
> Cc: Hans de Goede 
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 849f039..a3d24cb 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4117,6 +4117,10 @@ skl_compute_ddb(struct drm_atomic_state *state)
> ret = skl_allocate_pipe_ddb(cstate, ddb);
> if (ret)
> return ret;
> +
> +   ret = drm_atomic_add_affected_planes(state,
> _crtc->base);
> +   if (ret)
> +   return ret;
> }
>
> return 0;
> --
> 2.7.4
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes against the available link bandwidth

2016-09-20 Thread Srivatsa, Anusha
Hi All,

Looking forward for more reviews comments and concerns if any, regarding this 
patch. 

Regards,
Anusha

>-Original Message-
>From: Jim Bride [mailto:jim.br...@linux.intel.com]
>Sent: Tuesday, September 6, 2016 11:57 AM
>To: Srivatsa, Anusha 
>Cc: intel-gfx@lists.freedesktop.org
>Subject: Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes against the
>available link bandwidth
>
>On Tue, Sep 06, 2016 at 06:39:12PM +, Srivatsa, Anusha wrote:
>> Sending to the list again since Ville's review comment  didn't hit the 
>> mailing list
>last time.
>>
>> Regards,
>> Anusha
>>
>> -Original Message-
>> From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
>> Sent: Monday, September 5, 2016 4:39 AM
>> To: Srivatsa, Anusha 
>> Cc: intel-gfx@lists.freedesktop.org
>> Subject: Re: [Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes
>> against the available link bandwidth
>>
>> On Thu, Aug 18, 2016 at 05:11:31PM -0700, Anusha Srivatsa wrote:
>> > Change intel_dp_mst_mode_valid() to use available link bandwidth
>> > rather than the link's maximum supported bandwidth to evaluate
>> > whether modes are legal for the current configuration. This takes
>> > into account the fact that link bandwidth may already be dedicated
>> > to other virtual channels.
>>
>> We can't do this. Results of mode_valid() aren't supposed to change like this
>depending on what else is happening in the system. The only thing mode_valid()
>tells us is whether it's possible to use the mode under *some* circumstances.
>Only if the mode can't be used under *any* circumstances should it be filtered
>out.
>>
>> If we wanted to change this, we'd at the very least have to synthesize 
>> hotplug
>uevents whenever the conditions changed. But doing that would be a fairly big
>behavioural change, so I'm not sure how people feel about it. It also doesn't
>really solve the problem since eg. atomic can go directly from 0->n active 
>pipes,
>so there's no way to know upfront which modes we can pick for each pipe.
>>
>
>I won't dispute that this won't help for all cases, but it does make hotplugs, 
>at a
>minimum, more sane.  For that reason alone, I'd like to see this patch land.
>Longer term I think we should look at how we can make user space and atomic
>better handle MST (IMHO in multi-modeset operations on MST atomic should
>ensure that the aggregate dotclock of the modes being set are less than the 
>link
>bandwidth that's configured.)  I think that user space also needs to be more 
>MST
>aware and update what's available based on the current configuration at any
>given time.  I'm not sure what the precise solution should look like here, but 
>I'd
>think that what we want is for user space to fetch the list of available 
>resolutions
>based on the current configuration any time we plug or unplug a device on a DP
>MST topology.  In an ideal world it would be nice to have some sort of
>information about the total link bandwidth, and how much bandwidth each mode
>is taking up so that the user could have a guide to tweaking their setup in 
>such a
>way to maximize how they choose to configure their monitors based on the
>available link bandwidth.
>
>Jim
>
>
>> >
>> > Signed-off-by: Anusha Srivatsa 
>> > ---
>> >  drivers/gpu/drm/i915/intel_dp_mst.c | 12 +---
>> >  1 file changed, 9 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c
>> > b/drivers/gpu/drm/i915/intel_dp_mst.c
>> > index 629337d..39c58eb 100644
>> > --- a/drivers/gpu/drm/i915/intel_dp_mst.c
>> > +++ b/drivers/gpu/drm/i915/intel_dp_mst.c
>> > @@ -352,16 +352,22 @@ static enum drm_mode_status
>> > intel_dp_mst_mode_valid(struct drm_connector *connector,
>> >struct drm_display_mode *mode)
>> >  {
>> > -  int max_dotclk = to_i915(connector->dev)->max_dotclk_freq;
>> > +  int req_pbn = 0;
>> > +  int slots = 0;
>> > +  struct intel_connector *intel_connector =
>to_intel_connector(connector);
>> > +  struct intel_dp *intel_dp = intel_connector->mst_port;
>> > +  struct drm_dp_mst_topology_mgr *mgr = _dp->mst_mgr;
>> > +
>> > +  req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24);
>> > +  slots = drm_dp_find_vcpi_slots(mgr, req_pbn);
>> >
>> > -  /* TODO - validate mode against available PBN for link */
>> >if (mode->clock < 1)
>> >return MODE_CLOCK_LOW;
>> >
>> >if (mode->flags & DRM_MODE_FLAG_DBLCLK)
>> >return MODE_H_ILLEGAL;
>> >
>> > -  if (mode->clock > max_dotclk)
>> > +  if (slots == -ENOSPC)
>> >return MODE_CLOCK_HIGH;
>> >
>> >return MODE_OK;
>> > --
>> > 2.7.4
>> >
>> > ___
>> > Intel-gfx mailing list
>> > Intel-gfx@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>> --
>> Ville Syrjälä
>> Intel OTC
>> ___
>> Intel-gfx mailing list
>> 

[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [v2,RESEND,1/2] drm/i915: Cleanup instdone collection

2016-09-20 Thread Patchwork
== Series Details ==

Series: series starting with [v2,RESEND,1/2] drm/i915: Cleanup instdone 
collection
URL   : https://patchwork.freedesktop.org/series/12714/
State : warning

== Summary ==

Series 12714v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/12714/revisions/1/mbox/

Test drv_module_reload_basic:
pass   -> SKIP   (fi-skl-6770hq)
Test kms_pipe_crc_basic:
Subgroup hang-read-crc-pipe-a:
pass   -> SKIP   (fi-hsw-4770r)
Test kms_psr_sink_crc:
Subgroup psr_basic:
dmesg-warn -> PASS   (fi-skl-6700hq)

fi-bdw-5557u total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050 total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
fi-hsw-4770k total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-hsw-4770r total:244  pass:221  dwarn:0   dfail:0   fail:0   skip:23 
fi-ilk-650   total:244  pass:182  dwarn:0   dfail:0   fail:2   skip:60 
fi-ivb-3520m total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
fi-ivb-3770  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6260u total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hqtotal:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6700k total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
fi-skl-6770hqtotal:244  pass:227  dwarn:1   dfail:0   fail:1   skip:15 
fi-snb-2520m total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
fi-snb-2600  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_2561/

4ca90e7c3b6e429e033b93fc56fc156da8f222ef drm-intel-nightly: 
2016y-09m-20d-12h-43m-32s UTC integration manifest
4721d61 drm/i915: Try to print INSTDONE bits for all slice/subslice
6240439 drm/i915: Cleanup instdone collection

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 20/38] drm/i915: Implement pwrite without struct-mutex

2016-09-20 Thread kbuild test robot
Hi Chris,

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20160920]
[cannot apply to v4.8-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/drm_modes.c:693: warning: No description found for parameter 
'bus_flags'
   drivers/gpu/drm/drm_plane_helper.c:248: warning: No description found for 
parameter 'dst'
   drivers/gpu/drm/drm_plane_helper.c:248: warning: Excess function parameter 
'dest' description in 'drm_plane_helper_check_update'
   drivers/gpu/drm/drm_plane_helper.c:247: warning: No description found for 
parameter 'dst'
   drivers/gpu/drm/drm_plane_helper.c:247: warning: Excess function parameter 
'dest' description in 'drm_plane_helper_check_update'
>> drivers/gpu/drm/i915/i915_gem.c:1162: warning: Excess function parameter 
>> 'i915' description in 'i915_gem_gtt_pwrite_fast'
>> drivers/gpu/drm/i915/i915_gem.c:1162: warning: Excess function parameter 
>> 'file' description in 'i915_gem_gtt_pwrite_fast'
   drivers/gpu/drm/i915/i915_gem_fence.c:644: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:674: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/drm_crtc.c:1270: WARNING: Inline literal start-string 
without end-string.
   drivers/gpu/drm/drm_crtc.c:1385: WARNING: Inline literal start-string 
without end-string.
   include/drm/drm_crtc.h:1202: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1255: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1268: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1272: WARNING: Inline literal start-string without 
end-string.
   drivers/gpu/drm/drm_irq.c:718: WARNING: Option list ends without a blank 
line; unexpected unindent.
   drivers/gpu/drm/drm_fb_helper.c:2195: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/drm_simple_kms_helper.c:141: WARNING: Inline literal 
start-string without end-string.
   include/drm/drm_gem.h:212: WARNING: Inline emphasis start-string without 
end-string.
   drivers/gpu/drm/i915/i915_vgpu.c:176: WARNING: Literal block ends without a 
blank line; unexpected unindent.
   drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/i915/intel_guc_fwif.h:159: WARNING: Block quote ends without 
a blank line; unexpected unindent.
   drivers/gpu/drm/i915/intel_guc_fwif.h:178: WARNING: Enumerated list ends 
without a blank line; unexpected unindent.
   Documentation/gpu/drm-kms.rst:13: WARNING: Could not lex literal_block as 
"C". Highlighting skipped.
   Documentation/gpu/drm-kms-helpers.rst:16: WARNING: Could not lex 
literal_block as "C". Highlighting skipped.
   Documentation/gpu/i915.rst:57: WARNING: Could not lex literal_block as "C". 
Highlighting skipped.

vim +1162 drivers/gpu/drm/i915/i915_gem.c

9b5ec232 Chris Wilson   2016-09-20  1146}
19709047 Chris Wilson   2016-09-20  1147  
19709047 Chris Wilson   2016-09-20  1148return unwritten;
19709047 Chris Wilson   2016-09-20  1149  }
19709047 Chris Wilson   2016-09-20  1150  
3de09aa3 Eric Anholt2009-03-09  1151  /**
3de09aa3 Eric Anholt2009-03-09  1152   * This is the fast pwrite path, 
where we copy the data directly from the
3de09aa3 Eric Anholt2009-03-09  1153   * user into the GTT, uncached.
62f90b38 Daniel Vetter  2016-07-15  1154   * @i915: i915 device private data
14bb2c11 Tvrtko Ursulin 2016-06-03  1155   * @obj: i915 gem object
14bb2c11 Tvrtko Ursulin 2016-06-03  1156   * @args: pwrite arguments 
structure
14bb2c11 Tvrtko Ursulin 2016-06-03  1157   * @file: drm file pointer
3de09aa3 Eric Anholt2009-03-09  1158   */
673a394b Eric Anholt2008-07-30  1159  static int
9b5ec232 Chris Wilson   2016-09

[Intel-gfx] [PATCH v2 RESEND 1/2] drm/i915: Cleanup instdone collection

2016-09-20 Thread Imre Deak
From: Ben Widawsky 

Consolidate the instdone logic so we can get a bit fancier. This patch also
removes the duplicated print of INSTDONE[0].

v2: (Imre)
- Rebased on top of hangcheck INSTDONE changes.
- Move all INSTDONE registers into a single struct, store it within the
  engine error struct during error capturing.

Signed-off-by: Ben Widawsky 
Signed-off-by: Imre Deak 
Reviewed-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 41 +++-
 drivers/gpu/drm/i915/i915_drv.h |  8 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c   | 84 +++--
 drivers/gpu/drm/i915/i915_irq.c | 69 ++-
 drivers/gpu/drm/i915/i915_reg.h |  1 -
 drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +++-
 6 files changed, 151 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index a6d174a..ba155c0 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1277,15 +1277,36 @@ out:
return ret;
 }
 
+static void i915_instdone_info(struct drm_i915_private *dev_priv,
+  struct seq_file *m,
+  struct intel_instdone *instdone)
+{
+   seq_printf(m, "\t\tINSTDONE: 0x%08x\n",
+  instdone->instdone);
+
+   if (INTEL_GEN(dev_priv) <= 3)
+   return;
+
+   seq_printf(m, "\t\tSC_INSTDONE: 0x%08x\n",
+  instdone->slice_common);
+
+   if (INTEL_GEN(dev_priv) <= 6)
+   return;
+
+   seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n",
+  instdone->sampler);
+   seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n",
+  instdone->row);
+}
+
 static int i915_hangcheck_info(struct seq_file *m, void *unused)
 {
struct drm_i915_private *dev_priv = node_to_i915(m->private);
struct intel_engine_cs *engine;
u64 acthd[I915_NUM_ENGINES];
u32 seqno[I915_NUM_ENGINES];
-   u32 instdone[I915_NUM_INSTDONE_REG];
+   struct intel_instdone instdone;
enum intel_engine_id id;
-   int j;
 
if (test_bit(I915_WEDGED, _priv->gpu_error.flags))
seq_printf(m, "Wedged\n");
@@ -1308,7 +1329,7 @@ static int i915_hangcheck_info(struct seq_file *m, void 
*unused)
seqno[id] = intel_engine_get_seqno(engine);
}
 
-   i915_get_extra_instdone(dev_priv, instdone);
+   i915_get_engine_instdone(dev_priv, RCS, );
 
intel_runtime_pm_put(dev_priv);
 
@@ -1336,18 +1357,14 @@ static int i915_hangcheck_info(struct seq_file *m, void 
*unused)
seq_printf(m, "\taction = %d\n", engine->hangcheck.action);
 
if (engine->id == RCS) {
-   seq_puts(m, "\tinstdone read =");
+   seq_puts(m, "\tinstdone read =\n");
 
-   for (j = 0; j < I915_NUM_INSTDONE_REG; j++)
-   seq_printf(m, " 0x%08x", instdone[j]);
+   i915_instdone_info(dev_priv, m, );
 
-   seq_puts(m, "\n\tinstdone accu =");
+   seq_puts(m, "\tinstdone accu =\n");
 
-   for (j = 0; j < I915_NUM_INSTDONE_REG; j++)
-   seq_printf(m, " 0x%08x",
-  engine->hangcheck.instdone[j]);
-
-   seq_puts(m, "\n");
+   i915_instdone_info(dev_priv, m,
+  >hangcheck.instdone);
}
}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4dd307e..3568995 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -759,7 +759,7 @@ struct drm_i915_error_state {
u32 gam_ecochk;
u32 gab_ctl;
u32 gfx_mode;
-   u32 extra_instdone[I915_NUM_INSTDONE_REG];
+
u64 fence[I915_MAX_NUM_FENCES];
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
@@ -791,7 +791,6 @@ struct drm_i915_error_state {
u32 hws;
u32 ipeir;
u32 ipehr;
-   u32 instdone;
u32 bbstate;
u32 instpm;
u32 instps;
@@ -802,6 +801,7 @@ struct drm_i915_error_state {
u64 faddr;
u32 rc_psmi; /* sleep state */
u32 semaphore_mboxes[I915_NUM_ENGINES - 1];
+   struct intel_instdone instdone;
 
struct drm_i915_error_object {
int page_count;
@@ -3542,7 +3542,9 @@ void i915_error_state_get(struct drm_device *dev,
 void i915_error_state_put(struct i915_error_state_file_priv *error_priv);
 void i915_destroy_error_state(struct drm_device *dev);
 
-void 

[Intel-gfx] [PATCH v2 RESEND 2/2] drm/i915: Try to print INSTDONE bits for all slice/subslice

2016-09-20 Thread Imre Deak
From: Ben Widawsky 

v2: (Imre)
- Access only subslices that are known to exist.
- Reset explictly the MCR selector to slice/sub-slice ID 0 after the
  readout.
- Use the subslice INSTDONE bits for the hangcheck/subunits-stuck
  detection too.
- Take the uncore lock for the MCR-select/subslice-readout sequence.

Signed-off-by: Ben Widawsky 
Signed-off-by: Imre Deak 
Reviewed-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 --
 drivers/gpu/drm/i915/i915_gpu_error.c   | 76 ++---
 drivers/gpu/drm/i915/i915_irq.c | 25 ---
 drivers/gpu/drm/i915/i915_reg.h |  5 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 23 +-
 5 files changed, 125 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index ba155c0..d3f83c5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1281,6 +1281,9 @@ static void i915_instdone_info(struct drm_i915_private 
*dev_priv,
   struct seq_file *m,
   struct intel_instdone *instdone)
 {
+   int slice;
+   int subslice;
+
seq_printf(m, "\t\tINSTDONE: 0x%08x\n",
   instdone->instdone);
 
@@ -1293,10 +1296,13 @@ static void i915_instdone_info(struct drm_i915_private 
*dev_priv,
if (INTEL_GEN(dev_priv) <= 6)
return;
 
-   seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n",
-  instdone->sampler);
-   seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n",
-  instdone->row);
+   for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+   seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
+  slice, subslice, instdone->sampler[slice][subslice]);
+
+   for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+   seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
+  slice, subslice, instdone->row[slice][subslice]);
 }
 
 static int i915_hangcheck_info(struct seq_file *m, void *unused)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6a2775a..2bbab22 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -231,6 +231,9 @@ static const char *hangcheck_action_to_str(enum 
intel_engine_hangcheck_action a)
 static void error_print_instdone(struct drm_i915_error_state_buf *m,
 struct drm_i915_error_engine *ee)
 {
+   int slice;
+   int subslice;
+
err_printf(m, "  INSTDONE: 0x%08x\n",
   ee->instdone.instdone);
 
@@ -243,10 +246,15 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
if (INTEL_GEN(m->i915) <= 6)
return;
 
-   err_printf(m, "  SAMPLER_INSTDONE: 0x%08x\n",
-  ee->instdone.sampler);
-   err_printf(m, "  ROW_INSTDONE: 0x%08x\n",
-  ee->instdone.row);
+   for_each_instdone_slice_subslice(m->i915, slice, subslice)
+   err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
+  slice, subslice,
+  ee->instdone.sampler[slice][subslice]);
+
+   for_each_instdone_slice_subslice(m->i915, slice, subslice)
+   err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
+  slice, subslice,
+  ee->instdone.row[slice][subslice]);
 }
 
 static void error_print_engine(struct drm_i915_error_state_buf *m,
@@ -1549,12 +1557,52 @@ const char *i915_cache_level_str(struct 
drm_i915_private *i915, int type)
}
 }
 
+static inline uint32_t
+read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
+ int subslice, i915_reg_t reg)
+{
+   uint32_t mcr;
+   uint32_t ret;
+   enum forcewake_domains fw_domains;
+
+   fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
+   FW_REG_READ);
+   fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
+GEN8_MCR_SELECTOR,
+FW_REG_READ | 
FW_REG_WRITE);
+
+   spin_lock_irq(_priv->uncore.lock);
+   intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
+
+   mcr = I915_READ_FW(GEN8_MCR_SELECTOR);
+   /*
+* The HW expects the slice and sublice selectors to be reset to 0
+* after reading out the registers.
+*/
+   WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
+   mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+   mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+   I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
+
+   ret = 

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Cleanup instdone collection

2016-09-20 Thread Imre Deak
On pe, 2016-09-16 at 12:56 +0300, Jani Nikula wrote:
> On Tue, 06 Sep 2016, Mika Kuoppala 
> wrote:
> > Imre Deak  writes:
> > 
> > > From: Ben Widawsky 
> > > 
> > > Consolidate the instdone logic so we can get a bit fancier. This
> > > patch also
> > > removes the duplicated print of INSTDONE[0].
> > > 
> > > v2: (Imre)
> > > - Rebased on top of hangcheck INSTDONE changes.
> > > - Move all INSTDONE registers into a single struct, store it
> > > within the
> > >   engine error struct during error capturing.
> > > 
> > > Signed-off-by: Ben Widawsky 
> > > Signed-off-by: Imre Deak 
> > 
> > Reviewed-by: Mika Kuoppala 
> 
> I didn't receive the original messages, and I can't find them on the
> list either.

Can't find it in the list archive either, not sure where it got stuck.
I'll resend it.

--Imre

> 
> BR,
> Jani.
> 
> 
> > 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c | 41 +++-
> > >  drivers/gpu/drm/i915/i915_drv.h |  8 ++--
> > >  drivers/gpu/drm/i915/i915_gpu_error.c   | 84
> > > +++--
> > >  drivers/gpu/drm/i915/i915_irq.c | 69 ++-
> > > 
> > >  drivers/gpu/drm/i915/i915_reg.h |  1 -
> > >  drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +++-
> > >  6 files changed, 151 insertions(+), 62 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index 3fde507..45244f9 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -1277,15 +1277,36 @@ out:
> > >   return ret;
> > >  }
> > >  
> > > +static void i915_instdone_info(struct drm_i915_private
> > > *dev_priv,
> > > +    struct seq_file *m,
> > > +    struct intel_instdone *instdone)
> > > +{
> > > + seq_printf(m, "\t\tINSTDONE: 0x%08x\n",
> > > +    instdone->instdone);
> > > +
> > > + if (INTEL_GEN(dev_priv) <= 3)
> > > + return;
> > > +
> > > + seq_printf(m, "\t\tSC_INSTDONE: 0x%08x\n",
> > > +    instdone->slice_common);
> > > +
> > > + if (INTEL_GEN(dev_priv) <= 6)
> > > + return;
> > > +
> > > + seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n",
> > > +    instdone->sampler);
> > > + seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n",
> > > +    instdone->row);
> > > +}
> > > +
> > >  static int i915_hangcheck_info(struct seq_file *m, void *unused)
> > >  {
> > >   struct drm_i915_private *dev_priv = node_to_i915(m-
> > > >private);
> > >   struct intel_engine_cs *engine;
> > >   u64 acthd[I915_NUM_ENGINES];
> > >   u32 seqno[I915_NUM_ENGINES];
> > > - u32 instdone[I915_NUM_INSTDONE_REG];
> > > + struct intel_instdone instdone;
> > >   enum intel_engine_id id;
> > > - int j;
> > >  
> > >   if (!i915.enable_hangcheck) {
> > >   seq_printf(m, "Hangcheck disabled\n");
> > > @@ -1299,7 +1320,7 @@ static int i915_hangcheck_info(struct
> > > seq_file *m, void *unused)
> > >   seqno[id] = intel_engine_get_seqno(engine);
> > >   }
> > >  
> > > - i915_get_extra_instdone(dev_priv, instdone);
> > > + i915_get_engine_instdone(dev_priv, RCS, );
> > >  
> > >   intel_runtime_pm_put(dev_priv);
> > >  
> > > @@ -1327,18 +1348,14 @@ static int i915_hangcheck_info(struct
> > > seq_file *m, void *unused)
> > >   seq_printf(m, "\taction = %d\n", engine-
> > > >hangcheck.action);
> > >  
> > >   if (engine->id == RCS) {
> > > - seq_puts(m, "\tinstdone read =");
> > > + seq_puts(m, "\tinstdone read =\n");
> > >  
> > > - for (j = 0; j < I915_NUM_INSTDONE_REG;
> > > j++)
> > > - seq_printf(m, " 0x%08x",
> > > instdone[j]);
> > > + i915_instdone_info(dev_priv, m,
> > > );
> > >  
> > > - seq_puts(m, "\n\tinstdone accu =");
> > > + seq_puts(m, "\tinstdone accu =\n");
> > >  
> > > - for (j = 0; j < I915_NUM_INSTDONE_REG;
> > > j++)
> > > - seq_printf(m, " 0x%08x",
> > > -    engine-
> > > >hangcheck.instdone[j]);
> > > -
> > > - seq_puts(m, "\n");
> > > + i915_instdone_info(dev_priv, m,
> > > +    
> > > >hangcheck.instdone);
> > >   }
> > >   }
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 757b1d1..20bf72e 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -747,7 +747,7 @@ struct drm_i915_error_state {
> > >   u32 gam_ecochk;
> > >   u32 gab_ctl;
> > >   u32 gfx_mode;
> > > - u32 extra_instdone[I915_NUM_INSTDONE_REG];
> > > +
> > >   u64 fence[I915_MAX_NUM_FENCES];
> > >   struct intel_overlay_error_state *overlay;
> > >   

Re: [Intel-gfx] Kernel stability on baytrail machines

2016-09-20 Thread Michal Feix

Hi, I think there might be another clue on this one.

One of the comments is also mentioning an unfixed erratum of certain 
Baytrail processors, named as "EOI Transaction May Not be Sent if 
Software Enters Core C6 During an Interrupt Service Routine". This 
erratum can be found on several different processors, even on several 
non-baytrails, like Inte Xeon 3400 and similar.


I also came across a patch that was created for SUSE and that seems to 
be adressing this issue in pre 4.X kernels:


https://build.opensuse.org/package/view_file?file=22160-Intel-C6-EOI.patch=xen=home%3Acharlesa%3AopenSUSE11.3=7

---
Michal


Dne 18.7.2016 15:30, One Thousand Gnomes napsal:

On Tue, 12 Jul 2016 16:41:58 -0300
Ezequiel Garcia  wrote:


Hi Alan,

(Adding interested people to this thread)

On 09 Apr 08:14 PM, One Thousand Gnomes wrote:
> > > I do feel that the importance of the mentioned bug is currently
> > > underestimated. Can anyone here give a note, how much current linux
> > > kernel is supposed to be stable on general baytrail machines?
> >
> > If you did not get any replies... you might want to check MAINTAINERS file, 
and
> > put Intel x86 maintainers on Cc list.
> >
> > I'm sure someone cares :-).
>
> Yes we care, and there are people looking at the various reports.
>

Are there any updates on the status of this issue?

The current bugzilla report [1] marks this as a power management
issue. However, many reports indicate that it would only freeze
when running X, so it's not completely clear if it's related to
the gfx driver too.


There are two things we are currently tracking. One of them was merged
which seems to have made my machine stable at least and fixes a problem
related to the MMC. The second one we may need is a power related 
changed

to SPI to hold the CPU in C0/C1 whenever the ACPI _SEM is held.

Graphics shows these problems up because of the way the GPU causes 
power

state changes.

Alan

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 2/2] drm/i915: Try to print INSTDONE bits for all slice/subslice

2016-09-20 Thread Mika Kuoppala
Imre Deak  writes:

> From: Ben Widawsky 
>
> v2: (Imre)
> - Access only subslices that are known to exist.
> - Reset explictly the MCR selector to slice/sub-slice ID 0 after the
>   readout.
> - Use the subslice INSTDONE bits for the hangcheck/subunits-stuck
>   detection too.
> - Take the uncore lock for the MCR-select/subslice-readout sequence.
>
> Signed-off-by: Ben Widawsky 
> Signed-off-by: Imre Deak 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 14 --
>  drivers/gpu/drm/i915/i915_gpu_error.c   | 76 
> ++---
>  drivers/gpu/drm/i915/i915_irq.c | 25 ---
>  drivers/gpu/drm/i915/i915_reg.h |  5 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 23 +-
>  5 files changed, 125 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 45244f9..0f84165 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1281,6 +1281,9 @@ static void i915_instdone_info(struct drm_i915_private 
> *dev_priv,
>  struct seq_file *m,
>  struct intel_instdone *instdone)
>  {
> + int slice;
> + int subslice;
> +
>   seq_printf(m, "\t\tINSTDONE: 0x%08x\n",
>  instdone->instdone);
>  
> @@ -1293,10 +1296,13 @@ static void i915_instdone_info(struct 
> drm_i915_private *dev_priv,
>   if (INTEL_GEN(dev_priv) <= 6)
>   return;
>  
> - seq_printf(m, "\t\tSAMPLER_INSTDONE: 0x%08x\n",
> -instdone->sampler);
> - seq_printf(m, "\t\tROW_INSTDONE: 0x%08x\n",
> -instdone->row);
> + for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> + seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
> +slice, subslice, instdone->sampler[slice][subslice]);
> +
> + for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> + seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
> +slice, subslice, instdone->row[slice][subslice]);
>  }
>  
>  static int i915_hangcheck_info(struct seq_file *m, void *unused)
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 80fe101..06d4309 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -231,6 +231,9 @@ static const char *hangcheck_action_to_str(enum 
> intel_engine_hangcheck_action a)
>  static void error_print_instdone(struct drm_i915_error_state_buf *m,
>struct drm_i915_error_engine *ee)
>  {
> + int slice;
> + int subslice;
> +
>   err_printf(m, "  INSTDONE: 0x%08x\n",
>  ee->instdone.instdone);
>  
> @@ -243,10 +246,15 @@ static void error_print_instdone(struct 
> drm_i915_error_state_buf *m,
>   if (INTEL_GEN(m->i915) <= 6)
>   return;
>  
> - err_printf(m, "  SAMPLER_INSTDONE: 0x%08x\n",
> -ee->instdone.sampler);
> - err_printf(m, "  ROW_INSTDONE: 0x%08x\n",
> -ee->instdone.row);
> + for_each_instdone_slice_subslice(m->i915, slice, subslice)
> + err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
> +slice, subslice,
> +ee->instdone.sampler[slice][subslice]);
> +
> + for_each_instdone_slice_subslice(m->i915, slice, subslice)
> + err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
> +slice, subslice,
> +ee->instdone.row[slice][subslice]);
>  }
>  
>  static void error_print_engine(struct drm_i915_error_state_buf *m,
> @@ -1534,12 +1542,52 @@ const char *i915_cache_level_str(struct 
> drm_i915_private *i915, int type)
>   }
>  }
>  
> +static inline uint32_t
> +read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
> +   int subslice, i915_reg_t reg)
> +{
> + uint32_t mcr;
> + uint32_t ret;
> + enum forcewake_domains fw_domains;
> +
> + fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
> + FW_REG_READ);
> + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
> +  GEN8_MCR_SELECTOR,
> +  FW_REG_READ | 
> FW_REG_WRITE);
> +
> + spin_lock_irq(_priv->uncore.lock);
> + intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
> +
> + mcr = I915_READ_FW(GEN8_MCR_SELECTOR);
> + /*
> +  * The HW expects the slice and sublice selectors to be reset to 0
> +  * after reading out the registers.
> +  */
> + WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
> + mcr &= 

Re: [Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue

2016-09-20 Thread Maarten Lankhorst
Op 20-09-16 om 14:51 schreef Chris Wilson:
> On Tue, Sep 20, 2016 at 02:58:19PM +0300, Imre Deak wrote:
>> While user space has control over the scheduling priority of its page
>> flipping thread, the corresponding work the driver schedules for MMIO
>> flips always runs from the generic system workqueue which has some
>> scheduling overhead due it being CPU bound. This would hinder an
>> application that wants more stringent guarantees over flip timing (to
>> avoid missing a flip at the next frame count).
>>
>> Fix this by scheduling the work from the unbound system workqueue
>> which provides for minimal scheduling latency.
>>
>> v2:
>> - Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris)
>> v3:
>> - Use the system unbound wq instead of a dedicated one. (Maarten)
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775
>> Testcase: igt/kms_cursor_legacy
>> CC: Chris Wilson 
>> CC: Maarten Lankhorst 
>> CC: Tvrtko Ursulin 
>> Signed-off-by: Imre Deak 
>> Reviewed-by: Tvrtko Ursulin  (v1)
> We violate the unbound_wq rules no worse than the ordinary system_wq,
> and this brings mmioflip on a par with nonblocking atomic modesets, so
> Reviewed-by: Chris Wilson 
> -Chris
>
Reviewed-by: Maarten Lankhorst 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue

2016-09-20 Thread Chris Wilson
On Tue, Sep 20, 2016 at 02:58:19PM +0300, Imre Deak wrote:
> While user space has control over the scheduling priority of its page
> flipping thread, the corresponding work the driver schedules for MMIO
> flips always runs from the generic system workqueue which has some
> scheduling overhead due it being CPU bound. This would hinder an
> application that wants more stringent guarantees over flip timing (to
> avoid missing a flip at the next frame count).
> 
> Fix this by scheduling the work from the unbound system workqueue
> which provides for minimal scheduling latency.
> 
> v2:
> - Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris)
> v3:
> - Use the system unbound wq instead of a dedicated one. (Maarten)
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775
> Testcase: igt/kms_cursor_legacy
> CC: Chris Wilson 
> CC: Maarten Lankhorst 
> CC: Tvrtko Ursulin 
> Signed-off-by: Imre Deak 
> Reviewed-by: Tvrtko Ursulin  (v1)

We violate the unbound_wq rules no worse than the ordinary system_wq,
and this brings mmioflip on a par with nonblocking atomic modesets, so
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Unlock PPS registers after GPU reset

2016-09-20 Thread Imre Deak
On to, 2016-09-15 at 07:20 +, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Unlock PPS registers after GPU reset
> URL   : https://patchwork.freedesktop.org/series/12446/
> State : success

Thanks for the report and review, I pushed it to -dinq.

> 
> == Summary ==
> 
> Series 12446v1 drm/i915: Unlock PPS registers after GPU reset
> https://patchwork.freedesktop.org/api/1.0/series/12446/revisions/1/mb
> ox/
> 
> Test gem_exec_suspend:
> Subgroup basic-s3:
> incomplete -> PASS   (fi-hsw-4770k)
> Test kms_frontbuffer_tracking:
> Subgroup basic:
> skip   -> PASS   (fi-ivb-3770)
> Test kms_pipe_crc_basic:
> Subgroup suspend-read-crc-pipe-a:
> dmesg-warn -> PASS   (fi-byt-j1900)
> Subgroup suspend-read-crc-pipe-b:
> dmesg-warn -> PASS   (fi-byt-j1900)
> 
> fi-bdw-
> 5557u total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
> fi-bsw-
> n3050 total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
> fi-byt-
> j1900 total:244  pass:212  dwarn:0   dfail:0   fail:1   skip:31 
> fi-byt-
> n2820 total:244  pass:208  dwarn:0   dfail:0   fail:1   skip:35 
> fi-hsw-
> 4770k total:244  pass:226  dwarn:0   dfail:0   fail:0   skip:18 
> fi-hsw-
> 4770r total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
> fi-ilk-
> 650   total:244  pass:183  dwarn:0   dfail:0   fail:1   skip:60 
> fi-ivb-
> 3520m total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
> fi-ivb-
> 3770  total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
> fi-skl-
> 6260u total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
> fi-skl-
> 6700hqtotal:244  pass:221  dwarn:0   dfail:0   fail:1   skip:22 
> fi-skl-
> 6700k total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
> fi-snb-
> 2520m total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
> fi-snb-
> 2600  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
> 
> Results at /archive/results/CI_IGT_test/Patchwork_2530/
> 
> 9aa8c0cdbc076bcc0486d7a31922a0f77c032fe7 drm-intel-nightly: 2016y-
> 09m-14d-09h-19m-25s UTC integration manifest
> 424c774 drm/i915: Unlock PPS registers after GPU reset
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Queue page flip work with high priority (rev3)

2016-09-20 Thread Patchwork
== Series Details ==

Series: drm/i915: Queue page flip work with high priority (rev3)
URL   : https://patchwork.freedesktop.org/series/12336/
State : warning

== Summary ==

Series 12336v3 drm/i915: Queue page flip work with high priority
https://patchwork.freedesktop.org/api/1.0/series/12336/revisions/3/mbox/

Test gem_exec_gttfill:
Subgroup basic:
pass   -> SKIP   (fi-snb-2600)
Test gem_exec_suspend:
Subgroup basic-s3:
incomplete -> PASS   (fi-hsw-4770k)
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-c-frame-sequence:
pass   -> SKIP   (fi-hsw-4770r)
Subgroup suspend-read-crc-pipe-a:
pass   -> SKIP   (fi-hsw-4770r)
Subgroup suspend-read-crc-pipe-b:
skip   -> PASS   (fi-hsw-4770r)
Test kms_psr_sink_crc:
Subgroup psr_basic:
fail   -> DMESG-WARN (fi-skl-6700hq)

fi-bdw-5557u total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050 total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
fi-byt-n2820 total:244  pass:208  dwarn:0   dfail:0   fail:1   skip:35 
fi-hsw-4770k total:208  pass:187  dwarn:0   dfail:0   fail:0   skip:20 
fi-hsw-4770r total:244  pass:220  dwarn:0   dfail:0   fail:0   skip:24 
fi-ilk-650   total:244  pass:182  dwarn:0   dfail:0   fail:2   skip:60 
fi-ivb-3520m total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
fi-ivb-3770  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6260u total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hqtotal:244  pass:221  dwarn:1   dfail:0   fail:0   skip:22 
fi-skl-6700k total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
fi-skl-6770hqtotal:244  pass:228  dwarn:1   dfail:0   fail:1   skip:14 
fi-snb-2520m total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
fi-snb-2600  total:244  pass:206  dwarn:0   dfail:0   fail:0   skip:38 

Results at /archive/results/CI_IGT_test/Patchwork_2560/

de934d54f765dcfa82c018ea5d256801cffee7af drm-intel-nightly: 
2016y-09m-20d-09h-13m-55s UTC integration manifest
04838a2 drm/i915: Queue page flip work via a low latency, unbound workqueue

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions

2016-09-20 Thread Paulo Zanoni
Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar 
> 
> This patch make use of plane_wm variable directly instead of passing
> skl_plane_wm struct. this way reduces number of argument requirement
> in watermark calculation functions.
> 
> It also gives more freedom of decision making to implement Bspec WM
> workarounds.

This is just my personal opinion, but I don't think this patch is an
improvement to the codebase. The way things are currently organized,
there's no risk that we may end up reading some variable that's still
not set/computed, so there's less risk that some later patch may end up
using information it shouldn't. Also, by having these explicit out
parameters, we clearly highlight what's the goal of the function:
output those 3 values, instead of writing I-don't-know-what to a huge
struct.

Besides, the patch even keeps the out_ variables as local variables
instead of parameters, which makes things even more confusing IMHO,
since in_ and out_ variables are usually function parameters.

I also see that this patch is not necessary for the series. We kinda
use the new pipe_wm variable at patch 9, but we could just go with the
variables we have.

So, unless some new arguments are given, I'd suggest to just drop this
patch.

> 
> Signed-off-by: Mahesh Kumar 
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 29 +++--
>  1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 86c6d66..3fdec4d 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>   struct intel_plane_state
> *intel_pstate,
>   uint16_t ddb_allocation,
>   int level,
> - uint16_t *out_blocks, /* out */
> - uint8_t *out_lines, /* out */
> - bool *enabled /* out */)
> + struct skl_pipe_wm *pipe_wm)
>  {
>   struct drm_plane_state *pstate = _pstate->base;
>   struct drm_framebuffer *fb = pstate->fb;
> @@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>   uint32_t width = 0, height = 0;
>   uint32_t plane_pixel_rate;
>   uint32_t y_tile_minimum, y_min_scanlines;
> + int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> + struct skl_wm_level *result = _wm->wm[level];
> + uint16_t *out_blocks = >plane_res_b[id];
> + uint8_t *out_lines = >plane_res_l[id];
> + bool *enabled = >plane_en[id];
>  
>   if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible) {
>   *enabled = false;
> @@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>    struct skl_ddb_allocation *ddb,
>    struct intel_crtc_state *cstate,
>    int level,
> -  struct skl_wm_level *result)
> +  struct skl_pipe_wm *pipe_wm)
>  {
>   struct drm_atomic_state *state = cstate->base.state;
>   struct intel_crtc *intel_crtc = to_intel_crtc(cstate-
> >base.crtc);
> @@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>   enum pipe pipe = intel_crtc->pipe;
>   int ret;
>  
> - /*
> -  * We'll only calculate watermarks for planes that are
> actually
> -  * enabled, so make sure all other planes are set as
> disabled.
> -  */
> - memset(result, 0, sizeof(*result));
> -
>   for_each_intel_plane_mask(_priv->drm,
>     intel_plane,
>     cstate->base.plane_mask) {
> @@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>      intel_pstate,
>      ddb_blocks,
>      level,
> -    >plane_res_b[i],
> -    >plane_res_l[i],
> -    >plane_en[i]);
> +    pipe_wm);
>   if (ret)
>   return ret;
>   }
> @@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct
> intel_crtc_state *cstate,
>   int level, max_level = ilk_wm_max_level(dev);
>   int ret;
>  
> + /*
> +  * We'll only calculate watermarks for planes that are
> actually
> +  * enabled, so make sure all other planes are set as
> disabled.
> +  */
> + memset(pipe_wm, 0, sizeof(*pipe_wm));
> +
>   for (level = 0; level <= max_level; level++) {
>   ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -

[Intel-gfx] [PATCH v3] drm/i915: Queue page flip work via a low latency, unbound workqueue

2016-09-20 Thread Imre Deak
While user space has control over the scheduling priority of its page
flipping thread, the corresponding work the driver schedules for MMIO
flips always runs from the generic system workqueue which has some
scheduling overhead due it being CPU bound. This would hinder an
application that wants more stringent guarantees over flip timing (to
avoid missing a flip at the next frame count).

Fix this by scheduling the work from the unbound system workqueue
which provides for minimal scheduling latency.

v2:
- Use an unbound workqueue instead of a high-prio one. (Tvrtko, Chris)
v3:
- Use the system unbound wq instead of a dedicated one. (Maarten)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775
Testcase: igt/kms_cursor_legacy
CC: Chris Wilson 
CC: Maarten Lankhorst 
CC: Tvrtko Ursulin 
Signed-off-by: Imre Deak 
Reviewed-by: Tvrtko Ursulin  (v1)
---
 drivers/gpu/drm/i915/intel_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 7bb7874..8da9f3c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12266,7 +12266,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 
work->flip_queued_req = i915_gem_active_get(>last_write,

>base.dev->struct_mutex);
-   schedule_work(>mmio_work);
+   queue_work(system_unbound_wq, >mmio_work);
} else {
request = i915_gem_request_alloc(engine, engine->last_context);
if (IS_ERR(request)) {
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/38] drm/i915: Pass around sg_table to get_pages/put_pages backend

2016-09-20 Thread kbuild test robot
Hi Chris,

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20160919]
[cannot apply to v4.8-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/drm_modes.c:693: warning: No description found for parameter 
'bus_flags'
   drivers/gpu/drm/drm_plane_helper.c:248: warning: No description found for 
parameter 'dst'
   drivers/gpu/drm/drm_plane_helper.c:248: warning: Excess function parameter 
'dest' description in 'drm_plane_helper_check_update'
   drivers/gpu/drm/drm_plane_helper.c:247: warning: No description found for 
parameter 'dst'
   drivers/gpu/drm/drm_plane_helper.c:247: warning: Excess function parameter 
'dest' description in 'drm_plane_helper_check_update'
>> drivers/gpu/drm/i915/i915_gem_fence.c:644: warning: No description found for 
>> parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:674: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:643: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/i915/i915_gem_fence.c:673: warning: No description found for 
parameter 'pages'
   drivers/gpu/drm/drm_crtc.c:1270: WARNING: Inline literal start-string 
without end-string.
   drivers/gpu/drm/drm_crtc.c:1385: WARNING: Inline literal start-string 
without end-string.
   include/drm/drm_crtc.h:1202: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1255: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1268: WARNING: Inline literal start-string without 
end-string.
   include/drm/drm_crtc.h:1272: WARNING: Inline literal start-string without 
end-string.
   drivers/gpu/drm/drm_irq.c:718: WARNING: Option list ends without a blank 
line; unexpected unindent.
   drivers/gpu/drm/drm_fb_helper.c:2195: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/drm_simple_kms_helper.c:141: WARNING: Inline literal 
start-string without end-string.
   include/drm/drm_gem.h:212: WARNING: Inline emphasis start-string without 
end-string.
   drivers/gpu/drm/i915/i915_vgpu.c:176: WARNING: Literal block ends without a 
blank line; unexpected unindent.
   drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/i915/intel_audio.c:54: WARNING: Inline emphasis start-string 
without end-string.
   drivers/gpu/drm/i915/intel_guc_fwif.h:159: WARNING: Block quote ends without 
a blank line; unexpected unindent.
   drivers/gpu/drm/i915/intel_guc_fwif.h:178: WARNING: Enumerated list ends 
without a blank line; unexpected unindent.
   Documentation/gpu/drm-kms.rst:13: WARNING: Could not lex literal_block as 
"C". Highlighting skipped.
   Documentation/gpu/drm-kms-helpers.rst:16: WARNING: Could not lex 
literal_block as "C". Highlighting skipped.
   Documentation/gpu/i915.rst:57: WARNING: Could not lex literal_block as "C". 
Highlighting skipped.

vim +/pages +644 drivers/gpu/drm/i915/i915_gem_fence.c

3271dca48 Daniel Vetter 2015-07-24  628  /**
3271dca48 Daniel Vetter 2015-07-24  629   * i915_gem_object_do_bit_17_swizzle - 
fixup bit 17 swizzling
3271dca48 Daniel Vetter 2015-07-24  630   * @obj: i915 GEM buffer object
3271dca48 Daniel Vetter 2015-07-24  631   *
3271dca48 Daniel Vetter 2015-07-24  632   * This function fixes up the 
swizzling in case any page frame number for this
3271dca48 Daniel Vetter 2015-07-24  633   * object has changed in bit 17 since 
that state has been saved with
3271dca48 Daniel Vetter 2015-07-24  634   * 
i915_gem_object_save_bit_17_swizzle().
3271dca48 Daniel Vetter 2015-07-24  635   *
3271dca48 Daniel Vetter 2015-07-24  636   * This is called when pinning backing 
storage again, since the kernel is free
3271dca48 Daniel Vetter 2015-07-24  637   * to move unpinned backing storage 
around (either by directly moving pages or
3271dca48 Daniel Vetter 2015-07-24  638   * by swapping them out and back in 
again).
3271dca48 Daniel Vetter 2015-07-24  639   */
7f96ecaf1 Daniel Vetter 2015-07-24  640  void
eb1ae14ed Chris Wilson  2016-09-20  641  
i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
eb1ae14ed Chris Wilson  2016-09-20  642  

[Intel-gfx] [PATCH] drm/i915: fix semicolon.cocci warnings

2016-09-20 Thread kbuild test robot
drivers/gpu/drm/i915/i915_gem_request.c:256:42-43: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Chris Wilson 
Signed-off-by: Fengguang Wu 
---

 i915_gem_request.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -253,7 +253,7 @@ static int i915_gem_init_global_seqno(st
i915_gem_retire_requests(dev_priv);
 
/* If the seqno wraps around, we need to clear the breadcrumb rbtree */
-   timeline = _priv->gt.global_timeline;;
+   timeline = _priv->gt.global_timeline;
if (!i915_seqno_passed(seqno, atomic_read(>next_seqno))) {
while (intel_kick_waiters(dev_priv) ||
   intel_kick_signalers(dev_priv))
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 35/38] drm/i915: Reserve space in the global seqno during request allocation

2016-09-20 Thread kbuild test robot
Hi Chris,

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20160919]
[cannot apply to v4.8-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Allow-disabling-error-capture/20160920-164839
base:   git://anongit.freedesktop.org/drm-intel for-linux-next


coccinelle warnings: (new ones prefixed by >>)

>> drivers/gpu/drm/i915/i915_gem_request.c:256:42-43: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t] tests: kms_pipe_color: fix ctm tests

2016-09-20 Thread Ville Syrjälä
On Tue, Sep 20, 2016 at 09:46:58AM +0100, Lionel Landwerlin wrote:
> Some of the Intel platforms have odd numbers of LUT entries and we
> need to tests a couple of values around the expected result. Bring
> back the CRC equal function we need that doesn't trigger an assert
> right away, while we still assert if we can't find a correct result in
> the outter loop.
> 
> v2: update Fixes field (Jani)
> 
> bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97691
> Fixes: 582ce4cd19c6 ("lib/debugs: nuke igt_crc_equal again")
> Cc: Daniel Vetter 
> Cc: Christophe Prigent 
> ---
>  tests/kms_pipe_color.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/kms_pipe_color.c b/tests/kms_pipe_color.c
> index b0a2f07..ebfceab 100644
> --- a/tests/kms_pipe_color.c
> +++ b/tests/kms_pipe_color.c
> @@ -652,6 +652,16 @@ static void test_pipe_legacy_gamma_reset(data_t *data,
>   free(gamma_zero);
>  }
>  
> +static bool crc_equal(igt_crc_t *a, igt_crc_t *b)
> +{
> +  int i;
> +
> +  for (i = 0; i < a->n_words; i++)
> +if (a->crc[i] != b->crc[i])
> +  return false;
> +  return true;
> +}

memcmp()?

> +
>  /*
>   * Draw 3 rectangles using before colors with the ctm matrix apply and verify
>   * the CRC is equal to using after colors with an identify ctm matrix.
> @@ -724,7 +734,7 @@ static bool test_pipe_ctm(data_t *data,
>   /* Verify that the CRC of the software computed output is
>* equal to the CRC of the CTM matrix transformation output.
>*/
> - igt_assert_crc_equal(_software, _hardware);
> + ret &= crc_equal(_software, _hardware);
>  
>   igt_output_set_pipe(output, PIPE_ANY);
>   }
> -- 
> 2.8.1
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h

2016-09-20 Thread Jani Nikula
On Tue, 20 Sep 2016, Chris Wilson  wrote:
> On Tue, Sep 20, 2016 at 11:57:06AM +0300, Jani Nikula wrote:
>> On Mon, 19 Sep 2016, Joonas Lahtinen  wrote:
>> > On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote:
>> >> Fix sparse warnings:
>> >> 
>> >> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol
>> >> 'i915_driver_load' was not declared. Should it be static?
>> >> 
>> >> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol
>> >> 'i915_driver_unload' was not declared. Should it be static?
>> >> 
>> >> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops'
>> >> was not declared. Should it be static?
>> >> 
>> >
>> > Hmm, Chris, were not these change in the middle of a series, did the
>> > cleanup patches just fall off?
>> 
>> Can we just please use whichever patches that fix the issue?
>
> It was deliberate placement to avoid having those symbols exposed to
> every user of i915_drv.h, i.e. everyone. Limit i915_drv.h to the
> interface exposed by i915_drv.c and so used only by a handful of files.

I can see that, but I prefer to get the warnings gone.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] tests/kms_psr_sink_crc: commit before querying mode

2016-09-20 Thread Maarten Lankhorst
Op 20-09-16 om 02:21 schreef Rodrigo Vivi:
> From: Mika Kuoppala 
>
> Commit to a mode before querying it.
>
> Tested-by: Rodrigo Vivi 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=97172
> Cc: Maarten Lankhorst 
> Signed-off-by: Mika Kuoppala 
> Signed-off-by: Rodrigo Vivi 
> ---
>  tests/kms_psr_sink_crc.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/tests/kms_psr_sink_crc.c b/tests/kms_psr_sink_crc.c
> index 8aafedb..68310d8 100644
> --- a/tests/kms_psr_sink_crc.c
> +++ b/tests/kms_psr_sink_crc.c
> @@ -112,6 +112,14 @@ static void setup_output(data_t *data)
>   continue;
>  
>   igt_output_set_pipe(output, pipe);
> + igt_display_commit(display);
> +
> + if (!output->valid) {
> + igt_output_set_pipe(output, PIPE_ANY);
> + igt_display_commit(display);
> + continue;
> + }

Could you be more verbose? I'm trying to get rid of this requirement, in 
particular the output->valid branch will never happen here.
To be compatible with atomic, igt_display_commit should not be called until the 
setup is completed, with fb's set etc.

Quick hack below, does this also fix the issue?

diff --git a/lib/igt_kms.c b/lib/igt_kms.c
index 7d53360359ef..86884466c5c4 100644
--- a/lib/igt_kms.c
+++ b/lib/igt_kms.c
@@ -1215,13 +1215,16 @@ static void igt_display_log_shift(igt_display_t 
*display, int shift)
igt_assert(display->log_shift >= 0);
 }
 
-static void igt_output_refresh(igt_output_t *output)
+static void igt_output_refresh(igt_output_t *output, bool final)
 {
igt_display_t *display = output->display;
unsigned long crtc_idx_mask;
 
+   crtc_idx_mask = output->pending_crtc_idx_mask;
+
/* we mask out the pipes already in use */
-   crtc_idx_mask = output->pending_crtc_idx_mask & ~display->pipes_in_use;
+   if (final)
+   crtc_idx_mask &= ~display->pipes_in_use;
 
kmstest_free_connector_config(>config);
 
@@ -1504,7 +1507,7 @@ void igt_display_init(igt_display_t *display, int drm_fd)
output->id = resources->connectors[i];
output->display = display;
 
-   igt_output_refresh(output);
+   igt_output_refresh(output, false);
 
output->config.pipe_changed = true;
}
@@ -1591,7 +1594,7 @@ static void igt_display_refresh(igt_display_t *display)
for (i = 0; i < display->n_outputs; i++) {
igt_output_t *output = >outputs[i];
 
-   igt_output_refresh(output);
+   igt_output_refresh(output, true);
}
 }
 
@@ -2462,6 +2465,8 @@ void igt_output_set_pipe(igt_output_t *output, enum pipe 
pipe)
 
if (pipe != output->config.pipe)
output->config.pipe_changed = true;
+
+   igt_output_refresh(output, false);
 }
 
 void igt_output_set_scaling_mode(igt_output_t *output, uint64_t scaling_mode)



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h

2016-09-20 Thread Chris Wilson
On Tue, Sep 20, 2016 at 11:57:06AM +0300, Jani Nikula wrote:
> On Mon, 19 Sep 2016, Joonas Lahtinen  wrote:
> > On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote:
> >> Fix sparse warnings:
> >> 
> >> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol
> >> 'i915_driver_load' was not declared. Should it be static?
> >> 
> >> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol
> >> 'i915_driver_unload' was not declared. Should it be static?
> >> 
> >> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops'
> >> was not declared. Should it be static?
> >> 
> >
> > Hmm, Chris, were not these change in the middle of a series, did the
> > cleanup patches just fall off?
> 
> Can we just please use whichever patches that fix the issue?

It was deliberate placement to avoid having those symbols exposed to
every user of i915_drv.h, i.e. everyone. Limit i915_drv.h to the
interface exposed by i915_drv.c and so used only by a handful of files.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/38] drm/i915: Allow disabling error capture

2016-09-20 Thread Patchwork
== Series Details ==

Series: series starting with [01/38] drm/i915: Allow disabling error capture
URL   : https://patchwork.freedesktop.org/series/12703/
State : failure

== Summary ==

Series 12703v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/12703/revisions/1/mbox/

Test drv_module_reload_basic:
pass   -> INCOMPLETE (fi-ilk-650)
pass   -> DMESG-WARN (fi-bdw-5557u)
Test kms_cursor_legacy:
Subgroup basic-flip-after-cursor-varying-size:
pass   -> SKIP   (fi-hsw-4770r)
Subgroup basic-flip-before-cursor-varying-size:
pass   -> SKIP   (fi-hsw-4770r)
Test kms_force_connector_basic:
Subgroup force-connector-state:
pass   -> SKIP   (fi-hsw-4770k)
Subgroup force-edid:
pass   -> SKIP   (fi-hsw-4770k)
Subgroup force-load-detect:
pass   -> SKIP   (fi-hsw-4770k)
Subgroup prune-stale-modes:
pass   -> SKIP   (fi-hsw-4770k)

fi-bdw-5557u total:244  pass:228  dwarn:1   dfail:0   fail:0   skip:15 
fi-bsw-n3050 total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
fi-byt-n2820 total:244  pass:208  dwarn:0   dfail:0   fail:1   skip:35 
fi-hsw-4770k total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-hsw-4770r total:244  pass:220  dwarn:0   dfail:0   fail:0   skip:24 
fi-ilk-650   total:31   pass:11   dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
fi-ivb-3770  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6260u total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hqtotal:244  pass:221  dwarn:0   dfail:0   fail:1   skip:22 
fi-skl-6700k total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
fi-skl-6770hqtotal:244  pass:228  dwarn:1   dfail:0   fail:1   skip:14 
fi-snb-2520m total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
fi-snb-2600  total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_2559/

4c518aef024daa0223692124baa2d7399f54dd97 drm-intel-nightly: 
2016y-09m-19d-20h-40m-51s UTC integration manifest
0a4cc56 drm/i915: Support explicit fencing for execbuf
761e0ee drm/i915: Enable userspace to opt-out of implicit fencing
a673b6f drm/i915: Enable multiple timelines
5526441 drm/i915: Reserve space in the global seqno during request allocation
2ce5a23 drm/i915: Create a unique name for the context
82dbbe1 drm/i915: Move the global sync optimisation to the timeline
b44dc8d drm/i915: Defer request emission
ba0e9ba drm/i915: Record space required for request emission
8702dfa drm/i915: Introduce a global_seqno for each request
b483e16 drm/i915: Wait first for submission, before waiting for request 
completion
b2b3b08 drm/i915: Queue the idling context switch after all other timelines
2dbc868 drm/i915: Combine seqno + tracking into a global timeline struct
1e5f30e drm/i915: Restore nonblocking awaits for modesetting
e30f561 drm: Add reference counting to drm_atomic_state
3d6d7d3 drm/i915: Move GEM activity tracking into a common struct 
reservation_object
1223e3f drm/i915: Use lockless object free
e22aea4 drm/i915: Move object release to a freelist + worker
59801b5 drm/i915: Acquire the backing storage outside of struct_mutex in 
set-domain
4d2cc56 drm/i915: Implement pwrite without struct-mutex
16ba351 drm/i915: Implement pread without struct-mutex
75967f2 drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex
35ba93d drm/i915: Move object backing storage manipulation to its own locking
ac493d3 drm/i915: Pass around sg_table to get_pages/put_pages backend
9901023 drm/i915: Refactor object page API
0f5c18f drm/i915: Use a radixtree for random access to the object's backing 
storage
58f3da0 drm/i915: Markup GEM API with lockdep asserts
339ff75 drm/i915: Reuse the active golden render state batch
eb014cf drm/i915: Introduce an internal allocator for disposable private objects
2278008 drm/i915: Defer active reference until required
1a7a8ee drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked()
a049b32 drm/i915: Rearrange i915_wait_request() accounting with callers
688719b drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate
f1aef72 drm/i915: Support asynchronous waits on struct fence from 
i915_gem_request
2da40ec drm/i915: Compress GPU objects in error state
fd35d16 drm/i915: Consolidate error object printing
af118c3 drm/i915: Always use the GTT for error capture
a7f289e drm/i915: Stop the machine whilst capturing the GPU crash dump
0412489 drm/i915: Allow disabling error capture

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/2] drm/i915: fix pwm increment setup

2016-09-20 Thread Jani Nikula
On Tue, 20 Sep 2016, Ville Syrjälä  wrote:
> On Mon, Sep 19, 2016 at 01:35:24PM +0300, Jani Nikula wrote:
>> CI got confused by all the patches flowing in the earlier thread, so
>> resend. No changes.
>
> Series lgtm.
>
> Reviewed-by: Ville Syrjälä 

Thanks for the review, pushed to drm-intel-next-queued.

BR,
Jani.


>
>> 
>> BR,
>> Jani.
>> 
>> Jani Nikula (1):
>>   drm/i915/backlight: setup and cache pwm alternate increment value
>> 
>> Shawn Lee (1):
>>   drm/i915/backlight: setup backlight pwm alternate increment on
>> backlight enable
>> 
>>  drivers/gpu/drm/i915/intel_drv.h   |  1 +
>>  drivers/gpu/drm/i915/intel_panel.c | 32 
>>  2 files changed, 29 insertions(+), 4 deletions(-)
>> 
>> -- 
>> 2.1.4
>> 
>> ___
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] drm/i915: keep declarations in i915_drv.h

2016-09-20 Thread Jani Nikula
On Mon, 19 Sep 2016, Joonas Lahtinen  wrote:
> On to, 2016-09-15 at 16:28 +0300, Jani Nikula wrote:
>> Fix sparse warnings:
>> 
>> drivers/gpu/drm/i915/i915_drv.c:1179:5: warning: symbol
>> 'i915_driver_load' was not declared. Should it be static?
>> 
>> drivers/gpu/drm/i915/i915_drv.c:1267:6: warning: symbol
>> 'i915_driver_unload' was not declared. Should it be static?
>> 
>> drivers/gpu/drm/i915/i915_drv.c:2444:25: warning: symbol 'i915_pm_ops'
>> was not declared. Should it be static?
>> 
>
> Hmm, Chris, were not these change in the middle of a series, did the
> cleanup patches just fall off?

Can we just please use whichever patches that fix the issue?

BR,
Jani.



>
> Regards, Joonas

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t] tests: kms_pipe_color: fix ctm tests

2016-09-20 Thread Lionel Landwerlin
Some of the Intel platforms have odd numbers of LUT entries and we
need to tests a couple of values around the expected result. Bring
back the CRC equal function we need that doesn't trigger an assert
right away, while we still assert if we can't find a correct result in
the outter loop.

v2: update Fixes field (Jani)

bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97691
Fixes: 582ce4cd19c6 ("lib/debugs: nuke igt_crc_equal again")
Cc: Daniel Vetter 
Cc: Christophe Prigent 
---
 tests/kms_pipe_color.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tests/kms_pipe_color.c b/tests/kms_pipe_color.c
index b0a2f07..ebfceab 100644
--- a/tests/kms_pipe_color.c
+++ b/tests/kms_pipe_color.c
@@ -652,6 +652,16 @@ static void test_pipe_legacy_gamma_reset(data_t *data,
free(gamma_zero);
 }
 
+static bool crc_equal(igt_crc_t *a, igt_crc_t *b)
+{
+  int i;
+
+  for (i = 0; i < a->n_words; i++)
+if (a->crc[i] != b->crc[i])
+  return false;
+  return true;
+}
+
 /*
  * Draw 3 rectangles using before colors with the ctm matrix apply and verify
  * the CRC is equal to using after colors with an identify ctm matrix.
@@ -724,7 +734,7 @@ static bool test_pipe_ctm(data_t *data,
/* Verify that the CRC of the software computed output is
 * equal to the CRC of the CTM matrix transformation output.
 */
-   igt_assert_crc_equal(_software, _hardware);
+   ret &= crc_equal(_software, _hardware);
 
igt_output_set_pipe(output, PIPE_ANY);
}
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/6] Remaining patches for upfront link training on DDI platforms

2016-09-20 Thread Jani Nikula
On Fri, 16 Sep 2016, Manasi Navare  wrote:
> This patch series includes some of the remaining patches to enable
> upfront link training on DDI platforms for DP SST and MST.
> They are based on some of the patches submitted earlier by
> Ander and Durgadoss.

When you post new versions of an entire series, please post them in a
fresh thread, without --in-reply-to.

Thanks,
Jani.

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/2] drm/i915: fix pwm increment setup

2016-09-20 Thread Ville Syrjälä
On Mon, Sep 19, 2016 at 01:35:24PM +0300, Jani Nikula wrote:
> CI got confused by all the patches flowing in the earlier thread, so
> resend. No changes.

Series lgtm.

Reviewed-by: Ville Syrjälä 

> 
> BR,
> Jani.
> 
> Jani Nikula (1):
>   drm/i915/backlight: setup and cache pwm alternate increment value
> 
> Shawn Lee (1):
>   drm/i915/backlight: setup backlight pwm alternate increment on
> backlight enable
> 
>  drivers/gpu/drm/i915/intel_drv.h   |  1 +
>  drivers/gpu/drm/i915/intel_panel.c | 32 
>  2 files changed, 29 insertions(+), 4 deletions(-)
> 
> -- 
> 2.1.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t] lib/igt_core: Print stacktrace in initialization/fixture blocks.

2016-09-20 Thread Chris Wilson
On Tue, Sep 20, 2016 at 11:20:38AM +0300, Marius Vlad wrote:
> Likely candidate for this behaviour is the igt_fixture block. Seen in the CI 
> by
> running tests/kms_psr_sink_crc which is causing segfaults in the fixture 
> block.
> 
> While at it fix some minor printing bugs.
> 
> Signed-off-by: Marius Vlad 
> CC: Chris Wilson 
> ---
> @@ -1740,7 +1740,7 @@ static void fatal_sig_handler(int sig)
>   igt_assert_eq(write(STDERR_FILENO, ".\n", 2), 2);
>   }
>  
> - if (in_subtest && crash_signal(sig)) {
> + if (crash_signal(sig)) {
>   /* Linux standard to return exit code as 128 + signal */
>   if (!failed_one)
>   igt_exitcode = 128 + sig;
> @@ -1749,7 +1749,8 @@ static void fatal_sig_handler(int sig)
>  #ifdef HAVE_LIBUNWIND
>   print_backtrace_sig_safe();
>  #endif
> - exit_subtest("CRASH");
> + if (in_subtest)
> + exit_subtest("CRASH");

Makes sense, I think - as much as I know this part of igt at least!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 29/38] drm/i915: Wait first for submission, before waiting for request completion

2016-09-20 Thread Chris Wilson
In future patches, we will no longer be able to wait on a static global
seqno and instead have to break our wait up into phases. First we wait
for the global seqno assignment (upon submission to hardware), and once
submitted we wait for the hardware to complete.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 998387e53e5e..c1fe3d28fa37 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -773,6 +773,49 @@ bool __i915_spin_request(const struct drm_i915_gem_request 
*req,
return false;
 }
 
+static long
+__i915_request_wait_for_submit(struct drm_i915_gem_request *request,
+  unsigned int flags,
+  long timeout)
+{
+   const int state = flags & I915_WAIT_INTERRUPTIBLE ?
+   TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
+   wait_queue_head_t *q = >i915->gpu_error.wait_queue;
+   DEFINE_WAIT(reset);
+   DEFINE_WAIT(wait);
+
+   if (flags & I915_WAIT_LOCKED)
+   add_wait_queue(q, );
+
+   do {
+   prepare_to_wait(>submit.wait, , state);
+
+   if (i915_sw_fence_done(>submit))
+   break;
+
+   if (flags & I915_WAIT_LOCKED &&
+   i915_reset_in_progress(>i915->gpu_error)) {
+   __set_current_state(TASK_RUNNING);
+   i915_reset(request->i915);
+   reset_wait_queue(q, );
+   continue;
+   }
+
+   if (signal_pending_state(state, current)) {
+   timeout = -ERESTARTSYS;
+   break;
+   }
+
+   timeout = io_schedule_timeout(timeout);
+   } while (timeout);
+   finish_wait(>submit.wait, );
+
+   if (flags & I915_WAIT_LOCKED)
+   remove_wait_queue(q, );
+
+   return timeout;
+}
+
 /**
  * i915_wait_request - wait until execution of request has finished
  * @req: duh!
@@ -811,6 +854,12 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 
trace_i915_gem_request_wait_begin(req);
 
+   if (!i915_sw_fence_done(>submit)) {
+   timeout = __i915_request_wait_for_submit(req, flags, timeout);
+   if (timeout < 0)
+   goto complete;
+   }
+
/* Optimistic short spin before touching IRQs */
if (i915_spin_request(req, state, 5))
goto complete;
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 31/38] drm/i915: Record space required for request emission

2016-09-20 Thread Chris Wilson
In the next patch, we will use deferred request emission. That requires
reserving sufficient space in the ringbuffer to emit the request, which
first requires us to know how large the request is.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c |  1 +
 drivers/gpu/drm/i915/intel_lrc.c|  6 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 29 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 8a3193609bc3..c6c31aab9e38 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -427,6 +427,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 * away, e.g. because a GPU scheduler has deferred it.
 */
req->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
+   GEM_BUG_ON(req->reserved_space < engine->emit_request_sz);
 
if (i915.enable_execlists)
ret = intel_logical_ring_alloc_request_extras(req);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a2421a566ed6..87e6bf4392fd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1572,6 +1572,8 @@ static int gen8_emit_request(struct drm_i915_gem_request 
*request)
return intel_logical_ring_advance(request);
 }
 
+static const int gen8_emit_request_sz = 6 + WA_TAIL_DWORDS;
+
 static int gen8_emit_request_render(struct drm_i915_gem_request *request)
 {
struct intel_ring *ring = request->ring;
@@ -1603,6 +1605,8 @@ static int gen8_emit_request_render(struct 
drm_i915_gem_request *request)
return intel_logical_ring_advance(request);
 }
 
+static const int gen8_emit_request_render_sz = 8 + WA_TAIL_DWORDS;
+
 static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
int ret;
@@ -1677,6 +1681,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs 
*engine)
engine->reset_hw = reset_common_ring;
engine->emit_flush = gen8_emit_flush;
engine->emit_request = gen8_emit_request;
+   engine->emit_request_sz = gen8_emit_request_sz;
engine->submit_request = execlists_submit_request;
 
engine->irq_enable = gen8_logical_ring_enable_irq;
@@ -1799,6 +1804,7 @@ int logical_render_ring_init(struct intel_engine_cs 
*engine)
engine->init_context = gen8_init_rcs_context;
engine->emit_flush = gen8_emit_flush_render;
engine->emit_request = gen8_emit_request_render;
+   engine->emit_request_sz = gen8_emit_request_render_sz;
 
ret = intel_engine_create_scratch(engine, 4096);
if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index de3876cbf58a..56915c1a9b7d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1405,6 +1405,8 @@ static int i9xx_emit_request(struct drm_i915_gem_request 
*req)
return 0;
 }
 
+static const int i9xx_emit_request_sz = 4;
+
 /**
  * gen6_sema_emit_request - Update the semaphore mailbox registers
  *
@@ -1458,6 +1460,8 @@ static int gen8_render_emit_request(struct 
drm_i915_gem_request *req)
return 0;
 }
 
+static const int gen8_render_emit_request_sz = 8;
+
 /**
  * intel_ring_sync - sync the waiter to the signaller on seqno
  *
@@ -2677,8 +2681,21 @@ static void intel_ring_default_vfuncs(struct 
drm_i915_private *dev_priv,
engine->reset_hw = reset_ring_common;
 
engine->emit_request = i9xx_emit_request;
-   if (i915.semaphores)
+   engine->emit_request_sz = i9xx_emit_request_sz;
+   if (i915.semaphores) {
+   int num_rings;
+
engine->emit_request = gen6_sema_emit_request;
+
+   num_rings = hweight32(INTEL_INFO(dev_priv)->ring_mask) - 1;
+   if (INTEL_GEN(dev_priv) >= 8) {
+   engine->emit_request_sz += num_rings * 6;
+   } else {
+   engine->emit_request_sz += num_rings * 3;
+   if (num_rings & 1)
+   engine->emit_request_sz++;
+   }
+   }
engine->submit_request = i9xx_submit_request;
 
if (INTEL_GEN(dev_priv) >= 8)
@@ -2706,9 +2723,17 @@ int intel_init_render_ring_buffer(struct intel_engine_cs 
*engine)
if (INTEL_GEN(dev_priv) >= 8) {
engine->init_context = intel_rcs_ctx_init;
engine->emit_request = gen8_render_emit_request;
+   engine->emit_request_sz = gen8_render_emit_request_sz;
engine->emit_flush = gen8_render_ring_flush;
-   if (i915.semaphores)
+   if (i915.semaphores) {
+   int num_rings;
+

[Intel-gfx] [PATCH 36/38] drm/i915: Enable multiple timelines

2016-09-20 Thread Chris Wilson
With the infrastructure converted over to tracking multiple timelines in
the GEM API whilst preserving the efficiency of using a single execution
timeline internally, we can now assign a separate timeline to every
context with full-ppgtt.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  | 10 
 drivers/gpu/drm/i915/i915_gem.c  |  5 ++
 drivers/gpu/drm/i915/i915_gem_context.c  |  4 +-
 drivers/gpu/drm/i915/i915_gem_evict.c| 11 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.c  | 19 ---
 drivers/gpu/drm/i915/i915_gem_gtt.h  |  4 +-
 drivers/gpu/drm/i915/i915_gem_request.c  | 85 +++-
 drivers/gpu/drm/i915/i915_gem_timeline.c |  1 +
 drivers/gpu/drm/i915/i915_gem_timeline.h |  2 +
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 12 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  5 --
 11 files changed, 101 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48c63365184d..1187e8f51985 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3493,6 +3493,16 @@ static inline void i915_gem_context_put(struct 
i915_gem_context *ctx)
kref_put(>ref, i915_gem_context_free);
 }
 
+static inline struct intel_timeline *
+i915_gem_context_lookup_timeline(struct i915_gem_context *ctx,
+struct intel_engine_cs *engine)
+{
+   struct i915_address_space *vm;
+
+   vm = ctx->ppgtt ? >ppgtt->base : >i915->ggtt.base;
+   return >timeline.engine[engine->id];
+}
+
 static inline bool i915_gem_context_is_default(const struct i915_gem_context 
*c)
 {
return c->user_handle == DEFAULT_CONTEXT_HANDLE;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cd436f28e702..fd62e947322a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2558,6 +2558,7 @@ static void i915_gem_reset_engine(struct intel_engine_cs 
*engine)
 {
struct drm_i915_gem_request *request;
struct i915_gem_context *incomplete_ctx;
+   struct intel_timeline *timeline;
bool ring_hung;
 
/* Ensure irq handler finishes, and not run again. */
@@ -2595,6 +2596,10 @@ static void i915_gem_reset_engine(struct intel_engine_cs 
*engine)
list_for_each_entry_continue(request, >timeline->requests, link)
if (request->ctx == incomplete_ctx)
reset_request(request);
+
+   timeline = i915_gem_context_lookup_timeline(incomplete_ctx, engine);
+   list_for_each_entry(request, >requests, link)
+   reset_request(request);
 }
 
 void i915_gem_reset(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 762f3de9421f..5574bb1ca5c4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -365,9 +365,9 @@ i915_gem_create_context(struct drm_device *dev,
return ctx;
 
if (USES_FULL_PPGTT(dev)) {
-   struct i915_hw_ppgtt *ppgtt =
-   i915_ppgtt_create(to_i915(dev), file_priv);
+   struct i915_hw_ppgtt *ppgtt;
 
+   ppgtt = i915_ppgtt_create(to_i915(dev), file_priv, ctx->name);
if (IS_ERR(ppgtt)) {
DRM_DEBUG_DRIVER("PPGTT setup failed (%ld)\n",
 PTR_ERR(ppgtt));
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 61f716c8768c..dcab3da29b04 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -33,13 +33,16 @@
 #include "intel_drv.h"
 #include "i915_trace.h"
 
-static bool
-gpu_is_idle(struct drm_i915_private *dev_priv)
+static bool ggtt_is_idle(struct drm_i915_private *dev_priv)
 {
+   struct i915_ggtt *ggtt = _priv->ggtt;
struct intel_engine_cs *engine;
 
for_each_engine(engine, dev_priv) {
-   if (intel_engine_is_active(engine))
+   struct intel_timeline *tl;
+
+   tl = >base.timeline.engine[engine->id];
+   if (i915_gem_active_isset(>last_request))
return false;
}
 
@@ -153,7 +156,7 @@ search_again:
if (!i915_is_ggtt(vm) || flags & PIN_NONBLOCK)
return -ENOSPC;
 
-   if (gpu_is_idle(dev_priv)) {
+   if (ggtt_is_idle(dev_priv)) {
/* If we still have pending pageflip completions, drop
 * back to userspace to give our workqueues time to
 * acquire our locks and unpin the old scanouts.
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2a8eed5a70b9..0b7541b18195 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2112,8 +2112,10 @@ static int 

[Intel-gfx] [PATCH 12/38] drm/i915: Reuse the active golden render state batch

2016-09-20 Thread Chris Wilson
The golden render state is constant, but we recreate the batch setting
it up for every new context. If we keep that batch in a volatile cache
we can safely reuse it whenever we need to initialise a new context. We
mark the pages as purgeable and use the shrinker to recover pages from
the batch whenever we face memory pressues, recreating that batch afresh
on the next new context.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_render_state.c | 178 +--
 drivers/gpu/drm/i915/i915_gem_render_state.h |   4 +-
 drivers/gpu/drm/i915/intel_engine_cs.c   |   5 +
 drivers/gpu/drm/i915/intel_lrc.c |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +
 6 files changed, 124 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 53175aac5cd0..31b75132f6f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -28,17 +28,19 @@
 #include "i915_drv.h"
 #include "intel_renderstate.h"
 
-struct render_state {
+struct intel_render_state {
const struct intel_renderstate_rodata *rodata;
struct i915_vma *vma;
-   u32 aux_batch_size;
-   u32 aux_batch_offset;
+   u32 batch_offset;
+   u32 batch_size;
+   u32 aux_offset;
+   u32 aux_size;
 };
 
 static const struct intel_renderstate_rodata *
-render_state_get_rodata(const struct drm_i915_gem_request *req)
+render_state_get_rodata(const struct intel_engine_cs *engine)
 {
-   switch (INTEL_GEN(req->i915)) {
+   switch (INTEL_GEN(engine->i915)) {
case 6:
return _null_state;
case 7:
@@ -63,29 +65,27 @@ render_state_get_rodata(const struct drm_i915_gem_request 
*req)
  */
 #define OUT_BATCH(batch, i, val)   \
do {\
-   if (WARN_ON((i) >= PAGE_SIZE / sizeof(u32))) {  \
-   ret = -ENOSPC;  \
-   goto err_out;   \
-   }   \
+   if ((i) >= PAGE_SIZE / sizeof(u32)) \
+   goto err;   \
(batch)[(i)++] = (val); \
} while(0)
 
-static int render_state_setup(struct render_state *so)
+static int render_state_setup(struct intel_render_state *so,
+ struct drm_i915_private *i915)
 {
-   struct drm_device *dev = so->vma->vm->dev;
const struct intel_renderstate_rodata *rodata = so->rodata;
-   const bool has_64bit_reloc = INTEL_GEN(dev) >= 8;
+   const bool has_64bit_reloc = INTEL_GEN(i915) >= 8;
+   struct drm_i915_gem_object *obj = so->vma->obj;
unsigned int i = 0, reloc_index = 0;
-   struct page *page;
+   unsigned int needs_clflush;
u32 *d;
int ret;
 
-   ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true);
+   ret = i915_gem_obj_prepare_shmem_write(obj, _clflush);
if (ret)
return ret;
 
-   page = i915_gem_object_get_dirty_page(so->vma->obj, 0);
-   d = kmap(page);
+   d = kmap_atomic(i915_gem_object_get_dirty_page(obj, 0));
 
while (i < rodata->batch_items) {
u32 s = rodata->batch[i];
@@ -95,10 +95,8 @@ static int render_state_setup(struct render_state *so)
s = lower_32_bits(r);
if (has_64bit_reloc) {
if (i + 1 >= rodata->batch_items ||
-   rodata->batch[i + 1] != 0) {
-   ret = -EINVAL;
-   goto err_out;
-   }
+   rodata->batch[i + 1] != 0)
+   goto err;
 
d[i++] = s;
s = upper_32_bits(r);
@@ -110,12 +108,20 @@ static int render_state_setup(struct render_state *so)
d[i++] = s;
}
 
+   if (rodata->reloc[reloc_index] != -1) {
+   DRM_ERROR("only %d relocs resolved\n", reloc_index);
+   goto err;
+   }
+
+   so->batch_offset = so->vma->node.start;
+   so->batch_size = rodata->batch_items * sizeof(u32);
+
while (i % CACHELINE_DWORDS)
OUT_BATCH(d, i, MI_NOOP);
 
-   so->aux_batch_offset = i * sizeof(u32);
+   so->aux_offset = i * sizeof(u32);
 
-   if (HAS_POOLED_EU(dev)) {
+   if (HAS_POOLED_EU(i915)) {
/*
 * We always program 3x6 pool config but depending upon which
 * subslice is disabled HW drops down to appropriate config
@@ 

[Intel-gfx] [PATCH 28/38] drm/i915: Queue the idling context switch after all other timelines

2016-09-20 Thread Chris Wilson
Before suspend, we wait for the switch to the kernel context. In order
for all the other context images to be complete upon suspend, that
switch must be the last operation by the GPU (i.e. this idling request
must not overtake any pending requests). To make this request execute last,
we make it depend on every other inflight request.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 12 
 drivers/gpu/drm/i915/i915_gem_context.c | 21 +++--
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 768190594c4f..b10d26d3254d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4259,6 +4259,17 @@ void __i915_gem_object_release_unless_active(struct 
drm_i915_gem_object *obj)
i915_gem_object_put(obj);
 }
 
+__maybe_unused static bool is_kernel_context(struct drm_i915_private *dev_priv)
+{
+   struct intel_engine_cs *engine;
+
+   for_each_engine(engine, dev_priv)
+   if (engine->last_context != dev_priv->kernel_context)
+   return false;
+
+   return true;
+}
+
 int i915_gem_suspend(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
@@ -4288,6 +4299,7 @@ int i915_gem_suspend(struct drm_device *dev)
 
i915_gem_retire_requests(dev_priv);
 
+   GEM_BUG_ON(!is_kernel_context(dev_priv));
i915_gem_context_lost(dev_priv);
mutex_unlock(>struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 1d2ab73a8f43..fe45cd7640b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -929,21 +929,30 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
 {
struct intel_engine_cs *engine;
+   struct i915_gem_timeline *timeline;
 
for_each_engine(engine, dev_priv) {
struct drm_i915_gem_request *req;
int ret;
 
-   if (engine->last_context == NULL)
-   continue;
-
-   if (engine->last_context == dev_priv->kernel_context)
-   continue;
-
req = i915_gem_request_alloc(engine, dev_priv->kernel_context);
if (IS_ERR(req))
return PTR_ERR(req);
 
+   /* Queue this switch after all other activity */
+   list_for_each_entry(timeline, _priv->gt.timelines, link) {
+   struct drm_i915_gem_request *prev;
+   struct intel_timeline *tl;
+
+   tl = >engine[engine->id];
+   prev = i915_gem_active_raw(>last_request,
+  _priv->drm.struct_mutex);
+   if (prev)
+   i915_sw_fence_await_sw_fence(>submit,
+>submit,
+NULL, GFP_KERNEL);
+   }
+
ret = i915_switch_context(req);
i915_add_request_no_flush(req);
if (ret)
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 35/38] drm/i915: Reserve space in the global seqno during request allocation

2016-09-20 Thread Chris Wilson
A restriction on our global seqno is that they cannot wrap, and that we
cannot use the value 0. This allows us to detect when a request has not
yet been submitted, its global seqno is still 0, and ensures that
hardware semaphores are monotonic as required by older hardware. To
meet these restrictions when we defer the assignment of the global
seqno, we must check that we have an available slot in the global seqno
space during request construction. If that test fails, we wait for all
requests to be completed and reset the hardware back to 0.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen name,
   
i915_gem_request_get_seqno(work->flip_queued_req),
-  
dev_priv->gt.global_timeline.next_seqno,
+  
atomic_read(_priv->gt.global_timeline.next_seqno),
   intel_engine_get_seqno(engine),
   
i915_gem_request_completed(work->flip_queued_req));
} else
@@ -1055,7 +1055,7 @@ i915_next_seqno_get(void *data, u64 *val)
 {
struct drm_i915_private *dev_priv = data;
 
-   *val = READ_ONCE(dev_priv->gt.global_timeline.next_seqno);
+   *val = atomic_read(_priv->gt.global_timeline.next_seqno);
return 0;
 }
 
@@ -2331,8 +2331,8 @@ static int i915_rps_boost_info(struct seq_file *m, void 
*data)
struct drm_file *file;
 
seq_printf(m, "RPS enabled? %d\n", dev_priv->rps.enabled);
-   seq_printf(m, "GPU busy? %s [%x]\n",
-  yesno(dev_priv->gt.awake), dev_priv->gt.active_engines);
+   seq_printf(m, "GPU busy? %s [%d requests]\n",
+  yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv));
seq_printf(m, "Frequency requested %d\n",
   intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq));
@@ -2367,7 +2367,7 @@ static int i915_rps_boost_info(struct seq_file *m, void 
*data)
 
if (INTEL_GEN(dev_priv) >= 6 &&
dev_priv->rps.enabled &&
-   dev_priv->gt.active_engines) {
+   dev_priv->gt.active_requests) {
u32 rpup, rpupei;
u32 rpdown, rpdownei;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bb2b8b41eb61..48c63365184d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2059,6 +2059,7 @@ struct drm_i915_private {
 
struct list_head timelines;
struct i915_gem_timeline global_timeline;
+   u32 active_requests;
 
/**
 * Is the GPU currently considered idle, or busy executing
@@ -2067,7 +2068,6 @@ struct drm_i915_private {
 * In order to reduce the effect on performance, there
 * is a slight delay before we do so.
 */
-   unsigned int active_engines;
bool awake;
 
/**
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 659ee3b910d7..cd436f28e702 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2640,8 +2640,6 @@ static void i915_gem_cleanup_engine(struct 
intel_engine_cs *engine)
memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
spin_unlock(>execlist_lock);
}
-
-   engine->i915->gt.active_engines &= ~intel_engine_flag(engine);
 }
 
 void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
@@ -2696,7 +2694,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (!READ_ONCE(dev_priv->gt.awake))
return;
 
-   if (READ_ONCE(dev_priv->gt.active_engines))
+   if (READ_ONCE(dev_priv->gt.active_requests))
return;
 
rearm_hangcheck =
@@ -2710,7 +2708,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
goto out_rearm;
}
 
-   if (dev_priv->gt.active_engines)

[Intel-gfx] [PATCH 37/38] drm/i915: Enable userspace to opt-out of implicit fencing

2016-09-20 Thread Chris Wilson
Userspace is faced with a dilemma. The kernel requires implicit fencing
to manage resource usage (we always must wait for the GPU to finish
before releasing its PTE) and for third parties. However, userspace may
wish to avoid this serialisation if it is either using explicit fencing
between parties and wants more fine-grained access to buffers (e.g. it
may partition the buffer between uses and track fences on ranges rather
than the implicit fences tracking the whole object). It follows that
userspace needs a mechanism to avoid the kernel's serialisation on its
implicit fences before execbuf execution.

The next question is whether this is an object, execbuf or context flag.
Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on
shared winsys buffers, but implicit fencing on internal surfaces)
require a per-object level flag. Given that this flag need to be only
set once for the lifetime of the object, this reduces the convenience of
having an execbuf or context level flag (and avoids having multiple
pieces of uABI controlling the same feature).

Incorrect use of this flag will result in rendering corruption and GPU
hangs - but will not result in use-after-free or similar resource
tracking issues.

Serious caveat: write ordering is not strictly correct after setting
this flag on a render target on multiple engines. This affects all
subsequent GEM operations (execbuf, set-domain, pread) and shared
dma-buf operations. A fix is possible - but costly (both in terms of
further ABI changes and runtime overhead).

Testcase: igt/gem_exec_async
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.c|  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +++
 include/uapi/drm/i915_drm.h| 27 ++-
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b393347554bd..19ee76284371 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -332,6 +332,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_EXEC_HANDLE_LUT:
case I915_PARAM_HAS_COHERENT_PHYS_GTT:
case I915_PARAM_HAS_EXEC_SOFTPIN:
+   case I915_PARAM_HAS_EXEC_ASYNC:
/* For the time being all of these are always true;
 * if some supported hardware does not have one of these
 * features this value needs to be provided from
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e2d4f937d0b2..7038da9aa68f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1107,6 +1107,9 @@ i915_gem_execbuffer_move_to_gpu(struct 
drm_i915_gem_request *req,
list_for_each_entry(vma, vmas, exec_list) {
struct drm_i915_gem_object *obj = vma->obj;
 
+   if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC)
+   continue;
+
ret = i915_gem_request_await_object
(req, obj, obj->base.pending_write_domain);
if (ret)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 03725fe89859..a2fa511b46b3 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -388,6 +388,10 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_POOLED_EU38
 #define I915_PARAM_MIN_EU_IN_POOL   39
 #define I915_PARAM_MMAP_GTT_VERSION 40
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports the ability to opt-out of
+ * synchronisation with implicit fencing on individual objects.
+ */
+#define I915_PARAM_HAS_EXEC_ASYNC   41
 
 typedef struct drm_i915_getparam {
__s32 param;
@@ -729,8 +733,29 @@ struct drm_i915_gem_exec_object2 {
 #define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
 #define EXEC_OBJECT_PINNED  (1<<4)
 #define EXEC_OBJECT_PAD_TO_SIZE (1<<5)
+/* The kernel implicitly tracks GPU activity on all GEM objects, and
+ * synchronises operations with outstanding rendering. This includes
+ * rendering on other devices if exported via dma-buf. However, sometimes
+ * this tracking is too coarse and the user knows better. For example,
+ * if the object is split into non-overlapping ranges shared between different
+ * clients or engines (i.e. suballocating objects), the implicit tracking
+ * by kernel assumes that each operation affects the whole object rather
+ * than an individual range, causing needless synchronisation between clients.
+ * The kernel will also forgo any CPU cache flushes prior to rendering from
+ * the object as the client is expected to be also handling such domain
+ * tracking.
+ *
+ * The kernel maintains the implicit tracking in order to manage resources
+ * used by the GPU - this flag only disables the 

[Intel-gfx] [PATCH 24/38] drm/i915: Move GEM activity tracking into a common struct reservation_object

2016-09-20 Thread Chris Wilson
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

 * busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

 * non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

 * dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 72s). The runtime effect is far larger
than the overhead added to execbuf as indicated by perf - interesting
secondary effects?

 * loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

 * minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  32 ++--
 drivers/gpu/drm/i915/i915_drv.h|  45 +
 drivers/gpu/drm/i915/i915_gem.c| 277 +
 drivers/gpu/drm/i915/i915_gem_batch_pool.c |   3 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |  46 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.h |  45 -
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  71 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c|  32 
 drivers/gpu/drm/i915/i915_gem_gtt.h|   1 +
 drivers/gpu/drm/i915/i915_gem_request.c|  48 +++--
 drivers/gpu/drm/i915/i915_gem_request.h|  37 +---
 drivers/gpu/drm/i915/i915_gpu_error.c  |   6 +-
 drivers/gpu/drm/i915/intel_atomic_plane.c  |   2 -
 drivers/gpu/drm/i915/intel_display.c   | 131 --
 drivers/gpu/drm/i915/intel_drv.h   |   3 -
 15 files changed, 234 insertions(+), 545 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_gem_dmabuf.h

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 355eec8f7cac..7ecdd5cc27dd 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -130,6 +130,23 @@ static u64 i915_gem_obj_total_ggtt_size(struct 
drm_i915_gem_object *obj)
return size;
 }
 
+static struct intel_engine_cs *
+last_write_engine(struct drm_i915_gem_object *obj)
+{
+   struct intel_engine_cs *engine = NULL;
+   struct fence *fence;
+
+   rcu_read_lock();
+   fence = reservation_object_get_excl_rcu(obj->resv);
+   rcu_read_unlock();
+
+   if (fence && fence_is_i915(fence) && !fence_is_signaled(fence))
+   engine = to_request(fence)->engine;
+   fence_put(fence);
+
+   return engine;
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -138,11 +155,10 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
struct i915_vma *vma;
unsigned int frontbuffer_bits;
int pin_count = 0;
-   enum intel_engine_id id;
 
lockdep_assert_held(>base.dev->struct_mutex);
 
-   seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x [ ",
+   seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x %s%s%s",
   >base,
   get_active_flag(obj),
   get_pin_flag(obj),
@@ -151,14 +167,7 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
   get_pin_mapped_flag(obj),
   obj->base.size / 1024,
   obj->base.read_domains,
-  obj->base.write_domain);
-   for_each_engine_id(engine, dev_priv, id)
-   seq_printf(m, "%x ",
-  i915_gem_active_get_seqno(>last_read[id],
-
>base.dev->struct_mutex));
-   seq_printf(m, "] %x %s%s%s",
-  i915_gem_active_get_seqno(>last_write,
->base.dev->struct_mutex),
+  

[Intel-gfx] [PATCH 21/38] drm/i915: Acquire the backing storage outside of struct_mutex in set-domain

2016-09-20 Thread Chris Wilson
As we can locklessly (well struct_mutex-lessly) acquire the backing
storage, do so in set-domain-ioctl to reduce the contention on the
struct_mutex.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 83 ++---
 1 file changed, 53 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c77c04422cc..4ee074305908 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1459,6 +1459,30 @@ write_origin(struct drm_i915_gem_object *obj, unsigned 
domain)
obj->frontbuffer_ggtt_origin : ORIGIN_CPU);
 }
 
+static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_private *i915;
+   struct list_head *list;
+   struct i915_vma *vma;
+
+   list_for_each_entry(vma, >vma_list, obj_link) {
+   if (!i915_vma_is_ggtt(vma))
+   continue;
+
+   if (i915_vma_is_active(vma))
+   continue;
+
+   if (!drm_mm_node_allocated(>node))
+   continue;
+
+   list_move_tail(>vm_link, >vm->inactive_list);
+   }
+
+   i915 = to_i915(obj->base.dev);
+   list = obj->bind_count ? >mm.bound_list : >mm.unbound_list;
+   list_move_tail(>global_list, list);
+}
+
 /**
  * Called when user space prepares to use an object with the CPU, either
  * through the mmap ioctl's mapping or a GTT mapping.
@@ -1500,25 +1524,40 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
   MAX_SCHEDULE_TIMEOUT,
   to_rps_client(file));
if (ret)
-   goto err;
+   goto err_unlocked;
+
+   /* Flush and acquire obj->pages so that we are coherent through
+* direct access in memory with previous cached writes through
+* shmemfs and that our cache domain tracking remains valid.
+* For example, if the obj->filp was moved to swap without us
+* being notified and releasing the pages, we would mistakenly
+* continue to assume that the obj remained out of the CPU cached
+* domain.
+*/
+   ret = i915_gem_object_pin_pages(obj);
+   if (ret)
+   goto err_unlocked;
 
ret = i915_mutex_lock_interruptible(dev);
if (ret)
-   goto err;
+   goto err_pages;
 
if (read_domains & I915_GEM_DOMAIN_GTT)
ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0);
else
ret = i915_gem_object_set_to_cpu_domain(obj, write_domain != 0);
 
-   if (write_domain != 0)
-   intel_fb_obj_invalidate(obj, write_origin(obj, write_domain));
+   /* And bump the LRU for this access */
+   i915_gem_object_bump_inactive_ggtt(obj);
 
-   i915_gem_object_put(obj);
mutex_unlock(>struct_mutex);
-   return ret;
 
-err:
+   if (write_domain != 0)
+   intel_fb_obj_invalidate(obj, write_origin(obj, write_domain));
+
+err_pages:
+   i915_gem_object_unpin_pages(obj);
+err_unlocked:
i915_gem_object_put_unlocked(obj);
return ret;
 }
@@ -1741,6 +1780,10 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
vm_fault *vmf)
if (ret)
goto err;
 
+   ret = i915_gem_object_pin_pages(obj);
+   if (ret)
+   goto err;
+
intel_runtime_pm_get(dev_priv);
 
ret = i915_mutex_lock_interruptible(dev);
@@ -1822,6 +1865,7 @@ err_unlock:
mutex_unlock(>struct_mutex);
 err_rpm:
intel_runtime_pm_put(dev_priv);
+   i915_gem_object_unpin_pages(obj);
 err:
switch (ret) {
case -EIO:
@@ -3205,24 +3249,6 @@ i915_gem_object_flush_cpu_write_domain(struct 
drm_i915_gem_object *obj)
I915_GEM_DOMAIN_CPU);
 }
 
-static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj)
-{
-   struct i915_vma *vma;
-
-   list_for_each_entry(vma, >vma_list, obj_link) {
-   if (!i915_vma_is_ggtt(vma))
-   continue;
-
-   if (i915_vma_is_active(vma))
-   continue;
-
-   if (!drm_mm_node_allocated(>node))
-   continue;
-
-   list_move_tail(>vm_link, >vm->inactive_list);
-   }
-}
-
 /**
  * Moves a single object to the GTT read, and possibly write domain.
  * @obj: object to act on
@@ -3278,7 +3304,7 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj, bool write)
/* It should now be out of any other write domains, and we can update
 * the domain values for our changes.
 */
-   BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_GTT) != 0);
+   GEM_BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_GTT) != 0);

[Intel-gfx] [PATCH 34/38] drm/i915: Create a unique name for the context

2016-09-20 Thread Chris Wilson
This will be used for communicating issues with this context to
userspace, so we want to identify the parent process and the individual
context. Note that the name isn't quite unique, it makes the presumption
of there only being a single device fd per process.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 23 ++-
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6daa4d538f42..bb2b8b41eb61 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -918,6 +918,7 @@ struct i915_gem_context {
struct drm_i915_file_private *file_priv;
struct i915_hw_ppgtt *ppgtt;
struct pid *pid;
+   const char *name;
 
struct i915_ctx_hang_stats hang_stats;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index fe45cd7640b3..762f3de9421f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -158,6 +158,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
__i915_gem_object_release_unless_active(ce->state->obj);
}
 
+   kfree(ctx->name);
put_pid(ctx->pid);
list_del(>link);
 
@@ -303,19 +304,28 @@ __create_hw_context(struct drm_device *dev,
}
 
/* Default context will never have a file_priv */
-   if (file_priv != NULL) {
+   ret = DEFAULT_CONTEXT_HANDLE;
+   if (file_priv) {
ret = idr_alloc(_priv->context_idr, ctx,
DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
if (ret < 0)
goto err_out;
-   } else
-   ret = DEFAULT_CONTEXT_HANDLE;
+   }
+   ctx->user_handle = ret;
 
ctx->file_priv = file_priv;
-   if (file_priv)
+   if (file_priv) {
ctx->pid = get_task_pid(current, PIDTYPE_PID);
+   ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
+ current->comm,
+ pid_nr(ctx->pid),
+ ctx->user_handle);
+   if (!ctx->name) {
+   ret = -ENOMEM;
+   goto err_pid;
+   }
+   }
 
-   ctx->user_handle = ret;
/* NB: Mark all slices as needing a remap so that when the context first
 * loads it will restore whatever remap state already exists. If there
 * is no remap info, it will be a NOP. */
@@ -329,6 +339,9 @@ __create_hw_context(struct drm_device *dev,
 
return ctx;
 
+err_pid:
+   put_pid(ctx->pid);
+   idr_remove(_priv->context_idr, ctx->user_handle);
 err_out:
context_close(ctx);
return ERR_PTR(ret);
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 32/38] drm/i915: Defer request emission

2016-09-20 Thread Chris Wilson
Move the actual emission of the request (the closing breadcrumb) from
i915_add_request() to the submit callback. (It can be moved later when
required.) This allows us to defer the allocation of the global_seqno
from request construction to actual submission, allowing us to emit the
requests out of order (wrt to the order of their construction, they
still will only be executed one all of their dependencies are resolved
including that all earlier requests on their timeline have been
submitted.) We have to specialise how we then emit the request in order
to write into the preallocated space, rather than at the tail of the
ringbuffer (which will have been advanced by the addition of new
requests).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_request.c |  26 ++---
 drivers/gpu/drm/i915/intel_lrc.c| 120 ---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  10 +-
 4 files changed, 111 insertions(+), 213 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index c6c31aab9e38..c31cb7c4a61f 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -632,9 +632,7 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
struct intel_ring *ring = request->ring;
struct intel_timeline *timeline = request->timeline;
struct drm_i915_gem_request *prev;
-   u32 request_start;
-   u32 reserved_tail;
-   int ret;
+   int err;
 
lockdep_assert_held(>i915->drm.struct_mutex);
trace_i915_gem_request_add(request);
@@ -644,8 +642,6 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
 * should already have been reserved in the ring buffer. Let the ring
 * know that it is time to use that space up.
 */
-   request_start = ring->tail;
-   reserved_tail = request->reserved_space;
request->reserved_space = 0;
 
/*
@@ -656,10 +652,10 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
 * what.
 */
if (flush_caches) {
-   ret = engine->emit_flush(request, EMIT_FLUSH);
+   err = engine->emit_flush(request, EMIT_FLUSH);
 
/* Not allowed to fail! */
-   WARN(ret, "engine->emit_flush() failed: %d!\n", ret);
+   WARN(err, "engine->emit_flush() failed: %d!\n", err);
}
 
/* Record the position of the start of the breadcrumb so that
@@ -667,20 +663,12 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
 * GPU processing the request, we never over-estimate the
 * position of the ring's HEAD.
 */
+   err = intel_ring_begin(request, engine->emit_request_sz);
+   GEM_BUG_ON(err);
request->postfix = ring->tail;
 
-   /* Not allowed to fail! */
-   ret = engine->emit_request(request);
-   WARN(ret, "(%s)->emit_request failed: %d!\n", engine->name, ret);
-
-   /* Sanity check that the reserved size was large enough. */
-   ret = ring->tail - request_start;
-   if (ret < 0)
-   ret += ring->size;
-   WARN_ONCE(ret > reserved_tail,
- "Not enough space reserved (%d bytes) "
- "for adding the request (%d bytes)\n",
- reserved_tail, ret);
+   engine->emit_request(request, request->ring->vaddr + request->postfix);
+   ring->tail += engine->emit_request_sz * sizeof(u32);
 
/* Seal the request and mark it as pending execution. Note that
 * we may inspect this state, without holding any locks, during
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 87e6bf4392fd..ffbb4da6cf95 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -360,7 +360,7 @@ static u64 execlists_update_context(struct 
drm_i915_gem_request *rq)
struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
u32 *reg_state = ce->lrc_reg_state;
 
-   reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);
+   reg_state[CTX_RING_TAIL+1] = rq->tail;
 
/* True 32b PPGTT with dynamic page allocation: update PDP
 * registers and point the unallocated PDPs to scratch page.
@@ -594,6 +594,15 @@ static void execlists_submit_request(struct 
drm_i915_gem_request *request)
 
spin_lock_irqsave(>execlist_lock, flags);
 
+   /* We keep the previous context alive until we retire the following
+* request. This ensures that any the context object is still pinned
+* for any residual writes the HW makes into it on the context switch
+* into the next object following the breadcrumb. Otherwise, we may
+* retire the context too early.
+*/
+  

[Intel-gfx] [PATCH 22/38] drm/i915: Move object release to a freelist + worker

2016-09-20 Thread Chris Wilson
We want to hide the latency of releasing objects and their backing
storage from the submission, so we move the actual free to a worker.
This allows us to switch to struct_mutex freeing of the object in the
next patch.

Furthermore, if we know that the object we are dereferencing remains valid
for the duration of our access, we can forgo the usual synchronisation
barriers and atomic reference counting. To ensure this we defer freeing
an object til after an RCU grace period, such that any lookup of the
object within an RCU read critical section will remain valid until
after we exit that critical section. We also employ this delay for
rate-limiting the serialisation on reallocation - we have to slow down
object creation in order to prevent resource starvation (in particular,
files).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c  |  15 ++-
 drivers/gpu/drm/i915/i915_drv.c  |   3 +
 drivers/gpu/drm/i915/i915_drv.h  |  44 +++-
 drivers/gpu/drm/i915/i915_gem.c  | 166 +--
 drivers/gpu/drm/i915/i915_gem_shrinker.c |  20 ++--
 drivers/gpu/drm/i915/i915_gem_tiling.c   |  21 ++--
 6 files changed, 196 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 0b2e0ff968cf..355eec8f7cac 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4783,10 +4783,12 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_ring_test_irq_fops,
 #define DROP_BOUND 0x2
 #define DROP_RETIRE 0x4
 #define DROP_ACTIVE 0x8
-#define DROP_ALL (DROP_UNBOUND | \
- DROP_BOUND | \
- DROP_RETIRE | \
- DROP_ACTIVE)
+#define DROP_FREED 0x10
+#define DROP_ALL (DROP_UNBOUND | \
+ DROP_BOUND| \
+ DROP_RETIRE   | \
+ DROP_ACTIVE   | \
+ DROP_FREED)
 static int
 i915_drop_caches_get(void *data, u64 *val)
 {
@@ -4830,6 +4832,11 @@ i915_drop_caches_set(void *data, u64 val)
 unlock:
mutex_unlock(>struct_mutex);
 
+   if (val & DROP_FREED) {
+   synchronize_rcu();
+   flush_work(_priv->mm.free_work);
+   }
+
return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7f4e8adec8a8..dded8af4092c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -563,6 +563,9 @@ static void i915_gem_fini(struct drm_device *dev)
i915_gem_context_fini(dev);
mutex_unlock(>struct_mutex);
 
+   synchronize_rcu();
+   flush_work(_priv->mm.free_work);
+
WARN_ON(!list_empty(_i915(dev)->context_list));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e68746f616b0..ba0c8d097afc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1337,11 +1337,17 @@ struct i915_gem_mm {
struct list_head bound_list;
/**
 * List of objects which are not bound to the GTT (thus
-* are idle and not used by the GPU) but still have
-* (presumably uncached) pages still attached.
+* are idle and not used by the GPU). These objects may or may
+* not actually have any pages attached.
 */
struct list_head unbound_list;
 
+   /**
+* List of objects which are pending destruction.
+*/
+   struct llist_head free_list;
+   struct work_struct free_work;
+
/** Usable portion of the GTT for GEM */
unsigned long stolen_base; /* limited to low memory (32-bit) */
 
@@ -2196,6 +2202,10 @@ struct drm_i915_gem_object {
/** Stolen memory for this object, instead of being backed by shmem. */
struct drm_mm_node *stolen;
struct list_head global_list;
+   union {
+   struct rcu_head rcu;
+   struct llist_node freed;
+   };
 
/** Used in execbuf to temporarily hold a ref */
struct list_head obj_exec_link;
@@ -2318,10 +2328,38 @@ to_intel_bo(struct drm_gem_object *gem)
return container_of(gem, struct drm_i915_gem_object, base);
 }
 
+/**
+ * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle
+ * @filp: DRM file private date
+ * @handle: userspace handle
+ *
+ * Returns:
+ *
+ * A pointer to the object named by the handle if such exists on @filp, NULL
+ * otherwise. This object is only valid whilst under the RCU read lock, and
+ * note carefully the object may be in the process of being destroyed.
+ */
+static inline struct drm_i915_gem_object *
+i915_gem_object_lookup_rcu(struct drm_file *file, u32 handle)
+{
+#ifdef CONFIG_LOCKDEP
+   WARN_ON(!lock_is_held(_lock_map));
+#endif
+   return idr_find(>object_idr, handle);
+}
+
 static inline struct drm_i915_gem_object *
 i915_gem_object_lookup(struct drm_file *file, u32 handle)
 {
-   return 

[Intel-gfx] [PATCH 30/38] drm/i915: Introduce a global_seqno for each request

2016-09-20 Thread Chris Wilson
Though we will have multiple timelines, we still have a single timeline
of execution. This we can use to provide an execution and retirement order
of requests. This keeps tracking execution of requests simple, and vital
for preserving a single waiter (i.e. so that we can order the waiters so
that only the earliest to wakeup need be woken). To accomplish this we
distinguish the seqno used to order requests per-context (external) and
that used internally for execution.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  2 +-
 drivers/gpu/drm/i915/i915_drv.h|  4 ++--
 drivers/gpu/drm/i915/i915_gem.c|  4 ++--
 drivers/gpu/drm/i915/i915_gem_request.c| 23 ---
 drivers/gpu/drm/i915/i915_gem_request.h| 30 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |  2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  4 ++--
 drivers/gpu/drm/i915/i915_trace.h  |  8 
 drivers/gpu/drm/i915/intel_breadcrumbs.c   |  8 +---
 drivers/gpu/drm/i915/intel_lrc.c   |  4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c| 14 +++---
 11 files changed, 67 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index ef1672b3a4bc..79ad67893d41 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -683,7 +683,7 @@ static int i915_gem_request_info(struct seq_file *m, void 
*data)
rcu_read_lock();
task = pid ? pid_task(pid, PIDTYPE_PID) : NULL;
seq_printf(m, "%x @ %d: %s [%d]\n",
-  req->fence.seqno,
+  req->global_seqno,
   (int) (jiffies - req->emitted_jiffies),
   task ? task->comm : "",
   task ? task->pid : -1);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 06b429f81ffd..ac83f449efa5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3956,7 +3956,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request 
*req)
/* Before we do the heavier coherent read of the seqno,
 * check the value (hopefully) in the CPU cacheline.
 */
-   if (i915_gem_request_completed(req))
+   if (__i915_gem_request_completed(req))
return true;
 
/* Ensure our read of the seqno is coherent so that we
@@ -4007,7 +4007,7 @@ __i915_request_irq_complete(struct drm_i915_gem_request 
*req)
wake_up_process(tsk);
rcu_read_unlock();
 
-   if (i915_gem_request_completed(req))
+   if (__i915_gem_request_completed(req))
return true;
}
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b10d26d3254d..659ee3b910d7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2525,7 +2525,7 @@ i915_gem_find_active_request(struct intel_engine_cs 
*engine)
 * not need an engine->irq_seqno_barrier() before the seqno reads.
 */
list_for_each_entry(request, >timeline->requests, link) {
-   if (i915_gem_request_completed(request))
+   if (__i915_gem_request_completed(request))
continue;
 
if (!i915_sw_fence_done(>submit))
@@ -2575,7 +2575,7 @@ static void i915_gem_reset_engine(struct intel_engine_cs 
*engine)
return;
 
DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n",
-engine->name, request->fence.seqno);
+engine->name, request->global_seqno);
 
/* Setup the CS to resume from the breadcrumb of the hung request */
engine->reset_hw(engine, request);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index c1fe3d28fa37..8a3193609bc3 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -358,7 +358,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
/* Move the oldest request to the slab-cache (if not in use!) */
req = list_first_entry_or_null(>timeline->requests,
   typeof(*req), link);
-   if (req && i915_gem_request_completed(req))
+   if (req && __i915_gem_request_completed(req))
i915_gem_request_retire(req);
 
/* Beware: Dragons be flying overhead.
@@ -369,7 +369,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 * of being read by __i915_gem_active_get_rcu(). As such,
 * we have to be very careful when overwriting the contents. During

[Intel-gfx] [PATCH 20/38] drm/i915: Implement pwrite without struct-mutex

2016-09-20 Thread Chris Wilson
We only need struct_mutex within pwrite for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 339 ++--
 1 file changed, 117 insertions(+), 222 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4b41f8f51dc2..1c77c04422cc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1121,10 +1121,10 @@ out:
  */
 
 static inline int
-fast_user_write(struct io_mapping *mapping,
-   loff_t page_base, int page_offset,
-   char __user *user_data,
-   int length)
+ggtt_write(struct io_mapping *mapping,
+  loff_t page_base, int page_offset,
+  char __user *user_data,
+  int length)
 {
void __iomem *vaddr_atomic;
void *vaddr;
@@ -1136,28 +1136,15 @@ fast_user_write(struct io_mapping *mapping,
unwritten = __copy_from_user_inatomic_nocache(vaddr,
  user_data, length);
io_mapping_unmap_atomic(vaddr_atomic);
-   return unwritten;
-}
-
-static inline unsigned long
-slow_user_access(struct io_mapping *mapping,
-uint64_t page_base, int page_offset,
-char __user *user_data,
-unsigned long length, bool pwrite)
-{
-   void __iomem *ioaddr;
-   void *vaddr;
-   uint64_t unwritten;
 
-   ioaddr = io_mapping_map_wc(mapping, page_base, PAGE_SIZE);
-   /* We can use the cpu mem copy function because this is X86. */
-   vaddr = (void __force *)ioaddr + page_offset;
-   if (pwrite)
-   unwritten = __copy_from_user(vaddr, user_data, length);
-   else
-   unwritten = __copy_to_user(user_data, vaddr, length);
+   if (unwritten) {
+   vaddr_atomic = io_mapping_map_wc(mapping, page_base, PAGE_SIZE);
+   /* We can use the cpu mem copy function because this is X86. */
+   vaddr = (void __force*)vaddr_atomic + page_offset;
+   unwritten = copy_from_user(vaddr, user_data, length);
+   io_mapping_unmap(vaddr_atomic);
+   }
 
-   io_mapping_unmap(ioaddr);
return unwritten;
 }
 
@@ -1170,22 +1157,19 @@ slow_user_access(struct io_mapping *mapping,
  * @file: drm file pointer
  */
 static int
-i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
-struct drm_i915_gem_object *obj,
-struct drm_i915_gem_pwrite *args,
-struct drm_file *file)
+i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
+const struct drm_i915_gem_pwrite *args)
 {
-   struct i915_ggtt *ggtt = >ggtt;
-   struct drm_device *dev = obj->base.dev;
-   struct i915_vma *vma;
+   struct i915_ggtt *ggtt = _i915(obj->base.dev)->ggtt;
struct drm_mm_node node;
-   uint64_t remain, offset;
-   char __user *user_data;
+   struct i915_vma *vma;
+   u64 remain, offset;
+   void __user *user_data;
int ret;
-   bool hit_slow_path = false;
 
-   if (i915_gem_object_is_tiled(obj))
-   return -EFAULT;
+   ret = mutex_lock_interruptible(>base.dev->struct_mutex);
+   if (ret)
+   return ret;
 
vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
   PIN_MAPPABLE | PIN_NONBLOCK);
@@ -1201,21 +1185,17 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
if (IS_ERR(vma)) {
ret = insert_mappable_node(ggtt, , PAGE_SIZE);
if (ret)
-   goto out;
-
-   ret = i915_gem_object_pin_pages(obj);
-   if (ret) {
-   remove_mappable_node();
-   goto out;
-   }
+   goto out_unlock;
+   GEM_BUG_ON(!node.allocated);
}
 
ret = i915_gem_object_set_to_gtt_domain(obj, true);
if (ret)
goto out_unpin;
 
+   mutex_unlock(>base.dev->struct_mutex);
+
intel_fb_obj_invalidate(obj, ORIGIN_CPU);
-   obj->mm.dirty = true;
 
user_data = u64_to_user_ptr(args->data_ptr);
offset = args->offset;
@@ -1246,92 +1226,36 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 * If the object is non-shmem backed, we retry again with the
 * path that handles page fault.
 */
-   if (fast_user_write(>mappable, page_base,
-   page_offset, user_data, page_length)) {
-   hit_slow_path = true;
-   mutex_unlock(>struct_mutex);
-   if (slow_user_access(>mappable,

[Intel-gfx] [PATCH 14/38] drm/i915: Use a radixtree for random access to the object's backing storage

2016-09-20 Thread Chris Wilson
A while ago we switched from a contiguous array of pages into an sglist,
for that was both more convenient for mapping to hardware and avoided
the requirement for a vmalloc array of pages on every object. However,
certain GEM API calls (like pwrite, pread as well as performing
relocations) do desire access to individual struct pages. A quick hack
was to introduce a cache of the last access such that finding the
following page was quick - this works so long as the caller desired
sequential access. Walking backwards, or multiple callers, still hits a
slow linear search for each page. One solution is to store each
successful lookup in a radix tree.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h |  57 
 drivers/gpu/drm/i915/i915_gem.c | 149 
 drivers/gpu/drm/i915/i915_gem_stolen.c  |   4 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c |   4 +-
 4 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6ef2068ed399..33ce6db85ce4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2261,9 +2261,12 @@ struct drm_i915_gem_object {
 
struct sg_table *pages;
int pages_pin_count;
-   struct get_page {
-   struct scatterlist *sg;
-   int last;
+   struct i915_gem_object_page_iter {
+   struct scatterlist *sg_pos;
+   unsigned long sg_idx;
+
+   struct radix_tree_root radix;
+   struct mutex lock;
} get_page;
void *mapping;
 
@@ -3152,45 +3155,21 @@ static inline int __sg_page_count(struct scatterlist 
*sg)
return sg->length >> PAGE_SHIFT;
 }
 
-struct page *
-i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n);
-
-static inline dma_addr_t
-i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, int n)
-{
-   if (n < obj->get_page.last) {
-   obj->get_page.sg = obj->pages->sgl;
-   obj->get_page.last = 0;
-   }
-
-   while (obj->get_page.last + __sg_page_count(obj->get_page.sg) <= n) {
-   obj->get_page.last += __sg_page_count(obj->get_page.sg++);
-   if (unlikely(sg_is_chain(obj->get_page.sg)))
-   obj->get_page.sg = sg_chain_ptr(obj->get_page.sg);
-   }
-
-   return sg_dma_address(obj->get_page.sg) + ((n - obj->get_page.last) << 
PAGE_SHIFT);
-}
-
-static inline struct page *
-i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n)
-{
-   if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT))
-   return NULL;
+struct scatterlist *
+i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
+  unsigned long n, unsigned int *offset);
 
-   if (n < obj->get_page.last) {
-   obj->get_page.sg = obj->pages->sgl;
-   obj->get_page.last = 0;
-   }
+struct page *
+i915_gem_object_get_page(struct drm_i915_gem_object *obj,
+unsigned long n);
 
-   while (obj->get_page.last + __sg_page_count(obj->get_page.sg) <= n) {
-   obj->get_page.last += __sg_page_count(obj->get_page.sg++);
-   if (unlikely(sg_is_chain(obj->get_page.sg)))
-   obj->get_page.sg = sg_chain_ptr(obj->get_page.sg);
-   }
+struct page *
+i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj,
+  unsigned long n);
 
-   return nth_page(sg_page(obj->get_page.sg), n - obj->get_page.last);
-}
+dma_addr_t
+i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
+   unsigned long n);
 
 static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0d8ae6c54e5a..253f92b95400 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2292,6 +2292,15 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object 
*obj)
kfree(obj->pages);
 }
 
+static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
+{
+   struct radix_tree_iter iter;
+   void **slot;
+
+   radix_tree_for_each_slot(slot, >get_page.radix, , 0)
+   radix_tree_delete(>get_page.radix, iter.index);
+}
+
 int
 i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
@@ -2324,6 +2333,8 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
obj->mapping = NULL;
}
 
+   __i915_gem_object_reset_page_iter(obj);
+
ops->put_pages(obj);
obj->pages = NULL;
 
@@ -2488,8 +2499,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
list_add_tail(>global_list, _priv->mm.unbound_list);
 
-   obj->get_page.sg = obj->pages->sgl;
-   obj->get_page.last = 0;
+   obj->get_page.sg_pos = obj->pages->sgl;
+   

[Intel-gfx] [PATCH 18/38] drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex

2016-09-20 Thread Chris Wilson
Use the per-object mm.lock to allocate the backing storage (and hold a
reference to it across the dmabuf access) without resorting to
struct_mutex.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 51 ++
 1 file changed, 21 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index a4c90e915051..61b983efba3f 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -44,19 +44,15 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;
 
-   ret = i915_mutex_lock_interruptible(obj->base.dev);
-   if (ret)
-   goto err;
-
ret = i915_gem_object_pin_pages(obj);
if (ret)
-   goto err_unlock;
+   goto err;
 
/* Copy sg so that we make an independent mapping */
st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
if (st == NULL) {
ret = -ENOMEM;
-   goto err_unpin;
+   goto err_put_pages;
}
 
ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
@@ -72,21 +68,18 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
}
 
if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
-   ret =-ENOMEM;
+   ret = -ENOMEM;
goto err_free_sg;
}
 
-   mutex_unlock(>base.dev->struct_mutex);
return st;
 
 err_free_sg:
sg_free_table(st);
 err_free:
kfree(st);
-err_unpin:
+err_put_pages:
i915_gem_object_unpin_pages(obj);
-err_unlock:
-   mutex_unlock(>base.dev->struct_mutex);
 err:
return ERR_PTR(ret);
 }
@@ -101,36 +94,21 @@ static void i915_gem_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
sg_free_table(sg);
kfree(sg);
 
-   mutex_lock(>base.dev->struct_mutex);
i915_gem_object_unpin_pages(obj);
-   mutex_unlock(>base.dev->struct_mutex);
 }
 
 static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
-   struct drm_device *dev = obj->base.dev;
-   void *addr;
-   int ret;
 
-   ret = i915_mutex_lock_interruptible(dev);
-   if (ret)
-   return ERR_PTR(ret);
-
-   addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
-   mutex_unlock(>struct_mutex);
-
-   return addr;
+   return i915_gem_object_pin_map(obj, I915_MAP_WB);
 }
 
 static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
-   struct drm_device *dev = obj->base.dev;
 
-   mutex_lock(>struct_mutex);
i915_gem_object_unpin_map(obj);
-   mutex_unlock(>struct_mutex);
 }
 
 static void *i915_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, unsigned 
long page_num)
@@ -177,15 +155,22 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_dire
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
struct drm_device *dev = obj->base.dev;
-   int ret;
bool write = (direction == DMA_BIDIRECTIONAL || direction == 
DMA_TO_DEVICE);
+   int ret;
 
-   ret = i915_mutex_lock_interruptible(dev);
+   ret = i915_gem_object_pin_pages(obj);
if (ret)
return ret;
 
+   ret = i915_mutex_lock_interruptible(dev);
+   if (ret)
+   goto err;
+
ret = i915_gem_object_set_to_cpu_domain(obj, write);
mutex_unlock(>struct_mutex);
+
+err:
+   i915_gem_object_unpin_pages(obj);
return ret;
 }
 
@@ -195,13 +180,19 @@ static int i915_gem_end_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_direct
struct drm_device *dev = obj->base.dev;
int ret;
 
-   ret = i915_mutex_lock_interruptible(dev);
+   ret = i915_gem_object_pin_pages(obj);
if (ret)
return ret;
 
+   ret = i915_mutex_lock_interruptible(dev);
+   if (ret)
+   goto err;
+
ret = i915_gem_object_set_to_gtt_domain(obj, false);
mutex_unlock(>struct_mutex);
 
+err:
+   i915_gem_object_unpin_pages(obj);
return ret;
 }
 
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 33/38] drm/i915: Move the global sync optimisation to the timeline

2016-09-20 Thread Chris Wilson
Currently we try to reduce the number of synchronisations (now the
number of requests we need to wait upon) by noting that if we have
earlier waited upon a request, all subsequent requests in the timeline
will be after the wait. This only applies to requests in this timeline,
as other timelines will not be ordered by that waiter.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c  |  9 --
 drivers/gpu/drm/i915/i915_drv.h  |  1 -
 drivers/gpu/drm/i915/i915_gem_request.c  | 18 
 drivers/gpu/drm/i915/i915_gem_timeline.h |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c| 48 +++-
 drivers/gpu/drm/i915/intel_engine_cs.c   |  2 --
 drivers/gpu/drm/i915/intel_ringbuffer.c  |  3 --
 drivers/gpu/drm/i915/intel_ringbuffer.h  | 23 ---
 8 files changed, 43 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 79ad67893d41..ca7d2e0da23c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3185,15 +3185,6 @@ static int i915_semaphore_status(struct seq_file *m, 
void *unused)
seq_putc(m, '\n');
}
 
-   seq_puts(m, "\nSync seqno:\n");
-   for_each_engine(engine, dev_priv) {
-   for (j = 0; j < num_rings; j++)
-   seq_printf(m, "  0x%08x ",
-  engine->semaphore.sync_seqno[j]);
-   seq_putc(m, '\n');
-   }
-   seq_putc(m, '\n');
-
intel_runtime_pm_put(dev_priv);
mutex_unlock(>struct_mutex);
return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ac83f449efa5..6daa4d538f42 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -784,7 +784,6 @@ struct drm_i915_error_state {
u32 cpu_ring_tail;
 
u32 last_seqno;
-   u32 semaphore_seqno[I915_NUM_ENGINES - 1];
 
/* Register state */
u32 start;
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index c31cb7c4a61f..3d09301f7b40 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -238,8 +238,9 @@ static int i915_gem_check_wedge(struct drm_i915_private 
*dev_priv)
 static int i915_gem_init_global_seqno(struct drm_i915_private *dev_priv,
  u32 seqno)
 {
+   struct i915_gem_timeline *timeline;
struct intel_engine_cs *engine;
-   int ret;
+   int ret, i;
 
/* Carefully retire all requests without writing to the rings */
ret = i915_gem_wait_for_idle(dev_priv,
@@ -262,6 +263,14 @@ static int i915_gem_init_global_seqno(struct 
drm_i915_private *dev_priv,
for_each_engine(engine, dev_priv)
intel_engine_init_global_seqno(engine, seqno);
 
+   list_for_each_entry(timeline, _priv->gt.timelines, link) {
+   for (i = 0; i < ARRAY_SIZE(timeline->engine); i++) {
+   struct intel_timeline *tl = >engine[i];
+
+   memset(tl->sync_seqno, 0, sizeof(tl->sync_seqno));
+   }
+   }
+
return 0;
 }
 
@@ -456,7 +465,7 @@ static int
 i915_gem_request_await_request(struct drm_i915_gem_request *to,
   struct drm_i915_gem_request *from)
 {
-   int idx, ret;
+   int ret;
 
GEM_BUG_ON(to == from);
 
@@ -476,8 +485,7 @@ i915_gem_request_await_request(struct drm_i915_gem_request 
*to,
return ret < 0 ? ret : 0;
}
 
-   idx = intel_engine_sync_index(from->engine, to->engine);
-   if (from->global_seqno <= from->engine->semaphore.sync_seqno[idx])
+   if (from->global_seqno <= to->timeline->sync_seqno[from->engine->id])
return 0;
 
trace_i915_gem_ring_sync_to(to, from);
@@ -495,7 +503,7 @@ i915_gem_request_await_request(struct drm_i915_gem_request 
*to,
return ret;
}
 
-   from->engine->semaphore.sync_seqno[idx] = from->global_seqno;
+   to->timeline->sync_seqno[from->engine->id] = from->global_seqno;
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.h 
b/drivers/gpu/drm/i915/i915_gem_timeline.h
index be34c2e5ac85..ec2e56352c4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.h
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.h
@@ -47,6 +47,7 @@ struct intel_timeline {
 * struct_mutex.
 */
struct i915_gem_active last_request;
+   u32 sync_seqno[I915_NUM_ENGINES];
 
struct i915_gem_timeline *common;
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index c263f42cca07..4ee31c9a2bd9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ 

[Intel-gfx] [PATCH 19/38] drm/i915: Implement pread without struct-mutex

2016-09-20 Thread Chris Wilson
We only need struct_mutex within pread for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 360 +---
 1 file changed, 156 insertions(+), 204 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c5e14c3f4dee..4b41f8f51dc2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -63,13 +63,13 @@ static bool cpu_write_needs_clflush(struct 
drm_i915_gem_object *obj)
 }
 
 static int
-insert_mappable_node(struct drm_i915_private *i915,
+insert_mappable_node(struct i915_ggtt *ggtt,
  struct drm_mm_node *node, u32 size)
 {
memset(node, 0, sizeof(*node));
-   return drm_mm_insert_node_in_range_generic(>ggtt.base.mm, node,
-  size, 0, 0, 0,
-  i915->ggtt.mappable_end,
+   return drm_mm_insert_node_in_range_generic(>base.mm, node,
+  size, 0, -1,
+  0, ggtt->mappable_end,
   DRM_MM_SEARCH_DEFAULT,
   DRM_MM_CREATE_DEFAULT);
 }
@@ -820,32 +820,6 @@ err_unpin:
return ret;
 }
 
-/* Per-page copy function for the shmem pread fastpath.
- * Flushes invalid cachelines before reading the target if
- * needs_clflush is set. */
-static int
-shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length,
-char __user *user_data,
-bool page_do_bit17_swizzling, bool needs_clflush)
-{
-   char *vaddr;
-   int ret;
-
-   if (unlikely(page_do_bit17_swizzling))
-   return -EINVAL;
-
-   vaddr = kmap_atomic(page);
-   if (needs_clflush)
-   drm_clflush_virt_range(vaddr + shmem_page_offset,
-  page_length);
-   ret = __copy_to_user_inatomic(user_data,
- vaddr + shmem_page_offset,
- page_length);
-   kunmap_atomic(vaddr);
-
-   return ret ? -EFAULT : 0;
-}
-
 static void
 shmem_clflush_swizzled_range(char *addr, unsigned long length,
 bool swizzled)
@@ -871,7 +845,7 @@ shmem_clflush_swizzled_range(char *addr, unsigned long 
length,
 /* Only difference to the fast-path function is that this can handle bit17
  * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
+shmem_pread_slow(struct page *page, int offset, int length,
 char __user *user_data,
 bool page_do_bit17_swizzling, bool needs_clflush)
 {
@@ -880,60 +854,132 @@ shmem_pread_slow(struct page *page, int 
shmem_page_offset, int page_length,
 
vaddr = kmap(page);
if (needs_clflush)
-   shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
-page_length,
+   shmem_clflush_swizzled_range(vaddr + offset, length,
 page_do_bit17_swizzling);
 
if (page_do_bit17_swizzling)
-   ret = __copy_to_user_swizzled(user_data,
- vaddr, shmem_page_offset,
- page_length);
+   ret = __copy_to_user_swizzled(user_data, vaddr, offset, length);
else
-   ret = __copy_to_user(user_data,
-vaddr + shmem_page_offset,
-page_length);
+   ret = __copy_to_user(user_data, vaddr + offset, length);
kunmap(page);
 
return ret ? - EFAULT : 0;
 }
 
-static inline unsigned long
-slow_user_access(struct io_mapping *mapping,
-uint64_t page_base, int page_offset,
-char __user *user_data,
-unsigned long length, bool pwrite)
+static int
+shmem_pread(struct page *page, int offset, int length, char __user *user_data,
+   bool page_do_bit17_swizzling, bool needs_clflush)
 {
-   void __iomem *ioaddr;
+   int ret;
+
+   ret = -ENODEV;
+   if (!page_do_bit17_swizzling) {
+   char *vaddr = kmap_atomic(page);
+
+   if (needs_clflush)
+   drm_clflush_virt_range(vaddr + offset, length);
+   ret = __copy_to_user_inatomic(user_data, vaddr + offset, 
length);
+   kunmap_atomic(vaddr);
+   }
+   if (ret == 0)
+   return 0;
+
+   return shmem_pread_slow(page, offset, length, user_data,
+ 

[Intel-gfx] [PATCH 27/38] drm/i915: Combine seqno + tracking into a global timeline struct

2016-09-20 Thread Chris Wilson
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.

Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/Makefile  |  1 +
 drivers/gpu/drm/i915/i915_debugfs.c| 19 +++
 drivers/gpu/drm/i915/i915_drv.c|  6 ++-
 drivers/gpu/drm/i915/i915_drv.h|  9 ++--
 drivers/gpu/drm/i915/i915_gem.c| 80 ++
 drivers/gpu/drm/i915/i915_gem.h|  2 +
 drivers/gpu/drm/i915/i915_gem_request.c| 74 ---
 drivers/gpu/drm/i915/i915_gem_request.h|  1 +
 drivers/gpu/drm/i915/i915_gem_timeline.c   | 64 
 drivers/gpu/drm/i915/i915_gem_timeline.h   | 69 ++
 drivers/gpu/drm/i915/i915_gpu_error.c  |  6 +--
 drivers/gpu/drm/i915/i915_guc_submission.c |  3 +-
 drivers/gpu/drm/i915/i915_irq.c|  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c | 13 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h| 35 ++---
 15 files changed, 271 insertions(+), 113 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_timeline.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_timeline.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b94a90f34d2d..bf07b9de078d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -42,6 +42,7 @@ i915-y += i915_cmd_parser.o \
  i915_gem_shrinker.o \
  i915_gem_stolen.o \
  i915_gem_tiling.o \
+ i915_gem_timeline.o \
  i915_gem_userptr.o \
  i915_gpu_error.o \
  i915_trace_points.o \
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 8614bf4e479a..ef1672b3a4bc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -579,7 +579,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void 
*data)
seq_printf(m, "Flip queued on %s at seqno %x, 
next seqno %x [current breadcrumb %x], completed? %d\n",
   engine->name,
   
i915_gem_request_get_seqno(work->flip_queued_req),
-  dev_priv->next_seqno,
+  
dev_priv->gt.global_timeline.next_seqno,
   intel_engine_get_seqno(engine),
   
i915_gem_request_completed(work->flip_queued_req));
} else
@@ -670,13 +670,13 @@ static int i915_gem_request_info(struct seq_file *m, void 
*data)
int count;
 
count = 0;
-   list_for_each_entry(req, >request_list, link)
+   list_for_each_entry(req, >timeline->requests, link)
count++;
if (count == 0)
continue;
 
seq_printf(m, "%s requests: %d\n", engine->name, count);
-   list_for_each_entry(req, >request_list, link) {
+   list_for_each_entry(req, >timeline->requests, link) {
struct pid *pid = req->ctx->pid;
struct task_struct *task;
 
@@ -1054,15 +1054,8 @@ static int
 i915_next_seqno_get(void *data, u64 *val)
 {
struct drm_i915_private *dev_priv = data;
-   int ret;
-
-   ret = mutex_lock_interruptible(_priv->drm.struct_mutex);
-   if (ret)
-   return ret;
-
-   *val = dev_priv->next_seqno;
-   mutex_unlock(_priv->drm.struct_mutex);
 
+   *val = READ_ONCE(dev_priv->gt.global_timeline.next_seqno);
return 0;
 }
 
@@ -1077,7 +1070,7 @@ i915_next_seqno_set(void *data, u64 val)
if (ret)
return ret;
 
-   ret = i915_gem_set_seqno(dev, val);
+   ret = i915_gem_set_global_seqno(dev, val);
mutex_unlock(>struct_mutex);
 
return ret;
@@ -1336,7 +1329,7 @@ static int i915_hangcheck_info(struct seq_file *m, void 
*unused)
seq_printf(m, "\tseqno = %x [current %x, last %x]\n",
   engine->hangcheck.seqno,
   seqno[id],
-  engine->last_submitted_seqno);
+  engine->timeline->last_submitted_seqno);
seq_printf(m, "\twaiters? %s, fake irq active? %s\n",
   

[Intel-gfx] [PATCH 26/38] drm/i915: Restore nonblocking awaits for modesetting

2016-09-20 Thread Chris Wilson
After combining the dma-buf reservation object and the GEM reservation
object, we lost the ability to do a nonblocking wait on the i915 request
(as we blocked upon the reservation object during prepare_fb). We can
instead convert the reservation object into a fence upon which we can
asynchronously wait (including a forced timeout in case the DMA fence is
never signaled).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/intel_display.c | 78 ++--
 drivers/gpu/drm/i915/intel_drv.h |  2 +
 2 files changed, 51 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index c4b28a30b282..3909b55ca157 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14408,12 +14408,33 @@ static void intel_atomic_commit_tail(struct 
drm_atomic_state *state)
 
 static void intel_atomic_commit_work(struct work_struct *work)
 {
-   struct drm_atomic_state *state = container_of(work,
- struct drm_atomic_state,
- commit_work);
+   struct drm_atomic_state *state =
+   container_of(work, struct drm_atomic_state, commit_work);
+
intel_atomic_commit_tail(state);
 }
 
+static int __i915_sw_fence_call
+intel_atomic_commit_ready(struct i915_sw_fence *fence,
+ enum i915_sw_fence_notify notify)
+{
+   struct intel_atomic_state *state =
+   container_of(fence, struct intel_atomic_state, commit_ready);
+
+   switch (notify) {
+   case FENCE_COMPLETE:
+   if (state->base.commit_work.func)
+   queue_work(system_unbound_wq, >base.commit_work);
+   break;
+
+   case FENCE_FREE:
+   drm_atomic_state_put(>base);
+   break;
+   }
+
+   return NOTIFY_DONE;
+}
+
 static void intel_atomic_track_fbs(struct drm_atomic_state *state)
 {
struct drm_plane_state *old_plane_state;
@@ -14459,11 +14480,14 @@ static int intel_atomic_commit(struct drm_device *dev,
if (ret)
return ret;
 
-   INIT_WORK(>commit_work, intel_atomic_commit_work);
+   drm_atomic_state_get(state);
+   i915_sw_fence_init(_state->commit_ready,
+  intel_atomic_commit_ready);
 
ret = intel_atomic_prepare_commit(dev, state);
if (ret) {
DRM_DEBUG_ATOMIC("Preparing state failed with %i\n", ret);
+   i915_sw_fence_commit(_state->commit_ready);
return ret;
}
 
@@ -14474,10 +14498,14 @@ static int intel_atomic_commit(struct drm_device *dev,
intel_atomic_track_fbs(state);
 
drm_atomic_state_get(state);
-   if (nonblock)
-   queue_work(system_unbound_wq, >commit_work);
-   else
+   INIT_WORK(>commit_work,
+ nonblock ? intel_atomic_commit_work : NULL);
+
+   i915_sw_fence_commit(_state->commit_ready);
+   if (!nonblock) {
+   i915_sw_fence_wait(_state->commit_ready);
intel_atomic_commit_tail(state);
+   }
 
return 0;
 }
@@ -14593,8 +14621,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
struct drm_framebuffer *fb = new_state->fb;
struct drm_i915_gem_object *obj = intel_fb_obj(fb);
struct drm_i915_gem_object *old_obj = intel_fb_obj(plane->state->fb);
-   long lret;
-   int ret = 0;
+   int ret;
 
if (!obj && !old_obj)
return 0;
@@ -14602,7 +14629,6 @@ intel_prepare_plane_fb(struct drm_plane *plane,
if (old_obj) {
struct drm_crtc_state *crtc_state =
drm_atomic_get_existing_crtc_state(new_state->state, 
plane->state->crtc);
-   long timeout = 0;
 
/* Big Hammer, we also need to ensure that any pending
 * MI_WAIT_FOR_EVENT inside a user batch buffer on the
@@ -14615,31 +14641,25 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 * This should only fail upon a hung GPU, in which case we
 * can safely continue.
 */
-   if (needs_modeset(crtc_state))
-   timeout = i915_gem_object_wait(old_obj,
-  I915_WAIT_INTERRUPTIBLE |
-  I915_WAIT_LOCKED,
-  MAX_SCHEDULE_TIMEOUT,
-  NULL);
-   if (timeout < 0) {
-   /* GPU hangs should have been swallowed by the wait */
-   WARN_ON(timeout == -EIO);
-   return timeout;
+   if (needs_modeset(crtc_state)) {
+   ret = 

[Intel-gfx] [PATCH 38/38] drm/i915: Support explicit fencing for execbuf

2016-09-20 Thread Chris Wilson
Now that the user can opt-out of implicit fencing, we need to give them
back control over the fencing. We employ sync_file to wrap our
drm_i915_gem_request and provide an fd that userspace can merge with
other sync_file fds and pass back to the kernel to wait upon before
future execution.

Testcase: igt/gem_exec_fence
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/Kconfig   |  1 +
 drivers/gpu/drm/i915/i915_drv.c|  3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 53 +++---
 include/uapi/drm/i915_drm.h| 36 +++-
 4 files changed, 86 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 92ecced1bc8f..ca23fd17c3ce 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -19,6 +19,7 @@ config DRM_I915
select INPUT if ACPI
select ACPI_VIDEO if ACPI
select ACPI_BUTTON if ACPI
+   select SYNC_FILE
help
  Choose this option if you have a system that has "Intel Graphics
  Media Accelerator" or "HD Graphics" integrated graphics,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 19ee76284371..3acc1fd9a9e1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -333,6 +333,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_COHERENT_PHYS_GTT:
case I915_PARAM_HAS_EXEC_SOFTPIN:
case I915_PARAM_HAS_EXEC_ASYNC:
+   case I915_PARAM_HAS_EXEC_FENCE:
/* For the time being all of these are always true;
 * if some supported hardware does not have one of these
 * features this value needs to be provided from
@@ -2535,7 +2536,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_HWS_ADDR, drm_noop, 
DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF_DRV(I915_GEM_INIT, drm_noop, 
DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER, i915_gem_execbuffer, DRM_AUTH),
-   DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2, i915_gem_execbuffer2, 
DRM_AUTH|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2_WR, i915_gem_execbuffer2, 
DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_PIN, i915_gem_reject_pin_ioctl, 
DRM_AUTH|DRM_ROOT_ONLY),
DRM_IOCTL_DEF_DRV(I915_GEM_UNPIN, i915_gem_reject_pin_ioctl, 
DRM_AUTH|DRM_ROOT_ONLY),
DRM_IOCTL_DEF_DRV(I915_GEM_BUSY, i915_gem_busy_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7038da9aa68f..a99bd002596c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1585,6 +1586,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
struct i915_execbuffer_params *params = _master;
const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
u32 dispatch_flags;
+   struct fence *in_fence = NULL;
+   struct sync_file *out_fence = NULL;
+   int out_fence_fd = -1;
int ret;
bool need_relocs;
 
@@ -1628,6 +1632,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
*data,
dispatch_flags |= I915_DISPATCH_RS;
}
 
+   if (args->flags & I915_EXEC_FENCE_IN) {
+   in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
+   if (!in_fence) {
+   ret = -EINVAL;
+   goto pre_mutex_err;
+   }
+   }
+
+   if (args->flags & I915_EXEC_FENCE_OUT) {
+   out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
+   if (out_fence_fd < 0) {
+   ret = out_fence_fd;
+   out_fence_fd = -1;
+   goto pre_mutex_err;
+   }
+   }
+
/* Take a local wakeref for preparing to dispatch the execbuf as
 * we expect to access the hardware fairly frequently in the
 * process. Upon first dispatch, we acquire another prolonged
@@ -1772,6 +1793,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
*data,
goto err_batch_unpin;
}
 
+   if (in_fence) {
+   ret = i915_gem_request_await_fence(params->request, in_fence);
+   if (ret < 0)
+   goto err_request;
+   }
+
+   if (out_fence_fd != -1) {
+   out_fence = 
sync_file_create(fence_get(>request->fence));
+   if (!out_fence) {
+   ret = -ENOMEM;
+   goto err_request;
+   }
+   }
+
/* Whilst this request exists, batch_obj will be on the

[Intel-gfx] [PATCH 09/38] drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked()

2016-09-20 Thread Chris Wilson
Since we only use the more generic unlocked variant, just rename it as
the normal i915_gem_active_wait(). The temporary cost is that we need to
always acquire the reference in a RCU safe manner, but the benefit is
that we will combine the common paths.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.h | 34 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 2 files changed, 4 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.h 
b/drivers/gpu/drm/i915/i915_gem_request.h
index f1ce390c9244..45998eedda2c 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -569,40 +569,13 @@ i915_gem_active_is_idle(const struct i915_gem_active 
*active,
 }
 
 /**
- * i915_gem_active_wait - waits until the request is completed
- * @active - the active request on which to wait
- *
- * i915_gem_active_wait() waits until the request is completed before
- * returning. Note that it does not guarantee that the request is
- * retired first, see i915_gem_active_retire().
- *
- * i915_gem_active_wait() returns immediately if the active
- * request is already complete.
- */
-static inline int __must_check
-i915_gem_active_wait(const struct i915_gem_active *active, struct mutex *mutex)
-{
-   struct drm_i915_gem_request *request;
-   long ret;
-
-   request = i915_gem_active_peek(active, mutex);
-   if (!request)
-   return 0;
-
-   ret = i915_wait_request(request,
-   I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED,
-   MAX_SCHEDULE_TIMEOUT);
-   return ret < 0 ? ret : 0;
-}
-
-/**
- * i915_gem_active_wait_unlocked - waits until the request is completed
+ * i915_gem_active_wait- waits until the request is completed
  * @active - the active request on which to wait
  * @flags - how to wait
  * @timeout - how long to wait at most
  * @rps - userspace client to charge for a waitboost
  *
- * i915_gem_active_wait_unlocked() waits until the request is completed before
+ * i915_gem_active_wait() waits until the request is completed before
  * returning, without requiring any locks to be held. Note that it does not
  * retire any requests before returning.
  *
@@ -618,8 +591,7 @@ i915_gem_active_wait(const struct i915_gem_active *active, 
struct mutex *mutex)
  * Returns 0 if successful, or a negative error code.
  */
 static inline int
-i915_gem_active_wait_unlocked(const struct i915_gem_active *active,
- unsigned int flags)
+i915_gem_active_wait(const struct i915_gem_active *active, unsigned int flags)
 {
struct drm_i915_gem_request *request;
long ret = 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 88ed21f1135b..80f2db3d63d9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -503,7 +503,7 @@ static inline int intel_engine_idle(struct intel_engine_cs 
*engine,
unsigned int flags)
 {
/* Wait upon the last request to be completed */
-   return i915_gem_active_wait_unlocked(>last_request, flags);
+   return i915_gem_active_wait(>last_request, flags);
 }
 
 int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 13/38] drm/i915: Markup GEM API with lockdep asserts

2016-09-20 Thread Chris Wilson
Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c  | 20 
 drivers/gpu/drm/i915/i915_gem_evict.c|  5 -
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  2 ++
 drivers/gpu/drm/i915/i915_gem_render_state.c |  2 ++
 drivers/gpu/drm/i915/i915_gem_request.c  |  3 +++
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e8373f264858..0d8ae6c54e5a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -104,6 +104,8 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 {
int ret;
 
+   might_sleep();
+
if (!i915_reset_in_progress(error))
return 0;
 
@@ -2295,6 +2297,8 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
const struct drm_i915_gem_object_ops *ops = obj->ops;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
if (obj->pages == NULL)
return 0;
 
@@ -2466,6 +2470,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
const struct drm_i915_gem_object_ops *ops = obj->ops;
int ret;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
if (obj->pages)
return 0;
 
@@ -2745,6 +2751,8 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 {
struct intel_engine_cs *engine;
 
+   lockdep_assert_held(_priv->drm.struct_mutex);
+
i915_gem_retire_requests(dev_priv);
 
for_each_engine(engine, dev_priv)
@@ -2967,6 +2975,8 @@ int i915_vma_unbind(struct i915_vma *vma)
unsigned long active;
int ret;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
/* First wait upon any activity as retiring the request may
 * have side-effects such as unpinning or even unbinding this vma.
 */
@@ -3359,6 +3369,7 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj, bool write)
int ret;
 
lockdep_assert_held(>base.dev->struct_mutex);
+
ret = i915_gem_object_wait(obj,
   I915_WAIT_INTERRUPTIBLE |
   I915_WAIT_LOCKED |
@@ -3437,6 +3448,8 @@ int i915_gem_object_set_cache_level(struct 
drm_i915_gem_object *obj,
struct i915_vma *vma;
int ret = 0;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
if (obj->cache_level == cache_level)
goto out;
 
@@ -3647,6 +3660,8 @@ i915_gem_object_pin_to_display_plane(struct 
drm_i915_gem_object *obj,
u32 old_read_domains, old_write_domain;
int ret;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
/* Mark the pin_display early so that we account for the
 * display coherency whilst setting up the cache domains.
 */
@@ -3713,6 +3728,7 @@ err_unpin_display:
 void
 i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
 {
+   lockdep_assert_held(>vm->dev->struct_mutex);
if (WARN_ON(vma->obj->pin_display == 0))
return;
 
@@ -3742,6 +3758,7 @@ i915_gem_object_set_to_cpu_domain(struct 
drm_i915_gem_object *obj, bool write)
int ret;
 
lockdep_assert_held(>base.dev->struct_mutex);
+
ret = i915_gem_object_wait(obj,
   I915_WAIT_INTERRUPTIBLE |
   I915_WAIT_LOCKED |
@@ -3897,6 +3914,7 @@ int __i915_vma_do_pin(struct i915_vma *vma,
unsigned int bound = vma->flags;
int ret;
 
+   lockdep_assert_held(>vm->dev->struct_mutex);
GEM_BUG_ON((flags & (PIN_GLOBAL | PIN_USER)) == 0);
GEM_BUG_ON((flags & PIN_GLOBAL) && !i915_vma_is_ggtt(vma));
 
@@ -3937,6 +3955,8 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
struct i915_vma *vma;
int ret;
 
+   lockdep_assert_held(>base.dev->struct_mutex);
+
vma = i915_gem_obj_lookup_or_create_vma(obj, vm, view);
if (IS_ERR(vma))
return vma;
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 5b6f81c1dbca..61f716c8768c 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -102,6 +102,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
struct i915_vma *vma, *next;
int ret;
 
+   lockdep_assert_held(>dev->struct_mutex);
trace_i915_gem_evict(vm, min_size, alignment, flags);
 
/*
@@ -212,6 +213,8 @@ i915_gem_evict_for_vma(struct i915_vma *target)
 {
struct drm_mm_node *node, *next;
 
+   lockdep_assert_held(>vm->dev->struct_mutex);
+
list_for_each_entry_safe(node, next,
>vm->mm.head_node.node_list,
node_list) {
@@ -265,7 +268,7 @@ int i915_gem_evict_vm(struct 

[Intel-gfx] [PATCH 23/38] drm/i915: Use lockless object free

2016-09-20 Thread Chris Wilson
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h | 10 +-
 drivers/gpu/drm/i915/i915_gem.c | 21 ++---
 drivers/gpu/drm/i915/i915_gem_tiling.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c |  4 ++--
 drivers/gpu/drm/i915/intel_display.c|  6 +++---
 drivers/gpu/drm/i915/intel_overlay.c|  4 ++--
 drivers/gpu/drm/i915/intel_pm.c |  2 +-
 8 files changed, 21 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index dded8af4092c..4222e3f11eea 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2579,7 +2579,7 @@ static struct drm_driver driver = {
.set_busid = drm_pci_set_busid,
 
.gem_close_object = i915_gem_close_object,
-   .gem_free_object = i915_gem_free_object,
+   .gem_free_object_unlocked = i915_gem_free_object,
.gem_vm_ops = _gem_vm_ops,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ba0c8d097afc..8a59e27dfee4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2381,19 +2381,12 @@ __attribute__((nonnull))
 static inline void
 i915_gem_object_put(struct drm_i915_gem_object *obj)
 {
-   drm_gem_object_unreference(>base);
+   __drm_gem_object_unreference(>base);
 }
 
 __deprecated
 extern void drm_gem_object_unreference(struct drm_gem_object *);
 
-__attribute__((nonnull))
-static inline void
-i915_gem_object_put_unlocked(struct drm_i915_gem_object *obj)
-{
-   drm_gem_object_unreference_unlocked(>base);
-}
-
 __deprecated
 extern void drm_gem_object_unreference_unlocked(struct drm_gem_object *);
 
@@ -2495,7 +2488,6 @@ static inline struct i915_vma *i915_vma_get(struct 
i915_vma *vma)
 
 static inline void i915_vma_put(struct i915_vma *vma)
 {
-   lockdep_assert_held(>vm->dev->struct_mutex);
i915_gem_object_put(vma->obj);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4d69a58f8a2b..4b4b6f164540 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -616,7 +616,7 @@ i915_gem_create(struct drm_file *file,
 
ret = drm_gem_handle_create(file, >base, );
/* drop reference from allocate - handle holds it now */
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
if (ret)
return ret;
 
@@ -1115,7 +1115,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 
i915_gem_object_unpin_pages(obj);
 out:
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return ret;
 }
 
@@ -1451,7 +1451,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 
i915_gem_object_unpin_pages(obj);
 err:
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return ret;
 }
 
@@ -1561,7 +1561,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
 err_pages:
i915_gem_object_unpin_pages(obj);
 err_unlocked:
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return ret;
 }
 
@@ -1592,7 +1592,7 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void 
*data,
}
}
 
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return err;
 }
 
@@ -1638,7 +1638,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 * pages from.
 */
if (!obj->base.filp) {
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return -EINVAL;
}
 
@@ -1650,7 +1650,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
struct vm_area_struct *vma;
 
if (down_write_killable(>mmap_sem)) {
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return -EINTR;
}
vma = find_vma(mm, addr);
@@ -1664,7 +1664,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
/* This may race, but that's ok, it only gets set */
WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU);
}
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
if (IS_ERR((void *)addr))
return addr;
 
@@ -2074,7 +2074,7 @@ i915_gem_mmap_gtt(struct drm_file *file,
if (ret == 0)
*offset = drm_vma_node_offset_addr(>base.vma_node);
 
-   i915_gem_object_put_unlocked(obj);
+   i915_gem_object_put(obj);
return ret;
 }
 
@@ -2871,8 +2871,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, 
struct drm_file *file)
args->timeout_ns = 0;
}
 
-   

[Intel-gfx] [PATCH 25/38] drm: Add reference counting to drm_atomic_state

2016-09-20 Thread Chris Wilson
drm_atomic_state has a complicated single owner model that tracks the
single reference from allocation through to destruction on another
thread - or perhaps on a local error path. We can simplify this tracking
by using reference counting (at a cost of a few more atomics). This is
even more beneficial when the lifetime of the state becomes more
convoluted than being passed to a single worker thread for the commit.

v2: Double check !intel atomic_commit functions for missing gets
v3: Update kerneldocs

Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Eric Engestrom 
---
 drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c |  3 +-
 drivers/gpu/drm/drm_atomic.c | 25 +++
 drivers/gpu/drm/drm_atomic_helper.c  | 98 +++-
 drivers/gpu/drm/drm_fb_helper.c  |  9 +--
 drivers/gpu/drm/exynos/exynos_drm_drv.c  |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c  |  3 +-
 drivers/gpu/drm/i915/intel_display.c | 31 +
 drivers/gpu/drm/i915/intel_sprite.c  |  4 +-
 drivers/gpu/drm/mediatek/mtk_drm_drv.c   |  3 +-
 drivers/gpu/drm/msm/msm_atomic.c |  3 +-
 drivers/gpu/drm/omapdrm/omap_drv.c   |  3 +-
 drivers/gpu/drm/rcar-du/rcar_du_kms.c|  3 +-
 drivers/gpu/drm/sti/sti_drv.c|  3 +-
 drivers/gpu/drm/tegra/drm.c  |  3 +-
 drivers/gpu/drm/tilcdc/tilcdc_drv.c  |  2 -
 drivers/gpu/drm/vc4/vc4_kms.c|  3 +-
 include/drm/drm_atomic.h | 28 +++-
 include/drm/drm_crtc.h   |  3 +
 18 files changed, 101 insertions(+), 129 deletions(-)

diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c 
b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
index 8e7483d90c47..c9e645fa1233 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
@@ -464,7 +464,7 @@ atmel_hlcdc_dc_atomic_complete(struct atmel_hlcdc_dc_commit 
*commit)
 
drm_atomic_helper_cleanup_planes(dev, old_state);
 
-   drm_atomic_state_free(old_state);
+   drm_atomic_state_put(old_state);
 
/* Complete the commit, wake up any waiter. */
spin_lock(>commit.wait.lock);
@@ -521,6 +521,7 @@ static int atmel_hlcdc_dc_atomic_commit(struct drm_device 
*dev,
/* Swap the state, this is the point of no return. */
drm_atomic_helper_swap_state(state, true);
 
+   drm_atomic_state_get(state);
if (async)
queue_work(dc->wq, >work);
else
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 23739609427d..5dd70540219c 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -74,6 +74,8 @@ EXPORT_SYMBOL(drm_atomic_state_default_release);
 int
 drm_atomic_state_init(struct drm_device *dev, struct drm_atomic_state *state)
 {
+   kref_init(>ref);
+
/* TODO legacy paths should maybe do a better job about
 * setting this appropriately?
 */
@@ -215,22 +217,16 @@ void drm_atomic_state_clear(struct drm_atomic_state 
*state)
 EXPORT_SYMBOL(drm_atomic_state_clear);
 
 /**
- * drm_atomic_state_free - free all memory for an atomic state
- * @state: atomic state to deallocate
+ * __drm_atomic_state_free - free all memory for an atomic state
+ * @ref: This atomic state to deallocate
  *
  * This frees all memory associated with an atomic state, including all the
  * per-object state for planes, crtcs and connectors.
  */
-void drm_atomic_state_free(struct drm_atomic_state *state)
+void __drm_atomic_state_free(struct kref *ref)
 {
-   struct drm_device *dev;
-   struct drm_mode_config *config;
-
-   if (!state)
-   return;
-
-   dev = state->dev;
-   config = >mode_config;
+   struct drm_atomic_state *state = container_of(ref, typeof(*state), ref);
+   struct drm_mode_config *config = >dev->mode_config;
 
drm_atomic_state_clear(state);
 
@@ -243,7 +239,7 @@ void drm_atomic_state_free(struct drm_atomic_state *state)
kfree(state);
}
 }
-EXPORT_SYMBOL(drm_atomic_state_free);
+EXPORT_SYMBOL(__drm_atomic_state_free);
 
 /**
  * drm_atomic_get_crtc_state - get crtc state
@@ -1742,7 +1738,7 @@ retry:
if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) {
/*
 * Unlike commit, check_only does not clean up state.
-* Below we call drm_atomic_state_free for it.
+* Below we call drm_atomic_state_put for it.
 */
ret = drm_atomic_check_only(state);
} else if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK) {
@@ -1775,8 +1771,7 @@ out:
goto retry;
}
 
-   if (ret || arg->flags & DRM_MODE_ATOMIC_TEST_ONLY)
-   drm_atomic_state_free(state);
+   drm_atomic_state_put(state);
 

[Intel-gfx] [PATCH 17/38] drm/i915: Move object backing storage manipulation to its own locking

2016-09-20 Thread Chris Wilson
Break the allocation of the backing storage away from struct_mutex into
a per-object lock. This allows parallel page allocation, provided we can
do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT
fault), i.e. before execbuf! The increased cost of the atomic counters
are hidden behind i915_vma_pin() for the typical case of execbuf, i.e.
as the object is typically bound between execbufs, the page_pin_count is
static. The cost will be felt around set-domain and pwrite, but offset
by the improvement from reduced struct_mutex contention.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  | 26 --
 drivers/gpu/drm/i915/i915_gem.c  | 86 +---
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 42 ++--
 drivers/gpu/drm/i915/i915_gem_tiling.c   |  2 +
 drivers/gpu/drm/i915/i915_gem_userptr.c  | 10 ++--
 5 files changed, 100 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e267f25fca6e..e68746f616b0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2249,7 +2249,8 @@ struct drm_i915_gem_object {
unsigned int pin_display;
 
struct {
-   unsigned int pages_pin_count;
+   struct mutex lock;
+   atomic_t pages_pin_count;
 
struct sg_table *pages;
void *mapping;
@@ -3190,8 +3191,7 @@ int __i915_gem_object_get_pages(struct 
drm_i915_gem_object *obj);
 static inline int __must_check
 i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held(>base.dev->struct_mutex);
-   if (obj->mm.pages_pin_count++)
+   if (atomic_inc_not_zero(>mm.pages_pin_count))
return 0;
 
return __i915_gem_object_get_pages(obj);
@@ -3200,29 +3200,26 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object 
*obj)
 static inline void
 __i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held_unless(>base.dev->struct_mutex,
-  i915_gem_object_is_dead(obj));
GEM_BUG_ON(!obj->mm.pages);
-   obj->mm.pages_pin_count++;
+   atomic_inc(>mm.pages_pin_count);
 }
 
 static inline bool
 i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj)
 {
-   return obj->mm.pages_pin_count;
+   return atomic_read(>mm.pages_pin_count);
 }
 
 static inline void
 __i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held_unless(>base.dev->struct_mutex,
-  i915_gem_object_is_dead(obj));
GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
GEM_BUG_ON(!obj->mm.pages);
-   obj->mm.pages_pin_count--;
+   atomic_dec(>mm.pages_pin_count);
 }
 
-static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
+static inline void
+i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
__i915_gem_object_unpin_pages(obj);
 }
@@ -3245,8 +3242,8 @@ enum i915_map_type {
  * the kernel address space. Based on the @type of mapping, the PTE will be
  * set to either WriteBack or WriteCombine (via pgprot_t).
  *
- * The caller must hold the struct_mutex, and is responsible for calling
- * i915_gem_object_unpin_map() when the mapping is no longer required.
+ * The caller is responsible for calling i915_gem_object_unpin_map() when the
+ * mapping is no longer required.
  *
  * Returns the pointer through which to access the mapped object, or an
  * ERR_PTR() on error.
@@ -3262,12 +3259,9 @@ void *__must_check i915_gem_object_pin_map(struct 
drm_i915_gem_object *obj,
  * with your access, call i915_gem_object_unpin_map() to release the pin
  * upon the mapping. Once the pin count reaches zero, that mapping may be
  * removed.
- *
- * The caller must hold the struct_mutex.
  */
 static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held(>base.dev->struct_mutex);
i915_gem_object_unpin_pages(obj);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 88ed07575f5c..c5e14c3f4dee 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2287,12 +2287,17 @@ void __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj)
 {
struct sg_table *pages;
 
-   lockdep_assert_held(>base.dev->struct_mutex);
-
if (i915_gem_object_has_pinned_pages(obj))
return;
 
GEM_BUG_ON(obj->bind_count);
+   if (!READ_ONCE(obj->mm.pages))
+   return;
+
+   /* May be called by shrinker from within get_pages() (on another bo) */
+   mutex_lock_nested(>mm.lock, SINGLE_DEPTH_NESTING);
+   if (unlikely(atomic_read(>mm.pages_pin_count)))
+   goto unlock;
 
/* ->put_pages might need to allocate memory for the bit17 swizzle
 * array, hence 

[Intel-gfx] [PATCH 11/38] drm/i915: Introduce an internal allocator for disposable private objects

2016-09-20 Thread Chris Wilson
Quite a few of our objects used for internal hardware programming do not
benefit from being swappable or from being zero initialised. As such
they do not benefit from using a shmemfs backing storage and since they
are internal and never directly exposed to the user, we do not need to
worry about providing a filp. For these we can use an
drm_i915_gem_object wrapper around a sg_table of plain struct page. They
are not swapped backed and not automatically pinned. If they are reaped
by the shrinker, the pages are released and the contents discarded. For
the internal use case, this is fine as for example, ringbuffers are
pinned from being written by a request to be read by the hardware. Once
they are idle, they can be discarded entirely. As such they are a good
match for execlist ringbuffers and a small varierty of other internal
objects.

In the first iteration, this is limited to the scratch batch buffers we
use (for command parsing and state initialisation).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile|   1 +
 drivers/gpu/drm/i915/i915_drv.h  |   5 +
 drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  28 ++---
 drivers/gpu/drm/i915/i915_gem_internal.c | 156 +++
 drivers/gpu/drm/i915/i915_gem_render_state.c |   2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c   |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   2 +-
 7 files changed, 174 insertions(+), 22 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a998c2bce70a..b94a90f34d2d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
  i915_gem_execbuffer.o \
  i915_gem_fence.o \
  i915_gem_gtt.o \
+ i915_gem_internal.o \
  i915_gem.o \
  i915_gem_render_state.o \
  i915_gem_request.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 89eab85ab11c..6ef2068ed399 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3522,6 +3522,11 @@ i915_gem_object_create_stolen_for_preallocated(struct 
drm_device *dev,
   u32 gtt_offset,
   u32 size);
 
+/* i915_gem_internal.c */
+struct drm_i915_gem_object *
+i915_gem_object_create_internal(struct drm_device *dev,
+   unsigned int size);
+
 /* i915_gem_shrinker.c */
 unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv,
  unsigned long target,
diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c 
b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
index cb25cad3318c..3934c9103cf2 100644
--- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
+++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
@@ -97,9 +97,9 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
size_t size)
 {
struct drm_i915_gem_object *obj = NULL;
-   struct drm_i915_gem_object *tmp, *next;
+   struct drm_i915_gem_object *tmp;
struct list_head *list;
-   int n;
+   int n, ret;
 
lockdep_assert_held(>engine->i915->drm.struct_mutex);
 
@@ -112,19 +112,12 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
n = ARRAY_SIZE(pool->cache_list) - 1;
list = >cache_list[n];
 
-   list_for_each_entry_safe(tmp, next, list, batch_pool_link) {
+   list_for_each_entry(tmp, list, batch_pool_link) {
/* The batches are strictly LRU ordered */
if (!i915_gem_active_is_idle(>last_read[pool->engine->id],
 >base.dev->struct_mutex))
break;
 
-   /* While we're looping, do some clean up */
-   if (tmp->madv == __I915_MADV_PURGED) {
-   list_del(>batch_pool_link);
-   i915_gem_object_put(tmp);
-   continue;
-   }
-
if (tmp->base.size >= size) {
obj = tmp;
break;
@@ -132,19 +125,16 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
}
 
if (obj == NULL) {
-   int ret;
-
-   obj = i915_gem_object_create(>engine->i915->drm, size);
+   obj = i915_gem_object_create_internal(>engine->i915->drm,
+ size);
if (IS_ERR(obj))
return obj;
-
-   ret = i915_gem_object_get_pages(obj);
-   if (ret)
-   return ERR_PTR(ret);
-
-   obj->madv = I915_MADV_DONTNEED;
}
 
+   ret = i915_gem_object_get_pages(obj);
+   if (ret)
+   return ERR_PTR(ret);
+

[Intel-gfx] [PATCH 15/38] drm/i915: Refactor object page API

2016-09-20 Thread Chris Wilson
The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c   |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c  |  17 +--
 drivers/gpu/drm/i915/i915_drv.h  |  89 
 drivers/gpu/drm/i915/i915_gem.c  | 207 +--
 drivers/gpu/drm/i915/i915_gem_batch_pool.c   |   3 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   |  14 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |   2 +-
 drivers/gpu/drm/i915/i915_gem_fence.c|   4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  12 +-
 drivers/gpu/drm/i915/i915_gem_internal.c |  19 +--
 drivers/gpu/drm/i915/i915_gem_render_state.c |   2 +-
 drivers/gpu/drm/i915/i915_gem_shrinker.c |  10 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c   |  24 ++--
 drivers/gpu/drm/i915/i915_gem_tiling.c   |   8 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c  |  30 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c|   4 +-
 drivers/gpu/drm/i915/intel_lrc.c |   6 +-
 17 files changed, 235 insertions(+), 218 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 70980f82a15b..8d20020cb9f9 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1290,7 +1290,7 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
}
 
if (ret == 0 && needs_clflush_after)
-   drm_clflush_virt_range(shadow_batch_obj->mapping, batch_len);
+   drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
i915_gem_object_unpin_map(shadow_batch_obj);
 
return ret;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 278fb18efde2..0b2e0ff968cf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -114,7 +114,7 @@ static char get_global_flag(struct drm_i915_gem_object *obj)
 
 static char get_pin_mapped_flag(struct drm_i915_gem_object *obj)
 {
-   return obj->mapping ? 'M' : ' ';
+   return obj->mm.mapping ? 'M' : ' ';
 }
 
 static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
@@ -160,8 +160,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
   i915_gem_active_get_seqno(>last_write,
 >base.dev->struct_mutex),
   i915_cache_level_str(dev_priv, obj->cache_level),
-  obj->dirty ? " dirty" : "",
-  obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
+  obj->mm.dirty ? " dirty" : "",
+  obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
if (obj->base.name)
seq_printf(m, " (name: %d)", obj->base.name);
list_for_each_entry(vma, >vma_list, obj_link) {
@@ -413,12 +413,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
size += obj->base.size;
++count;
 
-   if (obj->madv == I915_MADV_DONTNEED) {
+   if (obj->mm.madv == I915_MADV_DONTNEED) {
purgeable_size += obj->base.size;
++purgeable_count;
}
 
-   if (obj->mapping) {
+   if (obj->mm.mapping) {
mapped_count++;
mapped_size += obj->base.size;
}
@@ -435,12 +435,12 @@ static int i915_gem_object_info(struct seq_file *m, void 
*data)
++dpy_count;
}
 
-   if (obj->madv == I915_MADV_DONTNEED) {
+   if (obj->mm.madv == I915_MADV_DONTNEED) {
purgeable_size += obj->base.size;
++purgeable_count;
}
 
-   if (obj->mapping) {
+   if (obj->mm.mapping) {
mapped_count++;
mapped_size += obj->base.size;
}
@@ -1978,7 +1978,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
seq_printf(m, "\tBound in GGTT at 0x%08x\n",
   i915_ggtt_offset(vma));
 
-   if (i915_gem_object_get_pages(vma->obj)) {
+   if (i915_gem_object_pin_pages(vma->obj)) {
seq_puts(m, "\tFailed to get pages for context object\n\n");
return;
}
@@ -1997,6 +1997,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
kunmap_atomic(reg_state);
}
 
+   i915_gem_object_unpin_pages(vma->obj);
seq_putc(m, '\n');
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 33ce6db85ce4..d51a4ec512fa 100644
--- 

[Intel-gfx] [PATCH 16/38] drm/i915: Pass around sg_table to get_pages/put_pages backend

2016-09-20 Thread Chris Wilson
The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |  36 +--
 drivers/gpu/drm/i915/i915_gem.c  | 174 +++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   |  20 ++--
 drivers/gpu/drm/i915/i915_gem_fence.c|  16 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  19 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.h  |   6 +-
 drivers/gpu/drm/i915/i915_gem_internal.c |  22 ++--
 drivers/gpu/drm/i915/i915_gem_shrinker.c |  11 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c   |  43 
 drivers/gpu/drm/i915/i915_gem_userptr.c  |  88 
 10 files changed, 226 insertions(+), 209 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d51a4ec512fa..e267f25fca6e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2157,8 +2157,8 @@ struct drm_i915_gem_object_ops {
 * being released or under memory pressure (where we attempt to
 * reap pages for the shrinker).
 */
-   int (*get_pages)(struct drm_i915_gem_object *);
-   void (*put_pages)(struct drm_i915_gem_object *);
+   struct sg_table *(*get_pages)(struct drm_i915_gem_object *);
+   void (*put_pages)(struct drm_i915_gem_object *, struct sg_table *);
 
int (*dmabuf_export)(struct drm_i915_gem_object *);
void (*release)(struct drm_i915_gem_object *);
@@ -2300,8 +2300,6 @@ struct drm_i915_gem_object {
struct i915_gem_userptr {
uintptr_t ptr;
unsigned read_only :1;
-   unsigned workers :4;
-#define I915_GEM_USERPTR_MAX_WORKERS 15
 
struct i915_mm_struct *mm;
struct i915_mmu_object *mmu_object;
@@ -2361,6 +2359,19 @@ __deprecated
 extern void drm_gem_object_unreference_unlocked(struct drm_gem_object *);
 
 static inline bool
+i915_gem_object_is_dead(const struct drm_i915_gem_object *obj)
+{
+   return atomic_read(>base.refcount.refcount) == 0;
+}
+
+#if IS_ENABLED(CONFIG_LOCKDEP)
+#define lockdep_assert_held_unless(lock, cond) \
+   GEM_BUG_ON(!lockdep_is_held(lock) && !(cond))
+#else
+#define lockdep_assert_held_unless(lock, cond)
+#endif
+
+static inline bool
 i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj)
 {
return obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE;
@@ -3172,6 +3183,8 @@ dma_addr_t
 i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
unsigned long n);
 
+void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
+struct sg_table *pages);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
 static inline int __must_check
@@ -3187,7 +3200,8 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 static inline void
 __i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held(>base.dev->struct_mutex);
+   lockdep_assert_held_unless(>base.dev->struct_mutex,
+  i915_gem_object_is_dead(obj));
GEM_BUG_ON(!obj->mm.pages);
obj->mm.pages_pin_count++;
 }
@@ -3201,7 +3215,8 @@ i915_gem_object_has_pinned_pages(struct 
drm_i915_gem_object *obj)
 static inline void
 __i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
-   lockdep_assert_held(>base.dev->struct_mutex);
+   lockdep_assert_held_unless(>base.dev->struct_mutex,
+  i915_gem_object_is_dead(obj));
GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
GEM_BUG_ON(!obj->mm.pages);
obj->mm.pages_pin_count--;
@@ -3212,7 +3227,8 @@ static inline void i915_gem_object_unpin_pages(struct 
drm_i915_gem_object *obj)
__i915_gem_object_unpin_pages(obj);
 }
 
-int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
+void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
+void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj);
 
 enum i915_map_type {
I915_MAP_WB = 0,
@@ -3435,8 +3451,10 @@ i915_vma_unpin_fence(struct i915_vma *vma)
 void i915_gem_restore_fences(struct drm_device *dev);
 
 void i915_gem_detect_bit_6_swizzle(struct drm_device *dev);
-void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj);
-void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj);
+void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
+  struct sg_table *pages);
+void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj,
+   

[Intel-gfx] [PATCH 06/38] drm/i915: Support asynchronous waits on struct fence from i915_gem_request

2016-09-20 Thread Chris Wilson
We will need to wait on DMA completion (as signaled via struct fence)
before executing our i915_gem_request. Therefore we want to expose a
method for adding the await on the fence itself to the request.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_request.c | 40 +
 drivers/gpu/drm/i915/i915_gem_request.h |  2 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 40978bc12ceb..624d71bf4518 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -23,6 +23,7 @@
  */
 
 #include 
+#include 
 
 #include "i915_drv.h"
 
@@ -494,6 +495,45 @@ i915_gem_request_await_request(struct drm_i915_gem_request 
*to,
return 0;
 }
 
+int
+i915_gem_request_await_fence(struct drm_i915_gem_request *req,
+struct fence *fence)
+{
+   struct fence_array *array;
+   int ret;
+   int i;
+
+   if (test_bit(FENCE_FLAG_SIGNALED_BIT, >flags))
+   return 0;
+
+   if (fence_is_i915(fence))
+   return i915_gem_request_await_request(req, to_request(fence));
+
+   if (!fence_is_array(fence)) {
+   ret = i915_sw_fence_await_dma_fence(>submit,
+   fence, 10*HZ,
+   GFP_KERNEL);
+   return ret < 0 ? ret : 0;
+   }
+
+   array = to_fence_array(fence);
+   for (i = 0; i < array->num_fences; i++) {
+   struct fence *child = array->fences[i];
+
+   if (fence_is_i915(child))
+   ret = i915_gem_request_await_request(req,
+to_request(child));
+   else
+   ret = i915_sw_fence_await_dma_fence(>submit,
+   child, 10*HZ,
+   GFP_KERNEL);
+   if (ret < 0)
+   return ret;
+   }
+
+   return 0;
+}
+
 /**
  * i915_gem_request_await_object - set this request to (async) wait upon a bo
  *
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h 
b/drivers/gpu/drm/i915/i915_gem_request.h
index 974bd7bcc801..c85a3d82febf 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -214,6 +214,8 @@ int
 i915_gem_request_await_object(struct drm_i915_gem_request *to,
  struct drm_i915_gem_object *obj,
  bool write);
+int i915_gem_request_await_fence(struct drm_i915_gem_request *req,
+struct fence *fence);
 
 void __i915_add_request(struct drm_i915_gem_request *req, bool flush_caches);
 #define i915_add_request(req) \
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/38] drm/i915: Stop the machine whilst capturing the GPU crash dump

2016-09-20 Thread Chris Wilson
The error state is purposefully racy as we expect it to be called at any
time and so have avoided any locking whilst capturing the crash dump.
However, with multi-engine GPUs and multiple CPUs, those races can
manifest into OOPSes as we attempt to chase dangling pointers freed on
other CPUs. Under discussion are lots of ways to slow down normal
operation in order to protect the post-mortem error capture, but what it
we take the opposite approach and freeze the machine whilst the error
capture runs (note the GPU may still running, but as long as we don't
process any of the results the driver's bookkeeping will be static).

Note that by of itself, this is not a complete fix. It also depends on
the compiler barriers in list_add/list_del to prevent traversing the
lists into the void. We also depend that we only require state from
carefully controlled sources - i.e. all the state we require for
post-mortem debugging should be reachable from the request itself so
that we only have to worry about retrieving the request carefully. Once
we have the request, we know that all pointers from it are intact.

v2: Avoid drm_clflush_pages() inside stop_machine() as it may use
stop_machine() itself for its wbinvd fallback.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  |  1 +
 drivers/gpu/drm/i915/i915_drv.h   |  2 ++
 drivers/gpu/drm/i915/i915_gpu_error.c | 46 +--
 3 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 10a6ac11b6a9..0f46a9c04c0e 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -4,6 +4,7 @@ config DRM_I915
depends on X86 && PCI
select INTEL_GTT
select INTERVAL_TREE
+   select STOP_MACHINE
# we need shmfs for the swappable backing store, and in particular
# the shmem_readpage() which depends upon tmpfs
select SHMEM
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1643acb02513..3fda31816fdd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -735,6 +735,8 @@ struct drm_i915_error_state {
struct kref ref;
struct timeval time;
 
+   struct drm_i915_private *i915;
+
char error_msg[128];
bool simulated;
int iommu;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 364f5b17df21..13ad86e11695 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -28,6 +28,7 @@
  */
 
 #include 
+#include 
 #include "i915_drv.h"
 
 #ifdef CONFIG_DRM_I915_CAPTURE_ERROR
@@ -720,14 +721,12 @@ i915_error_object_create(struct drm_i915_private 
*dev_priv,
 
dst->page_count = num_pages;
while (num_pages--) {
-   unsigned long flags;
void *d;
 
d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
 
-   local_irq_save(flags);
if (use_ggtt) {
void __iomem *s;
 
@@ -746,15 +745,10 @@ i915_error_object_create(struct drm_i915_private 
*dev_priv,
 
page = i915_gem_object_get_page(src, i);
 
-   drm_clflush_pages(, 1);
-
s = kmap_atomic(page);
memcpy(d, s, PAGE_SIZE);
kunmap_atomic(s);
-
-   drm_clflush_pages(, 1);
}
-   local_irq_restore(flags);
 
dst->pages[i++] = d;
reloc_offset += PAGE_SIZE;
@@ -1420,6 +1414,31 @@ static void i915_capture_gen_state(struct 
drm_i915_private *dev_priv,
   sizeof(error->device_info));
 }
 
+static int capture(void *data)
+{
+   struct drm_i915_error_state *error = data;
+
+   /* Ensure that what we readback from memory matches what the GPU sees */
+   wbinvd();
+
+   i915_capture_gen_state(error->i915, error);
+   i915_capture_reg_state(error->i915, error);
+   i915_gem_record_fences(error->i915, error);
+   i915_gem_record_rings(error->i915, error);
+   i915_capture_active_buffers(error->i915, error);
+   i915_capture_pinned_buffers(error->i915, error);
+
+   do_gettimeofday(>time);
+
+   error->overlay = intel_overlay_capture_error_state(error->i915);
+   error->display = intel_display_capture_error_state(error->i915);
+
+   /* And make sure we don't leave trash in the CPU cache */
+   wbinvd();
+
+   return 0;
+}
+
 /**
  * i915_capture_error_state - capture an error record for later analysis
  * @dev: drm device
@@ -1451,18 +1470,9 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
}
 
kref_init(>ref);
+   error->i915 = dev_priv;
 
-   i915_capture_gen_state(dev_priv, error);
-   

[Intel-gfx] [PATCH 03/38] drm/i915: Always use the GTT for error capture

2016-09-20 Thread Chris Wilson
Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  43 
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   2 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 121 --
 3 files changed, 74 insertions(+), 92 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0bb4232f66bc..7906fc3aa488 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2717,6 +2717,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
 */
struct i915_ggtt *ggtt = _priv->ggtt;
unsigned long hole_start, hole_end;
+   struct i915_hw_ppgtt *ppgtt;
struct drm_mm_node *entry;
int ret;
 
@@ -2724,6 +2725,15 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
if (ret)
return ret;
 
+   /* Reserve a mappable slot for our lockless error capture */
+   ret = drm_mm_insert_node_in_range_generic(>base.mm,
+ >gpu_error,
+ 4096, 0, -1,
+ 0, ggtt->mappable_end,
+ 0, 0);
+   if (ret)
+   return ret;
+
/* Clear any non-preallocated blocks */
drm_mm_for_each_hole(entry, >base.mm, hole_start, hole_end) {
DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
@@ -2738,25 +2748,21 @@ int i915_gem_init_ggtt(struct drm_i915_private 
*dev_priv)
   true);
 
if (USES_PPGTT(dev_priv) && !USES_FULL_PPGTT(dev_priv)) {
-   struct i915_hw_ppgtt *ppgtt;
-
ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
-   if (!ppgtt)
-   return -ENOMEM;
+   if (!ppgtt) {
+   ret = -ENOMEM;
+   goto err;
+   }
 
ret = __hw_ppgtt_init(ppgtt, dev_priv);
-   if (ret) {
-   kfree(ppgtt);
-   return ret;
-   }
+   if (ret)
+   goto err_ppgtt;
 
-   if (ppgtt->base.allocate_va_range)
+   if (ppgtt->base.allocate_va_range) {
ret = ppgtt->base.allocate_va_range(>base, 0,
ppgtt->base.total);
-   if (ret) {
-   ppgtt->base.cleanup(>base);
-   kfree(ppgtt);
-   return ret;
+   if (ret)
+   goto err_ppgtt_cleanup;
}
 
ppgtt->base.clear_range(>base,
@@ -2770,6 +2776,14 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
}
 
return 0;
+
+err_ppgtt_cleanup:
+   ppgtt->base.cleanup(>base);
+err_ppgtt:
+   kfree(ppgtt);
+err:
+   drm_mm_remove_node(>gpu_error);
+   return ret;
 }
 
 /**
@@ -2788,6 +2802,9 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private 
*dev_priv)
 
i915_gem_cleanup_stolen(_priv->drm);
 
+   if (drm_mm_node_allocated(>gpu_error))
+   drm_mm_remove_node(>gpu_error);
+
if (drm_mm_initialized(>base.mm)) {
intel_vgt_deballoon(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index ec78be2f8c77..a90999fe2e57 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -450,6 +450,8 @@ struct i915_ggtt {
bool do_idle_maps;
 
int mtrr;
+
+   struct drm_mm_node gpu_error;
 };
 
 struct i915_hw_ppgtt {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 13ad86e11695..82a928b0b2f3 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -632,7 +632,7 @@ static void i915_error_object_free(struct 
drm_i915_error_object *obj)
return;
 
for (page = 0; page < obj->page_count; page++)
-   kfree(obj->pages[page]);
+   free_page((unsigned long)obj->pages[page]);
 
kfree(obj);
 }
@@ -669,98 +669,68 @@ static void i915_error_state_free(struct kref *error_ref)
kfree(error);
 }
 
+static int compress_page(void *src, struct drm_i915_error_object *dst)
+{
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   return -ENOMEM;
+
+   

[Intel-gfx] [PATCH 08/38] drm/i915: Rearrange i915_wait_request() accounting with callers

2016-09-20 Thread Chris Wilson
Our low-level wait routine has evolved from our generic wait interface
that handled unlocked, RPS boosting, waits with time tracking. If we
push our GEM fence tracking to use reservation_objects (required for
handling multiple timelines), we lose the ability to pass the required
information down to i915_wait_request(). However, if we push the extra
functionality from i915_wait_request() to the individual callsites
(i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use
of those extras, we can both simplify our low level wait and prepare for
extending the GEM interface for use of reservation_objects.

* This causes a temporary regression in the use of wait-ioctl as a busy
query -- it will fail to report immediately, but go into
i915_wait_request() before timing out.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h |   7 +-
 drivers/gpu/drm/i915/i915_gem.c | 300 +++-
 drivers/gpu/drm/i915/i915_gem_request.c | 117 ++---
 drivers/gpu/drm/i915/i915_gem_request.h |  32 ++--
 drivers/gpu/drm/i915/i915_gem_userptr.c |  12 +-
 drivers/gpu/drm/i915/intel_display.c|  34 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c |  14 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   3 +-
 8 files changed, 283 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dc0cca48ff3d..a3401418bf4d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3279,9 +3279,10 @@ int __must_check i915_gem_wait_for_idle(struct 
drm_i915_private *dev_priv,
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void i915_gem_resume(struct drm_device *dev);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
-int __must_check
-i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
-  bool readonly);
+int i915_gem_object_wait(struct drm_i915_gem_object *obj,
+unsigned int flags,
+long timeout,
+struct intel_rps_client *rps);
 int __must_check
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj,
  bool write);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c8bd02277b7d..d9214c9d31d2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -292,7 +292,12 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 * must wait for all rendering to complete to the object (as unbinding
 * must anyway), and retire the requests.
 */
-   ret = i915_gem_object_wait_rendering(obj, false);
+   ret = i915_gem_object_wait(obj,
+  I915_WAIT_INTERRUPTIBLE |
+  I915_WAIT_LOCKED |
+  I915_WAIT_ALL,
+  MAX_SCHEDULE_TIMEOUT,
+  NULL);
if (ret)
return ret;
 
@@ -311,88 +316,171 @@ int i915_gem_object_unbind(struct drm_i915_gem_object 
*obj)
return ret;
 }
 
-/**
- * Ensures that all rendering to the object has completed and the object is
- * safe to unbind from the GTT or access from the CPU.
- * @obj: i915 gem object
- * @readonly: waiting for just read access or read-write access
- */
-int
-i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
-  bool readonly)
+static long
+i915_gem_object_wait_fence(struct fence *fence,
+  unsigned int flags,
+  long timeout,
+  struct intel_rps_client *rps)
 {
-   struct reservation_object *resv;
-   struct i915_gem_active *active;
-   unsigned long active_mask;
-   int idx;
+   struct drm_i915_gem_request *rq;
 
-   lockdep_assert_held(>base.dev->struct_mutex);
+   BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1);
 
-   if (!readonly) {
-   active = obj->last_read;
-   active_mask = i915_gem_object_get_active(obj);
-   } else {
-   active_mask = 1;
-   active = >last_write;
+   if (test_bit(FENCE_FLAG_SIGNALED_BIT, >flags))
+   return timeout;
+
+   if (!fence_is_i915(fence))
+   return fence_wait_timeout(fence,
+ flags & I915_WAIT_INTERRUPTIBLE,
+ timeout);
+
+   rq = to_request(fence);
+   if (i915_gem_request_completed(rq))
+   goto out;
+
+   /* This client is about to stall waiting for the GPU. In many cases
+* this is undesirable and limits the throughput of the system, as
+* many clients cannot continue processing user input/output whilst
+* blocked. RPS autotuning may take tens of milliseconds to respond
+  

[Intel-gfx] [PATCH 07/38] drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate

2016-09-20 Thread Chris Wilson
In forthcoming patches, we want to be able to dynamically allocate the
wait_queue_t used whilst awaiting. This is more convenient if we extend
the i915_sw_fence_await_sw_fence() to perform the allocation for us if
we pass in a gfp mask as an alternative than a preallocated struct.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c |  2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c| 22 --
 drivers/gpu/drm/i915/i915_sw_fence.h|  7 ++-
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 624d71bf4518..7496661b295e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -678,7 +678,7 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
   >i915->drm.struct_mutex);
if (prev)
i915_sw_fence_await_sw_fence(>submit, >submit,
->submitq);
+>submitq, GFP_NOWAIT);
 
request->emitted_jiffies = jiffies;
request->previous_seqno = engine->last_submitted_seqno;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index 1e5cbc585ca2..1fd20b69feea 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -13,6 +13,8 @@
 
 #include "i915_sw_fence.h"
 
+#define I915_SW_FENCE_FLAG_ALLOC BIT(0)
+
 static DEFINE_SPINLOCK(i915_sw_fence_lock);
 
 static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
@@ -135,6 +137,8 @@ static int i915_sw_fence_wake(wait_queue_t *wq, unsigned 
mode, int flags, void *
list_del(>task_list);
__i915_sw_fence_complete(wq->private, key);
i915_sw_fence_put(wq->private);
+   if (wq->flags & I915_SW_FENCE_FLAG_ALLOC)
+   kfree(wq);
return 0;
 }
 
@@ -194,7 +198,7 @@ static bool i915_sw_fence_check_if_after(struct 
i915_sw_fence *fence,
 
 int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence,
 struct i915_sw_fence *signaler,
-wait_queue_t *wq)
+wait_queue_t *wq, gfp_t gfp)
 {
unsigned long flags;
int pending;
@@ -206,8 +210,22 @@ int i915_sw_fence_await_sw_fence(struct i915_sw_fence 
*fence,
if (unlikely(i915_sw_fence_check_if_after(fence, signaler)))
return -EINVAL;
 
+   pending = 0;
+   if (!wq) {
+   wq = kmalloc(sizeof(*wq), gfp);
+   if (!wq) {
+   if (!gfpflags_allow_blocking(gfp))
+   return -ENOMEM;
+
+   i915_sw_fence_wait(signaler);
+   return 0;
+   }
+
+   pending |= I915_SW_FENCE_FLAG_ALLOC;
+   }
+
INIT_LIST_HEAD(>task_list);
-   wq->flags = 0;
+   wq->flags = pending;
wq->func = i915_sw_fence_wake;
wq->private = i915_sw_fence_get(fence);
 
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 373141602ca4..d221df381ec2 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -45,7 +45,7 @@ void i915_sw_fence_commit(struct i915_sw_fence *fence);
 
 int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence,
 struct i915_sw_fence *after,
-wait_queue_t *wq);
+wait_queue_t *wq, gfp_t gfp);
 int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
  struct fence *dma,
  unsigned long timeout,
@@ -62,4 +62,9 @@ static inline bool i915_sw_fence_done(const struct 
i915_sw_fence *fence)
return atomic_read(>pending) < 0;
 }
 
+static inline void i915_sw_fence_wait(struct i915_sw_fence *fence)
+{
+   wait_event(fence->wait, i915_sw_fence_done(fence));
+}
+
 #endif /* _I915_SW_FENCE_H_ */
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/38] drm/i915: Defer active reference until required

2016-09-20 Thread Chris Wilson
We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  | 34 
 drivers/gpu/drm/i915/i915_gem.c  | 25 +++-
 drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  2 +-
 drivers/gpu/drm/i915/i915_gem_context.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  2 --
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |  4 ++--
 8 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a3401418bf4d..89eab85ab11c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2214,6 +2214,13 @@ struct drm_i915_gem_object {
((READ_ONCE((bo)->flags) >> I915_BO_ACTIVE_SHIFT) & I915_BO_ACTIVE_MASK)
 
/**
+* Have we taken a reference for the object for incomplete GPU
+* activity?
+*/
+#define I915_BO_ACTIVE_REF_SHIFT (I915_BO_ACTIVE_SHIFT + I915_NUM_ENGINES)
+#define I915_BO_ACTIVE_REF BIT(I915_BO_ACTIVE_REF_SHIFT)
+
+   /**
 * This is set if the object has been written to since last bound
 * to the GTT
 */
@@ -2383,6 +2390,28 @@ i915_gem_object_has_active_engine(const struct 
drm_i915_gem_object *obj,
return obj->flags & BIT(engine + I915_BO_ACTIVE_SHIFT);
 }
 
+static inline bool
+i915_gem_object_has_active_reference(const struct drm_i915_gem_object *obj)
+{
+   return obj->flags & I915_BO_ACTIVE_REF;
+}
+
+static inline void
+i915_gem_object_set_active_reference(struct drm_i915_gem_object *obj)
+{
+   lockdep_assert_held(>base.dev->struct_mutex);
+   obj->flags |= I915_BO_ACTIVE_REF;
+}
+
+static inline void
+i915_gem_object_clear_active_reference(struct drm_i915_gem_object *obj)
+{
+   lockdep_assert_held(>base.dev->struct_mutex);
+   obj->flags &= ~I915_BO_ACTIVE_REF;
+}
+
+void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj);
+
 static inline unsigned int
 i915_gem_object_get_tiling(struct drm_i915_gem_object *obj)
 {
@@ -2413,6 +2442,11 @@ static inline void i915_vma_put(struct i915_vma *vma)
i915_gem_object_put(vma->obj);
 }
 
+static inline void i915_vma_put_internal(struct i915_vma *vma)
+{
+   __i915_gem_object_release_unless_active(vma->obj);
+}
+
 /*
  * Optimised SGL iterator for GEM objects
  */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9214c9d31d2..e8373f264858 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2618,7 +2618,10 @@ i915_gem_object_retire__read(struct i915_gem_active 
*active,
list_move_tail(>global_list,
   >i915->mm.bound_list);
 
-   i915_gem_object_put(obj);
+   if (i915_gem_object_has_active_reference(obj)) {
+   i915_gem_object_clear_active_reference(obj);
+   i915_gem_object_put(obj);
+   }
 }
 
 static bool i915_context_is_banned(const struct i915_gem_context *ctx)
@@ -2881,6 +2884,12 @@ void i915_gem_close_object(struct drm_gem_object *gem, 
struct drm_file *file)
list_for_each_entry_safe(vma, vn, >vma_list, obj_link)
if (vma->vm->file == fpriv)
i915_vma_close(vma);
+
+   if (i915_gem_object_is_active(obj) &&
+   !i915_gem_object_has_active_reference(obj)) {
+   i915_gem_object_set_active_reference(obj);
+   i915_gem_object_get(obj);
+   }
mutex_unlock(>base.dev->struct_mutex);
 }
 
@@ -4358,6 +4367,20 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
intel_runtime_pm_put(dev_priv);
 }
 
+void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj)
+{
+   if (!obj)
+   return;
+
+   lockdep_assert_held(>base.dev->struct_mutex);
+
+   GEM_BUG_ON(i915_gem_object_has_active_reference(obj));
+   if (i915_gem_object_is_active(obj))
+   i915_gem_object_set_active_reference(obj);
+   else
+   i915_gem_object_put(obj);
+}
+
 int i915_gem_suspend(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c 
b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
index ed989596d9a3..cb25cad3318c 100644
--- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
+++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
@@ -73,7 +73,7 @@ void i915_gem_batch_pool_fini(struct i915_gem_batch_pool 
*pool)
list_for_each_entry_safe(obj, next,
 

[Intel-gfx] Multiple timelines, take 2

2016-09-20 Thread Chris Wilson
A minor hitch in handling the unification of vma + obj retirements is
that we trigger a recursive use-after-free if we try to purge an object
with an active reference. One way to side-step that issue was to pull in
the deferred free work, which pulled in dependencies on the separate
mm.lock and a few rearrangements to make that possible.
-Chris

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/38] drm/i915: Allow disabling error capture

2016-09-20 Thread Chris Wilson
We currently capture the GPU state after we detect a hang. This is vital
for us to both triage and debug hangs in the wild (post-mortem
debugging). However, it comes at the cost of running some potentially
dangerous code (since it has to make very few assumption about the state
of the driver) that is quite resource intensive.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  | 10 ++
 drivers/gpu/drm/i915/i915_debugfs.c   |  6 ++
 drivers/gpu/drm/i915/i915_drv.h   | 16 
 drivers/gpu/drm/i915/i915_gpu_error.c |  7 +++
 drivers/gpu/drm/i915/i915_params.c|  9 +
 drivers/gpu/drm/i915/i915_params.h|  1 +
 drivers/gpu/drm/i915/i915_sysfs.c |  8 
 drivers/gpu/drm/i915/intel_display.c  |  4 
 drivers/gpu/drm/i915/intel_overlay.c  |  4 
 9 files changed, 65 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e469118f..10a6ac11b6a9 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -46,6 +46,16 @@ config DRM_I915_PRELIMINARY_HW_SUPPORT
 
  If in doubt, say "N".
 
+config DRM_I915_CAPTURE_ERROR
+   bool "Enable capturing GPU state following a hang"
+   depends on DRM_I915
+   default y
+   help
+ This option enables capturing the GPU state when a hang is detected.
+ This information is vital for triaging hangs and assists in debugging.
+
+ If in doubt, say "Y".
+
 config DRM_I915_USERPTR
bool "Always enable userptr support"
depends on DRM_I915
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 27b0e34dadec..278fb18efde2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -956,6 +956,8 @@ static int i915_hws_info(struct seq_file *m, void *data)
return 0;
 }
 
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
+
 static ssize_t
 i915_error_state_write(struct file *filp,
   const char __user *ubuf,
@@ -1038,6 +1040,8 @@ static const struct file_operations i915_error_state_fops 
= {
.release = i915_error_state_release,
 };
 
+#endif
+
 static int
 i915_next_seqno_get(void *data, u64 *val)
 {
@@ -5309,7 +5313,9 @@ static const struct i915_debugfs_files {
{"i915_ring_missed_irq", _ring_missed_irq_fops},
{"i915_ring_test_irq", _ring_test_irq_fops},
{"i915_gem_drop_caches", _drop_caches_fops},
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
{"i915_error_state", _error_state_fops},
+#endif
{"i915_next_seqno", _next_seqno_fops},
{"i915_display_crc_ctl", _display_crc_ctl_fops},
{"i915_pri_wm_latency", _pri_wm_latency_fops},
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4dd307ed4336..1643acb02513 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3522,6 +3522,8 @@ static inline void intel_display_crc_init(struct 
drm_i915_private *dev_priv) {}
 #endif
 
 /* i915_gpu_error.c */
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
+
 __printf(2, 3)
 void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...);
 int i915_error_state_to_str(struct drm_i915_error_state_buf *estr,
@@ -3542,6 +3544,20 @@ void i915_error_state_get(struct drm_device *dev,
 void i915_error_state_put(struct i915_error_state_file_priv *error_priv);
 void i915_destroy_error_state(struct drm_device *dev);
 
+#else
+
+static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
+   u32 engine_mask,
+   const char *error_msg)
+{
+}
+
+static inline void i915_destroy_error_state(struct drm_device *dev)
+{
+}
+
+#endif
+
 void i915_get_extra_instdone(struct drm_i915_private *dev_priv, uint32_t 
*instdone);
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 334f15df7c8d..364f5b17df21 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -30,6 +30,8 @@
 #include 
 #include "i915_drv.h"
 
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
+
 static const char *engine_str(int engine)
 {
switch (engine) {
@@ -1435,6 +1437,9 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
struct drm_i915_error_state *error;
unsigned long flags;
 
+   if (!i915.error_capture)
+   return;
+
if (READ_ONCE(dev_priv->gpu_error.first_error))
return;
 
@@ -1520,6 +1525,8 @@ void i915_destroy_error_state(struct drm_device *dev)
kref_put(>ref, i915_error_state_free);
 }
 
+#endif
+
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
 {
switch (type) {
diff --git a/drivers/gpu/drm/i915/i915_params.c 

[Intel-gfx] [PATCH 04/38] drm/i915: Consolidate error object printing

2016-09-20 Thread Chris Wilson
Leave all the pretty printing to userspace and simplify the error
capture to only have a single common object printer. It makes the kernel
code more compact, and the refactoring allows us to apply more complex
transformations like compressing the output.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 100 +-
 1 file changed, 25 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 82a928b0b2f3..f4cc6746dc72 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -311,10 +311,22 @@ void i915_error_printf(struct drm_i915_error_state_buf 
*e, const char *f, ...)
 }
 
 static void print_error_obj(struct drm_i915_error_state_buf *m,
+   struct intel_engine_cs *engine,
+   const char *name,
struct drm_i915_error_object *obj)
 {
int page, offset, elt;
 
+   if (!obj)
+   return;
+
+   if (name) {
+   err_printf(m, "%s --- %s = 0x%08x %08x\n",
+  engine ? engine->name : "global", name,
+  upper_32_bits(obj->gtt_offset),
+  lower_32_bits(obj->gtt_offset));
+   }
+
for (page = offset = 0; page < obj->page_count; page++) {
for (elt = 0; elt < PAGE_SIZE/4; elt++) {
err_printf(m, "%08x :  %08x\n", offset,
@@ -342,8 +354,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf 
*m,
struct pci_dev *pdev = dev_priv->drm.pdev;
struct drm_i915_error_state *error = error_priv->error;
struct drm_i915_error_object *obj;
-   int i, j, offset, elt;
int max_hangcheck_score;
+   int i, j;
 
if (!error) {
err_printf(m, "no error state collected\n");
@@ -467,15 +479,7 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
err_printf(m, " --- gtt_offset = 0x%08x %08x\n",
   upper_32_bits(obj->gtt_offset),
   lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
-   }
-
-   obj = ee->wa_batchbuffer;
-   if (obj) {
-   err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
-  dev_priv->engine[i].name,
-  lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
+   print_error_obj(m, _priv->engine[i], NULL, obj);
}
 
if (ee->num_requests) {
@@ -507,77 +511,23 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}
 
-   if ((obj = ee->ringbuffer)) {
-   err_printf(m, "%s --- ringbuffer = 0x%08x\n",
-  dev_priv->engine[i].name,
-  lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
-   }
+   print_error_obj(m, _priv->engine[i],
+   "ringbuffer", ee->ringbuffer);
 
-   if ((obj = ee->hws_page)) {
-   u64 hws_offset = obj->gtt_offset;
-   u32 *hws_page = >pages[0][0];
+   print_error_obj(m, _priv->engine[i],
+   "HW Status", ee->hws_page);
 
-   if (i915.enable_execlists) {
-   hws_offset += LRC_PPHWSP_PN * PAGE_SIZE;
-   hws_page = >pages[LRC_PPHWSP_PN][0];
-   }
-   err_printf(m, "%s --- HW Status = 0x%08llx\n",
-  dev_priv->engine[i].name, hws_offset);
-   offset = 0;
-   for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
-   err_printf(m, "[%04x] %08x %08x %08x %08x\n",
-  offset,
-  hws_page[elt],
-  hws_page[elt+1],
-  hws_page[elt+2],
-  hws_page[elt+3]);
-   offset += 16;
-   }
-   }
+   print_error_obj(m, _priv->engine[i],
+   "HW context", ee->ctx);
 
-   obj = ee->wa_ctx;
-   if (obj) {
-   u64 wa_ctx_offset = obj->gtt_offset;
-   u32 *wa_ctx_page = >pages[0][0];
-   struct intel_engine_cs *engine = _priv->engine[RCS];
-  

[Intel-gfx] [PATCH 05/38] drm/i915: Compress GPU objects in error state

2016-09-20 Thread Chris Wilson
Our error states are quickly growing, pinning kernel memory with them.
The majority of the space is taken up by the error objects. These
compress well using zlib and without decode are mostly meaningless, so
encoding them does not hinder quickly parsing the error state for
familiarity.

v2: Make the zlib dependency optional

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  |  12 +++
 drivers/gpu/drm/i915/i915_drv.h   |   3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 170 +-
 3 files changed, 163 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 0f46a9c04c0e..92ecced1bc8f 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -57,6 +57,18 @@ config DRM_I915_CAPTURE_ERROR
 
  If in doubt, say "Y".
 
+config DRM_I915_COMPRESS_ERROR
+   bool "Compress GPU error state"
+   depends on DRM_I915_CAPTURE_ERROR
+   select ZLIB_DEFLATE
+   default y
+   help
+ This option selects ZLIB_DEFLATE if it isn't already
+ selected and causes any error state captured upon a GPU hang
+ to be compressed using zlib.
+
+ If in doubt, say "Y".
+
 config DRM_I915_USERPTR
bool "Always enable userptr support"
depends on DRM_I915
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3fda31816fdd..dc0cca48ff3d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -806,9 +806,10 @@ struct drm_i915_error_state {
u32 semaphore_mboxes[I915_NUM_ENGINES - 1];
 
struct drm_i915_error_object {
-   int page_count;
u64 gtt_offset;
u64 gtt_size;
+   int page_count;
+   int unused;
u32 *pages[0];
} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index f4cc6746dc72..dcb279035b29 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -29,6 +29,7 @@
 
 #include 
 #include 
+#include 
 #include "i915_drv.h"
 
 #ifdef CONFIG_DRM_I915_CAPTURE_ERROR
@@ -175,6 +176,110 @@ static void i915_error_puts(struct 
drm_i915_error_state_buf *e,
 #define err_printf(e, ...) i915_error_printf(e, __VA_ARGS__)
 #define err_puts(e, s) i915_error_puts(e, s)
 
+#ifdef CONFIG_DRM_I915_COMPRESS_ERROR
+
+static bool compress_init(struct z_stream_s *zstream)
+{
+   memset(zstream, 0, sizeof(*zstream));
+
+   zstream->workspace =
+   kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
+   GFP_ATOMIC | __GFP_NOWARN);
+   if (!zstream->workspace)
+   return NULL;
+
+   if (zlib_deflateInit(zstream, Z_DEFAULT_COMPRESSION) != Z_OK) {
+   kfree(zstream->workspace);
+   return false;
+   }
+
+   return true;
+}
+
+static int compress_page(struct z_stream_s *zstream,
+void *src,
+struct drm_i915_error_object *dst)
+{
+   zstream->next_in = src;
+   zstream->avail_in = PAGE_SIZE;
+
+   do {
+   if (zstream->avail_out == 0) {
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   return -ENOMEM;
+
+   dst->pages[dst->page_count++] = (void *)page;
+
+   zstream->next_out = (void *)page;
+   zstream->avail_out = PAGE_SIZE;
+   }
+
+   if (zlib_deflate(zstream, Z_SYNC_FLUSH) != Z_OK)
+   return -EIO;
+
+   /* Fallback to uncompressed if we increase size? */
+   if (0 && zstream->total_out > zstream->total_in)
+   return -E2BIG;
+   } while (zstream->avail_in);
+
+   return 0;
+}
+
+static void compress_fini(struct z_stream_s *zstream,
+ struct drm_i915_error_object *dst)
+{
+   if (dst) {
+   zlib_deflate(zstream, Z_FINISH);
+   dst->unused = zstream->avail_out;
+   }
+
+   zlib_deflateEnd(zstream);
+   kfree(zstream->workspace);
+}
+
+static void err_compression_marker(struct drm_i915_error_state_buf *m)
+{
+   err_puts(m, ":");
+}
+
+#else
+
+static bool compress_init(struct z_stream_s *zstream)
+{
+   return true;
+}
+
+static int compress_page(struct z_stream_s *zstream,
+void *src,
+struct drm_i915_error_object *dst)
+{
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   return -ENOMEM;
+
+   

Re: [Intel-gfx] [PATCH 0/7] drm/i915: Add pipe scaler for Gen9 in atomic path

2016-09-20 Thread Ville Syrjälä
On Tue, Aug 30, 2016 at 10:30:54AM +0530, Nabendu Maiti wrote:
> Following patch series add pipe scaler functionality in atomic path.The pipe
> scaler can be changed dynamically without modeset.Apart from default panel
> fitter supported scaling modes -CENTER/ASPECT/FULLSCREEN custom scaling mode
> mode is added as there is no such restriction on GEn9 pipe scaler.

Some quick observations:
- missing any interaction with drm core, so all generic fb size checks
  and whatnot will not work out, I think
- the way it's done goes against what I've suggested in the past. My
  idea has been to add a "fixed mode" property to connectors instead,
  which I think would minimize the impact on the core, and it would
  follow closely the way eDP/LVDS/DSI already works today.
- There's no need to restrict the feature to gen9+ only. It should work
  out just fine for at least all pch platforms. gmch platforms would be
  more challenging
- the pfiter_recalculate thing looks pretty wrong atomic wise

> 
> 
> 
> Nabendu Maiti (7):
>   drm/i915: Add pipe scaler pipe source drm property
>   drm/i915: Add pipe_src size property calculations in atomic path.
>   drm/i915: Panel fitting calculation for GEN9
>   drm/i915: Adding atomic fitting mode property for GEN9
>   drm/i915: Add pipe scaler co-ordinate and size property for Gen9
>   drm/i915: Update pipe-scaler according to destination size
>   drm/i915: Pipescaler destination size limit check on Gen9
> 
>  drivers/gpu/drm/drm_atomic.c |  35 ++
>  drivers/gpu/drm/drm_crtc.c   |  56 +++
>  drivers/gpu/drm/i915/intel_display.c | 128 
> +--
>  drivers/gpu/drm/i915/intel_drv.h |   3 +
>  drivers/gpu/drm/i915/intel_panel.c   |  34 +-
>  include/drm/drm_crtc.h   |  14 
>  include/uapi/drm/drm_mode.h  |   1 +
>  7 files changed, 263 insertions(+), 8 deletions(-)
> 
> -- 
> 1.9.1
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t] tests/kms_psr_sink_crc: Assert if we don't have a valid crtc.

2016-09-20 Thread Marius Vlad
In the fixture block for the test we receive a segfault with the
following trace:

0  setup_output (data=) at kms_psr_sink_crc.c:115
1  display_init (data=0x7ffd46833cf0) at kms_psr_sink_crc.c:126
2  main (argc=1, argv=) at kms_psr_sink_crc.c:528

And in setup_output(), output.config is:

$2 = {crtc = 0x0, connector = 0x6ad460, encoder = 0x0, default_mode =
{clock = 361310, hdisplay = 3200, hsync_start = 3248,
hsync_end = 3280, htotal = 3316, hskew = 0, vdisplay = 1800,
vsync_start = 1802, vsync_end = 1807, vtotal = 1816,
vscan = 0, vrefresh = 60, flags = 10, type = 72,
name = "3200x1800", '\000' },
connector_scaling_mode = 0, connector_scaling_mode_changed =
false, pipe_changed = true, atomic_props_connector =
{44, 0}, pipe = -1, valid_crtc_idx_mask = 7}

Also, include some warnings in _kmstest_connector_config()
in case we fail to retrieve the connector or the crtc.

Signed-off-by: Marius Vlad 
CC: Maarten Lankhorst 
CC: Rodrigo Vivi 
CC: Daniel Vetter 

---
 lib/igt_kms.c| 10 +-
 tests/kms_psr_sink_crc.c |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/lib/igt_kms.c b/lib/igt_kms.c
index 960ecbd..32cd5cf 100644
--- a/lib/igt_kms.c
+++ b/lib/igt_kms.c
@@ -862,8 +862,10 @@ static bool _kmstest_connector_config(int drm_fd, uint32_t 
connector_id,
else
connector = drmModeGetConnectorCurrent(drm_fd, connector_id);
 
-   if (!connector)
+   if (!connector) {
+   igt_warn("drmModeGetConnector(Current) failed\n");
goto err2;
+   }
 
if (connector->connector_id != connector_id) {
igt_warn("connector id doesn't match (%d != %d)\n",
@@ -896,6 +898,12 @@ static bool _kmstest_connector_config(int drm_fd, uint32_t 
connector_id,
config->encoder = _kmstest_connector_config_find_encoder(drm_fd, 
connector, config->pipe);
config->crtc = drmModeGetCrtc(drm_fd, resources->crtcs[config->pipe]);
 
+   if (!config->crtc) {
+   igt_warn("drmModeGetCrtc for pipe %s failed\n",
+   kmstest_pipe_name(config->pipe));
+   goto err2;
+   }
+
if (connector->connection != DRM_MODE_CONNECTED)
goto err2;
 
diff --git a/tests/kms_psr_sink_crc.c b/tests/kms_psr_sink_crc.c
index 8aafedb..0890d5b 100644
--- a/tests/kms_psr_sink_crc.c
+++ b/tests/kms_psr_sink_crc.c
@@ -112,6 +112,7 @@ static void setup_output(data_t *data)
continue;
 
igt_output_set_pipe(output, pipe);
+   igt_assert(output->config.crtc != NULL);
data->crtc_id = output->config.crtc->crtc_id;
data->output = output;
data->mode = igt_output_get_mode(output);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t] lib/igt_core: Print stacktrace in initialization/fixture blocks.

2016-09-20 Thread Marius Vlad
Likely candidate for this behaviour is the igt_fixture block. Seen in the CI by
running tests/kms_psr_sink_crc which is causing segfaults in the fixture block.

While at it fix some minor printing bugs.

Signed-off-by: Marius Vlad 
CC: Chris Wilson 
---
 lib/igt_core.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/igt_core.c b/lib/igt_core.c
index dd27a22..43db468 100644
--- a/lib/igt_core.c
+++ b/lib/igt_core.c
@@ -1122,7 +1122,7 @@ static void
 printnum(unsigned long long num, unsigned base)
 {
int i = 0;
-   unsigned long long __num;
+   unsigned long long __num = num;
 
/* determine from where we should start dividing */
do {
@@ -1264,7 +1264,7 @@ static void print_backtrace_sig_safe(void)
unw_word_t off;
 
if (unw_get_proc_name(, name, 255, ) < 0)
-   xstrlcpy(name, "", 9);
+   xstrlcpy(name, "", 10);
 
xprintf(" #%d [%s+0x%x]\n", stack_num++, name,
(unsigned int) off);
@@ -1740,7 +1740,7 @@ static void fatal_sig_handler(int sig)
igt_assert_eq(write(STDERR_FILENO, ".\n", 2), 2);
}
 
-   if (in_subtest && crash_signal(sig)) {
+   if (crash_signal(sig)) {
/* Linux standard to return exit code as 128 + signal */
if (!failed_one)
igt_exitcode = 128 + sig;
@@ -1749,7 +1749,8 @@ static void fatal_sig_handler(int sig)
 #ifdef HAVE_LIBUNWIND
print_backtrace_sig_safe();
 #endif
-   exit_subtest("CRASH");
+   if (in_subtest)
+   exit_subtest("CRASH");
}
break;
}
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Skylake graphics regression: projector failure with 4.8-rc3

2016-09-20 Thread Jani Nikula
On Mon, 19 Sep 2016, James Bottomley  
wrote:
> On Mon, 2016-09-19 at 08:09 -0700, James Bottomley wrote:
>> On Sun, 2016-09-18 at 13:35 +0200, Thorsten Leemhuis wrote:
>> > Hi! James & Paulo: What's the current status of this?
>> 
>> No, the only interaction has been the suggestion below for a revert,
>> which didn't fix the problem.
>> 
>> >  Was this issue discussed elsewhere or even fixed in between? Just 
>> > asking, because this issue is on the list of regressions for 4.8.
>> 
>> 
>> I'm just about to try out -rc7, but it's not fixed so far.
>
> OK, with -rc7 and the i915 fixes, there seems to be a marked
> improvement.  I can no longer crash the crtc by using lots of xrandr
> switches, which was the principal problem.
>
> I've so far only got one of these in the logs
>
> [14858.635035] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
> CPU pipe B FIFO underrun
>
> And the only residual problem seems to be that the monitor goes blank
> periodically, but this can be fixed by switching resolution a couple of
> times.
>
> I haven't seen any of the link training errors so far and I've run
> through my usual battery of be nasty to the external monitor tests.

Thanks for the update. We still have fixes for the underruns in the
pipeline.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [i915] monitor is not detected unless it was active during boot

2016-09-20 Thread Jani Nikula
On Mon, 19 Sep 2016, Maarten Maathuis  wrote:
> Any advice on how to proceed?

Please file a bug at [1].

BR,
Jani.


[1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI=DRM/Intel

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx