Re: [Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 12:18:39PM +0100, Tomas Elf wrote:
 My point was more along the lines of bailing out if the reset
 request fails and not return an error message but simply keep track
 of the number of times we've attempted the reset request. By not
 returning an error we would allow more subsequent hang detections to
 happen (since the hang is still there), which would end up in the
 same reset request in the future. If the reset request would fail
 more times we would simply increment the counter and at one point we
 would decide that we've had too many unsuccessful reset request
 attempts and simply go ahead with the reset anyway and if the reset
 would fail we would return an error at that point in time, which
 would result in a terminally wedged state. But, yeah, I can see why
 we shouldn't do this.

Skipping to the middle!

I understand the merit in trying the reset a few times before giving up,
it would just need a bit of restructuring to try the reset before
clearing gem state (trivial) and requeueing the hangcheck. I am just
wary of feature creep before we get stuck into TDR, which promises to
change how we think about resets entirely.

I am trying not to block your work by doing it would be nice if tasks
first! :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

2015-06-18 Thread Dave Gordon
On 17/06/15 13:02, Daniel Vetter wrote:
 On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
 On 15/06/15 21:09, Chris Wilson wrote:
 On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
 From: Alex Dai yu@intel.com

 i915_gem_object_write() is a generic function to copy data from a plain
 linear buffer to a paged gem object.

 We will need this for the microcontroller firmware loading support code.

 Issue: VIZ-4884
 Signed-off-by: Alex Dai yu@intel.com
 Signed-off-by: Dave Gordon david.s.gor...@intel.com
 ---
  drivers/gpu/drm/i915/i915_drv.h |2 ++
  drivers/gpu/drm/i915/i915_gem.c |   28 
  2 files changed, 30 insertions(+)

 diff --git a/drivers/gpu/drm/i915/i915_drv.h 
 b/drivers/gpu/drm/i915/i915_drv.h
 index 611fbd8..9094c06 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device *dev);
  void i915_gem_object_free(struct drm_i915_gem_object *obj);
  void i915_gem_object_init(struct drm_i915_gem_object *obj,
 const struct drm_i915_gem_object_ops *ops);
 +int i915_gem_object_write(struct drm_i915_gem_object *obj,
 +const void *data, size_t size);
  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
  size_t size);
  void i915_init_vm(struct drm_i915_private *dev_priv,
 diff --git a/drivers/gpu/drm/i915/i915_gem.c 
 b/drivers/gpu/drm/i915/i915_gem.c
 index be35f04..75d63c2 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct 
 drm_i915_gem_object *obj)
return false;
  }
  
 +/* Fill the @obj with the @size amount of @data */
 +int i915_gem_object_write(struct drm_i915_gem_object *obj,
 +  const void *data, size_t size)
 +{
 +  struct sg_table *sg;
 +  size_t bytes;
 +  int ret;
 +
 +  ret = i915_gem_object_get_pages(obj);
 +  if (ret)
 +  return ret;
 +
 +  i915_gem_object_pin_pages(obj);

 You don't set the object into the CPU domain, or instead manually handle
 the domain flushing. You don't handle objects that cannot be written
 directly by the CPU, nor do you handle objects whose representation in
 memory is not linear.
 -Chris

 No we don't handle just any random gem object, but we do return an error
 code for any types not supported. However, as we don't really need the
 full generality of writing into a gem object of any type, I will replace
 this function with one that combines the allocation of a new object
 (which will therefore definitely be of the correct type, in the correct
 domain, etc) and filling it with the data to be preserved.

The usage pattern for the particular case is going to be:
Once-only:
Allocate
Fill
Then each time GuC is (re-)initialised:
Map to GTT
DMA-read from buffer into GuC private memory
Unmap
Only on unload:
Dispose

So our object is write-once by the CPU (and that's always the first
operation), thereafter read-occasionally by the GuC's DMA engine.

 Domain handling is required for all gem objects, and the resulting bugs if
 you don't for one-off objects are absolutely no fun to track down.
 -Daniel

Is it not the case that the new object returned by
i915_gem_alloc_object() is
(a) of a type that can be mapped into the GTT, and
(b) initially in the CPU domain for both reading and writing?

So AFAICS the allocate-and-fill function I'm describing (to appear in
next patch series respin) doesn't need any further domain handling.

.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fwd: [PATCH] drm/i915: Fix IPS related flicker

2015-06-18 Thread Jani Nikula
On Thu, 18 Jun 2015, Jani Nikula jani.nik...@linux.intel.com wrote:
 On Thu, 18 Jun 2015, Ander Conselvan De Oliveira conselv...@gmail.com wrote:
 On Fri, 2015-06-05 at 12:11 +0300, Ville Syrjälä wrote:
 On Fri, Jun 05, 2015 at 11:51:42AM +0300, Jani Nikula wrote:
  On Thu, 04 Jun 2015, Rodrigo Vivi rodrigo.v...@gmail.com wrote:
   I just noticed that I had forgotten to reply-all...
  
   Jani, would you consider merge this fix with the explanation above
   related to Ville's question?
  
   or do you want/need any action here?
  
  Ville's question, I'd like Ville's ack on it.
 
 It's good enough for me. This part of the driver is a quite a mess
 anyway currently, so doesn't matter too much what we stick in there.

 Ping. Seems like this still isn't merged. Does it need more work or did
 it just fall through the cracks?

 It fell between the cracks. I know the world isn't black and white, but
 it doesn't help the maintainers when review is some shade of grey.

 I've pushed this to drm-intel-next-fixes for now, but it has missed the
 train for both the v4.1 release and the main drm-next feature pull
 request for the v4.2 merge window. I expect this to land upstream in
 v4.2-rc2, unless there's an additional drm-next pull request during the
 merge window. I've added cc: stable.

 Thanks for the patch, and I guess the review was, uh, good enough for
 me now... :p

Argh, I'll take that back. This conflicts with dinq, and while doing so
also confuses git rerere enough to uncover a previous much bigger
conflict that I have no intention of resolving again before the
weekend. I'll return to it next week. Sorry.

BR,
Jani.




 BR,
 Jani.



 Thanks,
 Ander

 
  
  BR,
  Jani.
  
  
  
   Thanks,
   Rodrigo.
  
  
   -- Forwarded message --
   From: Rodrigo Vivi rodrigo.v...@gmail.com
   Date: Fri, May 29, 2015 at 9:45 AM
   Subject: Re: [Intel-gfx] [PATCH] drm/i915: Fix IPS related flicker
   To: Ville Syrjälä ville.syrj...@linux.intel.com
  
  
   On Fri, May 29, 2015 at 1:47 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
   On Thu, May 28, 2015 at 11:07:11AM -0700, Rodrigo Vivi wrote:
   We cannot let IPS enabled with no plane on the pipe:
  
   BSpec: IPS cannot be enabled until after at least one plane has
   been enabled for at least one vertical blank. and IPS must be
   disabled while there is still at least one plane enabled on the
   same pipe as IPS. This restriction apply to HSW and BDW.
  
   However a shortcut path on update primary plane function
   to make primary plane invisible by setting DSPCTRL to 0
   was leting IPS enabled while there was no
   other plane enabled on the pipe causing flickerings that we were
   believing that it was caused by that other restriction where
   ips cannot be used when pixel rate is greater than 95% of cdclok.
  
   v2: Don't mess with Atomic path as pointed out by Ville.
  
   Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85583
   Cc: Ville Syrjälä ville.syrj...@linux.intel.com
   Cc: Paulo Zanoni paulo.r.zan...@intel.com
   Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
   ---
drivers/gpu/drm/i915/intel_display.c | 13 +
drivers/gpu/drm/i915/intel_drv.h |  1 +
2 files changed, 14 insertions(+)
  
   diff --git a/drivers/gpu/drm/i915/intel_display.c 
   b/drivers/gpu/drm/i915/intel_display.c
   index 4e3f302..5a6b17b 100644
   --- a/drivers/gpu/drm/i915/intel_display.c
   +++ b/drivers/gpu/drm/i915/intel_display.c
   @@ -13309,6 +13309,16 @@ intel_check_primary_plane(struct drm_plane 
   *plane,
 intel_crtc-atomic.wait_vblank = true;
 }
  
   + /*
   +  * FIXME: Actually if we will still have any other 
   plane enabled
   +  * on the pipe we could let IPS enabled still, but for
   +  * now lets consider that when we make primary invisible
   +  * by setting DSPCNTR to 0 on update_primary_plane 
   function
   +  * IPS needs to be disable.
   +  */
   + if (!state-visible || !fb)
   + intel_crtc-atomic.disable_ips = true;
   +
  
   How could it be visible without an fb?
  
   I don't like this !fb here as well, but I just tried to keep exactly
   same if statement that makes I915_WRITE(DSPCNTRL, 0) on update primary
   plane func...
  
  
 intel_crtc-atomic.fb_bits |=
 INTEL_FRONTBUFFER_PRIMARY(intel_crtc-pipe);
  
   @@ -13406,6 +13416,9 @@ static void intel_begin_crtc_commit(struct 
   drm_crtc *crtc)
 if (intel_crtc-atomic.disable_fbc)
 intel_fbc_disable(dev);
  
   + if (intel_crtc-atomic.disable_ips)
   + hsw_disable_ips(intel_crtc);
   +
 if (intel_crtc-atomic.pre_disable_primary)
 intel_pre_disable_primary(crtc);
  
   intel_pre_disable_primary() would already disable IPS. Except no one
   sets .pre_disable_primary=true. 

Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

2015-06-18 Thread Dave Gordon
On 17/06/15 13:05, Daniel Vetter wrote:
 On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
 Current devices may contain one or more programmable microcontrollers
 that need to have a firmware image (aka binary blob) loaded from an
 external medium and transferred to the device's memory.

 This file provides generic support functions for doing this; they can
 then be used by each uC-specific loader, thus reducing code duplication
 and testing effort.

 Signed-off-by: Dave Gordon david.s.gor...@intel.com
 Signed-off-by: Alex Dai yu@intel.com
 
 Given that I'm just shredding the synchronization used by the dmc loader
 I'm not convinced this is a good idea. Abstraction has cost, and a bit of
 copy-paste for similar sounding but slightly different things doesn't
 sound awful to me. And the critical bit in all the firmware loading I've
 seen thus far is in synchronizing the loading with other operations,
 hiding that isn't a good idea. Worse if we enforce stuff like requiring
 dev-struct_mutex.
 -Daniel

It's precisely because it's in some sense trivial-but-tricky that we
should write it once, get it right, and use it everywhere. Copypaste
/does/ sound awful; I've seen how the code this was derived from had
already been cloned into three flavours, all different and all wrong.

It's a very simple abstraction: one early call to kick things off as
early as possible, no locking required. One late call with the
struct_mutex held to complete the synchronisation and actually do the
work, thus guaranteeing that the transfer to the target uC is done in a
controlled fashion, at a time of the caller's choice, and by the
driver's mainline thread, NOT by an asynchronous thread racing with
other activity (which was one of the things wrong with the original
version).

We should convert the DMC loader to use this too, so there need be only
one bit of code in the whole driver that needs to understand how to use
completions to get correct handover from a free-running no-locks-held
thread to the properly disciplined environment of driver mainline for
purposes of programming the h/w.

.Dave.

 ---
  drivers/gpu/drm/i915/Makefile  |3 +
  drivers/gpu/drm/i915/intel_uc_loader.c |  312 
 
  drivers/gpu/drm/i915/intel_uc_loader.h |   82 +
  3 files changed, 397 insertions(+)
  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c
  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h

 diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
 index b7ddf48..607fa2a 100644
 --- a/drivers/gpu/drm/i915/Makefile
 +++ b/drivers/gpu/drm/i915/Makefile
 @@ -38,6 +38,9 @@ i915-y += i915_cmd_parser.o \
intel_ringbuffer.o \
intel_uncore.o
  
 +# generic ancilliary microcontroller support
 +i915-y += intel_uc_loader.o
 +
  # autogenerated null render state
  i915-y += intel_renderstate_gen6.o \
intel_renderstate_gen7.o \
 diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c 
 b/drivers/gpu/drm/i915/intel_uc_loader.c
 new file mode 100644
 index 000..26f0fbe
 --- /dev/null
 +++ b/drivers/gpu/drm/i915/intel_uc_loader.c
 @@ -0,0 +1,312 @@
 +/*
 + * Copyright © 2014 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the 
 Software),
 + * to deal in the Software without restriction, including without limitation
 + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the next
 + * paragraph) shall be included in all copies or substantial portions of the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS 
 OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
 DEALINGS
 + * IN THE SOFTWARE.
 + *
 + * Author:
 + *  Dave Gordon david.s.gor...@intel.com
 + */
 +#include linux/firmware.h
 +#include i915_drv.h
 +#include intel_uc_loader.h
 +
 +/**
 + * DOC: Generic embedded microcontroller (uC) firmware loading support
 + *
 + * The functions in this file provide a generic way to load the firmware 
 that
 + * may be required by an embedded microcontroller (uC).
 + *
 + * The function intel_uc_fw_init() should be called early, and will initiate
 + * an asynchronous request to fetch the firmware image (aka binary blob).
 + * When the image has been fetched into memory, the kernel will call back to
 + * 

Re: [Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 01:14:56PM +0100, john.c.harri...@intel.com wrote:
 From: John Harrison john.c.harri...@intel.com
 
 The plan is to pass requests around as the basic submission tracking structure
 rather than rings and contexts. This patch updates the i915_gem_object_sync()
 code path.
 
 v2: Much more complex patch to share a single request between the sync and the
 page flip. The _sync() function now supports lazy allocation of the request
 structure. That is, if one is passed in then that will be used. If one is not,
 then a request will be allocated and passed back out. Note that the _sync() 
 code
 does not necessarily require a request. Thus one will only be created until
 certain situations. The reason the lazy allocation must be done within the
 _sync() code itself is because the decision to need one or not is not really
 something that code above can second guess (except in the case where one is
 definitely not required because no ring is passed in).
 
 The call chains above _sync() now support passing a request through which most
 callers passing in NULL and assuming that no request will be required (because
 they also pass in NULL for the ring and therefore can't be generating any ring
 code).
 
 The exeception is intel_crtc_page_flip() which now supports having a request
 returned from _sync(). If one is, then that request is shared by the page flip
 (if the page flip is of a type to need a request). If _sync() does not 
 generate
 a request but the page flip does need one, then the page flip path will create
 its own request.
 
 v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
 Elf review request). Rebased onto newer tree that significantly changed the
 synchronisation code.
 
 v4: Updated comments from review feedback (Tomas Elf)
 
 For: VIZ-5115
 Signed-off-by: John Harrison john.c.harri...@intel.com
 Reviewed-by: Tomas Elf tomas@intel.com
 ---
  drivers/gpu/drm/i915/i915_drv.h|4 ++-
  drivers/gpu/drm/i915/i915_gem.c|   48 
 +---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
  drivers/gpu/drm/i915/intel_display.c   |   17 +++---
  drivers/gpu/drm/i915/intel_drv.h   |3 +-
  drivers/gpu/drm/i915/intel_fbdev.c |2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |2 +-
  drivers/gpu/drm/i915/intel_overlay.c   |2 +-
  8 files changed, 57 insertions(+), 23 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index 64a10fa..f69e9cb 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct 
 drm_i915_gem_object *obj)
  
  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 -  struct intel_engine_cs *to);
 +  struct intel_engine_cs *to,
 +  struct drm_i915_gem_request **to_req);

Nope. Did you forget to reorder the code to ensure that the request is
allocated along with the context switch at the start of execbuf?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Per-DDI I_boost override

2015-06-18 Thread Antti Koskipaa
An OEM may request increased I_boost beyond the recommended values
by specifying an I_boost value to be applied to all swing entries for
a port. These override values are specified in VBT.

Issue: VIZ-5676
Signed-off-by: Antti Koskipaa antti.koski...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_drv.h   |  3 +++
 drivers/gpu/drm/i915/intel_bios.c | 21 +
 drivers/gpu/drm/i915/intel_bios.h |  9 +
 drivers/gpu/drm/i915/intel_ddi.c  | 39 +++
 4 files changed, 64 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 09a57a5..e17fd56 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1402,6 +1402,9 @@ struct ddi_vbt_port_info {
uint8_t supports_dvi:1;
uint8_t supports_hdmi:1;
uint8_t supports_dp:1;
+
+   uint8_t dp_boost_level;
+   uint8_t hdmi_boost_level;
 };
 
 enum psr_lines_to_wait {
diff --git a/drivers/gpu/drm/i915/intel_bios.c 
b/drivers/gpu/drm/i915/intel_bios.c
index 198fc3c..06b5dc3 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -946,6 +946,17 @@ err:
memset(dev_priv-vbt.dsi.sequence, 0, 
sizeof(dev_priv-vbt.dsi.sequence));
 }
 
+static u8 translate_iboost(u8 val)
+{
+   static const u8 mapping[] = { 1, 3, 7 }; /* See VBT spec */
+
+   if (val = ARRAY_SIZE(mapping)) {
+   DRM_DEBUG_KMS(Unsupported I_boost value found in VBT (%d), 
display may not work properly\n, val);
+   return 0;
+   }
+   return mapping[val];
+}
+
 static void parse_ddi_port(struct drm_i915_private *dev_priv, enum port port,
   const struct bdb_header *bdb)
 {
@@ -1046,6 +1057,16 @@ static void parse_ddi_port(struct drm_i915_private 
*dev_priv, enum port port,
  hdmi_level_shift);
info-hdmi_level_shift = hdmi_level_shift;
}
+
+   /* Parse the I_boost config for SKL and above */
+   if (bdb-version = 196  (child-common.flags_1  IBOOST_ENABLE)) {
+   info-dp_boost_level = 
translate_iboost(child-common.iboost_level  0xF);
+   DRM_DEBUG_KMS(VBT (e)DP boost level for port %c: %d\n,
+ port_name(port), info-dp_boost_level);
+   info-hdmi_boost_level = 
translate_iboost(child-common.iboost_level  4);
+   DRM_DEBUG_KMS(VBT HDMI boost level for port %c: %d\n,
+ port_name(port), info-hdmi_boost_level);
+   }
 }
 
 static void parse_ddi_ports(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_bios.h 
b/drivers/gpu/drm/i915/intel_bios.h
index af0b476..8edd75c 100644
--- a/drivers/gpu/drm/i915/intel_bios.h
+++ b/drivers/gpu/drm/i915/intel_bios.h
@@ -231,6 +231,10 @@ struct old_child_dev_config {
 /* This one contains field offsets that are known to be common for all BDB
  * versions. Notice that the meaning of the contents contents may still change,
  * but at least the offsets are consistent. */
+
+/* Definitions for flags_1 */
+#define IBOOST_ENABLE (13)
+
 struct common_child_dev_config {
u16 handle;
u16 device_type;
@@ -239,8 +243,13 @@ struct common_child_dev_config {
u8 not_common2[2];
u8 ddc_pin;
u16 edid_ptr;
+   u8 obsolete;
+   u8 flags_1;
+   u8 not_common3[13];
+   u8 iboost_level;
 } __packed;
 
+
 /* This field changes depending on the BDB version, so the most reliable way to
  * read it is by checking the BDB version and reading the raw pointer. */
 union child_device_config {
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 3abcb43..8e5e94c 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -434,6 +434,7 @@ static void intel_prepare_ddi_buffers(struct drm_device 
*dev, enum port port,
 {
struct drm_i915_private *dev_priv = dev-dev_private;
u32 reg;
+   u32 iboost_bit = 0;
int i, n_hdmi_entries, n_dp_entries, n_edp_entries, hdmi_default_entry,
size;
int hdmi_level = dev_priv-vbt.ddi_port_info[port].hdmi_level_shift;
@@ -459,6 +460,10 @@ static void intel_prepare_ddi_buffers(struct drm_device 
*dev, enum port port,
ddi_translations_hdmi =
skl_get_buf_trans_hdmi(dev, n_hdmi_entries);
hdmi_default_entry = 8;
+   /* If we're boosting the current, set bit 31 of trans1 */
+   if (dev_priv-vbt.ddi_port_info[port].hdmi_boost_level ||
+   dev_priv-vbt.ddi_port_info[port].dp_boost_level)
+   iboost_bit = 131;
} else if (IS_BROADWELL(dev)) {
ddi_translations_fdi = bdw_ddi_translations_fdi;
ddi_translations_dp = bdw_ddi_translations_dp;
@@ -519,7 +524,7 @@ static void intel_prepare_ddi_buffers(struct drm_device 

[Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread John . C . Harrison
From: John Harrison john.c.harri...@intel.com

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com
Reviewed-by: Tomas Elf tomas@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h|4 ++-
 drivers/gpu/drm/i915/i915_gem.c|   48 +---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
 drivers/gpu/drm/i915/intel_display.c   |   17 +++---
 drivers/gpu/drm/i915/intel_drv.h   |3 +-
 drivers/gpu/drm/i915/intel_fbdev.c |2 +-
 drivers/gpu/drm/i915/intel_lrc.c   |2 +-
 drivers/gpu/drm/i915/intel_overlay.c   |2 +-
 8 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64a10fa..f69e9cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct 
drm_i915_gem_object *obj)
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-struct intel_engine_cs *to);
+struct intel_engine_cs *to,
+struct drm_i915_gem_request **to_req);
 void i915_vma_move_to_active(struct i915_vma *vma,
 struct intel_engine_cs *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -2889,6 +2890,7 @@ int __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 u32 alignment,
 struct intel_engine_cs *pipelined,
+struct drm_i915_gem_request 
**pipelined_request,
 const struct i915_ggtt_view *view);
 void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
  const struct i915_ggtt_view 
*view);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e59369a..d7c7127 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3095,25 +3095,26 @@ out:
 static int
 __i915_gem_object_sync(struct drm_i915_gem_object *obj,
   struct intel_engine_cs *to,
-  struct drm_i915_gem_request *req)
+  struct drm_i915_gem_request *from_req,
+  struct drm_i915_gem_request **to_req)
 {
struct intel_engine_cs *from;
int ret;
 
-   from = i915_gem_request_get_ring(req);
+   from = i915_gem_request_get_ring(from_req);
if (to == from)
return 0;
 
-   if (i915_gem_request_completed(req, true))
+   if (i915_gem_request_completed(from_req, true))
return 0;
 
-   ret = i915_gem_check_olr(req);
+   ret = i915_gem_check_olr(from_req);
if (ret)
return ret;
 
if (!i915_semaphore_is_enabled(obj-base.dev)) {
struct drm_i915_private *i915 = to_i915(obj-base.dev);
-   ret = __i915_wait_request(req,
+   ret = __i915_wait_request(from_req,
  
atomic_read(i915-gpu_error.reset_counter),
  

Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 01:29:45PM +0100, Peter Antoine wrote:
 @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct 
 intel_engine_cs *ring,
   if (ret)
   return ret;
  
 + /*
 +  * Failing to program the MOCS is non-fatal.The system will not
 +  * run at peak performance. So generate a warning and carry on.
 +  */
 + if (intel_rcs_context_init_mocs(ring, ctx) != 0)
 + DRM_ERROR(MOCS failed to program: expect performance issues.);

You said to expect display corruption as well if this failed.
Fortunately, if this fails, we have severe driver issues...

 +/**
 + * emit_mocs_l3cc_table() - emit the mocs control table
 + * @ringbuf: DRM device.
 + * @table:   The values to program into the control regs.
 + *
 + * This function simply emits a MI_LOAD_REGISTER_IMM command for the
 + * given table starting at the given address. This register set is  
 programmed
 + * in pairs.
 + *
 + * Return: Nothing.
 + */
 +static void emit_mocs_l3cc_table(struct intel_ringbuffer *ringbuf,
 +  struct drm_i915_mocs_table *table) {
 + unsigned int count;
 + unsigned int i;
 + u32 value;
 + u32 filler = (table-table[0].l3cc_value  0x) |
 + ((table-table[0].l3cc_value  0x)  16);

l3cc_value is only u16,  0x is just noise, without  you don't need
the parantheses.

 +int intel_rcs_context_init_mocs(struct intel_engine_cs *ring,
 + struct intel_context *ctx)
 +{
 + int ret = 0;
 +
 + struct drm_i915_mocs_table t;
 + struct drm_device *dev = ring-dev;
 + struct intel_ringbuffer *ringbuf = ctx-engine[ring-id].ringbuf;
 +
 + if (get_mocs_settings(dev, t)) {
 + u32 table_size;
 +
 + /*
 +  * OK. For each supported ring:
 +  *  number of mocs entries * 2 dwords for each control_value
 +  *  plus number of mocs entries /2 dwords for l3cc values.
 +  *
 +  *  Plus 1 for the load command and 1 for the NOOP per ring
 +  *  and the l3cc programming.
 *
 * With 5 rings and 63 mocs entries, this gives 715
 * dwords.
 +  */

 + table_size = GEN9_NUM_MOCS_RINGS *
 + ((2 * GEN9_NUM_MOCS_ENTRIES) + 2) +
 + GEN9_NUM_MOCS_ENTRIES + 2;

If you pushed the ring_begin into each function, not only would it be
easier to verify, you then don't need an explanation that starts with
This looks like a mistake. Validation of ring_begin/ring_advance is by
review, so it has to be easy to review.

 + ret = intel_logical_ring_begin(ringbuf, ctx, table_size);
 + if (ret) {
 + DRM_DEBUG(intel_logical_ring_begin failed %d\n, ret);
 + return ret;
 + }


-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 1:21 PM, John Harrison
john.c.harri...@intel.com wrote:
 I'm still confused by what you are saying in the above referenced email.
 Part of it is about the sanity checks failing to handle the wrapping case
 correctly which has been fixed in the base reserve space patch (patch 2 in
 the series). The rest is either saying that you think we are potentially
 wrappping too early and wasting a few bytes of the ring buffer or that
 something is actually broken?

Yeah I didn't realize that this change was meant to fix the
ring-reserved_tail check since I didn't make that connection. It is
correct with that change, but the problem I see is that the
correctness of that debug aid isn't assured locally: No we both need
that check _and_ the correct handling of the reservation tracking at
wrap-around. If the check just handles wrapping it'll robustly stay in
working shape even when the wrapping behaviour changes.

 Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes remaining.
 You seem to think this will fail somehow? Why? The wait_for_space(160) in
 the execbuf code will cause a wrap because the the 100 bytes for the
 add_request reservation is added on and the wait is actually being done for
 260 bytes. So yes, we wrap earlier than would otherwise have been necessary
 but that is the only way to absolutely guarantee that the add_request() call
 cannot fail when trying to do the wrap itself.

There's no problem except that it's wasteful. And I tried to explain
that no unconditionally force-wrapping for the entire reservation is
actually not needed, since the additional space needed to account for
the eventual wrapping is bounded by a factor of 2. It's much less in
practice since we split up the final request bits into multiple
smaller intel_ring_begin. And if feels a bit wasteful to throw that
space away (and make the gpu eat through MI_NOP) just because it makes
caring for the worst-case harder. And with GuC the 160 dwords is
actually a fairly substantial part of the ring.

Even more so when we completely switch to a transaction model for
request, where we only need to wrap for individual commands and hence
could place intel_ring_being per-cmd (which is mostly what we do
already anyway).

 As Chris says, if the driver is attempting to create a single request that
 fills the entire ringbuffer then that is a bug that should be caught as soon
 as possible. Even with a Guc, the ring buffer is not small compared to the
 size of requests the driver currently produces. Part of the scheduler work
 is to limit the number of batch buffers that a given application/context can
 have outstanding in the ring buffer at any given time in order to prevent
 starvation of the rest of the system by one badly behaved app. Thus
 completely filling a large ring buffer becomes impossible anyway - the
 application will be blocked before it gets that far.

My proposal for this reservation wrapping business would have been:
- Increase the reservation by 31 dwords (to account for the worst-case
wrap in pc_render_add_request).
- Rework the reservation overflow WARN_ON in reserve_space_end to work
correctly even when wrapping while the reservation has been in use.
- Move the addition of reserved_space below the point where we wrap
the ring and only check against total free space, neglecting wrapping.
- Remove all other complications you've added.

Result is no forced wrapping for reservation and a debug check which
should even survive random changes by monkeys since the logic for that
check is fully contained within reserve_space_end. And for the check
we should be able to reuse __intel_free_space.

If I'm reading things correctly this shouldn't have any effect outside
of patch 2 and shouldn't cause any conflicts.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fwd: [PATCH] drm/i915: Fix IPS related flicker

2015-06-18 Thread Jani Nikula
On Thu, 18 Jun 2015, Ander Conselvan De Oliveira conselv...@gmail.com wrote:
 On Fri, 2015-06-05 at 12:11 +0300, Ville Syrjälä wrote:
 On Fri, Jun 05, 2015 at 11:51:42AM +0300, Jani Nikula wrote:
  On Thu, 04 Jun 2015, Rodrigo Vivi rodrigo.v...@gmail.com wrote:
   I just noticed that I had forgotten to reply-all...
  
   Jani, would you consider merge this fix with the explanation above
   related to Ville's question?
  
   or do you want/need any action here?
  
  Ville's question, I'd like Ville's ack on it.
 
 It's good enough for me. This part of the driver is a quite a mess
 anyway currently, so doesn't matter too much what we stick in there.

 Ping. Seems like this still isn't merged. Does it need more work or did
 it just fall through the cracks?

It fell between the cracks. I know the world isn't black and white, but
it doesn't help the maintainers when review is some shade of grey.

I've pushed this to drm-intel-next-fixes for now, but it has missed the
train for both the v4.1 release and the main drm-next feature pull
request for the v4.2 merge window. I expect this to land upstream in
v4.2-rc2, unless there's an additional drm-next pull request during the
merge window. I've added cc: stable.

Thanks for the patch, and I guess the review was, uh, good enough for
me now... :p

BR,
Jani.



 Thanks,
 Ander

 
  
  BR,
  Jani.
  
  
  
   Thanks,
   Rodrigo.
  
  
   -- Forwarded message --
   From: Rodrigo Vivi rodrigo.v...@gmail.com
   Date: Fri, May 29, 2015 at 9:45 AM
   Subject: Re: [Intel-gfx] [PATCH] drm/i915: Fix IPS related flicker
   To: Ville Syrjälä ville.syrj...@linux.intel.com
  
  
   On Fri, May 29, 2015 at 1:47 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
   On Thu, May 28, 2015 at 11:07:11AM -0700, Rodrigo Vivi wrote:
   We cannot let IPS enabled with no plane on the pipe:
  
   BSpec: IPS cannot be enabled until after at least one plane has
   been enabled for at least one vertical blank. and IPS must be
   disabled while there is still at least one plane enabled on the
   same pipe as IPS. This restriction apply to HSW and BDW.
  
   However a shortcut path on update primary plane function
   to make primary plane invisible by setting DSPCTRL to 0
   was leting IPS enabled while there was no
   other plane enabled on the pipe causing flickerings that we were
   believing that it was caused by that other restriction where
   ips cannot be used when pixel rate is greater than 95% of cdclok.
  
   v2: Don't mess with Atomic path as pointed out by Ville.
  
   Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85583
   Cc: Ville Syrjälä ville.syrj...@linux.intel.com
   Cc: Paulo Zanoni paulo.r.zan...@intel.com
   Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
   ---
drivers/gpu/drm/i915/intel_display.c | 13 +
drivers/gpu/drm/i915/intel_drv.h |  1 +
2 files changed, 14 insertions(+)
  
   diff --git a/drivers/gpu/drm/i915/intel_display.c 
   b/drivers/gpu/drm/i915/intel_display.c
   index 4e3f302..5a6b17b 100644
   --- a/drivers/gpu/drm/i915/intel_display.c
   +++ b/drivers/gpu/drm/i915/intel_display.c
   @@ -13309,6 +13309,16 @@ intel_check_primary_plane(struct drm_plane 
   *plane,
 intel_crtc-atomic.wait_vblank = true;
 }
  
   + /*
   +  * FIXME: Actually if we will still have any other plane 
   enabled
   +  * on the pipe we could let IPS enabled still, but for
   +  * now lets consider that when we make primary invisible
   +  * by setting DSPCNTR to 0 on update_primary_plane 
   function
   +  * IPS needs to be disable.
   +  */
   + if (!state-visible || !fb)
   + intel_crtc-atomic.disable_ips = true;
   +
  
   How could it be visible without an fb?
  
   I don't like this !fb here as well, but I just tried to keep exactly
   same if statement that makes I915_WRITE(DSPCNTRL, 0) on update primary
   plane func...
  
  
 intel_crtc-atomic.fb_bits |=
 INTEL_FRONTBUFFER_PRIMARY(intel_crtc-pipe);
  
   @@ -13406,6 +13416,9 @@ static void intel_begin_crtc_commit(struct 
   drm_crtc *crtc)
 if (intel_crtc-atomic.disable_fbc)
 intel_fbc_disable(dev);
  
   + if (intel_crtc-atomic.disable_ips)
   + hsw_disable_ips(intel_crtc);
   +
 if (intel_crtc-atomic.pre_disable_primary)
 intel_pre_disable_primary(crtc);
  
   intel_pre_disable_primary() would already disable IPS. Except no one
   sets .pre_disable_primary=true. OTOH that thing mostly seems to do
   stuff that has nothing to do with the primary plane (cxsr disable,
   fifo underrun reporting disable on gen2), so I don't think we want
   to use that.
  
   In any case we should really have the IPS state as part of the crtc
   state. These global disable_foo things should just be killed IMO.
   Hmm, 

Re: [Intel-gfx] [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation

2015-06-18 Thread John Harrison

On 17/06/2015 16:52, Chris Wilson wrote:

On Wed, Jun 17, 2015 at 04:54:42PM +0200, Daniel Vetter wrote:

On Wed, Jun 17, 2015 at 03:27:08PM +0100, Chris Wilson wrote:

On Wed, Jun 17, 2015 at 03:31:59PM +0200, Daniel Vetter wrote:

On Fri, May 29, 2015 at 05:44:09PM +0100, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

Now that the *_ring_begin() functions no longer call the request allocation
code, it is finally safe for the request allocation code to call *_ring_begin().
This is important to guarantee that the space reserved for the subsequent
i915_add_request() call does actually get reserved.

v2: Renamed functions according to review feedback (Tomas Elf).

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com

Still has my question open from the previos round:

http://mid.gmane.org/20150323091030.GL1349@phenom.ffwll.local

Note that this isn't all that unlikely with GuC mode since there the
ringbuffer is substantially smaller (due to firmware limitations) than
what we allocate ourselves right now.

Looking at this patch, I am still fundamentally opposed to reserving
space for the request. Detecting a request that wraps and cancelling
that request (after the appropriate WARN for the overlow) is trivial and
such a rare case (as it is a programming error) that it should only be
handled in the slow path.

I thought the entire point here that we don't have request half-committed
because the final request ringcmds didn't fit in. And that does require
that we reserve a bit of space for that postamble.

I guess if it's too much (atm it's super-pessimistic due to ilk) we can
make per-platform reservation limits to be really minimal.

Maybe we could go towards a rollback model longterm of rewingind the
ringbuffer. But if there's no clear need I'd like to avoid that
complexity.

Even if you didn't like the rollback model which helps handling the
partial state from context switches and what not, if you run out of
ringspace you can set the GPU as wedged. Issuing a request that fills
the entire ringbuffer is a programming bug that needs to be caught very
early in development.
-Chris



I'm still confused by what you are saying in the above referenced email. 
Part of it is about the sanity checks failing to handle the wrapping 
case correctly which has been fixed in the base reserve space patch 
(patch 2 in the series). The rest is either saying that you think we are 
potentially wrappping too early and wasting a few bytes of the ring 
buffer or that something is actually broken?


Point 2: 100 bytes of reserve, 160 bytes of execbuf and 200 bytes 
remaining. You seem to think this will fail somehow? Why? The 
wait_for_space(160) in the execbuf code will cause a wrap because the 
the 100 bytes for the add_request reservation is added on and the wait 
is actually being done for 260 bytes. So yes, we wrap earlier than would 
otherwise have been necessary but that is the only way to absolutely 
guarantee that the add_request() call cannot fail when trying to do the 
wrap itself.


As Chris says, if the driver is attempting to create a single request 
that fills the entire ringbuffer then that is a bug that should be 
caught as soon as possible. Even with a Guc, the ring buffer is not 
small compared to the size of requests the driver currently produces. 
Part of the scheduler work is to limit the number of batch buffers that 
a given application/context can have outstanding in the ring buffer at 
any given time in order to prevent starvation of the rest of the system 
by one badly behaved app. Thus completely filling a large ring buffer 
becomes impossible anyway - the application will be blocked before it 
gets that far.


Note that with the removal of the OLR, all requests now have a definite 
start and a definite end. Thus the scheme could be extended to provide 
rollback of the ring buffer. Each new request takes a note of the ring 
pointers at creation time. If the request is cancelled it can reset the 
pointers to where they were before. Thus all half submitted work is 
discarded. That is a much bigger semantic change however, so I would 
really like to get the bare minimum anti-OLR patch set in first before 
trying to do fancy extra features.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 12:03:12PM +0100, John Harrison wrote:
 On 17/06/2015 15:21, Chris Wilson wrote:
 On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:
 On Fri, May 29, 2015 at 05:44:16PM +0100, john.c.harri...@intel.com wrote:
 From: John Harrison john.c.harri...@intel.com
 
 The i915_gem_object_flush_active() call used to do lots. Over time it has 
 done
 less and less. Now all it does check the various associated requests to 
 see if
 they can be retired. Hence this patch renames the function and updates the
 comments around it to match the current operation.
 
 For: VIZ-5115
 Signed-off-by: John Harrison john.c.harri...@intel.com
 When rebasing patches and especially like here when also renaming them a
 bit please leave some indication of what you've changed. Took me a while
 to figure out where one of my pending comments from the previous round
 went too.
 
 And please don't just v2: rebase, but please add some indicators against
 what it conflicted if it's obvious.
 This function doesn't do an unconditional retire - the new name is much
 worse since it is inconsistent with how requests retire. In my make GEM
 umpteen times faster patches, I repurposed this function for reporting
 the object's current activeness and called it bool i915_gem_oject_active()
   - though that is probably better as i915_gem_object_is_active().
 -Chris
 
 
 Retiring is generally not an unconditional operation.

In the code, I use object_retire to perform the retiring operation on
that object. I can rename i915_gem_retire_requests if that makes you
happier, but I don't think it needs to since retire_requests does not
imply to me that all requests are retired, just some indefinite value
(though positive indefinite at least!).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.

2015-06-18 Thread Dave Gordon
On 16/06/15 14:54, Chris Wilson wrote:
 On Tue, Jun 16, 2015 at 03:48:09PM +0200, Daniel Vetter wrote:
 On Mon, Jun 08, 2015 at 06:33:59PM +0100, Chris Wilson wrote:
 On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
 In preparation for per-engine reset add way for setting context reset 
 stats.

 OPEN QUESTIONS:
 1. How do we deal with get_reset_stats and the GL robustness interface when
 introducing per-engine resets?

a. Do we set context that cause per-engine resets as guilty? If so, how
does this affect context banning?

 Yes. If the reset works quicker, then we can set a higher threshold for
 DoS detection, but we still do need Dos detection?
  
b. Do we extend the publically available reset stats to also contain
per-engine reset statistics? If so, would this break the ABI?

 No. The get_reset_stats is targetted at the GL API and describing it in
 terms of whether my context is guilty or has been affected. That is
 orthogonal to whether the reset was on a single ring or the entire GPU -
 the question is how broad do want the affected to be. Ideally a
 per-context reset wouldn't necessarily impact others, except for the
 surfaces shared between them...

 gl computes sharing sets itself, the kernel only tells it whether a given
 context has been victimized, i.e. one of it's batches was not properly
 executed due to reset after a hang.
 
 So you don't think we should delete all pending requests that depend
 upon state from the hung request?
 -Chris

John Harrison  I discussed this yesterday; he's against doing so (even
though the scheduler is ideally placed to do it, if that were actually
the preferred policy). The primary argument (as I see it) is that you
actually don't and can't know the nature of an apparent dependency
between batches that share a buffer object. There are at least three cases:

1. tightly-coupled: the dependent batch is going to rely on data
produced by the earlier batch. In this case, GIGO applies and the
results will be undefined, possibly including a further hang. Subsequent
batches presumably belong to the same or a closely-related
(co-operating) task, and killing them might be a reasonable strategy here.

2. loosely-coupled: the dependent batch is going to access the data,
but not in any way that depends on the content (for example, blitting a
rectangle into a composition buffer). The result will be wrong, but only
in a limited way (e.g. window belonging to the faulty application will
appear corrupted). The dependent batches may well belong to unrelated
system tasks (e.g. X or surfaceflinger) and killing them is probably not
justified.

3. uncoupled: the dependent batch wants the /buffer/, not the data in
it (most likely a framebuffer or similar object). Any incorrect data in
the buffer is irrelevant. Killing off subsequent batches would be wrong.

Buffer access mode (readonly, read/write, writeonly) might allow us to
distinguish these somewhat, but probably not enough to help make the
right decision. So the default must be *not* to kill off dependants
automatically, but if the failure does propagate in such a way as to
cause further consequent hangs, then the context-banning mechanism
should eventually catch and block all the downstream effects.

.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active()

2015-06-18 Thread John Harrison

On 18/06/2015 12:10, Chris Wilson wrote:

On Thu, Jun 18, 2015 at 12:03:12PM +0100, John Harrison wrote:

On 17/06/2015 15:21, Chris Wilson wrote:

On Wed, Jun 17, 2015 at 04:06:05PM +0200, Daniel Vetter wrote:

On Fri, May 29, 2015 at 05:44:16PM +0100, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

The i915_gem_object_flush_active() call used to do lots. Over time it has done
less and less. Now all it does check the various associated requests to see if
they can be retired. Hence this patch renames the function and updates the
comments around it to match the current operation.

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com

When rebasing patches and especially like here when also renaming them a
bit please leave some indication of what you've changed. Took me a while
to figure out where one of my pending comments from the previous round
went too.

And please don't just v2: rebase, but please add some indicators against
what it conflicted if it's obvious.

This function doesn't do an unconditional retire - the new name is much
worse since it is inconsistent with how requests retire. In my make GEM
umpteen times faster patches, I repurposed this function for reporting
the object's current activeness and called it bool i915_gem_oject_active()
  - though that is probably better as i915_gem_object_is_active().
-Chris


Retiring is generally not an unconditional operation.

In the code, I use object_retire to perform the retiring operation on
that object. I can rename i915_gem_retire_requests if that makes you
happier, but I don't think it needs to since retire_requests does not
imply to me that all requests are retired, just some indefinite value
(though positive indefinite at least!).
-Chris



Fair enough. I guess I'm still thinking of the driver as it was when I 
first wrote the patch series which was before your re-write for 
read/read optimisations. Like I said, the exact new name isn't as 
important as at least giving it a new name. The old name is definitely 
not valid any more. Feel free to suggest something better.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Damien Lespiau
On Thu, Jun 18, 2015 at 01:29:45PM +0100, Peter Antoine wrote:
 @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct 
 intel_engine_cs *ring,
   if (ret)
   return ret;
  
 + /*
 +  * Failing to program the MOCS is non-fatal.The system will not
 +  * run at peak performance. So generate a warning and carry on.
 +  */
 + if (intel_rcs_context_init_mocs(ring, ctx) != 0)
 + DRM_ERROR(MOCS failed to program: expect performance issues.);
 +

Missing a '\n'.

 +static const struct drm_i915_mocs_entry skylake_mocs_table[] = {
 +  /* {0x0009, 0x0010} */
 + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) |
 + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) |
 + MOC_PFM(0) | MOCS_SCF(0)),
 + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))},
 +  /* {0x003b, 0x0030} */

We're still missing the usage hints for those configuration entries
That'd help user space a lot, which means make this patch land quicker
as well.

 +int intel_rcs_context_init_mocs(struct intel_engine_cs *ring,
 + struct intel_context *ctx)
 +{
 + int ret = 0;
 +
 + struct drm_i915_mocs_table t;
 + struct drm_device *dev = ring-dev;
 + struct intel_ringbuffer *ringbuf = ctx-engine[ring-id].ringbuf;
 +
 + if (get_mocs_settings(dev, t)) {
 + u32 table_size;
 +
 + /*
 +  * OK. For each supported ring:
 +  *  number of mocs entries * 2 dwords for each control_value
 +  *  plus number of mocs entries /2 dwords for l3cc values.
 +  *
 +  *  Plus 1 for the load command and 1 for the NOOP per ring
 +  *  and the l3cc programming.
 +  */
 + table_size = GEN9_NUM_MOCS_RINGS *
 + ((2 * GEN9_NUM_MOCS_ENTRIES) + 2) +
 + GEN9_NUM_MOCS_ENTRIES + 2;
 + ret = intel_logical_ring_begin(ringbuf, ctx, table_size);
 + if (ret) {
 + DRM_DEBUG(intel_logical_ring_begin failed %d\n, ret);
 + return ret;
 + }
 +
 + /* program the control registers */
 + emit_mocs_control_table(ringbuf, t, GEN9_GFX_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_MFX0_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_MFX1_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_VEBOX_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_BLT_MOCS_0);

So, if I'm not mistaken, I think this only works because we fully
initialize the default context at start/reset time through:

  + i915_gem_init_hw()
+ i915_gem_context_enable()
  + cycle through all the rings and call ring-init_context()
+ gen8_init_rcs_context()
  + intel_rcs_context_init_mocs()
(initalize ALL the MOCS!)

So, intializing the other (non-render) MOCS in gen8_init_rcs_context()
isn't the most logical thing to do I'm afraid. What happens if we
suddenly decide that we don't want to fully initialize the default
context at startup but initialize each ring on-demand for that context
as well? We can end up in a situation where we use the blitter first and
we wouldn't have the blitter MOCS initialized.

In that sense, that code makes an assumption about how we do things in a
completely different part of the driver and that's always a potential
source of bugs.

Chris, how far am I ? :p

One way to solve this (if that's indeed the issue pointed at by Chris)
would be to decouple the render MOCS from the others, still keep the
render ones in there as they need to be emitted from the ring but put
the other writes (which could be done through MMIO as well) higher in
the chain, could probably make sense in i915_gem_context_enable()?
(which, by the way is awfully namedm should have an _init somewhere?).
It could also be a per-ring vfunc I suppose.

For similar reasons, I think the GuC MOCS should be part of the GuC
init as well so we don't couple too hard different part of the code.

Now, is that really a blocker? I'd say no if we had userspace ready and
could commit that today, because we really want it. Still something to
look at, I could be totally wrong.

The separate header for a single function isn't something we usually do
either, but that can always be folded in later.

-- 
Damien
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread John Harrison

On 18/06/2015 13:21, Chris Wilson wrote:

On Thu, Jun 18, 2015 at 01:14:56PM +0100, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com
Reviewed-by: Tomas Elf tomas@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h|4 ++-
  drivers/gpu/drm/i915/i915_gem.c|   48 +---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
  drivers/gpu/drm/i915/intel_display.c   |   17 +++---
  drivers/gpu/drm/i915/intel_drv.h   |3 +-
  drivers/gpu/drm/i915/intel_fbdev.c |2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |2 +-
  drivers/gpu/drm/i915/intel_overlay.c   |2 +-
  8 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64a10fa..f69e9cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct 
drm_i915_gem_object *obj)
  
  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);

  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-struct intel_engine_cs *to);
+struct intel_engine_cs *to,
+struct drm_i915_gem_request **to_req);

Nope. Did you forget to reorder the code to ensure that the request is
allocated along with the context switch at the start of execbuf?
-Chris

Not sure what you are objecting to? If you mean the lazily allocated 
request then that is for page flip code not execbuff code. If we get 
here from an execbuff call then the request will definitely have been 
allocated and will be passed in. Whereas the page flip code may or may 
not require a request (depending on whether MMIO or ring flips are in 
use. Likewise the sync code may or may not require a request (depending 
on whether there is anything to sync to or not). There is no point 
allocating and submitting an empty request in the MMIO/idle case. Hence 
the sync code needs to be able to use an existing request or create one 
if none already exists.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+

2015-06-18 Thread Tomas Elf
On 18/06/2015 11:36, Chris Wilson wrote: On Thu, Jun 18, 2015 at 
11:11:55AM +0100, Tomas Elf wrote:

 On 18/06/2015 10:51, Mika Kuoppala wrote:
 In order for gen8+ hardware to guarantee that no context switch
 takes place during engine reset and that current context is properly
 saved, the driver needs to notify and query hw before commencing
 with reset.

 There are gpu hangs where the engine gets so stuck that it never will
 report to be ready for reset. We could proceed with reset anyway, but
 with some hangs with skl, the forced gpu reset will result in a system
 hang. By inspecting the unreadiness for reset seems to correlate with
 the probable system hang.

 We will only proceed with reset if all engines report that they
 are ready for reset. If root cause for system hang is found and
 can be worked around with another means, we can reconsider if
 we can reinstate full reset for unreadiness case.

 v2: -EIO, Recovery, gen8 (Chris, Tomas, Daniel)
 v3: updated commit msg
 v4: timeout_ms, simpler error path (Chris)

 References: https://bugs.freedesktop.org/show_bug.cgi?id=89959
 References: https://bugs.freedesktop.org/show_bug.cgi?id=90854
 Testcase: igt/gem_concurrent_blit --r 
prw-blt-overwrite-source-read-rcs-forked
 Testcase: igt/gem_concurrent_blit --r 
gtt-blt-overwrite-source-read-rcs-forked

 Cc: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Tomas Elf tomas@intel.com
 Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
 ---
   drivers/gpu/drm/i915/i915_reg.h |  3 +++
   drivers/gpu/drm/i915/intel_uncore.c | 43 
-

   2 files changed, 45 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/i915/i915_reg.h 
b/drivers/gpu/drm/i915/i915_reg.h

 index 0b979ad..3684f92 100644
 --- a/drivers/gpu/drm/i915/i915_reg.h
 +++ b/drivers/gpu/drm/i915/i915_reg.h
 @@ -1461,6 +1461,9 @@ enum skl_disp_power_wells {
   #define RING_MAX_IDLE(base)  ((base)+0x54)
   #define RING_HWS_PGA(base)   ((base)+0x80)
   #define RING_HWS_PGA_GEN6(base)  ((base)+0x2080)
 +#define RING_RESET_CTL(base)  ((base)+0xd0)
 +#define   RESET_CTL_REQUEST_RESET  (1  0)
 +#define   RESET_CTL_READY_TO_RESET (1  1)

   #define HSW_GTT_CACHE_EN 0x4024
   #define   GTT_CACHE_EN_ALL   0xF0007FFF
 diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c

 index 4a86cf0..160a47a 100644
 --- a/drivers/gpu/drm/i915/intel_uncore.c
 +++ b/drivers/gpu/drm/i915/intel_uncore.c
 @@ -1455,9 +1455,50 @@ static int gen6_do_reset(struct drm_device *dev)
return ret;
   }

 +static int wait_for_register(struct drm_i915_private *dev_priv,
 +   const u32 reg,
 +   const u32 mask,
 +   const u32 value,
 +   const unsigned long timeout_ms)
 +{
 +  return wait_for((I915_READ(reg)  mask) == value, timeout_ms);
 +}
 +
 +static int gen8_do_reset(struct drm_device *dev)
 +{
 +  struct drm_i915_private *dev_priv = dev-dev_private;
 +  struct intel_engine_cs *engine;
 +  int i;
 +
 +  for_each_ring(engine, dev_priv, i) {
 +  I915_WRITE(RING_RESET_CTL(engine-mmio_base),
 + _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
 +
 +  if (wait_for_register(dev_priv,
 +RING_RESET_CTL(engine-mmio_base),
 +RESET_CTL_READY_TO_RESET,
 +RESET_CTL_READY_TO_RESET,
 +700)) {
 +  DRM_ERROR(%s: reset request timeout\n, engine-name);
 +  goto not_ready;
 +  }

 So just to be clear here: If one or more of the reset control
 registers decide that they are at a point where they will never
 again be ready for reset we will simply not do a full GPU reset
 until reboot? Is there perhaps a case where you would want to try
 reset request once or twice or like five times or whatever but then
 simply go ahead with the full GPU reset regardless of what the reset
 control register tells you? After all, it's our only way out if the
 hardware is truly stuck.

 What happens is that we skip the reset, report an error and that marks
 the GPU as wedged. To get out of that state requires user intervention,
 either by rebooting or through use of debugfs/i915_wedged.

That's a fair point, we will mark the GPU as terminally wedged. That's 
always been there as a final state where we simply give up. I guess it 
might be better to actively mark the GPU as terminally wedged from the 
driver's point of view rather than plow ahead in a last ditch effort to 
reset the GPU, which may or may not succeed and which may irrecoverably 
hang the system in the worst case. I guess we at least protect the 
currently running context if we just mark the GPU as terminally wedged 
instead of putting it in a potentially undefined state.



 We can try to repeat the reset from a workqueue, but we should first
 

Re: [Intel-gfx] [PATCH v4] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Antoine, Peter
On Thu, 2015-06-18 at 10:10 +0100, ch...@chris-wilson.co.uk wrote:
 On Thu, Jun 18, 2015 at 08:45:10AM +, Antoine, Peter wrote:
  On Thu, 2015-06-18 at 08:49 +0100, ch...@chris-wilson.co.uk wrote:
   On Thu, Jun 18, 2015 at 07:36:41AM +, Antoine, Peter wrote:

On Wed, 2015-06-17 at 17:33 +0100, Chris Wilson wrote:
 On Wed, Jun 17, 2015 at 04:19:22PM +0100, Peter Antoine wrote:
  This change adds the programming of the MOCS registers to the gen 9+
  platforms. This change set programs the MOCS register values to a 
  set
  of values that are defined to be optimal.
  
  It creates a fixed register set that is programmed across the 
  different
  engines so that all engines have the same table. This is done as the
  main RCS context only holds the registers for itself and the shared
  L3 values. By trying to keep the registers consistent across the
  different engines it should make the programming for the registers
  consistent.
  
  v2:
  -'static const' for private data structures and style changes.(Matt 
  Turner)
  v3:
  - Make the tables slightly more readable. (Damien Lespiau)
  - Updated tables fix performance regression.
  v4:
  - Code formatting. (Chris Wilson)
  - re-privatised mocs code. (Daniel Vetter)
 
 Being really picky now, but reading your comments impressed upon me
 the importance of reinforcing one particular point...
 
   
  +   /*
  +* Failing to program the MOCS is non-fatal.The system will not
  +* run at peak performance. So generate a warning and carry on.
  +*/
  +   if (gen9_program_mocs(ring, ctx) != 0)
 
 I think this is better as intel_rcs_context_init_mocs(). Too me it is
 important that you emphasize this is to be run once during very early
 initialisation to setup the first context prior to anything else. i.e.
 All subsequent execution state must be derived from this. Renaming it 
 as
 intel_rcs_context_init_mocs():
 
 1 - indicates you have written it to handle all generation, this is
 important as you are otherwise passing in gen8 into a gen9 
 function.
 
 2 - it is only called during RCS-init_context() and must not be 
 called
 at any other time - this avoids the issue of modifying registers
 used by other rings at runtime, which is the trap you lead me into
 last time.
No problem with that.But adding rcs to the original name suggests that 
it
is only setting up the rcs engine and not all the engines. If any of the
other context engines have there context extended then we may need to 
call
the function from other ring initialise functions.
   
   intel_rcs_context is the object
   init_mocs is the verb, with init being a fairly well defined phase
   of context operatinons.
   
   My suggestion is that is only run during RCS context init. The comments
   tell us that it affects all rings - and so we must emphasize that the
   RCS context init *must* be run before the other rings are enabled for
   submission.
   
   If we have contexts being initialised on other rings, then one would not
   think of calling intel_rcs_context_init* but instead think of how we
   would need to interact with concurrent engine initialisation. Being
   specifc here should stop someone simply calling the function and hoping
   for the best.
   
I'll change it to intel_context_emit_mocs() as this does say what it 
does
on the tin, it only emits the mocs to the context and does not program 
them.
   
   That misses the point I am trying to make.
  
  I don't get your point, the original seemed good to me.
  Changing name to what you want as this needs to get in.
 
 My point is that it is not a generic function and must be called at a
 certain phase of context construction and lrc initialisation. I am
 trying to suggest a name that encapsulates that to avoid possible
 misuse.
 
  +   if (IS_SKYLAKE(dev)) {
  +   table-size  = ARRAY_SIZE(skylake_mocs_table);
  +   table-table = skylake_mocs_table;
  +   result = true;
  +   } else if (IS_BROXTON(dev)) {
  +   table-size  = ARRAY_SIZE(broxton_mocs_table);
  +   table-table = broxton_mocs_table;
  +   result = true;
  +   } else {
  +   /* Platform that should have a MOCS table does not */
  +   WARN_ON(INTEL_INFO(dev)-gen = 9);
 
 result = false; here would be fewer lines of code today and tomorrow. 
 :)
Fail safe return value. Makes not difference here, but golden in larger
functions.
   
   Actually I don't see why you can't encode the ARRAY_SIZE into the static
   const tables, then the return value is just the appropriate table. If
   you don't set a default value, then you get a compiler warning telling
   you missed adding it your new 

[Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Peter Antoine
This change adds the programming of the MOCS registers to the gen 9+
platforms. This change set programs the MOCS register values to a set
of values that are defined to be optimal.

It creates a fixed register set that is programmed across the different
engines so that all engines have the same table. This is done as the
main RCS context only holds the registers for itself and the shared
L3 values. By trying to keep the registers consistent across the
different engines it should make the programming for the registers
consistent.

v2:
-'static const' for private data structures and style changes.(Matt Turner)
v3:
- Make the tables slightly more readable. (Damien Lespiau)
- Updated tables fix performance regression.
v4:
- Code formatting. (Chris Wilson)
- re-privatised mocs code. (Daniel Vetter)
v5:
- Changed the name of a function. (Chris Wilson)

Signed-off-by: Peter Antoine peter.anto...@intel.com
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/i915_reg.h   |   9 +
 drivers/gpu/drm/i915/intel_lrc.c  |  10 +-
 drivers/gpu/drm/i915/intel_lrc.h  |   4 +
 drivers/gpu/drm/i915/intel_mocs.c | 370 ++
 drivers/gpu/drm/i915/intel_mocs.h |  61 +++
 6 files changed, 454 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/intel_mocs.c
 create mode 100644 drivers/gpu/drm/i915/intel_mocs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b7ddf48..c781e19 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
  i915_irq.o \
  i915_trace_points.o \
  intel_lrc.o \
+ intel_mocs.o \
  intel_ringbuffer.o \
  intel_uncore.o
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7213224..3a435b5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7829,4 +7829,13 @@ enum skl_disp_power_wells {
 #define _PALETTE_A (dev_priv-info.display_mmio_offset + 0xa000)
 #define _PALETTE_B (dev_priv-info.display_mmio_offset + 0xa800)
 
+/* MOCS (Memory Object Control State) registers */
+#define GEN9_LNCFCMOCS0(0xB020)/* L3 Cache Control 
base */
+
+#define GEN9_GFX_MOCS_0(0xc800)/* Graphics MOCS base 
register*/
+#define GEN9_MFX0_MOCS_0   (0xc900)/* Media 0 MOCS base register*/
+#define GEN9_MFX1_MOCS_0   (0xcA00)/* Media 1 MOCS base register*/
+#define GEN9_VEBOX_MOCS_0  (0xcB00)/* Video MOCS base register*/
+#define GEN9_BLT_MOCS_0(0xcc00)/* Blitter MOCS base 
register*/
+
 #endif /* _I915_REG_H_ */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9f5485d..dd01caf 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -135,6 +135,7 @@
 #include drm/drmP.h
 #include drm/i915_drm.h
 #include i915_drv.h
+#include intel_mocs.h
 
 #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
 #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
@@ -796,7 +797,7 @@ static int logical_ring_prepare(struct intel_ringbuffer 
*ringbuf,
  *
  * Return: non-zero if the ringbuffer is not ready to be written to.
  */
-static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
+int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
struct intel_context *ctx, int num_dwords)
 {
struct intel_engine_cs *ring = ringbuf-ring;
@@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct intel_engine_cs 
*ring,
if (ret)
return ret;
 
+   /*
+* Failing to program the MOCS is non-fatal.The system will not
+* run at peak performance. So generate a warning and carry on.
+*/
+   if (intel_rcs_context_init_mocs(ring, ctx) != 0)
+   DRM_ERROR(MOCS failed to program: expect performance issues.);
+
return intel_lr_context_render_state_init(ring, ctx);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 04d3a6d..dbbd6af 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -44,6 +44,10 @@ int intel_logical_rings_init(struct drm_device *dev);
 
 int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf,
  struct intel_context *ctx);
+
+int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
+   struct intel_context *ctx, int num_dwords);
+
 /**
  * intel_logical_ring_advance() - advance the ringbuffer tail
  * @ringbuf: Ringbuffer to advance.
diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
b/drivers/gpu/drm/i915/intel_mocs.c
new file mode 100644
index 000..1651379e
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -0,0 +1,370 @@
+/*
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Permission is hereby 

Re: [Intel-gfx] [PATCH] drm/i915: Per-DDI I_boost override

2015-06-18 Thread Antti Koskipää
Just FYI, this patch depends on David Weinehall's Buffer translation
improvements patch from earlier today.

-- 
- Antti

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands

2015-06-18 Thread John . C . Harrison
From: John Harrison john.c.harri...@intel.com

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

v3: Incremented the reserved size to accommodate Ironlake (after finally
managing to run on an ILK system). Also fixed missing wrap code in LRC mode.

v4: Added extra comment and removed duplicate WARN (feedback from Tomas).

For: VIZ-5115
CC: Tomas Elf tomas@intel.com
Signed-off-by: John Harrison john.c.harri...@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h |1 +
 drivers/gpu/drm/i915/i915_gem.c |   37 
 drivers/gpu/drm/i915/intel_lrc.c|   21 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   71 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   25 +++
 5 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0347eb9..eba1857 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
   struct intel_context *ctx);
+void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 
 static inline uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 81f3512..85fa27b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
} else
ringbuf = ring-buffer;
 
+   /*
+* To ensure that this call will not fail, space for its emissions
+* should already have been reserved in the ring buffer. Let the ring
+* know that it is time to use that space up.
+*/
+   intel_ring_reserved_space_use(ringbuf);
+
request_start = intel_ring_get_tail(ringbuf);
/*
 * Emit any outstanding flushes - execbuf can fail to emit the flush
@@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
   round_jiffies_up_relative(HZ));
intel_mark_busy(dev_priv-dev);
 
+   /* Sanity check that the reserved size was large enough. */
+   intel_ring_reserved_space_end(ringbuf);
+
return 0;
 }
 
@@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
if (ret)
goto err;
 
+   /*
+* Reserve space in the ring buffer for all the commands required to
+* eventually emit this request. This is to guarantee that the
+* i915_add_request() call can't fail. Note that the reserve may need
+* to be redone if the request is not actually submitted straight
+* away, e.g. because a GPU scheduler has deferred it.
+*
+* Note further that this call merely notes the reserve request. A
+* subsequent call to *_ring_begin() is required to actually ensure
+* that the reservation is 

[Intel-gfx] [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring

2015-06-18 Thread John . C . Harrison
From: John Harrison john.c.harri...@intel.com

The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.

This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().

v2: Fix dumb loop variable re-use.

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com
Reviewed-by: Tomas Elf tomas@intel.com
---
 drivers/gpu/drm/i915/i915_gem.c |   27 ---
 drivers/gpu/drm/i915/i915_gem_gtt.c |   28 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.h |1 +
 3 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ac893e3..dff21bd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5016,7 +5016,7 @@ i915_gem_init_hw(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
struct intel_engine_cs *ring;
-   int ret, i;
+   int ret, i, j;
 
if (INTEL_INFO(dev)-gen  6  !intel_enable_gtt())
return -EIO;
@@ -5053,19 +5053,32 @@ i915_gem_init_hw(struct drm_device *dev)
 */
init_unused_rings(dev);
 
+   ret = i915_ppgtt_init_hw(dev);
+   if (ret) {
+   DRM_ERROR(PPGTT enable HW failed %d\n, ret);
+   goto out;
+   }
+
+   /* Need to do basic initialisation of all rings first: */
for_each_ring(ring, dev_priv, i) {
ret = ring-init_hw(ring);
if (ret)
goto out;
}
 
-   for (i = 0; i  NUM_L3_SLICES(dev); i++)
-   i915_gem_l3_remap(dev_priv-ring[RCS], i);
+   /* Now it is safe to go back round and do everything else: */
+   for_each_ring(ring, dev_priv, i) {
+   if (ring-id == RCS) {
+   for (j = 0; j  NUM_L3_SLICES(dev); j++)
+   i915_gem_l3_remap(ring, j);
+   }
 
-   ret = i915_ppgtt_init_hw(dev);
-   if (ret  ret != -EIO) {
-   DRM_ERROR(PPGTT enable failed %d\n, ret);
-   i915_gem_cleanup_ringbuffer(dev);
+   ret = i915_ppgtt_init_ring(ring);
+   if (ret  ret != -EIO) {
+   DRM_ERROR(PPGTT enable ring #%d failed %d\n, i, ret);
+   i915_gem_cleanup_ringbuffer(dev);
+   goto out;
+   }
}
 
ret = i915_gem_context_enable(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 17b7df0..b14ae63 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1543,11 +1543,6 @@ int i915_ppgtt_init(struct drm_device *dev, struct 
i915_hw_ppgtt *ppgtt)
 
 int i915_ppgtt_init_hw(struct drm_device *dev)
 {
-   struct drm_i915_private *dev_priv = dev-dev_private;
-   struct intel_engine_cs *ring;
-   struct i915_hw_ppgtt *ppgtt = dev_priv-mm.aliasing_ppgtt;
-   int i, ret = 0;
-
/* In the case of execlists, PPGTT is enabled by the context descriptor
 * and the PDPs are contained within the context itself.  We don't
 * need to do anything here. */
@@ -1566,16 +1561,23 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
else
MISSING_CASE(INTEL_INFO(dev)-gen);
 
-   if (ppgtt) {
-   for_each_ring(ring, dev_priv, i) {
-   ret = ppgtt-switch_mm(ppgtt, ring);
-   if (ret != 0)
-   return ret;
-   }
-   }
+   return 0;
+}
 
-   return ret;
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring)
+{
+   struct drm_i915_private *dev_priv = ring-dev-dev_private;
+   struct i915_hw_ppgtt *ppgtt = dev_priv-mm.aliasing_ppgtt;
+
+   if (i915.enable_execlists)
+   return 0;
+
+   if (!ppgtt)
+   return 0;
+
+   return ppgtt-switch_mm(ppgtt, ring);
 }
+
 struct i915_hw_ppgtt *
 i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0d46dd2..0caa9eb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -475,6 +475,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev);
 
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 int i915_ppgtt_init_hw(struct drm_device *dev);
+int i915_ppgtt_init_ring(struct intel_engine_cs *ring);
 void 

Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
 On 17/06/15 13:02, Daniel Vetter wrote:
  Domain handling is required for all gem objects, and the resulting bugs if
  you don't for one-off objects are absolutely no fun to track down.
 
 Is it not the case that the new object returned by
 i915_gem_alloc_object() is
 (a) of a type that can be mapped into the GTT, and
 (b) initially in the CPU domain for both reading and writing?
 
 So AFAICS the allocate-and-fill function I'm describing (to appear in
 next patch series respin) doesn't need any further domain handling.

A i915_gem_object_create_from_data() is a reasonable addition, and I
suspect it will make the code a bit more succinct.

Whilst your statement is true today, calling set_domain is then a no-op,
and helps document how you use the object and so reduces the likelihood
of us introducing bugs in the future.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] igt: remove deprecated reg access tools in favor of intel_reg

2015-06-18 Thread Jani Nikula
intel_iosf_sb_read, intel_iosf_sb_write, intel_reg_dumper,
intel_reg_read, intel_reg_snapshot, intel_reg_write, intel_vga_read, and
intel_vga_write have been deprecated in favor of intel_reg. Remove the
deprecated tools. intel_reg does everything they do, and more.

Signed-off-by: Jani Nikula jani.nik...@intel.com
---
 man/Makefile.am |3 -
 man/intel_reg_dumper.man|   33 -
 man/intel_reg_read.man  |   15 -
 man/intel_reg_snapshot.man  |   15 -
 man/intel_reg_write.man |   16 -
 tools/Makefile.sources  |8 -
 tools/intel_iosf_sb_read.c  |  153 ---
 tools/intel_iosf_sb_write.c |  140 --
 tools/intel_reg_dumper.c| 3020 ---
 tools/intel_reg_read.c  |  145 ---
 tools/intel_reg_snapshot.c  |   56 -
 tools/intel_reg_write.c |   60 -
 tools/intel_vga_read.c  |   97 --
 tools/intel_vga_write.c |   97 --
 14 files changed, 3858 deletions(-)
 delete mode 100644 man/intel_reg_dumper.man
 delete mode 100644 man/intel_reg_read.man
 delete mode 100644 man/intel_reg_snapshot.man
 delete mode 100644 man/intel_reg_write.man
 delete mode 100644 tools/intel_iosf_sb_read.c
 delete mode 100644 tools/intel_iosf_sb_write.c
 delete mode 100644 tools/intel_reg_dumper.c
 delete mode 100644 tools/intel_reg_read.c
 delete mode 100644 tools/intel_reg_snapshot.c
 delete mode 100644 tools/intel_reg_write.c
 delete mode 100644 tools/intel_vga_read.c
 delete mode 100644 tools/intel_vga_write.c

diff --git a/man/Makefile.am b/man/Makefile.am
index ee09156c934e..c42a91beb09b 100644
--- a/man/Makefile.am
+++ b/man/Makefile.am
@@ -10,9 +10,6 @@ appman_PRE =  \
intel_infoframes.man\
intel_lid.man   \
intel_panel_fitter.man  \
-   intel_reg_dumper.man\
-   intel_reg_read.man  \
-   intel_reg_write.man \
intel_stepping.man  \
intel_upload_blit_large.man \
intel_upload_blit_large_gtt.man \
diff --git a/man/intel_reg_dumper.man b/man/intel_reg_dumper.man
deleted file mode 100644
index 89f6b9f96072..
--- a/man/intel_reg_dumper.man
+++ /dev/null
@@ -1,33 +0,0 @@
-.\ shorthand for double quote that works everywhere.
-.ds q \N'34'
-.TH intel_reg_dumper __appmansuffix__ __xorgversion__
-.SH NAME
-intel_reg_dumper \- Decode a bunch of Intel GPU registers for debugging
-.SH SYNOPSIS
-.B intel_reg_dumper [ options ] [ file ]
-.SH DESCRIPTION
-.B intel_reg_dumper
-is a tool to read and decode the values of many Intel GPU registers.  It is
-commonly used in debugging video mode setting issues.  If the
-.B file
-argument is present, the registers will be decoded from the given file
-instead of the current registers.  Use the
-.B intel_reg_snapshot
-tool to generate such files.
-
-When the
-.B file
-argument is present and the
-.B -d
-argument is not present,
-.B intel_reg_dumper
-will assume the file was generated on an Ironlake machine.
-.SH OPTIONS
-.TP
-.B -d id
-when a dump file is used, use 'id' as device id (in hex)
-.TP
-.B -h
-prints a help message
-.SH SEE ALSO
-.BR intel_reg_snapshot(1)
diff --git a/man/intel_reg_read.man b/man/intel_reg_read.man
deleted file mode 100644
index cc2bf612eb35..
--- a/man/intel_reg_read.man
+++ /dev/null
@@ -1,15 +0,0 @@
-.\ shorthand for double quote that works everywhere.
-.ds q \N'34'
-.TH intel_reg_read __appmansuffix__ __xorgversion__
-.SH NAME
-intel_reg_read \- Reads an Intel GPU register value
-.SH SYNOPSIS
-.B intel_reg_read \fIregister\fR
-.SH DESCRIPTION
-.B intel_reg_read
-is a tool to read Intel GPU registers, for use in debugging.  The
-\fIregister\fR argument is given as hexadecimal.
-.SH EXAMPLES
-.TP
-intel_reg_read 0x61230
-Shows the register value for the first internal panel fitter.
diff --git a/man/intel_reg_snapshot.man b/man/intel_reg_snapshot.man
deleted file mode 100644
index 1930f613fb26..
--- a/man/intel_reg_snapshot.man
+++ /dev/null
@@ -1,15 +0,0 @@
-.\ shorthand for double quote that works everywhere.
-.ds q \N'34'
-.TH intel_reg_snapshot __appmansuffix__ __xorgversion__
-.SH NAME
-intel_reg_snapshot \- Take a GPU register snapshot
-.SH SYNOPSIS
-.B intel_reg_snapshot
-.SH DESCRIPTION
-.B intel_reg_snapshot
-takes a snapshot of the registers of an Intel GPU, and writes it to standard
-output.  These files can be inspected later with the
-.B intel_reg_dumper
-tool.
-.SH SEE ALSO
-.BR intel_reg_dumper(1)
diff --git a/man/intel_reg_write.man b/man/intel_reg_write.man
deleted file mode 100644
index cb1731c6f04b..
--- a/man/intel_reg_write.man
+++ /dev/null
@@ -1,16 +0,0 @@
-.\ shorthand for double quote that works everywhere.
-.ds q \N'34'
-.TH intel_reg_write __appmansuffix__ __xorgversion__
-.SH NAME
-intel_reg_write \- Set an Intel GPU register to a value
-.SH SYNOPSIS
-.B intel_reg_write \fIregister\fR \fIvalue\fR
-.SH DESCRIPTION
-.B intel_reg_write
-is a tool to set Intel GPU 

[Intel-gfx] [PATCH v5 5/6] drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround

2015-06-18 Thread Arun Siluvery
In Indirect context w/a batch buffer,
WaClearSlmSpaceAtContextSwitch

v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville)

Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 16 
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d14ad20..7637e64 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -410,6 +410,7 @@
 #define   DISPLAY_PLANE_A   (020)
 #define   DISPLAY_PLANE_B   (120)
 #define GFX_OP_PIPE_CONTROL(len)   ((0x329)|(0x327)|(0x224)|(len-2))
+#define   PIPE_CONTROL_FLUSH_L3(127)
 #define   PIPE_CONTROL_GLOBAL_GTT_IVB  (124) /* gen7+ */
 #define   PIPE_CONTROL_MMIO_WRITE  (123)
 #define   PIPE_CONTROL_STORE_DATA_INDEX(121)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index dff8303..792d559 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1106,6 +1106,7 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
uint32_t *num_dwords)
 {
uint32_t index;
+   uint32_t scratch_addr;
uint32_t *batch = *wa_ctx_batch;
 
index = offset;
@@ -1136,6 +1137,21 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
wa_ctx_emit(batch, l3sqc4_flush  
~GEN8_LQSC_FLUSH_COHERENT_LINES);
}
 
+   /* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+   /* Actual scratch location is at 128 bytes offset */
+   scratch_addr = ring-scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr |= PIPE_CONTROL_GLOBAL_GTT;
+
+   wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6));
+   wa_ctx_emit(batch, (PIPE_CONTROL_FLUSH_L3 |
+   PIPE_CONTROL_GLOBAL_GTT_IVB |
+   PIPE_CONTROL_CS_STALL |
+   PIPE_CONTROL_QW_WRITE));
+   wa_ctx_emit(batch, scratch_addr);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+
/* padding */
 while (((unsigned long) (batch + index) % CACHELINE_BYTES) != 0)
wa_ctx_emit(batch, MI_NOOP);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 4/6] drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround

2015-06-18 Thread Arun Siluvery
In Indirect context w/a batch buffer,
+WaFlushCoherentL3CacheLinesAtContextSwitch:bdw

v2: Add LRI commands to set/reset bit that invalidates coherent lines,
update WA to include programming restrictions and exclude CHV as
it is not required (Ville)

v3: Avoid unnecessary read when it can be done by reading register once (Chris).

Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 84af255..d14ad20 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -426,6 +426,7 @@
 #define   PIPE_CONTROL_INDIRECT_STATE_DISABLE  (19)
 #define   PIPE_CONTROL_NOTIFY  (18)
 #define   PIPE_CONTROL_FLUSH_ENABLE(17) /* gen7+ */
+#define   PIPE_CONTROL_DC_FLUSH_ENABLE (15)
 #define   PIPE_CONTROL_VF_CACHE_INVALIDATE (14)
 #define   PIPE_CONTROL_CONST_CACHE_INVALIDATE  (13)
 #define   PIPE_CONTROL_STATE_CACHE_INVALIDATE  (12)
@@ -5788,6 +5789,7 @@ enum skl_disp_power_wells {
 
 #define GEN8_L3SQCREG4 0xb118
 #define  GEN8_LQSC_RO_PERF_DIS (127)
+#define  GEN8_LQSC_FLUSH_COHERENT_LINES(121)
 
 /* GEN8 chicken */
 #define HDC_CHICKEN0   0x7300
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8d5932a..dff8303 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1113,6 +1113,29 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
/* WaDisableCtxRestoreArbitration:bdw,chv */
wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE);
 
+   /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+   if (IS_BROADWELL(ring-dev)) {
+   struct drm_i915_private *dev_priv = to_i915(ring-dev);
+   uint32_t l3sqc4_flush = (I915_READ(GEN8_L3SQCREG4) |
+GEN8_LQSC_FLUSH_COHERENT_LINES);
+
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, GEN8_L3SQCREG4);
+   wa_ctx_emit(batch, l3sqc4_flush);
+
+   wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6));
+   wa_ctx_emit(batch, (PIPE_CONTROL_CS_STALL |
+   PIPE_CONTROL_DC_FLUSH_ENABLE));
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, GEN8_L3SQCREG4);
+   wa_ctx_emit(batch, l3sqc4_flush  
~GEN8_LQSC_FLUSH_COHERENT_LINES);
+   }
+
/* padding */
 while (((unsigned long) (batch + index) % CACHELINE_BYTES) != 0)
wa_ctx_emit(batch, MI_NOOP);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 0/6] Add Per-context WA using WA batch buffers

2015-06-18 Thread Arun Siluvery
From Gen8+ we have some workarounds that are applied Per context and
they are applied using special batch buffers called as WA batch buffers.
HW executes them at specific stages during context save/restore.
The patches in this series adds this framework to i915.

I did some basic testing on BDW by running glmark2 and didn't see any issues.
These WA are mainly required when preemption is enabled.

All of the previous comments are addressed in latest revision v5

[v1] http://lists.freedesktop.org/archives/intel-gfx/2015-February/060707.html
[v2] http://www.spinics.net/lists/intel-gfx/msg67804.html

[v3] In v2, two separate ring_buffer objects were used to load WA instructions
and they were part of every context which is not really required.
Chris suggested a better approach of adding a page to context itself and using
it for this purpose. Since GuC is also planning to do the same it can probably
be shared with GuC. But after discussions it is agreed to use an independent
page as GuC area might grow in future. Independent page also makes sense because
these WA are only initialized once and not changed afterwards so we can share
them across all contexts.

[v4] Changes in this revision,
In the previous version the size of batch buffers are fixed during
initialization which is not a good idea. This is corrected by updating the
functions that load WA to return the number of dwords written and caller
updates the size once all WA are initialized. The functions now also accept
offset field which allows us to have multiple batches so that required batch
can be selected based on a criteria. This is not a requirement at this point
but could be useful in future.

WaFlushCoherentL3CacheLinesAtContextSwitch implementation was incomplete which
is fixed and programming restrictions correctly applied.

http://www.spinics.net/lists/intel-gfx/msg68947.html

[v5] No major changes in this revision but switched to new revision as changes
affected all patches. Introduced macro to add commands which also checks for
page overflow. Moved code around to simplify, indentation fixes and other
improvements suggested by Chris.

Since we don't know the number of WA applied upfront, Chris suggested a two-pass
approach but that brings additional complexity which is not necessary.
Discussed with Chris and agreed upon on single page setup as simpler code wins
and also single page is sufficient for our requirement.

Please see the patches for more details.

Arun Siluvery (6):
  drm/i915/gen8: Add infrastructure to initialize WA batch buffers
  drm/i915/gen8: Re-order init pipe_control in lrc mode
  drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround
  drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch
workaround
  drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround
  drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround

 drivers/gpu/drm/i915/i915_reg.h |  32 +++-
 drivers/gpu/drm/i915/intel_lrc.c| 298 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  18 ++
 3 files changed, 341 insertions(+), 7 deletions(-)

-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers

2015-06-18 Thread Arun Siluvery
Some of the WA are to be applied during context save but before restore and
some at the end of context save/restore but before executing the instructions
in the ring, WA batch buffers are created for this purpose and these WA cannot
be applied using normal means. Each context has two registers to load the
offsets of these batch buffers. If they are non-zero, HW understands that it
need to execute these batches.

v1: In this version two separate ring_buffer objects were used to load WA
instructions for indirect and per context batch buffers and they were part
of every context.

v2: Chris suggested to include additional page in context and use it to load
these WA instead of creating separate objects. This will simplify lot of things
as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC
is planning to use a similar setup to share data between GuC and driver and
WA batch buffers can probably share that page. However after discussions with
Dave who is implementing GuC changes, he suggested to use an independent page
for the reasons - GuC area might grow and these WA are initialized only once and
are not changed afterwards so we can share them share across all contexts.

The page is updated with WA during render ring init. This has an advantage of
not adding more special cases to default_context.

We don't know upfront the number of WA we will applying using these batch 
buffers.
For this reason the size was fixed earlier but it is not a good idea. To fix 
this,
the functions that load instructions are modified to report the no of commands
inserted and the size is now calculated after the batch is updated. A macro is
introduced to add commands to these batch buffers which also checks for overflow
and returns error.
We have a full page dedicated for these WA so that should be sufficient for
good number of WA, anything more means we have major issues.
The list for Gen8 is small, same for Gen9 also, maybe few more gets added
going forward but not close to filling entire page. Chris suggested a two-pass
approach but we agreed to go with single page setup as it is a one-off routine
and simpler code wins. Moved around functions to simplify it further, add 
comments.

One additional option is offset field which is helpful if we would like to
have multiple batches at different offsets within the page and select them
based on some criteria. This is not a requirement at this point but could
help in future (Dave).

(Thanks to Chris, Dave and Thomas for their reviews and inputs)

Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c| 204 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  18 +++
 2 files changed, 218 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0413b8f..ad0b189 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -211,6 +211,7 @@ enum {
FAULT_AND_CONTINUE /* Unsupported */
 };
 #define GEN8_CTX_ID_SHIFT 32
+#define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT  0x17
 
 static int intel_lr_context_pin(struct intel_engine_cs *ring,
struct intel_context *ctx);
@@ -1077,6 +1078,173 @@ static int intel_logical_ring_workarounds_emit(struct 
intel_engine_cs *ring,
return 0;
 }
 
+#define wa_ctx_emit(batch, cmd) {  \
+   if (WARN_ON(index = (PAGE_SIZE / sizeof(uint32_t { \
+   return -ENOSPC; \
+   }   \
+   batch[index++] = (cmd); \
+   }
+
+/**
+ * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA
+ *
+ * @ring: only applicable for RCS
+ * @wa_ctx_batch: page in which WA are loaded
+ * @offset: This is for future use in case if we would like to have multiple
+ *  batches at different offsets and select them based on a criteria.
+ * @num_dwords: The number of WA applied are known at the beginning, it returns
+ * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END
+ * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will be
+ * added to perctx batch and both of them together makes a complete batch 
buffer.
+ *
+ * Return: non-zero if we exceed the PAGE_SIZE limit.
+ */
+
+static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring,
+   uint32_t **wa_ctx_batch,
+   uint32_t offset,
+   uint32_t *num_dwords)
+{
+   uint32_t index;
+   uint32_t *batch = *wa_ctx_batch;
+
+   index = offset;
+
+   /* FIXME: fill one cacheline with NOOPs.
+* Replace these instructions with WA
+*/
+   while (index  (offset + 16))
+   wa_ctx_emit(batch, MI_NOOP);

[Intel-gfx] [PATCH v5 3/6] drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround

2015-06-18 Thread Arun Siluvery
In Indirect and Per context w/a batch buffer,
+WaDisableCtxRestoreArbitration

Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1d31eb5..8d5932a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1110,10 +1110,11 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
 
index = offset;
 
-   /* FIXME: fill one cacheline with NOOPs.
-* Replace these instructions with WA
-*/
-   while (index  (offset + 16))
+   /* WaDisableCtxRestoreArbitration:bdw,chv */
+   wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE);
+
+   /* padding */
+while (((unsigned long) (batch + index) % CACHELINE_BYTES) != 0)
wa_ctx_emit(batch, MI_NOOP);
 
/*
@@ -1143,13 +1144,10 @@ static int gen8_init_perctx_bb(struct intel_engine_cs 
*ring,
 
index = offset;
 
-   /* FIXME: fill one cacheline with NOOPs.
-* Replace these instructions with WA
-*/
-   while (index  (offset + 16))
-   wa_ctx_emit(batch, MI_NOOP);
+   /* WaDisableCtxRestoreArbitration:bdw,chv */
+   wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 
-   batch[index - 1] = MI_BATCH_BUFFER_END;
+   wa_ctx_emit(batch, MI_BATCH_BUFFER_END);
 
*num_dwords = index - offset;
 
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 6/6] drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround

2015-06-18 Thread Arun Siluvery
In Per context w/a batch buffer,
WaRsRestoreWithPerCtxtBb

v2: This patches modifies definitions of MI_LOAD_REGISTER_MEM and
MI_LOAD_REGISTER_REG; Add GEN8 specific defines for these instructions
so as to not break any future users of existing definitions (Michel)

v3: Length defined in current definitions of LRM, LRR instructions was specified
as 0. It seems it is common convention for instructions whose length vary 
between
platforms. This is not an issue so far because they are not used anywhere except
command parser; now that we use in this patch update them with correct length
and also move them out of command parser placeholder to appropriate place.
remove unnecessary padding and follow the WA programming sequence exactly
as mentioned in spec which is essential for this WA (Dave).

Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  | 29 +++--
 drivers/gpu/drm/i915/intel_lrc.c | 54 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7637e64..208620d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -347,6 +347,31 @@
 #define   MI_INVALIDATE_BSD(17)
 #define   MI_FLUSH_DW_USE_GTT  (12)
 #define   MI_FLUSH_DW_USE_PPGTT(02)
+#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 1)
+#define MI_LOAD_REGISTER_MEM_GEN8 MI_INSTR(0x29, 2)
+#define   MI_LRM_USE_GLOBAL_GTT (122)
+#define   MI_LRM_ASYNC_MODE_ENABLE (121)
+#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 1)
+#define MI_ATOMIC(len) MI_INSTR(0x2F, (len-2))
+#define   MI_ATOMIC_MEMORY_TYPE_GGTT   (122)
+#define   MI_ATOMIC_INLINE_DATA(118)
+#define   MI_ATOMIC_CS_STALL   (117)
+#define   MI_ATOMIC_RETURN_DATA_CTL(116)
+#define MI_ATOMIC_OP_MASK(op)  ((op)  8)
+#define MI_ATOMIC_AND  MI_ATOMIC_OP_MASK(0x01)
+#define MI_ATOMIC_OR   MI_ATOMIC_OP_MASK(0x02)
+#define MI_ATOMIC_XOR  MI_ATOMIC_OP_MASK(0x03)
+#define MI_ATOMIC_MOVE MI_ATOMIC_OP_MASK(0x04)
+#define MI_ATOMIC_INC  MI_ATOMIC_OP_MASK(0x05)
+#define MI_ATOMIC_DEC  MI_ATOMIC_OP_MASK(0x06)
+#define MI_ATOMIC_ADD  MI_ATOMIC_OP_MASK(0x07)
+#define MI_ATOMIC_SUB  MI_ATOMIC_OP_MASK(0x08)
+#define MI_ATOMIC_RSUB MI_ATOMIC_OP_MASK(0x09)
+#define MI_ATOMIC_IMAX MI_ATOMIC_OP_MASK(0x0A)
+#define MI_ATOMIC_IMIN MI_ATOMIC_OP_MASK(0x0B)
+#define MI_ATOMIC_UMAX MI_ATOMIC_OP_MASK(0x0C)
+#define MI_ATOMIC_UMIN MI_ATOMIC_OP_MASK(0x0D)
+
 #define MI_BATCH_BUFFERMI_INSTR(0x30, 1)
 #define   MI_BATCH_NON_SECURE  (1)
 /* for snb/ivb/vlv this also means batch in ppgtt when ppgtt is enabled. */
@@ -451,8 +476,6 @@
 #define MI_CLFLUSH  MI_INSTR(0x27, 0)
 #define MI_REPORT_PERF_COUNTMI_INSTR(0x28, 0)
 #define   MI_REPORT_PERF_COUNT_GGTT (10)
-#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 0)
-#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 0)
 #define MI_RS_STORE_DATA_IMMMI_INSTR(0x2B, 0)
 #define MI_LOAD_URB_MEM MI_INSTR(0x2C, 0)
 #define MI_STORE_URB_MEMMI_INSTR(0x2D, 0)
@@ -1799,6 +1822,8 @@ enum skl_disp_power_wells {
 #define   GEN8_RC_SEMA_IDLE_MSG_DISABLE(1  12)
 #define   GEN8_FF_DOP_CLOCK_GATE_DISABLE   (110)
 
+#define GEN8_RS_PREEMPT_STATUS 0x215C
+
 /* Fuse readout registers for GT */
 #define CHV_FUSE_GT(VLV_DISPLAY_BASE + 0x2168)
 #define   CHV_FGT_DISABLE_SS0  (1  10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 792d559..19a3460 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1179,13 +1179,67 @@ static int gen8_init_perctx_bb(struct intel_engine_cs 
*ring,
   uint32_t *num_dwords)
 {
uint32_t index;
+   uint32_t scratch_addr;
uint32_t *batch = *wa_ctx_batch;
 
index = offset;
 
+   /* Actual scratch location is at 128 bytes offset */
+   scratch_addr = ring-scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr |= PIPE_CONTROL_GLOBAL_GTT;
+
/* WaDisableCtxRestoreArbitration:bdw,chv */
wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 
+   /*
+* As per Bspec, to workaround a known HW issue, SW must perform the
+* below programming sequence prior to programming MI_BATCH_BUFFER_END.
+*
+* This is only applicable for Gen8.
+*/
+
+   /* WaRsRestoreWithPerCtxtBb:bdw,chv */
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, INSTPM);
+   wa_ctx_emit(batch, _MASKED_BIT_DISABLE(INSTPM_FORCE_ORDERING));
+
+   wa_ctx_emit(batch, (MI_ATOMIC(5) |
+   MI_ATOMIC_MEMORY_TYPE_GGTT |
+   MI_ATOMIC_INLINE_DATA |
+   MI_ATOMIC_CS_STALL |
+   

[Intel-gfx] [PATCH v5 2/6] drm/i915/gen8: Re-order init pipe_control in lrc mode

2015-06-18 Thread Arun Siluvery
Some of the WA applied using WA batch buffers perform writes to scratch page.
In the current flow WA are initialized before scratch obj is allocated.
This patch reorders intel_init_pipe_control() to have a valid scratch obj
before we initialize WA.

Signed-off-by: Michel Thierry michel.thie...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ad0b189..1d31eb5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1641,7 +1641,8 @@ static int logical_render_ring_init(struct drm_device 
*dev)
ring-emit_bb_start = gen8_emit_bb_start;
 
ring-dev = dev;
-   ret = logical_ring_init(dev, ring);
+
+   ret = intel_init_pipe_control(ring);
if (ret)
return ret;
 
@@ -1653,7 +1654,7 @@ static int logical_render_ring_init(struct drm_device 
*dev)
}
}
 
-   ret = intel_init_pipe_control(ring);
+   ret = logical_ring_init(dev, ring);
if (ret) {
if (ring-wa_ctx.obj)
lrc_destroy_wa_ctx_obj(ring);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 01:22:36PM +0300, Mika Kuoppala wrote:
 Chris Wilson ch...@chris-wilson.co.uk writes:
 
  On Thu, Jun 18, 2015 at 12:51:40PM +0300, Mika Kuoppala wrote:
  In order for gen8+ hardware to guarantee that no context switch
  takes place during engine reset and that current context is properly
  saved, the driver needs to notify and query hw before commencing
  with reset.
  
  There are gpu hangs where the engine gets so stuck that it never will
  report to be ready for reset. We could proceed with reset anyway, but
  with some hangs with skl, the forced gpu reset will result in a system
  hang. By inspecting the unreadiness for reset seems to correlate with
  the probable system hang.
  
  We will only proceed with reset if all engines report that they
  are ready for reset. If root cause for system hang is found and
  can be worked around with another means, we can reconsider if
  we can reinstate full reset for unreadiness case.
  
  v2: -EIO, Recovery, gen8 (Chris, Tomas, Daniel)
  v3: updated commit msg
  v4: timeout_ms, simpler error path (Chris)
  
  References: https://bugs.freedesktop.org/show_bug.cgi?id=89959
  References: https://bugs.freedesktop.org/show_bug.cgi?id=90854
  Testcase: igt/gem_concurrent_blit --r 
  prw-blt-overwrite-source-read-rcs-forked
  Testcase: igt/gem_concurrent_blit --r 
  gtt-blt-overwrite-source-read-rcs-forked
 
  Is this the new format for subtests?
 
 No. It is me cutpasting from scripts. Daniel could you please
 fix while merging.

Done and queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Factor out p2 divider selection for pre-ilk platforms

2015-06-18 Thread Imre Deak
On to, 2015-06-18 at 13:47 +0300, ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 The same dpll p2 divider selection is repeated three times in the
 gen2-4 .find_dpll() functions. Factor it out.
 
 Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com

Looks ok to me:
Reviewed-by: Imre Deak imre.d...@intel.com

 ---
  drivers/gpu/drm/i915/intel_display.c | 78 
 ++--
  1 file changed, 30 insertions(+), 48 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index 2fa81ed..2cc8ae7 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -643,16 +643,12 @@ static bool intel_PLL_is_valid(struct drm_device *dev,
   return true;
  }
  
 -static bool
 -i9xx_find_best_dpll(const intel_limit_t *limit,
 - struct intel_crtc_state *crtc_state,
 - int target, int refclk, intel_clock_t *match_clock,
 - intel_clock_t *best_clock)
 +static int
 +i9xx_select_p2_div(const intel_limit_t *limit,
 +const struct intel_crtc_state *crtc_state,
 +int target)
  {
 - struct intel_crtc *crtc = to_intel_crtc(crtc_state-base.crtc);
 - struct drm_device *dev = crtc-base.dev;
 - intel_clock_t clock;
 - int err = target;
 + struct drm_device *dev = crtc_state-base.crtc-dev;
  
   if (intel_pipe_will_have_type(crtc_state, INTEL_OUTPUT_LVDS)) {
   /*
 @@ -661,18 +657,31 @@ i9xx_find_best_dpll(const intel_limit_t *limit,
* single/dual channel state, if we even can.
*/
   if (intel_is_dual_link_lvds(dev))
 - clock.p2 = limit-p2.p2_fast;
 + return limit-p2.p2_fast;
   else
 - clock.p2 = limit-p2.p2_slow;
 + return limit-p2.p2_slow;
   } else {
   if (target  limit-p2.dot_limit)
 - clock.p2 = limit-p2.p2_slow;
 + return limit-p2.p2_slow;
   else
 - clock.p2 = limit-p2.p2_fast;
 + return limit-p2.p2_fast;
   }
 +}
 +
 +static bool
 +i9xx_find_best_dpll(const intel_limit_t *limit,
 + struct intel_crtc_state *crtc_state,
 + int target, int refclk, intel_clock_t *match_clock,
 + intel_clock_t *best_clock)
 +{
 + struct drm_device *dev = crtc_state-base.crtc-dev;
 + intel_clock_t clock;
 + int err = target;
  
   memset(best_clock, 0, sizeof(*best_clock));
  
 + clock.p2 = i9xx_select_p2_div(limit, crtc_state, target);
 +
   for (clock.m1 = limit-m1.min; clock.m1 = limit-m1.max;
clock.m1++) {
   for (clock.m2 = limit-m2.min;
 @@ -712,30 +721,14 @@ pnv_find_best_dpll(const intel_limit_t *limit,
  int target, int refclk, intel_clock_t *match_clock,
  intel_clock_t *best_clock)
  {
 - struct intel_crtc *crtc = to_intel_crtc(crtc_state-base.crtc);
 - struct drm_device *dev = crtc-base.dev;
 + struct drm_device *dev = crtc_state-base.crtc-dev;
   intel_clock_t clock;
   int err = target;
  
 - if (intel_pipe_will_have_type(crtc_state, INTEL_OUTPUT_LVDS)) {
 - /*
 -  * For LVDS just rely on its current settings for dual-channel.
 -  * We haven't figured out how to reliably set up different
 -  * single/dual channel state, if we even can.
 -  */
 - if (intel_is_dual_link_lvds(dev))
 - clock.p2 = limit-p2.p2_fast;
 - else
 - clock.p2 = limit-p2.p2_slow;
 - } else {
 - if (target  limit-p2.dot_limit)
 - clock.p2 = limit-p2.p2_slow;
 - else
 - clock.p2 = limit-p2.p2_fast;
 - }
 -
   memset(best_clock, 0, sizeof(*best_clock));
  
 + clock.p2 = i9xx_select_p2_div(limit, crtc_state, target);
 +
   for (clock.m1 = limit-m1.min; clock.m1 = limit-m1.max;
clock.m1++) {
   for (clock.m2 = limit-m2.min;
 @@ -773,28 +766,17 @@ g4x_find_best_dpll(const intel_limit_t *limit,
  int target, int refclk, intel_clock_t *match_clock,
  intel_clock_t *best_clock)
  {
 - struct intel_crtc *crtc = to_intel_crtc(crtc_state-base.crtc);
 - struct drm_device *dev = crtc-base.dev;
 + struct drm_device *dev = crtc_state-base.crtc-dev;
   intel_clock_t clock;
   int max_n;
 - bool found;
 + bool found = false;
   /* approximately equals target * 0.00585 */
   int err_most = (target  8) + (target  9);
 - found = false;
 -
 - if (intel_pipe_will_have_type(crtc_state, INTEL_OUTPUT_LVDS)) {
 - if (intel_is_dual_link_lvds(dev))
 - clock.p2 = limit-p2.p2_fast;
 - 

Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread ch...@chris-wilson.co.uk
On Thu, Jun 18, 2015 at 04:25:47PM +0100, Damien Lespiau wrote:
 On Thu, Jun 18, 2015 at 03:45:44PM +0100, Antoine, Peter wrote:
  So, intializing the other (non-render) MOCS in gen8_init_rcs_context()
  isn't the most logical thing to do I'm afraid. What happens if we
  suddenly decide that we don't want to fully initialize the default
  context at startup but initialize each ring on-demand for that context
  as well? We can end up in a situation where we use the blitter first
  and we wouldn't have the blitter MOCS initialized.
  
  In that sense, that code makes an assumption about how we do things in
  a completely different part of the driver and that's always a
  potential source of bugs.
  
 
 Yes, but this is the same with the golden context and the workarounds
 (as I understand it) so all this code would have to be moved. 
 
 Ah, but the workarounds in that function are only for registers in the
 render context, not other rings/engine.

Yes, but it just so happens that we initialise the default context
before userspace so that we know that context is pristine before sending
batches to the GPU.

This is the reason why I think it is important to mark this function as
being executed at that stage, so that all parties can be sure that the
execution is before real use of the GPU and so we can use the RCS to
initialise the other rings. At the moment, I am happy with baking that
assumption into the code, we can readdress it later if there are non-RCS
operations that must be performed at context init and conflict with the
RCS programming.

If you can think of a suitable comment to forewarn us in future about
potential conflicts in adding xcs-init_context(), be my guest.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
 On 17/06/15 13:05, Daniel Vetter wrote:
  On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
  Current devices may contain one or more programmable microcontrollers
  that need to have a firmware image (aka binary blob) loaded from an
  external medium and transferred to the device's memory.
 
  This file provides generic support functions for doing this; they can
  then be used by each uC-specific loader, thus reducing code duplication
  and testing effort.
 
  Signed-off-by: Dave Gordon david.s.gor...@intel.com
  Signed-off-by: Alex Dai yu@intel.com
  
  Given that I'm just shredding the synchronization used by the dmc loader
  I'm not convinced this is a good idea. Abstraction has cost, and a bit of
  copy-paste for similar sounding but slightly different things doesn't
  sound awful to me. And the critical bit in all the firmware loading I've
  seen thus far is in synchronizing the loading with other operations,
  hiding that isn't a good idea. Worse if we enforce stuff like requiring
  dev-struct_mutex.
  -Daniel
 
 It's precisely because it's in some sense trivial-but-tricky that we
 should write it once, get it right, and use it everywhere. Copypaste
 /does/ sound awful; I've seen how the code this was derived from had
 already been cloned into three flavours, all different and all wrong.
 
 It's a very simple abstraction: one early call to kick things off as
 early as possible, no locking required. One late call with the
 struct_mutex held to complete the synchronisation and actually do the
 work, thus guaranteeing that the transfer to the target uC is done in a
 controlled fashion, at a time of the caller's choice, and by the
 driver's mainline thread, NOT by an asynchronous thread racing with
 other activity (which was one of the things wrong with the original
 version).

Yeah I've seen the origins of this in the display code, and that code gets
the syncing wrong. The only thing that one has do to is grab a runtime pm
reference for the appropriate power well to prevent dc5 entry, and release
it when the firmware is loaded and initialized.

Which means any kind of firmware loader which requires/uses
dev-struct_mutex get stuff wrong and is not appropriate everywhere.

 We should convert the DMC loader to use this too, so there need be only
 one bit of code in the whole driver that needs to understand how to use
 completions to get correct handover from a free-running no-locks-held
 thread to the properly disciplined environment of driver mainline for
 purposes of programming the h/w.

Nack on using this for dmc, since I want them to convert it to the above
synchronization, since that's how all the other async power initialization
is done.

Guc is different since we really must have it ready for execbuf, and for
that usecase a completion at drm_open time sounds like the right thing.

As a rule of thumb for refactoring and share infastructure we use the
following recipe in drm:
- first driver implements things as straightforward as possible
- 2nd user copypastes
- 3rd one has the duty to figure out whether some refactoring is in order
  or not.

Imo that approach leads a really good balance between avoiding
overengineering and having maintainable code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 04/10] drm: Add Gamma correction structure

2015-06-18 Thread Emil Velikov
On 14 June 2015 at 10:02, Sharma, Shashank shashank.sha...@intel.com wrote:
 Hi, Emil Velikov

 The reason behind a zero sized array is that we want to use the same variable 
 for various color correction possible across various driver .
 Due to current blob implementation, it doesn’t look very efficient to have 
 another pointer in the structure, so we are left with this option only.

Can you elaborate (to suggest any reading material) about those inefficiencies ?

 I guess as long as we are using gcc (which is for all Linux distributions), 
 we are good. The size of the zero sized array will be zero, so no alignment 
 errors as such.

Note that most of the DRM subsystem code is dual-licensed. As such it
is used in other OSes - Solaris, *BSD, not to mention the work (in
progress) about using clang/LLVM to build the kernel. In the former
case not everyone uses GCC.

Thanks
Emil
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/5] drm/i915/bxt: add missing DDI PLL registers to the state checking

2015-06-18 Thread Imre Deak
Although we have a fixed setting for the PLL9 and EBB4 registers, it
still makes sense to check them together with the rest of PLL registers.

While at it also remove a redundant comment about 10 bit clock enabling.

Signed-off-by: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h  |  3 ++-
 drivers/gpu/drm/i915/i915_reg.h  |  3 ++-
 drivers/gpu/drm/i915/intel_ddi.c | 16 +---
 drivers/gpu/drm/i915/intel_display.c |  6 --
 4 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 491ef0c..bf235ff 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -366,7 +366,8 @@ struct intel_dpll_hw_state {
uint32_t cfgcr1, cfgcr2;
 
/* bxt */
-   uint32_t ebb0, pll0, pll1, pll2, pll3, pll6, pll8, pll10, pcsdw12;
+   uint32_t ebb0, ebb4, pll0, pll1, pll2, pll3, pll6, pll8, pll9, pll10,
+pcsdw12;
 };
 
 struct intel_shared_dpll_config {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 4bbc85a..bba0691 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1207,7 +1207,8 @@ enum skl_disp_power_wells {
 /* PORT_PLL_8_A */
 #define   PORT_PLL_TARGET_CNT_MASK 0x3FF
 /* PORT_PLL_9_A */
-#define  PORT_PLL_LOCK_THRESHOLD_MASK  0xe
+#define  PORT_PLL_LOCK_THRESHOLD_SHIFT 1
+#define  PORT_PLL_LOCK_THRESHOLD_MASK  (0x7  PORT_PLL_LOCK_THRESHOLD_SHIFT)
 /* PORT_PLL_10_A */
 #define  PORT_PLL_DCO_AMP_OVR_EN_H (127)
 #define  PORT_PLL_DCO_AMP_MASK 0x3c00
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index bdc5677..ca970ba 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1476,11 +1476,15 @@ bxt_ddi_pll_select(struct intel_crtc *intel_crtc,
 
crtc_state-dpll_hw_state.pll8 = targ_cnt;
 
+   crtc_state-dpll_hw_state.pll9 = 5  PORT_PLL_LOCK_THRESHOLD_SHIFT;
+
if (dcoampovr_en_h)
crtc_state-dpll_hw_state.pll10 = PORT_PLL_DCO_AMP_OVR_EN_H;
 
crtc_state-dpll_hw_state.pll10 |= PORT_PLL_DCO_AMP(dco_amp);
 
+   crtc_state-dpll_hw_state.ebb4 = PORT_PLL_10BIT_CLK_ENABLE;
+
crtc_state-dpll_hw_state.pcsdw12 =
LANESTAGGER_STRAP_OVRD | lanestagger;
 
@@ -2414,7 +2418,7 @@ static void bxt_ddi_pll_enable(struct drm_i915_private 
*dev_priv,
 
temp = I915_READ(BXT_PORT_PLL(port, 9));
temp = ~PORT_PLL_LOCK_THRESHOLD_MASK;
-   temp |= (5  1);
+   temp |= pll-config.hw_state.pll9;
I915_WRITE(BXT_PORT_PLL(port, 9), temp);
 
temp = I915_READ(BXT_PORT_PLL(port, 10));
@@ -2427,8 +2431,8 @@ static void bxt_ddi_pll_enable(struct drm_i915_private 
*dev_priv,
temp = I915_READ(BXT_PORT_PLL_EBB_4(port));
temp |= PORT_PLL_RECALIBRATE;
I915_WRITE(BXT_PORT_PLL_EBB_4(port), temp);
-   /* Enable 10 bit clock */
-   temp |= PORT_PLL_10BIT_CLK_ENABLE;
+   temp = ~PORT_PLL_10BIT_CLK_ENABLE;
+   temp |= pll-config.hw_state.ebb4;
I915_WRITE(BXT_PORT_PLL_EBB_4(port), temp);
 
/* Enable PLL */
@@ -2481,6 +2485,9 @@ static bool bxt_ddi_pll_get_hw_state(struct 
drm_i915_private *dev_priv,
hw_state-ebb0 = I915_READ(BXT_PORT_PLL_EBB_0(port));
hw_state-ebb0 = PORT_PLL_P1_MASK | PORT_PLL_P2_MASK;
 
+   hw_state-ebb4 = I915_READ(BXT_PORT_PLL_EBB_4(port));
+   hw_state-ebb4 = PORT_PLL_10BIT_CLK_ENABLE;
+
hw_state-pll0 = I915_READ(BXT_PORT_PLL(port, 0));
hw_state-pll0 = PORT_PLL_M2_MASK;
 
@@ -2501,6 +2508,9 @@ static bool bxt_ddi_pll_get_hw_state(struct 
drm_i915_private *dev_priv,
hw_state-pll8 = I915_READ(BXT_PORT_PLL(port, 8));
hw_state-pll8 = PORT_PLL_TARGET_CNT_MASK;
 
+   hw_state-pll9 = I915_READ(BXT_PORT_PLL(port, 9));
+   hw_state-pll9 = PORT_PLL_LOCK_THRESHOLD_MASK;
+
hw_state-pll10 = I915_READ(BXT_PORT_PLL(port, 10));
hw_state-pll10 = PORT_PLL_DCO_AMP_OVR_EN_H |
   PORT_PLL_DCO_AMP_MASK;
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 9149410..6f79680 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11905,17 +11905,19 @@ static void intel_dump_pipe_config(struct intel_crtc 
*crtc,
DRM_DEBUG_KMS(double wide: %i\n, pipe_config-double_wide);
 
if (IS_BROXTON(dev)) {
-   DRM_DEBUG_KMS(ddi_pll_sel: %u; dpll_hw_state: ebb0: 0x%x, 
+   DRM_DEBUG_KMS(ddi_pll_sel: %u; dpll_hw_state: ebb0: 0x%x, 
ebb4: 0x%x,
  pll0: 0x%x, pll1: 0x%x, pll2: 0x%x, pll3: 0x%x, 
- pll6: 0x%x, pll8: 0x%x, pcsdw12: 0x%x\n,
+ pll6: 0x%x, pll8: 0x%x, pll9: 0x%x, pcsdw12: 
0x%x\n,
  pipe_config-ddi_pll_sel,
  

[Intel-gfx] [PATCH 5/5] drm/i915/bxt: add DDI port HW readout support

2015-06-18 Thread Imre Deak
Add support for reading out the HW state for DDI ports. Since the actual
programming is very similar to the CHV/VLV DPIO PLL programming we can
reuse much of the logic from there.

This fixes the state checker failures I saw on my BXT with HDMI output.

Signed-off-by: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  | 15 +--
 drivers/gpu/drm/i915/intel_ddi.c | 22 --
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index bba0691..fcf6ad5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1169,10 +1169,12 @@ enum skl_disp_power_wells {
 #define _PORT_PLL_EBB_0_A  0x162034
 #define _PORT_PLL_EBB_0_B  0x6C034
 #define _PORT_PLL_EBB_0_C  0x6C340
-#define   PORT_PLL_P1_MASK (0x07  13)
-#define   PORT_PLL_P1(x)   ((x)   13)
-#define   PORT_PLL_P2_MASK (0x1f  8)
-#define   PORT_PLL_P2(x)   ((x)   8)
+#define   PORT_PLL_P1_SHIFT13
+#define   PORT_PLL_P1_MASK (0x07  PORT_PLL_P1_SHIFT)
+#define   PORT_PLL_P1(x)   ((x)   PORT_PLL_P1_SHIFT)
+#define   PORT_PLL_P2_SHIFT8
+#define   PORT_PLL_P2_MASK (0x1f  PORT_PLL_P2_SHIFT)
+#define   PORT_PLL_P2(x)   ((x)   PORT_PLL_P2_SHIFT)
 #define BXT_PORT_PLL_EBB_0(port)   _PORT3(port, _PORT_PLL_EBB_0_A, \
_PORT_PLL_EBB_0_B,  \
_PORT_PLL_EBB_0_C)
@@ -1192,8 +1194,9 @@ enum skl_disp_power_wells {
 /* PORT_PLL_0_A */
 #define   PORT_PLL_M2_MASK 0xFF
 /* PORT_PLL_1_A */
-#define   PORT_PLL_N_MASK  (0x0F  8)
-#define   PORT_PLL_N(x)((x)  8)
+#define   PORT_PLL_N_SHIFT 8
+#define   PORT_PLL_N_MASK  (0x0F  PORT_PLL_N_SHIFT)
+#define   PORT_PLL_N(x)((x)  PORT_PLL_N_SHIFT)
 /* PORT_PLL_2_A */
 #define   PORT_PLL_M2_FRAC_MASK0x3F
 /* PORT_PLL_3_A */
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index ca970ba..6859068 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -971,8 +971,26 @@ static void hsw_ddi_clock_get(struct intel_encoder 
*encoder,
 static int bxt_calc_pll_link(struct drm_i915_private *dev_priv,
enum intel_dpll_id dpll)
 {
-   /* FIXME formula not available in bspec */
-   return 0;
+   struct intel_shared_dpll *pll;
+   struct intel_dpll_hw_state *state;
+   intel_clock_t clock;
+
+   /* For DDI ports we always use a shared PLL. */
+   if (WARN_ON(dpll == DPLL_ID_PRIVATE))
+   return 0;
+
+   pll = dev_priv-shared_dplls[dpll];
+   state = pll-config.hw_state;
+
+   clock.m1 = 2;
+   clock.m2 = (state-pll0  PORT_PLL_M2_MASK)  22;
+   if (state-pll3  PORT_PLL_M2_FRAC_ENABLE)
+   clock.m2 |= state-pll2  PORT_PLL_M2_FRAC_MASK;
+   clock.n = (state-pll1  PORT_PLL_N_MASK)  PORT_PLL_N_SHIFT;
+   clock.p1 = (state-ebb0  PORT_PLL_P1_MASK)  PORT_PLL_P1_SHIFT;
+   clock.p2 = (state-ebb0  PORT_PLL_P2_MASK)  PORT_PLL_P2_SHIFT;
+
+   return vlv_calc_port_clock(10, clock);
 }
 
 static void bxt_ddi_clock_get(struct intel_encoder *encoder,
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/5] drm/i915/bxt: mask off the DPLL state checker bits we don't program

2015-06-18 Thread Imre Deak
For the purpose of state checking we only care about the DPLL HW flags
that we actually program, so mask off the ones that we don't.

This fixes one set of DPLL state check failures.

Signed-off-by: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/i915/intel_ddi.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 9ae297a..bdc5677 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -2479,13 +2479,32 @@ static bool bxt_ddi_pll_get_hw_state(struct 
drm_i915_private *dev_priv,
return false;
 
hw_state-ebb0 = I915_READ(BXT_PORT_PLL_EBB_0(port));
+   hw_state-ebb0 = PORT_PLL_P1_MASK | PORT_PLL_P2_MASK;
+
hw_state-pll0 = I915_READ(BXT_PORT_PLL(port, 0));
+   hw_state-pll0 = PORT_PLL_M2_MASK;
+
hw_state-pll1 = I915_READ(BXT_PORT_PLL(port, 1));
+   hw_state-pll1 = PORT_PLL_N_MASK;
+
hw_state-pll2 = I915_READ(BXT_PORT_PLL(port, 2));
+   hw_state-pll2 = PORT_PLL_M2_FRAC_MASK;
+
hw_state-pll3 = I915_READ(BXT_PORT_PLL(port, 3));
+   hw_state-pll3 = PORT_PLL_M2_FRAC_ENABLE;
+
hw_state-pll6 = I915_READ(BXT_PORT_PLL(port, 6));
+   hw_state-pll6 = PORT_PLL_PROP_COEFF_MASK |
+ PORT_PLL_INT_COEFF_MASK |
+ PORT_PLL_GAIN_CTL_MASK;
+
hw_state-pll8 = I915_READ(BXT_PORT_PLL(port, 8));
+   hw_state-pll8 = PORT_PLL_TARGET_CNT_MASK;
+
hw_state-pll10 = I915_READ(BXT_PORT_PLL(port, 10));
+   hw_state-pll10 = PORT_PLL_DCO_AMP_OVR_EN_H |
+  PORT_PLL_DCO_AMP_MASK;
+
/*
 * While we write to the group register to program all lanes at once we
 * can read only lane registers. We configure all lanes the same way, so
@@ -2496,6 +2515,7 @@ static bool bxt_ddi_pll_get_hw_state(struct 
drm_i915_private *dev_priv,
DRM_DEBUG_DRIVER(lane stagger config different for lane 01 
(%08x) and 23 (%08x)\n,
 hw_state-pcsdw12,
 I915_READ(BXT_PORT_PCS_DW12_LN23(port)));
+   hw_state-pcsdw12 = LANE_STAGGER_MASK | LANESTAGGER_STRAP_OVRD;
 
return true;
 }
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/5] drm/i915/vlv: factor out vlv_calc_port_clock

2015-06-18 Thread Imre Deak
This functionality will be needed by the next patch adding HW readout
support for DDI ports on BXT, so factor it out.

No functional change.

Signed-off-by: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c | 18 ++
 drivers/gpu/drm/i915/intel_drv.h |  2 ++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 0e5c613..6cf2a15 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7993,6 +7993,14 @@ static void i9xx_get_pfit_config(struct intel_crtc *crtc,
I915_READ(LVDS)  LVDS_BORDER_ENABLE;
 }
 
+int vlv_calc_port_clock(int refclk, intel_clock_t *pll_clock)
+{
+   chv_clock(refclk, pll_clock);
+
+   /* clock.dot is the fast clock */
+   return pll_clock-dot / 5;
+}
+
 static void vlv_crtc_clock_get(struct intel_crtc *crtc,
   struct intel_crtc_state *pipe_config)
 {
@@ -8017,10 +8025,7 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc,
clock.p1 = (mdiv  DPIO_P1_SHIFT)  7;
clock.p2 = (mdiv  DPIO_P2_SHIFT)  0x1f;
 
-   vlv_clock(refclk, clock);
-
-   /* clock.dot is the fast clock */
-   pipe_config-port_clock = clock.dot / 5;
+   pipe_config-port_clock = vlv_calc_port_clock(refclk, clock);
 }
 
 static void
@@ -8116,10 +8121,7 @@ static void chv_crtc_clock_get(struct intel_crtc *crtc,
clock.p1 = (cmn_dw13  DPIO_CHV_P1_DIV_SHIFT)  0x7;
clock.p2 = (cmn_dw13  DPIO_CHV_P2_DIV_SHIFT)  0x1f;
 
-   chv_clock(refclk, clock);
-
-   /* clock.dot is the fast clock */
-   pipe_config-port_clock = clock.dot / 5;
+   pipe_config-port_clock = vlv_calc_port_clock(refclk, clock);
 }
 
 static bool i9xx_get_pipe_config(struct intel_crtc *crtc,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bcafefc..95e14bb 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1139,6 +1139,8 @@ ironlake_check_encoder_dotclock(const struct 
intel_crtc_state *pipe_config,
int dotclock);
 bool bxt_find_best_dpll(struct intel_crtc_state *crtc_state, int target_clock,
intel_clock_t *best_clock);
+int vlv_calc_port_clock(int refclk, intel_clock_t *pll_clock);
+
 bool intel_crtc_active(struct drm_crtc *crtc);
 void hsw_enable_ips(struct intel_crtc *crtc);
 void hsw_disable_ips(struct intel_crtc *crtc);
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/5] drm/i915/bxt: add PLL10 to the PLL state dumper

2015-06-18 Thread Imre Deak
Signed-off-by: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 6f79680..0e5c613 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11907,7 +11907,7 @@ static void intel_dump_pipe_config(struct intel_crtc 
*crtc,
if (IS_BROXTON(dev)) {
DRM_DEBUG_KMS(ddi_pll_sel: %u; dpll_hw_state: ebb0: 0x%x, 
ebb4: 0x%x,
  pll0: 0x%x, pll1: 0x%x, pll2: 0x%x, pll3: 0x%x, 
- pll6: 0x%x, pll8: 0x%x, pll9: 0x%x, pcsdw12: 
0x%x\n,
+ pll6: 0x%x, pll8: 0x%x, pll9: 0x%x, pll10: 0x%x, 
pcsdw12: 0x%x\n,
  pipe_config-ddi_pll_sel,
  pipe_config-dpll_hw_state.ebb0,
  pipe_config-dpll_hw_state.ebb4,
@@ -11918,6 +11918,7 @@ static void intel_dump_pipe_config(struct intel_crtc 
*crtc,
  pipe_config-dpll_hw_state.pll6,
  pipe_config-dpll_hw_state.pll8,
  pipe_config-dpll_hw_state.pll9,
+ pipe_config-dpll_hw_state.pll10,
  pipe_config-dpll_hw_state.pcsdw12);
} else if (IS_SKYLAKE(dev)) {
DRM_DEBUG_KMS(ddi_pll_sel: %u; dpll_hw_state: 
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Initialize HWS page address after GPU reset

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 03:05:12PM +0100, Siluvery, Arun wrote:
 On 15/06/2015 06:20, Daniel Vetter wrote:
 On Wed, Jun 3, 2015 at 6:14 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
 I was going to suggest removing the same thing from the
 lrc_setup_hardware_status_page(), but after another look it seems we
 sometimes call .init_hw() before the context setup. Would be nice to
 have a more consistent sequence for init and reset. But anyway the patch
 looks OK to me. I verified that we indeed lose this register on GPU
 reset.
 
 Yep, this is a mess. And historically _any_ difference between driver
 load and gpu reset (or resume fwiw) has lead to hilarious bugs, so
 this difference is really troubling to me. Arun, can you please work
 on a patch to unify the setup sequence here, so that both driver load
 gpu resets work the same way? By the time we're calling gem_init_hw
 the default context should have been created already, and hence we
 should be able to write to HWS_PGA in ring-init_hw only.
 
 
 Hi Daniel,
 
 I think the problem in this case was the code to init HWS page after reset
 was missing for Gen8+. For Gen7 we are doing this as part of ring-init_hw.
 
 Gen7:
 i915_reset()
 +-- i915_gem_init_hw()
 +-- ring-init_hw() which is init_render_ring()
 +-- init_ring_common()
 + intel_ring_setup_status_page()
 
 Gen8:
 i915_reset()
 +-- i915_gem_init_hw()
 +-- ring-init_hw() which is gen8_init_render_ring()
 + gen8_init_common_ring() - I added changes in this function.
 
 We could probably use intel_ring_setup_status_page() for both cases, does it
 have to be Gen7 specific?

My concern isn't that we have two functions doing hws setup. My concern is
that we now have 2 callsites for execlist mode doing hws setup, with
slight differences between reset/driver load and resume.

I want one, unconditional call to set up the hws page at exactly the right
place in the setup sequence. That might require some refactoring, I
haven't looked that closely at intel_lrc.c

The usual approach is that gem_init does exclusively software setup, and
gem_init_hw does all the register writes an actual enabling. I think the
hws setup in the driver load code is currently called from gem_init(),
which is the wrong place.

 Also I wonder about resume, where's the HWS_PGA restore for that case?
 It is covered.
 
 i915_drm_resume()
 +--i915_gem_init_hw

Ok, so should be covered with whatever fix we have for gpu reset.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 12:42:55PM +0100, Chris Wilson wrote:
 On Thu, Jun 18, 2015 at 12:18:39PM +0100, Tomas Elf wrote:
  My point was more along the lines of bailing out if the reset
  request fails and not return an error message but simply keep track
  of the number of times we've attempted the reset request. By not
  returning an error we would allow more subsequent hang detections to
  happen (since the hang is still there), which would end up in the
  same reset request in the future. If the reset request would fail
  more times we would simply increment the counter and at one point we
  would decide that we've had too many unsuccessful reset request
  attempts and simply go ahead with the reset anyway and if the reset
  would fail we would return an error at that point in time, which
  would result in a terminally wedged state. But, yeah, I can see why
  we shouldn't do this.
 
 Skipping to the middle!
 
 I understand the merit in trying the reset a few times before giving up,
 it would just need a bit of restructuring to try the reset before
 clearing gem state (trivial) and requeueing the hangcheck. I am just
 wary of feature creep before we get stuck into TDR, which promises to
 change how we think about resets entirely.

My maintainer concern here is always that we should err on the side of not
killing the machine. If the reset failed, or if the gpu reinit failed then
marking the gpu as wedged has historically been the safe option. The
system will still run, display mostly works and there's a reasonable
chance you can gather debug data.

We do have i915.reset to disable the reset for these cases, but it's
always a nuisance to have to resort to that.
-Daneil
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Buffer translation improvements

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 12:50:33PM +0300, David Weinehall wrote:
 @@ -3520,6 +3545,9 @@ intel_dp_set_signal_levels(struct intel_dp *intel_dp, 
 uint32_t *DP)
   } else if (HAS_DDI(dev)) {
   signal_levels = hsw_signal_levels(train_set);
   mask = DDI_BUF_EMP_MASK;
 +
 + if (IS_SKYLAKE(dev))
 + skl_set_iboost(intel_dp);

Imo this should be put into hsw_signal_levels and then hsw_signal_levels
be moved into intel_ddi.c - that way everything related to low-level ddi
DP signal level code in intel_ddi.c.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Damien Lespiau
On Thu, Jun 18, 2015 at 03:45:44PM +0100, Antoine, Peter wrote:
 So, intializing the other (non-render) MOCS in gen8_init_rcs_context()
 isn't the most logical thing to do I'm afraid. What happens if we
 suddenly decide that we don't want to fully initialize the default
 context at startup but initialize each ring on-demand for that context
 as well? We can end up in a situation where we use the blitter first
 and we wouldn't have the blitter MOCS initialized.
 
 In that sense, that code makes an assumption about how we do things in
 a completely different part of the driver and that's always a
 potential source of bugs.
 

Yes, but this is the same with the golden context and the workarounds
(as I understand it) so all this code would have to be moved. 

Ah, but the workarounds in that function are only for registers in the
render context, not other rings/engine.

-- 
Damien
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 04:24:53PM +0200, Daniel Vetter wrote:
 On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
  On 18/06/2015 13:21, Chris Wilson wrote:
  On Thu, Jun 18, 2015 at 01:14:56PM +0100, john.c.harri...@intel.com wrote:
  From: John Harrison john.c.harri...@intel.com
  
  The plan is to pass requests around as the basic submission tracking 
  structure
  rather than rings and contexts. This patch updates the 
  i915_gem_object_sync()
  code path.
  
  v2: Much more complex patch to share a single request between the sync 
  and the
  page flip. The _sync() function now supports lazy allocation of the 
  request
  structure. That is, if one is passed in then that will be used. If one is 
  not,
  then a request will be allocated and passed back out. Note that the 
  _sync() code
  does not necessarily require a request. Thus one will only be created 
  until
  certain situations. The reason the lazy allocation must be done within the
  _sync() code itself is because the decision to need one or not is not 
  really
  something that code above can second guess (except in the case where one 
  is
  definitely not required because no ring is passed in).
  
  The call chains above _sync() now support passing a request through which 
  most
  callers passing in NULL and assuming that no request will be required 
  (because
  they also pass in NULL for the ring and therefore can't be generating any 
  ring
  code).
  
  The exeception is intel_crtc_page_flip() which now supports having a 
  request
  returned from _sync(). If one is, then that request is shared by the page 
  flip
  (if the page flip is of a type to need a request). If _sync() does not 
  generate
  a request but the page flip does need one, then the page flip path will 
  create
  its own request.
  
  v3: Updated comment description to be clearer about 'to_req' parameter 
  (Tomas
  Elf review request). Rebased onto newer tree that significantly changed 
  the
  synchronisation code.
  
  v4: Updated comments from review feedback (Tomas Elf)
  
  For: VIZ-5115
  Signed-off-by: John Harrison john.c.harri...@intel.com
  Reviewed-by: Tomas Elf tomas@intel.com
  ---
drivers/gpu/drm/i915/i915_drv.h|4 ++-
drivers/gpu/drm/i915/i915_gem.c|   48 
   +---
drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
drivers/gpu/drm/i915/intel_display.c   |   17 +++---
drivers/gpu/drm/i915/intel_drv.h   |3 +-
drivers/gpu/drm/i915/intel_fbdev.c |2 +-
drivers/gpu/drm/i915/intel_lrc.c   |2 +-
drivers/gpu/drm/i915/intel_overlay.c   |2 +-
8 files changed, 57 insertions(+), 23 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_drv.h 
  b/drivers/gpu/drm/i915/i915_drv.h
  index 64a10fa..f69e9cb 100644
  --- a/drivers/gpu/drm/i915/i915_drv.h
  +++ b/drivers/gpu/drm/i915/i915_drv.h
  @@ -2778,7 +2778,8 @@ static inline void 
  i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
int i915_gem_object_sync(struct drm_i915_gem_object *obj,
  -  struct intel_engine_cs *to);
  +  struct intel_engine_cs *to,
  +  struct drm_i915_gem_request **to_req);
  Nope. Did you forget to reorder the code to ensure that the request is
  allocated along with the context switch at the start of execbuf?
  -Chris
  
  Not sure what you are objecting to? If you mean the lazily allocated request
  then that is for page flip code not execbuff code. If we get here from an
  execbuff call then the request will definitely have been allocated and will
  be passed in. Whereas the page flip code may or may not require a request
  (depending on whether MMIO or ring flips are in use. Likewise the sync code
  may or may not require a request (depending on whether there is anything to
  sync to or not). There is no point allocating and submitting an empty
  request in the MMIO/idle case. Hence the sync code needs to be able to use
  an existing request or create one if none already exists.
 
 I guess Chris' comment was that if you have a non-NULL to, then you better
 have a non-NULL to_req. And since we link up reqeusts to the engine
 they'll run on the former shouldn't be required any more. So either that's
 true and we can remove the to or we don't understand something yet (and
 perhaps that should be done as a follow-up).

I am sure I sent a patch that outlined in great detail how that we need
only the request parameter in i915_gem_object_sync(), for handling both
execbuffer, pipelined pin_and_fence and synchronous pin_and_fence.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
 Guc is different since we really must have it ready for execbuf, and for
 that usecase a completion at drm_open time sounds like the right thing.

But do we? It would be nice if we had a definite answer that the hw was
ready before we started using it in anger, but I don't see any reason
why we would have to delay userspace for a slow microcode update...

(This presupposes that userspace batches are unaffected by GuC/execlist
setup, which for userspace sanity I hope they are - or at least using
predicate registers and conditional execution.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range()

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 06:31:18PM +0300, Imre Deak wrote:
 On to, 2015-06-11 at 09:33 +0100, Chris Wilson wrote:
  On Thu, Jun 11, 2015 at 09:25:16AM +0100, Dave Gordon wrote:
   On 10/06/15 15:58, Chris Wilson wrote:
As the clflush operates on cache lines, and we can flush any byte
address, in order to flush all bytes given in the range we issue an
extra clflush on the last byte to ensure the last cacheline is flushed.
We can can the iteration to be over the actual cache lines to avoid this
double clflush on the last byte.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Imre Deak imre.d...@intel.com
---
 drivers/gpu/drm/drm_cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 9a62d7a53553..6743ff7dccfa 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -130,11 +130,12 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)
 {
 #if defined(CONFIG_X86)
if (cpu_has_clflush) {
+   const int size = boot_cpu_data.x86_clflush_size;
void *end = addr + length;
+   addr = (void *)(((unsigned long)addr)  -size);
   
   Should this cast be to uintptr_t?
  
  The kernel has a strict equivalence between sizeof(unsigned long) and
  sizeof(pointer). You will see unsigned long used universally to pass
  along pointers to functions and as closures.
  
   Or intptr_t, as size has somewhat
   strangely been defined as signed? To complete the mix, x86_clflush_size
   is 'u16'! So maybe we should write
   
   + const size_t size = boot_cpu_data.x86_clflush_size;
   + const size_t mask = ~(size - 1);
 void *end = addr + length;
   + addr = (void *)(((uintptr_t)addr)  mask);
  
  No. size_t has very poor definition inside the kernel - what does the
  maximum size of a userspace allocation have to do with kernel internals?
  
  Let's keep userspace types in userspace, or else we end up with
  i915_gem_gtt.c.
 
 I also think using unsigned long for virtual addresses is standard in
 the kernel and I can't see how using int would lead to problems given
 the expected range of x86_clflush_size, so this looks ok to me:
 Reviewed-by: Imre Deak imre.d...@intel.com

Applied to drm-misc, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Initialize HWS page address after GPU reset

2015-06-18 Thread Siluvery, Arun

On 15/06/2015 06:20, Daniel Vetter wrote:

On Wed, Jun 3, 2015 at 6:14 PM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:

I was going to suggest removing the same thing from the
lrc_setup_hardware_status_page(), but after another look it seems we
sometimes call .init_hw() before the context setup. Would be nice to
have a more consistent sequence for init and reset. But anyway the patch
looks OK to me. I verified that we indeed lose this register on GPU
reset.


Yep, this is a mess. And historically _any_ difference between driver
load and gpu reset (or resume fwiw) has lead to hilarious bugs, so
this difference is really troubling to me. Arun, can you please work
on a patch to unify the setup sequence here, so that both driver load
gpu resets work the same way? By the time we're calling gem_init_hw
the default context should have been created already, and hence we
should be able to write to HWS_PGA in ring-init_hw only.



Hi Daniel,

I think the problem in this case was the code to init HWS page after 
reset was missing for Gen8+. For Gen7 we are doing this as part of 
ring-init_hw.


Gen7:
i915_reset()
+-- i915_gem_init_hw()
+-- ring-init_hw() which is init_render_ring()
+-- init_ring_common()
+ intel_ring_setup_status_page()

Gen8:
i915_reset()
+-- i915_gem_init_hw()
+-- ring-init_hw() which is gen8_init_render_ring()
+ gen8_init_common_ring() - I added changes in this function.

We could probably use intel_ring_setup_status_page() for both cases, 
does it have to be Gen7 specific?



Also I wonder about resume, where's the HWS_PGA restore for that case?

It is covered.

i915_drm_resume()
+--i915_gem_init_hw

regards
Arun


-Daniel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Antoine, Peter


-Original Message-
From: Lespiau, Damien 
Sent: Thursday, June 18, 2015 2:51 PM
To: Antoine, Peter
Cc: intel-gfx@lists.freedesktop.org; 
daniel.vetter.intel@irsmsx102.ger.corp.intel.com; ch...@chris-wilson.co.uk; 
matts...@gmail.com
Subject: Re: [PATCH v5] drm/i915 : Added Programming of the MOCS

On Thu, Jun 18, 2015 at 01:29:45PM +0100, Peter Antoine wrote:
 @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct 
 intel_engine_cs *ring,
   if (ret)
   return ret;
  
 + /*
 +  * Failing to program the MOCS is non-fatal.The system will not
 +  * run at peak performance. So generate a warning and carry on.
 +  */
 + if (intel_rcs_context_init_mocs(ring, ctx) != 0)
 + DRM_ERROR(MOCS failed to program: expect performance issues.);
 +

Missing a '\n'.

Will fix.

 +static const struct drm_i915_mocs_entry skylake_mocs_table[] = {
 +  /* {0x0009, 0x0010} */
 + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) |
 + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) |
 + MOC_PFM(0) | MOCS_SCF(0)),
 + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))},
 +  /* {0x003b, 0x0030} */

We're still missing the usage hints for those configuration entries That'd help 
user space a lot, which means make this patch land quicker as well.

These are boiled down from 250+ requirements from different usecases (opencl, 
Media, etc...), I can't really generate anymore usage hints.

 +int intel_rcs_context_init_mocs(struct intel_engine_cs *ring,
 + struct intel_context *ctx)
 +{
 + int ret = 0;
 +
 + struct drm_i915_mocs_table t;
 + struct drm_device *dev = ring-dev;
 + struct intel_ringbuffer *ringbuf = ctx-engine[ring-id].ringbuf;
 +
 + if (get_mocs_settings(dev, t)) {
 + u32 table_size;
 +
 + /*
 +  * OK. For each supported ring:
 +  *  number of mocs entries * 2 dwords for each control_value
 +  *  plus number of mocs entries /2 dwords for l3cc values.
 +  *
 +  *  Plus 1 for the load command and 1 for the NOOP per ring
 +  *  and the l3cc programming.
 +  */
 + table_size = GEN9_NUM_MOCS_RINGS *
 + ((2 * GEN9_NUM_MOCS_ENTRIES) + 2) +
 + GEN9_NUM_MOCS_ENTRIES + 2;
 + ret = intel_logical_ring_begin(ringbuf, ctx, table_size);
 + if (ret) {
 + DRM_DEBUG(intel_logical_ring_begin failed %d\n, ret);
 + return ret;
 + }
 +
 + /* program the control registers */
 + emit_mocs_control_table(ringbuf, t, GEN9_GFX_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_MFX0_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_MFX1_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_VEBOX_MOCS_0);
 + emit_mocs_control_table(ringbuf, t, GEN9_BLT_MOCS_0);

So, if I'm not mistaken, I think this only works because we fully initialize 
the default context at start/reset time through:

  + i915_gem_init_hw()
+ i915_gem_context_enable()
  + cycle through all the rings and call ring-init_context()
+ gen8_init_rcs_context()
  + intel_rcs_context_init_mocs()
(initalize ALL the MOCS!)

Yes.

So, intializing the other (non-render) MOCS in gen8_init_rcs_context() isn't 
the most logical thing to do I'm afraid. What happens if we suddenly decide 
that we don't want to fully initialize the default context at startup but 
initialize each ring on-demand for that context as well? We can end up in a 
situation where we use the blitter first and we wouldn't have the blitter MOCS 
initialized.

In that sense, that code makes an assumption about how we do things in a 
completely different part of the driver and that's always a potential source of 
bugs.

Yes, but this is the same with the golden context and the workarounds (as I 
understand it) so all this code would have to be moved. 

Chris, how far am I ? :p

One way to solve this (if that's indeed the issue pointed at by Chris) would 
be to decouple the render MOCS from the others, still keep the render ones in 
there as they need to be emitted from the ring but put the other writes (which 
could be done through MMIO as well) higher in the chain, could probably make 
sense in i915_gem_context_enable()?
(which, by the way is awfully namedm should have an _init somewhere?).
It could also be a per-ring vfunc I suppose.

For similar reasons, I think the GuC MOCS should be part of the GuC init as 
well so we don't couple too hard different part of the code.

Now, is that really a blocker? I'd say no if we had userspace ready and could 
commit that today, because we really want it. Still something to look at, I 
could be totally wrong.

Not a blocker. It gets a little 

Re: [Intel-gfx] [PATCH v3 15/19] drm/i915: atomic plane updates in a nutshell

2015-06-18 Thread Ville Syrjälä
On Mon, Jun 15, 2015 at 12:33:52PM +0200, Maarten Lankhorst wrote:
 Now that all planes are added during a modeset we can use the
 calculated changes before disabling a plane, and then either commit
 or force disable a plane before disabling the crtc.
 
 The code is shared with atomic_begin/flush, except watermark updating
 and vblank evasion are not used.
 
 This is needed for proper atomic suspend/resume support.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90868
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_display.c | 103 
 ---
  drivers/gpu/drm/i915/intel_sprite.c  |   4 +-
  2 files changed, 23 insertions(+), 84 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index cc4ca4970716..beb69281f45c 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -2217,28 +2217,6 @@ static void intel_disable_pipe(struct intel_crtc *crtc)
   intel_wait_for_pipe_off(crtc);
  }
  
 -/**
 - * intel_enable_primary_hw_plane - enable the primary plane on a given pipe
 - * @plane:  plane to be enabled
 - * @crtc: crtc for the plane
 - *
 - * Enable @plane on @crtc, making sure that the pipe is running first.
 - */
 -static void intel_enable_primary_hw_plane(struct drm_plane *plane,
 -   struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = plane-dev;
 - struct drm_i915_private *dev_priv = dev-dev_private;
 - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -
 - /* If the pipe isn't enabled, we can't pump pixels and may hang */
 - assert_pipe_enabled(dev_priv, intel_crtc-pipe);
 - to_intel_plane_state(plane-state)-visible = true;
 -
 - dev_priv-display.update_primary_plane(crtc, plane-fb,
 -crtc-x, crtc-y);
 -}
 -
  static bool need_vtd_wa(struct drm_device *dev)
  {
  #ifdef CONFIG_INTEL_IOMMU
 @@ -4508,20 +4486,6 @@ static void ironlake_pfit_enable(struct intel_crtc 
 *crtc)
   }
  }
  
 -static void intel_enable_sprite_planes(struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = crtc-dev;
 - enum pipe pipe = to_intel_crtc(crtc)-pipe;
 - struct drm_plane *plane;
 - struct intel_plane *intel_plane;
 -
 - drm_for_each_legacy_plane(plane, dev-mode_config.plane_list) {
 - intel_plane = to_intel_plane(plane);
 - if (intel_plane-pipe == pipe)
 - intel_plane_restore(intel_plane-base);
 - }
 -}
 -
  void hsw_enable_ips(struct intel_crtc *crtc)
  {
   struct drm_device *dev = crtc-base.dev;
 @@ -4817,27 +4781,6 @@ static void intel_pre_plane_update(struct intel_crtc 
 *crtc)
   intel_pre_disable_primary(crtc-base);
  }
  
 -static void intel_crtc_enable_planes(struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = crtc-dev;
 - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 - int pipe = intel_crtc-pipe;
 -
 - intel_enable_primary_hw_plane(crtc-primary, crtc);
 - intel_enable_sprite_planes(crtc);
 - if (to_intel_plane_state(crtc-cursor-state)-visible)
 - intel_crtc_update_cursor(crtc, true);
 -
 - intel_post_enable_primary(crtc);
 -
 - /*
 -  * FIXME: Once we grow proper nuclear flip support out of this we need
 -  * to compute the mask of flip planes precisely. For the time being
 -  * consider this a flip to a NULL plane.
 -  */
 - intel_frontbuffer_flip(dev, INTEL_FRONTBUFFER_ALL_MASK(pipe));
 -}
 -
  static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned 
 plane_mask)
  {
   struct drm_device *dev = crtc-dev;
 @@ -4845,10 +4788,6 @@ static void intel_crtc_disable_planes(struct drm_crtc 
 *crtc, unsigned plane_mask
   struct drm_plane *p;
   int pipe = intel_crtc-pipe;
  
 - intel_crtc_wait_for_pending_flips(crtc);
 -
 - intel_pre_disable_primary(crtc);
 -
   intel_crtc_dpms_overlay_disable(intel_crtc);
  
   drm_for_each_plane_mask(p, dev, plane_mask)
 @@ -6270,6 +6209,11 @@ static void intel_crtc_disable_noatomic(struct 
 drm_crtc *crtc)
   if (!intel_crtc-active)
   return;
  
 + if (to_intel_plane_state(crtc-primary-state)-visible) {
 + intel_crtc_wait_for_pending_flips(crtc);
 + intel_pre_disable_primary(crtc);
 + }
 +
   intel_crtc_disable_planes(crtc, crtc-state-plane_mask);
   dev_priv-display.crtc_disable(crtc);
  
 @@ -11783,10 +11727,6 @@ int intel_plane_atomic_calc_changes(struct 
 drm_crtc_state *crtc_state,
   if (old_plane_state-base.fb  !fb)
   intel_crtc-atomic.disabled_planes |= 1  i;
  
 - /* don't run rest during modeset yet */
 - if (!intel_crtc-active || mode_changed)
 - return 0;
 -
   was_visible = old_plane_state-visible;
   visible = to_intel_plane_state(plane_state)-visible;
  
 @@ 

Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote:
 On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
  Guc is different since we really must have it ready for execbuf, and for
  that usecase a completion at drm_open time sounds like the right thing.
 
 But do we? It would be nice if we had a definite answer that the hw was
 ready before we started using it in anger, but I don't see any reason
 why we would have to delay userspace for a slow microcode update...
 
 (This presupposes that userspace batches are unaffected by GuC/execlist
 setup, which for userspace sanity I hope they are - or at least using
 predicate registers and conditional execution.)

Well I figured a wait_completion or flush_work unconditionally in execbuf
is not to your liking, and it's better to keep that in open. But I think
we should be able to get away with this at execbuf time. Might even be
better since this wouldn't block sw-rendered boot-splashs.

But either way should be suitable I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range()

2015-06-18 Thread Imre Deak
On to, 2015-06-11 at 09:33 +0100, Chris Wilson wrote:
 On Thu, Jun 11, 2015 at 09:25:16AM +0100, Dave Gordon wrote:
  On 10/06/15 15:58, Chris Wilson wrote:
   As the clflush operates on cache lines, and we can flush any byte
   address, in order to flush all bytes given in the range we issue an
   extra clflush on the last byte to ensure the last cacheline is flushed.
   We can can the iteration to be over the actual cache lines to avoid this
   double clflush on the last byte.
   
   Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
   Cc: Imre Deak imre.d...@intel.com
   ---
drivers/gpu/drm/drm_cache.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
   
   diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
   index 9a62d7a53553..6743ff7dccfa 100644
   --- a/drivers/gpu/drm/drm_cache.c
   +++ b/drivers/gpu/drm/drm_cache.c
   @@ -130,11 +130,12 @@ drm_clflush_virt_range(void *addr, unsigned long 
   length)
{
#if defined(CONFIG_X86)
 if (cpu_has_clflush) {
   + const int size = boot_cpu_data.x86_clflush_size;
 void *end = addr + length;
   + addr = (void *)(((unsigned long)addr)  -size);
  
  Should this cast be to uintptr_t?
 
 The kernel has a strict equivalence between sizeof(unsigned long) and
 sizeof(pointer). You will see unsigned long used universally to pass
 along pointers to functions and as closures.
 
  Or intptr_t, as size has somewhat
  strangely been defined as signed? To complete the mix, x86_clflush_size
  is 'u16'! So maybe we should write
  
  +   const size_t size = boot_cpu_data.x86_clflush_size;
  +   const size_t mask = ~(size - 1);
  void *end = addr + length;
  +   addr = (void *)(((uintptr_t)addr)  mask);
 
 No. size_t has very poor definition inside the kernel - what does the
 maximum size of a userspace allocation have to do with kernel internals?
 
 Let's keep userspace types in userspace, or else we end up with
 i915_gem_gtt.c.

I also think using unsigned long for virtual addresses is standard in
the kernel and I can't see how using int would lead to problems given
the expected range of x86_clflush_size, so this looks ok to me:
Reviewed-by: Imre Deak imre.d...@intel.com

 -Chris
 


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
 On 17/06/15 13:02, Daniel Vetter wrote:
  On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
  On 15/06/15 21:09, Chris Wilson wrote:
  On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
  From: Alex Dai yu@intel.com
 
  i915_gem_object_write() is a generic function to copy data from a plain
  linear buffer to a paged gem object.
 
  We will need this for the microcontroller firmware loading support code.
 
  Issue: VIZ-4884
  Signed-off-by: Alex Dai yu@intel.com
  Signed-off-by: Dave Gordon david.s.gor...@intel.com
  ---
   drivers/gpu/drm/i915/i915_drv.h |2 ++
   drivers/gpu/drm/i915/i915_gem.c |   28 
   2 files changed, 30 insertions(+)
 
  diff --git a/drivers/gpu/drm/i915/i915_drv.h 
  b/drivers/gpu/drm/i915/i915_drv.h
  index 611fbd8..9094c06 100644
  --- a/drivers/gpu/drm/i915/i915_drv.h
  +++ b/drivers/gpu/drm/i915/i915_drv.h
  @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device 
  *dev);
   void i915_gem_object_free(struct drm_i915_gem_object *obj);
   void i915_gem_object_init(struct drm_i915_gem_object *obj,
const struct drm_i915_gem_object_ops *ops);
  +int i915_gem_object_write(struct drm_i915_gem_object *obj,
  +  const void *data, size_t size);
   struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device 
  *dev,
 size_t size);
   void i915_init_vm(struct drm_i915_private *dev_priv,
  diff --git a/drivers/gpu/drm/i915/i915_gem.c 
  b/drivers/gpu/drm/i915/i915_gem.c
  index be35f04..75d63c2 100644
  --- a/drivers/gpu/drm/i915/i915_gem.c
  +++ b/drivers/gpu/drm/i915/i915_gem.c
  @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct 
  drm_i915_gem_object *obj)
   return false;
   }
   
  +/* Fill the @obj with the @size amount of @data */
  +int i915_gem_object_write(struct drm_i915_gem_object *obj,
  +const void *data, size_t size)
  +{
  +struct sg_table *sg;
  +size_t bytes;
  +int ret;
  +
  +ret = i915_gem_object_get_pages(obj);
  +if (ret)
  +return ret;
  +
  +i915_gem_object_pin_pages(obj);
 
  You don't set the object into the CPU domain, or instead manually handle
  the domain flushing. You don't handle objects that cannot be written
  directly by the CPU, nor do you handle objects whose representation in
  memory is not linear.
  -Chris
 
  No we don't handle just any random gem object, but we do return an error
  code for any types not supported. However, as we don't really need the
  full generality of writing into a gem object of any type, I will replace
  this function with one that combines the allocation of a new object
  (which will therefore definitely be of the correct type, in the correct
  domain, etc) and filling it with the data to be preserved.
 
 The usage pattern for the particular case is going to be:
   Once-only:
   Allocate
   Fill
   Then each time GuC is (re-)initialised:
   Map to GTT
   DMA-read from buffer into GuC private memory
   Unmap
   Only on unload:
   Dispose
 
 So our object is write-once by the CPU (and that's always the first
 operation), thereafter read-occasionally by the GuC's DMA engine.

Yup. The problem is more that on atom platforms the objects aren't
coherent by default and generally you need to do something. Hence we
either have
- an explicit set_caching call to document that this is a gpu object which
  is always coherent (so also on chv/bxt), even when that's a no-op on big
  core
- or wrap everything in set_domain calls, even when those are no-ops too.

If either of those lack, reviews tend to freak out preemptively and the
reptil brain takes over ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 18/19] drm/i915: Remove transitional references from intel_plane_atomic_check.

2015-06-18 Thread Matt Roper
On Mon, Jun 15, 2015 at 12:33:55PM +0200, Maarten Lankhorst wrote:
 All transitional plane helpers are gone, party!
 
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com

There's also a reference in skylake_update_primary_plane() that I assume
can be removed?


Matt


 ---
  drivers/gpu/drm/i915/intel_atomic_plane.c | 19 ++-
  1 file changed, 6 insertions(+), 13 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c 
 b/drivers/gpu/drm/i915/intel_atomic_plane.c
 index 10a8ecedc942..f1ab8e4b9c11 100644
 --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
 +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
 @@ -115,6 +115,7 @@ static int intel_plane_atomic_check(struct drm_plane 
 *plane,
   struct intel_crtc_state *crtc_state;
   struct intel_plane *intel_plane = to_intel_plane(plane);
   struct intel_plane_state *intel_state = to_intel_plane_state(state);
 + struct drm_crtc_state *drm_crtc_state;
   int ret;
  
   crtc = crtc ? crtc : plane-state-crtc;
 @@ -129,19 +130,11 @@ static int intel_plane_atomic_check(struct drm_plane 
 *plane,
   if (!crtc)
   return 0;
  
 - /* FIXME: temporary hack necessary while we still use the plane update
 -  * helper. */
 - if (state-state) {
 - struct drm_crtc_state *drm_crtc_state =
 - drm_atomic_get_existing_crtc_state(state-state, crtc);
 + drm_crtc_state = drm_atomic_get_existing_crtc_state(state-state, crtc);
 + if (WARN_ON(!drm_crtc_state))
 + return -EINVAL;
  
 - if (WARN_ON(!drm_crtc_state))
 - return -EINVAL;
 -
 - crtc_state = to_intel_crtc_state(drm_crtc_state);
 - } else {
 - crtc_state = intel_crtc-config;
 - }
 + crtc_state = to_intel_crtc_state(drm_crtc_state);
  
   /*
* The original src/dest coordinates are stored in state-base, but
 @@ -191,7 +184,7 @@ static int intel_plane_atomic_check(struct drm_plane 
 *plane,
  
   intel_state-visible = false;
   ret = intel_plane-check_plane(plane, crtc_state, intel_state);
 - if (ret || !state-state)
 + if (ret)
   return ret;
  
   return intel_plane_atomic_calc_changes(crtc_state-base, state);
 -- 
 2.1.0
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling  Development
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:
 On 18/06/2015 13:21, Chris Wilson wrote:
 On Thu, Jun 18, 2015 at 01:14:56PM +0100, john.c.harri...@intel.com wrote:
 From: John Harrison john.c.harri...@intel.com
 
 The plan is to pass requests around as the basic submission tracking 
 structure
 rather than rings and contexts. This patch updates the 
 i915_gem_object_sync()
 code path.
 
 v2: Much more complex patch to share a single request between the sync and 
 the
 page flip. The _sync() function now supports lazy allocation of the request
 structure. That is, if one is passed in then that will be used. If one is 
 not,
 then a request will be allocated and passed back out. Note that the _sync() 
 code
 does not necessarily require a request. Thus one will only be created until
 certain situations. The reason the lazy allocation must be done within the
 _sync() code itself is because the decision to need one or not is not really
 something that code above can second guess (except in the case where one is
 definitely not required because no ring is passed in).
 
 The call chains above _sync() now support passing a request through which 
 most
 callers passing in NULL and assuming that no request will be required 
 (because
 they also pass in NULL for the ring and therefore can't be generating any 
 ring
 code).
 
 The exeception is intel_crtc_page_flip() which now supports having a request
 returned from _sync(). If one is, then that request is shared by the page 
 flip
 (if the page flip is of a type to need a request). If _sync() does not 
 generate
 a request but the page flip does need one, then the page flip path will 
 create
 its own request.
 
 v3: Updated comment description to be clearer about 'to_req' parameter 
 (Tomas
 Elf review request). Rebased onto newer tree that significantly changed the
 synchronisation code.
 
 v4: Updated comments from review feedback (Tomas Elf)
 
 For: VIZ-5115
 Signed-off-by: John Harrison john.c.harri...@intel.com
 Reviewed-by: Tomas Elf tomas@intel.com
 ---
   drivers/gpu/drm/i915/i915_drv.h|4 ++-
   drivers/gpu/drm/i915/i915_gem.c|   48 
  +---
   drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
   drivers/gpu/drm/i915/intel_display.c   |   17 +++---
   drivers/gpu/drm/i915/intel_drv.h   |3 +-
   drivers/gpu/drm/i915/intel_fbdev.c |2 +-
   drivers/gpu/drm/i915/intel_lrc.c   |2 +-
   drivers/gpu/drm/i915/intel_overlay.c   |2 +-
   8 files changed, 57 insertions(+), 23 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h 
 b/drivers/gpu/drm/i915/i915_drv.h
 index 64a10fa..f69e9cb 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct 
 drm_i915_gem_object *obj)
   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 -struct intel_engine_cs *to);
 +struct intel_engine_cs *to,
 +struct drm_i915_gem_request **to_req);
 Nope. Did you forget to reorder the code to ensure that the request is
 allocated along with the context switch at the start of execbuf?
 -Chris
 
 Not sure what you are objecting to? If you mean the lazily allocated request
 then that is for page flip code not execbuff code. If we get here from an
 execbuff call then the request will definitely have been allocated and will
 be passed in. Whereas the page flip code may or may not require a request
 (depending on whether MMIO or ring flips are in use. Likewise the sync code
 may or may not require a request (depending on whether there is anything to
 sync to or not). There is no point allocating and submitting an empty
 request in the MMIO/idle case. Hence the sync code needs to be able to use
 an existing request or create one if none already exists.

I guess Chris' comment was that if you have a non-NULL to, then you better
have a non-NULL to_req. And since we link up reqeusts to the engine
they'll run on the former shouldn't be required any more. So either that's
true and we can remove the to or we don't understand something yet (and
perhaps that should be done as a follow-up).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 15/19] drm/i915: atomic plane updates in a nutshell

2015-06-18 Thread Matt Roper
On Mon, Jun 15, 2015 at 12:33:52PM +0200, Maarten Lankhorst wrote:
 Now that all planes are added during a modeset we can use the
 calculated changes before disabling a plane, and then either commit
 or force disable a plane before disabling the crtc.
 
 The code is shared with atomic_begin/flush, except watermark updating
 and vblank evasion are not used.
 
 This is needed for proper atomic suspend/resume support.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90868
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_display.c | 103 
 ---
  drivers/gpu/drm/i915/intel_sprite.c  |   4 +-
  2 files changed, 23 insertions(+), 84 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index cc4ca4970716..beb69281f45c 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -2217,28 +2217,6 @@ static void intel_disable_pipe(struct intel_crtc *crtc)
   intel_wait_for_pipe_off(crtc);
  }
  
 -/**
 - * intel_enable_primary_hw_plane - enable the primary plane on a given pipe
 - * @plane:  plane to be enabled
 - * @crtc: crtc for the plane
 - *
 - * Enable @plane on @crtc, making sure that the pipe is running first.
 - */
 -static void intel_enable_primary_hw_plane(struct drm_plane *plane,
 -   struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = plane-dev;
 - struct drm_i915_private *dev_priv = dev-dev_private;
 - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -
 - /* If the pipe isn't enabled, we can't pump pixels and may hang */
 - assert_pipe_enabled(dev_priv, intel_crtc-pipe);
 - to_intel_plane_state(plane-state)-visible = true;
 -
 - dev_priv-display.update_primary_plane(crtc, plane-fb,
 -crtc-x, crtc-y);
 -}
 -
  static bool need_vtd_wa(struct drm_device *dev)
  {
  #ifdef CONFIG_INTEL_IOMMU
 @@ -4508,20 +4486,6 @@ static void ironlake_pfit_enable(struct intel_crtc 
 *crtc)
   }
  }
  
 -static void intel_enable_sprite_planes(struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = crtc-dev;
 - enum pipe pipe = to_intel_crtc(crtc)-pipe;
 - struct drm_plane *plane;
 - struct intel_plane *intel_plane;
 -
 - drm_for_each_legacy_plane(plane, dev-mode_config.plane_list) {
 - intel_plane = to_intel_plane(plane);
 - if (intel_plane-pipe == pipe)
 - intel_plane_restore(intel_plane-base);
 - }
 -}
 -
  void hsw_enable_ips(struct intel_crtc *crtc)
  {
   struct drm_device *dev = crtc-base.dev;
 @@ -4817,27 +4781,6 @@ static void intel_pre_plane_update(struct intel_crtc 
 *crtc)
   intel_pre_disable_primary(crtc-base);
  }
  
 -static void intel_crtc_enable_planes(struct drm_crtc *crtc)
 -{
 - struct drm_device *dev = crtc-dev;
 - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 - int pipe = intel_crtc-pipe;
 -
 - intel_enable_primary_hw_plane(crtc-primary, crtc);
 - intel_enable_sprite_planes(crtc);
 - if (to_intel_plane_state(crtc-cursor-state)-visible)
 - intel_crtc_update_cursor(crtc, true);
 -
 - intel_post_enable_primary(crtc);
 -
 - /*
 -  * FIXME: Once we grow proper nuclear flip support out of this we need
 -  * to compute the mask of flip planes precisely. For the time being
 -  * consider this a flip to a NULL plane.
 -  */
 - intel_frontbuffer_flip(dev, INTEL_FRONTBUFFER_ALL_MASK(pipe));
 -}
 -
  static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned 
 plane_mask)
  {
   struct drm_device *dev = crtc-dev;
 @@ -4845,10 +4788,6 @@ static void intel_crtc_disable_planes(struct drm_crtc 
 *crtc, unsigned plane_mask
   struct drm_plane *p;
   int pipe = intel_crtc-pipe;
  
 - intel_crtc_wait_for_pending_flips(crtc);
 -
 - intel_pre_disable_primary(crtc);
 -
   intel_crtc_dpms_overlay_disable(intel_crtc);
  
   drm_for_each_plane_mask(p, dev, plane_mask)
 @@ -6270,6 +6209,11 @@ static void intel_crtc_disable_noatomic(struct 
 drm_crtc *crtc)
   if (!intel_crtc-active)
   return;
  
 + if (to_intel_plane_state(crtc-primary-state)-visible) {
 + intel_crtc_wait_for_pending_flips(crtc);
 + intel_pre_disable_primary(crtc);
 + }
 +
   intel_crtc_disable_planes(crtc, crtc-state-plane_mask);
   dev_priv-display.crtc_disable(crtc);
  
 @@ -11783,10 +11727,6 @@ int intel_plane_atomic_calc_changes(struct 
 drm_crtc_state *crtc_state,
   if (old_plane_state-base.fb  !fb)
   intel_crtc-atomic.disabled_planes |= 1  i;
  
 - /* don't run rest during modeset yet */
 - if (!intel_crtc-active || mode_changed)
 - return 0;
 -
   was_visible = old_plane_state-visible;
   visible = to_intel_plane_state(plane_state)-visible;
  
 @@ 

Re: [Intel-gfx] [PATCH v3 17/19] drm/i915: Make setting color key atomic.

2015-06-18 Thread Matt Roper
On Mon, Jun 15, 2015 at 12:33:54PM +0200, Maarten Lankhorst wrote:
 By making color key atomic there are no more transitional helpers.
 The plane check function will reject the color key when a scaler is
 active.
 
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_atomic_plane.c |  1 +
  drivers/gpu/drm/i915/intel_display.c  |  7 ++-
  drivers/gpu/drm/i915/intel_drv.h  |  6 +--
  drivers/gpu/drm/i915/intel_sprite.c   | 85 
 +++
  4 files changed, 46 insertions(+), 53 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c 
 b/drivers/gpu/drm/i915/intel_atomic_plane.c
 index 91d53768df9d..10a8ecedc942 100644
 --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
 +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
 @@ -56,6 +56,7 @@ intel_create_plane_state(struct drm_plane *plane)
  
   state-base.plane = plane;
   state-base.rotation = BIT(DRM_ROTATE_0);
 + state-ckey.flags = I915_SET_COLORKEY_NONE;
  
   return state;
  }
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index 5facd0501a34..746c73d2ab84 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -4401,9 +4401,9 @@ static int skl_update_scaler_plane(struct 
 intel_crtc_state *crtc_state,
   return ret;
  
   /* check colorkey */
 - if (WARN_ON(intel_plane-ckey.flags != I915_SET_COLORKEY_NONE)) {
 + if (plane_state-ckey.flags != I915_SET_COLORKEY_NONE) {
   DRM_DEBUG_KMS([PLANE:%d] scaling with color key not allowed,
 - intel_plane-base.base.id);
 +   intel_plane-base.base.id);
   return -EINVAL;
   }
  
 @@ -13733,7 +13733,7 @@ intel_check_primary_plane(struct drm_plane *plane,
  
   /* use scaler when colorkey is not required */
   if (INTEL_INFO(plane-dev)-gen = 9 
 - to_intel_plane(plane)-ckey.flags == I915_SET_COLORKEY_NONE) {
 + state-ckey.flags == I915_SET_COLORKEY_NONE) {
   min_scale = 1;
   max_scale = skl_max_scale(to_intel_crtc(crtc), crtc_state);
   can_position = true;
 @@ -13881,7 +13881,6 @@ static struct drm_plane 
 *intel_primary_plane_create(struct drm_device *dev,
   primary-check_plane = intel_check_primary_plane;
   primary-commit_plane = intel_commit_primary_plane;
   primary-disable_plane = intel_disable_primary_plane;
 - primary-ckey.flags = I915_SET_COLORKEY_NONE;
   if (HAS_FBC(dev)  INTEL_INFO(dev)-gen  4)
   primary-plane = !pipe;
  
 diff --git a/drivers/gpu/drm/i915/intel_drv.h 
 b/drivers/gpu/drm/i915/intel_drv.h
 index 93b9542ab8dc..3a2ac82b0970 100644
 --- a/drivers/gpu/drm/i915/intel_drv.h
 +++ b/drivers/gpu/drm/i915/intel_drv.h
 @@ -274,6 +274,8 @@ struct intel_plane_state {
* update_scaler_plane.
*/
   int scaler_id;
 +
 + struct drm_intel_sprite_colorkey ckey;
  };
  
  struct intel_initial_plane_config {
 @@ -588,9 +590,6 @@ struct intel_plane {
   bool can_scale;
   int max_downscale;
  
 - /* FIXME convert to properties */
 - struct drm_intel_sprite_colorkey ckey;
 -
   /* Since we need to change the watermarks before/after
* enabling/disabling the planes, we need to store the parameters here
* as the other pieces of the struct may not reflect the values we want
 @@ -1390,7 +1389,6 @@ bool intel_sdvo_init(struct drm_device *dev, uint32_t 
 sdvo_reg, bool is_sdvob);
  
  /* intel_sprite.c */
  int intel_plane_init(struct drm_device *dev, enum pipe pipe, int plane);
 -int intel_plane_restore(struct drm_plane *plane);
  int intel_sprite_set_colorkey(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
  bool intel_pipe_update_start(struct intel_crtc *crtc,
 diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
 b/drivers/gpu/drm/i915/intel_sprite.c
 index 168f90f346c2..21d3f7882c4d 100644
 --- a/drivers/gpu/drm/i915/intel_sprite.c
 +++ b/drivers/gpu/drm/i915/intel_sprite.c
 @@ -182,7 +182,8 @@ skl_update_plane(struct drm_plane *drm_plane, struct 
 drm_crtc *crtc,
   const int plane = intel_plane-plane + 1;
   u32 plane_ctl, stride_div, stride;
   int pixel_size = drm_format_plane_cpp(fb-pixel_format, 0);
 - const struct drm_intel_sprite_colorkey *key = intel_plane-ckey;
 + const struct drm_intel_sprite_colorkey *key =
 + to_intel_plane_state(drm_plane-state)-ckey;
   unsigned long surf_addr;
   u32 tile_height, plane_offset, plane_size;
   unsigned int rotation;
 @@ -344,7 +345,8 @@ vlv_update_plane(struct drm_plane *dplane, struct 
 drm_crtc *crtc,
   u32 sprctl;
   unsigned long sprsurf_offset, linear_offset;
   int pixel_size = drm_format_plane_cpp(fb-pixel_format, 0);
 - const struct drm_intel_sprite_colorkey *key = intel_plane-ckey;
 + const struct 

Re: [Intel-gfx] [PATCH 25/55] drm/i915: Update i915_gem_object_sync() to take a request structure

2015-06-18 Thread John Harrison

On 18/06/2015 16:39, Chris Wilson wrote:

On Thu, Jun 18, 2015 at 04:24:53PM +0200, Daniel Vetter wrote:

On Thu, Jun 18, 2015 at 01:59:13PM +0100, John Harrison wrote:

On 18/06/2015 13:21, Chris Wilson wrote:

On Thu, Jun 18, 2015 at 01:14:56PM +0100, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: John Harrison john.c.harri...@intel.com
Reviewed-by: Tomas Elf tomas@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h|4 ++-
  drivers/gpu/drm/i915/i915_gem.c|   48 +---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +-
  drivers/gpu/drm/i915/intel_display.c   |   17 +++---
  drivers/gpu/drm/i915/intel_drv.h   |3 +-
  drivers/gpu/drm/i915/intel_fbdev.c |2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |2 +-
  drivers/gpu/drm/i915/intel_overlay.c   |2 +-
  8 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64a10fa..f69e9cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,7 +2778,8 @@ static inline void i915_gem_object_unpin_pages(struct 
drm_i915_gem_object *obj)
  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
-struct intel_engine_cs *to);
+struct intel_engine_cs *to,
+struct drm_i915_gem_request **to_req);

Nope. Did you forget to reorder the code to ensure that the request is
allocated along with the context switch at the start of execbuf?
-Chris


Not sure what you are objecting to? If you mean the lazily allocated request
then that is for page flip code not execbuff code. If we get here from an
execbuff call then the request will definitely have been allocated and will
be passed in. Whereas the page flip code may or may not require a request
(depending on whether MMIO or ring flips are in use. Likewise the sync code
may or may not require a request (depending on whether there is anything to
sync to or not). There is no point allocating and submitting an empty
request in the MMIO/idle case. Hence the sync code needs to be able to use
an existing request or create one if none already exists.

I guess Chris' comment was that if you have a non-NULL to, then you better
have a non-NULL to_req. And since we link up reqeusts to the engine
they'll run on the former shouldn't be required any more. So either that's
true and we can remove the to or we don't understand something yet (and
perhaps that should be done as a follow-up).

I am sure I sent a patch that outlined in great detail how that we need
only the request parameter in i915_gem_object_sync(), for handling both
execbuffer, pipelined pin_and_fence and synchronous pin_and_fence.
-Chris



As the driver stands, the page flip code wants to synchronise with the 
framebuffer object but potentially without touching the ring and 
therefore without creating a request. If the synchronisation is a no-op 
(because there are no outstanding operations on the given object) then 
there is no need for a request anywhere in the call chain. Thus there is 
a need to pass in the ring together with an optional 

[Intel-gfx] [PATCH] drm/i915: Add the ddi get cdclk code for BXT (v2)

2015-06-18 Thread Matt Roper
From: Bob Paauwe bob.j.paa...@intel.com

The registers and process differ from other platforms.

v2(Matt): Return 19.2 MHz when DE PLL is disabled (Ville)

Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Imre Deak imre.d...@intel.com
Signed-off-by: Bob Paauwe bob.j.paa...@intel.com
Signed-off-by: Matt Roper matthew.d.ro...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 3ee7dbc..294c4e4 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6689,6 +6689,34 @@ static int skylake_get_display_clock_speed(struct 
drm_device *dev)
return 24000;
 }
 
+static int broxton_get_display_clock_speed(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   uint32_t cdctl = I915_READ(CDCLK_CTL);
+   uint32_t pll_freq = I915_READ(BXT_DE_PLL_CTL)  BXT_DE_PLL_RATIO_MASK;
+   uint32_t pll_enab = I915_READ(BXT_DE_PLL_ENABLE);
+
+   if (!(pll_enab  BXT_DE_PLL_PLL_ENABLE))
+   return 19200;
+
+   switch (cdctl  BXT_CDCLK_CD2X_DIV_SEL_MASK) {
+   case BXT_CDCLK_CD2X_DIV_SEL_1:
+   if (pll_freq == BXT_DE_PLL_RATIO(60)) /* PLL freq = 1152MHz */
+   return 576000;
+   else /* PLL freq = 1248MHz */
+   return 624000;
+   case BXT_CDCLK_CD2X_DIV_SEL_1_5:
+   return 384000;
+   case BXT_CDCLK_CD2X_DIV_SEL_2:
+   return 288000;
+   case BXT_CDCLK_CD2X_DIV_SEL_4:
+   return 144000;
+   }
+
+   /* error case, assume higer PLL freq. */
+   return 624000;
+}
+
 static int broadwell_get_display_clock_speed(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
@@ -14715,6 +14743,9 @@ static void intel_init_display(struct drm_device *dev)
if (IS_SKYLAKE(dev))
dev_priv-display.get_display_clock_speed =
skylake_get_display_clock_speed;
+   else if (IS_BROXTON(dev))
+   dev_priv-display.get_display_clock_speed =
+   broxton_get_display_clock_speed;
else if (IS_BROADWELL(dev))
dev_priv-display.get_display_clock_speed =
broadwell_get_display_clock_speed;
-- 
1.8.5.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 05:35:29PM +0200, Daniel Vetter wrote:
 On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote:
  On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
   Guc is different since we really must have it ready for execbuf, and for
   that usecase a completion at drm_open time sounds like the right thing.
  
  But do we? It would be nice if we had a definite answer that the hw was
  ready before we started using it in anger, but I don't see any reason
  why we would have to delay userspace for a slow microcode update...
  
  (This presupposes that userspace batches are unaffected by GuC/execlist
  setup, which for userspace sanity I hope they are - or at least using
  predicate registers and conditional execution.)
 
 Well I figured a wait_completion or flush_work unconditionally in execbuf
 is not to your liking, and it's better to keep that in open. But I think
 we should be able to get away with this at execbuf time. Might even be
 better since this wouldn't block sw-rendered boot-splashs.
 
 But either way should be suitable I think.

I am optimistic that we can make the request interface robust enough to be
able queue up not only the ring initialisation and ppgtt initialisation
requests, but also userspace requests. If it all works out, we only need
to truly worry about microcode completion in hangcheck.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers

2015-06-18 Thread Chris Wilson
I'm pretty happy with the code, I was just confused by the series
changing the setup halfway through

On Thu, Jun 18, 2015 at 02:07:30PM +0100, Arun Siluvery wrote:
 +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring,
 + uint32_t **wa_ctx_batch,
 + uint32_t offset,
 + uint32_t *num_dwords)
 +{
 + uint32_t index;
 + uint32_t *batch = *wa_ctx_batch;
 +
 + index = offset;
 +
 + /* FIXME: fill one cacheline with NOOPs.
 +  * Replace these instructions with WA
 +  */
 + while (index  (offset + 16))
 + wa_ctx_emit(batch, MI_NOOP);

If this was

/* Replace me with WA */
wa_ctx_emit(batch, MI_NOOP)

/* Pad to end of cacheline */
while (index % 16)
wa_ctx_emit(batch, MI_NOOP);

You then don't need to alter the code when yo add the real w/a. Note
that using (unsigned long)batch as you do later for cacheline
calculation is wrong, as that is a local physical CPU address (not the
virtual address used by the cache in the GPU) and was page aligned
anyway.

Similary,

 +static int gen8_init_perctx_bb(struct intel_engine_cs *ring,
 +uint32_t **wa_ctx_batch,
 +uint32_t offset,
 +uint32_t *num_dwords)
 +{
 + uint32_t index;
 + uint32_t *batch = *wa_ctx_batch;
 +
 + index = offset;
 +

If this just did
wa_ctx_emit(batch, MI_BATCH_BUFFER_END);
rather than insert a cacheline of noops, again you wouldn't need to
touch this infrastructure as you added the w/a.

As it stands, I was a little worried halfway through when the cache
alignment suddenly disappeared - but this patch implied to me that it
was necessary.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] 3.16 backlight kernel options

2015-06-18 Thread Stéphane ANCELOT

Hi,

Which option is mandatory in linux kernel to be able to act on 
brightness of display ?


Regards,
Steph

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Damien Lespiau
On Thu, Jun 18, 2015 at 03:45:44PM +0100, Antoine, Peter wrote:
 Not a blocker. It gets a little more interesting, as the L3CC
 registers are shared across all engines, but is only saved in the RCS
 context. But, it is reset on the context switch when ELSP is set. So
 we would have to program it (i.e. MMIO) and also set it in the batch
 start for the RCS. Each ring would have to have a proper
 init_context() and these registers programmed there.

Hum, so yes, it's like you say. I think leaving a comment somewhere in
the init path telling us we rely on the RCS init_context() for all the
rings would be nice, but that's extra topping that can be done any time.

-- 
Damien
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 15/19] drm/i915: atomic plane updates in a nutshell

2015-06-18 Thread Maarten Lankhorst
Op 18-06-15 om 16:21 schreef Matt Roper:
 On Mon, Jun 15, 2015 at 12:33:52PM +0200, Maarten Lankhorst wrote:
 Now that all planes are added during a modeset we can use the
 calculated changes before disabling a plane, and then either commit
 or force disable a plane before disabling the crtc.

 The code is shared with atomic_begin/flush, except watermark updating
 and vblank evasion are not used.

 This is needed for proper atomic suspend/resume support.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90868
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_display.c | 103 
 ---
  drivers/gpu/drm/i915/intel_sprite.c  |   4 +-
  2 files changed, 23 insertions(+), 84 deletions(-)

 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index cc4ca4970716..beb69281f45c 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -2217,28 +2217,6 @@ static void intel_disable_pipe(struct intel_crtc 
 *crtc)
  intel_wait_for_pipe_off(crtc);
  }
  
 -/**
 - * intel_enable_primary_hw_plane - enable the primary plane on a given pipe
 - * @plane:  plane to be enabled
 - * @crtc: crtc for the plane
 - *
 - * Enable @plane on @crtc, making sure that the pipe is running first.
 - */
 -static void intel_enable_primary_hw_plane(struct drm_plane *plane,
 -  struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = plane-dev;
 -struct drm_i915_private *dev_priv = dev-dev_private;
 -struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -
 -/* If the pipe isn't enabled, we can't pump pixels and may hang */
 -assert_pipe_enabled(dev_priv, intel_crtc-pipe);
 -to_intel_plane_state(plane-state)-visible = true;
 -
 -dev_priv-display.update_primary_plane(crtc, plane-fb,
 -   crtc-x, crtc-y);
 -}
 -
  static bool need_vtd_wa(struct drm_device *dev)
  {
  #ifdef CONFIG_INTEL_IOMMU
 @@ -4508,20 +4486,6 @@ static void ironlake_pfit_enable(struct intel_crtc 
 *crtc)
  }
  }
  
 -static void intel_enable_sprite_planes(struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = crtc-dev;
 -enum pipe pipe = to_intel_crtc(crtc)-pipe;
 -struct drm_plane *plane;
 -struct intel_plane *intel_plane;
 -
 -drm_for_each_legacy_plane(plane, dev-mode_config.plane_list) {
 -intel_plane = to_intel_plane(plane);
 -if (intel_plane-pipe == pipe)
 -intel_plane_restore(intel_plane-base);
 -}
 -}
 -
  void hsw_enable_ips(struct intel_crtc *crtc)
  {
  struct drm_device *dev = crtc-base.dev;
 @@ -4817,27 +4781,6 @@ static void intel_pre_plane_update(struct intel_crtc 
 *crtc)
  intel_pre_disable_primary(crtc-base);
  }
  
 -static void intel_crtc_enable_planes(struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = crtc-dev;
 -struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -int pipe = intel_crtc-pipe;
 -
 -intel_enable_primary_hw_plane(crtc-primary, crtc);
 -intel_enable_sprite_planes(crtc);
 -if (to_intel_plane_state(crtc-cursor-state)-visible)
 -intel_crtc_update_cursor(crtc, true);
 -
 -intel_post_enable_primary(crtc);
 -
 -/*
 - * FIXME: Once we grow proper nuclear flip support out of this we need
 - * to compute the mask of flip planes precisely. For the time being
 - * consider this a flip to a NULL plane.
 - */
 -intel_frontbuffer_flip(dev, INTEL_FRONTBUFFER_ALL_MASK(pipe));
 -}
 -
  static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned 
 plane_mask)
  {
  struct drm_device *dev = crtc-dev;
 @@ -4845,10 +4788,6 @@ static void intel_crtc_disable_planes(struct drm_crtc 
 *crtc, unsigned plane_mask
  struct drm_plane *p;
  int pipe = intel_crtc-pipe;
  
 -intel_crtc_wait_for_pending_flips(crtc);
 -
 -intel_pre_disable_primary(crtc);
 -
  intel_crtc_dpms_overlay_disable(intel_crtc);
  
  drm_for_each_plane_mask(p, dev, plane_mask)
 @@ -6270,6 +6209,11 @@ static void intel_crtc_disable_noatomic(struct 
 drm_crtc *crtc)
  if (!intel_crtc-active)
  return;
  
 +if (to_intel_plane_state(crtc-primary-state)-visible) {
 +intel_crtc_wait_for_pending_flips(crtc);
 +intel_pre_disable_primary(crtc);
 +}
 +
  intel_crtc_disable_planes(crtc, crtc-state-plane_mask);
  dev_priv-display.crtc_disable(crtc);
  
 @@ -11783,10 +11727,6 @@ int intel_plane_atomic_calc_changes(struct 
 drm_crtc_state *crtc_state,
  if (old_plane_state-base.fb  !fb)
  intel_crtc-atomic.disabled_planes |= 1  i;
  
 -/* don't run rest during modeset yet */
 -if (!intel_crtc-active || mode_changed)
 -return 0;
 -
  was_visible = old_plane_state-visible;
  visible = to_intel_plane_state(plane_state)-visible;
  
 @@ -13255,15 +13195,18 

Re: [Intel-gfx] [PATCH v3 17/19] drm/i915: Make setting color key atomic.

2015-06-18 Thread Maarten Lankhorst
Op 18-06-15 om 16:21 schreef Matt Roper:
 On Mon, Jun 15, 2015 at 12:33:54PM +0200, Maarten Lankhorst wrote:
 By making color key atomic there are no more transitional helpers.
 The plane check function will reject the color key when a scaler is
 active.

 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_atomic_plane.c |  1 +
  drivers/gpu/drm/i915/intel_display.c  |  7 ++-
  drivers/gpu/drm/i915/intel_drv.h  |  6 +--
  drivers/gpu/drm/i915/intel_sprite.c   | 85 
 +++
  4 files changed, 46 insertions(+), 53 deletions(-)

 diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c 
 b/drivers/gpu/drm/i915/intel_atomic_plane.c
 index 91d53768df9d..10a8ecedc942 100644
 --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
 +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
 @@ -56,6 +56,7 @@ intel_create_plane_state(struct drm_plane *plane)
  
  state-base.plane = plane;
  state-base.rotation = BIT(DRM_ROTATE_0);
 +state-ckey.flags = I915_SET_COLORKEY_NONE;
  
  return state;
  }
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index 5facd0501a34..746c73d2ab84 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -4401,9 +4401,9 @@ static int skl_update_scaler_plane(struct 
 intel_crtc_state *crtc_state,
  return ret;
  
  /* check colorkey */
 -if (WARN_ON(intel_plane-ckey.flags != I915_SET_COLORKEY_NONE)) {
 +if (plane_state-ckey.flags != I915_SET_COLORKEY_NONE) {
  DRM_DEBUG_KMS([PLANE:%d] scaling with color key not allowed,
 -intel_plane-base.base.id);
 +  intel_plane-base.base.id);
  return -EINVAL;
  }
  
 @@ -13733,7 +13733,7 @@ intel_check_primary_plane(struct drm_plane *plane,
  
  /* use scaler when colorkey is not required */
  if (INTEL_INFO(plane-dev)-gen = 9 
 -to_intel_plane(plane)-ckey.flags == I915_SET_COLORKEY_NONE) {
 +state-ckey.flags == I915_SET_COLORKEY_NONE) {
  min_scale = 1;
  max_scale = skl_max_scale(to_intel_crtc(crtc), crtc_state);
  can_position = true;
 @@ -13881,7 +13881,6 @@ static struct drm_plane 
 *intel_primary_plane_create(struct drm_device *dev,
  primary-check_plane = intel_check_primary_plane;
  primary-commit_plane = intel_commit_primary_plane;
  primary-disable_plane = intel_disable_primary_plane;
 -primary-ckey.flags = I915_SET_COLORKEY_NONE;
  if (HAS_FBC(dev)  INTEL_INFO(dev)-gen  4)
  primary-plane = !pipe;
  
 diff --git a/drivers/gpu/drm/i915/intel_drv.h 
 b/drivers/gpu/drm/i915/intel_drv.h
 index 93b9542ab8dc..3a2ac82b0970 100644
 --- a/drivers/gpu/drm/i915/intel_drv.h
 +++ b/drivers/gpu/drm/i915/intel_drv.h
 @@ -274,6 +274,8 @@ struct intel_plane_state {
   * update_scaler_plane.
   */
  int scaler_id;
 +
 +struct drm_intel_sprite_colorkey ckey;
  };
  
  struct intel_initial_plane_config {
 @@ -588,9 +590,6 @@ struct intel_plane {
  bool can_scale;
  int max_downscale;
  
 -/* FIXME convert to properties */
 -struct drm_intel_sprite_colorkey ckey;
 -
  /* Since we need to change the watermarks before/after
   * enabling/disabling the planes, we need to store the parameters here
   * as the other pieces of the struct may not reflect the values we want
 @@ -1390,7 +1389,6 @@ bool intel_sdvo_init(struct drm_device *dev, uint32_t 
 sdvo_reg, bool is_sdvob);
  
  /* intel_sprite.c */
  int intel_plane_init(struct drm_device *dev, enum pipe pipe, int plane);
 -int intel_plane_restore(struct drm_plane *plane);
  int intel_sprite_set_colorkey(struct drm_device *dev, void *data,
struct drm_file *file_priv);
  bool intel_pipe_update_start(struct intel_crtc *crtc,
 diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
 b/drivers/gpu/drm/i915/intel_sprite.c
 index 168f90f346c2..21d3f7882c4d 100644
 --- a/drivers/gpu/drm/i915/intel_sprite.c
 +++ b/drivers/gpu/drm/i915/intel_sprite.c
 @@ -182,7 +182,8 @@ skl_update_plane(struct drm_plane *drm_plane, struct 
 drm_crtc *crtc,
  const int plane = intel_plane-plane + 1;
  u32 plane_ctl, stride_div, stride;
  int pixel_size = drm_format_plane_cpp(fb-pixel_format, 0);
 -const struct drm_intel_sprite_colorkey *key = intel_plane-ckey;
 +const struct drm_intel_sprite_colorkey *key =
 +to_intel_plane_state(drm_plane-state)-ckey;
  unsigned long surf_addr;
  u32 tile_height, plane_offset, plane_size;
  unsigned int rotation;
 @@ -344,7 +345,8 @@ vlv_update_plane(struct drm_plane *dplane, struct 
 drm_crtc *crtc,
  u32 sprctl;
  unsigned long sprsurf_offset, linear_offset;
  int pixel_size = drm_format_plane_cpp(fb-pixel_format, 0);
 -const struct drm_intel_sprite_colorkey *key = intel_plane-ckey;
 +const struct 

Re: [Intel-gfx] [PATCH v3 18/19] drm/i915: Remove transitional references from intel_plane_atomic_check.

2015-06-18 Thread Maarten Lankhorst
Op 18-06-15 om 16:21 schreef Matt Roper:
 On Mon, Jun 15, 2015 at 12:33:55PM +0200, Maarten Lankhorst wrote:
 All transitional plane helpers are gone, party!

 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 There's also a reference in skylake_update_primary_plane() that I assume
 can be removed?

Sure, I left it in because people will complain about unrelated changes 
otherwise. :P
It should be a separate patch though..

~Maarten
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [alsa-devel] DP MST audio support

2015-06-18 Thread Lin, Mengdong
 -Original Message-
 From: Takashi Iwai [mailto:ti...@suse.de]
 Sent: Monday, May 18, 2015 5:21 PM
 
 At Thu, 14 May 2015 09:10:33 +1000,
 Dave Airlie wrote:
 
  On 12 May 2015 at 13:27, Dave Airlie airl...@gmail.com wrote:
   On 12 May 2015 at 11:50, Dave Airlie airl...@gmail.com wrote:
   Hi,
  
   So I have a branch that makes no sound,
   http://cgit.freedesktop.org/~airlied/linux/log/?h=dp-mst-audio
  
   and I'm not sure where I need to turn to next,
  
   The Intel docs I've read are kinda vague, assuming you know lots of
   things I clearly don't.
  
   so in theory my branch, sets up the SDP stream to the monitor in
   the payload creation, enables the codec in the intel GPU driver,
   and passes the ELD to the audio driver.
  
   The audio driver uses the device list to get the presence/valid
   bits per device, and manages to retrieve the ELD. I even create ELD
   files in /proc/asound/HDMI/ that have sensible values in them
  
   So it looks like I'm just missing some routing somewhere, most
   likely in the audio driver, then again I  could be missing a lot
   more than that.
  
   Just looking for any ideas or knowledge people may have locked in
   their brains or inside their firewalls.
  
   Okay the branch now has audio on my test setup,
  
   I've had to hack out the intel_not_share_assigned_cvt function it
   appears to do bad things, I set pin 6 to connection 0 (pin 2), the
   later sets on pin 5/7 to connection 1 by that function seems to reprogram
 pin 6.
  
   I'm guessing the connection is assigned to a device not a pin in the
   new hw, and the same device is routed via pin 5/7 so I end up trashing it.
  
   my test setup is a Haswell Lenovo t440s + docking station + Dell U2410.
  ping audio guys?
 
  can someone from alsa please take a look or some interest in this?
 
 Sorry, I've been on vacation for last two weeks.  Now still swimming in the
 flood of backlogs...

Hi Artie,

Sorry for the late reply.

We don't have Haswell Lenovo t440s atm, so could you share more info?
- Dell U2410 should support both HDMI and DP input. But I guess it cannot
 support DP MST, right?
- Are you connecting this monitor a DP cable? 
 Which DDI port is used? DDI B, C or D?
- Does audio fail after i915 enables DP MST?
- Is patch snd/hdmi: hack out haswell codec workaround the only change
 on audio driver side?

  The graphics side patches are fairly trivial, also it would be good to
  get a good explaination of how the hw works,
 
  from what I can see devices get connections not pins on this hw, and I
  notice that I don't always get 3 devices, so I'm not sure if devices
  are a dynamic thing we should be reprobing on some signal.

Do you mean 3 PCM devices here, like pcmC0D3p, pcmC0D7p, pcmC0D8p?
Now the devices are not dynamic, a PCM device is created on each pin.
It seems we need to revise this for DP MST, since a pin can be used to send
up to 3 independent streams on Intel GPU which has 3 display pipelines.

 
 The intel_not_share_assigned_cvt() was needed for Haswell HDMI/DP as there
 was static routing between the pin and the converter widgets although the
 codec graph shows it's selectable.

We need to check why this fails. Even if MST is enabled, the convertors should
also be selectable.

 Was the pin default config of pin 6 enabled by BIOS properly?
 
 In anyway, Intel people should have a better clue about this; it's been 
 always a
 strange behavior that is tied with the graphics...
 
 
 thanks,
 
 Takashi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 15/19] drm/i915: atomic plane updates in a nutshell

2015-06-18 Thread Maarten Lankhorst
Op 18-06-15 om 17:28 schreef Ville Syrjälä:
 On Mon, Jun 15, 2015 at 12:33:52PM +0200, Maarten Lankhorst wrote:
 Now that all planes are added during a modeset we can use the
 calculated changes before disabling a plane, and then either commit
 or force disable a plane before disabling the crtc.

 The code is shared with atomic_begin/flush, except watermark updating
 and vblank evasion are not used.

 This is needed for proper atomic suspend/resume support.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90868
 Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
 ---
  drivers/gpu/drm/i915/intel_display.c | 103 
 ---
  drivers/gpu/drm/i915/intel_sprite.c  |   4 +-
  2 files changed, 23 insertions(+), 84 deletions(-)

 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index cc4ca4970716..beb69281f45c 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -2217,28 +2217,6 @@ static void intel_disable_pipe(struct intel_crtc 
 *crtc)
  intel_wait_for_pipe_off(crtc);
  }
  
 -/**
 - * intel_enable_primary_hw_plane - enable the primary plane on a given pipe
 - * @plane:  plane to be enabled
 - * @crtc: crtc for the plane
 - *
 - * Enable @plane on @crtc, making sure that the pipe is running first.
 - */
 -static void intel_enable_primary_hw_plane(struct drm_plane *plane,
 -  struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = plane-dev;
 -struct drm_i915_private *dev_priv = dev-dev_private;
 -struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -
 -/* If the pipe isn't enabled, we can't pump pixels and may hang */
 -assert_pipe_enabled(dev_priv, intel_crtc-pipe);
 -to_intel_plane_state(plane-state)-visible = true;
 -
 -dev_priv-display.update_primary_plane(crtc, plane-fb,
 -   crtc-x, crtc-y);
 -}
 -
  static bool need_vtd_wa(struct drm_device *dev)
  {
  #ifdef CONFIG_INTEL_IOMMU
 @@ -4508,20 +4486,6 @@ static void ironlake_pfit_enable(struct intel_crtc 
 *crtc)
  }
  }
  
 -static void intel_enable_sprite_planes(struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = crtc-dev;
 -enum pipe pipe = to_intel_crtc(crtc)-pipe;
 -struct drm_plane *plane;
 -struct intel_plane *intel_plane;
 -
 -drm_for_each_legacy_plane(plane, dev-mode_config.plane_list) {
 -intel_plane = to_intel_plane(plane);
 -if (intel_plane-pipe == pipe)
 -intel_plane_restore(intel_plane-base);
 -}
 -}
 -
  void hsw_enable_ips(struct intel_crtc *crtc)
  {
  struct drm_device *dev = crtc-base.dev;
 @@ -4817,27 +4781,6 @@ static void intel_pre_plane_update(struct intel_crtc 
 *crtc)
  intel_pre_disable_primary(crtc-base);
  }
  
 -static void intel_crtc_enable_planes(struct drm_crtc *crtc)
 -{
 -struct drm_device *dev = crtc-dev;
 -struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 -int pipe = intel_crtc-pipe;
 -
 -intel_enable_primary_hw_plane(crtc-primary, crtc);
 -intel_enable_sprite_planes(crtc);
 -if (to_intel_plane_state(crtc-cursor-state)-visible)
 -intel_crtc_update_cursor(crtc, true);
 -
 -intel_post_enable_primary(crtc);
 -
 -/*
 - * FIXME: Once we grow proper nuclear flip support out of this we need
 - * to compute the mask of flip planes precisely. For the time being
 - * consider this a flip to a NULL plane.
 - */
 -intel_frontbuffer_flip(dev, INTEL_FRONTBUFFER_ALL_MASK(pipe));
 -}
 -
  static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned 
 plane_mask)
  {
  struct drm_device *dev = crtc-dev;
 @@ -4845,10 +4788,6 @@ static void intel_crtc_disable_planes(struct drm_crtc 
 *crtc, unsigned plane_mask
  struct drm_plane *p;
  int pipe = intel_crtc-pipe;
  
 -intel_crtc_wait_for_pending_flips(crtc);
 -
 -intel_pre_disable_primary(crtc);
 -
  intel_crtc_dpms_overlay_disable(intel_crtc);
  
  drm_for_each_plane_mask(p, dev, plane_mask)
 @@ -6270,6 +6209,11 @@ static void intel_crtc_disable_noatomic(struct 
 drm_crtc *crtc)
  if (!intel_crtc-active)
  return;
  
 +if (to_intel_plane_state(crtc-primary-state)-visible) {
 +intel_crtc_wait_for_pending_flips(crtc);
 +intel_pre_disable_primary(crtc);
 +}
 +
  intel_crtc_disable_planes(crtc, crtc-state-plane_mask);
  dev_priv-display.crtc_disable(crtc);
  
 @@ -11783,10 +11727,6 @@ int intel_plane_atomic_calc_changes(struct 
 drm_crtc_state *crtc_state,
  if (old_plane_state-base.fb  !fb)
  intel_crtc-atomic.disabled_planes |= 1  i;
  
 -/* don't run rest during modeset yet */
 -if (!intel_crtc-active || mode_changed)
 -return 0;
 -
  was_visible = old_plane_state-visible;
  visible = to_intel_plane_state(plane_state)-visible;
  
 @@ -13255,15 

Re: [Intel-gfx] [PATCH] drm/atomic: Extract needs_modeset function

2015-06-18 Thread Maarten Lankhorst
Op 18-06-15 om 11:25 schreef Daniel Vetter:
 We use the same check already in the atomic core, so might as well
 make this official. And it's also reused in e.g. i915.

 Motivated by Maarten's idea to extract a connector_changed state out
 of mode_changed.

 Cc: Maarten Lankhorst maarten.lankho...@linux.intel.com
 Signed-off-by: Daniel Vetter daniel.vet...@intel.com

Reviewed-By: Maarten Lankhorst maarten.lankho...@linux.intel.com
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PULL] drm-intel-next-fixes

2015-06-18 Thread Dave Airlie
On 18 June 2015 at 16:04, Jani Nikula jani.nik...@intel.com wrote:

 Hi Dave, i915 fixes for drm-next/v4.2.

 BR,
 Jani.

And my gcc says:

/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:
In function ‘__intel_set_mode’:
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11850:14:
warning: ‘crtc_state’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
  return state-mode_changed || state-active_changed;
  ^
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11860:25:
note: ‘crtc_state’ was declared here
  struct drm_crtc_state *crtc_state;
 ^
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11874:6:
warning: ‘crtc’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   if (crtc != intel_encoder-base.crtc)
  ^
/home/airlied/devel/kernel/drm-next/drivers/gpu/drm/i915/intel_display.c:11859:19:
note: ‘crtc’ was declared here
  struct drm_crtc *crtc;
   ^

No idea if this is true, but I don't think I've seen it before now.

gcc 5.1.1 on fedora 22

Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 2/6] drm/i915/gen8: Re-order init pipe_control in lrc mode

2015-06-18 Thread Arun Siluvery
Some of the WA applied using WA batch buffers perform writes to scratch page.
In the current flow WA are initialized before scratch obj is allocated.
This patch reorders intel_init_pipe_control() to have a valid scratch obj
before we initialize WA.

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Michel Thierry michel.thie...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8cc851dd..62486cd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1636,7 +1636,8 @@ static int logical_render_ring_init(struct drm_device 
*dev)
ring-emit_bb_start = gen8_emit_bb_start;
 
ring-dev = dev;
-   ret = logical_ring_init(dev, ring);
+
+   ret = intel_init_pipe_control(ring);
if (ret)
return ret;
 
@@ -1648,7 +1649,7 @@ static int logical_render_ring_init(struct drm_device 
*dev)
}
}
 
-   ret = intel_init_pipe_control(ring);
+   ret = logical_ring_init(dev, ring);
if (ret) {
if (ring-wa_ctx.obj)
lrc_destroy_wa_ctx_obj(ring);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 5/6] drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaround

2015-06-18 Thread Arun Siluvery
In Indirect context w/a batch buffer,
WaClearSlmSpaceAtContextSwitch

v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville)

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 16 
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d14ad20..7637e64 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -410,6 +410,7 @@
 #define   DISPLAY_PLANE_A   (020)
 #define   DISPLAY_PLANE_B   (120)
 #define GFX_OP_PIPE_CONTROL(len)   ((0x329)|(0x327)|(0x224)|(len-2))
+#define   PIPE_CONTROL_FLUSH_L3(127)
 #define   PIPE_CONTROL_GLOBAL_GTT_IVB  (124) /* gen7+ */
 #define   PIPE_CONTROL_MMIO_WRITE  (123)
 #define   PIPE_CONTROL_STORE_DATA_INDEX(121)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3291ef4..b631390 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1106,6 +1106,7 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
uint32_t *num_dwords)
 {
uint32_t index;
+   uint32_t scratch_addr;
uint32_t *batch = *wa_ctx_batch;
 
index = offset;
@@ -1136,6 +1137,21 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
wa_ctx_emit(batch, l3sqc4_flush  
~GEN8_LQSC_FLUSH_COHERENT_LINES);
}
 
+   /* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+   /* Actual scratch location is at 128 bytes offset */
+   scratch_addr = ring-scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr |= PIPE_CONTROL_GLOBAL_GTT;
+
+   wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6));
+   wa_ctx_emit(batch, (PIPE_CONTROL_FLUSH_L3 |
+   PIPE_CONTROL_GLOBAL_GTT_IVB |
+   PIPE_CONTROL_CS_STALL |
+   PIPE_CONTROL_QW_WRITE));
+   wa_ctx_emit(batch, scratch_addr);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+
/* Pad to end of cacheline */
while (index % CACHELINE_DWORDS)
wa_ctx_emit(batch, MI_NOOP);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/5] drm/i915: PSR: Remove Low Power HW tracking mask.

2015-06-18 Thread Rodrigo Vivi
By Spec we should just mask memup and hotplug detection
for hardware tracking cases. However we always masked
LPSP that is for low power tracking support because
without it PSR was constantly exiting and never really
getting activated.

Now with runtime PM being enabled by default Matthew
reported that he was facing missed screen updates. So
let's remove this undesirable mask and let HW tracking
take care of cases like this were power saving features
are also running.

WARNING: With this patch PSR depends on Audio and GPU
runtime PM to be properly enabled, working on auto.
If either audio runtime PM or gpu runtime pm are not
properly set PSR will constant Exit and Performance
Counter will be 0.

But the best thing of this patch is that with one more
HW tracking working the risks of missed blank screen
are minimized at most.

This affects just core platforms where PSR exit are also
helped by HW tracking: Haswell, Broadwell and Skylake
for now.

Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Matthew Garrett mj...@srcf.ucam.org via codon.org.uk
Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
---
 drivers/gpu/drm/i915/intel_psr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 5ee0fa5..6549d58 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -400,7 +400,7 @@ void intel_psr_enable(struct intel_dp *intel_dp)
 
/* Avoid continuous PSR exit by masking memup and hpd */
I915_WRITE(EDP_PSR_DEBUG_CTL(dev), EDP_PSR_DEBUG_MASK_MEMUP |
-  EDP_PSR_DEBUG_MASK_HPD | EDP_PSR_DEBUG_MASK_LPSP);
+  EDP_PSR_DEBUG_MASK_HPD);
 
/* Enable PSR on the panel */
hsw_psr_enable_sink(intel_dp);
-- 
2.1.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/5] drm/i915: Remove unused ring argument from frontbuffer invalidate and busy functions.

2015-06-18 Thread Rodrigo Vivi
This patch doesn't have any functional change, but organize fruntbuffer
invalidate and busy by removing unecesarry signature argument for ring.

It was unsed on mark_fb_busy and only used on fb_obj_invalidate for the
same ORIGIN_CS usage. So let's clean it a bit

Cc: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
---
 drivers/gpu/drm/i915/i915_gem.c| 10 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_drv.h   |  1 -
 drivers/gpu/drm/i915/intel_fbdev.c |  4 ++--
 drivers/gpu/drm/i915/intel_frontbuffer.c   | 14 +-
 5 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 248fd1a..49beca2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -350,7 +350,7 @@ i915_gem_phys_pwrite(struct drm_i915_gem_object *obj,
if (ret)
return ret;
 
-   intel_fb_obj_invalidate(obj, NULL, ORIGIN_CPU);
+   intel_fb_obj_invalidate(obj, ORIGIN_CPU);
if (__copy_from_user_inatomic_nocache(vaddr, user_data, args-size)) {
unsigned long unwritten;
 
@@ -804,7 +804,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 
offset = i915_gem_obj_ggtt_offset(obj) + args-offset;
 
-   intel_fb_obj_invalidate(obj, NULL, ORIGIN_GTT);
+   intel_fb_obj_invalidate(obj, ORIGIN_GTT);
 
while (remain  0) {
/* Operation in this page
@@ -948,7 +948,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
if (ret)
return ret;
 
-   intel_fb_obj_invalidate(obj, NULL, ORIGIN_CPU);
+   intel_fb_obj_invalidate(obj, ORIGIN_CPU);
 
i915_gem_object_pin_pages(obj);
 
@@ -3939,7 +3939,7 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj, bool write)
}
 
if (write)
-   intel_fb_obj_invalidate(obj, NULL, ORIGIN_GTT);
+   intel_fb_obj_invalidate(obj, ORIGIN_GTT);
 
trace_i915_gem_object_change_domain(obj,
old_read_domains,
@@ -4212,7 +4212,7 @@ i915_gem_object_set_to_cpu_domain(struct 
drm_i915_gem_object *obj, bool write)
}
 
if (write)
-   intel_fb_obj_invalidate(obj, NULL, ORIGIN_CPU);
+   intel_fb_obj_invalidate(obj, ORIGIN_CPU);
 
trace_i915_gem_object_change_domain(obj,
old_read_domains,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3336e1c..edb8c45 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1038,7 +1038,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
obj-dirty = 1;
i915_gem_request_assign(obj-last_write_req, req);
 
-   intel_fb_obj_invalidate(obj, ring, ORIGIN_CS);
+   intel_fb_obj_invalidate(obj, ORIGIN_CS);
 
/* update for the implicit flush after a batch */
obj-base.write_domain = ~I915_GEM_GPU_DOMAINS;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bcafefc..64fb9fe 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -955,7 +955,6 @@ void bxt_ddi_vswing_sequence(struct drm_device *dev, u32 
level,
 
 /* intel_frontbuffer.c */
 void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
-struct intel_engine_cs *ring,
 enum fb_op_origin origin);
 void intel_frontbuffer_flip_prepare(struct drm_device *dev,
unsigned frontbuffer_bits);
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
b/drivers/gpu/drm/i915/intel_fbdev.c
index 6372cfc..8382146 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -89,7 +89,7 @@ static int intel_fbdev_blank(int blank, struct fb_info *info)
 * now until we solve this for real.
 */
mutex_lock(fb_helper-dev-struct_mutex);
-   intel_fb_obj_invalidate(ifbdev-fb-obj, NULL, ORIGIN_GTT);
+   intel_fb_obj_invalidate(ifbdev-fb-obj, ORIGIN_GTT);
mutex_unlock(fb_helper-dev-struct_mutex);
}
 
@@ -115,7 +115,7 @@ static int intel_fbdev_pan_display(struct fb_var_screeninfo 
*var,
 * now until we solve this for real.
 */
mutex_lock(fb_helper-dev-struct_mutex);
-   intel_fb_obj_invalidate(ifbdev-fb-obj, NULL, ORIGIN_GTT);
+   intel_fb_obj_invalidate(ifbdev-fb-obj, ORIGIN_GTT);
mutex_unlock(fb_helper-dev-struct_mutex);
}
 
diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c 
b/drivers/gpu/drm/i915/intel_frontbuffer.c

[Intel-gfx] [PATCH 5/5] drm/i915: Enable PSR by default.

2015-06-18 Thread Rodrigo Vivi
With a reliable frontbuffer tracking and all instability corner cases solved
let's re-enabled PSR by default on all supported platforms.

Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
---
 drivers/gpu/drm/i915/i915_params.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 8ac5a1b..e864e67 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -37,7 +37,7 @@ struct i915_params i915 __read_mostly = {
.enable_execlists = -1,
.enable_hangcheck = true,
.enable_ppgtt = -1,
-   .enable_psr = 0,
+   .enable_psr = 1,
.preliminary_hw_support = 
IS_ENABLED(CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT),
.disable_power_well = 1,
.enable_ips = 1,
@@ -124,7 +124,7 @@ MODULE_PARM_DESC(enable_execlists,
(-1=auto [default], 0=disabled, 1=enabled));
 
 module_param_named(enable_psr, i915.enable_psr, int, 0600);
-MODULE_PARM_DESC(enable_psr, Enable PSR (default: false));
+MODULE_PARM_DESC(enable_psr, Enable PSR (default: true));
 
 module_param_named(preliminary_hw_support, i915.preliminary_hw_support, int, 
0600);
 MODULE_PARM_DESC(preliminary_hw_support,
-- 
2.1.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 6/8] drivers/pwm: Add Crystalcove (CRC) PWM driver

2015-06-18 Thread Shobhit Kumar
On Fri, May 1, 2015 at 2:42 AM, Paul Bolle pebo...@tiscali.nl wrote:
 On Wed, 2015-04-29 at 19:30 +0530, Shobhit Kumar wrote:
 --- a/drivers/pwm/Kconfig
 +++ b/drivers/pwm/Kconfig

 +config PWM_CRC
 + bool Intel Crystalcove (CRC) PWM support
 + depends on X86  INTEL_SOC_PMIC
 + help
 +   Generic PWM framework driver for Crystalcove (CRC) PMIC based PWM
 +   control.

 --- a/drivers/pwm/Makefile
 +++ b/drivers/pwm/Makefile

 +obj-$(CONFIG_PWM_CRC)+= pwm-crc.o

 PWM_CRC is a bool symbol. So pwm-crc.o can never be part of a module.

I actually started this as a module but later decided to make it as
bool because INTEL_SOC_PMIC on which this depends is itself a bool as
well. Still it is good to keep the module based initialization.
Firstly because it causes no harm and even though some of the macros
are pre-processed out, gives info about the driver. Secondly there
were discussion on why INTEL_SOC_PMIC is bool (note this driver also
has module based initialization even when bool). I am guessing because
of some tricky module load order dependencies. If ever that becomes a
module, this can mostly be unchanged to be loaded as a module.

Regards
Shobhit


 (If I'm wrong, and that object file can actually be part of a module,
 you can stop reading here.)

 --- /dev/null
 +++ b/drivers/pwm/pwm-crc.c

 +#include linux/module.h

 Perhaps this include is not needed.

 +static const struct pwm_ops crc_pwm_ops = {
 + .config = crc_pwm_config,
 + .enable = crc_pwm_enable,
 + .disable = crc_pwm_disable,
 + .owner = THIS_MODULE,

 For built-in only code THIS_MODULE is basically equivalent to NULL (see
 include/linux/export.h). So I guess this line can be dropped.

 +};

 +static struct platform_driver crystalcove_pwm_driver = {
 + .probe = crystalcove_pwm_probe,
 + .remove = crystalcove_pwm_remove,
 + .driver = {
 + .name = crystal_cove_pwm,
 + },
 +};
 +
 +module_platform_driver(crystalcove_pwm_driver);

 Speaking from memory: for built-in only code this is equivalent to
 calling
 platform_driver_register(crystalcove_pwm_driver);

 from a wrapper, and marking that wrapper with device_initcall().

 +MODULE_AUTHOR(Shobhit Kumar shobhit.ku...@intel.com);
 +MODULE_DESCRIPTION(Intel Crystal Cove PWM Driver);
 +MODULE_LICENSE(GPL v2);

 These macros will be effectively preprocessed away for built-in only
 code.


 Paul Bolle

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/15] drm/i915: GuC-specific firmware loader

2015-06-18 Thread Yu Dai



On 06/15/2015 01:30 PM, Chris Wilson wrote:

On Mon, Jun 15, 2015 at 07:36:23PM +0100, Dave Gordon wrote:
snip
 + * Return true if get a success code from normal boot or RC6 boot
 + */
 +static inline bool i915_guc_get_status(struct drm_i915_private *dev_priv,
 +  u32 *status)
 +{
 +  *status = I915_READ(GUC_STATUS);
 +  return (((*status)  GS_UKERNEL_MASK) == GS_UKERNEL_READY ||
 +  ((*status)  GS_UKERNEL_MASK) == GS_UKERNEL_LAPIC_DONE);

Weird function. Does two things, only one of those is get_status. Maybe
you would like to split this up better and use a switch when you mean a
switch. Or rename it to reflect it's use only as a condition.
Yes. It makes sense to change it to something like 
i915_guc_is_ucode_loaded().

 +}
 +
 +/* Transfers the firmware image to RAM for execution by the microcontroller.
 + *
 + * GuC Firmware layout:
 + * +---+  
 + * |  CSS header   |  128B
 + * +---+  
 + * | uCode |
 + * +---+  
 + * | RSA signature |  256B
 + * +---+  
 + * | RSA public Key|  256B
 + * +---+  
 + * |   Public key modulus  |4B
 + * +---+  
 + *
 + * Architecturally, the DMA engine is bidirectional, and in can potentially
 + * even transfer between GTT locations. This functionality is left out of the
 + * API for now as there is no need for it.
 + *
 + * Be note that GuC need the CSS header plus uKernel code to be copied as one
 + * chunk of data. RSA sig data is loaded via MMIO.
 + */
 +static int guc_ucode_xfer_dma(struct drm_i915_private *dev_priv)
 +{
 +  struct intel_uc_fw *guc_fw = dev_priv-guc.guc_fw;
 +  struct drm_i915_gem_object *fw_obj = guc_fw-uc_fw_obj;
 +  unsigned long offset;
 +  struct sg_table *sg = fw_obj-pages;
 +  u32 status, ucode_size, rsa[UOS_RSA_SIG_SIZE / sizeof(u32)];
 +  int i, ret = 0;
 +
 +  /* uCode size, also is where RSA signature starts */
 +  offset = ucode_size = guc_fw-uc_fw_size - UOS_CSS_SIGNING_SIZE;
 +
 +  /* Copy RSA signature from the fw image to HW for verification */
 +  sg_pcopy_to_buffer(sg-sgl, sg-nents, rsa, UOS_RSA_SIG_SIZE, offset);
 +  for (i = 0; i  UOS_RSA_SIG_SIZE / sizeof(u32); i++)
 +  I915_WRITE(UOS_RSA_SCRATCH_0 + i * sizeof(u32), rsa[i]);
 +
 +  /* Set the source address for the new blob */
 +  offset = i915_gem_obj_ggtt_offset(fw_obj);

Why would it even have a GGTT vma? There's no precondition here to
assert that it should.

It is pinned into GGTT inside gem_allocate_guc_obj.

 +  I915_WRITE(DMA_ADDR_0_LOW, lower_32_bits(offset));
 +  I915_WRITE(DMA_ADDR_0_HIGH, upper_32_bits(offset)  0x);
 +
 +  /* Set the destination. Current uCode expects an 8k stack starting from
 +   * offset 0. */
 +  I915_WRITE(DMA_ADDR_1_LOW, 0x2000);
 +
 +  /* XXX: The image is automatically transfered to SRAM after the RSA
 +   * verification. This is why the address space is chosen as such. */
 +  I915_WRITE(DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM);
 +
 +  I915_WRITE(DMA_COPY_SIZE, ucode_size);
 +
 +  /* Finally start the DMA */
 +  I915_WRITE(DMA_CTRL, _MASKED_BIT_ENABLE(UOS_MOVE | START_DMA));
 +

Just assuming that the writes land and in the order you expect?
A POSTING_READ of DMA_COPY_SIZE before issue the DMA is enough here? Or, 
POSTING_READ all those writes?


-Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: Enable runtime pm

2015-06-18 Thread Rodrigo Vivi
I understand this patch is yet under discussion. I just re-sent to
warn that the following one depends on this. Otherwise it is better to
remove it and proceed with the last 3 patches of the series.

Thanks

On Thu, Jun 18, 2015 at 11:43 AM, Rodrigo Vivi rodrigo.v...@intel.com wrote:
 From: Daniel Vetter daniel.vet...@ffwll.ch

 Like with every other feature that's not enabled by default we break
 runtime pm support way too often by accident because the overall test
 coverage isn't great. And it's been almost 2 years since we enabled
 the power well code by default

 commit bf51d5e2cda5d36d98e4b46ac7fca9461e512c41
 Author: Paulo Zanoni paulo.r.zan...@intel.com
 Date:   Wed Jul 3 17:12:13 2013 -0300

 drm/i915: switch disable_power_well default value to 1

 It's really more than overdue for runtime pm itself to follow!

 Note that in practice this wont do a hole lot yet, since we're still
 gated on snd-hda-intel doing proper runtime pm. But I've discussed
 this with Liam and we agreed that this needs to be done. And the audio
 team is working to hold up their end of this bargain.

 And the justification for updating the autosuspend delay to 100ms:
 Quick measurment shows that we can do a full rpm cycle in about 5ms,
 which means the delay should still be really conservative from a power
 conservation pov. The only workload that would suffer from ping-pong
 is also only gpu/compute with all screens off. 100ms should cover any
 kind of latency with submitting follow-up batches.

 Cc: Takashi Iwai ti...@suse.de
 Cc: Liam Girdwood liam.r.girdw...@intel.com
 Cc: Yang, Libin libin.y...@intel.com
 Cc: Lin, Mengdong mengdong@intel.com
 Cc: Li, Jocelyn jocelyn...@intel.com
 Cc: Kaskinen, Tanu tanu.kaski...@intel.com
 Cc: Zanoni, Paulo R paulo.r.zan...@intel.com
 Signed-off-by: Daniel Vetter daniel.vet...@intel.com
 Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: 
 shuang...@intel.com)
 Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
 ---
  drivers/gpu/drm/i915/intel_runtime_pm.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
 b/drivers/gpu/drm/i915/intel_runtime_pm.c
 index 1a45385..2628b21 100644
 --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
 +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
 @@ -1831,9 +1831,10 @@ void intel_runtime_pm_enable(struct drm_i915_private 
 *dev_priv)
 return;
 }

 -   pm_runtime_set_autosuspend_delay(device, 1); /* 10s */
 +   pm_runtime_set_autosuspend_delay(device, 100);
 pm_runtime_mark_last_busy(device);
 pm_runtime_use_autosuspend(device);
 +   pm_runtime_allow(device);

 pm_runtime_put_autosuspend(device);
  }
 --
 2.1.0

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/5] drm/i915: Enable runtime pm

2015-06-18 Thread Rodrigo Vivi
From: Daniel Vetter daniel.vet...@ffwll.ch

Like with every other feature that's not enabled by default we break
runtime pm support way too often by accident because the overall test
coverage isn't great. And it's been almost 2 years since we enabled
the power well code by default

commit bf51d5e2cda5d36d98e4b46ac7fca9461e512c41
Author: Paulo Zanoni paulo.r.zan...@intel.com
Date:   Wed Jul 3 17:12:13 2013 -0300

drm/i915: switch disable_power_well default value to 1

It's really more than overdue for runtime pm itself to follow!

Note that in practice this wont do a hole lot yet, since we're still
gated on snd-hda-intel doing proper runtime pm. But I've discussed
this with Liam and we agreed that this needs to be done. And the audio
team is working to hold up their end of this bargain.

And the justification for updating the autosuspend delay to 100ms:
Quick measurment shows that we can do a full rpm cycle in about 5ms,
which means the delay should still be really conservative from a power
conservation pov. The only workload that would suffer from ping-pong
is also only gpu/compute with all screens off. 100ms should cover any
kind of latency with submitting follow-up batches.

Cc: Takashi Iwai ti...@suse.de
Cc: Liam Girdwood liam.r.girdw...@intel.com
Cc: Yang, Libin libin.y...@intel.com
Cc: Lin, Mengdong mengdong@intel.com
Cc: Li, Jocelyn jocelyn...@intel.com
Cc: Kaskinen, Tanu tanu.kaski...@intel.com
Cc: Zanoni, Paulo R paulo.r.zan...@intel.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: 
shuang...@intel.com)
Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
---
 drivers/gpu/drm/i915/intel_runtime_pm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 1a45385..2628b21 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -1831,9 +1831,10 @@ void intel_runtime_pm_enable(struct drm_i915_private 
*dev_priv)
return;
}
 
-   pm_runtime_set_autosuspend_delay(device, 1); /* 10s */
+   pm_runtime_set_autosuspend_delay(device, 100);
pm_runtime_mark_last_busy(device);
pm_runtime_use_autosuspend(device);
+   pm_runtime_allow(device);
 
pm_runtime_put_autosuspend(device);
 }
-- 
2.1.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/5] drm/i915: Invalidate frontbuffer bits on FBDEV sync

2015-06-18 Thread Rodrigo Vivi
Before this we had some duct tapes to cover known cases where
FBDEV would cause a frontbuffer flush so we invalidate it again.

However other cases appeared like the boot splash screen doing
modeset and flushing it. So let's fix it for all cases.

FBDEV ops provides a function to fb_sync that was designed
to wait for blit idle. We don't need to wait for blit idle for
the operations, but we can use this function to let frontbuffer
tracking know that fbdev is about to do something.

So this patch introduces a reliable way to know when fbdev is
performing any operation.

I could've use ORIGIN_FBDEV to set fbdev_running bool inside
 the invalidate function, however I decided to let it on fbdev
so we can use the single lock to know when we need to invalidate
minimizing the struct_mutex locks and invalidates themselves.
So actual invalidate happens only on the first fbdev frontbuffer
touch only, or whenever needed. Like if the splash screen
called a modeset during boot the fbdev will invalidate on the next
screen drawn so there will be no risk of missing screen updates
if PSR is enabled.

The fbdev_running unset is happening on frontbuffer tracking code
when a async flip completes. Since fbdev has no reliable place
to tell when it got paused we can use this place that will happen
if something else completed a modeset. The risk of false positive
exist but is minimal since any proper alternation will go through
this path. Also false positive while we don't get the propper
modeset is better than the risk of miss screen updates.

Althoguth fbdev presumes that all callbacks work from atomic
context I don't believe that any wait for idle is atomic. So I
also removed the FIXME comments we had for using struct_mutext
there on fb_ops.

Cc: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
---
 drivers/gpu/drm/i915/i915_debugfs.c  |   5 ++
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/intel_fbdev.c   | 106 ++-
 drivers/gpu/drm/i915/intel_frontbuffer.c |  19 ++
 4 files changed, 59 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c49fe2a..e3adddb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2376,6 +2376,11 @@ static int i915_edp_psr_status(struct seq_file *m, void 
*data)
}
mutex_unlock(dev_priv-psr.lock);
 
+   mutex_lock(dev_priv-fb_tracking.lock);
+   seq_printf(m, FBDEV running: %s\n,
+  yesno(dev_priv-fb_tracking.fbdev_running));
+   mutex_unlock(dev_priv-fb_tracking.lock);
+
intel_runtime_pm_put(dev_priv);
return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 491ef0c..f1478f5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -888,6 +888,7 @@ enum fb_op_origin {
ORIGIN_CPU,
ORIGIN_CS,
ORIGIN_FLIP,
+   ORIGIN_FBDEV,
 };
 
 struct i915_fbc {
@@ -1627,6 +1628,7 @@ struct i915_frontbuffer_tracking {
 */
unsigned busy_bits;
unsigned flip_bits;
+   bool fbdev_running;
 };
 
 struct i915_wa_reg {
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
b/drivers/gpu/drm/i915/intel_fbdev.c
index 8382146..4a96c20 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -45,92 +45,52 @@
 #include drm/i915_drm.h
 #include i915_drv.h
 
-static int intel_fbdev_set_par(struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info-par;
-   struct intel_fbdev *ifbdev =
-   container_of(fb_helper, struct intel_fbdev, helper);
-   int ret;
-
-   ret = drm_fb_helper_set_par(info);
-
-   if (ret == 0) {
-   /*
-* FIXME: fbdev presumes that all callbacks also work from
-* atomic contexts and relies on that for emergency oops
-* printing. KMS totally doesn't do that and the locking here is
-* by far not the only place this goes wrong.  Ignore this for
-* now until we solve this for real.
-*/
-   mutex_lock(fb_helper-dev-struct_mutex);
-   ret = i915_gem_object_set_to_gtt_domain(ifbdev-fb-obj,
-   true);
-   mutex_unlock(fb_helper-dev-struct_mutex);
-   }
-
-   return ret;
-}
-
-static int intel_fbdev_blank(int blank, struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info-par;
-   struct intel_fbdev *ifbdev =
-   container_of(fb_helper, struct intel_fbdev, helper);
-   int ret;
-
-   ret = drm_fb_helper_blank(blank, info);
-
-   if (ret == 0) {
-   /*
-* FIXME: fbdev presumes that all callbacks also work from
-* atomic contexts and relies on that for emergency oops
-

Re: [Intel-gfx] [PATCH 05/15] drm/i915: GuC-specific firmware loader

2015-06-18 Thread Dave Gordon
On 15/06/15 21:30, Chris Wilson wrote:
 On Mon, Jun 15, 2015 at 07:36:23PM +0100, Dave Gordon wrote:
 +/* We can't enable contexts until all firmware is loaded */
 +ret = intel_guc_ucode_load(dev, false);
 
 Pardon. I know context initialisation is broken, but adding to that
 breakage is not pleasant.

Sorry, but that's just the way it works. If you want to use the GuC for
batch submission, then you cannot submit any commands to any engine via
the GuC before its firmware is loaded, nor can you submit anything at
all directly to the ELSPs.

However in /this/ patch the 'false' above should have been 'true' to
give synchronous load semantics; and then ignoring the return is
intentional, because either it's worked and we're going to use the GuC,
or it hasn't and we're not (and it's already printed a message). Then
there's a later patch that tries to decouple engine MMIO setup from
engine setup using batches  contexts, at which point we can make use of
the return code.

  ret = i915_gem_context_enable(dev_priv);
  if (ret  ret != -EIO) {
  DRM_ERROR(Context enable failed %d\n, ret);
 
 diff --git a/drivers/gpu/drm/i915/intel_guc.h 
 b/drivers/gpu/drm/i915/intel_guc.h
 index 82367c9..0b44265 100644
 --- a/drivers/gpu/drm/i915/intel_guc.h
 +++ b/drivers/gpu/drm/i915/intel_guc.h
 @@ -166,4 +166,9 @@ struct intel_guc {
  #define GUC_WD_VECS_IER 0xC558
  #define GUC_PM_P24C_IER 0xC55C
  
 +/* intel_guc_loader.c */
 +extern void intel_guc_ucode_init(struct drm_device *dev);
 +extern int intel_guc_ucode_load(struct drm_device *dev, bool wait);
 +extern void intel_guc_ucode_fini(struct drm_device *dev);
 +
  #endif
 diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
 b/drivers/gpu/drm/i915/intel_guc_loader.c
 new file mode 100644
 index 000..16eef4c
 --- /dev/null
 +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
 @@ -0,0 +1,416 @@
 +/*
 + * Copyright © 2014 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the 
 Software),
 + * to deal in the Software without restriction, including without limitation
 + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the next
 + * paragraph) shall be included in all copies or substantial portions of the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS 
 OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
 DEALINGS
 + * IN THE SOFTWARE.
 + *
 + * Authors:
 + *Vinit Azad vinit.a...@intel.com
 + *Ben Widawsky b...@bwidawsk.net
 + *Dave Gordon david.s.gor...@intel.com
 + *Alex Dai yu@intel.com
 + */
 +#include linux/firmware.h
 +#include i915_drv.h
 +#include intel_guc.h
 +
 +/**
 + * DOC: GuC
 + *
 + * intel_guc:
 + * Top level structure of guc. It handles firmware loading and manages 
 client
 + * pool and doorbells. intel_guc owns a i915_guc_client to replace the 
 legacy
 + * ExecList submission.
 + *
 + * Firmware versioning:
 + * The firmware build process will generate a version header file with 
 major and
 + * minor version defined. The versions are built into CSS header of 
 firmware.
 + * i915 kernel driver set the minimal firmware version required per 
 platform.
 + * The firmware installation package will install (symbolic link) proper 
 version
 + * of firmware.
 + *
 + * GuC address space:
 + * GuC does not allow any gfx GGTT address that falls into range [0, 
 WOPCM_TOP),
 + * which is reserved for Boot ROM, SRAM and WOPCM. Currently this top 
 address is
 + * 512K. In order to exclude 0-512K address space from GGTT, all gfx objects
 + * used by GuC is pinned with PIN_OFFSET_BIAS along with size of WOPCM.
 + *
 + * Firmware log:
 + * Firmware log is enabled by setting i915.guc_log_level to non-negative 
 level.
 + * Log data is printed out via reading debugfs i915_guc_log_dump. Reading 
 from
 + * i915_guc_load_status will print out firmware loading status and scratch
 + * registers value.
 + *
 + */
 +
 +#define I915_SKL_GUC_UCODE i915/skl_guc_ver3.bin
 +MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
 +
 +static u32 get_gttype(struct drm_device *dev)
 +{
 +/* XXX: GT type based on PCI device ID? field seems unused by fw */
 +return 0;
 +}
 +
 +static u32 get_core_family(struct drm_device *dev)
 
 For new code we really should be in the habit of passing around the

Re: [Intel-gfx] [PATCH 5/5] drm/i915: Enable PSR by default.

2015-06-18 Thread Paulo Zanoni
2015-06-18 15:43 GMT-03:00 Rodrigo Vivi rodrigo.v...@intel.com:
 With a reliable frontbuffer tracking and all instability corner cases solved
 let's re-enabled PSR by default on all supported platforms.

Are we now passing all the PSR tests from kms_frontbuffer_tracking too?


 Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com
 ---
  drivers/gpu/drm/i915/i915_params.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/i915/i915_params.c 
 b/drivers/gpu/drm/i915/i915_params.c
 index 8ac5a1b..e864e67 100644
 --- a/drivers/gpu/drm/i915/i915_params.c
 +++ b/drivers/gpu/drm/i915/i915_params.c
 @@ -37,7 +37,7 @@ struct i915_params i915 __read_mostly = {
 .enable_execlists = -1,
 .enable_hangcheck = true,
 .enable_ppgtt = -1,
 -   .enable_psr = 0,
 +   .enable_psr = 1,
 .preliminary_hw_support = 
 IS_ENABLED(CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT),
 .disable_power_well = 1,
 .enable_ips = 1,
 @@ -124,7 +124,7 @@ MODULE_PARM_DESC(enable_execlists,
 (-1=auto [default], 0=disabled, 1=enabled));

  module_param_named(enable_psr, i915.enable_psr, int, 0600);
 -MODULE_PARM_DESC(enable_psr, Enable PSR (default: false));
 +MODULE_PARM_DESC(enable_psr, Enable PSR (default: true));

  module_param_named(preliminary_hw_support, i915.preliminary_hw_support, int, 
 0600);
  MODULE_PARM_DESC(preliminary_hw_support,
 --
 2.1.0

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 4/6] drm/i915/gen8: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround

2015-06-18 Thread Arun Siluvery
In Indirect context w/a batch buffer,
+WaFlushCoherentL3CacheLinesAtContextSwitch:bdw

v2: Add LRI commands to set/reset bit that invalidates coherent lines,
update WA to include programming restrictions and exclude CHV as
it is not required (Ville)

v3: Avoid unnecessary read when it can be done by reading register once (Chris).

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 84af255..d14ad20 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -426,6 +426,7 @@
 #define   PIPE_CONTROL_INDIRECT_STATE_DISABLE  (19)
 #define   PIPE_CONTROL_NOTIFY  (18)
 #define   PIPE_CONTROL_FLUSH_ENABLE(17) /* gen7+ */
+#define   PIPE_CONTROL_DC_FLUSH_ENABLE (15)
 #define   PIPE_CONTROL_VF_CACHE_INVALIDATE (14)
 #define   PIPE_CONTROL_CONST_CACHE_INVALIDATE  (13)
 #define   PIPE_CONTROL_STATE_CACHE_INVALIDATE  (12)
@@ -5788,6 +5789,7 @@ enum skl_disp_power_wells {
 
 #define GEN8_L3SQCREG4 0xb118
 #define  GEN8_LQSC_RO_PERF_DIS (127)
+#define  GEN8_LQSC_FLUSH_COHERENT_LINES(121)
 
 /* GEN8 chicken */
 #define HDC_CHICKEN0   0x7300
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c4b3493..3291ef4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1113,6 +1113,29 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
/* WaDisableCtxRestoreArbitration:bdw,chv */
wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE);
 
+   /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+   if (IS_BROADWELL(ring-dev)) {
+   struct drm_i915_private *dev_priv = to_i915(ring-dev);
+   uint32_t l3sqc4_flush = (I915_READ(GEN8_L3SQCREG4) |
+GEN8_LQSC_FLUSH_COHERENT_LINES);
+
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, GEN8_L3SQCREG4);
+   wa_ctx_emit(batch, l3sqc4_flush);
+
+   wa_ctx_emit(batch, GFX_OP_PIPE_CONTROL(6));
+   wa_ctx_emit(batch, (PIPE_CONTROL_CS_STALL |
+   PIPE_CONTROL_DC_FLUSH_ENABLE));
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+   wa_ctx_emit(batch, 0);
+
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, GEN8_L3SQCREG4);
+   wa_ctx_emit(batch, l3sqc4_flush  
~GEN8_LQSC_FLUSH_COHERENT_LINES);
+   }
+
/* Pad to end of cacheline */
while (index % CACHELINE_DWORDS)
wa_ctx_emit(batch, MI_NOOP);
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers

2015-06-18 Thread Arun Siluvery
Some of the WA are to be applied during context save but before restore and
some at the end of context save/restore but before executing the instructions
in the ring, WA batch buffers are created for this purpose and these WA cannot
be applied using normal means. Each context has two registers to load the
offsets of these batch buffers. If they are non-zero, HW understands that it
need to execute these batches.

v1: In this version two separate ring_buffer objects were used to load WA
instructions for indirect and per context batch buffers and they were part
of every context.

v2: Chris suggested to include additional page in context and use it to load
these WA instead of creating separate objects. This will simplify lot of things
as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC
is planning to use a similar setup to share data between GuC and driver and
WA batch buffers can probably share that page. However after discussions with
Dave who is implementing GuC changes, he suggested to use an independent page
for the reasons - GuC area might grow and these WA are initialized only once and
are not changed afterwards so we can share them share across all contexts.

The page is updated with WA during render ring init. This has an advantage of
not adding more special cases to default_context.

We don't know upfront the number of WA we will applying using these batch 
buffers.
For this reason the size was fixed earlier but it is not a good idea. To fix 
this,
the functions that load instructions are modified to report the no of commands
inserted and the size is now calculated after the batch is updated. A macro is
introduced to add commands to these batch buffers which also checks for overflow
and returns error.
We have a full page dedicated for these WA so that should be sufficient for
good number of WA, anything more means we have major issues.
The list for Gen8 is small, same for Gen9 also, maybe few more gets added
going forward but not close to filling entire page. Chris suggested a two-pass
approach but we agreed to go with single page setup as it is a one-off routine
and simpler code wins. Moved around functions to simplify it further, add 
comments
and fix alignment check.

One additional option is offset field which is helpful if we would like to
have multiple batches at different offsets within the page and select them
based on some criteria. This is not a requirement at this point but could
help in future (Dave).

(Many thanks to Chris, Dave and Thomas for their reviews and inputs)

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c| 199 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  18 +++
 2 files changed, 213 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0413b8f..8cc851dd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -211,6 +211,7 @@ enum {
FAULT_AND_CONTINUE /* Unsupported */
 };
 #define GEN8_CTX_ID_SHIFT 32
+#define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT  0x17
 
 static int intel_lr_context_pin(struct intel_engine_cs *ring,
struct intel_context *ctx);
@@ -1077,6 +1078,168 @@ static int intel_logical_ring_workarounds_emit(struct 
intel_engine_cs *ring,
return 0;
 }
 
+#define wa_ctx_emit(batch, cmd) {  \
+   if (WARN_ON(index = (PAGE_SIZE / sizeof(uint32_t { \
+   return -ENOSPC; \
+   }   \
+   batch[index++] = (cmd); \
+   }
+
+/**
+ * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA
+ *
+ * @ring: only applicable for RCS
+ * @wa_ctx_batch: page in which WA are loaded
+ * @offset: This is for future use in case if we would like to have multiple
+ *  batches at different offsets and select them based on a criteria.
+ * @num_dwords: The number of WA applied are known at the beginning, it returns
+ * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END
+ * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will be
+ * added to perctx batch and both of them together makes a complete batch 
buffer.
+ *
+ * Return: non-zero if we exceed the PAGE_SIZE limit.
+ */
+
+static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring,
+   uint32_t **wa_ctx_batch,
+   uint32_t offset,
+   uint32_t *num_dwords)
+{
+   uint32_t index;
+   uint32_t *batch = *wa_ctx_batch;
+
+   index = offset;
+
+   /* FIXME: Replace me with WA */
+   wa_ctx_emit(batch, 

[Intel-gfx] [PATCH v5 6/6] drm/i915/gen8: Add WaRsRestoreWithPerCtxtBb workaround

2015-06-18 Thread Arun Siluvery
In Per context w/a batch buffer,
WaRsRestoreWithPerCtxtBb

v2: This patches modifies definitions of MI_LOAD_REGISTER_MEM and
MI_LOAD_REGISTER_REG; Add GEN8 specific defines for these instructions
so as to not break any future users of existing definitions (Michel)

v3: Length defined in current definitions of LRM, LRR instructions was specified
as 0. It seems it is common convention for instructions whose length vary 
between
platforms. This is not an issue so far because they are not used anywhere except
command parser; now that we use in this patch update them with correct length
and also move them out of command parser placeholder to appropriate place.
remove unnecessary padding and follow the WA programming sequence exactly
as mentioned in spec which is essential for this WA (Dave).

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  | 29 +++--
 drivers/gpu/drm/i915/intel_lrc.c | 54 
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7637e64..208620d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -347,6 +347,31 @@
 #define   MI_INVALIDATE_BSD(17)
 #define   MI_FLUSH_DW_USE_GTT  (12)
 #define   MI_FLUSH_DW_USE_PPGTT(02)
+#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 1)
+#define MI_LOAD_REGISTER_MEM_GEN8 MI_INSTR(0x29, 2)
+#define   MI_LRM_USE_GLOBAL_GTT (122)
+#define   MI_LRM_ASYNC_MODE_ENABLE (121)
+#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 1)
+#define MI_ATOMIC(len) MI_INSTR(0x2F, (len-2))
+#define   MI_ATOMIC_MEMORY_TYPE_GGTT   (122)
+#define   MI_ATOMIC_INLINE_DATA(118)
+#define   MI_ATOMIC_CS_STALL   (117)
+#define   MI_ATOMIC_RETURN_DATA_CTL(116)
+#define MI_ATOMIC_OP_MASK(op)  ((op)  8)
+#define MI_ATOMIC_AND  MI_ATOMIC_OP_MASK(0x01)
+#define MI_ATOMIC_OR   MI_ATOMIC_OP_MASK(0x02)
+#define MI_ATOMIC_XOR  MI_ATOMIC_OP_MASK(0x03)
+#define MI_ATOMIC_MOVE MI_ATOMIC_OP_MASK(0x04)
+#define MI_ATOMIC_INC  MI_ATOMIC_OP_MASK(0x05)
+#define MI_ATOMIC_DEC  MI_ATOMIC_OP_MASK(0x06)
+#define MI_ATOMIC_ADD  MI_ATOMIC_OP_MASK(0x07)
+#define MI_ATOMIC_SUB  MI_ATOMIC_OP_MASK(0x08)
+#define MI_ATOMIC_RSUB MI_ATOMIC_OP_MASK(0x09)
+#define MI_ATOMIC_IMAX MI_ATOMIC_OP_MASK(0x0A)
+#define MI_ATOMIC_IMIN MI_ATOMIC_OP_MASK(0x0B)
+#define MI_ATOMIC_UMAX MI_ATOMIC_OP_MASK(0x0C)
+#define MI_ATOMIC_UMIN MI_ATOMIC_OP_MASK(0x0D)
+
 #define MI_BATCH_BUFFERMI_INSTR(0x30, 1)
 #define   MI_BATCH_NON_SECURE  (1)
 /* for snb/ivb/vlv this also means batch in ppgtt when ppgtt is enabled. */
@@ -451,8 +476,6 @@
 #define MI_CLFLUSH  MI_INSTR(0x27, 0)
 #define MI_REPORT_PERF_COUNTMI_INSTR(0x28, 0)
 #define   MI_REPORT_PERF_COUNT_GGTT (10)
-#define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 0)
-#define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 0)
 #define MI_RS_STORE_DATA_IMMMI_INSTR(0x2B, 0)
 #define MI_LOAD_URB_MEM MI_INSTR(0x2C, 0)
 #define MI_STORE_URB_MEMMI_INSTR(0x2D, 0)
@@ -1799,6 +1822,8 @@ enum skl_disp_power_wells {
 #define   GEN8_RC_SEMA_IDLE_MSG_DISABLE(1  12)
 #define   GEN8_FF_DOP_CLOCK_GATE_DISABLE   (110)
 
+#define GEN8_RS_PREEMPT_STATUS 0x215C
+
 /* Fuse readout registers for GT */
 #define CHV_FUSE_GT(VLV_DISPLAY_BASE + 0x2168)
 #define   CHV_FGT_DISABLE_SS0  (1  10)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b631390..281aec6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1179,13 +1179,67 @@ static int gen8_init_perctx_bb(struct intel_engine_cs 
*ring,
   uint32_t *num_dwords)
 {
uint32_t index;
+   uint32_t scratch_addr;
uint32_t *batch = *wa_ctx_batch;
 
index = offset;
 
+   /* Actual scratch location is at 128 bytes offset */
+   scratch_addr = ring-scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr |= PIPE_CONTROL_GLOBAL_GTT;
+
/* WaDisableCtxRestoreArbitration:bdw,chv */
wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 
+   /*
+* As per Bspec, to workaround a known HW issue, SW must perform the
+* below programming sequence prior to programming MI_BATCH_BUFFER_END.
+*
+* This is only applicable for Gen8.
+*/
+
+   /* WaRsRestoreWithPerCtxtBb:bdw,chv */
+   wa_ctx_emit(batch, MI_LOAD_REGISTER_IMM(1));
+   wa_ctx_emit(batch, INSTPM);
+   wa_ctx_emit(batch, _MASKED_BIT_DISABLE(INSTPM_FORCE_ORDERING));
+
+   wa_ctx_emit(batch, (MI_ATOMIC(5) |
+   MI_ATOMIC_MEMORY_TYPE_GGTT |
+   MI_ATOMIC_INLINE_DATA |

[Intel-gfx] [PATCH v5 3/6] drm/i915/gen8: Add WaDisableCtxRestoreArbitration workaround

2015-06-18 Thread Arun Siluvery
In Indirect and Per context w/a batch buffer,
+WaDisableCtxRestoreArbitration

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Dave Gordon david.s.gor...@intel.com
Signed-off-by: Rafael Barbalho rafael.barba...@intel.com
Signed-off-by: Arun Siluvery arun.siluv...@linux.intel.com
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 62486cd..c4b3493 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1110,8 +1110,8 @@ static int gen8_init_indirectctx_bb(struct 
intel_engine_cs *ring,
 
index = offset;
 
-   /* FIXME: Replace me with WA */
-   wa_ctx_emit(batch, MI_NOOP);
+   /* WaDisableCtxRestoreArbitration:bdw,chv */
+   wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_DISABLE);
 
/* Pad to end of cacheline */
while (index % CACHELINE_DWORDS)
@@ -1144,6 +1144,9 @@ static int gen8_init_perctx_bb(struct intel_engine_cs 
*ring,
 
index = offset;
 
+   /* WaDisableCtxRestoreArbitration:bdw,chv */
+   wa_ctx_emit(batch, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+
wa_ctx_emit(batch, MI_BATCH_BUFFER_END);
 
*num_dwords = index - offset;
-- 
2.3.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

2015-06-18 Thread Dave Gordon
On 18/06/15 13:10, Chris Wilson wrote:
 On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
 On 17/06/15 13:02, Daniel Vetter wrote:
 Domain handling is required for all gem objects, and the resulting bugs if
 you don't for one-off objects are absolutely no fun to track down.

 Is it not the case that the new object returned by
 i915_gem_alloc_object() is
 (a) of a type that can be mapped into the GTT, and
 (b) initially in the CPU domain for both reading and writing?

 So AFAICS the allocate-and-fill function I'm describing (to appear in
 next patch series respin) doesn't need any further domain handling.
 
 A i915_gem_object_create_from_data() is a reasonable addition, and I
 suspect it will make the code a bit more succinct.

I shall adopt this name for it :)

 Whilst your statement is true today, calling set_domain is then a no-op,
 and helps document how you use the object and so reduces the likelihood
 of us introducing bugs in the future.
 -Chris

So here's the new function ... where should the set-to-cpu-domain go?
After the pin_pages and before the sg_copy_from_buffer?

/* Allocate a new GEM object and fill it with the supplied data */
struct drm_i915_gem_object *
i915_gem_object_create_from_data(struct drm_device *dev,
 const void *data, size_t size)
{
struct drm_i915_gem_object *obj;
struct sg_table *sg;
size_t bytes;
int ret;

obj = i915_gem_alloc_object(dev, round_up(size, PAGE_SIZE));
if (!obj)
return NULL;

ret = i915_gem_object_get_pages(obj);
if (ret)
goto fail;

i915_gem_object_pin_pages(obj);
sg = obj-pages;
bytes = sg_copy_from_buffer(sg-sgl, sg-nents, (void *)data, size);
i915_gem_object_unpin_pages(obj);

if (WARN_ON(bytes != size)) {
DRM_ERROR(Incomplete copy, wrote %zu of %zu, bytes, size);
goto fail;
}

return obj;

fail:
drm_gem_object_unreference(obj-base);
return NULL;
}

.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 6/8] drivers/pwm: Add Crystalcove (CRC) PWM driver

2015-06-18 Thread Paul Bolle
Hi Shobhit,

On Thu, 2015-06-18 at 23:24 +0530, Shobhit Kumar wrote:
 On Fri, May 1, 2015 at 2:42 AM, Paul Bolle pebo...@tiscali.nl wrote:
  On Wed, 2015-04-29 at 19:30 +0530, Shobhit Kumar wrote:
  --- a/drivers/pwm/Kconfig
  +++ b/drivers/pwm/Kconfig
 
  +config PWM_CRC
  + bool Intel Crystalcove (CRC) PWM support
  + depends on X86  INTEL_SOC_PMIC
  + help
  +   Generic PWM framework driver for Crystalcove (CRC) PMIC based PWM
  +   control.
 
  --- a/drivers/pwm/Makefile
  +++ b/drivers/pwm/Makefile
 
  +obj-$(CONFIG_PWM_CRC)+= pwm-crc.o
 
  PWM_CRC is a bool symbol. So pwm-crc.o can never be part of a module.
 
 I actually started this as a module but later decided to make it as
 bool because INTEL_SOC_PMIC on which this depends is itself a bool as
 well.

As does GPIO_CRYSTAL_COVE and that's a tristate. So?

 Still it is good to keep the module based initialization.
 Firstly because it causes no harm

If I got a dime for every time people used an argument like that I ... I
could treat myself to an ice cream. A really big ice cream. Hmm, that
doesn't sound too impressive. But still, causes no harm is below the
bar for kernel code. Kernel code needs to add value.

 and even though some of the macros
 are pre-processed out, gives info about the driver.

None of which can't be gotten elsewhere (ie, the commit message, or the
file these macro reside in).

 Secondly there
 were discussion on why INTEL_SOC_PMIC is bool (note this driver also
 has module based initialization even when bool).

Yes, there's copy and paste going on even in kernel development.

 I am guessing because
 of some tricky module load order dependencies. If ever that becomes a
 module, this can mostly be unchanged to be loaded as a module.

You put in a macro, or any other bit of code, when it's needed, not
beforehand, just in case. That's silly.

Thanks,


Paul Bolle

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

2015-06-18 Thread Dave Gordon
On 18/06/15 15:31, Daniel Vetter wrote:
 On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
 On 17/06/15 13:02, Daniel Vetter wrote:
 On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
 On 15/06/15 21:09, Chris Wilson wrote:
 On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
 From: Alex Dai yu@intel.com

 i915_gem_object_write() is a generic function to copy data from a plain
 linear buffer to a paged gem object.

 We will need this for the microcontroller firmware loading support code.

 Issue: VIZ-4884
 Signed-off-by: Alex Dai yu@intel.com
 Signed-off-by: Dave Gordon david.s.gor...@intel.com
 ---
  drivers/gpu/drm/i915/i915_drv.h |2 ++
  drivers/gpu/drm/i915/i915_gem.c |   28 
  2 files changed, 30 insertions(+)

 diff --git a/drivers/gpu/drm/i915/i915_drv.h 
 b/drivers/gpu/drm/i915/i915_drv.h
 index 611fbd8..9094c06 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device 
 *dev);
  void i915_gem_object_free(struct drm_i915_gem_object *obj);
  void i915_gem_object_init(struct drm_i915_gem_object *obj,
   const struct drm_i915_gem_object_ops *ops);
 +int i915_gem_object_write(struct drm_i915_gem_object *obj,
 +  const void *data, size_t size);
  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device 
 *dev,
size_t size);
  void i915_init_vm(struct drm_i915_private *dev_priv,
 diff --git a/drivers/gpu/drm/i915/i915_gem.c 
 b/drivers/gpu/drm/i915/i915_gem.c
 index be35f04..75d63c2 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct 
 drm_i915_gem_object *obj)
  return false;
  }
  
 +/* Fill the @obj with the @size amount of @data */
 +int i915_gem_object_write(struct drm_i915_gem_object *obj,
 +const void *data, size_t size)
 +{
 +struct sg_table *sg;
 +size_t bytes;
 +int ret;
 +
 +ret = i915_gem_object_get_pages(obj);
 +if (ret)
 +return ret;
 +
 +i915_gem_object_pin_pages(obj);

 You don't set the object into the CPU domain, or instead manually handle
 the domain flushing. You don't handle objects that cannot be written
 directly by the CPU, nor do you handle objects whose representation in
 memory is not linear.
 -Chris

 No we don't handle just any random gem object, but we do return an error
 code for any types not supported. However, as we don't really need the
 full generality of writing into a gem object of any type, I will replace
 this function with one that combines the allocation of a new object
 (which will therefore definitely be of the correct type, in the correct
 domain, etc) and filling it with the data to be preserved.

 The usage pattern for the particular case is going to be:
  Once-only:
  Allocate
  Fill
  Then each time GuC is (re-)initialised:
  Map to GTT
  DMA-read from buffer into GuC private memory
  Unmap
  Only on unload:
  Dispose

 So our object is write-once by the CPU (and that's always the first
 operation), thereafter read-occasionally by the GuC's DMA engine.
 
 Yup. The problem is more that on atom platforms the objects aren't
 coherent by default and generally you need to do something. Hence we
 either have
 - an explicit set_caching call to document that this is a gpu object which
   is always coherent (so also on chv/bxt), even when that's a no-op on big
   core
 - or wrap everything in set_domain calls, even when those are no-ops too.
 
 If either of those lack, reviews tend to freak out preemptively and the
 reptil brain takes over ;-)
 
 Cheers, Daniel

We don't need coherency as such. The buffer is filled (once only) by
the CPU (so I should put a set-to-cpu-domain between the allocate and
fill stages?) Once it's filled, the CPU need not read or write it ever
again.

Then before the DMA engine accesses it, we call i915_gem_obj_ggtt_pin,
which I'm assuming will take care of any coherency issues (making sure
the data written by the CPU is now visible to the DMA engine) when it
puts the buffer into the GTT-readable domain. Is that not sufficient?

.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/15] drm/i915: GuC-specific firmware loader

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 10:53:10AM -0700, Yu Dai wrote:
 
 
 On 06/15/2015 01:30 PM, Chris Wilson wrote:
 On Mon, Jun 15, 2015 at 07:36:23PM +0100, Dave Gordon wrote:
  +  /* Set the source address for the new blob */
  +  offset = i915_gem_obj_ggtt_offset(fw_obj);
 
 Why would it even have a GGTT vma? There's no precondition here to
 assert that it should.
 It is pinned into GGTT inside gem_allocate_guc_obj.

The basic rules when reviewing is pinning is:
- is there a reason for this pin?
- is the lifetime of the pin bound to the hardware access?
- are the pad-to-size/alignment correct?
- is the vma in the wrong location?

Pinning early (and then not even stating in the function preamble that
you expect the object to be pinned) makes it hard to review both the
reason and check the lifetime. An easy solution to avoiding the
assumption of having a pinned object is to pass around the vma instead.
Though because you pin too early it is not clear the reason for the pin
nor that you only pin it for the lifetime of the hardware access, and
you have to scour the code to ensure that the pin isn't randomly dropped
or reused for another access.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC 06/14] drm/i915: Disable vlank interrupt for disabling MIPI cmd mode

2015-06-18 Thread Gaurav K Singh
vblank interrupt should be disabled before starting the disable
sequence for MIPI command mode. Otherwise when pipe is disabled
TE interurpt will be still handled and one memory write command
will be sent with pipe disabled. This makes the pipe hw to get
stuck and it doesn't recover in the next enable sequence causing
display blank out.

v2: Use drm_blank_off instead of platform specific disable vblank functions 
(Daniel)

Signed-off-by: Yogesh Mohan Marimuthu yogesh.mohan.marimu...@intel.com
Signed-off-by: Gaurav K Singh gaurav.k.si...@intel.com
---
 drivers/gpu/drm/i915/intel_dsi.c |   14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index d378246..7021591 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -513,11 +513,25 @@ static void intel_dsi_enable_nop(struct intel_encoder 
*encoder)
 
 static void intel_dsi_pre_disable(struct intel_encoder *encoder)
 {
+   struct drm_device *dev = encoder-base.dev;
struct intel_dsi *intel_dsi = enc_to_intel_dsi(encoder-base);
+   struct intel_crtc *intel_crtc = to_intel_crtc(encoder-base.crtc);
+   int pipe = intel_crtc-pipe;
enum port port;
 
DRM_DEBUG_KMS(\n);
 
+   if (is_cmd_mode(intel_dsi)) {
+   drm_vblank_off(dev, pipe);
+
+   /*
+* Make sure that the last frame is sent otherwise pipe can get
+* stuck. Currently providing delay time for ~2 vblanks
+* assuming 60fps.
+*/
+   mdelay(40);
+   }
+
if (is_vid_mode(intel_dsi)) {
/* Send Shutdown command to the panel in LP mode */
for_each_dsi_port(port, intel_dsi-ports)
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC 07/14] drm/i915: Disable MIPI display self refresh mode

2015-06-18 Thread Gaurav K Singh
During disable sequence for MIPI encoder in command mode, disable
MIPI display self-refresh mode bit in Pipe Ctrl reg.

v2: Use crtc state flag instead of loop over encoders (Daniel)

Signed-off-by: Gaurav K Singh gaurav.k.si...@intel.com
Signed-off-by: Yogesh Mohan Marimuthu yogesh.mohan.marimu...@intel.com
Signed-off-by: Shobhit Kumar shobhit.ku...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c |3 +++
 drivers/gpu/drm/i915/intel_drv.h |3 +++
 drivers/gpu/drm/i915/intel_dsi.c |3 +++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 067b1de..dd518d6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2193,6 +2193,9 @@ static void intel_disable_pipe(struct intel_crtc *crtc)
if ((val  PIPECONF_ENABLE) == 0)
return;
 
+   if (crtc-config-dsi_self_refresh)
+   val = val  ~PIPECONF_MIPI_DSR_ENABLE;
+
/*
 * Double wide has implications for planes
 * so best keep it disabled when not needed.
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 14562c6..4298a00 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -444,6 +444,9 @@ struct intel_crtc_state {
bool double_wide;
 
bool dp_encoder_is_mst;
+
+   bool dsi_self_refresh;
+
int pbn;
 
struct intel_crtc_scaler_state scaler_state;
diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index 7021591..36d8ad6 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -308,6 +308,9 @@ static bool intel_dsi_compute_config(struct intel_encoder 
*encoder,
 
DRM_DEBUG_KMS(\n);
 
+   if (is_cmd_mode(intel_dsi))
+   config-dsi_self_refresh = true;
+
if (fixed_mode)
intel_fixed_panel_mode(fixed_mode, adjusted_mode);
 
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC 11/14] drm/i915: Enable MIPI display self refresh mode

2015-06-18 Thread Gaurav K Singh
During enable sequence for MIPI encoder in command mode, enable
MIPI display self-refresh mode bit in Pipe Ctrl reg.

v2: Use crtc state flag instead of loop over encoders (Daniel)

Signed-off-by: Gaurav K Singh gaurav.k.si...@intel.com
Signed-off-by: Yogesh Mohan Marimuthu yogesh.mohan.marimu...@intel.com
Signed-off-by: Shobhit Kumar shobhit.ku...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index dd518d6..c53f66d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2158,6 +2158,11 @@ static void intel_enable_pipe(struct intel_crtc *crtc)
return;
}
 
+   if (crtc-config-dsi_self_refresh) {
+   val = val | PIPECONF_MIPI_DSR_ENABLE;
+   I915_WRITE(reg, val);
+   }
+
I915_WRITE(reg, val | PIPECONF_ENABLE);
POSTING_READ(reg);
 }
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] {Intel-gfx] [RFC 01/14] drm/i915: allocate gem memory for mipi dbi cmd buffer

2015-06-18 Thread Gaurav K Singh
Allocate gem memory for MIPI DBI command buffer. This memory
will be used when sending command via DBI interface.

v2: lock mutex before gem object unreference and later set gem obj ptr to NULL 
(Gaurav)

Signed-off-by: Yogesh Mohan Marimuthu yogesh.mohan.marimu...@intel.com
Signed-off-by: Gaurav K Singh gaurav.k.si...@intel.com
Signed-off-by: Shobhit Kumar shobhit.ku...@intel.com
---
 drivers/gpu/drm/i915/intel_dsi.c |   40 ++
 drivers/gpu/drm/i915/intel_dsi.h |4 
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index 98998e9..011fef2 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -407,9 +407,35 @@ static void intel_dsi_pre_enable(struct intel_encoder 
*encoder)
enum pipe pipe = intel_crtc-pipe;
enum port port;
u32 tmp;
+   int ret;
 
DRM_DEBUG_KMS(\n);
 
+   if (!intel_dsi-gem_obj  is_cmd_mode(intel_dsi)) {
+   intel_dsi-gem_obj = i915_gem_alloc_object(dev, 4096);
+   if (!intel_dsi-gem_obj) {
+   DRM_ERROR(Failed to allocate seqno page\n);
+   return;
+   }
+
+   ret = i915_gem_object_set_cache_level(intel_dsi-gem_obj,
+ I915_CACHE_LLC);
+   if (ret)
+   goto err_unref;
+
+   ret = i915_gem_obj_ggtt_pin(intel_dsi-gem_obj, 4096, 0);
+   if (ret) {
+err_unref:
+   drm_gem_object_unreference(intel_dsi-gem_obj-base);
+   return;
+   }
+
+   intel_dsi-cmd_buff =
+   kmap(sg_page(intel_dsi-gem_obj-pages-sgl));
+   intel_dsi-cmd_buff_phy_addr = page_to_phys(
+   sg_page(intel_dsi-gem_obj-pages-sgl));
+   }
+
/* Disable DPOunit clock gating, can stall pipe
 * and we need DPLL REFA always enabled */
tmp = I915_READ(DPLL(pipe));
@@ -555,6 +581,7 @@ static void intel_dsi_post_disable(struct intel_encoder 
*encoder)
 {
struct drm_i915_private *dev_priv = encoder-base.dev-dev_private;
struct intel_dsi *intel_dsi = enc_to_intel_dsi(encoder-base);
+   struct drm_device *dev = encoder-base.dev;
u32 val;
 
DRM_DEBUG_KMS(\n);
@@ -571,6 +598,15 @@ static void intel_dsi_post_disable(struct intel_encoder 
*encoder)
 
msleep(intel_dsi-panel_off_delay);
msleep(intel_dsi-panel_pwr_cycle_delay);
+
+   if (intel_dsi-gem_obj) {
+   kunmap(intel_dsi-cmd_buff);
+   i915_gem_object_ggtt_unpin(intel_dsi-gem_obj);
+   mutex_lock(dev-struct_mutex);
+   drm_gem_object_unreference(intel_dsi-gem_obj-base);
+   mutex_unlock(dev-struct_mutex);
+   }
+   intel_dsi-gem_obj = NULL;
 }
 
 static bool intel_dsi_get_hw_state(struct intel_encoder *encoder,
@@ -1042,6 +1078,10 @@ void intel_dsi_init(struct drm_device *dev)
intel_dsi-ports = (1  PORT_C);
}
 
+   intel_dsi-cmd_buff = NULL;
+   intel_dsi-cmd_buff_phy_addr = 0;
+   intel_dsi-gem_obj = NULL;
+
/* Create a DSI host (and a device) for each port. */
for_each_dsi_port(port, intel_dsi-ports) {
struct intel_dsi_host *host;
diff --git a/drivers/gpu/drm/i915/intel_dsi.h b/drivers/gpu/drm/i915/intel_dsi.h
index 2784ac4..36ca3cc 100644
--- a/drivers/gpu/drm/i915/intel_dsi.h
+++ b/drivers/gpu/drm/i915/intel_dsi.h
@@ -44,6 +44,10 @@ struct intel_dsi {
 
struct intel_connector *attached_connector;
 
+   struct drm_i915_gem_object *gem_obj;
+   void *cmd_buff;
+   dma_addr_t cmd_buff_phy_addr;
+
/* bit mask of ports being driven */
u16 ports;
 
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] {Intel-gfx] [RFC 01/14] drm/i915: allocate gem memory for mipi dbi cmd buffer

2015-06-18 Thread Singh, Gaurav K



On 6/19/2015 3:32 AM, Gaurav K Singh wrote:

Allocate gem memory for MIPI DBI command buffer. This memory
will be used when sending command via DBI interface.

v2: lock mutex before gem object unreference and later set gem obj ptr to NULL 
(Gaurav)

Signed-off-by: Yogesh Mohan Marimuthu yogesh.mohan.marimu...@intel.com
Signed-off-by: Gaurav K Singh gaurav.k.si...@intel.com
Signed-off-by: Shobhit Kumar shobhit.ku...@intel.com
---
  drivers/gpu/drm/i915/intel_dsi.c |   40 ++
  drivers/gpu/drm/i915/intel_dsi.h |4 
  2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index 98998e9..011fef2 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -407,9 +407,35 @@ static void intel_dsi_pre_enable(struct intel_encoder 
*encoder)
enum pipe pipe = intel_crtc-pipe;
enum port port;
u32 tmp;
+   int ret;
  
  	DRM_DEBUG_KMS(\n);
  
+	if (!intel_dsi-gem_obj  is_cmd_mode(intel_dsi)) {

+   intel_dsi-gem_obj = i915_gem_alloc_object(dev, 4096);
+   if (!intel_dsi-gem_obj) {
+   DRM_ERROR(Failed to allocate seqno page\n);
+   return;
+   }
+
+   ret = i915_gem_object_set_cache_level(intel_dsi-gem_obj,
+ I915_CACHE_LLC);
+   if (ret)
+   goto err_unref;
+
+   ret = i915_gem_obj_ggtt_pin(intel_dsi-gem_obj, 4096, 0);
+   if (ret) {
+err_unref:
+   drm_gem_object_unreference(intel_dsi-gem_obj-base);
+   return;
+   }
+
+   intel_dsi-cmd_buff =
+   kmap(sg_page(intel_dsi-gem_obj-pages-sgl));
+   intel_dsi-cmd_buff_phy_addr = page_to_phys(
+   sg_page(intel_dsi-gem_obj-pages-sgl));
+   }
+
/* Disable DPOunit clock gating, can stall pipe
 * and we need DPLL REFA always enabled */
tmp = I915_READ(DPLL(pipe));
@@ -555,6 +581,7 @@ static void intel_dsi_post_disable(struct intel_encoder 
*encoder)
  {
struct drm_i915_private *dev_priv = encoder-base.dev-dev_private;
struct intel_dsi *intel_dsi = enc_to_intel_dsi(encoder-base);
+   struct drm_device *dev = encoder-base.dev;
u32 val;
  
  	DRM_DEBUG_KMS(\n);

@@ -571,6 +598,15 @@ static void intel_dsi_post_disable(struct intel_encoder 
*encoder)
  
  	msleep(intel_dsi-panel_off_delay);

msleep(intel_dsi-panel_pwr_cycle_delay);
+
+   if (intel_dsi-gem_obj) {
+   kunmap(intel_dsi-cmd_buff);
+   i915_gem_object_ggtt_unpin(intel_dsi-gem_obj);
+   mutex_lock(dev-struct_mutex);
+   drm_gem_object_unreference(intel_dsi-gem_obj-base);
+   mutex_unlock(dev-struct_mutex);
+   }
+   intel_dsi-gem_obj = NULL;
  }
  
  static bool intel_dsi_get_hw_state(struct intel_encoder *encoder,

@@ -1042,6 +1078,10 @@ void intel_dsi_init(struct drm_device *dev)
intel_dsi-ports = (1  PORT_C);
}
  
+	intel_dsi-cmd_buff = NULL;

+   intel_dsi-cmd_buff_phy_addr = 0;
+   intel_dsi-gem_obj = NULL;
+
/* Create a DSI host (and a device) for each port. */
for_each_dsi_port(port, intel_dsi-ports) {
struct intel_dsi_host *host;
diff --git a/drivers/gpu/drm/i915/intel_dsi.h b/drivers/gpu/drm/i915/intel_dsi.h
index 2784ac4..36ca3cc 100644
--- a/drivers/gpu/drm/i915/intel_dsi.h
+++ b/drivers/gpu/drm/i915/intel_dsi.h
@@ -44,6 +44,10 @@ struct intel_dsi {
  
  	struct intel_connector *attached_connector;
  
+	struct drm_i915_gem_object *gem_obj;

+   void *cmd_buff;
+   dma_addr_t cmd_buff_phy_addr;
+
/* bit mask of ports being driven */
u16 ports;
  
Corrected the initial patch. Working on the dma_alloc_coherent patch , 
will update soon.


With regards,
Gaurav
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PULL] drm-intel-next-fixes

2015-06-18 Thread Jani Nikula

Hi Dave, i915 fixes for drm-next/v4.2.

BR,
Jani.


The following changes since commit bf546f8158e2df2656494a475e6235634121c87c:

  drm/i915/skl: Fix DMC API version in firmware file name (2015-06-05 12:08:01 
+0300)

are available in the git repository at:

  git://anongit.freedesktop.org/drm-intel tags/drm-intel-next-fixes-2015-06-18

for you to fetch changes up to 4ed9fb371ccdfe465bd3bbb69e4cad5243e6c4e2:

  drm/i915: Don't set enabled value of all CRTCs when restoring the mode 
(2015-06-17 14:21:01 +0300)


Ander Conselvan de Oliveira (3):
  drm/i915: Don't check modeset state in the hw state force restore path
  drm/i915: Don't update staged config during force restore modesets
  drm/i915: Don't set enabled value of all CRTCs when restoring the mode

Francisco Jerez (3):
  drm/i915: Fix command parser to validate multiple register access with 
the same command.
  drm/i915: Extend the parser to check register writes against a mask/value 
pair.
  drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.

Ville Syrjälä (1):
  drm/i915: Don't skip request retirement if the active list is empty

 drivers/gpu/drm/i915/i915_cmd_parser.c  | 197 +---
 drivers/gpu/drm/i915/i915_drv.h |   5 +
 drivers/gpu/drm/i915/i915_gem.c |   3 -
 drivers/gpu/drm/i915/intel_display.c|  54 -
 drivers/gpu/drm/i915/intel_ringbuffer.h |   5 +-
 5 files changed, 164 insertions(+), 100 deletions(-)

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 06/19] drm/i915: Split skl_update_scaler, v3.

2015-06-18 Thread Daniel Vetter
On Thu, Jun 18, 2015 at 07:42:10AM +0200, Maarten Lankhorst wrote:
 Op 18-06-15 om 03:48 schreef Matt Roper:
  On Mon, Jun 15, 2015 at 12:33:43PM +0200, Maarten Lankhorst wrote:
  It's easier to read separate functions for crtc and plane scaler state.
 
  Changes since v1:
   - Update documentation.
  Changes since v2:
   - Get rid of parameters to skl_update_scaler only used for traces.
 This avoids needing to document the other parameters.
 
  Signed-off-by: Maarten Lankhorst maarten.lankho...@linux.intel.com
  ---
   drivers/gpu/drm/i915/intel_display.c | 211 
  +++
   drivers/gpu/drm/i915/intel_dp.c  |   2 +-
   drivers/gpu/drm/i915/intel_drv.h |  12 +-
   drivers/gpu/drm/i915/intel_sprite.c  |   3 +-
   4 files changed, 121 insertions(+), 107 deletions(-)
 
  diff --git a/drivers/gpu/drm/i915/intel_display.c 
  b/drivers/gpu/drm/i915/intel_display.c
  index 0f7652a31c95..26d610acb61f 100644
  --- a/drivers/gpu/drm/i915/intel_display.c
  +++ b/drivers/gpu/drm/i915/intel_display.c
  @@ -4303,62 +4303,16 @@ static void cpt_verify_modeset(struct drm_device 
  *dev, int pipe)
 }
   }
   
  -/**
  - * skl_update_scaler_users - Stages update to crtc's scaler state
  - * @intel_crtc: crtc
  - * @crtc_state: crtc_state
  - * @plane: plane (NULL indicates crtc is requesting update)
  - * @plane_state: plane's state
  - * @force_detach: request unconditional detachment of scaler
  - *
  - * This function updates scaler state for requested plane or crtc.
  - * To request scaler usage update for a plane, caller shall pass plane 
  pointer.
  - * To request scaler usage update for crtc, caller shall pass plane 
  pointer
  - * as NULL.
  - *
  - * Return
  - * 0 - scaler_usage updated successfully
  - *error - requested scaling cannot be supported or other error 
  condition
  - */
  -int
  -skl_update_scaler_users(
  -  struct intel_crtc *intel_crtc, struct intel_crtc_state *crtc_state,
  -  struct intel_plane *intel_plane, struct intel_plane_state *plane_state,
  -  int force_detach)
  +static int
  +skl_update_scaler(struct intel_crtc_state *crtc_state, bool force_detach,
  +unsigned scaler_idx, int *scaler_id, unsigned int rotation,
 ^^
  This parameter isn't actually the scaler index is it (that's what
  scaler_id winds up being once assigned here)?  I think this one is the
  plane index that we're assigning a scaler for (or the special value of
  SKL_CRTC_INDEX if we're assigning for the CRTC instead of a plane).
 
  Maybe 'scaler_target' or 'scaler_user' would be better?
 
 Could we call it 'i'?

Not for a function argument really ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4] drm/i915 : Added Programming of the MOCS

2015-06-18 Thread Daniel Vetter
On Wed, Jun 17, 2015 at 04:19:22PM +0100, Peter Antoine wrote:
 This change adds the programming of the MOCS registers to the gen 9+
 platforms. This change set programs the MOCS register values to a set
 of values that are defined to be optimal.
 
 It creates a fixed register set that is programmed across the different
 engines so that all engines have the same table. This is done as the
 main RCS context only holds the registers for itself and the shared
 L3 values. By trying to keep the registers consistent across the
 different engines it should make the programming for the registers
 consistent.
 
 v2:
 -'static const' for private data structures and style changes.(Matt Turner)
 v3:
 - Make the tables slightly more readable. (Damien Lespiau)
 - Updated tables fix performance regression.
 v4:
 - Code formatting. (Chris Wilson)
 - re-privatised mocs code. (Daniel Vetter)
 
 Signed-off-by: Peter Antoine peter.anto...@intel.com
 ---
  drivers/gpu/drm/i915/Makefile |   1 +
  drivers/gpu/drm/i915/i915_reg.h   |   9 +
  drivers/gpu/drm/i915/intel_lrc.c  |  10 +-
  drivers/gpu/drm/i915/intel_lrc.h  |   4 +
  drivers/gpu/drm/i915/intel_mocs.c | 373 
 ++
  drivers/gpu/drm/i915/intel_mocs.h |  64 +++
  6 files changed, 460 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/i915/intel_mocs.c
  create mode 100644 drivers/gpu/drm/i915/intel_mocs.h
 
 diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
 index b7ddf48..c781e19 100644
 --- a/drivers/gpu/drm/i915/Makefile
 +++ b/drivers/gpu/drm/i915/Makefile
 @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
 i915_irq.o \
 i915_trace_points.o \
 intel_lrc.o \
 +   intel_mocs.o \
 intel_ringbuffer.o \
 intel_uncore.o
  
 diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
 index 7213224..3a435b5 100644
 --- a/drivers/gpu/drm/i915/i915_reg.h
 +++ b/drivers/gpu/drm/i915/i915_reg.h
 @@ -7829,4 +7829,13 @@ enum skl_disp_power_wells {
  #define _PALETTE_A (dev_priv-info.display_mmio_offset + 0xa000)
  #define _PALETTE_B (dev_priv-info.display_mmio_offset + 0xa800)
  
 +/* MOCS (Memory Object Control State) registers */
 +#define GEN9_LNCFCMOCS0  (0xB020)/* L3 Cache Control 
 base */
 +
 +#define GEN9_GFX_MOCS_0  (0xc800)/* Graphics MOCS base 
 register*/
 +#define GEN9_MFX0_MOCS_0 (0xc900)/* Media 0 MOCS base register*/
 +#define GEN9_MFX1_MOCS_0 (0xcA00)/* Media 1 MOCS base register*/
 +#define GEN9_VEBOX_MOCS_0(0xcB00)/* Video MOCS base register*/
 +#define GEN9_BLT_MOCS_0  (0xcc00)/* Blitter MOCS base 
 register*/
 +
  #endif /* _I915_REG_H_ */
 diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
 b/drivers/gpu/drm/i915/intel_lrc.c
 index 9f5485d..73b919d 100644
 --- a/drivers/gpu/drm/i915/intel_lrc.c
 +++ b/drivers/gpu/drm/i915/intel_lrc.c
 @@ -135,6 +135,7 @@
  #include drm/drmP.h
  #include drm/i915_drm.h
  #include i915_drv.h
 +#include intel_mocs.h
  
  #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
  #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
 @@ -796,7 +797,7 @@ static int logical_ring_prepare(struct intel_ringbuffer 
 *ringbuf,
   *
   * Return: non-zero if the ringbuffer is not ready to be written to.
   */
 -static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
 +int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
   struct intel_context *ctx, int num_dwords)
  {
   struct intel_engine_cs *ring = ringbuf-ring;
 @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct 
 intel_engine_cs *ring,
   if (ret)
   return ret;
  
 + /*
 +  * Failing to program the MOCS is non-fatal.The system will not
 +  * run at peak performance. So generate a warning and carry on.
 +  */

Is this really true? Userspace must make sure that they don't
inappropriately overwrite the caching settings using MOCS for frontbuffers
and in doing so causing coherency issues with the display block. If we
fail to program MOCS correctly then things won't look pretty.

Sounds like even more reaons imo why we really need the userspace side of
this ...

Also the general approach for render side setup failures is to return
-EIO, which will result in a wedged gpu. No reason imo here to eat this
failure.
-Daniel

 + if (gen9_program_mocs(ring, ctx) != 0)
 + DRM_ERROR(MOCS failed to program: expect performance issues.);
 +
   return intel_lr_context_render_state_init(ring, ctx);
  }
  
 diff --git a/drivers/gpu/drm/i915/intel_lrc.h 
 b/drivers/gpu/drm/i915/intel_lrc.h
 index 04d3a6d..dbbd6af 100644
 --- a/drivers/gpu/drm/i915/intel_lrc.h
 +++ b/drivers/gpu/drm/i915/intel_lrc.h
 @@ -44,6 +44,10 @@ int intel_logical_rings_init(struct drm_device *dev);
  
  int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf,

Re: [Intel-gfx] [PATCH v2 17/18] drm/i915: Wa32bitGeneralStateOffset Wa32bitInstructionBaseOffset

2015-06-18 Thread Chris Wilson
On Thu, Jun 18, 2015 at 08:45:50AM +0200, Daniel Vetter wrote:
 On Wed, Jun 17, 2015 at 06:37:03PM +0100, Chris Wilson wrote:
  On Wed, Jun 17, 2015 at 05:03:19PM +0200, Daniel Vetter wrote:
   On Wed, Jun 17, 2015 at 01:53:17PM +0100, Chris Wilson wrote:
On Wed, Jun 17, 2015 at 02:49:47PM +0200, Daniel Vetter wrote:
 On Wed, Jun 10, 2015 at 07:09:03PM +0100, Chris Wilson wrote:
  On Wed, Jun 10, 2015 at 05:46:54PM +0100, Michel Thierry wrote:
   There are some allocations that must be only referenced by 32bit
   offsets. To limit the chances of having the first 4GB already 
   full,
   objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
   DRM_MM_CREATE_TOP flags
   
   User must pass I915_EXEC_SUPPORTS_48BADDRESS flag to indicate it 
   can
   be allocated above the 32b address range.
  
  This should be a per-object flag not per-execbuffer.
 
 We need both. This one to opt into the large address space, the 
 per-object
 one to apply the w/a. Also libdrm/mesa patches for this are still 
 missing.

Do we need the opt in on the context? The 48bit vm is lazily
constructed, if no object asks to use the high range, it will never be
populated. Or is there a cost with preparing a 48bit vm?
   
   If we restrict to 4G we'll evict objects if we run out, and will stay
   correct even when processing fairly large workloads. With just lazily
   eating into 48b that won't be the case. A bit far-fetched, but if we go
   to the trouble of implementing this might as well do it right.
  
  i915_evict_something runs between the range requested for pinning. If we
  run out of 4G space and the desired pin does not opt into 48bit, we will
  evict from the lower 4G.
  
  I obviously missed your concern. Care to elaborate?
 
 Current situation: You always get an address below 4G for all objects,
 even if you use more than 4G of textures - the evict code will make space.
 
 New situation with 48b address space enabled but existing userspace and a
 total BO set bigger than 4G: The kernel will eventually hand out ppgtt
 addresses  4G, which means if we get such an address potentially even for
 an object where this wa needs to apply. This would be a regression. But if
 we make 48b strictly opt-in the kernel will restrict _all_ objects to
 below 4G, creating no regression.

How? The pin code requires PIN_48BIT to be set to hand out higher
addresses. That is only set by execbuffer if execobject-flags is also set.
 
 Ofc new userspace on 48b would set both the execbuf opt-in (or context
 flag, we have those now) plus the per-obj I need this below 4G flag for
 the objects that need this wa.

I don't see why we need another flag beyond the per-object flag. If you
are thinking validation, we have to validate per-object flags anyway.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   >