Re: [Intel-gfx] [PATCH] uxa/glamor: Enable the rest glamor rendering functions.

2011-12-13 Thread Zhigang Gong
> -Original Message-
> From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
> Sent: Wednesday, December 14, 2011 2:45 AM
> To: zhigang.g...@linux.intel.com
> Cc: intel-gfx@lists.freedesktop.org; zhigang.g...@gmail.com;
> zhigang.g...@linux.intel.com
> Subject: Re: [PATCH] uxa/glamor: Enable the rest glamor rendering
> functions.
> 
> On Tue, 13 Dec 2011 22:31:41 +0800, zhigang.g...@linux.intel.com
> wrote:
> > From: Zhigang Gong 
> >
> > This commit enable all the rest glamor rendering functions.
> > Tested with latest glamor master branch, can pass rendercheck.
> 
> Hmm, it exposes an issue with keeping a bo cache independent of mesa
> and trying to feed it our own handles:
> 
>  Region for name 6 already exists but is not compatible
> 
> The w/a for this would be:
> 
> diff --git a/src/intel_glamor.c b/src/intel_glamor.c index
0cf8ed7..2757fd6
> 100644
> --- a/src/intel_glamor.c
> +++ b/src/intel_glamor.c
> @@ -91,6 +91,7 @@ intel_glamor_create_textured_pixmap(PixmapPtr
> pixmap)
> priv = intel_get_pixmap_private(pixmap);
> if (glamor_egl_create_textured_pixmap(pixmap,
> priv->bo->handle,
>   priv->stride)) {
> +   drm_intel_bo_disable_reuse(priv->bo);
> priv->pinned = 1;
> return TRUE;
> } else
> 
> but that gives up all pretense of maintaining a bo cache.

Yes, I think this impacts the performance. Actually, I noticed this problem
and I
spent some time to track the root cause. If everything is ok, this error
should
not be triggered. Although the same BO maybe reused to create a new pixmap,
the previous pixmap which own this BO should be already destroyed. And the
previous image created with the previous pixmap should be destroyed either.

And then, when we create a new pixmap/image with this BO, MESA should not
find any exist image/region for this BO. But it does happen. I tracked
further into
mesa internal and found that the previous image was not destroyed when we
call eglDestroyImageKHR, as its reference count is decreased to zero. It's
weird
for me. Further tracking shows that the root cause is when I use the
texture(bind to 
the image) as a shader's source texture, and call glDrawArrays to perform
the
rendering, the texture's reference count will be increased by 1 before
return
from glDrawArrays. And I failed to find any API to decrease it. Then this
texture
can't be freed when destroy that texture and thus the image's reference
count
will also remain 1 and can't be freed either.

The attached is a simple case to show this bug. It was modified from the
eglkms.c
in mesa-demos.

I did send this issue to mesa-dev. Don't have a solution or explanation so
far. Any 
idea?

> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre


eglkms_mod.c
Description: Binary data
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] uxa/glamor: Enable the rest glamor rendering functions.

2011-12-13 Thread Zhigang Gong
> -Original Message-
> From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
> Sent: Wednesday, December 14, 2011 2:20 AM
> To: zhigang.g...@linux.intel.com
> Cc: intel-gfx@lists.freedesktop.org; zhigang.g...@gmail.com;
> zhigang.g...@linux.intel.com
> Subject: Re: [PATCH] uxa/glamor: Enable the rest glamor rendering
> functions.
> 
> On Tue, 13 Dec 2011 22:31:41 +0800, zhigang.g...@linux.intel.com
> wrote:
> > From: Zhigang Gong 
> >
> > This commit enable all the rest glamor rendering functions.
> > Tested with latest glamor master branch, can pass rendercheck.
> >
> > One thing need to be pointed out is the picture's handling.
> > Pictures support many different color formats, but glamor's texture
> > only support a few color formats. And the most common scenario is that
> > we create a pixmap with a color depth and then attach it to a picture
> > which has a specific color format with the same color depth. But there
> > is no way to change a texture's internal format after the texture was
> > allocated.
> > If you do that, the OpenGL will allocate a new texture. And then the
> > glamor side and UXA side will be inconsitence. So for all the picture
> > related operations, we can't fallback to UXA path directly, even it is
> > rather a strainth forward operation. So for the get_image, Addtraps..,
> > we have to add wrappers function for them to jump into glamor firstly.
> 
> Can we create multiple textures referencing the same bo but with
> different formats?
AFAIK, it's impossible to match all possible picture formats to a OpenGL
internal format.
We have to have a new texture attached to glamor for incompatible format.
The
old texture is created from DDX's BO and has incorrect internal format. IMO,
we
can't make any use of this wrong texture within glamor, so I just don't
create it and
return a false to DDX layer to notify the DDX to unlink the BO. All the
consequent 
rendering operation on this pixmap will be handled within glamor scope and
target
to the new texture with correct format.

> Or are we going to run afoul of the coherency model
> with GL?
My understanding is, if the picture's format is incompatible with OpenGL's
internal
format.

--Zhigang

> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Keith Packard
On Tue, 13 Dec 2011 17:09:45 -0800, Eric Anholt  wrote:

> We introduced this complexity with no evidence that it would help, just
> because we thought, like you, that "avoiding cache flushes should be
> good, right?".  Experiments so far say we were wrong.

Right, you'd think we'd have learned to not optimize in advance of
data. Someday maybe we'll know better...

And, of course, future hardware may require different code for optimal
performance. Who would have guessed that?

-- 
keith.pack...@intel.com


pgpqO79VRD9do.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] More crashes with intel driver 1.10.4 and corrupted stack traces for GM45

2011-12-13 Thread Marc MERLIN
On Wed, Dec 14, 2011 at 12:04:03AM +0100, Daniel Vetter wrote:
> > > One of my X logs had some of these:
> > > [1535818.200] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > > space left on device
> > > [1535818.208] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > > space left on device
> > > [1535818.208] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > > space left on device
> 
> This one sounds like a vma exhausting issue Chris Wilson recently fixed.
> Please retest with the latest versions of xf86-video-intel and libdrm from
> git.

On Tue, Dec 13, 2011 at 11:16:17PM +, Chris Wilson wrote:
> On Tue, 13 Dec 2011 14:13:18 -0800, Marc MERLIN  wrote:
> > On Sun, Dec 11, 2011 at 12:32:00PM -0800, Marc MERLIN wrote:
> > > Hi Eugeni and other folks on this list,
> > 
> > Am I sending this to the wrong place or is GM45 unsupported?
> 
> Apologies, it is a known issue. I was thrown by the bizarre stack trace,
> but it seems like it just the mmap failure blowing up in spectacular
> fashion. The ENOSPC issues arises from when we have either exhausted
> or badly fragmented the (on your system presumably) 32-bits of address
> space for GTT mappings such that we are no longer able to allocate new
> ones. I have recently begun to take a similar issue whereby we exhausted
> the per-process map limit with over 65,000 vma keep open. The solution
> there of capping the number of cached vma is likely to resolve this
> problem as well.

Since I'm a newbie at thie, there isn't a 'head' per se, and I see that
not all patches are production ready.
Could you point me on what I could sync from and that would be
reasonably likely to work? :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fan running with Intel Graphics

2011-12-13 Thread Eugeni Dodonov
On Tue, Dec 13, 2011 at 19:23, Jesse Barnes wrote:

>
> If that fixes your fan issue, it
> means the GPU frequency is to blame.  I think Eugeni is working on a
> nice API to let users control perf vs power a bit better than our
> current default of "try to run as fast as possible under any load, even
> a tiny one".
>

Slowly, but yes :).

But so far, setting the max_frequency manually is the only way to control
this. This will make your card run on the slowest frequency at all time and
in theory should improve the power consumption.

But meanwhile, for better understanding what is going on with your machine,
could you setup some thermal sensors and check what do they say about
temperature? And also check with powertop about what is using the most
power out there?

-- 
Eugeni Dodonov

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/4] drm/i915: add support to get/set cache type for bo

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

Allow userspace to discover current cache level for a bo and set a
specific cache level if necessary.

The patch is mostly based on the original patch from Ben Widawsky.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_dma.c  |2 +
 drivers/gpu/drm/i915/i915_gem.c  |   74 ++
 drivers/gpu/drm/i915/intel_drv.h |5 +++
 include/drm/i915_drm.h   |   14 +++
 4 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 5ffbd95..adb8bca 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2301,6 +2301,8 @@ struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_MADVISE, i915_gem_madvise_ioctl, 
DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_PUT_IMAGE, intel_overlay_put_image, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_ATTRS, intel_overlay_attrs, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_GET_CACHE_TYPE, intel_get_cache_type_ioctl, 
DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_SET_CACHE_TYPE, intel_set_cache_type_ioctl, 
DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dccec77..7dc1dc4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4208,3 +4208,77 @@ rescan:
mutex_unlock(&dev->struct_mutex);
return cnt / 100 * sysctl_vfs_cache_pressure;
 }
+
+/**
+ * Sets the cache mode of an object.
+ */
+int
+intel_set_cache_type_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file)
+{
+   struct drm_i915_gem_get_cache_type *args = data;
+   struct drm_i915_gem_object *obj;
+   int ret = 0;
+
+   ret = i915_mutex_lock_interruptible(dev);
+   if (ret)
+   return ret;
+
+   obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
+   if (&obj->base == NULL) {
+   ret = -ENOENT;
+   goto unlock;
+   }
+
+   switch (args->cache_level) {
+   case I915_CACHE_LLC:
+   if (!HAS_LLC(dev)) {
+   ret = -EINVAL;
+   break;
+   }
+   ret = i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+   break;
+   case I915_CACHE_NONE:
+   ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
+   break;
+   default:
+   ret = -EINVAL;
+   break;
+   }
+
+   drm_gem_object_unreference(&obj->base);
+unlock:
+   mutex_unlock(&dev->struct_mutex);
+
+   return ret;
+}
+
+/**
+ * Returns the cache mode of an object.
+ */
+int
+intel_get_cache_type_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file)
+{
+   struct drm_i915_gem_get_cache_type *args = data;
+   struct drm_i915_gem_object *obj;
+   int ret = 0;
+
+   mutex_lock(&dev->struct_mutex);
+   obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
+   if (&obj->base == NULL) {
+   ret = -ENOENT;
+   goto unlock;
+   }
+
+   /* Assume that kernel always knows what he is doing, so we shouldn't
+* have unknown cache level for exporting. If we do, we have bigger
+* things to worry about anyway. */
+   args->cache_level = obj->cache_level;
+
+   drm_gem_object_unreference(&obj->base);
+unlock:
+   mutex_unlock(&dev->struct_mutex);
+
+   return ret;
+}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bd9a604..2e0841a 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -376,6 +376,11 @@ extern int intel_overlay_put_image(struct drm_device *dev, 
void *data,
 extern int intel_overlay_attrs(struct drm_device *dev, void *data,
   struct drm_file *file_priv);
 
+extern int intel_get_cache_type_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_priv);
+extern int intel_set_cache_type_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_priv);
+
 extern void intel_fb_output_poll_changed(struct drm_device *dev);
 extern void intel_fb_restore_mode(struct drm_device *dev);
 
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index 7f778f5..de6bb61 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -198,6 +198,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_OVERLAY_PUT_IMAGE 0x27
 #define DRM_I915_OVERLAY_ATTRS 0x28
 #define DRM_I915_GEM_EXECBUFFER2   0x29
+#define DRM_I915_GET_CACHE_TYPE0x2a
+#define DRM_I915_SET_CACHE_TYPE0x2b
 
 #define DRM_IOCTL_I915_INITDRM_IOW( DRM_COMMAND_BASE 

[Intel-gfx] [PATCH 3/4] drm/i915: add drm PARAM to query available cache levels

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

This allows to query available cache levels from libdrm and check for
presence of LLC from userspace.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_dma.c |6 ++
 include/drm/i915_drm.h  |1 +
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..5ffbd95 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -781,6 +781,12 @@ static int i915_getparam(struct drm_device *dev, void 
*data,
case I915_PARAM_HAS_RELAXED_DELTA:
value = 1;
break;
+   case I915_PARAM_CACHE_LEVELS:
+   /* Everyone has CACHE_NONE but not everyone has LLC */
+   value = 1 << I915_CACHE_NONE;
+   if (HAS_LLC(dev))
+   value |= 1 << I915_CACHE_LLC;
+   break;
default:
DRM_DEBUG_DRIVER("Unknown parameter %d\n",
 param->param);
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index e9f1cf4..7f778f5 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -291,6 +291,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_COHERENT_RINGS   13
 #define I915_PARAM_HAS_EXEC_CONSTANTS   14
 #define I915_PARAM_HAS_RELAXED_DELTA15
+#define I915_PARAM_CACHE_LEVELS 16
 
 typedef struct drm_i915_getparam {
int param;
-- 
1.7.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4] drm/i915: simplify cache parameters for userspace

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

Simplify the code and cache-related parameters handling, and prepare to
export them to userspace in subsequent patches.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_drv.h |   10 ++
 drivers/gpu/drm/i915/i915_gem.c |2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c |4 ++--
 include/drm/i915_drm.h  |8 
 4 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index abbbf32..45b3f5b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -737,12 +737,6 @@ typedef struct drm_i915_private {
atomic_t forcewake_count;
 } drm_i915_private_t;
 
-enum i915_cache_level {
-   I915_CACHE_NONE,
-   I915_CACHE_LLC,
-   I915_CACHE_LLC_MLC, /* gen6+ */
-};
-
 struct drm_i915_gem_object {
struct drm_gem_object base;
 
@@ -1213,13 +1207,13 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_device 
*dev,
int tiling_mode);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
-   enum i915_cache_level cache_level);
+   uint32_t cache_level);
 
 /* i915_gem_gtt.c */
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_rebind_object(struct drm_i915_gem_object *obj,
-   enum i915_cache_level cache_level);
+   uint32_t cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 
 /* i915_gem_evict.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fb69337..dccec77 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2949,7 +2949,7 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj, bool write)
 }
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
-   enum i915_cache_level cache_level)
+   uint32_t cache_level)
 {
int ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6042c5e..d556dc8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -31,7 +31,7 @@
 
 /* XXX kill agp_type! */
 static unsigned int cache_level_to_agp_type(struct drm_device *dev,
-   enum i915_cache_level cache_level)
+   uint32_t cache_level)
 {
switch (cache_level) {
case I915_CACHE_LLC_MLC:
@@ -117,7 +117,7 @@ int i915_gem_gtt_bind_object(struct drm_i915_gem_object 
*obj)
 }
 
 void i915_gem_gtt_rebind_object(struct drm_i915_gem_object *obj,
-   enum i915_cache_level cache_level)
+   uint32_t cache_level)
 {
struct drm_device *dev = obj->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index 28c0d11..e9f1cf4 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -844,4 +844,12 @@ struct drm_intel_overlay_attrs {
__u32 gamma5;
 };
 
+
+/**
+ * Cache level definitions
+ */
+#define I915_CACHE_NONE0
+#define I915_CACHE_LLC 1
+#define I915_CACHE_LLC_MLC 2
+
 #endif /* _I915_DRM_H_ */
-- 
1.7.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/4] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

LLC is not SNB-specific, so we should check for it in a more generic way.

v2: export LLC support status via debugfs.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_debugfs.c |1 +
 drivers/gpu/drm/i915/i915_drv.c |4 
 drivers/gpu/drm/i915/i915_drv.h |2 ++
 drivers/gpu/drm/i915/i915_gem.c |4 ++--
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index d09a6e0..cb8a153 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -82,6 +82,7 @@ static int i915_capabilities(struct seq_file *m, void *data)
B(supports_tv);
B(has_bsd_ring);
B(has_blt_ring);
+   B(has_llc);
 #undef B
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 15bfa91..19fb7a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -214,6 +214,7 @@ static const struct intel_device_info 
intel_sandybridge_d_info = {
.need_gfx_hws = 1, .has_hotplug = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_sandybridge_m_info = {
@@ -222,6 +223,7 @@ static const struct intel_device_info 
intel_sandybridge_m_info = {
.has_fbc = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_ivybridge_d_info = {
@@ -229,6 +231,7 @@ static const struct intel_device_info 
intel_ivybridge_d_info = {
.need_gfx_hws = 1, .has_hotplug = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_ivybridge_m_info = {
@@ -237,6 +240,7 @@ static const struct intel_device_info 
intel_ivybridge_m_info = {
.has_fbc = 0,   /* FBC is not enabled on Ivybridge mobile yet */
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct pci_device_id pciidlist[] = {  /* aka */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4a9c1b9..abbbf32 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -250,6 +250,7 @@ struct intel_device_info {
u8 supports_tv:1;
u8 has_bsd_ring:1;
u8 has_blt_ring:1;
+   u8 has_llc:1;
 };
 
 enum no_fbc_reason {
@@ -961,6 +962,7 @@ struct drm_i915_file_private {
 
 #define HAS_BSD(dev)(INTEL_INFO(dev)->has_bsd_ring)
 #define HAS_BLT(dev)(INTEL_INFO(dev)->has_blt_ring)
+#define HAS_LLC(dev)(INTEL_INFO(dev)->has_llc)
 #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_OVERLAY(dev)   (INTEL_INFO(dev)->has_overlay)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 60ff1b6..fb69337 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3620,8 +3620,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct 
drm_device *dev,
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (IS_GEN6(dev) || IS_GEN7(dev)) {
-   /* On Gen6, we can have the GPU use the LLC (the CPU
+   if (HAS_LLC(dev)) {
+   /* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
 * display scanout are coherent with the CPU in
-- 
1.7.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] [RFC] LLC and per-BO cache handling

2011-12-13 Thread Eugeni Dodonov
With base on comments and feedback on LLC handling from Daniel Vetter, Chris
Wilson and Ben Widawsky, I reworked my patch into a more useful approach for
detecting presence of different cache levels from userspace.

So, in this context, PATCH 1 adds the very same .has_llc flag and debugfs
entry for it. PATCH 2 does the initial job of preparing to export cache
levels to userspace, which can be queried via the DRM interface by means of
PATCH 3. And finally, PATCH 4 allows to get and set a specific cache level for 
each
bo.

The i915_gem_object_set_cache_level() routine does a good job of checking
whether the objects should be put into specific cache levels already. When
thinking about possible optimizations and tweaks for different cache levels,
this is what we thought:
 - The most usual case is for I915_MADV_WILLNEED object, where we can change
   its cache level as we want.
 - If we are trying to change cache level of I915_MADV_DONTNEED object, on
   first glance, it is pointless to do so. However, it could make sense for
   some sort of prefetching or quick change of mind (we certainly can mark
   an object as DONTNEED and get to reallocate and WILLNEED it again). So
   why not?
 - If we are trying to change cache level of a PURGED object, nothing will
   happen, as it has no GTT space anyway.

One possible optimization would be to remove an object from LLC on
gem_madvise_ioctl(..I915_MADV_DONTNEED). However, I don't think it is worth
the extra cycles - at some point, it will be removed eventually anyway, even
if we leave it at LLC.

So those are all the possible cases I could think of. Of course, I could
have missed something, so just yell if so :).

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-13 Thread Daniel Vetter
On Mon, Dec 12, 2011 at 10:16, Rocko Requin  wrote:
>> If you can wire up netconsole you should be able to gather the full
>> backtrace and that would be really useful. Otherwise can you please
>> confirm by reverting that commit from your current tree that it is
>> indeed the culprit? Otherwise please bisect the issue.
>
> I built 3.2-rc5 with the patch from commit
> eb1711bb94991e93669c5a1b5f84f11be2d51ea1 reversed, and have been using it
> now for a day and a half without any i915_gem issues. So at this stage it
> does seem likely it is the culprit, based on the fact that I had at least 2
> and probably 3 i915_gem crashes in around 12 hours with the commit applied.
> When I get some free time I'll reapply the patch and see if I can reproduce
> the crash and get a netconsole dump.

Backtraces from another reporter seriously look like we're hitting
some ugly use-after free. Can you please test whether the patch
"drm/i915: Only clear the GPU domains upon a successful finish" by
Chris Wilson fixes anything for you? You can grab it from
http://cgit.freedesktop.org/~danvet/drm/patch/?id=389a55581e30607af0fcde6cdb4e54f189cf46cf

Thanks, Daniel



-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Eric Anholt
On Tue, 13 Dec 2011 14:49:49 -0800, Ben Widawsky  wrote:
> On 12/13/2011 08:36 AM, Keith Packard wrote:
> > On Tue, 13 Dec 2011 17:01:33 +0100, Daniel Vetter  wrote:
> > 
> >> - or remove it all and invalidate/flush unconditionally.
> > 
> > Eric and I were chatting yesterday about trying this -- it seems like
> > we'd be able to dramatically simplify the kernel module by doing this,
> > and given how much flushing already occurs, I doubt we'd see any
> > significant performance difference, and we'd save a pile of CPU time,
> > which might actually improve performance.
> 
> 
> Would we want to keep domain tracking if the HW worked correctly and we
> didn't have to always flush. It seems like a shame to just gut the code
> if it actually could offer a benefit on future generations.

It's not a matter of hardware behavior.  It's a matter of always needing
to flush the caches anyway because you're emitting new commands at the
GPU and you're wanting results completed on the screen at the end.  So
we go to all this fragile, expensive CPU work to get no benefit.

We introduced this complexity with no evidence that it would help, just
because we thought, like you, that "avoiding cache flushes should be
good, right?".  Experiments so far say we were wrong.


pgp6rrlw2a23f.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/dp: Dither down to 6bpc if it makes the mode fit

2011-12-13 Thread Eric Anholt
From: Adam Jackson 

Some active adaptors (VGA usually) only have two lanes at 2.7GHz.
That's a maximum pixel clock of 144MHz at 8bpc, but 192MHz at 6bpc.

Fixes Asus UX31 panel being black at startup due to no valid modes since
dc22ee6fc18ce0f15424e753e8473c306ece95c1.

v2: Rebased to current code, resulting in the fix applying to EDP panels as
well.  Also changed from spatio-temporal to just spatial dithering on
pre-ironlake, to be conssitent (and less visual flicker)

Signed-off-by: Adam Jackson 
Signed-off-by: Eric Anholt 
Tested-by: Eric Anholt 
---
 drivers/gpu/drm/i915/intel_display.c |   22 --
 drivers/gpu/drm/i915/intel_dp.c  |   24 ++--
 drivers/gpu/drm/i915/intel_drv.h |1 +
 3 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 0d7d648..7eaea5a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -4670,6 +4670,7 @@ static inline bool intel_panel_use_ssc(struct 
drm_i915_private *dev_priv)
 /**
  * intel_choose_pipe_bpp_dither - figure out what color depth the pipe should 
send
  * @crtc: CRTC structure
+ * @mode: requested mode
  *
  * A pipe may be connected to one or more outputs.  Based on the depth of the
  * attached framebuffer, choose a good color depth to use on the pipe.
@@ -4681,13 +4682,15 @@ static inline bool intel_panel_use_ssc(struct 
drm_i915_private *dev_priv)
  *HDMI supports only 8bpc or 12bpc, so clamp to 8bpc with dither for 10bpc
  *Displays may support a restricted set as well, check EDID and clamp as
  *  appropriate.
+ *DP may want to dither down to 6bpc to fit larger modes
  *
  * RETURNS:
  * Dithering requirement (i.e. false if display bpc and pipe bpc match,
  * true if they don't match).
  */
 static bool intel_choose_pipe_bpp_dither(struct drm_crtc *crtc,
-unsigned int *pipe_bpp)
+unsigned int *pipe_bpp,
+struct drm_display_mode *mode)
 {
struct drm_device *dev = crtc->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -4758,6 +4761,11 @@ static bool intel_choose_pipe_bpp_dither(struct drm_crtc 
*crtc,
}
}
 
+   if (mode->private_flags & INTEL_MODE_DP_FORCE_6BPC) {
+   DRM_DEBUG_KMS("Dithering DP to 6bpc\n");
+   display_bpc = 6;
+   }
+
/*
 * We could just drive the pipe at the highest bpc all the time and
 * enable dithering as needed, but that costs bandwidth.  So choose
@@ -5019,6 +5027,16 @@ static int i9xx_crtc_mode_set(struct drm_crtc *crtc,
pipeconf &= ~PIPECONF_DOUBLE_WIDE;
}
 
+   /* default to 8bpc */
+   pipeconf &= ~(PIPECONF_BPP_MASK | PIPECONF_DITHER_EN);
+   if (is_dp) {
+   if (mode->private_flags & INTEL_MODE_DP_FORCE_6BPC) {
+   pipeconf |= PIPECONF_BPP_6 |
+   PIPECONF_DITHER_EN |
+   PIPECONF_DITHER_TYPE_SP;
+   }
+   }
+
dpll |= DPLL_VCO_ENABLE;
 
DRM_DEBUG_KMS("Mode for pipe %c:\n", pipe == 0 ? 'A' : 'B');
@@ -5480,7 +5498,7 @@ static int ironlake_crtc_mode_set(struct drm_crtc *crtc,
/* determine panel color depth */
temp = I915_READ(PIPECONF(pipe));
temp &= ~PIPE_BPC_MASK;
-   dither = intel_choose_pipe_bpp_dither(crtc, &pipe_bpp);
+   dither = intel_choose_pipe_bpp_dither(crtc, &pipe_bpp, mode);
switch (pipe_bpp) {
case 18:
temp |= PIPE_6BPC;
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 1bff19a..db3b461 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -208,13 +208,15 @@ intel_dp_link_clock(uint8_t link_bw)
  */
 
 static int
-intel_dp_link_required(struct intel_dp *intel_dp, int pixel_clock)
+intel_dp_link_required(struct intel_dp *intel_dp, int pixel_clock, int 
check_bpp)
 {
struct drm_crtc *crtc = intel_dp->base.base.crtc;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
int bpp = 24;
 
-   if (intel_crtc)
+   if (check_bpp)
+   bpp = check_bpp;
+   else if (intel_crtc)
bpp = intel_crtc->bpp;
 
return (pixel_clock * bpp + 9) / 10;
@@ -233,6 +235,7 @@ intel_dp_mode_valid(struct drm_connector *connector,
struct intel_dp *intel_dp = intel_attached_dp(connector);
int max_link_clock = 
intel_dp_link_clock(intel_dp_max_link_bw(intel_dp));
int max_lanes = intel_dp_max_lane_count(intel_dp);
+   int max_rate, mode_rate;
 
if (is_edp(intel_dp) && intel_dp->panel_fixed_mode) {
if (mode->hdisplay > intel_dp->panel_fixed_mode->hdisplay)
@@ -242,9 +245,17 @@ intel_dp_mode_valid(struct drm

Re: [Intel-gfx] Patches queued to drm-intel-fixes

2011-12-13 Thread Keith Packard
On Wed, 14 Dec 2011 00:04:26 +0100, Daniel Vetter  
wrote:
> Hi Keith,
> 
> I've noticed that you merged my patch "rm/i915: properly prefault for
> pread/pwrite" into your -fixes branch (which I assume is headed for
> 3.2). Please remove that from your queue again for the following
> reasons:

Thanks! I'll rework the patch series before submitting it.

-- 
keith.pack...@intel.com


pgpfYu9iuAfmD.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Keith Packard
On Tue, 13 Dec 2011 14:49:49 -0800, Ben Widawsky  wrote:

> Would we want to keep domain tracking if the HW worked correctly and we
> didn't have to always flush. It seems like a shame to just gut the code
> if it actually could offer a benefit on future generations.

That sounds like premature optimization to me. If we want something
similar on future hardware, we can resurrect the old code and see what
pieces are useful. For now, we're fighting correctness and stability
issues, and given the limited (zero? negative?) performance benefits, we
just need to get to code which works reliably and provides good
performance.

The current code has gotten to the 'piles of kludges on kludges' stage,
which makes it very fragile -- see the regression caused by changing
flushing orders in the VT-d work-around.

-- 
keith.pack...@intel.com


pgp7sZZGywkAo.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] More crashes with intel driver 1.10.4 and corrupted stack traces for GM45

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 14:13:18 -0800, Marc MERLIN  wrote:
> On Sun, Dec 11, 2011 at 12:32:00PM -0800, Marc MERLIN wrote:
> > Hi Eugeni and other folks on this list,
> 
> Am I sending this to the wrong place or is GM45 unsupported?

Apologies, it is a known issue. I was thrown by the bizarre stack trace,
but it seems like it just the mmap failure blowing up in spectacular
fashion. The ENOSPC issues arises from when we have either exhausted
or badly fragmented the (on your system presumably) 32-bits of address
space for GTT mappings such that we are no longer able to allocate new
ones. I have recently begun to take a similar issue whereby we exhausted
the per-process map limit with over 65,000 vma keep open. The solution
there of capping the number of cached vma is likely to resolve this
problem as well.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Patches queued to drm-intel-fixes

2011-12-13 Thread Daniel Vetter
Hi Keith,

I've noticed that you merged my patch "rm/i915: properly prefault for
pread/pwrite" into your -fixes branch (which I assume is headed for
3.2). Please remove that from your queue again for the following
reasons:

- The right thing to do is to fix up the prefault handlers in pagemap.h
- It's really ugly code (which Chris Wilson later on complained
about), so ugly actually that it confused you while reviewing it.
- It changes the semantics of pread/pwrite in funny ways (something
you actually complained about in review a while ago, too).
- This bug has been lying around for almost half a year already, so I
don't see the need for a rush now.
- It only papers over the underlying issue, the real minimal and
proper fix is queued up (and reviewed) in my-next in my own git repo
for 3.3.
- I actually managed to blow things up while playing with the prefault
stuff, so it's imo not really risk-free.
- But most important this late in the -rc cylce: It doesn't fix a regression.

I've also noticed that you have my patch "drm/i915: check ACTHD of all
rings" queued up in -fixes. I wouldn't have minded this getting merged
a few weeks ago into an early -rc but again I think it's too late for
this one for the following reasons:
- It touches on the hangcheck code, one of the most important pieces
to be able to debug issues and hence support users of our driver, but
also one of the least tested ones (we essentially only test it when
hitting actual hangs).
- A similar patch by Ben Widawsky actually blew things up for Chris Wilson.
- Again it doesn't fix a regression.

Dave, please reject a pull request for 3.2 containing these patches -
I've already embarrassed myself with the vt-d oneliner (which should
imo have been merged about 4 weeks ago, but mea culpa).

Yours, Daniel
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] More crashes with intel driver 1.10.4 and corrupted stack traces for GM45

2011-12-13 Thread Daniel Vetter
On Tue, Dec 13, 2011 at 02:13:18PM -0800, Marc MERLIN wrote:
> On Sun, Dec 11, 2011 at 12:32:00PM -0800, Marc MERLIN wrote:
> Am I sending this to the wrong place or is GM45 unsupported?

Sorry for missing this one, we're busy people. In the future, just prod us
again like you've done.

> > I reported some Xorg crashes on my lenovo W500.
> > In the past, I thought it might have been virtualbox drivers, so I make
> > sure I don't load them anymore.
> > 
> > In the meantime, I upgraded to ubuntu oneiric.
> > 
> > Kernel is 3.1.0:
> > agpgart-intel :00:00.0: Intel GM45 Chipset
> > agpgart-intel :00:00.0: detected gtt size: 2097152K total, 262144K 
> > mappable
> > agpgart-intel :00:00.0: detected 32768K stolen memory
> > agpgart-intel :00:00.0: AGP aperture is 256M @ 0xd000
> > 
> > I have ulimit -c unlimited, so I do get cores.
> > 
> > One of my X logs had some of these:
> > [1535818.200] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > space left on device
> > [1535818.208] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > space left on device
> > [1535818.208] (WW) intel(0): intel_uxa_prepare_access: bo map failed: No 
> > space left on device

This one sounds like a vma exhausting issue Chris Wilson recently fixed.
Please retest with the latest versions of xf86-video-intel and libdrm from
git.

Thanks, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: add color key support v4

2011-12-13 Thread Daniel Vetter
On Tue, Dec 13, 2011 at 02:09:11PM -0800, Jesse Barnes wrote:
> Add new ioctls for getting and setting the current destination color
> key.  This allows for simple overlay display control by matching a color
> key value in the primary plane before blending the overlay on top.
> 
> v2: remove unnecessary mutex acquire/release around reg accesses
> v3: add support for full color key management
> v4: fix copy & paste bug in snb_get_colorkey
> don't bother checking min/max values against docs as the docs are likely
> wrong (how could we handle 10bpc surface formats?)

With the changes in v4 I can actually slap an r-b onto this and fix up
Jesse's a bit over-eager use of that tag ;-)

Reviewed-by: Daniel Vetter 
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Ben Widawsky
On 12/13/2011 08:36 AM, Keith Packard wrote:
> On Tue, 13 Dec 2011 17:01:33 +0100, Daniel Vetter  wrote:
> 
>> - or remove it all and invalidate/flush unconditionally.
> 
> Eric and I were chatting yesterday about trying this -- it seems like
> we'd be able to dramatically simplify the kernel module by doing this,
> and given how much flushing already occurs, I doubt we'd see any
> significant performance difference, and we'd save a pile of CPU time,
> which might actually improve performance.


Would we want to keep domain tracking if the HW worked correctly and we
didn't have to always flush. It seems like a shame to just gut the code
if it actually could offer a benefit on future generations.

I know Daniel has the same idea about gutting it...

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915_init takes a full second of kernel init time

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 14:01:29 -0800, Jesse Barnes  
wrote:
> We had some async code to take all of this out of the boot time
> critical path at least...  I thought Chris merged them long ago but I
> guess they were dropped.  Chris?

It never made it upstream because it had a tendency to hang machines
during boot, as the async code was broken at the time wrt handling
multiple async domains and it interacted badly with PIO hard disk
controllers.

After a little bit of digging I found:
http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=async&id=470d6985b508466308fc4c6aec945cdbf6de39b8
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: add color key support v4

2011-12-13 Thread Jesse Barnes
Add new ioctls for getting and setting the current destination color
key.  This allows for simple overlay display control by matching a color
key value in the primary plane before blending the overlay on top.

v2: remove unnecessary mutex acquire/release around reg accesses
v3: add support for full color key management
v4: fix copy & paste bug in snb_get_colorkey
don't bother checking min/max values against docs as the docs are likely
wrong (how could we handle 10bpc surface formats?)

Reviewed-by: Daniel Vetter 
Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/i915_dma.c |2 +
 drivers/gpu/drm/i915/i915_reg.h |3 +
 drivers/gpu/drm/i915/intel_drv.h|   11 ++
 drivers/gpu/drm/i915/intel_sprite.c |  177 +++
 include/drm/i915_drm.h  |   36 +++
 5 files changed, 229 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..12615eb 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2295,6 +2295,8 @@ struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_MADVISE, i915_gem_madvise_ioctl, 
DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_PUT_IMAGE, intel_overlay_put_image, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_ATTRS, intel_overlay_attrs, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_SET_SPRITE_COLORKEY, intel_sprite_set_colorkey, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_GET_SPRITE_COLORKEY, intel_sprite_get_colorkey, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f872ba2..25ec240 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2726,9 +2726,12 @@
 #define DVSSTRIDE(pipe) _PIPE(pipe, _DVSASTRIDE, _DVSBSTRIDE)
 #define DVSPOS(pipe) _PIPE(pipe, _DVSAPOS, _DVSBPOS)
 #define DVSSURF(pipe) _PIPE(pipe, _DVSASURF, _DVSBSURF)
+#define DVSKEYMAX(pipe) _PIPE(pipe, _DVSAKEYMAXVAL, _DVSBKEYMAXVAL)
 #define DVSSIZE(pipe) _PIPE(pipe, _DVSASIZE, _DVSBSIZE)
 #define DVSSCALE(pipe) _PIPE(pipe, _DVSASCALE, _DVSBSCALE)
 #define DVSTILEOFF(pipe) _PIPE(pipe, _DVSATILEOFF, _DVSBTILEOFF)
+#define DVSKEYVAL(pipe) _PIPE(pipe, _DVSAKEYVAL, _DVSBKEYVAL)
+#define DVSKEYMSK(pipe) _PIPE(pipe, _DVSAKEYMSK, _DVSBKEYMSK)
 
 #define _SPRA_CTL  0x70280
 #define   SPRITE_ENABLE(1<<31)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 97bbbc5..fd1a2cc 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -26,6 +26,7 @@
 #define __INTEL_DRV_H__
 
 #include 
+#include "i915_drm.h"
 #include "i915_drv.h"
 #include "drm_crtc.h"
 #include "drm_crtc_helper.h"
@@ -191,6 +192,10 @@ struct intel_plane {
 uint32_t x, uint32_t y,
 uint32_t src_w, uint32_t src_h);
void (*disable_plane)(struct drm_plane *plane);
+   int (*update_colorkey)(struct drm_plane *plane,
+  struct drm_intel_sprite_colorkey *key);
+   void (*get_colorkey)(struct drm_plane *plane,
+struct drm_intel_sprite_colorkey *key);
 };
 
 #define to_intel_crtc(x) container_of(x, struct intel_crtc, base)
@@ -413,4 +418,10 @@ extern void sandybridge_update_wm(struct drm_device *dev);
 extern void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
   uint32_t sprite_width,
   int pixel_size);
+
+extern int intel_sprite_set_colorkey(struct drm_device *dev, void *data,
+struct drm_file *file_priv);
+extern int intel_sprite_get_colorkey(struct drm_device *dev, void *data,
+struct drm_file *file_priv);
+
 #endif /* __INTEL_DRV_H__ */
diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
b/drivers/gpu/drm/i915/intel_sprite.c
index 735c8ab..2273173 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -95,6 +95,7 @@ ivb_update_plane(struct drm_plane *plane, struct 
drm_framebuffer *fb,
/* must disable */
sprctl |= SPRITE_TRICKLE_FEED_DISABLE;
sprctl |= SPRITE_ENABLE;
+   sprctl |= SPRITE_DEST_KEY;
 
/* Sizes are 0 based */
src_w--;
@@ -153,6 +154,60 @@ ivb_disable_plane(struct drm_plane *plane)
POSTING_READ(SPRSURF(pipe));
 }
 
+static int
+ivb_update_colorkey(struct drm_plane *plane,
+   struct drm_intel_sprite_colorkey *key)
+{
+   struct drm_device *dev = plane->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_plane *intel_plane;
+   u32 sprctl;
+   int ret = 0;
+
+   in

Re: [Intel-gfx] i915_init takes a full second of kernel init time

2011-12-13 Thread Jesse Barnes
On Tue, 13 Dec 2011 13:34:38 -0800
Scott James Remnant  wrote:

> On Tue, Dec 13, 2011 at 12:02 PM, Jesse Barnes  
> wrote:
> > On Tue, 13 Dec 2011 11:55:06 -0800
> > Scott James Remnant  wrote:
> >
> >> I've been investigating Chrome OS boot time and noticed the anomaly
> >> where i915_init takes up a considerable amount of kernel startup time,
> >> one second in fact. I've attached a full dmesg with drm.debug=0xff for
> >> analysis at Daniel's suggestion.
> >
> > I'm not surprised... we haven't optimized init time in awhile and lots
> > of delays have crept in.
> >
> > What kind of panel does this laptop have?  Can you enable drm debugging
> > (drm.debug=1 on the boot line) and see where the big delays are in
> > modesetting?
> >
> Is this different to drm.debug=0xff ?

Should be the same... and now I look again and see you already attached
it, doh!

Looks like there are a couple of big jumps:

[0.899057] [drm:intel_crtc_init], swapping pipes & planes for FBC
[0.952613] [drm:drm_sysfs_connector_add], adding "LVDS-1" to sysfs

Between these two calls, we do the output setup stuff in i915.  There
are likely some delays there due to trying to fetch the EDID.

[0.952846] [drm:intel_panel_set_backlight], set backlight PWM = 0
[1.206060] [drm:i915_get_vblank_counter], trying to get vblank count for 
disabled pipe A

Not sure where this is coming from offhand...

[1.206153] [drm:pineview_update_wm], Self-refresh is disabled
[1.214189] [drm:init_status_page], render ring hws offset: 0x

This looks like a mode set or CRTC disable happened before we
initialized the status page.

[1.239347] [drm:drm_mode_debug_printmodeline], Modeline 8:"1280x800" 60 
70700 1280 1296 1344 1440 800 801 804 818 0x48 0xa
[1.273051] [drm:i9xx_update_plane], Writing base 0003  0 0 5120

Looks like a mode set occurred, that'll also take awhile.

[1.273758] [drm:intel_lvds_enable], applying panel-fitter: 8, 0
[1.673051] [drm:intel_panel_set_backlight], set backlight PWM = 13046

So it's an LVDS machine... this is probably part of the LVDS mode set.

We had some async code to take all of this out of the boot time
critical path at least...  I thought Chris merged them long ago but I
guess they were dropped.  Chris?

In general, we can put a lot of the stuff we do into delayed work
handlers; e.g. when we shut things off we're often supposed to wait for
a full frame (i.e. the next vblank) before doing anything else.  This
could be done with a semaphore and delayed work though to keep things
snappy.

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fan running with Intel Graphics

2011-12-13 Thread Jesse Barnes
On Tue, 13 Dec 2011 22:30:42 +0100
Johannes Bauer  wrote:
> > then echo that value into the max freq file, e.g.:
> > 
> > $ echo 400 > /sys/kernel/debug/dri/0/i915_max_freq
> 
> This doesn't work for
> 
> joelaptop [/sys/kernel/debug/dri]: uname -a
> Linux joelaptop 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:28:43 UTC
> 2011 x86_64 x86_64 x86_64 GNU/Linux
> 
> Since i915_max_freq does not exist -- do I need to switch to a more
> recent kernel version?

Yes.  Try running drm-intel-next from Keith's git tree:
git://people.freedesktop.org/~keithp/linux

> I also noted that dri/ has two subdirectories, 0 and 64. Does this mean
> anything?

You can ignore the "64" directory, it's currently unused.

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915_init takes a full second of kernel init time

2011-12-13 Thread Scott James Remnant
On Tue, Dec 13, 2011 at 12:02 PM, Jesse Barnes  wrote:
> On Tue, 13 Dec 2011 11:55:06 -0800
> Scott James Remnant  wrote:
>
>> I've been investigating Chrome OS boot time and noticed the anomaly
>> where i915_init takes up a considerable amount of kernel startup time,
>> one second in fact. I've attached a full dmesg with drm.debug=0xff for
>> analysis at Daniel's suggestion.
>
> I'm not surprised... we haven't optimized init time in awhile and lots
> of delays have crept in.
>
> What kind of panel does this laptop have?  Can you enable drm debugging
> (drm.debug=1 on the boot line) and see where the big delays are in
> modesetting?
>
Is this different to drm.debug=0xff ?

Scott
-- 
Scott James Remnant | Chrome OS Systems | key...@google.com | Google
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fan running with Intel Graphics

2011-12-13 Thread Johannes Bauer
Hi Jesse,

wow - that was fast! :-)

Am 13.12.2011 22:23, schrieb Jesse Barnes:
> On Tue, 13 Dec 2011 22:12:12 +0100
> Johannes Bauer  wrote:
>> And it's not even doing *anything*, the CPUs are all at almost 0%
>> (therefore I don't think there's much heat coming from there). I'm not
>> doing heavy graphics (not even light graphics, not even moving the mouse!).
>>
>> How can I measure the graphics card load? Would it improve anything if I
>> compiled a kernel myself and switched to 3.1? Is there anything at all I
>> can do?
> 
> In the console, as root, can you:
> 
> $ cat /sys/kernel/debug/dri/0/i915_cur_delayinfo | grep Lowest

Okay:

joelaptop [/sys/kernel/debug/dri]: cat
/sys/kernel/debug/dri/0/i915_cur_delayinfo | grep Lowest
Lowest (RPN) frequency: 650MHz

> then echo that value into the max freq file, e.g.:
> 
> $ echo 400 > /sys/kernel/debug/dri/0/i915_max_freq

This doesn't work for

joelaptop [/sys/kernel/debug/dri]: uname -a
Linux joelaptop 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:28:43 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux

Since i915_max_freq does not exist -- do I need to switch to a more
recent kernel version?

I also noted that dri/ has two subdirectories, 0 and 64. Does this mean
anything?

Best regards and thank you for your help,
Joe
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Fan running with Intel Graphics

2011-12-13 Thread Jesse Barnes
On Tue, 13 Dec 2011 22:12:12 +0100
Johannes Bauer  wrote:
> And it's not even doing *anything*, the CPUs are all at almost 0%
> (therefore I don't think there's much heat coming from there). I'm not
> doing heavy graphics (not even light graphics, not even moving the mouse!).
> 
> How can I measure the graphics card load? Would it improve anything if I
> compiled a kernel myself and switched to 3.1? Is there anything at all I
> can do?

In the console, as root, can you:

$ cat /sys/kernel/debug/dri/0/i915_cur_delayinfo | grep Lowest

then echo that value into the max freq file, e.g.:

$ echo 400 > /sys/kernel/debug/dri/0/i915_max_freq

then try logging in and doing stuff.  If that fixes your fan issue, it
means the GPU frequency is to blame.  I think Eugeni is working on a
nice API to let users control perf vs power a bit better than our
current default of "try to run as fast as possible under any load, even
a tiny one".

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/3] drm/i915: add SNB and IVB video sprite support v6

2011-12-13 Thread Jesse Barnes
The video sprites support various video surface formats natively and can
handle scaling as well.  So add support for them using the new DRM core
sprite support functions.

v2: use drm specific fourcc header and defines
v3: address Daniel's comments:
  - don't take struct mutex around register access (only needed for
regs in the GT power well)
  - don't hold struct mutex across vblank waits
  - fix up update_plane API (pass obj instead of GTT offset)
  - add interlaced defines for sprite regs
  - drop unnecessary 'reg' variables
  - comment double buffered reg flushing
  Also fix w/h confusion when writing the scaling reg.
v4: more fixes, address more comments from Daniel, and include Hai's fix
  - prevent divide by zero in scaling calculation (Hai Lan)
  - update to Ville's new DRM_FORMAT_* types
  - fix sprite watermark handling (calc based on CRTC size, separate
from normal display wm)
  - remove private refcounts now that the fb cleanups handles things
v5: add linear surface support
v6: remove color key clearing & setting from update_plane

For this version, I tested DPMS since it came up in the last review;
DPMS off/on works ok when a video player is working under X, but for
power saving we'll probably want to do something smarter.  I'll leave
that for a separate patch on top.  Likewise with the refcounting/fb
layer handling, which are really separate cleanups.

Reviewed-by: Daniel Vetter 
Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/Makefile|1 +
 drivers/gpu/drm/i915/i915_drv.h  |3 +
 drivers/gpu/drm/i915/i915_reg.h  |  133 ++
 drivers/gpu/drm/i915/intel_display.c |  174 +-
 drivers/gpu/drm/i915/intel_drv.h |   28 ++
 drivers/gpu/drm/i915/intel_fb.c  |6 +
 drivers/gpu/drm/i915/intel_sprite.c  |  450 ++
 7 files changed, 788 insertions(+), 7 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_sprite.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0ae6a7c..808b255 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -28,6 +28,7 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o i915_mem.o \
  intel_dvo.o \
  intel_ringbuffer.o \
  intel_overlay.o \
+ intel_sprite.o \
  intel_opregion.o \
  dvo_ch7xxx.o \
  dvo_ch7017.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 06a37f4..0920b6b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -203,6 +203,8 @@ struct drm_i915_display_funcs {
int (*get_display_clock_speed)(struct drm_device *dev);
int (*get_fifo_size)(struct drm_device *dev, int plane);
void (*update_wm)(struct drm_device *dev);
+   void (*update_sprite_wm)(struct drm_device *dev, int pipe,
+uint32_t sprite_width, int pixel_size);
int (*crtc_mode_set)(struct drm_crtc *crtc,
 struct drm_display_mode *mode,
 struct drm_display_mode *adjusted_mode,
@@ -344,6 +346,7 @@ typedef struct drm_i915_private {
 
/* overlay */
struct intel_overlay *overlay;
+   bool sprite_scaling_enabled;
 
/* LVDS info */
int backlight_level;  /* restore backlight to this value */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5a09416..f872ba2 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2450,6 +2450,8 @@
 #define WM3_LP_ILK 0x45110
 #define  WM3_LP_EN (1<<31)
 #define WM1S_LP_ILK0x45120
+#define WM2S_LP_IVB0x45124
+#define WM3S_LP_IVB0x45128
 #define  WM1S_LP_EN(1<<31)
 
 /* Memory latency timer register */
@@ -2666,6 +2668,137 @@
 #define _DSPBSURF  0x7119C
 #define _DSPBTILEOFF   0x711A4
 
+/* Sprite A control */
+#define _DVSACNTR  0x72180
+#define   DVS_ENABLE   (1<<31)
+#define   DVS_GAMMA_ENABLE (1<<30)
+#define   DVS_PIXFORMAT_MASK   (3<<25)
+#define   DVS_FORMAT_YUV422(0<<25)
+#define   DVS_FORMAT_RGBX101010(1<<25)
+#define   DVS_FORMAT_RGBX888   (2<<25)
+#define   DVS_FORMAT_RGBX161616(3<<25)
+#define   DVS_SOURCE_KEY   (1<<22)
+#define   DVS_RGB_ORDER_RGBX   (1<<20)
+#define   DVS_YUV_BYTE_ORDER_MASK (3<<16)
+#define   DVS_YUV_ORDER_YUYV   (0<<16)
+#define   DVS_YUV_ORDER_UYVY   (1<<16)
+#define   DVS_YUV_ORDER_YVYU   (2<<16)
+#define   DVS_YUV_ORDER_VYUY   (3<<16)
+#define   DVS_DEST_KEY (1<<2)
+#define   DVS_TRICKLE_FEED_DISABLE (1<<14)
+#define   DVS_TILED(1<<10)
+#define _DVSALINOFF0x72184
+#define _DVSASTRIDE0x72188
+#define _DVSAPOS   0x7218c
+#define _DVSASIZE  0x72190
+#define _DVSAKEYVAL0x72194
+#define _DVSAKEYMSK0x72198
+#define _DVSASURF 

[Intel-gfx] [PATCH 3/3] drm/i915: add color key support v3

2011-12-13 Thread Jesse Barnes
Add new ioctls for getting and setting the current destination color
key.  This allows for simple overlay display control by matching a color
key value in the primary plane before blending the overlay on top.

v2: remove unnecessary mutex acquire/release around reg accesses
v3: add support for full color key management

Reviewed-by: Daniel Vetter 
Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/i915_dma.c |2 +
 drivers/gpu/drm/i915/i915_reg.h |3 +
 drivers/gpu/drm/i915/intel_drv.h|   11 ++
 drivers/gpu/drm/i915/intel_sprite.c |  188 +++
 include/drm/i915_drm.h  |   36 +++
 5 files changed, 240 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..12615eb 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2295,6 +2295,8 @@ struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_MADVISE, i915_gem_madvise_ioctl, 
DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_PUT_IMAGE, intel_overlay_put_image, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF_DRV(I915_OVERLAY_ATTRS, intel_overlay_attrs, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_SET_SPRITE_COLORKEY, intel_sprite_set_colorkey, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_GET_SPRITE_COLORKEY, intel_sprite_get_colorkey, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f872ba2..25ec240 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2726,9 +2726,12 @@
 #define DVSSTRIDE(pipe) _PIPE(pipe, _DVSASTRIDE, _DVSBSTRIDE)
 #define DVSPOS(pipe) _PIPE(pipe, _DVSAPOS, _DVSBPOS)
 #define DVSSURF(pipe) _PIPE(pipe, _DVSASURF, _DVSBSURF)
+#define DVSKEYMAX(pipe) _PIPE(pipe, _DVSAKEYMAXVAL, _DVSBKEYMAXVAL)
 #define DVSSIZE(pipe) _PIPE(pipe, _DVSASIZE, _DVSBSIZE)
 #define DVSSCALE(pipe) _PIPE(pipe, _DVSASCALE, _DVSBSCALE)
 #define DVSTILEOFF(pipe) _PIPE(pipe, _DVSATILEOFF, _DVSBTILEOFF)
+#define DVSKEYVAL(pipe) _PIPE(pipe, _DVSAKEYVAL, _DVSBKEYVAL)
+#define DVSKEYMSK(pipe) _PIPE(pipe, _DVSAKEYMSK, _DVSBKEYMSK)
 
 #define _SPRA_CTL  0x70280
 #define   SPRITE_ENABLE(1<<31)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 97bbbc5..fd1a2cc 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -26,6 +26,7 @@
 #define __INTEL_DRV_H__
 
 #include 
+#include "i915_drm.h"
 #include "i915_drv.h"
 #include "drm_crtc.h"
 #include "drm_crtc_helper.h"
@@ -191,6 +192,10 @@ struct intel_plane {
 uint32_t x, uint32_t y,
 uint32_t src_w, uint32_t src_h);
void (*disable_plane)(struct drm_plane *plane);
+   int (*update_colorkey)(struct drm_plane *plane,
+  struct drm_intel_sprite_colorkey *key);
+   void (*get_colorkey)(struct drm_plane *plane,
+struct drm_intel_sprite_colorkey *key);
 };
 
 #define to_intel_crtc(x) container_of(x, struct intel_crtc, base)
@@ -413,4 +418,10 @@ extern void sandybridge_update_wm(struct drm_device *dev);
 extern void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
   uint32_t sprite_width,
   int pixel_size);
+
+extern int intel_sprite_set_colorkey(struct drm_device *dev, void *data,
+struct drm_file *file_priv);
+extern int intel_sprite_get_colorkey(struct drm_device *dev, void *data,
+struct drm_file *file_priv);
+
 #endif /* __INTEL_DRV_H__ */
diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
b/drivers/gpu/drm/i915/intel_sprite.c
index 735c8ab..a2d523b 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -95,6 +95,7 @@ ivb_update_plane(struct drm_plane *plane, struct 
drm_framebuffer *fb,
/* must disable */
sprctl |= SPRITE_TRICKLE_FEED_DISABLE;
sprctl |= SPRITE_ENABLE;
+   sprctl |= SPRITE_DEST_KEY;
 
/* Sizes are 0 based */
src_w--;
@@ -153,6 +154,63 @@ ivb_disable_plane(struct drm_plane *plane)
POSTING_READ(SPRSURF(pipe));
 }
 
+static int
+ivb_update_colorkey(struct drm_plane *plane,
+   struct drm_intel_sprite_colorkey *key)
+{
+   struct drm_device *dev = plane->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_plane *intel_plane;
+   u32 sprctl;
+   int ret = 0;
+
+   if (key->min_value > 0xff)
+   return -EINVAL;
+
+   intel_plane = to_intel_plane(plane);
+
+   I915_WRITE(SPRKEYVAL(intel_plane->pipe), key->min_value);
+

[Intel-gfx] [PATCH 2/3] drm/i915: track sprite coverage and disable primary plane if possible

2011-12-13 Thread Jesse Barnes
To save power when the sprite is full screen, we can disable the primary
plane on the same pipe.  Track the sprite status and enable/disable the
primary opportunistically.

v2: remove primary plane enable/disable hooks; they're identical

Reviewed-by: Daniel Vetter 
Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/intel_drv.h|1 +
 drivers/gpu/drm/i915/intel_sprite.c |   41 +++
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 089cdde..97bbbc5 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -180,6 +180,7 @@ struct intel_plane {
struct drm_plane base;
enum pipe pipe;
struct drm_i915_gem_object *obj;
+   bool primary_disabled;
int max_downscale;
u32 lut_r[1024], lut_g[1024], lut_b[1024];
void (*update_plane)(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
b/drivers/gpu/drm/i915/intel_sprite.c
index 87cafbe..735c8ab 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -256,6 +256,28 @@ snb_disable_plane(struct drm_plane *plane)
POSTING_READ(DVSSURF(pipe));
 }
 
+static void
+intel_enable_primary(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+   int reg = DSPCNTR(intel_crtc->plane);
+
+   I915_WRITE(reg, I915_READ(reg) | DISPLAY_PLANE_ENABLE);
+}
+
+static void
+intel_disable_primary(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+   int reg = DSPCNTR(intel_crtc->plane);
+
+   I915_WRITE(reg, I915_READ(reg) & ~DISPLAY_PLANE_ENABLE);
+}
+
 static int
 intel_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
   struct drm_framebuffer *fb, int crtc_x, int crtc_y,
@@ -342,9 +364,23 @@ intel_update_plane(struct drm_plane *plane, struct 
drm_crtc *crtc,
 
intel_plane->obj = obj;
 
+   /*
+* Be sure to re-enable the primary before the sprite is no longer
+* covering it fully.
+*/
+   if (!disable_primary && intel_plane->primary_disabled) {
+   intel_enable_primary(crtc);
+   intel_plane->primary_disabled = false;
+   }
+
intel_plane->update_plane(plane, fb, obj, crtc_x, crtc_y,
  crtc_w, crtc_h, x, y, src_w, src_h);
 
+   if (disable_primary) {
+   intel_disable_primary(crtc);
+   intel_plane->primary_disabled = true;
+   }
+
/* Unpin old obj after new one is active to avoid ugliness */
if (old_obj) {
/*
@@ -374,6 +410,11 @@ intel_disable_plane(struct drm_plane *plane)
struct intel_plane *intel_plane = to_intel_plane(plane);
int ret = 0;
 
+   if (intel_plane->primary_disabled) {
+   intel_enable_primary(plane->crtc);
+   intel_plane->primary_disabled = false;
+   }
+
intel_plane->disable_plane(plane);
 
if (!intel_plane->obj)
-- 
1.7.4.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Fan running with Intel Graphics

2011-12-13 Thread Johannes Bauer
Hi list,

I hope that this is the right place to come. I have a Dell Latitude
E5520 Laptop (Sandy Bridge, Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz)
and I'm using the Intel graphics driver that ships with Ubuntu (3.0.0
kernel):

joelaptop [~]: lspci -nn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd
Generation Core Processor Family Integrated Graphics Controller
[8086:0126] (rev 09)

joelaptop [~]: dpkg -l | grep xorg | grep int
ii  xserver-xorg-video-intel   2:2.15.901-1ubuntu2.1
   X.Org X server -- Intel i8xx, i9xx display driver

I have one problem with this setup, however: The fan is running all the
time. It annoys the shit out of me. The reason why I suspect that
something might be off with the graphics card driver is that in
framebuffer mode, the fan is off. Only when I type in the password to my
cryptofs (which leaves the framebuffer and starts Xorg) the fan starts
running almost instantaniously.

And it's not even doing *anything*, the CPUs are all at almost 0%
(therefore I don't think there's much heat coming from there). I'm not
doing heavy graphics (not even light graphics, not even moving the mouse!).

How can I measure the graphics card load? Would it improve anything if I
compiled a kernel myself and switched to 3.1? Is there anything at all I
can do?

Best regards,
Joe
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/i915: add SNB and IVB video sprite support v5

2011-12-13 Thread Jesse Barnes
On Mon, 12 Dec 2011 22:56:34 +0100
Daniel Vetter  wrote:

> On Wed, Dec 07, 2011 at 12:29:21PM -0800, Jesse Barnes wrote:
> > The video sprites support various video surface formats natively and can
> > handle scaling as well.  So add support for them using the new DRM core
> > sprite support functions.
> > 
> > v2: use drm specific fourcc header and defines
> > v3: address Daniel's comments:
> >   - don't take struct mutex around register access (only needed for
> > regs in the GT power well)
> >   - don't hold struct mutex across vblank waits
> >   - fix up update_plane API (pass obj instead of GTT offset)
> >   - add interlaced defines for sprite regs
> >   - drop unnecessary 'reg' variables
> >   - comment double buffered reg flushing
> >   Also fix w/h confusion when writing the scaling reg.
> > v4: more fixes, address more comments from Daniel, and include Hai's fix
> >   - prevent divide by zero in scaling calculation (Hai Lan)
> >   - update to Ville's new DRM_FORMAT_* types
> >   - fix sprite watermark handling (calc based on CRTC size, separate
> > from normal display wm)
> >   - remove private refcounts now that the fb cleanups handles things
> > v5: add linear surface support
> > 
> > For this version, I tested DPMS since it came up in the last review;
> > DPMS off/on works ok when a video player is working under X, but for
> > power saving we'll probably want to do something smarter.  I'll leave
> > that for a separate patch on top.  Likewise with the refcounting/fb
> > layer handling, which are really separate cleanups.
> > 
> > Signed-off-by: Jesse Barnes 
> 
> I didn't bother to recheck the regs and and the wm stuff looks like the
> usual magic ;-) Otherwise you've implemented way more of my review
> comments than I could possibly still remember, so it must be good.
> 
> Reviewed-by: Daniel Vetter 

I take it back, I need to re-post this one minus a few lines now that
I'm updating the color key support.  New series on its way.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

LLC is not SNB-specific, so we should check for it in a more generic way.

v2: export LLC support status via debugfs and DRM GETPARAM.

v3: rebase on newer kernel version which says that IVB supports LLC as
well.

Reviewed-by: Chris Wilson 
Reviewed-by: Daniel Vetter 
Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_debugfs.c |1 +
 drivers/gpu/drm/i915/i915_dma.c |3 +++
 drivers/gpu/drm/i915/i915_drv.c |4 
 drivers/gpu/drm/i915/i915_drv.h |2 ++
 drivers/gpu/drm/i915/i915_gem.c |4 ++--
 include/drm/i915_drm.h  |1 +
 6 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index d09a6e0..cb8a153 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -82,6 +82,7 @@ static int i915_capabilities(struct seq_file *m, void *data)
B(supports_tv);
B(has_bsd_ring);
B(has_blt_ring);
+   B(has_llc);
 #undef B
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..938ad57 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -781,6 +781,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_RELAXED_DELTA:
value = 1;
break;
+   case I915_PARAM_HAS_LLC:
+   value = HAS_LLC(dev);
+   break;
default:
DRM_DEBUG_DRIVER("Unknown parameter %d\n",
 param->param);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 15bfa91..19fb7a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -214,6 +214,7 @@ static const struct intel_device_info 
intel_sandybridge_d_info = {
.need_gfx_hws = 1, .has_hotplug = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_sandybridge_m_info = {
@@ -222,6 +223,7 @@ static const struct intel_device_info 
intel_sandybridge_m_info = {
.has_fbc = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_ivybridge_d_info = {
@@ -229,6 +231,7 @@ static const struct intel_device_info 
intel_ivybridge_d_info = {
.need_gfx_hws = 1, .has_hotplug = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_ivybridge_m_info = {
@@ -237,6 +240,7 @@ static const struct intel_device_info 
intel_ivybridge_m_info = {
.has_fbc = 0,   /* FBC is not enabled on Ivybridge mobile yet */
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct pci_device_id pciidlist[] = {  /* aka */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4a9c1b9..abbbf32 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -250,6 +250,7 @@ struct intel_device_info {
u8 supports_tv:1;
u8 has_bsd_ring:1;
u8 has_blt_ring:1;
+   u8 has_llc:1;
 };
 
 enum no_fbc_reason {
@@ -961,6 +962,7 @@ struct drm_i915_file_private {
 
 #define HAS_BSD(dev)(INTEL_INFO(dev)->has_bsd_ring)
 #define HAS_BLT(dev)(INTEL_INFO(dev)->has_blt_ring)
+#define HAS_LLC(dev)(INTEL_INFO(dev)->has_llc)
 #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_OVERLAY(dev)   (INTEL_INFO(dev)->has_overlay)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 60ff1b6..fb69337 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3620,8 +3620,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct 
drm_device *dev,
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (IS_GEN6(dev) || IS_GEN7(dev)) {
-   /* On Gen6, we can have the GPU use the LLC (the CPU
+   if (HAS_LLC(dev)) {
+   /* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
 * display scanout are coherent with the CPU in
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index 28c0d11..b34e630 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -291,6 +291,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_COHERENT_RINGS   13
 #define I915_PARAM_HAS_EXEC_CONSTANTS   14
 #define I915_PARAM_HAS_RELAXED_DELTA15
+#define I915_PARAM_HAS_LLC  16
 
 typedef struct drm_i915_getparam {
int param;
-- 
1.7.7.4

__

Re: [Intel-gfx] i915_init takes a full second of kernel init time

2011-12-13 Thread Jesse Barnes
On Tue, 13 Dec 2011 11:55:06 -0800
Scott James Remnant  wrote:

> I've been investigating Chrome OS boot time and noticed the anomaly
> where i915_init takes up a considerable amount of kernel startup time,
> one second in fact. I've attached a full dmesg with drm.debug=0xff for
> analysis at Daniel's suggestion.

I'm not surprised... we haven't optimized init time in awhile and lots
of delays have crept in.

What kind of panel does this laptop have?  Can you enable drm debugging
(drm.debug=1 on the boot line) and see where the big delays are in
modesetting?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/i915: add SNB and IVB video sprite support v5

2011-12-13 Thread Jesse Barnes
On Mon, 12 Dec 2011 22:56:34 +0100
Daniel Vetter  wrote:

> On Wed, Dec 07, 2011 at 12:29:21PM -0800, Jesse Barnes wrote:
> > The video sprites support various video surface formats natively and can
> > handle scaling as well.  So add support for them using the new DRM core
> > sprite support functions.
> > 
> > v2: use drm specific fourcc header and defines
> > v3: address Daniel's comments:
> >   - don't take struct mutex around register access (only needed for
> > regs in the GT power well)
> >   - don't hold struct mutex across vblank waits
> >   - fix up update_plane API (pass obj instead of GTT offset)
> >   - add interlaced defines for sprite regs
> >   - drop unnecessary 'reg' variables
> >   - comment double buffered reg flushing
> >   Also fix w/h confusion when writing the scaling reg.
> > v4: more fixes, address more comments from Daniel, and include Hai's fix
> >   - prevent divide by zero in scaling calculation (Hai Lan)
> >   - update to Ville's new DRM_FORMAT_* types
> >   - fix sprite watermark handling (calc based on CRTC size, separate
> > from normal display wm)
> >   - remove private refcounts now that the fb cleanups handles things
> > v5: add linear surface support
> > 
> > For this version, I tested DPMS since it came up in the last review;
> > DPMS off/on works ok when a video player is working under X, but for
> > power saving we'll probably want to do something smarter.  I'll leave
> > that for a separate patch on top.  Likewise with the refcounting/fb
> > layer handling, which are really separate cleanups.
> > 
> > Signed-off-by: Jesse Barnes 
> 
> I didn't bother to recheck the regs and and the wm stuff looks like the
> usual magic ;-) Otherwise you've implemented way more of my review
> comments than I could possibly still remember, so it must be good.
> 
> Reviewed-by: Daniel Vetter 

The one thing I kept out was the "disable planes in generic code"
call.  The TI guys are working on using planes to represent all planes,
not just additional overlays, so calling disable from generic code
seemed unfriendly.

Keith, this one is ready for -next.  I'll clean up the ioctl now and
re-post; without it the overlays will always sit on top without
blending, so the patch is still safe.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/4] drm/i915: Update GEN6_RP_CONTROL definitions

2011-12-13 Thread Jesse Barnes
On Mon, 12 Dec 2011 19:21:59 -0800
Ben Widawsky  wrote:

> This matches the modern specs more accurately.
> 
> This will be used by the following patch to fix the way we display RC
> status.
> 
> Signed-off-by: Ben Widawsky 
> ---

Looks good.

Reviewed-by: Jesse Barnes 

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 4/4] drm/i915: drpc debugfs update for gen6

2011-12-13 Thread Jesse Barnes
On Mon, 12 Dec 2011 19:22:00 -0800
Ben Widawsky  wrote:

> Many of the old fields from Ironlake have gone away. Strip all those
> fields, and try to update to fields people care about. RC information
> isn't exactly ideal anymore. All we can guarantee when we read the
> register is that we're not using forcewake, ie. the software isn't
> forcing the hardware to stay awake. The downside is that in doing this
> we may wait a while and that causes an unnaturally idle state on the
> GPU.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42578
> Signed-off-by: Ben Widawsky 
> ---

Aside from the merge error:

Reviewed-by: Jesse Barnes 

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] uxa/glamor: Enable the rest glamor rendering functions.

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 22:31:41 +0800, zhigang.g...@linux.intel.com wrote:
> From: Zhigang Gong 
> 
> This commit enable all the rest glamor rendering functions.
> Tested with latest glamor master branch, can pass rendercheck.

Hmm, it exposes an issue with keeping a bo cache independent of mesa and
trying to feed it our own handles:

 Region for name 6 already exists but is not compatible

The w/a for this would be:

diff --git a/src/intel_glamor.c b/src/intel_glamor.c
index 0cf8ed7..2757fd6 100644
--- a/src/intel_glamor.c
+++ b/src/intel_glamor.c
@@ -91,6 +91,7 @@ intel_glamor_create_textured_pixmap(PixmapPtr pixmap)
priv = intel_get_pixmap_private(pixmap);
if (glamor_egl_create_textured_pixmap(pixmap, priv->bo->handle,
  priv->stride)) {
+   drm_intel_bo_disable_reuse(priv->bo);
priv->pinned = 1;
return TRUE;
} else

but that gives up all pretense of maintaining a bo cache.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Ben Widawsky

On 12/13/2011 09:22 AM, Eric Anholt wrote:

On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky  wrote:

Since we don't differentiate on the different GPU read domains, it
should be safe to allow back to back reads to occur without issuing a
wait (or flush in the non-semaphore case).

This has the unfortunate side effect that we need to keep track of all
the outstanding buffer reads so that we can synchronize on a write, to
another ring (since we don't know which read finishes first). In other
words, the code is quite simple for two rings, but gets more tricky for

2 rings.


Here is a picture of the solution to the above problem

Ring 0Ring 1 Ring 2
batch 0   batch 1batch 2
   read buffer A read buffer A  wait batch 0
wait batch 1
write buffer A

This code is really untested. I'm hoping for some feedback if this is
worth cleaning up, and testing more thoroughly.


You say it's an optimization -- do you have performance numbers?


33% improvement on a hacked version of gem_ring_sync_loop with.

It's not really a valid test as it's not coherent, but this is 
approximately the best case improvement.


Oddly semaphores doesn't make much difference in this test, which was 
surprising.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] uxa/glamor: Enable the rest glamor rendering functions.

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 22:31:41 +0800, zhigang.g...@linux.intel.com wrote:
> From: Zhigang Gong 
> 
> This commit enable all the rest glamor rendering functions.
> Tested with latest glamor master branch, can pass rendercheck.
> 
> One thing need to be pointed out is the picture's handling.
> Pictures support many different color formats, but glamor's
> texture only support a few color formats. And the most common
> scenario is that we create a pixmap with a color depth and
> then attach it to a picture which has a specific color format
> with the same color depth. But there is no way to change a
> texture's internal format after the texture was allocated.
> If you do that, the OpenGL will allocate a new texture. And
> then the glamor side and UXA side will be inconsitence. So
> for all the picture related operations, we can't fallback to
> UXA path directly, even it is rather a strainth forward
> operation. So for the get_image, Addtraps.., we have to add
> wrappers function for them to jump into glamor firstly.

Can we create multiple textures referencing the same bo but with
different formats? Or are we going to run afoul of the coherency model
with GL?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Daniel Vetter
On Tue, Dec 13, 2011 at 09:20:40AM -0800, Eric Anholt wrote:
> On Tue, 13 Dec 2011 17:09:37 +0100, Daniel Vetter  wrote:
> > On Tue, Dec 13, 2011 at 11:05:15AM -0200, Eugeni Dodonov wrote:
> > > From: Eugeni Dodonov 
> > > 
> > > LLC is not SNB-specific, so we should check for it in a more generic way.
> > > 
> > > v2: export LLC support status via debugfs and DRM GETPARAM.
> > > 
> > > Signed-off-by: Eugeni Dodonov 
> > 
> > Nice patch and would get an r-b from me safe for the new GETPARAM. I
> > really think we need to export this on a per-bo basis (and with the caveat
> > that the kernel is free to change the caching on every ioctl that uses
> > it). I.e. without forcing userspace to check the caching bits before any
> > bo access I fear that we won't be able to change the kernel's behaviour in
> > this area, which surely results in backwards-compat hell when the first
> > w/a that needs such changes comes around. Hence in its current from
> > 
> > Nacked-by: Daniel Vetter 
> > 
> > So please drop the GETPARAM. For the per-bo get_cache_flags ioctl there's
> > already a patch by Ben floating around.
> 
> The way the getparam would be useful is that right now we're taking some
> different paths for performance reasons in Mesa on gen6, assuming that
> LLC is present.  Knowing whether or not we expect BOs in general to be
> LLC for performance would be nice for that -- without that, I'll just
> make assumptions based on chipset generation.

Ok, I'll reconsider: In the mesa example (and any other use-case for llc
accelarated up/download) we don't depend upon llc for correctness and
we're using the caching on a newly created buffer, so the per-bo ioctls
aren't much use. So I think the backwards-compat mess is manageable if
people promise to use the HAS_LLC getparam only for such optimizations ...

I still think we want to full get/set_cache_level in additions to this.
But for this patch:

Reviewed-by: Daniel Vetter 
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Eric Anholt
On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky  wrote:
> Since we don't differentiate on the different GPU read domains, it
> should be safe to allow back to back reads to occur without issuing a
> wait (or flush in the non-semaphore case).
> 
> This has the unfortunate side effect that we need to keep track of all
> the outstanding buffer reads so that we can synchronize on a write, to
> another ring (since we don't know which read finishes first). In other
> words, the code is quite simple for two rings, but gets more tricky for
> > 2 rings.
> 
> Here is a picture of the solution to the above problem
> 
> Ring 0Ring 1 Ring 2
> batch 0   batch 1batch 2
>   read buffer A read buffer A  wait batch 0
>wait batch 1
>write buffer A
> 
> This code is really untested. I'm hoping for some feedback if this is
> worth cleaning up, and testing more thoroughly.

You say it's an optimization -- do you have performance numbers?


pgp0IC8rNeGiA.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Eric Anholt
On Tue, 13 Dec 2011 17:09:37 +0100, Daniel Vetter  wrote:
> On Tue, Dec 13, 2011 at 11:05:15AM -0200, Eugeni Dodonov wrote:
> > From: Eugeni Dodonov 
> > 
> > LLC is not SNB-specific, so we should check for it in a more generic way.
> > 
> > v2: export LLC support status via debugfs and DRM GETPARAM.
> > 
> > Signed-off-by: Eugeni Dodonov 
> 
> Nice patch and would get an r-b from me safe for the new GETPARAM. I
> really think we need to export this on a per-bo basis (and with the caveat
> that the kernel is free to change the caching on every ioctl that uses
> it). I.e. without forcing userspace to check the caching bits before any
> bo access I fear that we won't be able to change the kernel's behaviour in
> this area, which surely results in backwards-compat hell when the first
> w/a that needs such changes comes around. Hence in its current from
> 
> Nacked-by: Daniel Vetter 
> 
> So please drop the GETPARAM. For the per-bo get_cache_flags ioctl there's
> already a patch by Ben floating around.

The way the getparam would be useful is that right now we're taking some
different paths for performance reasons in Mesa on gen6, assuming that
LLC is present.  Knowing whether or not we expect BOs in general to be
LLC for performance would be nice for that -- without that, I'll just
make assumptions based on chipset generation.


pgptyja25Bklj.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 17:01:33 +0100, Daniel Vetter  wrote:
> Afaik the only use-case for parallel reads is video decode with
> post-processing on the render ring. The decode ring needs read-only access
> to reference frames to decode the next frame and the render ring read-only
> access to past frames for post-processing (e.g. deinterlacing). But given
> the general state of perf optimizations in libva I think we have lower
> hanging fruit to chase if we actually miss a performance target for this
> use-case.

One in the near future will be: render to backbuffer (RCS),
pageflip to scanout (BCS), read from front (RCS).

And in its current form UXA will do the back-to-front blit on the BCS.
But that is async and so not a large race window, whereas the pageflip
may takes ~16ms to process. I don't think it is entirely unfeasible that
we see some form of this whilst running compositors or games. Or at
least would if we enabled semaphores for pageflips. Except in the
pageflip scenario we know we are protected by the fb ref, so consider
the hypothetical scenario where we have a working vsync'ed blit...

The real question is in any event do we have enough instrumentation to
diagnose GPU stalls upon buffer migration? Then we can replace the read
optimisation with a tracepoint and wait for a test case.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Keith Packard
On Tue, 13 Dec 2011 17:01:33 +0100, Daniel Vetter  wrote:

> - or remove it all and invalidate/flush unconditionally.

Eric and I were chatting yesterday about trying this -- it seems like
we'd be able to dramatically simplify the kernel module by doing this,
and given how much flushing already occurs, I doubt we'd see any
significant performance difference, and we'd save a pile of CPU time,
which might actually improve performance.

-- 
keith.pack...@intel.com


pgpPKOYCh5NJT.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/sdvo: Enforce more timing requirements

2011-12-13 Thread Paulo Zanoni
Hi

2011/12/13 Adam Jackson :
> +       if (mode->vtotal - mode->vdisplay < 3)
> +               return MODE_VBLANK_NARROW;
> +
> +       if (mode->vsync_end - mode->vsync_start < 1)
> +               return MODE_VSYNC_NARROW;
> +
> +       if (mode->htotal - mode->hdisplay < 16)
> +               return MODE_HBLANK_NARROW;
> +
> +       if (mode->hsync_end - mode->hsync_start < 16)

I believe in this line above it should be 2 instead of 16.

> +               return MODE_HSYNC_NARROW;
> +
>        return MODE_OK;
>  }
>
> --
> 1.7.6.4
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Daniel Vetter
On Tue, Dec 13, 2011 at 11:05:15AM -0200, Eugeni Dodonov wrote:
> From: Eugeni Dodonov 
> 
> LLC is not SNB-specific, so we should check for it in a more generic way.
> 
> v2: export LLC support status via debugfs and DRM GETPARAM.
> 
> Signed-off-by: Eugeni Dodonov 

Nice patch and would get an r-b from me safe for the new GETPARAM. I
really think we need to export this on a per-bo basis (and with the caveat
that the kernel is free to change the caching on every ioctl that uses
it). I.e. without forcing userspace to check the caching bits before any
bo access I fear that we won't be able to change the kernel's behaviour in
this area, which surely results in backwards-compat hell when the first
w/a that needs such changes comes around. Hence in its current from

Nacked-by: Daniel Vetter 

So please drop the GETPARAM. For the per-bo get_cache_flags ioctl there's
already a patch by Ben floating around.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Daniel Vetter
On Tue, Dec 13, 2011 at 09:49:34AM +, Chris Wilson wrote:
> On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky  wrote:
> > Since we don't differentiate on the different GPU read domains, it
> > should be safe to allow back to back reads to occur without issuing a
> > wait (or flush in the non-semaphore case).
> > 
> > This has the unfortunate side effect that we need to keep track of all
> > the outstanding buffer reads so that we can synchronize on a write, to
> > another ring (since we don't know which read finishes first). In other
> > words, the code is quite simple for two rings, but gets more tricky for
> > > 2 rings.
> > 
> > Here is a picture of the solution to the above problem
> > 
> > Ring 0Ring 1 Ring 2
> > batch 0   batch 1batch 2
> >   read buffer A read buffer A  wait batch 0
> >wait batch 1
> >write buffer A
> > 
> > This code is really untested. I'm hoping for some feedback if this is
> > worth cleaning up, and testing more thoroughly.
> 
> Yes, that race is quite valid and the reason why I thought I hadn't made
> that optimisation. Darn. :(
> 
> To go a step further, we can split the obj->ring_list into
> (obj->ring_read_list[NUM_RINGS], obj->num_readers, obj->last_read_seqno) and
> (obj->ring_write_list, obj->last_write_seqno). At which point Daniel
> complains about bloating every i915_gem_object, and we probably should
> kmem_cache_alloc a i915_gem_object_seqno on demand. This allows us to track
> objects in multiple rings and implement read-write locking, albeit at
> significantly more complexity in managing the active lists.

I think the i915_gem_object bloat can be fought by stealing a few bits
from the various seqnos and storing the ring id in there. The thing that
makes me more uneasy is that I don't trust our gpu domain tracking
(especially since it's not per-ring). So either
- extend it to be per-ring
- or remove it all and invalidate/flush unconditionally.
In the light of all the complexity and the fact that due to our various
w/as I prefer the latter.

Afaik the only use-case for parallel reads is video decode with
post-processing on the render ring. The decode ring needs read-only access
to reference frames to decode the next frame and the render ring read-only
access to past frames for post-processing (e.g. deinterlacing). But given
the general state of perf optimizations in libva I think we have lower
hanging fruit to chase if we actually miss a performance target for this
use-case.

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/sdvo: Enforce more timing requirements

2011-12-13 Thread Adam Jackson
Signed-off-by: Adam Jackson 
---
 drivers/gpu/drm/i915/intel_sdvo.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_sdvo.c 
b/drivers/gpu/drm/i915/intel_sdvo.c
index 3003fb2..82de0b0 100644
--- a/drivers/gpu/drm/i915/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/intel_sdvo.c
@@ -1174,6 +1174,18 @@ static int intel_sdvo_mode_valid(struct drm_connector 
*connector,
return MODE_PANEL;
}
 
+   if (mode->vtotal - mode->vdisplay < 3)
+   return MODE_VBLANK_NARROW;
+
+   if (mode->vsync_end - mode->vsync_start < 1)
+   return MODE_VSYNC_NARROW;
+
+   if (mode->htotal - mode->hdisplay < 16)
+   return MODE_HBLANK_NARROW;
+
+   if (mode->hsync_end - mode->hsync_start < 16)
+   return MODE_HSYNC_NARROW;
+
return MODE_OK;
 }
 
-- 
1.7.6.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] uxa/glamor: Enable the rest glamor rendering functions.

2011-12-13 Thread zhigang . gong
From: Zhigang Gong 

This commit enable all the rest glamor rendering functions.
Tested with latest glamor master branch, can pass rendercheck.

One thing need to be pointed out is the picture's handling.
Pictures support many different color formats, but glamor's
texture only support a few color formats. And the most common
scenario is that we create a pixmap with a color depth and
then attach it to a picture which has a specific color format
with the same color depth. But there is no way to change a
texture's internal format after the texture was allocated.
If you do that, the OpenGL will allocate a new texture. And
then the glamor side and UXA side will be inconsitence. So
for all the picture related operations, we can't fallback to
UXA path directly, even it is rather a strainth forward
operation. So for the get_image, Addtraps.., we have to add
wrappers function for them to jump into glamor firstly.

Signed-off-by: Zhigang Gong 
---
 src/intel_uxa.c  |   14 +++-
 uxa/uxa-accel.c  |   88 -
 uxa/uxa-glamor.h |   16 -
 uxa/uxa-glyphs.c |   21 
 uxa/uxa-priv.h   |8 +
 uxa/uxa-render.c |   90 ++
 uxa/uxa.c|4 +-
 7 files changed, 234 insertions(+), 7 deletions(-)

diff --git a/src/intel_uxa.c b/src/intel_uxa.c
index e4a5270..a79affa 100644
--- a/src/intel_uxa.c
+++ b/src/intel_uxa.c
@@ -1108,6 +1108,9 @@ intel_uxa_create_pixmap(ScreenPtr screen, int w, int h, 
int depth,
list_del(&priv->in_flight);
screen->ModifyPixmapHeader(pixmap, w, h, 0, 0, 
stride, NULL);
intel_set_pixmap_private(pixmap, priv);
+
+   if 
(!intel_glamor_create_textured_pixmap(pixmap))
+   intel_set_pixmap_bo(pixmap, NULL);
return pixmap;
}
}
@@ -1145,8 +1148,15 @@ intel_uxa_create_pixmap(ScreenPtr screen, int w, int h, 
int depth,
list_init(&priv->batch);
list_init(&priv->flush);
intel_set_pixmap_private(pixmap, priv);
-
-   intel_glamor_create_textured_pixmap(pixmap);
+   /* Create textured pixmap failed means glamor fail to create
+* a texture from the BO for some reasons, and then glamor
+* create a new texture attached to the pixmap, and all the
+* consequent rendering operations on this pixmap will never
+* fallback to UXA path, so we don't need to hold the useless
+* BO if it is the case.
+*/
+   if (!intel_glamor_create_textured_pixmap(pixmap))
+   intel_set_pixmap_bo(pixmap, NULL);
}
 
return pixmap;
diff --git a/uxa/uxa-accel.c b/uxa/uxa-accel.c
index e4afd13..05c64f6 100644
--- a/uxa/uxa-accel.c
+++ b/uxa/uxa-accel.c
@@ -207,8 +207,23 @@ static void
 uxa_put_image(DrawablePtr pDrawable, GCPtr pGC, int depth, int x, int y,
  int w, int h, int leftPad, int format, char *bits)
 {
+   uxa_screen_t *uxa_screen = uxa_get_screen(pDrawable->pScreen);
+
+   if (uxa_screen->info->flags & UXA_USE_GLAMOR) {
+   uxa_prepare_access(pDrawable, UXA_GLAMOR_ACCESS_RW);
+   if (glamor_put_image_nf(pDrawable,
+   pGC, depth, x, y, w, h,
+   leftPad, format, bits)) {
+   uxa_finish_access(pDrawable, UXA_GLAMOR_ACCESS_RW);
+   return;
+   }
+   uxa_finish_access(pDrawable, UXA_GLAMOR_ACCESS_RO);
+   goto fallback;
+   }
+
if (!uxa_do_put_image(pDrawable, pGC, depth, x, y, w, h, format, bits,
  PixmapBytePad(w, pDrawable->depth)))
+fallback:
uxa_check_put_image(pDrawable, pGC, depth, x, y, w, h, leftPad,
format, bits);
 }
@@ -352,6 +367,22 @@ uxa_copy_n_to_n(DrawablePtr pSrcDrawable,
int dst_off_x, dst_off_y;
PixmapPtr pSrcPixmap, pDstPixmap;
 
+   if (uxa_screen->info->flags & UXA_USE_GLAMOR) {
+   uxa_prepare_access(pSrcDrawable, UXA_GLAMOR_ACCESS_RO);
+   uxa_prepare_access(pDstDrawable, UXA_GLAMOR_ACCESS_RW);
+   if (glamor_copy_n_to_n_nf(pSrcDrawable, pDstDrawable,
+ pGC, pbox, nbox, dx, dy,
+ reverse, upsidedown, bitplane,
+ closure)) {
+   uxa_finish_access(pDstDrawable, UXA_GLAMOR_ACCESS_RW);
+   uxa_finish_access(pSrcDrawable, UXA_GLAMOR_ACCESS_RO);
+   return;
+   }
+   uxa_finish_access(pDstDrawable, UXA_GLAMOR_ACCESS_RO

Re: [Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Chris Wilson
On Tue, 13 Dec 2011 11:05:15 -0200, Eugeni Dodonov  wrote:
> From: Eugeni Dodonov 
> 
> LLC is not SNB-specific, so we should check for it in a more generic way.
> 
> v2: export LLC support status via debugfs and DRM GETPARAM.
> 
> Signed-off-by: Eugeni Dodonov 
Reviewed-by: Chris Wilson 

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: add a LLC feature flag in device description

2011-12-13 Thread Eugeni Dodonov
From: Eugeni Dodonov 

LLC is not SNB-specific, so we should check for it in a more generic way.

v2: export LLC support status via debugfs and DRM GETPARAM.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/i915/i915_debugfs.c |1 +
 drivers/gpu/drm/i915/i915_dma.c |3 +++
 drivers/gpu/drm/i915/i915_drv.c |2 ++
 drivers/gpu/drm/i915/i915_drv.h |2 ++
 drivers/gpu/drm/i915/i915_gem.c |4 ++--
 include/drm/i915_drm.h  |1 +
 6 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 4f40f1c..070dbd9 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -82,6 +82,7 @@ static int i915_capabilities(struct seq_file *m, void *data)
B(supports_tv);
B(has_bsd_ring);
B(has_blt_ring);
+   B(has_llc);
 #undef B
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..938ad57 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -781,6 +781,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_RELAXED_DELTA:
value = 1;
break;
+   case I915_PARAM_HAS_LLC:
+   value = HAS_LLC(dev);
+   break;
default:
DRM_DEBUG_DRIVER("Unknown parameter %d\n",
 param->param);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e9c2cfe..b7ac903 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -214,6 +214,7 @@ static const struct intel_device_info 
intel_sandybridge_d_info = {
.need_gfx_hws = 1, .has_hotplug = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_sandybridge_m_info = {
@@ -222,6 +223,7 @@ static const struct intel_device_info 
intel_sandybridge_m_info = {
.has_fbc = 1,
.has_bsd_ring = 1,
.has_blt_ring = 1,
+   .has_llc = 1,
 };
 
 static const struct intel_device_info intel_ivybridge_d_info = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 06a37f4..24969eb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -247,6 +247,7 @@ struct intel_device_info {
u8 supports_tv:1;
u8 has_bsd_ring:1;
u8 has_blt_ring:1;
+   u8 has_llc:1;
 };
 
 enum no_fbc_reason {
@@ -960,6 +961,7 @@ struct drm_i915_file_private {
 
 #define HAS_BSD(dev)(INTEL_INFO(dev)->has_bsd_ring)
 #define HAS_BLT(dev)(INTEL_INFO(dev)->has_blt_ring)
+#define HAS_LLC(dev)(INTEL_INFO(dev)->has_llc)
 #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws)
 
 #define HAS_OVERLAY(dev)   (INTEL_INFO(dev)->has_overlay)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d18b07a..2c0fa78 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3613,8 +3613,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct 
drm_device *dev,
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (IS_GEN6(dev)) {
-   /* On Gen6, we can have the GPU use the LLC (the CPU
+   if (HAS_LLC(dev)) {
+   /* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
 * display scanout are coherent with the CPU in
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index 28c0d11..b34e630 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -291,6 +291,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_COHERENT_RINGS   13
 #define I915_PARAM_HAS_EXEC_CONSTANTS   14
 #define I915_PARAM_HAS_RELAXED_DELTA15
+#define I915_PARAM_HAS_LLC  16
 
 typedef struct drm_i915_getparam {
int param;
-- 
1.7.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

2011-12-13 Thread Chris Wilson
On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky  wrote:
> Since we don't differentiate on the different GPU read domains, it
> should be safe to allow back to back reads to occur without issuing a
> wait (or flush in the non-semaphore case).
> 
> This has the unfortunate side effect that we need to keep track of all
> the outstanding buffer reads so that we can synchronize on a write, to
> another ring (since we don't know which read finishes first). In other
> words, the code is quite simple for two rings, but gets more tricky for
> > 2 rings.
> 
> Here is a picture of the solution to the above problem
> 
> Ring 0Ring 1 Ring 2
> batch 0   batch 1batch 2
>   read buffer A read buffer A  wait batch 0
>wait batch 1
>write buffer A
> 
> This code is really untested. I'm hoping for some feedback if this is
> worth cleaning up, and testing more thoroughly.

Yes, that race is quite valid and the reason why I thought I hadn't made
that optimisation. Darn. :(

To go a step further, we can split the obj->ring_list into
(obj->ring_read_list[NUM_RINGS], obj->num_readers, obj->last_read_seqno) and
(obj->ring_write_list, obj->last_write_seqno). At which point Daniel
complains about bloating every i915_gem_object, and we probably should
kmem_cache_alloc a i915_gem_object_seqno on demand. This allows us to track
objects in multiple rings and implement read-write locking, albeit at
significantly more complexity in managing the active lists.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx