[Mesa-dev] [Bug 53199] New: out-of-bounds read src/gallium/drivers/softpipe/sp_flush.c:59

2012-08-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=53199

 Bug #: 53199
   Summary: out-of-bounds read
src/gallium/drivers/softpipe/sp_flush.c:59
Classification: Unclassified
   Product: Mesa
   Version: unspecified
  Platform: All
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
AssignedTo: mesa-dev@lists.freedesktop.org
ReportedBy: v...@freedesktop.org
CC: bri...@vmware.com


mesa: 7d65356d8a4d268dce4c933d7704d709e1cdacfa (master)

Coverity reports a out-of-bounds read defect.

 44void
 45softpipe_flush( struct pipe_context *pipe,
 46unsigned flags,
 47struct pipe_fence_handle **fence )
 48{
 49   struct softpipe_context *softpipe = softpipe_context(pipe);
 50   uint i;
 51
 52   draw_flush(softpipe-draw);
 53
At (1): Condition flags  2U, taking true branch
 54   if (flags  SP_FLUSH_TEXTURE_CACHE) {
 55  unsigned sh;
 56
At (2): Condition sh  4U, taking true branch
At (9): Condition sh  4U, taking true branch
At (16): Condition sh  4U, taking true branch
 57  for (sh = 0; sh  PIPE_SHADER_TYPES; sh++) {
At (3): Condition i  softpipe-num_sampler_views[sh], taking true branch
At (5): Condition i  softpipe-num_sampler_views[sh], taking true branch
At (7): Condition i  softpipe-num_sampler_views[sh], taking false branch
At (10): Condition i  softpipe-num_sampler_views[sh], taking true branch
At (12): Condition i  softpipe-num_sampler_views[sh], taking true branch
At (14): Condition i  softpipe-num_sampler_views[sh], taking false branch
At (17): Condition i  softpipe-num_sampler_views[sh], taking true branch
 58 for (i = 0; i  softpipe-num_sampler_views[sh]; i++) {
CID 714585: Out-of-bounds read (OVERRUN) [select defect]
CID 714587: Out-of-bounds read (OVERRUN_STATIC)
At (18): Overrunning static array softpipe-tex_cache, with 3 elements, at
position 3 with index variable sh.
 59sp_flush_tex_tile_cache(softpipe-tex_cache[sh][i]);
At (4): Jumping back to the beginning of the loop
At (6): Jumping back to the beginning of the loop
At (11): Jumping back to the beginning of the loop
At (13): Jumping back to the beginning of the loop
 60 }
At (8): Jumping back to the beginning of the loop
At (15): Jumping back to the beginning of the loop
 61  }
 62   }



src/gallium/include/pipe/p_defines.h 
   347  /**
   348   * Shaders
   349   */
   350  #define PIPE_SHADER_VERTEX   0
   351  #define PIPE_SHADER_FRAGMENT 1
   352  #define PIPE_SHADER_GEOMETRY 2
   353  #define PIPE_SHADER_COMPUTE  3
   354  #define PIPE_SHADER_TYPES4


src/gallium/drivers/softpipe/sp_context.h
   180 /*
   181  * Texture caches for vertex, fragment, geometry stages.
   182  * Don't use PIPE_SHADER_TYPES here to avoid allocating unused
memory
   183  * for compute shaders.
   184  */
   185 struct softpipe_tex_tile_cache
*tex_cache[PIPE_SHADER_GEOMETRY+1][PIPE_MAX_SAMPLERS];

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] translate: Fix typo in is_legal_int_format_combo.

2012-08-07 Thread Vinson Lee
Fixes same on both sides defect reported by Coverity.

Signed-off-by: Vinson Lee v...@freedesktop.org
---
 src/gallium/auxiliary/translate/translate_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/translate/translate_generic.c 
b/src/gallium/auxiliary/translate/translate_generic.c
index 0b6ebf5..72099af 100644
--- a/src/gallium/auxiliary/translate/translate_generic.c
+++ b/src/gallium/auxiliary/translate/translate_generic.c
@@ -773,7 +773,7 @@ is_legal_int_format_combo( const struct 
util_format_description *src,
 
for (i = 0; i  nr; i++) {
   /* The signs must match. */
-  if (src-channel[i].type != src-channel[i].type) {
+  if (src-channel[i].type != dst-channel[i].type) {
  return FALSE;
   }
 
-- 
1.7.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 VDPAU 422 regression since r600g: make sure copying of all texture formats is accelerated

2012-08-07 Thread Andy Furniss

Marek Olšák wrote:

Does the attached patch fix this issue?


Not properly - it fixes the invalid command stream but the output is not 
quite right -


http://www.andyqos.ukfsn.org/vdpau-422-patched.png




Marek

On Mon, Aug 6, 2012 at 5:40 PM, Andy Furniss andy...@ukfsn.org wrote:

Kernel is dcn card is rv790 - vdpau csc/scale regressed.

This only shows with 422 colour so most things work.

commit 7c371f46958910dd2ca9487c89af1b72bbfdada9
Author: Marek Olšák mar...@gmail.com
Date:   Sat Jul 28 00:38:42 2012 +0200

 r600g: make sure copying of all texture formats is accelerated

[drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
radeon :01:00.0: texture bo too small ((704 576) (1 1) 0 26 0 - 1622016
have 884736)
radeon :01:00.0: alignments 384 1 1 1



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] translate: Fix typo in is_legal_int_format_combo.

2012-08-07 Thread Jose Fonseca
Good catch.

Reviewed-by: Jose Fonseca jfons...@vmware.com

- Original Message -
 Fixes same on both sides defect reported by Coverity.
 
 Signed-off-by: Vinson Lee v...@freedesktop.org
 ---
  src/gallium/auxiliary/translate/translate_generic.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/gallium/auxiliary/translate/translate_generic.c
 b/src/gallium/auxiliary/translate/translate_generic.c
 index 0b6ebf5..72099af 100644
 --- a/src/gallium/auxiliary/translate/translate_generic.c
 +++ b/src/gallium/auxiliary/translate/translate_generic.c
 @@ -773,7 +773,7 @@ is_legal_int_format_combo( const struct
 util_format_description *src,
  
 for (i = 0; i  nr; i++) {
/* The signs must match. */
 -  if (src-channel[i].type != src-channel[i].type) {
 +  if (src-channel[i].type != dst-channel[i].type) {
   return FALSE;
}
  
 --
 1.7.11.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] down to 1 test page failing in WebGL 1.0.1 test on Radeon driver

2012-08-07 Thread Andreas Boll
2012/8/6 Laurent Carlier lordhea...@gmail.com:
 Le lundi 6 août 2012 17:14:52 Alex Deucher a écrit :

 On Mon, Aug 6, 2012 at 5:14 PM, Alex Deucher alexdeuc...@gmail.com
 wrote:

  On Mon, Aug 6, 2012 at 12:43 AM, Benoit Jacob bja...@mozilla.com
  wrote:

  Hi,

 

  Just so you know: the WebGL 1.0.1 tests are now passing on 2 drivers on

  Linux: the Intel Mesa driver, and the NVIDIA driver.

 

  Technically that's enough for us to claim conformance (we need to pass

  with 2 drivers on each OS we support).

 

  But I'd really like to include the Radeon driver in the list of driver
  we

  can claim to fully pass conformance tests on.

 

  As of Ubuntu 12.04 64bit / Gallium 0.4 on AMD RV710 / Mesa 8.0.2, I
  have

  this single test page failing:

 

 
  https://www.khronos.org/registry/webgl/conformance-suites/1.0.1/conforman

  ce/textures/texture-mips.html

  Does it still fail with mesa from git (soon to be 8.1)? I think the

  tiling rework may have fixed this, but is too invasive to backport to

  the 8.x branch.



 8.0 branch that is.



  Alex


 Two fail here with r600g/hd6870 from git and kernel 3.5


I can confirm that with my rv770 / kernel 3.5 / mesa 8.1-git

conformance/textures/texture-mips.html (17 of 19 passed)

failed: texture that is only using the smallest 2 mips should draw
with green
failed: texture that is only using smallest mips should draw with cyan

Andreas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 53199] out-of-bounds read src/gallium/drivers/softpipe/sp_flush.c:59

2012-08-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=53199

Brian Paul brian.e.p...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #1 from Brian Paul brian.e.p...@gmail.com 2012-08-07 14:01:27 UTC 
---
Fixed by commit 99695f58fde6d364f2310d97303768782a1e537d

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] someone regressed tinderbox

2012-08-07 Thread Brian Paul

On 08/06/2012 10:38 PM, Dave Airlie wrote:

http://tinderbox.x.org/builds/2012-08-06-0020/logs/libGL/#build

Making all in glx
gmake[4]: Entering directory `/home/tinderbox/mesa/mesa/src/egl/drivers/glx'
   CC egl_glx.lo
In file included from ../../../../src/egl/main/egltypedefs.h:37,
  from ../../../../src/egl/main/eglconfig.h:37,
  from egl_glx.c:44:
../../../../include/EGL/eglext.h:454: error: redefinition of typedef
'PFNEGLQUERYSTREAMTIMEKHRPROC'
../../../../include/EGL/eglext.h:407: note: previous declaration of
'PFNEGLQUERYSTREAMTIMEKHRPROC' was here
gmake[4]: Leaving directory `/home/tinderbox/mesa/mesa/src/egl/drivers/glx'


The eglext.h patch I posted last night fixes this.  I'll go ahead and 
push it.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gbm: Fix build without gallium_drm_loader

2012-08-07 Thread Chí-Thanh Christopher Nguyễn
pipe_loader_drm_probe_fd only exists if HAVE_PIPE_LOADER_DRM is defined.
This addresses https://bugs.freedesktop.org/show_bug.cgi?id=52962
---
 src/gallium/targets/gbm/gbm.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/src/gallium/targets/gbm/gbm.c b/src/gallium/targets/gbm/gbm.c
index 7d2af51..3ab3b32 100644
--- a/src/gallium/targets/gbm/gbm.c
+++ b/src/gallium/targets/gbm/gbm.c
@@ -51,9 +51,11 @@ gallium_screen_create(struct gbm_gallium_drm_device *gdrm)
struct pipe_loader_device *dev;
int ret;
 
+#ifdef HAVE_PIPE_LOADER_DRM
ret = pipe_loader_drm_probe_fd(dev, gdrm-base.base.fd);
if (!ret)
   return -1;
+#endif /* HAVE_PIPE_LOADER_DRM */
 
gdrm-screen = pipe_loader_create_screen(dev, get_library_search_path());
if (gdrm-screen == NULL) {
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/15] dri: Simplify use of driConcatConfigs

2012-08-07 Thread Chad Versace
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/06/2012 07:33 PM, Eric Anholt wrote:
 Chad Versace chad.vers...@linux.intel.com writes:
 
 If either argument to driConcatConfigs(a, b) is null or the empty list, 
 then simply return the other argument as the resultant list.
 
 All callers were accomplishing that same behavior anyway. And each
 caller accopmplished it with the same pattern. So this patch moves that
 external pattern into the function.
 
 CC: Ian Romanick i...@freedesktop.org Reviewed-by: e...@anholt.net 
 Signed-off-by: Chad Versace chad.vers...@linux.intel.com
 
 I was going to say reviewed-by on the last patchset with this change.
 

You gave it reviewed-by on the previous patchset, where it was patch 14/16.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQIS3vAAoJEAIvNt057x8i+4YQALVLaGPD8KoNKX5DD7Knjhjg
yyLjtM6Ncl2wiGDYF9Xqteg0U4sFCkk+vQw5WCl9S5yJOPS72Y5BXdoTP0TqZWj1
MRmbfWGxRiDcbxAdvyrYb8wPYCfN/BFKTo9qYgTHPbJ2WB6FtYORZSLOxYaq9LQP
JU4nO4U1532qTxGpwHsbwNabC1Jq0r6cKZoFLDoYoxo8Zvv+Sv94fnGmXMcr7GmR
c+G50ELXbpV0VFcodJGon1P0rBJon/RuMxMAckbTy/loy3Kr/uh+IsCzWYG2bNbS
lhwQQCxV7qD/byt29UJS1PBhaVgkNRCknBJfNqDYbooGowTjfuZqxKvOE5+Ja7kz
Tyi6FRX9obhl3tgUlzof86OZiWHwAWJ8xnsYvieNcQMIcdDNL2LiNkOTm0Qmom7m
iGigQ7G8E6pe1pRPJAp0LZPfNBphte6PzwwpOWhcJkEYJbj4HQmlC7f0VRnm00Xe
W+vrFzLl1otlvKfL93aJpPjbzdbnt+tmFLoEiDlbb5asHaSsdbnw77uVe0DC3osh
yD+kakzhgTb8lLSfW/sqnFoYgO2C2KBBv7qQk3WP80D1mi568+VYb7l5vOAT4lDq
Ow/2yRCx4lyDa9qYC/XRZ/bDMeYXKw/xHlujiCnqxJvgqJSouf+hO+PM0kYBJLKM
EmAe4Cnca8xbGOTUBXZg
=B/y7
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/15] intel: Clarify intel_screen_make_configs

2012-08-07 Thread Chad Versace
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/06/2012 07:40 PM, Eric Anholt wrote:
 Chad Versace chad.vers...@linux.intel.com writes:
 
 This function felt sloppy, so this patch cleans it up a little bit.
 
 - Rename `color` to `i`. It is not a color value, only an iterator int.
 
 I'm meh on this change.

A quick explanation of why I renamed it... The variable name `color` confused
me the first time I encountered this function. It's uncommon for integer
iterator variables to have colorful names so I suspected something special was
going on, but nothing was.

 - Move `depth_bits[0] = 0` into the non-accum loop because that is where 
 it used. The accum loop later overwrites depth_bits[0].
 
 This makes sense -- move it next to the place that sets up the rest of the
 array.
 
 - Redefine `msaa_samples_array` as static const because it is never 
 modified.
 
 Maybe instead, singlesample_samples[] = {0} and multisample_samples[] = {4,
 8}?  The array math in the next patch was not pretty.

Good idea. I don't want git blame to associate me with that array math. I
incorporated your suggestion into this and the next patch.

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQITH9AAoJEAIvNt057x8iNDMQAMH8sVoOLqA0LaDw1VJTVRn6
R/DpyLv1lGffBfMJylKJGylDe3Yv1yRin6h9pu6dWzAi5W2UOD3UI4tOUgi+RN2g
lwbDwK+hdWd/qYeQzZCPRsCq9O6E0mNZGMkFJ4DNTKLJJFGOQ3JKEBwSgZQDUOpE
evlreUvC3y4LsIolAFnBMMih2tU8+mwftb/Nai7rOQbu9fwyggW5TzNp5cqWUxkW
9lTtqg7S7syDEc/iUAmfapY39w0Jnn3bTOflKfuvjPcxiHot4qDhHmPO+TxsCIR6
l63q2uB4E7db07NyHky3YtBc5YQNe9VHL9UrT98+yVfqzkXoKX4xuNfuUl342PsO
YSeUOTEMVnocnIuXWHS1kfAUZNaCPHxTF6PRuB1v/zX7qUGSwGQvVKvkp1qKKIEz
0uTeFcBPJEQ4MGBz9+c5rrHLoXmQU481xcd61IXrqtohMoNkdT7wpc6yigiDXhrr
2H8KeMH45ptWDTdGinaxRaP76d09glGggucqEPQ4/hTlZptq8xGGi2TFc0zBElFw
snKCkx90lOtdsib6JNSJHd0DXGfL4R/qoMVFsxpu4d/qdw0tAdZVPbkmS0vlEXb0
5dD/mubBUIdoTD/Crgh6K+CY6HZG8PcCZm0Y38PelmpYCWk8p+vkDWLVt7A1XVv6
QkBirnpfwG8wt5EdS1el
=Zpbx
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] i965: add more Haswell PCI IDs

2012-08-07 Thread Paulo Zanoni
2012/8/6 Kenneth Graunke kenn...@whitecape.org:
 On 08/06/2012 02:50 PM, Paulo Zanoni wrote:
 From: Paulo Zanoni paulo.r.zan...@intel.com

 Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 Do you have push access?  If not, I can commit this for you.

I just discovered I have. Patch committed, thanks.


-- 
Paulo Zanoni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/15] intel: Support mapping multisample miptrees (v2)

2012-08-07 Thread Chad Versace
On 08/06/2012 07:32 PM, Eric Anholt wrote:
 Chad Versace chad.vers...@linux.intel.com writes:
 
 Add two new functions: intel_miptree_{map,unmap}_multisample, to which
 intel_miptree_{map,unmap} dispatch. Only mapping flat, renderbuffer-like
 miptrees are supported.

 v2:
 - Move the introduction of
   intel_mipmap_tree::singlesample_{width0,height0} to this patch, per
   Anholt.
 - Replace relations `mt-num_samples == 0` and `mt-num_samples  0`
   with `= 1` and ` 0`, per Anholt.
 - Don't downsample unnecessarily, found by Anholt.

 CC: Eric Anholt e...@anholt.net
 CC: Paul Berry stereotype...@gmail.com
 Signed-off-by: Chad Versace chad.vers...@linux.intel.com
 ---
  src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 115 
 +++--
  src/mesa/drivers/dri/intel/intel_mipmap_tree.h |  18 
  2 files changed, 127 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 index 23d84c0..6ecb48f 100644
 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 +   if (!mt-singlesample_mt) {
 +  map-singlesample_mt_is_tmp = true;
 +  mt-need_downsample = true;
 
 Move this mt-need_downsample flag setup to after you've successfully
 alloced?

Done. That's a sensible change, and removes the need for unsetting
mt-need_downsample in the failure path.

 
 +  mt-singlesample_mt =
 + intel_miptree_create_for_renderbuffer(intel,
 +   mt-format,
 +   mt-singlesample_width0,
 +   mt-singlesample_height0,
 +   0 /*num_samples*/);
 +  if (!mt-singlesample_mt) {
 + mt-need_downsample = false;
 + goto fail;
 +  }
 +   }
 +
 +   if (mode  GL_MAP_INVALIDATE_RANGE_BIT)
 +  mt-need_downsample = false;
 +
 +   intel_miptree_downsample(intel, mt);
 
 I don't think you can clear need_downsample for
 GL_MAP_INVALIDATE_RANGE_BIT, because the GL_MAP_WRITE_BIT case in the
 unmap (implied by INVALIDATE_RANGE) will upsample the whole singlesample
 buffer back, not just the mapped subset.  Dropping the INVALIDATE_RANGE
 gets the series up to this patch my r-b.
 

Ah, your're right. Thanks for catching that subtle error.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/15] intel: Advertise multisample DRI2 configs on gen = 6

2012-08-07 Thread Chad Versace
On 08/06/2012 07:49 PM, Eric Anholt wrote:
 Chad Versace chad.vers...@linux.intel.com writes:
 +   /* Generate multisample configs.
 +*
 +* This loop breaks early, and hence is a no-op, on gen  6.
 +*
 +* Multisample configs must follow the singlesample configs in order to
 +* work around an X server bug present in 1.12. The X server chooses to
 +* associate the first listed RGBA888-Z24S8 config, regardless of its
 +* sample count, with the 32-bit depth visual used for compositing.
 +*
 +* Only doublebuffer configs with GLX_SWAP_UNDEFINED_OML behavior are
 +* supported. Singlebuffer configs are not supported because that would
 +* require that rendering be eventually written to the singlesample 
 buffer
 +* even if DRI2Flush is never called; yet we downsample to the 
 singlesample
 +* buffer only on DRI2Flush.  GLX_SWAP_COPY_OML is not supported because 
 we
 +* have no tests for its interaction with MSAA.
 +*/
 
 We actually need to remove our claiming of GLX_SWAP_COPY_OML in general,
 because pageflipping means that we don't actually support SWAP_COPY.  We
 only do UNDEFINED.
 
 I'd say instead singlebuffer configs are not supported because nobody
 wants them.

I'll update the comments to say:

* Only doublebuffer configs with GLX_SWAP_UNDEFINED_OML behavior are
* supported.  Singlebuffer configs are not supported because no one wants
* them. GLX_SWAP_COPY_OML is not supported due to page flipping.
*/

I'll follow up later with a patch that removes advertising of SWAP_COPY configs.

 I think all you need is (pessimistically) intel_downsample_for_dri2_flush
 in intel_flush_front() to make front buffer rendering actually work, and
 it's a problem that exists even in a doublebuffer config.

Ah, I failed to realize that this problem exists even in a doublebuffer
config. I'll follow up later with a patch that downsamples in intel_flush_front.

 I think this concludes my review.  Great work!  I'm excited to see this
 finally land.

Thanks. Is this an implicit r-b on this patch?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/15] intel: Advertise multisample DRI2 configs on gen = 6

2012-08-07 Thread Eric Anholt
Chad Versace chad.vers...@linux.intel.com writes:

 On 08/06/2012 07:49 PM, Eric Anholt wrote:
 Chad Versace chad.vers...@linux.intel.com writes:
 +   /* Generate multisample configs.
 +*
 +* This loop breaks early, and hence is a no-op, on gen  6.
 +*
 +* Multisample configs must follow the singlesample configs in order to
 +* work around an X server bug present in 1.12. The X server chooses to
 +* associate the first listed RGBA888-Z24S8 config, regardless of its
 +* sample count, with the 32-bit depth visual used for compositing.
 +*
 +* Only doublebuffer configs with GLX_SWAP_UNDEFINED_OML behavior are
 +* supported. Singlebuffer configs are not supported because that would
 +* require that rendering be eventually written to the singlesample 
 buffer
 +* even if DRI2Flush is never called; yet we downsample to the 
 singlesample
 +* buffer only on DRI2Flush.  GLX_SWAP_COPY_OML is not supported 
 because we
 +* have no tests for its interaction with MSAA.
 +*/
 
 We actually need to remove our claiming of GLX_SWAP_COPY_OML in general,
 because pageflipping means that we don't actually support SWAP_COPY.  We
 only do UNDEFINED.
 
 I'd say instead singlebuffer configs are not supported because nobody
 wants them.

 I'll update the comments to say:

 * Only doublebuffer configs with GLX_SWAP_UNDEFINED_OML behavior are
 * supported.  Singlebuffer configs are not supported because no one wants
 * them. GLX_SWAP_COPY_OML is not supported due to page flipping.
 */

 I'll follow up later with a patch that removes advertising of SWAP_COPY 
 configs.

 I think all you need is (pessimistically) intel_downsample_for_dri2_flush
 in intel_flush_front() to make front buffer rendering actually work, and
 it's a problem that exists even in a doublebuffer config.

 Ah, I failed to realize that this problem exists even in a doublebuffer
 config. I'll follow up later with a patch that downsamples in 
 intel_flush_front.

 I think this concludes my review.  Great work!  I'm excited to see this
 finally land.

 Thanks. Is this an implicit r-b on this patch?

Yeah.


pgpY24JWTcGTJ.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/msaa: Add sample-alpha-to-coverage support for multiple render targets

2012-08-07 Thread Eric Anholt
Anuj Phogat anuj.pho...@gmail.com writes:

 Render Target Write message should include source zero alpha value when
 sample-alpha-to-coverage is enabled for an FBO with  multiple render targets.
 Source zero alpha value is used as fragment coverage for all the render
 targets.

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index fefe2c7..7fc28ac 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1930,14 +1930,24 @@ fs_visitor::emit_color_write(int target, int index, 
 int first_color_mrf)
  {
 int reg_width = c-dispatch_width / 8;
 fs_inst *inst;
 -   fs_reg color = outputs[target];
 +   fs_reg color;
 +   bool src0_alpha_to_render_target = target  0 
 +   c-key.nr_color_regions  1 
 +   c-key.sample_alpha_to_coverage;
 +
 +   color = (src0_alpha_to_render_target  !index) ?
 + outputs[0] :
 + outputs[target];
 fs_reg mrf;
  
 /* If there's no color data to be written, skip it. */
 if (color.file == BAD_FILE)
return;
  
 -   color.reg_offset += index;
 +   if (src0_alpha_to_render_target)
 +  color.reg_offset += !index ? 3 : index - 1;
 +   else
 +  color.reg_offset += index;

Ew, this is really awful.

How about instead..,

 -  for (unsigned i = 0; i  this-output_components[target]; i++)
 -  emit_color_write(target, i, color_mrf);
 +  /* If src0_alpha_to_render_target is true, include source zero alpha
 +   * data in RenderTargetWrite message for targets  0.
 +   */
 +  output_components = (target  src0_alpha_to_render_target) ?
 +   (this-output_components[target] + 1) :
 +   this-output_components[target];
  
 -  fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
 +  for (unsigned i = 0; i  output_components; i++)
 + emit_color_write(target, i, color_mrf);

Replace all of this change with:

  if (src0_alpha_to_render_target) {
 emit_color_write(0, 3, color_mrf);
 color_mrf += reg_width);
  }
  for (unsigned i = 0; i  this-output_components[target]; i++)
 emit_color_write(target, i, color_mrf);

 diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
 b/src/mesa/drivers/dri/i965/brw_wm.c
 index 5ab0547..210b078 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm.c
 @@ -546,6 +546,8 @@ static void brw_wm_populate_key( struct brw_context *brw,
 /* _NEW_BUFFERS */
 key-nr_color_regions = ctx-DrawBuffer-_NumColorDrawBuffers;

Needs
  /* _NEW_MULTISAMPLE */
 +   key-sample_alpha_to_coverage = ctx-Multisample.SampleAlphaToCoverage;

and corresponding addition to the state struct below.


pgpLkqngDbG9c.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] intel: Rename INTEL_DEBUG=fall to INTEL_DEBUG=perf.

2012-08-07 Thread Eric Anholt
I want to introduce some more debug output for performance surprises that
includes fallbacks, but aren't necessarily software rasterization.  Leave
INTEL_DEBUG=fall in place for those that have used that flag before.
---
 src/mesa/drivers/dri/i915/i915_program.c|2 +-
 src/mesa/drivers/dri/i915/intel_tris.c  |4 ++--
 src/mesa/drivers/dri/i965/brw_fallback.c|2 +-
 src/mesa/drivers/dri/i965/brw_urb.c |2 +-
 src/mesa/drivers/dri/intel/intel_context.c  |3 ++-
 src/mesa/drivers/dri/intel/intel_context.h  |4 ++--
 src/mesa/drivers/dri/intel/intel_tex_copy.c |4 ++--
 7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/i915_program.c 
b/src/mesa/drivers/dri/i915/i915_program.c
index 0a600d3..4437167 100644
--- a/src/mesa/drivers/dri/i915/i915_program.c
+++ b/src/mesa/drivers/dri/i915/i915_program.c
@@ -442,7 +442,7 @@ i915_emit_param4fv(struct i915_fragment_program * p, const 
GLfloat * values)
 void
 i915_program_error(struct i915_fragment_program *p, const char *fmt, ...)
 {
-   if (unlikely((INTEL_DEBUG  (DEBUG_WM | DEBUG_FALLBACKS)) != 0)) {
+   if (unlikely((INTEL_DEBUG  (DEBUG_WM | DEBUG_PERF)) != 0)) {
   va_list args;
 
   fprintf(stderr, i915_program_error: );
diff --git a/src/mesa/drivers/dri/i915/intel_tris.c 
b/src/mesa/drivers/dri/i915/intel_tris.c
index 5954b24..549af5e 100644
--- a/src/mesa/drivers/dri/i915/intel_tris.c
+++ b/src/mesa/drivers/dri/i915/intel_tris.c
@@ -1223,7 +1223,7 @@ intelFallback(struct intel_context *intel, GLbitfield 
bit, bool mode)
 assert(!intel-tnl_pipeline_running);
 
  intel_flush(ctx);
- if (INTEL_DEBUG  DEBUG_FALLBACKS)
+ if (INTEL_DEBUG  DEBUG_PERF)
 fprintf(stderr, ENTER FALLBACK %x: %s\n,
 bit, getFallbackString(bit));
  _swsetup_Wakeup(ctx);
@@ -1236,7 +1236,7 @@ intelFallback(struct intel_context *intel, GLbitfield 
bit, bool mode)
 assert(!intel-tnl_pipeline_running);
 
  _swrast_flush(ctx);
- if (INTEL_DEBUG  DEBUG_FALLBACKS)
+ if (INTEL_DEBUG  DEBUG_PERF)
 fprintf(stderr, LEAVE FALLBACK %s\n, getFallbackString(bit));
  tnl-Driver.Render.Start = intelRenderStart;
  tnl-Driver.Render.PrimitiveNotify = intelRenderPrimitive;
diff --git a/src/mesa/drivers/dri/i965/brw_fallback.c 
b/src/mesa/drivers/dri/i965/brw_fallback.c
index 81fc23a..1ae6fc8 100644
--- a/src/mesa/drivers/dri/i965/brw_fallback.c
+++ b/src/mesa/drivers/dri/i965/brw_fallback.c
@@ -37,7 +37,7 @@
 #include tnl/tnl.h
 #include brw_context.h
 
-#define FILE_DEBUG_FLAG DEBUG_FALLBACKS
+#define FILE_DEBUG_FLAG DEBUG_PERF
 
 static bool do_check_fallback(struct brw_context *brw)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_urb.c 
b/src/mesa/drivers/dri/i965/brw_urb.c
index 7643dc2..b1126b5 100644
--- a/src/mesa/drivers/dri/i965/brw_urb.c
+++ b/src/mesa/drivers/dri/i965/brw_urb.c
@@ -190,7 +190,7 @@ static void recalculate_urb_fence( struct brw_context *brw )
exit(1);
 }
 
-if (unlikely(INTEL_DEBUG  (DEBUG_URB|DEBUG_FALLBACKS)))
+if (unlikely(INTEL_DEBUG  (DEBUG_URB|DEBUG_PERF)))
printf(URB CONSTRAINED\n);
   }
 
diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
b/src/mesa/drivers/dri/intel/intel_context.c
index 759fead..a39462b 100644
--- a/src/mesa/drivers/dri/intel/intel_context.c
+++ b/src/mesa/drivers/dri/intel/intel_context.c
@@ -427,7 +427,8 @@ static const struct dri_debug_control debug_control[] = {
{ ioctl, DEBUG_IOCTL},
{ blit,  DEBUG_BLIT},
{ mip,   DEBUG_MIPTREE},
-   { fall,  DEBUG_FALLBACKS},
+   { fall,  DEBUG_PERF},
+   { perf,  DEBUG_PERF},
{ verb,  DEBUG_VERBOSE},
{ bat,   DEBUG_BATCH},
{ pix,   DEBUG_PIXEL},
diff --git a/src/mesa/drivers/dri/intel/intel_context.h 
b/src/mesa/drivers/dri/intel/intel_context.h
index 29ab187..6d1a81c 100644
--- a/src/mesa/drivers/dri/intel/intel_context.h
+++ b/src/mesa/drivers/dri/intel/intel_context.h
@@ -430,7 +430,7 @@ extern int INTEL_DEBUG;
 #define DEBUG_IOCTL0x4
 #define DEBUG_BLIT 0x8
 #define DEBUG_MIPTREE   0x10
-#define DEBUG_FALLBACKS0x20
+#define DEBUG_PERF 0x20
 #define DEBUG_VERBOSE  0x40
 #define DEBUG_BATCH 0x80
 #define DEBUG_PIXEL 0x100
@@ -459,7 +459,7 @@ extern int INTEL_DEBUG;
 } while(0)
 
 #define fallback_debug(...) do {   \
-   if (unlikely(INTEL_DEBUG  DEBUG_FALLBACKS))\
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) \
printf(__VA_ARGS__);\
 } while(0)
 
diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c 
b/src/mesa/drivers/dri/intel/intel_tex_copy.c
index 6da4ec6..f436633 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_copy.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c
@@ -61,7 +61,7 @@ intel_copy_texsubimage(struct intel_context *intel,
intel_prepare_render(intel);
 
  

[Mesa-dev] [PATCH 2/8] i965: Add INTEL_DEBUG=perf for failure to compile 16-wide shaders.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |5 -
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |3 ++-
 src/mesa/drivers/dri/intel/intel_context.h|5 +
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d06858e..298c708 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2078,7 +2078,10 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
   c-dispatch_width = 16;
   fs_visitor v2(c, prog, shader);
   v2.import_uniforms(v);
-  v2.run();
+  if (!v2.run()) {
+ perf_debug(16-wide shader failed to compile, falling back to 
+8-wide at a 10-20%% performance cost: %s, v2.fail_msg);
+  }
}
 
c-prog_data.dispatch_width = 8;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index 7618047..e7f11ae 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -238,7 +238,8 @@ fs_visitor::assign_regs()
   if (reg == -1) {
 fail(no register to spill\n);
   } else if (c-dispatch_width == 16) {
-fail(no spilling support on 16-wide yet\n);
+fail(Failure to register allocate.  Reduce number of live scalar 
+  values to avoid this.);
   } else {
 spill_reg(reg);
   }
diff --git a/src/mesa/drivers/dri/intel/intel_context.h 
b/src/mesa/drivers/dri/intel/intel_context.h
index 6d1a81c..c4efa54 100644
--- a/src/mesa/drivers/dri/intel/intel_context.h
+++ b/src/mesa/drivers/dri/intel/intel_context.h
@@ -463,6 +463,11 @@ extern int INTEL_DEBUG;
printf(__VA_ARGS__);\
 } while(0)
 
+#define perf_debug(...) do {   \
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) \
+   printf(__VA_ARGS__);\
+} while(0)
+
 #define PCI_CHIP_845_G 0x2562
 #define PCI_CHIP_I830_M0x3577
 #define PCI_CHIP_I855_GM   0x3582
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] intel: performance debug flag.

2012-08-07 Thread Eric Anholt
One of Valve's requests was for GL_ARB_debug_output for performance traps they
should know about.  Unfortunately, Mesa's ARB_debug_output support is very
limited at the moment, so this just gets messages in place, which we can
convert to GL_ARB_debug_output at some later time.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] i965: Add performance debug for register spilling.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_vs.c |4 
 src/mesa/drivers/dri/i965/brw_wm.c |4 
 2 files changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index b1b073e..5120167 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -254,6 +254,10 @@ do_vs_prog(struct brw_context *brw,
 
/* Scratch space is used for register spilling */
if (c.last_scratch) {
+  perf_debug(Vertex shader triggered register spilling.  
+ Try reducing the number of live vec4 values to 
+ improve performance.\n);
+
   c.prog_data.total_scratch = brw_get_scratch_size(c.last_scratch);
 
   brw_get_scratch_bo(intel, brw-vs.scratch_bo,
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 5ab0547..3abc696 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -321,6 +321,10 @@ bool do_wm_prog(struct brw_context *brw,
 
/* Scratch space is used for register spilling */
if (c-last_scratch) {
+  perf_debug(Fragment shader triggered register spilling.  
+ Try reducing the number of live scalar values to 
+ improve performance.\n);
+
   c-prog_data.total_scratch = brw_get_scratch_size(c-last_scratch);
 
   brw_get_scratch_bo(intel, brw-wm.scratch_bo,
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] intel: Add performance debug for some common GPU stalls.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_queryobj.c  |6 ++
 src/mesa/drivers/dri/intel/intel_buffer_objects.c |8 +++-
 src/mesa/drivers/dri/intel/intel_regions.c|6 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 240fe32..f84ad0e 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -58,6 +58,12 @@ brw_queryobj_get_results(struct gl_context *ctx,
if (query-bo == NULL)
   return;
 
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  if (drm_intel_bo_busy(query-bo)) {
+ perf_debug(Stalling on the GPU waiting for a query object.\n);
+  }
+   }
+
drm_intel_bo_map(query-bo, false);
results = query-bo-virtual;
switch (query-Base.Target) {
diff --git a/src/mesa/drivers/dri/intel/intel_buffer_objects.c 
b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
index 37dc75c..df8ac7f 100644
--- a/src/mesa/drivers/dri/intel/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
@@ -212,7 +212,8 @@ intel_bufferobj_subdata(struct gl_context * ctx,
 intel_bufferobj_alloc_buffer(intel, intel_obj);
 drm_intel_bo_subdata(intel_obj-buffer, 0, size, data);
   } else {
-/* Use the blitter to upload the new data. */
+ perf_debug(Using a blit copy to avoid stalling on glBufferSubData() 
+to a busy buffer object.\n);
 drm_intel_bo *temp_bo =
drm_intel_bo_alloc(intel-bufmgr, subdata temp, size, 64);
 
@@ -226,6 +227,11 @@ intel_bufferobj_subdata(struct gl_context * ctx,
 drm_intel_bo_unreference(temp_bo);
   }
} else {
+  if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+ if (drm_intel_bo_busy(intel_obj-buffer)) {
+perf_debug(Stalling on the GPU in glBufferSubData().\n);
+ }
+  }
   drm_intel_bo_subdata(intel_obj-buffer, offset, size, data);
}
 }
diff --git a/src/mesa/drivers/dri/intel/intel_regions.c 
b/src/mesa/drivers/dri/intel/intel_regions.c
index 1ef1ac6..9bf9c66 100644
--- a/src/mesa/drivers/dri/intel/intel_regions.c
+++ b/src/mesa/drivers/dri/intel/intel_regions.c
@@ -123,6 +123,12 @@ intel_region_map(struct intel_context *intel, struct 
intel_region *region,
 * flush is only needed on first map of the buffer.
 */
 
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  if (drm_intel_bo_busy(region-bo)) {
+ perf_debug(Mapping a busy BO, causing a stall on the GPU.\n);
+  }
+   }
+
_DBG(%s %p\n, __FUNCTION__, region);
if (!region-map_refcount) {
   intel_flush(intel-ctx);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965: Add performance debug for shader recompiles.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_context.h |2 +
 src/mesa/drivers/dri/i965/brw_fs.cpp|6 ++
 src/mesa/drivers/dri/i965/brw_program.h |2 +
 src/mesa/drivers/dri/i965/brw_vec4_emit.cpp |6 ++
 src/mesa/drivers/dri/i965/brw_wm.c  |   84 +++
 src/mesa/drivers/dri/i965/brw_wm.h  |3 +
 6 files changed, 103 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 8a082ab..bc43557 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -275,6 +275,8 @@ struct brw_fragment_program {
 struct brw_shader {
struct gl_shader base;
 
+   bool compiled_once;
+
/** Shader IR transformed for native compile, at link time. */
struct exec_list *ir;
 };
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 298c708..90a1d92 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2086,6 +2086,12 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
 
c-prog_data.dispatch_width = 8;
 
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  if (shader-compiled_once)
+ brw_wm_debug_recompile(brw, prog, c-key);
+  shader-compiled_once = true;
+   }
+
return true;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_program.h 
b/src/mesa/drivers/dri/i965/brw_program.h
index 874238f..9fbc201 100644
--- a/src/mesa/drivers/dri/i965/brw_program.h
+++ b/src/mesa/drivers/dri/i965/brw_program.h
@@ -45,5 +45,7 @@ struct brw_sampler_prog_key_data {
 void brw_populate_sampler_prog_key_data(struct gl_context *ctx,
const struct gl_program *prog,
struct brw_sampler_prog_key_data *key);
+bool brw_debug_recompile_sampler_key(const struct brw_sampler_prog_key_data 
*old_key,
+ const struct brw_sampler_prog_key_data 
*key);
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
index 9df7b11..788d7b5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
@@ -1031,6 +1031,10 @@ brw_vs_emit(struct gl_shader_program *prog, struct 
brw_vs_compile *c)
   printf(\n\n);
}
 
+   if (shader-compiled_once) {
+  perf_debug(Recompiling vertex shader for program %d\n, prog-Name);
+   }
+
vec4_visitor v(c, prog, shader);
if (!v.run()) {
   prog-LinkStatus = false;
@@ -1038,6 +1042,8 @@ brw_vs_emit(struct gl_shader_program *prog, struct 
brw_vs_compile *c)
   return false;
}
 
+   shader-compiled_once = true;
+
return true;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 3abc696..323eabd 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -347,6 +347,90 @@ bool do_wm_prog(struct brw_context *brw,
return true;
 }
 
+static bool
+key_debug(const char *name, int a, int b)
+{
+   if (a != b) {
+  perf_debug(  %s %d-%d\n, name, a, b);
+  return true;
+   } else {
+  return false;
+   }
+}
+
+bool
+brw_debug_recompile_sampler_key(const struct brw_sampler_prog_key_data 
*old_key,
+const struct brw_sampler_prog_key_data *key)
+{
+   bool found = false;
+
+   for (unsigned int i = 0; i  BRW_MAX_TEX_UNIT; i++) {
+  found |= key_debug(EXT_texture_swizzle or DEPTH_TEXTURE_MODE,
+ key-swizzles[i], old_key-swizzles[i]);
+   }
+   found |= key_debug(GL_CLAMP enabled on any texture unit's 1st coordinate,
+  key-gl_clamp_mask[0], old_key-gl_clamp_mask[0]);
+   found |= key_debug(GL_CLAMP enabled on any texture unit's 2nd coordinate,
+  key-gl_clamp_mask[1], old_key-gl_clamp_mask[1]);
+   found |= key_debug(GL_CLAMP enabled on any texture unit's 3rd coordinate,
+  key-gl_clamp_mask[2], old_key-gl_clamp_mask[2]);
+   found |= key_debug(GL_MESA_ycbcr texturing\n,
+  key-yuvtex_mask, old_key-yuvtex_mask);
+   found |= key_debug(GL_MESA_ycbcr UV swapping\n,
+  key-yuvtex_swap_mask, old_key-yuvtex_swap_mask);
+
+   return found;
+}
+
+void
+brw_wm_debug_recompile(struct brw_context *brw,
+   struct gl_shader_program *prog,
+   const struct brw_wm_prog_key *key)
+{
+   struct brw_cache_item *c = NULL;
+   const struct brw_wm_prog_key *old_key = NULL;
+   bool found = false;
+
+   perf_debug(Recompiling fragment shader for program %d\n, prog-Name);
+
+   for (unsigned int i = 0; i  brw-cache.size; i++) {
+  for (c = brw-cache.items[i]; c; c = c-next) {
+ if (c-cache_id == BRW_WM_PROG) {
+old_key = c-key;
+
+if (old_key-program_string_id == key-program_string_id)
+   break;
+ }
+  

[Mesa-dev] [PATCH 5/8] i965: Add performance debug for fast clear fallbacks.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_clear.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
b/src/mesa/drivers/dri/i965/brw_clear.c
index 31c2e45..71d7c48 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -107,14 +107,22 @@ brw_fast_clear_depth(struct gl_context *ctx)
 * a previous clear had happened at a different clear value and resolve it
 * first.
 */
-   if (ctx-Scissor.Enabled)
+   if (ctx-Scissor.Enabled) {
+  perf_debug(Failed to fast clear depth due to scissor being enabled.  
+ Possible 5%% performance win if avoided.\n);
   return false;
+   }
 
/* The rendered area has to be 8x4 samples, not resolved pixels, so we look
 * at the miptree slice dimensions instead of renderbuffer size.
 */
if (mt-level[depth_irb-mt_level].width % 8 != 0 ||
mt-level[depth_irb-mt_level].height % 4 != 0) {
+  perf_debug(Failed to fast clear depth due to width/height %d,%d not 
+ being aligned to 8,4.  Possible 5%% performance win if 
+ avoided\n,
+ mt-level[depth_irb-mt_level].width,
+ mt-level[depth_irb-mt_level].height);
   return false;
}
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965: Add perf debug for stalls during shader compiles.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_fs.cpp|   13 +
 src/mesa/drivers/dri/i965/brw_vec4_emit.cpp |   20 ++--
 src/mesa/drivers/dri/intel/intel_screen.c   |   13 +
 src/mesa/drivers/dri/intel/intel_screen.h   |1 +
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 90a1d92..dfd101f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2044,10 +2044,18 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
   struct gl_shader_program *prog)
 {
struct intel_context *intel = brw-intel;
+   bool start_busy = false;
+   float start_time = 0;
 
if (!prog)
   return false;
 
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  start_busy = (intel-batch.last_bo 
+drm_intel_bo_busy(intel-batch.last_bo));
+  start_time = get_time();
+   }
+
struct brw_shader *shader =
  (brw_shader *) prog-_LinkedShaders[MESA_SHADER_FRAGMENT];
if (!shader)
@@ -2090,6 +2098,11 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
brw_wm_compile *c,
   if (shader-compiled_once)
  brw_wm_debug_recompile(brw, prog, c-key);
   shader-compiled_once = true;
+
+  if (start_busy  !drm_intel_bo_busy(intel-batch.last_bo)) {
+ perf_debug(FS compile took %.03f ms and stalled the GPU\n,
+(get_time() - start_time) / 1000);
+  }
}
 
return true;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
index 788d7b5..0db435b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
@@ -1017,9 +1017,19 @@ extern C {
 bool
 brw_vs_emit(struct gl_shader_program *prog, struct brw_vs_compile *c)
 {
+   struct intel_context *intel = c-func.brw-intel;
+   bool start_busy = false;
+   float start_time = 0;
+
if (!prog)
   return false;
 
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  start_busy = (intel-batch.last_bo 
+drm_intel_bo_busy(intel-batch.last_bo));
+  start_time = get_time();
+   }
+
struct brw_shader *shader =
  (brw_shader *) prog-_LinkedShaders[MESA_SHADER_VERTEX];
if (!shader)
@@ -1031,8 +1041,14 @@ brw_vs_emit(struct gl_shader_program *prog, struct 
brw_vs_compile *c)
   printf(\n\n);
}
 
-   if (shader-compiled_once) {
-  perf_debug(Recompiling vertex shader for program %d\n, prog-Name);
+   if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
+  if (shader-compiled_once) {
+ perf_debug(Recompiling vertex shader for program %d\n, prog-Name);
+  }
+  if (start_busy  !drm_intel_bo_busy(intel-batch.last_bo)) {
+ perf_debug(VS compile took %.03f ms and stalled the GPU\n,
+(get_time() - start_time) / 1000);
+  }
}
 
vec4_visitor v(c, prog, shader);
diff --git a/src/mesa/drivers/dri/intel/intel_screen.c 
b/src/mesa/drivers/dri/intel/intel_screen.c
index 5c38c8d..56abc12 100644
--- a/src/mesa/drivers/dri/intel/intel_screen.c
+++ b/src/mesa/drivers/dri/intel/intel_screen.c
@@ -109,6 +109,19 @@ const GLuint __driNConfigOptions = 15;
 static PFNGLXCREATECONTEXTMODES create_context_modes = NULL;
 #endif /*USE_NEW_INTERFACE */
 
+/**
+ * For debugging, this returns a time in seconds since the first call.
+ */
+double
+get_time(void)
+{
+   struct timespec tp;
+
+   clock_gettime(CLOCK_MONOTONIC, tp);
+
+   return tp.tv_sec + tp.tv_nsec / 10.0;
+}
+
 void
 aub_dump_bmp(struct gl_context *ctx)
 {
diff --git a/src/mesa/drivers/dri/intel/intel_screen.h 
b/src/mesa/drivers/dri/intel/intel_screen.h
index c0cc284..f5a374d 100644
--- a/src/mesa/drivers/dri/intel/intel_screen.h
+++ b/src/mesa/drivers/dri/intel/intel_screen.h
@@ -81,6 +81,7 @@ intelMakeCurrent(__DRIcontext * driContextPriv,
  __DRIdrawable * driDrawPriv,
  __DRIdrawable * driReadPriv);
 
+double get_time(void);
 void aub_dump_bmp(struct gl_context *ctx);
 
 #endif
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] i965: Add performance debug for when the state cache gets nuked.

2012-08-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_state_cache.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c 
b/src/mesa/drivers/dri/i965/brw_state_cache.c
index 4ae8e12..57a5ee9 100644
--- a/src/mesa/drivers/dri/i965/brw_state_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_state_cache.c
@@ -375,8 +375,11 @@ brw_state_cache_check_size(struct brw_context *brw)
/* un-tuned guess.  Each object is generally a page, so 1000 of them is 4 
MB of
 * state cache.
 */
-   if (brw-cache.n_items  1000)
+   if (brw-cache.n_items  1000) {
+  perf_debug(Exceeded state cache size limit.  Clearing the set 
+ of compiled programs, which will trigger recompiles\n);
   brw_clear_cache(brw, brw-cache);
+   }
 }
 
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/8] i965: Add INTEL_DEBUG=perf for failure to compile 16-wide shaders.

2012-08-07 Thread Jordan Justen
On Tue, 2012-08-07 at 11:04 -0700, Eric Anholt wrote:
 diff --git a/src/mesa/drivers/dri/intel/intel_context.h 
 b/src/mesa/drivers/dri/intel/intel_context.h
 index 6d1a81c..c4efa54 100644
 --- a/src/mesa/drivers/dri/intel/intel_context.h
 +++ b/src/mesa/drivers/dri/intel/intel_context.h
 @@ -463,6 +463,11 @@ extern int INTEL_DEBUG;
   printf(__VA_ARGS__);\
  } while(0)
  
 +#define perf_debug(...) do { \
 + if (unlikely(INTEL_DEBUG  DEBUG_PERF)) \
 + printf(__VA_ARGS__);\
 +} while(0)

Should perf_debug be used in the paths in PATCH 1/8?

For series:
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

-Jordan


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Add a lowering pass to turn complicated UBO references to vector loads.

2012-08-07 Thread Kenneth Graunke
On 08/06/2012 07:00 PM, Eric Anholt wrote:
 v2: Reduce the impenetrable code in emit_ubo_loads() by 23 lines by keeping
 the ir_variable as the variable part of the offset from handle_rvalue(),
 and track the constant offsets from that with a plain old integer value,
 avoiding a bunch of temporary variables in the array and struct handling.
 Also, fix file description doxygen.
 ---
  src/glsl/Makefile.sources|1 +
  src/glsl/ir_optimization.h   |1 +
  src/glsl/lower_ubo_reference.cpp |  313 
 ++
  3 files changed, 315 insertions(+)
  create mode 100644 src/glsl/lower_ubo_reference.cpp
 
 diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
 index f2743f7..765f06a 100644
 --- a/src/glsl/Makefile.sources
 +++ b/src/glsl/Makefile.sources
 @@ -66,6 +66,7 @@ LIBGLSL_CXX_FILES = \
   $(GLSL_SRCDIR)/lower_vec_index_to_swizzle.cpp \
   $(GLSL_SRCDIR)/lower_vector.cpp \
   $(GLSL_SRCDIR)/lower_output_reads.cpp \
 + $(GLSL_SRCDIR)/lower_ubo_reference.cpp \
   $(GLSL_SRCDIR)/opt_algebraic.cpp \
   $(GLSL_SRCDIR)/opt_array_splitting.cpp \
   $(GLSL_SRCDIR)/opt_constant_folding.cpp \
 diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
 index c435d77..2220d51 100644
 --- a/src/glsl/ir_optimization.h
 +++ b/src/glsl/ir_optimization.h
 @@ -74,6 +74,7 @@ bool lower_variable_index_to_cond_assign(exec_list 
 *instructions,
  bool lower_quadop_vector(exec_list *instructions, bool dont_lower_swz);
  bool lower_clip_distance(exec_list *instructions);
  void lower_output_reads(exec_list *instructions);
 +void lower_ubo_reference(struct gl_shader *shader, exec_list *instructions);
  bool optimize_redundant_jumps(exec_list *instructions);
  bool optimize_split_arrays(exec_list *instructions, bool linked);
  
 diff --git a/src/glsl/lower_ubo_reference.cpp 
 b/src/glsl/lower_ubo_reference.cpp
 new file mode 100644
 index 000..f930da5
 --- /dev/null
 +++ b/src/glsl/lower_ubo_reference.cpp
 @@ -0,0 +1,313 @@
 +/*
 + * Copyright © 2012 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the Software),
 + * to deal in the Software without restriction, including without limitation
 + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the next
 + * paragraph) shall be included in all copies or substantial portions of the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +/**
 + * \file lower_ubo_reference.cpp
 + *
 + * IR lower pass to replace dereferences of variables in a uniform
 + * buffer object with usage of ir_binop_ubo_load expressions, each of
 + * which can read data up to the size of a vec4.
 + *
 + * This relieves drivers of the responsibility to deal with tricky UBO
 + * layout issues like std140 structures and row_major matrices on
 + * their own.
 + */
 +
 +#include ir.h
 +#include ir_builder.h
 +#include ir_rvalue_visitor.h
 +#include main/macros.h
 +
 +using namespace ir_builder;
 +
 +namespace {
 +class lower_ubo_reference_visitor : public ir_rvalue_enter_visitor {
 +public:
 +   lower_ubo_reference_visitor(struct gl_shader *shader)
 +   : shader(shader)
 +   {
 +   }
 +
 +   void handle_rvalue(ir_rvalue **rvalue);
 +   void emit_ubo_loads(ir_dereference *deref, ir_variable *base_offset,
 +unsigned int deref_offset);
 +   ir_expression *ubo_load(const struct glsl_type *type,
 +ir_rvalue *offset);
 +
 +   void *mem_ctx;
 +   struct gl_shader *shader;
 +   struct gl_uniform_buffer_variable *ubo_var;
 +   unsigned uniform_block;
 +   bool progress;
 +};
 +
 +static inline unsigned int
 +align(unsigned int a, unsigned int align)
 +{
 +   return (a + align - 1) / align * align;
 +}
 +
 +void
 +lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
 +{
 +   if (!*rvalue)
 +  return;
 +
 +   ir_dereference *deref = (*rvalue)-as_dereference();
 +   if (!deref)
 +  return;
 +
 +   ir_variable *var = deref-variable_referenced();
 +   if (!var || var-uniform_block == -1)
 +  return;
 +
 +   mem_ctx = ralloc_parent(*rvalue);
 +   uniform_block = var-uniform_block;
 +   struct gl_uniform_block *block = 

[Mesa-dev] [PATCH] mesa: Fix glPopAttrib() behavior on GL_FRAMEBUFFER_SRGB.

2012-08-07 Thread Eric Anholt
I happened to notice this while looking at a blit pass in l4d2, which had an
optional push/pop around framebuffer srgb setting.  It didn't matter in the
end, but the fix is sitting in my tree now.
---
 src/mesa/main/attrib.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
index 8bc7c34..9cab35b 100644
--- a/src/mesa/main/attrib.c
+++ b/src/mesa/main/attrib.c
@@ -135,6 +135,9 @@ struct gl_enable_attrib
/* GL_ARB_point_sprite / GL_NV_point_sprite */
GLboolean PointSprite;
GLboolean FragmentShaderATI;
+
+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   GLboolean sRGBEnabled;
 };
 
 
@@ -322,6 +325,9 @@ _mesa_PushAttrib(GLbitfield mask)
   attr-VertexProgramPointSize = ctx-VertexProgram.PointSizeEnabled;
   attr-VertexProgramTwoSide = ctx-VertexProgram.TwoSideEnabled;
   save_attrib_data(head, GL_ENABLE_BIT, attr);
+
+  /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+  attr-sRGBEnabled = ctx-Color.sRGBEnabled;
}
 
if (mask  GL_EVAL_BIT) {
@@ -617,6 +623,10 @@ pop_enable_group(struct gl_context *ctx, const struct 
gl_enable_attrib *enable)
enable-VertexProgramTwoSide,
GL_VERTEX_PROGRAM_TWO_SIDE_ARB);
 
+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   TEST_AND_UPDATE(ctx-Color.sRGBEnabled, enable-sRGBEnabled,
+   GL_FRAMEBUFFER_SRGB);
+
/* texture unit enables */
for (i = 0; i  ctx-Const.MaxTextureUnits; i++) {
   const GLbitfield enabled = enable-Texture[i];
@@ -981,6 +991,9 @@ _mesa_PopAttrib(void)
_mesa_set_enable(ctx, GL_DITHER, color-DitherFlag);
_mesa_ClampColorARB(GL_CLAMP_FRAGMENT_COLOR_ARB, 
color-ClampFragmentColor);
_mesa_ClampColorARB(GL_CLAMP_READ_COLOR_ARB, 
color-ClampReadColor);
+
+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   _mesa_set_enable(ctx, GL_FRAMEBUFFER_SRGB, color-sRGBEnabled);
 }
 break;
  case GL_CURRENT_BIT:
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 53226] New: mesa/demos does not build with mesa git because of gbm API changes

2012-08-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=53226

 Bug #: 53226
   Summary: mesa/demos does not build with mesa git because of gbm
API changes
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: Other
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Demos
AssignedTo: mesa-dev@lists.freedesktop.org
ReportedBy: freedesk...@blino.org


mesa/demos does not build with mesa git because of gbm API changes.
gbm_bo_get_pitch() is now gbm_bo_get_stride(), as of mesa commit
7250cd506baa0bd4649b30d87509cdd0cbc06a57

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 53226] mesa/demos does not build with mesa git because of gbm API changes

2012-08-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=53226

--- Comment #1 from Olivier Blin freedesk...@blino.org 2012-08-07 21:14:27 
UTC ---
Created attachment 65255
  -- https://bugs.freedesktop.org/attachment.cgi?id=65255
eglkms: adapt to gbm stride API change

This patch fixes build with mesa git.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 VDPAU 422 regression since r600g: make sure copying of all texture formats is accelerated

2012-08-07 Thread Marek Olšák
Do you have any idea what could be wrong with the patch? Also could
please tell me how to setup VDPAU and where to download the tests, so
that I can test this.

Marek

On Tue, Aug 7, 2012 at 11:25 AM, Andy Furniss andy...@ukfsn.org wrote:
 Marek Olšák wrote:

 Does the attached patch fix this issue?


 Not properly - it fixes the invalid command stream but the output is not
 quite right -

 http://www.andyqos.ukfsn.org/vdpau-422-patched.png




 Marek

 On Mon, Aug 6, 2012 at 5:40 PM, Andy Furniss andy...@ukfsn.org wrote:

 Kernel is dcn card is rv790 - vdpau csc/scale regressed.

 This only shows with 422 colour so most things work.

 commit 7c371f46958910dd2ca9487c89af1b72bbfdada9
 Author: Marek Olšák mar...@gmail.com
 Date:   Sat Jul 28 00:38:42 2012 +0200

  r600g: make sure copying of all texture formats is accelerated

 [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
 radeon :01:00.0: texture bo too small ((704 576) (1 1) 0 26 0 -
 1622016
 have 884736)
 radeon :01:00.0: alignments 384 1 1 1



 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 51749] make[6]: ../../../../src/mesa/Makefile.old: No such file or directory

2012-08-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=51749

--- Comment #1 from Matt Turner matts...@gmail.com 2012-08-07 21:49:33 UTC ---
Not a problem now, right?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 VDPAU 422 regression since r600g: make sure copying of all texture formats is accelerated

2012-08-07 Thread Alex Deucher
On Tue, Aug 7, 2012 at 5:43 PM, Marek Olšák mar...@gmail.com wrote:
 Do you have any idea what could be wrong with the patch? Also could
 please tell me how to setup VDPAU and where to download the tests, so
 that I can test this.

Just add:
--enable-vdpau
to your mesa configure line to enable it.  To test it, try and play
back an MPEG1/2 file with mplayer or another app that supports vdpau.

Alex


 Marek

 On Tue, Aug 7, 2012 at 11:25 AM, Andy Furniss andy...@ukfsn.org wrote:
 Marek Olšák wrote:

 Does the attached patch fix this issue?


 Not properly - it fixes the invalid command stream but the output is not
 quite right -

 http://www.andyqos.ukfsn.org/vdpau-422-patched.png




 Marek

 On Mon, Aug 6, 2012 at 5:40 PM, Andy Furniss andy...@ukfsn.org wrote:

 Kernel is dcn card is rv790 - vdpau csc/scale regressed.

 This only shows with 422 colour so most things work.

 commit 7c371f46958910dd2ca9487c89af1b72bbfdada9
 Author: Marek Olšák mar...@gmail.com
 Date:   Sat Jul 28 00:38:42 2012 +0200

  r600g: make sure copying of all texture formats is accelerated

 [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
 radeon :01:00.0: texture bo too small ((704 576) (1 1) 0 26 0 -
 1622016
 have 884736)
 radeon :01:00.0: alignments 384 1 1 1



 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix glPopAttrib() behavior on GL_FRAMEBUFFER_SRGB.

2012-08-07 Thread Brian Paul

On 08/07/2012 03:05 PM, Eric Anholt wrote:

I happened to notice this while looking at a blit pass in l4d2, which had an
optional push/pop around framebuffer srgb setting.  It didn't matter in the
end, but the fix is sitting in my tree now.
---
  src/mesa/main/attrib.c |   13 +
  1 file changed, 13 insertions(+)

diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
index 8bc7c34..9cab35b 100644
--- a/src/mesa/main/attrib.c
+++ b/src/mesa/main/attrib.c
@@ -135,6 +135,9 @@ struct gl_enable_attrib
 /* GL_ARB_point_sprite / GL_NV_point_sprite */
 GLboolean PointSprite;
 GLboolean FragmentShaderATI;
+
+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   GLboolean sRGBEnabled;
  };


@@ -322,6 +325,9 @@ _mesa_PushAttrib(GLbitfield mask)
attr-VertexProgramPointSize = ctx-VertexProgram.PointSizeEnabled;
attr-VertexProgramTwoSide = ctx-VertexProgram.TwoSideEnabled;
save_attrib_data(head, GL_ENABLE_BIT, attr);
+
+  /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+  attr-sRGBEnabled = ctx-Color.sRGBEnabled;
 }

 if (mask  GL_EVAL_BIT) {
@@ -617,6 +623,10 @@ pop_enable_group(struct gl_context *ctx, const struct 
gl_enable_attrib *enable)
 enable-VertexProgramTwoSide,
 GL_VERTEX_PROGRAM_TWO_SIDE_ARB);

+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   TEST_AND_UPDATE(ctx-Color.sRGBEnabled, enable-sRGBEnabled,
+   GL_FRAMEBUFFER_SRGB);
+
 /* texture unit enables */
 for (i = 0; i  ctx-Const.MaxTextureUnits; i++) {
const GLbitfield enabled = enable-Texture[i];
@@ -981,6 +991,9 @@ _mesa_PopAttrib(void)
 _mesa_set_enable(ctx, GL_DITHER, color-DitherFlag);
 _mesa_ClampColorARB(GL_CLAMP_FRAGMENT_COLOR_ARB, 
color-ClampFragmentColor);
 _mesa_ClampColorARB(GL_CLAMP_READ_COLOR_ARB, 
color-ClampReadColor);
+
+   /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */
+   _mesa_set_enable(ctx, GL_FRAMEBUFFER_SRGB, color-sRGBEnabled);
  }
  break;
   case GL_CURRENT_BIT:


Looks OK to me.  Candidate for 8.0 branch?

Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 VDPAU 422 regression since r600g: make sure copying of all texture formats is accelerated

2012-08-07 Thread Andy Furniss

Marek Olšák wrote:

Do you have any idea what could be wrong with the patch? Also could
please tell me how to setup VDPAU and where to download the tests, so
that I can test this.


I don't know about the patch.

One thing which may be a clue or a red herring is that when Christian 
first implemented 422 it was corrupt in the same way. He said he thought 
it was to do with tiling - and fixed it in (IIRC) ~deathsimple/mesa


I run LFS so to get VDPAU I installed the lib from -

git://people.freedesktop.org/~aplattner/libvdpau

mplayer built from svn should find the lib and enable vdpau during 
configure.


svn checkout svn://svn.mplayerhq.hu/mplayer/trunk mplayer

I guess

http://www.mplayerhq.hu/MPlayer/releases/MPlayer-1.1.tar.xz

will also work if you don't want to do svn.

For me -vo vdpau is the default output with mplayer - You may need to be 
explicit I guess if you already have some distro version with a config 
installed.


This issue is just with -vo vdpau not decode (-vc ffmpeg12vdpau) as 422 
isn't implemented for decode anyway.


When I autogen mesa I have --enable-gallium-g3dvl

The sample I am using is from

ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/625/

The ones ending 400 are 422 the others are 420. The exact file is

ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/625/flwr_400.m2v


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Pipe control patches redux

2012-08-07 Thread Kenneth Graunke
Here's v3 of Daniel's PIPE_CONTROL series.  I reworked it substantially,
moving the length change to the beginning and splitting up the patches
into smaller ones that only do one thing at a time, to make it easier to
bisect or revert if there are any issues.  (I'm pretty paranoid when it
comes to PIPE_CONTROLs.)

If there are no objections, I'll push these tomorrow.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] intel: Make the length for PIPE_CONTROL explicit.

2012-08-07 Thread Kenneth Graunke
PIPE_CONTROL has variable length, depending upon generation and whether
we want to do 32-bit or 64-bit data writes.  Make it explicit, rather
than hiding a length of 4 in the #define for _3DSTATE_PIPE_CONTROL.

Generated by s/3DSTATE_PIPE_CONTROL/3DSTATE_PIPE_CONTROL | (4 - 2)/g.
This is equivalent since the #define used to have | 2 in it.  A grep
through the sources shows that all instances have been converted, so
it's safe to remove the | 2 from the #define.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_queryobj.c   | 20 ++--
 src/mesa/drivers/dri/i965/gen6_vs_state.c  |  2 +-
 src/mesa/drivers/dri/intel/intel_batchbuffer.c | 16 
 src/mesa/drivers/dri/intel/intel_reg.h |  2 +-
 4 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 240fe32..921fecd 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -132,7 +132,7 @@ brw_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 
   if (intel-gen = 6) {
  BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
  OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
  OUT_RELOC(query-bo,
  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
@@ -143,7 +143,7 @@ brw_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
   
   } else {
  BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL |
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
  PIPE_CONTROL_WRITE_TIMESTAMP);
  OUT_RELOC(query-bo,
  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
@@ -202,7 +202,7 @@ brw_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_TIME_ELAPSED_EXT:
   if (intel-gen = 6) {
  BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
  OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
  OUT_RELOC(query-bo,
  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
@@ -213,7 +213,7 @@ brw_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
   
   } else {
  BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL |
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
  PIPE_CONTROL_WRITE_TIMESTAMP);
  OUT_RELOC(query-bo,
  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
@@ -340,12 +340,12 @@ brw_emit_query_begin(struct brw_context *brw)
BEGIN_BATCH(8);
 
/* workaround: CS stall required before depth stall. */
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
OUT_BATCH(PIPE_CONTROL_CS_STALL);
OUT_BATCH(0); /* write address */
OUT_BATCH(0); /* write data */
 
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
 PIPE_CONTROL_WRITE_DEPTH_COUNT);
OUT_RELOC(brw-query.bo,
@@ -357,7 +357,7 @@ brw_emit_query_begin(struct brw_context *brw)

} else {
BEGIN_BATCH(4);
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL |
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
   PIPE_CONTROL_DEPTH_STALL |
   PIPE_CONTROL_WRITE_DEPTH_COUNT);
/* This object could be mapped cacheable, but we don't have an exposed
@@ -397,12 +397,12 @@ brw_emit_query_end(struct brw_context *brw)
if (intel-gen = 6) {
BEGIN_BATCH(8);
/* workaround: CS stall required before depth stall. */
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
OUT_BATCH(PIPE_CONTROL_CS_STALL);
OUT_BATCH(0); /* write address */
OUT_BATCH(0); /* write data */
 
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL);
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
 PIPE_CONTROL_WRITE_DEPTH_COUNT);
OUT_RELOC(brw-query.bo,
@@ -414,7 +414,7 @@ brw_emit_query_end(struct brw_context *brw)

} else {
BEGIN_BATCH(4);
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL |
+   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
   PIPE_CONTROL_DEPTH_STALL |
   PIPE_CONTROL_WRITE_DEPTH_COUNT);
OUT_RELOC(brw-query.bo,
diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
b/src/mesa/drivers/dri/i965/gen6_vs_state.c
index 3392a9f..c562cc7 100644
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
@@ -216,7 +216,7 @@ upload_vs_state(struct brw_context *brw)
intel_emit_post_sync_nonzero_flush(intel);
 
BEGIN_BATCH(4);
-   

[Mesa-dev] [PATCH 2/7] i965: Refactor timestamp write PIPE_CONTROLs into a helper function.

2012-08-07 Thread Kenneth Graunke
This consolidates the complexity in one place, which is important
because it's about to get even more complicated.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 80 
 1 file changed, 30 insertions(+), 50 deletions(-)

Eric wanted a helper function.  He didn't say exactly what he wanted, so I
made this.  It at least consolidates the two timestamp bits.

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 921fecd..229aeb7 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -45,6 +45,33 @@
 #include intel_batchbuffer.h
 #include intel_reg.h
 
+static void
+write_timestamp(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
+{
+   if (intel-gen = 6) {
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
+  OUT_RELOC(query_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+PIPE_CONTROL_GLOBAL_GTT_WRITE |
+idx * sizeof(uint64_t));
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   } else {
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
+PIPE_CONTROL_WRITE_TIMESTAMP);
+  OUT_RELOC(query_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+PIPE_CONTROL_GLOBAL_GTT_WRITE |
+idx * sizeof(uint64_t));
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   }
+}
+
 /** Waits on the query object's BO and totals the results for this query */
 static void
 brw_queryobj_get_results(struct gl_context *ctx,
@@ -127,32 +154,8 @@ brw_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
switch (query-Base.Target) {
case GL_TIME_ELAPSED_EXT:
   drm_intel_bo_unreference(query-bo);
-  query-bo = drm_intel_bo_alloc(intel-bufmgr, timer query,
-4096, 4096);
-
-  if (intel-gen = 6) {
- BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
- OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
- OUT_RELOC(query-bo,
- I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
- PIPE_CONTROL_GLOBAL_GTT_WRITE |
- 0);
- OUT_BATCH(0);
- ADVANCE_BATCH();
-  
-  } else {
- BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
- PIPE_CONTROL_WRITE_TIMESTAMP);
- OUT_RELOC(query-bo,
- I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
- PIPE_CONTROL_GLOBAL_GTT_WRITE |
- 0);
- OUT_BATCH(0);
- OUT_BATCH(0);
- ADVANCE_BATCH();
-  }
+  query-bo = drm_intel_bo_alloc(intel-bufmgr, timer query, 4096, 4096);
+  write_timestamp(intel, query-bo, 0);
   break;
 
case GL_SAMPLES_PASSED_ARB:
@@ -200,30 +203,7 @@ brw_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 
switch (query-Base.Target) {
case GL_TIME_ELAPSED_EXT:
-  if (intel-gen = 6) {
- BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
- OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
- OUT_RELOC(query-bo,
- I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
- PIPE_CONTROL_GLOBAL_GTT_WRITE |
- 8);
- OUT_BATCH(0);
- ADVANCE_BATCH();
-  
-  } else {
- BEGIN_BATCH(4);
- OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
- PIPE_CONTROL_WRITE_TIMESTAMP);
- OUT_RELOC(query-bo,
- I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
- PIPE_CONTROL_GLOBAL_GTT_WRITE |
- 8);
- OUT_BATCH(0);
- OUT_BATCH(0);
- ADVANCE_BATCH();
-  }
-
+  write_timestamp(intel, query-bo, 1);
   intel_batchbuffer_flush(intel);
   break;
 
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] i965: Use 64-bit writes for timestamp queries.

2012-08-07 Thread Kenneth Graunke
The hardware seems to use the length of the PIPE_CONTROL command to
indicate whether the write is 64-bits or 32-bits.  Which makes sense
for immediate writes.

Daniel discovered this by writing a pattern into the query object bo
and noticing that the high 32-bits were left intact, even on those
pipe control writes that seemingly worked.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 229aeb7..afa3091 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -49,14 +49,15 @@ static void
 write_timestamp(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
 {
if (intel-gen = 6) {
-  BEGIN_BATCH(4);
-  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  BEGIN_BATCH(5);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
   OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
   OUT_RELOC(query_bo,
 I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
 PIPE_CONTROL_GLOBAL_GTT_WRITE |
 idx * sizeof(uint64_t));
   OUT_BATCH(0);
+  OUT_BATCH(0);
   ADVANCE_BATCH();
} else {
   BEGIN_BATCH(4);
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Emit a CS stall before timestamp writes.

2012-08-07 Thread Kenneth Graunke
This implements one of the Sandybridge PIPE_CONTROL workarounds.  It
doesn't appear to be required for Ivybridge.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 14 ++
 1 file changed, 14 insertions(+)

Unlike Daniel's series, I made this only apply on Sandybridge.  It appears
that it isn't required, from reading the docs, and I believe Eric made a
comment to that effect during the v1 review.

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index afa3091..cbe67ad 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -49,6 +49,20 @@ static void
 write_timestamp(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
 {
if (intel-gen = 6) {
+  /* Emit workaround flushes: */
+  if (intel-gen == 6) {
+ /* The timestamp write below is a non-zero post-sync op, which on
+  * Gen6 necessitates a CS stall.  CS stalls need stall at scoreboard
+  * set.  See the comments for intel_emit_post_sync_nonzero_flush().
+  */
+ BEGIN_BATCH(4);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+ OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ ADVANCE_BATCH();
+  }
+
   BEGIN_BATCH(5);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
   OUT_BATCH(PIPE_CONTROL_WRITE_TIMESTAMP);
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965: Refactor depth count write PIPE_CONTROLs into a helper function.

2012-08-07 Thread Kenneth Graunke
This consolidates the complexity in one place, which is important
because it's about to get even more complicated.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 111 ---
 1 file changed, 43 insertions(+), 68 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index cbe67ad..d45edc1 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -87,6 +87,47 @@ write_timestamp(struct intel_context *intel, drm_intel_bo 
*query_bo, int idx)
}
 }
 
+static void
+write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
+{
+   if (intel-gen = 6) {
+  BEGIN_BATCH(8);
+
+  /* workaround: CS stall required before depth stall. */
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(PIPE_CONTROL_CS_STALL);
+  OUT_BATCH(0); /* write address */
+  OUT_BATCH(0); /* write data */
+
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
+PIPE_CONTROL_WRITE_DEPTH_COUNT);
+  OUT_RELOC(query_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+PIPE_CONTROL_GLOBAL_GTT_WRITE |
+(idx * sizeof(uint64_t)));
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   } else {
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
+PIPE_CONTROL_DEPTH_STALL |
+PIPE_CONTROL_WRITE_DEPTH_COUNT);
+  /* This object could be mapped cacheable, but we don't have an exposed
+   * mechanism to support that.  Since it's going uncached, tell GEM that
+   * we're writing to it.  The usual clflush should be all that's required
+   * to pick up the results.
+   */
+  OUT_RELOC(query_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+PIPE_CONTROL_GLOBAL_GTT_WRITE |
+(idx * sizeof(uint64_t)));
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   }
+}
+
 /** Waits on the query object's BO and totals the results for this query */
 static void
 brw_queryobj_get_results(struct gl_context *ctx,
@@ -331,43 +372,7 @@ brw_emit_query_begin(struct brw_context *brw)
if (!query || brw-query.active)
   return;
 
-   if (intel-gen = 6) {
-   BEGIN_BATCH(8);
-
-   /* workaround: CS stall required before depth stall. */
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-   OUT_BATCH(PIPE_CONTROL_CS_STALL);
-   OUT_BATCH(0); /* write address */
-   OUT_BATCH(0); /* write data */
-
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-   OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
-PIPE_CONTROL_WRITE_DEPTH_COUNT);
-   OUT_RELOC(brw-query.bo,
-I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-PIPE_CONTROL_GLOBAL_GTT_WRITE |
-((brw-query.index * 2) * sizeof(uint64_t)));
-   OUT_BATCH(0);
-   ADVANCE_BATCH();
-   
-   } else {
-   BEGIN_BATCH(4);
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
-  PIPE_CONTROL_DEPTH_STALL |
-  PIPE_CONTROL_WRITE_DEPTH_COUNT);
-   /* This object could be mapped cacheable, but we don't have an exposed
-   * mechanism to support that.  Since it's going uncached, tell GEM that
-   * we're writing to it.  The usual clflush should be all that's required
-   * to pick up the results.
-   */
-   OUT_RELOC(brw-query.bo,
-  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-  PIPE_CONTROL_GLOBAL_GTT_WRITE |
-  ((brw-query.index * 2) * sizeof(uint64_t)));
-   OUT_BATCH(0);
-   OUT_BATCH(0);
-   ADVANCE_BATCH();
-   }
+   write_depth_count(intel, brw-query.bo, brw-query.index * 2);
 
if (query-bo != brw-query.bo) {
   if (query-bo != NULL)
@@ -389,37 +394,7 @@ brw_emit_query_end(struct brw_context *brw)
if (!brw-query.active)
   return;
 
-   if (intel-gen = 6) {
-   BEGIN_BATCH(8);
-   /* workaround: CS stall required before depth stall. */
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-   OUT_BATCH(PIPE_CONTROL_CS_STALL);
-   OUT_BATCH(0); /* write address */
-   OUT_BATCH(0); /* write data */
-
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-   OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
-PIPE_CONTROL_WRITE_DEPTH_COUNT);
-   OUT_RELOC(brw-query.bo,
-I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-PIPE_CONTROL_GLOBAL_GTT_WRITE |
-((brw-query.index * 2 + 1) * sizeof(uint64_t)));
-   OUT_BATCH(0);
-   ADVANCE_BATCH();
-   
-   } else {
-   BEGIN_BATCH(4);
-   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
-  PIPE_CONTROL_DEPTH_STALL |
-  

[Mesa-dev] [PATCH 6/7] i965: Use 64-bit writes for occlusion queries.

2012-08-07 Thread Kenneth Graunke
The hardware seems to use the length of the PIPE_CONTROL command to
indicate whether the write is 64-bits or 32-bits.  Which makes sense
for immediate writes.

Daniel discovered this by writing a pattern into the query object bo
and noticing that the high 32-bits were left intact, even on those
pipe control writes that seemingly worked.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index d45edc1..1e03d08 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -91,7 +91,7 @@ static void
 write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
 {
if (intel-gen = 6) {
-  BEGIN_BATCH(8);
+  BEGIN_BATCH(9);
 
   /* workaround: CS stall required before depth stall. */
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
@@ -99,7 +99,7 @@ write_depth_count(struct intel_context *intel, drm_intel_bo 
*query_bo, int idx)
   OUT_BATCH(0); /* write address */
   OUT_BATCH(0); /* write data */
 
-  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
   OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
 PIPE_CONTROL_WRITE_DEPTH_COUNT);
   OUT_RELOC(query_bo,
@@ -107,6 +107,7 @@ write_depth_count(struct intel_context *intel, drm_intel_bo 
*query_bo, int idx)
 PIPE_CONTROL_GLOBAL_GTT_WRITE |
 (idx * sizeof(uint64_t)));
   OUT_BATCH(0);
+  OUT_BATCH(0);
   ADVANCE_BATCH();
} else {
   BEGIN_BATCH(4);
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.

2012-08-07 Thread Kenneth Graunke
Separate out the depth stall from the depth count write.  Workarounds
say that a depth stall needs to be preceeded with a non-zero post-sync
op (in this case, the depth count write).  Also, before the non-zero
post-sync op, we need a CS stall, which needs a stall at scoreboard.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 36 
 1 file changed, 27 insertions(+), 9 deletions(-)

This does remove the CS stall on Ivybridge.

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 1e03d08..4c561ad 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -91,17 +91,24 @@ static void
 write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
 {
if (intel-gen = 6) {
-  BEGIN_BATCH(9);
-
-  /* workaround: CS stall required before depth stall. */
-  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-  OUT_BATCH(PIPE_CONTROL_CS_STALL);
-  OUT_BATCH(0); /* write address */
-  OUT_BATCH(0); /* write data */
+  /* Emit Sandybridge workaround flush: */
+  if (intel-gen == 6) {
+ /* The timestamp write below is a non-zero post-sync op, which on
+  * Gen6 necessitates a CS stall.  CS stalls need stall at scoreboard
+  * set.  See the comments for intel_emit_post_sync_nonzero_flush().
+  */
+ BEGIN_BATCH(4);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+ OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ ADVANCE_BATCH();
+  }
 
+  /* Emit the actual depth count write: */
+  BEGIN_BATCH(5);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
-  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
-PIPE_CONTROL_WRITE_DEPTH_COUNT);
+  OUT_BATCH(PIPE_CONTROL_WRITE_DEPTH_COUNT);
   OUT_RELOC(query_bo,
 I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
 PIPE_CONTROL_GLOBAL_GTT_WRITE |
@@ -109,6 +116,17 @@ write_depth_count(struct intel_context *intel, 
drm_intel_bo *query_bo, int idx)
   OUT_BATCH(0);
   OUT_BATCH(0);
   ADVANCE_BATCH();
+
+  /* We need to emit a depth stall to get the right value for the depth
+   * count.  As a workaround this needs a preceeding PIPE_CONTROL with a
+   * non-zero post-sync op.  The depth count write above does that for us.
+   */
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
} else {
   BEGIN_BATCH(4);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev