Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/18/2011 05:22 AM, Eric Anholt wrote:
 One of the pain points of working on compiler optimizations has been
 justifying them -- sometimes I come up with something I think is
 useful and spend a day or two on it, but the value doesn't show up as
 fps in the application that suggested the optimization to me.  Then I
 wonder if this transformation of the code is paying off in general,
 and thus if I should push it.  If I don't push it, I end up bringing
 that patch out on every application I look at that it could affect, to
 see if now I finally have justification to get it out of a private
 branch.
 
 At a conference this week, we heard about how another team is are
 using a database of (assembly) shaders, which they run through their
 compiler and count resulting instructions for testing purposes.  This
 sounded like a fun idea, so I threw one together.  Patch #1 is good in

This is one of those ideas that seems so obvious after you hear about it
that you can't believe you hadn't thought of it years ago.  This seems
like something we'd want in piglit, but I'm not sure how that would look.

The first problem is, obviously, using INTEL_DEBUG=wm to get the
instruction counts won't work. :)  Perhaps we could extend some of the
existing assembly program queries (e.g.,
GL_PROGRAM_NATIVE_INSTRUCTIONS_ARB) to GLSL.  That would help even if we
didn't incorporate this into piglit.

The other problem is what the test would report for a result.  Hmm...

 general (hey, link errors, finally!), but also means that a quick hack
 to glslparsertest makes it link a passing compile shader and therefore
 generate assembly that gets dumped under INTEL_DEBUG=wm.  Patch #2 I
 used for automatic scraping of shaders in every application I could
 find on my system at the time.  The open-source ones I pushed to:
 
 http://cgit.freedesktop.org/~anholt/shader-db
 
 And finally, patch #3 is something I built before but couldn't really
 justify until now.  However, given that it reduced fragment shader
 instructions 0.3% across 831 shaders (affecting 52 of them including
 yofrankie, warsow, norsetto, and gstreamer) and didn't increase
 instructions anywhere, I'm a lot happier now.

We'll probably want to be able to disable this once we have some sort of
CSE on the low-level IR.  This sort of optimization can cause problems
for CSE in cases where the same register is a source and a destination.
 Imagine something like

z = sqrt(x) + y;
z = z * w;
q = sqrt(x) + y;

If the result of the first 'sqrt(x) + y' is written directly to z, the
value is gone when the second 'sqrt(x) + y' is executed.  If that
result is written to a temporary register that is then copied to z, the
value is still around at the second instance.

Since we don't have any CSE, this doesn't matter now.  However, it's
something to keep in mind.

 Hopefully we hook up EXT_timer_query to apitrace soon so I can do more
 targeted optimizations and need this less :) In the meantime, I hope
 this can prove useful to others -- if you want to contribute
 appropriately-licensed shaders to the database so we track those, or
 if you want to make the analysis work on your hardware backend, feel
 free.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAk3TbnkACgkQX1gOwKyEAw96twCfcEHQaQMe4HtpLar6zAFxj9Ww
i/wAnRfQCSlN5E5vCIyE7t3Ep7EfXuL0
=aVeT
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] mesa: Include shader target in dumps of GLSL source.

2011-05-18 Thread Kenneth Graunke

On 05/17/2011 08:22 PM, Eric Anholt wrote:

This makes automatic parsing of MESA_GLSL=dump output easier.
---
  src/mesa/program/ir_to_mesa.cpp |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)


This makes human parsing of MESA_GLSL=dump easier, too.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/st: split updating vertex and fragment shader stages.

2011-05-18 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

(I've already pushed this by accident, if its bad I can revert it).

this seems like a logical thing to do and sets the correct st flags
for vertex textures.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/mesa/state_tracker/st_atom.c |1 +
 src/mesa/state_tracker/st_atom.h |1 +
 src/mesa/state_tracker/st_atom_texture.c |   18 ++
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom.c b/src/mesa/state_tracker/st_atom.c
index bf160fe..e1eac81 100644
--- a/src/mesa/state_tracker/st_atom.c
+++ b/src/mesa/state_tracker/st_atom.c
@@ -56,6 +56,7 @@ static const struct st_tracked_state *atoms[] =
st_update_scissor,
st_update_blend,
st_update_sampler,
+   st_update_vertex_texture,
st_update_texture,
st_update_framebuffer,
st_update_msaa,
diff --git a/src/mesa/state_tracker/st_atom.h b/src/mesa/state_tracker/st_atom.h
index 6a5ea36..930a084 100644
--- a/src/mesa/state_tracker/st_atom.h
+++ b/src/mesa/state_tracker/st_atom.h
@@ -60,6 +60,7 @@ extern const struct st_tracked_state st_update_blend;
 extern const struct st_tracked_state st_update_msaa;
 extern const struct st_tracked_state st_update_sampler;
 extern const struct st_tracked_state st_update_texture;
+extern const struct st_tracked_state st_update_vertex_texture;
 extern const struct st_tracked_state st_finalize_textures;
 extern const struct st_tracked_state st_update_fs_constants;
 extern const struct st_tracked_state st_update_gs_constants;
diff --git a/src/mesa/state_tracker/st_atom_texture.c 
b/src/mesa/state_tracker/st_atom_texture.c
index 990b504..072eb97 100644
--- a/src/mesa/state_tracker/st_atom_texture.c
+++ b/src/mesa/state_tracker/st_atom_texture.c
@@ -317,20 +317,22 @@ update_fragment_textures(struct st_context *st)
   st-state.sampler_views);
 }
 
-static void 
-update_textures(struct st_context *st)
-{
-  update_fragment_textures(st);
-  update_vertex_textures(st);
-}
-
 const struct st_tracked_state st_update_texture = {
st_update_texture,/* name */
{   /* dirty */
   _NEW_TEXTURE,/* mesa */
   ST_NEW_FRAGMENT_PROGRAM, /* st */
},
-   update_textures /* update */
+   update_fragment_textures/* update */
+};
+
+const struct st_tracked_state st_update_vertex_texture = {
+   st_update_vertex_texture, /* name */
+   {   /* dirty */
+  _NEW_TEXTURE,/* mesa */
+  ST_NEW_VERTEX_PROGRAM,   /* st */
+   },
+   update_vertex_textures  /* update */
 };
 
 static void 
-- 
1.7.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 36651] mesa requires bison and flex to build but configure does not check for them

2011-05-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=36651

Brian Paul brian.e.p...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #5 from Brian Paul brian.e.p...@gmail.com 2011-05-18 06:52:19 PDT 
---
I've committed the patch: de1df26b5c11a45f2b1ff2ddc7b8ec764356aa94

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Anisotropic filtering extension for swrast

2011-05-18 Thread Brian Paul

On 05/17/2011 05:08 AM, Andreas Faenger wrote:

Hi,

this patch makes it possible to have high quality texture filtering
with the pure software renderer. The main purpose is to use it with
osmesa. The anisotropic filtering is based on Elliptical Weighted Avarage (EWA).

The patch was designed to make as little changes to the existing codebase as 
possible. Therefore, the existing texture_sample_func
signature has not been adjusted although this was required; a hack
was used instead to pass the required arguments.

I provide this patch as other people might be interessted in
using anisotropic filtering for osmesa, especially when rendering
images in a headless environment.


Thanks.  I'm about to commit your patch (with some formatting fixes).

Would you be interested in implementing this feature in the Gallium 
softpipe driver too?


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Jerome Glisse
On Tue, May 17, 2011 at 11:22 PM, Eric Anholt e...@anholt.net wrote:
 One of the pain points of working on compiler optimizations has been
 justifying them -- sometimes I come up with something I think is
 useful and spend a day or two on it, but the value doesn't show up as
 fps in the application that suggested the optimization to me.  Then I
 wonder if this transformation of the code is paying off in general,
 and thus if I should push it.  If I don't push it, I end up bringing
 that patch out on every application I look at that it could affect, to
 see if now I finally have justification to get it out of a private
 branch.

 At a conference this week, we heard about how another team is are
 using a database of (assembly) shaders, which they run through their
 compiler and count resulting instructions for testing purposes.  This
 sounded like a fun idea, so I threw one together.  Patch #1 is good in
 general (hey, link errors, finally!), but also means that a quick hack
 to glslparsertest makes it link a passing compile shader and therefore
 generate assembly that gets dumped under INTEL_DEBUG=wm.  Patch #2 I
 used for automatic scraping of shaders in every application I could
 find on my system at the time.  The open-source ones I pushed to:

 http://cgit.freedesktop.org/~anholt/shader-db

 And finally, patch #3 is something I built before but couldn't really
 justify until now.  However, given that it reduced fragment shader
 instructions 0.3% across 831 shaders (affecting 52 of them including
 yofrankie, warsow, norsetto, and gstreamer) and didn't increase
 instructions anywhere, I'm a lot happier now.

 Hopefully we hook up EXT_timer_query to apitrace soon so I can do more
 targeted optimizations and need this less :) In the meantime, I hope
 this can prove useful to others -- if you want to contribute
 appropriately-licensed shaders to the database so we track those, or
 if you want to make the analysis work on your hardware backend, feel
 free.


I have been thinking at doing somethings slightly different. Sadly
instruction count is not necesarily the best metric to evaluate
optimization performed by shader compiler. Hidding texture fetch
latency of a shader can improve performance a lot more than saving 2
instructions. So my idea was to do a gl app that render into
framebuffer thousand time the same shader. The use of fbo is to avoid
to have things like swapbuffer or a like to play a role while we are
solely interested in shader performance. Also use an fbo as big as
possible so fragment shader has a lot of pixel to go through and i
believe disabling things like blending, zbuffer ... so no other part
of the pipeline impact in anyway the shader.

Others things might play a role, for instance if we provide small
dummy texture we might just hide the gain texture fetch optimization
might give, as the GPU might be able to have the texture in cache and
thus have very low latency on each texture fetch. Same if we are using
same texture for all unit, texture cache might hide latency that real
application might otherwise face. So i think we need to have big
enough dummy texture like 512*512 and different one for each unit,
also try to provide random u,v for texture fetch so that texture cache
doesn't hide too much of the latency.

I am sure i am missing other factor that we should try to diminish
while testing for shader performance.

I think such things isn't a good fit for piglit but it can still be
added as a subtools (so that we don't add yet another repository)

Thanks a lot for extracting all those shader, i am sure we can get
some people to write us shader with some what advance math under
acceptable license.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 36651] mesa requires bison and flex to build but configure does not check for them

2011-05-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=36651

--- Comment #6 from Michal Suchanek hramr...@gmail.com 2011-05-18 08:51:04 
PDT ---
AFAIK this is only a part of the solution.

The patch makes configure check for flex but does not make it fail when flex is
not found.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 36651] mesa requires bison and flex to build but configure does not check for them

2011-05-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=36651

--- Comment #7 from Brian Paul brian.e.p...@gmail.com 2011-05-18 09:03:30 PDT 
---
Are you sure you're looking at the updated patch?

See:

+AC_PATH_PROG([FLEX], [flex])
+test x$FLEX = x  AC_MSG_ERROR([flex is needed to build Mesa])
+
+AC_PATH_PROG([BISON], [bison])
+test x$BISON = x  AC_MSG_ERROR([bison is needed to build Mesa])
+

I removed flex/bison and tested and configure errors/exits as expected.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 36651] mesa requires bison and flex to build but configure does not check for them

2011-05-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=36651

Michal Suchanek hramr...@gmail.com changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED

--- Comment #8 from Michal Suchanek hramr...@gmail.com 2011-05-18 09:05:34 
PDT ---
Yes, with the updated patch it works as expected.

I missed the previous one where the check was introduced.

Thanks

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Anisotropic filtering extension for swrast

2011-05-18 Thread Maxim Levitsky
On Wed, 2011-05-18 at 08:06 -0600, Brian Paul wrote: 
 On 05/17/2011 05:08 AM, Andreas Faenger wrote:
  Hi,
 
  this patch makes it possible to have high quality texture filtering
  with the pure software renderer. The main purpose is to use it with
  osmesa. The anisotropic filtering is based on Elliptical Weighted Avarage 
  (EWA).
 
  The patch was designed to make as little changes to the existing codebase 
  as possible. Therefore, the existing texture_sample_func
  signature has not been adjusted although this was required; a hack
  was used instead to pass the required arguments.
 
  I provide this patch as other people might be interessted in
  using anisotropic filtering for osmesa, especially when rendering
  images in a headless environment.
 
 Thanks.  I'm about to commit your patch (with some formatting fixes).
 
 Would you be interested in implementing this feature in the Gallium 
 softpipe driver too?

Offtopic: @phoronix: please don't write an excitement article about how
mesa now supports anisotropic filtering (with writing in very small text
that is only for software renderer...) :-)

Best regards,
Maxim Levitsly

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] nv50: add support for user clip planes.

2011-05-18 Thread Maxim Levitsky
Clip distance is calculated each time vertex position is written
which is suboptiomal is some cases but very safe.
User clip planes are an obsolete feature anyway.

Every time number of clip planes increases, the vertex program is recompiled.
That ensures no overhead in normal case (no user clip planes)
and reasonable overhead otherwice.

Fixes 3D windows in compiz, and reflection effect in neverball.
Also fixes  compiz expo plugin when windows were dragged and each window shown 
3 times.

Thanks to Christoph Bumiller for writing the shader compiler, and for helping me
learn it enough to fix that little issue.
Also, this is based on old patch by him, that added this support to older 
version of the shader compiler.

V2: * revalidate shader linkage only when vertex program is rebuilt as 
suggested by Christoph Bumiller
* little consmetic fixes

Signed-off-by: Maxim Levitsky maximlevit...@gmail.com
---
 src/gallium/drivers/nv50/nv50_program.c|3 ++
 src/gallium/drivers/nv50/nv50_shader_state.c   |8 ++-
 src/gallium/drivers/nv50/nv50_state_validate.c |4 +++
 src/gallium/drivers/nv50/nv50_tgsi_to_nc.c |   28 
 4 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/src/gallium/drivers/nv50/nv50_program.c 
b/src/gallium/drivers/nv50/nv50_program.c
index 41d3e14..4def93d 100644
--- a/src/gallium/drivers/nv50/nv50_program.c
+++ b/src/gallium/drivers/nv50/nv50_program.c
@@ -395,6 +395,9 @@ nv50_vertprog_prepare(struct nv50_translation_info *ti)
   }
}
 
+   p-vp.clpd = p-max_out;
+   p-max_out += p-vp.clpd_nr;
+
for (i = 0; i  TGSI_SEMANTIC_COUNT; ++i) {
   switch (ti-sysval_map[i]) {
   case 2:
diff --git a/src/gallium/drivers/nv50/nv50_shader_state.c 
b/src/gallium/drivers/nv50/nv50_shader_state.c
index 82c346c..065a9e7 100644
--- a/src/gallium/drivers/nv50/nv50_shader_state.c
+++ b/src/gallium/drivers/nv50/nv50_shader_state.c
@@ -170,6 +170,12 @@ nv50_vertprog_validate(struct nv50_context *nv50)
struct nouveau_channel *chan = nv50-screen-base.channel;
struct nv50_program *vp = nv50-vertprog;
 
+   if (nv50-clip.nr  vp-vp.clpd_nr) {
+  if (vp-translated)
+nv50_program_destroy(nv50, vp);
+  vp-vp.clpd_nr = nv50-clip.nr;
+   }
+
if (!nv50_program_validate(nv50, vp))
  return;
 
@@ -369,7 +375,7 @@ nv50_fp_linkage_validate(struct nv50_context *nv50)
m = nv50_vec4_map(map, 0, lin, dummy, vp-out[0]);
 
for (c = 0; c  vp-vp.clpd_nr; ++c)
-  map[m++] |= vp-vp.clpd + c;
+  map[m++] = vp-vp.clpd + c;
 
colors |= m  8; /* adjust BFC0 id */
 
diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nv50/nv50_state_validate.c
index cdf1a98..c8a0d50 100644
--- a/src/gallium/drivers/nv50/nv50_state_validate.c
+++ b/src/gallium/drivers/nv50/nv50_state_validate.c
@@ -225,6 +225,10 @@ nv50_validate_clip(struct nv50_context *nv50)
 
BEGIN_RING(chan, RING_3D(VP_CLIP_DISTANCE_ENABLE), 1);
OUT_RING  (chan, (1  nv50-clip.nr) - 1);
+
+   if (nv50-vertprog  nv50-vertprog-translated 
+ nv50-clip.nr  nv50-vertprog-vp.clpd_nr)
+ nv50-dirty |= NV50_NEW_VERTPROG;
 }
 
 static void
diff --git a/src/gallium/drivers/nv50/nv50_tgsi_to_nc.c 
b/src/gallium/drivers/nv50/nv50_tgsi_to_nc.c
index 25dcaae..5efa99c 100644
--- a/src/gallium/drivers/nv50/nv50_tgsi_to_nc.c
+++ b/src/gallium/drivers/nv50/nv50_tgsi_to_nc.c
@@ -1990,6 +1990,34 @@ bld_instruction(struct bld_context *bld,
 
FOR_EACH_DST0_ENABLED_CHANNEL(c, insn)
   emit_store(bld, insn, c, dst0[c]);
+
+
+   const struct tgsi_full_dst_register *dreg = insn-Dst[0];
+   struct nv50_program *prog = bld-ti-p;
+
+   if (prog-vp.clpd_nr  prog-type == PIPE_SHADER_VERTEX 
+  dreg-Register.File == TGSI_FILE_OUTPUT 
+  prog-out[dreg-Register.Index].sn == TGSI_SEMANTIC_POSITION) {
+
+  for (int p = 0 ; p  prog-vp.clpd_nr ; p++) {
+ struct nv_value *clipd = NULL;
+
+ for (int c = 0 ; c  4 ; c++) {
+temp = new_value(bld-pc, NV_FILE_MEM_C(15), NV_TYPE_F32);
+temp-reg.id = p * 4 + c;
+temp = bld_insn_1(bld, NV_OP_LDA, temp);
+
+clipd = clipd ?
+bld_insn_3(bld, NV_OP_MAD, dst0[c], temp, clipd) :
+bld_insn_2(bld, NV_OP_MUL, dst0[c], temp);
+ }
+
+ temp = bld_insn_1(bld, NV_OP_MOV, clipd);
+ temp-reg.file = NV_FILE_OUT;
+ temp-reg.id = bld-ti-p-vp.clpd + p;
+ temp-insn-fixed = 1;
+  }
+   }
 }
 
 static INLINE void
-- 
1.7.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl: Link wayland-drm.a into libEGL after egl_dri2

2011-05-18 Thread Thierry Reding
Fixes the following build error in wayland-demos:

  CCLD   wayland-compositor
/usr/lib/libEGL.so: undefined reference to 
`wayland_drm_buffer_get_buffer'
/usr/lib/libEGL.so: undefined reference to `wayland_drm_uninit'
/usr/lib/libEGL.so: undefined reference to `wayland_buffer_is_drm'
/usr/lib/libEGL.so: undefined reference to `wayland_drm_init'
/usr/lib/libEGL.so: undefined reference to `wl_drm_interface'
---
 src/egl/main/Makefile |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/egl/main/Makefile b/src/egl/main/Makefile
index 6c24255..6c4a392 100644
--- a/src/egl/main/Makefile
+++ b/src/egl/main/Makefile
@@ -54,10 +54,6 @@ OBJECTS = $(SOURCES:.c=.o)
 LOCAL_CFLAGS = -D_EGL_OS_UNIX=1
 LOCAL_LIBS =
 
-ifneq ($(findstring wayland, $(EGL_PLATFORMS)),)
-LOCAL_LIBS += $(TOP)/src/egl/wayland/wayland-drm/libwayland-drm.a
-endif
-
 # egl_dri2 and egl_glx are built-ins
 ifeq ($(filter dri2, $(EGL_DRIVERS_DIRS)),dri2)
 LOCAL_CFLAGS += -D_EGL_BUILT_IN_DRIVER_DRI2
@@ -68,6 +64,10 @@ endif
 EGL_LIB_DEPS += $(LIBUDEV_LIBS) $(DLOPEN_LIBS) $(LIBDRM_LIB) $(WAYLAND_LIBS)
 endif
 
+ifneq ($(findstring wayland, $(EGL_PLATFORMS)),)
+LOCAL_LIBS += $(TOP)/src/egl/wayland/wayland-drm/libwayland-drm.a
+endif
+
 ifeq ($(filter glx, $(EGL_DRIVERS_DIRS)),glx)
 LOCAL_CFLAGS += -D_EGL_BUILT_IN_DRIVER_GLX
 LOCAL_LIBS += $(TOP)/src/egl/drivers/glx/libegl_glx.a
-- 
1.7.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Eric Anholt
On Wed, 18 May 2011 09:00:09 +0200, Ian Romanick i...@freedesktop.org wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 05/18/2011 05:22 AM, Eric Anholt wrote:
  One of the pain points of working on compiler optimizations has been
  justifying them -- sometimes I come up with something I think is
  useful and spend a day or two on it, but the value doesn't show up as
  fps in the application that suggested the optimization to me.  Then I
  wonder if this transformation of the code is paying off in general,
  and thus if I should push it.  If I don't push it, I end up bringing
  that patch out on every application I look at that it could affect, to
  see if now I finally have justification to get it out of a private
  branch.
  
  At a conference this week, we heard about how another team is are
  using a database of (assembly) shaders, which they run through their
  compiler and count resulting instructions for testing purposes.  This
  sounded like a fun idea, so I threw one together.  Patch #1 is good in
 
 This is one of those ideas that seems so obvious after you hear about it
 that you can't believe you hadn't thought of it years ago.  This seems
 like something we'd want in piglit, but I'm not sure how that would look.

Incidentally, Tom Stellard has apparently been doing this across piglit
already.  This makes me think that maybe I want to just roll the
captured open-source shaders into glslparsertest, and just use the
analysis stuff on piglit.

 The first problem is, obviously, using INTEL_DEBUG=wm to get the
 instruction counts won't work. :)  Perhaps we could extend some of the
 existing assembly program queries (e.g.,
 GL_PROGRAM_NATIVE_INSTRUCTIONS_ARB) to GLSL.  That would help even if we
 didn't incorporate this into piglit.

You say it won't work, but I'm using it and it is working :)

Oh, you mean you want a clean solution and not a dirty hack?  Yeah, I'd
really like to have an interface for apps (read: shader debuggers) to
get our annotated assembly out.

  And finally, patch #3 is something I built before but couldn't really
  justify until now.  However, given that it reduced fragment shader
  instructions 0.3% across 831 shaders (affecting 52 of them including
  yofrankie, warsow, norsetto, and gstreamer) and didn't increase
  instructions anywhere, I'm a lot happier now.
 
 We'll probably want to be able to disable this once we have some sort of
 CSE on the low-level IR.  This sort of optimization can cause problems
 for CSE in cases where the same register is a source and a destination.
  Imagine something like
 
   z = sqrt(x) + y;
   z = z * w;
   q = sqrt(x) + y;
 
 If the result of the first 'sqrt(x) + y' is written directly to z, the
 value is gone when the second 'sqrt(x) + y' is executed.  If that
 result is written to a temporary register that is then copied to z, the
 value is still around at the second instance.
 
 Since we don't have any CSE, this doesn't matter now.  However, it's
 something to keep in mind.

I think for CSE on 965 LIR, we'll want to be aggressive, and just
consider whether the RHS values are still around, so we can execute to a
temp and reuse it on math instructions.  Otherwise, you end up with
weird ordering requirements on the optimization passes to ensure that
register coalescing doesn't kill these CSE opportunities.


pgpVDvsrSbsMC.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Eric Anholt
On Wed, 18 May 2011 11:05:39 -0400, Jerome Glisse j.gli...@gmail.com wrote:
 On Tue, May 17, 2011 at 11:22 PM, Eric Anholt e...@anholt.net wrote:
  One of the pain points of working on compiler optimizations has been
  justifying them -- sometimes I come up with something I think is
  useful and spend a day or two on it, but the value doesn't show up as
  fps in the application that suggested the optimization to me.  Then I
  wonder if this transformation of the code is paying off in general,
  and thus if I should push it.  If I don't push it, I end up bringing
  that patch out on every application I look at that it could affect, to
  see if now I finally have justification to get it out of a private
  branch.
 
  At a conference this week, we heard about how another team is are
  using a database of (assembly) shaders, which they run through their
  compiler and count resulting instructions for testing purposes.  This
  sounded like a fun idea, so I threw one together.  Patch #1 is good in
  general (hey, link errors, finally!), but also means that a quick hack
  to glslparsertest makes it link a passing compile shader and therefore
  generate assembly that gets dumped under INTEL_DEBUG=wm.  Patch #2 I
  used for automatic scraping of shaders in every application I could
  find on my system at the time.  The open-source ones I pushed to:
 
  http://cgit.freedesktop.org/~anholt/shader-db
 
  And finally, patch #3 is something I built before but couldn't really
  justify until now.  However, given that it reduced fragment shader
  instructions 0.3% across 831 shaders (affecting 52 of them including
  yofrankie, warsow, norsetto, and gstreamer) and didn't increase
  instructions anywhere, I'm a lot happier now.
 
  Hopefully we hook up EXT_timer_query to apitrace soon so I can do more
  targeted optimizations and need this less :) In the meantime, I hope
  this can prove useful to others -- if you want to contribute
  appropriately-licensed shaders to the database so we track those, or
  if you want to make the analysis work on your hardware backend, feel
  free.
 
 
 I have been thinking at doing somethings slightly different. Sadly
 instruction count is not necesarily the best metric to evaluate
 optimization performed by shader compiler. Hidding texture fetch
 latency of a shader can improve performance a lot more than saving 2
 instructions. So my idea was to do a gl app that render into
 framebuffer thousand time the same shader. The use of fbo is to avoid
 to have things like swapbuffer or a like to play a role while we are
 solely interested in shader performance. Also use an fbo as big as
 possible so fragment shader has a lot of pixel to go through and i
 believe disabling things like blending, zbuffer ... so no other part
 of the pipeline impact in anyway the shader.

You might take a look at mesa-demos/src/perf for that.  I haven't had
success using them for performance work due to the noisiness of the
results.

More generally, imo, the problem with that plan is you have to build the
shaders yourself and justify to yourself why that shader you wrote is
representative, and you spend all your time on building the tests when
you just wanted to know if an instruction-reduction optimization did
anything.  shader-db took me one evening to build and collect for all
applications I had (I've got a personal branch for all the closed-source
stuff :/ )

For actual performance testing of apps without idsoftware-style
timedemos, I'm way more excited by the potential of using apitrace with
EXT_timer_query to decide which shaders I should be analyzing, and then
I'd know afterward whether I impacted a real application by replaying
the trace.  That is, assuming I didn't increase CPU costs in the
process, which is where an apitrace replay would not be representative.

Our perspective is: if we are driving the hardware anywhere below what
is possible, that is a bug that we should fix.  Analyzing the costs of
instructions, scheduling impacts, CPU overhead impacts, etc. may be out
of scope for shader-db, but does make some types of analysis quick and
easy (test all shaders you have ever seen of in a couple minutes).


pgpeSdItkPTWG.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of VDPAU and XvMC state-trackers (was Re: Build error on current xvmc-r600 pipe-video)

2011-05-18 Thread Christian König
Am Montag, den 16.05.2011, 19:54 +0100 schrieb Andy Furniss:
 I noticed another strange thing with pipe-video on my rv670.
 
 Until recently there was a bug that made the mesa demo lodbias misrender.
 
 It's fixed now in master and pipe-video, but if I use pipe-video + vdpau 
 decode (xvmc untested) then lodbias reverts to the broken state.

 I don't install pipe-video, so just using libvdpau_g3dvl makes my 
 installed (master) driver behave differently on that test. I have to 
 reboot to get working lodbias and have failed to regress it any other way.
Sounds like some register bits for LOD biasing didn't get set correctly.
So they are programmed once by the vdpau driver and never gets resetted
to their initial state. Which piglit test exactly shows a regression?
shouldn't be to hard to reproduce and fix.

Christian.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Jerome Glisse
On Wed, May 18, 2011 at 3:16 PM, Eric Anholt e...@anholt.net wrote:
 On Wed, 18 May 2011 11:05:39 -0400, Jerome Glisse j.gli...@gmail.com wrote:
 On Tue, May 17, 2011 at 11:22 PM, Eric Anholt e...@anholt.net wrote:
  One of the pain points of working on compiler optimizations has been
  justifying them -- sometimes I come up with something I think is
  useful and spend a day or two on it, but the value doesn't show up as
  fps in the application that suggested the optimization to me.  Then I
  wonder if this transformation of the code is paying off in general,
  and thus if I should push it.  If I don't push it, I end up bringing
  that patch out on every application I look at that it could affect, to
  see if now I finally have justification to get it out of a private
  branch.
 
  At a conference this week, we heard about how another team is are
  using a database of (assembly) shaders, which they run through their
  compiler and count resulting instructions for testing purposes.  This
  sounded like a fun idea, so I threw one together.  Patch #1 is good in
  general (hey, link errors, finally!), but also means that a quick hack
  to glslparsertest makes it link a passing compile shader and therefore
  generate assembly that gets dumped under INTEL_DEBUG=wm.  Patch #2 I
  used for automatic scraping of shaders in every application I could
  find on my system at the time.  The open-source ones I pushed to:
 
  http://cgit.freedesktop.org/~anholt/shader-db
 
  And finally, patch #3 is something I built before but couldn't really
  justify until now.  However, given that it reduced fragment shader
  instructions 0.3% across 831 shaders (affecting 52 of them including
  yofrankie, warsow, norsetto, and gstreamer) and didn't increase
  instructions anywhere, I'm a lot happier now.
 
  Hopefully we hook up EXT_timer_query to apitrace soon so I can do more
  targeted optimizations and need this less :) In the meantime, I hope
  this can prove useful to others -- if you want to contribute
  appropriately-licensed shaders to the database so we track those, or
  if you want to make the analysis work on your hardware backend, feel
  free.
 

 I have been thinking at doing somethings slightly different. Sadly
 instruction count is not necesarily the best metric to evaluate
 optimization performed by shader compiler. Hidding texture fetch
 latency of a shader can improve performance a lot more than saving 2
 instructions. So my idea was to do a gl app that render into
 framebuffer thousand time the same shader. The use of fbo is to avoid
 to have things like swapbuffer or a like to play a role while we are
 solely interested in shader performance. Also use an fbo as big as
 possible so fragment shader has a lot of pixel to go through and i
 believe disabling things like blending, zbuffer ... so no other part
 of the pipeline impact in anyway the shader.

 You might take a look at mesa-demos/src/perf for that.  I haven't had
 success using them for performance work due to the noisiness of the
 results.

 More generally, imo, the problem with that plan is you have to build the
 shaders yourself and justify to yourself why that shader you wrote is
 representative, and you spend all your time on building the tests when
 you just wanted to know if an instruction-reduction optimization did
 anything.  shader-db took me one evening to build and collect for all
 applications I had (I've got a personal branch for all the closed-source
 stuff :/ )

Shader is a bunch of input, so for each shader collected the issue is
to provide proper input, texture could use dummy texture unless the
shader have some dependency on the texture data (like if the texture
fetched data determine the number of iteration or is use to kill a
fragment, ...). Well it's all about going through know shader and
building a reasonable set of input for each of them, it's time
consuming but i believe it brings a lot more for testing point of
view.

 For actual performance testing of apps without idsoftware-style
 timedemos, I'm way more excited by the potential of using apitrace with
 EXT_timer_query to decide which shaders I should be analyzing, and then
 I'd know afterward whether I impacted a real application by replaying
 the trace.  That is, assuming I didn't increase CPU costs in the
 process, which is where an apitrace replay would not be representative.

 Our perspective is: if we are driving the hardware anywhere below what
 is possible, that is a bug that we should fix.  Analyzing the costs of
 instructions, scheduling impacts, CPU overhead impacts, etc. may be out
 of scope for shader-db, but does make some types of analysis quick and
 easy (test all shaders you have ever seen of in a couple minutes).

 I  agree that shader-db provide a usefull tools, i am just convinced
that number of instruction in complex shader is a bad metric especialy
when considering things like r6xx and newer class of hw where texture
fetch and instruction can run 

[Mesa-dev] [PATCH] mesa: fix vertex array enable checking in check_valid_to_render()

2011-05-18 Thread Brian Paul
In particular, this fixes the case where a vertex shader only uses
generic vertex attributes (non-0th).  Before, we were no-op'ing the
glDrawArrays/Elements().

This fixes the new piglit pos-array test.

NOTE: This is a candidate for the 7.10 branch.
---
 src/mesa/main/api_validate.c |   34 --
 1 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 7c4652f..993519f 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -116,17 +116,39 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
   break;
 #endif
 
-#if FEATURE_ES1 || FEATURE_GL
+#if FEATURE_ES1
case API_OPENGLES:
-   case API_OPENGL:
-  /* For regular OpenGL, only draw if we have vertex positions
-   * (regardless of whether or not we have a vertex program/shader). */
-  if (!ctx-Array.ArrayObj-Vertex.Enabled 
- !ctx-Array.ArrayObj-VertexAttrib[0].Enabled)
+  /* For OpenGL ES, only draw if we have vertex positions
+   */
+  if (!ctx-Array.ArrayObj-Vertex.Enabled)
 return GL_FALSE;
   break;
 #endif
 
+#if FEATURE_GL
+   case API_OPENGL:
+  {
+ const struct gl_shader_program *vsProg =
+ctx-Shader.CurrentVertexProgram;
+ GLboolean haveVertexShader = (vsProg  vsProg-LinkStatus);
+ GLboolean haveVertexProgram = ctx-VertexProgram._Enabled;
+ if (haveVertexShader || haveVertexProgram) {
+/* Draw regardless of whether or not we have any vertex arrays.
+ * (Ex: could draw a point using a constant vertex pos)
+ */
+return GL_TRUE;
+ }
+ else {
+/* Draw if we have vertex positions (GL_VERTEX_ARRAY or generic
+ * array [0]).
+ */
+return (ctx-Array.ArrayObj-Vertex.Enabled ||
+ctx-Array.ArrayObj-VertexAttrib[0].Enabled);
+ }
+  }
+  break;
+#endif
+
default:
   ASSERT_NO_FEATURE();
}
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] wayland installed on meego

2011-05-18 Thread 李海梅
hi, everyone. I meet a problem.

I install wayland on meego.
when running compositor, i meet a warning as follows:

mesa warning:dri2 failed to create dri screen

My card is i915
Anybody who meet this problem?I guess this problem is related to the setting
of mesa .
So hope to get some reply in this mail-list.

Thanks a lot.
lanyijia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] wayland works on meego

2011-05-18 Thread 李海梅
hi, everyone. I meet a problem.

I install wayland on meego.
when running compositor, i meet a warning as follows:

mesa warning:dri2 failed to create dri screen

My card is i915.
Anybody who meet this problem?I guess this problem is related to the setting
of mesa or there must be some important steps which is needed when install
mesa.
So hope to get some replys in this mail-list.

Thanks a lot.
lanyijia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader-db, and justifying an i965 compiler optimization.

2011-05-18 Thread Tom Stellard
On Wed, May 18, 2011 at 12:23:40PM -0700, Eric Anholt wrote:
 On Wed, 18 May 2011 09:00:09 +0200, Ian Romanick i...@freedesktop.org wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  On 05/18/2011 05:22 AM, Eric Anholt wrote:
   One of the pain points of working on compiler optimizations has been
   justifying them -- sometimes I come up with something I think is
   useful and spend a day or two on it, but the value doesn't show up as
   fps in the application that suggested the optimization to me.  Then I
   wonder if this transformation of the code is paying off in general,
   and thus if I should push it.  If I don't push it, I end up bringing
   that patch out on every application I look at that it could affect, to
   see if now I finally have justification to get it out of a private
   branch.
   
   At a conference this week, we heard about how another team is are
   using a database of (assembly) shaders, which they run through their
   compiler and count resulting instructions for testing purposes.  This
   sounded like a fun idea, so I threw one together.  Patch #1 is good in
  
  This is one of those ideas that seems so obvious after you hear about it
  that you can't believe you hadn't thought of it years ago.  This seems
  like something we'd want in piglit, but I'm not sure how that would look.
 
 Incidentally, Tom Stellard has apparently been doing this across piglit
 already.  This makes me think that maybe I want to just roll the
 captured open-source shaders into glslparsertest, and just use the
 analysis stuff on piglit.


I use this piglit patch to help capture shader stats:
http://lists.freedesktop.org/archives/piglit/2010-December/000189.html

It redirects any line of output that begins with ~ to a stats file. Then
I use sdiff to compare stats files from different piglit runs.

The output looks like this:

shaders/glsl-orangebook-ch06-bump
 FRAGMENT PROGRAM ~~~
~  25 Instructions
~  25 Vector Instructions (RGB)
~   4 Scalar Instructions (Alpha)
~   0 Flow Control Instructions
~   0 Texture Instructions
~   2 Presub Operations
~   6 Temporary Registers
~~ END ~~

This patch is probably a little overkill, though, because as Marek
pointed out, the same thing could be accomplished by grep'ing the raw
output from piglit.  This has been useful for testing compiler
optimizations, but it would be much better if there were some real world
shaders in piglit.

Also, the glslparsertest hack isn't working on r300g, because shaders
don't get compiled in the r300 backend until the first time they are used.
It's done this way so the driver can emulate things like shadow samplers
in the shader.  I'm not sure what the best solution is for this.  Maybe
we could add an environment variable to force compilation at link time.

-Tom


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev