Re: [Mesa-dev] [PATCH 00/17] i965/vs: Generalize VS compiler back-end in preparation for GS.

2013-04-08 Thread Jordan Justen
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

On Sun, Apr 7, 2013 at 3:53 PM, Paul Berry stereotype...@gmail.com wrote:
 This patch series lays the groundwork for the i965 geometry shader
 back-end by separating the functions and data structures which are
 specific to vertex shaders from those that can also be used to compile
 geometry shaders.  (Following a naming convention that is already
 present in the codebase, this common code is referred to as vec4
 code, since in the future we should be able to use it for any shader
 stage where the hardware expects a pair of vec4s to be stored in each
 register.  This includes tessellation control and tessellation
 evaluation shaders.)

 In particular, the following structs/classes have been split into a
 base class containing vec4-generic data, and a derived class
 containing VS-specific data:

 - brw_vs_compile (new base struct is brw_vec4_compile)
 - brw_vs_prog_key (new base struct is brw_vec4_prog_key)
 - brw_vs_prog_data (new base struct is brw_vec4_prog_data)
 - vec4_visitor (new derived class is vec4_vs_visitor)

 In the case of vec4_visitor, standard C++ inheritance is used, and
 VS-specific behaviours are moved into virtual functions.  The other
 three cases use C-style inheritance (the derived struct has an
 explicit base element, and there are no virtual functions), since
 these structs need to be accessible from plain C code.

 In addition, small modifications have been made to the vec4_generator
 class and the brw_compute_vue_map() function to generalize them for
 use by both vertex and geometry shaders.

 To keep merge conflicts to a minimum (since this patch series has been
 in development for several weeks), I've tried to minimize the amount
 of code motion introduced by this change.  As a result, the patch
 series leaves vec4_vs_visitor functions scattered in several files
 (brw_vec4.cpp, brw_vec4_visitor.cpp, and brw_vec4_vp.cpp).  Once this
 series lands, I'd like to follow up with a patch series that moves all
 of the vec4_vs_visitor functions to a new brw_vec4_vs_visitor.cpp
 file, and moves the class declaration to a corresponding header.  I'll
 wait until this patch series has landed before starting on that, and
 try not to do it while anyone is in the middle of major VS back-end
 work.

 Note that this patch series must be applied atop the patch i965/vs:
 Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes, which I
 sent out for review this morning.  You can find the complete series in
 context at branch gs-backend-prep of
 git://github.com/stereotype441/mesa.git.

 In order to verify that it's actually possible to build geometry
 shader functionality atop these changes, I have begun prototyping an
 implementation of geometry shaders which passes all existing geometry
 shader Piglit tests.  (Sadly, there are many more tests left to write,
 and features left to implement!)  I'll be sending out those patches in
 the coming month(s), as they mature.  You can find that series in
 branch gs of git://github.com/stereotype441/mesa.git.  Note that the
 gs branch is *highly volatile*, so if you want to base work on it,
 please let me know so we can coordinate.

 Piglit-tested on i965/Gen7--no regressions.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radeonsi: Handle new format for configuration values emitted by the LLVM backend

2013-04-08 Thread Michel Dänzer
On Fre, 2013-04-05 at 14:54 -0400, Tom Stellard wrote: 
 From: Tom Stellard thomas.stell...@amd.com
 
 Instead of emitting configuration values (e.g. number of gprs used) in a
 predefined order, the LLVM backend now emits these values in
 register/value pairs.  The first dword contains the register address and
 the second dword contians the value to write.
 ---
  src/gallium/drivers/radeonsi/radeonsi_shader.c | 26 
 +++---
  1 file changed, 23 insertions(+), 3 deletions(-)
 
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 index 0aeecc2..78c1cf4 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 @@ -1175,9 +1175,29 @@ int si_pipe_shader_create(
 }
 }
  
 -   shader-num_sgprs = util_le32_to_cpu(*(uint32_t*)binary.config);
 -   shader-num_vgprs = util_le32_to_cpu(*(uint32_t*)(binary.config + 4));
 -   shader-spi_ps_input_ena = 
 util_le32_to_cpu(*(uint32_t*)(binary.config + 8));
 +   /* XXX: We may be able to emit some of these values directly rather 
 than
 +* extracting fields to be emitted later.
 +*/
 +   for (i = 0; i  binary.config_size; i+= 8) {
 +   unsigned reg = util_le32_to_cpu(*(uint32_t*)(binary.config + 
 i));
 +   unsigned value = util_le32_to_cpu(*(uint32_t*)(binary.config 
 + i + 4));
 +   switch (reg) {
 +   case R_00B028_SPI_SHADER_PGM_RSRC1_PS:
 +   case R_00B128_SPI_SHADER_PGM_RSRC1_VS:
 +   case R_00B228_SPI_SHADER_PGM_RSRC1_GS:
 +   case R_00B848_COMPUTE_PGM_RSRC1:
 +   shader-num_sgprs = (G_00B028_SGPRS(value) * 8) + 1;
 +   shader-num_vgprs = (G_00B028_VGPRS(value) * 4) + 1;

This results in the correct values being written to the registers, but I
think something like

  shader-num_sgprs = (G_00B028_SGPRS(value) + 1) * 8;
  shader-num_vgprs = (G_00B028_VGPRS(value) + 1) * 4;

makes clearer how many GPRs are allocated in the hardware.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture

2013-04-08 Thread Michel Dänzer
On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote: 
 From: Jerome Glisse jgli...@redhat.com
 
 Most test pass, issue are with border color and swizzle.

FWIW, those issues are there with non-compressed formats as well. I'm
afraid we might need to change the hardware border colour depending on
the swizzle.


 diff --git a/src/gallium/drivers/radeonsi/si_state.c 
 b/src/gallium/drivers/radeonsi/si_state.c
 index ca9e8b4..d968b95 100644
 --- a/src/gallium/drivers/radeonsi/si_state.c
 +++ b/src/gallium/drivers/radeonsi/si_state.c
 @@ -1206,6 +1209,51 @@ static uint32_t si_translate_texformat(struct 
 pipe_screen *screen,
 }
  
 /* TODO compressed formats */

Remove this comment?


 @@ -1541,67 +1589,16 @@ boolean si_is_format_supported(struct pipe_screen 
 *screen,
 return retval == usage;
  }
  
 -static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, 
 unsigned level)
 -{
 -   if (util_format_is_depth_or_stencil(rtex-real_format)) {
 -   if (rtex-surface.level[level].mode == RADEON_SURF_MODE_1D) {
 -   return 4;
 -   } else if (rtex-surface.level[level].mode == 
 RADEON_SURF_MODE_2D) {
 -   switch (rtex-real_format) {
 -   case PIPE_FORMAT_Z16_UNORM:
 -   return 5;
 -   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
 -   case PIPE_FORMAT_X8Z24_UNORM:
 -   case PIPE_FORMAT_Z24X8_UNORM:
 -   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
 -   case PIPE_FORMAT_Z32_FLOAT:
 -   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
 -   return 6;
 -   default:
 -   return 7;
 -   }
 -   }
 -   }
 +static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, 
 unsigned level, bool stencil)
 +{
 +   unsigned tile_mode_index = 0;
  
 -   switch (rtex-surface.level[level].mode) {
 -   default:
 -   assert(!Invalid surface mode);
 -   /* Fall through */
 -   case RADEON_SURF_MODE_LINEAR_ALIGNED:
 -   return 8;
 -   case RADEON_SURF_MODE_1D:
 -   if (rtex-surface.flags  RADEON_SURF_SCANOUT)
 -   return 9;
 -   else
 -   return 13;
 -   case RADEON_SURF_MODE_2D:
 -   if (rtex-surface.flags  RADEON_SURF_SCANOUT) {
 -   switch (util_format_get_blocksize(rtex-real_format)) 
 {
 -   case 1:
 -   return 10;
 -   case 2:
 -   return 11;
 -   default:
 -   assert(!Invalid block size);
 -   /* Fall through */
 -   case 4:
 -   return 12;
 -   }
 -   } else {
 -   switch (util_format_get_blocksize(rtex-real_format)) 
 {
 -   case 1:
 -   return 14;
 -   case 2:
 -   return 15;
 -   case 4:
 -   return 16;
 -   case 8:
 -   return 17;
 -   default:
 -   return 13;
 -   }
 -   }
 +   if (stencil) {
 +   tile_mode_index = rtex-surface.stencil_tiling_index[level];
 +   } else {
 +   tile_mode_index = rtex-surface.tiling_index[level];
 }
 +   return tile_mode_index;
  }
  
  /*
 @@ -1638,7 +1635,7 @@ static void si_cb(struct r600_context *rctx, struct 
 si_pm4_state *pm4,
 slice = slice - 1;
 }
  
 -   tile_mode_index = si_tile_mode_index(rtex, level);
 +   tile_mode_index = si_tile_mode_index(rtex, level, false);
  
 desc = util_format_description(surf-base.format);
 for (i = 0; i  4; i++) {
 @@ -1780,15 +1777,9 @@ static void si_db(struct r600_context *rctx, struct 
 si_pm4_state *pm4,
 else
 s_info = S_028044_FORMAT(V_028044_STENCIL_INVALID);
  
 -   tile_mode_index = si_tile_mode_index(rtex, level);
 -   if (tile_mode_index  4 || tile_mode_index  7) {
 -   R600_ERR(Invalid DB tiling mode %d!\n,
 -rtex-surface.level[level].mode);
 -   si_pm4_set_reg(pm4, R_028040_DB_Z_INFO, 
 S_028040_FORMAT(V_028040_Z_INVALID));
 -   si_pm4_set_reg(pm4, R_028044_DB_STENCIL_INFO, 
 S_028044_FORMAT(V_028044_STENCIL_INVALID));
 -   return;
 -   }
 +   tile_mode_index = si_tile_mode_index(rtex, level, false);
 z_info |= S_028040_TILE_MODE_INDEX(tile_mode_index);
 +   

Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture

2013-04-08 Thread Marek Olšák
On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net wrote:

 On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote:
  From: Jerome Glisse jgli...@redhat.com
 
  Most test pass, issue are with border color and swizzle.

 FWIW, those issues are there with non-compressed formats as well. I'm
 afraid we might need to change the hardware border colour depending on
 the swizzle.


I don't think so. The issue with the swizzled border color seems to be a
bad hardware design decision present since r600 rather than a hardware bug.
I tried fixing it for older chipsets with no success. I doubt the hw
designers fixed this for SI. The problem is the hardware tries to guess
what the border color swizzle is from the combined pipe_format+sampler view
swizzle combination. You need 2 texture swizzle states in the texture unit
for the border color to be swizzled correctly, because texels must be
swizzled by the pipe_format swizzle and sampler view swizzle, but the
border color must be swizzled by the sampler view only. The main problem is
that the hardware internally tries to undo the pipe_format swizzle in a way
that just doesn't work. I don't remember the exact swizzles being used by
hardware, but I got crazy cases like if I set texture swizzle to ywzx, the
border color will be ywyy. There is no way to access those zx components of
the border color for that specific swizzling. For some cases, the hardware
succeeds in guessing what the border color should be, e.g. if I set texture
swizzle to .zyxw, the returned border color will be .xyzw (and that would
be correct if the swizzle came from pipe_format, and incorrect if the
swizzle came from sampler view).

It was easy with r300, because I could just undo pipe_format swizzling
before passing the border color to the hardware.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture

2013-04-08 Thread Christoph Bumiller
On 08.04.2013 12:03, Marek Olšák wrote:
 On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net
 mailto:mic...@daenzer.net wrote:

 On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com
 mailto:j.gli...@gmail.com wrote:
  From: Jerome Glisse jgli...@redhat.com mailto:jgli...@redhat.com
 
  Most test pass, issue are with border color and swizzle.

 FWIW, those issues are there with non-compressed formats as well. I'm
 afraid we might need to change the hardware border colour depending on
 the swizzle.


 I don't think so. The issue with the swizzled border color seems to be
 a bad hardware design decision present since r600 rather than a
 hardware bug. I tried fixing it for older chipsets with no success. I
 doubt the hw designers fixed this for SI. The problem is the hardware
 tries to guess what the border color swizzle is from the combined
 pipe_format+sampler view swizzle combination. You need 2 texture
 swizzle states in the texture unit for the border color to be swizzled
 correctly, because texels must be swizzled by the pipe_format swizzle
 and sampler view swizzle, but the border color must be swizzled by the
 sampler view only. The main problem is that the hardware internally
 tries to undo the pipe_format swizzle in a way that just doesn't work.
 I don't remember the exact swizzles being used by hardware, but I got
 crazy cases like if I set texture swizzle to ywzx, the border color
 will be ywyy. There is no way to access those zx components of the
 border color for that specific swizzling. For some cases, the hardware
 succeeds in guessing what the border color should be, e.g. if I set
 texture swizzle to .zyxw, the returned border color will be .xyzw (and
 that would be correct if the swizzle came from pipe_format, and
 incorrect if the swizzle came from sampler view).

 It was easy with r300, because I could just undo pipe_format swizzling
 before passing the border color to the hardware.


Ah yes, border colour swizzle, it's a problem on NV, too. Because the
border colour isn't getting swizzled at all [as far as we know].
The main issue is the separation of samplers and textures in gallium, if
that wasn't the case samplers and textures would be coupled and the
sampler state could be set according to texture view state (if it's just
OpenGL; and if it's just D3D there's no swizzle).
So, I just leave it broken, I can't destroying the elegant separation
because of such an unimportant detail, that hurts too much.

(Also, if someone was to use multiple samplers and views in gallium and
index them dynamically, I'd have to set up all combinations of textures
and samplers, which is simply ridiculous.
And now I'm going to look for some secret sampler setup bit that says
swizzle according to texture view state. Maybe looking into the future
of OpenGL someone's been wise enough to add that. But then, I'd have the
sample problem as you. An intensity texture simply doesn't have separate
values for R,G,B,A.)

Possible solution:
Maybe the state tracker could just do the swizzling, because it knows
that samplers and views are coupled, and it knows the swizzle ?

 Marek


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture

2013-04-08 Thread Marek Olšák
Christoph,

You're talking about something entirely different. I was trying to explain
that a correct swizzled border color is *impossible* on r600 and later
chipsets. I think your hardware is actually good and can do swizzled border
color with a little bit of driver work you refuse to do. :) You have the
option, we don't. The fact D3D doesn't have sampler swizzling actually
explains a lot.

In any case, all radeon drivers should be able to pass the normal
(unswizzled) border color tests.

Marek


On Mon, Apr 8, 2013 at 1:01 PM, Christoph Bumiller 
e0425...@student.tuwien.ac.at wrote:

 On 08.04.2013 12:03, Marek Olšák wrote:
  On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net
  mailto:mic...@daenzer.net wrote:
 
  On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com
  mailto:j.gli...@gmail.com wrote:
   From: Jerome Glisse jgli...@redhat.com mailto:jgli...@redhat.com
 
  
   Most test pass, issue are with border color and swizzle.
 
  FWIW, those issues are there with non-compressed formats as well. I'm
  afraid we might need to change the hardware border colour depending
 on
  the swizzle.
 
 
  I don't think so. The issue with the swizzled border color seems to be
  a bad hardware design decision present since r600 rather than a
  hardware bug. I tried fixing it for older chipsets with no success. I
  doubt the hw designers fixed this for SI. The problem is the hardware
  tries to guess what the border color swizzle is from the combined
  pipe_format+sampler view swizzle combination. You need 2 texture
  swizzle states in the texture unit for the border color to be swizzled
  correctly, because texels must be swizzled by the pipe_format swizzle
  and sampler view swizzle, but the border color must be swizzled by the
  sampler view only. The main problem is that the hardware internally
  tries to undo the pipe_format swizzle in a way that just doesn't work.
  I don't remember the exact swizzles being used by hardware, but I got
  crazy cases like if I set texture swizzle to ywzx, the border color
  will be ywyy. There is no way to access those zx components of the
  border color for that specific swizzling. For some cases, the hardware
  succeeds in guessing what the border color should be, e.g. if I set
  texture swizzle to .zyxw, the returned border color will be .xyzw (and
  that would be correct if the swizzle came from pipe_format, and
  incorrect if the swizzle came from sampler view).
 
  It was easy with r300, because I could just undo pipe_format swizzling
  before passing the border color to the hardware.
 

 Ah yes, border colour swizzle, it's a problem on NV, too. Because the
 border colour isn't getting swizzled at all [as far as we know].
 The main issue is the separation of samplers and textures in gallium, if
 that wasn't the case samplers and textures would be coupled and the
 sampler state could be set according to texture view state (if it's just
 OpenGL; and if it's just D3D there's no swizzle).
 So, I just leave it broken, I can't destroying the elegant separation
 because of such an unimportant detail, that hurts too much.

 (Also, if someone was to use multiple samplers and views in gallium and
 index them dynamically, I'd have to set up all combinations of textures
 and samplers, which is simply ridiculous.
 And now I'm going to look for some secret sampler setup bit that says
 swizzle according to texture view state. Maybe looking into the future
 of OpenGL someone's been wise enough to add that. But then, I'd have the
 sample problem as you. An intensity texture simply doesn't have separate
 values for R,G,B,A.)

 Possible solution:
 Maybe the state tracker could just do the swizzling, because it knows
 that samplers and views are coupled, and it knows the swizzle ?

  Marek
 
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 63269] New: explicitly symlinking libraries without libtool breaks OpenBSD build

2013-04-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=63269

  Priority: medium
Bug ID: 63269
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: explicitly symlinking libraries without libtool breaks
OpenBSD build
  Severity: normal
Classification: Unclassified
OS: OpenBSD
  Reporter: j...@openbsd.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

Many of the Mesa Makefiles now contain a comment like:

# Provide compatibility with scripts for the old Mesa build system for
# a while by putting a link to the driver into /lib of the build tree

Followed by then explicitly assuming libraries work like they do in
Linux:

../../../bin/install-sh -c -d ../../../lib
ln -f .libs/libglapi.so.0.0.0 ../../../lib/libglapi.so.0.0.0
ln: .libs/libglapi.so.0.0.0: No such file or directory

The .libs dir already contains a libglapi.so.0.0 here.  There is no symbol
versioning on OpenBSD, and libraries use just so.major.minor

Would it be possible to remove this 'compatibility with scripts' so Mesa can
build on more platforms?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] tgsi: Ensure struct tgsi_ind_register field Index is initialized.

2013-04-08 Thread Brian Paul

On 04/06/2013 10:33 PM, Vinson Lee wrote:

Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Leev...@freedesktop.org
---
  src/gallium/auxiliary/tgsi/tgsi_build.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 509bc5c..523430b 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -835,6 +835,7 @@ tgsi_default_ind_register( void )
 struct tgsi_ind_register ind_register;

 ind_register.File = TGSI_FILE_NULL;
+   ind_register.Index = 0;
 ind_register.Swizzle = TGSI_SWIZZLE_X;
 ind_register.ArrayID = 0;



Reviewed-by: Brian Paul bri...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: fix levels in initial texture creation

2013-04-08 Thread Brian Paul

On 04/06/2013 10:31 PM, Dave Airlie wrote:

From: Dave Airlieairl...@redhat.com

calim pointed out we were getting mipmap levels for array multisamples,
this didn't make sense. So then I noticed this function takes last_level
so we are passing in a too high value here.

I think this should fix the case he was seeing.

Signed-off-by: Dave Airlieairl...@redhat.com
---
  src/mesa/state_tracker/st_cb_texture.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 25ee352..85b5609 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -1691,7 +1691,7 @@ st_AllocTextureStorage(struct gl_context *ctx,
 stObj-pt = st_texture_create(st,
   gl_target_to_pipe(texObj-Target),
   fmt,
- levels,
+ levels - 1,
   ptWidth,
   ptHeight,
   ptDepth,


Reviewed-by: Brian Paul bri...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Fix linkage of libOpenCL

2013-04-08 Thread Tom Stellard
On Thu, Apr 04, 2013 at 11:26:45PM +0200, Niels Ole Salscheider wrote:
 Clover needs the irreader component of llvm
 
 v2: Check for irreader component
 irreader is only available with LLVM 3.3 = 177971
 
 Signed-off-by: Niels Ole Salscheider niels_...@salscheider-online.de

I've pushed this, thanks.

btw, I also pushed your libclc build fixes to my libclc repo.

-Tom

 ---
  configure.ac | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/configure.ac b/configure.ac
 index 81d4a3f..fea5868 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -1650,6 +1650,10 @@ if test x$enable_gallium_llvm = xyes; then
  
  if test x$enable_opencl = xyes; then
  LLVM_COMPONENTS=${LLVM_COMPONENTS} ipo linker instrumentation
 +# LLVM 3.3 = 177971 requires IRReader
 +if $LLVM_CONFIG --components | grep -q '\irreader\'; then
 +LLVM_COMPONENTS=${LLVM_COMPONENTS} irreader
 +fi
  fi
   LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
   LLVM_BINDIR=`$LLVM_CONFIG --bindir`
 -- 
 1.8.2
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/17] i965/vs: Make type of vec4_visitor::vp more generic.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 The vec4_visitor functions don't use any VS specific data from
 vec4_visitor::vp.  So rename it to just p and change its type from
 struct gl_vertex_program * to struct gl_program *.  This will allow
 the code to be re-used for geometry shaders.

In many other places in the driver, p is a brw_compile.  I'd rather
not overload that name.  In a couple other cases where we've had both
the gl_shader_program and the gl_program, the shader_program becomes
shader_prog (only about 8 instances in brw_vec4) and gl_program gets
to be just prog


pgpowcUqY6kwn.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag.

2013-04-08 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 When I removed the proj_attrib_mask optimization, I also removed the
 last consumer of this bit without realizing it.

 Since nobody uses it, there's no point in flagging it.

Series is:

Reviewed-by: Eric Anholt e...@anholt.net


pgpk5NUvBfZ1J.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/17] i965/vs: move VS-specific data members to vs_vec4_visitor.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 This patch moves the following data structures from vec4_visitor to
 vec4_vs_visitor, since they contain VS-specific data:

 - struct brw_vs_compile *c
 - struct brw_vs_prog_data *prog_data
 - src_reg *vp_temp_regs
 - src_reg vp_addr_reg

 Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic
 data, the following pointers are added to the base class, to allow it
 to access the vec4-generic portions of these data structures:

 - struct brw_vec4_compile *vec4_compile
 - struct brw_vec4_prog_key *vec4_key
 - struct brw_vec4_prog_data *vec4_prog_data

I would lean toward the base class (which contains most of the members
and usages, I think) having the short name, and the derived class having
the more specific name.  Either way, patch 7-11 are:

Reviewed-by: Eric Anholt e...@anholt.net


pgp6KftuxloEV.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/17] i965: Generalize computation of VUE map in preparation for GS.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 This patch modifies the arguments to brw_compute_vue_map() so that
 they no longer bake in the assumption that we are generating a VUE map
 for vertex shader outputs.  It also makes the function non-static so
 that we can re-use it for geometry shader outputs.

Reviewed-by: Eric Anholt e...@anholt.net


pgp1u3_ILPrBg.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/17] i965/vs: split brw_vs_prog_data into generic and VS-specific parts.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:
 -/* Note: brw_vs_prog_data_compare() must be updated when adding fields to 
 this
 - * struct!
 +
 +/* Note: brw_vec4_prog_data_compare() must be updated when adding fields to
 + * this struct!
   */
 -struct brw_vs_prog_data {
 +struct brw_vec4_prog_data {
 struct brw_vue_map vue_map;
  
 GLuint curb_read_length;
 -   GLuint urb_read_length;
 GLuint total_grf;
 GLuint nr_params;   /** number of float params/constants */
 GLuint nr_pull_params; /** number of dwords referenced by pull_param[] */
 GLuint total_scratch;
  
 +   int num_surfaces;
 +
 +   /* These pointers must appear last.  See brw_vec4_prog_data_compare(). */
 +   const float **param;
 +   const float **pull_param;
 +};
 +
 +
 +/* Note: brw_vs_prog_data_compare() must be updated when adding fields to 
 this
 + * struct!
 + */
 +struct brw_vs_prog_data {
 +   struct brw_vec4_prog_data base;
 +
 +   GLuint urb_read_length;

There's a URB read length in the GS state packet, so it seems like you'd
want this field in the GS case as well as VS.  I'm confused.  I also
would have expected urb_entry_size in GS.


pgpHJskhzQ03h.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/17] i965/vs: Generalize data structures pointed to by vec4_generator.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 This patch removes the following field from vec4_generator, since it
 is not used:

 - struct brw_vs_compile *c

 And changes the following field:

 - struct gl_vertex_program *vp = struct gl_program *glprog

Same comment about prog/shader_prog as a naming solution.


pgpd_Sc9QyiZB.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture

2013-04-08 Thread Vadim Girlin

On 04/08/2013 02:03 PM, Marek Olšák wrote:

On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net wrote:


On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote:

From: Jerome Glisse jgli...@redhat.com

Most test pass, issue are with border color and swizzle.


FWIW, those issues are there with non-compressed formats as well. I'm
afraid we might need to change the hardware border colour depending on
the swizzle.



I don't think so. The issue with the swizzled border color seems to be a
bad hardware design decision present since r600 rather than a hardware bug.
I tried fixing it for older chipsets with no success. I doubt the hw
designers fixed this for SI. The problem is the hardware tries to guess
what the border color swizzle is from the combined pipe_format+sampler view
swizzle combination. You need 2 texture swizzle states in the texture unit
for the border color to be swizzled correctly, because texels must be
swizzled by the pipe_format swizzle and sampler view swizzle, but the
border color must be swizzled by the sampler view only. The main problem is
that the hardware internally tries to undo the pipe_format swizzle in a way
that just doesn't work. I don't remember the exact swizzles being used by
hardware, but I got crazy cases like if I set texture swizzle to ywzx, the
border color will be ywyy. There is no way to access those zx components of
the border color for that specific swizzling. For some cases, the hardware
succeeds in guessing what the border color should be, e.g. if I set texture
swizzle to .zyxw, the returned border color will be .xyzw (and that would
be correct if the swizzle came from pipe_format, and incorrect if the
swizzle came from sampler view).



I also looked into this issue some time ago (on evergreen) and IIRC I 
found that the swizzle is actually applied twice to border color in most 
cases (at least when swizzle_y is not 2 or 3), I think it's just a bug 
(or we are missing something in the hw configuration).


Anyway, according to my tests in many cases (960 of 1296 total swizzles, 
74%) it's possible to apply some precomputed swizzle to border color 
before writing it to the registers to get the correct result in the end, 
but I'm not sure if it makes sense to implement that.


Vadim


It was easy with r300, because I could just undo pipe_format swizzling
before passing the border color to the hardware.

Marek



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] remove mfeatures.h, take two

2013-04-08 Thread Matt Turner
Ready to commit?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Update comments to match newer specs.

2013-04-08 Thread Matt Turner
Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately
above the second hunk also uses 'p'.
---
 src/mesa/main/mtypes.h |2 +-
 src/mesa/main/texobj.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 008f68b..3d8f359 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1171,7 +1171,7 @@ struct gl_texture_object
GLint MaxLevel; /** max mipmap level, OpenGL 1.2 */
GLint ImmutableLevels;   /** ES 3.0 / ARB_texture_view */
GLint _MaxLevel;/** actual max mipmap level (q in the spec) */
-   GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - b in spec) */
+   GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - p in spec) */
GLint CropRect[4];   /** GL_OES_draw_texture */
GLenum Swizzle[4];   /** GL_EXT_texture_swizzle */
GLuint _Swizzle; /** same as Swizzle, but SWIZZLE_* format */
diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index 66377c8..d0fcb12 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -553,7 +553,7 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel);
t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */
 
-   /* Compute _MaxLambda = q - b (see the 1.2 spec) used during mipmapping */
+   /* Compute _MaxLambda = q - p in the spec used during mipmapping */
t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel);
 
if (t-Immutable) {
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: Use MIN3 instead of two MIN2s.

2013-04-08 Thread Matt Turner
---
 src/mesa/main/texobj.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index d0fcb12..28b8130 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -548,10 +548,11 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
 
ASSERT(maxLevels  0);
 
-   t-_MaxLevel =
-  baseLevel + baseImage-MaxNumLevels - 1; /* 'p' in the GL spec */
-   t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel);
-   t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */
+   t-_MaxLevel = MIN3(t-MaxLevel,
+   /* 'p' in the GL spec */
+   baseLevel + baseImage-MaxNumLevels - 1,
+   /* 'q' in the GL spec */
+   maxLevels - 1);
 
/* Compute _MaxLambda = q - p in the spec used during mipmapping */
t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel);
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.

2013-04-08 Thread Jordan Justen
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

On Sat, Apr 6, 2013 at 8:25 PM, Paul Berry stereotype...@gmail.com wrote:
 When transform feedback is active, the driver manually counts the
 number of primitives that run through the pipeline, so that if a batch
 buffer flush happens, the next batch buffer can pick up transform
 feedback where the last batch buffer left off.  Hardware-accelerated
 primitive restart interferes with this process (because it makes the
 primitive count depend not just on the number of vertices entering the
 pipeline, but also on the contents of the index buffer).  So, when
 transform feedback is active, we need to fall back to the software
 implementation of primitive restart.

 Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush.

 NOTE: This is a candidate for stable release branches.
 ---
  src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
 b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 index e6902b4..d0f0038 100644
 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 @@ -27,6 +27,7 @@

  #include main/imports.h
  #include main/bufferobj.h
 +#include main/transformfeedback.h

  #include brw_context.h
  #include brw_defines.h
 @@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx,
 struct brw_context *brw = brw_context(ctx);

 if (brw-sol.counting_primitives_generated ||
 -   brw-sol.counting_primitives_written) {
 +   brw-sol.counting_primitives_written ||
 +   _mesa_is_xfb_active_and_unpaused(ctx)) {
/* Counting primitives generated in hardware is not currently
 * supported, so take the software path. We need to investigate
 * the *_PRIMITIVES_COUNT registers to allow this to be handled
 * entirely in hardware.
 +   *
 +   * Note that when transform feedback is active, we also count 
 primitives
 +   * (even if the client hasn't requested it), since that is the only way
 +   * we can start at the proper place in the transform feedback buffer
 +   * after a flush.  So we also have to fall back to software when
 +   * transform feedback is active and unpaused.
 */
return false;
 }
 --
 1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.

2013-04-08 Thread Ian Romanick

On 04/07/2013 06:42 AM, Paul Berry wrote:

The call to emit_shader_time_end() before the second URB write was
conditioned with if (eot), but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.


I had to look at that code for way to long to convince myself that your 
patch was correct.  I think it might be better to remove both the 
conditional emit_shader_time_end calls and put this block of code at the 
very bottom (unless emit_shader_time_end has some side effect that I 
don't see):


   if (inst-eot) {
  if (INTEL_DEBUG  DEBUG_SHADER_TIME)
 emit_shader_time_end();
   }

Or does the last URB write have to be the last instruction?


---
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 8bd2fd8..ca1cfe8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2664,10 +2664,8 @@ vec4_visitor::emit_urb_writes()
   emit_urb_slot(mrf++, c-prog_data.vue_map.slot_to_varying[slot]);
}

-  if (eot) {
- if (INTEL_DEBUG  DEBUG_SHADER_TIME)
-emit_shader_time_end();
-  }
+  if (INTEL_DEBUG  DEBUG_SHADER_TIME)
+ emit_shader_time_end();

current_annotation = URB write;
inst = emit(VS_OPCODE_URB_WRITE);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Update comments to match newer specs.

2013-04-08 Thread Ian Romanick

On 04/08/2013 10:29 AM, Matt Turner wrote:

Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately
above the second hunk also uses 'p'.


Series is

Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---
  src/mesa/main/mtypes.h |2 +-
  src/mesa/main/texobj.c |2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 008f68b..3d8f359 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1171,7 +1171,7 @@ struct gl_texture_object
 GLint MaxLevel;/** max mipmap level, OpenGL 1.2 */
 GLint ImmutableLevels;   /** ES 3.0 / ARB_texture_view */
 GLint _MaxLevel;   /** actual max mipmap level (q in the spec) */
-   GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - b in spec) */
+   GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - p in spec) */
 GLint CropRect[4];   /** GL_OES_draw_texture */
 GLenum Swizzle[4];   /** GL_EXT_texture_swizzle */
 GLuint _Swizzle; /** same as Swizzle, but SWIZZLE_* format */
diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index 66377c8..d0fcb12 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -553,7 +553,7 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
 t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel);
 t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */

-   /* Compute _MaxLambda = q - b (see the 1.2 spec) used during mipmapping */
+   /* Compute _MaxLambda = q - p in the spec used during mipmapping */
 t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel);

 if (t-Immutable) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.

2013-04-08 Thread Ian Romanick

On 04/06/2013 08:25 PM, Paul Berry wrote:

When transform feedback is active, the driver manually counts the
number of primitives that run through the pipeline, so that if a batch
buffer flush happens, the next batch buffer can pick up transform
feedback where the last batch buffer left off.  Hardware-accelerated
primitive restart interferes with this process (because it makes the
primitive count depend not just on the number of vertices entering the
pipeline, but also on the contents of the index buffer).  So, when
transform feedback is active, we need to fall back to the software
implementation of primitive restart.

Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush.

NOTE: This is a candidate for stable release branches.


Oof.  This shouldn't be a performance hit on too many applications, 
thankfully.  Do we know when we're going to get real hardware counting 
support? :(


Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---
  src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
index e6902b4..d0f0038 100644
--- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
+++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
@@ -27,6 +27,7 @@

  #include main/imports.h
  #include main/bufferobj.h
+#include main/transformfeedback.h

  #include brw_context.h
  #include brw_defines.h
@@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx,
 struct brw_context *brw = brw_context(ctx);

 if (brw-sol.counting_primitives_generated ||
-   brw-sol.counting_primitives_written) {
+   brw-sol.counting_primitives_written ||
+   _mesa_is_xfb_active_and_unpaused(ctx)) {
/* Counting primitives generated in hardware is not currently
 * supported, so take the software path. We need to investigate
 * the *_PRIMITIVES_COUNT registers to allow this to be handled
 * entirely in hardware.
+   *
+   * Note that when transform feedback is active, we also count primitives
+   * (even if the client hasn't requested it), since that is the only way
+   * we can start at the proper place in the transform feedback buffer
+   * after a flush.  So we also have to fall back to software when
+   * transform feedback is active and unpaused.
 */
return false;
 }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: add 2d tiling support for texture v3

2013-04-08 Thread j . glisse
From: Jerome Glisse jgli...@redhat.com

v2: Remove left over code
v3: Restage properly the commit so hunk of first one are not in
second one.

Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 src/gallium/drivers/radeonsi/r600_texture.c | 11 ++--
 src/gallium/drivers/radeonsi/si_state.c | 81 +
 2 files changed, 20 insertions(+), 72 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/r600_texture.c 
b/src/gallium/drivers/radeonsi/r600_texture.c
index 1b8382f..8992f9a 100644
--- a/src/gallium/drivers/radeonsi/r600_texture.c
+++ b/src/gallium/drivers/radeonsi/r600_texture.c
@@ -47,7 +47,6 @@ static void r600_copy_to_staging_texture(struct pipe_context 
*ctx, struct r600_t
transfer-box);
 }
 
-
 /* Copy from a transfer's staging texture to a full GPU one. */
 static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
 {
@@ -152,12 +151,12 @@ static int r600_init_surface(struct r600_screen *rscreen,
 
if (!is_flushed_depth  is_depth) {
surface-flags |= RADEON_SURF_ZBUFFER;
-
if (is_stencil) {
surface-flags |= RADEON_SURF_SBUFFER |
RADEON_SURF_HAS_SBUFFER_MIPTREE;
}
}
+   surface-flags |= RADEON_SURF_HAS_TILE_MODE_INDEX;
return 0;
 }
 
@@ -530,7 +529,11 @@ struct pipe_resource *si_texture_create(struct pipe_screen 
*screen,
 
if (!(templ-flags  R600_RESOURCE_FLAG_TRANSFER) 
!(templ-bind  PIPE_BIND_SCANOUT)) {
-   array_mode = V_009910_ARRAY_1D_TILED_THIN1;
+   if (util_format_is_compressed(templ-format)) {
+   array_mode = V_009910_ARRAY_1D_TILED_THIN1;
+   } else {
+   array_mode = V_009910_ARRAY_2D_TILED_THIN1;
+   }
}
 
r = r600_init_surface(rscreen, surface, templ, array_mode,
@@ -620,6 +623,8 @@ struct pipe_resource *si_texture_from_handle(struct 
pipe_screen *screen,
if (r) {
return NULL;
}
+   /* always set the scanout flags */
+   surface.flags |= RADEON_SURF_SCANOUT;
return (struct pipe_resource *)r600_texture_create_object(screen, 
templ, array_mode,
  stride, 0, 
buf, FALSE, surface);
 }
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ca9e8b4..61ede64 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1541,67 +1541,16 @@ boolean si_is_format_supported(struct pipe_screen 
*screen,
return retval == usage;
 }
 
-static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, 
unsigned level)
-{
-   if (util_format_is_depth_or_stencil(rtex-real_format)) {
-   if (rtex-surface.level[level].mode == RADEON_SURF_MODE_1D) {
-   return 4;
-   } else if (rtex-surface.level[level].mode == 
RADEON_SURF_MODE_2D) {
-   switch (rtex-real_format) {
-   case PIPE_FORMAT_Z16_UNORM:
-   return 5;
-   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
-   case PIPE_FORMAT_X8Z24_UNORM:
-   case PIPE_FORMAT_Z24X8_UNORM:
-   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
-   case PIPE_FORMAT_Z32_FLOAT:
-   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
-   return 6;
-   default:
-   return 7;
-   }
-   }
-   }
+static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, 
unsigned level, bool stencil)
+{
+   unsigned tile_mode_index = 0;
 
-   switch (rtex-surface.level[level].mode) {
-   default:
-   assert(!Invalid surface mode);
-   /* Fall through */
-   case RADEON_SURF_MODE_LINEAR_ALIGNED:
-   return 8;
-   case RADEON_SURF_MODE_1D:
-   if (rtex-surface.flags  RADEON_SURF_SCANOUT)
-   return 9;
-   else
-   return 13;
-   case RADEON_SURF_MODE_2D:
-   if (rtex-surface.flags  RADEON_SURF_SCANOUT) {
-   switch (util_format_get_blocksize(rtex-real_format)) {
-   case 1:
-   return 10;
-   case 2:
-   return 11;
-   default:
-   assert(!Invalid block size);
-   /* Fall through */
-   case 4:
-   return 12;
-   }
-   } else {
-   switch (util_format_get_blocksize(rtex-real_format)) {
- 

[Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture v2

2013-04-08 Thread j . glisse
From: Jerome Glisse jgli...@redhat.com

Most test pass, issue are with border color and swizzle.

Based on ircnickmaelcum patch.

v2: Restaged commit hunk

Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 src/gallium/drivers/radeonsi/si_state.c | 71 -
 src/gallium/drivers/radeonsi/sid.h  |  7 
 2 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 61ede64..a39843c 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -30,6 +30,7 @@
 #include util/u_helpers.h
 #include util/u_math.h
 #include util/u_pack_color.h
+#include util/u_format_s3tc.h
 #include tgsi/tgsi_parse.h
 #include radeonsi_pipe.h
 #include radeonsi_shader.h
@@ -1164,6 +1165,8 @@ static uint32_t si_translate_texformat(struct pipe_screen 
*screen,
   const struct util_format_description 
*desc,
   int first_non_void)
 {
+   struct r600_screen *rscreen = (struct r600_screen*)screen;
+   bool enable_s3tc = rscreen-info.drm_minor = 31;
boolean uniform = TRUE;
int i;
 
@@ -1205,7 +1208,51 @@ static uint32_t si_translate_texformat(struct 
pipe_screen *screen,
break;
}
 
-   /* TODO compressed formats */
+   if (desc-layout == UTIL_FORMAT_LAYOUT_RGTC) {
+   if (!enable_s3tc)
+   goto out_unknown;
+
+   switch (format) {
+   case PIPE_FORMAT_RGTC1_SNORM:
+   case PIPE_FORMAT_LATC1_SNORM:
+   case PIPE_FORMAT_RGTC1_UNORM:
+   case PIPE_FORMAT_LATC1_UNORM:
+   return V_008F14_IMG_DATA_FORMAT_BC4;
+   case PIPE_FORMAT_RGTC2_SNORM:
+   case PIPE_FORMAT_LATC2_SNORM:
+   case PIPE_FORMAT_RGTC2_UNORM:
+   case PIPE_FORMAT_LATC2_UNORM:
+   return V_008F14_IMG_DATA_FORMAT_BC5;
+   default:
+   goto out_unknown;
+   }
+   }
+
+   if (desc-layout == UTIL_FORMAT_LAYOUT_S3TC) {
+
+   if (!enable_s3tc)
+   goto out_unknown;
+
+   if (!util_format_s3tc_enabled) {
+   goto out_unknown;
+   }
+
+   switch (format) {
+   case PIPE_FORMAT_DXT1_RGB:
+   case PIPE_FORMAT_DXT1_RGBA:
+   case PIPE_FORMAT_DXT1_SRGB:
+   case PIPE_FORMAT_DXT1_SRGBA:
+   return V_008F14_IMG_DATA_FORMAT_BC1;
+   case PIPE_FORMAT_DXT3_RGBA:
+   case PIPE_FORMAT_DXT3_SRGBA:
+   return V_008F14_IMG_DATA_FORMAT_BC2;
+   case PIPE_FORMAT_DXT5_RGBA:
+   case PIPE_FORMAT_DXT5_SRGBA:
+   return V_008F14_IMG_DATA_FORMAT_BC3;
+   default:
+   goto out_unknown;
+   }
+   }
 
if (format == PIPE_FORMAT_R9G9B9E5_FLOAT) {
return V_008F14_IMG_DATA_FORMAT_5_9_9_9;
@@ -2109,7 +2156,27 @@ static struct pipe_sampler_view 
*si_create_sampler_view(struct pipe_context *ctx
break;
default:
if (first_non_void  0) {
-   num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
+   if (util_format_is_compressed(pipe_format)) {
+   switch (pipe_format) {
+   case PIPE_FORMAT_DXT1_SRGB:
+   case PIPE_FORMAT_DXT1_SRGBA:
+   case PIPE_FORMAT_DXT3_SRGBA:
+   case PIPE_FORMAT_DXT5_SRGBA:
+   num_format = 
V_008F14_IMG_NUM_FORMAT_SRGB;
+   break;
+   case PIPE_FORMAT_RGTC1_SNORM:
+   case PIPE_FORMAT_LATC1_SNORM:
+   case PIPE_FORMAT_RGTC2_SNORM:
+   case PIPE_FORMAT_LATC2_SNORM:
+   num_format = 
V_008F14_IMG_NUM_FORMAT_SNORM;
+   break;
+   default:
+   num_format = 
V_008F14_IMG_NUM_FORMAT_UNORM;
+   break;
+   }
+   } else {
+   num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
+   }
} else if (desc-colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
num_format = V_008F14_IMG_NUM_FORMAT_SRGB;
} else {
diff --git a/src/gallium/drivers/radeonsi/sid.h 
b/src/gallium/drivers/radeonsi/sid.h
index 8528981..2722c79 100644
--- a/src/gallium/drivers/radeonsi/sid.h
+++ 

Re: [Mesa-dev] [PATCH 3/4] st/mesa: add support for ARB_texture_multisample

2013-04-08 Thread Ian Romanick

On 04/06/2013 03:05 AM, Dave Airlie wrote:

From: Dave Airlie airl...@redhat.com

This adds support to the mesa state tracker for ARB_texture_multisample.

hardware doesn't seem to use a different texture instructions, so
I don't think we need to create one for TGSI at this time.

Thanks to Marek for fixes to sample number picking.

Reviewed-by: Marek Olšák mar...@gmail.com
Signed-off-by: Dave Airlie airl...@redhat.com
---
  src/mesa/state_tracker/st_atom_framebuffer.c |  1 +
  src/mesa/state_tracker/st_atom_msaa.c|  2 ++
  src/mesa/state_tracker/st_cb_bitmap.c|  4 +--
  src/mesa/state_tracker/st_cb_drawpixels.c|  2 +-
  src/mesa/state_tracker/st_cb_fbo.c   |  2 +-
  src/mesa/state_tracker/st_cb_texture.c   | 41 
  src/mesa/state_tracker/st_extensions.c   |  6 +++-
  src/mesa/state_tracker/st_gen_mipmap.c   |  1 +
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 17 ++--
  src/mesa/state_tracker/st_mesa_to_tgsi.c |  2 ++
  src/mesa/state_tracker/st_texture.c  |  8 +-
  src/mesa/state_tracker/st_texture.h  |  1 +
  12 files changed, 72 insertions(+), 15 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c 
b/src/mesa/state_tracker/st_atom_framebuffer.c
index 3df8691..c752640 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -59,6 +59,7 @@ update_renderbuffer_surface(struct st_context *st,
 enum pipe_format format = st-ctx-Color.sRGBEnabled ? resource-format : 
util_format_linear(resource-format);

 if (!strb-surface ||
+   strb-surface-texture-nr_samples != strb-Base.NumSamples ||
 strb-surface-format != format ||
 strb-surface-texture != resource ||
 strb-surface-width != rtt_width ||
diff --git a/src/mesa/state_tracker/st_atom_msaa.c 
b/src/mesa/state_tracker/st_atom_msaa.c
index 9baa4fc..b749a17 100644
--- a/src/mesa/state_tracker/st_atom_msaa.c
+++ b/src/mesa/state_tracker/st_atom_msaa.c
@@ -63,6 +63,8 @@ static void update_sample_mask( struct st_context *st )
  sample_mask = ~sample_mask;
}
/* TODO merge with app-supplied sample mask */
+  if (st-ctx-Multisample.SampleMask)
+ sample_mask = st-ctx-Multisample.SampleMaskValue;
 }

 /* mask off unused bits or don't care? */
diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
b/src/mesa/state_tracker/st_cb_bitmap.c
index 0513814..ee66ab3 100644
--- a/src/mesa/state_tracker/st_cb_bitmap.c
+++ b/src/mesa/state_tracker/st_cb_bitmap.c
@@ -299,7 +299,7 @@ make_bitmap_texture(struct gl_context *ctx, GLsizei width, 
GLsizei height,
  * Create texture to hold bitmap pattern.
  */
 pt = st_texture_create(st, st-internal_target, st-bitmap.tex_format,
-  0, width, height, 1, 1,
+  0, width, height, 1, 1, 0,
PIPE_BIND_SAMPLER_VIEW);
 if (!pt) {
_mesa_unmap_pbo_source(ctx, unpack);
@@ -568,7 +568,7 @@ reset_cache(struct st_context *st)
 cache-texture = st_texture_create(st, PIPE_TEXTURE_2D,
st-bitmap.tex_format, 0,
BITMAP_CACHE_WIDTH, BITMAP_CACHE_HEIGHT,
-  1, 1,
+  1, 1, 0,
  PIPE_BIND_SAMPLER_VIEW);
  }

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index b25b776..db2f03a 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -466,7 +466,7 @@ alloc_texture(struct st_context *st, GLsizei width, GLsizei 
height,
 struct pipe_resource *pt;

 pt = st_texture_create(st, st-internal_target, texFormat, 0,
-  width, height, 1, 1, PIPE_BIND_SAMPLER_VIEW);
+  width, height, 1, 1, 0, PIPE_BIND_SAMPLER_VIEW);

 return pt;
  }
diff --git a/src/mesa/state_tracker/st_cb_fbo.c 
b/src/mesa/state_tracker/st_cb_fbo.c
index 4452e52..127b123 100644
--- a/src/mesa/state_tracker/st_cb_fbo.c
+++ b/src/mesa/state_tracker/st_cb_fbo.c
@@ -433,7 +433,7 @@ st_render_texture(struct gl_context *ctx,
 strb-rtt_level = att-TextureLevel;
 strb-rtt_face = att-CubeMapFace;
 strb-rtt_slice = att-Zoffset;
-
+   rb-NumSamples = texImage-NumSamples;
 rb-Width = texImage-Width2;
 rb-Height = texImage-Height2;
 rb-_BaseFormat = texImage-_BaseFormat;
diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 0cd0d77..25ee352 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -78,6 +78,8 @@ gl_target_to_pipe(GLenum target)
 case GL_TEXTURE_2D:
 case GL_PROXY_TEXTURE_2D:
 case GL_TEXTURE_EXTERNAL_OES:
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_PROXY_TEXTURE_2D_MULTISAMPLE:

[Mesa-dev] [Bug 56542] [bisected] Piglit gl_select tests crash on exit

2013-04-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=56542

--- Comment #3 from Jerome Glisse gli...@freedesktop.org ---
Dunno if it's a freeglut know bug. They could argue it's not a bug, but really
using atexit to do Xorg cleanup is bad.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] glsl/linker: Adapt flat varying handling in preparation for geometry shaders.

2013-04-08 Thread Ian Romanick

On 04/06/2013 07:49 PM, Paul Berry wrote:

When a varying is consumed by transform feedback, but is not used by
the fragment shader, assign_varying_locations() sets its interpolation
type to flat in order to ensure that lower_packed_varyings never has
to deal with non-flat integral varyings (the GLSL spec doesn't require
integral vertex outputs to be flat if they aren't consumed by the
fragment shader).

A similar situation will arise when geometry shader support is added,
since the GLSL spec only requires integral vertex shader outputs to be
flat when they are consumed by the geometry shader.  This patch

 
fragment?


modifies the linker to handle this situation too.
---
  src/glsl/link_varyings.cpp | 30 --
  1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 431d8fd..7e90beb 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -541,7 +541,7 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
  class varying_matches
  {
  public:
-   varying_matches(bool disable_varying_packing);
+   varying_matches(bool disable_varying_packing, bool consumer_is_fs);
 ~varying_matches();
 void record(ir_variable *producer_var, ir_variable *consumer_var);
 unsigned assign_locations();
@@ -621,11 +621,15 @@ private:
  * it was allocated.
  */
 unsigned matches_capacity;
+
+   const bool consumer_is_fs;
  };


-varying_matches::varying_matches(bool disable_varying_packing)
-   : disable_varying_packing(disable_varying_packing)
+varying_matches::varying_matches(bool disable_varying_packing,
+ bool consumer_is_fs)
+   : disable_varying_packing(disable_varying_packing),
+ consumer_is_fs(consumer_is_fs)
  {
 /* Note: this initial capacity is rather arbitrarily chosen to be large
  * enough for many cases without wasting an unreasonable amount of space.
@@ -672,12 +676,12 @@ varying_matches::record(ir_variable *producer_var, 
ir_variable *consumer_var)
return;
 }

-   if (consumer_var == NULL) {
-  /* Since there is no consumer_var, the interpolation type of this
-   * varying cannot possibly affect rendering.  Also, since the GL spec
-   * only requires integer varyings to be flat when they are fragment
-   * shader inputs, it is possible that this variable is non-flat and is
-   * (or contains) an integer.
+   if (consumer_var == NULL || !consumer_is_fs) {
+  /* Since this varying is not being consumed by the fragment shader, its
+   * interpolation type varying cannot possibly affect rendering.  Also,
+   * since the GL spec only requires integer varyings to be flat when
+   * they are fragment shader inputs, it is possible that this variable is
+   * non-flat and is (or contains) an integer.
 *
 * lower_packed_varyings requires all integer varyings to flat,
 * regardless of where they appear.  We can trivially satisfy that
@@ -685,6 +689,11 @@ varying_matches::record(ir_variable *producer_var, 
ir_variable *consumer_var)
 */
producer_var-centroid = false;
producer_var-interpolation = INTERP_QUALIFIER_FLAT;
+
+  if (consumer_var) {
+ consumer_var-centroid = false;
+ consumer_var-interpolation = INTERP_QUALIFIER_FLAT;
+  }
 }

 if (this-num_matches == this-matches_capacity) {
@@ -979,7 +988,8 @@ assign_varying_locations(struct gl_context *ctx,
  {
 const unsigned producer_base = VARYING_SLOT_VAR0;
 const unsigned consumer_base = VARYING_SLOT_VAR0;
-   varying_matches matches(ctx-Const.DisableVaryingPacking);
+   varying_matches matches(ctx-Const.DisableVaryingPacking,
+   consumer  consumer-Type == GL_FRAGMENT_SHADER);
 hash_table *tfeedback_candidates
= hash_table_ctor(0, hash_table_string_hash, hash_table_string_compare);
 hash_table *consumer_inputs



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.

2013-04-08 Thread Paul Berry
On 8 April 2013 10:40, Ian Romanick i...@freedesktop.org wrote:

 On 04/06/2013 08:25 PM, Paul Berry wrote:

 When transform feedback is active, the driver manually counts the
 number of primitives that run through the pipeline, so that if a batch
 buffer flush happens, the next batch buffer can pick up transform
 feedback where the last batch buffer left off.  Hardware-accelerated
 primitive restart interferes with this process (because it makes the
 primitive count depend not just on the number of vertices entering the
 pipeline, but also on the contents of the index buffer).  So, when
 transform feedback is active, we need to fall back to the software
 implementation of primitive restart.

 Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush.

 NOTE: This is a candidate for stable release branches.


 Oof.  This shouldn't be a performance hit on too many applications,
 thankfully.  Do we know when we're going to get real hardware counting
 support? :(


We just had a discussion about that this morning.  There's no hardware
limitation, just kernel limitations.  As far as this bug is concerned, all
we need is hardware context support (which we have today).  I believe Eric
is working on this.

As for the GL_PRIMITIVES_GENERATED and
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries, I believe we need some
kernel changes to allow us to read the hardware counters.  I believe Eric
is pinging some of the kernel folks on IRC to request that.

All of this stuff needs to get sorted out before we can implement geometry
shaders, so I'm highly motivated to keep an eye on it and make sure it gets
settled soon :)



 Reviewed-by: Ian Romanick ian.d.roman...@intel.com


  ---
   src/mesa/drivers/dri/i965/brw_**primitive_restart.c | 10 +-
   1 file changed, 9 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 index e6902b4..d0f0038 100644
 --- a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 +++ b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 @@ -27,6 +27,7 @@

   #include main/imports.h
   #include main/bufferobj.h
 +#include main/transformfeedback.h

   #include brw_context.h
   #include brw_defines.h
 @@ -81,11 +82,18 @@ can_cut_index_handle_prims(**struct gl_context *ctx,
  struct brw_context *brw = brw_context(ctx);

  if (brw-sol.counting_primitives_**generated ||
 -   brw-sol.counting_primitives_**written) {
 +   brw-sol.counting_primitives_**written ||
 +   _mesa_is_xfb_active_and_**unpaused(ctx)) {
 /* Counting primitives generated in hardware is not currently
  * supported, so take the software path. We need to investigate
  * the *_PRIMITIVES_COUNT registers to allow this to be handled
  * entirely in hardware.
 +   *
 +   * Note that when transform feedback is active, we also count
 primitives
 +   * (even if the client hasn't requested it), since that is the
 only way
 +   * we can start at the proper place in the transform feedback
 buffer
 +   * after a flush.  So we also have to fall back to software when
 +   * transform feedback is active and unpaused.
  */
 return false;
  }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] glsl/linker: Adapt flat varying handling in preparation for geometry shaders.

2013-04-08 Thread Paul Berry
On 8 April 2013 10:57, Ian Romanick i...@freedesktop.org wrote:

 On 04/06/2013 07:49 PM, Paul Berry wrote:

 When a varying is consumed by transform feedback, but is not used by
 the fragment shader, assign_varying_locations() sets its interpolation
 type to flat in order to ensure that lower_packed_varyings never has
 to deal with non-flat integral varyings (the GLSL spec doesn't require
 integral vertex outputs to be flat if they aren't consumed by the
 fragment shader).

 A similar situation will arise when geometry shader support is added,
 since the GLSL spec only requires integral vertex shader outputs to be
 flat when they are consumed by the geometry shader.  This patch

  
 fragment?


Oops, yes.  Thanks.




  modifies the linker to handle this situation too.
 ---
   src/glsl/link_varyings.cpp | 30 --
   1 file changed, 20 insertions(+), 10 deletions(-)

 diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
 index 431d8fd..7e90beb 100644
 --- a/src/glsl/link_varyings.cpp
 +++ b/src/glsl/link_varyings.cpp
 @@ -541,7 +541,7 @@ store_tfeedback_info(struct gl_context *ctx, struct
 gl_shader_program *prog,
   class varying_matches
   {
   public:
 -   varying_matches(bool disable_varying_packing);
 +   varying_matches(bool disable_varying_packing, bool consumer_is_fs);
  ~varying_matches();
  void record(ir_variable *producer_var, ir_variable *consumer_var);
  unsigned assign_locations();
 @@ -621,11 +621,15 @@ private:
   * it was allocated.
   */
  unsigned matches_capacity;
 +
 +   const bool consumer_is_fs;
   };


 -varying_matches::varying_**matches(bool disable_varying_packing)
 -   : disable_varying_packing(**disable_varying_packing)
 +varying_matches::varying_**matches(bool disable_varying_packing,
 + bool consumer_is_fs)
 +   : disable_varying_packing(**disable_varying_packing),
 + consumer_is_fs(consumer_is_fs)
   {
  /* Note: this initial capacity is rather arbitrarily chosen to be
 large
   * enough for many cases without wasting an unreasonable amount of
 space.
 @@ -672,12 +676,12 @@ varying_matches::record(ir_**variable
 *producer_var, ir_variable *consumer_var)
 return;
  }

 -   if (consumer_var == NULL) {
 -  /* Since there is no consumer_var, the interpolation type of this
 -   * varying cannot possibly affect rendering.  Also, since the GL
 spec
 -   * only requires integer varyings to be flat when they are
 fragment
 -   * shader inputs, it is possible that this variable is non-flat
 and is
 -   * (or contains) an integer.
 +   if (consumer_var == NULL || !consumer_is_fs) {
 +  /* Since this varying is not being consumed by the fragment
 shader, its
 +   * interpolation type varying cannot possibly affect rendering.
  Also,
 +   * since the GL spec only requires integer varyings to be flat
 when
 +   * they are fragment shader inputs, it is possible that this
 variable is
 +   * non-flat and is (or contains) an integer.
  *
  * lower_packed_varyings requires all integer varyings to flat,
  * regardless of where they appear.  We can trivially satisfy that
 @@ -685,6 +689,11 @@ varying_matches::record(ir_**variable
 *producer_var, ir_variable *consumer_var)
  */
 producer_var-centroid = false;
 producer_var-interpolation = INTERP_QUALIFIER_FLAT;
 +
 +  if (consumer_var) {
 + consumer_var-centroid = false;
 + consumer_var-interpolation = INTERP_QUALIFIER_FLAT;
 +  }
  }

  if (this-num_matches == this-matches_capacity) {
 @@ -979,7 +988,8 @@ assign_varying_locations(**struct gl_context *ctx,
   {
  const unsigned producer_base = VARYING_SLOT_VAR0;
  const unsigned consumer_base = VARYING_SLOT_VAR0;
 -   varying_matches matches(ctx-Const.**DisableVaryingPacking);
 +   varying_matches matches(ctx-Const.**DisableVaryingPacking,
 +   consumer  consumer-Type ==
 GL_FRAGMENT_SHADER);
  hash_table *tfeedback_candidates
 = hash_table_ctor(0, hash_table_string_hash,
 hash_table_string_compare);
  hash_table *consumer_inputs



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.

2013-04-08 Thread Kenneth Graunke

On 04/06/2013 08:25 PM, Paul Berry wrote:

When transform feedback is active, the driver manually counts the
number of primitives that run through the pipeline, so that if a batch
buffer flush happens, the next batch buffer can pick up transform
feedback where the last batch buffer left off.  Hardware-accelerated
primitive restart interferes with this process (because it makes the
primitive count depend not just on the number of vertices entering the
pipeline, but also on the contents of the index buffer).  So, when
transform feedback is active, we need to fall back to the software
implementation of primitive restart.

Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush.

NOTE: This is a candidate for stable release branches.
---
  src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
index e6902b4..d0f0038 100644
--- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
+++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
@@ -27,6 +27,7 @@

  #include main/imports.h
  #include main/bufferobj.h
+#include main/transformfeedback.h

  #include brw_context.h
  #include brw_defines.h
@@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx,
 struct brw_context *brw = brw_context(ctx);

 if (brw-sol.counting_primitives_generated ||
-   brw-sol.counting_primitives_written) {
+   brw-sol.counting_primitives_written ||
+   _mesa_is_xfb_active_and_unpaused(ctx)) {
/* Counting primitives generated in hardware is not currently
 * supported, so take the software path. We need to investigate
 * the *_PRIMITIVES_COUNT registers to allow this to be handled
 * entirely in hardware.
+   *
+   * Note that when transform feedback is active, we also count primitives
+   * (even if the client hasn't requested it), since that is the only way
+   * we can start at the proper place in the transform feedback buffer
+   * after a flush.  So we also have to fall back to software when
+   * transform feedback is active and unpaused.
 */
return false;
 }


Gah.  This is unfortunate, but necessary.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.

2013-04-08 Thread Paul Berry
On 8 April 2013 10:37, Ian Romanick i...@freedesktop.org wrote:

 On 04/07/2013 06:42 AM, Paul Berry wrote:

 The call to emit_shader_time_end() before the second URB write was
 conditioned with if (eot), but eot is always false in this code
 path, so emit_shader_time_end() was never being called for vertex
 shaders that performed 2 URB writes.


 I had to look at that code for way to long to convince myself that your
 patch was correct.  I think it might be better to remove both the
 conditional emit_shader_time_end calls and put this block of code at the
 very bottom (unless emit_shader_time_end has some side effect that I don't
 see):

if (inst-eot) {
   if (INTEL_DEBUG  DEBUG_SHADER_TIME)
  emit_shader_time_end();
}

 Or does the last URB write have to be the last instruction?


The last URB write has to be the last instruction, since it's actually the
URB write that ends the thread (eot stands for end of thread).

For GL 3.2 we're going to need to refactor this function to use a loop,
since GL 3.2 doubles the number of varying components permitted for VS-GS
linkage (so we'll need up to 4 URB writes instead of 2).  I think once that
change is made the function is going to be a lot easier to follow.  Maybe I
should just do that refactor now?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] intel: Refactor selection of miptree tiling

2013-04-08 Thread Kenneth Graunke
From: Chad Versace chad.vers...@linux.intel.com

This patch (1) extracts from intel_miptree_create() the spaghetti logic
that selects the tiling format, (2) rewrites that spaghetti into a lucid
form, and (3) moves it to a new function, intel_miptree_choose_tiling().
No behavioral change.

As a bonus, it is now evident that the force_y_tiling parameter to
intel_miptree_create() does not really force Y tiling.

Signed-off-by: Chad Versace chad.vers...@linux.intel.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 90 +++---
 1 file changed, 54 insertions(+), 36 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 66cadeb..402972a 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -297,6 +297,57 @@ intel_miptree_create_layout(struct intel_context *intel,
return mt;
 }
 
+/**
+ * \brief Helper function for intel_miptree_create().
+ */
+static uint32_t
+intel_miptree_choose_tiling(struct intel_context *intel,
+gl_format format,
+uint32_t width0,
+uint32_t num_samples,
+bool force_y_tiling)
+{
+
+   if (format == MESA_FORMAT_S8) {
+  /* The stencil buffer is W tiled. However, we request from the kernel a
+   * non-tiled buffer because the GTT is incapable of W fencing.
+   */
+  return I915_TILING_NONE;
+   }
+
+   if (!intel-use_texture_tiling || _mesa_is_format_compressed(format))
+  return I915_TILING_NONE;
+
+   if (force_y_tiling)
+  return I915_TILING_Y;
+
+   if (num_samples  1) {
+  /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled
+   * Surface):
+   *
+   *   [DevSNB+]: For multi-sample render targets, this field must be
+   *   1. MSRTs can only be tiled.
+   *
+   * Our usual reason for preferring X tiling (fast blits using the
+   * blitting engine) doesn't apply to MSAA, since we'll generally be
+   * downsampling or upsampling when blitting between the MSAA buffer
+   * and another buffer, and the blitting engine doesn't support that.
+   * So use Y tiling, since it makes better use of the cache.
+   */
+  return I915_TILING_Y;
+   }
+
+   GLenum base_format = _mesa_get_format_base_format(format);
+   if (intel-gen = 4 
+   (base_format == GL_DEPTH_COMPONENT ||
+base_format == GL_DEPTH_STENCIL_EXT))
+  return I915_TILING_Y;
+
+   if (width0 = 64)
+  return I915_TILING_X;
+
+   return I915_TILING_NONE;
+}
 
 struct intel_mipmap_tree *
 intel_miptree_create(struct intel_context *intel,
@@ -312,8 +363,6 @@ intel_miptree_create(struct intel_context *intel,
  bool force_y_tiling)
 {
struct intel_mipmap_tree *mt;
-   uint32_t tiling = I915_TILING_NONE;
-   GLenum base_format;
gl_format tex_format = format;
gl_format etc_format = MESA_FORMAT_NONE;
GLuint total_width, total_height;
@@ -352,35 +401,6 @@ intel_miptree_create(struct intel_context *intel,
}
 
etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
-   base_format = _mesa_get_format_base_format(format);
-
-   if (num_samples  1) {
-  /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled
-   * Surface):
-   *
-   *   [DevSNB+]: For multi-sample render targets, this field must be
-   *   1. MSRTs can only be tiled.
-   *
-   * Our usual reason for preferring X tiling (fast blits using the
-   * blitting engine) doesn't apply to MSAA, since we'll generally be
-   * downsampling or upsampling when blitting between the MSAA buffer
-   * and another buffer, and the blitting engine doesn't support that.
-   * So use Y tiling, since it makes better use of the cache.
-   */
-  force_y_tiling = true;
-   }
-
-   if (intel-use_texture_tiling  !_mesa_is_format_compressed(format)) {
-  if (intel-gen = 4 
- (base_format == GL_DEPTH_COMPONENT ||
-  base_format == GL_DEPTH_STENCIL_EXT))
-tiling = I915_TILING_Y;
-  else if (force_y_tiling) {
- tiling = I915_TILING_Y;
-  } else if (width0 = 64)
-tiling = I915_TILING_X;
-   }
-
mt = intel_miptree_create_layout(intel, target, format,
  first_level, last_level, width0,
  height0, depth0,
@@ -397,15 +417,13 @@ intel_miptree_create(struct intel_context *intel,
total_height = mt-total_height;
 
if (format == MESA_FORMAT_S8) {
-  /* The stencil buffer is W tiled. However, we request from the kernel a
-   * non-tiled buffer because the GTT is incapable of W fencing.  So round
-   * up the width and height to match the size of W tiles (64x64).
-   */
-  tiling = I915_TILING_NONE;
+  /* Align to size 

[Mesa-dev] [PATCH 2/3] i965: Use tiling even for compressed textures.

2013-04-08 Thread Kenneth Graunke
The code has no rationale for why we would force compressed textures to
be untiled, and it appears to work fine.  Git archeology indicates that
it's been that way dating back to when we first started tiling.

Improves performance in GLB27_TRex_C24Z16_FixedTimeStep at 1280x720 by
10.0529% +/- 0.573075% (n=12).  Improves performance in Xonotic by
4.56409% +/- 0.27965% (n=3).

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 402972a..8dd04be 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -315,7 +315,7 @@ intel_miptree_choose_tiling(struct intel_context *intel,
   return I915_TILING_NONE;
}
 
-   if (!intel-use_texture_tiling || _mesa_is_format_compressed(format))
+   if (!intel-use_texture_tiling)
   return I915_TILING_NONE;
 
if (force_y_tiling)
-- 
1.8.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.

2013-04-08 Thread Kenneth Graunke
In the past, we preferred X-tiling for color buffers because our BLT
code couldn't handle Y-tiling.  However, the BLT paths have been largely
replaced by BLORP on Gen6+, which can handle any kind of tiling.

We hadn't measured any performance improvement in the past, but that's
probably because compressed textures were all uncompressed anyway.

Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 8dd04be..6a9f08c 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel,
   return I915_TILING_Y;
 
if (width0 = 64)
-  return I915_TILING_X;
+  return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X;
 
return I915_TILING_NONE;
 }
-- 
1.8.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Skip resetting SOL offsets at batch start when contexts are present.

2013-04-08 Thread Eric Anholt
We won't be able to compute them in software with the advent of geometry
shaders.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
NOTE: This is a candidate for the 9.1 branch.
---
 src/mesa/drivers/dri/i965/gen6_sol.c   |9 +
 src/mesa/drivers/dri/i965/gen7_sol_state.c |   18 --
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 9c09ade..a7b63f6 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -159,6 +159,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
 struct gl_transform_feedback_object *obj)
 {
struct brw_context *brw = brw_context(ctx);
+   struct intel_context *intel = brw-intel;
const struct gl_shader_program *vs_prog =
   ctx-Shader.CurrentVertexProgram;
const struct gl_transform_feedback_info *linked_xfb_info =
@@ -180,6 +181,14 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
brw-sol.svbi_0_starting_index = 0;
brw-sol.svbi_0_max_index = max_index;
brw-sol.offset_0_batch_start = 0;
+
+   if (intel-gen = 7) {
+  /* Ask the kernel to reset the SO offsets for any previous transform
+   * feedback, so we start at the start of the user's buffer. (note: these
+   * are not the query counters)
+   */
+  intel-batch.needs_sol_reset = true;
+   }
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index c83b2df..03709ea 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -82,12 +82,14 @@ upload_3dstate_so_buffers(struct brw_context *brw)
   end = ALIGN(start + xfb_obj-Size[i], 4);
   assert(end = bo-size);
 
-  /* Offset the starting offset by the current vertex index into the
-   * feedback buffer, offset register is always set to 0 at the start of 
the
-   * batchbuffer.
+  /* If we don't have hardware contexts, then we reset our offsets at the
+   * start of every batch, so we track the number of vertices written in
+   * software and increment our pointers by that many.
*/
-  start += brw-sol.offset_0_batch_start * stride;
-  assert(start = end);
+  if (!intel-hw_ctx) {
+ start += brw-sol.offset_0_batch_start * stride;
+ assert(start = end);
+  }
 
   BEGIN_BATCH(4);
   OUT_BATCH(_3DSTATE_SO_BUFFER  16 | (4 - 2));
@@ -244,7 +246,11 @@ upload_sol_state(struct brw_context *brw)
   /* BRW_NEW_VUE_MAP_GEOM_OUT */
   upload_3dstate_so_decl_list(brw, brw-vue_map_geom_out);
 
-  intel-batch.needs_sol_reset = true;
+  /* If we don't have hardware contexts, then some other client may have
+   * changed the SO write offsets, and we need to rewrite them.
+   */
+  if (!intel-hw_ctx)
+ intel-batch.needs_sol_reset = true;
}
 
/* Finally, set up the SOL stage.  This command must always follow updates 
to
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.

2013-04-08 Thread Matt Turner
On Mon, Apr 8, 2013 at 7:27 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 In the past, we preferred X-tiling for color buffers because our BLT
 code couldn't handle Y-tiling.  However, the BLT paths have been largely
 replaced by BLORP on Gen6+, which can handle any kind of tiling.

 We hadn't measured any performance improvement in the past, but that's
 probably because compressed textures were all uncompressed anyway.

s/uncompressed/untiled/

Series is

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: Remove the texture_tiling driconf option.

2013-04-08 Thread Kenneth Graunke
This option can force textures to be untiled.  However, on Gen6+, depth
buffers must be Y-tiled.  MSAA buffers also must be Y-tiled.  So setting
this option on even a trivial application like glxgears causes assertion
failures in a debug build, and likely GPU hangs in a release build.

It's just giving users a license to shoot themselves in the foot.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/intel/intel_context.c | 2 --
 src/mesa/drivers/dri/intel/intel_context.h | 1 -
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 ---
 src/mesa/drivers/dri/intel/intel_screen.c  | 6 +-
 4 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
b/src/mesa/drivers/dri/intel/intel_context.c
index bf4045e..990fbea 100644
--- a/src/mesa/drivers/dri/intel/intel_context.c
+++ b/src/mesa/drivers/dri/intel/intel_context.c
@@ -811,8 +811,6 @@ intelInitContext(struct intel_context *intel,
 
intel_fbo_init(intel);
 
-   intel-use_texture_tiling = driQueryOptionb(intel-optionCache,
-  texture_tiling);
intel-use_early_z = driQueryOptionb(intel-optionCache, early_z);
 
if (!driQueryOptionb(intel-optionCache, hiz)) {
diff --git a/src/mesa/drivers/dri/intel/intel_context.h 
b/src/mesa/drivers/dri/intel/intel_context.h
index b2ded49..22d29be 100644
--- a/src/mesa/drivers/dri/intel/intel_context.h
+++ b/src/mesa/drivers/dri/intel/intel_context.h
@@ -343,7 +343,6 @@ struct intel_context
 */
bool is_front_buffer_reading;
 
-   bool use_texture_tiling;
bool use_early_z;
 
int driFd;
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 6a9f08c..9aff109 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -315,9 +315,6 @@ intel_miptree_choose_tiling(struct intel_context *intel,
   return I915_TILING_NONE;
}
 
-   if (!intel-use_texture_tiling)
-  return I915_TILING_NONE;
-
if (force_y_tiling)
   return I915_TILING_Y;
 
diff --git a/src/mesa/drivers/dri/intel/intel_screen.c 
b/src/mesa/drivers/dri/intel/intel_screen.c
index 3ca10c8..ccd513e 100644
--- a/src/mesa/drivers/dri/intel/intel_screen.c
+++ b/src/mesa/drivers/dri/intel/intel_screen.c
@@ -55,10 +55,6 @@ PUBLIC const char __driConfigOptions[] =
 DRI_CONF_DESC_END
   DRI_CONF_OPT_END
 
-  DRI_CONF_OPT_BEGIN(texture_tiling, bool, true)
-DRI_CONF_DESC(en, Enable texture tiling)
-  DRI_CONF_OPT_END
-
   DRI_CONF_OPT_BEGIN(hiz, bool, true)
 DRI_CONF_DESC(en, Enable Hierarchical Z on gen6+)
   DRI_CONF_OPT_END
@@ -95,7 +91,7 @@ PUBLIC const char __driConfigOptions[] =
DRI_CONF_SECTION_END
 DRI_CONF_END;
 
-const GLuint __driNConfigOptions = 17;
+const GLuint __driNConfigOptions = 16;
 
 #include intel_batchbuffer.h
 #include intel_buffers.h
-- 
1.8.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.

2013-04-08 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 In the past, we preferred X-tiling for color buffers because our BLT
 code couldn't handle Y-tiling.  However, the BLT paths have been largely
 replaced by BLORP on Gen6+, which can handle any kind of tiling.

 We hadn't measured any performance improvement in the past, but that's
 probably because compressed textures were all uncompressed anyway.

 Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.

This series is:

Reviewed-by: Eric Anholt e...@anholt.net


pgpeuyXpTM53_.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: fix glGet queries depending on derived framebuffer state

2013-04-08 Thread Eric Anholt
Marek Olšák mar...@gmail.com writes:

 ctx-DrawBuffer-Visual might be invalid if (NewState _NEW_BUFFERS) != 0.

 NOTE: This is a candidate for stable branches.
 ---
  src/mesa/main/get_hash_params.py |8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/main/get_hash_params.py 
 b/src/mesa/main/get_hash_params.py
 index 4ef2324..580e62f 100644
 --- a/src/mesa/main/get_hash_params.py
 +++ b/src/mesa/main/get_hash_params.py
 @@ -8,7 +8,7 @@ descriptor=[
[ COLOR_WRITEMASK, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ],
[ CULL_FACE, CONTEXT_BOOL(Polygon.CullFlag), NO_EXTRA ],
[ CULL_FACE_MODE, CONTEXT_ENUM(Polygon.CullFaceMode), NO_EXTRA ],
 -  [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), NO_EXTRA ],
 +  [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), extra_new_buffers ],
[ DEPTH_CLEAR_VALUE, CONTEXT_FIELD(Depth.Clear, TYPE_DOUBLEN), 
 NO_EXTRA ],
[ DEPTH_FUNC, CONTEXT_ENUM(Depth.Func), NO_EXTRA ],
[ DEPTH_RANGE, CONTEXT_FIELD(Viewport.Near, TYPE_FLOATN_2), NO_EXTRA ],
 @@ -31,7 +31,7 @@ descriptor=[
[ RED_BITS, BUFFER_INT(Visual.redBits), extra_new_buffers ],
[ SCISSOR_BOX, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ],
[ SCISSOR_TEST, CONTEXT_BOOL(Scissor.Enabled), NO_EXTRA ],
 -  [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), NO_EXTRA ],
 +  [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), extra_new_buffers ],
[ STENCIL_CLEAR_VALUE, CONTEXT_INT(Stencil.Clear), NO_EXTRA ],
[ STENCIL_FAIL, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ],
[ STENCIL_FUNC, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ],
 @@ -80,8 +80,8 @@ descriptor=[
[ SAMPLE_COVERAGE_ARB, CONTEXT_BOOL(Multisample.SampleCoverage), 
 NO_EXTRA ],
[ SAMPLE_COVERAGE_VALUE_ARB, 
 CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA ],
[ SAMPLE_COVERAGE_INVERT_ARB, 
 CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA ],
 -  [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), NO_EXTRA ],
 -  [ SAMPLES_ARB, BUFFER_INT(Visual.samples), NO_EXTRA ],
 +  [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), 
 extra_new_buffers ],
 +  [ SAMPLES_ARB, BUFFER_INT(Visual.samples), extra_new_buffers ],

Don't RGBA_FLOAT_MODE_ARB and FRAMEBUFFER_SRGB_CAPABLE_EXT also need
this treatment?


pgpzxuLDnr9Wa.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: update derived framebuffer state in GetMultisamplefv

2013-04-08 Thread Eric Anholt
Marek Olšák mar...@gmail.com writes:

 This makes sure that ctx-DrawBuffer-Visual.samples is up-to-date.

Reviewed-by: Eric Anholt e...@anholt.net


pgpTrmcAX8O6l.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.

2013-04-08 Thread Paul Berry
Gen7.5 (Haswell) hardware supports primitive restart for all primitive
types.  It also handles all possible primitive restart indices.
Rather than specialize both can_cut_index_handle_restart_index() and
the switch statement in can_cut_index_handle_prims() for Haswell, just
return early if the hardware is Haswell because we know it can handle
everything.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_primitive_restart.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
index e6902b4..10581b3 100644
--- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
+++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
@@ -36,18 +36,12 @@
 
 /**
  * Check if the hardware's cut index support can handle the primitive
- * restart index value.
+ * restart index value (pre-Haswell only).
  */
 static bool
 can_cut_index_handle_restart_index(struct gl_context *ctx,
const struct _mesa_index_buffer *ib)
 {
-   struct intel_context *intel = intel_context(ctx);
-
-   /* Haswell supports an arbitrary cut index. */
-   if (intel-is_haswell)
-  return true;
-
bool cut_index_will_work;
 
switch (ib-type) {
@@ -78,6 +72,7 @@ can_cut_index_handle_prims(struct gl_context *ctx,
GLuint nr_prims,
const struct _mesa_index_buffer *ib)
 {
+   struct intel_context *intel = intel_context(ctx);
struct brw_context *brw = brw_context(ctx);
 
if (brw-sol.counting_primitives_generated ||
@@ -90,6 +85,10 @@ can_cut_index_handle_prims(struct gl_context *ctx,
   return false;
}
 
+   /* Otherwise Haswell can do it all. */
+   if (intel-is_haswell)
+  return true;
+
if (!can_cut_index_handle_restart_index(ctx, ib)) {
   /* The primitive restart index can't be handled, so take
* the software path
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Only use brw_draw.c's trim() function when necessary.

2013-04-08 Thread Paul Berry
brw_draw.c contains a trim() function which modifies the vertex count
for quads and quad strips in order to discard dangling vertices.  In
principle this shouldn't be necessary, since hardware since Gen4 is
capable of discarding dangling vertices by itself.  However, it's
necessary because as a hack to speed up rendering on Gen 4-5, we
sometimes convert quads to trifans and quad strips to tristrips.  The
trim() function isn't necessary on Gen6 and up.

This patch documents why and when the trim() function is necessary,
and avoids calling it when it's not needed.

This will avoid creating problems when we enable hardware support for
primitive restart of quads and quad strips on Haswell.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_draw.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 809bcc5..43a4f05 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -136,6 +136,14 @@ static void gen6_set_prim(struct brw_context *brw,
 }
 
 
+/**
+ * The hardware is capable of removing dangling vertices on its own; however,
+ * prior to Gen6, we sometimes convert quads into trifans (and quad strips
+ * into tristrips), since pre-Gen6 hardware requires a GS to render quads.
+ * This function manually trims dangling vertices from a draw call involving
+ * quads so that those dangling vertices won't get drawn when we convert to
+ * trifans/tristrips.
+ */
 static GLuint trim(GLenum prim, GLuint length)
 {
if (prim == GL_QUAD_STRIP)
@@ -171,7 +179,11 @@ static void brw_emit_prim(struct brw_context *brw,
   start_vertex_location += brw-vb.start_vertex_bias;
}
 
-   verts_per_instance = trim(prim-mode, prim-count);
+   /* We only need to trim the primitive count on pre-Gen6. */
+   if (intel-gen  6)
+  verts_per_instance = trim(prim-mode, prim-count);
+   else
+  verts_per_instance = prim-count;
 
/* If nothing to emit, just return. */
if (verts_per_instance == 0)
@@ -228,7 +240,7 @@ static void gen7_emit_prim(struct brw_context *brw,
   start_vertex_location += brw-vb.start_vertex_bias;
}
 
-   verts_per_instance = trim(prim-mode, prim-count);
+   verts_per_instance = prim-count;
 
/* If nothing to emit, just return. */
if (verts_per_instance == 0)
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/fs/gen7: Allow reads from MRFs.

2013-04-08 Thread Matt Turner
Since they're actually GRFs, we can read from them.

total instructions in shared programs: 852751 - 851371 (-0.16%)
instructions in affected programs: 227286 - 225906 (-0.61%)
(no regressions)
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |   22 --
 1 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c12ba45..57be319 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2121,16 +2121,18 @@ fs_visitor::compute_to_mrf()
 /* You can't read from an MRF, so if someone else reads our
  * MRF's source GRF that we wanted to rewrite, that stops us.
  */
-bool interfered = false;
-for (int i = 0; i  3; i++) {
-   if (scan_inst-src[i].file == GRF 
-   scan_inst-src[i].reg == inst-src[0].reg 
-   scan_inst-src[i].reg_offset == inst-src[0].reg_offset) {
-  interfered = true;
-   }
-}
-if (interfered)
-   break;
+ if (intel-gen  7) {
+bool interfered = false;
+for (int i = 0; i  3; i++) {
+   if (scan_inst-src[i].file == GRF 
+   scan_inst-src[i].reg == inst-src[0].reg 
+   scan_inst-src[i].reg_offset == inst-src[0].reg_offset) {
+  interfered = true;
+   }
+}
+if (interfered)
+   break;
+ }
 
 if (scan_inst-dst.file == MRF) {
/* If somebody else writes our MRF here, we can't
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/vs/gen7: Allow reads from MRFs.

2013-04-08 Thread Matt Turner
Since they're actually GRFs, we can read from them.

total instructions in shared programs: 344973 - 342483 (-0.72%)
instructions in affected programs: 245602 - 243112 (-1.01%)
(no regressions)
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp |   23 +--
 1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index c58fb44..e337738 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -927,16 +927,18 @@ vec4_visitor::opt_register_coalesce()
   * GRF we're trying to coalesce to, we don't actually handle
   * rewriting sources so bail in that case as well.
   */
-bool interfered = false;
-for (int i = 0; i  3; i++) {
-   if (scan_inst-src[i].file == GRF 
-   scan_inst-src[i].reg == inst-src[0].reg 
-   scan_inst-src[i].reg_offset == inst-src[0].reg_offset) {
-  interfered = true;
-   }
-}
-if (interfered)
-   break;
+ if (intel-gen  7) {
+bool interfered = false;
+for (int i = 0; i  3; i++) {
+   if (scan_inst-src[i].file == GRF 
+   scan_inst-src[i].reg == inst-src[0].reg 
+   scan_inst-src[i].reg_offset == inst-src[0].reg_offset) {
+  interfered = true;
+   }
+}
+if (interfered)
+   break;
+ }
 
  /* If somebody else writes our destination here, we can't coalesce
   * before that.
@@ -956,6 +958,7 @@ vec4_visitor::opt_register_coalesce()
break;
 }
  } else {
+bool interfered = false;
 for (int i = 0; i  3; i++) {
if (scan_inst-src[i].file == inst-dst.file 
scan_inst-src[i].reg == inst-dst.reg 
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.

2013-04-08 Thread Jordan Justen
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

On Mon, Apr 8, 2013 at 11:57 AM, Paul Berry stereotype...@gmail.com wrote:
 Gen7.5 (Haswell) hardware supports primitive restart for all primitive
 types.  It also handles all possible primitive restart indices.
 Rather than specialize both can_cut_index_handle_restart_index() and
 the switch statement in can_cut_index_handle_prims() for Haswell, just
 return early if the hardware is Haswell because we know it can handle
 everything.

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_primitive_restart.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
 b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 index e6902b4..10581b3 100644
 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
 @@ -36,18 +36,12 @@

  /**
   * Check if the hardware's cut index support can handle the primitive
 - * restart index value.
 + * restart index value (pre-Haswell only).
   */
  static bool
  can_cut_index_handle_restart_index(struct gl_context *ctx,
 const struct _mesa_index_buffer *ib)
  {
 -   struct intel_context *intel = intel_context(ctx);
 -
 -   /* Haswell supports an arbitrary cut index. */
 -   if (intel-is_haswell)
 -  return true;
 -
 bool cut_index_will_work;

 switch (ib-type) {
 @@ -78,6 +72,7 @@ can_cut_index_handle_prims(struct gl_context *ctx,
 GLuint nr_prims,
 const struct _mesa_index_buffer *ib)
  {
 +   struct intel_context *intel = intel_context(ctx);
 struct brw_context *brw = brw_context(ctx);

 if (brw-sol.counting_primitives_generated ||
 @@ -90,6 +85,10 @@ can_cut_index_handle_prims(struct gl_context *ctx,
return false;
 }

 +   /* Otherwise Haswell can do it all. */
 +   if (intel-is_haswell)
 +  return true;
 +
 if (!can_cut_index_handle_restart_index(ctx, ib)) {
/* The primitive restart index can't be handled, so take
 * the software path
 --
 1.8.2

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Remove the texture_tiling driconf option.

2013-04-08 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 This option can force textures to be untiled.  However, on Gen6+, depth
 buffers must be Y-tiled.  MSAA buffers also must be Y-tiled.  So setting
 this option on even a trivial application like glxgears causes assertion
 failures in a debug build, and likely GPU hangs in a release build.

 It's just giving users a license to shoot themselves in the foot.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

Long long ago, it was occasionally useful for testing.  No more.

Reviewed-by: Eric Anholt e...@anholt.net


pgp3hQfZjUd_r.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Allocate hiz in intel_renderbuffer_move_to_temp()

2013-04-08 Thread Paul Berry
On 5 April 2013 16:51, Paul Berry stereotype...@gmail.com wrote:

 On 5 April 2013 15:28, Chad Versace chad.vers...@linux.intel.com wrote:

 When moving the renderbuffer to a new miptree, we neglected to allocate
 the hiz buffer for the new miptree. Oops.

 Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on
 Sandybridge.

 CC: Paul Berry stereotype...@gmail.com
 CC: Eric Anholt e...@anholt.net
 Signed-off-by: Chad Versace chad.vers...@linux.intel.com


 I haven't had a chance to review this yet, but:

 Candidate for the 9.1 stable release branch?


Ok, if this is marked as a candidate for 9.1, then it is:

Reviewed-by: Paul Berry stereotype...@gmail.com




 ---
  src/mesa/drivers/dri/intel/intel_fbo.c | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c
 b/src/mesa/drivers/dri/intel/intel_fbo.c
 index b91d6e0..2977568 100644
 --- a/src/mesa/drivers/dri/intel/intel_fbo.c
 +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
 @@ -1010,6 +1010,10 @@ intel_renderbuffer_move_to_temp(struct
 intel_context *intel,
   irb-mt-num_samples,
   false /* force_y_tiling */);

 +   if (intel-vtbl.is_hiz_depth_format(intel, new_mt-format)) {
 +  intel_miptree_alloc_hiz(intel, new_mt, irb-mt-num_samples);
 +   }
 +
 intel_miptree_copy_teximage(intel, intel_image, new_mt, invalidate);

 intel_miptree_reference(irb-mt, intel_image-mt);
 --
 1.8.1.4



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.

2013-04-08 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 Gen7.5 (Haswell) hardware supports primitive restart for all primitive
 types.  It also handles all possible primitive restart indices.
 Rather than specialize both can_cut_index_handle_restart_index() and
 the switch statement in can_cut_index_handle_prims() for Haswell, just
 return early if the hardware is Haswell because we know it can handle
 everything.

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Series is:

Reviewed-by: Eric Anholt e...@anholt.net


pgp6rlpHCigsg.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/fs/gen7: Allow reads from MRFs.

2013-04-08 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 Since they're actually GRFs, we can read from them.

 total instructions in shared programs: 852751 - 851371 (-0.16%)
 instructions in affected programs: 227286 - 225906 (-0.61%)
 (no regressions)

I don't see you actually rewriting these GRF reads to be the new MRF, so
they'll now be reading uninitialized values.



pgpcsQlEjAxNs.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 9.1.2? (was Re: Mesa (9.1): 21 new commits)

2013-04-08 Thread Ian Romanick

On 04/05/2013 07:51 PM, Jordan Justen wrote:

On Fri, Apr 5, 2013 at 7:03 PM, Ian Romanick i...@freedesktop.org wrote:

I just cherry picked (almost) all of the marked patches from master that
have been out for two weeks or more.  There are a couple that I did not
pick.



With all that out of the way... how does a Mesa 9.1.2 release next Friday
sound?  43 patches have been cherry picked since 9.1.1, so it seems like a
good time.


0967c362 brings  gen6 from not working at all on TF2, to somewhat
working with major issues.

So, if it is not considered too risky, then getting it onto 9.1 might be nice.


The problem is that patch seems to depend on a pile of other patches... 
at least 8fbc22e8, but maybe also 463ef47 and a593a1b.


Perhaps someone can recommend an alternate patch specifically for 9.1?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.

2013-04-08 Thread Ian Romanick

On 04/08/2013 11:17 AM, Paul Berry wrote:

On 8 April 2013 10:37, Ian Romanick i...@freedesktop.org
mailto:i...@freedesktop.org wrote:

On 04/07/2013 06:42 AM, Paul Berry wrote:

The call to emit_shader_time_end() before the second URB write was
conditioned with if (eot), but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.


I had to look at that code for way to long to convince myself that
your patch was correct.  I think it might be better to remove both
the conditional emit_shader_time_end calls and put this block of
code at the very bottom (unless emit_shader_time_end has some side
effect that I don't see):

if (inst-eot) {
   if (INTEL_DEBUG  DEBUG_SHADER_TIME)
  emit_shader_time_end();
}

Or does the last URB write have to be the last instruction?


The last URB write has to be the last instruction, since it's actually
the URB write that ends the thread (eot stands for end of thread).


I suspected it was something like that.


For GL 3.2 we're going to need to refactor this function to use a loop,
since GL 3.2 doubles the number of varying components permitted for
VS-GS linkage (so we'll need up to 4 URB writes instead of 2).  I think
once that change is made the function is going to be a lot easier to
follow.  Maybe I should just do that refactor now?


It's up to you.  I think the code in your patch is okay for now.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] Death to array dereferences of vectors!

2013-04-08 Thread Ian Romanick
This series gradually replaces array dereferences of vectors with two
expressions.  It takes so many patches because changes are needed to the
existing lowering passes and because several places in the code generate
array dereferences of vectors (e.g., lowering accessed to
gl_ClipDistance).  There is also some challenge in dealing with function
inout parameters that are indexed vectors.

The two new expressions are ir_binop_vector_extract and
ir_triop_vector_insert.  The former has a vector operand and a scalar
operand.  The result is the scalar value from the vector specified by
the scalar.  The later takes a vector and two scalars.  The result is a
new vector with one indexed field replaced by a scalar value.

Together this series fixes piglit tests glsl-vs-channel-overwrite-01 and
glsl-vs-channel-overwrite-03.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] glsl: Add ir_binop_vector_extract

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The new opcode is used to get a single field from a vector.  The field
index may not be constant.  This will eventually replace
ir_dereference_array of vectors.  This is similar to the extractelement
instruction in LLVM IR.

http://llvm.org/docs/LangRef.html#extractelement-instruction

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp |  5 +
 src/glsl/ir.h   | 10 +-
 src/glsl/ir_constant_expression.cpp | 35 ---
 src/glsl/ir_validate.cpp|  6 ++
 src/mesa/program/ir_to_mesa.cpp |  1 +
 5 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 05b77da..f4596db 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -399,6 +399,10 @@ ir_expression::ir_expression(int op, ir_rvalue *op0, 
ir_rvalue *op1)
   this-type = op0-type;
   break;
 
+   case ir_binop_vector_extract:
+  this-type = op0-type-get_scalar_type();
+  break;
+
default:
   assert(!not reached: missing automatic type setup for ir_expression);
   this-type = glsl_type::float_type;
@@ -505,6 +509,7 @@ static const char *const operator_strs[] = {
pow,
packHalf2x16_split,
ubo_load,
+   vector_extract,
lrp,
vector,
 };
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 0c3e399..4da54fc 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -1115,9 +1115,17 @@ enum ir_expression_operation {
ir_binop_ubo_load,
 
/**
+* Extract a scalar from a vector
+*
+* operand0 is the vector
+* operand1 is the index of the field to read from operand0
+*/
+   ir_binop_vector_extract,
+
+   /**
 * A sentinel marking the last of the binary operations.
 */
-   ir_last_binop = ir_binop_ubo_load,
+   ir_last_binop = ir_binop_vector_extract,
 
ir_triop_lrp,
 
diff --git a/src/glsl/ir_constant_expression.cpp 
b/src/glsl/ir_constant_expression.cpp
index c09e56a..e802e6c 100644
--- a/src/glsl/ir_constant_expression.cpp
+++ b/src/glsl/ir_constant_expression.cpp
@@ -391,9 +391,16 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
}
 
if (op[1] != NULL)
-  assert(op[0]-type-base_type == op[1]-type-base_type ||
-this-operation == ir_binop_lshift ||
-this-operation == ir_binop_rshift);
+  switch (this-operation) {
+  case ir_binop_lshift:
+  case ir_binop_rshift:
+  case ir_binop_vector_extract:
+break;
+
+  default:
+assert(op[0]-type-base_type == op[1]-type-base_type);
+break;
+  }
 
bool op0_scalar = op[0]-type-is_scalar();
bool op1_scalar = op[1] != NULL  op[1]-type-is_scalar();
@@ -1230,6 +1237,28 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   }
   break;
 
+   case ir_binop_vector_extract: {
+  const int c = op[1]-value.i[0];
+
+  switch (op[0]-type-base_type) {
+  case GLSL_TYPE_UINT:
+data.u[0] = op[0]-value.u[c];
+break;
+  case GLSL_TYPE_INT:
+data.i[0] = op[0]-value.i[c];
+break;
+  case GLSL_TYPE_FLOAT:
+data.f[0] = op[0]-value.f[c];
+break;
+  case GLSL_TYPE_BOOL:
+data.b[0] = op[0]-value.b[c];
+break;
+  default:
+assert(0);
+  }
+  break;
+   }
+
case ir_binop_bit_xor:
   for (unsigned c = 0, c0 = 0, c1 = 0;
c  components;
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 699c192..83519cf 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -468,6 +468,12 @@ ir_validate::visit_leave(ir_expression *ir)
   assert(ir-operands[1]-type == glsl_type::uint_type);
   break;
 
+   case ir_binop_vector_extract:
+  assert(ir-operands[0]-type-is_vector());
+  assert(ir-operands[1]-type-is_scalar()
+ ir-operands[1]-type-is_integer());
+  break;
+
case ir_triop_lrp:
   assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT);
   assert(ir-operands[0]-type == ir-operands[1]-type);
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 14cf5ba..7d351c0 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -1485,6 +1485,7 @@ ir_to_mesa_visitor::visit(ir_expression *ir)
   emit(ir, OPCODE_LRP, result_dst, op[2], op[1], op[0]);
   break;
 
+   case ir_binop_vector_extract:
case ir_quadop_vector:
   /* This operation should have already been handled.
*/
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/12] glsl: Add ir_triop_vector_insert

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The new opcode is used to generate a new vector with a single field from
the source vector replaced.  This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp |  1 +
 src/glsl/ir.h   | 11 ++-
 src/glsl/ir_validate.cpp|  9 +
 src/mesa/program/ir_to_mesa.cpp |  1 +
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index f4596db..336ff95 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -511,6 +511,7 @@ static const char *const operator_strs[] = {
ubo_load,
vector_extract,
lrp,
+   vector_insert,
vector,
 };
 
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 4da54fc..7106cde 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -1130,9 +1130,18 @@ enum ir_expression_operation {
ir_triop_lrp,
 
/**
+* Generate a value with one field of a vector changed
+*
+* operand0 is the vector
+* operand1 is the value to write into the vector result
+* operand2 is the index in operand0 to be modified
+*/
+   ir_triop_vector_insert,
+
+   /**
 * A sentinel marking the last of the ternary operations.
 */
-   ir_last_triop = ir_triop_lrp,
+   ir_last_triop = ir_triop_vector_insert,
 
ir_quadop_vector,
 
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 83519cf..f304af4 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -480,6 +480,15 @@ ir_validate::visit_leave(ir_expression *ir)
   assert(ir-operands[2]-type == ir-operands[0]-type || 
ir-operands[2]-type == glsl_type::float_type);
   break;
 
+   case ir_triop_vector_insert:
+  assert(ir-operands[0]-type-is_vector());
+  assert(ir-operands[1]-type-is_scalar());
+  assert(ir-operands[0]-type-base_type == 
ir-operands[1]-type-base_type);
+  assert(ir-operands[2]-type-is_scalar()
+ ir-operands[2]-type-is_integer());
+  assert(ir-type == ir-operands[0]-type);
+  break;
+
case ir_quadop_vector:
   /* The vector operator collects some number of scalars and generates a
* vector from them.
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 7d351c0..eb64347 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -1486,6 +1486,7 @@ ir_to_mesa_visitor::visit(ir_expression *ir)
   break;
 
case ir_binop_vector_extract:
+   case ir_triop_vector_insert:
case ir_quadop_vector:
   /* This operation should have already been handled.
*/
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/12] glsl: Refactor part of convert_vec_index_to_cond_assign

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Use a first function that extract the vector being indexed and the index
from the deref.  Call the second function that does the real work.

Coming patches will add a new ir_expression for variable indexing into a
vector.  Having the lowering pass split into two functions will make it
much easier to lower the new ir_expression.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/lower_vec_index_to_cond_assign.cpp | 47 ++---
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp 
b/src/glsl/lower_vec_index_to_cond_assign.cpp
index f85875f..6572cc4 100644
--- a/src/glsl/lower_vec_index_to_cond_assign.cpp
+++ b/src/glsl/lower_vec_index_to_cond_assign.cpp
@@ -53,6 +53,9 @@ public:
}
 
ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val);
+   ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx,
+  ir_rvalue *orig_vector,
+  ir_rvalue *orig_index);
 
virtual ir_visitor_status visit_enter(ir_expression *);
virtual ir_visitor_status visit_enter(ir_swizzle *);
@@ -65,24 +68,15 @@ public:
 };
 
 ir_rvalue *
-ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
 *ir)
+ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void 
*mem_ctx,
+ ir_rvalue 
*orig_vector,
+ ir_rvalue 
*orig_index)
 {
-   ir_dereference_array *orig_deref = ir-as_dereference_array();
ir_assignment *assign, *value_assign;
ir_variable *index, *var, *value;
ir_dereference *deref, *deref_value;
unsigned i;
 
-   if (!orig_deref)
-  return ir;
-
-   if (orig_deref-array-type-is_matrix() ||
-   orig_deref-array-type-is_array())
-  return ir;
-
-   void *mem_ctx = ralloc_parent(ir);
-
-   assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT);
 
exec_list list;
 
@@ -92,15 +86,15 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
ir_var_temporary);
list.push_tail(index);
deref = new(base_ir) ir_dereference_variable(index);
-   assign = new(base_ir) ir_assignment(deref, orig_deref-array_index, NULL);
+   assign = new(base_ir) ir_assignment(deref, orig_index, NULL);
list.push_tail(assign);
 
/* Store the value inside a temp, thus avoiding matrixes duplication */
-   value = new(base_ir) ir_variable(orig_deref-array-type, vec_value_tmp,
+   value = new(base_ir) ir_variable(orig_vector-type, vec_value_tmp,
ir_var_temporary);
list.push_tail(value);
deref_value = new(base_ir) ir_dereference_variable(value);
-   value_assign = new(base_ir) ir_assignment(deref_value, orig_deref-array);
+   value_assign = new(base_ir) ir_assignment(deref_value, orig_vector);
list.push_tail(value_assign);
 
/* Temporary where we store whichever value we swizzle out. */
@@ -113,11 +107,11 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
 */
ir_rvalue *const cond_deref =
   compare_index_block(list, index, 0,
- orig_deref-array-type-vector_elements,
+ orig_vector-type-vector_elements,
  mem_ctx);
 
/* Generate a conditional move of each vector element to the temp. */
-   for (i = 0; i  orig_deref-array-type-vector_elements; i++) {
+   for (i = 0; i  orig_vector-type-vector_elements; i++) {
   ir_rvalue *condition_swizzle =
 new(base_ir) ir_swizzle(cond_deref-clone(ir, NULL), i, 0, 0, 0, 1);
 
@@ -142,6 +136,25 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
return new(base_ir) ir_dereference_variable(var);
 }
 
+ir_rvalue *
+ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
 *ir)
+{
+   ir_dereference_array *orig_deref = ir-as_dereference_array();
+
+   if (!orig_deref)
+  return ir;
+
+   if (orig_deref-array-type-is_matrix() ||
+   orig_deref-array-type-is_array())
+  return ir;
+
+   assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT);
+
+   return convert_vec_index_to_cond_assign(ralloc_parent(ir),
+  orig_deref-array,
+  orig_deref-array_index);
+}
+
 ir_visitor_status
 ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir)
 {
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] glsl: Lower ir_binop_vector_extract to conditional moves

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Lower ir_binop_vector_extract with a non-constant index to a series of
conditional moves.  This is exactly like ir_dereference_array of a
vector with a non-constant index.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/lower_vec_index_to_cond_assign.cpp | 45 +
 1 file changed, 39 insertions(+), 6 deletions(-)

diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp 
b/src/glsl/lower_vec_index_to_cond_assign.cpp
index 6572cc4..2cd540c 100644
--- a/src/glsl/lower_vec_index_to_cond_assign.cpp
+++ b/src/glsl/lower_vec_index_to_cond_assign.cpp
@@ -55,7 +55,10 @@ public:
ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val);
ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx,
   ir_rvalue *orig_vector,
-  ir_rvalue *orig_index);
+  ir_rvalue *orig_index,
+  const glsl_type *type);
+
+   ir_rvalue *convert_vector_extract_to_cond_assign(ir_rvalue *ir);
 
virtual ir_visitor_status visit_enter(ir_expression *);
virtual ir_visitor_status visit_enter(ir_swizzle *);
@@ -70,7 +73,8 @@ public:
 ir_rvalue *
 ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void 
*mem_ctx,
  ir_rvalue 
*orig_vector,
- ir_rvalue 
*orig_index)
+ ir_rvalue 
*orig_index,
+ const 
glsl_type *type)
 {
ir_assignment *assign, *value_assign;
ir_variable *index, *var, *value;
@@ -98,7 +102,7 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_
list.push_tail(value_assign);
 
/* Temporary where we store whichever value we swizzle out. */
-   var = new(base_ir) ir_variable(ir-type, vec_index_tmp_v,
+   var = new(base_ir) ir_variable(type, vec_index_tmp_v,
  ir_var_temporary);
list.push_tail(var);
 
@@ -113,7 +117,7 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_
/* Generate a conditional move of each vector element to the temp. */
for (i = 0; i  orig_vector-type-vector_elements; i++) {
   ir_rvalue *condition_swizzle =
-new(base_ir) ir_swizzle(cond_deref-clone(ir, NULL), i, 0, 0, 0, 1);
+new(base_ir) ir_swizzle(cond_deref-clone(mem_ctx, NULL), i, 0, 0, 0, 
1);
 
   /* Just clone the rest of the deref chain when trying to get at the
* underlying variable.
@@ -152,7 +156,22 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
 
return convert_vec_index_to_cond_assign(ralloc_parent(ir),
   orig_deref-array,
-  orig_deref-array_index);
+  orig_deref-array_index,
+  ir-type);
+}
+
+ir_rvalue *
+ir_vec_index_to_cond_assign_visitor::convert_vector_extract_to_cond_assign(ir_rvalue
 *ir)
+{
+   ir_expression *const expr = ir-as_expression();
+
+   if (expr == NULL || expr-operation != ir_binop_vector_extract)
+  return ir;
+
+   return convert_vec_index_to_cond_assign(ralloc_parent(ir),
+  expr-operands[0],
+  expr-operands[1],
+  ir-type);
 }
 
 ir_visitor_status
@@ -162,6 +181,7 @@ 
ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir)
 
for (i = 0; i  ir-get_num_operands(); i++) {
   ir-operands[i] = convert_vec_index_to_cond_assign(ir-operands[i]);
+  ir-operands[i] = convert_vector_extract_to_cond_assign(ir-operands[i]);
}
 
return visit_continue;
@@ -175,6 +195,7 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle 
*ir)
 * using swizzling of scalars for vector construction.
 */
ir-val = convert_vec_index_to_cond_assign(ir-val);
+   ir-val = convert_vector_extract_to_cond_assign(ir-val);
 
return visit_continue;
 }
@@ -188,8 +209,12 @@ 
ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir)
unsigned i;
 
ir-rhs = convert_vec_index_to_cond_assign(ir-rhs);
-   if (ir-condition)
+   ir-rhs = convert_vector_extract_to_cond_assign(ir-rhs);
+
+   if (ir-condition) {
   ir-condition = convert_vec_index_to_cond_assign(ir-condition);
+  ir-condition = convert_vector_extract_to_cond_assign(ir-condition);
+   }
 
/* Last, handle the LHS */
ir_dereference_array *orig_deref = ir-lhs-as_dereference_array();
@@ -279,6 +304,12 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_call 
*ir)
 
   if (new_param != 

[Mesa-dev] [PATCH 04/12] glsl: Lower ir_binop_vector_extract to swizzle

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Lower ir_binop_vector_extract with a constant index to a swizzle.  This
is exactly like ir_dereference_array of a vector with a constant index.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/lower_vec_index_to_swizzle.cpp | 45 +
 1 file changed, 45 insertions(+)

diff --git a/src/glsl/lower_vec_index_to_swizzle.cpp 
b/src/glsl/lower_vec_index_to_swizzle.cpp
index 264d6dc..ad09dd2 100644
--- a/src/glsl/lower_vec_index_to_swizzle.cpp
+++ b/src/glsl/lower_vec_index_to_swizzle.cpp
@@ -47,6 +47,7 @@ public:
}
 
ir_rvalue *convert_vec_index_to_swizzle(ir_rvalue *val);
+   ir_rvalue *convert_vector_extract_to_swizzle(ir_rvalue *val);
 
virtual ir_visitor_status visit_enter(ir_expression *);
virtual ir_visitor_status visit_enter(ir_swizzle *);
@@ -98,6 +99,40 @@ 
ir_vec_index_to_swizzle_visitor::convert_vec_index_to_swizzle(ir_rvalue *ir)
return new(ctx) ir_swizzle(deref-array, i, 0, 0, 0, 1);
 }
 
+ir_rvalue *
+ir_vec_index_to_swizzle_visitor::convert_vector_extract_to_swizzle(ir_rvalue 
*ir)
+{
+   ir_expression *const expr = ir-as_expression();
+   if (expr == NULL || expr-operation != ir_binop_vector_extract)
+  return ir;
+
+   ir_constant *const idx = expr-operands[1]-constant_expression_value();
+   if (idx == NULL)
+  return ir;
+
+   void *ctx = ralloc_parent(ir);
+   this-progress = true;
+
+   /* Page 40 of the GLSL 1.20 spec says:
+*
+* When indexing with non-constant expressions, behavior is undefined
+* if the index is negative, or greater than or equal to the size of
+* the vector.
+*
+* The quoted spec text mentions non-constant expressions, but this code
+* operates on constants.  These constants are the result of non-constant
+* expressions that have been optimized to constants.  The common case here
+* is a loop counter from an unrolled loop that is used to index a vector.
+*
+* The ir_swizzle constructor gets angry if the index is negative or too
+* large.  For simplicity sake, just clamp the index to [0, size-1].
+*/
+   const int i = MIN2(MAX2(idx-value.i[0], 0),
+ ((int) expr-operands[0]-type-vector_elements - 1));
+
+   return new(ctx) ir_swizzle(expr-operands[0], i, 0, 0, 0, 1);
+}
+
 ir_visitor_status
 ir_vec_index_to_swizzle_visitor::visit_enter(ir_expression *ir)
 {
@@ -105,6 +140,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_expression 
*ir)
 
for (i = 0; i  ir-get_num_operands(); i++) {
   ir-operands[i] = convert_vec_index_to_swizzle(ir-operands[i]);
+  ir-operands[i] = convert_vector_extract_to_swizzle(ir-operands[i]);
}
 
return visit_continue;
@@ -127,6 +163,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_assignment 
*ir)
 {
ir-set_lhs(convert_vec_index_to_swizzle(ir-lhs));
ir-rhs = convert_vec_index_to_swizzle(ir-rhs);
+   ir-rhs = convert_vector_extract_to_swizzle(ir-rhs);
 
return visit_continue;
 }
@@ -140,6 +177,12 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_call *ir)
 
   if (new_param != param) {
 param-replace_with(new_param);
+  } else {
+new_param = convert_vec_index_to_swizzle(param);
+
+if (new_param != param) {
+   param-replace_with(new_param);
+}
   }
}
 
@@ -151,6 +194,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_return *ir)
 {
if (ir-value) {
   ir-value = convert_vec_index_to_swizzle(ir-value);
+  ir-value = convert_vector_extract_to_swizzle(ir-value);
}
 
return visit_continue;
@@ -160,6 +204,7 @@ ir_visitor_status
 ir_vec_index_to_swizzle_visitor::visit_enter(ir_if *ir)
 {
ir-condition = convert_vec_index_to_swizzle(ir-condition);
+   ir-condition = convert_vector_extract_to_swizzle(ir-condition);
 
return visit_continue;
 }
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

This will eventually replace do_vec_index_to_cond_assign.  This lowering
pass is called in all the places where do_vec_index_to_cond_assign or
do_vec_index_to_swizzle is called.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/Makefile.sources  |   1 +
 src/glsl/glsl_parser_extras.cpp|   1 +
 src/glsl/ir_optimization.h |   1 +
 src/glsl/lower_vector_insert.cpp   | 157 +
 src/mesa/drivers/dri/i965/brw_shader.cpp   |   1 +
 src/mesa/program/ir_to_mesa.cpp|   1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   1 +
 7 files changed, 163 insertions(+)
 create mode 100644 src/glsl/lower_vector_insert.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 674a05f..8e2dc1b 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -69,6 +69,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/lower_vec_index_to_cond_assign.cpp \
$(GLSL_SRCDIR)/lower_vec_index_to_swizzle.cpp \
$(GLSL_SRCDIR)/lower_vector.cpp \
+   $(GLSL_SRCDIR)/lower_vector_insert.cpp \
$(GLSL_SRCDIR)/lower_output_reads.cpp \
$(GLSL_SRCDIR)/lower_ubo_reference.cpp \
$(GLSL_SRCDIR)/opt_algebraic.cpp \
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 0992294..d38e967 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1236,6 +1236,7 @@ do_common_optimization(exec_list *ir, bool linked,
progress = do_algebraic(ir) || progress;
progress = do_lower_jumps(ir) || progress;
progress = do_vec_index_to_swizzle(ir) || progress;
+   progress = lower_vector_insert(ir, false) || progress;
progress = do_swizzle_swizzle(ir) || progress;
progress = do_noop_swizzle(ir) || progress;
 
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index a8885d7..0216e46 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -106,6 +106,7 @@ void lower_ubo_reference(struct gl_shader *shader, 
exec_list *instructions);
 void lower_packed_varyings(void *mem_ctx, unsigned location_base,
unsigned locations_used, ir_variable_mode mode,
gl_shader *shader);
+bool lower_vector_insert(exec_list *instructions, bool 
lower_nonconstant_index);
 bool optimize_redundant_jumps(exec_list *instructions);
 bool optimize_split_arrays(exec_list *instructions, bool linked);
 
diff --git a/src/glsl/lower_vector_insert.cpp b/src/glsl/lower_vector_insert.cpp
new file mode 100644
index 000..da1485c
--- /dev/null
+++ b/src/glsl/lower_vector_insert.cpp
@@ -0,0 +1,157 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include ir.h
+#include ir_builder.h
+#include ir_rvalue_visitor.h
+#include ir_optimization.h
+
+using namespace ir_builder;
+
+class vector_insert_visitor : public ir_rvalue_visitor {
+public:
+   vector_insert_visitor(bool lower_nonconstant_index)
+  : progress(false), lower_nonconstant_index(lower_nonconstant_index)
+   {
+  factory.instructions = factory_instructions;
+   }
+
+   virtual ~vector_insert_visitor()
+   {
+  assert(factory_instructions.is_empty());
+   }
+
+   virtual void handle_rvalue(ir_rvalue **rv);
+
+   ir_factory factory;
+   exec_list factory_instructions;
+   bool progress;
+   bool lower_nonconstant_index;
+};
+
+
+void
+vector_insert_visitor::handle_rvalue(ir_rvalue **rv)
+{
+   if (*rv == NULL || (*rv)-ir_type != ir_type_expression)
+  return;
+
+   ir_expression *const expr = (ir_expression *) *rv;
+
+   if (likely(expr-operation != ir_triop_vector_insert))
+  return;
+
+   factory.mem_ctx = ralloc_parent(expr);
+
+   ir_constant *const idx = expr-operands[2]-constant_expression_value();
+   if (idx != NULL) {
+  /* 

[Mesa-dev] [PATCH 07/12] glsl: Convert ir_binop_vector_extract in the LHS to ir_triop_vector_insert

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The ast_array_index code can't know whether to generate an
ir_binop_vector_extract or an ir_triop_vector_insert.  Instead it will
always generate ir_binop_vector_extract, and the LHS and RHS have to be
re-written.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_to_hir.cpp | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index a0ec71c..5414e18 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -672,6 +672,30 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
void *ctx = state;
bool error_emitted = (lhs-type-is_error() || rhs-type-is_error());
 
+   /* If the assignment LHS comes back as an ir_binop_vector_extract
+* expression, move it to the RHS as an ir_triop_vector_insert.
+*/
+   if (lhs-ir_type == ir_type_expression) {
+  ir_expression *const expr = lhs-as_expression();
+
+  if (unlikely(expr-operation == ir_binop_vector_extract)) {
+ir_rvalue *new_rhs =
+   validate_assignment(state, lhs-type, rhs, is_initializer);
+
+if (new_rhs == NULL) {
+   _mesa_glsl_error( lhs_loc, state, type mismatch);
+   return lhs;
+} else {
+   rhs = new(ctx) ir_expression(ir_triop_vector_insert,
+expr-operands[0]-type,
+expr-operands[0],
+new_rhs,
+expr-operands[1]);
+   lhs = expr-operands[0]-clone(ctx, NULL);
+}
+  }
+   }
+
ir_variable *lhs_var = lhs-variable_referenced();
if (lhs_var)
   lhs_var-assigned = true;
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] glsl: Generate ir_binop_vector_extract for indexing of vectors

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Now ir_dereference_array of a vector will never occur in the RHS of an
expression.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_array_index.cpp | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index 862f64c..e7bc299 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -31,17 +31,13 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 ir_rvalue *array, ir_rvalue *idx,
 YYLTYPE loc, YYLTYPE idx_loc)
 {
-   ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx);
-
if (!array-type-is_error()
 !array-type-is_array()
 !array-type-is_matrix()
-!array-type-is_vector()) {
+!array-type-is_vector())
   _mesa_glsl_error( idx_loc, state,
   cannot dereference non-array / non-matrix / 
   non-vector);
-  result-type = glsl_type::error_type;
-   }
 
if (!idx-type-is_error()) {
   if (!idx-type-is_integer()) {
@@ -174,5 +170,20 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
   }
}
 
-   return result;
+   /* After performing all of the error checking, generate the IR for the
+* expression.
+*/
+   if (array-type-is_array()
+   || array-type-is_matrix()) {
+  return new(mem_ctx) ir_dereference_array(array, idx);
+   } else if (array-type-is_vector()) {
+  return new(mem_ctx) ir_expression(ir_binop_vector_extract, array, idx);
+   } else if (array-type-is_error()) {
+  return array;
+   } else {
+  ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx);
+  result-type = glsl_type::error_type;
+
+  return result;
+   }
 }
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] glsl: Generate correct ir_binop_vector_extract code for out and inout parameters

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Like with type conversions on out parameters, some extra copies need to
occur to handle these cases.  The fundamental problem is that
ir_binop_vector_extract is not an lvalue, but out and inout parameters
must be lvalues.  A previous patch delt with a similar problem in the
LHS of ir_assignment.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_function.cpp | 149 +++---
 1 file changed, 102 insertions(+), 47 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 26f72cf..0d32241 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -165,10 +165,18 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
 actual-variable_referenced()-name);
return false;
 } else if (!actual-is_lvalue()) {
-   _mesa_glsl_error(loc, state,
-function parameter '%s %s' is not an lvalue,
-mode, formal-name);
-   return false;
+   /* Even though ir_binop_vector_extract is not an l-value, let it
+* slop through.  generate_call will handle it correctly.
+*/
+   ir_expression *const expr = ((ir_rvalue *) actual)-as_expression();
+   if (expr == NULL
+   || expr-operation != ir_binop_vector_extract
+   || !expr-operands[0]-is_lvalue()) {
+  _mesa_glsl_error(loc, state,
+   function parameter '%s %s' is not an lvalue,
+   mode, formal-name);
+  return false;
+   }
 }
   }
 
@@ -178,6 +186,93 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
return true;
 }
 
+static void
+fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type,
+ exec_list *before_instructions, exec_list *after_instructions,
+ bool parameter_is_inout)
+{
+   ir_expression *const expr = actual-as_expression();
+
+   /* If the types match exactly and the parameter is not a vector-extract,
+* nothing needs to be done to fix the parameter.
+*/
+   if (formal_type == actual-type
+(expr == NULL || expr-operation != ir_binop_vector_extract))
+  return;
+
+   /* To convert an out parameter, we need to create a temporary variable to
+* hold the value before conversion, and then perform the conversion after
+* the function call returns.
+*
+* This has the effect of transforming code like this:
+*
+*   void f(out int x);
+*   float value;
+*   f(value);
+*
+* Into IR that's equivalent to this:
+*
+*   void f(out int x);
+*   float value;
+*   int out_parameter_conversion;
+*   f(out_parameter_conversion);
+*   value = float(out_parameter_conversion);
+*
+* If the parameter is an ir_expression of ir_binop_vector_extract,
+* additional conversion is needed in the post-call re-write.
+*/
+   ir_variable *tmp =
+  new(mem_ctx) ir_variable(formal_type, inout_tmp, ir_var_temporary);
+
+   before_instructions-push_tail(tmp);
+
+   /* If the parameter is an inout parameter, copy the value of the actual
+* parameter to the new temporary.  Note that no type conversion is allowed
+* here because inout parameters must match types exactly.
+*/
+   if (parameter_is_inout) {
+  /* Inout parameters should never require conversion, since that would
+   * require an implicit conversion to exist both to and from the formal
+   * parameter type, and there are no bidirectional implicit conversions.
+   */
+  assert (actual-type == formal_type);
+
+  ir_dereference_variable *const deref_tmp_1 =
+new(mem_ctx) ir_dereference_variable(tmp);
+  ir_assignment *const assignment =
+new(mem_ctx) ir_assignment(deref_tmp_1, actual);
+  before_instructions-push_tail(assignment);
+   }
+
+   /* Replace the parameter in the call with a dereference of the new
+* temporary.
+*/
+   ir_dereference_variable *const deref_tmp_2 =
+  new(mem_ctx) ir_dereference_variable(tmp);
+   actual-replace_with(deref_tmp_2);
+
+
+   /* Copy the temporary variable to the actual parameter with optional
+* type conversion applied.
+*/
+   ir_rvalue *rhs = new(mem_ctx) ir_dereference_variable(tmp);
+   if (actual-type != formal_type)
+  rhs = convert_component(rhs, actual-type);
+
+   ir_rvalue *lhs = actual;
+   if (expr != NULL  expr-operation == ir_binop_vector_extract) {
+  rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert,
+  expr-operands[0]-type,
+  expr-operands[0]-clone(mem_ctx, NULL),
+  rhs,
+  expr-operands[1]-clone(mem_ctx, NULL));
+  lhs = expr-operands[0]-clone(mem_ctx, NULL);
+   }

[Mesa-dev] [PATCH 09/12] glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Right now the lower_clip_distance_visitor lowers variable indexing into
gl_ClipDistance into variable indexing into both the array
gl_ClipDistanceMESA and the vectors of that array.  For example,

gl_ClipDistance[i] = f;

becomes

gl_ClipDistanceMESA[i/4][i%4] = f;

However, variable indexing into vectors using ir_dereference_array is
being removed.  Instead, ir_expression with ir_triop_vector_insert will
be used.  The above code will become

gl_ClipDistanceMESA[i/4] = vector_insert(gl_ClipDistanceMESA[i/4],
 i % 4,
 f);

In order to do this, an ir_rvalue_visitor will need to be used.  This
commit is really just a refactor to get ready for that.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Paul Berry stereotype...@gmail.com
---
 src/glsl/lower_clip_distance.cpp | 136 +--
 1 file changed, 86 insertions(+), 50 deletions(-)

diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp
index 643807d..26a0feb 100644
--- a/src/glsl/lower_clip_distance.cpp
+++ b/src/glsl/lower_clip_distance.cpp
@@ -46,10 +46,10 @@
  */
 
 #include glsl_symbol_table.h
-#include ir_hierarchical_visitor.h
+#include ir_rvalue_visitor.h
 #include ir.h
 
-class lower_clip_distance_visitor : public ir_hierarchical_visitor {
+class lower_clip_distance_visitor : public ir_rvalue_visitor {
 public:
lower_clip_distance_visitor()
   : progress(false), old_clip_distance_var(NULL),
@@ -59,11 +59,12 @@ public:
 
virtual ir_visitor_status visit(ir_variable *);
void create_indices(ir_rvalue*, ir_rvalue *, ir_rvalue *);
-   virtual ir_visitor_status visit_leave(ir_dereference_array *);
virtual ir_visitor_status visit_leave(ir_assignment *);
void visit_new_assignment(ir_assignment *ir);
virtual ir_visitor_status visit_leave(ir_call *);
 
+   virtual void handle_rvalue(ir_rvalue **rvalue);
+
bool progress;
 
/**
@@ -173,33 +174,35 @@ lower_clip_distance_visitor::create_indices(ir_rvalue 
*old_index,
 }
 
 
-/**
- * Replace any expression that indexes into the gl_ClipDistance array with an
- * expression that indexes into one of the vec4's in gl_ClipDistanceMESA and
- * accesses the appropriate component.
- */
-ir_visitor_status
-lower_clip_distance_visitor::visit_leave(ir_dereference_array *ir)
+void
+lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv)
 {
/* If the gl_ClipDistance var hasn't been declared yet, then
 * there's no way this deref can refer to it.
 */
-   if (!this-old_clip_distance_var)
-  return visit_continue;
-
-   ir_dereference_variable *old_var_ref = ir-array-as_dereference_variable();
-   if (old_var_ref  old_var_ref-var == this-old_clip_distance_var) {
-  this-progress = true;
-  ir_rvalue *array_index;
-  ir_rvalue *swizzle_index;
-  this-create_indices(ir-array_index, array_index, swizzle_index);
-  void *mem_ctx = ralloc_parent(ir);
-  ir-array = new(mem_ctx) ir_dereference_array(
- this-new_clip_distance_var, array_index);
-  ir-array_index = swizzle_index;
+   if (!this-old_clip_distance_var || *rv == NULL)
+  return;
+
+   ir_dereference_array *const array = (*rv)-as_dereference_array();
+   if (array != NULL) {
+  /* Replace any expression that indexes into the gl_ClipDistance array
+   * with an expression that indexes into one of the vec4's in
+   * gl_ClipDistanceMESA and accesses the appropriate component.
+   */
+  ir_dereference_variable *old_var_ref =
+array-array-as_dereference_variable();
+  if (old_var_ref  old_var_ref-var == this-old_clip_distance_var) {
+this-progress = true;
+ir_rvalue *array_index;
+ir_rvalue *swizzle_index;
+this-create_indices(array-array_index, array_index, swizzle_index);
+void *mem_ctx = ralloc_parent(array);
+array-array =
+   new(mem_ctx) ir_dereference_array(this-new_clip_distance_var,
+ array_index);
+array-array_index = swizzle_index;
+  }
}
-
-   return visit_continue;
 }
 
 
@@ -214,38 +217,71 @@ lower_clip_distance_visitor::visit_leave(ir_assignment 
*ir)
 {
ir_dereference_variable *lhs_var = ir-lhs-as_dereference_variable();
ir_dereference_variable *rhs_var = ir-rhs-as_dereference_variable();
-   if ((lhs_var  lhs_var-var == this-old_clip_distance_var)
-   || (rhs_var  rhs_var-var == this-old_clip_distance_var)) {
-  /* LHS or RHS of the assignment is the entire gl_ClipDistance array.
-   * Since we are reshaping gl_ClipDistance from an array of floats to an
-   * array of vec4's, this isn't going to work as a bulk assignment
-   * anymore, so unroll it to element-by-element assignments and lower
-   * each of them.
-   *
-   * Note: to unroll into element-by-element assignments, we 

[Mesa-dev] [PATCH 12/12] glsl: Death to array dereferences of vectors!

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Now that all the places that used to generate array derefeneces of
vectors have been changed to generate either ir_binop_vector_extract or
ir_triop_vector_insert (or both), remove all support for dealing with
this deprecated construct.

As an added safeguard, modify ir_validate to reject ir_dereference_array
of a vector.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir_validate.cpp|  29 +++
 src/glsl/lower_vec_index_to_cond_assign.cpp | 116 +---
 src/glsl/lower_vec_index_to_swizzle.cpp |  56 +-
 3 files changed, 32 insertions(+), 169 deletions(-)

diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index f304af4..4957146 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -69,6 +69,8 @@ public:
virtual ir_visitor_status visit_leave(ir_expression *ir);
virtual ir_visitor_status visit_leave(ir_swizzle *ir);
 
+   virtual ir_visitor_status visit_enter(class ir_dereference_array *);
+
virtual ir_visitor_status visit_enter(ir_assignment *ir);
virtual ir_visitor_status visit_enter(ir_call *ir);
 
@@ -102,6 +104,33 @@ ir_validate::visit(ir_dereference_variable *ir)
 }
 
 ir_visitor_status
+ir_validate::visit_enter(class ir_dereference_array *ir)
+{
+   if (!ir-array-type-is_array()  !ir-array-type-is_matrix()) {
+  printf(ir_dereference_array @ %p does not specify an array or a 
+matrix\n,
+(void *) ir);
+  ir-print();
+  printf(\n);
+  abort();
+   }
+
+   if (!ir-array_index-type-is_scalar()) {
+  printf(ir_dereference_array @ %p does not have scalar index: %s\n,
+(void *) ir, ir-array_index-type-name);
+  abort();
+   }
+
+   if (!ir-array_index-type-is_integer()) {
+  printf(ir_dereference_array @ %p does not have integer index: %s\n,
+(void *) ir, ir-array_index-type-name);
+  abort();
+   }
+
+   return visit_continue;
+}
+
+ir_visitor_status
 ir_validate::visit_enter(ir_if *ir)
 {
if (ir-condition-type != glsl_type::bool_type) {
diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp 
b/src/glsl/lower_vec_index_to_cond_assign.cpp
index 2cd540c..f74e1d9 100644
--- a/src/glsl/lower_vec_index_to_cond_assign.cpp
+++ b/src/glsl/lower_vec_index_to_cond_assign.cpp
@@ -52,7 +52,6 @@ public:
   progress = false;
}
 
-   ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val);
ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx,
   ir_rvalue *orig_vector,
   ir_rvalue *orig_index,
@@ -141,26 +140,6 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_
 }
 
 ir_rvalue *
-ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue
 *ir)
-{
-   ir_dereference_array *orig_deref = ir-as_dereference_array();
-
-   if (!orig_deref)
-  return ir;
-
-   if (orig_deref-array-type-is_matrix() ||
-   orig_deref-array-type-is_array())
-  return ir;
-
-   assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT);
-
-   return convert_vec_index_to_cond_assign(ralloc_parent(ir),
-  orig_deref-array,
-  orig_deref-array_index,
-  ir-type);
-}
-
-ir_rvalue *
 
ir_vec_index_to_cond_assign_visitor::convert_vector_extract_to_cond_assign(ir_rvalue
 *ir)
 {
ir_expression *const expr = ir-as_expression();
@@ -180,7 +159,6 @@ 
ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir)
unsigned int i;
 
for (i = 0; i  ir-get_num_operands(); i++) {
-  ir-operands[i] = convert_vec_index_to_cond_assign(ir-operands[i]);
   ir-operands[i] = convert_vector_extract_to_cond_assign(ir-operands[i]);
}
 
@@ -194,7 +172,6 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle 
*ir)
 * the result of indexing a vector is.  But maybe at some point we'll end up
 * using swizzling of scalars for vector construction.
 */
-   ir-val = convert_vec_index_to_cond_assign(ir-val);
ir-val = convert_vector_extract_to_cond_assign(ir-val);
 
return visit_continue;
@@ -203,95 +180,12 @@ 
ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle *ir)
 ir_visitor_status
 ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir)
 {
-   ir_variable *index, *var;
-   ir_dereference_variable *deref;
-   ir_assignment *assign;
-   unsigned i;
-
-   ir-rhs = convert_vec_index_to_cond_assign(ir-rhs);
ir-rhs = convert_vector_extract_to_cond_assign(ir-rhs);
 
if (ir-condition) {
-  ir-condition = convert_vec_index_to_cond_assign(ir-condition);
   ir-condition = convert_vector_extract_to_cond_assign(ir-condition);
}
 
-   /* Last, handle the LHS */
-   ir_dereference_array *orig_deref = ir-lhs-as_dereference_array();
-
-   if 

[Mesa-dev] [PATCH 10/12] glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA

2013-04-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Variable indexing into vectors using ir_dereference_array is being
removed, so this lowering pass has to generate something different.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Paul Berry stereotype...@gmail.com
---
 src/glsl/lower_clip_distance.cpp | 36 ++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp
index 26a0feb..fd6e3f0 100644
--- a/src/glsl/lower_clip_distance.cpp
+++ b/src/glsl/lower_clip_distance.cpp
@@ -197,10 +197,17 @@ lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv)
 ir_rvalue *swizzle_index;
 this-create_indices(array-array_index, array_index, swizzle_index);
 void *mem_ctx = ralloc_parent(array);
-array-array =
+
+ir_dereference_array *const ClipDistanceMESA_deref =
new(mem_ctx) ir_dereference_array(this-new_clip_distance_var,
  array_index);
-array-array_index = swizzle_index;
+
+ir_expression *const expr =
+   new(mem_ctx) ir_expression(ir_binop_vector_extract,
+  ClipDistanceMESA_deref,
+  swizzle_index);
+
+*rv = expr;
   }
}
 }
@@ -280,7 +287,32 @@ lower_clip_distance_visitor::visit_leave(ir_assignment *ir)
   return visit_continue;
}
 
+   /* Handle the LHS as if it were an r-value.  This may cause the LHS to get
+* replaced with an ir_expression or ir_binop_vector_extract.  If this
+* occurs, replace it with a dereference of the vector, and replace the RHS
+* with an ir_triop_vector_insert.
+*/
handle_rvalue((ir_rvalue **)ir-lhs);
+   if (ir-lhs-ir_type == ir_type_expression) {
+  ir_expression *const expr = (ir_expression *) ir-lhs;
+
+  /* The expression must be of the form:
+   *
+   * (vector_extract gl_ClipDistanceMESA[i], j).
+   */
+  assert(expr-operation == ir_binop_vector_extract);
+  assert(expr-operands[0]-ir_type == ir_type_dereference_array);
+
+  ir_dereference *const new_lhs = (ir_dereference *) expr-operands[0];
+  ir-rhs = new(ctx) ir_expression(ir_triop_vector_insert,
+  new_lhs-type,
+  new_lhs-clone(ctx, NULL),
+  ir-rhs,
+  expr-operands[1]);
+  ir-set_lhs(new_lhs);
+  ir-write_mask = (1U  new_lhs-type-vector_elements) - 1;
+   }
+
return rvalue_visit(ir);
 }
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.

2013-04-08 Thread Daniel Vetter
On Mon, Apr 08, 2013 at 07:27:38PM -0700, Kenneth Graunke wrote:
 In the past, we preferred X-tiling for color buffers because our BLT
 code couldn't handle Y-tiling.  However, the BLT paths have been largely
 replaced by BLORP on Gen6+, which can handle any kind of tiling.
 
 We hadn't measured any performance improvement in the past, but that's
 probably because compressed textures were all uncompressed anyway.

Long ago when I've drawn diagramms showing which pixels lay in which
cachelines for enabling tiling on i915g I've figured that at least for the
4x4 block compressed layouts with 128bits per block X and Y tiling should
result in about equally optimal layouts (just cachelines stack
differently): X-tiled actually gives you an 8x8 grid of 4x4 blocks, so
I've figured that'll be better for tlb efficiency.

Anyway I've never done real benchmarks, I'm just curious that you blame
all the speedup here on compressed textures and wonder a bit what that'd
look like when (some) of the compressed layouts would keep on using x
tiled. But it's gettin a bit late here ;-)
-Daniel

 
 Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 index 8dd04be..6a9f08c 100644
 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
 @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel,
return I915_TILING_Y;
  
 if (width0 = 64)
 -  return I915_TILING_X;
 +  return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X;
  
 return I915_TILING_NONE;
  }
 -- 
 1.8.1.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.

2013-04-08 Thread Daniel Vetter
On Tue, Apr 09, 2013 at 01:17:39AM +0200, Daniel Vetter wrote:
 On Mon, Apr 08, 2013 at 07:27:38PM -0700, Kenneth Graunke wrote:
  In the past, we preferred X-tiling for color buffers because our BLT
  code couldn't handle Y-tiling.  However, the BLT paths have been largely
  replaced by BLORP on Gen6+, which can handle any kind of tiling.
  
  We hadn't measured any performance improvement in the past, but that's
  probably because compressed textures were all uncompressed anyway.
 
 Long ago when I've drawn diagramms showing which pixels lay in which
 cachelines for enabling tiling on i915g I've figured that at least for the
 4x4 block compressed layouts with 128bits per block X and Y tiling should
 result in about equally optimal layouts (just cachelines stack
 differently): X-tiled actually gives you an 8x8 grid of 4x4 blocks, so
 I've figured that'll be better for tlb efficiency.

Blergh, can't do math, should be 8x32 or 32x8 grids of 4x4 blocks in a
tile. So on a quick look x/y-tiled are about equally nicely laid out. I've
mixed up the 8x8 with the cacheline pattern of y-tiled, where each
cacheline is a 4x4 pixel block (at least for 32bit-per-pixel stuff).
-Daniel

 Anyway I've never done real benchmarks, I'm just curious that you blame
 all the speedup here on compressed textures and wonder a bit what that'd
 look like when (some) of the compressed layouts would keep on using x
 tiled. But it's gettin a bit late here ;-)
 -Daniel
 
  
  Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.
  
  Signed-off-by: Kenneth Graunke kenn...@whitecape.org
  ---
   src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
  b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
  index 8dd04be..6a9f08c 100644
  --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
  +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
  @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel,
 return I915_TILING_Y;
   
  if (width0 = 64)
  -  return I915_TILING_X;
  +  return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X;
   
  return I915_TILING_NONE;
   }
  -- 
  1.8.1.1
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] remove mfeatures.h, take two

2013-04-08 Thread Brian Paul

On 04/08/2013 11:26 AM, Matt Turner wrote:

Ready to commit?


Thanks for the reminder.  I think it's ready but IIRC only one person 
besides myself really tested it.  I think I could cherry-pick the 
commits a few at a time to master...


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman

2013-04-08 Thread Marek Olšák
Pushed, thanks. The transform feedback test still doesn't pass, but at
least the hardlocks are gone.

Marek


On Sun, Apr 7, 2013 at 6:29 PM, Martin Andersson g02ma...@gmail.com wrote:

 If there are no objections or comments on this, it would be nice if
 someone could commit it.

 //Martin

 On Tue, Apr 2, 2013 at 10:43 PM, Martin Andersson g02ma...@gmail.com
 wrote:
  The multiplication part of tgsi_umad did not work on Cayman, because it
 did
  not populate the correct vector slots.
  ---
   src/gallium/drivers/r600/r600_shader.c | 45
 --
   1 file changed, 32 insertions(+), 13 deletions(-)
 
  diff --git a/src/gallium/drivers/r600/r600_shader.c
 b/src/gallium/drivers/r600/r600_shader.c
  index 82885d1..6c4cc8f 100644
  --- a/src/gallium/drivers/r600/r600_shader.c
  +++ b/src/gallium/drivers/r600/r600_shader.c
  @@ -5840,7 +5840,7 @@ static int tgsi_umad(struct r600_shader_ctx *ctx)
   {
  struct tgsi_full_instruction *inst =
 ctx-parse.FullToken.FullInstruction;
  struct r600_bytecode_alu alu;
  -   int i, j, r;
  +   int i, j, k, r;
  int lasti =
 tgsi_last_instruction(inst-Dst[0].Register.WriteMask);
 
  /* src0 * src1 */
  @@ -5848,21 +5848,40 @@ static int tgsi_umad(struct r600_shader_ctx *ctx)
  if (!(inst-Dst[0].Register.WriteMask  (1  i)))
  continue;
 
  -   memset(alu, 0, sizeof(struct r600_bytecode_alu));
  +   if (ctx-bc-chip_class == CAYMAN) {
  +   for (j = 0 ; j  4; j++) {
  +   memset(alu, 0, sizeof(struct
 r600_bytecode_alu));
 
  -   alu.dst.chan = i;
  -   alu.dst.sel = ctx-temp_reg;
  -   alu.dst.write = 1;
  +   alu.op = ALU_OP2_MULLO_UINT;
  +   for (k = 0; k 
 inst-Instruction.NumSrcRegs; k++) {
  +   r600_bytecode_src(alu.src[k],
 ctx-src[k], i);
  +   }
  +   tgsi_dst(ctx, inst-Dst[0], j,
 alu.dst);
  +   alu.dst.sel = ctx-temp_reg;
  +   alu.dst.write = (j == i);
  +   if (j == 3)
  +   alu.last = 1;
  +   r = r600_bytecode_add_alu(ctx-bc, alu);
  +   if (r)
  +   return r;
  +   }
  +   } else {
  +   memset(alu, 0, sizeof(struct
 r600_bytecode_alu));
 
  -   alu.op = ALU_OP2_MULLO_UINT;
  -   for (j = 0; j  2; j++) {
  -   r600_bytecode_src(alu.src[j], ctx-src[j], i);
  -   }
  +   alu.dst.chan = i;
  +   alu.dst.sel = ctx-temp_reg;
  +   alu.dst.write = 1;
 
  -   alu.last = 1;
  -   r = r600_bytecode_add_alu(ctx-bc, alu);
  -   if (r)
  -   return r;
  +   alu.op = ALU_OP2_MULLO_UINT;
  +   for (j = 0; j  2; j++) {
  +   r600_bytecode_src(alu.src[j],
 ctx-src[j], i);
  +   }
  +
  +   alu.last = 1;
  +   r = r600_bytecode_add_alu(ctx-bc, alu);
  +   if (r)
  +   return r;
  +   }
  }
 
 
  --
  1.8.2
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output

2013-04-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=63117

--- Comment #2 from Brian Paul bri...@vmware.com ---
Created attachment 77643
  -- https://bugs.freedesktop.org/attachment.cgi?id=77643action=edit
patch for osmesa.c

Kevin, can you try this patch?

I think the unique thing that vtk is doing is calling OSMesaMakeCurrent()
several times per frame.  Each time OSMesaMakeCurrent() was called we're
creating new gallium drawing surfaces so any previous rendering to the frame
was getting lost.
The patch tries to re-use gallium surfaces from one MakeCurrent to the next.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.

2013-04-08 Thread Kenneth Graunke

On 04/07/2013 06:42 AM, Paul Berry wrote:

The call to emit_shader_time_end() before the second URB write was
conditioned with if (eot), but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.
---
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 8bd2fd8..ca1cfe8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2664,10 +2664,8 @@ vec4_visitor::emit_urb_writes()
   emit_urb_slot(mrf++, c-prog_data.vue_map.slot_to_varying[slot]);
}

-  if (eot) {
- if (INTEL_DEBUG  DEBUG_SHADER_TIME)
-emit_shader_time_end();
-  }
+  if (INTEL_DEBUG  DEBUG_SHADER_TIME)
+ emit_shader_time_end();

current_annotation = URB write;
inst = emit(VS_OPCODE_URB_WRITE);


Yeah...sorry for missing this in the last round of review.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev