[Mesa-dev] [PATCH] i965/skl: Don't try to apply the opt_sampler_eot extension for vs

2015-04-28 Thread Neil Roberts
The opt_sampler_eot optimisation of fs_visitor effectively assumes
that it is running on a fragment shader because it casts the program
key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
used for vertex shaders. It looks like this usually works anyway
because the optimisation is skipped if key-nr_color_regions != 1.
However for a vertex shader the key is actually a brw_vs_prog_key so
the space for nr_color_regions is probably taken up by
key-base.program_string_id. This can end up making nr_color_regions
be 1 in which case the function will later assert when the last
instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test
suite assert. Presumably this only happens there because that compiles
a lot of shaders so it would end up with a high value for
program_string_id.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 61ee056..255ddf4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2548,6 +2548,9 @@ fs_visitor::opt_sampler_eot()
 {
brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
 
+   if (stage != MESA_SHADER_FRAGMENT)
+  return false;
+
if (devinfo-gen  9  !devinfo-is_cherryview)
   return false;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] helper-conveniance functions for drivers to implement ARB_framebuffer_no_attachment

2015-04-28 Thread Rogovin, Kevin
Hi, 

 This is in fact two changes: introduction of the helpers and refactoring of 
 the intersection code to be used with caller provided bounding box.

Is this a request to change the commit message or to split this as well? 
I think splitting it is silly, but if it make you happy so be it; the reason 
being
that there are no lurking surprises if it split or inconsistencies introduced
by such a split.

 
 ---
  src/mesa/main/framebuffer.c | 49 
 ++---
  src/mesa/main/framebuffer.h | 29 +++
  src/mesa/main/mtypes.h  | 21 ++-
  3 files changed, 74 insertions(+), 25 deletions(-)
 
 diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c 
 index 4e4d896..7d8921b 100644
 --- a/src/mesa/main/framebuffer.c
 +++ b/src/mesa/main/framebuffer.c
 @@ -357,30 +357,20 @@ update_framebuffer_size(struct gl_context *ctx, 
 struct gl_framebuffer *fb)  }
  
  
 +
  /**
 - * Calculate the inclusive bounding box for the scissor of a specific 
 viewport
 + * Given a bounding box, intersect the bounding box with the scirros 
 + of
 + * a specified vieport.
   *
   * \param ctx GL context.
 - * \param buffer  Framebuffer to be checked against
   * \param idx Index of the desired viewport
   * \param bboxBounding box for the scissored viewport.  Stored as xmin,
   *xmax, ymin, ymax.
 - *
 - * \warning This function assumes that the framebuffer dimensions are 
 up to
 - * date (e.g., update_framebuffer_size has been recently called on \c 
 buffer).
 - *
 - * \sa _mesa_clip_to_region
   */
 -void
 -_mesa_scissor_bounding_box(const struct gl_context *ctx,
 -   const struct gl_framebuffer *buffer,
 -   unsigned idx, int *bbox)
 +extern void
 +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx,
 + unsigned idx, int *bbox)
  {
 -   bbox[0] = 0;
 -   bbox[2] = 0;
 -   bbox[1] = buffer-Width;
 -   bbox[3] = buffer-Height;
 -
 if (ctx-Scissor.EnableFlags  (1u  idx)) {
if (ctx-Scissor.ScissorArray[idx].X  bbox[0]) {
   bbox[0] = ctx-Scissor.ScissorArray[idx].X; @@ -402,6 
 +392,33 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx,
   bbox[2] = bbox[3];
}
 }
 +}
 +
 +/**
 + * Calculate the inclusive bounding box for the scissor of a specific 
 +viewport
 + *
 + * \param ctx GL context.
 + * \param buffer  Framebuffer to be checked against
 + * \param idx Index of the desired viewport
 + * \param bboxBounding box for the scissored viewport.  Stored as xmin,
 + *xmax, ymin, ymax.
 + *
 + * \warning This function assumes that the framebuffer dimensions are 
 +up to
 + * date (e.g., update_framebuffer_size has been recently called on \c 
 buffer).
 + *
 + * \sa _mesa_clip_to_region
 + */
 +void
 +_mesa_scissor_bounding_box(const struct gl_context *ctx,
 +   const struct gl_framebuffer *buffer,
 +   unsigned idx, int *bbox) {
 +   bbox[0] = 0;
 +   bbox[2] = 0;
 +   bbox[1] = buffer-Width;
 +   bbox[3] = buffer-Height;
 +
 +   _mesa_intersect_scissor_bounding_box(ctx, idx, bbox);
  
 assert(bbox[0] = bbox[1]);
 assert(bbox[2] = bbox[3]);
 diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h 
 index a427421..8b84d26 100644
 --- a/src/mesa/main/framebuffer.h
 +++ b/src/mesa/main/framebuffer.h
 @@ -76,6 +76,35 @@ _mesa_scissor_bounding_box(const struct gl_context *ctx,
 const struct gl_framebuffer *buffer,
 unsigned idx, int *bbox);
  
 +extern void
 +_mesa_intersect_scissor_bounding_box(const struct gl_context *ctx,
 + unsigned idx, int *bbox);
 +
 +static inline  GLuint
 +_mesa_geometric_width(const struct gl_framebuffer *buffer) {
 +   return buffer-_HasAttachments ? buffer-Width : 
 +buffer-DefaultGeometry.Width; }
 +
 +
 +static inline  GLuint
 +_mesa_geometric_height(const struct gl_framebuffer *buffer) {
 +   return buffer-_HasAttachments ? buffer-Height : 
 +buffer-DefaultGeometry.Height; }
 +
 +static inline  GLuint
 +_mesa_geometric_samples(const struct gl_framebuffer *buffer) {
 +   return buffer-_HasAttachments ? buffer-Visual.samples : 
 +buffer-DefaultGeometry.NumSamples;
 +}
 +
 +static inline  GLuint
 +_mesa_geometric_layers(const struct gl_framebuffer *buffer) {
 +   return buffer-_HasAttachments ? buffer-MaxNumLayers : 
 +buffer-DefaultGeometry.Layers; }
 +
  extern void
  _mesa_update_draw_buffer_bounds(struct gl_context *ctx);
  
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 
 38a3817..ac7cdb6 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3134,13 +3134,13 @@ struct gl_framebuffer
 struct gl_config Visual;
  
 /**
 -* size of frame buffer in pixels, 
 -* no attachments has these values as 0 
 +* size 

Re: [Mesa-dev] [PATCH 2/7] Define constants and functions for ARB_framebuffer_no_attachment extension

2015-04-28 Thread Rogovin, Kevin
Hello,

 H that's surprising.

 src/mesa/main/tests/dispatch_sanity.cpp:// {
glFramebufferParameteri, 43, -1 },   // XXX: Add to xml

I thought it should detect that there's a new API and complain loudly.
 At least that's how I remembered it working, but that doesn't seem to be the 
 case?
 Are you sure you had a clean build? Either way, those should probably get 
 uncommented, 

The reason why there was no issue is because the entries for both 
glFramebufferParameteri 
and glGetFramebufferParameteriv are already commented out. Looking at the 
commit log, 
they have been commented out for quite some time. The commit from Jordan Justen
(dated Oct 24,2012) titled dispatch sanity test: Add GL CORE 3.1 test has the 
line added 
but commented out. In fact, looking at the commit log, with git log -p the only 
reference 
to those functions has both commented out.

 and there are probably interactions with ARB_dsa as well, should probably 
 figure out if you or Laura should add support for that (or perhaps you had it 
 in your patches already).

The extension ARB_framebuffer_no_attachments does NOT define the DSA style 
functions 
for ARB_direct_state_access. Instead, it only defines for the 
EXT_direct_state_access. I think 
that the implementers of GL_ARB_direct_state_access are the ones that need to  
define and 
implement gl[Get]NamedFraembufferParamteri. I wrote the patch 3 so that it is 
trivial to
implement the DSA function though.

-Kevin

On Fri, Apr 24, 2015 at 11:06 AM, Rogovin, Kevin kevin.rogo...@intel.com 
wrote:
 Hi,

  I agree with the comments about the code (and when the last element of the 
 series is reviewed I will submit the series with review comments taken into 
 use), but when I applied just Patch 1 and Patch 2, and ran 
 src/mesa/main/tests/main-test (after a git clean -dfx and all that cleaning) 
 all test pass, in particular the 4 DispatchSanity_test's:  
 DispatchSanity_test.GL31_CORE ,  DispatchSanity_test.GLES11,  
 DispatchSanity_test.GLES2 and  DispatchSanity_test.GLES3. In addition, make 
 check passes all test as well. If you are referring to another test, what 
 test is that?

  -Kevin

 -Original Message-
 From: ibmir...@gmail.com [mailto:ibmir...@gmail.com] On Behalf Of Ilia 
 Mirkin
 Sent: Friday, April 24, 2015 4:36 PM
 To: Matt Turner
 Cc: Rogovin, Kevin; mesa-...@freedesktop.org
 Subject: Re: [Mesa-dev] [PATCH 2/7] Define constants and functions for 
 ARB_framebuffer_no_attachment extension

 This change will make the dispatch_sanity test fail.

 On Fri, Apr 24, 2015 at 3:05 AM, Matt Turner matts...@gmail.com wrote:
 The subject should be prefixed with mesa:

 On Thu, Apr 23, 2015 at 11:59 PM,  kevin.rogo...@intel.com wrote:
 From: Kevin Rogovin kevin.rogo...@intel.com

 Define enumerations, functions and associated glGet's for extension 
 ARB_framebuffer_no_attachment.

 ---
  .../glapi/gen/ARB_framebuffer_no_attachments.xml   | 33 ++
  src/mapi/glapi/gen/Makefile.am |  1 +
  src/mapi/glapi/gen/gl_API.xml  |  1 +
  src/mesa/main/fbobject.c   | 12 +++
  src/mesa/main/fbobject.h   |  7 
  src/mesa/main/get.c|  3 ++
  src/mesa/main/get_hash_params.py   | 40 
 ++
  7 files changed, 97 insertions(+)
  create mode 100644
 src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml

 diff --git a/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml
 b/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml
 new file mode 100644
 index 000..60e40d0
 --- /dev/null
 +++ b/src/mapi/glapi/gen/ARB_framebuffer_no_attachments.xml
 @@ -0,0 +1,33 @@
 +?xml version=1.0?
 +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd
 +
 +OpenGLAPI
 +
 +category name=GL_ARB_framebuffer_no_attachments number=130
 +
 +enum name=FRAMEBUFFER_DEFAULT_WIDTH value=0x9310 / enum 
 +name=FRAMEBUFFER_DEFAULT_HEIGHT value=0x9311 / enum 
 +name=FRAMEBUFFER_DEFAULT_LAYERS value=0x9312 / enum 
 +name=FRAMEBUFFER_DEFAULT_SAMPLES value=0x9313 / enum 
 +name=FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS value=0x9314 / 
 +enum name=MAX_FRAMEBUFFER_WIDTH value=0x9315 / enum 
 +name=MAX_FRAMEBUFFER_HEIGHT value=0x9316 / enum 
 +name=MAX_FRAMEBUFFER_LAYERS value=0x9317 / enum 
 +name=MAX_FRAMEBUFFER_SAMPLES value=0x9318 /
 +
 +
 +function name=FramebufferParameteri offset=assign
 +param name=target type=GLenum /
 +param name=pname type=GLenum /
 +param name=param type=GLint / /function
 +
 +function name=GetFramebufferParameteriv offset=assign
 +param name=target type=GLenum /
 +param name=pname type=GLenum /
 +param name=params type=GLint * / /function
 +
 +/category
 +
 +/OpenGLAPI
 diff --git a/src/mapi/glapi/gen/Makefile.am 
 b/src/mapi/glapi/gen/Makefile.am index 1c4b86a..9a0e944 100644
 --- a/src/mapi/glapi/gen/Makefile.am
 +++ b/src/mapi/glapi/gen/Makefile.am
 @@ -130,6 +130,7 @@ API_XML = \
 

Re: [Mesa-dev] [PATCH 11/38] main: Major refactor of get_texture_for_framebuffer.

2015-04-28 Thread Fredrik Höglund
On Sunday 12 April 2015, Fredrik Höglund wrote:
 This looks like a very nice cleanup indeed!
 
 One thing I noticed though is that fbobject.c seems to use two blank
 lines between function definitions, although it's not consistent about it.
 The functions introduced in this patch only have one blank line between
 them.
 
 Some more comments (mostly nitpicks) below.
 
 On Wednesday 04 March 2015, Laura Ekstrand wrote:
  This splits off the (still) rather large chunk that is
  get_texture_for_framebuffer into lots of smaller functions specialized to
  service the wide variety of unique needs of *FramebufferTexture* entry 
  points.
  The result is much cleaner because, rather than having a pile of branches 
  and
  confusing conditions (like the boolean layered), the uniqueness is baked 
  into
  the entry points. The entry points know whether or not they are layered or 
  use
  a textarget.
  ---
   src/mesa/main/fbobject.c | 457 
  +--
   src/mesa/main/fbobject.h |   2 +-
   2 files changed, 247 insertions(+), 212 deletions(-)
  
  diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
  index f634aed..4df0b6b 100644
  --- a/src/mesa/main/fbobject.c
  +++ b/src/mesa/main/fbobject.c
  @@ -2415,14 +2415,7 @@ reuse_framebuffer_texture_attachment(struct 
  gl_framebuffer *fb,
   
   /**
* Common code called by gl*FramebufferTexture*() to retrieve the correct
  - * texture object pointer and check for associated errors.
  - *
  - * \param textarget is the textarget that was passed to the
  - * glFramebufferTexture...() function, or 0 if the corresponding function
  - * doesn't have a textarget parameter.
  - *
  - * \param layered is true if this function was called from
  - * gl*FramebufferTexture(), false otherwise.
  + * texture object pointer.
*
* \param texObj where the pointer to the texture object is returned.  Note
* that a successful call may return texObj = NULL.
  @@ -2430,20 +2423,12 @@ reuse_framebuffer_texture_attachment(struct 
  gl_framebuffer *fb,
* \return true if no errors, false if errors
*/
   static bool
  -get_texture_for_framebuffer(struct gl_context *ctx,
  -GLuint texture, GLenum textarget,
  -GLint level, GLuint zoffset, GLboolean 
  *layered,
  -const char *caller,
  +get_texture_for_framebuffer(struct gl_context *ctx, GLuint texture,
  +bool layered, const char *caller,
   struct gl_texture_object **texObj)
   {
  -   GLenum maxLevelsTarget;
  -   GLboolean err = GL_TRUE;
  -
  *texObj = NULL; /* This will get returned if texture = 0. */
   
  -   /* The textarget, level, and zoffset parameters are only validated if
  -* texture is non-zero.
  -*/
  if (!texture)
 return true;
   
  @@ -2458,27 +2443,40 @@ get_texture_for_framebuffer(struct gl_context *ctx,
  * value, while the other commands throw invalid operation (where
  * *layered = GL_FALSE).
  */
  -  GLenum no_texobj_err = *layered ? GL_INVALID_VALUE :
  +  GLenum no_texobj_err = layered ? GL_INVALID_VALUE :
GL_INVALID_OPERATION;
 _mesa_error(ctx, no_texobj_err,
 %s(non-generated texture %u), caller, texture);
 return false;
  }
   
  -   if (textarget == 0) {
  -  if (*layered) {
  - /* We're being called by gl*FramebufferTexture() and textarget
  -  * is not used.
  -  */
  - switch ((*texObj)-Target) {
  +   return true;
  +}
  +
  +/**
  + * Common code called by gl*FramebufferTexture() to verify the texture 
  target
  + * and decide whether or not the attachment should truly be considered
  + * layered.
  + *
  + * \param layered true if attachment should be considered layered, false if
  + * not
  + *
  + * \return true if no errors, false if errors
  + */
  +static bool
  +check_layered_texture_target(struct gl_context *ctx, GLenum target,
  + const char *caller, bool *layered)
 
 I don't think layered should be a bool here, because it's stored as a
 GLboolean in gl_renderbuffer_attachment, and can be queried with
 glGetFramebufferAttachmentParameteriv.
 
  +{
  +   *layered = true;
  +
  + switch (target) {
case GL_TEXTURE_3D:
case GL_TEXTURE_1D_ARRAY_EXT:
case GL_TEXTURE_2D_ARRAY_EXT:
case GL_TEXTURE_CUBE_MAP:
case GL_TEXTURE_CUBE_MAP_ARRAY:
case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
  -err = false;
  -break;
  +return true;
case GL_TEXTURE_1D:
case GL_TEXTURE_2D:
case GL_TEXTURE_RECTANGLE:
  @@ -2486,41 +2484,139 @@ get_texture_for_framebuffer(struct gl_context *ctx,
   /* These texture types are valid to pass to
* glFramebufferTexture(), 

Re: [Mesa-dev] [PATCH 03/18] winsys/amdgpu: add a new winsys for the new kernel driver

2015-04-28 Thread Marek Olšák
Hi Emil,

I think I have fixed everything that you suggested. You can review the
branch here:

http://cgit.freedesktop.org/~mareko/mesa/log/?h=amdgpu

Thanks,

Marek



On Tue, Apr 28, 2015 at 11:01 AM, Emil Velikov emil.l.veli...@gmail.com wrote:
 On 28 April 2015 at 01:02, Marek Olšák mar...@gmail.com wrote:
 On Tue, Apr 21, 2015 at 5:12 PM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 Hi Marek,

 Must admit that the current split/location of the new winsys looks a
 bit strange. I'm thinking that if one places the new winsys alongside
 the radeon one (i.e. winsys/amdgpu/drm) things should still work and
 thus we'll result in shorter and cleaner patch. See below for more details.

 I've moved it now and I'm in the middle of a rebase right now.



 On 20/04/15 22:23, Marek Olšák wrote:
 From: Marek Olšák marek.ol...@amd.com

 ---
  configure.ac  |   5 +
  src/gallium/Makefile.am   |   1 +
  src/gallium/drivers/r300/Automake.inc |   6 +-
  src/gallium/drivers/r600/Automake.inc |   6 +-
  src/gallium/drivers/radeonsi/Automake.inc |   6 +-
  src/gallium/targets/pipe-loader/Makefile.am   |  12 +-
  src/gallium/winsys/radeon/amdgpu/Android.mk   |  40 ++
  src/gallium/winsys/radeon/amdgpu/Makefile.am  |  12 +
  src/gallium/winsys/radeon/amdgpu/Makefile.sources |   8 +
  src/gallium/winsys/radeon/amdgpu/amdgpu_bo.c  | 643 
 ++
  src/gallium/winsys/radeon/amdgpu/amdgpu_bo.h  |  75 +++
  src/gallium/winsys/radeon/amdgpu/amdgpu_cs.c  | 578 
 +++
  src/gallium/winsys/radeon/amdgpu/amdgpu_cs.h  | 149 +
  src/gallium/winsys/radeon/amdgpu/amdgpu_public.h  |  14 +
  src/gallium/winsys/radeon/amdgpu/amdgpu_winsys.c  | 491 +
  src/gallium/winsys/radeon/amdgpu/amdgpu_winsys.h  |  80 +++
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |   8 +
  src/gallium/winsys/radeon/radeon_winsys.h |   4 +
  18 files changed, 2129 insertions(+), 9 deletions(-)
  create mode 100644 src/gallium/winsys/radeon/amdgpu/Android.mk
  create mode 100644 src/gallium/winsys/radeon/amdgpu/Makefile.am
  create mode 100644 src/gallium/winsys/radeon/amdgpu/Makefile.sources
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_bo.c
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_bo.h
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_cs.c
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_cs.h
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_public.h
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_winsys.c
  create mode 100644 src/gallium/winsys/radeon/amdgpu/amdgpu_winsys.h

 diff --git a/configure.ac b/configure.ac
 index 095e23e..f22975f 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -68,6 +68,7 @@ AC_SUBST([OSMESA_VERSION])
  dnl Versions for external dependencies
  LIBDRM_REQUIRED=2.4.38
  LIBDRM_RADEON_REQUIRED=2.4.56
 +LIBDRM_AMDGPU_REQUIRED=2.4.60
 I guess this will need to be changed once the libdrm patches land ?

 Yes.


  LIBDRM_INTEL_REQUIRED=2.4.60
  LIBDRM_NVVIEUX_REQUIRED=2.4.33
  LIBDRM_NOUVEAU_REQUIRED=2.4.33 libdrm = 2.4.41
 @@ -2091,6 +2092,7 @@ if test -n $with_gallium_drivers; then
  xr300)
  HAVE_GALLIUM_R300=yes
  PKG_CHECK_MODULES([RADEON], [libdrm_radeon = 
 $LIBDRM_RADEON_REQUIRED])
 +PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu = 
 $LIBDRM_AMDGPU_REQUIRED])
  gallium_require_drm Gallium R300
  gallium_require_drm_loader
  gallium_require_llvm Gallium R300
 @@ -2098,6 +2100,7 @@ if test -n $with_gallium_drivers; then
  xr600)
  HAVE_GALLIUM_R600=yes
  PKG_CHECK_MODULES([RADEON], [libdrm_radeon = 
 $LIBDRM_RADEON_REQUIRED])
 +PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu = 
 $LIBDRM_AMDGPU_REQUIRED])
 We can drop the above two hunks.

  gallium_require_drm Gallium R600
  gallium_require_drm_loader
  if test x$enable_r600_llvm = xyes -o x$enable_opencl = 
 xyes; then
 @@ -2114,6 +2117,7 @@ if test -n $with_gallium_drivers; then
  xradeonsi)
  HAVE_GALLIUM_RADEONSI=yes
  PKG_CHECK_MODULES([RADEON], [libdrm_radeon = 
 $LIBDRM_RADEON_REQUIRED])
 +PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu = 
 $LIBDRM_AMDGPU_REQUIRED])
  gallium_require_drm radeonsi
  gallium_require_drm_loader
  radeon_llvm_check radeonsi
 @@ -2384,6 +2388,7 @@ AC_CONFIG_FILES([Makefile
   src/gallium/winsys/intel/drm/Makefile
   src/gallium/winsys/nouveau/drm/Makefile
   src/gallium/winsys/radeon/drm/Makefile
 + src/gallium/winsys/radeon/amdgpu/Makefile
   src/gallium/winsys/svga/drm/Makefile
   src/gallium/winsys/sw/dri/Makefile
   src/gallium/winsys/sw/kms-dri/Makefile
 diff --git 

[Mesa-dev] [PATCH 06/27] i965: Define gather push constants opcodes

2015-04-28 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_defines.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index da288d3..8079433 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -2209,6 +2209,29 @@ enum brw_wm_barycentric_interp_mode {
 #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
 #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */
 
+/* Resource streamer gather constants */
+#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */
+#define _3DSTATE_GATHER_CONSTANT_VS   0x7834
+#define _3DSTATE_GATHER_CONSTANT_GS   0x7835
+#define _3DSTATE_GATHER_CONSTANT_HS   0x7836
+#define _3DSTATE_GATHER_CONSTANT_DS   0x7837
+#define _3DSTATE_GATHER_CONSTANT_PS   0x7838
+/* Only required in HSW */
+#define HSW_GATHER_CONSTANTS_RESERVED (3  4)
+
+#define BRW_GATHER_CONSTANTS_ENABLE_SHIFT 11 /* GEN7.5+ */
+#define BRW_GATHER_CONSTANTS_ENABLE_MASK  INTEL_MASK(11, 11)
+#define BRW_GATHER_CONSTANTS_ON   1
+#define BRW_GATHER_CONSTANTS_OFF  0
+#define BRW_GATHER_BUFFER_VALID_SHIFT 16
+#define BRW_GATHER_BUFFER_VALID_MASK  INTEL_MASK(31, 16)
+#define BRW_GATHER_BINDING_TABLE_BLOCK_SHIFT  12
+#define BRW_GATHER_BINDING_TABLE_BLOCK_MASK   INTEL_MASK(15, 12)
+#define BRW_GATHER_CONST_BUFFER_OFFSET_SHIFT  8
+#define BRW_GATHER_CONST_BUFFER_OFFSET_MASK   INTEL_MASK(15, 8)
+#define BRW_GATHER_CHANNEL_MASK_SHIFT 4
+#define BRW_GATHER_CHANNEL_MASK_MASK  INTEL_MASK(7, 4)
+
 #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
 /* DW1 */
 # define SO_FUNCTION_ENABLE(1  31)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/27] i965: Implement interface to edit binding table entries

2015-04-28 Thread Abdiel Janulgue
Unlike normal software binding tables where the driver has to manually
generate and fill a binding table array which are then uploaded to the
hardware, the resource streamer instead presents the driver with an option
to fill out slots for individual binding table indices. The hardware
accumlates the state for these combined edits which it then automatically
flushes to a binding table pool when the binding table pointer state
command is invoked.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 49 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  9 +
 2 files changed, 58 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index a58e32e..70b8751 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -44,6 +44,11 @@
 #include brw_state.h
 #include intel_batchbuffer.h
 
+static const GLuint stage_to_bt_edit[MESA_SHADER_FRAGMENT + 1] = {
+   _3DSTATE_BINDING_TABLE_EDIT_VS,
+   _3DSTATE_BINDING_TABLE_EDIT_GS,
+   _3DSTATE_BINDING_TABLE_EDIT_PS,
+};
 /* Somehow the hw-binding table pool offset must start here, otherwise
  * the GPU will hang
  */
@@ -172,6 +177,50 @@ const struct brw_tracked_state brw_gs_binding_table = {
  * Hardware-generated binding tables for the resource streamer
  */
 void
+gen7_update_binding_table(struct brw_context *brw,
+  gl_shader_stage stage,
+  uint32_t index,
+  uint32_t surf_offset)
+{
+   assert(stage = MESA_SHADER_FRAGMENT);
+
+   /* The surface state offset is a 16-bit value aligned to 32 bytes but
+* Surface State Pointer in dw2 is [15:0]. Right shift surf_offset
+* by 5 bits so it won't disturb bit 16 (which is used as the binding
+* table index entry), otherwise it would hang the GPU.
+*/
+   uint32_t dw2 = SET_FIELD(index, BRW_BINDING_TABLE_INDEX) | (surf_offset  
5);
+
+   BEGIN_BATCH(3);
+   OUT_BATCH(stage_to_bt_edit[stage]  16 | (3 - 2));
+   OUT_BATCH(BRW_BINDING_TABLE_EDIT_TARGET_ALL);
+   OUT_BATCH(dw2);
+   ADVANCE_BATCH();
+}
+
+/**
+ * Hardware-generated binding tables for the resource streamer
+ */
+void
+gen7_update_binding_table_from_array(struct brw_context *brw,
+ gl_shader_stage stage,
+ const uint32_t* binding_table,
+ int size)
+{
+   uint32_t dw2 = 0;
+   assert(stage = MESA_SHADER_FRAGMENT);
+
+   BEGIN_BATCH(size + 2);
+   OUT_BATCH(stage_to_bt_edit[stage]  16 | size);
+   OUT_BATCH(BRW_BINDING_TABLE_EDIT_TARGET_ALL);
+   for (int i = 0; i  size; i++) {
+  dw2 = SET_FIELD(i, BRW_BINDING_TABLE_INDEX) | (binding_table[i]  5);
+  OUT_BATCH(dw2);
+   }
+   ADVANCE_BATCH();
+}
+
+void
 gen7_disable_hw_binding_tables(struct brw_context *brw)
 {
BEGIN_BATCH(3);
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index d882bdd..129a780 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -300,6 +300,15 @@ gen7_upload_constant_state(struct brw_context *brw,
 
 /* gen7_misc_state.c */
 void gen7_rs_control(struct brw_context *brw, int enable);
+
+void gen7_update_binding_table(struct brw_context *brw,
+   gl_shader_stage stage,
+   uint32_t index,
+   uint32_t surf_offset);
+void gen7_update_binding_table_from_array(struct brw_context *brw,
+  gl_shader_stage stage,
+  const uint32_t* binding_table,
+  int size);
 void gen7_enable_hw_binding_tables(struct brw_context *brw);
 void gen7_disable_hw_binding_tables(struct brw_context *brw);
 void gen7_reset_rs_pool_offsets(struct brw_context *brw);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/27] i965: Store gather table information in the program data

2015-04-28 Thread Abdiel Janulgue
The resource streamer is able to gather and pack sparsely-located
constant data from any buffer object by a referring to a gather table
This patch adds support for keeping track of these constant data
fetches into a gather table.

The gather table is generated from two sources. Ordinary uniform fetches
are stored first. These are then combined with a separate table containing
UBO entries. The separate entry for UBOs is needed to make it easier to
generate the gather mask when combining and packing the constant data.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_context.h  |  9 +
 src/mesa/drivers/dri/i965/brw_gs.c   |  4 
 src/mesa/drivers/dri/i965/brw_program.c  |  5 +
 src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +++-
 src/mesa/drivers/dri/i965/brw_shader.h   | 11 +++
 src/mesa/drivers/dri/i965/brw_vs.c   |  5 +
 src/mesa/drivers/dri/i965/brw_wm.c   |  5 +
 7 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 7fd49e9..e25c64d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -355,9 +355,12 @@ struct brw_stage_prog_data {
 
GLuint nr_params;   /** number of float params/constants */
GLuint nr_pull_params;
+   GLuint nr_ubo_params;
+   GLuint nr_gather_table;
 
unsigned curb_read_length;
unsigned total_scratch;
+   unsigned max_ubo_const_block;
 
/**
 * Register where the thread expects to find input data from the URB
@@ -375,6 +378,12 @@ struct brw_stage_prog_data {
 */
const gl_constant_value **param;
const gl_constant_value **pull_param;
+   struct {
+  int reg;
+  unsigned channel_mask;
+  unsigned const_block;
+  unsigned const_offset;
+   } *gather_table;
 };
 
 /* Data about a particular attempt to compile a program.  Note that
diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index bea90d8..97658d5 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -70,6 +70,10 @@ brw_compile_gs_prog(struct brw_context *brw,
c.prog_data.base.base.pull_param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
c.prog_data.base.base.nr_params = param_count;
+   c.prog_data.base.base.nr_gather_table = 0;
+   c.prog_data.base.base.gather_table =
+  rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
+   (c.prog_data.base.base.nr_params + 
c.prog_data.base.base.nr_ubo_params));
 
if (brw-gen = 7) {
   if (gp-program.OutputType == GL_POINTS) {
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 81a0c19..f27c799 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -573,6 +573,10 @@ brw_stage_prog_data_compare(const struct 
brw_stage_prog_data *a,
if (memcmp(a-pull_param, b-pull_param, a-nr_pull_params * sizeof(void 
*)))
   return false;
 
+   if (memcmp(a-gather_table, b-gather_table,
+  a-nr_gather_table * sizeof(*a-gather_table)))
+  return false;
+
return true;
 }
 
@@ -583,6 +587,7 @@ brw_stage_prog_data_free(const void *p)
 
ralloc_free(prog_data-param);
ralloc_free(prog_data-pull_param);
+   ralloc_free(prog_data-gather_table);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 0d6ac0c..8769f67 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -739,11 +739,13 @@ backend_visitor::backend_visitor(struct brw_context *brw,
  prog(prog),
  stage_prog_data(stage_prog_data),
  cfg(NULL),
- stage(stage)
+ stage(stage),
+ ubo_gather_table(NULL)
 {
debug_enabled = INTEL_DEBUG  intel_debug_flag_for_shader_stage(stage);
stage_name = _mesa_shader_stage_to_string(stage);
stage_abbrev = _mesa_shader_stage_to_abbrev(stage);
+   this-nr_ubo_gather_table = 0;
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index 8a3263e..db0018f 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -204,6 +204,17 @@ public:
void assign_common_binding_table_offsets(uint32_t 
next_binding_table_offset);
 
virtual void invalidate_live_intervals() = 0;
+
+   /** Gather table entries for UBOs */
+   unsigned nr_ubo_gather_table;
+
+   struct gather_table {
+  int reg;
+  unsigned channel_mask;
+  unsigned const_block;
+  unsigned const_offset;
+   };
+   gather_table *ubo_gather_table;
 };
 
 uint32_t brw_texture_offset(struct gl_context *ctx, int *offsets,
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index dabff43..52333c9 100644
--- 

[Mesa-dev] [PATCH 02/27] i965: Pass resource streamer enable flags on batchbuffer start

2015-04-28 Thread Abdiel Janulgue
This is passed on the kernel to enable the resource streamer enable bit
on MI_BATCHBUFFER_START

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index e522e4e..bce5830 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -293,7 +293,8 @@ do_flush_locked(struct brw_context *brw)
   if (brw-gen = 6  batch-ring == BLT_RING) {
  flags = I915_EXEC_BLT;
   } else {
- flags = I915_EXEC_RENDER;
+ flags = I915_EXEC_RENDER |
+(brw-has_resource_streamer ? I915_EXEC_RESOURCE_STREAMER : 0);
   }
   if (batch-needs_sol_reset)
 flags |= I915_EXEC_GEN7_SOL_RESET;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/27] i965: Define HW-binding table and resource streamer control opcodes

2015-04-28 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_context.h |  1 +
 src/mesa/drivers/dri/i965/brw_defines.h | 24 
 src/mesa/drivers/dri/i965/intel_reg.h   |  3 +++
 3 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a6d6787..07626af 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1105,6 +1105,7 @@ struct brw_context
bool no_simd8;
bool use_rep_send;
bool scalar_vs;
+   bool has_resource_streamer;
 
/**
 * Some versions of Gen hardware don't do centroid interpolation correctly
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index a97a944..da288d3 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1586,6 +1586,30 @@ enum brw_message_target {
 #define _3DSTATE_BINDING_TABLE_POINTERS_GS 0x7829 /* GEN7+ */
 #define _3DSTATE_BINDING_TABLE_POINTERS_PS 0x782A /* GEN7+ */
 
+#define _3DSTATE_BINDING_TABLE_POOL_ALLOC   0x7919 /* GEN7.5+ */
+#define BRW_HW_BINDING_TABLE_ENABLE_SHIFT   11 /* GEN7.5+ */
+#define BRW_HW_BINDING_TABLE_ENABLE_MASKINTEL_MASK(11, 11)
+#define BRW_HW_BINDING_TABLE_ON 1
+#define BRW_HW_BINDING_TABLE_OFF0
+#define GEN7_HW_BT_MOCS_SHIFT   7
+#define GEN7_HW_BT_MOCS_MASKINTEL_MASK(10, 7)
+#define GEN8_HW_BT_MOCS_SHIFT   0
+#define GEN8_HW_BT_MOCS_MASKINTEL_MASK(6, 0)
+/* Only required in HSW */
+#define HSW_HW_BINDING_TABLE_RESERVED   (3  5)
+
+#define _3DSTATE_BINDING_TABLE_EDIT_VS  0x7843 /* GEN7.5 */
+#define _3DSTATE_BINDING_TABLE_EDIT_GS  0x7844 /* GEN7.5 */
+#define _3DSTATE_BINDING_TABLE_EDIT_HS  0x7845 /* GEN7.5 */
+#define _3DSTATE_BINDING_TABLE_EDIT_DS  0x7846 /* GEN7.5 */
+#define _3DSTATE_BINDING_TABLE_EDIT_PS  0x7847 /* GEN7.5 */
+#define BRW_BINDING_TABLE_INDEX_SHIFT   16
+#define BRW_BINDING_TABLE_INDEX_MASKINTEL_MASK(23, 16)
+
+#define BRW_BINDING_TABLE_EDIT_TARGET_ALL   3
+#define BRW_BINDING_TABLE_EDIT_TARGET_CORE1 2
+#define BRW_BINDING_TABLE_EDIT_TARGET_CORE0 1
+
 #define _3DSTATE_SAMPLER_STATE_POINTERS0x7802 /* GEN6+ */
 # define PS_SAMPLER_STATE_CHANGE   (1  12)
 # define GS_SAMPLER_STATE_CHANGE   (1  9)
diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
b/src/mesa/drivers/dri/i965/intel_reg.h
index 488fb5b..9cdb3ca 100644
--- a/src/mesa/drivers/dri/i965/intel_reg.h
+++ b/src/mesa/drivers/dri/i965/intel_reg.h
@@ -47,6 +47,9 @@
 /* Load a value from memory into a register.  Only available on Gen7+. */
 #define GEN7_MI_LOAD_REGISTER_MEM  (CMD_MI | (0x29  23))
 # define MI_LOAD_REGISTER_MEM_USE_GGTT (1  22)
+/* Haswell RS control */
+#define MI_RS_CONTROL   (CMD_MI | (0x6  23))
+#define MI_RS_STORE_DATA_IMM(CMD_MI | (0x2b  23))
 
 /** @{
  *
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-04-28 Thread Abdiel Janulgue
Reserve space in the gather pool where the gathered uniforms are flushed.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
b/src/mesa/drivers/dri/i965/gen6_vs_state.c
index 35d10ef..aebaa49 100644
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
@@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
*/
   assert(stage_state-push_const_size = 32);
}
+   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
chunks*/
+   if (brw-gather_pool.bo != NULL) {
+  if (prog_data-nr_params  0) {
+ int num_consts = ALIGN(prog_data-nr_params, 4) / 4;
+ stage_state-push_const_offset = brw-gather_pool.next_offset;
+ brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
+  }
+   }
 }
 
 static void
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters

2015-04-28 Thread Abdiel Janulgue
Now that we consider UBO constants as push constants, we need to include
the sizes of the UBO's constant slots in the visitor's uniform slot sizes.
This information is needed to properly pack vector constants tightly next to
each other.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_gs.c | 11 +++
 src/mesa/drivers/dri/i965/brw_vs.c | 13 +
 src/mesa/drivers/dri/i965/brw_wm.c | 13 +
 3 files changed, 37 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index 97658d5..2dc3ea1 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -32,6 +32,7 @@
 #include brw_vec4_gs_visitor.h
 #include brw_state.h
 #include brw_ff_gs.h
+#include glsl/nir/nir_types.h
 
 
 bool
@@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw,
c.prog_data.base.base.pull_param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
c.prog_data.base.base.nr_params = param_count;
+   c.prog_data.base.base.nr_ubo_params = 0;
+   for (int i = 0; i  gs-NumUniformBlocks; i++) {
+  for (int p = 0; p  gs-UniformBlocks[i].NumUniforms; p++) {
+ const struct glsl_type *type = gs-UniformBlocks[i].Uniforms[p].Type;
+ const struct glsl_type *elem = glsl_get_element_type(type);
+ int array_sz = elem ? glsl_get_array_size(type) : 1;
+ int components = elem ? glsl_get_components(elem) : 
glsl_get_components(type);
+ c.prog_data.base.base.nr_ubo_params += components * array_sz;
+  }
+   }
c.prog_data.base.base.nr_gather_table = 0;
c.prog_data.base.base.gather_table =
   rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 52333c9..86bef5e 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -37,6 +37,7 @@
 #include brw_state.h
 #include program/prog_print.h
 #include program/prog_parameter.h
+#include glsl/nir/nir_types.h
 
 #include util/ralloc.h
 
@@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw,
   rzalloc_array(NULL, const gl_constant_value *, param_count);
stage_prog_data-nr_params = param_count;
 
+   stage_prog_data-nr_ubo_params = 0;
+   if (vs) {
+  for (int i = 0; i  vs-NumUniformBlocks; i++) {
+ for (int p = 0; p  vs-UniformBlocks[i].NumUniforms; p++) {
+const struct glsl_type *type = 
vs-UniformBlocks[i].Uniforms[p].Type;
+const struct glsl_type *elem = glsl_get_element_type(type);
+int array_sz = elem ? glsl_get_array_size(type) : 1;
+int components = elem ? glsl_get_components(elem) : 
glsl_get_components(type);
+stage_prog_data-nr_ubo_params += components * array_sz;
+ }
+  }
+   }
stage_prog_data-nr_gather_table = 0;
stage_prog_data-gather_table = rzalloc_size(NULL, 
sizeof(*stage_prog_data-gather_table) *
 (stage_prog_data-nr_params +
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 13a64d8..2060eab 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -38,6 +38,7 @@
 #include main/samplerobj.h
 #include program/prog_parameter.h
 #include program/program.h
+#include glsl/nir/nir_types.h
 #include intel_mipmap_tree.h
 
 #include util/ralloc.h
@@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw,
   rzalloc_array(NULL, const gl_constant_value *, param_count);
prog_data.base.nr_params = param_count;
 
+   prog_data.base.nr_ubo_params = 0;
+   if (fs) {
+  for (int i = 0; i  fs-NumUniformBlocks; i++) {
+ for (int p = 0; p  fs-UniformBlocks[i].NumUniforms; p++) {
+const struct glsl_type *type = 
fs-UniformBlocks[i].Uniforms[p].Type;
+const struct glsl_type *elem = glsl_get_element_type(type);
+int array_sz = elem ? glsl_get_array_size(type) : 1;
+int components = elem ? glsl_get_components(elem) : 
glsl_get_components(type);
+prog_data.base.nr_ubo_params += components * array_sz;
+ }
+  }
+   }
prog_data.base.nr_gather_table = 0;
prog_data.base.gather_table = rzalloc_size(NULL, 
sizeof(*prog_data.base.gather_table) *
   (prog_data.base.nr_params +
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/27] i965/fs: Append ir_binop_ubo_load entries to the gather table

2015-04-28 Thread Abdiel Janulgue
When the const block and offset are immediate values. Otherwise just
fall-back to the previous method of uploading the UBO constant data to
GRF using pull constants.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 11 
 src/mesa/drivers/dri/i965/brw_fs.h   |  4 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +++-
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 071ac59..031d807 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2273,6 +2273,7 @@ fs_visitor::assign_constant_locations()
}
 
stage_prog_data-nr_params = 0;
+   stage_prog_data-nr_ubo_params = ubo_uniforms;
 
unsigned const_reg_access[uniforms];
memset(const_reg_access, 0, sizeof(const_reg_access));
@@ -2302,6 +2303,16 @@ fs_visitor::assign_constant_locations()
   stage_prog_data-gather_table[p].channel_mask =
  const_reg_access[i];
}
+
+   for (unsigned i = 0; i  this-nr_ubo_gather_table; i++) {
+  int p = stage_prog_data-nr_gather_table++;
+  stage_prog_data-gather_table[p].reg = this-ubo_gather_table[i].reg;
+  stage_prog_data-gather_table[p].channel_mask = 
this-ubo_gather_table[i].channel_mask;
+  stage_prog_data-gather_table[p].const_block = 
this-ubo_gather_table[i].const_block;
+  stage_prog_data-gather_table[p].const_offset = 
this-ubo_gather_table[i].const_offset;
+  stage_prog_data-max_ubo_const_block = 
MAX2(stage_prog_data-max_ubo_const_block,
+  
this-ubo_gather_table[i].const_block);
+   }
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 32063f0..a48b2bb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -417,6 +417,7 @@ public:
void setup_uniform_values(ir_variable *ir);
void setup_builtin_uniform_values(ir_variable *ir);
int implied_mrf_writes(fs_inst *inst);
+   bool generate_ubo_gather_table(ir_expression* ir);
 
virtual void dump_instructions();
virtual void dump_instructions(const char *name);
@@ -445,6 +446,9 @@ public:
/** Total number of direct uniforms we can get from NIR */
unsigned num_direct_uniforms;
 
+   /** Number of ubo uniform variable components visited. */
+   unsigned ubo_uniforms;
+
/** Byte-offset for the next available spot in the scratch space buffer. */
unsigned last_scratch;
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4e99366..11e608b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1179,11 +1179,18 @@ fs_visitor::visit(ir_expression *ir)
   emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, this-result, op[0], op[1]);
   break;
case ir_binop_ubo_load: {
+  /* Use gather push constants if at all possible, otherwise just
+   * fall back to pull constants for UBOs
+   */
+  if (generate_ubo_gather_table(ir))
+ break;
+
   /* This IR node takes a constant uniform block and a constant or
* variable byte offset within the block and loads a vector from that.
*/
   ir_constant *const_uniform_block = ir-operands[0]-as_constant();
   ir_constant *const_offset = ir-operands[1]-as_constant();
+
   fs_reg surf_index;
 
   if (const_uniform_block) {
@@ -4144,6 +4151,79 @@ fs_visitor::resolve_bool_comparison(ir_rvalue *rvalue, 
fs_reg *reg)
*reg = neg_result;
 }
 
+bool
+fs_visitor::generate_ubo_gather_table(ir_expression *ir)
+{
+   ir_constant *const_uniform_block = ir-operands[0]-as_constant();
+   ir_constant *const_offset = ir-operands[1]-as_constant();
+
+   if (ir-operation != ir_binop_ubo_load ||
+   !brw-has_resource_streamer||
+   !brw-fs_ubo_gather||
+   !const_uniform_block   ||
+   !const_offset)
+  return false;
+
+  /* Only allow 16 registers (128 uniform components) as push constants.
+   */
+   unsigned int max_push_components = 16 * 8;
+   unsigned param_index = uniforms + ubo_uniforms;
+   if ((param_index + ir-type-vector_elements) = max_push_components)
+  return false;
+
+   fs_reg reg;
+   if (dispatch_width == 16) {
+  for (int i = 0; i  (int) this-nr_ubo_gather_table; i++) {
+ if ((this-ubo_gather_table[i].const_block ==
+  const_uniform_block-value.u[0]) 
+ (this-ubo_gather_table[i].const_offset ==
+  const_offset-value.u[0])) {
+reg = fs_reg(UNIFORM, this-ubo_gather_table[i].reg);
+reg.type = brw_type_for_base_type(ir-type);
+break;
+ }
+  }
+  assert(reg.file == UNIFORM);
+   }
+
+   if (reg.file != UNIFORM) {
+  reg = fs_reg(UNIFORM, param_index);
+  

[Mesa-dev] [PATCH 12/27] i965: Assign hw-binding table index for each UBO constant buffer.

2015-04-28 Thread Abdiel Janulgue
To be able to refer to a constant buffer, the resource streamer needs
to index it with a hardware binding table entry. This blankets the ubo
buffers with hardware binding table indices.

Gather constants hardware fetches in 16-entry binding table blocks.
So we need to use a block that is unused.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_context.h  | 11 +++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  6 ++
 2 files changed, 17 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index e25c64d..276c359 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -678,6 +678,17 @@ struct brw_vs_prog_data {
 
 #define SURF_INDEX_GEN6_SOL_BINDING(t) (t)
 
+/** Start of hardware binding table index for uniform gather constant entries.
+ *  This must be aligned to the start of a hardware binding table block (a 
block
+ *  is a group 16 binding table entries).
+ */
+#define BRW_UNIFORM_GATHER_INDEX_START 32
+
+/** Appended to the end of the binding table index for uniform constant 
buffers to indicate
+ *  start of the UBO gather constant binding table.
+ */
+#define BRW_UBO_GATHER_INDEX_APPEND 2
+
 /* Note: brw_gs_prog_data_compare() must be updated when adding fields to
  * this struct!
  */
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 161d140..ce61554 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -884,6 +884,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
 
uint32_t *surf_offsets =
   stage_state-surf_offset[prog_data-binding_table.ubo_start];
+   bool use_gather = (brw-gather_pool.bo != NULL);
 
for (int i = 0; i  shader-NumUniformBlocks; i++) {
   struct gl_uniform_buffer_binding *binding;
@@ -904,6 +905,11 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
   bo-size - binding-Offset,
   surf_offsets[i],
   dword_pitch);
+  if (use_gather) {
+ int bt_idx = BRW_UNIFORM_GATHER_INDEX_START + 
BRW_UBO_GATHER_INDEX_APPEND + i;
+ gen7_update_binding_table(brw, stage_state-stage,
+   bt_idx, surf_offsets[i]);
+  }
}
 
if (shader-NumUniformBlocks)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/27] i965/vec4: Pack UBO registers right after uniform registers

2015-04-28 Thread Abdiel Janulgue
Since we now consider UBOs as push constants, we need to layout
our push constant register space in such a way that UBO registers
are packed right after uniform registers.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 38 ++
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 5365af0..1e5cdf6 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -492,9 +492,10 @@ vec4_visitor::split_uniform_registers()
 void
 vec4_visitor::pack_uniform_registers()
 {
-   bool uniform_used[this-uniforms];
-   int new_loc[this-uniforms];
-   int new_chan[this-uniforms];
+   int total_uniforms = this-uniforms + this-ubo_uniforms;
+   bool uniform_used[total_uniforms];
+   int new_loc[total_uniforms];
+   int new_chan[total_uniforms];
 
memset(uniform_used, 0, sizeof(uniform_used));
memset(new_loc, 0, sizeof(new_loc));
@@ -518,7 +519,7 @@ vec4_visitor::pack_uniform_registers()
/* Now, figure out a packing of the live uniform vectors into our
 * push constants.
 */
-   for (int src = 0; src  uniforms; src++) {
+   for (int src = 0; src  total_uniforms; src++) {
   assert(src  uniform_array_size);
   int size = this-uniform_vector_size[src];
 
@@ -528,9 +529,16 @@ vec4_visitor::pack_uniform_registers()
   }
 
   int dst;
-  /* Find the lowest place we can slot this uniform in. */
+  /* Find the lowest place we can slot this uniform in. However, when
+   * our constants come from a mix of UBO and uniform sources, don't allow 
registers
+   * assigned to UBOs fall into half-filled uniform slots when repacking,
+   * otherwise we could mix up uniform and UBO register fetches in one 
vec4.
+   */
   for (dst = 0; dst  src; dst++) {
-if (this-uniform_vector_size[dst] + size = 4)
+ bool allow_repack = ((src = uniforms  dst = uniforms) ||
+  (src  uniforms  dst  uniforms)   ||
+  this-uniform_vector_size[dst] == 0);
+if (this-uniform_vector_size[dst] + size = 4  allow_repack)
break;
   }
 
@@ -541,17 +549,20 @@ vec4_visitor::pack_uniform_registers()
 new_loc[src] = dst;
 new_chan[src] = this-uniform_vector_size[dst];
 
-/* Move the references to the data */
-for (int j = 0; j  size; j++) {
-   stage_prog_data-param[dst * 4 + new_chan[src] + j] =
-  stage_prog_data-param[src * 4 + j];
-}
+/* Move the references only for uniform data */
+ if (src  uniforms) {
+for (int j = 0; j  size; j++) {
+   stage_prog_data-param[dst * 4 + new_chan[src] + j] =
+  stage_prog_data-param[src * 4 + j];
+}
+ }
 
 this-uniform_vector_size[dst] += size;
 this-uniform_vector_size[src] = 0;
   }
 
-  new_uniform_count = MAX2(new_uniform_count, dst + 1);
+  if (src  uniforms)
+ new_uniform_count = MAX2(new_uniform_count, dst + 1);
}
 
this-uniforms = new_uniform_count;
@@ -1542,7 +1553,8 @@ vec4_visitor::setup_uniforms(int reg)
   this-uniforms++;
   reg++;
} else {
-  reg += ALIGN(uniforms, 2) / 2;
+  int ubo_regs = ALIGN(ubo_uniforms, 4) / 4;
+  reg += ALIGN(ubo_regs + uniforms, 2) / 2;
}
 
stage_prog_data-nr_params = this-uniforms * 4;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i965: enable resource streamer gather constants for UBOs‏

2015-04-28 Thread Abdiel Janulgue
This patch series enables resource streamer gather constants for UBOs.
With this feature, we treat UBO fetches as push constants instead of
pull. The resource streamer hardware makes it possible to gather and
pack easily with minimal overhead non-contiguous blocks of constant
data from an arbitrary buffer object as is in the case for UBOs sources
so the push constant state can treat the gathered constants as one GRF
block. I've initially targeted UBOs but the same idea can be
theoretically applied to any scattered uniform fetch as well - which I
plan to focus on next.
 
Mostly tested on Haswell, v2 has been incubating for some time and I
believe I've ironed out most of the major issues on the fs-backend. All
piglit tests for fragment shaders are passing. The vec4 backend still
needs some additional fine-tuning but it passes all vertex and geometry
shader piglit tests as well except gs-mat4x3. I've added a new
environment flag to selectively enable which shader stages to optimize.
 
Initial posting here if someone needs the original overview of the
series: 
http://lists.freedesktop.org/archives/mesa-dev/2015-January/073594.html

Entire series lives here:
git://people.freedesktop.org/~abj/mesa:rs_gather_constants_NIR
 
Below are some real-world results from Unreal Engine 4 demos which
feature heavy UBO usage. The benchmark enabled use of gather constants
only for the fragment shaders.
 
EffectsCave (NIR disabled):
x  fs gather constants disabled
+  fs gather constants enabled
NMin   MaxMedian   AvgStddev
x  104.6008   4.83961 4.80967  4.791587   0.06943449
+  105.05152  5.14954 5.11507  5.106432   0.031042147
Difference at 95.0% confidence
0.314845 ± 0.0505323
6.57079% ± 1.0546%

EffectsCave (NIR enabled):
x  fs gather constants disabled
+  fs gather constants enabled
N   Min   Max Median   AvgStddev
x  10   3.99146   4.26072 4.19591  4.157199   0.093623634
+  10   4.51396   4.59149 4.58185  4.574359   0.022251777
Difference at 95.0% confidence
0.41716 ± 0.0639358
10.0346% ± 1.53795%
 
Reflections Subway (NIR disabled):
x  fs gather constants disabled
+  fs gather constants enabled
NMin   MaxMedian   AvgStddev
x  106.64539   7.288987.11371  7.083675   0.19290418
+  107.58844   7.662477.64003  7.632628   0.022702317
Difference at 95.0% confidence
0.548953 ± 0.129049
7.74955% ± 1.82178%
 
Reflections Subway (NIR enabled):
x  fs gather constants disabled
+  fs gather constants enabled
NMin   MaxMedian   AvgStddev
x  106.03644   6.197226.08858  6.097111   0.062671415
+  106.30447   6.4363 6.35115  6.358372   0.043168601
Difference at 95.0% confidence
0.261261 ± 0.0505605
4.285% ± 0.829254%
 
What's changed since initial posting:

* Lots of squashed patches (~50 -- ~30)!
* Use environment variable INTEL_UBO_GATHER=vs,fs,gs to selectively enable
  which shader stage to optimize with this feature.
* NIR support for the fs-backend.
* Remove unrelated fine-grained uniform support which I'll resubmit in a
  separate patch series.
 
Dependencies:

* You'll need the i915 kernel driver which enables the resource streamer. I
  plan to submit this in a separate patch series to the i915 mailing list:
  git://people.freedesktop.org/~abj/linux:intel_resource_streamer_2

* libdrm with updated headers:
  git://people.freedesktop.org/~abj/libdrm:libdrm_rs
  
Patch overview:
 
Patches 1 -5:  Enables core resource streamer functionality and
   hardware-generated binding tables
Patches 6 -10: Switches on the hardware bits for gather push constants
Patches 11-16: Core compiler support
Patches 17-20: Support for original i965 fs backend
Patches 19:Support for NIR fs backend
Patches 21-23: Support for vec4 backend
Patches 24-26: Required state setup and workarounds
Patches 29:Switch on push constants whenever we have UBO entries.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
src/glsl/nir/nir_types.cpp|  11 ++
 src/glsl/nir/nir_types.h  |   4 +
 .../drivers/dri/i965/brw_binding_tables.c | 180 +-
 src/mesa/drivers/dri/i965/brw_context.c   |  41 
 src/mesa/drivers/dri/i965/brw_context.h   |  36 
 src/mesa/drivers/dri/i965/brw_defines.h   |  47 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  71 ++-
 src/mesa/drivers/dri/i965/brw_fs.h|   6 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |  59 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  86 -
 src/mesa/drivers/dri/i965/brw_gs.c|  15 ++
 src/mesa/drivers/dri/i965/brw_program.c   |   5 +
 src/mesa/drivers/dri/i965/brw_shader.cpp  |   4 +-
 src/mesa/drivers/dri/i965/brw_shader.h| 

[Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-04-28 Thread Abdiel Janulgue
This patch implements the binding table enable command which is also
used to allocate a binding table pool where hardware-generated
binding table entries are flushed into. Each binding table offset in
the binding table pool is unique per each shader stage that are
enabled within a batch.

Also insert the required brw_tracked_state objects to enable
hw-generated binding tables in normal render path.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 ++
 src/mesa/drivers/dri/i965/brw_context.c|  4 ++
 src/mesa/drivers/dri/i965/brw_context.h|  5 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
 src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
 src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
 6 files changed, 92 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 459165a..a58e32e 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -44,6 +44,11 @@
 #include brw_state.h
 #include intel_batchbuffer.h
 
+/* Somehow the hw-binding table pool offset must start here, otherwise
+ * the GPU will hang
+ */
+#define HW_BT_START_OFFSET 256;
+
 /**
  * Upload a shader stage's binding table as indirect state.
  *
@@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
.emit = brw_gs_upload_binding_table,
 };
 
+/**
+ * Hardware-generated binding tables for the resource streamer
+ */
+void
+gen7_disable_hw_binding_tables(struct brw_context *brw)
+{
+   BEGIN_BATCH(3);
+   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
+   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) |
+ brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
+   OUT_BATCH(0);
+   ADVANCE_BATCH();
+
+   /* Pipe control workaround */
+   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+}
+
+void
+gen7_enable_hw_binding_tables(struct brw_context *brw)
+{
+   if (!brw-has_resource_streamer) {
+  gen7_disable_hw_binding_tables(brw);
+  return;
+   }
+
+   if (!brw-hw_bt_pool.bo) {
+  /* From the BSpec, 3D Pipeline  Resource Streamer  Hardware Binding 
Tables:
+   *
+   *  A maximum of 16,383 Binding tables are allowed in any batch buffer.
+   */
+  int max_size = 16383 * 4;
+  brw-hw_bt_pool.bo = drm_intel_bo_alloc(brw-bufmgr, hw_bt,
+  max_size, 64);
+  brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
+   }
+
+   uint32_t dw1 = SET_FIELD(BRW_HW_BINDING_TABLE_ON, 
BRW_HW_BINDING_TABLE_ENABLE);
+   if (brw-is_haswell)
+  dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_MOCS) | 
HSW_HW_BINDING_TABLE_RESERVED;
+
+   BEGIN_BATCH(3);
+   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
+   OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
+   OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
+ brw-hw_bt_pool.bo-size);
+   ADVANCE_BATCH();
+
+   /* Pipe control workaround */
+   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+}
+
+void
+gen7_reset_rs_pool_offsets(struct brw_context *brw)
+{
+   brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
+}
+
+const struct brw_tracked_state gen7_hw_binding_tables = {
+   .dirty = {
+  .mesa = 0,
+  .brw = BRW_NEW_BATCH,
+   },
+   .emit = gen7_enable_hw_binding_tables
+};
+
 /** @} */
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index c7e1e81..9c7ccae 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -953,6 +953,10 @@ intelDestroyContext(__DRIcontext * driContextPriv)
if (brw-wm.base.scratch_bo)
   drm_intel_bo_unreference(brw-wm.base.scratch_bo);
 
+   gen7_reset_rs_pool_offsets(brw);
+   drm_intel_bo_unreference(brw-hw_bt_pool.bo);
+   brw-hw_bt_pool.bo = NULL;
+
drm_intel_gem_context_destroy(brw-hw_ctx);
 
if (ctx-swrast_context) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 07626af..1c72b74 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1360,6 +1360,11 @@ struct brw_context
   uint32_t fast_clear_op;
} wm;
 
+   /* RS hardware binding table */
+   struct {
+  drm_intel_bo *bo;
+  uint32_t next_offset;
+   } hw_bt_pool;
 
struct {
   uint32_t state_offset;
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index cfa67b6..d882bdd 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -130,6 +130,7 @@ extern const struct brw_tracked_state gen7_sol_state;
 extern const struct brw_tracked_state gen7_urb;
 extern const struct brw_tracked_state gen7_vs_state;
 extern const 

[Mesa-dev] [PATCH 15/27] nir: Add glsl_get_array_size() wrapper.

2015-04-28 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/nir/nir_types.cpp | 6 ++
 src/glsl/nir/nir_types.h   | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
index 249678f..7218eeb 100644
--- a/src/glsl/nir/nir_types.cpp
+++ b/src/glsl/nir/nir_types.cpp
@@ -164,3 +164,9 @@ glsl_array_type(const glsl_type *base, unsigned elements)
 {
return glsl_type::get_array_instance(base, elements);
 }
+
+unsigned
+glsl_get_array_size(const struct glsl_type *type)
+{
+   return type-array_size();
+}
diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
index 125f075..32e18f3 100644
--- a/src/glsl/nir/nir_types.h
+++ b/src/glsl/nir/nir_types.h
@@ -53,6 +53,8 @@ const struct glsl_type *glsl_get_element_type(const struct 
glsl_type *type);
 
 enum glsl_base_type glsl_get_base_type(const struct glsl_type *type);
 
+unsigned glsl_get_array_size(const struct glsl_type *type);
+
 unsigned glsl_get_vector_elements(const struct glsl_type *type);
 
 unsigned glsl_get_components(const struct glsl_type *type);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/27] i965/fs: Pack UBO registers right after uniform registers

2015-04-28 Thread Abdiel Janulgue
By generating the generating the gather constants channel mask.
Each gather push constant entry equates to a constant buffer
fetch for entries scattered around the buffer in 128-bit increments.
To select which bits are loaded into an entry, a channel mask
interface is provided by the hardware to narrow down which channels
are loaded to the packing gather pool.This patch generates the mask
for enabled entries.

This is accomplished by basically walking and appending the live registers
to this channel mask. Note that the the ir_swizzle visitor which is run
prior to assign_push_constant_locations() determines which registers
are loaded in the push constant array.

We have two sources of constant buffers: UBOs and ordinary uniforms.
After assigning a block of push constant hw-register to normal uniforms,
just pack the UBO registers right after it.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 43 
 1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 031d807..e4d6300 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1145,7 +1145,10 @@ fs_visitor::import_uniforms(fs_visitor *v)
this-push_constant_loc = v-push_constant_loc;
this-pull_constant_loc = v-pull_constant_loc;
this-uniforms = v-uniforms;
+   this-ubo_uniforms = v-ubo_uniforms;
this-param_size = v-param_size;
+   this-nr_ubo_gather_table = v-nr_ubo_gather_table;
+   this-ubo_gather_table = v-ubo_gather_table;
 }
 
 /* Our support for uniforms is piggy-backed on the struct
@@ -1724,7 +1727,8 @@ fs_visitor::assign_curb_setup()
   prog_data-dispatch_grf_start_reg_16 = payload.num_regs;
}
 
-   prog_data-curb_read_length = ALIGN(stage_prog_data-nr_params, 8) / 8;
+   prog_data-curb_read_length = ALIGN(stage_prog_data-nr_params + 
stage_prog_data-nr_ubo_params,
+   8) / 8;
 
/* Map the offsets in the UNIFORM file to fixed HW regs. */
foreach_block_and_inst(block, fs_inst, inst, cfg) {
@@ -1732,7 +1736,7 @@ fs_visitor::assign_curb_setup()
 if (inst-src[i].file == UNIFORM) {
 int uniform_nr = inst-src[i].reg + inst-src[i].reg_offset;
 int constant_nr;
-if (uniform_nr = 0  uniform_nr  (int) uniforms) {
+if (uniform_nr = 0  uniform_nr  (int) (uniforms + 
ubo_uniforms)) {
constant_nr = push_constant_loc[uniform_nr];
 } else {
/* Section 5.11 of the OpenGL 4.1 spec says:
@@ -2167,8 +2171,9 @@ fs_visitor::move_uniform_array_access_to_pull_constants()
if (dispatch_width != 8)
   return;
 
-   pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
-   memset(pull_constant_loc, -1, sizeof(pull_constant_loc[0]) * uniforms);
+   unsigned int total_uniforms = uniforms + ubo_uniforms;
+   pull_constant_loc = ralloc_array(mem_ctx, int, total_uniforms);
+   memset(pull_constant_loc, -1, sizeof(pull_constant_loc[0]) * 
total_uniforms);
 
/* Walk through and find array access of uniforms.  Put a copy of that
 * uniform in the pull constant buffer.
@@ -2218,9 +2223,10 @@ fs_visitor::assign_constant_locations()
if (dispatch_width != 8)
   return;
 
+   unsigned int total_uniforms = uniforms + ubo_uniforms;
/* Find which UNIFORM registers are still in use. */
-   bool is_live[uniforms];
-   for (unsigned int i = 0; i  uniforms; i++) {
+   bool is_live[total_uniforms];
+   for (unsigned int i = 0; i  total_uniforms; i++) {
   is_live[i] = false;
}
 
@@ -2230,8 +2236,26 @@ fs_visitor::assign_constant_locations()
 continue;
 
  int constant_nr = inst-src[i].reg + inst-src[i].reg_offset;
- if (constant_nr = 0  constant_nr  (int) uniforms)
+ if (constant_nr = 0  constant_nr  (int) total_uniforms) {
 is_live[constant_nr] = true;
+
+for (unsigned int p = 0; p  this-nr_ubo_gather_table; p++) {
+   if (this-ubo_gather_table[p].reg == inst-src[i].reg) {
+  /* Generate the channel mask to determine which entries 
starting from
+   * the offset above should be packed into the 16-byte entry. 
If the
+   * offset is aligned to a 16-byte boundary, just set the 
position based on
+   * the reg_offset. Otherwise, set the mask based on the 
positon of the offset
+   * from the boundary.
+   */
+  unsigned mask = ((this-ubo_gather_table[p].const_offset % 
16) == 0) ?
+ 1  inst-src[i].reg_offset :
+ 1  ((this-ubo_gather_table[p].const_offset % 16) / 4);
+
+  this-ubo_gather_table[p].channel_mask |= mask;
+  break;
+   }
+}
+ }
   }
}
 
@@ -2246,9 +2270,9 @@ 

[Mesa-dev] [PATCH 24/27] i965: Upload UBO surfaces before emitting constant state packet

2015-04-28 Thread Abdiel Janulgue
Now that UBOs are uploaded as push constants. We need to obtain and
append the amount of push constant entries generated by the UBO entry
fetches to the 3DSTATE_CONSTANT_* packets.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_state_upload.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index e9f3bbd..bee6c56 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -192,6 +192,10 @@ static const struct brw_tracked_state *gen7_render_atoms[] 
=
 
gen7_hw_binding_tables, /* Enable hw-generated binding tables for Haswell 
*/
 
+   brw_vs_ubo_surfaces,
+   brw_gs_ubo_surfaces,
+   brw_wm_ubo_surfaces,
+
gen6_vs_push_constants, /* Before vs_state */
gen6_gs_push_constants, /* Before gs_state */
gen6_wm_push_constants, /* Before wm_surfaces and constant_buffer */
@@ -200,13 +204,10 @@ static const struct brw_tracked_state 
*gen7_render_atoms[] =
 * table upload must be last.
 */
brw_vs_pull_constants,
-   brw_vs_ubo_surfaces,
brw_vs_abo_surfaces,
brw_gs_pull_constants,
-   brw_gs_ubo_surfaces,
brw_gs_abo_surfaces,
brw_wm_pull_constants,
-   brw_wm_ubo_surfaces,
brw_wm_abo_surfaces,
gen6_renderbuffer_surfaces,
brw_texture_surfaces,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/27] i965/vec4: Append ir_binop_ubo_load entries to the gather table

2015-04-28 Thread Abdiel Janulgue
When the const block and offset are immediate values. Otherwise just
fall-back to the previous method of uploading the UBO constant data to
GRF using pull constants.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 +
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 80 ++
 3 files changed, 94 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 799d79e..5365af0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -580,6 +580,18 @@ vec4_visitor::generate_gather_table()
   stage_prog_data-gather_table[p].reg = -1;
   stage_prog_data-gather_table[p].channel_mask = 0xf;
}
+
+   for (unsigned i = 0; i  this-nr_ubo_gather_table; i++) {
+  int p = stage_prog_data-nr_gather_table++;
+  stage_prog_data-gather_table[p].reg = this-ubo_gather_table[i].reg;
+  stage_prog_data-gather_table[p].channel_mask = 
this-ubo_gather_table[i].channel_mask;
+  stage_prog_data-gather_table[p].const_block = 
this-ubo_gather_table[i].const_block;
+  stage_prog_data-gather_table[p].const_offset = 
this-ubo_gather_table[i].const_offset;
+  stage_prog_data-max_ubo_const_block = 
MAX2(stage_prog_data-max_ubo_const_block,
+  
this-ubo_gather_table[i].const_block);
+   }
+
+   stage_prog_data-nr_ubo_params = ubo_uniforms;
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index e58ea25..858f9ea 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -173,6 +173,7 @@ public:
int *uniform_vector_size;
int uniform_array_size; /* Size of uniform_[vector_]size arrays */
int uniforms;
+   int ubo_uniforms;
 
src_reg shader_start_time;
 
@@ -386,6 +387,7 @@ public:
void dump_instruction(backend_instruction *inst, FILE *file);
 
void visit_atomic_counter_intrinsic(ir_call *ir);
+   bool generate_ubo_gather_table(ir_expression *ir, const dst_reg 
result_dst);
 
 protected:
void emit_vertex();
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 3d16caa..9408f75 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1800,6 +1800,12 @@ vec4_visitor::visit(ir_expression *ir)
   break;
 
case ir_binop_ubo_load: {
+  /* Use gather push constants if at all possible, otherwise just
+   * fall back to pull constants for UBOs
+   */
+  if (generate_ubo_gather_table(ir, result_dst))
+ break;
+
   ir_constant *const_uniform_block = ir-operands[0]-as_constant();
   ir_constant *const_offset_ir = ir-operands[1]-as_constant();
   unsigned const_offset = const_offset_ir ? const_offset_ir-value.u[0] : 
0;
@@ -3645,6 +3651,72 @@ vec4_visitor::resolve_bool_comparison(ir_rvalue *rvalue, 
src_reg *reg)
*reg = neg_result;
 }
 
+bool
+vec4_visitor::generate_ubo_gather_table(ir_expression *ir, const dst_reg 
result_dst)
+{
+   ir_constant *const_uniform_block = ir-operands[0]-as_constant();
+   ir_constant *const_offset_ir = ir-operands[1]-as_constant();
+   unsigned const_offset = const_offset_ir ? const_offset_ir-value.u[0] : 0;
+
+   if (ir-operation != ir_binop_ubo_load ||
+   !brw-has_resource_streamer||
+   !const_uniform_block   ||
+   !const_offset_ir)
+  return false;
+
+   if (stage == MESA_SHADER_VERTEX  !brw-vs_ubo_gather)
+  return false;
+   if (stage == MESA_SHADER_GEOMETRY  !brw-gs_ubo_gather)
+  return false;
+
+   /* Only allow 32 registers (256 uniform components) as push constants,
+*/
+   int max_uniform_components = 32 * 8;
+   int param_index = uniforms + ubo_uniforms;
+   if ((param_index + ir-type-vector_elements) = max_uniform_components)
+  return false;
+
+   dst_reg reg;
+   for (int i = 0; i  (int) this-nr_ubo_gather_table; i++) {
+  if ((this-ubo_gather_table[i].const_block ==
+   const_uniform_block-value.u[0]) 
+  (this-ubo_gather_table[i].const_offset ==
+   const_offset)) {
+ reg = dst_reg(UNIFORM, this-ubo_gather_table[i].reg);
+ break;
+  }
+   }
+
+   if (reg.file != UNIFORM) {
+  reg = dst_reg(UNIFORM, param_index);
+  uniform_vector_size[param_index] = ir-type-vector_elements;
+
+  int gather = this-nr_ubo_gather_table++;
+  this-ubo_gather_table[gather].reg = reg.reg;
+  this-ubo_gather_table[gather].const_block =
+ const_uniform_block-value.u[0];
+  this-ubo_gather_table[gather].const_offset = const_offset;
+
+  for (int i = 0; i  ir-type-vector_elements; i++) {
+ this-ubo_gather_table[gather].channel_mask |= (1  i);
+  }
+  

[Mesa-dev] [PATCH 07/27] i965: Enable gather push constants

2015-04-28 Thread Abdiel Janulgue
The 3DSTATE_GATHER_POOL_ALLOC is used to enable or disable the gather
push constants feature within a context. This patch provides the toggle
functionality of using gather push constants to program constant data
within a batch.

Using gather push constants require that a gather pool be allocated so
that the resource streamer can flush the packed constants it gathered.
The pool is later referenced by the 3DSTATE_CONSTANT_* command to
program the push constant data.

Also introduce INTEL_UBO_GATHER to selectively enable which shader stage
uses gather constants for ubo fetches.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 43 +-
 src/mesa/drivers/dri/i965/brw_context.c| 37 ++
 src/mesa/drivers/dri/i965/brw_context.h| 10 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  1 +
 4 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index c1d188e..4793fbc 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -236,9 +236,47 @@ gen7_update_binding_table_from_array(struct brw_context 
*brw,
ADVANCE_BATCH();
 }
 
+static void
+gen7_init_gather_pool(struct brw_context *brw)
+{
+   if (!brw-has_resource_streamer)
+  return;
+
+   if (!brw-gather_pool.bo) {
+  brw-gather_pool.bo = drm_intel_bo_alloc(brw-bufmgr, gather_pool,
+   brw-gather_pool.size, 4096);
+  brw-gather_pool.next_offset = 0;
+   }
+}
+
+void
+gen7_toggle_gather_constants(struct brw_context *brw, bool enable)
+{
+   if (enable  !brw-has_resource_streamer)
+  return;
+
+   uint32_t dw1 = brw-is_haswell ? HSW_GATHER_CONSTANTS_RESERVED : 0;
+
+   BEGIN_BATCH(3);
+   OUT_BATCH(_3DSTATE_GATHER_POOL_ALLOC  16 | (3 - 2));
+   if (enable) {
+  dw1 |= SET_FIELD(BRW_GATHER_CONSTANTS_ON, BRW_GATHER_CONSTANTS_ENABLE) |
+ (brw-is_haswell ? GEN7_MOCS_L3 : 0);
+  OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
+  OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
+brw-gather_pool.bo-size);
+   } else {
+  OUT_BATCH(dw1);
+  OUT_BATCH(0);
+   }
+   ADVANCE_BATCH();
+}
+
 void
 gen7_disable_hw_binding_tables(struct brw_context *brw)
 {
+   gen7_toggle_gather_constants(brw, false);
+
BEGIN_BATCH(3);
OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) |
@@ -280,6 +318,9 @@ gen7_enable_hw_binding_tables(struct brw_context *brw)
  brw-hw_bt_pool.bo-size);
ADVANCE_BATCH();
 
+   gen7_init_gather_pool(brw);
+   gen7_toggle_gather_constants(brw, true);
+
/* Pipe control workaround */
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
 }
@@ -288,6 +329,7 @@ void
 gen7_reset_rs_pool_offsets(struct brw_context *brw)
 {
brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
+   brw-gather_pool.next_offset = 0;
 }
 
 const struct brw_tracked_state gen7_hw_binding_tables = {
@@ -371,5 +413,4 @@ const struct brw_tracked_state gen6_binding_table_pointers 
= {
},
.emit = gen6_upload_binding_table_pointers,
 };
-
 /** @} */
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9c7ccae..685ca70 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -67,6 +67,7 @@
 #include tnl/tnl.h
 #include tnl/t_pipeline.h
 #include util/ralloc.h
+#include util/u_atomic.h
 
 #include glsl/nir/nir.h
 
@@ -692,6 +693,25 @@ brw_get_revision(int fd)
return revision;
 }
 
+static void
+brw_process_intel_gather_variable(struct brw_context *brw)
+{
+   uint64_t INTEL_UBO_GATHER = 0;
+
+   static const struct dri_debug_control gather_control[] = {
+  { vs, (1  MESA_SHADER_VERTEX)},
+  { gs, (1  MESA_SHADER_GEOMETRY)},
+  { fs, (1  MESA_SHADER_FRAGMENT)},
+  { NULL, 0 }
+   };
+   uint64_t intel_ubo_gather = driParseDebugString(getenv(INTEL_UBO_GATHER), 
gather_control);
+   (void) p_atomic_cmpxchg(INTEL_UBO_GATHER, 0, intel_ubo_gather);
+
+   brw-vs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_VERTEX));
+   brw-gs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_GEOMETRY));
+   brw-fs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_FRAGMENT));
+}
+
 GLboolean
 brwCreateContext(gl_api api,
 const struct gl_config *mesaVis,
@@ -755,6 +775,10 @@ brwCreateContext(gl_api api,
brw-must_use_separate_stencil = screen-hw_must_use_separate_stencil;
brw-has_swizzling = screen-hw_has_swizzling;
 
+   brw_process_intel_gather_variable(brw);
+   brw-has_resource_streamer = brw-is_haswell 
+  (brw-vs_ubo_gather ||  brw-gs_ubo_gather ||  brw-fs_ubo_gather);
+
brw-vs.base.stage = MESA_SHADER_VERTEX;

[Mesa-dev] [PATCH 21/27] i965/vec4: Append uniform entries to the gather table

2015-04-28 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 2 files changed, 13 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index c4c77b2..799d79e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -571,6 +571,17 @@ vec4_visitor::pack_uniform_registers()
}
 }
 
+void
+vec4_visitor::generate_gather_table()
+{
+   int num_consts = ALIGN(stage_prog_data-nr_params, 4) / 4;
+   for (int i = 0; i  num_consts; i++) {
+  int p = stage_prog_data-nr_gather_table++;
+  stage_prog_data-gather_table[p].reg = -1;
+  stage_prog_data-gather_table[p].channel_mask = 0xf;
+   }
+}
+
 /**
  * Does algebraic optimizations (0 * a = 0, 1 * a = a, a + 0 = a).
  *
@@ -1757,6 +1768,7 @@ vec4_visitor::run()
   return false;
 
setup_payload();
+   generate_gather_table();
 
if (false) {
   /* Debug of register spilling: Go spill everything. */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index a0ee2cc..e58ea25 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -212,6 +212,7 @@ public:
bool is_dep_ctrl_unsafe(const vec4_instruction *inst);
void opt_set_dependency_control();
void opt_schedule_instructions();
+   void generate_gather_table();
 
vec4_instruction *emit(vec4_instruction *inst);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/27] i965: Assign hw-binding table index for uniform constant buffer block

2015-04-28 Thread Abdiel Janulgue
Assign the uploaded uniform block with hardware binding table indices.
This is indexed by the resource streamer to fetch the constant buffers
referred to by our gather table entries.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen6_vs_state.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
b/src/mesa/drivers/dri/i965/gen6_vs_state.c
index 7325c6e..bce597f 100644
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
@@ -72,9 +72,16 @@ gen6_upload_push_constants(struct brw_context *brw,
   gl_constant_value *param;
   int i;
 
-  param = brw_state_batch(brw, type,
- prog_data-nr_params * sizeof(gl_constant_value),
+  uint32_t size = prog_data-nr_params * sizeof(gl_constant_value);
+  param = brw_state_batch(brw, type, size,
  32, stage_state-push_const_offset);
+  if (brw-gather_pool.bo != NULL) {
+ uint32_t surf_offset = 0;
+ brw_create_constant_surface(brw, brw-batch.bo, 
stage_state-push_const_offset,
+ size, surf_offset, false);
+ gen7_update_binding_table(brw, stage_state-stage, 
BRW_UNIFORM_GATHER_INDEX_START,
+   surf_offset);
+  }
 
   STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float));
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/27] i965: Disable gather push constants for null constants

2015-04-28 Thread Abdiel Janulgue
Programming null constants with gather constant tables seems to
be unsupported and results in a GPU lockup even with the prescribed
GPU workarounds in the bspec. Found out by trial and error that
disabling HW gather constant when the constant state for a stage
needs to be nullified is the only way to go around the issue.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen7_disable.c  |  4 
 src/mesa/drivers/dri/i965/gen7_vs_state.c | 11 +++
 2 files changed, 15 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_disable.c 
b/src/mesa/drivers/dri/i965/gen7_disable.c
index 2c43cd7..ba7fbf8 100644
--- a/src/mesa/drivers/dri/i965/gen7_disable.c
+++ b/src/mesa/drivers/dri/i965/gen7_disable.c
@@ -29,6 +29,8 @@
 static void
 disable_stages(struct brw_context *brw)
 {
+   gen7_toggle_gather_constants(brw, false);
+
/* Disable the HS Unit */
BEGIN_BATCH(7);
OUT_BATCH(_3DSTATE_CONSTANT_HS  16 | (7 - 2));
@@ -87,6 +89,8 @@ disable_stages(struct brw_context *brw)
OUT_BATCH(_3DSTATE_BINDING_TABLE_POINTERS_DS  16 | (2 - 2));
OUT_BATCH(0);
ADVANCE_BATCH();
+
+   gen7_toggle_gather_constants(brw, true);
 }
 
 const struct brw_tracked_state gen7_disable_stages = {
diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c 
b/src/mesa/drivers/dri/i965/gen7_vs_state.c
index adfaa59..f5e77ed 100644
--- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
@@ -85,6 +85,13 @@ gen7_upload_constant_state(struct brw_context *brw,
int const_loc = use_gather ? 16 : 0;
int dwords = brw-gen = 8 ? 11 : 7;
 
+   /* Disable gather constants when zeroing constant states */
+   bool gather_switched_off = false;
+   if (use_gather  !active) {
+  gen7_toggle_gather_constants(brw, false);
+  gather_switched_off = true;
+   }
+
struct brw_stage_prog_data *prog_data = stage_state-prog_data;
if (prog_data  use_gather  active) {
   gen7_submit_gather_table(brw, stage_state, prog_data, gather_opcode);
@@ -115,6 +122,10 @@ gen7_upload_constant_state(struct brw_context *brw,
 
ADVANCE_BATCH();
 
+   /* Re-enable gather again if required */
+   if (gather_switched_off)
+  gen7_toggle_gather_constants(brw, true);
+
   /* On SKL+ the new constants don't take effect until the next corresponding
* 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we need to ensure
* that is sent
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/27] i965/fs: Append uniform entries to the gather table

2015-04-28 Thread Abdiel Janulgue
And generate the gather mask constant entries from our uniform data.
Data generated here will later be packed together with UBO constants.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7cc88ea..071ac59 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2272,7 +2272,10 @@ fs_visitor::assign_constant_locations()
   }
}
 
-   stage_prog_data-nr_params = num_push_constants;
+   stage_prog_data-nr_params = 0;
+
+   unsigned const_reg_access[uniforms];
+   memset(const_reg_access, 0, sizeof(const_reg_access));
 
/* Up until now, the param[] array has been indexed by reg + reg_offset
 * of UNIFORM registers.  Condense it to only contain the uniforms we
@@ -2286,6 +2289,18 @@ fs_visitor::assign_constant_locations()
 
   assert(remapped = (int)i);
   stage_prog_data-param[remapped] = stage_prog_data-param[i];
+  int p = stage_prog_data-nr_params++;
+
+  /* access table for uniform registers*/
+  const_reg_access[(ALIGN(prog_data-nr_params, 4) / 4) - 1] |= (1  (p % 
4));
+   }
+
+   int num_consts = ALIGN(prog_data-nr_params, 4) / 4;
+   for (int i = 0; i  num_consts; i++) {
+  int p = stage_prog_data-nr_gather_table++;
+  stage_prog_data-gather_table[p].reg = -1;
+  stage_prog_data-gather_table[p].channel_mask =
+ const_reg_access[i];
}
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/27] i965/fs/nir: Append nir_intrinsic_load_ubo entries to the gather table

2015-04-28 Thread Abdiel Janulgue
When the const block and offset are immediate values. Otherwise just
fall-back to the previous method of uploading the UBO constant data to
GRF using pull constants.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 ++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 59 
 2 files changed, 61 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index a48b2bb..5247fa1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -418,6 +418,8 @@ public:
void setup_builtin_uniform_values(ir_variable *ir);
int implied_mrf_writes(fs_inst *inst);
bool generate_ubo_gather_table(ir_expression* ir);
+   bool nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg dest,
+  bool has_indirect);
 
virtual void dump_instructions();
virtual void dump_instructions(const char *name);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3972581..b68f221 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1377,6 +1377,9 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   has_indirect = true;
   /* fallthrough */
case nir_intrinsic_load_ubo: {
+  if (nir_generate_ubo_gather_table(instr, dest, has_indirect))
+ break;
+
   nir_const_value *const_index = nir_src_as_const_value(instr-src[0]);
   fs_reg surf_index;
 
@@ -1774,3 +1777,59 @@ fs_visitor::nir_emit_jump(nir_jump_instr *instr)
   unreachable(unknown jump);
}
 }
+
+bool
+fs_visitor::nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg 
dest,
+  bool has_indirect)
+{
+   nir_const_value *const_index = nir_src_as_const_value(instr-src[0]);
+
+   if (!const_index || has_indirect || !brw-fs_ubo_gather || 
!brw-has_resource_streamer)
+  return false;
+
+   /* Only allow 16 registers (128 uniform components) as push constants.
+*/
+   unsigned int max_push_components = 16 * 8;
+   unsigned param_index = uniforms + ubo_uniforms;
+   if ((MAX2(param_index, num_direct_uniforms) +
+instr-num_components)  max_push_components)
+  return false;
+
+   fs_reg uniform_reg;
+   if (dispatch_width == 16) {
+  for (int i = 0; i  (int) this-nr_ubo_gather_table; i++) {
+ if ((this-ubo_gather_table[i].const_block ==
+  const_index-u[0]) 
+ (this-ubo_gather_table[i].const_offset ==
+  (unsigned) instr-const_index[0])) {
+uniform_reg = fs_reg(UNIFORM, this-ubo_gather_table[i].reg);
+break;
+ }
+  }
+  if (uniform_reg.file != UNIFORM) {
+ /* Unlikely but this means that SIMD8 wasn't able to allocate push 
constant
+  * registers for this ubo load. Fall back to pull-constant method.
+  */
+ return false;
+  }
+   }
+
+   if (uniform_reg.file != UNIFORM) {
+  uniform_reg = fs_reg(UNIFORM, param_index);
+  int gather = this-nr_ubo_gather_table++;
+
+  assert(instr-num_components = 4);
+  ubo_uniforms += instr-num_components;
+  this-ubo_gather_table[gather].reg = uniform_reg.reg;
+  this-ubo_gather_table[gather].const_block = const_index-u[0];
+  this-ubo_gather_table[gather].const_offset = instr-const_index[0];
+   }
+
+   for (unsigned j = 0; j  instr-num_components; j++) {
+  fs_reg src = offset(retype(uniform_reg, dest.type), j);
+  emit(MOV(dest, src));
+  dest = offset(dest, 1);
+   }
+
+   return true;
+}
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/27] i965: Allocate space on the gather pool for UBO entries

2015-04-28 Thread Abdiel Janulgue
If there are UBO constant entries, append them to stage_state-push_const_size.
The gather pool contains the combined entries of both ordinary uniforms
and UBO constants.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen6_vs_state.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
b/src/mesa/drivers/dri/i965/gen6_vs_state.c
index aebaa49..7325c6e 100644
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
@@ -59,7 +59,9 @@ gen6_upload_push_constants(struct brw_context *brw,
struct gl_context *ctx = brw-ctx;
 
if (prog_data-nr_params == 0) {
-  stage_state-push_const_size = 0;
+  if (prog_data-nr_ubo_params == 0) {
+ stage_state-push_const_size = 0;
+  }
} else {
   /* Updates the ParamaterValues[i] pointers for all parameters of the
* basic type of PROGRAM_STATE_VAR.
@@ -122,10 +124,24 @@ gen6_upload_push_constants(struct brw_context *brw,
}
/* Allocate gather pool space for uniform and UBO entries in 512-bit 
chunks*/
if (brw-gather_pool.bo != NULL) {
+  unsigned gather_pool_next_offset = brw-gather_pool.next_offset;
+
   if (prog_data-nr_params  0) {
  int num_consts = ALIGN(prog_data-nr_params, 4) / 4;
+ gather_pool_next_offset += (ALIGN(num_consts, 4) / 4) * 64;
+  }
+
+  if (prog_data-nr_ubo_params  0) {
+ stage_state-push_const_size = ALIGN(prog_data-nr_params + 
prog_data-nr_ubo_params, 8) / 8;
+ uint32_t num_constants = ALIGN(prog_data-nr_ubo_params, 4) / 4;
+ gather_pool_next_offset += (ALIGN(num_constants, 4) / 4) * 64;
+  }
+
+  if (gather_pool_next_offset  brw-gather_pool.next_offset) {
  stage_state-push_const_offset = brw-gather_pool.next_offset;
- brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
+ brw-gather_pool.next_offset = gather_pool_next_offset;
+ assert(brw-gather_pool.next_offset  brw-gather_pool.bo-size);
+ assert(stage_state-push_const_offset  brw-gather_pool.next_offset);
   }
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/27] i965: Program the push constants state using the gather table

2015-04-28 Thread Abdiel Janulgue
Use the gather table generated from the uniform uploads and
ir_binop_ubo_load to gather and pack the constants to the gather pool.

Note that the 3DSTATE_CONSTANT_* packet now refers to the gather
pool generated by the resource streamer instead of the constant buffer
pointed to by an offset of the dynamic state base address.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_state.h |  2 +-
 src/mesa/drivers/dri/i965/gen6_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen6_vs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen6_wm_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_vs_state.c | 62 +--
 5 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 342157d..6536085 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -296,7 +296,7 @@ brw_upload_pull_constants(struct brw_context *brw,
 void
 gen7_upload_constant_state(struct brw_context *brw,
const struct brw_stage_state *stage_state,
-   bool active, unsigned opcode);
+   bool active, unsigned opcode, unsigned gather_op);
 
 /* gen7_misc_state.c */
 void gen7_rs_control(struct brw_context *brw, int enable);
diff --git a/src/mesa/drivers/dri/i965/gen6_gs_state.c 
b/src/mesa/drivers/dri/i965/gen6_gs_state.c
index eb4c586..79a899e 100644
--- a/src/mesa/drivers/dri/i965/gen6_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_gs_state.c
@@ -48,7 +48,7 @@ gen6_upload_gs_push_constants(struct brw_context *brw)
}
 
if (brw-gen = 7)
-  gen7_upload_constant_state(brw, stage_state, gp, _3DSTATE_CONSTANT_GS);
+  gen7_upload_constant_state(brw, stage_state, gp, _3DSTATE_CONSTANT_GS, 
_3DSTATE_GATHER_CONSTANT_GS);
 }
 
 const struct brw_tracked_state gen6_gs_push_constants = {
diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
b/src/mesa/drivers/dri/i965/gen6_vs_state.c
index bce597f..025cef7 100644
--- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
@@ -172,7 +172,7 @@ gen6_upload_vs_push_constants(struct brw_context *brw)
  gen7_emit_vs_workaround_flush(brw);
 
   gen7_upload_constant_state(brw, stage_state, true /* active */,
- _3DSTATE_CONSTANT_VS);
+ _3DSTATE_CONSTANT_VS, 
_3DSTATE_GATHER_CONSTANT_VS);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
b/src/mesa/drivers/dri/i965/gen6_wm_state.c
index 8e673a4..798399e 100644
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
@@ -50,7 +50,7 @@ gen6_upload_wm_push_constants(struct brw_context *brw)
 
if (brw-gen = 7) {
   gen7_upload_constant_state(brw, brw-wm.base, true,
- _3DSTATE_CONSTANT_PS);
+ _3DSTATE_CONSTANT_PS, 
_3DSTATE_GATHER_CONSTANT_PS);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c 
b/src/mesa/drivers/dri/i965/gen7_vs_state.c
index 278b3ec..adfaa59 100644
--- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
@@ -28,28 +28,82 @@
 #include program/prog_parameter.h
 #include program/prog_statevars.h
 #include intel_batchbuffer.h
+#include glsl/glsl_parser_extras.h
 
+static void
+gen7_submit_gather_table(struct brw_context* brw,
+ const struct brw_stage_state *stage_state,
+ const struct brw_stage_prog_data *prog_data,
+ unsigned gather_opcode)
+{
+   uint32_t gather_dwords = 3 + prog_data-nr_gather_table;
+
+   /* Ordinary uniforms are assigned to the first constant buffer slot */
+   unsigned cb_valid = 1;
+   /* Assign subsequent constant buffer slots to UBOs if any */
+   cb_valid |= (prog_data-nr_ubo_params  0) ?
+  (2  (BRW_UBO_GATHER_INDEX_APPEND + prog_data-max_ubo_const_block)) - 
1 : 0;
+
+   assert(cb_valid  0x);
+
+   BEGIN_BATCH(gather_dwords);
+   OUT_BATCH(gather_opcode  16 | (gather_dwords - 2));
+   OUT_BATCH(SET_FIELD(cb_valid, BRW_GATHER_BUFFER_VALID) |
+ SET_FIELD(BRW_UNIFORM_GATHER_INDEX_START / 16, 
BRW_GATHER_BINDING_TABLE_BLOCK));
+   OUT_BATCH(stage_state-push_const_offset);
+   for (int i = 0; i  prog_data-nr_gather_table; i++) {
+  /* Which bo are we referring to? The uniform constant buffer or
+   * the UBO block?
+   */
+  bool is_uniform = prog_data-gather_table[i].reg == -1;
+  int cb_offset = is_uniform ? i : 
(prog_data-gather_table[i].const_offset / 16);
+  int bt_offset = is_uniform ? 0 :
+ (prog_data-gather_table[i].const_block + 
BRW_UBO_GATHER_INDEX_APPEND);
+
+  assert(cb_offset  256);
+  assert(bt_offset  16);
+
+  OUT_BATCH(SET_FIELD(cb_offset, BRW_GATHER_CONST_BUFFER_OFFSET) |
+

[Mesa-dev] [PATCH 05/27] i965: Upload binding tables in hw-generated binding table format.

2015-04-28 Thread Abdiel Janulgue
When hardware-generated binding tables are enabled, use the hw-generated
binding table format when uploading binding table state.

Normally, the CS will will just consume the binding table pointer commands
as pipelined state. When the RS is enabled however, the RS flushes whatever
edited surface state entries of our on-chip binding table to the binding
table pool before passing the command on to the CS.

Note that the the binding table pointer offset is relative to the binding table
pool base address when resource streamer instead of the surface state base 
address.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 70b8751..c1d188e 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -75,7 +75,12 @@ brw_upload_binding_table(struct brw_context *brw,
  return;
 
   stage_state-bind_bo_offset = 0;
-   } else {
+   }
+
+   /* If resource streamer is enabled, skip manual binding table upload */
+   if (!brw-hw_bt_pool.bo) {
+  /* CACHE_NEW_*_PROG */
+
   /* Upload a new binding table. */
   if (INTEL_DEBUG  DEBUG_SHADER_TIME) {
  brw-vtbl.emit_buffer_surface_state(
@@ -92,15 +97,26 @@ brw_upload_binding_table(struct brw_context *brw,
   /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */
   memcpy(bind, stage_state-surf_offset,
  prog_data-binding_table.size_bytes);
+   } else {
+  gen7_update_binding_table_from_array(brw, stage_state-stage,
+   stage_state-surf_offset,
+   prog_data-binding_table.size_bytes 
/ 4);
}
 
brw-ctx.NewDriverState |= brw_new_binding_table;
 
if (brw-gen = 7) {
+
+  if (brw-has_resource_streamer)
+ stage_state-bind_bo_offset = brw-hw_bt_pool.next_offset;
+
   BEGIN_BATCH(2);
   OUT_BATCH(packet_name  16 | (2 - 2));
   OUT_BATCH(stage_state-bind_bo_offset);
   ADVANCE_BATCH();
+
+  if (brw-has_resource_streamer)
+ brw-hw_bt_pool.next_offset += 
ALIGN(prog_data-binding_table.size_bytes, 64);
}
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/27] nir: Add glsl_get_element_type() wrapper.

2015-04-28 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/nir/nir_types.cpp | 5 +
 src/glsl/nir/nir_types.h   | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
index f0d0b46..249678f 100644
--- a/src/glsl/nir/nir_types.cpp
+++ b/src/glsl/nir/nir_types.cpp
@@ -82,6 +82,11 @@ glsl_get_base_type(const struct glsl_type *type)
return type-base_type;
 }
 
+const struct glsl_type *
+glsl_get_element_type(const struct glsl_type *type)
+{
+   return type-element_type();
+}
 unsigned
 glsl_get_vector_elements(const struct glsl_type *type)
 {
diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
index 276d4ad..125f075 100644
--- a/src/glsl/nir/nir_types.h
+++ b/src/glsl/nir/nir_types.h
@@ -49,6 +49,8 @@ const struct glsl_type *glsl_get_array_element(const struct 
glsl_type *type);
 
 const struct glsl_type *glsl_get_column_type(const struct glsl_type *type);
 
+const struct glsl_type *glsl_get_element_type(const struct glsl_type *type);
+
 enum glsl_base_type glsl_get_base_type(const struct glsl_type *type);
 
 unsigned glsl_get_vector_elements(const struct glsl_type *type);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/27] i965: Disable resource streamer in BLORP

2015-04-28 Thread Abdiel Janulgue
Switch off hardware-generated binding tables and gather push
constants in the blorp. Blorp requires only a minimal set of
simple constants. There is no need for the extra complexity
to program a gather table entry into the pipeline.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen7_blorp.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
index fb6a0dd..bc428e6 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
@@ -821,6 +821,7 @@ gen7_blorp_exec(struct brw_context *brw,
depthstencil_offset = gen6_blorp_emit_depth_stencil_state(brw, params);
gen7_blorp_emit_depth_stencil_state_pointers(brw, params,
 depthstencil_offset);
+   gen7_disable_hw_binding_tables(brw);
if (params-use_wm_prog) {
   uint32_t wm_surf_offset_renderbuffer;
   uint32_t wm_surf_offset_texture = 0;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/27] i965: Enable push constants for UBOs

2015-04-28 Thread Abdiel Janulgue
Switches on push constants whenever we have UBO entries.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
index 923414e..1dfe697 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
@@ -152,7 +152,7 @@ upload_ps_state(struct brw_context *brw)
 
dw4 |= (brw-max_wm_threads - 1)  max_threads_shift;
 
-   if (prog_data-base.nr_params  0)
+   if (prog_data-base.nr_params  0 || prog_data-base.nr_ubo_params  0)
   dw4 |= GEN7_PS_PUSH_CONSTANT_ENABLE;
 
/* From the IVB PRM, volume 2 part 1, page 287:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Brian Paul

On 04/28/2015 04:35 PM, Ilia Mirkin wrote:

On Tue, Apr 28, 2015 at 6:26 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

Reviewed-by: Ilia Mirkin imir...@alum.mit.edu


ctually



Awesome! Now I can go remove this set of hacks from freedreno. And
this fixes the same issue in nouveau. Thanks for doing it the real way
:)

On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:

If the user requested a GL_RGB texture but the driver actually allocated
an RGBA texture, the alpha values in the texture may not be defined.

If we later bind the texture as a color target and try to blend into
it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
undefined alpha values when, in fact, the dest alpha value should be one.
So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

Fixes the piglit fbo-blending-formats test for some GL_RGB formats
with the VMware driver.  Also tested with llvmpipe.
---
  src/mesa/state_tracker/st_atom_blend.c | 38 +-
  1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 6bb4077..30bff7a 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -44,10 +44,21 @@
  /**
   * Convert GLenum blend tokens to pipe tokens.
   * Both blend factors and blend funcs are accepted.
+ * \param destBaseFormat  the base format of the render target, such as
+ *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
   */
  static GLuint
-translate_blend(GLenum blend)
+translate_blend(GLenum blend, GLenum destBaseFormat)
  {
+   /* If we don't have destination alpha and the blend factor is either
+* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
+* PIPE_BLENDFACTOR_ONE or _ZERO instead.
+*/
+   const bool haveDstA = (destBaseFormat == GL_RGBA ||
+  destBaseFormat == GL_ALPHA ||
+  destBaseFormat == GL_INTENSITY ||
+  destBaseFormat == GL_LUMINANCE_ALPHA);
+
 switch (blend) {
 /* blend functions */
 case GL_FUNC_ADD:
@@ -69,7 +80,7 @@ translate_blend(GLenum blend)
 case GL_SRC_ALPHA:
return PIPE_BLENDFACTOR_SRC_ALPHA;
 case GL_DST_ALPHA:
-  return PIPE_BLENDFACTOR_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA : PIPE_BLENDFACTOR_ONE;
 case GL_DST_COLOR:
return PIPE_BLENDFACTOR_DST_COLOR;
 case GL_SRC_ALPHA_SATURATE:
@@ -91,7 +102,7 @@ translate_blend(GLenum blend)
 case GL_ONE_MINUS_DST_COLOR:
return PIPE_BLENDFACTOR_INV_DST_COLOR;
 case GL_ONE_MINUS_DST_ALPHA:
-  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA : PIPE_BLENDFACTOR_ZERO;
 case GL_ONE_MINUS_CONSTANT_COLOR:
return PIPE_BLENDFACTOR_INV_CONST_COLOR;
 case GL_ONE_MINUS_CONSTANT_ALPHA:
@@ -208,14 +219,21 @@ update_blend( struct st_context *st )
 else if (ctx-Color.BlendEnabled) {
/* blending enabled */
for (i = 0, j = 0; i  num_state; i++) {
+ const struct gl_renderbuffer *rb;
+ GLenum baseFormat;

   blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i)  0x1;

   if (ctx-Extensions.ARB_draw_buffers_blend)
  j = i;

+ /* _NEW_BUFFERS */
+ /* Get the base format of the render target */
+ rb = ctx-DrawBuffer-_ColorDrawBuffers[j];


That's the wrong render target, no? You need the i'th render target.


Ah, I think you're right.

BTW, I think there's more i/j mix-ups in this code (independent of this 
patch).  I'll send a separate patch for that after I check the specs.




And what happens if I'm not using independent blend but one of the
RT's is RGB while the other is RGBA?


Yeah, there's no simple fix for that, AFAIK.





   -ilia


+ baseFormat = rb ? rb-_BaseFormat : GL_RGBA;
+
   blend-rt[i].rgb_func =
-translate_blend(ctx-Color.Blend[j].EquationRGB);
+translate_blend(ctx-Color.Blend[j].EquationRGB, baseFormat);

   if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
   ctx-Color.Blend[i].EquationRGB == GL_MAX) {
@@ -225,13 +243,13 @@ update_blend( struct st_context *st )
   }
   else {
  blend-rt[i].rgb_src_factor =
-   translate_blend(ctx-Color.Blend[j].SrcRGB);
+   translate_blend(ctx-Color.Blend[j].SrcRGB, baseFormat);
  blend-rt[i].rgb_dst_factor =
-   translate_blend(ctx-Color.Blend[j].DstRGB);
+   translate_blend(ctx-Color.Blend[j].DstRGB, baseFormat);
   }

   blend-rt[i].alpha_func =
-translate_blend(ctx-Color.Blend[j].EquationA);
+translate_blend(ctx-Color.Blend[j].EquationA, baseFormat);

   if (ctx-Color.Blend[i].EquationA == GL_MIN ||
   ctx-Color.Blend[i].EquationA == GL_MAX) {
@@ -241,9 +259,9 

Re: [Mesa-dev] [PATCH 5/7] i965: use _mesa_geometry_width/height/layers/samples for programming geometry of framebuffer to GEN

2015-04-28 Thread Rogovin, Kevin
Hello,


 No, because the non-shared code is (by your own admission) untested and/or 
 dead code.  Untested code is broken code.  I would personally be ok with a 
 lot  of the changes that just replace fb-Width with
 _mesa_geometric_width(fb) since it's effectively just replacing a direct 
 access with a getter.  However, almost half of the patch is updating the 
 upload_sf_vp  function which is only used for gen = 5.  A comment or assert 
 there would be sufficient rather than reworking it.

Fair enough. Would the following be good:
 - keep all those that replace fb-whatever with _mesa_geomety_whatever,
 - instead of the ick I have done to upload_sf_vp, place a big comment warning

I would be happy with the above  as it addresses my main concern and the 
dead-is-broken code concern as well. If I had physical access to a Gen4 and 5 
box I would test it and if it worked, enable the extension on those platforms 
as well.

-Kevin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965: use _mesa_geometry_width/height/layers/samples for programming geometry of framebuffer to GEN

2015-04-28 Thread Jason Ekstrand
On Tue, Apr 28, 2015 at 6:17 PM, Rogovin, Kevin kevin.rogo...@intel.com wrote:
 Hello,


 No, because the non-shared code is (by your own admission) untested and/or 
 dead code.  Untested code is broken code.  I would personally be ok with a 
 lot  of the changes that just replace fb-Width with
 _mesa_geometric_width(fb) since it's effectively just replacing a direct 
 access with a getter.  However, almost half of the patch is updating the 
 upload_sf_vp  function which is only used for gen = 5.  A comment or 
 assert there would be sufficient rather than reworking it.

 Fair enough. Would the following be good:
  - keep all those that replace fb-whatever with _mesa_geomety_whatever,
  - instead of the ick I have done to upload_sf_vp, place a big comment warning

Yes, I think that would be sufficient.

 I would be happy with the above  as it addresses my main concern and the 
 dead-is-broken code concern as well. If I had physical access to a Gen4 and 5 
 box I would test it and if it worked, enable the extension on those platforms 
 as well.

I don't think there's any good reason to turn it on for Gen5 or older.
However, we should still test it because it does touch code that hits
those platforms.  Testing across all the hardware can be done fairly
easily by pushing to our Jenkins system.  I know that Martin and Topi
(and probably Curro) have accounts and could run it easily enough.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Ilia Mirkin
Yes, sorry, thought that was implied since I had given it earlier
pending my (as it turns out, incorrect) suggestion.

Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

On Tue, Apr 28, 2015 at 8:20 PM, Brian Paul bri...@vmware.com wrote:
 R-b?

 -Brian


 On 04/28/2015 05:47 PM, Ilia Mirkin wrote:

 That's... really asymmetrical. There's a
 PIPE_FORMAT_R10G10B10X2_SNORM. Oh well -- no reason to add it in just
 for this.

 On Tue, Apr 28, 2015 at 7:47 PM, Brian Paul bri...@vmware.com wrote:

 Unless I'm not seeing it, there is no such gallium format.

 -Brian


 On 04/28/2015 04:22 PM, Ilia Mirkin wrote:


 Presumably you should also include RGB10_X2 while you're at it?

 With that, Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

 On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:


 ---
src/mesa/state_tracker/st_format.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/state_tracker/st_format.c
 b/src/mesa/state_tracker/st_format.c
 index 181465d..db7b5b7 100644
 --- a/src/mesa/state_tracker/st_format.c
 +++ b/src/mesa/state_tracker/st_format.c
 @@ -991,7 +991,7 @@ static const struct format_mapping format_map[] = {
   {
  { GL_RGB10, 0 },
  { PIPE_FORMAT_B10G10R10X2_UNORM,
 PIPE_FORMAT_B10G10R10A2_UNORM,
 -DEFAULT_RGB_FORMATS }
 +PIPE_FORMAT_R10G10B10A2_UNORM, DEFAULT_RGB_FORMATS }
   },
   {
  { GL_RGB10_A2, 0 },
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org


 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=IuMA5FK_1tNFsFKLmtJFxFREEyZXHga_CgoCrfxxV8Ys=EF5qZZulTwh5vLidYWf7vMagDvZQexfoUJ9PjS1uHZIe=




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Matt Turner
On Tue, Apr 28, 2015 at 5:23 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Yes, sorry, thought that was implied since I had given it earlier
 pending my (as it turns out, incorrect) suggestion.

A number of people have been confused (rightly so) by LGTM implying
Reviewed-by. Let's please not ever start implying or inferring
Reviewed-bys from anything less than a reply actually saying
Reviewed-by in the format expected to go into the commit message.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965: use _mesa_geometry_width/height/layers/samples for programming geometry of framebuffer to GEN

2015-04-28 Thread Jason Ekstrand
On Tue, Apr 28, 2015 at 3:37 PM, Rogovin, Kevin kevin.rogo...@intel.com wrote:

 I read the patch again and I'm still in the opinion that the changes to the
 pure pre-gen7 logic (i.e., logic that is not re-used for later gens) are not 
 needed.

 As I have tried and apparently failed to communicate, it is -better- and more 
 consistent. Need
 is a far stronger word. Without a doubt, if the extension is never enabled 
 for those older
 Gens, then it does not matter in terms of produced output. However, I stated 
 that it leaves
 a trap and an inconsistency which I find quite bothering.

You have very clearly communicated that you *think* it's better to
change it everywhere.  Others have chosen to disagree for a variety of
reasons.  Defending your choice to the death isn't aiding in the
discussion.

 The shared logic between pre-gen7 and later, namely setup for renderbuffers, 
 drawing rectangle and
 fragment shader compilation key are safe to do as they only introduce new 
 logic that is conditional to
 no-attachments being used.

 And that is exactly for the case for that code that is not shared. Indeed, if 
 the shared code is safe
 for pre-Gen7, then so is the non-shared code.

No, because the non-shared code is (by your own admission) untested
and/or dead code.  Untested code is broken code.  I would personally
be ok with a lot of the changes that just replace fb-Width with
_mesa_geometric_width(fb) since it's effectively just replacing a
direct access with a getter.  However, almost half of the patch is
updating the upload_sf_vp function which is only used for gen = 5.  A
comment or assert there would be sufficient rather than reworking it.

 Your concern about the readers getting confused could be also addressed with 
 assert(brw-gen = 7)
  and a comment saying that the no-attachment specific path is not applicable 
 for older gens.

 There is only one occurrence of no-attachment specific code paths in these 
 i965 patches
 and that is associated to scissoring.  The rest is existing code is changed 
 from accessing Width,
 Height of gl_framebuffer to getting those values from a function. There is 
 no proper place
 to insert an assert(brw-gen =7 ), since, with the exception of the 
 scissoring (and it is just
 one if block) there is no such no attachment code path. I had thought the 
 diffs of the series
 made that quite clear.

 And when it comes to the pure pre-gen7 logic, I, in fact, have just the 
 opposite opinion on making it to go through the no-attachment-aware path.
 As the extension is not possible for older gens, I find it clearer that 
 logic explicitly by-passes such paths that even consider it.

 Um, I am pretty sure than pre Gen7 hardware can do the extension. The crux is 
 that the extension
 is pointless for such hardware because pre Gen7 hardware does not (AFAIK) 
 have a feature that
 allows for a fragment shader to have a side effect. Even that statement is 
 not totally true. Indeed,
 one can argue performance queries and occlusion queries with 
 framebuffer_no_attachments
 make some form of sense (it would give an application a count of sorts).

That's a contingency I think we can ignore for the moment.  If someone
really wants to do occlusion queries with no framebuffer on ILK, we
can add it then.  Until that unlikely event happens, let's concentrate
on HW that at least exposes atomics.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Ilia Mirkin
On Tue, Apr 28, 2015 at 7:52 PM, Brian Paul bri...@vmware.com wrote:
 On 04/28/2015 04:35 PM, Ilia Mirkin wrote:

 On Tue, Apr 28, 2015 at 6:26 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 Reviewed-by: Ilia Mirkin imir...@alum.mit.edu


 ctually


 Awesome! Now I can go remove this set of hacks from freedreno. And
 this fixes the same issue in nouveau. Thanks for doing it the real way
 :)

 On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:

 If the user requested a GL_RGB texture but the driver actually allocated
 an RGBA texture, the alpha values in the texture may not be defined.

 If we later bind the texture as a color target and try to blend into
 it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
 undefined alpha values when, in fact, the dest alpha value should be
 one.
 So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

 Fixes the piglit fbo-blending-formats test for some GL_RGB formats
 with the VMware driver.  Also tested with llvmpipe.
 ---
   src/mesa/state_tracker/st_atom_blend.c | 38
 +-
   1 file changed, 28 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/state_tracker/st_atom_blend.c
 b/src/mesa/state_tracker/st_atom_blend.c
 index 6bb4077..30bff7a 100644
 --- a/src/mesa/state_tracker/st_atom_blend.c
 +++ b/src/mesa/state_tracker/st_atom_blend.c
 @@ -44,10 +44,21 @@
   /**
* Convert GLenum blend tokens to pipe tokens.
* Both blend factors and blend funcs are accepted.
 + * \param destBaseFormat  the base format of the render target, such as
 + *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
*/
   static GLuint
 -translate_blend(GLenum blend)
 +translate_blend(GLenum blend, GLenum destBaseFormat)
   {
 +   /* If we don't have destination alpha and the blend factor is either
 +* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
 +* PIPE_BLENDFACTOR_ONE or _ZERO instead.
 +*/
 +   const bool haveDstA = (destBaseFormat == GL_RGBA ||
 +  destBaseFormat == GL_ALPHA ||
 +  destBaseFormat == GL_INTENSITY ||
 +  destBaseFormat == GL_LUMINANCE_ALPHA);
 +
  switch (blend) {
  /* blend functions */
  case GL_FUNC_ADD:
 @@ -69,7 +80,7 @@ translate_blend(GLenum blend)
  case GL_SRC_ALPHA:
 return PIPE_BLENDFACTOR_SRC_ALPHA;
  case GL_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA :
 PIPE_BLENDFACTOR_ONE;
  case GL_DST_COLOR:
 return PIPE_BLENDFACTOR_DST_COLOR;
  case GL_SRC_ALPHA_SATURATE:
 @@ -91,7 +102,7 @@ translate_blend(GLenum blend)
  case GL_ONE_MINUS_DST_COLOR:
 return PIPE_BLENDFACTOR_INV_DST_COLOR;
  case GL_ONE_MINUS_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA :
 PIPE_BLENDFACTOR_ZERO;
  case GL_ONE_MINUS_CONSTANT_COLOR:
 return PIPE_BLENDFACTOR_INV_CONST_COLOR;
  case GL_ONE_MINUS_CONSTANT_ALPHA:
 @@ -208,14 +219,21 @@ update_blend( struct st_context *st )
  else if (ctx-Color.BlendEnabled) {
 /* blending enabled */
 for (i = 0, j = 0; i  num_state; i++) {
 + const struct gl_renderbuffer *rb;
 + GLenum baseFormat;

blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i) 
 0x1;

if (ctx-Extensions.ARB_draw_buffers_blend)
   j = i;

 + /* _NEW_BUFFERS */
 + /* Get the base format of the render target */
 + rb = ctx-DrawBuffer-_ColorDrawBuffers[j];


 That's the wrong render target, no? You need the i'th render target.


 Ah, I think you're right.

 BTW, I think there's more i/j mix-ups in this code (independent of this
 patch).  I'll send a separate patch for that after I check the specs.


 And what happens if I'm not using independent blend but one of the
 RT's is RGB while the other is RGBA?


 Yeah, there's no simple fix for that, AFAIK.

Well, presumably if the driver supports independent blend, you could
turn that on? If the driver doesn't do independent blend, then you're
SOL, but you should at least leave the DST_ALPHA setting alone...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Roland Scheidegger
Am 29.04.2015 um 01:52 schrieb Brian Paul:
 On 04/28/2015 04:35 PM, Ilia Mirkin wrote:
 On Tue, Apr 28, 2015 at 6:26 PM, Ilia Mirkin imir...@alum.mit.edu
 wrote:
 Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

 ctually


 Awesome! Now I can go remove this set of hacks from freedreno. And
 this fixes the same issue in nouveau. Thanks for doing it the real way
 :)

 On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:
 If the user requested a GL_RGB texture but the driver actually
 allocated
 an RGBA texture, the alpha values in the texture may not be defined.

 If we later bind the texture as a color target and try to blend into
 it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
 undefined alpha values when, in fact, the dest alpha value should be
 one.
 So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

 Fixes the piglit fbo-blending-formats test for some GL_RGB formats
 with the VMware driver.  Also tested with llvmpipe.
 ---
   src/mesa/state_tracker/st_atom_blend.c | 38
 +-
   1 file changed, 28 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/state_tracker/st_atom_blend.c
 b/src/mesa/state_tracker/st_atom_blend.c
 index 6bb4077..30bff7a 100644
 --- a/src/mesa/state_tracker/st_atom_blend.c
 +++ b/src/mesa/state_tracker/st_atom_blend.c
 @@ -44,10 +44,21 @@
   /**
* Convert GLenum blend tokens to pipe tokens.
* Both blend factors and blend funcs are accepted.
 + * \param destBaseFormat  the base format of the render target,
 such as
 + *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
*/
   static GLuint
 -translate_blend(GLenum blend)
 +translate_blend(GLenum blend, GLenum destBaseFormat)
   {
 +   /* If we don't have destination alpha and the blend factor is
 either
 +* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
 +* PIPE_BLENDFACTOR_ONE or _ZERO instead.
 +*/
 +   const bool haveDstA = (destBaseFormat == GL_RGBA ||
 +  destBaseFormat == GL_ALPHA ||
 +  destBaseFormat == GL_INTENSITY ||
 +  destBaseFormat == GL_LUMINANCE_ALPHA);
 +
  switch (blend) {
  /* blend functions */
  case GL_FUNC_ADD:
 @@ -69,7 +80,7 @@ translate_blend(GLenum blend)
  case GL_SRC_ALPHA:
 return PIPE_BLENDFACTOR_SRC_ALPHA;
  case GL_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA :
 PIPE_BLENDFACTOR_ONE;
  case GL_DST_COLOR:
 return PIPE_BLENDFACTOR_DST_COLOR;
  case GL_SRC_ALPHA_SATURATE:
 @@ -91,7 +102,7 @@ translate_blend(GLenum blend)
  case GL_ONE_MINUS_DST_COLOR:
 return PIPE_BLENDFACTOR_INV_DST_COLOR;
  case GL_ONE_MINUS_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA :
 PIPE_BLENDFACTOR_ZERO;
  case GL_ONE_MINUS_CONSTANT_COLOR:
 return PIPE_BLENDFACTOR_INV_CONST_COLOR;
  case GL_ONE_MINUS_CONSTANT_ALPHA:
 @@ -208,14 +219,21 @@ update_blend( struct st_context *st )
  else if (ctx-Color.BlendEnabled) {
 /* blending enabled */
 for (i = 0, j = 0; i  num_state; i++) {
 + const struct gl_renderbuffer *rb;
 + GLenum baseFormat;

blend-rt[i].blend_enable = (ctx-Color.BlendEnabled 
 i)  0x1;

if (ctx-Extensions.ARB_draw_buffers_blend)
   j = i;

 + /* _NEW_BUFFERS */
 + /* Get the base format of the render target */
 + rb = ctx-DrawBuffer-_ColorDrawBuffers[j];

 That's the wrong render target, no? You need the i'th render target.
 
 Ah, I think you're right.
 
 BTW, I think there's more i/j mix-ups in this code (independent of this
 patch).  I'll send a separate patch for that after I check the specs.
 
 
 And what happens if I'm not using independent blend but one of the
 RT's is RGB while the other is RGBA?
 
 Yeah, there's no simple fix for that, AFAIK.

Well could turn that into independent blend if the the driver supports
it. This would probably be a reason why drivers may be better suited to
handle this (if they don't support independent blend they may or may not
still be able to handle this correctly), though it's true this logic is
quite duplicated among all drivers in practice.

Anyway, this is probably a good idea though I suspect unfortunately we
can't get rid of the same code in llvmpipe due to other state trackers
(not entirely sure though).

Roland

 
 
 

-ilia

 + baseFormat = rb ? rb-_BaseFormat : GL_RGBA;
 +
blend-rt[i].rgb_func =
 -translate_blend(ctx-Color.Blend[j].EquationRGB);
 +translate_blend(ctx-Color.Blend[j].EquationRGB,
 baseFormat);

if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
ctx-Color.Blend[i].EquationRGB == GL_MAX) {
 @@ -225,13 +243,13 @@ update_blend( struct st_context *st )
}
else {

Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Brian Paul

On 04/28/2015 05:52 PM, Ilia Mirkin wrote:

On Tue, Apr 28, 2015 at 7:52 PM, Brian Paul bri...@vmware.com wrote:

On 04/28/2015 04:35 PM, Ilia Mirkin wrote:


On Tue, Apr 28, 2015 at 6:26 PM, Ilia Mirkin imir...@alum.mit.edu wrote:


Reviewed-by: Ilia Mirkin imir...@alum.mit.edu



ctually



Awesome! Now I can go remove this set of hacks from freedreno. And
this fixes the same issue in nouveau. Thanks for doing it the real way
:)

On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:


If the user requested a GL_RGB texture but the driver actually allocated
an RGBA texture, the alpha values in the texture may not be defined.

If we later bind the texture as a color target and try to blend into
it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
undefined alpha values when, in fact, the dest alpha value should be
one.
So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

Fixes the piglit fbo-blending-formats test for some GL_RGB formats
with the VMware driver.  Also tested with llvmpipe.
---
   src/mesa/state_tracker/st_atom_blend.c | 38
+-
   1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c
b/src/mesa/state_tracker/st_atom_blend.c
index 6bb4077..30bff7a 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -44,10 +44,21 @@
   /**
* Convert GLenum blend tokens to pipe tokens.
* Both blend factors and blend funcs are accepted.
+ * \param destBaseFormat  the base format of the render target, such as
+ *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
*/
   static GLuint
-translate_blend(GLenum blend)
+translate_blend(GLenum blend, GLenum destBaseFormat)
   {
+   /* If we don't have destination alpha and the blend factor is either
+* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
+* PIPE_BLENDFACTOR_ONE or _ZERO instead.
+*/
+   const bool haveDstA = (destBaseFormat == GL_RGBA ||
+  destBaseFormat == GL_ALPHA ||
+  destBaseFormat == GL_INTENSITY ||
+  destBaseFormat == GL_LUMINANCE_ALPHA);
+
  switch (blend) {
  /* blend functions */
  case GL_FUNC_ADD:
@@ -69,7 +80,7 @@ translate_blend(GLenum blend)
  case GL_SRC_ALPHA:
 return PIPE_BLENDFACTOR_SRC_ALPHA;
  case GL_DST_ALPHA:
-  return PIPE_BLENDFACTOR_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA :
PIPE_BLENDFACTOR_ONE;
  case GL_DST_COLOR:
 return PIPE_BLENDFACTOR_DST_COLOR;
  case GL_SRC_ALPHA_SATURATE:
@@ -91,7 +102,7 @@ translate_blend(GLenum blend)
  case GL_ONE_MINUS_DST_COLOR:
 return PIPE_BLENDFACTOR_INV_DST_COLOR;
  case GL_ONE_MINUS_DST_ALPHA:
-  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA :
PIPE_BLENDFACTOR_ZERO;
  case GL_ONE_MINUS_CONSTANT_COLOR:
 return PIPE_BLENDFACTOR_INV_CONST_COLOR;
  case GL_ONE_MINUS_CONSTANT_ALPHA:
@@ -208,14 +219,21 @@ update_blend( struct st_context *st )
  else if (ctx-Color.BlendEnabled) {
 /* blending enabled */
 for (i = 0, j = 0; i  num_state; i++) {
+ const struct gl_renderbuffer *rb;
+ GLenum baseFormat;

blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i) 
0x1;

if (ctx-Extensions.ARB_draw_buffers_blend)
   j = i;

+ /* _NEW_BUFFERS */
+ /* Get the base format of the render target */
+ rb = ctx-DrawBuffer-_ColorDrawBuffers[j];



That's the wrong render target, no? You need the i'th render target.



Ah, I think you're right.

BTW, I think there's more i/j mix-ups in this code (independent of this
patch).  I'll send a separate patch for that after I check the specs.



And what happens if I'm not using independent blend but one of the
RT's is RGB while the other is RGBA?



Yeah, there's no simple fix for that, AFAIK.


Well, presumably if the driver supports independent blend, you could
turn that on? If the driver doesn't do independent blend, then you're
SOL, but you should at least leave the DST_ALPHA setting alone...


If we have independent blend, the patch takes care of this.  I assumed 
you were asking about the RGBA+RGB case if we don't have independent 
blend.  I guess I'm not too concerned about that for now.  Off-hand, I 
doubt we have any piglit tests that really exercise mixed buffer 
formats.  And I don't have time to write them myself right now.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/mesa: fix i/j indexing mix-up for blend equations

2015-04-28 Thread Brian Paul
This doesn't actually change behavior, but it matches the surrounding
code and makes more sense.

If independent blend mode is supported (GL_ARB_draw_buffers_blend) i==j
so there's no difference.  If independent blend mode is not supported,
Blend[i].EquationRGB/A will never be GL_MIN/MAX if i0.
---
 src/mesa/state_tracker/st_atom_blend.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 6337e1c..2d0420e 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -235,8 +235,8 @@ update_blend( struct st_context *st )
  blend-rt[i].rgb_func =
 translate_blend(ctx-Color.Blend[j].EquationRGB, baseFormat);
 
- if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
- ctx-Color.Blend[i].EquationRGB == GL_MAX) {
+ if (ctx-Color.Blend[j].EquationRGB == GL_MIN ||
+ ctx-Color.Blend[j].EquationRGB == GL_MAX) {
 /* Min/max are special */
 blend-rt[i].rgb_src_factor = PIPE_BLENDFACTOR_ONE;
 blend-rt[i].rgb_dst_factor = PIPE_BLENDFACTOR_ONE;
@@ -251,8 +251,8 @@ update_blend( struct st_context *st )
  blend-rt[i].alpha_func =
 translate_blend(ctx-Color.Blend[j].EquationA, baseFormat);
 
- if (ctx-Color.Blend[i].EquationA == GL_MIN ||
- ctx-Color.Blend[i].EquationA == GL_MAX) {
+ if (ctx-Color.Blend[j].EquationA == GL_MIN ||
+ ctx-Color.Blend[j].EquationA == GL_MAX) {
 /* Min/max are special */
 blend-rt[i].alpha_src_factor = PIPE_BLENDFACTOR_ONE;
 blend-rt[i].alpha_dst_factor = PIPE_BLENDFACTOR_ONE;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Brian Paul
If the user requested a GL_RGB texture but the driver actually allocated
an RGBA texture, the alpha values in the texture may not be defined.

If we later bind the texture as a color target and try to blend into
it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
undefined alpha values when, in fact, the dest alpha value should be one.
So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

Fixes the piglit fbo-blending-formats test for some GL_RGB formats
with the VMware driver.  Also tested with llvmpipe.

v2: use the i-th (not j-th) render buffer's base format
---
 src/mesa/state_tracker/st_atom_blend.c | 38 +-
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 6bb4077..6337e1c 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -44,10 +44,21 @@
 /**
  * Convert GLenum blend tokens to pipe tokens.
  * Both blend factors and blend funcs are accepted.
+ * \param destBaseFormat  the base format of the render target, such as
+ *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
  */
 static GLuint
-translate_blend(GLenum blend)
+translate_blend(GLenum blend, GLenum destBaseFormat)
 {
+   /* If we don't have destination alpha and the blend factor is either
+* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
+* PIPE_BLENDFACTOR_ONE or _ZERO instead.
+*/
+   const bool haveDstA = (destBaseFormat == GL_RGBA ||
+  destBaseFormat == GL_ALPHA ||
+  destBaseFormat == GL_INTENSITY ||
+  destBaseFormat == GL_LUMINANCE_ALPHA);
+
switch (blend) {
/* blend functions */
case GL_FUNC_ADD:
@@ -69,7 +80,7 @@ translate_blend(GLenum blend)
case GL_SRC_ALPHA:
   return PIPE_BLENDFACTOR_SRC_ALPHA;
case GL_DST_ALPHA:
-  return PIPE_BLENDFACTOR_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA : PIPE_BLENDFACTOR_ONE;
case GL_DST_COLOR:
   return PIPE_BLENDFACTOR_DST_COLOR;
case GL_SRC_ALPHA_SATURATE:
@@ -91,7 +102,7 @@ translate_blend(GLenum blend)
case GL_ONE_MINUS_DST_COLOR:
   return PIPE_BLENDFACTOR_INV_DST_COLOR;
case GL_ONE_MINUS_DST_ALPHA:
-  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
+  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA : PIPE_BLENDFACTOR_ZERO;
case GL_ONE_MINUS_CONSTANT_COLOR:
   return PIPE_BLENDFACTOR_INV_CONST_COLOR;
case GL_ONE_MINUS_CONSTANT_ALPHA:
@@ -208,14 +219,21 @@ update_blend( struct st_context *st )
else if (ctx-Color.BlendEnabled) {
   /* blending enabled */
   for (i = 0, j = 0; i  num_state; i++) {
+ const struct gl_renderbuffer *rb;
+ GLenum baseFormat;
 
  blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i)  0x1;
 
  if (ctx-Extensions.ARB_draw_buffers_blend)
 j = i;
 
+ /* _NEW_BUFFERS */
+ /* Get the base format of the i-th render target */
+ rb = ctx-DrawBuffer-_ColorDrawBuffers[i];
+ baseFormat = rb ? rb-_BaseFormat : GL_RGBA;
+
  blend-rt[i].rgb_func =
-translate_blend(ctx-Color.Blend[j].EquationRGB);
+translate_blend(ctx-Color.Blend[j].EquationRGB, baseFormat);
 
  if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
  ctx-Color.Blend[i].EquationRGB == GL_MAX) {
@@ -225,13 +243,13 @@ update_blend( struct st_context *st )
  }
  else {
 blend-rt[i].rgb_src_factor =
-   translate_blend(ctx-Color.Blend[j].SrcRGB);
+   translate_blend(ctx-Color.Blend[j].SrcRGB, baseFormat);
 blend-rt[i].rgb_dst_factor =
-   translate_blend(ctx-Color.Blend[j].DstRGB);
+   translate_blend(ctx-Color.Blend[j].DstRGB, baseFormat);
  }
 
  blend-rt[i].alpha_func =
-translate_blend(ctx-Color.Blend[j].EquationA);
+translate_blend(ctx-Color.Blend[j].EquationA, baseFormat);
 
  if (ctx-Color.Blend[i].EquationA == GL_MIN ||
  ctx-Color.Blend[i].EquationA == GL_MAX) {
@@ -241,9 +259,9 @@ update_blend( struct st_context *st )
  }
  else {
 blend-rt[i].alpha_src_factor =
-   translate_blend(ctx-Color.Blend[j].SrcA);
+   translate_blend(ctx-Color.Blend[j].SrcA, baseFormat);
 blend-rt[i].alpha_dst_factor =
-   translate_blend(ctx-Color.Blend[j].DstA);
+   translate_blend(ctx-Color.Blend[j].DstA, baseFormat);
  }
   }
}
@@ -285,7 +303,7 @@ update_blend( struct st_context *st )
 const struct st_tracked_state st_update_blend = {
st_update_blend,  /* name */
{   /* dirty */
-  (_NEW_COLOR | _NEW_MULTISAMPLE),  /* XXX _NEW_BLEND someday? */  /* mesa 
*/
+  

Re: [Mesa-dev] [PATCH 6/7] i965/cs: Implement brw_emit_gpgpu_walker

2015-04-28 Thread Kenneth Graunke
On Tuesday, April 28, 2015 12:04:50 AM Jordan Justen wrote:
 On 2015-04-27 19:02:38, Kenneth Graunke wrote:
  On Friday, April 24, 2015 04:33:43 PM Jordan Justen wrote:
   +   BEGIN_BATCH(dwords);
   +   OUT_BATCH(GPGPU_WALKER  16 | (dwords - 2));
  
  I was going to suggest splitting this into separate Gen8+ and Gen7
  blocks, but now that I look at the code...these two are slightly
  different indirect handling, and the later one is just a DWord of MBZ,
  so...it's not really that different.  I think what you have is fine :)
 
 Hmm. Maybe time to press my luck. :)
 
 In my other 20 patch series
 [PATCH v2 19/20] i965/cs: Upload brw_cs_state
 
 We discussed this somewhat ugly code:
  +   int dw = 0;
  +   desc[dw++] = brw-cs.base.prog_offset;
  +   if (brw-gen = 8)
  +  dw++; /* Kernel Start Pointer High */
  +   dw++;
  +   dw++;
  +   desc[dw++] = stage_state-bind_bo_offset;
 
 It turns out it eventually doesn't look quite so pointless to use the
 dw var. Later, it would look like:
 
 http://cgit.freedesktop.org/~jljusten/mesa/tree/src/mesa/drivers/dri/i965/brw_cs.cpp?h=cs-27#n392
 
 Basically, the structure is pretty similar, but an extra dword appears
 for the high address in gen8.
 
 If it seems cleaner, I wouldn't mind splitting either or both of these
 to be initialized in separate paths based on the gen.
 
 Does the link above change your opinion on the other patch?
 
 Thanks for your time,
 
 -Jordan

I guess that's fine.  In the OUT_BATCH paradigm, we'd do things like:

   if (brw-gen = 8) {
   OUT_RELOC64(...)
   } else {
   OUT_RELOC(...)
   }

which effectively does the same thing.  But here you're writing indirect
state, so you have to track dwords yourself.

How about just changing 'dw++' to something like:

   desc[dw++] = 0; /* MBZ */
   desc[dw++] = 0; /* Kernel Start Pointer High */

I feel like that makes it readily apparent that you're just filling in
DWords in order, and not doing anything fancy.

Thoughts?
--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Ilia Mirkin
   if (blend_per_rt(ctx) || colormask_per_rt(ctx)) {
  num_state = ctx-Const.MaxDrawBuffers;
  blend-independent_blend_enable = 1;
   }

So independent blend won't get enabled even though the backend
supports it in the case where you have the same settings but with one
RGB and one RGBA target.

My concern is that this might end up messing up scenarios that work
today (e.g. when the driver properly supports RGBX and RGBA, but
you'll end up forcing the RGBX one to be ONE instead of DST_ALPHA).


On Tue, Apr 28, 2015 at 8:19 PM, Brian Paul bri...@vmware.com wrote:
 If the user requested a GL_RGB texture but the driver actually allocated
 an RGBA texture, the alpha values in the texture may not be defined.

 If we later bind the texture as a color target and try to blend into
 it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
 undefined alpha values when, in fact, the dest alpha value should be one.
 So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

 Fixes the piglit fbo-blending-formats test for some GL_RGB formats
 with the VMware driver.  Also tested with llvmpipe.

 v2: use the i-th (not j-th) render buffer's base format
 ---
  src/mesa/state_tracker/st_atom_blend.c | 38 
 +-
  1 file changed, 28 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/state_tracker/st_atom_blend.c 
 b/src/mesa/state_tracker/st_atom_blend.c
 index 6bb4077..6337e1c 100644
 --- a/src/mesa/state_tracker/st_atom_blend.c
 +++ b/src/mesa/state_tracker/st_atom_blend.c
 @@ -44,10 +44,21 @@
  /**
   * Convert GLenum blend tokens to pipe tokens.
   * Both blend factors and blend funcs are accepted.
 + * \param destBaseFormat  the base format of the render target, such as
 + *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
   */
  static GLuint
 -translate_blend(GLenum blend)
 +translate_blend(GLenum blend, GLenum destBaseFormat)
  {
 +   /* If we don't have destination alpha and the blend factor is either
 +* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
 +* PIPE_BLENDFACTOR_ONE or _ZERO instead.
 +*/
 +   const bool haveDstA = (destBaseFormat == GL_RGBA ||
 +  destBaseFormat == GL_ALPHA ||
 +  destBaseFormat == GL_INTENSITY ||
 +  destBaseFormat == GL_LUMINANCE_ALPHA);
 +
 switch (blend) {
 /* blend functions */
 case GL_FUNC_ADD:
 @@ -69,7 +80,7 @@ translate_blend(GLenum blend)
 case GL_SRC_ALPHA:
return PIPE_BLENDFACTOR_SRC_ALPHA;
 case GL_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA : PIPE_BLENDFACTOR_ONE;
 case GL_DST_COLOR:
return PIPE_BLENDFACTOR_DST_COLOR;
 case GL_SRC_ALPHA_SATURATE:
 @@ -91,7 +102,7 @@ translate_blend(GLenum blend)
 case GL_ONE_MINUS_DST_COLOR:
return PIPE_BLENDFACTOR_INV_DST_COLOR;
 case GL_ONE_MINUS_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA : 
 PIPE_BLENDFACTOR_ZERO;
 case GL_ONE_MINUS_CONSTANT_COLOR:
return PIPE_BLENDFACTOR_INV_CONST_COLOR;
 case GL_ONE_MINUS_CONSTANT_ALPHA:
 @@ -208,14 +219,21 @@ update_blend( struct st_context *st )
 else if (ctx-Color.BlendEnabled) {
/* blending enabled */
for (i = 0, j = 0; i  num_state; i++) {
 + const struct gl_renderbuffer *rb;
 + GLenum baseFormat;

   blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i)  0x1;

   if (ctx-Extensions.ARB_draw_buffers_blend)
  j = i;

 + /* _NEW_BUFFERS */
 + /* Get the base format of the i-th render target */
 + rb = ctx-DrawBuffer-_ColorDrawBuffers[i];
 + baseFormat = rb ? rb-_BaseFormat : GL_RGBA;
 +
   blend-rt[i].rgb_func =
 -translate_blend(ctx-Color.Blend[j].EquationRGB);
 +translate_blend(ctx-Color.Blend[j].EquationRGB, baseFormat);

   if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
   ctx-Color.Blend[i].EquationRGB == GL_MAX) {
 @@ -225,13 +243,13 @@ update_blend( struct st_context *st )
   }
   else {
  blend-rt[i].rgb_src_factor =
 -   translate_blend(ctx-Color.Blend[j].SrcRGB);
 +   translate_blend(ctx-Color.Blend[j].SrcRGB, baseFormat);
  blend-rt[i].rgb_dst_factor =
 -   translate_blend(ctx-Color.Blend[j].DstRGB);
 +   translate_blend(ctx-Color.Blend[j].DstRGB, baseFormat);
   }

   blend-rt[i].alpha_func =
 -translate_blend(ctx-Color.Blend[j].EquationA);
 +translate_blend(ctx-Color.Blend[j].EquationA, baseFormat);

   if (ctx-Color.Blend[i].EquationA == GL_MIN ||
   ctx-Color.Blend[i].EquationA == GL_MAX) {
 @@ -241,9 +259,9 @@ update_blend( struct st_context *st )
   }
   else {
  blend-rt[i].alpha_src_factor =
 

Re: [Mesa-dev] [PATCH 5/7] i965: use _mesa_geometry_width/height/layers/samples for programming geometry of framebuffer to GEN

2015-04-28 Thread Rogovin, Kevin

 I read the patch again and I'm still in the opinion that the changes to the 
 pure pre-gen7 logic (i.e., logic that is not re-used for later gens) are not 
 needed.

As I have tried and apparently failed to communicate, it is -better- and more 
consistent. Need 
is a far stronger word. Without a doubt, if the extension is never enabled for 
those older
Gens, then it does not matter in terms of produced output. However, I stated 
that it leaves 
a trap and an inconsistency which I find quite bothering.

 The shared logic between pre-gen7 and later, namely setup for renderbuffers, 
 drawing rectangle and 
 fragment shader compilation key are safe to do as they only introduce new 
 logic that is conditional to 
 no-attachments being used.

And that is exactly for the case for that code that is not shared. Indeed, if 
the shared code is safe 
for pre-Gen7, then so is the non-shared code. 

 Your concern about the readers getting confused could be also addressed with 
 assert(brw-gen = 7) 
  and a comment saying that the no-attachment specific path is not applicable 
 for older gens.

There is only one occurrence of no-attachment specific code paths in these 
i965 patches
and that is associated to scissoring.  The rest is existing code is changed 
from accessing Width, 
Height of gl_framebuffer to getting those values from a function. There is no 
proper place
to insert an assert(brw-gen =7 ), since, with the exception of the scissoring 
(and it is just
one if block) there is no such no attachment code path. I had thought the 
diffs of the series
made that quite clear.

 And when it comes to the pure pre-gen7 logic, I, in fact, have just the 
 opposite opinion on making it to go through the no-attachment-aware path.
 As the extension is not possible for older gens, I find it clearer that logic 
 explicitly by-passes such paths that even consider it.

Um, I am pretty sure than pre Gen7 hardware can do the extension. The crux is 
that the extension
is pointless for such hardware because pre Gen7 hardware does not (AFAIK) have 
a feature that
allows for a fragment shader to have a side effect. Even that statement is not 
totally true. Indeed,
one can argue performance queries and occlusion queries with 
framebuffer_no_attachments 
make some form of sense (it would give an application a count of sorts). 

-Kevin




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Don't try to apply the opt_sampler_eot extension for vs

2015-04-28 Thread Kenneth Graunke
On Tuesday, April 28, 2015 02:27:17 PM Neil Roberts wrote:
 The opt_sampler_eot optimisation of fs_visitor effectively assumes
 that it is running on a fragment shader because it casts the program
 key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
 used for vertex shaders. It looks like this usually works anyway
 because the optimisation is skipped if key-nr_color_regions != 1.
 However for a vertex shader the key is actually a brw_vs_prog_key so
 the space for nr_color_regions is probably taken up by
 key-base.program_string_id. This can end up making nr_color_regions
 be 1 in which case the function will later assert when the last
 instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test
 suite assert. Presumably this only happens there because that compiles
 a lot of shaders so it would end up with a high value for
 program_string_id.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 61ee056..255ddf4 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2548,6 +2548,9 @@ fs_visitor::opt_sampler_eot()
  {
 brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
  
 +   if (stage != MESA_SHADER_FRAGMENT)
 +  return false;
 +
 if (devinfo-gen  9  !devinfo-is_cherryview)
return false;
  
 

Good catch.

I'd remove skl from the subject line - this isn't platform specific
breakage (the optimization breaks VS on all platforms where it runs,
such as Cherryview).

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Ilia Mirkin
That's... really asymmetrical. There's a
PIPE_FORMAT_R10G10B10X2_SNORM. Oh well -- no reason to add it in just
for this.

On Tue, Apr 28, 2015 at 7:47 PM, Brian Paul bri...@vmware.com wrote:
 Unless I'm not seeing it, there is no such gallium format.

 -Brian


 On 04/28/2015 04:22 PM, Ilia Mirkin wrote:

 Presumably you should also include RGB10_X2 while you're at it?

 With that, Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

 On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:

 ---
   src/mesa/state_tracker/st_format.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/state_tracker/st_format.c
 b/src/mesa/state_tracker/st_format.c
 index 181465d..db7b5b7 100644
 --- a/src/mesa/state_tracker/st_format.c
 +++ b/src/mesa/state_tracker/st_format.c
 @@ -991,7 +991,7 @@ static const struct format_mapping format_map[] = {
  {
 { GL_RGB10, 0 },
 { PIPE_FORMAT_B10G10R10X2_UNORM, PIPE_FORMAT_B10G10R10A2_UNORM,
 -DEFAULT_RGB_FORMATS }
 +PIPE_FORMAT_R10G10B10A2_UNORM, DEFAULT_RGB_FORMATS }
  },
  {
 { GL_RGB10_A2, 0 },
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org

 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=IuMA5FK_1tNFsFKLmtJFxFREEyZXHga_CgoCrfxxV8Ys=EF5qZZulTwh5vLidYWf7vMagDvZQexfoUJ9PjS1uHZIe=


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Ilia Mirkin
On Tue, Apr 28, 2015 at 9:03 PM, Matt Turner matts...@gmail.com wrote:
 On Tue, Apr 28, 2015 at 5:23 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Yes, sorry, thought that was implied since I had given it earlier
 pending my (as it turns out, incorrect) suggestion.

 A number of people have been confused (rightly so) by LGTM implying
 Reviewed-by. Let's please not ever start implying or inferring
 Reviewed-bys from anything less than a reply actually saying
 Reviewed-by in the format expected to go into the commit message.

Well, I had supplied it before, conditional on a (seemingly at the
time) trivial change. Then I agreed that my condition was wrong. I
thought that qualified as good enough to not have to re-give the
R-b.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/mesa: adjust blending modes if we don't have destination alpha

2015-04-28 Thread Ilia Mirkin
On Tue, Apr 28, 2015 at 6:26 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

ctually


 Awesome! Now I can go remove this set of hacks from freedreno. And
 this fixes the same issue in nouveau. Thanks for doing it the real way
 :)

 On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:
 If the user requested a GL_RGB texture but the driver actually allocated
 an RGBA texture, the alpha values in the texture may not be defined.

 If we later bind the texture as a color target and try to blend into
 it with GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA we may blend with
 undefined alpha values when, in fact, the dest alpha value should be one.
 So replace GL_DST_ALPHA/GL_ONE_MINUS_DST_ALPHA with GL_ONE/GL_ZERO.

 Fixes the piglit fbo-blending-formats test for some GL_RGB formats
 with the VMware driver.  Also tested with llvmpipe.
 ---
  src/mesa/state_tracker/st_atom_blend.c | 38 
 +-
  1 file changed, 28 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/state_tracker/st_atom_blend.c 
 b/src/mesa/state_tracker/st_atom_blend.c
 index 6bb4077..30bff7a 100644
 --- a/src/mesa/state_tracker/st_atom_blend.c
 +++ b/src/mesa/state_tracker/st_atom_blend.c
 @@ -44,10 +44,21 @@
  /**
   * Convert GLenum blend tokens to pipe tokens.
   * Both blend factors and blend funcs are accepted.
 + * \param destBaseFormat  the base format of the render target, such as
 + *GL_RGBA, GL_RGB, GL_RED, GL_ALPHA, etc.
   */
  static GLuint
 -translate_blend(GLenum blend)
 +translate_blend(GLenum blend, GLenum destBaseFormat)
  {
 +   /* If we don't have destination alpha and the blend factor is either
 +* GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA then we use
 +* PIPE_BLENDFACTOR_ONE or _ZERO instead.
 +*/
 +   const bool haveDstA = (destBaseFormat == GL_RGBA ||
 +  destBaseFormat == GL_ALPHA ||
 +  destBaseFormat == GL_INTENSITY ||
 +  destBaseFormat == GL_LUMINANCE_ALPHA);
 +
 switch (blend) {
 /* blend functions */
 case GL_FUNC_ADD:
 @@ -69,7 +80,7 @@ translate_blend(GLenum blend)
 case GL_SRC_ALPHA:
return PIPE_BLENDFACTOR_SRC_ALPHA;
 case GL_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_DST_ALPHA : PIPE_BLENDFACTOR_ONE;
 case GL_DST_COLOR:
return PIPE_BLENDFACTOR_DST_COLOR;
 case GL_SRC_ALPHA_SATURATE:
 @@ -91,7 +102,7 @@ translate_blend(GLenum blend)
 case GL_ONE_MINUS_DST_COLOR:
return PIPE_BLENDFACTOR_INV_DST_COLOR;
 case GL_ONE_MINUS_DST_ALPHA:
 -  return PIPE_BLENDFACTOR_INV_DST_ALPHA;
 +  return haveDstA ? PIPE_BLENDFACTOR_INV_DST_ALPHA : 
 PIPE_BLENDFACTOR_ZERO;
 case GL_ONE_MINUS_CONSTANT_COLOR:
return PIPE_BLENDFACTOR_INV_CONST_COLOR;
 case GL_ONE_MINUS_CONSTANT_ALPHA:
 @@ -208,14 +219,21 @@ update_blend( struct st_context *st )
 else if (ctx-Color.BlendEnabled) {
/* blending enabled */
for (i = 0, j = 0; i  num_state; i++) {
 + const struct gl_renderbuffer *rb;
 + GLenum baseFormat;

   blend-rt[i].blend_enable = (ctx-Color.BlendEnabled  i)  0x1;

   if (ctx-Extensions.ARB_draw_buffers_blend)
  j = i;

 + /* _NEW_BUFFERS */
 + /* Get the base format of the render target */
 + rb = ctx-DrawBuffer-_ColorDrawBuffers[j];

That's the wrong render target, no? You need the i'th render target.
And what happens if I'm not using independent blend but one of the
RT's is RGB while the other is RGBA?

  -ilia

 + baseFormat = rb ? rb-_BaseFormat : GL_RGBA;
 +
   blend-rt[i].rgb_func =
 -translate_blend(ctx-Color.Blend[j].EquationRGB);
 +translate_blend(ctx-Color.Blend[j].EquationRGB, baseFormat);

   if (ctx-Color.Blend[i].EquationRGB == GL_MIN ||
   ctx-Color.Blend[i].EquationRGB == GL_MAX) {
 @@ -225,13 +243,13 @@ update_blend( struct st_context *st )
   }
   else {
  blend-rt[i].rgb_src_factor =
 -   translate_blend(ctx-Color.Blend[j].SrcRGB);
 +   translate_blend(ctx-Color.Blend[j].SrcRGB, baseFormat);
  blend-rt[i].rgb_dst_factor =
 -   translate_blend(ctx-Color.Blend[j].DstRGB);
 +   translate_blend(ctx-Color.Blend[j].DstRGB, baseFormat);
   }

   blend-rt[i].alpha_func =
 -translate_blend(ctx-Color.Blend[j].EquationA);
 +translate_blend(ctx-Color.Blend[j].EquationA, baseFormat);

   if (ctx-Color.Blend[i].EquationA == GL_MIN ||
   ctx-Color.Blend[i].EquationA == GL_MAX) {
 @@ -241,9 +259,9 @@ update_blend( struct st_context *st )
   }
   else {
  blend-rt[i].alpha_src_factor =
 -   translate_blend(ctx-Color.Blend[j].SrcA);
 +   

Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Matt Turner
On Tue, Apr 28, 2015 at 6:08 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Tue, Apr 28, 2015 at 9:03 PM, Matt Turner matts...@gmail.com wrote:
 On Tue, Apr 28, 2015 at 5:23 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Yes, sorry, thought that was implied since I had given it earlier
 pending my (as it turns out, incorrect) suggestion.

 A number of people have been confused (rightly so) by LGTM implying
 Reviewed-by. Let's please not ever start implying or inferring
 Reviewed-bys from anything less than a reply actually saying
 Reviewed-by in the format expected to go into the commit message.

 Well, I had supplied it before, conditional on a (seemingly at the
 time) trivial change. Then I agreed that my condition was wrong. I
 thought that qualified as good enough to not have to re-give the
 R-b.

Ah, right. I only saw your second reply, and not the first with the R-b.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Regarding X.Org Endless Vacation of Code

2015-04-28 Thread Timothy Arceri
On Tue, 2015-04-28 at 04:28 +0530, Anish Kanchan wrote:
 Hello,
 
 
 I am a final year Computer Engineering student from Sardar Patel
 Institute of Technology. I have a good understanding of C and I have
 worked on projects using C as a part of my coursework. I am interested
 in working on the Improved application of GLSL complier
 optimizations project idea for EVoC.
 
 
 So before submitting an application, I understand that I need to
 understand the code base and submit patches to show my programming
 proficiency and problem solving skills.
 
 
 However, before I begin doing so, I would like you to answer few
 queries -
 
 
 1. Can a proposal have only one 'easy' project idea (or will one have
 to club two easy ones)?
 2. Could you point me to the necessary resources which will help me
 understand the project?

There is some basic developer info in the docs folder devinfo.html
(provides some tips to new developers) and sourcetree.html (has an
excellent overview of the source code directory structures) are two
files I found helpful.

 3. Could you also tell me how to set up the development environment
 for setting up this project?
 
 
 Thanks,
 Anish Kanchan
 Student, University of Mumbai
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90213] glDrawPixels with GL_COLOR_INDEX never returns.

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90213

Bug ID: 90213
   Summary: glDrawPixels with GL_COLOR_INDEX never returns.
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: juhapekka.heikk...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

When I run glDrawPixels(width, height, GL_COLOR_INDEX, GL_UNSIGNED_BYTE, data)
code execution never return.

Commit 84eb402c01962e3ff4b70c9948c85a61ed81678f changed the path so we end up
in _mesa_format_from_format_and_type at src/mesa/main/glformats.c:2682 which
does not seem to know anything about GL_COLOR_INDEX. On release build this
function should return a value but doesn't and gets confused, on debug build I
get assert _mesa_format_from_format_and_type: Assertion `!Unsupported format'
failed.

I made minimalistic test for this, one can get it from here
https://github.com/juhapekka/CI_test.git

There seems also exist test in Piglit which does similar and halts Piglit run.
Piglit test is gl-1.0-drawpixels-color-index

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Statically linking libstdc++ and libgcc

2015-04-28 Thread Emil Velikov
On 28 April 2015 at 11:16, Jose Fonseca jfons...@vmware.com wrote:
 Hi,


 I don't know if in the end of this thread there was an agreement that Valve
 should only use bundled libstdc++ if it's newer than the system's libstdc++,
 or just no agreement at all.


It seems to me that there is no agreement atm :-(

 But just for future reference (or in case any distro decides to apply the
 patches themselves), I'd like to point out there a couple of technical
 issues with the proposed patch.

 I actually modified apitrace's LD_PRELOAD wrappers precisely as Vivek
 proposed (so apitrace can safely trace any application, without fear of
 symbol collision, no matter what), but ran against two problems:


 - For a long time static libstdcxx wasn't built with -fPIC ( see
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28811 ) so I actually had to
 add a configure-time check to see whether it works or not:


 https://github.com/apitrace/apitrace/commit/09531388e2aea19018ef03487d37a12547eb9325


Good catch. I would assume that as we mandate GCC 4.2 we should be
save, although the configure check won't hurt either.

 - libstdc++.a's symbols are not hidden (they have default visibility), and
 while `-Wl,--exclude-libs,libstdc++.a` prevents normal libstdc++.a's symbols
 from being exported, it does not work for weak symbols.  Ie, weak symbols
 from libstdc++.a's are still exported, and can clash with the weak symbols
 from the dynamically linked libstdc++.so.

   (Ironically I spotted this issue while tracing with Mesa's drivers.)

   So in the end I had to actually use LD version scripts:


 https://github.com/apitrace/apitrace/commit/946652f4fc103854d4f643551331eb72e8fb0345


We already use version scripts for the gallium drivers (dri, vdpau...)
due to static linking LLVM, which causes similar problem.



 IMHO I think that the solution that makes more sense for Valve is to
 manipulate LD_LIBRARY_PATH so that libstdc++ is only picked when necessary
 (newer than systems'), as Michel suggested.  It's the only way to guarantee
 maximum compatibility.

 Mesa could provide the option to statically link libstdc++, as defense
 against thirdparty vendors that invariable dynamically link their own. But
 it seems unrealistic for a thirdparty app vendor to assume it's safe to
 always use a bundled libstdc++.  It's a matter of time until a system
 component relies on libstdc++.so.  If it's not Mesa driver, it could be
 anything else.  Here's for example all system shared objects are are loaded
 by CSGO on my home laptop:

 $ cat /proc/`pidof csgo_linux`/maps | grep -o '\s/\(usr/\)\?lib\S\+' | sort
 -u
  /lib/i386-linux-gnu/i686/cmov/libc-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libdl-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libm-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libnsl-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libnss_files-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libpthread-2.19.so
  /lib/i386-linux-gnu/i686/cmov/libresolv-2.19.so
  /lib/i386-linux-gnu/i686/cmov/librt-2.19.so
  /lib/i386-linux-gnu/ld-2.19.so
  /lib/i386-linux-gnu/libudev.so.1.5.0
  /usr/lib/i386-linux-gnu/dri/i965_dri.so
  /usr/lib/i386-linux-gnu/gconv/gconv-modules.cache
  /usr/lib/i386-linux-gnu/gconv/UTF-32.so
  /usr/lib/i386-linux-gnu/libdrm_intel.so.1.0.0
  /usr/lib/i386-linux-gnu/libdrm_nouveau.so.2.0.0
  /usr/lib/i386-linux-gnu/libdrm_radeon.so.1.0.1
  /usr/lib/i386-linux-gnu/libdrm.so.2.4.0
  /usr/lib/i386-linux-gnu/libglapi.so.0.0.0
  /usr/lib/i386-linux-gnu/libGL.so.1.2.0
  /usr/lib/i386-linux-gnu/libpciaccess.so.0.11.1
  /usr/lib/i386-linux-gnu/libxshmfence.so.1.0.0
  /usr/lib/locale/locale-archive

 Who can say for sure that one of these won't get one day rewriten on C++, or
 introduces a new dependency on a module written in C++??

 In short, although I personally have no objection of providing the option to
 build Mesa with static libstdc++, I think it's unsustainable for Valve to
 keep making this assumption.

Great write-up Jose. Thank you very much.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Brian Paul

Unless I'm not seeing it, there is no such gallium format.

-Brian

On 04/28/2015 04:22 PM, Ilia Mirkin wrote:

Presumably you should also include RGB10_X2 while you're at it?

With that, Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:

---
  src/mesa/state_tracker/st_format.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_format.c 
b/src/mesa/state_tracker/st_format.c
index 181465d..db7b5b7 100644
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -991,7 +991,7 @@ static const struct format_mapping format_map[] = {
 {
{ GL_RGB10, 0 },
{ PIPE_FORMAT_B10G10R10X2_UNORM, PIPE_FORMAT_B10G10R10A2_UNORM,
-DEFAULT_RGB_FORMATS }
+PIPE_FORMAT_R10G10B10A2_UNORM, DEFAULT_RGB_FORMATS }
 },
 {
{ GL_RGB10_A2, 0 },
--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=IuMA5FK_1tNFsFKLmtJFxFREEyZXHga_CgoCrfxxV8Ys=EF5qZZulTwh5vLidYWf7vMagDvZQexfoUJ9PjS1uHZIe=


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10

2015-04-28 Thread Brian Paul

R-b?

-Brian

On 04/28/2015 05:47 PM, Ilia Mirkin wrote:

That's... really asymmetrical. There's a
PIPE_FORMAT_R10G10B10X2_SNORM. Oh well -- no reason to add it in just
for this.

On Tue, Apr 28, 2015 at 7:47 PM, Brian Paul bri...@vmware.com wrote:

Unless I'm not seeing it, there is no such gallium format.

-Brian


On 04/28/2015 04:22 PM, Ilia Mirkin wrote:


Presumably you should also include RGB10_X2 while you're at it?

With that, Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

On Tue, Apr 28, 2015 at 6:16 PM, Brian Paul bri...@vmware.com wrote:


---
   src/mesa/state_tracker/st_format.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_format.c
b/src/mesa/state_tracker/st_format.c
index 181465d..db7b5b7 100644
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -991,7 +991,7 @@ static const struct format_mapping format_map[] = {
  {
 { GL_RGB10, 0 },
 { PIPE_FORMAT_B10G10R10X2_UNORM, PIPE_FORMAT_B10G10R10A2_UNORM,
-DEFAULT_RGB_FORMATS }
+PIPE_FORMAT_R10G10B10A2_UNORM, DEFAULT_RGB_FORMATS }
  },
  {
 { GL_RGB10_A2, 0 },
--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=IuMA5FK_1tNFsFKLmtJFxFREEyZXHga_CgoCrfxxV8Ys=EF5qZZulTwh5vLidYWf7vMagDvZQexfoUJ9PjS1uHZIe=





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90207] [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90207

--- Comment #15 from Dieter Nützel die...@nuetzel-hh.de ---
(In reply to Aaron Watry from comment #14)
 (In reply to José Fonseca from comment #13)
  I think I figured out the problem.
  
  As commented on DECL_RESOURCE_FUNC macro, the RESOURCE_VAR inline function
  is not type safe, and stuff that's not a ir_variable is wrongly being casted
  into it. 
  
  This patch seems to do the trick, but I'm not 100% sure.
  
 
 That patch fixes the crash in Konqueror for me.
 
 Whether or not it's correct, that's up to people who know the code better
 than myself.

For me, too.
git-6fe0d4f + José's patch

Thanks José!

Your WebGL implementation:

Renderer: WebKit WebGL
Vendor: WebKit
WebGL version: WebGL 1.0 (3.0 Mesa 10.6.0-devel (git-6fe0d4f))
GLSL version: WebGL GLSL ES 1.0 (1.30)
Supported WebGL extensions:

OES_texture_float   -- Were is this defined? ;-)
OES_standard_derivatives
WEBKIT_EXT_texture_filter_anisotropic
OES_vertex_array_object
OES_element_index_uint
WEBKIT_WEBGL_lose_context
WEBKIT_WEBGL_compressed_texture_s3tc
WEBKIT_WEBGL_depth_texture

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Statically linking libstdc++ and libgcc

2015-04-28 Thread Jose Fonseca

On 28/04/15 12:22, Emil Velikov wrote:

On 28 April 2015 at 11:16, Jose Fonseca jfons...@vmware.com wrote:

Hi,


I don't know if in the end of this thread there was an agreement that Valve
should only use bundled libstdc++ if it's newer than the system's libstdc++,
or just no agreement at all.



It seems to me that there is no agreement atm :-(


But just for future reference (or in case any distro decides to apply the
patches themselves), I'd like to point out there a couple of technical
issues with the proposed patch.

I actually modified apitrace's LD_PRELOAD wrappers precisely as Vivek
proposed (so apitrace can safely trace any application, without fear of
symbol collision, no matter what), but ran against two problems:


- For a long time static libstdcxx wasn't built with -fPIC ( see
https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_bugzilla_show-5Fbug.cgi-3Fid-3D28811d=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=Okz0-DpEUB5ncMYMjMjc59cE6euZOdfhCHx8NraKzQos=jE3xtJAeYngM4cK2Belo2vS4-0MAziIXXEbvY_1go9ce=
  ) so I actually had to
add a configure-time check to see whether it works or not:


https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apitrace_apitrace_commit_09531388e2aea19018ef03487d37a12547eb9325d=AwIBaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=Okz0-DpEUB5ncMYMjMjc59cE6euZOdfhCHx8NraKzQos=HcmNQY-aIcw6rVHsn7Gz3g5sBNjepxRboc-MbusLe8Ae=



Good catch. I would assume that as we mandate GCC 4.2 we should be
save, although the configure check won't hurt either.


IIUC, from that bug report it looks like -fPIC on x86_64 was busted all 
the way till GCC 4.7.1.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90207] [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90207

--- Comment #9 from Aaron Watry awa...@gmail.com ---
Note that firefox 37 works fine for me on my CEDAR (Radeon 5400) on r600g on
current mesa master.

Konqueror 4.14.6 crashes while loading the OpenGL demo.

When running konqueror with LIBGL_DEBUG=verbose, I get the following:

awatry@ws-awatry:~$ LIBGL_DEBUG=verbose konqueror 
libGL: OpenDriver: trying /usr/local/lib/dri/tls/r600_dri.so
libGL: OpenDriver: trying /usr/local/lib/dri/r600_dri.so
libGL: OpenDriver: trying /usr/local/lib/dri/tls/r600_dri.so
libGL: OpenDriver: trying /usr/local/lib/dri/r600_dri.so
NOT SANDBOXED
Mesa: User error: GL_INVALID_VALUE in glGetActiveAttrib(index)
KCrash: Application 'konqueror' crashing...

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix glGetProgramiv(GL_ACTIVE_ATTRIBUTES).

2015-04-28 Thread Tapani


Argh, this is exactly what I had in mind and I did it with 
_mesa_longest_attribute_name_length but forgot it in 
_mesa_count_active_attribs ... many thanks for catching this. I agree on 
the assertion change, that is simple and will make these bugs much 
easier to catch.


My overall plan is to go further to get rid of ir_variable usage 
completely and have only the required bits of it, this is still WIP though.


Reviewed-by: Tapani Pälli tapani.pa...@intel.com

On 04/28/2015 11:56 PM, Jose Fonseca wrote:

It's returning random values, because RESOURCE_VAR() is casting
different objects into ir_variable pointers.

This updates _mesa_count_active_attribs to filters the resources with
the same logic used in _mesa_longest_attribute_name_length.

https://bugs.freedesktop.org/show_bug.cgi?id=90207

P.S.: RESOURCE_VAR cast helper should have assertions to catch this.
---
  src/mesa/main/shader_query.cpp | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index a84ec84..d2ca49b 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -302,8 +302,10 @@ _mesa_count_active_attribs(struct gl_shader_program 
*shProg)
 struct gl_program_resource *res = shProg-ProgramResourceList;
 unsigned count = 0;
 for (unsigned j = 0; j  shProg-NumProgramResourceList; j++, res++) {
- if (is_active_attrib(RESOURCE_VAR(res)))
-count++;
+  if (res-Type == GL_PROGRAM_INPUT 
+  res-StageReferences  (1  MESA_SHADER_VERTEX) 
+  is_active_attrib(RESOURCE_VAR(res)))
+ count++;
 }
 return count;
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90207] [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90207

--- Comment #16 from Tapani Pälli lem...@gmail.com ---
(In reply to José Fonseca from comment #13)
 I think I figured out the problem.
 
 As commented on DECL_RESOURCE_FUNC macro, the RESOURCE_VAR inline function
 is not type safe, and stuff that's not a ir_variable is wrongly being casted
 into it. 
 
 This patch seems to do the trick, but I'm not 100% sure.

Yep it is correct, just like with _mesa_longest_attribute_name_length. Not sure
why I did not manage to hit this.

 diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
 index a84ec84..d2ca49b 100644
 --- a/src/mesa/main/shader_query.cpp
 +++ b/src/mesa/main/shader_query.cpp
 @@ -302,8 +302,10 @@ _mesa_count_active_attribs(struct gl_shader_program
 *shProg)
 struct gl_program_resource *res = shProg-ProgramResourceList;
 unsigned count = 0;
 for (unsigned j = 0; j  shProg-NumProgramResourceList; j++, res++) {
 - if (is_active_attrib(RESOURCE_VAR(res)))
 -count++;
 +  if (res-Type == GL_PROGRAM_INPUT 
 +  res-StageReferences  (1  MESA_SHADER_VERTEX) 
 +  is_active_attrib(RESOURCE_VAR(res)))
 + count++;
 }
 return count;
  }
 
 
 
 I think we should invest the time to make RESOURCE_VAR and friends more
 robust, by adding assertions that the types are indeed correct.  (Replace
 DECL_RESOURCE_FUNC helper macro by manually written code if necessary.)
 
 This sort of bugs is hard to find otherwise.

I agree, current way is too fragile. IMO a type template would be best
solution. Maybe first just adding the assert for the correct type is good
start.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90207] [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90207

José Fonseca jfons...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from José Fonseca jfons...@vmware.com ---
(In reply to Tapani Pälli from comment #16)
 Yep it is correct, just like with _mesa_longest_attribute_name_length. 

Thanks. I pushed the fix now.

 Not sure why I did not manage to hit this.

It's random -- I think that is_active_attrib ends up interpreting random
pointer addresses as enums and decide on them.  Maybe the randomness is higher
with gallium drivers.

It was only when I try to print var-name that things started to crash
consistently for me.

  I think we should invest the time to make RESOURCE_VAR and friends more
  robust, by adding assertions that the types are indeed correct.  (Replace
  DECL_RESOURCE_FUNC helper macro by manually written code if necessary.)
  
  This sort of bugs is hard to find otherwise.
 
 I agree, current way is too fragile. IMO a type template would be best
 solution. Maybe first just adding the assert for the correct type is good
 start.

Yep.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/16] i965/blorp: Allow caller to provide sampler settings

2015-04-28 Thread Pohjolainen, Topi
On Tue, Apr 28, 2015 at 03:17:30PM -0700, Kenneth Graunke wrote:
 On Thursday, April 23, 2015 09:00:28 PM Topi Pohjolainen wrote:
  Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
  ---
   src/mesa/drivers/dri/i965/brw_blorp.h|  4 +++-
   src/mesa/drivers/dri/i965/gen6_blorp.cpp | 15 +--
   src/mesa/drivers/dri/i965/gen7_blorp.cpp |  3 ++-
   3 files changed, 14 insertions(+), 8 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
  b/src/mesa/drivers/dri/i965/brw_blorp.h
  index 59aecab..63dfe5b 100644
  --- a/src/mesa/drivers/dri/i965/brw_blorp.h
  +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
  @@ -415,7 +415,9 @@ gen6_blorp_emit_drawing_rectangle(struct brw_context 
  *brw,
   
   uint32_t
   gen6_blorp_emit_sampler_state(struct brw_context *brw,
  -  const brw_blorp_params *params);
  +  unsigned tex_filter, unsigned max_lod,
  +  bool use_unorm_coords);
 
 Let's not call this use_unorm_coords.  UNORM is a data format -
 unsigned normalized fixed point numbers.
 
 This means non-normalized texture coordinates, i.e. ones that range from
 [0, width]x[0, height] instead of [0.0, 1.0]x[0.0, 1.0].
 
 is_rectangle_sampler, non_normalized_coords, or use_rect_coords
 all seem like reasonable names.  I'm open to other ideas too.

I agree, and I wasn't too happy with the name either. I'll switch it to
non_normalized_coords, it is closest to the spec also. Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/18] i965/wm/gen6: Refactor program offset setup

2015-04-28 Thread Pohjolainen, Topi
On Tue, Apr 28, 2015 at 03:01:43PM -0700, Kenneth Graunke wrote:
 On Wednesday, April 22, 2015 11:47:34 PM Topi Pohjolainen wrote:
  Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
  ---
   src/mesa/drivers/dri/i965/brw_state.h |  8 +
   src/mesa/drivers/dri/i965/gen6_wm_state.c | 56 
  ++-
   2 files changed, 41 insertions(+), 23 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
  b/src/mesa/drivers/dri/i965/brw_state.h
  index 23f36c0..ca3274d 100644
  --- a/src/mesa/drivers/dri/i965/brw_state.h
  +++ b/src/mesa/drivers/dri/i965/brw_state.h
  @@ -292,6 +292,14 @@ void brw_update_sampler_state(struct brw_context *brw,
 uint32_t *sampler_state,
 uint32_t batch_offset_for_sampler_state);
   
  +/* gen6_wm_state.c */
  +void
  +gen6_wm_state_set_programs(const struct brw_wm_prog_data *prog_data,
  +   const struct brw_stage_state *stage_state,
  +   int min_inv_per_frag,
  +   uint32_t *ksp0, uint32_t *ksp2,
  +   uint32_t *dw4, uint32_t *dw5, uint32_t *dw6);
  +
   /* gen6_sf_state.c */
   void
   calculate_attr_overrides(const struct brw_context *brw,
  diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
  b/src/mesa/drivers/dri/i965/gen6_wm_state.c
  index 8e673a4..bc921e5 100644
  --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
  +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
  @@ -65,6 +65,37 @@ const struct brw_tracked_state gen6_wm_push_constants = {
  .emit = gen6_upload_wm_push_constants,
   };
   
  +void
  +gen6_wm_state_set_programs(const struct brw_wm_prog_data *prog_data,
  +   const struct brw_stage_state *stage_state,
  +   int min_inv_per_frag,
  +   uint32_t *ksp0, uint32_t *ksp2,
  +   uint32_t *dw4, uint32_t *dw5, uint32_t *dw6)
  +{
  +   if (prog_data-prog_offset_16 || prog_data-no_8) {
  +  *dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
  +
  +  if (!prog_data-no_8  min_inv_per_frag == 1) {
  + *dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
  + *dw4 |= (prog_data-base.dispatch_grf_start_reg 
  +  GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
  + *dw4 |= (prog_data-dispatch_grf_start_reg_16 
  +  GEN6_WM_DISPATCH_START_GRF_SHIFT_2);
  + *ksp0 = stage_state-prog_offset;
  + *ksp2 = stage_state-prog_offset + prog_data-prog_offset_16;
  +  } else {
  + *dw4 |= (prog_data-dispatch_grf_start_reg_16 
  +  GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
  + *ksp0 = stage_state-prog_offset + prog_data-prog_offset_16;
  +  }
  +   } else {
  +  *dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
  +  *dw4 |= (prog_data-base.dispatch_grf_start_reg 
  +   GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
  +  *ksp0 = stage_state-prog_offset;
  +   }
  +}
  +
 
 This split feels awkward to me - the code to emit 3DSTATE_WM is now
 split across multiple functions...and it has 5 out parameters.  I really
 prefer keeping the code to fill out a packet's DWords together in one
 function.
 
 Could we keep it in one function, but instead make upload_wm_state()
 take additional parameters, rather than poking at brw- directly?
 
 Sorry for the trouble...

I'll take another look. I can't remember all the details but I started that
way, it became ugly and so I decided to do this instead.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/18] i965: Pass slice details as parameters for surface setup

2015-04-28 Thread Pohjolainen, Topi
On Tue, Apr 28, 2015 at 02:45:27PM -0700, Kenneth Graunke wrote:
 On Wednesday, April 22, 2015 11:47:33 PM Topi Pohjolainen wrote:
  Also changed a couple of direct shifts into SET_FIELD().
  
  Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
  ---
   src/mesa/drivers/dri/i965/brw_context.h   |  3 ++-
   src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 30 
  +--
   src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 14 +--
   src/mesa/drivers/dri/i965/gen8_surface_state.c| 10 +++-
   4 files changed, 29 insertions(+), 28 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
  b/src/mesa/drivers/dri/i965/brw_context.h
  index b90d329..ae28955 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.h
  +++ b/src/mesa/drivers/dri/i965/brw_context.h
  @@ -964,10 +964,11 @@ struct brw_context
  {
 void (*update_texture_surface)(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
  - struct gl_texture_object *tObj,
uint32_t tex_format,
bool is_integer_format,
GLenum target, uint32_t 
  effective_depth,
  + uint32_t min_layer,
  + uint32_t min_lod, uint32_t mip_count, 
int swizzle, uint32_t *surf_offset,
bool for_gather);
 uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
  diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
  b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  index f7acad4..ad5ddb5 100644
  --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  @@ -310,16 +310,16 @@ update_buffer_texture_surface(struct gl_context *ctx,
   static void
   brw_update_texture_surface(struct brw_context *brw,
  const struct intel_mipmap_tree *mt,
  -   struct gl_texture_object *tObj,
  uint32_t tex_format,
  bool is_integer_format /* unused */,
  GLenum target,
  uint32_t effective_depth /* unused */,
  +   uint32_t min_layer /* unused */,
  +   uint32_t min_lod, uint32_t mip_count, 
  int swizzle /* unused */,
  uint32_t *surf_offset,
  bool for_gather)
   {
  -   struct intel_texture_object *intelObj = intel_texture_object(tObj);
  uint32_t *surf;
   
  surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
  @@ -361,16 +361,16 @@ brw_update_texture_surface(struct brw_context *brw,
   
  surf[1] = mt-bo-offset64 + mt-offset; /* reloc */
   
  -   surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel)  
  BRW_SURFACE_LOD_SHIFT |
  - (mt-logical_width0 - 1)  BRW_SURFACE_WIDTH_SHIFT |
  - (mt-logical_height0 - 1)  BRW_SURFACE_HEIGHT_SHIFT);
  +   surf[2] = SET_FIELD(mip_count, BRW_SURFACE_LOD) |
  + SET_FIELD(mt-logical_width0 - 1, BRW_SURFACE_WIDTH) |
  + SET_FIELD(mt-logical_height0 - 1, BRW_SURFACE_HEIGHT);
   
  -   surf[3] = (brw_get_surface_tiling_bits(mt-tiling) |
  - (mt-logical_depth0 - 1)  BRW_SURFACE_DEPTH_SHIFT |
  - (mt-pitch - 1)  BRW_SURFACE_PITCH_SHIFT);
  +   surf[3] = brw_get_surface_tiling_bits(mt-tiling) |
  +SET_FIELD(mt-logical_depth0 - 1, BRW_SURFACE_DEPTH) |
  +SET_FIELD(mt-pitch - 1, BRW_SURFACE_PITCH);
   
  -   surf[4] = (brw_get_surface_num_multisamples(mt-num_samples) |
  -  SET_FIELD(tObj-BaseLevel - mt-first_level, 
  BRW_SURFACE_MIN_LOD));
  +   surf[4] = brw_get_surface_num_multisamples(mt-num_samples) |
  + SET_FIELD(min_lod, BRW_SURFACE_MIN_LOD);
 
 This is not equivalent...Min Lod used to be:
 
tObj-BaseLevel - mt-first_level
 
 and now it is:
 
tObj-MinLevel + tObj-BaseLevel - mt-first_level
 
 I would really appreciate it if you could make this a separate patch
 from the refactoring, for easier bisectability.  (First add tObj-MinLevel
 to the Gen4-6 code, then do this refactor.)
 
 It seems like a fine change, but is certainly worth noting in the commit
 message.  Perhaps this is what fixed some tests?

Mark helped me to bisect that with Jenkins, it wasn't this patch. It was

i965: Pass slice details as parameters for surface setup


Anyway, as the series grew I started forgetting things I meant to say or
fix. This was something I meant to address, my apologies for you to need to
figure it out manually. I actually thought adding assert and comment here:

   /* Mininum level setting is only used for ARB_texture_view which isn't
* enabled before gen7.

Re: [Mesa-dev] [PATCH 1/3] mesa: remove unused options var in compile_shader()

2015-04-28 Thread Anuj Phogat
On Mon, Apr 27, 2015 at 4:35 PM, Brian Paul bri...@vmware.com wrote:
 ---
  src/mesa/main/shaderapi.c | 3 ---
  1 file changed, 3 deletions(-)

 diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
 index cc001ba..a04b287 100644
 --- a/src/mesa/main/shaderapi.c
 +++ b/src/mesa/main/shaderapi.c
 @@ -861,14 +861,11 @@ static void
  compile_shader(struct gl_context *ctx, GLuint shaderObj)
  {
 struct gl_shader *sh;
 -   struct gl_shader_compiler_options *options;

 sh = _mesa_lookup_shader_err(ctx, shaderObj, glCompileShader);
 if (!sh)
return;

 -   options = ctx-Const.ShaderCompilerOptions[sh-Stage];
 -
 if (!sh-Source) {
/* If the user called glCompileShader without first calling
 * glShaderSource, we should fail to compile, but not raise a GL_ERROR.
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Series is:
Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/21] i965: Define consistent interface to perform cross-component flag result reduction.

2015-04-28 Thread Francisco Jerez
This is the only non-trivial instruction manipulator.  It enables
using ALIGN16 predication modes in the scalar back-end without
emitting any additional instructions by using a combination of
predication and conditional mods.  exec_reduce() prepares an
instruction for the desired reduction mode (a no-op except for SVEC4
instructions in the scalar back-end), subsequent instructions can then
predicate on the result of the reduction by using the NORMAL
predication mode on the single flag register written by the generating
instruction.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h|  9 +++
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 45 
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 10 +++
 3 files changed, 64 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 7e5083c..c9d40ce 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -373,4 +373,13 @@ exec_saturate(bool saturate, fs_inst *inst)
return inst;
 }
 
+/**
+ * No-op.  See the SVEC4 implementation.
+ */
+static inline fs_inst *
+exec_reduce(brw_predicate pred, fs_inst *inst)
+{
+   return inst;
+}
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index 508ed5e..36164a9 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -439,6 +439,51 @@ namespace brw {
 
   return inst;
}
+
+   /**
+* Perform a vector reduction on the flag result of \p inst.  This allows
+* using vector predication modes (the ALIGN16 ones) with SVEC4
+* instructions, even though we may not have enough flag registers
+* available to hold the flag results from all components.
+*
+* The (largely inconsequential) limitation is that the predication mode
+* has to be already known by the generating instruction (i.e. \p inst), so
+* you cannot apply different ALIGN16 predication modes on the flag result
+* of the same reduced instruction.
+*/
+   inline svec4_inst *
+   exec_reduce(brw_predicate pred, svec4_inst *inst)
+   {
+  switch (pred) {
+  case BRW_PREDICATE_ALIGN16_REPLICATE_X:
+  case BRW_PREDICATE_ALIGN16_REPLICATE_Y:
+  case BRW_PREDICATE_ALIGN16_REPLICATE_Z:
+  case BRW_PREDICATE_ALIGN16_REPLICATE_W: {
+ const unsigned j = pred - BRW_PREDICATE_ALIGN16_REPLICATE_X;
+
+ for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+if (inst-v[i]  i != j)
+   exec_condmod(BRW_CONDITIONAL_NONE, inst-v[i]);
+ }
+
+ return inst;
+  }
+  case BRW_PREDICATE_ALIGN16_ANY4H:
+  case BRW_PREDICATE_ALIGN16_ALL4H: {
+ const bool invert = (pred == BRW_PREDICATE_ALIGN16_ANY4H);
+ unsigned j = 0;
+
+ for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+if (inst-v[i]  j++  0)
+   exec_predicate_inv(BRW_PREDICATE_NORMAL, invert, inst-v[i]);
+ }
+
+ return inst;
+  }
+  default:
+ unreachable(Not reached);
+  }
+   }
 }
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index a407ec4..a9a1e0b 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -374,6 +374,16 @@ exec_saturate(bool saturate, vec4_instruction *inst)
inst-saturate = saturate;
return inst;
 }
+
+/**
+ * No-op.  See the SVEC4 implementation.
+ */
+inline vec4_instruction *
+exec_reduce(brw_predicate pred, vec4_instruction *inst)
+{
+   return inst;
+}
+
 } /* namespace brw */
 
 #endif
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/21] i965/vec4: Make src_reg conversion constructor from dst_reg implicit.

2015-04-28 Thread Francisco Jerez
The dst_reg to src_reg conversion is fairly safe since
430c6bf70e48c08ba4dc9e00f2b88e2230793010.  No information is lost and
OP(dst, src_reg(dst), src1) does what one would expect -- This seems
just annoying now.  The implicit conversion allows you to declare
temporaries that are both written and read from as dst_reg and have
them conveniently converted to src_reg when they are used.  They also
avoid redundant expressions like 'negate(src_reg(tmp))',
'swizzle(src_reg(tmp), ...)' or 'src_vector(src_reg(tmp), ...)' (the
latter function will be defined in a future commit).

The src_reg to dst_reg conversion is kept explicit because it does
lose component ordering information.
---
 src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index c65c148..a5fc26f 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -46,6 +46,7 @@ public:
src_reg(int32_t i);
src_reg(uint8_t vf[4]);
src_reg(uint8_t vf0, uint8_t vf1, uint8_t vf2, uint8_t vf3);
+   src_reg(const dst_reg reg);
src_reg(struct brw_reg reg);
 
bool equals(const src_reg r) const;
@@ -53,8 +54,6 @@ public:
src_reg(class vec4_visitor *v, const struct glsl_type *type);
src_reg(class vec4_visitor *v, const struct glsl_type *type, int size);
 
-   explicit src_reg(const dst_reg reg);
-
unsigned swizzle; /** BRW_SWIZZLE_XYZW macros from brw_reg.h. */
 
src_reg *reladdr;
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/21] i965/fs: Introduce FS IR builder.

2015-04-28 Thread Francisco Jerez
The purpose of this change is threefold: First, it improves the
modularity of the compiler back-end by separating the functionality
required to construct an i965 IR program from the rest of the visitor
god-object, what in turn will reduce the coupling between other
components and the visitor allowing a more modular design.  This patch
doesn't yet remove the equivalent functionality from the visitor
classes, that task will be undertaken by a separate series, as it
involves major back-end surgery.

Second, it improves consistency between the scalar and vector
back-ends in order to enable back-end agnostic IR generation.  The FS
and VEC4 builders can both be used to generate scalar code with a
compatible interface or they can be used to generate natural vector
width code -- 1 or 4 components respectively according to
traits::chan_size.  The SVEC4 builder is an alternative FS IR builder
that generates implicitly scalarized vector code, with a similar
interface to the VEC4 builder.

Third, the approach to IR construction is somewhat different to what
the visitor classes currently do.  All parameters affecting code
generation (execution size, half control, point in the program where
new instructions are inserted, etc.) are encapsulated in a stand-alone
object rather than being quasi-global state (yes, anything defined in
one of the visitor classes is effectively global due to the tight
coupling with virtually everything else in the compiler back-end).
This object is lightweight and can be copied, mutated and passed
around, making helper IR-building functions more flexible because they
can now simply take a builder object as argument and will inherit its
IR generation properties in exactly the same way that a discrete
instruction would from the same builder object.

The emit_typed_write() function from my image-load-store branch is an
example that illustrates the usefulness of the latter point: Due to
hardware limitations the function may have to split the untyped
surface message in 8-wide chunks.  That means that the several
functions called to help with the construction of the message payload
are themselves required to set the execution width and half control
correctly on the instructions they emit, and to allocate all registers
with half the default width.  With the previous approach this would
require the used helper functions to be aware of the parameters that
might differ from the default state and explicitly set the instruction
bits accordingly.  With the new approach they would get a modified
builder object as argument that would influence all instructions
emitted by the helper function as if it were the default state.

Another example is the fs_visitor::VARYING_PULL_CONSTANT_LOAD()
method.  It doesn't actually emit any instructions, they are simply
created and inserted into an exec_list which is returned for the
caller to emit at some location of the program.  This sort of two-step
emission becomes unnecessary with the builder interface because the
insertion point is one more of the code generation parameters which
are part of the builder object.  The caller can simply pass
VARYING_PULL_CONSTANT_LOAD() a modified builder object pointing at the
location of the program where the effect of the constant load is
desired.  This two-step emission (which pervades the compiler back-end
and is in most cases redundant) goes away: E.g. ADD() now actually
adds two registers rather than just creating an ADD instruction in
memory, emit(ADD()) is no longer necessary.
---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 +
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 682 +
 2 files changed, 683 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_fs_builder.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 83acbd0..20cbdb2 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -38,6 +38,7 @@ i965_FILES = \
brw_ff_gs.c \
brw_ff_gs_emit.c \
brw_ff_gs.h \
+   brw_fs_builder.h \
brw_fs_channel_expressions.cpp \
brw_fs_cmod_propagation.cpp \
brw_fs_combine_constants.cpp \
diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
new file mode 100644
index 000..6b36d1f
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -0,0 +1,682 @@
+/* -*- c++ -*- */
+/*
+ * Copyright © 2010-2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The 

[Mesa-dev] [PATCH 07/21] i965: Add resize() register helper function.

2015-04-28 Thread Francisco Jerez
The resize() function takes a number of vector components from the
register given as argument.  Until now the VEC4 back-end would use
swizzle() or writemask() depending on the register type, and the FS
back-end would leave the register untouched.  This provides a
consistent interface to do the same operation on any register type on
either back-end.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h   | 10 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h | 18 ++
 2 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index ee5606f..89c8e15 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -170,6 +170,16 @@ component(const fs_reg reg, unsigned i)
return offset(reg, i);
 }
 
+/**
+ * Return a register with the first \p n components of \p reg.  No-op since FS
+ * registers have an unspecified number of components.
+ */
+static inline fs_reg
+resize(const fs_reg reg, unsigned n)
+{
+   return reg;
+}
+
 static inline bool
 is_uniform(const fs_reg reg)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index f0bdd29..c65c148 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -105,6 +105,15 @@ component(const src_reg reg, unsigned i)
return swizzle(reg, BRW_SWIZZLE4(i, i, i, i));
 }
 
+/**
+ * Return a register with the first \p n components of \p reg.
+ */
+static inline src_reg
+resize(const src_reg reg, unsigned n)
+{
+   return swizzle(reg, brw_swizzle_for_size(n));
+}
+
 static inline bool
 is_uniform(const src_reg reg)
 {
@@ -169,6 +178,15 @@ component(const dst_reg reg, unsigned i)
return writemask(reg, 1  i);
 }
 
+/**
+ * Return a register with the first \p n components of \p reg.
+ */
+static inline dst_reg
+resize(const dst_reg reg, unsigned n)
+{
+   return writemask(reg, (1  n) - 1);
+}
+
 class vec4_instruction : public backend_instruction {
 public:
DECLARE_RALLOC_CXX_OPERATORS(vec4_instruction)
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/21] i965/fs: Introduce scalarizing SVEC4 IR builder.

2015-04-28 Thread Francisco Jerez
See i965/fs: Introduce FS IR builder. for the rationale.
---
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 426 +
 1 file changed, 426 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index 6b36d1f..0368d2b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -677,6 +677,432 @@ namespace brw {
   const void *base_ir;
   /** @} */
};
+
+   /**
+* Toolbox to assemble an FS IR program out of vector instructions,
+* scalarizing them on emission.  It's meant to be largely compatible with
+* brw::vec4_builder in order to enable generic FS/VEC4 programming.
+*/
+   class svec4_builder {
+   public:
+  /** Type used in this IR to represent a source of an instruction. */
+  typedef src_svec4 src_reg;
+
+  /** Type used in this IR to represent the destination of an instruction. 
*/
+  typedef dst_svec4 dst_reg;
+
+  /** Type used in this IR to represent an instruction. */
+  typedef svec4_inst instruction;
+
+  /** You can use this to do scalar operations on the same IR. */
+  typedef fs_builder scalar_builder;
+
+  /** We build vector instructions. */
+  typedef svec4_builder vector_builder;
+
+  /**
+   * Construct a scalarizing vector builder stacked on top of a scalar
+   * builder.
+   */
+  svec4_builder(const fs_builder bld) :
+ devinfo(bld.devinfo), bld(bld)
+  {
+  }
+
+  /**
+   * Construct a scalar builder inheriting other code generation
+   * parameters from this.
+   */
+  const fs_builder 
+  scalar() const
+  {
+ return bld;
+  }
+
+  /**
+   * Construct a vector builder inheriting other code generation
+   * parameters from this.
+   */
+  svec4_builder
+  vector() const
+  {
+ return *this;
+  }
+
+  /**
+   * Construct a builder of half-SIMD-width instructions inheriting other
+   * code generation parameters from this.  Predication and control flow
+   * masking will use the enable signals for the i-th half.
+   */
+  svec4_builder
+  half(unsigned i) const
+  {
+ return svec4_builder(bld.half(i));
+  }
+
+  /**
+   * Get the SIMD width in use.
+   */
+  unsigned
+  dispatch_width() const
+  {
+ return bld.dispatch_width();
+  }
+
+  /**
+   * Get the lowered predicate to be used to interpret the flag result
+   * written by a reduced SVEC4 instruction (i.e. having called
+   * brw::exec_reduce() on the instruction with \p pred as argument).
+   * This can be used to map an ALIGN16 predication mode into an ALIGN1
+   * mode, allowing vector comparisons in the scalar back-end.
+   *
+   * \sa brw::exec_reduce().
+   */
+  static brw_predicate
+  reduced_predicate(brw_predicate pred)
+  {
+ return (pred == BRW_PREDICATE_NONE ? BRW_PREDICATE_NONE :
+ BRW_PREDICATE_NORMAL);
+  }
+
+  /**
+   * Allocate a virtual register of natural vector size and SIMD width.
+   * \p n gives the amount of space to allocate in dispatch_width units
+   * (which is just enough space for one logical component in this IR).
+   */
+  dst_reg
+  natural_reg(brw_reg_type type, unsigned n = 4) const
+  {
+ return resize(dst_reg(bld.natural_reg(type, n)), n);
+  }
+
+  /**
+   * Create a register of natural vector size and SIMD width using array
+   * \p reg as storage.
+   */
+  dst_reg
+  natural_reg(const array_reg reg) const
+  {
+ return bld.natural_reg(reg);
+  }
+
+  /**
+   * Allocate a virtual register of vector size one and natural SIMD
+   * width.
+   */
+  dst_reg
+  scalar_reg(brw_reg_type type) const
+  {
+ return dst_reg(bld.natural_reg(type), WRITEMASK_X);
+  }
+
+  /**
+   * Allocate a raw chunk of memory from the virtual GRF file with no
+   * special vector size or SIMD width.  \p n is given in units of 32B
+   * registers.
+   */
+  ::array_reg
+  array_reg(enum brw_reg_type type, unsigned n) const
+  {
+ return bld.array_reg(type, n);
+  }
+
+  /**
+   * Create a null register of floating type.
+   */
+  dst_reg
+  null_reg_f() const
+  {
+ return dst_reg(retype(brw_null_vec(dispatch_width()),
+   BRW_REGISTER_TYPE_F));
+  }
+
+  /**
+   * Create a null register of signed integer type.
+   */
+  dst_reg
+  null_reg_d() const
+  {
+ return dst_reg(retype(brw_null_vec(dispatch_width()),
+   BRW_REGISTER_TYPE_D));
+  }
+
+  /**
+   * Create a null register of unsigned integer type.
+   */
+  dst_reg
+  null_reg_ud() 

[Mesa-dev] [PATCH 02/21] i965/fs: Fix offset() for registers with zero stride.

2015-04-28 Thread Francisco Jerez
stride == 0 implies that the register has one channel per vector
component.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 68a2818..e2d2617 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -139,14 +139,15 @@ horiz_offset(fs_reg reg, unsigned delta)
 static inline fs_reg
 offset(fs_reg reg, unsigned delta)
 {
-   assert(reg.stride  0);
switch (reg.file) {
case BAD_FILE:
   break;
case GRF:
case MRF:
case ATTR:
-  return byte_offset(reg, delta * reg.width * reg.stride * 
type_sz(reg.type));
+  return byte_offset(reg,
+ delta * MAX2(reg.width * reg.stride, 1) *
+ type_sz(reg.type));
case UNIFORM:
   reg.reg_offset += delta;
   break;
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/21] i965: Define consistent interface to predicate an instruction.

2015-04-28 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 22 ++
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 26 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 22 ++
 3 files changed, 70 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 1bbe164..b2dfa00 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -329,4 +329,26 @@ exec_all(fs_inst *inst)
return inst;
 }
 
+/**
+ * Make the execution of \p inst dependent on the evaluation of a possibly
+ * inverted predicate.
+ */
+static inline fs_inst *
+exec_predicate_inv(enum brw_predicate pred, bool inverse,
+   fs_inst *inst)
+{
+   inst-predicate = pred;
+   inst-predicate_inverse = inverse;
+   return inst;
+}
+
+/**
+ * Make the execution of \p inst dependent on the evaluation of a predicate.
+ */
+static inline fs_inst *
+exec_predicate(enum brw_predicate pred, fs_inst *inst)
+{
+   return exec_predicate_inv(pred, false, inst);
+}
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index f4585d7..58c04c1 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -381,6 +381,32 @@ namespace brw {
 
   return inst;
}
+
+   /**
+* Make the execution of \p inst dependent on the evaluation of a possibly
+* inverted predicate.
+*/
+   inline svec4_inst *
+   exec_predicate_inv(brw_predicate pred, bool inverse,
+  svec4_inst *inst)
+   {
+  for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+ if (inst-v[i])
+exec_predicate_inv(pred, inverse, inst-v[i]);
+  }
+
+  return inst;
+   }
+
+   /**
+* Make the execution of \p inst dependent on the evaluation of a
+* predicate.
+*/
+   inline svec4_inst *
+   exec_predicate(enum brw_predicate pred, svec4_inst *inst)
+   {
+  return exec_predicate_inv(pred, false, inst);
+   }
 }
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 1ad57d9..325e661 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -330,6 +330,28 @@ exec_all(vec4_instruction *inst)
inst-force_writemask_all = true;
return inst;
 }
+
+/**
+ * Make the execution of \p inst dependent on the evaluation of a possibly
+ * inverted predicate.
+ */
+inline vec4_instruction *
+exec_predicate_inv(enum brw_predicate pred, bool inverse,
+   vec4_instruction *inst)
+{
+   inst-predicate = pred;
+   inst-predicate_inverse = inverse;
+   return inst;
+}
+
+/**
+ * Make the execution of \p inst dependent on the evaluation of a predicate.
+ */
+inline vec4_instruction *
+exec_predicate(enum brw_predicate pred, vec4_instruction *inst)
+{
+   return exec_predicate_inv(pred, false, inst);
+}
 } /* namespace brw */
 
 #endif
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/21] i965/fs: Rename component() to channel().

2015-04-28 Thread Francisco Jerez
Let's avoid confusion between vector components (i.e. semantically
different values, each in turn represented as a vector with a separate
value for each logical thread being executed in the same SIMD thread)
and channels (i.e. one of the N instances of some scalar value of the
program running in SIMD(Mx)N mode).  component() was giving you the
latter.  A future commit will introduce a proper component() helper
function with consistent behaviour across back-ends.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 -
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 23 +--
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bee8a54..b9eb561 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2544,8 +2544,8 @@ fs_visitor::opt_algebraic()
 
  } else if (inst-src[1].file == IMM) {
 inst-opcode = BRW_OPCODE_MOV;
-inst-src[0] = component(inst-src[0],
- inst-src[1].fixed_hw_reg.dw1.ud);
+inst-src[0] = channel(inst-src[0],
+   inst-src[1].fixed_hw_reg.dw1.ud);
 inst-sources = 1;
 inst-force_writemask_all = true;
 progress = true;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9aff84d..0d4dd5a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -339,10 +339,9 @@ fs_visitor::emit_uniformize(const fs_reg dst, const 
fs_reg src)
 {
const fs_reg chan_index = vgrf(glsl_type::uint_type);
 
-   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, component(chan_index, 0))
+   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, channel(chan_index, 0))
   -force_writemask_all = true;
-   emit(SHADER_OPCODE_BROADCAST, component(dst, 0),
-src, component(chan_index, 0))
+   emit(SHADER_OPCODE_BROADCAST, channel(dst, 0), src, channel(chan_index, 0))
   -force_writemask_all = true;
 }
 
@@ -3255,10 +3254,10 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
unsigned surf_index,
 
if (stage == MESA_SHADER_FRAGMENT) {
   if (((brw_wm_prog_data*)this-prog_data)-uses_kill) {
- emit(MOV(component(sources[0], 7), brw_flag_reg(0, 1)))
+ emit(MOV(channel(sources[0], 7), brw_flag_reg(0, 1)))
 -force_writemask_all = true;
   } else {
- emit(MOV(component(sources[0], 7),
+ emit(MOV(channel(sources[0], 7),
   retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UD)))
 -force_writemask_all = true;
   }
@@ -3269,7 +3268,7 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
unsigned surf_index,
* the atomic operation.
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
-  emit(MOV(component(sources[0], 7),
+  emit(MOV(channel(sources[0], 7),
fs_reg(0xu)))-force_writemask_all = true;
}
length++;
@@ -3318,10 +3317,10 @@ fs_visitor::emit_untyped_surface_read(unsigned 
surf_index, fs_reg dst,
 
if (stage == MESA_SHADER_FRAGMENT) {
   if (((brw_wm_prog_data*)this-prog_data)-uses_kill) {
- emit(MOV(component(sources[0], 7), brw_flag_reg(0, 1)))
+ emit(MOV(channel(sources[0], 7), brw_flag_reg(0, 1)))
 -force_writemask_all = true;
   } else {
- emit(MOV(component(sources[0], 7),
+ emit(MOV(channel(sources[0], 7),
   retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UD)))
 -force_writemask_all = true;
   }
@@ -3332,7 +3331,7 @@ fs_visitor::emit_untyped_surface_read(unsigned 
surf_index, fs_reg dst,
* the atomic operation.
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
-  emit(MOV(component(sources[0], 7),
+  emit(MOV(channel(sources[0], 7),
fs_reg(0xu)))-force_writemask_all = true;
}
 
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index e2d2617..e4ad657e 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -157,16 +157,6 @@ offset(fs_reg reg, unsigned delta)
return reg;
 }
 
-static inline fs_reg
-component(fs_reg reg, unsigned idx)
-{
-   assert(reg.subreg_offset == 0);
-   assert(idx  reg.width);
-   reg.subreg_offset = idx * type_sz(reg.type);
-   reg.width = 1;
-   return reg;
-}
-
 static inline bool
 is_uniform(const fs_reg reg)
 {
@@ -193,6 +183,19 @@ half(fs_reg reg, unsigned idx)
return horiz_offset(reg, 8 * idx);
 }
 
+/**
+ * Return the i-th SIMD channel of a register.
+ */
+static inline fs_reg
+channel(fs_reg reg, unsigned i)
+{
+   assert(reg.subreg_offset == 0);
+   assert(i  reg.width);
+   reg.subreg_offset = i * 

[Mesa-dev] [PATCH 14/21] i965: Define consistent interface to disable control flow execution masking.

2015-04-28 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 10 ++
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 14 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  |  9 +
 3 files changed, 33 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index e8c9cbc..1bbe164 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -319,4 +319,14 @@ public:
bool pi_noperspective:1;   /** Pixel interpolator noperspective flag */
 };
 
+/**
+ * Disable per-channel control flow execution masking on \p inst.
+ */
+static inline fs_inst *
+exec_all(fs_inst *inst)
+{
+   inst-force_writemask_all = true;
+   return inst;
+}
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index e023b9e..f4585d7 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -367,6 +367,20 @@ namespace brw {
 
   fs_inst *v[4];
};
+
+   /**
+* Disable per-channel control flow execution masking on \p inst.
+*/
+   inline svec4_inst *
+   exec_all(svec4_inst *inst)
+   {
+  for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+ if (inst-v[i])
+exec_all(inst-v[i]);
+  }
+
+  return inst;
+   }
 }
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index e79f70f..1ad57d9 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -321,6 +321,15 @@ public:
}
 };
 
+/**
+ * Disable per-channel control flow execution masking on \p inst.
+ */
+inline vec4_instruction *
+exec_all(vec4_instruction *inst)
+{
+   inst-force_writemask_all = true;
+   return inst;
+}
 } /* namespace brw */
 
 #endif
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/21] i965/fs: Fix passing an immediate to half().

2015-04-28 Thread Francisco Jerez
Immediates are generally uniform, they yield the same value to both
halves of any instruction.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index b4ff3dc..68a2818 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -183,10 +183,10 @@ half(fs_reg reg, unsigned idx)
 {
assert(idx  2);
 
-   if (reg.file == UNIFORM)
+   if (reg.file == UNIFORM || reg.file == IMM)
   return reg;
 
-   assert(idx == 0 || (reg.file != HW_REG  reg.file != IMM));
+   assert(idx == 0 || reg.file != HW_REG);
assert(reg.width == 16);
reg.width = 8;
return horiz_offset(reg, 8 * idx);
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/21] i965: Add helper function to get a vector component of some register.

2015-04-28 Thread Francisco Jerez
Define a helper function that returns the vector component of its
argument given by an index (as in, the x/y/z/w components of an actual
GLSL vector).  The purpose is to have a mechanism to do it
consistently across back-ends -- Until now the FS back-end had to use
offset() and the VEC4 back-end had to call swizzle() or writemask()
depending on the register type.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h   | 13 +
 src/mesa/drivers/dri/i965/brw_ir_vec4.h | 20 
 2 files changed, 33 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 7139934..ee5606f 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -157,6 +157,19 @@ offset(fs_reg reg, unsigned delta)
return reg;
 }
 
+/**
+ * Return the i-th logical component of register \p reg.  A logical component
+ * is itself a vector with as many channels as the SIMD width of \p reg.  Note
+ * that this happens to be equivalent to offset() in the FS IR because the
+ * number of vector components per addressing unit is one
+ * (cf. fs_reg::traits::chan_size), but that's not in general the case.
+ */
+static inline fs_reg
+component(const fs_reg reg, unsigned i)
+{
+   return offset(reg, i);
+}
+
 static inline bool
 is_uniform(const fs_reg reg)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index b52ebb3..f0bdd29 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -95,6 +95,16 @@ negate(src_reg reg)
return reg;
 }
 
+/**
+ * Return the i-th logical component of register \p reg.  A logical component
+ * is itself a vector with as many channels as the SIMD width of \p reg.
+ */
+static inline src_reg
+component(const src_reg reg, unsigned i)
+{
+   return swizzle(reg, BRW_SWIZZLE4(i, i, i, i));
+}
+
 static inline bool
 is_uniform(const src_reg reg)
 {
@@ -149,6 +159,16 @@ writemask(dst_reg reg, unsigned mask)
return reg;
 }
 
+/**
+ * Return the i-th logical component of register \p reg.  A logical component
+ * is itself a vector with as many channels as the SIMD width of \p reg.
+ */
+static inline dst_reg
+component(const dst_reg reg, unsigned i)
+{
+   return writemask(reg, 1  i);
+}
+
 class vec4_instruction : public backend_instruction {
 public:
DECLARE_RALLOC_CXX_OPERATORS(vec4_instruction)
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/21] i965: Define register trait structures.

2015-04-28 Thread Francisco Jerez
Register traits are a mechanism for back-end agnostic code to query
static properties of the IR the program is being compiled into.  These
properties are fully determined by the flavour of back-end in use,
like the source and destination register types of an instruction, the
number of SIMD slots available per logical thread dispatched to a
single SIMD thread (AKA vector size of the IR), and whether some
specific register type is able to represent vector swizzling and
writemasking.

The trait structures are somewhat unusual in that they are defined
directly inside the register object they describe rather than being a
template parameterized on the register type.  E.g. 't::traits::dst_reg'
queries the matching destination register type for register type 't'.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 32 
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 66 
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 64 +++
 3 files changed, 162 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index b0e07ad..676ed0d 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -79,6 +79,38 @@ public:
 
/** Register region horizontal stride */
uint8_t stride;
+
+   struct traits {
+  /**
+   * Type used in this IR to represent a source of an instruction.
+   */
+  typedef fs_reg src_reg;
+
+  /**
+   * Type used in this IR to represent the destination of an instruction.
+   */
+  typedef fs_reg dst_reg;
+
+  /**
+   * Base vector size of the IR.  Number of logically independent vector
+   * components available for each channel in a hardware SIMD instruction
+   * or in a dispatch_width-wide register.  This is the number of logical
+   * vector components you get when you allocate a dispatch_width-wide
+   * register, and the number of logical components that offset(reg, 1)
+   * skips over.
+   */
+  static const unsigned chan_size = 1;
+
+  /**
+   * Whether this register type is able to represent vector swizzles.
+   */
+  static const bool allows_swizzle = false;
+
+  /**
+   * Whether this register type is able to represent vector writemasking.
+   */
+  static const bool allows_writemask = false;
+   };
 };
 
 static inline fs_reg
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index d1eafdd..90e0305 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -91,6 +91,39 @@ namespace brw {
   using fs_reg::reladdr;
 
   unsigned swizzle;
+
+  struct traits {
+ /**
+  * Type used in this IR to represent a source of an instruction.
+  */
+ typedef src_svec4 src_reg;
+
+ /**
+  * Type used in this IR to represent the destination of an
+  * instruction.
+  */
+ typedef dst_svec4 dst_reg;
+
+ /**
+  * Base vector size of the IR.  Number of logically independent
+  * vector components available for each channel in a hardware SIMD
+  * instruction or in a dispatch_width-wide register.  This is the
+  * number of logical vector components you get when you allocate a
+  * dispatch_width-wide register, and the number of logical components
+  * that offset(reg, 1) skips over.
+  */
+ static const unsigned chan_size = fs_reg::traits::chan_size;
+
+ /**
+  * Whether this register type is able to represent vector swizzles.
+  */
+ static const bool allows_swizzle = true;
+
+ /**
+  * Whether this register type is able to represent vector 
writemasking.
+  */
+ static const bool allows_writemask = false;
+  };
};
 
/**
@@ -208,6 +241,39 @@ namespace brw {
   using fs_reg::reladdr;
 
   unsigned writemask;
+
+  struct traits {
+ /**
+  * Type used in this IR to represent a source of an instruction.
+  */
+ typedef src_svec4 src_reg;
+
+ /**
+  * Type used in this IR to represent the destination of an
+  * instruction.
+  */
+ typedef dst_svec4 dst_reg;
+
+ /**
+  * Base vector size of the IR.  Number of logically independent
+  * vector components available for each channel in a hardware SIMD
+  * instruction or in a dispatch_width-wide register.  This is the
+  * number of logical vector components you get when you allocate a
+  * dispatch_width-wide register, and the number of logical components
+  * that offset(reg, 1) skips over.
+  */
+ static const unsigned chan_size = fs_reg::traits::chan_size;
+
+ /**
+  * Whether this register type is able to represent vector 

[Mesa-dev] [PATCH 09/21] i965: Define an array register object.

2015-04-28 Thread Francisco Jerez
An array_reg is just a location in the register file with a size.  It
will be used to pass around chunks of message payloads to transform
them in several ways and assemble them.  The usual register types
aren't suitable for this because they don't carry size information,
and support complex addressing modes which aren't needed for this
purpose and unnecessarily increase the number of combinations we have
to handle.

As it doesn't know about align1 or align16 addressing modes, the same
type can be used in the VEC4 or FS backend, but it's necessary to
convert them to a native register type completing the missing
regioning information (width, swizzle, writemask, etc.) in order for
them to be usable with normal instructions.
---
 src/mesa/drivers/dri/i965/brw_shader.h | 35 ++
 1 file changed, 35 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index ac4e62a..df84cbd 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -138,6 +138,41 @@ struct backend_reg
bool abs;
 };
 
+#ifdef __cplusplus
+
+/**
+ * A plain contiguous region of memory in your register file, with
+ * well-defined size and no fancy addressing modes, swizzling or striding.
+ */
+struct array_reg : public backend_reg {
+   array_reg() : backend_reg(), size(0)
+   {
+   }
+
+   explicit
+   array_reg(const backend_reg reg, unsigned size = 1) :
+  backend_reg(reg), size(size)
+   {
+   }
+
+   /** Size of the region in 32B registers. */
+   unsigned size;
+};
+
+/**
+ * Increase the register base offset by the specified amount given in
+ * 32B registers.
+ */
+inline array_reg
+offset(array_reg reg, unsigned delta)
+{
+   assert(delta == 0 || (reg.file != HW_REG  reg.file != IMM));
+   reg.reg_offset += delta;
+   return reg;
+}
+
+#endif
+
 struct cfg_t;
 struct bblock_t;
 
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/21] i965: Define consistent interface to enable instruction conditional modifiers.

2015-04-28 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 11 +++
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 17 +
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 11 +++
 3 files changed, 39 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index b2dfa00..d6f40ee 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -351,4 +351,15 @@ exec_predicate(enum brw_predicate pred, fs_inst *inst)
return exec_predicate_inv(pred, false, inst);
 }
 
+/**
+ * Write the result of evaluating the condition given by \p mod to a flag
+ * register.
+ */
+static inline fs_inst *
+exec_condmod(enum brw_conditional_mod mod, fs_inst *inst)
+{
+   inst-conditional_mod = mod;
+   return inst;
+}
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index 58c04c1..4be3554 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -407,6 +407,23 @@ namespace brw {
{
   return exec_predicate_inv(pred, false, inst);
}
+
+   /**
+* Write the result of evaluating the condition given by \p mod to a flag
+* register.  This will typically be accompanied by exec_reduce() to select
+* how the per-component flag results are to be combined to give the final
+* flag result.
+*/
+   inline svec4_inst *
+   exec_condmod(brw_conditional_mod mod, svec4_inst *inst)
+   {
+  for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+ if (inst-v[i])
+exec_condmod(mod, inst-v[i]);
+  }
+
+  return inst;
+   }
 }
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 325e661..c4021d8 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -352,6 +352,17 @@ exec_predicate(enum brw_predicate pred, vec4_instruction 
*inst)
 {
return exec_predicate_inv(pred, false, inst);
 }
+
+/**
+ * Write the result of evaluating the condition given by \p mod to a flag
+ * register.
+ */
+inline vec4_instruction *
+exec_condmod(enum brw_conditional_mod mod, vec4_instruction *inst)
+{
+   inst-conditional_mod = mod;
+   return inst;
+}
 } /* namespace brw */
 
 #endif
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/21] i965/fs: Fix channel vs. component usage in a comment.

2015-04-28 Thread Francisco Jerez
And remove redundant note.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index e4ad657e..16b113e 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -165,9 +165,7 @@ is_uniform(const fs_reg reg)
 }
 
 /**
- * Get either of the 8-component halves of a 16-component register.
- *
- * Note: this also works if \c reg represents a SIMD16 pair of registers.
+ * Get either of the 8-channel halves of a 16-channel register.
  */
 static inline fs_reg
 half(fs_reg reg, unsigned idx)
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/21] i965/fs: Have channel() set the register stride to zero.

2015-04-28 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 16b113e..7139934 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -191,6 +191,7 @@ channel(fs_reg reg, unsigned i)
assert(i  reg.width);
reg.subreg_offset = i * type_sz(reg.type);
reg.width = 1;
+   reg.stride = 0;
return reg;
 }
 
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/21] i965: Define consistent interface to enable instruction result saturation.

2015-04-28 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h| 11 +++
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 15 +++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 11 +++
 3 files changed, 37 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index d6f40ee..7e5083c 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -362,4 +362,15 @@ exec_condmod(enum brw_conditional_mod mod, fs_inst *inst)
return inst;
 }
 
+/**
+ * Clamp the result of \p inst to the saturation range of its destination
+ * datatype.
+ */
+static inline fs_inst *
+exec_saturate(bool saturate, fs_inst *inst)
+{
+   inst-saturate = saturate;
+   return inst;
+}
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index 4be3554..508ed5e 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -424,6 +424,21 @@ namespace brw {
 
   return inst;
}
+
+   /**
+* Clamp the result of \p inst to the saturation range of its destination
+* datatype.
+*/
+   inline svec4_inst *
+   exec_saturate(bool saturate, svec4_inst *inst)
+   {
+  for (unsigned i = 0; i  ARRAY_SIZE(inst-v); ++i) {
+ if (inst-v[i])
+exec_saturate(saturate, inst-v[i]);
+  }
+
+  return inst;
+   }
 }
 
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index c4021d8..a407ec4 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -363,6 +363,17 @@ exec_condmod(enum brw_conditional_mod mod, 
vec4_instruction *inst)
inst-conditional_mod = mod;
return inst;
 }
+
+/**
+ * Clamp the result of \p inst to the saturation range of its destination
+ * datatype.
+ */
+inline vec4_instruction *
+exec_saturate(bool saturate, vec4_instruction *inst)
+{
+   inst-saturate = saturate;
+   return inst;
+}
 } /* namespace brw */
 
 #endif
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/21] i965: Document the offset() function.

2015-04-28 Thread Francisco Jerez
It was far from obvious what unit the 'delta' argument is expressed in.
---
 src/mesa/drivers/dri/i965/brw_ir_fs.h|  5 +
 src/mesa/drivers/dri/i965/brw_ir_svec4.h | 10 ++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h  | 10 ++
 3 files changed, 25 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 676ed0d..e8c9cbc 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -169,6 +169,11 @@ horiz_offset(fs_reg reg, unsigned delta)
return reg;
 }
 
+/**
+ * Increase the register base offset by the specified amount given in units of
+ * the register width, which is one logical component for this IR (cf.
+ * fs_reg::traits::chan_size).
+ */
 static inline fs_reg
 offset(fs_reg reg, unsigned delta)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
index 90e0305..e023b9e 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_svec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -142,6 +142,11 @@ namespace brw {
   return reg;
}
 
+   /**
+* Increase the register base offset by the specified amount given in units
+* of the register width, which is one logical component for this IR (cf.
+* src_svec4::traits::chan_size).
+*/
inline src_svec4
offset(const src_svec4 reg, unsigned delta)
{
@@ -292,6 +297,11 @@ namespace brw {
   return reg;
}
 
+   /**
+* Increase the register base offset by the specified amount given in units
+* of the register width, which is one logical component for this IR (cf.
+* dst_svec4::traits::chan_size).
+*/
inline dst_svec4
offset(const dst_svec4 reg, unsigned delta)
{
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 4a79c57..e79f70f 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -99,6 +99,11 @@ retype(src_reg reg, enum brw_reg_type type)
return reg;
 }
 
+/**
+ * Increase the register base offset by the specified amount given in units of
+ * the register width, which is four logical components for this IR (cf.
+ * src_reg::traits::chan_size).
+ */
 static inline src_reg
 offset(src_reg reg, unsigned delta)
 {
@@ -225,6 +230,11 @@ retype(dst_reg reg, enum brw_reg_type type)
return reg;
 }
 
+/**
+ * Increase the register base offset by the specified amount given in units of
+ * the register width, which is four logical components for this IR (cf.
+ * dst_reg::traits::chan_size).
+ */
 static inline dst_reg
 offset(dst_reg reg, unsigned delta)
 {
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/21] i965/fs: Define scalarizing VEC4 pseudo-IR.

2015-04-28 Thread Francisco Jerez
This is not a real IR in the sense of a long-lived representation of
the program.  An SVEC4 instruction, defined as an opcode operating on
4-vectors of FS registers, is broken up into its scalar components
(each an fs_inst) as soon as it's emitted.  The svec4_inst object is a
convenient way to carry around the expanded FS instructions and apply
some transformations on them (using the back-end-independent exec_*()
API introduced by a future commit).

A src_svec4 register is a vector of FS registers with its components
ordered according to a swizzle, and a dst_svec4 register is a subset
of vector components used as destination of some vector operation,
pretty much like the source and destination registers of the VEC4
back-end.

On the one hand this simplifies the translation of VEC4 higher level
languages (e.g. GLSL IR) and VEC4-centric APIs
(e.g. ARB_shader_image_load_store) into the scalar i965 back-end IR,
and on the other hand it can greatly reduce the amount of duplication
between back-ends, as it provides an interface to generate scalar IR
with semantics consistent with the VEC4 IR interface.

This patch only defines the essential data structures of the SVEC4
pseudo-IR.  The interface to construct, scalarize and emit SVEC4
instructions will be introduced in a future commit.
---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 +
 src/mesa/drivers/dri/i965/brw_ir_svec4.h   | 296 +
 src/mesa/drivers/dri/i965/brw_ir_vec4.h|  18 ++
 3 files changed, 315 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_ir_svec4.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 6d4659f..83acbd0 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -66,6 +66,7 @@ i965_FILES = \
brw_interpolation_map.c \
brw_ir_allocator.h \
brw_ir_fs.h \
+   brw_ir_svec4.h \
brw_ir_vec4.h \
brw_lower_texture_gradients.cpp \
brw_lower_unnormalized_offset.cpp \
diff --git a/src/mesa/drivers/dri/i965/brw_ir_svec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
new file mode 100644
index 000..d1eafdd
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_ir_svec4.h
@@ -0,0 +1,296 @@
+/* -*- c++ -*- */
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef BRW_IR_SVEC4_H
+#define BRW_IR_SVEC4_H
+
+#include brw_ir_fs.h
+
+namespace brw {
+   class dst_svec4;
+
+   /**
+* Source vector of one to four scalar FS registers.  These are the sources
+* of SVEC4 pseudo-instructions and provide VEC4-like semantics in the
+* scalar back-end with implicit scalarization.
+*
+* It inherits from fs_reg privately because there is an
+* implemented-in-terms-of relationship rather than an is-a relationship
+* (the Liskov substitution principle doesn't hold).
+*/
+   class src_svec4 : private fs_reg {
+   public:
+  src_svec4() : fs_reg(), swizzle(0)
+  {
+  }
+
+  src_svec4(const fs_reg reg, unsigned swizzle = BRW_SWIZZLE_NOOP) :
+ fs_reg(reg), swizzle(swizzle)
+  {
+  }
+
+  src_svec4(float f) : fs_reg(f), swizzle(BRW_SWIZZLE_)
+  {
+  }
+
+  src_svec4(int32_t i) : fs_reg(i), swizzle(BRW_SWIZZLE_)
+  {
+  }
+
+  src_svec4(uint32_t u) : fs_reg(u), swizzle(BRW_SWIZZLE_)
+  {
+  }
+
+  /**
+   * Construct a source vector from a destination vector.
+   */
+  inline
+  src_svec4(const dst_svec4 reg);
+
+  /**
+   * Return the standard representation of this register in the IR.  This
+   * is basically an up-cast but it's exposed as a function to prevent
+   * accidental casts which are unsafe in general.
+   */
+  friend const fs_reg 
+  repr(const src_svec4 reg)
+  {
+ return reg;
+   

[Mesa-dev] [PATCH 21/21] i965/vec4: Introduce VEC4 IR builder.

2015-04-28 Thread Francisco Jerez
See i965/fs: Introduce FS IR builder. for the rationale.
---
 src/mesa/drivers/dri/i965/Makefile.sources   |   1 +
 src/mesa/drivers/dri/i965/brw_vec4_builder.h | 664 +++
 2 files changed, 665 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_vec4_builder.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 20cbdb2..5bb6f06 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -110,6 +110,7 @@ i965_FILES = \
brw_urb.c \
brw_util.c \
brw_util.h \
+   brw_vec4_builder.h \
brw_vec4_copy_propagation.cpp \
brw_vec4.cpp \
brw_vec4_cse.cpp \
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_builder.h 
b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
new file mode 100644
index 000..8c4f222
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
@@ -0,0 +1,664 @@
+/* -*- c++ -*- */
+/*
+ * Copyright © 2010-2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef BRW_VEC4_BUILDER_H
+#define BRW_VEC4_BUILDER_H
+
+#include brw_ir_vec4.h
+#include brw_ir_allocator.h
+#include brw_context.h
+
+namespace brw {
+   /**
+* Toolbox to assemble a VEC4 IR program out of individual instructions.
+*
+* This object is meant to have an interface consistent with
+* brw::fs_builder in order to enable generic FS/VEC4 programming.  They
+* cannot be fully interchangeable because brw::fs_builder generates scalar
+* code while brw::vec4_builder generates vector code.  For a drop-in
+* replacement of brw::vec4_builder see brw::svec4_builder.
+*/
+   class vec4_builder {
+   public:
+  /** Type used in this IR to represent a source of an instruction. */
+  typedef brw::src_reg src_reg;
+
+  /** Type used in this IR to represent the destination of an instruction. 
*/
+  typedef brw::dst_reg dst_reg;
+
+  /** Type used in this IR to represent an instruction. */
+  typedef vec4_instruction instruction;
+
+  /** We can build scalar instructions. */
+  typedef vec4_builder scalar_builder;
+
+  /** We can build vector instructions too. */
+  typedef vec4_builder vector_builder;
+
+  /**
+   * Construct a vec4_builder appending instructions at the end of the
+   * list \p instructions.  \p alloc provides book-keeping of virtual
+   * registers allocated through the builder.
+   */
+  vec4_builder(const brw_device_info *devinfo,
+   void *mem_ctx,
+   simple_allocator alloc,
+   exec_list instructions) :
+ devinfo(devinfo), mem_ctx(mem_ctx),
+ alloc(alloc), block(NULL),
+ cursor((exec_node *)instructions.tail)
+  {
+  }
+
+  /**
+   * Construct a vec4_builder that inserts instructions before \p cursor
+   * in basic block \p block, inheriting other code generation parameters
+   * from this.
+   */
+  vec4_builder
+  at(bblock_t *block, instruction *cursor) const
+  {
+ vec4_builder bld = *this;
+ bld.block = block;
+ bld.cursor = cursor;
+ return bld;
+  }
+
+  /**
+   * Construct a scalar builder inheriting other code generation
+   * parameters from this.
+   */
+  vec4_builder
+  scalar() const
+  {
+ return *this;
+  }
+
+  /**
+   * Construct a vector builder inheriting other code generation
+   * parameters from this.
+   */
+  vec4_builder
+  vector() const
+  {
+ return *this;
+  }
+
+  /**
+   * Construct a builder of half-SIMD-width instructions inheriting other
+   * code generation parameters from this.  No-op.
+   */
+  const vec4_builder 
+  half(unsigned i) const
+  {
+ return *this;

[Mesa-dev] i965 FS/VEC4 generic programming.

2015-04-28 Thread Francisco Jerez
This series is motivated by the ridiculous amount of duplicated code
between the i965 FS and VEC4 compiler back-ends.  My next 2k LoC
patch series implementing the built-ins defined by
ARB_shader_image_load_store on the i965 back-end would have been plain
insulting without some mechanism to generate IR in a back-end-agnostic
manner.

The framework introduced in this series is expressive enough to
implement most of the translation from GLSL IR or NIR into i965 IR
independent of the back-end, but at this point only my image
load/store and atomic counters implementations make full use of it.
Both can be found in the image-load-store branch of my tree [1]
together with their dependencies, this series included.

Patches 1 to 8 simply fix some bugs and improve existing IR
manipulation interfaces to make them more consistent across back-ends.
Patches 19 to 21 introduce the builder interface that can be used to
construct i965 IR regardless of the back-end in use.  In combination
with the representation defined in patch 11 for vectors of FS scalar
registers it allows consistent generation of vector code on either
back-end with implicit scalarization in the FS case, or it can be used
to generate scalar or natural vector width code on either back-end
when that's sufficient.

Patches 14 to 18 define some helper functions that perform simple
transformations on instructions with a compatible interface across
back-ends.  Patch 12 provides a mechanism complementary to the builder
interface to query static properties of the IR -- This is especially
useful while performing back-end-independent transformations on the
program, but is also sometimes required to generate IR.

[1] http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store

src/mesa/drivers/dri/i965/Makefile.sources   |3 +
src/mesa/drivers/dri/i965/brw_fs.cpp |   10 +-
src/mesa/drivers/dri/i965/brw_fs_builder.h   | 1108 
++
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   17 ++-
src/mesa/drivers/dri/i965/brw_ir_fs.h|  151 --
src/mesa/drivers/dri/i965/brw_ir_svec4.h |  489 
++
src/mesa/drivers/dri/i965/brw_ir_vec4.h  |  198 
-
src/mesa/drivers/dri/i965/brw_shader.h   |   35 +
src/mesa/drivers/dri/i965/brw_vec4.cpp   |   10 ++
src/mesa/drivers/dri/i965/brw_vec4_builder.h |  664 
+++
10 files changed, 2660 insertions(+), 25 deletions(-)

[PATCH 01/21] i965/fs: Fix passing an immediate to half().
[PATCH 02/21] i965/fs: Fix offset() for registers with zero stride.
[PATCH 03/21] i965/fs: Rename component() to channel().
[PATCH 04/21] i965/fs: Fix channel vs. component usage in a comment.
[PATCH 05/21] i965/fs: Have channel() set the register stride to zero.
[PATCH 06/21] i965: Add helper function to get a vector component of some 
register.
[PATCH 07/21] i965: Add resize() register helper function.
[PATCH 08/21] i965/vec4: Make src_reg conversion constructor from dst_reg 
implicit.
[PATCH 09/21] i965: Define an array register object.
[PATCH 10/21] i965: Add register constructors taking an array_reg as argument.
[PATCH 11/21] i965/fs: Define scalarizing VEC4 pseudo-IR.
[PATCH 12/21] i965: Define register trait structures.
[PATCH 13/21] i965: Document the offset() function.
[PATCH 14/21] i965: Define consistent interface to disable control flow 
execution masking.
[PATCH 15/21] i965: Define consistent interface to predicate an instruction.
[PATCH 16/21] i965: Define consistent interface to enable instruction 
conditional modifiers.
[PATCH 17/21] i965: Define consistent interface to enable instruction result 
saturation.
[PATCH 18/21] i965: Define consistent interface to perform cross-component flag 
result reduction.
[PATCH 19/21] i965/fs: Introduce FS IR builder.
[PATCH 20/21] i965/fs: Introduce scalarizing SVEC4 IR builder.
[PATCH 21/21] i965/vec4: Introduce VEC4 IR builder.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/21] i965: Add register constructors taking an array_reg as argument.

2015-04-28 Thread Francisco Jerez
These are going to be used to convert an array_reg (chunk of the
register space without fancy regioning parameters) back to a normal
FS/VEC4 register.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp|  6 ++
 src/mesa/drivers/dri/i965/brw_ir_fs.h   |  1 +
 src/mesa/drivers/dri/i965/brw_ir_vec4.h |  2 ++
 src/mesa/drivers/dri/i965/brw_vec4.cpp  | 10 ++
 4 files changed, 19 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b9eb561..42b2c9d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -606,6 +606,12 @@ fs_reg::fs_reg(uint8_t vf0, uint8_t vf1, uint8_t vf2, 
uint8_t vf3)
(vf3  24);
 }
 
+fs_reg::fs_reg(const array_reg reg, unsigned width) :
+   backend_reg(reg), subreg_offset(0), reladdr(NULL),
+   width(width), effective_width(0), stride(1)
+{
+}
+
 /** Fixed brw_reg. */
 fs_reg::fs_reg(struct brw_reg fixed_hw_reg)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 89c8e15..b0e07ad 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -41,6 +41,7 @@ public:
explicit fs_reg(uint32_t u);
explicit fs_reg(uint8_t vf[4]);
explicit fs_reg(uint8_t vf0, uint8_t vf1, uint8_t vf2, uint8_t vf3);
+   fs_reg(const array_reg reg, unsigned width);
fs_reg(struct brw_reg fixed_hw_reg);
fs_reg(enum register_file file, int reg);
fs_reg(enum register_file file, int reg, enum brw_reg_type type);
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index a5fc26f..7bb9459 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -46,6 +46,7 @@ public:
src_reg(int32_t i);
src_reg(uint8_t vf[4]);
src_reg(uint8_t vf0, uint8_t vf1, uint8_t vf2, uint8_t vf3);
+   src_reg(const array_reg reg, unsigned swizzle);
src_reg(const dst_reg reg);
src_reg(struct brw_reg reg);
 
@@ -131,6 +132,7 @@ public:
dst_reg(register_file file, int reg);
dst_reg(register_file file, int reg, const glsl_type *type,
unsigned writemask);
+   dst_reg(const array_reg reg, unsigned writemask);
dst_reg(struct brw_reg reg);
dst_reg(class vec4_visitor *v, const struct glsl_type *type);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9256747..dea35dc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -138,6 +138,11 @@ src_reg::src_reg(const dst_reg reg)
this-swizzle = brw_swizzle_for_mask(reg.writemask);
 }
 
+src_reg::src_reg(const array_reg reg, unsigned swizzle) :
+   backend_reg(reg), swizzle(swizzle), reladdr(NULL)
+{
+}
+
 void
 dst_reg::init()
 {
@@ -170,6 +175,11 @@ dst_reg::dst_reg(register_file file, int reg, const 
glsl_type *type,
this-writemask = writemask;
 }
 
+dst_reg::dst_reg(const array_reg reg, unsigned writemask) :
+   backend_reg(reg), writemask(writemask), reladdr(NULL)
+{
+}
+
 dst_reg::dst_reg(struct brw_reg reg)
 {
init();
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90207] [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)

2015-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90207

--- Comment #14 from Aaron Watry awa...@gmail.com ---
(In reply to José Fonseca from comment #13)
 I think I figured out the problem.
 
 As commented on DECL_RESOURCE_FUNC macro, the RESOURCE_VAR inline function
 is not type safe, and stuff that's not a ir_variable is wrongly being casted
 into it. 
 
 This patch seems to do the trick, but I'm not 100% sure.
 

That patch fixes the crash in Konqueror for me.

Whether or not it's correct, that's up to people who know the code better than
myself.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/18] i965: Refactor and expose brw_upload_binding_table()

2015-04-28 Thread Kenneth Graunke
On Wednesday, April 22, 2015 11:47:23 PM Topi Pohjolainen wrote:
 Read and write parts of the state stage are also split into
 explicit arguments allowing future patches to use constant
 program data.
 
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_binding_tables.c | 21 ++---
  src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
  2 files changed, 21 insertions(+), 7 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
 b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 index 459165a..1142c67 100644
 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
 +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 @@ -124,9 +125,12 @@ const struct brw_tracked_state brw_vs_binding_table = {
  static void
  brw_upload_wm_binding_table(struct brw_context *brw)
  {
 +   /* BRW_NEW_WM_PROG_DATA */
 +   const struct brw_stage_prog_data *prog_data = brw-wm.base.prog_data;

   /* BRW_NEW_FS_PROG_DATA */ (it's not WM anymore)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6

2015-04-28 Thread Kenneth Graunke
On Tuesday, April 28, 2015 01:16:14 PM Matt Turner wrote:
 Could anyone spare a minute to take a look at this patch? It seems
 fine to me... but line rasterization rules are not something I really
 claim to understand. FWIW, it does fix Eric's line-aa-width piglit
 test.
 
 I'm inclined to commit it (and the Gen7 patch).

I don't understand them either - I'd always hoped to figure out what the
Windows driver does at some point and implement what they do.

That said, the current behavior is clearly broken and this appears to be
better, so I think we should commit it, too.

Acked-by: Kenneth Graunke kenn...@whitecape.org

Thanks Marius!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965: Blorp state setup refactors

2015-04-28 Thread Kenneth Graunke
On Thursday, April 23, 2015 09:00:25 PM Topi Pohjolainen wrote:
 This series introduces virtual member functions for blorp parameters
 that know how certain part of the batch is to be programmed for the
 shader in question.
 
 This will be taken advantage of later on when I add support for
 launching glsl-based programs.
 
 Topi Pohjolainen (16):
   i965/blorp: Remove constant parameter
   i965/blorp: Refactor vertex buffer state setup
   i965/blorp: Allow caller to provide sampler settings
   i965/gen7/blorp: Remove unused arguments
   i965/blorp: Remove unused arguments
   i965/blorp: Prepare for attributes other than render position
   i965/blorp: Allow blend state to be set for multiple render targets
   i965/blorp: Add support for layered rendering
   i965/blorp: Prepare drawing rectangle for flipped coordinates

Patches 1-9 are:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

I'm still planning to read through 10-16.

   i965/blorp: Use virtual function for wm/ps configuration
   i965/blorp: Move push const setup for the parameter type to handle
   i965/blorp: Move sampler setup for the parameter type to handle
   i965/blorp/gen6: Move surface setup for the parameter type to handle
   i965/blorp/gen7: Move surface setup for the parameter type to handle
   i965/blorp: Move vertex uploading for parameter type to handle
   i965/blorp: Move multisample setup for parameter type to handle
 
  src/mesa/drivers/dri/i965/brw_blorp.cpp  |  23 +--
  src/mesa/drivers/dri/i965/brw_blorp.h|  73 ---
  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  13 +-
  src/mesa/drivers/dri/i965/gen6_blorp.cpp | 273 
 ---
  src/mesa/drivers/dri/i965/gen6_blorp.h   |   2 +-
  src/mesa/drivers/dri/i965/gen7_blorp.cpp | 204 +---
  src/mesa/drivers/dri/i965/gen7_blorp.h   |   2 +-
  7 files changed, 316 insertions(+), 274 deletions(-)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6

2015-04-28 Thread Matt Turner
Could anyone spare a minute to take a look at this patch? It seems
fine to me... but line rasterization rules are not something I really
claim to understand. FWIW, it does fix Eric's line-aa-width piglit
test.

I'm inclined to commit it (and the Gen7 patch).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6

2015-04-28 Thread Chris Forbes
Have an:

Acked-by: Chris Forbes chr...@ijw.co.nz

On Fri, Apr 24, 2015 at 3:41 AM, Marius Predut marius.pre...@intel.com wrote:
 On SNB and IVB hw, for 1 pixel line thickness or less,
 the general anti-aliasing algorithm give up - garbage line is generated.
 Setting a Line Width of 0.0 specifies the rasterization of
 the “thinnest” (one-pixel-wide), non-antialiased lines.
 Lines rendered with zero Line Width are rasterized using
 Grid Intersection Quantization rules as specified
 by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.

 v2: Daniel Stone: Fix = used instead of == in an if-statement.
 v3: Ian Romanick: Use ._Enabled flag insteed .Enabled.
 Add code comments. re-word wrap the commit message.
 Add a complete bugzillia list.
 Improve the hardcoded values to produce better results.
 v4: Matt Turner: typo fixes and adjust = 1.49 to become  1.5

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006

 Signed-off-by: Marius Predut marius.pre...@intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_sf_state.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
 b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 index ea5c47a..e445ce2 100644
 --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 @@ -367,9 +367,25 @@ upload_sf_state(struct brw_context *brw)
float line_width =
   roundf(CLAMP(ctx-Line.Width, 0.0, ctx-Const.MaxLineWidth));
uint32_t line_width_u3_7 = U_FIXED(line_width, 7);
 -  /* TODO: line width of 0 is not allowed when MSAA enabled */
 -  if (line_width_u3_7 == 0)
 - line_width_u3_7 = 1;
 +
 +  /* Line width of 0 is not allowed when MSAA enabled */
 +  if (ctx-Multisample._Enabled) {
 + if (line_width_u3_7 == 0)
 + line_width_u3_7 = 1;
 +  } else if (ctx-Line.SmoothFlag  ctx-Line.Width  1.5) {
 + /* For 1 pixel line thickness or less, the general
 +  * anti-aliasing algorithm gives up, and a garbage line is
 +  * generated.  Setting a Line Width of 0.0 specifies the
 +  * rasterization of the thinnest (one-pixel-wide),
 +  * non-antialiased lines.
 +  *
 +  * Lines rendered with zero Line Width are rasterized using
 +  * Grid Intersection Quantization rules as specified by
 +  * bspec section 6.3.12.1 Zero-Width (Cosmetic) Line
 +  * Rasterization.
 +  */
 + line_width_u3_7 = 0;
 +  }
dw3 |= line_width_u3_7  GEN6_SF_LINE_WIDTH_SHIFT;
 }
 if (ctx-Line.SmoothFlag) {
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/18] i965/wm/gen6: Refactor program offset setup

2015-04-28 Thread Kenneth Graunke
On Wednesday, April 22, 2015 11:47:34 PM Topi Pohjolainen wrote:
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_state.h |  8 +
  src/mesa/drivers/dri/i965/gen6_wm_state.c | 56 
 ++-
  2 files changed, 41 insertions(+), 23 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
 b/src/mesa/drivers/dri/i965/brw_state.h
 index 23f36c0..ca3274d 100644
 --- a/src/mesa/drivers/dri/i965/brw_state.h
 +++ b/src/mesa/drivers/dri/i965/brw_state.h
 @@ -292,6 +292,14 @@ void brw_update_sampler_state(struct brw_context *brw,
uint32_t *sampler_state,
uint32_t batch_offset_for_sampler_state);
  
 +/* gen6_wm_state.c */
 +void
 +gen6_wm_state_set_programs(const struct brw_wm_prog_data *prog_data,
 +   const struct brw_stage_state *stage_state,
 +   int min_inv_per_frag,
 +   uint32_t *ksp0, uint32_t *ksp2,
 +   uint32_t *dw4, uint32_t *dw5, uint32_t *dw6);
 +
  /* gen6_sf_state.c */
  void
  calculate_attr_overrides(const struct brw_context *brw,
 diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
 b/src/mesa/drivers/dri/i965/gen6_wm_state.c
 index 8e673a4..bc921e5 100644
 --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
 @@ -65,6 +65,37 @@ const struct brw_tracked_state gen6_wm_push_constants = {
 .emit = gen6_upload_wm_push_constants,
  };
  
 +void
 +gen6_wm_state_set_programs(const struct brw_wm_prog_data *prog_data,
 +   const struct brw_stage_state *stage_state,
 +   int min_inv_per_frag,
 +   uint32_t *ksp0, uint32_t *ksp2,
 +   uint32_t *dw4, uint32_t *dw5, uint32_t *dw6)
 +{
 +   if (prog_data-prog_offset_16 || prog_data-no_8) {
 +  *dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
 +
 +  if (!prog_data-no_8  min_inv_per_frag == 1) {
 + *dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
 + *dw4 |= (prog_data-base.dispatch_grf_start_reg 
 +  GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
 + *dw4 |= (prog_data-dispatch_grf_start_reg_16 
 +  GEN6_WM_DISPATCH_START_GRF_SHIFT_2);
 + *ksp0 = stage_state-prog_offset;
 + *ksp2 = stage_state-prog_offset + prog_data-prog_offset_16;
 +  } else {
 + *dw4 |= (prog_data-dispatch_grf_start_reg_16 
 +  GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
 + *ksp0 = stage_state-prog_offset + prog_data-prog_offset_16;
 +  }
 +   } else {
 +  *dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
 +  *dw4 |= (prog_data-base.dispatch_grf_start_reg 
 +   GEN6_WM_DISPATCH_START_GRF_SHIFT_0);
 +  *ksp0 = stage_state-prog_offset;
 +   }
 +}
 +

This split feels awkward to me - the code to emit 3DSTATE_WM is now
split across multiple functions...and it has 5 out parameters.  I really
prefer keeping the code to fill out a packet's DWords together in one
function.

Could we keep it in one function, but instead make upload_wm_state()
take additional parameters, rather than poking at brw- directly?

Sorry for the trouble...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] clover: this serie remove util/compat.*

2015-04-28 Thread Tom Stellard
On Fri, Apr 24, 2015 at 12:59:53PM +0200, EdB wrote:
 Since clover should compile use -std=c++11,
 compat classes are no longer neccessary
 
 EdB (4):
   clover: remove compat class that matche std one
   clover: remove compat::string
   clover: make module::symbol::name a string
   clover: remove util/compat

I don't think patch 4 ever made it to the list.  Maybe it is too big?
Could you try resending it or post a link to a public git repo with that
commit.

Thanks,
Tom

 
  src/gallium/state_trackers/clover/Makefile.sources |   2 -
  src/gallium/state_trackers/clover/api/program.cpp  |  19 +-
  .../state_trackers/clover/core/compiler.hpp|  14 +-
  src/gallium/state_trackers/clover/core/error.hpp   |  10 +-
  src/gallium/state_trackers/clover/core/kernel.cpp  |   2 +-
  src/gallium/state_trackers/clover/core/module.cpp  |  56 ++-
  src/gallium/state_trackers/clover/core/module.hpp  |  23 +-
  src/gallium/state_trackers/clover/core/program.cpp |   4 +-
  src/gallium/state_trackers/clover/core/program.hpp |   2 +-
  .../state_trackers/clover/llvm/invocation.cpp  |  42 +-
  .../state_trackers/clover/tgsi/compiler.cpp|  12 +-
  src/gallium/state_trackers/clover/util/compat.cpp  |  38 --
  src/gallium/state_trackers/clover/util/compat.hpp  | 444 
 -
  13 files changed, 105 insertions(+), 563 deletions(-)
  delete mode 100644 src/gallium/state_trackers/clover/util/compat.cpp
  delete mode 100644 src/gallium/state_trackers/clover/util/compat.hpp
 
 -- 
 2.3.6
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/18] i965: Pass slice details as parameters for surface setup

2015-04-28 Thread Kenneth Graunke
On Wednesday, April 22, 2015 11:47:33 PM Topi Pohjolainen wrote:
 Also changed a couple of direct shifts into SET_FIELD().
 
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_context.h   |  3 ++-
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 30 
 +--
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 14 +--
  src/mesa/drivers/dri/i965/gen8_surface_state.c| 10 +++-
  4 files changed, 29 insertions(+), 28 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index b90d329..ae28955 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -964,10 +964,11 @@ struct brw_context
 {
void (*update_texture_surface)(struct brw_context *brw,
   const struct intel_mipmap_tree *mt,
 - struct gl_texture_object *tObj,
   uint32_t tex_format,
   bool is_integer_format,
   GLenum target, uint32_t effective_depth,
 + uint32_t min_layer,
 + uint32_t min_lod, uint32_t mip_count, 
   int swizzle, uint32_t *surf_offset,
   bool for_gather);
uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
 diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 index f7acad4..ad5ddb5 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 @@ -310,16 +310,16 @@ update_buffer_texture_surface(struct gl_context *ctx,
  static void
  brw_update_texture_surface(struct brw_context *brw,
 const struct intel_mipmap_tree *mt,
 -   struct gl_texture_object *tObj,
 uint32_t tex_format,
 bool is_integer_format /* unused */,
 GLenum target,
 uint32_t effective_depth /* unused */,
 +   uint32_t min_layer /* unused */,
 +   uint32_t min_lod, uint32_t mip_count, 
 int swizzle /* unused */,
 uint32_t *surf_offset,
 bool for_gather)
  {
 -   struct intel_texture_object *intelObj = intel_texture_object(tObj);
 uint32_t *surf;
  
 surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 @@ -361,16 +361,16 @@ brw_update_texture_surface(struct brw_context *brw,
  
 surf[1] = mt-bo-offset64 + mt-offset; /* reloc */
  
 -   surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel)  
 BRW_SURFACE_LOD_SHIFT |
 -   (mt-logical_width0 - 1)  BRW_SURFACE_WIDTH_SHIFT |
 -   (mt-logical_height0 - 1)  BRW_SURFACE_HEIGHT_SHIFT);
 +   surf[2] = SET_FIELD(mip_count, BRW_SURFACE_LOD) |
 + SET_FIELD(mt-logical_width0 - 1, BRW_SURFACE_WIDTH) |
 + SET_FIELD(mt-logical_height0 - 1, BRW_SURFACE_HEIGHT);
  
 -   surf[3] = (brw_get_surface_tiling_bits(mt-tiling) |
 -   (mt-logical_depth0 - 1)  BRW_SURFACE_DEPTH_SHIFT |
 -   (mt-pitch - 1)  BRW_SURFACE_PITCH_SHIFT);
 +   surf[3] = brw_get_surface_tiling_bits(mt-tiling) |
 +  SET_FIELD(mt-logical_depth0 - 1, BRW_SURFACE_DEPTH) |
 +  SET_FIELD(mt-pitch - 1, BRW_SURFACE_PITCH);
  
 -   surf[4] = (brw_get_surface_num_multisamples(mt-num_samples) |
 -  SET_FIELD(tObj-BaseLevel - mt-first_level, 
 BRW_SURFACE_MIN_LOD));
 +   surf[4] = brw_get_surface_num_multisamples(mt-num_samples) |
 + SET_FIELD(min_lod, BRW_SURFACE_MIN_LOD);

This is not equivalent...Min Lod used to be:

   tObj-BaseLevel - mt-first_level

and now it is:

   tObj-MinLevel + tObj-BaseLevel - mt-first_level

I would really appreciate it if you could make this a separate patch
from the refactoring, for easier bisectability.  (First add tObj-MinLevel
to the Gen4-6 code, then do this refactor.)

It seems like a fine change, but is certainly worth noting in the commit
message.  Perhaps this is what fixed some tests?

Thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965: Batch emission refactoring

2015-04-28 Thread Kenneth Graunke
On Wednesday, April 22, 2015 11:47:20 PM Topi Pohjolainen wrote:
 Currently batch emission logic is bolted into using the current
 gl-state and currently bound user shader programs as input. This
 series refactors the api to allow caller to give individual bits of
 information needed explicitly instead of the emission logic
 deducing them from the current state.
 
 This is needed to support blorp style gl-state-agnostic launching
 of internal utility shaders - shaders used for 2D blitting and
 buffer clearing/resolving.
 
 I have a follow-up series ready that is actually leveraging this,
 this series is simple set of refactors. I didn't mean it to, but
 it actually fixes one pigit test on ILK due to the way formats
 are set for texture surfaces: arb_copy_image.arb_copy_image-formats.
 
 Patches 6-13 all address texture surface setup. They move all the
 decision making of values into the hardware agnostic dispatcher
 leaving the hw-specific part just to deal with formatting.
 
 Topi Pohjolainen (18):
   i965: Refactor rb surface setup to allow caller to store offsets
   i965: Expose and refactor brw_update_renderbuffer_surfaces()
   i965: Refactor and expose brw_upload_binding_table()
   i965: Remove dependency to tex object in default color setup
   i965: Refactor sampler state setup
   i965: Move texture buffer dispatch into single location
   i965/gen8: Use miptree format in the surface setup
   i965: Move tex miptree and format resolving into dispatcher
   i965: Move texture swizzle resolving into dispatcher
   i965: Pass integer format flag as parameter to surface setup
   i965: Refactor effective depth calculation
   i965: Pass texture target as parameter for surface setup
   i965: Pass slice details as parameters for surface setup

I requested a small change on this patch.

   i965/wm/gen6: Refactor program offset setup

I NAK'd this one.

The rest of this 18 patch series looks great to me and is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

It looks like Curro landed different texture surface state refactoring
patches in the meantime, though...so the two of you will need to decide
how to sort that out :(

   i965/wm/gen6: Refactor push constant state uploading
   i965/ps/gen7: Refactor state uploading
   i965/ps/gen8: Refactor state uploading
   i965/gen8: Expose state base address setup


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Fix glGetProgramiv(GL_ACTIVE_ATTRIBUTES).

2015-04-28 Thread Jose Fonseca
It's returning random values, because RESOURCE_VAR() is casting
different objects into ir_variable pointers.

This updates _mesa_count_active_attribs to filters the resources with
the same logic used in _mesa_longest_attribute_name_length.

https://bugs.freedesktop.org/show_bug.cgi?id=90207

P.S.: RESOURCE_VAR cast helper should have assertions to catch this.
---
 src/mesa/main/shader_query.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index a84ec84..d2ca49b 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -302,8 +302,10 @@ _mesa_count_active_attribs(struct gl_shader_program 
*shProg)
struct gl_program_resource *res = shProg-ProgramResourceList;
unsigned count = 0;
for (unsigned j = 0; j  shProg-NumProgramResourceList; j++, res++) {
- if (is_active_attrib(RESOURCE_VAR(res)))
-count++;
+  if (res-Type == GL_PROGRAM_INPUT 
+  res-StageReferences  (1  MESA_SHADER_VERTEX) 
+  is_active_attrib(RESOURCE_VAR(res)))
+ count++;
}
return count;
 }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >