Re: [Mesa-dev] [PATCH] i965: Momentarily pretend to support ARB_texture_stencil8 for blits.

2015-06-10 Thread Jason Ekstrand
On Jun 10, 2015 9:57 AM, "Neil Roberts"  wrote:
>
> Kenneth Graunke  writes:
>
> > _mesa_meta_fb_tex_blit_begin(ctx, &blit);
> > +   ctx->Extensions.ARB_texture_stencil8 = true;
>
> Maybe you could put assert(ctx->Extensions.ARB_texture_stencil8==false)
> just before setting it to true so that we'll definitely remember to
> remove it if we eventually enable the extension. Otherwise as it stands
> if we forget about this it would probably not break any tests but the
> extension would mysteriously disable itself and no-one would notice
> because they would probably check the extensions just once upfront.

I'll second that of it's not already pushed.

> - Neil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Remove one more fixed brw_null_reg() from the visitor.

2015-06-10 Thread Jason Ekstrand
LGTM

Reviewed-by: Jason Ekstrand 
On Jun 10, 2015 7:39 AM, "Francisco Jerez"  wrote:

> Instead use fs_builder::null_reg_f() which has the correct register
> width.  Avoids the assertion failure in fs_builder::emit() hit by the
> "ES3-CTS.shaders.loops.for_dynamic_iterations.unconditional_break_fragment"
> GLES3 conformance test introduced by
> 4af4cfba9ee1014baa4a777660fc9d53d57e4c82.
>
> Reported-and-reviewed-by: Tapani Pälli 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 7789ca7..5563c5a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3234,7 +3234,7 @@ fs_visitor::lower_integer_multiplication()
>   ibld.ADD(dst, low, high);
>
>   if (inst->conditional_mod) {
> -fs_reg null(retype(brw_null_reg(), inst->dst.type));
> +fs_reg null(retype(ibld.null_reg_f(), inst->dst.type));
>  set_condmod(inst->conditional_mod,
>  ibld.MOV(null, inst->dst));
>   }
> --
> 2.3.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Re-index SSA definitions before printing NIR code.

2015-06-11 Thread Jason Ekstrand
On Thu, Jun 11, 2015 at 8:12 AM, Connor Abbott  wrote:
> The one thing this will hurt is that diff'ing shaders from before and
> after an optimization becomes harder, since just printing the shader
> will re-order the numbers and add spurious changes. If we want to make
> the result of doing INTEL_DEBUG=fs more reasonable, we could just do
> it at the end of the optimization loop or before dumping the shader...

Um... I think that's what this patch does...

> On Wed, Jun 10, 2015 at 2:39 AM, Kenneth Graunke  
> wrote:
>> This makes the SSA definitions use sequential numbers (0, 1, 2, ...)
>> instead of seemingly random ones.  There's not much point normally,
>> but it makes debug output much easier to read.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_nir.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index 142162c..c13708a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -167,6 +167,12 @@ brw_create_nir(struct brw_context *brw,
>> nir_validate_shader(nir);
>>
>> if (unlikely(debug_enabled)) {
>> +  /* Re-index SSA defs so we print more sensible numbers. */
>> +  nir_foreach_overload(nir, overload) {
>> + if (overload->impl)
>> +nir_index_ssa_defs(overload->impl);
>> +  }
>> +
>>fprintf(stderr, "NIR (SSA form) for %s shader:\n",
>>_mesa_shader_stage_to_string(stage));
>>nir_print_shader(nir, stderr);
>> --
>> 2.4.2
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Delete linked GLSL IR when using NIR.

2015-06-11 Thread Jason Ekstrand
On Thu, Jun 11, 2015 at 12:41 AM, Tapani Pälli  wrote:
> This is based on Kenneth's patch to delete 'most of the IR'. Due to
> linker changes to clone variables, we can now free all of IR.
>
> Saves 58MB of memory when replaying a Dota 2 trace on Broadwell.

I think we've saved ~50 MB 3 times now on that one dota trace.  Good work guys!

> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 76285f2..99de1cd 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -297,8 +297,11 @@ brw_link_shader(struct gl_context *ctx, struct 
> gl_shader_program *shProg)
>
>brw_add_texrect_params(prog);
>
> -  if (options->NirOptions)
> +  if (options->NirOptions) {
>   prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) 
> stage);
> + ralloc_free(shader->ir);
> + shader->ir = NULL;
> +  }
>
>_mesa_reference_program(ctx, &prog, NULL);
> }
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't create a temp PBO when uploading data from glTexImage*

2015-06-12 Thread Jason Ekstrand
On Fri, Jun 12, 2015 at 7:34 AM, Neil Roberts  wrote:
> Previously when glTexImage* is called it would attempt to create a
> temporary PBO if the texture is busy in order to avoid blocking when
> mapping the texture. This doesn't make much sense for glTexImage
> because in that case we are completely replacing the texture anyway so
> instead of allocating a PBO we can just allocate new storage for the
> texture.
>
> The code was buggy anyway because it was checking whether the buffer
> was busy before calling Driver->AllocTextureImageBuffer. That function
> actually always frees the buffer and recreates a new one so it was
> checking whether the previous buffer was busy and this is irrelevant.

I'm not sure this is correct.  You can still do partial updates with
TexImage if you are updating a single miplevel or cube face.

> In practice I think this wouldn't matter too much because the upper
> layers of Mesa always call Driver->FreeTextureImageBuffer before
> calling Driver->TexImage anyway so there would never be a buffer that
> could be busy.
> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 85d3d04..2874e5b 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -95,8 +95,6 @@ intelTexImage(struct gl_context * ctx,
> struct intel_texture_image *intelImage = intel_texture_image(texImage);
> bool ok;
>
> -   bool tex_busy = intelImage->mt && drm_intel_bo_busy(intelImage->mt->bo);
> -
> DBG("%s mesa_format %s target %s format %s type %s level %d %dx%dx%d\n",
> __func__, _mesa_get_format_name(texImage->TexFormat),
> _mesa_lookup_enum_by_nr(texImage->TexObject->Target),
> @@ -116,7 +114,8 @@ intelTexImage(struct gl_context * ctx,
> texImage->Depth,
> format, type, pixels,
> false /*allocate_storage*/,
> -   tex_busy, unpack);
> +   false /*create_pbo*/,
> +   unpack);
> if (ok)
>return;
>
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/14] meta: Fix transfer operations check in meta pbo path for readpixels

2015-06-16 Thread Jason Ekstrand
On Jun 16, 2015 11:15 AM, "Anuj Phogat"  wrote:
>
> Without this patch, arb_color_buffer_float-readpixels test fails, when
> forced to use meta pbo path.
>
> Signed-off-by: Anuj Phogat 
> Cc: 
> ---
>  src/mesa/drivers/common/meta_tex_subimage.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
b/src/mesa/drivers/common/meta_tex_subimage.c
> index d2474f5..00364f8 100644
> --- a/src/mesa/drivers/common/meta_tex_subimage.c
> +++ b/src/mesa/drivers/common/meta_tex_subimage.c
> @@ -273,12 +273,14 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context
*ctx, GLuint dims,
> format == GL_COLOR_INDEX)
>return false;
>
> -   if (ctx->_ImageTransferState)
> -  return false;
> -
> -
> +   /* Don't use meta path for readpixels in below conditions. */

A more descriptive comment would be nice.

> if (!tex_image) {
>rb = ctx->ReadBuffer->_ColorReadBuffer;
> +
> +  if (_mesa_get_readpixels_transfer_ops(ctx, rb->Format, format,
> +type, GL_FALSE))
> + return false;
> +
>if (_mesa_need_rgb_to_luminance_conversion(rb->Format, format))
>   return false;
> }
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/14] mesa: Fix conditions to test signed, unsigned integer format

2015-06-16 Thread Jason Ekstrand
Please note in the commit message exactly what is broken.
On Jun 16, 2015 11:15, "Anuj Phogat"  wrote:

> Signed-off-by: Anuj Phogat 
> Cc: 
> ---
>  src/mesa/main/readpix.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
> index caa2648..a9416ef 100644
> --- a/src/mesa/main/readpix.c
> +++ b/src/mesa/main/readpix.c
> @@ -160,10 +160,12 @@ _mesa_readpixels_needs_slow_path(const struct
> gl_context *ctx, GLenum format,
>srcType = _mesa_get_format_datatype(rb->Format);
>
>if ((srcType == GL_INT &&
> +   _mesa_is_enum_format_integer(format) &&
> (type == GL_UNSIGNED_INT ||
>  type == GL_UNSIGNED_SHORT ||
>  type == GL_UNSIGNED_BYTE)) ||
>(srcType == GL_UNSIGNED_INT &&
> +   _mesa_is_enum_format_integer(format) &&
> (type == GL_INT ||
>  type == GL_SHORT ||
>  type == GL_BYTE))) {
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/14] meta: Abort meta pbo path if readpixels need signed-unsigned conversion

2015-06-16 Thread Jason Ekstrand
On Jun 16, 2015 11:15, "Anuj Phogat"  wrote:
>
> Without this patch, piglit test fbo_integer_readpixels_sint_uint fails,
when
> forced to use the meta pbo path.
>
> Signed-off-by: Anuj Phogat 
> Cc: 
> ---
>  src/mesa/drivers/common/meta_tex_subimage.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
b/src/mesa/drivers/common/meta_tex_subimage.c
> index 00364f8..84cbc50 100644
> --- a/src/mesa/drivers/common/meta_tex_subimage.c
> +++ b/src/mesa/drivers/common/meta_tex_subimage.c
> @@ -283,6 +283,9 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx,
GLuint dims,
>
>if (_mesa_need_rgb_to_luminance_conversion(rb->Format, format))
>   return false;
> +
> +  if (_mesa_need_signed_unsigned_int_conversion(rb->Format, format,
type))
> + return false;

Hrm... This seems fishy.  Isn't glBlitFramebuffers supposed to handle
format conversion with integers?  If so we should probably fix it rather
than just skip it for the meta pbo path.

> }
>
> /* For arrays, use a tall (height * depth) 2D texture but taking into
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.

2015-06-18 Thread Jason Ekstrand
Wow... Reviewed-by: Jason Ekstrand 

On Thu, Jun 18, 2015 at 4:19 PM, Matt Turner  wrote:
> Fixes a performance problem caused by commit b639ed2f.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895
> ---
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> index c0c8dfa..49f2e3e 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw,
> mesa_format format,
> const union gl_color_union *color)
>  {
> -   if (_mesa_is_format_integer_color(format))
> +   if (_mesa_is_format_integer_color(format)) {
>if (brw->gen >= 8) {
>   perf_debug("Integer fast clear not enabled for (%s)",
>  _mesa_get_format_name(format));
>}
>return false;
> +   }
>
> for (int i = 0; i < 4; i++) {
>if (color->f[i] != 0.0 && color->f[i] != 1.0 &&
> --
> 2.3.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] New stable-branch 10.5 candidate pushed

2015-06-18 Thread Jason Ekstrand
We should also pull in Chris' 3-patch drawbuffers series if it applies:

http://lists.freedesktop.org/archives/mesa-dev/2015-June/085851.html

On Thu, Jun 18, 2015 at 11:30 AM, Emil Velikov  wrote:
> Hello list,
>
> The candidate for the Mesa 10.5.8 is now available. Currently we have:
>  - 11 queued
>  - 10 nominated (outstanding)
>  - and 0 rejected (obsolete) patches
>
> The present queue consist of a few nouveau and i965 patches, along with
> couple libEGL hit bug fixes and a build fix for people using llvm/clang.
>
> Take a look at section "Mesa stable queue" for more information.
>
> Testing
> ---
> The following results are against piglit 305ecc3ac89.
>
>
> Changes - classic i965(snb)
> ---
> None.
>
>
> Changes - swrast classic
> 
> None.
>
>
> Changes - gallium softpipe
> --
> None.
>
>
> Changes - gallium llvmpipe (LLVM 3.6)
> -
> None.
>
>
> Testing reports/general approval
> 
> Any testing reports (or general approval of the state of the branch)
> will be greatly appreciated.
>
>
> Trivial merge conflicts
> ---
> commit bb00457f49177d8d43417855f843887de3148e99
> Author: Jason Ekstrand 
>
> i965/fs: Don't let the EOT send message interfere with the MRF hack
>
> (cherry picked from commit 86e5afbfee5492235cab1a7be4ea49ac02be1644)
>
>
>
> The plan is to have 10.5.8 this Friday(19th of June).
>
> If you have any questions or comments that you would like to share
> before the release, please go ahead.
>
>
> Cheers,
> Emil
>
>
> Mesa stable queue
> -
>
> Nominated (10)
> ==
>
> Boyan Ding (2):
>   egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals
>   i915: Add XRGB format to intel_screen_make_configs
>
> Brian Paul (1):
>   configure: don't try to build gallium DRI drivers if --disable-dri is 
> set
>
> Ilia Mirkin (1):
>   glsl: add version checks to conditionals for builtin variable enablement
>   mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls
>
> Mario Kleiner(1):
>   nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads
>
> Tapani Pälli (2):
>   glsl: Allow dynamic sampler array indexing with GLSL ES < 3.00
>   glsl: validate sampler array indexing for 'constant-index-expression'
>
> Tom Stellard (3):
>   clover: Call clBuildProgram() notification function when build 
> completes v2
>   gallium/drivers: Add threadsafe wrappers for pipe_context and 
> pipe_screen
>   clover: Use threadsafe wrappers for pipe_screen and pipe_context
>
>
> Queued (11)
> ===
>
> Ben Widawsky (1):
>   i965: Disable compaction for EOT send messages
>
> Boyan Ding (1):
>   egl/x11: Set version of swrastLoader to 2
>
> Emil Velikov (1):
>   docs: Add sha256sums for the 10.5.7 release
>
> Erik Faye-Lund (1):
>   mesa: build xmlconfig to a separate static library
>
> Francisco Jerez (1):
>   i965: Don't compact instructions with unmapped bits.
>
> Ilia Mirkin (3):
>   nvc0/ir: fix collection of first uses for texture barrier insertion
>   nv50,nvc0: clamp uniform size to 64k
>   nvc0/ir: can't have a join on a load with an indirect source
>
> Jason Ekstrand (1):
>   i965/fs: Don't let the EOT send message interfere with the MRF hack
>
> Marek Olšák (1):
>   egl: fix setting context flags
>
> Roland Scheidegger (1):
>   draw: (trivial) fix NULL pointer dereference
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/17] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup

2015-06-18 Thread Jason Ekstrand
Soon we will start using the builder to explicitly set all the execution
sizes.  We could make a 32-wide builder, but the builder asserts that we
never grow it which is usually a reasonable assumption.  Sinc this one
instruction is a bit of an odd-ball, we just set the exec_size explicitly.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..b00825e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1357,10 +1357,11 @@ fs_visitor::emit_interpolation_setup_gen6()
*/
   fs_reg int_pixel_xy(GRF, alloc.allocate(dispatch_width / 8),
   BRW_REGISTER_TYPE_UW, dispatch_width * 2);
-  abld.exec_all()
-  .ADD(int_pixel_xy,
-   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
-   fs_reg(brw_imm_v(0x11001010)));
+  fs_inst *add = abld.exec_all()
+ .ADD(int_pixel_xy,
+  fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
+  fs_reg(brw_imm_v(0x11001010)));
+  add->exec_size = dispatch_width * 2;
 
   this->pixel_x = vgrf(glsl_type::float_type);
   this->pixel_y = vgrf(glsl_type::float_type);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/17] i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads

2015-06-18 Thread Jason Ekstrand
Previously, fs_inst::regs_read() fell back to depending on the register
width for the second source.  This isn't really correct since it isn't a
SIMD8 value at all, but a SIMD4x2 value.  This commit changes it to
explicitly be always one register.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 37b6d0d..ce56657 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -763,6 +763,12 @@ fs_inst::regs_read(int arg) const
  return exec_size / 4;
   break;
 
+   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7:
+  /* The second argument is a single SIMD4x2 register */
+  if (arg == 1)
+ return 1;
+  break;
+
default:
   if (is_tex() && arg == 0 && src[0].file == GRF)
  return mlen;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/17] i965/fs: Set the builder group for emitting FB-write stencil/AA alpha

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index b00825e..8a43ec8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1528,7 +1528,7 @@ fs_visitor::emit_single_fb_write(const fs_builder &bld,
 
if (payload.aa_dest_stencil_reg) {
   sources[length] = fs_reg(GRF, alloc.allocate(1));
-  bld.exec_all().annotate("FB write stencil/AA alpha")
+  bld.group(8, 0).exec_all().annotate("FB write stencil/AA alpha")
  .MOV(sources[length],
   fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0)));
   length++;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/17] i965/fs: Use a switch statement in fs_inst::regs_read()

2015-06-18 Thread Jason Ekstrand
This makes things a little simpler, more efficient, and quite a bit more
readable.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 45 ++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5563c5a..37b6d0d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -744,28 +744,29 @@ fs_inst::is_partial_write() const
 int
 fs_inst::regs_read(int arg) const
 {
-   if (is_tex() && arg == 0 && src[0].file == GRF) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_FB_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_URB_WRITE_SIMD8 && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_ATOMIC && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_READ && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_ATOMIC && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_READ && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET && arg == 0) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_LINTERP && arg == 0) {
-  return exec_size / 4;
+   switch (opcode) {
+   case FS_OPCODE_FB_WRITE:
+   case SHADER_OPCODE_URB_WRITE_SIMD8:
+   case SHADER_OPCODE_UNTYPED_ATOMIC:
+   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
+   case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
+   case SHADER_OPCODE_TYPED_ATOMIC:
+   case SHADER_OPCODE_TYPED_SURFACE_READ:
+   case SHADER_OPCODE_TYPED_SURFACE_WRITE:
+   case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
+  if (arg == 0)
+ return mlen;
+  break;
+
+   case FS_OPCODE_LINTERP:
+  if (arg == 0)
+ return exec_size / 4;
+  break;
+
+   default:
+  if (is_tex() && arg == 0 && src[0].file == GRF)
+ return mlen;
+  break;
}
 
switch (src[arg].file) {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/17] i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/Y

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ce56657..4f98d63 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -769,6 +769,12 @@ fs_inst::regs_read(int arg) const
  return 1;
   break;
 
+   case FS_OPCODE_PIXEL_X:
+   case FS_OPCODE_PIXEL_Y:
+  if (arg == 0)
+ return 2;
+  break;
+
default:
   if (is_tex() && arg == 0 && src[0].file == GRF)
  return mlen;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/17] i965/fs: Remove fs_inst constructors that don't take an explicit exec_size

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 30 ++
 src/mesa/drivers/dri/i965/brw_fs_builder.h |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  6 --
 src/mesa/drivers/dri/i965/brw_ir_fs.h  |  9 +
 4 files changed, 8 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 740b51d..61235d7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -126,9 +126,9 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size)
init(opcode, exec_size, reg_undef, NULL, 0);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst)
+fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst)
 {
-   init(opcode, 0, dst, NULL, 0);
+   init(opcode, exec_size, dst, NULL, 0);
 }
 
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
@@ -138,12 +138,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 1);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0)
-{
-   const fs_reg src[1] = { src0 };
-   init(opcode, 0, dst, src, 1);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
  const fs_reg &src0, const fs_reg &src1)
 {
@@ -151,13 +145,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 2);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
- const fs_reg &src1)
-{
-   const fs_reg src[2] = { src0, src1 };
-   init(opcode, 0, dst, src, 2);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
  const fs_reg &src0, const fs_reg &src1, const fs_reg &src2)
 {
@@ -165,19 +152,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 3);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
- const fs_reg &src1, const fs_reg &src2)
-{
-   const fs_reg src[3] = { src0, src1, src2 };
-   init(opcode, 0, dst, src, 3);
-}
-
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst,
- const fs_reg src[], unsigned sources)
-{
-   init(opcode, 0, dst, src, sources);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_width, const fs_reg &dst,
  const fs_reg src[], unsigned sources)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index 594e252..74fd2c9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -281,7 +281,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg &dst) const
   {
- return emit(instruction(opcode, dst));
+ return emit(instruction(opcode, dst.width, dst));
   }
 
   /**
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 0ede634..9941116 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -109,7 +109,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
  if (var->data.location == VARYING_SLOT_POS) {
 reg = *emit_fragcoord_interpolation(var->data.pixel_center_integer,
 var->data.origin_upper_left);
-emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, input, reg), 0xF);
+emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
+  input, reg), 0xF);
  } else {
 emit_general_interpolation(input, var->name, var->type,
(glsl_interp_qualifier) 
var->data.interpolation,
@@ -1743,7 +1744,8 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
nir_tex_instr *instr)
fs_reg dest = get_nir_dest(instr->dest);
dest.type = this->result.type;
unsigned num_components = nir_tex_instr_dest_size(instr);
-   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, dest, this->result),
+   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
+ dest, this->result),
 (1 << num_components) - 1);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 10033ca..4e62efa 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -159,20 +159,13 @@ public:
 
fs_inst();
fs_inst(enum opcode opcode, uint8_t exec_size);
-   fs_inst(enum opcode opcode, const fs_reg &dst);
+   fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst);
fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
const fs_reg &src0);
-   fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0);
  

[Mesa-dev] [PATCH 08/17] i965/fs: Make better use of the builder in shader_time

2015-06-18 Thread Jason Ekstrand
Previously, we were just depending on register widths to ensure that
various things were exec_size of 1 etc.  Now, we do so explicitly using the
builder.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c13ac7d..740b51d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -557,7 +557,7 @@ fs_visitor::get_timestamp(const fs_builder &bld)
/* We want to read the 3 fields we care about even if it's not enabled in
 * the dispatch.
 */
-   bld.exec_all().MOV(dst, ts);
+   bld.group(4, 0).exec_all().MOV(dst, ts);
 
/* The caller wants the low 32 bits of the timestamp.  Since it's running
 * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
@@ -637,17 +637,19 @@ fs_visitor::emit_shader_time_end()
start.negate = true;
fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
diff.set_smear(0);
-   ibld.ADD(diff, start, shader_end_time);
+
+   const fs_builder cbld = ibld.group(1, 0);
+   cbld.group(1, 0).ADD(diff, start, shader_end_time);
 
/* If there were no instructions between the two timestamp gets, the diff
 * is 2 cycles.  Remove that overhead, so I can forget about that when
 * trying to determine the time taken for single instructions.
 */
-   ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, type, diff);
-   SHADER_TIME_ADD(ibld, written_type, fs_reg(1u));
+   cbld.ADD(diff, diff, fs_reg(-2u));
+   SHADER_TIME_ADD(cbld, type, diff);
+   SHADER_TIME_ADD(cbld, written_type, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, reset_type, fs_reg(1u));
+   SHADER_TIME_ADD(cbld, reset_type, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/17] i965/fs: Remove exec_size guessing from fs_inst::init()

2015-06-18 Thread Jason Ekstrand
Now that all of the non-explicit constructors are gone, we don't need to
guess anymore.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 22 --
 1 file changed, 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index cff27e7..d9b7f75 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -68,28 +68,6 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
fs_reg &dst,
 
assert(dst.file != IMM && dst.file != UNIFORM);
 
-   /* If exec_size == 0, try to guess it from the registers.  Since all
-* manner of things may use hardware registers, we first try to guess
-* based on GRF registers.  If this fails, we will go ahead and take the
-* width from the destination register.
-*/
-   if (this->exec_size == 0) {
-  if (dst.file == GRF) {
- this->exec_size = dst.width;
-  } else {
- for (unsigned i = 0; i < sources; ++i) {
-if (src[i].file != GRF && src[i].file != ATTR)
-   continue;
-
-if (this->exec_size <= 1)
-   this->exec_size = src[i].width;
-assert(src[i].width == 1 || src[i].width == this->exec_size);
- }
-  }
-
-  if (this->exec_size == 0 && dst.file != BAD_FILE)
- this->exec_size = dst.width;
-   }
assert(this->exec_size != 0);
 
this->conditional_mod = BRW_CONDITIONAL_NONE;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/17] i965/fs_builder: Use dispatch_width instead of reg.width for offset and half

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index 7d3c8ab..58519d7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -161,7 +161,7 @@ namespace brw {
  case MRF:
  case ATTR:
 return byte_offset(reg,
-   delta * MAX2(reg.width * reg.stride, 1) *
+   delta * dispatch_width() * reg.stride *
type_sz(reg.type));
  case UNIFORM:
 reg.reg_offset += delta;
@@ -185,9 +185,9 @@ namespace brw {
 
  case GRF:
  case MRF:
-assert(reg.width == 16);
-reg.width = 8;
-return horiz_offset(reg, 8 * idx);
+assert(dispatch_width() == 16);
+reg.width = dispatch_width() / 2;
+return horiz_offset(reg, (dispatch_width() / 2) * idx);
 
  case ATTR:
  case HW_REG:
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/17] i965/fs: Use exec_size instead of dst.width for computing component size

2015-06-18 Thread Jason Ekstrand
There are a variety of places where we use dst.width / 8 to compute the
size of a single logical channel.  Instead, we should be using exec_size.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp| 6 +++---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp| 2 +-
 src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp| 2 +-
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++--
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b889432..b30463a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2317,12 +2317,12 @@ fs_visitor::opt_register_renaming()
 
   if (depth == 0 &&
   inst->dst.file == GRF &&
-  alloc.sizes[inst->dst.reg] == inst->dst.width / 8 &&
+  alloc.sizes[inst->dst.reg] == inst->exec_size / 8 &&
   !inst->is_partial_write()) {
  if (remap[dst] == -1) {
 remap[dst] = dst;
  } else {
-remap[dst] = alloc.allocate(inst->dst.width / 8);
+remap[dst] = alloc.allocate(inst->exec_size / 8);
 inst->dst.reg = remap[dst];
 progress = true;
  }
@@ -2453,7 +2453,7 @@ fs_visitor::compute_to_mrf()
 /* Things returning more than one register would need us to
  * understand coalescing out more than one MOV at a time.
  */
-if (scan_inst->regs_written > scan_inst->dst.width / 8)
+if (scan_inst->regs_written > scan_inst->exec_size / 8)
break;
 
/* SEND instructions can't have MRF as a destination. */
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 5ea66c5..55ed352 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -179,7 +179,7 @@ static void
 create_copy_instr(const fs_builder &bld, fs_inst *inst, fs_reg src, bool 
negate)
 {
int written = inst->regs_written;
-   int dst_width = inst->dst.width / 8;
+   int dst_width = inst->exec_size / 8;
const fs_builder ubld = bld.group(inst->exec_size, inst->force_sechalf)
   .exec_all(inst->force_writemask_all);
fs_inst *copy;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index 2ad7079..149c0f0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
@@ -196,7 +196,7 @@ fs_visitor::register_coalesce()
 continue;
  }
  reg_to_offset[offset] = inst->dst.reg_offset;
- if (inst->src[0].width == 16)
+ if (inst->exec_size == 16)
 reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
  mov[offset] = inst;
  channels_remaining -= inst->regs_written;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 68cd454..9291d39 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -912,7 +912,7 @@ fs_visitor::emit_texture(ir_texture_opcode op,
   bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth, fs_reg(6));
 
   fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, 
inst->regs_written);
-  int components = inst->regs_written / (dst.width / 8);
+  int components = inst->regs_written / (inst->exec_size / 8);
   for (int i = 0; i < components; i++) {
  if (i == 2) {
 fixed_payload[i] = fixed_depth;
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index ee0add5..b49961f 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -1314,8 +1314,8 @@ fs_instruction_scheduler::choose_instruction_to_schedule()
 * single-result send is probably actually reducing register
 * pressure.
 */
-   if (inst->regs_written <= inst->dst.width / 8 &&
-   chosen_inst->regs_written > chosen_inst->dst.width / 8) {
+   if (inst->regs_written <= inst->exec_size / 8 &&
+   chosen_inst->regs_written > chosen_inst->exec_size / 8) {
   chosen = n;
   continue;
} else if (inst->regs_written > chosen_inst->regs_written) {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/17] i965/fs: Use exec_size for determining regs read/written and partial writes

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 61235d7..cff27e7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -101,7 +101,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
fs_reg &dst,
case MRF:
case ATTR:
   this->regs_written =
- DIV_ROUND_UP(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
+ DIV_ROUND_UP(MAX2(exec_size * dst.stride, 1) * type_sz(dst.type), 32);
   break;
case BAD_FILE:
   this->regs_written = 0;
@@ -718,7 +718,7 @@ bool
 fs_inst::is_partial_write() const
 {
return ((this->predicate && this->opcode != BRW_OPCODE_SEL) ||
-   (this->dst.width * type_sz(this->dst.type)) < 32 ||
+   (this->exec_size * type_sz(this->dst.type)) < 32 ||
!this->dst.is_contiguous());
 }
 
@@ -772,8 +772,8 @@ fs_inst::regs_read(int arg) const
   if (src[arg].stride == 0) {
  return 1;
   } else {
- int size = src[arg].width * src[arg].stride * type_sz(src[arg].type);
- return (size + 31) / 32;
+ int size = this->exec_size * src[arg].stride * type_sz(src[arg].type);
+ return DIV_ROUND_UP(size, 32);
   }
case MRF:
   unreachable("MRF registers are not allowed as sources");
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/17] i965/fs: Use the builder dispatch width instead of dst.width for pull constants

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d9b7f75..b889432 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -188,7 +188,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
bld.ADD(vec4_offset, varying_offset, fs_reg(const_offset & ~3));
 
int scale = 1;
-   if (devinfo->gen == 4 && dst.width == 8) {
+   if (devinfo->gen == 4 && bld.dispatch_width() == 8) {
   /* Pre-gen5, we can either use a SIMD8 message that requires (header,
* u, v, r) as parameters, or we can just use the SIMD16 message
* consisting of (header, u).  We choose the second, at the cost of a
@@ -204,9 +204,9 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
   op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
 
assert(dst.width % 8 == 0);
-   int regs_written = 4 * (dst.width / 8) * scale;
+   int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written),
-   dst.type, dst.width);
+   dst.type, bld.dispatch_width());
fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
inst->regs_written = regs_written;
 
@@ -216,7 +216,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
   if (devinfo->gen == 4)
  inst->mlen = 3;
   else
- inst->mlen = 1 + dispatch_width / 8;
+ inst->mlen = 1 + bld.dispatch_width() / 8;
}
 
bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale));
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-18 Thread Jason Ekstrand
We want to move these into the builder so that they know the current
builder's dispatch width.  This will be needed by a later commit.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |  52 ++
 src/mesa/drivers/dri/i965/brw_fs_builder.h   |  46 +
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  60 +--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 149 ++-
 src/mesa/drivers/dri/i965/brw_ir_fs.h|  51 -
 6 files changed, 182 insertions(+), 178 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4f98d63..c13ac7d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -267,7 +267,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
  inst->mlen = 1 + dispatch_width / 8;
}
 
-   bld.MOV(dst, offset(vec4_result, (const_offset & 3) * scale));
+   bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale));
 }
 
 /**
@@ -361,7 +361,12 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
&grf_alloc) const
   reg.width = this->src[i].width;
   if (!this->src[i].equals(reg))
  return false;
-  reg = ::offset(reg, 1);
+
+  if (i < this->header_size) {
+ reg.reg_offset += 1;
+  } else {
+ reg.reg_offset += this->exec_size / 8;
+  }
}
 
return true;
@@ -963,7 +968,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
} else {
   bld.ADD(wpos, this->pixel_x, fs_reg(0.5f));
}
-   wpos = offset(wpos, 1);
+   wpos = bld.offset(wpos, 1);
 
/* gl_FragCoord.y */
if (!flip && pixel_center_integer) {
@@ -979,7 +984,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
 
   bld.ADD(wpos, pixel_y, fs_reg(offset));
}
-   wpos = offset(wpos, 1);
+   wpos = bld.offset(wpos, 1);
 
/* gl_FragCoord.z */
if (devinfo->gen >= 6) {
@@ -989,7 +994,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
this->delta_xy[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
interp_reg(VARYING_SLOT_POS, 2));
}
-   wpos = offset(wpos, 1);
+   wpos = bld.offset(wpos, 1);
 
/* gl_FragCoord.w: Already set up in emit_interpolation */
bld.MOV(wpos, this->wpos_w);
@@ -1072,7 +1077,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
/* If there's no incoming setup data for this slot, don't
 * emit interpolation for it.
 */
-   attr = offset(attr, type->vector_elements);
+   attr = bld.offset(attr, type->vector_elements);
location++;
continue;
 }
@@ -1087,7 +1092,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
   interp = suboffset(interp, 3);
interp.type = attr.type;
bld.emit(FS_OPCODE_CINTERP, attr, fs_reg(interp));
-  attr = offset(attr, 1);
+  attr = bld.offset(attr, 1);
}
 } else {
/* Smooth/noperspective interpolation case. */
@@ -1125,7 +1130,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
if (devinfo->gen < 6 && interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
   bld.MUL(attr, attr, this->pixel_w);
}
-  attr = offset(attr, 1);
+  attr = bld.offset(attr, 1);
}
 
 }
@@ -1227,19 +1232,19 @@ fs_visitor::emit_samplepos_setup()
if (dispatch_width == 8) {
   abld.MOV(int_sample_x, fs_reg(sample_pos_reg));
} else {
-  abld.half(0).MOV(half(int_sample_x, 0), fs_reg(sample_pos_reg));
-  abld.half(1).MOV(half(int_sample_x, 1),
+  abld.half(0).MOV(abld.half(int_sample_x, 0), fs_reg(sample_pos_reg));
+  abld.half(1).MOV(abld.half(int_sample_x, 1),
fs_reg(suboffset(sample_pos_reg, 16)));
}
/* Compute gl_SamplePosition.x */
compute_sample_position(pos, int_sample_x);
-   pos = offset(pos, 1);
+   pos = abld.offset(pos, 1);
if (dispatch_width == 8) {
   abld.MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1)));
} else {
-  abld.half(0).MOV(half(int_sample_y, 0),
+  abld.half(0).MOV(abld.half(int_sample_y, 0),
fs_reg(suboffset(sample_pos_reg, 1)));
-  abld.half(1).MOV(half(int_sample_y, 1),
+  abld.half(1).MOV(abld.half(int_sample_y, 1),
fs_reg(suboffset(sample_pos_reg, 17)));
}
/* Compute gl_SamplePosition.y */
@@ -3018,10 +3023,6 @@ fs_visitor::lower_load_payload()
 
   assert(inst->dst.file == MRF || inst->dst.file == GRF);
   assert(inst->saturate == false);
-
-  const fs_builder ibld = bld.group(inst->exec_size, inst->force_sechalf)
- .exec_all(inst->force_writemask_all)
-

[Mesa-dev] [PATCH 11/17] i965/fs_builder: Use the dispatch width for setting exec sizes

2015-06-18 Thread Jason Ekstrand
Previously we used dst.width but the two *should* be the same.
---
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index 74fd2c9..7d3c8ab 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -281,7 +281,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg &dst) const
   {
- return emit(instruction(opcode, dst.width, dst));
+ return emit(instruction(opcode, dispatch_width(), dst));
   }
 
   /**
@@ -299,11 +299,11 @@ namespace brw {
  case SHADER_OPCODE_SIN:
  case SHADER_OPCODE_COS:
 return fix_math_instruction(
-   emit(instruction(opcode, dst.width, dst,
+   emit(instruction(opcode, dispatch_width(), dst,
 fix_math_operand(src0;
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0));
+return emit(instruction(opcode, dispatch_width(), dst, src0));
  }
   }
 
@@ -319,12 +319,12 @@ namespace brw {
  case SHADER_OPCODE_INT_QUOTIENT:
  case SHADER_OPCODE_INT_REMAINDER:
 return fix_math_instruction(
-   emit(instruction(opcode, dst.width, dst,
+   emit(instruction(opcode, dispatch_width(), dst,
 fix_math_operand(src0),
 fix_math_operand(src1;
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0, src1));
+return emit(instruction(opcode, dispatch_width(), dst, src0, 
src1));
 
  }
   }
@@ -341,13 +341,14 @@ namespace brw {
  case BRW_OPCODE_BFI2:
  case BRW_OPCODE_MAD:
  case BRW_OPCODE_LRP:
-return emit(instruction(opcode, dst.width, dst,
+return emit(instruction(opcode, dispatch_width(), dst,
 fix_3src_operand(src0),
 fix_3src_operand(src1),
 fix_3src_operand(src2)));
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0, src1, src2));
+return emit(instruction(opcode, dispatch_width(), dst,
+src0, src1, src2));
  }
   }
 
@@ -563,7 +564,8 @@ namespace brw {
   {
  assert(dst.width % 8 == 0);
  instruction *inst = emit(instruction(SHADER_OPCODE_LOAD_PAYLOAD,
-  dst.width, dst, src, sources));
+  dispatch_width(), dst,
+  src, sources));
  inst->header_size = header_size;
 
  for (unsigned i = 0; i < header_size; i++)
@@ -574,7 +576,7 @@ namespace brw {
  for (unsigned i = header_size; i < sources; ++i)
 assert(src[i].file != GRF ||
src[i].width == dst.width);
- inst->regs_written += (sources - header_size) * (dst.width / 8);
+ inst->regs_written += (sources - header_size) * (dispatch_width() / 
8);
 
  return inst;
   }
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/17] i965/blorp: Explicitly set execution sizes for new'd instructions

2015-06-18 Thread Jason Ekstrand
This doesn't affect instructions allocated using the builder.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index c1b7609..f655a0c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -72,7 +72,7 @@ brw_blorp_eu_emitter::emit_kill_if_outside_rect(const struct 
brw_reg &x,
emit_cmp(BRW_CONDITIONAL_L, x, dst_x1)->predicate = BRW_PREDICATE_NORMAL;
emit_cmp(BRW_CONDITIONAL_L, y, dst_y1)->predicate = BRW_PREDICATE_NORMAL;
 
-   fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, g1, f0, g1);
+   fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, 16, g1, f0, g1);
inst->force_writemask_all = true;
insts.push_tail(inst);
 }
@@ -83,7 +83,7 @@ brw_blorp_eu_emitter::emit_texture_lookup(const struct 
brw_reg &dst,
   unsigned base_mrf,
   unsigned msg_length)
 {
-   fs_inst *inst = new (mem_ctx) fs_inst(op, dst, brw_message_reg(base_mrf),
+   fs_inst *inst = new (mem_ctx) fs_inst(op, 16, dst, 
brw_message_reg(base_mrf),
  fs_reg(0u));
 
inst->base_mrf = base_mrf;
@@ -118,7 +118,8 @@ brw_blorp_eu_emitter::emit_combine(enum opcode 
combine_opcode,
 {
assert(combine_opcode == BRW_OPCODE_ADD || combine_opcode == 
BRW_OPCODE_AVG);
 
-   insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, dst, src_1, src_2));
+   insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, 16, dst,
+ src_1, src_2));
 }
 
 fs_inst *
@@ -126,7 +127,7 @@ brw_blorp_eu_emitter::emit_cmp(enum brw_conditional_mod op,
const struct brw_reg &x,
const struct brw_reg &y)
 {
-   fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP,
+   fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP, 16,
 vec16(brw_null_reg()), x, y);
cmp->conditional_mod = op;
insts.push_tail(cmp);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/17] i965/fs: Remove the width field from fs_reg

2015-06-18 Thread Jason Ekstrand
As of now, the width field is no longer used for anything.  The width field
"seemed like a good idea at the time" but is actually entirely redundant
with the instruction's execution size.  Initially, it gave us the ability
to easily set the instructions execution size based entirely on register
widths.  With the builder, we can easiliy set the sizes explicitly and the
width field doesn't have as much purpose.  At this point, it's just
redundant information that can get out of sync so it really needs to go.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 62 --
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 21 ++--
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  4 --
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  4 +-
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |  1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 26 -
 src/mesa/drivers/dri/i965/brw_ir_fs.h  | 13 +
 8 files changed, 30 insertions(+), 107 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b30463a..3be81ca 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -203,10 +203,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
else
   op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
 
-   assert(dst.width % 8 == 0);
int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
-   fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written),
-   dst.type, bld.dispatch_width());
+   fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written), dst.type);
fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
inst->regs_written = regs_written;
 
@@ -310,7 +308,6 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
&grf_alloc) const
 
for (int i = 0; i < this->sources; i++) {
   reg.type = this->src[i].type;
-  reg.width = this->src[i].width;
   if (!this->src[i].equals(reg))
  return false;
 
@@ -366,7 +363,6 @@ fs_reg::fs_reg(float f)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_F;
this->fixed_hw_reg.dw1.f = f;
-   this->width = 1;
 }
 
 /** Immediate value constructor. */
@@ -376,7 +372,6 @@ fs_reg::fs_reg(int32_t i)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_D;
this->fixed_hw_reg.dw1.d = i;
-   this->width = 1;
 }
 
 /** Immediate value constructor. */
@@ -386,7 +381,6 @@ fs_reg::fs_reg(uint32_t u)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_UD;
this->fixed_hw_reg.dw1.ud = u;
-   this->width = 1;
 }
 
 /** Vector float immediate value constructor. */
@@ -417,7 +411,6 @@ fs_reg::fs_reg(struct brw_reg fixed_hw_reg)
this->file = HW_REG;
this->fixed_hw_reg = fixed_hw_reg;
this->type = fixed_hw_reg.type;
-   this->width = 1 << fixed_hw_reg.width;
 }
 
 bool
@@ -432,7 +425,6 @@ fs_reg::equals(const fs_reg &r) const
abs == r.abs &&
!reladdr && !r.reladdr &&
memcmp(&fixed_hw_reg, &r.fixed_hw_reg, sizeof(fixed_hw_reg)) == 0 &&
-   width == r.width &&
stride == r.stride);
 }
 
@@ -504,7 +496,7 @@ fs_visitor::get_timestamp(const fs_builder &bld)
   0),
  BRW_REGISTER_TYPE_UD));
 
-   fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4);
+   fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 
/* We want to read the 3 fields we care about even if it's not enabled in
 * the dispatch.
@@ -587,7 +579,7 @@ fs_visitor::emit_shader_time_end()
 
fs_reg start = shader_start_time;
start.negate = true;
-   fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
+   fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
diff.set_smear(0);
 
const fs_builder cbld = ibld.group(1, 0);
@@ -846,7 +838,7 @@ fs_visitor::vgrf(const glsl_type *const type)
 {
int reg_width = dispatch_width / 8;
return fs_reg(GRF, alloc.allocate(type_size(type) * reg_width),
- brw_type_for_base_type(type), dispatch_width);
+ brw_type_for_base_type(type));
 }
 
 /** Fixed HW reg constructor. */
@@ -856,14 +848,6 @@ fs_reg::fs_reg(enum register_file file, int reg)
this->file = file;
this->reg = reg;
this->type = BRW_REGISTER_TYPE_F;
-
-   switch (file) {
-   case UNIFORM:
-  this->width = 1;
-  break;
-   default:
-  this->width = 8;
-   }
 }
 
 /** Fixed HW reg constructor. */
@@ -873,25 +857,6 @@ fs_reg::fs_reg(enum register_file file, int reg, enum 
brw_reg_type type)
this->file = file;
this->reg = reg;
this->type = type;
-
-   switch (file) {
-   case UNIFORM:
-  this->width = 1;
-  break;
-   default:
-  this->width = 8;
-   }
-}
-
-/** Fixed HW reg constructor. */
-fs_reg::fs_reg(enum registe

[Mesa-dev] [PATCH 16/17] i965/fs_generator: Use inst->exec_size for determining hardware reg widths

2015-06-18 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 8eb3ace..2b66acf 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -48,7 +48,7 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
 }
 
 static struct brw_reg
-brw_reg_from_fs_reg(fs_reg *reg)
+brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg)
 {
struct brw_reg brw_reg;
 
@@ -57,10 +57,10 @@ brw_reg_from_fs_reg(fs_reg *reg)
case MRF:
   if (reg->stride == 0) {
  brw_reg = brw_vec1_reg(brw_file_from_reg(reg), reg->reg, 0);
-  } else if (reg->width < 8) {
+  } else if (inst->exec_size < 8) {
  brw_reg = brw_vec8_reg(brw_file_from_reg(reg), reg->reg, 0);
- brw_reg = stride(brw_reg, reg->width * reg->stride,
-  reg->width, reg->stride);
+ brw_reg = stride(brw_reg, inst->exec_size * reg->stride,
+  inst->exec_size, reg->stride);
   } else {
  /* From the Haswell PRM:
   *
@@ -413,7 +413,7 @@ fs_generator::generate_blorp_fb_write(fs_inst *inst)
brw_fb_WRITE(p,
 16 /* dispatch_width */,
 brw_message_reg(inst->base_mrf),
-brw_reg_from_fs_reg(&inst->src[0]),
+brw_reg_from_fs_reg(inst, &inst->src[0]),
 BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE,
 inst->target,
 inst->mlen,
@@ -1562,7 +1562,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  annotate(p->devinfo, &annotation, cfg, inst, p->next_insn_offset);
 
   for (unsigned int i = 0; i < inst->sources; i++) {
-src[i] = brw_reg_from_fs_reg(&inst->src[i]);
+src[i] = brw_reg_from_fs_reg(inst, &inst->src[i]);
 
 /* The accumulator result appears to get used for the
  * conditional modifier generation.  When negating a UD
@@ -1574,7 +1574,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
inst->src[i].type != BRW_REGISTER_TYPE_UD ||
!inst->src[i].negate);
   }
-  dst = brw_reg_from_fs_reg(&inst->dst);
+  dst = brw_reg_from_fs_reg(inst, &inst->dst);
 
   brw_set_default_predicate_control(p, inst->predicate);
   brw_set_default_predicate_inverse(p, inst->predicate_inverse);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/17] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup

2015-06-19 Thread Jason Ekstrand
On Jun 19, 2015 5:09 AM, "Iago Toral"  wrote:
>
> On Thu, 2015-06-18 at 17:50 -0700, Jason Ekstrand wrote:
> > Soon we will start using the builder to explicitly set all the execution
> > sizes.  We could make a 32-wide builder, but the builder asserts that we
> > never grow it which is usually a reasonable assumption.  Sinc this one
> > instruction is a bit of an odd-ball, we just set the exec_size
explicitly.
>
> So if I understand it right, the only point of this change is making
> explicit that this instruction has a different execution size to ensure
> that we notice it when we rewrite the code to set explicit execution
> sizes with the new builder, right?

No, it's more that there is no good way to set it to SIMD32 with the
builder because changing dispatch width in the builder can only go down and
not up.  In retrospect, I should have explicitly created the fs_inst rather
than using the builder to emit it 16-wide and changing it later.

The reason this patch can stand on it's own is because, at this point in
the series, the builder still uses the exec size guessing based on register
widths.

> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > index 4770838..b00825e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > @@ -1357,10 +1357,11 @@ fs_visitor::emit_interpolation_setup_gen6()
> > */
> >fs_reg int_pixel_xy(GRF, alloc.allocate(dispatch_width / 8),
> >BRW_REGISTER_TYPE_UW, dispatch_width * 2);
> > -  abld.exec_all()
> > -  .ADD(int_pixel_xy,
> > -   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
> > -   fs_reg(brw_imm_v(0x11001010)));
> > +  fs_inst *add = abld.exec_all()
> > + .ADD(int_pixel_xy,
> > +  fs_reg(stride(suboffset(g1_uw, 4), 1, 4,
0)),
> > +  fs_reg(brw_imm_v(0x11001010)));
> > +  add->exec_size = dispatch_width * 2;
> >
> >this->pixel_x = vgrf(glsl_type::float_type);
> >this->pixel_y = vgrf(glsl_type::float_type);
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/17] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup

2015-06-19 Thread Jason Ekstrand
Soon we will start using the builder to explicitly set all the execution
sizes.  We could make a 32-wide builder, but the builder asserts that we
never grow it which is usually a reasonable assumption.  Sinc this one
instruction is a bit of an odd-ball, we just set the exec_size explicitly.

v2: Explicitly new the fs_inst instead of using the builder and setting
exec_size after the fact.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..33464c5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1357,10 +1357,13 @@ fs_visitor::emit_interpolation_setup_gen6()
*/
   fs_reg int_pixel_xy(GRF, alloc.allocate(dispatch_width / 8),
   BRW_REGISTER_TYPE_UW, dispatch_width * 2);
-  abld.exec_all()
-  .ADD(int_pixel_xy,
-   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
-   fs_reg(brw_imm_v(0x11001010)));
+  fs_inst *add =
+ new (mem_ctx) fs_inst(BRW_OPCODE_ADD, dispatch_width * 2,
+   int_pixel_xy,
+   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
+   fs_reg(brw_imm_v(0x11001010)));
+  add->force_writemask_all = true;
+  abld.emit(add);
 
   this->pixel_x = vgrf(glsl_type::float_type);
   this->pixel_y = vgrf(glsl_type::float_type);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 02/18] i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads

2015-06-19 Thread Jason Ekstrand
Previously, fs_inst::regs_read() fell back to depending on the register
width for the second source.  This isn't really correct since it isn't a
SIMD8 value at all, but a SIMD4x2 value.  This commit changes it to
explicitly be always one register.

Reviewed-by: Iago Toral Quiroga 

v2: Use mlen for determining the number of registers written
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 17a940b..6d25ba4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -758,6 +758,12 @@ fs_inst::regs_read(int arg) const
  return mlen;
   break;
 
+   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7:
+  /* The payload is actually storred in src1 */
+  if (arg == 1)
+ return mlen;
+  break;
+
case FS_OPCODE_LINTERP:
   if (arg == 0)
  return exec_size / 4;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01.9/18] i965/fs: Actually set/use the mlen for gen7 uniform pull constant loads

2015-06-19 Thread Jason Ekstrand
Previously, we were allocating the payload with different sizes per gen and
then figuring out the mlen in the generator based on gen.  This meant,
among other things, that the higher level passes knew nothing about it.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 14 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  6 ++
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 37b6d0d..17a940b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2952,14 +2952,18 @@ fs_visitor::lower_uniform_pull_constant_loads()
  assert(const_offset_reg.file == IMM &&
 const_offset_reg.type == BRW_REGISTER_TYPE_UD);
  const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
- fs_reg payload = fs_reg(GRF, alloc.allocate(1));
 
- /* We have to use a message header on Skylake to get SIMD4x2 mode.
-  * Reserve space for the register.
-  */
+ fs_reg payload;
  if (devinfo->gen >= 9) {
+/* We have to use a message header on Skylake to get SIMD4x2
+ * mode.  Reserve space for the register.
+*/
+payload = fs_reg(GRF, alloc.allocate(2));
 payload.reg_offset++;
-alloc.sizes[payload.reg] = 2;
+inst->mlen = 2;
+ } else {
+payload = fs_reg(GRF, alloc.allocate(1));
+inst->mlen = 1;
  }
 
  /* This is actually going to be a MOV, but since only the first dword
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 8eb3ace..7a79b39 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1068,12 +1068,10 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
 
struct brw_reg src = offset;
bool header_present = false;
-   int mlen = 1;
 
if (devinfo->gen >= 9) {
   /* Skylake requires a message header in order to use SIMD4x2 mode. */
   src = retype(brw_vec4_grf(offset.nr - 1, 0), BRW_REGISTER_TYPE_UD);
-  mlen = 2;
   header_present = true;
 
   brw_push_insn_state(p);
@@ -1104,7 +1102,7 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
   0, /* LD message ignores sampler unit */
   GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
   1, /* rlen */
-  mlen,
+  inst->mlen,
   header_present,
   BRW_SAMPLER_SIMD_MODE_SIMD4X2,
   0);
@@ -1134,7 +1132,7 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
   0, /* LD message ignores sampler unit */
   GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
   1, /* rlen */
-  mlen,
+  inst->mlen,
   header_present,
   BRW_SAMPLER_SIMD_MODE_SIMD4X2,
   0);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/14] meta: Abort meta pbo path if readpixels need signed-unsigned conversion

2015-06-19 Thread Jason Ekstrand
On Fri, Jun 19, 2015 at 1:40 PM, Anuj Phogat  wrote:
> On Tue, Jun 16, 2015 at 9:21 PM, Jason Ekstrand  wrote:
>>
>> On Jun 16, 2015 11:15, "Anuj Phogat"  wrote:
>>>
>>> Without this patch, piglit test fbo_integer_readpixels_sint_uint fails,
>>> when
>>> forced to use the meta pbo path.
>>>
>>> Signed-off-by: Anuj Phogat 
>>> Cc: 
>>> ---
>>>  src/mesa/drivers/common/meta_tex_subimage.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
>>> b/src/mesa/drivers/common/meta_tex_subimage.c
>>> index 00364f8..84cbc50 100644
>>> --- a/src/mesa/drivers/common/meta_tex_subimage.c
>>> +++ b/src/mesa/drivers/common/meta_tex_subimage.c
>>> @@ -283,6 +283,9 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx,
>>> GLuint dims,
>>>
>>>if (_mesa_need_rgb_to_luminance_conversion(rb->Format, format))
>>>   return false;
>>> +
>>> +  if (_mesa_need_signed_unsigned_int_conversion(rb->Format, format,
>>> type))
>>> + return false;
>>
>> Hrm... This seems fishy.  Isn't glBlitFramebuffers supposed to handle format
>> conversion with integers?  If so we should probably fix it rather than just
>> skip it for the meta pbo path.
>>
> As discussed offline, here is relevant text for glBlitFrameBuffer() from
> OpenGL 4.5 spec, section 18.3.1:
> "An INVALID_OPERATION error is generated if format conversions are not
> supported, which occurs under any of the following conditions:
> -The read buffer contains fixed-point or floating-point values and any draw
>   buffer contains neither fixed-point nor floating-point values.
> -The read buffer contains unsigned integer values and any draw buffer does
>   not contain unsigned integer values.
> - The read buffer contains signed integer values and any draw buffer does
>   not contain signed integer values."
>
> I'll add a comment here explaining the reason to avoid meta path.

Thanks!  With that added,

Reviewed-by: Jason Ekstrand 

>>> }
>>>
>>> /* For arrays, use a tall (height * depth) 2D texture but taking into
>>> --
>>> 1.9.3
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 02/18] i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads

2015-06-19 Thread Jason Ekstrand
On Fri, Jun 19, 2015 at 1:51 PM, Matt Turner  wrote:
> On Fri, Jun 19, 2015 at 1:18 PM, Jason Ekstrand  wrote:
>> Previously, fs_inst::regs_read() fell back to depending on the register
>> width for the second source.  This isn't really correct since it isn't a
>> SIMD8 value at all, but a SIMD4x2 value.  This commit changes it to
>> explicitly be always one register.
>>
>> Reviewed-by: Iago Toral Quiroga 
>>
>> v2: Use mlen for determining the number of registers written
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 17a940b..6d25ba4 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -758,6 +758,12 @@ fs_inst::regs_read(int arg) const
>>   return mlen;
>>break;
>>
>> +   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7:
>> +  /* The payload is actually storred in src1 */
>
> stored just has one r
Thanks!  Fixed locally.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/16] i965: Finish removing brw_context from the compiler

2015-06-22 Thread Jason Ekstrand
I started working on this project some time ago to remove brw_context from
the backend compiler.  I got a bunch of refactoring done but eventualy got
stuck up on shader_time and some debug logging stuff.  I've finally gotten
around to finishing it and here it is.

Jason Ekstrand (15):
  i965: Replace some instances of brw->gen with devinfo->gen
  i965: Plumb compiler debug logging through a function pointer in
brw_compiler
  i965: Remove the dependance on brw_context from the generators
  i965: Move INTEL_DEBUG variable parsing to screen creation time
  i965/fs: Make no16 non-variadic
  i965/fs: Do the no16 perf logging directly in fs_visitor::no16()
  i965/fs: Plumb compiler debug logging through brw_compiler
  i965: Add compiler options to brw_compiler
  i965: Use a single index per shader for shader_time.
  i965: Pull calls to get_shader_time_index out of the visitor
  i965/fs: Add a do_rep_send flag to run_fs
  i965/vs: Pass the current set of clip planes through run() and
run_vs()
  i965/vec4: Turn some _mesa_problem calls into asserts
  i965/vec4_vs: Add an explicit use_legacy_snorm_formula flag
  i965: Remove the brw_context from the visitors

Kenneth Graunke (1):
  mesa: Add a va_args variant of _mesa_gl_debug().

 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp|   3 +-
 src/mesa/drivers/dri/i965/brw_context.c|  54 ++---
 src/mesa/drivers/dri/i965/brw_context.h|  15 +--
 src/mesa/drivers/dri/i965/brw_cs.cpp   |  17 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 127 -
 src/mesa/drivers/dri/i965/brw_fs.h |  28 +++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  21 ++--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  30 ++---
 src/mesa/drivers/dri/i965/brw_program.c|  67 ---
 src/mesa/drivers/dri/i965/brw_shader.cpp   | 100 +++-
 src/mesa/drivers/dri/i965/brw_shader.h |  13 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  49 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  23 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  22 ++--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |  32 --
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h|   5 +-
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |   1 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  16 +--
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  |   9 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp  |  16 +--
 src/mesa/drivers/dri/i965/brw_vs.h |   8 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h|   7 +-
 src/mesa/drivers/dri/i965/intel_debug.c|  13 +--
 src/mesa/drivers/dri/i965/intel_debug.h|   4 +-
 src/mesa/drivers/dri/i965/intel_screen.c   |   3 +
 src/mesa/main/errors.c |  29 +++--
 src/mesa/main/errors.h |   9 ++
 28 files changed, 379 insertions(+), 343 deletions(-)

-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/16] mesa: Add a va_args variant of _mesa_gl_debug().

2015-06-22 Thread Jason Ekstrand
From: Kenneth Graunke 

This will be useful for wrapper functions.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/errors.c | 29 +
 src/mesa/main/errors.h |  9 +
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 16f10dd..b340666 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -1413,6 +1413,26 @@ should_output(struct gl_context *ctx, GLenum error, 
const char *fmtString)
 
 
 void
+_mesa_gl_vdebug(struct gl_context *ctx,
+GLuint *id,
+enum mesa_debug_source source,
+enum mesa_debug_type type,
+enum mesa_debug_severity severity,
+const char *fmtString,
+va_list args)
+{
+   char s[MAX_DEBUG_MESSAGE_LENGTH];
+   int len;
+
+   debug_get_id(id);
+
+   len = _mesa_vsnprintf(s, MAX_DEBUG_MESSAGE_LENGTH, fmtString, args);
+
+   log_msg(ctx, source, type, *id, severity, len, s);
+}
+
+
+void
 _mesa_gl_debug(struct gl_context *ctx,
GLuint *id,
enum mesa_debug_source source,
@@ -1420,17 +1440,10 @@ _mesa_gl_debug(struct gl_context *ctx,
enum mesa_debug_severity severity,
const char *fmtString, ...)
 {
-   char s[MAX_DEBUG_MESSAGE_LENGTH];
-   int len;
va_list args;
-
-   debug_get_id(id);
-
va_start(args, fmtString);
-   len = _mesa_vsnprintf(s, MAX_DEBUG_MESSAGE_LENGTH, fmtString, args);
+   _mesa_gl_vdebug(ctx, id, source, type, severity, fmtString, args);
va_end(args);
-
-   log_msg(ctx, source, type, *id, severity, len, s);
 }
 
 
diff --git a/src/mesa/main/errors.h b/src/mesa/main/errors.h
index e6dc9b5..24f234f 100644
--- a/src/mesa/main/errors.h
+++ b/src/mesa/main/errors.h
@@ -76,6 +76,15 @@ extern FILE *
 _mesa_get_log_file(void);
 
 extern void
+_mesa_gl_vdebug(struct gl_context *ctx,
+GLuint *id,
+enum mesa_debug_source source,
+enum mesa_debug_type type,
+enum mesa_debug_severity severity,
+const char *fmtString,
+va_list args);
+
+extern void
 _mesa_gl_debug(struct gl_context *ctx,
GLuint *id,
enum mesa_debug_source source,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/16] i965: Remove the dependance on brw_context from the generators

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp   | 2 +-
 src/mesa/drivers/dri/i965/brw_cs.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs.h| 4 +++-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 ++--
 src/mesa/drivers/dri/i965/brw_vec4.h  | 4 +++-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 2 +-
 9 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index 9c04137..789520c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -29,7 +29,7 @@
 brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw,
bool debug_flag)
: mem_ctx(ralloc_context(NULL)),
- generator(brw->intelScreen->compiler,
+ generator(brw->intelScreen->compiler, brw,
mem_ctx, (void *) rzalloc(mem_ctx, struct brw_wm_prog_key),
(struct brw_stage_prog_data *) rzalloc(mem_ctx, struct 
brw_wm_prog_data),
NULL, 0, false, "BLORP")
diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index f93ca2f..0833404 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -128,7 +128,7 @@ brw_cs_emit(struct brw_context *brw,
   return NULL;
}
 
-   fs_generator g(brw->intelScreen->compiler,
+   fs_generator g(brw->intelScreen->compiler, brw,
   mem_ctx, (void*) key, &prog_data->base, &cp->Base,
   v8.promoted_constants, v8.runtime_check_aads_emit, "CS");
if (INTEL_DEBUG & DEBUG_CS) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2b892f0..615c2f1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4069,7 +4069,7 @@ brw_wm_fs_emit(struct brw_context *brw,
   prog_data->no_8 = false;
}
 
-   fs_generator g(brw->intelScreen->compiler,
+   fs_generator g(brw->intelScreen->compiler, brw,
   mem_ctx, (void *) key, &prog_data->base,
   &fp->Base, v.promoted_constants, v.runtime_check_aads_emit, 
"FS");
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 7414b65..1d52ff0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -398,7 +398,7 @@ public:
 class fs_generator
 {
 public:
-   fs_generator(const struct brw_compiler *compiler,
+   fs_generator(const struct brw_compiler *compiler, void *log_data,
 void *mem_ctx,
 const void *key,
 struct brw_stage_prog_data *prog_data,
@@ -494,6 +494,8 @@ private:
bool patch_discard_jumps_to_fb_writes();
 
const struct brw_compiler *compiler;
+   void *log_data; /* Passed to compiler->*_log functions */
+
const struct brw_device_info *devinfo;
 
struct brw_codegen *p;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d98a40d..2ed0bac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -121,7 +121,7 @@ brw_reg_from_fs_reg(fs_reg *reg)
return brw_reg;
 }
 
-fs_generator::fs_generator(const struct brw_compiler *compiler,
+fs_generator::fs_generator(const struct brw_compiler *compiler, void *log_data,
void *mem_ctx,
const void *key,
struct brw_stage_prog_data *prog_data,
@@ -130,7 +130,8 @@ fs_generator::fs_generator(const struct brw_compiler 
*compiler,
bool runtime_check_aads_emit,
const char *stage_abbrev)
 
-   : compiler(compiler), devinfo(compiler->devinfo), key(key),
+   : compiler(compiler), log_data(log_data),
+ devinfo(compiler->devinfo), key(key),
  prog_data(prog_data),
  prog(prog), promoted_constants(promoted_constants),
  runtime_check_aads_emit(runtime_check_aads_emit), debug_flag(false),
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 5e549c4..572bc17 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1910,7 +1910,7 @@ brw_vs_emit(struct brw_context *brw,
  return NULL;
   }
 
-  fs_generator g(brw->intelScreen->compiler,
+  fs_generator g(brw->intelScreen->compiler, brw,
  mem_ctx, (void *) &c->key, &prog_data->base.base,
  &c->vp->program.Base, v.promoted_constants,
  v.runtime_check_aads_emit, "VS");
@@ -1948,7 +1948,7 @@ brw_vs_emit(struct brw_

[Mesa-dev] [PATCH 10/16] i965: Use a single index per shader for shader_time.

2015-06-22 Thread Jason Ekstrand
Previously, each shader took 3 shader time indices which were potentially
at arbirary points in the shader time buffer.  Now, each shader gets a
single index which refers to 3 consecutive locations in the buffer.  This
simplifies some of the logic at the cost of having a magic 3 a few places.
---
 src/mesa/drivers/dri/i965/brw_context.h   | 14 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 28 --
 src/mesa/drivers/dri/i965/brw_fs.h|  3 +-
 src/mesa/drivers/dri/i965/brw_program.c   | 67 +++
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 18 +++---
 src/mesa/drivers/dri/i965/brw_vec4.h  | 10 +---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  8 +--
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  2 +-
 9 files changed, 53 insertions(+), 100 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d8fcfff..a7d83f8 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -821,20 +821,10 @@ struct brw_tracked_state {
 enum shader_time_shader_type {
ST_NONE,
ST_VS,
-   ST_VS_WRITTEN,
-   ST_VS_RESET,
ST_GS,
-   ST_GS_WRITTEN,
-   ST_GS_RESET,
ST_FS8,
-   ST_FS8_WRITTEN,
-   ST_FS8_RESET,
ST_FS16,
-   ST_FS16_WRITTEN,
-   ST_FS16_RESET,
ST_CS,
-   ST_CS_WRITTEN,
-   ST_CS_RESET,
 };
 
 struct brw_vertex_buffer {
@@ -979,6 +969,8 @@ enum brw_predicate_state {
BRW_PREDICATE_STATE_USE_BIT
 };
 
+struct shader_times;
+
 /**
  * brw_context is derived from gl_context.
  */
@@ -1503,7 +1495,7 @@ struct brw_context
   const char **names;
   int *ids;
   enum shader_time_shader_type *types;
-  uint64_t *cumulative;
+  struct shader_times *cumulative;
   int num_entries;
   int max_entries;
   double report_time;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 460120d..c1bfe86 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -578,38 +578,30 @@ fs_visitor::emit_shader_time_begin()
 void
 fs_visitor::emit_shader_time_end()
 {
-   enum shader_time_shader_type type, written_type, reset_type;
+   enum shader_time_shader_type type;
switch (stage) {
case MESA_SHADER_VERTEX:
   type = ST_VS;
-  written_type = ST_VS_WRITTEN;
-  reset_type = ST_VS_RESET;
   break;
case MESA_SHADER_GEOMETRY:
   type = ST_GS;
-  written_type = ST_GS_WRITTEN;
-  reset_type = ST_GS_RESET;
   break;
case MESA_SHADER_FRAGMENT:
   if (dispatch_width == 8) {
  type = ST_FS8;
- written_type = ST_FS8_WRITTEN;
- reset_type = ST_FS8_RESET;
   } else {
  assert(dispatch_width == 16);
  type = ST_FS16;
- written_type = ST_FS16_WRITTEN;
- reset_type = ST_FS16_RESET;
   }
   break;
case MESA_SHADER_COMPUTE:
   type = ST_CS;
-  written_type = ST_CS_WRITTEN;
-  reset_type = ST_CS_RESET;
   break;
default:
   unreachable("fs_visitor::emit_shader_time_end missing code");
}
+   int shader_time_index = brw_get_shader_time_index(brw, shader_prog, prog,
+ type);
 
/* Insert our code just before the final SEND with EOT. */
exec_node *end = this->instructions.get_tail();
@@ -639,20 +631,20 @@ fs_visitor::emit_shader_time_end()
 * trying to determine the time taken for single instructions.
 */
ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, type, diff);
-   SHADER_TIME_ADD(ibld, written_type, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, shader_time_index, 0, diff);
+   SHADER_TIME_ADD(ibld, shader_time_index, 1, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, reset_type, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, shader_time_index, 2, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
 void
 fs_visitor::SHADER_TIME_ADD(const fs_builder &bld,
-enum shader_time_shader_type type, fs_reg value)
+int shader_time_index, int shader_time_subindex,
+fs_reg value)
 {
-   int shader_time_index =
-  brw_get_shader_time_index(brw, shader_prog, prog, type);
-   fs_reg offset = fs_reg(shader_time_index * SHADER_TIME_STRIDE);
+   int index = shader_time_index * 3 + shader_time_subindex;
+   fs_reg offset = fs_reg(index * SHADER_TIME_STRIDE);
 
fs_reg payload;
if (dispatch_width == 8)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index cffedc0..55a9722 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -278,7 +278,8 @@ public:
void emit_shader_time_begin();
void emit_shader_time_end();
void SHADER_TIME_ADD(const brw::fs_builder &bld,
-

[Mesa-dev] [PATCH 14/16] i965/vec4: Turn some _mesa_problem calls into asserts

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
index 92d1085..dcbd240 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
@@ -381,8 +381,7 @@ vec4_vs_visitor::emit_program_code()
  break;
 
   default:
- _mesa_problem(ctx, "Unsupported opcode %s in vertex program\n",
-   _mesa_opcode_string(vpi->Opcode));
+ assert(!"Unsupported opcode in vertex program");
   }
 
   /* Copy the temporary back into the actual destination register. */
@@ -574,15 +573,13 @@ vec4_vs_visitor::get_vp_src_reg(const prog_src_register 
&src)
  break;
 
   default:
- _mesa_problem(ctx, "bad uniform src register file: %s\n",
-   _mesa_register_file_name((gl_register_file)src.File));
+ assert(!"Bad uniform in src register file");
  return src_reg(this, glsl_type::vec4_type);
   }
   break;
 
default:
-  _mesa_problem(ctx, "bad src register file: %s\n",
-_mesa_register_file_name((gl_register_file)src.File));
+  assert(!"Bad src register file");
   return src_reg(this, glsl_type::vec4_type);
}
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/16] i965: Pull calls to get_shader_time_index out of the visitor

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_cs.cpp  |  8 +++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 55 ---
 src/mesa/drivers/dri/i965/brw_fs.h|  7 ++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  7 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 25 ++-
 src/mesa/drivers/dri/i965/brw_vec4.h  |  7 ++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 18 +---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  4 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  5 ++-
 src/mesa/drivers/dri/i965/brw_vs.h|  3 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |  5 ++-
 12 files changed, 75 insertions(+), 72 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 0833404..fa8b5c8 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -88,10 +88,14 @@ brw_cs_emit(struct brw_context *brw,
cfg_t *cfg = NULL;
const char *fail_msg = NULL;
 
+   int st_index = -1;
+   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  st_index = brw_get_shader_time_index(brw, prog, &cp->Base, ST_CS);
+
/* Now the main event: Visit the shader IR and generate our CS IR for it.
 */
fs_visitor v8(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
- &cp->Base, 8);
+ &cp->Base, 8, st_index);
if (!v8.run_cs()) {
   fail_msg = v8.fail_msg;
} else if (local_workgroup_size <= 8 * brw->max_cs_threads) {
@@ -100,7 +104,7 @@ brw_cs_emit(struct brw_context *brw,
}
 
fs_visitor v16(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
-  &cp->Base, 16);
+  &cp->Base, 16, st_index);
if (likely(!(INTEL_DEBUG & DEBUG_NO16)) &&
!fail_msg && !v8.simd16_unsupported &&
local_workgroup_size <= 16 * brw->max_cs_threads) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c1bfe86..252196a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -578,31 +578,6 @@ fs_visitor::emit_shader_time_begin()
 void
 fs_visitor::emit_shader_time_end()
 {
-   enum shader_time_shader_type type;
-   switch (stage) {
-   case MESA_SHADER_VERTEX:
-  type = ST_VS;
-  break;
-   case MESA_SHADER_GEOMETRY:
-  type = ST_GS;
-  break;
-   case MESA_SHADER_FRAGMENT:
-  if (dispatch_width == 8) {
- type = ST_FS8;
-  } else {
- assert(dispatch_width == 16);
- type = ST_FS16;
-  }
-  break;
-   case MESA_SHADER_COMPUTE:
-  type = ST_CS;
-  break;
-   default:
-  unreachable("fs_visitor::emit_shader_time_end missing code");
-   }
-   int shader_time_index = brw_get_shader_time_index(brw, shader_prog, prog,
- type);
-
/* Insert our code just before the final SEND with EOT. */
exec_node *end = this->instructions.get_tail();
assert(end && ((fs_inst *) end)->eot);
@@ -631,16 +606,16 @@ fs_visitor::emit_shader_time_end()
 * trying to determine the time taken for single instructions.
 */
ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, shader_time_index, 0, diff);
-   SHADER_TIME_ADD(ibld, shader_time_index, 1, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, 0, diff);
+   SHADER_TIME_ADD(ibld, 1, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, shader_time_index, 2, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, 2, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
 void
 fs_visitor::SHADER_TIME_ADD(const fs_builder &bld,
-int shader_time_index, int shader_time_subindex,
+int shader_time_subindex,
 fs_reg value)
 {
int index = shader_time_index * 3 + shader_time_subindex;
@@ -3823,7 +3798,7 @@ fs_visitor::run_vs()
assign_common_binding_table_offsets(0);
setup_vs_payload();
 
-   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+   if (shader_time_index >= 0)
   emit_shader_time_begin();
 
emit_nir_code();
@@ -3833,7 +3808,7 @@ fs_visitor::run_vs()
 
emit_urb_writes();
 
-   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+   if (shader_time_index >= 0)
   emit_shader_time_end();
 
calculate_cfg();
@@ -3871,7 +3846,7 @@ fs_visitor::run_fs()
} else if (brw->use_rep_send && dispatch_width == 16) {
   emit_repclear_shader();
} else {
-  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  if (shader_time_index >= 0)
  emit_shader_time_begin();
 
   calculate_urb_setup();
@@ -3906,7 +3881,7 @@ fs_visitor::run_fs()
 
   emit_fb_writes();
 
-  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  if (shader_time_index >= 0)
  emit_shader_time_end();
 
   calculate_cfg();
@@ -3950,7 +3925,7 @@ fs_visitor::run_cs()
 
setu

[Mesa-dev] [PATCH 01/16] i965: Replace some instances of brw->gen with devinfo->gen

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5563c5a..ac65202 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3187,7 +3187,7 @@ fs_visitor::lower_integer_multiplication()
  fs_reg high(GRF, alloc.allocate(dispatch_width / 8),
  inst->dst.type, dispatch_width);
 
- if (brw->gen >= 7) {
+ if (devinfo->gen >= 7) {
 fs_reg src1_0_w = inst->src[1];
 fs_reg src1_1_w = inst->src[1];
 
@@ -3616,7 +3616,7 @@ fs_visitor::setup_vs_payload()
 void
 fs_visitor::setup_cs_payload()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
payload.num_regs = 1;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..cafe64a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1344,7 +1344,7 @@ fs_visitor::emit_interpolation_setup_gen6()
struct brw_reg g1_uw = retype(brw_vec1_grf(1, 0), BRW_REGISTER_TYPE_UW);
 
fs_builder abld = bld.annotate("compute pixel centers");
-   if (brw->gen >= 8 || dispatch_width == 8) {
+   if (devinfo->gen >= 8 || dispatch_width == 8) {
   /* The "Register Region Restrictions" page says for BDW (and newer,
* presumably):
*
@@ -1623,7 +1623,7 @@ fs_visitor::emit_single_fb_write(const fs_builder &bld,
   /* On pre-SNB, we have to interlace the color values.  LOAD_PAYLOAD
* will do this for us if we just give it a COMPR4 destination.
*/
-  if (brw->gen < 6 && exec_size == 16)
+  if (devinfo->gen < 6 && exec_size == 16)
  load->dst.reg |= BRW_MRF_COMPR4;
 
   write = ubld.emit(FS_OPCODE_FB_WRITE);
@@ -1934,7 +1934,7 @@ fs_visitor::emit_urb_writes()
 void
 fs_visitor::emit_cs_terminate()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
/* We are getting the thread ID from the compute shader header */
assert(stage == MESA_SHADER_COMPUTE);
@@ -1956,7 +1956,7 @@ fs_visitor::emit_cs_terminate()
 void
 fs_visitor::emit_barrier()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
/* We are getting the barrier ID from the compute shader header */
assert(stage == MESA_SHADER_COMPUTE);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/16] i965/vs: Pass the current set of clip planes through run() and run_vs()

2015-06-22 Thread Jason Ekstrand
Previously, these were pulled out of the GL context conditionally based on
whether we were running ff/ARB or a GLSL program.  Now, we just pass them
in so that the visitor doesn't have to grab them itself.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs.h|  8 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 11 +--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 
 src/mesa/drivers/dri/i965/brw_vec4.h  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  4 +---
 7 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bf04e26..23f60c2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3791,7 +3791,7 @@ fs_visitor::allocate_registers()
 }
 
 bool
-fs_visitor::run_vs()
+fs_visitor::run_vs(gl_clip_plane *clip_planes)
 {
assert(stage == MESA_SHADER_VERTEX);
 
@@ -3806,7 +3806,7 @@ fs_visitor::run_vs()
if (failed)
   return false;
 
-   emit_urb_writes();
+   emit_urb_writes(clip_planes);
 
if (shader_time_index >= 0)
   emit_shader_time_end();
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 4db5a91..e0a8984 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -84,8 +84,8 @@ public:
 
fs_reg vgrf(const glsl_type *const type);
void import_uniforms(fs_visitor *v);
-   void setup_uniform_clipplane_values();
-   void compute_clip_distance();
+   void setup_uniform_clipplane_values(gl_clip_plane *clip_planes);
+   void compute_clip_distance(gl_clip_plane *clip_planes);
 
uint32_t gather_channel(int orig_chan, uint32_t sampler);
void swizzle_result(ir_texture_opcode op, int dest_components,
@@ -104,7 +104,7 @@ public:
void DEP_RESOLVE_MOV(const brw::fs_builder &bld, int grf);
 
bool run_fs(bool do_rep_send);
-   bool run_vs();
+   bool run_vs(gl_clip_plane *clip_planes);
bool run_cs();
void optimize();
void allocate_registers();
@@ -271,7 +271,7 @@ public:
  fs_reg src0_alpha, unsigned components,
  unsigned exec_size, bool use_2nd_half = 
false);
void emit_fb_writes();
-   void emit_urb_writes();
+   void emit_urb_writes(gl_clip_plane *clip_planes);
void emit_cs_terminate();
 
void emit_barrier();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9ce8491..395394c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1715,9 +1715,8 @@ fs_visitor::emit_fb_writes()
 }
 
 void
-fs_visitor::setup_uniform_clipplane_values()
+fs_visitor::setup_uniform_clipplane_values(gl_clip_plane *clip_planes)
 {
-   gl_clip_plane *clip_planes = brw_select_clip_planes(ctx);
const struct brw_vue_prog_key *key =
   (const struct brw_vue_prog_key *) this->key;
 
@@ -1731,7 +1730,7 @@ fs_visitor::setup_uniform_clipplane_values()
}
 }
 
-void fs_visitor::compute_clip_distance()
+void fs_visitor::compute_clip_distance(gl_clip_plane *clip_planes)
 {
struct brw_vue_prog_data *vue_prog_data =
   (struct brw_vue_prog_data *) prog_data;
@@ -1760,7 +1759,7 @@ void fs_visitor::compute_clip_distance()
if (outputs[clip_vertex].file == BAD_FILE)
   return;
 
-   setup_uniform_clipplane_values();
+   setup_uniform_clipplane_values(clip_planes);
 
const fs_builder abld = bld.annotate("user clip distances");
 
@@ -1781,7 +1780,7 @@ void fs_visitor::compute_clip_distance()
 }
 
 void
-fs_visitor::emit_urb_writes()
+fs_visitor::emit_urb_writes(gl_clip_plane *clip_planes)
 {
int slot, urb_offset, length;
struct brw_vs_prog_data *vs_prog_data =
@@ -1796,7 +1795,7 @@ fs_visitor::emit_urb_writes()
 
/* Lower legacy ff and ClipVertex clipping to clip distances */
if (key->base.userclip_active && !prog->UsesClipDistanceOut)
-  compute_clip_distance();
+  compute_clip_distance(clip_planes);
 
/* If we don't have any valid slots to write, just do a minimal urb write
 * send to terminate the shader. */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 093802c..9c45034 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1706,7 +1706,7 @@ vec4_visitor::emit_shader_time_write(int 
shader_time_subindex, src_reg value)
 }
 
 bool
-vec4_visitor::run()
+vec4_visitor::run(gl_clip_plane *clip_planes)
 {
sanity_param_count = prog->Parameters->NumParameters;
 
@@ -1728,7 +1728,7 @@ vec4_visitor::run()
base_ir = NULL;
 
if (key->userclip_active && !prog->UsesClipDistanceOut)
-  setup_uniform_clipplane_values();
+  setup_uniform_clipplane_values(clip_pla

[Mesa-dev] [PATCH 09/16] i965: Add compiler options to brw_compiler

2015-06-22 Thread Jason Ekstrand
This creates the options at screen cration time and then we just copy them
into the context at context creation time.  We also move is_scalar to the
brw_compiler structure.

We also end up manually setting some values that the core would have set by
default for us.  Fortunately, there are only two non-zero shader compiler
option defaults that we aren't overriding anyway so this isn't a big deal.
---
 src/mesa/drivers/dri/i965/brw_context.c  | 46 ++
 src/mesa/drivers/dri/i965/brw_context.h  |  1 -
 src/mesa/drivers/dri/i965/brw_shader.cpp | 49 +++-
 src/mesa/drivers/dri/i965/brw_shader.h   |  3 ++
 src/mesa/drivers/dri/i965/brw_vec4.cpp   |  2 +-
 src/mesa/drivers/dri/i965/intel_screen.c |  1 +
 6 files changed, 56 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 327a668..33cdbd2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -50,6 +50,7 @@
 
 #include "brw_context.h"
 #include "brw_defines.h"
+#include "brw_shader.h"
 #include "brw_draw.h"
 #include "brw_state.h"
 
@@ -68,8 +69,6 @@
 #include "tnl/t_pipeline.h"
 #include "util/ralloc.h"
 
-#include "glsl/nir/nir.h"
-
 /***
  * Mesa's Driver Functions
  ***/
@@ -558,48 +557,12 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128;
}
 
-   static const nir_shader_compiler_options nir_options = {
-  .native_integers = true,
-  /* In order to help allow for better CSE at the NIR level we tell NIR
-   * to split all ffma instructions during opt_algebraic and we then
-   * re-combine them as a later step.
-   */
-  .lower_ffma = true,
-  .lower_sub = true,
-   };
-
/* We want the GLSL compiler to emit code that uses condition codes */
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
-  ctx->Const.ShaderCompilerOptions[i].MaxIfDepth = brw->gen < 6 ? 16 : 
UINT_MAX;
-  ctx->Const.ShaderCompilerOptions[i].EmitCondCodes = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoNoise = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoMainReturn = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectInput = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectOutput =
-(i == MESA_SHADER_FRAGMENT);
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectTemp =
-(i == MESA_SHADER_FRAGMENT);
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectUniform = false;
-  ctx->Const.ShaderCompilerOptions[i].LowerClipDistance = true;
+  ctx->Const.ShaderCompilerOptions[i] =
+ brw->intelScreen->compiler->glsl_compiler_options[i];
}
 
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = true;
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
true;
-
-   if (brw->scalar_vs) {
-  /* If we're using the scalar backend for vertex shaders, we need to
-   * configure these accordingly.
-   */
-  
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
true;
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp 
= true;
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
false;
-
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
&nir_options;
-   }
-
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_FRAGMENT].NirOptions = 
&nir_options;
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_COMPUTE].NirOptions = 
&nir_options;
-
/* ARB_viewport_array */
if (brw->gen >= 6 && ctx->API == API_OPENGL_CORE) {
   ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS;
@@ -832,9 +795,6 @@ brwCreateContext(gl_api api,
if (INTEL_DEBUG & DEBUG_AUB)
   drm_intel_bufmgr_gem_set_aub_dump(brw->bufmgr, true);
 
-   if (brw->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS))
-  brw->scalar_vs = true;
-
brw_initialize_context_constants(brw);
 
ctx->Const.ResetStrategy = notify_reset
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 58119ee..d8fcfff 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1137,7 +1137,6 @@ struct brw_context
bool has_pln;
bool no_simd8;
bool use_rep_send;
-   bool scalar_vs;
 
/**
 * Some versions of Gen hardware don't do centroid interpolation correctly
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 3ac5ef1..683946b 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -84,6 +84,53 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
brw_fs_alloc_reg_sets(compiler);
brw_vec4_alloc_reg_set(compiler);
 
+   

[Mesa-dev] [PATCH 07/16] i965/fs: Do the no16 perf logging directly in fs_visitor::no16()

2015-06-22 Thread Jason Ekstrand
While we're at it, we'll drop the note about 10-20% performance loss.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a9d9f37..40e2c44 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -710,12 +710,7 @@ fs_visitor::no16(const char *msg)
} else {
   simd16_unsupported = true;
 
-  if (brw->perf_debug) {
- if (no16_msg)
-ralloc_strcat(&no16_msg, msg);
- else
-no16_msg = ralloc_strdup(mem_ctx, msg);
-  }
+  perf_debug("SIMD16 shader failed to compile: %s", msg);
}
 }
 
@@ -4042,14 +4037,10 @@ brw_wm_fs_emit(struct brw_context *brw,
  /* Try a SIMD16 compile */
  v2.import_uniforms(&v);
  if (!v2.run_fs()) {
-perf_debug("SIMD16 shader failed to compile, falling back to "
-   "SIMD8 at a 10-20%% performance cost: %s", v2.fail_msg);
+perf_debug("SIMD16 shader failed to compile: %s", v2.fail_msg);
  } else {
 simd16_cfg = v2.cfg;
  }
-  } else {
- perf_debug("SIMD16 shader unsupported, falling back to "
-"SIMD8 at a 10-20%% performance cost: %s", v.no16_msg);
   }
}
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/16] i965/vec4_vs: Add an explicit use_legacy_snorm_formula flag

2015-06-22 Thread Jason Ekstrand
This way we can stop doing is_gles3 checks inside of the compiler.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 +++-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 9 +
 src/mesa/drivers/dri/i965/brw_vs.h| 5 -
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9c45034..f51aa1a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -35,6 +35,7 @@ extern "C" {
 #include "program/prog_print.h"
 #include "program/prog_parameter.h"
 }
+#include "main/context.h"
 
 #define MAX_INSTRUCTION (1 << 30)
 
@@ -1938,7 +1939,8 @@ brw_vs_emit(struct brw_context *brw,
if (!assembly) {
   prog_data->base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT;
 
-  vec4_vs_visitor v(brw, c, prog_data, prog, mem_ctx, st_index);
+  vec4_vs_visitor v(brw, c, prog_data, prog, mem_ctx, st_index,
+!_mesa_is_gles3(&brw->ctx));
   if (!v.run(brw_select_clip_planes(&brw->ctx))) {
  if (prog) {
 prog->LinkStatus = false;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
index dc17755..26e3057 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
@@ -23,7 +23,6 @@
 
 
 #include "brw_vs.h"
-#include "main/context.h"
 
 
 namespace brw {
@@ -78,7 +77,7 @@ vec4_vs_visitor::emit_prolog()
 /* ES 3.0 has different rules for converting signed normalized
  * fixed-point numbers than desktop GL.
  */
-if (_mesa_is_gles3(ctx) && (wa_flags & BRW_ATTRIB_WA_SIGN)) {
+if ((wa_flags & BRW_ATTRIB_WA_SIGN) && !use_legacy_snorm_formula) {
/* According to equation 2.2 of the ES 3.0 specification,
 * signed normalization conversion is done by:
 *
@@ -217,14 +216,16 @@ vec4_vs_visitor::vec4_vs_visitor(struct brw_context *brw,
  struct brw_vs_prog_data *vs_prog_data,
  struct gl_shader_program *prog,
  void *mem_ctx,
- int shader_time_index)
+ int shader_time_index,
+ bool use_legacy_snorm_formula)
: vec4_visitor(brw, &vs_compile->base, &vs_compile->vp->program.Base,
   &vs_compile->key.base, &vs_prog_data->base, prog,
   MESA_SHADER_VERTEX,
   mem_ctx, false /* no_spills */,
   shader_time_index),
  vs_compile(vs_compile),
- vs_prog_data(vs_prog_data)
+ vs_prog_data(vs_prog_data),
+ use_legacy_snorm_formula(use_legacy_snorm_formula)
 {
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vs.h 
b/src/mesa/drivers/dri/i965/brw_vs.h
index 6f84179..0511ab5 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.h
+++ b/src/mesa/drivers/dri/i965/brw_vs.h
@@ -95,7 +95,8 @@ public:
struct brw_vs_prog_data *vs_prog_data,
struct gl_shader_program *prog,
void *mem_ctx,
-   int shader_time_index);
+   int shader_time_index,
+   bool use_legacy_snorm_formula);
 
 protected:
virtual dst_reg *make_reg_for_system_value(ir_variable *ir);
@@ -116,6 +117,8 @@ private:
struct brw_vs_prog_data * const vs_prog_data;
src_reg *vp_temp_regs;
src_reg vp_addr_reg;
+
+   bool use_legacy_snorm_formula;
 };
 
 } /* namespace brw */
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/16] i965: Plumb compiler debug logging through a function pointer in brw_compiler

2015-06-22 Thread Jason Ekstrand
v2 (Ken): Make shader_debug_log a printf-like function.
v3 (Jason): Add a void * to pass the brw_context through
---
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp   |  3 ++-
 src/mesa/drivers/dri/i965/brw_cs.cpp  |  3 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  3 ++-
 src/mesa/drivers/dri/i965/brw_fs.h|  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 20 +---
 src/mesa/drivers/dri/i965/brw_shader.cpp  | 16 
 src/mesa/drivers/dri/i965/brw_shader.h|  2 ++
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  6 --
 src/mesa/drivers/dri/i965/brw_vec4.h  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 21 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  3 ++-
 11 files changed, 51 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index c1b7609..9c04137 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -29,7 +29,8 @@
 brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw,
bool debug_flag)
: mem_ctx(ralloc_context(NULL)),
- generator(brw, mem_ctx, (void *) rzalloc(mem_ctx, struct brw_wm_prog_key),
+ generator(brw->intelScreen->compiler,
+   mem_ctx, (void *) rzalloc(mem_ctx, struct brw_wm_prog_key),
(struct brw_stage_prog_data *) rzalloc(mem_ctx, struct 
brw_wm_prog_data),
NULL, 0, false, "BLORP")
 {
diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 1f2a9d2..f93ca2f 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -128,7 +128,8 @@ brw_cs_emit(struct brw_context *brw,
   return NULL;
}
 
-   fs_generator g(brw, mem_ctx, (void*) key, &prog_data->base, &cp->Base,
+   fs_generator g(brw->intelScreen->compiler,
+  mem_ctx, (void*) key, &prog_data->base, &cp->Base,
   v8.promoted_constants, v8.runtime_check_aads_emit, "CS");
if (INTEL_DEBUG & DEBUG_CS) {
   char *name = ralloc_asprintf(mem_ctx, "%s compute shader %d",
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ac65202..2b892f0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4069,7 +4069,8 @@ brw_wm_fs_emit(struct brw_context *brw,
   prog_data->no_8 = false;
}
 
-   fs_generator g(brw, mem_ctx, (void *) key, &prog_data->base,
+   fs_generator g(brw->intelScreen->compiler,
+  mem_ctx, (void *) key, &prog_data->base,
   &fp->Base, v.promoted_constants, v.runtime_check_aads_emit, 
"FS");
 
if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index cdeea6d..7414b65 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -398,7 +398,7 @@ public:
 class fs_generator
 {
 public:
-   fs_generator(struct brw_context *brw,
+   fs_generator(const struct brw_compiler *compiler,
 void *mem_ctx,
 const void *key,
 struct brw_stage_prog_data *prog_data,
@@ -493,7 +493,7 @@ private:
 
bool patch_discard_jumps_to_fb_writes();
 
-   struct brw_context *brw;
+   const struct brw_compiler *compiler;
const struct brw_device_info *devinfo;
 
struct brw_codegen *p;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 8eb3ace..d98a40d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -121,7 +121,7 @@ brw_reg_from_fs_reg(fs_reg *reg)
return brw_reg;
 }
 
-fs_generator::fs_generator(struct brw_context *brw,
+fs_generator::fs_generator(const struct brw_compiler *compiler,
void *mem_ctx,
const void *key,
struct brw_stage_prog_data *prog_data,
@@ -130,7 +130,7 @@ fs_generator::fs_generator(struct brw_context *brw,
bool runtime_check_aads_emit,
const char *stage_abbrev)
 
-   : brw(brw), devinfo(brw->intelScreen->devinfo), key(key),
+   : compiler(compiler), devinfo(compiler->devinfo), key(key),
  prog_data(prog_data),
  prog(prog), promoted_constants(promoted_constants),
  runtime_check_aads_emit(runtime_check_aads_emit), debug_flag(false),
@@ -2173,15 +2173,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   ralloc_free(annotation.ann);
}
 
-   static GLuint msg_id = 0;
-   _mesa_gl_debug(&brw->ctx, &msg_id,
-  MESA_DEBUG_SOURCE_SHADER_COMPILER,
-  MESA_DEBUG_TYPE_OTHER,
-  

[Mesa-dev] [PATCH 08/16] i965/fs: Plumb compiler debug logging through brw_compiler

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 13 +
 src/mesa/drivers/dri/i965/brw_shader.cpp | 26 ++
 src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 40e2c44..460120d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -710,7 +710,9 @@ fs_visitor::no16(const char *msg)
} else {
   simd16_unsupported = true;
 
-  perf_debug("SIMD16 shader failed to compile: %s", msg);
+  struct brw_compiler *compiler = brw->intelScreen->compiler;
+  compiler->shader_perf_log(brw,
+"SIMD16 shader failed to compile: %s", msg);
}
 }
 
@@ -3788,9 +3790,12 @@ fs_visitor::allocate_registers()
  fail("Failure to register allocate.  Reduce number of "
   "live scalar values to avoid this.");
   } else {
- perf_debug("%s shader triggered register spilling.  "
-"Try reducing the number of live scalar values to "
-"improve performance.\n", stage_name);
+ struct brw_compiler *compiler = brw->intelScreen->compiler;
+ compiler->shader_perf_log(brw,
+   "%s shader triggered register spilling.  "
+   "Try reducing the number of live scalar "
+   "values to improve performance.\n",
+   stage_name);
   }
 
   /* Since we're out of heuristics, just go spill registers until we
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 18a6470..3ac5ef1 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -47,6 +47,31 @@ shader_debug_log_mesa(void *data, const char *fmt, ...)
va_end(args);
 }
 
+static void
+shader_perf_log_mesa(void *data, const char *fmt, ...)
+{
+   struct brw_context *brw = (struct brw_context *)data;
+
+   va_list args;
+   va_start(args, fmt);
+
+   if (unlikely(INTEL_DEBUG & DEBUG_PERF)) {
+  va_list args_copy;
+  va_copy(args_copy, args);
+  vfprintf(stderr, fmt, args_copy);
+  va_end(args_copy);
+   }
+
+   if (brw->perf_debug) {
+  GLuint msg_id = 0;
+  _mesa_gl_vdebug(&brw->ctx, &msg_id,
+  MESA_DEBUG_SOURCE_SHADER_COMPILER,
+  MESA_DEBUG_TYPE_PERFORMANCE,
+  MESA_DEBUG_SEVERITY_MEDIUM, fmt, args);
+   }
+   va_end(args);
+}
+
 struct brw_compiler *
 brw_compiler_create(void *mem_ctx, const struct brw_device_info *devinfo)
 {
@@ -54,6 +79,7 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
 
compiler->devinfo = devinfo;
compiler->shader_debug_log = shader_debug_log_mesa;
+   compiler->shader_perf_log = shader_perf_log_mesa;
 
brw_fs_alloc_reg_sets(compiler);
brw_vec4_alloc_reg_set(compiler);
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index f89c4f5..79cea6e 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -88,6 +88,7 @@ struct brw_compiler {
} fs_reg_sets[2];
 
void (*shader_debug_log)(void *, const char *str, ...) PRINTFLIKE(2, 3);
+   void (*shader_perf_log)(void *, const char *str, ...) PRINTFLIKE(2, 3);
 };
 
 enum PACKED register_file {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/16] i965: Move INTEL_DEBUG variable parsing to screen creation time

2015-06-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.c  | 10 +-
 src/mesa/drivers/dri/i965/intel_debug.c  | 13 ++---
 src/mesa/drivers/dri/i965/intel_debug.h  |  4 ++--
 src/mesa/drivers/dri/i965/intel_screen.c |  2 ++
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index c629f39..327a668 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -822,7 +822,15 @@ brwCreateContext(gl_api api,
_mesa_meta_init(ctx);
 
brw_process_driconf_options(brw);
-   brw_process_intel_debug_variable(brw);
+
+   if (INTEL_DEBUG & DEBUG_BUFMGR)
+  dri_bufmgr_set_debug(brw->bufmgr, true);
+
+   if (INTEL_DEBUG & DEBUG_PERF)
+  brw->perf_debug = true;
+
+   if (INTEL_DEBUG & DEBUG_AUB)
+  drm_intel_bufmgr_gem_set_aub_dump(brw->bufmgr, true);
 
if (brw->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS))
   brw->scalar_vs = true;
diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
index 53f575a..0f4e556 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.c
+++ b/src/mesa/drivers/dri/i965/intel_debug.c
@@ -88,25 +88,16 @@ intel_debug_flag_for_shader_stage(gl_shader_stage stage)
 }
 
 void
-brw_process_intel_debug_variable(struct brw_context *brw)
+brw_process_intel_debug_variable(const struct brw_device_info *devinfo)
 {
uint64_t intel_debug = driParseDebugString(getenv("INTEL_DEBUG"), 
debug_control);
(void) p_atomic_cmpxchg(&INTEL_DEBUG, 0, intel_debug);
 
-   if (INTEL_DEBUG & DEBUG_BUFMGR)
-  dri_bufmgr_set_debug(brw->bufmgr, true);
-
-   if ((INTEL_DEBUG & DEBUG_SHADER_TIME) && brw->gen < 7) {
+   if ((INTEL_DEBUG & DEBUG_SHADER_TIME) && devinfo->gen < 7) {
   fprintf(stderr,
   "shader_time debugging requires gen7 (Ivybridge) or better.\n");
   INTEL_DEBUG &= ~DEBUG_SHADER_TIME;
}
-
-   if (INTEL_DEBUG & DEBUG_PERF)
-  brw->perf_debug = true;
-
-   if (INTEL_DEBUG & DEBUG_AUB)
-  drm_intel_bufmgr_gem_set_aub_dump(brw->bufmgr, true);
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index f754be2..96212df 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -114,8 +114,8 @@ extern uint64_t INTEL_DEBUG;
 
 extern uint64_t intel_debug_flag_for_shader_stage(gl_shader_stage stage);
 
-struct brw_context;
+struct brw_device_info;
 
-extern void brw_process_intel_debug_variable(struct brw_context *brw);
+extern void brw_process_intel_debug_variable(const struct brw_device_info *);
 
 extern bool brw_env_var_as_boolean(const char *var_name, bool default_value);
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 896a125..38475b9 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1372,6 +1372,8 @@ __DRIconfig **intelInitScreen2(__DRIscreen *psp)
if (!intelScreen->devinfo)
   return false;
 
+   brw_process_intel_debug_variable(intelScreen->devinfo);
+
intelScreen->hw_must_use_separate_stencil = intelScreen->devinfo->gen >= 7;
 
intelScreen->hw_has_swizzling = intel_detect_swizzling(intelScreen);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/16] i965/fs: Make no16 non-variadic

2015-06-22 Thread Jason Ekstrand
We never used the fact that it was variadic anyway.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 14 --
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 615c2f1..a9d9f37 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -703,26 +703,20 @@ fs_visitor::fail(const char *format, ...)
  * During a SIMD16 compile (if one happens anyway), this just calls fail().
  */
 void
-fs_visitor::no16(const char *format, ...)
+fs_visitor::no16(const char *msg)
 {
-   va_list va;
-
-   va_start(va, format);
-
if (dispatch_width == 16) {
-  vfail(format, va);
+  fail("%s", msg);
} else {
   simd16_unsupported = true;
 
   if (brw->perf_debug) {
  if (no16_msg)
-ralloc_vasprintf_append(&no16_msg, format, va);
+ralloc_strcat(&no16_msg, msg);
  else
-no16_msg = ralloc_vasprintf(mem_ctx, format, va);
+no16_msg = ralloc_strdup(mem_ctx, msg);
   }
}
-
-   va_end(va);
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 1d52ff0..cffedc0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -157,7 +157,7 @@ public:
  fs_inst *inst);
void vfail(const char *msg, va_list args);
void fail(const char *msg, ...);
-   void no16(const char *msg, ...);
+   void no16(const char *msg);
void lower_uniform_pull_constant_loads();
bool lower_load_payload();
bool lower_integer_multiplication();
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/16] i965: Remove the brw_context from the visitors

2015-06-22 Thread Jason Ekstrand
As of this commit, nothing actually needs the brw_context.
---
 src/mesa/drivers/dri/i965/brw_cs.cpp|  6 --
 src/mesa/drivers/dri/i965/brw_fs.cpp| 12 ++--
 src/mesa/drivers/dri/i965/brw_fs.h  |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp   |  1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp|  4 ++--
 src/mesa/drivers/dri/i965/brw_shader.cpp|  9 +
 src/mesa/drivers/dri/i965/brw_shader.h  |  7 ---
 src/mesa/drivers/dri/i965/brw_vec4.cpp  |  6 --
 src/mesa/drivers/dri/i965/brw_vec4.h|  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp   | 14 --
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp |  1 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_vs.h  |  2 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h |  4 ++--
 16 files changed, 43 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index fa8b5c8..4c5082c 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -94,7 +94,8 @@ brw_cs_emit(struct brw_context *brw,
 
/* Now the main event: Visit the shader IR and generate our CS IR for it.
 */
-   fs_visitor v8(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
+   fs_visitor v8(brw->intelScreen->compiler, brw,
+ mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, prog,
  &cp->Base, 8, st_index);
if (!v8.run_cs()) {
   fail_msg = v8.fail_msg;
@@ -103,7 +104,8 @@ brw_cs_emit(struct brw_context *brw,
   prog_data->simd_size = 8;
}
 
-   fs_visitor v16(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
+   fs_visitor v16(brw->intelScreen->compiler, brw,
+  mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, prog,
   &cp->Base, 16, st_index);
if (likely(!(INTEL_DEBUG & DEBUG_NO16)) &&
!fail_msg && !v8.simd16_unsupported &&
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 23f60c2..f7f05af 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -677,8 +677,7 @@ fs_visitor::no16(const char *msg)
} else {
   simd16_unsupported = true;
 
-  struct brw_compiler *compiler = brw->intelScreen->compiler;
-  compiler->shader_perf_log(brw,
+  compiler->shader_perf_log(log_data,
 "SIMD16 shader failed to compile: %s", msg);
}
 }
@@ -3757,8 +3756,7 @@ fs_visitor::allocate_registers()
  fail("Failure to register allocate.  Reduce number of "
   "live scalar values to avoid this.");
   } else {
- struct brw_compiler *compiler = brw->intelScreen->compiler;
- compiler->shader_perf_log(brw,
+ compiler->shader_perf_log(log_data,
"%s shader triggered register spilling.  "
"Try reducing the number of live scalar "
"values to improve performance.\n",
@@ -3994,7 +3992,8 @@ brw_wm_fs_emit(struct brw_context *brw,
 
/* Now the main event: Visit the shader IR and generate our FS IR for it.
 */
-   fs_visitor v(brw, mem_ctx, MESA_SHADER_FRAGMENT, key, &prog_data->base,
+   fs_visitor v(brw->intelScreen->compiler, brw,
+mem_ctx, MESA_SHADER_FRAGMENT, key, &prog_data->base,
 prog, &fp->Base, 8, st_index8);
if (!v.run_fs(false /* do_rep_send */)) {
   if (prog) {
@@ -4009,7 +4008,8 @@ brw_wm_fs_emit(struct brw_context *brw,
}
 
cfg_t *simd16_cfg = NULL;
-   fs_visitor v2(brw, mem_ctx, MESA_SHADER_FRAGMENT, key, &prog_data->base,
+   fs_visitor v2(brw->intelScreen->compiler, brw,
+ mem_ctx, MESA_SHADER_FRAGMENT, key, &prog_data->base,
  prog, &fp->Base, 16, st_index16);
if (likely(!(INTEL_DEBUG & DEBUG_NO16) || brw->use_rep_send)) {
   if (!v.simd16_unsupported) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index e0a8984..243baf6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -70,7 +70,7 @@ namespace brw {
 class fs_visitor : public backend_shader
 {
 public:
-   fs_visitor(struct brw_context *brw,
+   fs_visitor(const struct brw_compiler *compiler, void *log_data,
   void *mem_ctx,
   gl_shader_stage stage,
   const void *key,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index cd78816..364fc4a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/d

[Mesa-dev] [PATCH 12/16] i965/fs: Add a do_rep_send flag to run_fs

2015-06-22 Thread Jason Ekstrand
Previously, we were pulling it from brw->do_rep_send
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 252196a..bf04e26 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3825,7 +3825,7 @@ fs_visitor::run_vs()
 }
 
 bool
-fs_visitor::run_fs()
+fs_visitor::run_fs(bool do_rep_send)
 {
brw_wm_prog_data *wm_prog_data = (brw_wm_prog_data *) this->prog_data;
brw_wm_prog_key *wm_key = (brw_wm_prog_key *) this->key;
@@ -3843,7 +3843,8 @@ fs_visitor::run_fs()
 
if (0) {
   emit_dummy_fs();
-   } else if (brw->use_rep_send && dispatch_width == 16) {
+   } else if (do_rep_send) {
+  assert(dispatch_width == 16);
   emit_repclear_shader();
} else {
   if (shader_time_index >= 0)
@@ -3995,7 +3996,7 @@ brw_wm_fs_emit(struct brw_context *brw,
 */
fs_visitor v(brw, mem_ctx, MESA_SHADER_FRAGMENT, key, &prog_data->base,
 prog, &fp->Base, 8, st_index8);
-   if (!v.run_fs()) {
+   if (!v.run_fs(false /* do_rep_send */)) {
   if (prog) {
  prog->LinkStatus = false;
  ralloc_strcat(&prog->InfoLog, v.fail_msg);
@@ -4014,7 +4015,7 @@ brw_wm_fs_emit(struct brw_context *brw,
   if (!v.simd16_unsupported) {
  /* Try a SIMD16 compile */
  v2.import_uniforms(&v);
- if (!v2.run_fs()) {
+ if (!v2.run_fs(brw->use_rep_send)) {
 perf_debug("SIMD16 shader failed to compile: %s", v2.fail_msg);
  } else {
 simd16_cfg = v2.cfg;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 525be3a..4db5a91 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -103,7 +103,7 @@ public:
uint32_t const_offset);
void DEP_RESOLVE_MOV(const brw::fs_builder &bld, int grf);
 
-   bool run_fs();
+   bool run_fs(bool do_rep_send);
bool run_vs();
bool run_cs();
void optimize();
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-23 Thread Jason Ekstrand
On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> We want to move these into the builder so that they know the current
>> builder's dispatch width.  This will be needed by a later commit.
>
> I very much like the idea of this series, but, why do you need to move
> these register manipulators into the builder?  The builder is an object
> you can use to:
>  - Manipulate and query parameters affecting code generation.
>  - Create instructions into the program (::emit and friends).
>  - Allocate virtual registers from the program (::vgrf and friends).
>
> offset() and half() logically perform an action on a given register
> object (or rather, compute a function of a given register object), not
> on a builder object, the builder is only required as an auxiliary
> parameter -- Any reason you didn't just pass it as a third parameter?

What's required as a third parameter is the current execution size.  I
could have passed that directly, but I figured that, especially for
half(), it would get messed up.  I could pass the builder in but I
don't see a whole lot of difference between that and what I'm doing
right now.  As is, it's not entirely obvious whether you should call
half(reg) on the half-width or full-width builder.  I'm not 100% sure
what to do about that.

> As offset() and half() don't require access to any private details of
> the builder, that would actually improve encapsulation, and would avoid
> the dubious overloading of fs_builder::half() with two methods with
> completely different semantics.

Yeah, I don't really like that either.  I just couldn't come up with
anything better at the time.

Suggestions are very much welcome.  But I would like to settle on
whatever we do fairly quickly so as to limit the amount of
refactoring.
--Jason

> Thanks.
>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp |  52 ++
>>  src/mesa/drivers/dri/i965/brw_fs_builder.h   |  46 +
>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |   2 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  60 +--
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 149 
>> ++-
>>  src/mesa/drivers/dri/i965/brw_ir_fs.h|  51 -
>>  6 files changed, 182 insertions(+), 178 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 4f98d63..c13ac7d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -267,7 +267,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
>> &bld,
>>   inst->mlen = 1 + dispatch_width / 8;
>> }
>>
>> -   bld.MOV(dst, offset(vec4_result, (const_offset & 3) * scale));
>> +   bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale));
>>  }
>>
>>  /**
>> @@ -361,7 +361,12 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
>> &grf_alloc) const
>>reg.width = this->src[i].width;
>>if (!this->src[i].equals(reg))
>>   return false;
>> -  reg = ::offset(reg, 1);
>> +
>> +  if (i < this->header_size) {
>> + reg.reg_offset += 1;
>> +  } else {
>> + reg.reg_offset += this->exec_size / 8;
>> +  }
>> }
>>
>> return true;
>> @@ -963,7 +968,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>> } else {
>>bld.ADD(wpos, this->pixel_x, fs_reg(0.5f));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.y */
>> if (!flip && pixel_center_integer) {
>> @@ -979,7 +984,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>>
>>bld.ADD(wpos, pixel_y, fs_reg(offset));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.z */
>> if (devinfo->gen >= 6) {
>> @@ -989,7 +994,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>> this->delta_xy[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
>> interp_reg(VARYING_SLOT_POS, 2));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.w: Already set up in emit_interpolation */
>> bld.MOV(wpos, this->wpos_w);
>> @@ -1072,7 +1077,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, 
>> const char *name,
>>  

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-23 Thread Jason Ekstrand
On Tue, Jun 23, 2015 at 1:39 AM, Pohjolainen, Topi
 wrote:
> On Thu, Jun 18, 2015 at 05:51:36PM -0700, Jason Ekstrand wrote:
>> We want to move these into the builder so that they know the current
>> builder's dispatch width.  This will be needed by a later commit.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp |  52 ++
>>  src/mesa/drivers/dri/i965/brw_fs_builder.h   |  46 +
>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |   2 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  60 +--
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 149 
>> ++-
>>  src/mesa/drivers/dri/i965/brw_ir_fs.h|  51 -
>>  6 files changed, 182 insertions(+), 178 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 4f98d63..c13ac7d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -267,7 +267,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
>> &bld,
>>   inst->mlen = 1 + dispatch_width / 8;
>> }
>>
>> -   bld.MOV(dst, offset(vec4_result, (const_offset & 3) * scale));
>> +   bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale));
>>  }
>>
>>  /**
>> @@ -361,7 +361,12 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
>> &grf_alloc) const
>>reg.width = this->src[i].width;
>>if (!this->src[i].equals(reg))
>>   return false;
>> -  reg = ::offset(reg, 1);
>> +
>> +  if (i < this->header_size) {
>> + reg.reg_offset += 1;
>> +  } else {
>> + reg.reg_offset += this->exec_size / 8;
>> +  }
>
> The latter branch is new functionality, isn't it? There is no consideration
> for header_size in the offset() utility.

Not quite.  We don't have a builder in this context, so I had to
mangle the reg_offset itself.  I'll freely admit that's kind of ugly.
This might be a good argument for Curro's suggestion of just adding a
width or maybe a pair of widths to the half() function.

The reason for the if statement is that, if it's a header, it deals in
actual physical registers while, if it's not a header, it deals in
something relative to the width.  This was all magically handled by
offset() before because the header registers had a width of 8 and the
others had a width of exec_size.

>> }
>>
>> return true;
>> @@ -963,7 +968,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>> } else {
>>bld.ADD(wpos, this->pixel_x, fs_reg(0.5f));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.y */
>> if (!flip && pixel_center_integer) {
>> @@ -979,7 +984,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>>
>>bld.ADD(wpos, pixel_y, fs_reg(offset));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.z */
>> if (devinfo->gen >= 6) {
>> @@ -989,7 +994,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
>> pixel_center_integer,
>> this->delta_xy[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
>> interp_reg(VARYING_SLOT_POS, 2));
>> }
>> -   wpos = offset(wpos, 1);
>> +   wpos = bld.offset(wpos, 1);
>>
>> /* gl_FragCoord.w: Already set up in emit_interpolation */
>> bld.MOV(wpos, this->wpos_w);
>> @@ -1072,7 +1077,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, 
>> const char *name,
>>   /* If there's no incoming setup data for this slot, don't
>>* emit interpolation for it.
>>*/
>> - attr = offset(attr, type->vector_elements);
>> + attr = bld.offset(attr, type->vector_elements);
>>   location++;
>>   continue;
>>}
>> @@ -1087,7 +1092,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, 
>> const char *name,
>>  interp = suboffset(interp, 3);
>> interp.type = attr.type;
>> bld.emit(FS_OPCODE_CINTERP, attr, fs_reg(interp));
>> -attr = offset(attr, 1);
>> +attr = bld.offset(attr, 1);
>>   }
>>} else {
>>   /* Smooth/noperspective interpolation case. */
>> @@ -1125,7 +1130,7 @@ fs_visitor::emit_general_interpolati

[Mesa-dev] [PATCH] i965/fs: Get rid of an unused variable in emit_barrier()

2015-06-23 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index ea29341..9a4bad6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1963,11 +1963,11 @@ fs_visitor::emit_barrier()
fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 
/* Clear the message payload */
-   fs_inst *inst = bld.exec_all().MOV(payload, fs_reg(0u));
+   bld.exec_all().MOV(payload, fs_reg(0u));
 
/* Copy bits 27:24 of r0.2 (barrier id) to the message payload reg.2 */
fs_reg r0_2 = fs_reg(retype(brw_vec1_grf(0, 2), BRW_REGISTER_TYPE_UD));
-   inst = bld.exec_all().AND(component(payload, 2), r0_2, fs_reg(0x0f00u));
+   bld.exec_all().AND(component(payload, 2), r0_2, fs_reg(0x0f00u));
 
/* Emit a gateway "barrier" message using the payload we set up, followed
 * by a wait instruction.
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Jason Ekstrand
On Wed, Jun 24, 2015 at 6:44 AM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> On Jun 24, 2015 6:29 AM, "Francisco Jerez"  wrote:
>>>
>>> Jason Ekstrand  writes:
>>>
>>> > On Jun 24, 2015 4:29 AM, "Francisco Jerez" 
>> wrote:
>>> >>
>>> >> Jason Ekstrand  writes:
>>> >>
>>> >> > On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez <
>> curroje...@riseup.net>
>>> > wrote:
>>> >> >> Jason Ekstrand  writes:
>>> >> >>
>>> >> >>> We want to move these into the builder so that they know the
>> current
>>> >> >>> builder's dispatch width.  This will be needed by a later commit.
>>> >> >>
>>> >> >> I very much like the idea of this series, but, why do you need to
>> move
>>> >> >> these register manipulators into the builder?  The builder is an
>> object
>>> >> >> you can use to:
>>> >> >>  - Manipulate and query parameters affecting code generation.
>>> >> >>  - Create instructions into the program (::emit and friends).
>>> >> >>  - Allocate virtual registers from the program (::vgrf and friends).
>>> >> >>
>>> >> >> offset() and half() logically perform an action on a given register
>>> >> >> object (or rather, compute a function of a given register object),
>> not
>>> >> >> on a builder object, the builder is only required as an auxiliary
>>> >> >> parameter -- Any reason you didn't just pass it as a third
>> parameter?
>>> >> >
>>> >> > What's required as a third parameter is the current execution size.
>> I
>>> >> > could have passed that directly, but I figured that, especially for
>>> >> > half(), it would get messed up.  I could pass the builder in but I
>>> >> > don't see a whole lot of difference between that and what I'm doing
>>> >> > right now.
>>> >>
>>> >> Assembly-wise there's no difference, but it seems inconsistent with
>> both
>>> >> the remaining register manipulators and remaining builder methods, and
>>> >> IMHO it's kind of an anti-pattern to make something a method that
>>> >> doesn't need access to any internal details of the object.
>>> >>
>>> >> > As is, it's not entirely obvious whether you should call
>>> >> > half(reg) on the half-width or full-width builder.  I'm not 100% sure
>>> >> > what to do about that.
>>> >> >
>>> >> Actually, does half() really need to know about the builder?  AFAICT it
>>> >> only needs it because of dispatch_width(), and before doing anything
>>> >> useful with it it asserts that it's equal to 16, what points at the
>>> >> parameter being redundant.  By convention a "half" is a group of 8
>>> >> channels (we may want to revise this convention when we implement
>> SIMD32
>>> >> -- E.g. make half a group of 16 channels and quarter a group of 8
>>> >> channels), so 'half(reg)' could simply be implemented as
>>> >> "horiz_offset(reg, 8 * i)" without any dependency on the builder.  As
>>> >> additional paranoia to catch half() being called on a non-16-aligned
>>> >> register you could assert that either 'stride == 0' or 16 divides
>>> >> '(REG_SIZE * reg_offset + subreg_offset) / (stride * type_size)' (why
>>> >> don't we have a reg_offset already in bytes again?) -- That would also
>>> >> catch cases in which the register and builder "widths" get out of sync,
>>> >> e.g. if half is called in an already halved register but the builder
>>> >> used happens to be of the correct exec_size.
>>> >
>>> > OK, fine, we can pull half() back out.  Should offset() stay in the
>>> > builder? If not, where should it get its dispatch width.
>>> >
>>> I'm for leaving it as a stand-alone function (like all other register
>>> manipulators), and add a third argument to pass the 'fs_builder' it can
>>> take the dispatch width from?
>>
>> I'm not a big fan.  However, in the interest of keeping the builder clean,
&g

Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Jason Ekstrand
On Wed, Jun 24, 2015 at 7:56 AM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> On Wed, Jun 24, 2015 at 6:44 AM, Francisco Jerez  
>> wrote:
>>> Jason Ekstrand  writes:
>>>
>>>> On Jun 24, 2015 6:29 AM, "Francisco Jerez"  wrote:
>>>>>
>>>>> Jason Ekstrand  writes:
>>>>>
>>>>> > On Jun 24, 2015 4:29 AM, "Francisco Jerez" 
>>>> wrote:
>>>>> >>
>>>>> >> Jason Ekstrand  writes:
>>>>> >>
>>>>> >> > On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez <
>>>> curroje...@riseup.net>
>>>>> > wrote:
>>>>> >> >> Jason Ekstrand  writes:
>>>>> >> >>
>>>>> >> >>> We want to move these into the builder so that they know the
>>>> current
>>>>> >> >>> builder's dispatch width.  This will be needed by a later commit.
>>>>> >> >>
>>>>> >> >> I very much like the idea of this series, but, why do you need to
>>>> move
>>>>> >> >> these register manipulators into the builder?  The builder is an
>>>> object
>>>>> >> >> you can use to:
>>>>> >> >>  - Manipulate and query parameters affecting code generation.
>>>>> >> >>  - Create instructions into the program (::emit and friends).
>>>>> >> >>  - Allocate virtual registers from the program (::vgrf and friends).
>>>>> >> >>
>>>>> >> >> offset() and half() logically perform an action on a given register
>>>>> >> >> object (or rather, compute a function of a given register object),
>>>> not
>>>>> >> >> on a builder object, the builder is only required as an auxiliary
>>>>> >> >> parameter -- Any reason you didn't just pass it as a third
>>>> parameter?
>>>>> >> >
>>>>> >> > What's required as a third parameter is the current execution size.
>>>> I
>>>>> >> > could have passed that directly, but I figured that, especially for
>>>>> >> > half(), it would get messed up.  I could pass the builder in but I
>>>>> >> > don't see a whole lot of difference between that and what I'm doing
>>>>> >> > right now.
>>>>> >>
>>>>> >> Assembly-wise there's no difference, but it seems inconsistent with
>>>> both
>>>>> >> the remaining register manipulators and remaining builder methods, and
>>>>> >> IMHO it's kind of an anti-pattern to make something a method that
>>>>> >> doesn't need access to any internal details of the object.
>>>>> >>
>>>>> >> > As is, it's not entirely obvious whether you should call
>>>>> >> > half(reg) on the half-width or full-width builder.  I'm not 100% sure
>>>>> >> > what to do about that.
>>>>> >> >
>>>>> >> Actually, does half() really need to know about the builder?  AFAICT it
>>>>> >> only needs it because of dispatch_width(), and before doing anything
>>>>> >> useful with it it asserts that it's equal to 16, what points at the
>>>>> >> parameter being redundant.  By convention a "half" is a group of 8
>>>>> >> channels (we may want to revise this convention when we implement
>>>> SIMD32
>>>>> >> -- E.g. make half a group of 16 channels and quarter a group of 8
>>>>> >> channels), so 'half(reg)' could simply be implemented as
>>>>> >> "horiz_offset(reg, 8 * i)" without any dependency on the builder.  As
>>>>> >> additional paranoia to catch half() being called on a non-16-aligned
>>>>> >> register you could assert that either 'stride == 0' or 16 divides
>>>>> >> '(REG_SIZE * reg_offset + subreg_offset) / (stride * type_size)' (why
>>>>> >> don't we have a reg_offset already in bytes again?) -- That would also
>>>>> >> catch cases in which the register and builder "widths" get out of sync,
>>>>> >> e.g. if hal

Re: [Mesa-dev] ARB_arrays_of_arrays GLSL ES

2015-06-24 Thread Jason Ekstrand
On Sat, Jun 20, 2015 at 5:32 AM, Timothy Arceri  wrote:
> Hi all,
>
> The restrictions in ES make the extension easier to implement so
> I thought I'd try get this stuff reviewed an committed before finishing
> up the full extension.
> The bits that I'm still working on for the desktop version are AoA inputs
> outputs, and interface blocks.
>
> The only thing I know is definatly missing in this series for ES is
> support for indirect indexing of samplers, but that didn't seem like
> something that should hold up the series.
>
> Once the SSBO series lands (with a patch that restricts unsized arrays)
> then all the AoA ES conformance tests will pass.
>
> There are already a bunch of piglit tests in git but I've just sent a
> series with all the patches still waiting review here:
> http://lists.freedesktop.org/archives/piglit/2015-June/016312.html
>
> I haven't made a patch marking this as done yet because currently
> the i965 backend takes a very long time trying to optimise some of the
> conformance tests. They still pass but they are taking 15-minutes+ just
> to compile so this really needs to be sorted out first. If someone with
> more knowledge in this area than me wants to take a look at this I would
> be greatful for being pointed in the right direction.

Can you try it with this patch:

http://lists.freedesktop.org/archives/mesa-dev/2015-June/086011.html

> If useful the series is in the 'gles4' branch of the repo here:
> https://github.com/tarceri/Mesa_arrays_of_arrays.git
>
> Thanks,
> Tim
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder

2015-06-24 Thread Jason Ekstrand
On Jun 24, 2015 4:29 AM, "Francisco Jerez"  wrote:
>
> Jason Ekstrand  writes:
>
> > On Tue, Jun 23, 2015 at 9:22 AM, Francisco Jerez 
wrote:
> >> Jason Ekstrand  writes:
> >>
> >>> We want to move these into the builder so that they know the current
> >>> builder's dispatch width.  This will be needed by a later commit.
> >>
> >> I very much like the idea of this series, but, why do you need to move
> >> these register manipulators into the builder?  The builder is an object
> >> you can use to:
> >>  - Manipulate and query parameters affecting code generation.
> >>  - Create instructions into the program (::emit and friends).
> >>  - Allocate virtual registers from the program (::vgrf and friends).
> >>
> >> offset() and half() logically perform an action on a given register
> >> object (or rather, compute a function of a given register object), not
> >> on a builder object, the builder is only required as an auxiliary
> >> parameter -- Any reason you didn't just pass it as a third parameter?
> >
> > What's required as a third parameter is the current execution size.  I
> > could have passed that directly, but I figured that, especially for
> > half(), it would get messed up.  I could pass the builder in but I
> > don't see a whole lot of difference between that and what I'm doing
> > right now.
>
> Assembly-wise there's no difference, but it seems inconsistent with both
> the remaining register manipulators and remaining builder methods, and
> IMHO it's kind of an anti-pattern to make something a method that
> doesn't need access to any internal details of the object.
>
> > As is, it's not entirely obvious whether you should call
> > half(reg) on the half-width or full-width builder.  I'm not 100% sure
> > what to do about that.
> >
> Actually, does half() really need to know about the builder?  AFAICT it
> only needs it because of dispatch_width(), and before doing anything
> useful with it it asserts that it's equal to 16, what points at the
> parameter being redundant.  By convention a "half" is a group of 8
> channels (we may want to revise this convention when we implement SIMD32
> -- E.g. make half a group of 16 channels and quarter a group of 8
> channels), so 'half(reg)' could simply be implemented as
> "horiz_offset(reg, 8 * i)" without any dependency on the builder.  As
> additional paranoia to catch half() being called on a non-16-aligned
> register you could assert that either 'stride == 0' or 16 divides
> '(REG_SIZE * reg_offset + subreg_offset) / (stride * type_size)' (why
> don't we have a reg_offset already in bytes again?) -- That would also
> catch cases in which the register and builder "widths" get out of sync,
> e.g. if half is called in an already halved register but the builder
> used happens to be of the correct exec_size.

OK, fine, we can pull half() back out.  Should offset() stay in the
builder? If not, where should it get its dispatch width.

> >> As offset() and half() don't require access to any private details of
> >> the builder, that would actually improve encapsulation, and would avoid
> >> the dubious overloading of fs_builder::half() with two methods with
> >> completely different semantics.
> >
> > Yeah, I don't really like that either.  I just couldn't come up with
> > anything better at the time.
> >
> > Suggestions are very much welcome.  But I would like to settle on
> > whatever we do fairly quickly so as to limit the amount of
> > refactoring.
> > --Jason
> >
> >> Thanks.
> >>
> >>> ---
> >>>  src/mesa/drivers/dri/i965/brw_fs.cpp |  52 ++
> >>>  src/mesa/drivers/dri/i965/brw_fs_builder.h   |  46 +
> >>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |   2 +-
> >>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  60 +--
> >>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 149
++-
> >>>  src/mesa/drivers/dri/i965/brw_ir_fs.h|  51 -
> >>>  6 files changed, 182 insertions(+), 178 deletions(-)
> >>>
> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >>> index 4f98d63..c13ac7d 100644
> >>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >>> @@ -267,7 +267,7 @@ fs_visitor::VARY

Re: [Mesa-dev] [PATCH v2 05/14] meta: Abort meta pbo path if readpixels need signed-unsigned conversion

2015-06-24 Thread Jason Ekstrand
On Wed, Jun 24, 2015 at 6:39 PM, Anuj Phogat  wrote:
> Meta pbo path for ReadPixels rely on BlitFramebuffer which doesn't support
> signed to unsigned integer conversions and vice versa.
>
> Without this patch, piglit test fbo_integer_readpixels_sint_uint fails, when
> forced to use the meta pbo path.
>
> v2: Make need_rgb_to_luminance_conversion() a static function. (Iago)
> Bump up the comment and the commit message. (Jason)
>
> Signed-off-by: Anuj Phogat 
> Reviewed-by: Jason Ekstrand 
> Cc: Iago Toral 
> Cc: 
> ---
>  src/mesa/drivers/common/meta_tex_subimage.c | 25 +
>  1 file changed, 25 insertions(+)
>
> diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
> b/src/mesa/drivers/common/meta_tex_subimage.c
> index 00364f8..a617b77 100644
> --- a/src/mesa/drivers/common/meta_tex_subimage.c
> +++ b/src/mesa/drivers/common/meta_tex_subimage.c
> @@ -248,6 +248,23 @@ fail:
> return success;
>  }
>
> +static bool
> +need_signed_unsigned_int_conversion(mesa_format rbFormat,
> +GLenum format, GLenum type)
> +{
> +   const GLenum srcType = _mesa_get_format_datatype(rbFormat);
> +   return (srcType == GL_INT &&
> +   _mesa_is_enum_format_integer(format) &&
> +   (type == GL_UNSIGNED_INT ||
> +type == GL_UNSIGNED_SHORT ||
> +type == GL_UNSIGNED_BYTE)) ||
> +  (srcType == GL_UNSIGNED_INT &&
> +   _mesa_is_enum_format_integer(format) &&
> +   (type == GL_INT ||
> +type == GL_SHORT ||
> +type == GL_BYTE));
> +}
> +
>  bool
>  _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims,
>struct gl_texture_image *tex_image,
> @@ -283,6 +300,14 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
> GLuint dims,
>
>if (_mesa_need_rgb_to_luminance_conversion(rb->Format, format))
>   return false;
> +
> +  /* This function rely on BlitFramebuffer to fill in the pixel data for
> +   * ReadPixels. But, BlitFrameBuffer doesn't support signed to unsigned
> +   * or unsigned to signed integer conversions. OpenGL spec expects an
> +   * invalid operation in that case.
> +   */
> +  if (need_signed_unsigned_int_conversion(rb->Format, format, type))
> + return false;

We should add this to TexSubImage as well.  Other than that, R-B still applies.

> }
>
> /* For arrays, use a tall (height * depth) 2D texture but taking into
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/vec4_live_variables: Do liveness analysis bottom-to-top

2015-06-25 Thread Jason Ekstrand
From Muchnick's Advanced Compiler Design and Implementation:

"To determine which variables are live at each point in a flowgraph, we
perform a backward data-flow analysis"

Previously, we were walking the blocks forwards and updating the livein and
then the liveout.  However, the livein calculation depends on the liveout
and the liveout depends on the successor blocks.  The net result is that it
takes one full iteration to go from liveout to livein and then another
full iteration to propagate to the predecessors.  This works out to an
O(n^2) computation where n is the number of blocks.  If we run things in
the other order, it's O(nl) where l is the maximum loop depth which is
practically bounded by 3.

In b2c6ba0c4b21391dc35018e1c8c4f7f7d8952bea, we made this same change in
the FS backend to great effect.  Might as well keep it consistent and make
the same change for vec4.
---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 95b9d90..29b4a53 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -133,27 +133,9 @@ vec4_live_variables::compute_live_variables()
while (cont) {
   cont = false;
 
-  foreach_block (block, cfg) {
+  foreach_block_reverse (block, cfg) {
  struct block_data *bd = &block_data[block->num];
 
-/* Update livein */
-for (int i = 0; i < bitset_words; i++) {
-BITSET_WORD new_livein = (bd->use[i] |
-  (bd->liveout[i] &
-   ~bd->def[i]));
-if (new_livein & ~bd->livein[i]) {
-   bd->livein[i] |= new_livein;
-   cont = true;
-   }
-}
- BITSET_WORD new_livein = (bd->flag_use[0] |
-   (bd->flag_liveout[0] &
-~bd->flag_def[0]));
- if (new_livein & ~bd->flag_livein[0]) {
-bd->flag_livein[0] |= new_livein;
-cont = true;
- }
-
 /* Update liveout */
 foreach_list_typed(bblock_link, child_link, link, &block->children) {
 struct block_data *child_bd = &block_data[child_link->block->num];
@@ -173,6 +155,24 @@ vec4_live_variables::compute_live_variables()
cont = true;
 }
 }
+
+ /* Update livein */
+ for (int i = 0; i < bitset_words; i++) {
+BITSET_WORD new_livein = (bd->use[i] |
+  (bd->liveout[i] &
+   ~bd->def[i]));
+if (new_livein & ~bd->livein[i]) {
+   bd->livein[i] |= new_livein;
+   cont = true;
+}
+ }
+ BITSET_WORD new_livein = (bd->flag_use[0] |
+   (bd->flag_liveout[0] &
+~bd->flag_def[0]));
+ if (new_livein & ~bd->flag_livein[0]) {
+bd->flag_livein[0] |= new_livein;
+cont = true;
+ }
   }
}
 }
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] ARB_arrays_of_arrays GLSL ES

2015-06-25 Thread Jason Ekstrand
On Thu, Jun 25, 2015 at 1:19 AM, Timothy Arceri  wrote:
> On Wed, 2015-06-24 at 11:17 -0700, Jason Ekstrand wrote:
>> On Sat, Jun 20, 2015 at 5:32 AM, Timothy Arceri <
>> t_arc...@yahoo.com.au> wrote:
>> > Hi all,
>> >
>> > The restrictions in ES make the extension easier to implement so
>> > I thought I'd try get this stuff reviewed an committed before
>> > finishing
>> > up the full extension.
>> > The bits that I'm still working on for the desktop version are AoA
>> > inputs
>> > outputs, and interface blocks.
>> >
>> > The only thing I know is definatly missing in this series for ES is
>> > support for indirect indexing of samplers, but that didn't seem
>> > like
>> > something that should hold up the series.
>> >
>> > Once the SSBO series lands (with a patch that restricts unsized
>> > arrays)
>> > then all the AoA ES conformance tests will pass.
>> >
>> > There are already a bunch of piglit tests in git but I've just sent
>> > a
>> > series with all the patches still waiting review here:
>> > http://lists.freedesktop.org/archives/piglit/2015-June/016312.html
>> >
>> > I haven't made a patch marking this as done yet because currently
>> > the i965 backend takes a very long time trying to optimise some of
>> > the
>> > conformance tests. They still pass but they are taking 15-minutes+
>> > just
>> > to compile so this really needs to be sorted out first. If someone
>> > with
>> > more knowledge in this area than me wants to take a look at this I
>> > would
>> > be greatful for being pointed in the right direction.
>>
>> Can you try it with this patch:
>>
>> http://lists.freedesktop.org/archives/mesa-dev/2015-June/086011.html
>
> Hi Jason,
>
> I tried that patch it didn't seem to make any difference, then I
> noticed its for fs. The slowdown is currently happening in
> vec4_live_variables. I've also noticed other large slowdowns with some
> piglit tests I've been working on this time in the register allocate
> code.

I just sent an equivalent patch for vec4:

http://patchwork.freedesktop.org/patch/52801/

I would love to know if it helps something so that I can put a good
justification in the commit message beyond "the same as for FS".
--Jason

> Once I finish fixing up some bugs with my current patchset I'll start
> digging deeper, just thought since it's so noticeably slow it might be
> easy for someone with good knowledge of the backend to point out where
> I should start looking, or to say your doing this wrong.
>
> One thing the slow shaders have in common is the use of arrays of
> arrays in nested loops so maybe something funny is going on when they
> go through the optimisation paths, I haven't spent much time looking at
> any of these passes yet so its highly likely they don't work as
> expected for arrays of arrays.
>
> Thanks anyway,
> Tim
>
>
>>
>> > If useful the series is in the 'gles4' branch of the repo here:
>> > https://github.com/tarceri/Mesa_arrays_of_arrays.git
>> >
>> > Thanks,
>> > Tim
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/17] i965/fs: Make better use of the builder in shader_time

2015-06-25 Thread Jason Ekstrand
On Tue, Jun 23, 2015 at 2:09 AM, Pohjolainen, Topi
 wrote:
> On Thu, Jun 18, 2015 at 05:51:37PM -0700, Jason Ekstrand wrote:
>> Previously, we were just depending on register widths to ensure that
>> various things were exec_size of 1 etc.  Now, we do so explicitly using the
>> builder.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 14 --
>>  1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index c13ac7d..740b51d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -557,7 +557,7 @@ fs_visitor::get_timestamp(const fs_builder &bld)
>> /* We want to read the 3 fields we care about even if it's not enabled in
>>  * the dispatch.
>>  */
>> -   bld.exec_all().MOV(dst, ts);
>> +   bld.group(4, 0).exec_all().MOV(dst, ts);
>
> Just to make sure I understand correctly, we want SIMD4 in order to read wide
> enough to get all the mentioned 3 fields?

I believe so, yes.

>>
>> /* The caller wants the low 32 bits of the timestamp.  Since it's running
>>  * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
>> @@ -637,17 +637,19 @@ fs_visitor::emit_shader_time_end()
>> start.negate = true;
>> fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
>> diff.set_smear(0);
>> -   ibld.ADD(diff, start, shader_end_time);
>> +
>> +   const fs_builder cbld = ibld.group(1, 0);
>> +   cbld.group(1, 0).ADD(diff, start, shader_end_time);
>>
>> /* If there were no instructions between the two timestamp gets, the diff
>>  * is 2 cycles.  Remove that overhead, so I can forget about that when
>>  * trying to determine the time taken for single instructions.
>>  */
>> -   ibld.ADD(diff, diff, fs_reg(-2u));
>> -   SHADER_TIME_ADD(ibld, type, diff);
>> -   SHADER_TIME_ADD(ibld, written_type, fs_reg(1u));
>> +   cbld.ADD(diff, diff, fs_reg(-2u));
>> +   SHADER_TIME_ADD(cbld, type, diff);
>> +   SHADER_TIME_ADD(cbld, written_type, fs_reg(1u));
>> ibld.emit(BRW_OPCODE_ELSE);
>> -   SHADER_TIME_ADD(ibld, reset_type, fs_reg(1u));
>> +   SHADER_TIME_ADD(cbld, reset_type, fs_reg(1u));
>> ibld.emit(BRW_OPCODE_ENDIF);
>>  }
>>
>> --
>> 2.4.3
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 03/19] i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads

2015-06-25 Thread Jason Ekstrand
Previously, fs_inst::regs_read() fell back to depending on the register
width for the second source.  This isn't really correct since it isn't a
SIMD8 value at all, but a SIMD4x2 value.  This commit changes it to
explicitly be always one register.

Reviewed-by: Iago Toral Quiroga 

v2: Use mlen for determining the number of registers written
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 31dfb24..589b74c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -715,6 +715,12 @@ fs_inst::regs_read(int arg) const
  return mlen;
   break;
 
+   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7:
+  /* The payload is actually stored in src1 */
+  if (arg == 1)
+ return mlen;
+  break;
+
case FS_OPCODE_LINTERP:
   if (arg == 0)
  return exec_size / 4;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 07/19] i965/blorp: Explicitly set execution sizes for new'd instructions

2015-06-25 Thread Jason Ekstrand
This doesn't affect instructions allocated using the builder.

Reviewed-by: Iago Toral Quiroga 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index 789520c..d458ad8 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -73,7 +73,7 @@ brw_blorp_eu_emitter::emit_kill_if_outside_rect(const struct 
brw_reg &x,
emit_cmp(BRW_CONDITIONAL_L, x, dst_x1)->predicate = BRW_PREDICATE_NORMAL;
emit_cmp(BRW_CONDITIONAL_L, y, dst_y1)->predicate = BRW_PREDICATE_NORMAL;
 
-   fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, g1, f0, g1);
+   fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, 16, g1, f0, g1);
inst->force_writemask_all = true;
insts.push_tail(inst);
 }
@@ -84,7 +84,7 @@ brw_blorp_eu_emitter::emit_texture_lookup(const struct 
brw_reg &dst,
   unsigned base_mrf,
   unsigned msg_length)
 {
-   fs_inst *inst = new (mem_ctx) fs_inst(op, dst, brw_message_reg(base_mrf),
+   fs_inst *inst = new (mem_ctx) fs_inst(op, 16, dst, 
brw_message_reg(base_mrf),
  fs_reg(0u));
 
inst->base_mrf = base_mrf;
@@ -119,7 +119,8 @@ brw_blorp_eu_emitter::emit_combine(enum opcode 
combine_opcode,
 {
assert(combine_opcode == BRW_OPCODE_ADD || combine_opcode == 
BRW_OPCODE_AVG);
 
-   insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, dst, src_1, src_2));
+   insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, 16, dst,
+ src_1, src_2));
 }
 
 fs_inst *
@@ -127,7 +128,7 @@ brw_blorp_eu_emitter::emit_cmp(enum brw_conditional_mod op,
const struct brw_reg &x,
const struct brw_reg &y)
 {
-   fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP,
+   fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP, 16,
 vec16(brw_null_reg()), x, y);
cmp->conditional_mod = op;
insts.push_tail(cmp);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 06/19] i965/fs: Set the builder group for emitting FB-write stencil/AA alpha

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Iago Toral Quiroga 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 8976c25..2341d02 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1529,7 +1529,7 @@ fs_visitor::emit_single_fb_write(const fs_builder &bld,
 
if (payload.aa_dest_stencil_reg) {
   sources[length] = fs_reg(GRF, alloc.allocate(1));
-  bld.exec_all().annotate("FB write stencil/AA alpha")
+  bld.group(8, 0).exec_all().annotate("FB write stencil/AA alpha")
  .MOV(sources[length],
   fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0)));
   length++;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 00/19] i965/fs: Remove the width field from fs_reg

2015-06-25 Thread Jason Ekstrand
This is a re-send of the series I did a week or two ago to remove the width
field from the fs_reg class.  I really didn't want to do a re-send but
there have been enough fixes since then that I thought it was worth
re-sending.  Most of these patches have already been reviewed but not all.

02: New.  Needs to be reviewed by someone familiar with SKL

03: Needs re-review.  This one is affected by 02.

05: Needs re-review.  This one went through a lot of changes to actually
get it right.  It should be the way we want now.

08: New.  It's just moving code around so it should be trivial.

09: New.  This is a complete replacement of patch 07 from the previous
series.

Cc: Topi Pohjolainen 
Cc: Iago Toral Quiroga 
Cc: Francisco Jerez 
Cc: Neil Roberts 

Jason Ekstrand (19):
  i965/fs: Use a switch statement in fs_inst::regs_read()
  i965/fs: Actually set/use the mlen for gen7 uniform pull constant
loads
  i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads
  i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/Y
  i965/fs: Explicitly set the exec_size on the add(32) in interpolation
setup
  i965/fs: Set the builder group for emitting FB-write stencil/AA alpha
  i965/blorp: Explicitly set execution sizes for new'd instructions
  i965/fs: Move offset(fs_reg, unsigned) to brw_fs.h
  i965/fs: Add a builder argument to offset()
  i965/fs: Make better use of the builder in shader_time
  i965/fs: Remove fs_inst constructors that don't take an explicit
exec_size
  i965/fs: Use exec_size for determining regs read/written and partial
writes
  i965/fs_builder: Use the dispatch width for setting exec sizes
  i965/fs: Remove exec_size guessing from fs_inst::init()
  i965/fs: Use the builder dispatch width instead of dst.width for pull
constants
  i965/fs: Use the builder dispatch_width for computing register offsets
  i965/fs: Use exec_size instead of dst.width for computing component
size
  i965/fs_generator: Use inst->exec_size for determining hardware reg
widths
  i965/fs: Remove the width field from fs_reg

 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp|   9 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 266 -
 src/mesa/drivers/dri/i965/brw_fs.h |  21 ++
 src/mesa/drivers/dri/i965/brw_fs_builder.h |  37 ++-
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |   4 -
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  10 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  23 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  64 ++---
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   4 +-
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |   3 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 183 +++---
 src/mesa/drivers/dri/i965/brw_ir_fs.h  |  45 +---
 .../drivers/dri/i965/brw_schedule_instructions.cpp |   4 +-
 13 files changed, 287 insertions(+), 386 deletions(-)

-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 11/19] i965/fs: Remove fs_inst constructors that don't take an explicit exec_size

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 30 ++
 src/mesa/drivers/dri/i965/brw_fs_builder.h |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  6 --
 src/mesa/drivers/dri/i965/brw_ir_fs.h  |  9 +
 4 files changed, 8 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8024fae..d1e253a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -126,9 +126,9 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size)
init(opcode, exec_size, reg_undef, NULL, 0);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst)
+fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst)
 {
-   init(opcode, 0, dst, NULL, 0);
+   init(opcode, exec_size, dst, NULL, 0);
 }
 
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
@@ -138,12 +138,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 1);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0)
-{
-   const fs_reg src[1] = { src0 };
-   init(opcode, 0, dst, src, 1);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
  const fs_reg &src0, const fs_reg &src1)
 {
@@ -151,13 +145,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 2);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
- const fs_reg &src1)
-{
-   const fs_reg src[2] = { src0, src1 };
-   init(opcode, 0, dst, src, 2);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
  const fs_reg &src0, const fs_reg &src1, const fs_reg &src2)
 {
@@ -165,19 +152,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, 
const fs_reg &dst,
init(opcode, exec_size, dst, src, 3);
 }
 
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
- const fs_reg &src1, const fs_reg &src2)
-{
-   const fs_reg src[3] = { src0, src1, src2 };
-   init(opcode, 0, dst, src, 3);
-}
-
-fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst,
- const fs_reg src[], unsigned sources)
-{
-   init(opcode, 0, dst, src, sources);
-}
-
 fs_inst::fs_inst(enum opcode opcode, uint8_t exec_width, const fs_reg &dst,
  const fs_reg src[], unsigned sources)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index 58ac598..c823190 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -235,7 +235,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg &dst) const
   {
- return emit(instruction(opcode, dst));
+ return emit(instruction(opcode, dst.width, dst));
   }
 
   /**
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 61eb904..50d6014 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -109,7 +109,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
  if (var->data.location == VARYING_SLOT_POS) {
 reg = *emit_fragcoord_interpolation(var->data.pixel_center_integer,
 var->data.origin_upper_left);
-emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, input, reg), 0xF);
+emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
+  input, reg), 0xF);
  } else {
 emit_general_interpolation(input, var->name, var->type,
(glsl_interp_qualifier) 
var->data.interpolation,
@@ -1743,7 +1744,8 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
nir_tex_instr *instr)
fs_reg dest = get_nir_dest(instr->dest);
dest.type = this->result.type;
unsigned num_components = nir_tex_instr_dest_size(instr);
-   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, dest, this->result),
+   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
+ dest, this->result),
 (1 << num_components) - 1);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 16b20be..d6b617a 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -189,20 +189,13 @@ public:
 
fs_inst();
fs_inst(enum opcode opcode, uint8_t exec_size);
-   fs_inst(enum opcode opcode, const fs_reg &dst);
+   fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst);
fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
const fs_reg &src0);
-   fs_inst(enum opcode opcode, const fs_re

[Mesa-dev] [PATCH v2 09/19] i965/fs: Add a builder argument to offset()

2015-06-25 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |  42 
 src/mesa/drivers/dri/i965/brw_fs.h   |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  58 +--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 143 ++-
 5 files changed, 128 insertions(+), 119 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6cf9e96..9855bfb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -267,7 +267,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
  inst->mlen = 1 + dispatch_width / 8;
}
 
-   bld.MOV(dst, offset(vec4_result, (const_offset & 3) * scale));
+   bld.MOV(dst, offset(vec4_result, bld, (const_offset & 3) * scale));
 }
 
 /**
@@ -361,7 +361,12 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
&grf_alloc) const
   reg.width = this->src[i].width;
   if (!this->src[i].equals(reg))
  return false;
-  reg = ::offset(reg, 1);
+
+  if (i < this->header_size) {
+ reg.reg_offset += 1;
+  } else {
+ reg.reg_offset += this->exec_size / 8;
+  }
}
 
return true;
@@ -920,7 +925,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
} else {
   bld.ADD(wpos, this->pixel_x, fs_reg(0.5f));
}
-   wpos = offset(wpos, 1);
+   wpos = offset(wpos, bld, 1);
 
/* gl_FragCoord.y */
if (!flip && pixel_center_integer) {
@@ -936,7 +941,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
 
   bld.ADD(wpos, pixel_y, fs_reg(offset));
}
-   wpos = offset(wpos, 1);
+   wpos = offset(wpos, bld, 1);
 
/* gl_FragCoord.z */
if (devinfo->gen >= 6) {
@@ -946,7 +951,7 @@ fs_visitor::emit_fragcoord_interpolation(bool 
pixel_center_integer,
this->delta_xy[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
interp_reg(VARYING_SLOT_POS, 2));
}
-   wpos = offset(wpos, 1);
+   wpos = offset(wpos, bld, 1);
 
/* gl_FragCoord.w: Already set up in emit_interpolation */
bld.MOV(wpos, this->wpos_w);
@@ -1029,7 +1034,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
/* If there's no incoming setup data for this slot, don't
 * emit interpolation for it.
 */
-   attr = offset(attr, type->vector_elements);
+   attr = offset(attr, bld, type->vector_elements);
location++;
continue;
 }
@@ -1044,7 +1049,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
   interp = suboffset(interp, 3);
interp.type = attr.type;
bld.emit(FS_OPCODE_CINTERP, attr, fs_reg(interp));
-  attr = offset(attr, 1);
+  attr = offset(attr, bld, 1);
}
 } else {
/* Smooth/noperspective interpolation case. */
@@ -1082,7 +1087,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const 
char *name,
if (devinfo->gen < 6 && interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
   bld.MUL(attr, attr, this->pixel_w);
}
-  attr = offset(attr, 1);
+  attr = offset(attr, bld, 1);
}
 
 }
@@ -1190,7 +1195,7 @@ fs_visitor::emit_samplepos_setup()
}
/* Compute gl_SamplePosition.x */
compute_sample_position(pos, int_sample_x);
-   pos = offset(pos, 1);
+   pos = offset(pos, abld, 1);
if (dispatch_width == 8) {
   abld.MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1)));
} else {
@@ -2980,10 +2985,6 @@ fs_visitor::lower_load_payload()
 
   assert(inst->dst.file == MRF || inst->dst.file == GRF);
   assert(inst->saturate == false);
-
-  const fs_builder ibld = bld.group(inst->exec_size, inst->force_sechalf)
- .exec_all(inst->force_writemask_all)
- .at(block, inst);
   fs_reg dst = inst->dst;
 
   /* Get rid of COMPR4.  We'll add it back in if we need it */
@@ -2991,17 +2992,23 @@ fs_visitor::lower_load_payload()
  dst.reg = dst.reg & ~BRW_MRF_COMPR4;
 
   dst.width = 8;
+  const fs_builder hbld = bld.group(8, 0).exec_all().at(block, inst);
+
   for (uint8_t i = 0; i < inst->header_size; i++) {
  if (inst->src[i].file != BAD_FILE) {
 fs_reg mov_dst = retype(dst, BRW_REGISTER_TYPE_UD);
 fs_reg mov_src = retype(inst->src[i], BRW_REGISTER_TYPE_UD);
 mov_src.width = 8;
-ibld.exec_all().MOV(mov_dst, mov_src);
+hbld.MOV(mov_dst, mov_src);
  }
- dst = offset(dst, 1);
+ dst = offset(dst, hbld, 1);
   }
 
   dst.width = inst->exec_size;
+  const fs_builder ibld = bld.group(inst->exec_size, inst->force_sechalf)
+ .exec_all(inst->force_writemask_all)
+   

[Mesa-dev] [PATCH v2 08/19] i965/fs: Move offset(fs_reg, unsigned) to brw_fs.h

2015-06-25 Thread Jason Ekstrand
Shortly, offset() will depend on the builder so we need it moved to some
place where it has access to that.
---
 src/mesa/drivers/dri/i965/brw_fs.h| 21 +
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 21 -
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 243baf6..c1819cc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -62,6 +62,27 @@ namespace brw {
class fs_live_variables;
 }
 
+static inline fs_reg
+offset(fs_reg reg, unsigned delta)
+{
+   switch (reg.file) {
+   case BAD_FILE:
+  break;
+   case GRF:
+   case MRF:
+   case ATTR:
+  return byte_offset(reg,
+ delta * MAX2(reg.width * reg.stride, 1) *
+ type_sz(reg.type));
+   case UNIFORM:
+  reg.reg_offset += delta;
+  break;
+   default:
+  assert(delta == 0);
+   }
+   return reg;
+}
+
 /**
  * The fragment shader front-end.
  *
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 96dc20d..16b20be 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -129,27 +129,6 @@ horiz_offset(fs_reg reg, unsigned delta)
 }
 
 static inline fs_reg
-offset(fs_reg reg, unsigned delta)
-{
-   switch (reg.file) {
-   case BAD_FILE:
-  break;
-   case GRF:
-   case MRF:
-   case ATTR:
-  return byte_offset(reg,
- delta * MAX2(reg.width * reg.stride, 1) *
- type_sz(reg.type));
-   case UNIFORM:
-  reg.reg_offset += delta;
-  break;
-   default:
-  assert(delta == 0);
-   }
-   return reg;
-}
-
-static inline fs_reg
 component(fs_reg reg, unsigned idx)
 {
assert(reg.subreg_offset == 0);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 02/19] i965/fs: Actually set/use the mlen for gen7 uniform pull constant loads

2015-06-25 Thread Jason Ekstrand
Previously, we were allocating the payload with different sizes per gen and
then figuring out the mlen in the generator based on gen.  This meant,
among other things, that the higher level passes knew nothing about it.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 19 ---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  9 +++--
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3ec8e6a..31dfb24 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2909,14 +2909,18 @@ fs_visitor::lower_uniform_pull_constant_loads()
  assert(const_offset_reg.file == IMM &&
 const_offset_reg.type == BRW_REGISTER_TYPE_UD);
  const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
- fs_reg payload = fs_reg(GRF, alloc.allocate(1));
 
- /* We have to use a message header on Skylake to get SIMD4x2 mode.
-  * Reserve space for the register.
-  */
+ fs_reg payload, offset;
  if (devinfo->gen >= 9) {
-payload.reg_offset++;
-alloc.sizes[payload.reg] = 2;
+/* We have to use a message header on Skylake to get SIMD4x2
+ * mode.  Reserve space for the register.
+*/
+offset = payload = fs_reg(GRF, alloc.allocate(2));
+offset.reg_offset++;
+inst->mlen = 2;
+ } else {
+offset = payload = fs_reg(GRF, alloc.allocate(1));
+inst->mlen = 1;
  }
 
  /* This is actually going to be a MOV, but since only the first dword
@@ -2925,7 +2929,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
   * by live variable analysis, or register allocation will explode.
   */
  fs_inst *setup = new(mem_ctx) fs_inst(FS_OPCODE_SET_SIMD4X2_OFFSET,
-   8, payload, const_offset_reg);
+   8, offset, const_offset_reg);
  setup->force_writemask_all = true;
 
  setup->ir = inst->ir;
@@ -2938,6 +2942,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
   */
  inst->opcode = FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7;
  inst->src[1] = payload;
+ inst->base_mrf = -1;
 
  invalidate_live_intervals();
   } else {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 2ed0bac..8d821ab 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1054,7 +1054,6 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
struct brw_reg index,
struct brw_reg offset)
 {
-   assert(inst->mlen == 0);
assert(index.type == BRW_REGISTER_TYPE_UD);
 
assert(offset.file == BRW_GENERAL_REGISTER_FILE);
@@ -1069,12 +1068,10 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
 
struct brw_reg src = offset;
bool header_present = false;
-   int mlen = 1;
 
if (devinfo->gen >= 9) {
   /* Skylake requires a message header in order to use SIMD4x2 mode. */
-  src = retype(brw_vec4_grf(offset.nr - 1, 0), BRW_REGISTER_TYPE_UD);
-  mlen = 2;
+  src = retype(brw_vec4_grf(offset.nr, 0), BRW_REGISTER_TYPE_UD);
   header_present = true;
 
   brw_push_insn_state(p);
@@ -1105,7 +1102,7 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
   0, /* LD message ignores sampler unit */
   GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
   1, /* rlen */
-  mlen,
+  inst->mlen,
   header_present,
   BRW_SAMPLER_SIMD_MODE_SIMD4X2,
   0);
@@ -1135,7 +1132,7 @@ 
fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
   0, /* LD message ignores sampler unit */
   GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
   1, /* rlen */
-  mlen,
+  inst->mlen,
   header_present,
   BRW_SAMPLER_SIMD_MODE_SIMD4X2,
   0);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 05/19] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup

2015-06-25 Thread Jason Ekstrand
Soon we will start using the builder to explicitly set all the execution
sizes.  We could make a 32-wide builder, but the builder asserts that we
never grow it which is usually a reasonable assumption.  Sinc this one
instruction is a bit of an odd-ball, we just set the exec_size explicitly.

Reviewed-by: Iago Toral Quiroga 

v2: Explicitly new the fs_inst instead of using the builder and setting
exec_size after the fact.

v3: Set force_writemask_all with the builder instead of directly.  The
builder over-writes it if we set it manually.  Also, if we don't have
force_writemask_all in the builder it will assert-fail on SIMD32.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9a4bad6..8976c25 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1357,10 +1357,12 @@ fs_visitor::emit_interpolation_setup_gen6()
*/
   fs_reg int_pixel_xy(GRF, alloc.allocate(dispatch_width / 8),
   BRW_REGISTER_TYPE_UW, dispatch_width * 2);
-  abld.exec_all()
-  .ADD(int_pixel_xy,
-   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
-   fs_reg(brw_imm_v(0x11001010)));
+  fs_inst *add =
+ new (mem_ctx) fs_inst(BRW_OPCODE_ADD, dispatch_width * 2,
+   int_pixel_xy,
+   fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)),
+   fs_reg(brw_imm_v(0x11001010)));
+  abld.exec_all().emit(add);
 
   this->pixel_x = vgrf(glsl_type::float_type);
   this->pixel_y = vgrf(glsl_type::float_type);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 10/19] i965/fs: Make better use of the builder in shader_time

2015-06-25 Thread Jason Ekstrand
Previously, we were just depending on register widths to ensure that
various things were exec_size of 1 etc.  Now, we do so explicitly using the
builder.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9855bfb..8024fae 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -557,7 +557,7 @@ fs_visitor::get_timestamp(const fs_builder &bld)
/* We want to read the 3 fields we care about even if it's not enabled in
 * the dispatch.
 */
-   bld.exec_all().MOV(dst, ts);
+   bld.group(4, 0).exec_all().MOV(dst, ts);
 
/* The caller wants the low 32 bits of the timestamp.  Since it's running
 * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
@@ -604,17 +604,19 @@ fs_visitor::emit_shader_time_end()
start.negate = true;
fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
diff.set_smear(0);
-   ibld.ADD(diff, start, shader_end_time);
+
+   const fs_builder cbld = ibld.group(1, 0);
+   cbld.group(1, 0).ADD(diff, start, shader_end_time);
 
/* If there were no instructions between the two timestamp gets, the diff
 * is 2 cycles.  Remove that overhead, so I can forget about that when
 * trying to determine the time taken for single instructions.
 */
-   ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, 0, diff);
-   SHADER_TIME_ADD(ibld, 1, fs_reg(1u));
+   cbld.ADD(diff, diff, fs_reg(-2u));
+   SHADER_TIME_ADD(cbld, 0, diff);
+   SHADER_TIME_ADD(cbld, 1, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, 2, fs_reg(1u));
+   SHADER_TIME_ADD(cbld, 2, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 12/19] i965/fs: Use exec_size for determining regs read/written and partial writes

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d1e253a..4f56865 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -101,7 +101,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
fs_reg &dst,
case MRF:
case ATTR:
   this->regs_written =
- DIV_ROUND_UP(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
+ DIV_ROUND_UP(MAX2(exec_size * dst.stride, 1) * type_sz(dst.type), 32);
   break;
case BAD_FILE:
   this->regs_written = 0;
@@ -675,7 +675,7 @@ bool
 fs_inst::is_partial_write() const
 {
return ((this->predicate && this->opcode != BRW_OPCODE_SEL) ||
-   (this->dst.width * type_sz(this->dst.type)) < 32 ||
+   (this->exec_size * type_sz(this->dst.type)) < 32 ||
!this->dst.is_contiguous());
 }
 
@@ -729,8 +729,8 @@ fs_inst::regs_read(int arg) const
   if (src[arg].stride == 0) {
  return 1;
   } else {
- int size = src[arg].width * src[arg].stride * type_sz(src[arg].type);
- return (size + 31) / 32;
+ int size = this->exec_size * src[arg].stride * type_sz(src[arg].type);
+ return DIV_ROUND_UP(size, 32);
   }
case MRF:
   unreachable("MRF registers are not allowed as sources");
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/19] i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/Y

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Iago Toral Quiroga 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 589b74c..6cf9e96 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -726,6 +726,12 @@ fs_inst::regs_read(int arg) const
  return exec_size / 4;
   break;
 
+   case FS_OPCODE_PIXEL_X:
+   case FS_OPCODE_PIXEL_Y:
+  if (arg == 0)
+ return 2;
+  break;
+
default:
   if (is_tex() && arg == 0 && src[0].file == GRF)
  return mlen;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 01/19] i965/fs: Use a switch statement in fs_inst::regs_read()

2015-06-25 Thread Jason Ekstrand
This makes things a little simpler, more efficient, and quite a bit more
readable.

Reviewed-by: Iago Toral Quiroga 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 45 ++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4292aa6..3ec8e6a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -701,28 +701,29 @@ fs_inst::is_partial_write() const
 int
 fs_inst::regs_read(int arg) const
 {
-   if (is_tex() && arg == 0 && src[0].file == GRF) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_FB_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_URB_WRITE_SIMD8 && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_ATOMIC && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_READ && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_ATOMIC && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_READ && arg == 0) {
-  return mlen;
-   } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_WRITE && arg == 0) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET && arg == 0) {
-  return mlen;
-   } else if (opcode == FS_OPCODE_LINTERP && arg == 0) {
-  return exec_size / 4;
+   switch (opcode) {
+   case FS_OPCODE_FB_WRITE:
+   case SHADER_OPCODE_URB_WRITE_SIMD8:
+   case SHADER_OPCODE_UNTYPED_ATOMIC:
+   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
+   case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
+   case SHADER_OPCODE_TYPED_ATOMIC:
+   case SHADER_OPCODE_TYPED_SURFACE_READ:
+   case SHADER_OPCODE_TYPED_SURFACE_WRITE:
+   case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
+  if (arg == 0)
+ return mlen;
+  break;
+
+   case FS_OPCODE_LINTERP:
+  if (arg == 0)
+ return exec_size / 4;
+  break;
+
+   default:
+  if (is_tex() && arg == 0 && src[0].file == GRF)
+ return mlen;
+  break;
}
 
switch (src[arg].file) {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 17/19] i965/fs: Use exec_size instead of dst.width for computing component size

2015-06-25 Thread Jason Ekstrand
There are a variety of places where we use dst.width / 8 to compute the
size of a single logical channel.  Instead, we should be using exec_size.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp| 6 +++---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp| 2 +-
 src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp| 2 +-
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++--
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index aeaa1c4..6e8d9e8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2274,12 +2274,12 @@ fs_visitor::opt_register_renaming()
 
   if (depth == 0 &&
   inst->dst.file == GRF &&
-  alloc.sizes[inst->dst.reg] == inst->dst.width / 8 &&
+  alloc.sizes[inst->dst.reg] == inst->exec_size / 8 &&
   !inst->is_partial_write()) {
  if (remap[dst] == -1) {
 remap[dst] = dst;
  } else {
-remap[dst] = alloc.allocate(inst->dst.width / 8);
+remap[dst] = alloc.allocate(inst->exec_size / 8);
 inst->dst.reg = remap[dst];
 progress = true;
  }
@@ -2410,7 +2410,7 @@ fs_visitor::compute_to_mrf()
 /* Things returning more than one register would need us to
  * understand coalescing out more than one MOV at a time.
  */
-if (scan_inst->regs_written > scan_inst->dst.width / 8)
+if (scan_inst->regs_written > scan_inst->exec_size / 8)
break;
 
/* SEND instructions can't have MRF as a destination. */
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 29d1f2a..29b46b9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -179,7 +179,7 @@ static void
 create_copy_instr(const fs_builder &bld, fs_inst *inst, fs_reg src, bool 
negate)
 {
int written = inst->regs_written;
-   int dst_width = inst->dst.width / 8;
+   int dst_width = inst->exec_size / 8;
const fs_builder ubld = bld.group(inst->exec_size, inst->force_sechalf)
   .exec_all(inst->force_writemask_all);
fs_inst *copy;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index 2ad7079..149c0f0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
@@ -196,7 +196,7 @@ fs_visitor::register_coalesce()
 continue;
  }
  reg_to_offset[offset] = inst->dst.reg_offset;
- if (inst->src[0].width == 16)
+ if (inst->exec_size == 16)
 reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
  mov[offset] = inst;
  channels_remaining -= inst->regs_written;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 7651e96..df24fb9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -912,7 +912,7 @@ fs_visitor::emit_texture(ir_texture_opcode op,
   bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth, fs_reg(6));
 
   fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, 
inst->regs_written);
-  int components = inst->regs_written / (dst.width / 8);
+  int components = inst->regs_written / (inst->exec_size / 8);
   for (int i = 0; i < components; i++) {
  if (i == 2) {
 fixed_payload[i] = fixed_depth;
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index ee0add5..b49961f 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -1314,8 +1314,8 @@ fs_instruction_scheduler::choose_instruction_to_schedule()
 * single-result send is probably actually reducing register
 * pressure.
 */
-   if (inst->regs_written <= inst->dst.width / 8 &&
-   chosen_inst->regs_written > chosen_inst->dst.width / 8) {
+   if (inst->regs_written <= inst->exec_size / 8 &&
+   chosen_inst->regs_written > chosen_inst->exec_size / 8) {
   chosen = n;
   continue;
} else if (inst->regs_written > chosen_inst->regs_written) {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 13/19] i965/fs_builder: Use the dispatch width for setting exec sizes

2015-06-25 Thread Jason Ekstrand
Previously we used dst.width but the two *should* be the same.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index c823190..8af16a0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
@@ -235,7 +235,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg &dst) const
   {
- return emit(instruction(opcode, dst.width, dst));
+ return emit(instruction(opcode, dispatch_width(), dst));
   }
 
   /**
@@ -253,11 +253,11 @@ namespace brw {
  case SHADER_OPCODE_SIN:
  case SHADER_OPCODE_COS:
 return fix_math_instruction(
-   emit(instruction(opcode, dst.width, dst,
+   emit(instruction(opcode, dispatch_width(), dst,
 fix_math_operand(src0;
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0));
+return emit(instruction(opcode, dispatch_width(), dst, src0));
  }
   }
 
@@ -273,12 +273,12 @@ namespace brw {
  case SHADER_OPCODE_INT_QUOTIENT:
  case SHADER_OPCODE_INT_REMAINDER:
 return fix_math_instruction(
-   emit(instruction(opcode, dst.width, dst,
+   emit(instruction(opcode, dispatch_width(), dst,
 fix_math_operand(src0),
 fix_math_operand(src1;
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0, src1));
+return emit(instruction(opcode, dispatch_width(), dst, src0, 
src1));
 
  }
   }
@@ -295,13 +295,14 @@ namespace brw {
  case BRW_OPCODE_BFI2:
  case BRW_OPCODE_MAD:
  case BRW_OPCODE_LRP:
-return emit(instruction(opcode, dst.width, dst,
+return emit(instruction(opcode, dispatch_width(), dst,
 fix_3src_operand(src0),
 fix_3src_operand(src1),
 fix_3src_operand(src2)));
 
  default:
-return emit(instruction(opcode, dst.width, dst, src0, src1, src2));
+return emit(instruction(opcode, dispatch_width(), dst,
+src0, src1, src2));
  }
   }
 
@@ -517,7 +518,8 @@ namespace brw {
   {
  assert(dst.width % 8 == 0);
  instruction *inst = emit(instruction(SHADER_OPCODE_LOAD_PAYLOAD,
-  dst.width, dst, src, sources));
+  dispatch_width(), dst,
+  src, sources));
  inst->header_size = header_size;
 
  for (unsigned i = 0; i < header_size; i++)
@@ -528,7 +530,7 @@ namespace brw {
  for (unsigned i = header_size; i < sources; ++i)
 assert(src[i].file != GRF ||
src[i].width == dst.width);
- inst->regs_written += (sources - header_size) * (dst.width / 8);
+ inst->regs_written += (sources - header_size) * (dispatch_width() / 
8);
 
  return inst;
   }
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 15/19] i965/fs: Use the builder dispatch width instead of dst.width for pull constants

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6e45fa7..aeaa1c4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -188,7 +188,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
bld.ADD(vec4_offset, varying_offset, fs_reg(const_offset & ~3));
 
int scale = 1;
-   if (devinfo->gen == 4 && dst.width == 8) {
+   if (devinfo->gen == 4 && bld.dispatch_width() == 8) {
   /* Pre-gen5, we can either use a SIMD8 message that requires (header,
* u, v, r) as parameters, or we can just use the SIMD16 message
* consisting of (header, u).  We choose the second, at the cost of a
@@ -204,9 +204,9 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
   op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
 
assert(dst.width % 8 == 0);
-   int regs_written = 4 * (dst.width / 8) * scale;
+   int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written),
-   dst.type, dst.width);
+   dst.type, bld.dispatch_width());
fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
inst->regs_written = regs_written;
 
@@ -216,7 +216,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
   if (devinfo->gen == 4)
  inst->mlen = 3;
   else
- inst->mlen = 1 + dispatch_width / 8;
+ inst->mlen = 1 + bld.dispatch_width() / 8;
}
 
bld.MOV(dst, offset(vec4_result, bld, (const_offset & 3) * scale));
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 19/19] i965/fs: Remove the width field from fs_reg

2015-06-25 Thread Jason Ekstrand
As of now, the width field is no longer used for anything.  The width field
"seemed like a good idea at the time" but is actually entirely redundant
with the instruction's execution size.  Initially, it gave us the ability
to easily set the instructions execution size based entirely on register
widths.  With the builder, we can easiliy set the sizes explicitly and the
width field doesn't have as much purpose.  At this point, it's just
redundant information that can get out of sync so it really needs to go.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 62 --
 src/mesa/drivers/dri/i965/brw_fs_builder.h | 19 ++-
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  4 --
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  4 +-
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |  1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 26 -
 src/mesa/drivers/dri/i965/brw_ir_fs.h  | 15 +-
 8 files changed, 30 insertions(+), 107 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6e8d9e8..a96424e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -203,10 +203,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
else
   op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
 
-   assert(dst.width % 8 == 0);
int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
-   fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written),
-   dst.type, bld.dispatch_width());
+   fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written), dst.type);
fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
inst->regs_written = regs_written;
 
@@ -310,7 +308,6 @@ fs_inst::is_copy_payload(const brw::simple_allocator 
&grf_alloc) const
 
for (int i = 0; i < this->sources; i++) {
   reg.type = this->src[i].type;
-  reg.width = this->src[i].width;
   if (!this->src[i].equals(reg))
  return false;
 
@@ -366,7 +363,6 @@ fs_reg::fs_reg(float f)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_F;
this->fixed_hw_reg.dw1.f = f;
-   this->width = 1;
 }
 
 /** Immediate value constructor. */
@@ -376,7 +372,6 @@ fs_reg::fs_reg(int32_t i)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_D;
this->fixed_hw_reg.dw1.d = i;
-   this->width = 1;
 }
 
 /** Immediate value constructor. */
@@ -386,7 +381,6 @@ fs_reg::fs_reg(uint32_t u)
this->file = IMM;
this->type = BRW_REGISTER_TYPE_UD;
this->fixed_hw_reg.dw1.ud = u;
-   this->width = 1;
 }
 
 /** Vector float immediate value constructor. */
@@ -417,7 +411,6 @@ fs_reg::fs_reg(struct brw_reg fixed_hw_reg)
this->file = HW_REG;
this->fixed_hw_reg = fixed_hw_reg;
this->type = fixed_hw_reg.type;
-   this->width = 1 << fixed_hw_reg.width;
 }
 
 bool
@@ -432,7 +425,6 @@ fs_reg::equals(const fs_reg &r) const
abs == r.abs &&
!reladdr && !r.reladdr &&
memcmp(&fixed_hw_reg, &r.fixed_hw_reg, sizeof(fixed_hw_reg)) == 0 &&
-   width == r.width &&
stride == r.stride);
 }
 
@@ -504,7 +496,7 @@ fs_visitor::get_timestamp(const fs_builder &bld)
   0),
  BRW_REGISTER_TYPE_UD));
 
-   fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4);
+   fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 
/* We want to read the 3 fields we care about even if it's not enabled in
 * the dispatch.
@@ -554,7 +546,7 @@ fs_visitor::emit_shader_time_end()
 
fs_reg start = shader_start_time;
start.negate = true;
-   fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
+   fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
diff.set_smear(0);
 
const fs_builder cbld = ibld.group(1, 0);
@@ -803,7 +795,7 @@ fs_visitor::vgrf(const glsl_type *const type)
 {
int reg_width = dispatch_width / 8;
return fs_reg(GRF, alloc.allocate(type_size(type) * reg_width),
- brw_type_for_base_type(type), dispatch_width);
+ brw_type_for_base_type(type));
 }
 
 /** Fixed HW reg constructor. */
@@ -813,14 +805,6 @@ fs_reg::fs_reg(enum register_file file, int reg)
this->file = file;
this->reg = reg;
this->type = BRW_REGISTER_TYPE_F;
-
-   switch (file) {
-   case UNIFORM:
-  this->width = 1;
-  break;
-   default:
-  this->width = 8;
-   }
 }
 
 /** Fixed HW reg constructor. */
@@ -830,25 +814,6 @@ fs_reg::fs_reg(enum register_file file, int reg, enum 
brw_reg_type type)
this->file = file;
this->reg = reg;
this->type = type;
-
-   switch (file) {
-   case UNIFORM:
-  this->width = 1;
-  break;
-   default:
-  this->width = 8;
-   }
-}
-
-/** Fixed HW reg constructor.

[Mesa-dev] [PATCH v2 14/19] i965/fs: Remove exec_size guessing from fs_inst::init()

2015-06-25 Thread Jason Ekstrand
Now that all of the non-explicit constructors are gone, we don't need to
guess anymore.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 22 --
 1 file changed, 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4f56865..6e45fa7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -68,28 +68,6 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
fs_reg &dst,
 
assert(dst.file != IMM && dst.file != UNIFORM);
 
-   /* If exec_size == 0, try to guess it from the registers.  Since all
-* manner of things may use hardware registers, we first try to guess
-* based on GRF registers.  If this fails, we will go ahead and take the
-* width from the destination register.
-*/
-   if (this->exec_size == 0) {
-  if (dst.file == GRF) {
- this->exec_size = dst.width;
-  } else {
- for (unsigned i = 0; i < sources; ++i) {
-if (src[i].file != GRF && src[i].file != ATTR)
-   continue;
-
-if (this->exec_size <= 1)
-   this->exec_size = src[i].width;
-assert(src[i].width == 1 || src[i].width == this->exec_size);
- }
-  }
-
-  if (this->exec_size == 0 && dst.file != BAD_FILE)
- this->exec_size = dst.width;
-   }
assert(this->exec_size != 0);
 
this->conditional_mod = BRW_CONDITIONAL_NONE;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 18/19] i965/fs_generator: Use inst->exec_size for determining hardware reg widths

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 8d821ab..0a70bdc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -48,7 +48,7 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
 }
 
 static struct brw_reg
-brw_reg_from_fs_reg(fs_reg *reg)
+brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg)
 {
struct brw_reg brw_reg;
 
@@ -57,10 +57,10 @@ brw_reg_from_fs_reg(fs_reg *reg)
case MRF:
   if (reg->stride == 0) {
  brw_reg = brw_vec1_reg(brw_file_from_reg(reg), reg->reg, 0);
-  } else if (reg->width < 8) {
+  } else if (inst->exec_size < 8) {
  brw_reg = brw_vec8_reg(brw_file_from_reg(reg), reg->reg, 0);
- brw_reg = stride(brw_reg, reg->width * reg->stride,
-  reg->width, reg->stride);
+ brw_reg = stride(brw_reg, inst->exec_size * reg->stride,
+  inst->exec_size, reg->stride);
   } else {
  /* From the Haswell PRM:
   *
@@ -414,7 +414,7 @@ fs_generator::generate_blorp_fb_write(fs_inst *inst)
brw_fb_WRITE(p,
 16 /* dispatch_width */,
 brw_message_reg(inst->base_mrf),
-brw_reg_from_fs_reg(&inst->src[0]),
+brw_reg_from_fs_reg(inst, &inst->src[0]),
 BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE,
 inst->target,
 inst->mlen,
@@ -1560,7 +1560,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  annotate(p->devinfo, &annotation, cfg, inst, p->next_insn_offset);
 
   for (unsigned int i = 0; i < inst->sources; i++) {
-src[i] = brw_reg_from_fs_reg(&inst->src[i]);
+src[i] = brw_reg_from_fs_reg(inst, &inst->src[i]);
 
 /* The accumulator result appears to get used for the
  * conditional modifier generation.  When negating a UD
@@ -1572,7 +1572,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
inst->src[i].type != BRW_REGISTER_TYPE_UD ||
!inst->src[i].negate);
   }
-  dst = brw_reg_from_fs_reg(&inst->dst);
+  dst = brw_reg_from_fs_reg(inst, &inst->dst);
 
   brw_set_default_predicate_control(p, inst->predicate);
   brw_set_default_predicate_inverse(p, inst->predicate_inverse);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 16/19] i965/fs: Use the builder dispatch_width for computing register offsets

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d4cc43d..d94a842 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -72,7 +72,7 @@ offset(fs_reg reg, const brw::fs_builder& bld, unsigned delta)
case MRF:
case ATTR:
   return byte_offset(reg,
- delta * MAX2(reg.width * reg.stride, 1) *
+ delta * bld.dispatch_width() * reg.stride *
  type_sz(reg.type));
case UNIFORM:
   reg.reg_offset += delta;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir/from_ssa: add a flag to not convert everything to SSA

2015-06-25 Thread Jason Ekstrand
On Thu, Jun 25, 2015 at 12:29 PM, Connor Abbott  wrote:
> We already don't convert constants out of SSA, and in our backend we'd
> like to have only one way of saying something is still in SSA.
>
> The one tricky part about this is that we may now leave some undef
> instructions around if they aren't part of a phi-web, so we have to be
> more careful about deleting them.
>
> Signed-off-by: Connor Abbott 
> ---
>  src/gallium/drivers/vc4/vc4_program.c |  2 +-
>  src/glsl/nir/nir.h|  7 ++-
>  src/glsl/nir/nir_from_ssa.c   | 25 ++---
>  src/mesa/drivers/dri/i965/brw_nir.c   |  2 +-
>  4 files changed, 26 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_program.c 
> b/src/gallium/drivers/vc4/vc4_program.c
> index 2061631..1a550e1 100644
> --- a/src/gallium/drivers/vc4/vc4_program.c
> +++ b/src/gallium/drivers/vc4/vc4_program.c
> @@ -2102,7 +2102,7 @@ vc4_shader_ntq(struct vc4_context *vc4, enum qstage 
> stage,
>
>  nir_remove_dead_variables(c->s);
>
> -nir_convert_from_ssa(c->s);
> +nir_convert_from_ssa(c->s, true);
>
>  if (vc4_debug & VC4_DEBUG_SHADERDB) {
>  fprintf(stderr, "SHADER-DB: %s prog %d/%d: %d NIR 
> instructions\n",
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 697d37e..2116f60 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1676,7 +1676,12 @@ bool nir_ssa_defs_interfere(nir_ssa_def *a, 
> nir_ssa_def *b);
>
>  void nir_convert_to_ssa_impl(nir_function_impl *impl);
>  void nir_convert_to_ssa(nir_shader *shader);
> -void nir_convert_from_ssa(nir_shader *shader);
> +
> +/* If convert_everything is true, convert all values (even those not involved
> + * in a phi node) to registers. If false, only convert SSA values involved in
> + * phi nodes to registers.
> + */
> +void nir_convert_from_ssa(nir_shader *shader, bool convert_everything);

I don't think "convert everything" is really what we want to call it.
A better idea might be to flip the bool and call it phi_webs_only.

With that changed,
Reviewed-by: Jason Ekstrand 

>
>  bool nir_opt_algebraic(nir_shader *shader);
>  bool nir_opt_algebraic_late(nir_shader *shader);
> diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
> index 67733e6..966c2fe 100644
> --- a/src/glsl/nir/nir_from_ssa.c
> +++ b/src/glsl/nir/nir_from_ssa.c
> @@ -37,6 +37,7 @@
>  struct from_ssa_state {
> void *mem_ctx;
> void *dead_ctx;
> +   bool convert_everything;
> struct hash_table *merge_node_table;
> nir_instr *instr;
> nir_function_impl *impl;
> @@ -482,6 +483,9 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
>
>reg = node->set->reg;
> } else {
> +  if (!state->convert_everything)
> + return true;
> +
>/* We leave load_const SSA values alone.  They act as immediates to
> * the backend.  If it got coalesced into a phi, that's ok.
> */
> @@ -505,8 +509,15 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
> nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg), state->mem_ctx);
> assert(list_empty(&def->uses) && list_empty(&def->if_uses));
>
> -   if (def->parent_instr->type == nir_instr_type_ssa_undef)
> +   if (def->parent_instr->type == nir_instr_type_ssa_undef) {
> +  /* If it's an ssa_undef instruction, remove it since we know we just 
> got
> +   * rid of all its uses.
> +   */
> +  nir_instr *parent_instr = def->parent_instr;
> +  nir_instr_remove(parent_instr);
> +  ralloc_steal(state->dead_ctx, parent_instr);
>return true;
> +   }
>
> assert(def->parent_instr->type != nir_instr_type_load_const);
>
> @@ -523,7 +534,7 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
>  }
>
>  /* Resolves ssa definitions to registers.  While we're at it, we also
> - * remove phi nodes and ssa_undef instructions
> + * remove phi nodes.
>   */
>  static bool
>  resolve_registers_block(nir_block *block, void *void_state)
> @@ -534,8 +545,7 @@ resolve_registers_block(nir_block *block, void 
> *void_state)
>state->instr = instr;
>nir_foreach_ssa_def(instr, rewrite_ssa_def, state);
>
> -  if (instr->type == nir_instr_type_ssa_undef ||
> -  instr->type == nir_instr_type_phi) {
> +  if (instr->type == nir_instr_type_phi) {
>   nir_instr_remove(instr);
>   ralloc_steal(state->dead_ctx, instr);
>}
> @@ -765,13 +775,14 @@ resolve_parallel_copies_block(nir_blo

Re: [Mesa-dev] [PATCH 3/4] nir: remove nir_src_get_parent_instr()

2015-06-25 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, Jun 25, 2015 at 12:29 PM, Connor Abbott  wrote:
> It's now unused.
>
> Signed-off-by: Connor Abbott 
> ---
>  src/glsl/nir/nir.h | 10 --
>  1 file changed, 10 deletions(-)
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 2116f60..b33c9c5 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -565,16 +565,6 @@ nir_src_for_reg(nir_register *reg)
> return src;
>  }
>
> -static inline nir_instr *
> -nir_src_get_parent_instr(const nir_src *src)
> -{
> -   if (src->is_ssa) {
> -  return src->ssa->parent_instr;
> -   } else {
> -  return src->reg.reg->parent_instr;
> -   }
> -}
> -
>  static inline nir_dest
>  nir_dest_for_reg(nir_register *reg)
>  {
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] nir: remove parent_instr from nir_register

2015-06-25 Thread Jason Ekstrand
Yes, please!  It was nice at the time, but it was always a hack.

Reviewed-by: Jason Ekstrand 

On Thu, Jun 25, 2015 at 12:29 PM, Connor Abbott  wrote:
> It's no longer used
>
> Signed-off-by: Connor Abbott 
> ---
>  src/glsl/nir/nir.c  | 1 -
>  src/glsl/nir/nir.h  | 8 
>  src/glsl/nir/nir_from_ssa.c | 8 
>  3 files changed, 17 deletions(-)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index f03e80a..f661249 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -57,7 +57,6 @@ reg_create(void *mem_ctx, struct exec_list *list)
>  {
> nir_register *reg = ralloc(mem_ctx, nir_register);
>
> -   reg->parent_instr = NULL;
> list_inithead(®->uses);
> list_inithead(®->defs);
> list_inithead(®->if_uses);
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index b33c9c5..e818acc 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -389,14 +389,6 @@ typedef struct {
>  */
> bool is_packed;
>
> -   /**
> -* If this pointer is non-NULL then this register has exactly one
> -* definition and that definition dominates all of its uses.  This is
> -* set by the out-of-SSA pass so that backends can get SSA-like
> -* information even once they have gone out of SSA.
> -*/
> -   struct nir_instr *parent_instr;
> -
> /** set of nir_instr's where this register is used (read from) */
> struct list_head uses;
>
> diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
> index 966c2fe..57bbdde 100644
> --- a/src/glsl/nir/nir_from_ssa.c
> +++ b/src/glsl/nir/nir_from_ssa.c
> @@ -496,14 +496,6 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
>reg->name = def->name;
>reg->num_components = def->num_components;
>reg->num_array_elems = 0;
> -
> -  /* This register comes from an SSA definition that is defined and not
> -   * part of a phi-web.  Therefore, we know it has a single unique
> -   * definition that dominates all of its uses; we can copy the
> -   * parent_instr from the SSA def safely.
> -   */
> -  if (def->parent_instr->type != nir_instr_type_ssa_undef)
> - reg->parent_instr = def->parent_instr;
> }
>
> nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg), state->mem_ctx);
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965/fs: use SSA values directly

2015-06-25 Thread Jason Ekstrand
> +}
> +
>  static fs_reg
>  fs_reg_for_nir_reg(fs_visitor *v, nir_register *nir_reg,
> unsigned base_offset, nir_src *indirect)
> @@ -1171,30 +1190,30 @@ fs_reg_for_nir_reg(fs_visitor *v, nir_register 
> *nir_reg,
>  fs_reg
>  fs_visitor::get_nir_src(nir_src src)
>  {
> +   fs_reg reg;
> if (src.is_ssa) {
> -  assert(src.ssa->parent_instr->type == nir_instr_type_load_const);
> -  nir_load_const_instr *load = 
> nir_instr_as_load_const(src.ssa->parent_instr);
> -  fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_D, src.ssa->num_components);
> -
> -  for (unsigned i = 0; i < src.ssa->num_components; ++i)
> - bld.MOV(offset(reg, i), fs_reg(load->value.i[i]));
> -
> -  return reg;

I understand that moving this stuff to emit_load_const has a very nice
unifying effect on the visitor.  However, it also has a subtle effect
on generated code that's worth at least documenting in the commit
message.  In particular, the MOV(dst, imm) is now in a different basic
block than the instruction that was using it.  I don't think it's a
big deal, but you should put it in the commit message.

Reviewed-by: Jason Ekstrand 

> +  reg = nir_ssa_values[src.ssa->index];
> } else {
> -  fs_reg reg = fs_reg_for_nir_reg(this, src.reg.reg, src.reg.base_offset,
> -  src.reg.indirect);
> -
> -  /* to avoid floating-point denorm flushing problems, set the type by
> -   * default to D - instructions that need floating point semantics will 
> set
> -   * this to F if they need to
> -   */
> -  return retype(reg, BRW_REGISTER_TYPE_D);
> +  reg = fs_reg_for_nir_reg(this, src.reg.reg, src.reg.base_offset,
> +   src.reg.indirect);
> }
> +
> +   /* to avoid floating-point denorm flushing problems, set the type by
> +* default to D - instructions that need floating point semantics will set
> +* this to F if they need to
> +*/
> +   return retype(reg, BRW_REGISTER_TYPE_D);
>  }
>
>  fs_reg
>  fs_visitor::get_nir_dest(nir_dest dest)
>  {
> +   if (dest.is_ssa) {
> +  nir_ssa_values[dest.ssa.index] = bld.vgrf(BRW_REGISTER_TYPE_F,
> +dest.ssa.num_components);
> +  return nir_ssa_values[dest.ssa.index];
> +   }
> +
> return fs_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
>   dest.reg.indirect);
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 9a4bad6..90f6219 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -2012,6 +2012,7 @@ fs_visitor::fs_visitor(const struct brw_compiler 
> *compiler, void *log_data,
> this->no16_msg = NULL;
>
> this->nir_locals = NULL;
> +   this->nir_ssa_values = NULL;
> this->nir_globals = NULL;
>
> memset(&this->payload, 0, sizeof(this->payload));
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index 3e154c1..d87e783 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -156,7 +156,7 @@ brw_create_nir(struct brw_context *brw,
>nir_print_shader(nir, stderr);
> }
>
> -   nir_convert_from_ssa(nir, true);
> +   nir_convert_from_ssa(nir, false);
> nir_validate_shader(nir);
>
> /* This is the last pass we run before we start emitting stuff.  It
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c 
> b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
> index f0b018c..9eb0ed9 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
> @@ -43,8 +43,8 @@
>  static uint8_t
>  get_resolve_status_for_src(nir_src *src)
>  {
> -   nir_instr *src_instr = nir_src_get_parent_instr(src);
> -   if (src_instr) {
> +   if (src->is_ssa) {
> +  nir_instr *src_instr = src->ssa->parent_instr;
>uint8_t resolve_status = src_instr->pass_flags & BRW_NIR_BOOLEAN_MASK;
>
>/* If the source instruction needs resolve, then from the perspective
> @@ -66,8 +66,8 @@ get_resolve_status_for_src(nir_src *src)
>  static bool
>  src_mark_needs_resolve(nir_src *src, void *void_state)
>  {
> -   nir_instr *src_instr = nir_src_get_parent_instr(src);
> -   if (src_instr) {
> +   if (src->is_ssa) {
> +  nir_instr *src_instr = src->ssa->parent_instr;
>uint8_t resolve_status = src_instr->pass_flags &

Re: [Mesa-dev] [PATCH 0/4] i965: use SSA values when we can

2015-06-25 Thread Jason Ekstrand
And, you got some shader-db stats:

total instructions in shared programs: 6078991 -> 6073118 (-0.10%)
instructions in affected programs: 402221 -> 396348 (-1.46%)
helped:1527
HURT:  0
GAINED:8
LOST:  2

I'm not sure which commit it was that helped.  I'm guessing one of our
on-the-fly peepholes started working better.


On Thu, Jun 25, 2015 at 12:29 PM, Connor Abbott  wrote:
> Before, we were using a hack where when we converted out of SSA, we set
> a "parent_instr" field of the nir_register to indicate that the register
> was actually an SSA value. But in the future, we want to handle SSA
> values directly, and right now we're creating an extra nir_register for
> everything, even if it's not involved in a phi node. This series removes
> that hack for i965 and gets us using SSA values directly in most cases.
>
> The only other user of nir_convert_from_ssa() is vc4, which I believed I
> changed correctly, and it doesn't seem to use
> nir_register::parent_instr, based on my grepping. I tried to
> compile-test it, but it assumed I was using the simulator and died, so
> it would be nice to at least compile-test it.
>
> The changes are also available at:
>
> git://people.freedesktop.org/~cwabbott0/mesa i965-use-ssa
>
> Connor Abbott (4):
>   nir/from_ssa: add a flag to not convert everything to SSA
>   i965/fs: use SSA values directly
>   nir: remove nir_src_get_parent_instr()
>   nir: remove parent_instr from nir_register
>
>  src/gallium/drivers/vc4/vc4_program.c  |  2 +-
>  src/glsl/nir/nir.c |  1 -
>  src/glsl/nir/nir.h | 25 ++--
>  src/glsl/nir/nir_from_ssa.c| 33 +-
>  src/mesa/drivers/dri/i965/brw_fs.h |  5 ++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 73 
> ++
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  1 +
>  src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
>  .../dri/i965/brw_nir_analyze_boolean_resolves.c| 12 ++--
>  9 files changed, 84 insertions(+), 70 deletions(-)
>
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Jason Ekstrand
In C, if you partially initialize a structure, the rest of the struct gets
set to 0.  C++, however, does not have this rule so GCC throws warnings
whenver NIR_SRC_INIT or NIR_DEST_INIT is used in C++.  Since nir.h contains
a static inline that uses NIR_SRC_INIT, every C++ file that includes nir.h
complains about this.

This patch adds a small static inline function that makes a struct,
memsets it to 0, and returns it.  NIR_SRC_INIT and NIR_DEST_INIT are then
wrappers around this function.
---
 src/glsl/nir/nir.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index c666d93..3634f30 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -511,7 +511,18 @@ typedef struct nir_src {
bool is_ssa;
 } nir_src;
 
+#ifdef __cplusplus
+static inline nir_src
+__nir_src_init(void)
+{
+   nir_src src;
+   memset(&src, 0, sizeof(src));
+   return src;
+}
+#define NIR_SRC_INIT (__nir_src_init())
+#else
 #define NIR_SRC_INIT (nir_src) { { NULL } }
+#endif
 
 #define nir_foreach_use(reg_or_ssa_def, src) \
list_for_each_entry(nir_src, src, &(reg_or_ssa_def)->uses, use_link)
@@ -534,7 +545,18 @@ typedef struct {
bool is_ssa;
 } nir_dest;
 
+#ifdef __cplusplus
+static inline nir_dest
+__nir_dest_init(void)
+{
+   nir_dest dest;
+   memset(&dest, 0, sizeof(dest));
+   return dest;
+}
+#define NIR_DEST_INIT (__nir_dest_init())
+#else
 #define NIR_DEST_INIT (nir_dest) { { { NULL } } }
+#endif
 
 #define nir_foreach_def(reg, dest) \
list_for_each_entry(nir_dest, dest, &(reg)->defs, reg.def_link)
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 16/19] i965/fs: Use the builder dispatch_width for computing register offsets

2015-06-26 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 8:52 AM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> Reviewed-by: Topi Pohjolainen 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index d4cc43d..d94a842 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -72,7 +72,7 @@ offset(fs_reg reg, const brw::fs_builder& bld, unsigned 
>> delta)
>> case MRF:
>> case ATTR:
>>return byte_offset(reg,
>> - delta * MAX2(reg.width * reg.stride, 1) *
>> + delta * bld.dispatch_width() * reg.stride *
>
> Er...  This doesn't look right for stride == 0.  If you keep the
> MAX2(.., 1) expression this patch is:

I don't think offset() even makes sense for something with stride ==
0.  I added "assert(stride != 0)" right above the byte_offset() call
and it passed Jenkins.  Would that be an acceptable alternative?
--Jason

> Reviewed-by: Francisco Jerez 
>
>>   type_sz(reg.type));
>> case UNIFORM:
>>reg.reg_offset += delta;
>> --
>> 2.4.3
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 12:08 PM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> In C, if you partially initialize a structure, the rest of the struct gets
>> set to 0.  C++, however, does not have this rule so GCC throws warnings
>> whenver NIR_SRC_INIT or NIR_DEST_INIT is used in C++.
>
> I don't think that's right, in C++ initializers missing from an
> aggregate initializer list are also defined to be initialized
> (value-initialized to be more precise, what would set them to zero in
> this case just like in C).

Yes, that is correct.  I just did a second attempt that, instead,
defines a static const variable named NIR_SRC_INIT with a partial
initializer.  C++ still gets grumpy and gives me a pile of "missing
initializer" warnings.

>> Since nir.h contains a static inline that uses NIR_SRC_INIT, every C++
>> file that includes nir.h complains about this.
>>
> I suspect the reason why this causes a warning may be that you're using
> compound literals? (which are a C99-specific feature and not part of C++)
>
>> This patch adds a small static inline function that makes a struct,
>> memsets it to 0, and returns it.  NIR_SRC_INIT and NIR_DEST_INIT are then
>> wrappers around this function.
>
> In C++ you could just call the implicitly defined default constructor
> for nir_src or nir_dest, like 'nir_src()'.

The implicitly defined default constructor does nothing to POD types,
so doing so would explicitly *not* perform the desired action of
zeroing out the data.

>> ---
>>  src/glsl/nir/nir.h | 22 ++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index c666d93..3634f30 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -511,7 +511,18 @@ typedef struct nir_src {
>> bool is_ssa;
>>  } nir_src;
>>
>> +#ifdef __cplusplus
>> +static inline nir_src
>> +__nir_src_init(void)
>> +{
>> +   nir_src src;
>> +   memset(&src, 0, sizeof(src));
>> +   return src;
>> +}
>> +#define NIR_SRC_INIT (__nir_src_init())
>> +#else
>>  #define NIR_SRC_INIT (nir_src) { { NULL } }
>> +#endif
>>
>>  #define nir_foreach_use(reg_or_ssa_def, src) \
>> list_for_each_entry(nir_src, src, &(reg_or_ssa_def)->uses, use_link)
>> @@ -534,7 +545,18 @@ typedef struct {
>> bool is_ssa;
>>  } nir_dest;
>>
>> +#ifdef __cplusplus
>> +static inline nir_dest
>> +__nir_dest_init(void)
>> +{
>> +   nir_dest dest;
>> +   memset(&dest, 0, sizeof(dest));
>> +   return dest;
>> +}
>> +#define NIR_DEST_INIT (__nir_dest_init())
>> +#else
>>  #define NIR_DEST_INIT (nir_dest) { { { NULL } } }
>> +#endif
>>
>>  #define nir_foreach_def(reg, dest) \
>> list_for_each_entry(nir_dest, dest, &(reg)->defs, reg.def_link)
>> --
>> 2.4.3
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 3:03 PM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> On Fri, Jun 26, 2015 at 12:08 PM, Francisco Jerez  
>> wrote:
>>> Jason Ekstrand  writes:
>>>
>>>> In C, if you partially initialize a structure, the rest of the struct gets
>>>> set to 0.  C++, however, does not have this rule so GCC throws warnings
>>>> whenver NIR_SRC_INIT or NIR_DEST_INIT is used in C++.
>>>
>>> I don't think that's right, in C++ initializers missing from an
>>> aggregate initializer list are also defined to be initialized
>>> (value-initialized to be more precise, what would set them to zero in
>>> this case just like in C).
>>
>> Yes, that is correct.  I just did a second attempt that, instead,
>> defines a static const variable named NIR_SRC_INIT with a partial
>> initializer.  C++ still gets grumpy and gives me a pile of "missing
>> initializer" warnings.
>>
> That's likely related to the warning flags you have enabled in CXXFLAGS,
> not to C++ itself.  Maybe you have -Wmissing-field-initializers enabled
> for C++ only?
>
>>>> Since nir.h contains a static inline that uses NIR_SRC_INIT, every C++
>>>> file that includes nir.h complains about this.
>>>>
>>> I suspect the reason why this causes a warning may be that you're using
>>> compound literals? (which are a C99-specific feature and not part of C++)
>>>
>>>> This patch adds a small static inline function that makes a struct,
>>>> memsets it to 0, and returns it.  NIR_SRC_INIT and NIR_DEST_INIT are then
>>>> wrappers around this function.
>>>
>>> In C++ you could just call the implicitly defined default constructor
>>> for nir_src or nir_dest, like 'nir_src()'.
>>
>> The implicitly defined default constructor does nothing to POD types,
>> so doing so would explicitly *not* perform the desired action of
>> zeroing out the data.
>>
>
> Indeed, but 'nir_src()' doesn't only call the implicitly-defined trivial
> default constructor, it value-initializes the object (See section 8.5/8
> of the C++14 spec) what for POD types causes all members to be
> zero-initialized.

It looks like this greatly depends on your C++ version.  If it's C++11
or above, I believe it does get zero-initialized.  If it's earlier
than C++11, it doesn't.  At least that's the way I read this:

http://en.cppreference.com/w/cpp/language/value_initialization
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Move compute_clip_distance() out of emit_urb_writes().

2015-06-26 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 3:56 PM, Kenneth Graunke  wrote:
> Legacy user clipping (using gl_Position or gl_ClipVertex) is handled by
> turning those into the modern gl_ClipDistance equivalents.
>
> This is unnecessary in Core Profile: if user clipping is enabled, but
> the shader doesn't write the corresponding gl_ClipDistance entry,
> results are undefined.  Hence, it is also unnecessary for geometry
> shaders.
>
> This patch moves the call up to run_vs().  This is equivalent for VS,
> but removes the need to pass clip distances into emit_urb_writes().
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp |4 +++-
>  src/mesa/drivers/dri/i965/brw_fs.h   |2 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   16 +++-
>  3 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4292aa6..8658554 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3816,7 +3816,9 @@ fs_visitor::run_vs(gl_clip_plane *clip_planes)
> if (failed)
>return false;
>
> -   emit_urb_writes(clip_planes);
> +   compute_clip_distance(clip_planes);
> +
> +   emit_urb_writes();
>
> if (shader_time_index >= 0)
>emit_shader_time_end();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 243baf6..d08d438 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -271,7 +271,7 @@ public:
>   fs_reg src0_alpha, unsigned components,
>   unsigned exec_size, bool use_2nd_half = 
> false);
> void emit_fb_writes();
> -   void emit_urb_writes(gl_clip_plane *clip_planes);
> +   void emit_urb_writes();
> void emit_cs_terminate();
>
> void emit_barrier();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 7074b5c..854e49b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1730,6 +1730,12 @@ 
> fs_visitor::setup_uniform_clipplane_values(gl_clip_plane *clip_planes)
> }
>  }
>
> +/**
> + * Lower legacy fixed-function and gl_ClipVertex clipping to clip distances.
> + *
> + * This does nothing if the shader uses gl_ClipDistance or user clipping is
> + * disabled altogether.
> + */
>  void fs_visitor::compute_clip_distance(gl_clip_plane *clip_planes)
>  {
> struct brw_vue_prog_data *vue_prog_data =
> @@ -1737,6 +1743,10 @@ void fs_visitor::compute_clip_distance(gl_clip_plane 
> *clip_planes)
> const struct brw_vue_prog_key *key =
>(const struct brw_vue_prog_key *) this->key;
>
> +   /* Bail unless some sort of legacy clipping is enabled */
> +   if (!key->userclip_active || prog->UsesClipDistanceOut)
> +  return;
> +

Any reason why you changed this from a conditional call to
compute_clip_distance to an early return?  I don't know that I care
much either way.

Thanks for making this less gross.

Reviewed-by: Jason Ekstrand 

> /* From the GLSL 1.30 spec, section 7.1 (Vertex Shader Special Variables):
>  *
>  * "If a linked set of shaders forming the vertex stage contains no
> @@ -1780,7 +1790,7 @@ void fs_visitor::compute_clip_distance(gl_clip_plane 
> *clip_planes)
>  }
>
>  void
> -fs_visitor::emit_urb_writes(gl_clip_plane *clip_planes)
> +fs_visitor::emit_urb_writes()
>  {
> int slot, urb_offset, length;
> struct brw_vs_prog_data *vs_prog_data =
> @@ -1793,10 +1803,6 @@ fs_visitor::emit_urb_writes(gl_clip_plane *clip_planes)
> bool flush;
> fs_reg sources[8];
>
> -   /* Lower legacy ff and ClipVertex clipping to clip distances */
> -   if (key->base.userclip_active && !prog->UsesClipDistanceOut)
> -  compute_clip_distance(clip_planes);
> -
> /* If we don't have any valid slots to write, just do a minimal urb write
>  * send to terminate the shader. */
> if (vue_map->slots_valid == 0) {
> --
> 1.7.10.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Make C++ more happy with NIR_SRC_INIT and NIR_DEST_INIT

2015-06-26 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 3:34 PM, Francisco Jerez  wrote:
> Jason Ekstrand  writes:
>
>> On Fri, Jun 26, 2015 at 3:03 PM, Francisco Jerez  
>> wrote:
>>> Jason Ekstrand  writes:
>>>
>>>> On Fri, Jun 26, 2015 at 12:08 PM, Francisco Jerez  
>>>> wrote:
>>>>> Jason Ekstrand  writes:
>>>>>
>>>>>> In C, if you partially initialize a structure, the rest of the struct 
>>>>>> gets
>>>>>> set to 0.  C++, however, does not have this rule so GCC throws warnings
>>>>>> whenver NIR_SRC_INIT or NIR_DEST_INIT is used in C++.
>>>>>
>>>>> I don't think that's right, in C++ initializers missing from an
>>>>> aggregate initializer list are also defined to be initialized
>>>>> (value-initialized to be more precise, what would set them to zero in
>>>>> this case just like in C).
>>>>
>>>> Yes, that is correct.  I just did a second attempt that, instead,
>>>> defines a static const variable named NIR_SRC_INIT with a partial
>>>> initializer.  C++ still gets grumpy and gives me a pile of "missing
>>>> initializer" warnings.
>>>>
>>> That's likely related to the warning flags you have enabled in CXXFLAGS,
>>> not to C++ itself.  Maybe you have -Wmissing-field-initializers enabled
>>> for C++ only?
>>>
>>>>>> Since nir.h contains a static inline that uses NIR_SRC_INIT, every C++
>>>>>> file that includes nir.h complains about this.
>>>>>>
>>>>> I suspect the reason why this causes a warning may be that you're using
>>>>> compound literals? (which are a C99-specific feature and not part of C++)
>>>>>
>>>>>> This patch adds a small static inline function that makes a struct,
>>>>>> memsets it to 0, and returns it.  NIR_SRC_INIT and NIR_DEST_INIT are then
>>>>>> wrappers around this function.
>>>>>
>>>>> In C++ you could just call the implicitly defined default constructor
>>>>> for nir_src or nir_dest, like 'nir_src()'.
>>>>
>>>> The implicitly defined default constructor does nothing to POD types,
>>>> so doing so would explicitly *not* perform the desired action of
>>>> zeroing out the data.
>>>>
>>>
>>> Indeed, but 'nir_src()' doesn't only call the implicitly-defined trivial
>>> default constructor, it value-initializes the object (See section 8.5/8
>>> of the C++14 spec) what for POD types causes all members to be
>>> zero-initialized.
>>
>> It looks like this greatly depends on your C++ version.  If it's C++11
>> or above, I believe it does get zero-initialized.  If it's earlier
>> than C++11, it doesn't.  At least that's the way I read this:
>>
>> http://en.cppreference.com/w/cpp/language/value_initialization
>
> Not really, it will get zero-initialized back to C++98.  AFAICT what the
> article is trying to say is that in C++98 what is now referred to as
> value-initialization used to be called default-initialization in the
> spec, but still it had the effect of zero-initializing the structure.

Ok, I did some more reading and I think I'm convinced now.  Figuring
out what "nir_src src = nir_src()" actually does should *not* take
this much research.  I'll send an updated patch on Monday.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] nir: cleanup open-coded instruction casts

2015-06-27 Thread Jason Ekstrand
Thanks!

R-B me
On Jun 27, 2015 7:57 AM, "Rob Clark"  wrote:

> From: Rob Clark 
>
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir_lower_alu_to_scalar.c | 2 +-
>  src/glsl/nir/nir_lower_vec_to_movs.c   | 2 +-
>  src/glsl/nir/nir_search.c  | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/glsl/nir/nir_lower_alu_to_scalar.c
> b/src/glsl/nir/nir_lower_alu_to_scalar.c
> index 25bba4e..5d15fb2 100644
> --- a/src/glsl/nir/nir_lower_alu_to_scalar.c
> +++ b/src/glsl/nir/nir_lower_alu_to_scalar.c
> @@ -164,7 +164,7 @@ lower_alu_to_scalar_block(nir_block *block, void *data)
>  {
> nir_foreach_instr_safe(block, instr) {
>if (instr->type == nir_instr_type_alu)
> - lower_alu_instr_scalar((nir_alu_instr *)instr, data);
> + lower_alu_instr_scalar(nir_instr_as_alu(instr), data);
> }
>
> return true;
> diff --git a/src/glsl/nir/nir_lower_vec_to_movs.c
> b/src/glsl/nir/nir_lower_vec_to_movs.c
> index 602853e..e6d522f 100644
> --- a/src/glsl/nir/nir_lower_vec_to_movs.c
> +++ b/src/glsl/nir/nir_lower_vec_to_movs.c
> @@ -90,7 +90,7 @@ lower_vec_to_movs_block(nir_block *block, void *mem_ctx)
>if (instr->type != nir_instr_type_alu)
>   continue;
>
> -  nir_alu_instr *vec = (nir_alu_instr *)instr;
> +  nir_alu_instr *vec = nir_instr_as_alu(instr);
>
>switch (vec->op) {
>case nir_op_vec2:
> diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
> index 0c4e48c..c33d6c3 100644
> --- a/src/glsl/nir/nir_search.c
> +++ b/src/glsl/nir/nir_search.c
> @@ -48,7 +48,7 @@ src_is_bool(nir_src src)
>return false;
> if (src.ssa->parent_instr->type != nir_instr_type_alu)
>return false;
> -   return alu_instr_is_bool((nir_alu_instr *)src.ssa->parent_instr);
> +   return alu_instr_is_bool(nir_instr_as_alu(src.ssa->parent_instr));
>  }
>
>  static bool
> --
> 2.4.3
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/78] i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor code-paths

2015-06-29 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> The NIR->vec4 pass will be activated if ALL the following conditions are met:
>
> * INTEL_USE_NIR environment variable is defined and is positive (1 or true)
> * The stage is vertex shader
> * The HW generation is either SandyBridge (gen6), IvyBridge or Haswell (gen7)

I'm not sure about this last one.  When we did this for FS, it was
well-known that HSW and IVB were the only ones that were working if
you used INTEL_USE_NIR on Iron Lake, you got what you got.  This makes
it easier to develop/test on older platforms because it doesn't
involve hacking things up.

Only using it for vertex shaders is perfectly reasonable because there
are whole chunks of stuff for geometry that simply doesn't exist yet.

>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> ---
>  src/mesa/drivers/dri/i965/brw_program.c  |  5 +
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 --
>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 32 
> ++--
>  src/mesa/drivers/dri/i965/brw_vec4.h |  2 ++
>  4 files changed, 45 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> b/src/mesa/drivers/dri/i965/brw_program.c
> index 2327af7..7e5d23d 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -574,6 +574,11 @@ brw_dump_ir(const char *stage, struct gl_shader_program 
> *shader_prog,
>  struct gl_shader *shader, struct gl_program *prog)
>  {
> if (shader_prog) {
> +  /* Since git~104c8fc, shader->ir can be NULL if NIR is used.
> +   * That must have been checked prior to calling this function, but
> +   * we double-check here just in case.
> +   */
> +  assert(shader->ir != NULL);
>fprintf(stderr,
>"GLSL IR for native %s shader %d:\n", stage, 
> shader_prog->Name);
>_mesa_print_ir(stderr, shader->ir, NULL);
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 5653d6b..0b53647 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -118,12 +118,14 @@ brw_compiler_create(void *mem_ctx, const struct 
> brw_device_info *devinfo)
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = true;
> compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
> true;
>
> -   if (compiler->scalar_vs) {
> -  /* If we're using the scalar backend for vertex shaders, we need to
> -   * configure these accordingly.
> -   */
> -  
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
> true;
> -  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp 
> = true;
> +   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR", 
> false)) {
> +  if (compiler->scalar_vs) {
> + /* If we're using the scalar backend for vertex shaders, we need to
> + * configure these accordingly.
> + */
> + 
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
> true;
> + 
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true;

This seems wrong.  The vec4 backend can certainly handle indirect
temporaries and indirect outputs are a must for geometry shaders.  I
really think we only want to turn these on for scalar shaders.

> +  }
>compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = 
> false;
>
>compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions = 
> nir_options;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index a5c686c..dcffa04 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1707,6 +1707,21 @@ vec4_visitor::emit_shader_time_write(int 
> shader_time_subindex, src_reg value)
>  }
>
>  bool
> +vec4_visitor::should_use_vec4_nir()
> +{
> +   /* NIR->vec4 pass is activated when all these conditions meet:
> +*
> +* 1) it is a vertex shader
> +* 2) INTEL_USE_NIR env-var set to true, so NirOptions are defined for VS
> +* 3) hardware gen is SNB, IVB or HSW
> +*/
> +   return
> +  stage == MESA_SHADER_VERTEX &&
> +  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions != NULL 
> &&
> +  devinfo->gen >= 6 && devinfo->gen < 8;
> +}
> +
> +bool
>  vec4_visitor::run(gl_clip_plane *clip_planes)
>  {
> sanity_param_count = prog->Parameters->NumParameters;
> @@ -1722,7 +1737,17 @@ vec4_visitor::run(gl_clip_plane *clip_planes)
>  * functions called "main").
>  */
> if (shader) {
> -  visit_instructions(shader->base.ir);
> +  if (should_use_vec4_nir()) {
> + assert(prog->nir != NULL);
> + emit_nir_code();
> + if (failed)
> +return false;
> +  } else {
> + 

Re: [Mesa-dev] [PATCH 02/78] i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor code-paths

2015-06-29 Thread Jason Ekstrand
On Mon, Jun 29, 2015 at 2:49 PM, Eduardo Lima Mitev  wrote:
> On 06/29/2015 11:22 PM, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>>> The NIR->vec4 pass will be activated if ALL the following conditions are 
>>> met:
>>>
>>> * INTEL_USE_NIR environment variable is defined and is positive (1 or true)
>>> * The stage is vertex shader
>>> * The HW generation is either SandyBridge (gen6), IvyBridge or Haswell 
>>> (gen7)
>>
>> I'm not sure about this last one.  When we did this for FS, it was
>> well-known that HSW and IVB were the only ones that were working if
>> you used INTEL_USE_NIR on Iron Lake, you got what you got.  This makes
>> it easier to develop/test on older platforms because it doesn't
>> involve hacking things up.
>>
>> Only using it for vertex shaders is perfectly reasonable because there
>> are whole chunks of stuff for geometry that simply doesn't exist yet.
>>
>
> Ok, I will drop that condition then, and perhaps spit a warning if gen<6
> to alert that NIR->vec4 doesn't yet support that gen, so crashes might
> happen.
>
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_program.c  |  5 +
>>>  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 --
>>>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 32 
>>> ++--
>>>  src/mesa/drivers/dri/i965/brw_vec4.h |  2 ++
>>>  4 files changed, 45 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
>>> b/src/mesa/drivers/dri/i965/brw_program.c
>>> index 2327af7..7e5d23d 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_program.c
>>> +++ b/src/mesa/drivers/dri/i965/brw_program.c
>>> @@ -574,6 +574,11 @@ brw_dump_ir(const char *stage, struct 
>>> gl_shader_program *shader_prog,
>>>  struct gl_shader *shader, struct gl_program *prog)
>>>  {
>>> if (shader_prog) {
>>> +  /* Since git~104c8fc, shader->ir can be NULL if NIR is used.
>>> +   * That must have been checked prior to calling this function, but
>>> +   * we double-check here just in case.
>>> +   */
>>> +  assert(shader->ir != NULL);
>>>fprintf(stderr,
>>>"GLSL IR for native %s shader %d:\n", stage, 
>>> shader_prog->Name);
>>>_mesa_print_ir(stderr, shader->ir, NULL);
>>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> index 5653d6b..0b53647 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> @@ -118,12 +118,14 @@ brw_compiler_create(void *mem_ctx, const struct 
>>> brw_device_info *devinfo)
>>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = 
>>> true;
>>> compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
>>> true;
>>>
>>> -   if (compiler->scalar_vs) {
>>> -  /* If we're using the scalar backend for vertex shaders, we need to
>>> -   * configure these accordingly.
>>> -   */
>>> -  
>>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
>>> true;
>>> -  
>>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
>>> true;
>>> +   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR", 
>>> false)) {
>>> +  if (compiler->scalar_vs) {
>>> + /* If we're using the scalar backend for vertex shaders, we need 
>>> to
>>> + * configure these accordingly.
>>> + */
>>> + 
>>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
>>> true;
>>> + 
>>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
>>> true;
>>
>> This seems wrong.  The vec4 backend can certainly handle indirect
>> temporaries and indirect outputs are a must for geometry shaders.  I
>> really think we only want to turn these on for scalar shaders.
>>
>
> Actually, these two settings do not get executed for our pass, because
> compiler->scalar_vs is still false. Notice the OR in the outer-most if.
>
> What this pat

Re: [Mesa-dev] [PATCH 04/78] i965/nir/vec4: Add setup of input variables in NIR->vec4 pass

2015-06-29 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> This implementation sets up a map of input variable offsets to source 
> registers
> that are already initialized with the corresponding register offset.
>
> This map will then be queried when processing load_input intrinsic operations,
> to obtain the correct register source from which the input data will be 
> loaded.
>
> This pattern of initializing an array map at setup time and then consuming it
> during instruction emission is common in fs_nir, while the actual offset
> calculations are taken from vec4_visitor.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  2 ++
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 13 -
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 7f78e7f..be47c82 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -411,6 +411,8 @@ public:
> virtual void nir_emit_jump(nir_jump_instr *instr);
> virtual void nir_emit_texture(nir_tex_instr *instr);
>
> +   src_reg *nir_inputs;
> +
>  protected:
> void emit_vertex();
> void lower_attributes_to_hw_regs(const int *attribute_map,
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index ae3b962..c2342b6 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -71,7 +71,18 @@ vec4_visitor::nir_setup_system_values(nir_shader *shader)
>  void
>  vec4_visitor::nir_setup_inputs(nir_shader *shader)
>  {
> -   /* @TODO: Not yet implemented */
> +   nir_inputs = ralloc_array(mem_ctx, src_reg, shader->num_inputs);
> +
> +   foreach_list_typed(nir_variable, var, node, &shader->inputs) {
> +  int offset = var->data.driver_location;
> +  unsigned size = type_size(var->type);
> +  for (unsigned i = 0; i < size; i++) {
> + src_reg src = src_reg(ATTR, var->data.location + i, var->type);
> + src = retype(src, brw_type_for_base_type(var->type));

I looked at the src_reg constructor called and it turns out we don't
bother to set the actual type.  We should do that in the src_reg
constructor instead of inserting a retype.

> + nir_inputs[offset] = src;
> + offset++;
> +  }
> +   }
>  }
>
>  void
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/78] i965/nir/vec4: Add setup of uniform variables

2015-06-29 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> From: Iago Toral Quiroga 
>
> This is based on similar code existing in vec4_visitor. It builds the
> uniform register file iterating through each uniform variable. It
> also stores the index of each register at the corresponding offset
> in a map. This map will later be used by load_uniform intrinsic
> instructions to build the correct UNIFORM source register.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.h   |   2 +
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 115 
> -
>  2 files changed, 114 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 673df4e..6535f19 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -414,6 +414,8 @@ public:
> src_reg *nir_inputs;
> int *nir_outputs;
> brw_reg_type *nir_output_types;
> +   unsigned *nir_uniform_offset;
> +   unsigned *nir_uniform_driver_location;
>
>  protected:
> void emit_vertex();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 2d457a6..40ec66f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -106,19 +106,128 @@ vec4_visitor::nir_setup_outputs(nir_shader *shader)
>  void
>  vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>  {
> -   /* @TODO: Not yet implemented */
> +   uniforms = 0;
> +
> +   nir_uniform_offset =
> +  rzalloc_array(mem_ctx, unsigned, this->uniform_array_size);
> +   memset(nir_uniform_offset, 0, this->uniform_array_size * 
> sizeof(unsigned));

rzalloc memsets the whole thing to 0 for you, this memset is redundant.

> +
> +   nir_uniform_driver_location =
> +  rzalloc_array(mem_ctx, unsigned, this->uniform_array_size);
> +   memset(nir_uniform_driver_location, 0,
> +  this->uniform_array_size * sizeof(unsigned));

Same here.

> +
> +   if (shader_prog) {
> +  foreach_list_typed(nir_variable, var, node, &shader->uniforms) {
> + /* UBO's, atomics and samplers don't take up space in the
> +uniform file */
> + if (var->interface_type != NULL || var->type->contains_atomic() ||
> + type_size(var->type) == 0) {

I'm curious as to why you have this extra type_size() == 0 condition.
We don't have that in the FS NIR code.  What caused you to add it?

> +continue;
> + }
> +
> + assert(uniforms < uniform_array_size);
> + this->uniform_size[uniforms] = type_size(var->type);
> +
> + if (strncmp(var->name, "gl_", 3) == 0)
> +nir_setup_builtin_uniform(var);
> + else
> +nir_setup_uniform(var);
> +  }
> +   } else {
> +  /* ARB_vertex_program is not supported yet */
> +  assert("Not implemented");
> +   }
>  }
>
>  void
>  vec4_visitor::nir_setup_uniform(nir_variable *var)
>  {
> -   /* @TODO: Not yet implemented */
> +   int namelen = strlen(var->name);
> +
> +   /* The data for our (non-builtin) uniforms is stored in a series of
> +* gl_uniform_driver_storage structs for each subcomponent that
> +* glGetUniformLocation() could name.  We know it's been set up in the 
> same
> +* order we'd walk the type, so walk the list of storage and find anything
> +* with our name, or the prefix of a component that starts with our name.
> +*/
> +unsigned offset = 0;
> +for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> +   struct gl_uniform_storage *storage = &shader_prog->UniformStorage[u];
> +
> +   if (storage->builtin)
> +  continue;
> +
> +   if (strncmp(var->name, storage->name, namelen) != 0 ||
> +   (storage->name[namelen] != 0 &&
> +storage->name[namelen] != '.' &&
> +storage->name[namelen] != '[')) {
> +  continue;
> +   }
> +
> +   gl_constant_value *components = storage->storage;
> +   unsigned vector_count = (MAX2(storage->array_elements, 1) *
> +storage->type->matrix_columns);

In the FS backend, we simply use storage->type->component_slots().
Why can't we do that here?  It seems to be performing the same
calculation.

> +
> +   for (unsigned s = 0; s < vector_count; s++) {
> +  assert(uniforms < uniform_array_size);
> +  uniform_vector_size[uniforms] = storage->type->vector_elements;
> +
> +  int i;
> +  for (i = 0; i < uniform_vector_size[uniforms]; i++) {
> + stage_prog_data->param[uniforms * 4 + i] = components;
> + components++;
> +  }
> +  for (; i < 4; i++) {
> + static gl_constant_value zero = { 0.0 };

This should probably be const.

> + stage_prog_data->param[uniforms * 4 + i] = &zero;
> +  }
> +
> +  int u

Re: [Mesa-dev] [PATCH 07/78] i965/vec4: Overload make_reg_for_system_value() to allow reuse in NIR->vec4 pass

2015-06-29 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> From: Alejandro Piñeiro 
>
> The new virtual method is more flexible, it has a signature:
>
> dst_reg *make_reg_for_system_value(int location, const glsl_type *type);
>
> so the current method will be chained through this one.

I just grepped the code.  This function is used exactly once in the
current fs_visitor code.  Let's just replace that use with a use of
the new one and avoid the overload.

>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580

Also, let's not reference the same "make a vec4 visitor for NIR" bug
in every patch.  It's not really a bugfix.
--Jason

> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp| 6 ++
>  src/mesa/drivers/dri/i965/brw_vec4.h  | 5 -
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 7 ---
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   | 3 ++-
>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 5 +++--
>  src/mesa/drivers/dri/i965/brw_vs.h| 3 ++-
>  6 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index dcffa04..ff1ef75 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1684,6 +1684,12 @@ vec4_visitor::emit_shader_time_end()
> emit(BRW_OPCODE_ENDIF);
>  }
>
> +dst_reg *
> +vec4_visitor::make_reg_for_system_value(ir_variable *ir)
> +{
> +   return make_reg_for_system_value(ir->data.location, ir->type);
> +}
> +
>  void
>  vec4_visitor::emit_shader_time_write(int shader_time_subindex, src_reg value)
>  {
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 6535f19..2a53d9a 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -411,6 +411,9 @@ public:
> virtual void nir_emit_jump(nir_jump_instr *instr);
> virtual void nir_emit_texture(nir_tex_instr *instr);
>
> +   virtual dst_reg *make_reg_for_system_value(int location,
> +  const glsl_type *type) = 0;
> +
> src_reg *nir_inputs;
> int *nir_outputs;
> brw_reg_type *nir_output_types;
> @@ -423,7 +426,7 @@ protected:
>  bool interleaved);
> void setup_payload_interference(struct ra_graph *g, int 
> first_payload_node,
> int reg_node_count);
> -   virtual dst_reg *make_reg_for_system_value(ir_variable *ir) = 0;
> +   virtual dst_reg *make_reg_for_system_value(ir_variable *ir);
> virtual void assign_binding_table_offsets();
> virtual void setup_payload() = 0;
> virtual void emit_prolog() = 0;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> index 69bcf5a..91bc849 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> @@ -49,11 +49,12 @@ vec4_gs_visitor::vec4_gs_visitor(const struct 
> brw_compiler *compiler,
>
>
>  dst_reg *
> -vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
> +vec4_gs_visitor::make_reg_for_system_value(int location,
> +   const glsl_type *type)
>  {
> -   dst_reg *reg = new(mem_ctx) dst_reg(this, ir->type);
> +   dst_reg *reg = new(mem_ctx) dst_reg(this, type);
>
> -   switch (ir->data.location) {
> +   switch (location) {
> case SYSTEM_VALUE_INVOCATION_ID:
>this->current_annotation = "initialize gl_InvocationID";
>emit(GS_OPCODE_GET_INSTANCE_ID, *reg);
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h 
> b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
> index e693c56..0f1c705 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
> @@ -76,7 +76,8 @@ public:
> int shader_time_index);
>
>  protected:
> -   virtual dst_reg *make_reg_for_system_value(ir_variable *ir);
> +   virtual dst_reg *make_reg_for_system_value(int location,
> +  const glsl_type *type);
> virtual void setup_payload();
> virtual void emit_prolog();
> virtual void emit_program_code();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
> index f93062b..1fe23ba 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
> @@ -143,7 +143,8 @@ vec4_vs_visitor::emit_prolog()
>
>
>  dst_reg *
> -vec4_vs_visitor::make_reg_for_system_value(ir_variable *ir)
> +vec4_vs_visitor::make_reg_for_system_value(int location,
> +   const glsl_type *type)
>  {
> /* VertexID is stored by the VF as the last vertex element, but
>  * we don't represent it with a flag in inputs_read, so we

Re: [Mesa-dev] [PATCH 00/78] i965: A new vec4 backend based on NIR

2015-06-29 Thread Jason Ekstrand
Good work guys!  I've started reviewing but review will probably take
a few days so please be patient.


On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> Hello,
>
> This series adds a new vec4 backend for i965 based on NIR. It is the result 
> of working on
> https://bugs.freedesktop.org/show_bug.cgi?id=89580.
>
> This backend is activated if all the following conditions are met:
>
> * INTEL_USE_NIR environment variable is set to 1 (or true)
> * The stage is a GLSL vertex shader (the pass does not support geometry 
> shaders or ARB_vertex_program yet)
> * Hardware is gen6 or gen7 (specifically, we tested on SNB, IVB and HSW)
>
> Otherwise the backend is disabled and the usual vec4_visitor is used.
>
> The backend implementation is heavily based on vec4_visitor, and it is in 
> fact part of that class. However we have taken care to make the new backend 
> as much self-contained as possible, to ease an eventual removal of the 
> vec4_visitor path. For dealing with NIR data-structures, we heavily borrowed 
> ideas and patterns from the fs_nir backend.
>
> At the moment, the backend shows no piglit regressions on SNB, IVB and HSW. 
> There is one piglit test that fails in master but crashes on our backend. The 
> test uses multiple indirect indexings on an expression with sampler arrays, 
> which is expected to hit an assertion in NIR when --enable-debug autoconf 
> flag is set.
>
> The backend shows no functional dEQP regressions. On HSW and IVB, however, 
> there are some particularly heavy tests (~80) that fail at link time due to 
> register spilling, which should be fixed once optimization work on the 
> backend is done.
>
> People interested in trying the backend can use this git tree (and remember 
> to have INTEL_USE_NIR=1):
>
> $ git clone -b nir-vec4-v1 https://github.com/Igalia/mesa.git
>
> The structure of the patch set is:
>
> The first patch (0001) adds the main structure of the backend, with 
> placeholders for each main functionality.
> The second patch (0002) adds logic to select between the current vec4_visitor 
> pass, and the new NIR->vec4 pass.
> The rest of the patches incrementally fill placeholders with atomic 
> functionality. In some cases, the division might seem arbitrary (i.e, 
> nir_emit_texture(), which is basically one method), but we decided to favor 
> having more atomic patches to facilitate review, and maybe squash the patches 
> just before merging.
>
> cheers,
> Eduardo
>
> Alejandro Piñeiro (12):
>   i965/vec4: Overload make_reg_for_system_value() to allow reuse in
> NIR->vec4 pass
>   i965/nir/vec4: Add setup for system values
>   i965/nir/vec4: Implement intrinsics that load system values
>   i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec)
>   i965/nir: Disable alu_to_scalar pass on non-scalar shaders
>   i965/nir/vec4: Add skeleton implementation of nir_emit_texture()
>   i965/vec4: Add a new dst_reg constructor accepting a brw_reg_type
>   i965/nir/vec4: Implement loading of nir_tex_src_comparitor
>   i965/nir/vec4: Implement loading of nir_tex_src_coord
>   i965/nir/vec4: Setup LOD source register
>   i965/nir/vec4: Implement nir_texop_tex and nir_texop_txl texture ops
>   i965/nir/vec4: Implement nir_texop_txf texture op
>
> Antia Puentes (33):
>   i965/nir/vec4: Implement loading values from an UBO
>   i965/nir/vec4: Prepare source and destination registers for ALU
> operations
>   i965/nir/vec4: Implement single-element "mov" operations
>   i965/nir/vec4: Lower "vecN" instructions and mark them unreachable
>   i965/nir/vec4: Implement int<->float format conversion ops
>   i965/nir/vec4: Implement the addition operation
>   i965/nir/vec4: Implement multiplication
>   i965/vec4: Return the last emitted instruction in emit_math()
>   i965/nir/vec4: Implement more math operations
>   i965/nir/vec4: Implement carry/borrow for addition/subtraction
>   i965/nir/vec4: Implement float-related functions
>   i965/vec4: Return the emitted instruction in emit_minmax()
>   i965/nir/vec4: Implement min/max operations
>   i965/nir/vec4: Derivatives are not allowed in VS
>   i965/nir: Add utility method for comparisons
>   i965/nir/vec4: Implement non-vector comparison ops
>   i965/nir/vec4: Add swizzle utility method for vector ops
>   i965/nir/vec4: Implement equality ops on vectors
>   i965/nir/vec4: Implement non-equality ops on vectors
>   i965/nir/vec4: Implement logical operators
>   i965/nir/vec4: Implement "bool<->int,float" format conversion
>   i965/nir/vec4: "noise" ops should already be lowered
>   i965/nir/vec4: Implement pack/unpack operations
>   i965/nir/vec4: Implement bit operations
>   i965/nir/vec4: Implement the "sign" operation
>   i965/nir/vec4: Implement "shift" operations
>   i965/nir/vec4: Implement floating-point fused multiply-add
>   i965/vec4: Return the emitted instruction in emit_lrp()
>   i965/nir/vec4: Implement linear interpolation
>   i965/nir/vec4: Implement conditional select
>   i965/nir/vec4: Impl

Re: [Mesa-dev] [PATCH 08/78] i965/nir/vec4: Add setup for system values

2015-06-29 Thread Jason Ekstrand
On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> From: Alejandro Piñeiro 
>
> Similar to other variable setups, system values will initialize the
> corresponding register inside a 'nir_system_values' map, which will then
> be queried later when processing the different system value intrinsics
> for the appropriate register.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 43 
> +-
>  2 files changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 2a53d9a..e531d60 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -419,6 +419,7 @@ public:
> brw_reg_type *nir_output_types;
> unsigned *nir_uniform_offset;
> unsigned *nir_uniform_driver_location;
> +   dst_reg *nir_system_values;
>
>  protected:
> void emit_vertex();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 40ec66f..6c2a046 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -54,13 +54,54 @@ vec4_visitor::emit_nir_code()
>  static bool
>  setup_system_values_block(nir_block *block, void *void_visitor)
>  {
> -   /* @TODO: Not yet implemented */
> +   vec4_visitor *v = (vec4_visitor *)void_visitor;
> +   dst_reg *reg;
> +
> +   nir_foreach_instr(block, instr) {
> +  if (instr->type != nir_instr_type_intrinsic)
> + continue;
> +
> +  nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> +
> +  switch (intrin->intrinsic) {
> +  case nir_intrinsic_load_vertex_id:
> + unreachable("should be lowered by lower_vertex_id().");
> +
> +  case nir_intrinsic_load_vertex_id_zero_base:
> + reg = &v->nir_system_values[SYSTEM_VALUE_VERTEX_ID_ZERO_BASE];
> + if (reg->file == BAD_FILE)
> +*reg =
> +   
> *v->make_reg_for_system_value(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE,
> + NULL);

I know the type isn't actually used for VX, but you know what it is so
you might as well pass it in here.

> + break;
> +
> +  case nir_intrinsic_load_base_vertex:
> + reg = &v->nir_system_values[SYSTEM_VALUE_BASE_VERTEX];
> + if (reg->file == BAD_FILE)
> +*reg = *v->make_reg_for_system_value(SYSTEM_VALUE_BASE_VERTEX,
> + NULL);

Same here.

> + break;
> +
> +  case nir_intrinsic_load_instance_id:
> + reg = &v->nir_system_values[SYSTEM_VALUE_INSTANCE_ID];
> + if (reg->file == BAD_FILE)
> +*reg = *v->make_reg_for_system_value(SYSTEM_VALUE_INSTANCE_ID,
> + NULL);

And here.
--Jason

> + break;
> +
> +  default:
> + break;
> +  }
> +   }
> +
> return true;
>  }
>
>  void
>  vec4_visitor::nir_setup_system_values(nir_shader *shader)
>  {
> +   nir_system_values = ralloc_array(mem_ctx, dst_reg, SYSTEM_VALUE_MAX);
> +
> nir_foreach_overload(shader, overload) {
>assert(strcmp(overload->function->name, "main") == 0);
>assert(overload->impl);
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >