Re: [Mesa-dev] VulkanCTS only supporting robustBufferAccess == true?

2019-11-08 Thread Chris Forbes
This is enforcing a hard requirement in the spec:

30.1. Feature Requirements

All Vulkan graphics implementations must support the following features:

* robustBufferAccess


On Fri, Nov 8, 2019 at 1:49 PM  wrote:
>
> Testing my Vulkan driver agains Vulkan CTS, I am a bit suprised, that is 
> seems like CTS does enforce robustBufferAccess (e.g. 
> https://github.com/KhronosGroup/VK-GL-CTS/blob/master/external/vulkancts/modules/vulkan/api/vktApiFeatureInfo.cpp#L1045).
>
> Am I mistaken or can someone explain what the logic behind all that is.
>
> Bonus points for pointing me to a hidden documentation about Vulkan CTS which 
> I haven't found yet.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Fix alpha to one with dual color blending.

2017-06-05 Thread Chris Forbes
Sigh, I should have just ignored the docs when I was poking at this years
ago.

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Mon, May 29, 2017 at 10:49 PM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> The BLEND_STATE documentation says that alpha to one must be disabled
> when dual color blending is enabled.  However, it appears that it simply
> fails to override src1 alpha to one.
>
> We can work around this by leaving alpha to one enabled, but overriding
> SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO.  This appears to be
> what the other driver does, and it looks like it works despite the
> documentation saying not to do it.
>
> Fixes spec/ext_framebuffer_multisample/alpha-to-one-dual-src-blend *
> Piglit tests.
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 57
> +--
>  1 file changed, 44 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 76d2ea887b1..145173c2fc1 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2435,6 +2435,20 @@ static const struct brw_tracked_state
> genX(gs_state) = {
>
>  /* --
> */
>
> +static GLenum
> +fix_dual_blend_alpha_to_one(GLenum function)
> +{
> +   switch (function) {
> +   case GL_SRC1_ALPHA:
> +  return GL_ONE;
> +
> +   case GL_ONE_MINUS_SRC1_ALPHA:
> +  return GL_ZERO;
> +   }
> +
> +   return function;
> +}
> +
>  #define blend_factor(x) brw_translate_blend_factor(x)
>  #define blend_eqn(x) brw_translate_blend_equation(x)
>
> @@ -2562,6 +2576,19 @@ genX(upload_blend_state)(struct brw_context *brw)
> dstA = brw_fix_xRGB_alpha(dstA);
>  }
>
> +/* From the BLEND_STATE docs, DWord 0, Bit 29 (AlphaToOne
> Enable):
> + * "If Dual Source Blending is enabled, this bit must be
> disabled."
> + *
> + * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to
> ZERO,
> + * and leave it enabled anyway.
> + */
> +if (ctx->Color.Blend[i]._UsesDualSrc &&
> blend.AlphaToOneEnable) {
> +   srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
> +   srcA = fix_dual_blend_alpha_to_one(srcA);
> +   dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
> +   dstA = fix_dual_blend_alpha_to_one(dstA);
> +}
> +
>  entry.ColorBufferBlendEnable = true;
>  entry.DestinationBlendFactor = blend_factor(dstRGB);
>  entry.SourceBlendFactor = blend_factor(srcRGB);
> @@ -2600,16 +2627,6 @@ genX(upload_blend_state)(struct brw_context *brw)
>   entry.WriteDisableBlue  = !ctx->Color.ColorMask[i][2];
>   entry.WriteDisableAlpha = !ctx->Color.ColorMask[i][3];
>
> - /* From the BLEND_STATE docs, DWord 0, Bit 29 (AlphaToOne
> Enable):
> -  * "If Dual Source Blending is enabled, this bit must be
> disabled."
> -  */
> - WARN_ONCE(ctx->Color.Blend[i]._UsesDualSrc &&
> -   _mesa_is_multisample_enabled(ctx) &&
> -   ctx->Multisample.SampleAlphaToOne,
> -   "HW workaround: disabling alpha to one with dual src "
> -   "blending\n");
> - if (ctx->Color.Blend[i]._UsesDualSrc)
> -blend.AlphaToOneEnable = false;
>  #if GEN_GEN >= 8
>   GENX(BLEND_STATE_ENTRY_pack)(NULL, _map[1 + i * 2],
> );
>  #else
> @@ -4049,11 +4066,15 @@ genX(upload_ps_blend)(struct brw_context *brw)
>/* BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS | _NEW_COLOR */
>pb.HasWriteableRT = brw_color_buffer_write_enabled(brw);
>
> +  bool alpha_to_one = false;
> +
>if (!buffer0_is_integer) {
>   /* _NEW_MULTISAMPLE */
> - pb.AlphaToCoverageEnable =
> -_mesa_is_multisample_enabled(ctx) &&
> -ctx->Multisample.SampleAlphaToCoverage;
> +
> + if (_mesa_is_multisample_enabled(ctx)) {
> +pb.AlphaToCoverageEnable = ctx->Multisample.
> SampleAlphaToCoverage;
> +alpha_to_one = ctx->Multisample.SampleAlphaToOne;
> + }
>
>   pb.AlphaTestEnable = color->AlphaEnabled;
>}
> @@ -4098,6 +4119,16 @@ genX(upload_ps_blend)(struct brw_context *brw)
>  dstA = brw_fix_xRGB_alpha(dstA);
>   }
>
> + /* Alpha to One doesn't work with Dual Color Blending.

Re: [Mesa-dev] [PATCH 1/2] i965: Drop unused STATE_TEXRECT_SCALE code.

2017-02-28 Thread Chris Forbes
Nice to see the last remnants of this go.

For the series:

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Wed, Mar 1, 2017 at 9:53 AM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> In the past, we used this on Gen4-5 to transform non-normalized texture
> coordinates (for sampler2DRect) to normalized ones.  We also used it on
> Gen6-7.5 for sampler2DRect with GL_CLAMP.
>
> Jason dropped this code in 6c8ba59cff14a1a86273f4008ff2a8e68335ab25
> in favor of using nir_lower_tex(), which just does a textureSize()
> call.  But we were still setting up these state references for
> useless uniform data.
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_link.cpp  |  2 --
>  src/mesa/drivers/dri/i965/brw_program.c | 23 ---
>  src/mesa/drivers/dri/i965/brw_program.h |  2 --
>  3 files changed, 27 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
> b/src/mesa/drivers/dri/i965/brw_link.cpp
> index 977feb37fc2..261d8861c35 100644
> --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> @@ -224,8 +224,6 @@ brw_link_shader(struct gl_context *ctx, struct
> gl_shader_program *shProg)
>prog->ShadowSamplers = shader->shadow_samplers;
>_mesa_update_shader_textures_used(shProg, prog);
>
> -  brw_add_texrect_params(prog);
> -
>bool debug_enabled =
>   (INTEL_DEBUG & intel_debug_flag_for_shader_
> stage(shader->Stage));
>
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c
> b/src/mesa/drivers/dri/i965/brw_program.c
> index 673dc502ad4..1d36b4b8938 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -244,8 +244,6 @@ brwProgramStringNotify(struct gl_context *ctx,
>  brw->ctx.NewDriverState |= BRW_NEW_FRAGMENT_PROGRAM;
>newFP->id = get_new_program_id(brw->screen);
>
> -  brw_add_texrect_params(prog);
> -
>prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_FRAGMENT,
> true);
>
>brw_fs_precompile(ctx, prog);
> @@ -267,8 +265,6 @@ brwProgramStringNotify(struct gl_context *ctx,
> */
>_tnl_program_string(ctx, target, prog);
>
> -  brw_add_texrect_params(prog);
> -
>prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_VERTEX,
>   compiler->scalar_stage[MESA_
> SHADER_VERTEX]);
>
> @@ -346,25 +342,6 @@ brw_blend_barrier(struct gl_context *ctx)
>  }
>
>  void
> -brw_add_texrect_params(struct gl_program *prog)
> -{
> -   for (int texunit = 0; texunit < BRW_MAX_TEX_UNIT; texunit++) {
> -  if (!(prog->TexturesUsed[texunit] & (1 << TEXTURE_RECT_INDEX)))
> - continue;
> -
> -  int tokens[STATE_LENGTH] = {
> - STATE_INTERNAL,
> - STATE_TEXRECT_SCALE,
> - texunit,
> - 0,
> - 0
> -  };
> -
> -  _mesa_add_state_reference(prog->Parameters, (gl_state_index
> *)tokens);
> -   }
> -}
> -
> -void
>  brw_get_scratch_bo(struct brw_context *brw,
>drm_intel_bo **scratch_bo, int size)
>  {
> diff --git a/src/mesa/drivers/dri/i965/brw_program.h
> b/src/mesa/drivers/dri/i965/brw_program.h
> index 6eda165e875..55b9e5441d7 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.h
> +++ b/src/mesa/drivers/dri/i965/brw_program.h
> @@ -48,8 +48,6 @@ void brw_populate_sampler_prog_key_data(struct
> gl_context *ctx,
>  bool brw_debug_recompile_sampler_key(struct brw_context *brw,
>   const struct
> brw_sampler_prog_key_data *old_key,
>   const struct
> brw_sampler_prog_key_data *key);
> -void brw_add_texrect_params(struct gl_program *prog);
> -
>  void
>  brw_mark_surface_used(struct brw_stage_prog_data *prog_data,
>unsigned surf_index);
> --
> 2.11.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Implement ARB_texture_filter_minmax for i965/gen9+

2017-01-31 Thread Chris Forbes
This looks like it misses the interactions with texture completeness.

- Chris

On Wed, Feb 1, 2017 at 7:53 AM, Plamena Manolova  wrote:

> This extension provides a new texture and sampler parameter
> (TEXTURE_REDUCTION_MODE_ARB) which allows applications to produce a
> filtered texel value by computing a component-wise minimum (MIN) or
> maximum (MAX) of the texels that would normally be averaged.  The
> reduction mode is orthogonal to the minification and magnification filter
> parameters.  The filter parameters are used to identify the set of texels
> used to produce a final filtered value; the reduction mode identifies how
> these texels are combined.
> 
>
> Signed-off-by: Plamena Manolova 
> ---
>  docs/features.txt |  2 +-
>  docs/relnotes/17.0.0.html |  1 +
>  src/mesa/drivers/dri/i965/brw_defines.h   |  8 +++
>  src/mesa/drivers/dri/i965/brw_sampler_state.c | 31 ++--
>  src/mesa/drivers/dri/i965/brw_state.h |  3 +-
>  src/mesa/drivers/dri/i965/intel_extensions.c  |  1 +
>  src/mesa/main/extensions_table.h  |  1 +
>  src/mesa/main/mtypes.h|  2 +
>  src/mesa/main/samplerobj.c| 71
> +++
>  src/mesa/main/texobj.c|  1 +
>  src/mesa/main/texparam.c  | 35 +
>  11 files changed, 151 insertions(+), 5 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index aff0016..da9d77a 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -302,7 +302,7 @@ Khronos, ARB, and OES extensions that are not part of
> any OpenGL or OpenGL ES ve
>GL_ARB_sparse_texture not started
>GL_ARB_sparse_texture2not started
>GL_ARB_sparse_texture_clamp   not started
> -  GL_ARB_texture_filter_minmax  not started
> +  GL_ARB_texture_filter_minmax  DONE (i965/gen9+)
>GL_ARB_transform_feedback_overflow_query  not started
>GL_KHR_blend_equation_advanced_coherent   DONE (i965/gen9+)
>GL_KHR_no_error   not started
> diff --git a/docs/relnotes/17.0.0.html b/docs/relnotes/17.0.0.html
> index 71fb4c3..eb6341b 100644
> --- a/docs/relnotes/17.0.0.html
> +++ b/docs/relnotes/17.0.0.html
> @@ -44,6 +44,7 @@ Note: some of the new features are only available with
> certain drivers.
>  
>
>  
> +GL_ARB_texture_filter_minmax on i965/gen9+
>  GL_ARB_post_depth_coverage on i965/gen9+
>  GL_KHR_blend_equation_advanced on nvc0
>  GL_INTEL_conservative_rasterization on i965/gen9+
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 3c5c6c4..c671bb0 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -259,6 +259,11 @@
>  #define BRW_RASTRULE_LOWER_LEFT  2
>  #define BRW_RASTRULE_LOWER_RIGHT 3
>
> +#define BRW_REDUCTION_TYPE_STD_FILTER 0
> +#define BRW_REDUCTION_TYPE_COMPARISON 1
> +#define BRW_REDUCTION_TYPE_MINIMUM 2
> +#define BRW_REDUCTION_TYPE_MAXIMUM 3
> +
>  #define BRW_RENDERTARGET_CLAMPRANGE_UNORM0
>  #define BRW_RENDERTARGET_CLAMPRANGE_SNORM1
>  #define BRW_RENDERTARGET_CLAMPRANGE_FORMAT   2
> @@ -725,6 +730,8 @@
>  /* SAMPLER_STATE DW2 - border color pointer */
>
>  /* SAMPLER_STATE DW3 */
> +#define BRW_SAMPLER_REDUCTION_TYPE_MASK INTEL_MASK(23, 22)
> +#define BRW_SAMPLER_REDUCTION_TYPE_SHIFT22
>  #define BRW_SAMPLER_MAX_ANISOTROPY_MASK INTEL_MASK(21, 19)
>  #define BRW_SAMPLER_MAX_ANISOTROPY_SHIFT19
>  #define BRW_SAMPLER_ADDRESS_ROUNDING_MASK   INTEL_MASK(18, 13)
> @@ -732,6 +739,7 @@
>  #define GEN7_SAMPLER_NON_NORMALIZED_COORDINATES (1 << 10)
>  /* Gen7+ wrap modes reuse the same BRW_SAMPLER_TC*_WRAP_MODE enums. */
>  #define GEN6_SAMPLER_NON_NORMALIZED_COORDINATES (1 << 0)
> +#define BRW_SAMPLER_REDUCTION_TYPE_ENABLE   (1 << 9)
>
>  enum brw_wrap_mode {
> BRW_TEXCOORDMODE_WRAP = 0,
> diff --git a/src/mesa/drivers/dri/i965/brw_sampler_state.c
> b/src/mesa/drivers/dri/i965/brw_sampler_state.c
> index 412efb9..3a04283 100644
> --- a/src/mesa/drivers/dri/i965/brw_sampler_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_sampler_state.c
> @@ -93,7 +93,8 @@ brw_emit_sampler_state(struct brw_context *brw,
> int lod_bias,
> unsigned shadow_function,
> bool non_normalized_coordinates,
> -   uint32_t border_color_offset)
> +   uint32_t border_color_offset,
> +   unsigned reduction_type)
>  {
> ss[0] = BRW_SAMPLER_LOD_PRECLAMP_ENABLE |
> SET_FIELD(mip_filter, 

Re: [Mesa-dev] [PATCH] mesa: Clamp ValueMask to [0, 255].

2016-12-17 Thread Chris Forbes
I don't see any spec justification for masking this. dEQP is broken here.
Implementations have the flexibility to retain more bits in the mask (and
have more bits set in the initial state) than the depth of the deepest
stencil buffer supported. From the ES3 spec, 4.1.4, second to last para:

   "In the initial state, ... , and the front and back stencil mask are
both set to the value 2^s - 1, where s is greater than or equal to the
number of bits in the deepest stencil buffer supported by the GL
implementation"

/ref/ is specified as being clamped on use and on query, which ought to be
indistinguishable from clamping it upfront.

- Chris

On Sun, Dec 18, 2016 at 4:44 AM, Jason Ekstrand 
wrote:

> Should ref also get clamped?
>
> On Dec 17, 2016 1:03 AM, "Kenneth Graunke"  wrote:
>
>> Commit b8b1d83c71fd148d2fd84afdc20c0aa367114f92 partially fixed
>> dEQP-GLES3.functional.state_query.integers.stencil*value*mask*getfloat
>> by changing the initial value masks from 32-bit ~0 (0x) to 0xFF.
>>
>> However, the application can call glStencilFunc and related functions
>> to set a new value mask, which is a 32-bit quantity.  The application
>> might specify 0x, bringing us back to the original problem.
>>
>> In particular, dEQP's state reset code seems to do this, so the tests
>> still fail when running the entire suite from one process, rather than
>> running the tests individually.
>>
>> This patch clamps the value masks to 0xFF when setting them.  Higher
>> bits have no effect on an 8-bit stencil buffer anyway.
>>
>> This might break apps that set a value mask then try to query it back
>> with glGet and expect to get the same value.  I'm unclear whether apps
>> can reasonably expect that anyway.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/main/stencil.c | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/main/stencil.c b/src/mesa/main/stencil.c
>> index b303bb7..608c564 100644
>> --- a/src/mesa/main/stencil.c
>> +++ b/src/mesa/main/stencil.c
>> @@ -161,7 +161,7 @@ _mesa_StencilFuncSeparateATI( GLenum frontfunc,
>> GLenum backfunc, GLint ref, GLui
>> ctx->Stencil.Function[0]  = frontfunc;
>> ctx->Stencil.Function[1]  = backfunc;
>> ctx->Stencil.Ref[0]   = ctx->Stencil.Ref[1]   = ref;
>> -   ctx->Stencil.ValueMask[0] = ctx->Stencil.ValueMask[1] = mask;
>> +   ctx->Stencil.ValueMask[0] = ctx->Stencil.ValueMask[1] = mask & 0xFF;
>> if (ctx->Driver.StencilFuncSeparate) {
>>ctx->Driver.StencilFuncSeparate(ctx, GL_FRONT,
>>frontfunc, ref, mask);
>> @@ -206,7 +206,7 @@ _mesa_StencilFunc( GLenum func, GLint ref, GLuint
>> mask )
>>FLUSH_VERTICES(ctx, _NEW_STENCIL);
>>ctx->Stencil.Function[face] = func;
>>ctx->Stencil.Ref[face] = ref;
>> -  ctx->Stencil.ValueMask[face] = mask;
>> +  ctx->Stencil.ValueMask[face] = mask & 0xFF;
>>
>>/* Only propagate the change to the driver if EXT_stencil_two_side
>> * is enabled.
>> @@ -227,7 +227,7 @@ _mesa_StencilFunc( GLenum func, GLint ref, GLuint
>> mask )
>>FLUSH_VERTICES(ctx, _NEW_STENCIL);
>>ctx->Stencil.Function[0]  = ctx->Stencil.Function[1]  = func;
>>ctx->Stencil.Ref[0]   = ctx->Stencil.Ref[1]   = ref;
>> -  ctx->Stencil.ValueMask[0] = ctx->Stencil.ValueMask[1] = mask;
>> +  ctx->Stencil.ValueMask[0] = ctx->Stencil.ValueMask[1] = mask &
>> 0xFF;
>>if (ctx->Driver.StencilFuncSeparate) {
>>   ctx->Driver.StencilFuncSeparate(ctx,
>>  ((ctx->Stencil.TestTwoSide)
>> @@ -472,13 +472,13 @@ _mesa_StencilFuncSeparate(GLenum face, GLenum
>> func, GLint ref, GLuint mask)
>>/* set front */
>>ctx->Stencil.Function[0] = func;
>>ctx->Stencil.Ref[0] = ref;
>> -  ctx->Stencil.ValueMask[0] = mask;
>> +  ctx->Stencil.ValueMask[0] = mask & 0xFF;
>> }
>> if (face != GL_FRONT) {
>>/* set back */
>>ctx->Stencil.Function[1] = func;
>>ctx->Stencil.Ref[1] = ref;
>> -  ctx->Stencil.ValueMask[1] = mask;
>> +  ctx->Stencil.ValueMask[1] = mask & 0xFF;
>> }
>> if (ctx->Driver.StencilFuncSeparate) {
>>ctx->Driver.StencilFuncSeparate(ctx, face, func, ref, mask);
>> --
>> 2.10.2
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Add i965 plumbing for ARB_post_depth_coverage for i965 (gen9+).

2016-11-30 Thread Chris Forbes
A couple of notes on existing weirdness here:
- Naming of GEN9_PSX_SHADER_NORMAL_COVERAGE_MASK_SHIFT is bizarre (not your
fault)
- Is BRW_PSICMS_INNER really the right thing for the normal mode? Why not
BRW_PSICMS_NORMAL? Perhaps whoever added this stuff can shed some light
here?

Actual change here looks good, so:

Reviewed-by: Chris Forbes <chrisfor...@google.com>


On Thu, Dec 1, 2016 at 9:00 AM, Plamena Manolova <plamena.manol...@intel.com
> wrote:

> This extension allows the fragment shader to control whether values in
> gl_SampleMaskIn[] reflect the coverage after application of the early
> depth and stencil tests.
>
> Signed-off-by: Plamena Manolova <plamena.manol...@intel.com>
> ---
>  docs/relnotes/13.1.0.html|  1 +
>  src/mesa/drivers/dri/i965/brw_compiler.h |  1 +
>  src/mesa/drivers/dri/i965/brw_fs.cpp |  1 +
>  src/mesa/drivers/dri/i965/gen8_ps_state.c| 13 ++---
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  5 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/docs/relnotes/13.1.0.html b/docs/relnotes/13.1.0.html
> index 4f76cc2..a160cda 100644
> --- a/docs/relnotes/13.1.0.html
> +++ b/docs/relnotes/13.1.0.html
> @@ -45,6 +45,7 @@ Note: some of the new features are only available with
> certain drivers.
>
>  
>  GL_NV_image_formats on any driver supporting
> GL_ARB_shader_image_load_store (i965, nvc0, radeonsi, softpipe)
> +GL_ARB_post_depth_coverage on i965/gen9+
>  
>
>  Bug fixes
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 65a7478..410641f 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -397,6 +397,7 @@ struct brw_wm_prog_data {
> bool computed_stencil;
>
> bool early_fragment_tests;
> +   bool post_depth_coverage;
> bool dispatch_8;
> bool dispatch_16;
> bool dual_src_blend;
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c218f56..ce0c07e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -6454,6 +6454,7 @@ brw_compile_fs(const struct brw_compiler *compiler,
> void *log_data,
> shader->info->outputs_read);
>
> prog_data->early_fragment_tests = shader->info->fs.early_
> fragment_tests;
> +   prog_data->post_depth_coverage = shader->info->fs.post_depth_coverage;
>
> prog_data->barycentric_interp_modes =
>brw_compute_barycentric_interp_modes(compiler->devinfo, shader);
> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> index a4eb962..33ef023 100644
> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> @@ -53,10 +53,17 @@ gen8_upload_ps_extra(struct brw_context *brw,
>dw1 |= GEN8_PSX_SHADER_IS_PER_SAMPLE;
>
> if (prog_data->uses_sample_mask) {
> -  if (brw->gen >= 9)
> - dw1 |= BRW_PSICMS_INNER << GEN9_PSX_SHADER_NORMAL_
> COVERAGE_MASK_SHIFT;
> -  else
> +  if (brw->gen >= 9) {
> + if (prog_data->post_depth_coverage) {
> +dw1 |= BRW_PCICMS_DEPTH << GEN9_PSX_SHADER_NORMAL_
> COVERAGE_MASK_SHIFT;
> + }
> + else {
> +dw1 |= BRW_PSICMS_INNER << GEN9_PSX_SHADER_NORMAL_
> COVERAGE_MASK_SHIFT;
> + }
> +  }
> +  else {
>   dw1 |= GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK;
> +  }
> }
>
> if (prog_data->uses_omask)
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 66079b5..19f4684 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -415,6 +415,7 @@ intelInitExtensions(struct gl_context *ctx)
>ctx->Extensions.KHR_texture_compression_astc_ldr = true;
>ctx->Extensions.KHR_texture_compression_astc_sliced_3d = true;
>ctx->Extensions.MESA_shader_framebuffer_fetch = true;
> +  ctx->Extensions.ARB_post_depth_coverage = true;
> }
>
> if (ctx->API == API_OPENGL_CORE)
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Add GL and GLSL plumbing for ARB_post_depth_coverage for i965 (gen9+).

2016-11-30 Thread Chris Forbes
Excellent, disregard that. Patch looks good.

On Thu, Dec 1, 2016 at 3:10 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:

> On Wed, Nov 30, 2016 at 9:10 PM, Chris Forbes <chr...@ijw.co.nz> wrote:
> > This patch misses adding the #define to the GLSL preprocessor. Other than
>
> The future is today. That's no longer necessary :)
>
> > that it looks good though, so with that fixed:
> >
> > Reviewed-by: Chris Forbes <chrisfor...@google.com>
> >
> > On Thu, Dec 1, 2016 at 8:53 AM, Plamena Manolova
> > <plamena.manol...@intel.com> wrote:
> >>
> >> This extension allows the fragment shader to control whether values in
> >> gl_SampleMaskIn[] reflect the coverage after application of the early
> >> depth and stencil tests.
> >>
> >> Signed-off-by: Plamena Manolova <plamena.manol...@intel.com>
> >> ---
> >>  src/compiler/glsl/ast.h  |  5 +
> >>  src/compiler/glsl/ast_to_hir.cpp |  5 +
> >>  src/compiler/glsl/ast_type.cpp   |  9 -
> >>  src/compiler/glsl/glsl_parser.yy | 18 ++
> >>  src/compiler/glsl/glsl_parser_extras.cpp |  4 
> >>  src/compiler/glsl/glsl_parser_extras.h   |  4 
> >>  src/compiler/glsl/linker.cpp |  4 
> >>  src/compiler/shader_info.h   |  1 +
> >>  src/mesa/main/extensions_table.h |  1 +
> >>  src/mesa/main/mtypes.h   |  2 ++
> >>  src/mesa/main/shaderapi.c|  1 +
> >>  11 files changed, 53 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
> >> index afe91ea..df3a744 100644
> >> --- a/src/compiler/glsl/ast.h
> >> +++ b/src/compiler/glsl/ast.h
> >> @@ -605,6 +605,11 @@ struct ast_type_qualifier {
> >>   /** \{ */
> >>   unsigned blend_support:1; /**< Are there any blend_support_
> >> qualifiers */
> >>   /** \} */
> >> +
> >> + /**
> >> +  * Flag set if GL_ARB_post_depth_coverage layout qualifier is
> >> used.
> >> +  */
> >> + unsigned post_depth_coverage:1;
> >>}
> >>/** \brief Set of flags, accessed by name. */
> >>q;
> >> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> >> b/src/compiler/glsl/ast_to_hir.cpp
> >> index c2ce389..2434ce5 100644
> >> --- a/src/compiler/glsl/ast_to_hir.cpp
> >> +++ b/src/compiler/glsl/ast_to_hir.cpp
> >> @@ -3632,6 +3632,11 @@ apply_layout_qualifier_to_variable(const struct
> >> ast_type_qualifier *qual,
> >>_mesa_glsl_error(loc, state, "early_fragment_tests layout
> qualifier
> >> only "
> >> "valid in fragment shader input layout
> >> declaration.");
> >> }
> >> +
> >> +   if (qual->flags.q.post_depth_coverage) {
> >> +  _mesa_glsl_error(loc, state, "post_depth_coverage layout
> qualifier
> >> only "
> >> +   "valid in fragment shader input layout
> >> declaration.");
> >> +   }
> >>  }
> >>
> >>  static void
> >> diff --git a/src/compiler/glsl/ast_type.cpp
> >> b/src/compiler/glsl/ast_type.cpp
> >> index 3431e24..aa1ae7e 100644
> >> --- a/src/compiler/glsl/ast_type.cpp
> >> +++ b/src/compiler/glsl/ast_type.cpp
> >> @@ -579,6 +579,7 @@ ast_type_qualifier::validate_in_qualifier(YYLTYPE
> >> *loc,
> >>break;
> >> case MESA_SHADER_FRAGMENT:
> >>valid_in_mask.flags.q.early_fragment_tests = 1;
> >> +  valid_in_mask.flags.q.post_depth_coverage = 1;
> >>break;
> >> case MESA_SHADER_COMPUTE:
> >>valid_in_mask.flags.q.local_size = 7;
> >> @@ -633,6 +634,11 @@ ast_type_qualifier::merge_
> into_in_qualifier(YYLTYPE
> >> *loc,
> >>state->in_qualifier->flags.q.early_fragment_tests = false;
> >> }
> >>
> >> +   if (state->in_qualifier->flags.q.post_depth_coverage) {
> >> +  state->fs_post_depth_coverage = true;
> >> +  state->in_qualifier->flags.q.post_depth_coverage = false;
> >> +   }
> >> +
> >> /* We allow the creation of multiple cs_input_layout nodes.
> Coherence
> >> among
> >>  * all existing nodes is checked later, when the AST

Re: [Mesa-dev] [PATCH 1/2] mesa: Add GL and GLSL plumbing for ARB_post_depth_coverage for i965 (gen9+).

2016-11-30 Thread Chris Forbes
This patch misses adding the #define to the GLSL preprocessor. Other than
that it looks good though, so with that fixed:

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Thu, Dec 1, 2016 at 8:53 AM, Plamena Manolova <plamena.manol...@intel.com
> wrote:

> This extension allows the fragment shader to control whether values in
> gl_SampleMaskIn[] reflect the coverage after application of the early
> depth and stencil tests.
>
> Signed-off-by: Plamena Manolova <plamena.manol...@intel.com>
> ---
>  src/compiler/glsl/ast.h  |  5 +
>  src/compiler/glsl/ast_to_hir.cpp |  5 +
>  src/compiler/glsl/ast_type.cpp   |  9 -
>  src/compiler/glsl/glsl_parser.yy | 18 ++
>  src/compiler/glsl/glsl_parser_extras.cpp |  4 
>  src/compiler/glsl/glsl_parser_extras.h   |  4 
>  src/compiler/glsl/linker.cpp |  4 
>  src/compiler/shader_info.h   |  1 +
>  src/mesa/main/extensions_table.h |  1 +
>  src/mesa/main/mtypes.h   |  2 ++
>  src/mesa/main/shaderapi.c|  1 +
>  11 files changed, 53 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
> index afe91ea..df3a744 100644
> --- a/src/compiler/glsl/ast.h
> +++ b/src/compiler/glsl/ast.h
> @@ -605,6 +605,11 @@ struct ast_type_qualifier {
>   /** \{ */
>   unsigned blend_support:1; /**< Are there any blend_support_
> qualifiers */
>   /** \} */
> +
> + /**
> +  * Flag set if GL_ARB_post_depth_coverage layout qualifier is
> used.
> +  */
> + unsigned post_depth_coverage:1;
>}
>/** \brief Set of flags, accessed by name. */
>q;
> diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_
> hir.cpp
> index c2ce389..2434ce5 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -3632,6 +3632,11 @@ apply_layout_qualifier_to_variable(const struct
> ast_type_qualifier *qual,
>_mesa_glsl_error(loc, state, "early_fragment_tests layout qualifier
> only "
> "valid in fragment shader input layout
> declaration.");
> }
> +
> +   if (qual->flags.q.post_depth_coverage) {
> +  _mesa_glsl_error(loc, state, "post_depth_coverage layout qualifier
> only "
> +   "valid in fragment shader input layout
> declaration.");
> +   }
>  }
>
>  static void
> diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.
> cpp
> index 3431e24..aa1ae7e 100644
> --- a/src/compiler/glsl/ast_type.cpp
> +++ b/src/compiler/glsl/ast_type.cpp
> @@ -579,6 +579,7 @@ ast_type_qualifier::validate_in_qualifier(YYLTYPE
> *loc,
>break;
> case MESA_SHADER_FRAGMENT:
>valid_in_mask.flags.q.early_fragment_tests = 1;
> +  valid_in_mask.flags.q.post_depth_coverage = 1;
>break;
> case MESA_SHADER_COMPUTE:
>valid_in_mask.flags.q.local_size = 7;
> @@ -633,6 +634,11 @@ ast_type_qualifier::merge_into_in_qualifier(YYLTYPE
> *loc,
>state->in_qualifier->flags.q.early_fragment_tests = false;
> }
>
> +   if (state->in_qualifier->flags.q.post_depth_coverage) {
> +  state->fs_post_depth_coverage = true;
> +  state->in_qualifier->flags.q.post_depth_coverage = false;
> +   }
> +
> /* We allow the creation of multiple cs_input_layout nodes. Coherence
> among
>  * all existing nodes is checked later, when the AST node is
> transformed
>  * into HIR.
> @@ -761,7 +767,8 @@ ast_type_qualifier::validate_flags(YYLTYPE *loc,
>  bad.flags.q.point_mode ? " point_mode" : "",
>  bad.flags.q.vertices ? " vertices" : "",
>  bad.flags.q.subroutine ? " subroutine" : "",
> -bad.flags.q.subroutine_def ? " subroutine_def" : "");
> +bad.flags.q.subroutine_def ? " subroutine_def" : "",
> +bad.flags.q.post_depth_coverage ? "
> post_depth_coverage" : "");
> return false;
>  }
>
> diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_
> parser.yy
> index 0c3781c..09b7e79 100644
> --- a/src/compiler/glsl/glsl_parser.yy
> +++ b/src/compiler/glsl/glsl_parser.yy
> @@ -1392,6 +1392,24 @@ layout_qualifier_id:
>
>  $$.flags.q.early_fragment_tests = 1;
>   }
> +
> + if (!$$.flags.i &&

Re: [Mesa-dev] [PATCH] anv: bump the texture gather offset limits

2016-11-27 Thread Chris Forbes
The HW limits here are -8/7 when using the gather4 message. [gather4_po
allows -32/31, and specified per channel]

On Mon, Nov 28, 2016 at 10:49 AM, Ilia Mirkin  wrote:

> This matches what NVIDIA and AMD hardware expose.
>
> Signed-off-by: Ilia Mirkin 
> ---
>
> Not sure what the true HW limit is here. On NVIDIA, the true HW limit
> really
> is -32/31 though. As an aside, according to vulkan.gpuinfo.org, the Intel
> Windows driver also exposes -32/31.
>
> With the updated limits on SKL, everything still passes:
>
> ./deqp-vk --deqp-visibility=hidden --deqp-case='*texture_gather*'
> Test run totals:
>   Passed:762/1524 (50.0%)
>   Failed:0/1524 (0.0%)
>   Not supported: 762/1524 (50.0%)
>   Warnings:  0/1524 (0.0%)
>
>  src/intel/vulkan/anv_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 16aba59..d20dc0f 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -555,8 +555,8 @@ void anv_GetPhysicalDeviceProperties(
>.minStorageBufferOffsetAlignment  = 1,
>.minTexelOffset   = -8,
>.maxTexelOffset   = 7,
> -  .minTexelGatherOffset = -8,
> -  .maxTexelGatherOffset = 7,
> +  .minTexelGatherOffset = -32,
> +  .maxTexelGatherOffset = 31,
>.minInterpolationOffset   = -0.5,
>.maxInterpolationOffset   = 0.4375,
>.subPixelInterpolationOffsetBits  = 4,
> --
> 2.7.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Advertise 8 subpixel bits always.

2016-11-06 Thread Chris Forbes
The mesa default is 4, but we program the hardware for 8 on all
generations.

Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 src/mesa/drivers/dri/i965/brw_context.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 3085a98..d8174c6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -538,6 +538,7 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.MaxProgramTextureGatherComponents = 1;
 
ctx->Const.MaxUniformBlockSize = 65536;
+   ctx->Const.SubPixelBits = 8;
 
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
   struct gl_program_constants *prog = >Const.Program[i];
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Remove EXTRA_EXT declaration for ARB_viewport_array

2016-11-06 Thread Chris Forbes
Now that we also have to consider OES_viewport_array & friends, nothing uses 
this.

Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 src/mesa/main/get.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 5f5e76a..854f8ab 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -472,7 +472,6 @@ EXTRA_EXT(ARB_texture_gather);
 EXTRA_EXT(ARB_shader_atomic_counters);
 EXTRA_EXT(ARB_draw_indirect);
 EXTRA_EXT(ARB_shader_image_load_store);
-EXTRA_EXT(ARB_viewport_array);
 EXTRA_EXT(ARB_query_buffer_object);
 EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5);
 EXTRA_EXT(INTEL_performance_query);
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] mesa: Handle OES_texture_view tokens

2016-08-28 Thread Chris Forbes
This patch isn't right. These enum values are the same as the desktop
version, so your new cases will never actually be used.

On Mon, Aug 29, 2016 at 2:24 AM, Francesco Ansanelli 
wrote:

> Signed-off-by: Francesco Ansanelli 
> ---
>  src/mesa/main/texparam.c |   48 ++
> 
>  1 file changed, 48 insertions(+)
>
> diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
> index bdd3fcb..4dd97b1 100644
> --- a/src/mesa/main/texparam.c
> +++ b/src/mesa/main/texparam.c
> @@ -1960,6 +1960,30 @@ get_tex_parameterfv(struct gl_context *ctx,
>   *params = (GLfloat) obj->NumLayers;
>   break;
>
> +  case GL_TEXTURE_VIEW_MIN_LEVEL_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLfloat) obj->MinLevel;
> + break;
> +
> +  case GL_TEXTURE_VIEW_NUM_LEVELS_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLfloat) obj->NumLevels;
> + break;
> +
> +  case GL_TEXTURE_VIEW_MIN_LAYER_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLfloat) obj->MinLayer;
> + break;
> +
> +  case GL_TEXTURE_VIEW_NUM_LAYERS_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLfloat) obj->NumLayers;
> + break;
> +
>case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES:
>   if (!_mesa_is_gles(ctx) || !ctx->Extensions.OES_EGL_
> image_external)
>  goto invalid_pname;
> @@ -2192,6 +2216,30 @@ get_tex_parameteriv(struct gl_context *ctx,
>   *params = (GLint) obj->NumLayers;
>   break;
>
> +  case GL_TEXTURE_VIEW_MIN_LEVEL_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLint) obj->MinLevel;
> + break;
> +
> +  case GL_TEXTURE_VIEW_NUM_LEVELS_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLint) obj->NumLevels;
> + break;
> +
> +  case GL_TEXTURE_VIEW_MIN_LAYER_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLint) obj->MinLayer;
> + break;
> +
> +  case GL_TEXTURE_VIEW_NUM_LAYERS_OES:
> + if (!ctx->Extensions.OES_texture_view)
> +goto invalid_pname;
> + *params = (GLint) obj->NumLayers;
> + break;
> +
>case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES:
>   if (!_mesa_is_gles(ctx) || !ctx->Extensions.OES_EGL_
> image_external)
>  goto invalid_pname;
> --
> 1.7.9.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Delete the FS_OPCODE_INTERPOLATE_AT_CENTROID virtual opcode.

2016-07-18 Thread Chris Forbes
I remember arguing about this when it got added -- tradeoff was payload
size/register pressure vs needing to call out to this unit, if centroid
barycentric coords weren't required for anything else? It does seem fairly
pointless, though.

For the series:-

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Tue, Jul 19, 2016 at 8:26 AM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> We no longer use this message.  As far as I can tell, it's fairly
> useless - the equivalent information is provided in the payload.
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h| 1 -
>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 2 --
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 -
>  src/mesa/drivers/dri/i965/brw_shader.cpp   | 2 --
>  4 files changed, 10 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index b5a259e..2814fa7 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1120,7 +1120,6 @@ enum opcode {
> FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X,
> FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y,
> FS_OPCODE_PLACEHOLDER_HALT,
> -   FS_OPCODE_INTERPOLATE_AT_CENTROID,
> FS_OPCODE_INTERPOLATE_AT_SAMPLE,
> FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET,
> FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET,
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 06007fe..120d6dd 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -250,7 +250,6 @@ fs_inst::is_send_from_grf() const
> switch (opcode) {
> case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7:
> case SHADER_OPCODE_SHADER_TIME_ADD:
> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
> @@ -4785,7 +4784,6 @@ get_lowered_simd_width(const struct brw_device_info
> *devinfo,
> case FS_OPCODE_PACK_HALF_2x16_SPLIT:
> case FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X:
> case FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y:
> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 1e9c7da..a390184 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -2054,11 +2054,6 @@ fs_generator::generate_code(const cfg_t *cfg, int
> dispatch_width)
>   }
>   break;
>
> -  case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> - generate_pixel_interpolator_query(inst, dst, src[0], src[1],
> -
>  GEN7_PIXEL_INTERPOLATOR_LOC_CENTROID);
> - break;
> -
>case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
>   generate_pixel_interpolator_query(inst, dst, src[0], src[1],
>
> GEN7_PIXEL_INTERPOLATOR_LOC_SAMPLE);
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index f3b5487..559e44c 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -367,8 +367,6 @@ brw_instruction_name(const struct brw_device_info
> *devinfo, enum opcode op)
> case FS_OPCODE_PLACEHOLDER_HALT:
>return "placeholder_halt";
>
> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> -  return "interp_centroid";
> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
>return "interp_sample";
> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] nir: Add a nir_lower_io flag for using load_interpolated_input intrins.

2016-07-18 Thread Chris Forbes
Seems a little unfortunate to add a random bool to this interface which is
otherwise fairly descriptive, but OK.

On Tue, Jul 19, 2016 at 8:26 AM, Kenneth Graunke 
wrote:

> While my intention is that the new intrinsics should be usable by all
> drivers, we need to make them optional until all drivers switch.
>
> This doesn't do anything yet, but I added it as a separate patch to
> keep the interface churn separate for easier review.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/nir/nir.h  |  3 ++-
>  src/compiler/nir/nir_lower_io.c | 15 +++
>  src/gallium/drivers/freedreno/ir3/ir3_cmdline.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_blorp.c   |  2 +-
>  src/mesa/drivers/dri/i965/brw_nir.c | 18 +-
>  src/mesa/drivers/dri/i965/brw_program.c |  4 ++--
>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |  2 +-
>  7 files changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index ac11998..e996e0e 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2324,7 +2324,8 @@ void nir_assign_var_locations(struct exec_list
> *var_list, unsigned *size,
>
>  void nir_lower_io(nir_shader *shader,
>nir_variable_mode modes,
> -  int (*type_size)(const struct glsl_type *));
> +  int (*type_size)(const struct glsl_type *),
> +  bool use_load_interpolated_input_intrinsics);
>  nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
>  nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
>
> diff --git a/src/compiler/nir/nir_lower_io.c
> b/src/compiler/nir/nir_lower_io.c
> index b05a73f..aa8a517 100644
> --- a/src/compiler/nir/nir_lower_io.c
> +++ b/src/compiler/nir/nir_lower_io.c
> @@ -39,6 +39,7 @@ struct lower_io_state {
> void *mem_ctx;
> int (*type_size)(const struct glsl_type *type);
> nir_variable_mode modes;
> +   bool use_interpolated_input;
>  };
>
>  void
> @@ -394,7 +395,8 @@ nir_lower_io_block(nir_block *block,
>  static void
>  nir_lower_io_impl(nir_function_impl *impl,
>nir_variable_mode modes,
> -  int (*type_size)(const struct glsl_type *))
> +  int (*type_size)(const struct glsl_type *),
> +  bool use_interpolated_input)
>  {
> struct lower_io_state state;
>
> @@ -402,6 +404,7 @@ nir_lower_io_impl(nir_function_impl *impl,
> state.mem_ctx = ralloc_parent(impl);
> state.modes = modes;
> state.type_size = type_size;
> +   state.use_interpolated_input = use_interpolated_input;
>
> nir_foreach_block(block, impl) {
>nir_lower_io_block(block, );
> @@ -413,11 +416,15 @@ nir_lower_io_impl(nir_function_impl *impl,
>
>  void
>  nir_lower_io(nir_shader *shader, nir_variable_mode modes,
> - int (*type_size)(const struct glsl_type *))
> + int (*type_size)(const struct glsl_type *),
> + bool use_interpolated_input)
>  {
> nir_foreach_function(function, shader) {
> -  if (function->impl)
> - nir_lower_io_impl(function->impl, modes, type_size);
> +  if (function->impl) {
> + nir_lower_io_impl(function->impl, modes, type_size,
> +   use_interpolated_input &&
> +   shader->stage == MESA_SHADER_FRAGMENT);
> +  }
> }
>  }
>
> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
> b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
> index 41532fc..a8a8c1b 100644
> --- a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
> +++ b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
> @@ -93,7 +93,7 @@ load_glsl(unsigned num_files, char* const* files,
> gl_shader_stage stage)
> // TODO nir_assign_var_locations??
>
> NIR_PASS_V(nir, nir_lower_system_values);
> -   NIR_PASS_V(nir, nir_lower_io, nir_var_all, st_glsl_type_size);
> +   NIR_PASS_V(nir, nir_lower_io, nir_var_all, st_glsl_type_size,
> false);
> NIR_PASS_V(nir, nir_lower_samplers, prog);
>
> return nir;
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 282a5b2..0473cfe 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -209,7 +209,7 @@ brw_blorp_compile_nir_shader(struct brw_context *brw,
> struct nir_shader *nir,
>unsigned end = var->data.location +
> nir_uniform_type_size(var->type);
>nir->num_uniforms = MAX2(nir->num_uniforms, end);
> }
> -   nir_lower_io(nir, nir_var_uniform, nir_uniform_type_size);
> +   nir_lower_io(nir, nir_var_uniform, nir_uniform_type_size, false);
>
> const unsigned *program =
>brw_compile_fs(compiler, brw, mem_ctx, wm_key, _prog_data, nir,
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index 

Re: [Mesa-dev] [PATCH 6/7] i965: Rewrite FS input handling to use the new NIR intrinsics.

2016-07-18 Thread Chris Forbes
On Tue, Jul 19, 2016 at 8:26 AM, Kenneth Graunke 
wrote:

> +   default:
> +  assert(!"invalid intrinsic");
>

unreachable() ?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: add support for B5G6R5 PBO uploads via texture buffers

2016-07-12 Thread Chris Forbes
On Tue, Jul 12, 2016 at 9:59 PM, Marek Olšák  wrote:
+   *endian = r600_endian_swap(32);

I don't fully understand r600, but this 32 seems dubious?

- Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Don't leak scratch BOs for TCS/TES.

2016-06-12 Thread Chris Forbes
Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Mon, Jun 13, 2016 at 12:03 PM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> These need to be freed too.
>
> Cc: "12.0" <mesa-sta...@lists.freedesktop.org>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 7bbc128..a5c6581 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -1100,6 +1100,10 @@ intelDestroyContext(__DRIcontext * driContextPriv)
> drm_intel_bo_unreference(brw->curbe.curbe_bo);
> if (brw->vs.base.scratch_bo)
>drm_intel_bo_unreference(brw->vs.base.scratch_bo);
> +   if (brw->tcs.base.scratch_bo)
> +  drm_intel_bo_unreference(brw->tcs.base.scratch_bo);
> +   if (brw->tes.base.scratch_bo)
> +  drm_intel_bo_unreference(brw->tes.base.scratch_bo);
> if (brw->gs.base.scratch_bo)
>drm_intel_bo_unreference(brw->gs.base.scratch_bo);
> if (brw->wm.base.scratch_bo)
> --
> 2.8.3
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/get: return correct value for layer provoking vertex.

2016-06-02 Thread Chris Forbes
Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Fri, Jun 3, 2016 at 2:27 PM, Dave Airlie <airl...@gmail.com> wrote:

> From: Dave Airlie <airl...@redhat.com>
>
> This fixes:
> GL45-CTS.geometry_shader.layered_rendering.layered_rendering
>
> on Skylake.
>
> Signed-off-by: Dave Airlie <airl...@redhat.com>
> ---
>  src/mesa/main/get_hash_params.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/get_hash_params.py
> b/src/mesa/main/get_hash_params.py
> index 2124072..bfcbfd6 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -542,7 +542,7 @@ descriptor=[
>[ "MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS",
> "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxCombinedUniformComponents),
> extra_ARB_uniform_buffer_object_and_geometry_shader" ],
>
>  # GL_ARB_viewport_array / GL_OES_geometry_shader
> -  [ "LAYER_PROVOKING_VERTEX", "CONTEXT_ENUM(Light.ProvokingVertex),
> extra_ARB_viewport_array_or_oes_geometry_shader" ],
> +  [ "LAYER_PROVOKING_VERTEX",
> "CONTEXT_ENUM(Const.LayerAndVPIndexProvokingVertex),
> extra_ARB_viewport_array_or_oes_geometry_shader" ],
>
>  # GL_ARB_gpu_shader5 / GL_OES_geometry_shader
>[ "MAX_GEOMETRY_SHADER_INVOCATIONS",
> "CONST(MAX_GEOMETRY_SHADER_INVOCATIONS),
> extra_ARB_gpu_shader5_or_oes_geometry_shader" ],
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] mesa: Allow relax various desktop-only checks for cube arrays

2016-05-30 Thread Chris Forbes
Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 src/mesa/main/get.c  | 2 +-
 src/mesa/main/get_hash_params.py | 6 +++---
 src/mesa/main/teximage.c | 3 ++-
 src/mesa/main/texobj.c   | 2 +-
 src/mesa/main/texparam.c | 3 ++-
 src/mesa/main/texstorage.c   | 3 ++-
 6 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 9f70749..4f46572 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1914,7 +1914,7 @@ tex_binding_to_index(const struct gl_context *ctx, GLenum 
binding)
   _mesa_has_OES_texture_buffer(ctx)) ?
  TEXTURE_BUFFER_INDEX : -1;
case GL_TEXTURE_BINDING_CUBE_MAP_ARRAY:
-  return _mesa_is_desktop_gl(ctx) && 
ctx->Extensions.ARB_texture_cube_map_array
+  return ctx->Extensions.ARB_texture_cube_map_array
  ? TEXTURE_CUBE_ARRAY_INDEX : -1;
case GL_TEXTURE_BINDING_2D_MULTISAMPLE:
   return _mesa_is_desktop_gl(ctx) && 
ctx->Extensions.ARB_texture_multisample
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 2124072..7193296 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -458,6 +458,9 @@ descriptor=[
   [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather"],
   [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather"],
 
+# GL_ARB_texture_cube_map_array / ES3.1 with GL_OES_texture_cube_map_array
+  [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
+
 # GL_ARB_compute_shader / GLES 3.1
   [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), 
extra_ARB_compute_shader_es31" ],
   [ "MAX_COMPUTE_UNIFORM_BLOCKS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_COMPUTE].MaxUniformBlocks), 
extra_ARB_compute_shader_es31" ],
@@ -851,9 +854,6 @@ descriptor=[
 # GL_ARB_map_buffer_alignment
   [ "MIN_MAP_BUFFER_ALIGNMENT", "CONTEXT_INT(Const.MinMapBufferAlignment), 
NO_EXTRA" ],
 
-# GL_ARB_texture_cube_map_array
-  [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
-
 # GL_ARB_texture_gather
   [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather"],
 
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 58b7f27..bfe0b18 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -1474,8 +1474,9 @@ legal_teximage_target(struct gl_context *ctx, GLuint 
dims, GLenum target)
   case GL_PROXY_TEXTURE_2D_ARRAY_EXT:
  return _mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_array;
   case GL_TEXTURE_CUBE_MAP_ARRAY:
-  case GL_PROXY_TEXTURE_CUBE_MAP_ARRAY:
  return ctx->Extensions.ARB_texture_cube_map_array;
+  case GL_PROXY_TEXTURE_CUBE_MAP_ARRAY:
+ return _mesa_is_desktop_gl(ctx) && 
ctx->Extensions.ARB_texture_cube_map_array;
   default:
  return GL_FALSE;
   }
diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index ed630bd..2e9d9e3 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -1579,7 +1579,7 @@ _mesa_tex_target_to_index(const struct gl_context *ctx, 
GLenum target)
   return _mesa_is_gles(ctx) && ctx->Extensions.OES_EGL_image_external
  ? TEXTURE_EXTERNAL_INDEX : -1;
case GL_TEXTURE_CUBE_MAP_ARRAY:
-  return _mesa_is_desktop_gl(ctx) && 
ctx->Extensions.ARB_texture_cube_map_array
+  return ctx->Extensions.ARB_texture_cube_map_array
  ? TEXTURE_CUBE_ARRAY_INDEX : -1;
case GL_TEXTURE_2D_MULTISAMPLE:
   return ((_mesa_is_desktop_gl(ctx) && 
ctx->Extensions.ARB_texture_multisample) ||
diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
index ba83f8f..d701b87 100644
--- a/src/mesa/main/texparam.c
+++ b/src/mesa/main/texparam.c
@@ -1243,6 +1243,8 @@ _mesa_legal_get_tex_level_parameter_target(struct 
gl_context *ctx, GLenum target
*/
   return (ctx->API == API_OPENGL_CORE && ctx->Version >= 31) ||
  _mesa_has_OES_texture_buffer(ctx);
+   case GL_TEXTURE_CUBE_MAP_ARRAY_ARB:
+  return ctx->Extensions.ARB_texture_cube_map_array;
}
 
if (!_mesa_is_desktop_gl(ctx))
@@ -1257,7 +1259,6 @@ _mesa_legal_get_tex_level_parameter_target(struct 
gl_context *ctx, GLenum target
   return GL_TRUE;
case GL_PROXY_TEXTURE_CUBE_MAP:
   return ctx->Extensions.ARB_texture_cube_map;
-   case GL_TEXTURE_CUBE_MAP_ARRAY_ARB:
case GL_PR

[Mesa-dev] [PATCH 4/4] docs: Note that OES_texture_cube_map_array is done.

2016-05-30 Thread Chris Forbes
Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 docs/GL3.txt  | 2 +-
 docs/relnotes/12.1.0.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index e8d401d..eeaed52 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -269,7 +269,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_tessellation_shaderstarted (Ken)
   GL_OES_texture_border_clamp   DONE (all drivers)
   GL_OES_texture_buffer DONE (i965, nvc0, 
radeonsi)
-  GL_OES_texture_cube_map_array not started (based on 
GL_ARB_texture_cube_map_array, which is done for all drivers)
+  GL_OES_texture_cube_map_array DONE (all drivers that 
support GL_ARB_texture_cube_map_array)
   GL_OES_texture_stencil8   DONE (all drivers that 
support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array   DONE (all drivers that 
support GL_ARB_texture_multisample)
 
diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
index 50eee17..3cb0f41 100644
--- a/docs/relnotes/12.1.0.html
+++ b/docs/relnotes/12.1.0.html
@@ -44,7 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
-TBD
+   GL_OES_texture_cube_map_array on all drivers that support ES3.1 and 
ARB_texture_cube_map_array
 
 
 Bug fixes
-- 
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] glsl: Add support for cube arrays in ES.

2016-05-30 Thread Chris Forbes
Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 src/compiler/glsl/builtin_functions.cpp | 12 
 src/compiler/glsl/builtin_types.cpp | 23 ---
 src/compiler/glsl/glsl_lexer.ll | 14 +++---
 3 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index edd02bb..46c8150 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -327,15 +327,19 @@ static bool
 fs_texture_cube_map_array(const _mesa_glsl_parse_state *state)
 {
return state->stage == MESA_SHADER_FRAGMENT &&
-  (state->is_version(400, 0) ||
-   state->ARB_texture_cube_map_array_enable);
+  (state->is_version(400, 320) ||
+   state->ARB_texture_cube_map_array_enable ||
+   state->EXT_texture_cube_map_array_enable ||
+   state->OES_texture_cube_map_array_enable);
 }
 
 static bool
 texture_cube_map_array(const _mesa_glsl_parse_state *state)
 {
-   return state->is_version(400, 0) ||
-  state->ARB_texture_cube_map_array_enable;
+   return state->is_version(400, 320) ||
+  state->ARB_texture_cube_map_array_enable ||
+  state->EXT_texture_cube_map_array_enable ||
+  state->OES_texture_cube_map_array_enable;
 }
 
 static bool
diff --git a/src/compiler/glsl/builtin_types.cpp 
b/src/compiler/glsl/builtin_types.cpp
index 5f208f8..2d1dc03 100644
--- a/src/compiler/glsl/builtin_types.cpp
+++ b/src/compiler/glsl/builtin_types.cpp
@@ -189,7 +189,7 @@ static const struct builtin_type_versions {
T(isamplerCube,130, 300)
T(isampler1DArray, 130, 999)
T(isampler2DArray, 130, 300)
-   T(isamplerCubeArray,   400, 999)
+   T(isamplerCubeArray,   400, 320)
T(isampler2DRect,  140, 999)
T(isamplerBuffer,  140, 320)
T(isampler2DMS,150, 310)
@@ -201,7 +201,7 @@ static const struct builtin_type_versions {
T(usamplerCube,130, 300)
T(usampler1DArray, 130, 999)
T(usampler2DArray, 130, 300)
-   T(usamplerCubeArray,   400, 999)
+   T(usamplerCubeArray,   400, 320)
T(usampler2DRect,  140, 999)
T(usamplerBuffer,  140, 320)
T(usampler2DMS,150, 310)
@@ -212,7 +212,7 @@ static const struct builtin_type_versions {
T(samplerCubeShadow,   130, 300)
T(sampler1DArrayShadow,130, 999)
T(sampler2DArrayShadow,130, 300)
-   T(samplerCubeArrayShadow,  400, 999)
+   T(samplerCubeArrayShadow,  400, 320)
T(sampler2DRectShadow, 140, 999)
 
T(struct_gl_DepthRangeParameters,  110, 100)
@@ -225,7 +225,7 @@ static const struct builtin_type_versions {
T(imageBuffer, 420, 320)
T(image1DArray,420, 999)
T(image2DArray,420, 310)
-   T(imageCubeArray,  420, 999)
+   T(imageCubeArray,  420, 320)
T(image2DMS,   420, 999)
T(image2DMSArray,  420, 999)
T(iimage1D,420, 999)
@@ -236,7 +236,7 @@ static const struct builtin_type_versions {
T(iimageBuffer,420, 320)
T(iimage1DArray,   420, 999)
T(iimage2DArray,   420, 310)
-   T(iimageCubeArray, 420, 999)
+   T(iimageCubeArray, 420, 320)
T(iimage2DMS,  420, 999)
T(iimage2DMSArray, 420, 999)
T(uimage1D,420, 999)
@@ -247,7 +247,7 @@ static const struct builtin_type_versions {
T(uimageBuffer,420, 320)
T(uimage1DArray,   420, 999)
T(uimage2DArray,   420, 310)
-   T(uimageCubeArray, 420, 999)
+   T(uimageCubeArray, 420, 320)
T(uimage2DMS,  420, 999)
T(uimage2DMSArray, 420, 999)
 
@@ -298,13 +298,22 @@ _mesa_glsl_initialize_types(struct _mesa_glsl_parse_state 
*state)
 * by the version-based loop, but attempting to add them a second time
 * is harmless.
 */
-   if (state->ARB_texture_cube_map_array_enable) {
+   if (state->ARB_texture_cube_map_array_enable ||
+   state->EXT_texture_cube_map_array_enable ||
+   state->OES_texture_cube_map_array_enable) {
   add_type(symbols, glsl_type::samplerCubeArray_type);
   add_type(symbols, glsl_type::samplerCubeArrayShadow_type);
   add_type(symbols, glsl_type::isamplerCubeArray_type);
   add_type(symbols, glsl_type::usamplerCubeArray_type);
}
 
+   if (state->EXT_texture_cube_map_array_enable ||

[Mesa-dev] [PATCH 1/4] mesa: Add scaffolding for OES_texture_cube_map_array

2016-05-30 Thread Chris Forbes
This is the same as ARB_texture_cube_map_array plus some image
interactions.

Signed-off-by: Chris Forbes <chrisfor...@google.com>
---
 src/compiler/glsl/glcpp/glcpp-parse.y|  5 -
 src/compiler/glsl/glsl_parser_extras.cpp |  2 ++
 src/compiler/glsl/glsl_parser_extras.h   |  4 
 src/mapi/glapi/gen/es_EXT.xml| 24 
 src/mesa/main/extensions_table.h |  2 ++
 5 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index 4022727..ffc309a 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -2329,7 +2329,10 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
add_builtin_define(parser, "GL_EXT_texture_buffer", 1);
add_builtin_define(parser, "GL_OES_texture_buffer", 1);
 }
-
+if (extensions->ARB_texture_cube_map_array) {
+   add_builtin_define(parser, "GL_EXT_texture_cube_map_array", 1);
+   add_builtin_define(parser, "GL_OES_texture_cube_map_array", 1);
+}
 if (extensions->OES_shader_io_blocks) {
add_builtin_define(parser, "GL_EXT_shader_io_blocks", 1);
add_builtin_define(parser, "GL_OES_shader_io_blocks", 1);
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 843998d..96c021a 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -631,6 +631,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(OES_standard_derivatives,   false, true,  
OES_standard_derivatives),
EXT(OES_texture_3D, false, true,  dummy_true),
EXT(OES_texture_buffer, false, true,  OES_texture_buffer),
+   EXT(OES_texture_cube_map_array, false, true,  
ARB_texture_cube_map_array),
EXT(OES_texture_storage_multisample_2d_array, false, true, 
ARB_texture_multisample),
 
/* All other extensions go here, sorted alphabetically.
@@ -650,6 +651,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(EXT_shader_samples_identical,   true,  true,  
EXT_shader_samples_identical),
EXT(EXT_texture_array,  true,  false, EXT_texture_array),
EXT(EXT_texture_buffer, false, true,  OES_texture_buffer),
+   EXT(EXT_texture_cube_map_array, false, true,  
ARB_texture_cube_map_array),
 };
 
 #undef EXT
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index a0c1903..eee949e 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -643,6 +643,8 @@ struct _mesa_glsl_parse_state {
bool OES_texture_3D_warn;
bool OES_texture_buffer_enable;
bool OES_texture_buffer_warn;
+   bool OES_texture_cube_map_array_enable;
+   bool OES_texture_cube_map_array_warn;
bool OES_texture_storage_multisample_2d_array_enable;
bool OES_texture_storage_multisample_2d_array_warn;
 
@@ -678,6 +680,8 @@ struct _mesa_glsl_parse_state {
bool EXT_texture_array_warn;
bool EXT_texture_buffer_enable;
bool EXT_texture_buffer_warn;
+   bool EXT_texture_cube_map_array_enable;
+   bool EXT_texture_cube_map_array_warn;
/*@}*/
 
/** Extensions supported by the OpenGL implementation. */
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index 6886dab..b21e85a 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -924,6 +924,18 @@
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
 http://www.w3.org/2001/XInclude"/>
 
 
@@ -1297,4 +1309,16 @@
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
 
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index b715f7c..be1aeb9 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -243,6 +243,7 @@ EXT(EXT_texture_compression_latc, 
EXT_texture_compression_latc
 EXT(EXT_texture_compression_rgtc, ARB_texture_compression_rgtc 
  , GLL, GLC,  x ,  x , 2004)
 EXT(EXT_texture_compression_s3tc, EXT_texture_compression_s3tc 
  , GLL, GLC,  x ,  x , 2000)
 EXT(EXT_texture_cube_map, ARB_texture_cube_map 
  , GLL,  x ,  x ,  x , 2001)
+EXT(EXT_texture_cube_map_array  , ARB_texture_cube_map_array   
  ,  x,   x ,  x ,  31, 2014)
 EXT(EXT_texture_edge_clamp  , dummy_true   
  , GLL,  x ,  x ,  x , 1997)
 EXT(EXT_texture_env_add , dummy_true   
  , GLL,  x ,  x ,  x , 1999)
 EXT(EXT_texture_env_combine , dumm

Re: [Mesa-dev] [PATCH] glsl/parser: handle multiple layout sections with AST nodes.

2016-05-23 Thread Chris Forbes
Eek, that would do it.

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Mon, May 23, 2016 at 5:55 PM, Dave Airlie <airl...@gmail.com> wrote:

> From: Dave Airlie <airl...@redhat.com>
>
> For geometry/compute inputs and tess control outputs, we create
> an AST node to keep track of some things. However if we have
> multiple layout sections, we don't ever link the node into the AST.
>
> This is because we create the node on the rightmost layout declaration
> and don't pass it back in so it gets linked at the end of the parsing
> of the rightmost.
>
> Signed-off-by: Dave Airlie <airl...@redhat.com>
> ---
>  src/compiler/glsl/glsl_parser.yy | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/compiler/glsl/glsl_parser.yy
> b/src/compiler/glsl/glsl_parser.yy
> index 09e346d..3885688 100644
> --- a/src/compiler/glsl/glsl_parser.yy
> +++ b/src/compiler/glsl/glsl_parser.yy
> @@ -2859,6 +2859,7 @@ layout_in_defaults:
>  merge_in_qualifier(& @1, state, $1, $$, false)) {
>  YYERROR;
>   }
> + $$ = $2;
>}
> }
> | layout_qualifier IN_TOK ';'
> @@ -2883,6 +2884,7 @@ layout_out_defaults:
>  merge_out_qualifier(& @1, state, $1, $$, false)) {
>  YYERROR;
>   }
> + $$ = $2;
>}
> }
> | layout_qualifier OUT_TOK ';'
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/ast: subroutineTypes can't be returned from functions.

2016-05-22 Thread Chris Forbes
Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Mon, May 23, 2016 at 2:15 PM, Dave Airlie <airl...@gmail.com> wrote:

> From: Dave Airlie <airl...@redhat.com>
>
> These types can't be returned.
>
> This fixes:
>
> GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types
> for the return type case.
>
> Signed-off-by: Dave Airlie <airl...@redhat.com>
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> index aa8e810..0ec5b70 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -5402,6 +5402,15 @@ ast_function::hir(exec_list *instructions,
> name);
> }
>
> +   /**/
> +   if (return_type->is_subroutine()) {
> +  YYLTYPE loc = this->get_location();
> +  _mesa_glsl_error(, state,
> +   "function `%s' return type can't be a subroutine
> type",
> +   name);
> +   }
> +
> +
> /* Create an ir_function if one doesn't already exist. */
> f = state->symbols->get_function(name);
> if (f == NULL) {
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] arb_shader_subroutine CTS fixes

2016-05-22 Thread Chris Forbes
1, 3-11 inclusive are:

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Mon, May 23, 2016 at 12:52 PM, Dave Airlie <airl...@gmail.com> wrote:

> Since I wrote ARB_shader_subroutine as mostly a hack to enable GL4.0,
> I felt a bit guilty and looked at CTS issues with it.
>
> There are a bunch of CTS tests that do explicit location/index with
> subroutines that were broken, along with a fair few of the subroutine
> tests.
>
> This is my first pass set of patches, there is still work to be done,
> but these should all fix things now.
>
> The big remaining ugly is that subroutines assignments are meant to
> be per context not per program, and I haven't gotten to fixing that yet.
>
> Dave.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/12] glsl: fix subroutine uniform .length().

2016-05-22 Thread Chris Forbes
On Mon, May 23, 2016 at 12:52 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> This fixes .length() on subroutine uniform arrays, if
> we don't find the identifier normally, we look up the corresponding
> subroutine identifier instead.
>
> Fixes:
> GL45-CTS.shader_subroutine.arrays_of_arrays_of_uniforms
> GL45-CTS.shader_subroutine.arrayed_subroutine_uniforms
>
> Signed-off-by: Dave Airlie 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> index 434734d..ecd1327 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -1917,6 +1917,14 @@ ast_expression::do_hir(exec_list *instructions,
>ir_variable *var =
>
> state->symbols->get_variable(this->primary_expression.identifier);
>
> +  if (var == NULL) {
> + /* the identifier might be a subroutine name */
>

Being pedantic, but `subroutine uniform name`, right?


> + char *sub_name;
> + sub_name = ralloc_asprintf(ctx, "%s_%s",
> _mesa_shader_stage_to_subroutine_prefix(state->stage),
> this->primary_expression.identifier);
> + var = state->symbols->get_variable(sub_name);
> + ralloc_free(sub_name);
> +  }
> +
>if (var != NULL) {
>   var->data.used = true;
>   result = new(ctx) ir_dereference_variable(var);
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: be more strict when validating shader inputs

2016-05-12 Thread Chris Forbes
With the version cutoff fixed, this and the patch it builds on are
(squashed together or not):

Reviewed-by: Chris Forbes <chrisfor...@google.com>

On Fri, May 13, 2016 at 4:58 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:

> On Fri, May 13, 2016 at 12:51 AM, Dave Airlie <airl...@gmail.com> wrote:
> >>> second argument is for ES... 0 means "never").
> >>
> >> I see.  (You can tell how much of this sort of code I've written...).
> >>
> >> I don't know that I'd trust me but it looks fine as far add I can see.
> >> Thanks for taking care of 4.50 while you were in the neighborhood.  For
> what
> >> it's worth,
> >
> > Do we know if the addition of swizzles and just correcting an oversight?
> >
> > Should we just enable it unconditionally.
>
> Good question.
>
> Looks like the cut-off is actually 4.40, not 4.50 (oops). I don't see
> it explicitly listed in changes of the 4.40 spec.
>
> In 4.30: "For all of the interpolation functions, interpolant must be
> an input variable or an element of an input
> variable declared as an array. Component selection operators (e.g.,
> .xy) may not be used when specifying
> interpolant."
>
> In 4.40: "For all of the interpolation functions, interpolant must be
> an input variable or an element of an input
> variable declared as an array. Component selection operators (e.g.,
> .xy) may be used when specifying
> interpolant."
>
> I'd say it's pretty clear that it's not allowed in 4.30 and allowed in
> 4.40. I'm going to fix the version cut-off locally, but if you want me
> to enable it everywhere, let me know.
>
>   -ilia
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Enable ARB_texture_stencil8 and OES_texture_stencil8 on Gen8+.

2016-04-26 Thread Chris Forbes
Series is:

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Wed, Apr 27, 2016 at 3:33 AM, Thomas Helland <thomashellan...@gmail.com>
wrote:

> I guess you should also update GL4.4 section in GL3.txt.
> And add the extension to the release notes.
> Either a follow up patch or squashed into this one is fine with me.
>
> Regards,
> Thomas
>
> On Apr 26, 2016 12:25, "Kenneth Graunke" <kenn...@whitecape.org> wrote:
> >
> > Stencil texturing is required by ES 3.1.  Apparently we never actually
> > turned it on.  Do that now.  Also turn on the desktop extension.
> >
> > Fixes nine dEQP-GLES31.functional tests:
> >
> > stencil_texturing.format.stencil_index8_2d
> > texture.border_clamp.formats.stencil_index8.nearest_size_pot
> > texture.border_clamp.formats.stencil_index8.nearest_size_npot
> > texture.border_clamp.formats.stencil_index8.gather_size_pot
> > texture.border_clamp.formats.stencil_index8.gather_size_npot
> > texture.border_clamp.unused_channels.stencil_index8
> > state_query.internal_format.renderbuffer.stencil_index8_samples
> > state_query.internal_format.texture_2d_multisample.stencil_index8_samples
> >
> state_query.internal_format.texture_2d_multisample_array.stencil_index8_samples
> >
> > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> > ---
> >  src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c | 7 ---
> >  src/mesa/drivers/dri/i965/brw_surface_formats.c   | 1 +
> >  src/mesa/drivers/dri/i965/intel_extensions.c  | 1 +
> >  3 files changed, 2 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
> b/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
> > index 7e04248..71ab7be 100644
> > --- a/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
> > @@ -436,12 +436,6 @@ brw_meta_stencil_blit(struct brw_context *brw,
> > GLenum target;
> >
> > _mesa_meta_fb_tex_blit_begin(ctx, );
> > -   /* XXX: Pretend to support stencil textures so
> _mesa_base_tex_format()
> > -* returns a valid format.  When we properly support the extension,
> we
> > -* should remove this.
> > -*/
> > -   assert(ctx->Extensions.ARB_texture_stencil8 == false);
> > -   ctx->Extensions.ARB_texture_stencil8 = true;
> >
> > drawFb = ctx->Driver.NewFramebuffer(ctx, 0xDEADBEEF);
> > if (drawFb == NULL) {
> > @@ -484,7 +478,6 @@ brw_meta_stencil_blit(struct brw_context *brw,
> > _mesa_DrawArrays(GL_TRIANGLE_FAN, 0, 4);
> >
> >  error:
> > -   ctx->Extensions.ARB_texture_stencil8 = false;
> > _mesa_meta_fb_tex_blit_end(ctx, target, );
> > _mesa_meta_end(ctx);
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > index c65f0d3..16667b9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > @@ -704,6 +704,7 @@ brw_init_surface_formats(struct brw_context *brw)
> > ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
> > ctx->TextureFormatSupported[MESA_FORMAT_Z_FLOAT32] = true;
> > ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = true;
> > +   ctx->TextureFormatSupported[MESA_FORMAT_S_UINT8] = true;
> >
> > /* Benchmarking shows that Z16 is slower than Z24, so there's no
> reason to
> >  * use it unless you're under memory (not memory bandwidth) pressure.
> > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> > index 907f24f..820d573 100644
> > --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> > @@ -368,6 +368,7 @@ intelInitExtensions(struct gl_context *ctx)
> >
> > if (brw->gen >= 8) {
> >ctx->Extensions.ARB_stencil_texturing = true;
> > +  ctx->Extensions.ARB_texture_stencil8 = true;
> > }
> >
> > if (brw->gen >= 9) {
> > --
> > 2.8.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: default FixedSampleLocations to true when using a dummy image

2016-02-13 Thread Chris Forbes
Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Fri, Feb 12, 2016 at 9:31 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote:

> GL_ARB_texture_multisample and GLES 3.1 expect the initial value to be
> GL_TRUE. This fixes
>
>
> dEQP-GLES31.functional.state_query.texture_level.texture_2d_multisample_array.fixed_sample_locations_integer
>
> and a few related tests.
>
> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
> ---
>  src/mesa/main/texparam.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
> index ed83830..c5f493f 100644
> --- a/src/mesa/main/texparam.c
> +++ b/src/mesa/main/texparam.c
> @@ -1310,6 +1310,7 @@ get_tex_level_parameter_image(struct gl_context *ctx,
>dummy_image.TexFormat = MESA_FORMAT_NONE;
>dummy_image.InternalFormat = GL_RGBA;
>dummy_image._BaseFormat = GL_NONE;
> +  dummy_image.FixedSampleLocations = GL_TRUE;
>
>img = _image;
> }
> --
> 2.4.10
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: ir: dump floats as %-g rather than %f, so we can see denormals

2016-02-10 Thread Chris Forbes
Signed-off-by: Chris Forbes <chr...@ijw.co.nz>
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 41a3f81..8734560 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4726,7 +4726,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
   case IMM:
  switch (inst->src[i].type) {
  case BRW_REGISTER_TYPE_F:
-fprintf(file, "%ff", inst->src[i].f);
+fprintf(file, "%-gf", inst->src[i].f);
 break;
  case BRW_REGISTER_TYPE_W:
  case BRW_REGISTER_TYPE_D:
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Utilize new 5th bit for gateway messages

2016-01-26 Thread Chris Forbes
Might be a good idea to update the comment above the second hunk. It's very
precise about which bits, and so now wrong.

- Chris

On Wed, Jan 27, 2016 at 12:44 PM, Ben Widawsky 
wrote:

> Cc: Jordan Justen 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index aad512f..820c1d4 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -924,6 +924,8 @@ void
>  fs_visitor::emit_barrier()
>  {
> assert(devinfo->gen >= 7);
> +   const uint32_t barrier_id_mask =
> +  devinfo->gen >= 9 ? 0x8f00u : 0x0f00u;
>
> /* We are getting the barrier ID from the compute shader header */
> assert(stage == MESA_SHADER_COMPUTE);
> @@ -937,7 +939,7 @@ fs_visitor::emit_barrier()
>
> /* Copy bits 27:24 of r0.2 (barrier id) to the message payload reg.2 */
> fs_reg r0_2 = fs_reg(retype(brw_vec1_grf(0, 2), BRW_REGISTER_TYPE_UD));
> -   pbld.AND(component(payload, 2), r0_2, brw_imm_ud(0x0f00u));
> +   pbld.AND(component(payload, 2), r0_2, brw_imm_ud(barrier_id_mask));
>
> /* Emit a gateway "barrier" message using the payload we set up,
> followed
>  * by a wait instruction.
> --
> 2.7.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: remove old FINISHME

2016-01-25 Thread Chris Forbes
Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Tue, Jan 26, 2016 at 6:22 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> This should have been removed long ago.
> ---
>  src/glsl/linker.cpp | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 4e63698..7925709 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -4679,8 +4679,6 @@ link_shaders(struct gl_context *ctx, struct
> gl_shader_program *prog)
>  >NumShaderStorageBlocks,
>  >SsboInterfaceBlockIndex);
>
> -   /* FINISHME: Assign fragment shader output locations. */
> -
> for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
>if (prog->_LinkedShaders[i] == NULL)
>  continue;
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Mark TCS URB writes as having side effects.

2016-01-11 Thread Chris Forbes
Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Tue, Jan 12, 2016 at 12:04 PM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> This adds barrier dependencies around TCS_OPCODE_URB_WRITE, preventing
> reads and writes from being incorrectly scheduled.
>
> Fixes rendering in GFXBench 4.0's tessellation demo.
>
> For some reason, we haven't ever listed URB writes as having
> side-effects.  This hasn't been a problem because in most stages, we
> never read from the URB, and only write to each location once.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93526
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index efc24f9..0ac3f4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -1022,6 +1022,7 @@ backend_instruction::has_side_effects() const
> case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED_PER_SLOT:
> case FS_OPCODE_FB_WRITE:
> case SHADER_OPCODE_BARRIER:
> +   case TCS_OPCODE_URB_WRITE:
> case TCS_OPCODE_RELEASE_INPUT:
>return true;
> default:
> --
> 2.7.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 7/7] vbo: cache/memoize the result of vbo_get_minmax_indices

2016-01-08 Thread Chris Forbes
Reviewed-by: Chris Forbes <chr...@ijw.co.nz>
On 8 Jan 2016 9:03 AM, "Nicolai Hähnle" <nhaeh...@gmail.com> wrote:

> From: Nicolai Hähnle <nicolai.haeh...@amd.com>
>
> Some games developers are unaware that an index buffer in a VBO still needs
> to be read by the CPU if some varying data comes from a user pointer
> (unless
> glDrawRangeElements and friends are used). This is particularly bad when
> they tell us that the index buffer should live in VRAM.
>
> This cache helps, e.g. lifting This War Of Mine (a particularly bad
> offender) from under 10fps to slightly over 20fps on a Carrizo.
>
> Note that there is nothing prohibiting a user from rendering from multiple
> threads simultaneously with the same index buffer, hence the locking. (The
> internal buffer map taken for the buffer still leads to a race, but at
> least
> the locks are a move in the right direction.)
>
> v2: disable the cache on USAGE_TEXTURE_BUFFER as well (Chris Forbes)
> ---
> This should be correct if a bit conservative if stores aren't used
> (ARB_texture_buffer is older than ARB_shader_image_load_store), but that's
> not worth losing sleep over.
>
>  src/mesa/main/bufferobj.c   |  10 +++
>  src/mesa/main/mtypes.h  |   4 +
>  src/mesa/vbo/vbo.h  |   3 +
>  src/mesa/vbo/vbo_minmax_index.c | 164
> 
>  4 files changed, 181 insertions(+)
>
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index d88d9e3..a113cac 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -458,6 +458,7 @@ _mesa_delete_buffer_object(struct gl_context *ctx,
>  {
> (void) ctx;
>
> +   vbo_delete_minmax_cache(bufObj);
> _mesa_align_free(bufObj->Data);
>
> /* assign strange values here to help w/ debugging */
> @@ -1528,6 +1529,7 @@ _mesa_buffer_storage(struct gl_context *ctx, struct
> gl_buffer_object *bufObj,
>
> bufObj->Written = GL_TRUE;
> bufObj->Immutable = GL_TRUE;
> +   bufObj->MinMaxCacheDirty = GL_TRUE;
>
> assert(ctx->Driver.BufferData);
> if (!ctx->Driver.BufferData(ctx, target, size, data, GL_DYNAMIC_DRAW,
> @@ -1641,6 +1643,7 @@ _mesa_buffer_data(struct gl_context *ctx, struct
> gl_buffer_object *bufObj,
> FLUSH_VERTICES(ctx, _NEW_BUFFER_OBJECT);
>
> bufObj->Written = GL_TRUE;
> +   bufObj->MinMaxCacheDirty = GL_TRUE;
>
>  #ifdef VBO_DEBUG
> printf("glBufferDataARB(%u, sz %ld, from %p, usage 0x%x)\n",
> @@ -1753,6 +1756,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct
> gl_buffer_object *bufObj,
> }
>
> bufObj->Written = GL_TRUE;
> +   bufObj->MinMaxCacheDirty = GL_TRUE;
>
> assert(ctx->Driver.BufferSubData);
> ctx->Driver.BufferSubData(ctx, offset, size, data, bufObj);
> @@ -1872,6 +1876,8 @@ _mesa_clear_buffer_sub_data(struct gl_context *ctx,
> if (size == 0)
>return;
>
> +   bufObj->MinMaxCacheDirty = GL_TRUE;
> +
> if (data == NULL) {
>/* clear to zeros, per the spec */
>ctx->Driver.ClearBufferSubData(ctx, offset, size,
> @@ -2285,6 +2291,8 @@ _mesa_copy_buffer_sub_data(struct gl_context *ctx,
>}
> }
>
> +   dst->MinMaxCacheDirty = GL_TRUE;
> +
> ctx->Driver.CopyBufferSubData(ctx, src, dst, readOffset, writeOffset,
> size);
>  }
>
> @@ -2494,6 +2502,8 @@ _mesa_map_buffer_range(struct gl_context *ctx,
>
>if (access & GL_MAP_PERSISTENT_BIT)
>   bufObj->UsageHistory |= USAGE_PERSISTENT_WRITE_MAP;
> +
> +  bufObj->MinMaxCacheDirty = GL_TRUE;
> }
>
>  #ifdef VBO_DEBUG
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 5b87c7c..37a088b 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1283,6 +1283,10 @@ struct gl_buffer_object
> GLuint NumMapBufferWriteCalls;
>
> struct gl_buffer_mapping Mappings[MAP_COUNT];
> +
> +   /** Memoization of min/max index computations for static index buffers
> */
> +   struct hash_table *MinMaxCache;
> +   GLboolean MinMaxCacheDirty;
>  };
>
>
> diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
> index 0b8b6a9..6494aa5 100644
> --- a/src/mesa/vbo/vbo.h
> +++ b/src/mesa/vbo/vbo.h
> @@ -181,6 +181,9 @@ vbo_sizeof_ib_type(GLenum type)
>  }
>
>  void
> +vbo_delete_minmax_cache(struct gl_buffer_object *bufferObj);
> +
> +void
>  vbo_get_minmax_indices(struct gl_context *ctx, const struct _mesa_prim
> *prim,
> const struct _mesa_index_buffer *ib,
> GLuint *min_index, GLuint *max_index, GLuint
> nr_prims);
&g

Re: [Mesa-dev] [PATCH 7/7] vbo: cache/memoize the result of vbo_get_minmax_indices

2016-01-07 Thread Chris Forbes
I think this misses the image load/store case. (*samplerBuffer)

- Chris
From: Nicolai Hähnle 

Some games developers are unaware that an index buffer in a VBO still needs
to be read by the CPU if some varying data comes from a user pointer (unless
glDrawRangeElements and friends are used). This is particularly bad when
they tell us that the index buffer should live in VRAM.

This cache helps, e.g. lifting This War Of Mine (a particularly bad
offender) from under 10fps to slightly over 20fps on a Carrizo.

Note that there is nothing prohibiting a user from rendering from multiple
threads simultaneously with the same index buffer, hence the locking. (The
internal buffer map taken for the buffer still leads to a race, but at least
the locks are a move in the right direction.)
---
 src/mesa/main/bufferobj.c   |  10 +++
 src/mesa/main/mtypes.h  |   4 +
 src/mesa/vbo/vbo.h  |   3 +
 src/mesa/vbo/vbo_minmax_index.c | 163

 4 files changed, 180 insertions(+)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index b06f528..f431bb8 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -453,6 +453,7 @@ _mesa_delete_buffer_object(struct gl_context *ctx,
 {
(void) ctx;

+   vbo_delete_minmax_cache(bufObj);
_mesa_align_free(bufObj->Data);

/* assign strange values here to help w/ debugging */
@@ -1513,6 +1514,7 @@ _mesa_buffer_storage(struct gl_context *ctx, struct
gl_buffer_object *bufObj,

bufObj->Written = GL_TRUE;
bufObj->Immutable = GL_TRUE;
+   bufObj->MinMaxCacheDirty = GL_TRUE;

assert(ctx->Driver.BufferData);
if (!ctx->Driver.BufferData(ctx, target, size, data, GL_DYNAMIC_DRAW,
@@ -1626,6 +1628,7 @@ _mesa_buffer_data(struct gl_context *ctx, struct
gl_buffer_object *bufObj,
FLUSH_VERTICES(ctx, _NEW_BUFFER_OBJECT);

bufObj->Written = GL_TRUE;
+   bufObj->MinMaxCacheDirty = GL_TRUE;

 #ifdef VBO_DEBUG
printf("glBufferDataARB(%u, sz %ld, from %p, usage 0x%x)\n",
@@ -1738,6 +1741,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct
gl_buffer_object *bufObj,
}

bufObj->Written = GL_TRUE;
+   bufObj->MinMaxCacheDirty = GL_TRUE;

assert(ctx->Driver.BufferSubData);
ctx->Driver.BufferSubData(ctx, offset, size, data, bufObj);
@@ -1857,6 +1861,8 @@ _mesa_clear_buffer_sub_data(struct gl_context *ctx,
if (size == 0)
   return;

+   bufObj->MinMaxCacheDirty = GL_TRUE;
+
if (data == NULL) {
   /* clear to zeros, per the spec */
   ctx->Driver.ClearBufferSubData(ctx, offset, size,
@@ -2270,6 +2276,8 @@ _mesa_copy_buffer_sub_data(struct gl_context *ctx,
   }
}

+   dst->MinMaxCacheDirty = GL_TRUE;
+
ctx->Driver.CopyBufferSubData(ctx, src, dst, readOffset, writeOffset,
size);
 }

@@ -2479,6 +2487,8 @@ _mesa_map_buffer_range(struct gl_context *ctx,

   if (access & GL_MAP_PERSISTENT_BIT)
  bufObj->UsageHistory |= USAGE_PERSISTENT_WRITE_MAP;
+
+  bufObj->MinMaxCacheDirty = GL_TRUE;
}

 #ifdef VBO_DEBUG
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4d625da..d4c41a7 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1283,6 +1283,10 @@ struct gl_buffer_object
GLuint NumMapBufferWriteCalls;

struct gl_buffer_mapping Mappings[MAP_COUNT];
+
+   /** Memoization of min/max index computations for static index buffers
*/
+   struct hash_table *MinMaxCache;
+   GLboolean MinMaxCacheDirty;
 };


diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h
index dd9b428..59c7351 100644
--- a/src/mesa/vbo/vbo.h
+++ b/src/mesa/vbo/vbo.h
@@ -169,6 +169,9 @@ vbo_sizeof_ib_type(GLenum type)
 }

 void
+vbo_delete_minmax_cache(struct gl_buffer_object *bufferObj);
+
+void
 vbo_get_minmax_indices(struct gl_context *ctx, const struct _mesa_prim
*prim,
const struct _mesa_index_buffer *ib,
GLuint *min_index, GLuint *max_index, GLuint
nr_prims);
diff --git a/src/mesa/vbo/vbo_minmax_index.c
b/src/mesa/vbo/vbo_minmax_index.c
index b43ed98..9ac0168 100644
--- a/src/mesa/vbo/vbo_minmax_index.c
+++ b/src/mesa/vbo/vbo_minmax_index.c
@@ -32,6 +32,162 @@
 #include "main/macros.h"
 #include "main/sse_minmax.h"
 #include "x86/common_x86_asm.h"
+#include "util/hash_table.h"
+
+
+struct minmax_cache_key {
+   GLintptr offset;
+   GLuint count;
+   GLenum type;
+};
+
+
+struct minmax_cache_entry {
+   struct minmax_cache_key key;
+   GLuint min;
+   GLuint max;
+};
+
+
+static uint32_t
+vbo_minmax_cache_hash(const struct minmax_cache_key *key)
+{
+   return _mesa_hash_data(key, sizeof(*key));
+}
+
+
+static bool
+vbo_minmax_cache_key_equal(const struct minmax_cache_key *a,
+   const struct minmax_cache_key *b)
+{
+   return (a->offset == b->offset) && (a->count == b->count) && (a->type
== b->type);
+}
+
+
+static void
+vbo_minmax_cache_delete_entry(struct hash_entry *entry)
+{
+   free(entry->data);
+}
+
+
+static GLboolean

Re: [Mesa-dev] [PATCH 0/10] Tessellation shaders for Gen7/7.5.

2015-12-24 Thread Chris Forbes
Ken,

That's great news, that hang was quite the head scratcher.

- Chris
On 25 Dec 2015 14:34, "Kenneth Graunke"  wrote:

> This morning, I woke up and somehow "knew" what was causing my HS GPU hangs
> on Gen7/7.5.  It turns out I was (completely) wrong, but through some
> miraculous series of illogical leaps, I arrived at a solution anyway.
>
> I don't honestly know how I got it working on Christmas Eve after
> failing to figure it out for months on end.  After exhausting every bit
> of documentation and every tool available, and finding zero information,
> somehow randomly flailing in the dark resulted in a solution, today of
> all days.  Honestly, I had pretty much no hope for figuring this out,
> so I'm relieved to have it working at last...
>
> It turns out that setting interleave on the EOT URB write does bad things.
> Fixing this fixed all the GPU hangs when releasing inputs one at a time,
> I then added back the ability to release inputs in pairs, which caused
> more GPU hangs.  It turned out I needed to be more careful and enable
> both halves.
>
> Everything seems to be working just fine now, so let's turn it on.
>
> --Ken
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Implement EXT_shader_samples_identical

2015-11-19 Thread Chris Forbes
Series (with the v2 changes) is:

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Thu, Nov 19, 2015 at 12:46 PM, Ian Romanick <i...@freedesktop.org> wrote:

> This patch series implements a new GL extension,
> EXT_shader_samples_identical.  This extension allows shaders to
> determine when all of the samples in a particular texel are the same.
> This takes advantage of the way compressed multisample surfaces are
> stored on modern Intel and AMD hardware.  This enables optimizations in
> application multisample resolve filters, etc.
>
> I really wanted to get this in the next Mesa release.  For some reason,
> I thought the branch point was after Thanksgiving (which is next
> Thursday).  Ken reminded me yesterday that the branch point is actually
> this Friday. :( As a result, I'm sending it out today to get review as
> soon as possible.
>
> I also wanted to get as much time as possible for other drivers to get
> implementations.  I worked with Graham Sellers on this extension, and he
> assures me that the implementation on modern Radeons is trivial.  My
> expectation is that it should be about the same as the Intel
> implementation.
>
> There will be some extra TGSI bits needed, but that should also be
> trivial.  For the NIR and i965 backend bits, I mostly copied and blended
> the implementations of txf_ms and query_samples.
>
> There are currently only trivial piglit tests, but I am working on more.
> I basically hacked up tests/spec/arb_texture_multisample/texelfetch.c to
> use the extension to render different colors based on whether
> textureSamplesIdenticalEXT returned true or false.  The resulting image
> and the generated assembly look good.  My plan is to get a set of real
> tests out by midday tomorrow.
>
> As soon as we're confident that the spec is good, I'll submit it to
> Khronos for publication in the registry.  I'm still waiting on feedback
> from another closed-source driver writer.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Implement EXT_shader_samples_identical

2015-11-18 Thread Chris Forbes
It lives! Thanks for picking this up, Ian.

Had a very brief look at the series as it arrived, looks good; will try to
do a real review later today.

- Chris
On Nov 19, 2015 12:47 PM, "Ian Romanick"  wrote:

> This patch series implements a new GL extension,
> EXT_shader_samples_identical.  This extension allows shaders to
> determine when all of the samples in a particular texel are the same.
> This takes advantage of the way compressed multisample surfaces are
> stored on modern Intel and AMD hardware.  This enables optimizations in
> application multisample resolve filters, etc.
>
> I really wanted to get this in the next Mesa release.  For some reason,
> I thought the branch point was after Thanksgiving (which is next
> Thursday).  Ken reminded me yesterday that the branch point is actually
> this Friday. :( As a result, I'm sending it out today to get review as
> soon as possible.
>
> I also wanted to get as much time as possible for other drivers to get
> implementations.  I worked with Graham Sellers on this extension, and he
> assures me that the implementation on modern Radeons is trivial.  My
> expectation is that it should be about the same as the Intel
> implementation.
>
> There will be some extra TGSI bits needed, but that should also be
> trivial.  For the NIR and i965 backend bits, I mostly copied and blended
> the implementations of txf_ms and query_samples.
>
> There are currently only trivial piglit tests, but I am working on more.
> I basically hacked up tests/spec/arb_texture_multisample/texelfetch.c to
> use the extension to render different colors based on whether
> textureSamplesIdenticalEXT returned true or false.  The resulting image
> and the generated assembly look good.  My plan is to get a set of real
> tests out by midday tomorrow.
>
> As soon as we're confident that the spec is good, I'll submit it to
> Khronos for publication in the registry.  I'm still waiting on feedback
> from another closed-source driver writer.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Intent to work on support for EXT_internalformat_query2

2015-10-27 Thread Chris Forbes
Presumably ARB_internalformat_query2?

On Tue, Oct 27, 2015 at 9:31 PM, Eduardo Lima Mitev 
wrote:

> Hello,
>
> This is an announcement that a few folks at Igalia team are planning to
> work on adding support for EXT_internalformat_query2 extension to Mesa.
>
> If somebody had started work on this already, or has any input that is
> relevant to the implementation, we would be very thankful to hear about it.
>
> I just filed a bug [1] to track progress. Feel free to add your comments
> there too.
>
> That's all!
>
> cheers,
> Eduardo
>
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=92687
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Implement ARB_fragment_layer_viewport.

2015-10-26 Thread Chris Forbes
For the series

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>
On Oct 27, 2015 7:03 AM, "Kenneth Graunke" <kenn...@whitecape.org> wrote:

> Normally, we could read gl_Layer from bits 26:16 of R0.0.  However, the
> specification requires that bogus out-of-range 32-bit values written by
> previous stages need to appear in the fragment shader as-written.
>
> Instead, we pass in the full 32-bit value from the VUE header as an
> extra flat-shaded varying.  We have the SF override the value to 0
> when the previous stage didn't actually write a value (it's actually
> defined to return 0).
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> Cc: Chris Forbes <chr...@ijw.co.nz>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp |  7 ++-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 +++
>  src/mesa/drivers/dri/i965/gen6_sf_state.c| 31
> 
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  4 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 2eef7af..5a82b04 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1442,6 +1442,9 @@ fs_visitor::calculate_urb_setup()
>  }
>   }
>} else {
> + bool include_vue_header =
> +nir->info.inputs_read & (VARYING_BIT_LAYER |
> VARYING_BIT_VIEWPORT);
> +
>   /* We have enough input varyings that the SF/SBE pipeline stage
> can't
>* arbitrarily rearrange them to suit our whim; we have to put
> them
>* in an order that matches the output of the previous pipeline
> stage
> @@ -1451,7 +1454,9 @@ fs_visitor::calculate_urb_setup()
>   brw_compute_vue_map(devinfo, _stage_vue_map,
>   key->input_slots_valid,
>   nir->info.separate_shader);
> - int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
> + int first_slot =
> +include_vue_header ? 0 : 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
> +
>   assert(prev_stage_vue_map.num_slots <= first_slot + 32);
>   for (int slot = first_slot; slot < prev_stage_vue_map.num_slots;
>slot++) {
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 4950ba4..9c1f95c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -71,6 +71,14 @@ fs_visitor::nir_setup_inputs()
>   var->data.origin_upper_left);
>   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
> input, reg), 0xF);
> +  } else if (var->data.location == VARYING_SLOT_LAYER) {
> + struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_LAYER,
> 1), 3);
> + reg.type = BRW_REGISTER_TYPE_D;
> + bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D),
> reg);
> +  } else if (var->data.location == VARYING_SLOT_VIEWPORT) {
> + struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_VIEWPORT,
> 2), 3);
> + reg.type = BRW_REGISTER_TYPE_D;
> + bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D),
> reg);
>} else {
>   emit_general_interpolation(input, var->name, var->type,
>  (glsl_interp_qualifier)
> var->data.interpolation,
> diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c
> b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> index 0c8c053..2634e6b 100644
> --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> @@ -60,6 +60,23 @@ get_attr_override(const struct brw_vue_map *vue_map,
> int urb_entry_read_offset,
> /* Find the VUE slot for this attribute. */
> int slot = vue_map->varying_to_slot[fs_attr];
>
> +   /* Viewport and Layer are stored in the VUE header.  We need to
> override
> +* them to zero if earlier stages didn't write them, as GL requires
> that
> +* they read back as zero when not explicitly set.
> +*/
> +   if (fs_attr == VARYING_SLOT_VIEWPORT || fs_attr == VARYING_SLOT_LAYER)
> {
> +  unsigned override =
> + ATTRIBUTE_0_OVERRIDE_X | ATTRIBUTE_0_OVERRIDE_W |
> + ATTRIBUTE_CONST_ << ATTRIBUTE_0_CONST_SOURCE_SHIFT;
> +
> +  if (!(vue_map->slots_valid & VARYING_BIT_LAYER))
> + override |= ATTRIBUTE_0_OVERRIDE_Y;
> +  if (!(vue_map->slots_valid & VARYING_BIT_VIEWPORT))
>

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/9] ff_fragment_shader: Use binding to set the sampler unit

2015-10-09 Thread Chris Forbes
The comment above this about the cast to int can probably go away?

- Chris

On Sat, Oct 10, 2015 at 2:52 PM, Ian Romanick  wrote:

> From: Ian Romanick 
>
> This is the way layout(binding=xxx) works from GLSL.  The old method
> just happened to work (and significantly predated support for
> layout(binding=xxx)), but future changes will break this.
>
> Signed-off-by: Ian Romanick 
> Cc: "10.6 11.0" 
> ---
>  src/mesa/main/ff_fragment_shader.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/ff_fragment_shader.cpp
> b/src/mesa/main/ff_fragment_shader.cpp
> index e4e2a18..f5a4fa5 100644
> --- a/src/mesa/main/ff_fragment_shader.cpp
> +++ b/src/mesa/main/ff_fragment_shader.cpp
> @@ -981,7 +981,8 @@ static void load_texture( texenv_fragment_program *p,
> GLuint unit )
>  * NOTE: The cast to int is important.  Without it, the constant will
> have
>  * type uint, and things later on may get confused.
>  */
> -   sampler->constant_value = new(p->mem_ctx) ir_constant(int(unit));
> +   sampler->data.explicit_binding = true;
> +   sampler->data.binding = unit;
>
> deref = new(p->mem_ctx) ir_dereference_variable(sampler);
> tex->set_sampler(deref, glsl_type::vec4_type);
> --
> 2.1.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] i965: Simplify handling of VUE map changes.

2015-09-26 Thread Chris Forbes
For the v2 series:

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Sat, Sep 12, 2015 at 6:58 PM, Kenneth Graunke <kenn...@whitecape.org>
wrote:

> The old code was disasterously complex - spread across multiple atoms
> which may not even run, inspecting the dirty bits to try and decide
> whether it was necessary to do checks...storing VS information in
> brw_context...extra flagging...
>
> This code tripped me and Carl up very badly when working on the
> shader cache code.  It's very fragile and hard to maintain.
>
> Now that geometry shaders only depend on their inputs and don't have
> to worry about the VS VUE map, we can dramatically simplify this:
> just compute the VUE map coming out of the geometry shader stage
> in brw_upload_programs.  If it changes, flag it.  Done.
>
> v2: Also check vue_map.separable.
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> Reviewed-by: Chris Forbes <chr...@ijw.co.nz> [v1]
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  | 12 +---
>  src/mesa/drivers/dri/i965/brw_gs.c   | 16 +---
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 16 +++-
>  src/mesa/drivers/dri/i965/brw_vs.c   | 15 ---
>  4 files changed, 17 insertions(+), 42 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 772a9fd..4cac30c 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -194,7 +194,6 @@ enum brw_state_id {
> BRW_STATE_GS_CONSTBUF,
> BRW_STATE_PROGRAM_CACHE,
> BRW_STATE_STATE_BASE_ADDRESS,
> -   BRW_STATE_VUE_MAP_VS,
> BRW_STATE_VUE_MAP_GEOM_OUT,
> BRW_STATE_TRANSFORM_FEEDBACK,
> BRW_STATE_RASTERIZER_DISCARD,
> @@ -276,7 +275,6 @@ enum brw_state_id {
>  #define BRW_NEW_GS_CONSTBUF (1ull << BRW_STATE_GS_CONSTBUF)
>  #define BRW_NEW_PROGRAM_CACHE   (1ull << BRW_STATE_PROGRAM_CACHE)
>  #define BRW_NEW_STATE_BASE_ADDRESS  (1ull <<
> BRW_STATE_STATE_BASE_ADDRESS)
> -#define BRW_NEW_VUE_MAP_VS  (1ull << BRW_STATE_VUE_MAP_VS)
>  #define BRW_NEW_VUE_MAP_GEOM_OUT(1ull <<
> BRW_STATE_VUE_MAP_GEOM_OUT)
>  #define BRW_NEW_TRANSFORM_FEEDBACK  (1ull <<
> BRW_STATE_TRANSFORM_FEEDBACK)
>  #define BRW_NEW_RASTERIZER_DISCARD  (1ull <<
> BRW_STATE_RASTERIZER_DISCARD)
> @@ -1375,16 +1373,8 @@ struct brw_context
> } curbe;
>
> /**
> -* Layout of vertex data exiting the vertex shader.
> -*
> -* BRW_NEW_VUE_MAP_VS is flagged when this VUE map changes.
> -*/
> -   struct brw_vue_map vue_map_vs;
> -
> -   /**
>  * Layout of vertex data exiting the geometry portion of the pipleine.
> -* This comes from the geometry shader if one exists, otherwise from
> the
> -* vertex shader.
> +* This comes from the last enabled shader stage (GS, DS, or VS).
>  *
>  * BRW_NEW_VUE_MAP_GEOM_OUT is flagged when the VUE map changes.
>  */
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index 77be9d9..1f219c0 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -297,8 +297,7 @@ brw_gs_state_dirty(struct brw_context *brw)
> return brw_state_dirty(brw,
>_NEW_TEXTURE,
>BRW_NEW_GEOMETRY_PROGRAM |
> -  BRW_NEW_TRANSFORM_FEEDBACK |
> -  BRW_NEW_VUE_MAP_VS);
> +  BRW_NEW_TRANSFORM_FEEDBACK);
>  }
>
>  static void
> @@ -336,11 +335,6 @@ brw_upload_gs_prog(struct brw_context *brw)
>
> if (gp == NULL) {
>/* No geometry shader.  Vertex data just passes straight through. */
> -  if (brw->ctx.NewDriverState & BRW_NEW_VUE_MAP_VS) {
> - brw->vue_map_geom_out = brw->vue_map_vs;
> - brw->ctx.NewDriverState |= BRW_NEW_VUE_MAP_GEOM_OUT;
> -  }
> -
>if (brw->gen == 6 &&
>(brw->ctx.NewDriverState & BRW_NEW_TRANSFORM_FEEDBACK)) {
>   gen6_brw_upload_ff_gs_prog(brw);
> @@ -367,14 +361,6 @@ brw_upload_gs_prog(struct brw_context *brw)
>(void)success;
> }
> brw->gs.base.prog_data = >gs.prog_data->base.base;
> -
> -   if (brw->gs.prog_data->base.vue_map.slots_valid !=
> -   brw->vue_map_geom_out.slots_valid ||
> -   brw->gs.prog_data->base.vue_map.separate !=
> -   brw->vue_map_geom_out.separate) {
> -  brw->vue_map_geom_out = brw->gs.prog_data->base.vue_map;
> -  brw->ctx

Re: [Mesa-dev] [PATCH 05/12] i965/vec4/skl+: Use lcd2dms_w instead of lcd2dms

2015-09-20 Thread Chris Forbes
s/lcd2dms/ld2dms/g in various places in this patch and others.

On Fri, Sep 18, 2015 at 4:00 AM, Neil Roberts  wrote:

> In order to support 16x MSAA, skl+ has a wider version of lcd2dms that
> takes two parameters for the MCS data. The MCS data in the response
> still fits in a single register so we just need to ensure we copy both
> values rather than just the lower one.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp   |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  5 +
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 14 --
>  3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index ed49cd3..22c8d01 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -327,6 +327,7 @@ vec4_visitor::implied_mrf_writes(vec4_instruction
> *inst)
> case SHADER_OPCODE_TXD:
> case SHADER_OPCODE_TXF:
> case SHADER_OPCODE_TXF_CMS:
> +   case SHADER_OPCODE_TXF_CMS_W:
> case SHADER_OPCODE_TXF_MCS:
> case SHADER_OPCODE_TXS:
> case SHADER_OPCODE_TG4:
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 1950333..c4ddff9 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -259,6 +259,10 @@ vec4_generator::generate_tex(vec4_instruction *inst,
>case SHADER_OPCODE_TXF:
>  msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_LD;
>  break;
> +  case SHADER_OPCODE_TXF_CMS_W:
> + assert(devinfo->gen >= 9);
> + msg_type = GEN9_SAMPLER_MESSAGE_SAMPLE_LD2DMS_W;
> + break;
>case SHADER_OPCODE_TXF_CMS:
>   if (devinfo->gen >= 7)
>  msg_type = GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DMS;
> @@ -1372,6 +1376,7 @@ vec4_generator::generate_code(const cfg_t *cfg)
>case SHADER_OPCODE_TXD:
>case SHADER_OPCODE_TXF:
>case SHADER_OPCODE_TXF_CMS:
> +  case SHADER_OPCODE_TXF_CMS_W:
>case SHADER_OPCODE_TXF_MCS:
>case SHADER_OPCODE_TXL:
>case SHADER_OPCODE_TXS:
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 0465770..79f92d8 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -2545,7 +2545,8 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
> case ir_txl: opcode = SHADER_OPCODE_TXL; break;
> case ir_txd: opcode = SHADER_OPCODE_TXD; break;
> case ir_txf: opcode = SHADER_OPCODE_TXF; break;
> -   case ir_txf_ms: opcode = SHADER_OPCODE_TXF_CMS; break;
> +   case ir_txf_ms: opcode = (devinfo->gen >= 9 ? SHADER_OPCODE_TXF_CMS_W :
> + SHADER_OPCODE_TXF_CMS); break;
> case ir_txs: opcode = SHADER_OPCODE_TXS; break;
> case ir_tg4: opcode = offset_value.file != BAD_FILE
>   ? SHADER_OPCODE_TG4_OFFSET : SHADER_OPCODE_TG4;
> break;
> @@ -2637,7 +2638,16 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
>} else if (op == ir_txf_ms) {
>   emit(MOV(dst_reg(MRF, param_base + 1, sample_index.type,
> WRITEMASK_X),
>sample_index));
> - if (devinfo->gen >= 7) {
> + if (opcode == SHADER_OPCODE_TXF_CMS_W) {
> +/* MCS data is stored in the first two channels of ‘mcs’, but
> we
> + * need to get it into the .y and .z channels of the second
> vec4
> + * of params.
> + */
> +mcs.swizzle = BRW_SWIZZLE4(0, 0, 1, 1);
> +emit(MOV(dst_reg(MRF, param_base + 1,
> + glsl_type::uint_type, WRITEMASK_YZ),
> + mcs));
> + } else if (devinfo->gen >= 7) {
>  /* MCS data is in the first channel of `mcs`, but we need to
> get it into
>   * the .y channel of the second vec4 of params, so replicate
> .x across
>   * the whole vec4 and then mask off everything except .y
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Fix value of _3DPRIM_TRIFAN_NOSTIPPLE.

2015-09-07 Thread Chris Forbes
These two are:

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Mon, Sep 7, 2015 at 7:03 PM, Kenneth Graunke <kenn...@whitecape.org> wrote:
> TRIFAN_NOSTIPPLE has always been 0x16 - 0x15 is marked "Reserved" on all
> platforms.  See the 965 PRM, Volume 2, Table 3-1, "3D Primitive Topology
> Type Encoding" for a list.
>
> We don't currently use this, and I don't expect we will, but we may as
> well not leave the bogus value around.
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 0f7feb3..411a97d 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -77,7 +77,7 @@
>  #define _3DPRIM_LINESTRIP_CONT0x12
>  #define _3DPRIM_LINESTRIP_BF  0x13
>  #define _3DPRIM_LINESTRIP_CONT_BF 0x14
> -#define _3DPRIM_TRIFAN_NOSTIPPLE  0x15
> +#define _3DPRIM_TRIFAN_NOSTIPPLE  0x16
>
>  /* We use this offset to be able to pass native primitive types in struct
>   * _mesa_prim::mode.  Native primitive types are BRW_PRIM_OFFSET +
> --
> 2.5.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/12] i965: Simplify handling of VUE map changes.

2015-09-03 Thread Chris Forbes
This had got pretty tangled.

For the series:

Reviewed-by: Chris Forbes <chr...@ijw.co.nz>

On Sat, Aug 29, 2015 at 9:24 PM, Kenneth Graunke <kenn...@whitecape.org> wrote:
> The old code was disasterously complex - spread across multiple atoms
> which may not even run, inspecting the dirty bits to try and decide
> whether it was necessary to do checks...storing VS information in
> brw_context...extra flagging...
>
> This code tripped me and Carl up very badly when working on the
> shader cache code.  It's very fragile and hard to maintain.
>
> Now that geometry shaders only depend on their inputs and don't have
> to worry about the VS VUE map, we can dramatically simplify this:
> just compute the VUE map coming out of the geometry shader stage
> in brw_upload_programs.  If it changes, flag it.  Done.
>
> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  | 12 +---
>  src/mesa/drivers/dri/i965/brw_gs.c   | 14 +-
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 14 +-
>  src/mesa/drivers/dri/i965/brw_vs.c   | 13 -
>  4 files changed, 15 insertions(+), 38 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index c9c47dd..91258be 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -194,7 +194,6 @@ enum brw_state_id {
> BRW_STATE_GS_CONSTBUF,
> BRW_STATE_PROGRAM_CACHE,
> BRW_STATE_STATE_BASE_ADDRESS,
> -   BRW_STATE_VUE_MAP_VS,
> BRW_STATE_VUE_MAP_GEOM_OUT,
> BRW_STATE_TRANSFORM_FEEDBACK,
> BRW_STATE_RASTERIZER_DISCARD,
> @@ -276,7 +275,6 @@ enum brw_state_id {
>  #define BRW_NEW_GS_CONSTBUF (1ull << BRW_STATE_GS_CONSTBUF)
>  #define BRW_NEW_PROGRAM_CACHE   (1ull << BRW_STATE_PROGRAM_CACHE)
>  #define BRW_NEW_STATE_BASE_ADDRESS  (1ull << 
> BRW_STATE_STATE_BASE_ADDRESS)
> -#define BRW_NEW_VUE_MAP_VS  (1ull << BRW_STATE_VUE_MAP_VS)
>  #define BRW_NEW_VUE_MAP_GEOM_OUT(1ull << BRW_STATE_VUE_MAP_GEOM_OUT)
>  #define BRW_NEW_TRANSFORM_FEEDBACK  (1ull << 
> BRW_STATE_TRANSFORM_FEEDBACK)
>  #define BRW_NEW_RASTERIZER_DISCARD  (1ull << 
> BRW_STATE_RASTERIZER_DISCARD)
> @@ -1362,16 +1360,8 @@ struct brw_context
> } curbe;
>
> /**
> -* Layout of vertex data exiting the vertex shader.
> -*
> -* BRW_NEW_VUE_MAP_VS is flagged when this VUE map changes.
> -*/
> -   struct brw_vue_map vue_map_vs;
> -
> -   /**
>  * Layout of vertex data exiting the geometry portion of the pipleine.
> -* This comes from the geometry shader if one exists, otherwise from the
> -* vertex shader.
> +* This comes from the last enabled shader stage (GS, DS, or VS).
>  *
>  * BRW_NEW_VUE_MAP_GEOM_OUT is flagged when the VUE map changes.
>  */
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index da8af88..6b25150 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -292,8 +292,7 @@ brw_gs_state_dirty(struct brw_context *brw)
> return brw_state_dirty(brw,
>_NEW_TEXTURE,
>BRW_NEW_GEOMETRY_PROGRAM |
> -  BRW_NEW_TRANSFORM_FEEDBACK |
> -  BRW_NEW_VUE_MAP_VS);
> +  BRW_NEW_TRANSFORM_FEEDBACK);
>  }
>
>  static void
> @@ -331,11 +330,6 @@ brw_upload_gs_prog(struct brw_context *brw)
>
> if (gp == NULL) {
>/* No geometry shader.  Vertex data just passes straight through. */
> -  if (brw->ctx.NewDriverState & BRW_NEW_VUE_MAP_VS) {
> - brw->vue_map_geom_out = brw->vue_map_vs;
> - brw->ctx.NewDriverState |= BRW_NEW_VUE_MAP_GEOM_OUT;
> -  }
> -
>if (brw->gen == 6 &&
>(brw->ctx.NewDriverState & BRW_NEW_TRANSFORM_FEEDBACK)) {
>   gen6_brw_upload_ff_gs_prog(brw);
> @@ -362,12 +356,6 @@ brw_upload_gs_prog(struct brw_context *brw)
>(void)success;
> }
> brw->gs.base.prog_data = >gs.prog_data->base.base;
> -
> -   if (brw->gs.prog_data->base.vue_map.slots_valid !=
> -   brw->vue_map_geom_out.slots_valid) {
> -  brw->vue_map_geom_out = brw->gs.prog_data->base.vue_map;
> -  brw->ctx.NewDriverState |= BRW_NEW_VUE_MAP_GEOM_OUT;
> -   }
>  }
>
>  bool
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index 9de4

Re: [Mesa-dev] [PATCH] i965: Improve disassembly of data port read messages.

2015-08-14 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Fri, Aug 14, 2015 at 9:52 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 We now print out the name of the message instead of its numerical
 value, and label the message control and surface numbers.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_disasm.c | 31 +++
  1 file changed, 27 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
 b/src/mesa/drivers/dri/i965/brw_disasm.c
 index 1075c5a..61be2b0 100644
 --- a/src/mesa/drivers/dri/i965/brw_disasm.c
 +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
 @@ -412,6 +412,22 @@ static const char *const gen7_gateway_subfuncid[8] = {
 [BRW_MESSAGE_GATEWAY_SFID_MMIO_READ_WRITE] = mmio read/write,
  };

 +static const char *const gen4_dp_read_port_msg_type[4] = {
 +   [0b00] = OWord Block Read,
 +   [0b01] = OWord Dual Block Read,
 +   [0b10] = Media Block Read,
 +   [0b11] = DWord Scattered Read,
 +};
 +
 +static const char *const g45_dp_read_port_msg_type[8] = {
 +   [0b000] = OWord Block Read,
 +   [0b010] = OWord Dual Block Read,
 +   [0b100] = Media Block Read,
 +   [0b110] = DWord Scattered Read,
 +   [0b001] = Render Target UNORM Read,
 +   [0b011] = AVC Loop Filter Read,
 +};
 +
  static const char *const dp_write_port_msg_type[8] = {
 [0b000] = OWord block write,
 [0b001] = OWord dual block write,
 @@ -1444,10 +1460,17 @@ brw_disassemble_inst(FILE *file, const struct 
 brw_device_info *devinfo,
brw_inst_dp_msg_type(devinfo, inst),
devinfo-gen = 7 ? 0 : 
 brw_inst_dp_write_commit(devinfo, inst));
  } else {
 -   format(file,  (%ld, %ld, %ld),
 -  brw_inst_binding_table_index(devinfo, inst),
 -  brw_inst_dp_read_msg_control(devinfo, inst),
 -  brw_inst_dp_read_msg_type(devinfo, inst));
 +   bool is_965 = devinfo-gen == 4  !devinfo-is_g4x;
 +   err |= control(file, DP read message type,
 +  is_965 ? gen4_dp_read_port_msg_type :
 +   g45_dp_read_port_msg_type,
 +  brw_inst_dp_read_msg_type(devinfo, inst),
 +  space);
 +
 +   format(file,  MsgCtrl = 0x%lx,
 +  brw_inst_dp_read_msg_control(devinfo, inst));
 +
 +   format(file,  Surface = %ld, 
 brw_inst_binding_table_index(devinfo, inst));
  }
  break;

 --
 2.4.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2][RFC] docs: Add the 2015 ARB extensions

2015-08-12 Thread Chris Forbes
I'd just add a 2015 block and a 2014 block.

On Thu, Aug 13, 2015 at 9:36 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Wed, Aug 12, 2015 at 5:23 PM, Thomas Helland
 thomashellan...@gmail.com wrote:
 2015-08-12 18:56 GMT+02:00 Kenneth Graunke kenn...@whitecape.org:
 On Wednesday, August 12, 2015 06:32:50 PM Thomas Helland wrote:
 2015-08-12 17:48 GMT+02:00 Ilia Mirkin imir...@alum.mit.edu:
  On Tue, Aug 11, 2015 at 1:48 PM, Thomas Helland
  thomashellan...@gmail.com wrote:
  Signed-off-by: Thomas Helland thomashellan...@gmail.com
  ---
  This adds a section for the extensions nvidia has chosen to
  call the GL ARB 2015 Extensions unveiled at SIGGRAPH.
 
  There are ARB extensions released every year (or more often, not
  sure)... we don't track all ARB extensions. Why are these so special
  vs e.g. the ones released along with GL 4.5 but that weren't included
  in the spec? Or any of the other ones...
 

 Well. They're not really special I guess. This just follows from the
 discussion that went down on irc between me, glennk, fredrikh, ++.

  Should GL3.txt just become extension-implementation-status.txt and
  list all non-vendor-specific extensions? So far it has stuck to actual
  GL versions (and more recently GLES).
 

 We can keep it GL / GLES versions only. Or we can extend it to a
 extension-implementation-status.txt thing. Or we can split it
 into two different files. I really don't care to much either way.

 If we end up adding these extensions to the file then a rename
 and adding other ARB's is probably the way to go. There are
 positive and negative sides to both approaches, and its not
 my call to decide how, and if, we want this. It gives a nice overview
 but at the same time it has PR- and needs-to-be-kept-updated-
 implications that we may not want. I'm all ears for suggestions.

 -Thomas

 I like the idea of adding an ARB Extensions section and listing all
 the ARB extensions that aren't part of a particular GL version - simply
 in addition to the existing content, rather than reorganizing it.

 GL3.txt has been a misnomer for a while, but I don't care whether we
 rename it or not; it doesn't bother me.

 --Ken

 I've assembled a list of extensions I *think* are not demanded by
 any current openGL specs, but I may have missed some.
 (I find it weird that I VAO's in any of the specs, for example)
 I could add all of them to a separate section to track them,
 or I can leave it as is and drop this patch. Up to you guys.

 2.  GLX_ARB_get_proc_address
 4.  WGL_ARB_buffer_region
 8.  WGL_ARB_extensions_string
 9.  WGL_ARB_pixel_format
 10. WGL_ARB_make_current_read
 11. WGL_ARB_pbuffer
 15. GL_ARB_vertex_blend
 16. GL_ARB_matrix_palette
 20. WGL_ARB_render_texture
 24. GL_ARB_shadow_ambient
 36. GL_ARB_fragment_program_shadow
 42. GL_ARB_pixel_buffer_object

 GL 2.1

 43. GL_ARB_depth_buffer_float

 GL 3.0

 45. GL_ARB_framebuffer_object

 GL 3.0

 46. GL_ARB_framebuffer_sRGB

 GL 3.0

 GLX_ARB_framebuffer_sRGB
 WGL_ARB_framebuffer_sRGB
 48. GL_ARB_half_float_vertex

 3.0

 50. GL_ARB_map_buffer_range

 3.0

 52. GL_ARB_texture_compression_rgtc

 3.0

 53. GL_ARB_texture_rg

 3.0

 54. GL_ARB_vertex_array_object

 3.0

 55. WGL_ARB_create_context
 56. GLX_ARB_create_context

 3.0... I think.

 58. GL_ARB_compatibility

 3.1+ compat contexts

 60. GL_ARB_shader_texture_lod

 3.0

 74. WGL_ARB_create_context_profile
 75. GLX_ARB_create_context_profile

 3.1

 76. GL_ARB_shading_language_include
 101.GLX_ARB_create_context_robustness
 102.WGL_ARB_create_context_robustness
 103.GL_ARB_cl_event
 104.GL_ARB_debug_output
 105.GL_ARB_robustness
 106.GL_ARB_shader_stencil_export
 118.GL_KHR_texture_compression_astc_hdr
 GL_KHR_texture_compression_astc_ldr
 126.GL_ARB_robustness_isolation
 142.GLX_ARB_robustness_application_isolation
 GLX_ARB_robustness_share_group_isolation
 143.WGL_ARB_robustness_application_isolation
 WGL_ARB_robustness_share_group_isolation
 152.GL_ARB_bindless_texture
 153.GL_ARB_compute_variable_group_size
 154.GL_ARB_indirect_parameters
 155.GL_ARB_seamless_cubemap_per_texture
 156.GL_ARB_shader_draw_parameters
 157.GL_ARB_shader_group_vote
 158.GL_ARB_sparse_texture
 171.GL_ARB_pipeline_statistics_query
 172.GL_ARB_sparse_buffer
 173.GL_ARB_transform_feedback_overflow_query
 174.GL_KHR_blend_equation_advanced
 GL_KHR_blend_equation_advanced_coherent
 175.GL_KHR_no_error
 176.GL_ARB_ES3_2_compatibility
 177.GL_ARB_fragment_shader_interlock
 178.GL_ARB_gpu_shader_int64
 179.

Re: [Mesa-dev] [PATCH] glsl: replace old hash table with new and faster one

2015-08-02 Thread Chris Forbes
Some perf numbers would be nice. How much is this winning?

- Chris

On Mon, Aug 3, 2015 at 11:18 AM, Timothy Arceri t_arc...@yahoo.com.au wrote:
 On Sun, 2015-08-02 at 19:50 +0200, Alejandro Seguí wrote:

 Maybe just for completeness you could add this to the commit message

 The util/hash_table was intended to be a fast hash table
 replacement for the program/hash_table see 35fd61bd99c1 and 72e55bb6888ff.

 ---
  src/glsl/ir_print_visitor.cpp | 19 ---
  1 file changed, 12 insertions(+), 7 deletions(-)

 diff --git a/src/glsl/ir_print_visitor.cpp b/src/glsl/ir_print_visitor.cpp
 index 78475f3..641a996 100644
 --- a/src/glsl/ir_print_visitor.cpp
 +++ b/src/glsl/ir_print_visitor.cpp
 @@ -25,7 +25,7 @@
  #include glsl_types.h
  #include glsl_parser_extras.h
  #include main/macros.h
 -#include program/hash_table.h
 +#include util/hash_table.h

  static void print_type(FILE *f, const glsl_type *t);

 @@ -89,14 +89,14 @@ ir_print_visitor::ir_print_visitor(FILE *f)
  {
 indentation = 0;
 printable_names =
 -  hash_table_ctor(32, hash_table_pointer_hash,
 hash_table_pointer_compare);
 +  _mesa_hash_table_create(NULL, _mesa_hash_pointer,
 _mesa_key_pointer_equal);
 symbols = _mesa_symbol_table_ctor();
 mem_ctx = ralloc_context(NULL);
  }

  ir_print_visitor::~ir_print_visitor()
  {
 -   hash_table_dtor(printable_names);
 +   _mesa_hash_table_destroy(printable_names, NULL);
 _mesa_symbol_table_dtor(symbols);
 ralloc_free(mem_ctx);
  }
 @@ -121,9 +121,14 @@ ir_print_visitor::unique_name(ir_variable *var)
 }

 /* Do we already have a name for this variable? */
 -   const char *name = (const char *) hash_table_find(this-printable_names,
 var);
 -   if (name != NULL)
 -  return name;
 +   struct hash_entry * entry =
 +_mesa_hash_table_search(this-printable_names, var);
 +
 +   if (entry != NULL) {
 +  return (const char *) entry-data;
 +   }
 +
 +   const char* name = NULL;

 The above looks a bit funny just floating here maybe move it


 /* If there's no conflict, just use the original name */
 Here.
 if (_mesa_symbol_table_find_symbol(this-symbols, -1, var-name) ==
 NULL) {
 @@ -132,7 +137,7 @@ ir_print_visitor::unique_name(ir_variable *var)
static unsigned i = 1;
name = ralloc_asprintf(this-mem_ctx, %s@%u, var-name, ++i);
 }
 -   hash_table_insert(this-printable_names, (void *) name, var);
 +   _mesa_hash_table_insert(this-printable_names, var, (void *) name);
 _mesa_symbol_table_add_symbol(this-symbols, -1, name, var);
 return name;
  }

 With those couple of small changes you can add to the commit message
 Reviewed-by: Timothy Arceri t_arc...@yahoo.com.au

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: replace old hash table with new and faster one

2015-08-02 Thread Chris Forbes
Oh, fair enough then.
On Aug 3, 2015 12:39 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 Given that this is a debug-only thing, I doubt perf numbers are that
 interesting.

 I have no clue what the diff is between the two hash tables, but if
 one is allegedly faster than the other, that should be determined, and
 we should just mass-migrate...

   -ilia

 On Sun, Aug 2, 2015 at 8:05 PM, Chris Forbes chr...@ijw.co.nz wrote:
  Some perf numbers would be nice. How much is this winning?
 
  - Chris
 
  On Mon, Aug 3, 2015 at 11:18 AM, Timothy Arceri t_arc...@yahoo.com.au
 wrote:
  On Sun, 2015-08-02 at 19:50 +0200, Alejandro Seguí wrote:
 
  Maybe just for completeness you could add this to the commit message
 
  The util/hash_table was intended to be a fast hash table
  replacement for the program/hash_table see 35fd61bd99c1 and
 72e55bb6888ff.
 
  ---
   src/glsl/ir_print_visitor.cpp | 19 ---
   1 file changed, 12 insertions(+), 7 deletions(-)
 
  diff --git a/src/glsl/ir_print_visitor.cpp
 b/src/glsl/ir_print_visitor.cpp
  index 78475f3..641a996 100644
  --- a/src/glsl/ir_print_visitor.cpp
  +++ b/src/glsl/ir_print_visitor.cpp
  @@ -25,7 +25,7 @@
   #include glsl_types.h
   #include glsl_parser_extras.h
   #include main/macros.h
  -#include program/hash_table.h
  +#include util/hash_table.h
 
   static void print_type(FILE *f, const glsl_type *t);
 
  @@ -89,14 +89,14 @@ ir_print_visitor::ir_print_visitor(FILE *f)
   {
  indentation = 0;
  printable_names =
  -  hash_table_ctor(32, hash_table_pointer_hash,
  hash_table_pointer_compare);
  +  _mesa_hash_table_create(NULL, _mesa_hash_pointer,
  _mesa_key_pointer_equal);
  symbols = _mesa_symbol_table_ctor();
  mem_ctx = ralloc_context(NULL);
   }
 
   ir_print_visitor::~ir_print_visitor()
   {
  -   hash_table_dtor(printable_names);
  +   _mesa_hash_table_destroy(printable_names, NULL);
  _mesa_symbol_table_dtor(symbols);
  ralloc_free(mem_ctx);
   }
  @@ -121,9 +121,14 @@ ir_print_visitor::unique_name(ir_variable *var)
  }
 
  /* Do we already have a name for this variable? */
  -   const char *name = (const char *)
 hash_table_find(this-printable_names,
  var);
  -   if (name != NULL)
  -  return name;
  +   struct hash_entry * entry =
  +_mesa_hash_table_search(this-printable_names, var);
  +
  +   if (entry != NULL) {
  +  return (const char *) entry-data;
  +   }
  +
  +   const char* name = NULL;
 
  The above looks a bit funny just floating here maybe move it
 
 
  /* If there's no conflict, just use the original name */
  Here.
  if (_mesa_symbol_table_find_symbol(this-symbols, -1, var-name) ==
  NULL) {
  @@ -132,7 +137,7 @@ ir_print_visitor::unique_name(ir_variable *var)
 static unsigned i = 1;
 name = ralloc_asprintf(this-mem_ctx, %s@%u, var-name, ++i);
  }
  -   hash_table_insert(this-printable_names, (void *) name, var);
  +   _mesa_hash_table_insert(this-printable_names, var, (void *) name);
  _mesa_symbol_table_add_symbol(this-symbols, -1, name, var);
  return name;
   }
 
  With those couple of small changes you can add to the commit message
  Reviewed-by: Timothy Arceri t_arc...@yahoo.com.au
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: when generating out/inout parameter fixups, do indexing before the call

2015-07-30 Thread Chris Forbes
If the indexing expression involves anything modified during the call,
the fixup code to copy back after the call would use the new values.

This fixes the cases where the first expression node encountered is
ir_binop_vector_extract, fixing the piglits:

* shaders@out-parameter-indexing@vs-inout-index-inout-mat2-row
* shaders@out-parameter-indexing@vs-inout-index-inout-vec4
* shaders@out-parameter-indexing@vs-inout-index-inout-vec4-array-element

Further changes are needed for other expression types.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
Cc: Ben Widawsky b...@bwidawsk.net
---
 src/glsl/ast_function.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 803edf5..e7147dd 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -299,6 +299,21 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const 
glsl_type *formal_type,
 
before_instructions-push_tail(tmp);
 
+   if (expr != NULL  expr-operation == ir_binop_vector_extract) {
+  /* We're going to have to emit a matching insert after the call.
+   * evaluate the indexing expression before the call, because it
+   * may reference things that change during the call.
+   */
+  ir_variable *index_tmp = new(mem_ctx) 
ir_variable(expr-operands[1]-type,
+inout_index_tmp, ir_var_temporary);
+  before_instructions-push_tail(index_tmp);
+  before_instructions-push_tail(
+new(mem_ctx) ir_assignment(
+   new(mem_ctx) ir_dereference_variable(index_tmp),
+   expr-operands[1]-clone(mem_ctx, NULL)));
+  expr-operands[1] = new(mem_ctx) ir_dereference_variable(index_tmp);
+   }
+
/* If the parameter is an inout parameter, copy the value of the actual
 * parameter to the new temporary.  Note that no type conversion is allowed
 * here because inout parameters must match types exactly.
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: recognize ARB_shading_language_420pack to be enabled with 4.20+

2015-07-24 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sat, Jul 25, 2015 at 9:07 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 The 420pack extension enables various GLSL rules that need to be applied
 to any GLSL 4.20+ shader even if the extension is not explicitly
 enabled.

 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/glsl/glsl_parser.yy   | 18 +-
  src/glsl/glsl_parser_extras.h |  5 +
  2 files changed, 14 insertions(+), 9 deletions(-)

 diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
 index 7de31d9..4cce5b8 100644
 --- a/src/glsl/glsl_parser.yy
 +++ b/src/glsl/glsl_parser.yy
 @@ -934,7 +934,7 @@ parameter_qualifier:
if (($1.flags.q.in || $1.flags.q.out)  ($2.flags.q.in || 
 $2.flags.q.out))
   _mesa_glsl_error(@1, state, duplicate in/out/inout qualifier);

 -  if (!state-ARB_shading_language_420pack_enable  $2.flags.q.constant)
 +  if (!state-has_420pack()  $2.flags.q.constant)
   _mesa_glsl_error(@1, state, in/out/inout must come after const 
or precise);

 @@ -946,7 +946,7 @@ parameter_qualifier:
if ($2.precision != ast_precision_none)
   _mesa_glsl_error(@1, state, duplicate precision qualifier);

 -  if (!state-ARB_shading_language_420pack_enable  $2.flags.i != 0)
 +  if (!state-has_420pack()  $2.flags.i != 0)
   _mesa_glsl_error(@1, state, precision qualifiers must come last);

$$ = $2;
 @@ -1458,7 +1458,7 @@ layout_qualifier_id:
   }
}

 -  if ((state-ARB_shading_language_420pack_enable ||
 +  if ((state-has_420pack() ||
 state-has_atomic_counters() ||
 state-ARB_shader_storage_buffer_object_enable) 
match_layout_qualifier(binding, $1, state) == 0) {
 @@ -1729,7 +1729,7 @@ type_qualifier:
if ($2.flags.q.invariant)
   _mesa_glsl_error(@1, state, duplicate \invariant\ qualifier);

 -  if (!state-ARB_shading_language_420pack_enable  $2.flags.q.precise)
 +  if (!state-has_420pack()  $2.flags.q.precise)
   _mesa_glsl_error(@1, state,
\invariant\ must come after \precise\);

 @@ -1762,7 +1762,7 @@ type_qualifier:
if ($2.has_interpolation())
   _mesa_glsl_error(@1, state, duplicate interpolation qualifier);

 -  if (!state-ARB_shading_language_420pack_enable 
 +  if (!state-has_420pack() 
($2.flags.q.precise || $2.flags.q.invariant)) {
   _mesa_glsl_error(@1, state, interpolation qualifiers must come 
after \precise\ or \invariant\);
 @@ -1782,7 +1782,7 @@ type_qualifier:
 * precise qualifiers since these are useful in 
 ARB_separate_shader_objects.
 * There is no clear spec guidance on this either.
 */
 -  if (!state-ARB_shading_language_420pack_enable  $2.has_layout())
 +  if (!state-has_420pack()  $2.has_layout())
   _mesa_glsl_error(@1, state, duplicate layout(...) qualifiers);

$$ = $1;
 @@ -1800,7 +1800,7 @@ type_qualifier:
duplicate auxiliary storage qualifier (centroid 
 or sample));
}

 -  if (!state-ARB_shading_language_420pack_enable 
 +  if (!state-has_420pack() 
($2.flags.q.precise || $2.flags.q.invariant ||
 $2.has_interpolation() || $2.has_layout())) {
   _mesa_glsl_error(@1, state, auxiliary storage qualifiers must 
 come 
 @@ -1818,7 +1818,7 @@ type_qualifier:
if ($2.has_storage())
   _mesa_glsl_error(@1, state, duplicate storage qualifier);

 -  if (!state-ARB_shading_language_420pack_enable 
 +  if (!state-has_420pack() 
($2.flags.q.precise || $2.flags.q.invariant || 
 $2.has_interpolation() ||
 $2.has_layout() || $2.has_auxiliary_storage())) {
   _mesa_glsl_error(@1, state, storage qualifiers must come after 
 @@ -1834,7 +1834,7 @@ type_qualifier:
if ($2.precision != ast_precision_none)
   _mesa_glsl_error(@1, state, duplicate precision qualifier);

 -  if (!state-ARB_shading_language_420pack_enable  $2.flags.i != 0)
 +  if (!state-has_420pack()  $2.flags.i != 0)
   _mesa_glsl_error(@1, state, precision qualifiers must come last);

$$ = $2;
 diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
 index b65d53d..eb325f0 100644
 --- a/src/glsl/glsl_parser_extras.h
 +++ b/src/glsl/glsl_parser_extras.h
 @@ -231,6 +231,11 @@ struct _mesa_glsl_parse_state {
return ARB_gpu_shader_fp64_enable || is_version(400, 0);
 }

 +   bool has_420pack() const
 +   {
 +  return ARB_shading_language_420pack_enable || is_version(420, 0);
 +   }
 +
 void process_version_directive(YYLTYPE *locp, int version,
const char *ident);

 --
 2.3.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org

Re: [Mesa-dev] [PATCH] i965: Use updated kernel interface for accurate TIMESTAMP reads

2015-07-23 Thread Chris Forbes
This fixes my HSW getting dropped back to 3.2 most of the time, and
seems like the reasonable thing to do.

Tested-and-acked-by: Chris Forbes chr...@ijw.co.nz

On Tue, Jul 21, 2015 at 11:58 PM, Chris Wilson ch...@chris-wilson.co.uk wrote:
 I was mistaken, I thought we already had fixed this in the kernel a
 couple of years ago. We had not, and the broken read (the hardware
 shifts the register output on 64bit kernels, but not on 32bit kernels) is
 now enshrined into the ABI. I also had the buggy architecture reversed,
 believing it to be 32bit that had the shifted results. On the basis of
 those mistakes, I wrote

 commit c8d3ebaffc0d7d915c1c19d54dba61fd1e57b338
 Author: Chris Wilson ch...@chris-wilson.co.uk
 Date:   Wed Apr 29 13:32:38 2015 +0100

 i965: Query whether we have kernel support for the TIMESTAMP register once

 Now that we do have an extended register read interface for always
 reporting the full 36bit TIMESTAMP (irrespective of whether the kernel
 is buggy or not), make use of it and in the process fix my reversed
 detection of the buggy reads for unpatched kernels.

 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Martin Peres martin.pe...@linux.intel.com
 Cc: Kenneth Graunke kenn...@whitecape.org
 Cc: Michał Winiarski michal.winiar...@intel.com
 Cc: Daniel Vetter dan...@ffwll.ch
 ---
  src/mesa/drivers/dri/i965/brw_queryobj.c | 15 --
  src/mesa/drivers/dri/i965/intel_screen.c | 49 
 +++-
  src/mesa/drivers/dri/i965/intel_screen.h |  2 +-
  3 files changed, 49 insertions(+), 17 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
 b/src/mesa/drivers/dri/i965/brw_queryobj.c
 index aea4d9b..6f1accd 100644
 --- a/src/mesa/drivers/dri/i965/brw_queryobj.c
 +++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
 @@ -497,13 +497,22 @@ brw_get_timestamp(struct gl_context *ctx)
 struct brw_context *brw = brw_context(ctx);
 uint64_t result = 0;

 -   drm_intel_reg_read(brw-bufmgr, TIMESTAMP, result);
 +   switch (brw-intelScreen-hw_has_timestamp) {
 +   case 3: /* New kernel, always full 36bit accuracy */
 +  drm_intel_reg_read(brw-bufmgr, TIMESTAMP | 1, result);
 +  break;
 +   case 2: /* 64bit kernel, result is right-shifted by 32bits, losing 4bits 
 */
 +  drm_intel_reg_read(brw-bufmgr, TIMESTAMP, result);
 +  result = result  32;
 +  break;
 +   case 1: /* 32bit kernel, result is 36bit wide but may be inaccurate! */
 +  drm_intel_reg_read(brw-bufmgr, TIMESTAMP, result);
 +  break;
 +   }

 /* See logic in brw_queryobj_get_results() */
 -   result = result  32;
 result *= 80;
 result = (1ull  36) - 1;
 -
 return result;
  }

 diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
 b/src/mesa/drivers/dri/i965/intel_screen.c
 index 1470b05..65a1766 100644
 --- a/src/mesa/drivers/dri/i965/intel_screen.c
 +++ b/src/mesa/drivers/dri/i965/intel_screen.c
 @@ -1123,25 +1123,48 @@ intel_detect_swizzling(struct intel_screen *screen)
return true;
  }

 -static bool
 +static int
  intel_detect_timestamp(struct intel_screen *screen)
  {
 -   uint64_t dummy = 0;
 -   int loop = 10;
 +   uint64_t dummy = 0, last = 0;
 +   int upper, lower, loops;

 -   /*
 -* On 32bit systems, some old kernels trigger a hw bug resulting in the
 -* TIMESTAMP register being shifted and the low 32bits always zero. Detect
 -* this by repeating the read a few times and check the register is
 -* incrementing every 80ns as expected and not stuck on zero (as would be
 -* the case with the buggy kernel/hw.).
 +   /* On 64bit systems, some old kernels trigger a hw bug resulting in the
 +* TIMESTAMP register being shifted and the low 32bits always zero.
 +*
 +* More recent kernels offer an interface to read the full 36bits
 +* everywhere.
  */
 -   do {
 +   if (drm_intel_reg_read(screen-bufmgr, TIMESTAMP | 1, dummy) == 0)
 +  return 3;
 +
 +   /* Determine if we have a 32bit or 64bit kernel by inspecting the
 +* upper 32bits for a rapidly changing timestamp.
 +*/
 +   if (drm_intel_reg_read(screen-bufmgr, TIMESTAMP, last))
 +  return 0;
 +
 +   upper = lower = 0;
 +   for (loops = 0; loops  10; loops++) {
 +  /* The TIMESTAMP should change every 80ns, so several round trips
 +   * through the kernel should be enough to advance it.
 +   */
if (drm_intel_reg_read(screen-bufmgr, TIMESTAMP, dummy))
 -return false;
 -   } while ((dummy  0x) == 0  --loop);
 + return 0;
 +
 +  upper += (dummy  32) != (last  32);
 +  if (upper  1) /* beware 32bit counter overflow */
 + return 2; /* upper dword holds the low 32bits of the timestamp */
 +
 +  lower += (dummy  0x) != (last  0x);
 +  if (lower  1)
 + return 1; /* timestamp is unshifted */
 +
 +  last = dummy;
 +   }

 -   return loop  0;
 +   /* No advancement? No timestamp! */
 +   return 0;
  }

  /**
 diff --git a/src/mesa/drivers

Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-13 Thread Chris Forbes
Oh, never mind - I see there's another hunk that my mailer had folded away
for some reason. I'm happy that it's correct now :)
On Jul 13, 2015 23:33, Neil Roberts n...@linux.intel.com wrote:

 Chris Forbes chr...@ijw.co.nz writes:

  Nitpicks aside, I don't think this is a great idea now that you've got
  the SKL PI working.

 Can you explain why you don't think this is a good idea? Is it because
 it is an optimisation for something that is not known to be a big use
 case so carrying around the extra code just adds unnecessary maintenance
 burden? I could agree with that so I'm happy to abandon the patch for
 now if that's the general consensus.

  I also think it's broken -- you need to arrange to have the centroid
  barycentric coords delivered to the FS thread, which won't be
  happening if this is the *only* use of them. Masked in the tests,
  because they compare with a centroid-qualified input. [I'm assuming
  you don't always get these delivered to the FS in SKL, but no docs
  access...]

 The changes to brw_compute_barycentric_interp_modes in the patch ensure
 that the centroid barycentric coords are delivered whenever
 interpolateAtCentroid is used in a shader. I don't think this is a
 problem. At least it seems to work in a simple test without using a
 separate varying with the centroid qualifier.

 Regards,
 - Neil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: use set rather than old hash table for ir_validate

2015-07-10 Thread Chris Forbes
Perf data?

On Fri, Jul 10, 2015 at 6:41 PM, Timothy Arceri t_arc...@yahoo.com.au wrote:
 This implementation should be faster and there was no
 need to store a data field.
 ---
  src/glsl/ir_validate.cpp | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)

 diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
 index cfe0df3..684bef2 100644
 --- a/src/glsl/ir_validate.cpp
 +++ b/src/glsl/ir_validate.cpp
 @@ -35,7 +35,8 @@

  #include ir.h
  #include ir_hierarchical_visitor.h
 -#include program/hash_table.h
 +#include util/hash_table.h
 +#include util/set.h
  #include glsl_types.h

  namespace {
 @@ -44,18 +45,18 @@ class ir_validate : public ir_hierarchical_visitor {
  public:
 ir_validate()
 {
 -  this-ht = hash_table_ctor(0, hash_table_pointer_hash,
 -hash_table_pointer_compare);
 +  this-ir_set = _mesa_set_create(NULL, _mesa_hash_pointer,
 +  _mesa_key_pointer_equal);

this-current_function = NULL;

this-callback_enter = ir_validate::validate_ir;
 -  this-data_enter = ht;
 +  this-data_enter = ir_set;
 }

 ~ir_validate()
 {
 -  hash_table_dtor(this-ht);
 +  _mesa_set_destroy(this-ir_set, NULL);
 }

 virtual ir_visitor_status visit(ir_variable *v);
 @@ -80,7 +81,7 @@ public:

 ir_function *current_function;

 -   struct hash_table *ht;
 +   struct set *ir_set;
  };

  } /* anonymous namespace */
 @@ -94,7 +95,7 @@ ir_validate::visit(ir_dereference_variable *ir)
abort();
 }

 -   if (hash_table_find(ht, ir-var) == NULL) {
 +   if (_mesa_set_search(ir_set, ir-var) == NULL) {
printf(ir_dereference_variable @ %p specifies undeclared variable 
  `%s' @ %p\n,
  (void *) ir, ir-var-name, (void *) ir-var);
 @@ -730,8 +731,7 @@ ir_validate::visit(ir_variable *ir)
 if (ir-name  ir-is_name_ralloced())
assert(ralloc_parent(ir-name) == ir);

 -   hash_table_insert(ht, ir, ir);
 -
 +   _mesa_set_add(ir_set, ir);

 /* If a variable is an array, verify that the maximum array index is in
  * bounds.  There was once an error in AST-to-HIR conversion that set this
 @@ -885,15 +885,15 @@ dump_ir:
  void
  ir_validate::validate_ir(ir_instruction *ir, void *data)
  {
 -   struct hash_table *ht = (struct hash_table *) data;
 +   struct set *ir_set = (struct set *) data;

 -   if (hash_table_find(ht, ir)) {
 +   if (_mesa_set_search(ir_set, ir)) {
printf(Instruction node present twice in ir tree:\n);
ir-print();
printf(\n);
abort();
 }
 -   hash_table_insert(ht, ir, ir);
 +   _mesa_set_add(ir_set, ir);
  }

  void
 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used

2015-07-10 Thread Chris Forbes
Surely the *right* thing would be to have the correct order expressed
in brw-vb.*, instead so you don't have this workaround in multiple
places.

As a minimal fix for stable though, this seems OK, so -

Reviewed-by: Chris Forbes chr...@ijw.co.nz



On Sat, Jul 11, 2015 at 5:04 AM, Neil Roberts n...@linux.intel.com wrote:
 When the edge flag element is enabled then the elements are slightly
 reordered so that the edge flag is always the last one. This was
 confusing the code to upload the 3DSTATE_VF_INSTANCING state because
 that is uploaded with a separate loop which has an instruction for
 each element. The indices used in these instructions weren't taking
 into account the reordering so the state would be incorrect.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292
 Cc: 10.6 10.5 mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 15 +--
  1 file changed, 13 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
 b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
 index 1af90ec..65b7625 100644
 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
 +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
 @@ -218,13 +218,24 @@ gen8_emit_vertices(struct brw_context *brw)
 }
 ADVANCE_BATCH();

 -   for (unsigned i = 0; i  brw-vb.nr_enabled; i++) {
 +   for (unsigned i = 0, j = 0; i  brw-vb.nr_enabled; i++) {
const struct brw_vertex_element *input = brw-vb.enabled[i];
const struct brw_vertex_buffer *buffer = 
 brw-vb.buffers[input-buffer];
 +  unsigned element_index;
 +
 +  /* The edge flag element is reordered to be the last one in the code
 +   * above so we need to compensate for that in the element indices used
 +   * below.
 +   */
 +  if (input == gen6_edgeflag_input)
 + element_index = brw-vb.nr_enabled - 1;
 +  else
 + element_index = j++;

BEGIN_BATCH(3);
OUT_BATCH(_3DSTATE_VF_INSTANCING  16 | (3 - 2));
 -  OUT_BATCH(i | (buffer-step_rate ? GEN8_VF_INSTANCING_ENABLE : 0));
 +  OUT_BATCH(element_index |
 +(buffer-step_rate ? GEN8_VF_INSTANCING_ENABLE : 0));
OUT_BATCH(buffer-step_rate);
ADVANCE_BATCH();
 }
 --
 1.9.3

 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-10 Thread Chris Forbes
s/interpolater/interpolator/g

On Fri, Jul 10, 2015 at 1:31 AM, Neil Roberts n...@linux.intel.com wrote:
 For centroid interpolation we can just directly use the values set up
 in the shader payload instead of querying the pixel interpolator. To
 do this we need to modify brw_compute_barycentric_interp_modes to
 detect when interpolateAtCentroid is called.

 v2: Rebase on top of changes to set the pulls bary bit on SKL
 ---

 As an aside, I was deliberating over whether to call the function
 set_up_blah instead of setup_blah because I think the former is more
 correct. The rest of Mesa seems to use setup so maybe it's more
 important to be consistent than correct.

  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 +++---
  src/mesa/drivers/dri/i965/brw_wm.c   | 55 
 
  2 files changed, 88 insertions(+), 19 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 5d1ea21..fd7f1b8 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -1238,6 +1238,25 @@ fs_visitor::emit_percomp(const fs_builder bld, const 
 fs_inst inst,
 }
  }

 +/* For most messages, we need one reg of ignored data; the hardware requires
 + * mlen==1 even when there is no payload. in the per-slot offset case, we'll
 + * replace this with the proper source data.
 + */
 +static void
 +setup_pixel_interpolater_instruction(fs_visitor *v,
 + nir_intrinsic_instr *instr,
 + fs_inst *inst,
 + int mlen = 1)
 +{
 +  inst-mlen = mlen;
 +  inst-regs_written = 2 * v-dispatch_width / 8;
 +  inst-pi_noperspective = instr-variables[0]-var-data.interpolation 
 ==
 +   INTERP_QUALIFIER_NOPERSPECTIVE;
 +
 +  assert(v-stage == MESA_SHADER_FRAGMENT);
 +  ((struct brw_wm_prog_data *) v-prog_data)-pulls_bary = true;
 +}
 +
  void
  fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr 
 *instr)
  {
 @@ -1482,25 +1501,23 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
 case nir_intrinsic_interp_var_at_centroid:
 case nir_intrinsic_interp_var_at_sample:
 case nir_intrinsic_interp_var_at_offset: {
 -  assert(stage == MESA_SHADER_FRAGMENT);
 -
 -  ((struct brw_wm_prog_data *) prog_data)-pulls_bary = true;
 -
fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);

 -  /* For most messages, we need one reg of ignored data; the hardware
 -   * requires mlen==1 even when there is no payload. in the per-slot
 -   * offset case, we'll replace this with the proper source data.
 -   */
fs_reg src = vgrf(glsl_type::float_type);
 -  int mlen = 1; /* one reg unless overriden */
fs_inst *inst;

switch (instr-intrinsic) {
 -  case nir_intrinsic_interp_var_at_centroid:
 - inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_CENTROID,
 - dst_xy, src, fs_reg(0u));
 +  case nir_intrinsic_interp_var_at_centroid: {
 + enum brw_wm_barycentric_interp_mode interp_mode;
 + if (instr-variables[0]-var-data.interpolation ==
 + INTERP_QUALIFIER_NOPERSPECTIVE)
 +interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
 + else
 +interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
 + uint8_t reg = payload.barycentric_coord_reg[interp_mode];
 + dst_xy = fs_reg(brw_vec16_grf(reg, 0));
   break;
 +  }

case nir_intrinsic_interp_var_at_sample: {
   /* XXX: We should probably handle non-constant sample id's */
 @@ -1509,6 +1526,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
   unsigned msg_data = const_sample ? const_sample-i[0]  4 : 0;
   inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SAMPLE, dst_xy, src,
   fs_reg(msg_data));
 + setup_pixel_interpolater_instruction(this, instr, inst);
   break;
}

 @@ -1521,6 +1539,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr

  inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET, dst_xy, 
 src,
  fs_reg(off_x | (off_y  4)));
 +setup_pixel_interpolater_instruction(this, instr, inst);
   } else {
  src = vgrf(glsl_type::ivec2_type);
  fs_reg offset_src = retype(get_nir_src(instr-src[0]),
 @@ -1550,9 +1569,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
 bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
  }

 -mlen = 2 * dispatch_width / 8;
  inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, 
 dst_xy, src,
  fs_reg(0u));

Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-10 Thread Chris Forbes
Nitpicks aside, I don't think this is a great idea now that you've got
the SKL PI working.

I also think it's broken -- you need to arrange to have the centroid
barycentric coords delivered to the FS thread, which won't be
happening if this is the *only* use of them. Masked in the tests,
because they compare with a centroid-qualified input. [I'm assuming
you don't always get these delivered to the FS in SKL, but no docs
access...]

- Chris

On Sat, Jul 11, 2015 at 11:18 AM, Chris Forbes chr...@ijw.co.nz wrote:
 s/interpolater/interpolator/g

 On Fri, Jul 10, 2015 at 1:31 AM, Neil Roberts n...@linux.intel.com wrote:
 For centroid interpolation we can just directly use the values set up
 in the shader payload instead of querying the pixel interpolator. To
 do this we need to modify brw_compute_barycentric_interp_modes to
 detect when interpolateAtCentroid is called.

 v2: Rebase on top of changes to set the pulls bary bit on SKL
 ---

 As an aside, I was deliberating over whether to call the function
 set_up_blah instead of setup_blah because I think the former is more
 correct. The rest of Mesa seems to use setup so maybe it's more
 important to be consistent than correct.

  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 +++---
  src/mesa/drivers/dri/i965/brw_wm.c   | 55 
 
  2 files changed, 88 insertions(+), 19 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 5d1ea21..fd7f1b8 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -1238,6 +1238,25 @@ fs_visitor::emit_percomp(const fs_builder bld, const 
 fs_inst inst,
 }
  }

 +/* For most messages, we need one reg of ignored data; the hardware requires
 + * mlen==1 even when there is no payload. in the per-slot offset case, we'll
 + * replace this with the proper source data.
 + */
 +static void
 +setup_pixel_interpolater_instruction(fs_visitor *v,
 + nir_intrinsic_instr *instr,
 + fs_inst *inst,
 + int mlen = 1)
 +{
 +  inst-mlen = mlen;
 +  inst-regs_written = 2 * v-dispatch_width / 8;
 +  inst-pi_noperspective = instr-variables[0]-var-data.interpolation 
 ==
 +   INTERP_QUALIFIER_NOPERSPECTIVE;
 +
 +  assert(v-stage == MESA_SHADER_FRAGMENT);
 +  ((struct brw_wm_prog_data *) v-prog_data)-pulls_bary = true;
 +}
 +
  void
  fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr 
 *instr)
  {
 @@ -1482,25 +1501,23 @@ fs_visitor::nir_emit_intrinsic(const fs_builder 
 bld, nir_intrinsic_instr *instr
 case nir_intrinsic_interp_var_at_centroid:
 case nir_intrinsic_interp_var_at_sample:
 case nir_intrinsic_interp_var_at_offset: {
 -  assert(stage == MESA_SHADER_FRAGMENT);
 -
 -  ((struct brw_wm_prog_data *) prog_data)-pulls_bary = true;
 -
fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);

 -  /* For most messages, we need one reg of ignored data; the hardware
 -   * requires mlen==1 even when there is no payload. in the per-slot
 -   * offset case, we'll replace this with the proper source data.
 -   */
fs_reg src = vgrf(glsl_type::float_type);
 -  int mlen = 1; /* one reg unless overriden */
fs_inst *inst;

switch (instr-intrinsic) {
 -  case nir_intrinsic_interp_var_at_centroid:
 - inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_CENTROID,
 - dst_xy, src, fs_reg(0u));
 +  case nir_intrinsic_interp_var_at_centroid: {
 + enum brw_wm_barycentric_interp_mode interp_mode;
 + if (instr-variables[0]-var-data.interpolation ==
 + INTERP_QUALIFIER_NOPERSPECTIVE)
 +interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
 + else
 +interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
 + uint8_t reg = payload.barycentric_coord_reg[interp_mode];
 + dst_xy = fs_reg(brw_vec16_grf(reg, 0));
   break;
 +  }

case nir_intrinsic_interp_var_at_sample: {
   /* XXX: We should probably handle non-constant sample id's */
 @@ -1509,6 +1526,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
   unsigned msg_data = const_sample ? const_sample-i[0]  4 : 0;
   inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SAMPLE, dst_xy, src,
   fs_reg(msg_data));
 + setup_pixel_interpolater_instruction(this, instr, inst);
   break;
}

 @@ -1521,6 +1539,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr

  inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET, dst_xy, 
 src,
  fs_reg(off_x | (off_y  4)));
 +setup_pixel_interpolater_instruction(this, instr

Re: [Mesa-dev] [PATCH 07/19] glsl/types: add new subroutine type (v3)

2015-07-09 Thread Chris Forbes
7-12 inclusive are

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 9, 2015 at 7:17 PM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This type will be used to store the name of subroutine types

 as in subroutine void myfunc(void);
 will store myfunc into a subroutine type.

 This is required to the parser can identify a subroutine
 type in a uniform decleration as a valid type, and also for
 looking up the type later.

 Also add contains_subroutine method.

 v2: handle subroutine to int comparisons, needed
 for lowering pass.
 v3: do subroutine to int with it's own IR
 operation to avoid hacking on asserts (Kayden)

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/glsl/glsl_types.cpp| 63 
 ++
  src/glsl/glsl_types.h  | 19 ++
  src/glsl/ir.cpp|  2 ++
  src/glsl/ir.h  |  1 +
  src/glsl/ir_builder.cpp|  6 
  src/glsl/ir_builder.h  |  1 +
  src/glsl/ir_clone.cpp  |  1 +
  src/glsl/ir_validate.cpp   |  4 +++
  src/glsl/link_uniform_initializers.cpp |  1 +
  9 files changed, 98 insertions(+)

 diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
 index 281ff51..1e3ebb2 100644
 --- a/src/glsl/glsl_types.cpp
 +++ b/src/glsl/glsl_types.cpp
 @@ -32,6 +32,7 @@ mtx_t glsl_type::mutex = _MTX_INITIALIZER_NP;
  hash_table *glsl_type::array_types = NULL;
  hash_table *glsl_type::record_types = NULL;
  hash_table *glsl_type::interface_types = NULL;
 +hash_table *glsl_type::subroutine_types = NULL;
  void *glsl_type::mem_ctx = NULL;

  void
 @@ -159,6 +160,22 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
 unsigned num_fields,
 mtx_unlock(glsl_type::mutex);
  }

 +glsl_type::glsl_type(const char *subroutine_name) :
 +   gl_type(0),
 +   base_type(GLSL_TYPE_SUBROUTINE),
 +   sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
 +   sampler_type(0), interface_packing(0),
 +   vector_elements(0), matrix_columns(0),
 +   length(0)
 +{
 +   mtx_lock(glsl_type::mutex);
 +
 +   init_ralloc_type_ctx();
 +   assert(subroutine_name != NULL);
 +   this-name = ralloc_strdup(this-mem_ctx, subroutine_name);
 +   this-vector_elements = 1;
 +   mtx_unlock(glsl_type::mutex);
 +}

  bool
  glsl_type::contains_sampler() const
 @@ -229,6 +246,22 @@ glsl_type::contains_opaque() const {
 }
  }

 +bool
 +glsl_type::contains_subroutine() const
 +{
 +   if (this-is_array()) {
 +  return this-fields.array-contains_subroutine();
 +   } else if (this-is_record()) {
 +  for (unsigned int i = 0; i  this-length; i++) {
 +if (this-fields.structure[i].type-contains_subroutine())
 +   return true;
 +  }
 +  return false;
 +   } else {
 +  return this-is_subroutine();
 +   }
 +}
 +
  gl_texture_index
  glsl_type::sampler_index() const
  {
 @@ -826,6 +859,34 @@ glsl_type::get_interface_instance(const 
 glsl_struct_field *fields,
 return t;
  }

 +const glsl_type *
 +glsl_type::get_subroutine_instance(const char *subroutine_name)
 +{
 +   const glsl_type key(subroutine_name);
 +
 +   mtx_lock(glsl_type::mutex);
 +
 +   if (subroutine_types == NULL) {
 +  subroutine_types = hash_table_ctor(64, record_key_hash, 
 record_key_compare);
 +   }
 +
 +   const glsl_type *t = (glsl_type *) hash_table_find(subroutine_types,  
 key);
 +   if (t == NULL) {
 +  mtx_unlock(glsl_type::mutex);
 +  t = new glsl_type(subroutine_name);
 +  mtx_lock(glsl_type::mutex);
 +
 +  hash_table_insert(subroutine_types, (void *) t, t);
 +   }
 +
 +   assert(t-base_type == GLSL_TYPE_SUBROUTINE);
 +   assert(strcmp(t-name, subroutine_name) == 0);
 +
 +   mtx_unlock(glsl_type::mutex);
 +
 +   return t;
 +}
 +

  const glsl_type *
  glsl_type::get_mul_type(const glsl_type *type_a, const glsl_type *type_b)
 @@ -958,6 +1019,7 @@ glsl_type::component_slots() const
 case GLSL_TYPE_SAMPLER:
 case GLSL_TYPE_ATOMIC_UINT:
 case GLSL_TYPE_VOID:
 +   case GLSL_TYPE_SUBROUTINE:
 case GLSL_TYPE_ERROR:
break;
 }
 @@ -1331,6 +1393,7 @@ glsl_type::count_attribute_slots() const
 case GLSL_TYPE_IMAGE:
 case GLSL_TYPE_ATOMIC_UINT:
 case GLSL_TYPE_VOID:
 +   case GLSL_TYPE_SUBROUTINE:
 case GLSL_TYPE_ERROR:
break;
 }
 diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h
 index f54a939..0f4dc80 100644
 --- a/src/glsl/glsl_types.h
 +++ b/src/glsl/glsl_types.h
 @@ -59,6 +59,7 @@ enum glsl_base_type {
 GLSL_TYPE_INTERFACE,
 GLSL_TYPE_ARRAY,
 GLSL_TYPE_VOID,
 +   GLSL_TYPE_SUBROUTINE,
 GLSL_TYPE_ERROR
  };

 @@ -264,6 +265,11 @@ struct glsl_type {
   const char *block_name);

 /**
 +* Get the instance of an subroutine type
 +*/
 +   static const glsl_type *get_subroutine_instance(const char 
 *subroutine_name);
 +
 +   /**
  * Get the type resulting

Re: [Mesa-dev] [PATCH 09/19] glsl/ir: add subroutine information storage to ir_function (v1.1)

2015-07-09 Thread Chris Forbes
Do you really need is_subroutine_def ? It seems redundant with
num_subroutine_types0.

On Thu, Jul 9, 2015 at 7:17 PM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 We need to store two sets of info into the ir_function,
 if this is a function definition with a subroutine list
 (subroutine_def) or if it a subroutine prototype.

 v1.1: add some more documentation.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/glsl/ir.cpp   |  4 
  src/glsl/ir.h | 16 
  src/glsl/ir_clone.cpp |  7 +++
  src/glsl/ir_print_visitor.cpp |  2 +-
  4 files changed, 28 insertions(+), 1 deletion(-)

 diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
 index 38a5e2a..2fbc631 100644
 --- a/src/glsl/ir.cpp
 +++ b/src/glsl/ir.cpp
 @@ -1853,6 +1853,7 @@ static void
  steal_memory(ir_instruction *ir, void *new_ctx)
  {
 ir_variable *var = ir-as_variable();
 +   ir_function *fn = ir-as_function();
 ir_constant *constant = ir-as_constant();
 if (var != NULL  var-constant_value != NULL)
steal_memory(var-constant_value, ir);
 @@ -1860,6 +1861,9 @@ steal_memory(ir_instruction *ir, void *new_ctx)
 if (var != NULL  var-constant_initializer != NULL)
steal_memory(var-constant_initializer, ir);

 +   if (fn != NULL  fn-subroutine_types)
 +  ralloc_steal(new_ctx, fn-subroutine_types);
 +
 /* The components of aggregate constants are not visited by the normal
  * visitor, so steal their values by hand.
  */
 diff --git a/src/glsl/ir.h b/src/glsl/ir.h
 index 092c96b..b5a9e99 100644
 --- a/src/glsl/ir.h
 +++ b/src/glsl/ir.h
 @@ -1121,6 +1121,22 @@ public:
  * List of ir_function_signature for each overloaded function with this 
 name.
  */
 struct exec_list signatures;
 +
 +   /**
 +* is this function a subroutine type declaration
 +* e.g. subroutine void type1(float arg1);
 +*/
 +   bool is_subroutine;
 +
 +   /**
 +* is this function associated to a subroutine type
 +* e.g. subroutine (type1, type2) function_name { function_body };
 +* would have this flag set and num_subroutine_types 2,
 +* and pointers to the type1 and type2 types.
 +*/
 +   bool is_subroutine_def;
 +   int num_subroutine_types;
 +   const struct glsl_type **subroutine_types;
  };

  inline const char *ir_function_signature::function_name() const
 diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
 index 49834ff..bf25d6c 100644
 --- a/src/glsl/ir_clone.cpp
 +++ b/src/glsl/ir_clone.cpp
 @@ -267,6 +267,13 @@ ir_function::clone(void *mem_ctx, struct hash_table *ht) 
 const
  {
 ir_function *copy = new(mem_ctx) ir_function(this-name);

 +   copy-is_subroutine = this-is_subroutine;
 +   copy-is_subroutine_def = this-is_subroutine_def;
 +   copy-num_subroutine_types = this-num_subroutine_types;
 +   copy-subroutine_types = ralloc_array(mem_ctx, const struct glsl_type *, 
 copy-num_subroutine_types);
 +   for (int i = 0; i  copy-num_subroutine_types; i++)
 + copy-subroutine_types[i] = this-subroutine_types[i];
 +
 foreach_in_list(const ir_function_signature, sig, this-signatures) {
ir_function_signature *sig_copy = sig-clone(mem_ctx, ht);
copy-add_signature(sig_copy);
 diff --git a/src/glsl/ir_print_visitor.cpp b/src/glsl/ir_print_visitor.cpp
 index 4cbcad4..f210175 100644
 --- a/src/glsl/ir_print_visitor.cpp
 +++ b/src/glsl/ir_print_visitor.cpp
 @@ -229,7 +229,7 @@ void ir_print_visitor::visit(ir_function_signature *ir)

  void ir_print_visitor::visit(ir_function *ir)
  {
 -   fprintf(f, (function %s\n, ir-name);
 +   fprintf(f, (%s function %s\n, ir-is_subroutine ? subroutine : , 
 ir-name);
 indentation++;
 foreach_in_list(ir_function_signature, sig, ir-signatures) {
indent();
 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix matNxM vertex attributes where M != 4.

2015-07-07 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 2, 2015 at 8:08 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 Matrix vertex attributes have their columns padded out to vec4s, which
 I was failing to account for.  Scalar NIR expects them to be packed,
 however.

 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)

 I still need to write proper Piglit tests for this.  We have basically a 
 single
 test for matrix vertex attributes, and that's a mat4 (which worked).

 But I figure we probably shouldn't hold up the bugfix on that.

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index caf1300..37b1ed7 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -91,12 +91,19 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
* So, we need to copy from fs_reg(ATTR, var-location) to
* offset(nir_inputs, var-data.driver_location).
*/
 - unsigned components = var-type-without_array()-components();
 + const glsl_type *const t = var-type-without_array();
 + const unsigned components = t-components();
 + const unsigned cols = t-matrix_columns;
 + const unsigned elts = t-vector_elements;
   unsigned array_length = var-type-is_array() ? var-type-length : 
 1;
   for (unsigned i = 0; i  array_length; i++) {
 -for (unsigned j = 0; j  components; j++) {
 -   bld.MOV(retype(offset(input, bld, components * i + j), type),
 -   offset(fs_reg(ATTR, var-data.location + i, type), 
 bld, j));
 +for (unsigned j = 0; j  cols; j++) {
 +   for (unsigned k = 0; k  elts; k++) {
 +  bld.MOV(offset(retype(input, type), bld,
 + components * i + elts * j + k),
 +  offset(fs_reg(ATTR, var-data.location + i, type),
 + bld, 4 * j + k));
 +   }
  }
   }
   break;
 --
 2.4.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-02 Thread Chris Forbes
Looks OK to me. I didn't think there was going to be much required to
make this work -- is nice that it turned out to be nothing.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

- Chris

On Fri, Jul 3, 2015 at 6:41 AM, Neil Roberts n...@linux.intel.com wrote:
 There was a comment saying that in SIMD16 mode the pixel interpolator
 returns coords interleaved 8 channels at a time and that this requires
 extra work to support. However, this interleaved format is exactly
 what the PLN instruction requires so I don't think anything needs to
 be done to support it apart from removing the line to disable it and
 to ensure that the message lengths for the send message are correct.

 I am more convinced that this is correct because as it says in the
 comment this interleaved output is identical to what is given in the
 thread payload. The code generated to apply the plane equation to
 these coordinates is identical on SIMD16 and SIMD8 except that the
 dispatch width is larger which implies no special unmangling is
 needed.

 Perhaps the confusion stems from the fact that the description of the
 PLN instruction in the IVB PRM seems to imply that the src1 inputs are
 not interleaved so it wouldn't work. However, in the HSW and BDW PRMs,
 the pseudo-code is different and looks like it expects the interleaved
 format. Mesa doesn't seem to generate different code on IVB to
 uninterleave the payload registers and everything is working so I can
 only assume that the PRM is wrong.

 I tested the interpolateAt tests on HSW and did a full Piglit run on
 IVB on there were no regressions.
 ---

 I've CC'd Chris Forbes because according to git-annotate he wrote the
 original comment so he might know something I don't.

  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++
  1 file changed, 3 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 59081ea..717e597 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -1461,12 +1461,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
 case nir_intrinsic_interp_var_at_centroid:
 case nir_intrinsic_interp_var_at_sample:
 case nir_intrinsic_interp_var_at_offset: {
 -  /* in SIMD16 mode, the pixel interpolator returns coords interleaved
 -   * 8 channels at a time, same as the barycentric coords presented in
 -   * the FS payload. this requires a bit of extra work to support.
 -   */
 -  no16(interpolate_at_* not yet supported in SIMD16 mode.);
 -
fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);

/* For most messages, we need one reg of ignored data; the hardware
 @@ -1531,7 +1525,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
 bld.SEL(offset(src, i), itemp, fs_reg(7)));
  }

 -mlen = 2;
 +mlen = 2 * dispatch_width / 8;
  inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, 
 dst_xy, src,
  fs_reg(0u));
   }
 @@ -1543,7 +1537,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
}

inst-mlen = mlen;
 -  inst-regs_written = 2; /* 2 floats per slot returned */
 +  /* 2 floats per slot returned */
 +  inst-regs_written = 2 * dispatch_width / 8;
inst-pi_noperspective = instr-variables[0]-var-data.interpolation 
 ==
 INTERP_QUALIFIER_NOPERSPECTIVE;

 --
 1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] mesa: reset the source packing when creating temp transfer image

2015-07-01 Thread Chris Forbes
Seems reasonable to me.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 2, 2015 at 7:18 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Commit 4b249d2ee (mesa: Handle transferOps in texstore_rgba) introduced
 proper transferops handling, but in updating the source to the newly
 allocated temporary image neglected to reset the source packing. Set it
 to the default which should be appropriate for the floats used.

 Fixes: 4b249d2ee (mesa: Handle transferOps in texstore_rgba)
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91173
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 Cc: 10.5 10.6 mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/main/texstore.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
 index 1525205..37c0569 100644
 --- a/src/mesa/main/texstore.c
 +++ b/src/mesa/main/texstore.c
 @@ -787,6 +787,7 @@ texstore_rgba(TEXSTORE_PARAMS)
srcType = GL_FLOAT;
srcRowStride = srcWidth * 4 * sizeof(float);
srcMesaFormat = RGBA32_FLOAT;
 +  srcPacking = ctx-DefaultPacking;
 }

 src = (GLubyte *)
 --
 2.3.6

 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-01 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 2, 2015 at 4:16 AM, Mike Stroyan m...@lunarg.com wrote:
 When there are no color buffer render targets, gen6 and gen7 still
 use the first BLEND_STATE element to determine alpha test.
 gen6_upload_blend_state was allocating zero elements when
 ctx-Color.AlphaEnabled was false.
 That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
 pointing to random data from some previous brw_state_batch().
 That sometimes suppressed depth rendering when those bits
 happened to mean COMPAREFUNC_NEVER.
 This produced flickering shadows for dota2 reborn.
 ---
  src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
 b/src/mesa/drivers/dri/i965/gen6_cc.c
 index 2bfa271..2b76e24 100644
 --- a/src/mesa/drivers/dri/i965/gen6_cc.c
 +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
 @@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
  * with render target 0, which will reference BLEND_STATE[0] for
  * alpha test enable.
  */
 -   if (nr_draw_buffers == 0  ctx-Color.AlphaEnabled)
 +   if (nr_draw_buffers == 0)
nr_draw_buffers = 1;

 size = sizeof(*blend) * nr_draw_buffers;
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/prog: relative offsets into constbufs are not constant

2015-07-01 Thread Chris Forbes
Seems fair.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 2, 2015 at 10:22 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 The optimization logic relies on being able to read out constbuf values
 from program parameters. However that only works if there's no relative
 addressing involved.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91173
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/mesa/program/prog_opt_constant_fold.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/mesa/program/prog_opt_constant_fold.c 
 b/src/mesa/program/prog_opt_constant_fold.c
 index 3811c0d..e2518e6 100644
 --- a/src/mesa/program/prog_opt_constant_fold.c
 +++ b/src/mesa/program/prog_opt_constant_fold.c
 @@ -38,6 +38,8 @@ src_regs_are_constant(const struct prog_instruction *inst, 
 unsigned num_srcs)
 for (i = 0; i  num_srcs; i++) {
if (inst-SrcReg[i].File != PROGRAM_CONSTANT)
  return false;
 +  if (inst-SrcReg[i].RelAddr)
 + return false;
 }

 return true;
 --
 2.3.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/16] i965: Remove the brw_context from the visitors

2015-06-23 Thread Chris Forbes
For the series:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Tue, Jun 23, 2015 at 1:07 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 As of this commit, nothing actually needs the brw_context.
 ---
  src/mesa/drivers/dri/i965/brw_cs.cpp|  6 --
  src/mesa/drivers/dri/i965/brw_fs.cpp| 12 ++--
  src/mesa/drivers/dri/i965/brw_fs.h  |  2 +-
  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp   |  1 -
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp|  4 ++--
  src/mesa/drivers/dri/i965/brw_shader.cpp|  9 +
  src/mesa/drivers/dri/i965/brw_shader.h  |  7 ---
  src/mesa/drivers/dri/i965/brw_vec4.cpp  |  6 --
  src/mesa/drivers/dri/i965/brw_vec4.h|  2 +-
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp   | 14 --
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h |  2 +-
  src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp |  1 -
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp  |  4 ++--
  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp   |  4 ++--
  src/mesa/drivers/dri/i965/brw_vs.h  |  2 +-
  src/mesa/drivers/dri/i965/gen6_gs_visitor.h |  4 ++--
  16 files changed, 43 insertions(+), 37 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
 b/src/mesa/drivers/dri/i965/brw_cs.cpp
 index fa8b5c8..4c5082c 100644
 --- a/src/mesa/drivers/dri/i965/brw_cs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
 @@ -94,7 +94,8 @@ brw_cs_emit(struct brw_context *brw,

 /* Now the main event: Visit the shader IR and generate our CS IR for it.
  */
 -   fs_visitor v8(brw, mem_ctx, MESA_SHADER_COMPUTE, key, prog_data-base, 
 prog,
 +   fs_visitor v8(brw-intelScreen-compiler, brw,
 + mem_ctx, MESA_SHADER_COMPUTE, key, prog_data-base, prog,
   cp-Base, 8, st_index);
 if (!v8.run_cs()) {
fail_msg = v8.fail_msg;
 @@ -103,7 +104,8 @@ brw_cs_emit(struct brw_context *brw,
prog_data-simd_size = 8;
 }

 -   fs_visitor v16(brw, mem_ctx, MESA_SHADER_COMPUTE, key, prog_data-base, 
 prog,
 +   fs_visitor v16(brw-intelScreen-compiler, brw,
 +  mem_ctx, MESA_SHADER_COMPUTE, key, prog_data-base, prog,
cp-Base, 16, st_index);
 if (likely(!(INTEL_DEBUG  DEBUG_NO16)) 
 !fail_msg  !v8.simd16_unsupported 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 23f60c2..f7f05af 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -677,8 +677,7 @@ fs_visitor::no16(const char *msg)
 } else {
simd16_unsupported = true;

 -  struct brw_compiler *compiler = brw-intelScreen-compiler;
 -  compiler-shader_perf_log(brw,
 +  compiler-shader_perf_log(log_data,
  SIMD16 shader failed to compile: %s, msg);
 }
  }
 @@ -3757,8 +3756,7 @@ fs_visitor::allocate_registers()
   fail(Failure to register allocate.  Reduce number of 
live scalar values to avoid this.);
} else {
 - struct brw_compiler *compiler = brw-intelScreen-compiler;
 - compiler-shader_perf_log(brw,
 + compiler-shader_perf_log(log_data,
 %s shader triggered register spilling.  
 Try reducing the number of live scalar 
 values to improve performance.\n,
 @@ -3994,7 +3992,8 @@ brw_wm_fs_emit(struct brw_context *brw,

 /* Now the main event: Visit the shader IR and generate our FS IR for it.
  */
 -   fs_visitor v(brw, mem_ctx, MESA_SHADER_FRAGMENT, key, prog_data-base,
 +   fs_visitor v(brw-intelScreen-compiler, brw,
 +mem_ctx, MESA_SHADER_FRAGMENT, key, prog_data-base,
  prog, fp-Base, 8, st_index8);
 if (!v.run_fs(false /* do_rep_send */)) {
if (prog) {
 @@ -4009,7 +4008,8 @@ brw_wm_fs_emit(struct brw_context *brw,
 }

 cfg_t *simd16_cfg = NULL;
 -   fs_visitor v2(brw, mem_ctx, MESA_SHADER_FRAGMENT, key, prog_data-base,
 +   fs_visitor v2(brw-intelScreen-compiler, brw,
 + mem_ctx, MESA_SHADER_FRAGMENT, key, prog_data-base,
   prog, fp-Base, 16, st_index16);
 if (likely(!(INTEL_DEBUG  DEBUG_NO16) || brw-use_rep_send)) {
if (!v.simd16_unsupported) {
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index e0a8984..243baf6 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -70,7 +70,7 @@ namespace brw {
  class fs_visitor : public backend_shader
  {
  public:
 -   fs_visitor(struct brw_context *brw,
 +   fs_visitor(const struct brw_compiler *compiler, void *log_data,
void *mem_ctx,
gl_shader_stage stage,
const void *key,
 diff --git a/src/mesa/drivers/dri/i965

Re: [Mesa-dev] [PATCH 14/16] i965/vec4: Turn some _mesa_problem calls into asserts

2015-06-22 Thread Chris Forbes
Recent convention has been to use unreachable(str) rather than assert(!str)

On Tue, Jun 23, 2015 at 1:07 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 ---
  src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
 index 92d1085..dcbd240 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
 @@ -381,8 +381,7 @@ vec4_vs_visitor::emit_program_code()
   break;

default:
 - _mesa_problem(ctx, Unsupported opcode %s in vertex program\n,
 -   _mesa_opcode_string(vpi-Opcode));
 + assert(!Unsupported opcode in vertex program);
}

/* Copy the temporary back into the actual destination register. */
 @@ -574,15 +573,13 @@ vec4_vs_visitor::get_vp_src_reg(const prog_src_register 
 src)
   break;

default:
 - _mesa_problem(ctx, bad uniform src register file: %s\n,
 -   _mesa_register_file_name((gl_register_file)src.File));
 + assert(!Bad uniform in src register file);
   return src_reg(this, glsl_type::vec4_type);
}
break;

 default:
 -  _mesa_problem(ctx, bad src register file: %s\n,
 -_mesa_register_file_name((gl_register_file)src.File));
 +  assert(!Bad src register file);
return src_reg(this, glsl_type::vec4_type);
 }

 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/16] i965/vs: Pass the current set of clip planes through run() and run_vs()

2015-06-22 Thread Chris Forbes
Is fairly unpleasant that the clip plane plumbing needs to be so
special in the visitors at all -- but breaking the context dependency
is a win.

- Chris

On Tue, Jun 23, 2015 at 1:07 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 Previously, these were pulled out of the GL context conditionally based on
 whether we were running ff/ARB or a GLSL program.  Now, we just pass them
 in so that the visitor doesn't have to grab them itself.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp  |  4 ++--
  src/mesa/drivers/dri/i965/brw_fs.h|  8 
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 11 +--
  src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 
  src/mesa/drivers/dri/i965/brw_vec4.h  |  4 ++--
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  4 ++--
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  4 +---
  7 files changed, 20 insertions(+), 23 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index bf04e26..23f60c2 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3791,7 +3791,7 @@ fs_visitor::allocate_registers()
  }

  bool
 -fs_visitor::run_vs()
 +fs_visitor::run_vs(gl_clip_plane *clip_planes)
  {
 assert(stage == MESA_SHADER_VERTEX);

 @@ -3806,7 +3806,7 @@ fs_visitor::run_vs()
 if (failed)
return false;

 -   emit_urb_writes();
 +   emit_urb_writes(clip_planes);

 if (shader_time_index = 0)
emit_shader_time_end();
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 4db5a91..e0a8984 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -84,8 +84,8 @@ public:

 fs_reg vgrf(const glsl_type *const type);
 void import_uniforms(fs_visitor *v);
 -   void setup_uniform_clipplane_values();
 -   void compute_clip_distance();
 +   void setup_uniform_clipplane_values(gl_clip_plane *clip_planes);
 +   void compute_clip_distance(gl_clip_plane *clip_planes);

 uint32_t gather_channel(int orig_chan, uint32_t sampler);
 void swizzle_result(ir_texture_opcode op, int dest_components,
 @@ -104,7 +104,7 @@ public:
 void DEP_RESOLVE_MOV(const brw::fs_builder bld, int grf);

 bool run_fs(bool do_rep_send);
 -   bool run_vs();
 +   bool run_vs(gl_clip_plane *clip_planes);
 bool run_cs();
 void optimize();
 void allocate_registers();
 @@ -271,7 +271,7 @@ public:
   fs_reg src0_alpha, unsigned components,
   unsigned exec_size, bool use_2nd_half = 
 false);
 void emit_fb_writes();
 -   void emit_urb_writes();
 +   void emit_urb_writes(gl_clip_plane *clip_planes);
 void emit_cs_terminate();

 void emit_barrier();
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 9ce8491..395394c 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1715,9 +1715,8 @@ fs_visitor::emit_fb_writes()
  }

  void
 -fs_visitor::setup_uniform_clipplane_values()
 +fs_visitor::setup_uniform_clipplane_values(gl_clip_plane *clip_planes)
  {
 -   gl_clip_plane *clip_planes = brw_select_clip_planes(ctx);
 const struct brw_vue_prog_key *key =
(const struct brw_vue_prog_key *) this-key;

 @@ -1731,7 +1730,7 @@ fs_visitor::setup_uniform_clipplane_values()
 }
  }

 -void fs_visitor::compute_clip_distance()
 +void fs_visitor::compute_clip_distance(gl_clip_plane *clip_planes)
  {
 struct brw_vue_prog_data *vue_prog_data =
(struct brw_vue_prog_data *) prog_data;
 @@ -1760,7 +1759,7 @@ void fs_visitor::compute_clip_distance()
 if (outputs[clip_vertex].file == BAD_FILE)
return;

 -   setup_uniform_clipplane_values();
 +   setup_uniform_clipplane_values(clip_planes);

 const fs_builder abld = bld.annotate(user clip distances);

 @@ -1781,7 +1780,7 @@ void fs_visitor::compute_clip_distance()
  }

  void
 -fs_visitor::emit_urb_writes()
 +fs_visitor::emit_urb_writes(gl_clip_plane *clip_planes)
  {
 int slot, urb_offset, length;
 struct brw_vs_prog_data *vs_prog_data =
 @@ -1796,7 +1795,7 @@ fs_visitor::emit_urb_writes()

 /* Lower legacy ff and ClipVertex clipping to clip distances */
 if (key-base.userclip_active  !prog-UsesClipDistanceOut)
 -  compute_clip_distance();
 +  compute_clip_distance(clip_planes);

 /* If we don't have any valid slots to write, just do a minimal urb write
  * send to terminate the shader. */
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 093802c..9c45034 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -1706,7 +1706,7 @@ vec4_visitor::emit_shader_time_write(int 
 shader_time_subindex, src_reg value)
  }

  bool
 -vec4_visitor::run()
 

Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.

2015-06-18 Thread Chris Forbes
Oh, how silly :)

Reviewed-by: Chris Forbes chr...@ijw.co.nz

- Chris

On Fri, Jun 19, 2015 at 11:19 AM, Matt Turner matts...@gmail.com wrote:
 Fixes a performance problem caused by commit b639ed2f.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895
 ---
  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
 b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
 index c0c8dfa..49f2e3e 100644
 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
 +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
 @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw,
 mesa_format format,
 const union gl_color_union *color)
  {
 -   if (_mesa_is_format_integer_color(format))
 +   if (_mesa_is_format_integer_color(format)) {
if (brw-gen = 8) {
   perf_debug(Integer fast clear not enabled for (%s),
  _mesa_get_format_name(format));
}
return false;
 +   }

 for (int i = 0; i  4; i++) {
if (color-f[i] != 0.0  color-f[i] != 1.0 
 --
 2.3.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Set max texture buffer size to hardware limit

2015-06-02 Thread Chris Forbes
Previously we were leaving this at the default of 64K, which meets the
spec but is too small for some real uses. The hardware can handle up to
128M.

User was complaining about this on freenode ##OpenGL today.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_context.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 673529a..b2b119a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -545,6 +545,7 @@ brw_initialize_context_constants(struct brw_context *brw)
 */
ctx-Const.UniformBufferOffsetAlignment = 16;
ctx-Const.TextureBufferOffsetAlignment = 16;
+   ctx-Const.MaxTextureBufferSize = 128 * 1024 * 1024;
 
if (brw-gen = 6) {
   ctx-Const.MaxVarying = 32;
-- 
2.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Use UW-typed immediate in multiply inst.

2015-06-02 Thread Chris Forbes
After discussion on IRC, this seems reasonable to me. Unfortunate that
CHV is fussy.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Wed, Jun 3, 2015 at 1:24 PM, Matt Turner matts...@gmail.com wrote:
 Some hardware reads only the low 16-bits even if the type is UD, but
 other hardware like Cherryview can't handle this.

 Fixes spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple on
 Cherryview.
 ---
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 index 40a3db3..ff05b2a 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 @@ -788,7 +788,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg 
 dst, struct brw_reg src
brw_set_default_access_mode(p, BRW_ALIGN_1);

/* addr = ((sampler * 0x101) + base_binding_table_index)  0xfff */
 -  brw_MUL(p, addr, sampler_reg, brw_imm_ud(0x101));
 +  brw_MUL(p, addr, sampler_reg, brw_imm_uw(0x101));
if (base_binding_table_index)
   brw_ADD(p, addr, addr, brw_imm_ud(base_binding_table_index));
brw_AND(p, addr, addr, brw_imm_ud(0xfff));
 --
 2.3.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Don't use a temporary when generating an indirect sample

2015-05-29 Thread Chris Forbes
This thing has been trouble since I wrote it. Nice to see it go.

Both patches are:

Reviewed-by: Chris Forbes chr...@ijw.co.nz
On May 30, 2015 6:28 AM, Matt Turner matts...@gmail.com wrote:

 On Fri, May 29, 2015 at 6:53 AM, Neil Roberts n...@linux.intel.com
 wrote:
  Previously when generating the send instruction for a sample
  instruction with an indirect sampler it would use the destination
  register as a temporary store. This breaks when used in combination
  with the opt_sampler_eot optimisation because that forces the
  destination to be null. This patch fixes that by avoiding the temp
  register altogether.
 
  The reason the temporary register was needed was because it was trying
  to ensure the binding table index doesn't overflow a byte by and'ing
  it with 0xff. The result is then or'd with samper_index8. This patch
  instead just and's the whole thing by 0xfff. This will ensure that a
  bogus sampler index won't overflow into the rest of the message
  descriptor but unlike the previous code it won't ensure that the
  binding table index doesn't overflow into the sampler index. It
  doesn't seem like that should matter very much though because if the
  shader is generating a bogus sampler index then it's going to just get
  garbage out either way.
 
  Instead of doing sampler_index8|(sampler_index+base_table_index) the
  new code avoids one operation by doing
  sampler_index*0x101+base_table_index which should be equivalent.
  However if we wanted to avoid the multiply for some reason we could do
  this by adding an extra or instruction still without needing the
  temporary register.
 
  This fixes a number of Piglit tests on Skylake that were using
  indirect samplers such as:
 
   spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple
  ---
   src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 17 -
   src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 16 
   2 files changed, 8 insertions(+), 25 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  index 0be0f86..ea46b1a 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
  @@ -779,27 +779,18 @@ fs_generator::generate_tex(fs_inst *inst, struct
 brw_reg dst, struct brw_reg src
 brw_mark_surface_used(prog_data, sampler +
 base_binding_table_index);
  } else {
 /* Non-const sampler index */
  -  /* Note: this clobbers `dst` as a temporary before emitting the
 send */
 
 struct brw_reg addr = vec1(retype(brw_address_reg(0),
 BRW_REGISTER_TYPE_UD));
  -  struct brw_reg temp = vec1(retype(dst, BRW_REGISTER_TYPE_UD));
  -
 struct brw_reg sampler_reg = vec1(retype(sampler_index,
 BRW_REGISTER_TYPE_UD));
 
 brw_push_insn_state(p);
 brw_set_default_mask_control(p, BRW_MASK_DISABLE);
 brw_set_default_access_mode(p, BRW_ALIGN_1);
 
  -  /* Some care required: `sampler` and `temp` may alias:
  -   *addr = sampler  0xff
  -   *temp = (sampler  8)  0xf00
  -   *addr = addr | temp
  -   */
  -  brw_ADD(p, addr, sampler_reg,
 brw_imm_ud(base_binding_table_index));
  -  brw_SHL(p, temp, sampler_reg, brw_imm_ud(8u));
  -  brw_AND(p, temp, temp, brw_imm_ud(0x0f00));
  -  brw_AND(p, addr, addr, brw_imm_ud(0x0ff));
  -  brw_OR(p, addr, addr, temp);
  +  /* addr = ((sampler * 0x101) + base_binding_table_index)  0xfff
 */
  +  brw_MUL(p, addr, sampler_reg, brw_imm_ud(0x101));
  +  brw_ADD(p, addr, addr, brw_imm_ud(base_binding_table_index));
  +  brw_AND(p, addr, addr, brw_imm_ud(0xfff));
 
 brw_pop_insn_state(p);
 
  diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
 b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
  index 20d096c..1d3f5ed 100644
  --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
  @@ -398,10 +398,8 @@ vec4_generator::generate_tex(vec4_instruction *inst,
 brw_mark_surface_used(prog_data-base, sampler +
 base_binding_table_index);
  } else {
 /* Non-constant sampler index. */
  -  /* Note: this clobbers `dst` as a temporary before emitting the
 send */
 
 struct brw_reg addr = vec1(retype(brw_address_reg(0),
 BRW_REGISTER_TYPE_UD));
  -  struct brw_reg temp = vec1(retype(dst, BRW_REGISTER_TYPE_UD));
 

 I'd delete the blank line here as well to match the fs code.

 Really nice solution. I'd been trying to figure out the best way to
 get an additional temporary in here, but this is clearly better.

 Reviewed-by: Matt Turner matts...@gmail.com
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http

Re: [Mesa-dev] [PATCH] docs: Mark ARB_shader_storage_buffer_object as in progress

2015-05-25 Thread Chris Forbes
Hardly needed, but:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Mon, May 25, 2015 at 7:41 PM, Iago Toral Quiroga ito...@igalia.com wrote:
 ---
  docs/GL3.txt | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 9d56ee5..44a824b 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -164,7 +164,7 @@ GL 4.3, GLSL 4.30:
GL_ARB_program_interface_query   DONE (all drivers)
GL_ARB_robust_buffer_access_behavior not started
GL_ARB_shader_image_size in progress (Martin 
 Peres)
 -  GL_ARB_shader_storage_buffer_object  not started
 +  GL_ARB_shader_storage_buffer_object  in progress (Iago 
 Toral, Samuel Iglesias)
GL_ARB_stencil_texturing DONE (i965/gen8+, 
 nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
 i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels  DONE (all drivers 
 that support GLSL 1.30)
 @@ -221,7 +221,7 @@ GLES3.1, GLSL ES 3.1
GL_ARB_shader_atomic_countersDONE (i965)
GL_ARB_shader_image_load_store   in progress (curro)
GL_ARB_shader_image_size in progress (Martin 
 Peres)
 -  GL_ARB_shader_storage_buffer_object  not started
 +  GL_ARB_shader_storage_buffer_object  in progress (Iago 
 Toral, Samuel Iglesias)
GL_ARB_shading_language_packing  DONE (all drivers)
GL_ARB_separate_shader_objects   DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, 
 nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Remove _NEW_MULTISAMPLE dirty bit from 3DSTATE_PS_EXTRA.

2015-05-25 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Mon, May 25, 2015 at 7:38 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 BRW_NEW_NUM_SAMPLES is sufficient.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_ps_state.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
 b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index 85ad3b6..6b9489b 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -72,7 +72,7 @@ upload_ps_extra(struct brw_context *brw)
brw_fragment_program_const(brw-fragment_program);
 /* BRW_NEW_FS_PROG_DATA */
 const struct brw_wm_prog_data *prog_data = brw-wm.prog_data;
 -   /* BRW_NEW_NUM_SAMPLES | _NEW_MULTISAMPLE */
 +   /* BRW_NEW_NUM_SAMPLES */
 const bool multisampled_fbo = brw-num_samples  1;

 gen8_upload_ps_extra(brw, fp-program, prog_data, multisampled_fbo);
 @@ -80,7 +80,7 @@ upload_ps_extra(struct brw_context *brw)

  const struct brw_tracked_state gen8_ps_extra = {
 .dirty = {
 -  .mesa  = _NEW_MULTISAMPLE,
 +  .mesa  = 0,
.brw   = BRW_NEW_CONTEXT |
 BRW_NEW_FRAGMENT_PROGRAM |
 BRW_NEW_FS_PROG_DATA |
 --
 2.4.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: Mark ARB_shader_storage_buffer_object as in progress

2015-05-25 Thread Chris Forbes
Oh, I meant I would have just pushed something like this :)

- Chris

On Mon, May 25, 2015 at 8:26 PM, Iago Toral ito...@igalia.com wrote:
 On Mon, 2015-05-25 at 20:15 +1200, Chris Forbes wrote:
 Hardly needed, but:

 I know, I should've sent this patch when we started working on this... I
 got some comments asking why this is marked as not started if there are
 patches in the mailing list, so I guess this will help make the current
 state clear while the patches don't land.

 Thanks,
 Iago

 Reviewed-by: Chris Forbes chr...@ijw.co.nz

 On Mon, May 25, 2015 at 7:41 PM, Iago Toral Quiroga ito...@igalia.com 
 wrote:
  ---
   docs/GL3.txt | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
 
  diff --git a/docs/GL3.txt b/docs/GL3.txt
  index 9d56ee5..44a824b 100644
  --- a/docs/GL3.txt
  +++ b/docs/GL3.txt
  @@ -164,7 +164,7 @@ GL 4.3, GLSL 4.30:
 GL_ARB_program_interface_query   DONE (all drivers)
 GL_ARB_robust_buffer_access_behavior not started
 GL_ARB_shader_image_size in progress 
  (Martin Peres)
  -  GL_ARB_shader_storage_buffer_object  not started
  +  GL_ARB_shader_storage_buffer_object  in progress (Iago 
  Toral, Samuel Iglesias)
 GL_ARB_stencil_texturing DONE (i965/gen8+, 
  nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
 GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
  i965, r600, radeonsi, llvmpipe)
 GL_ARB_texture_query_levels  DONE (all drivers 
  that support GLSL 1.30)
  @@ -221,7 +221,7 @@ GLES3.1, GLSL ES 3.1
 GL_ARB_shader_atomic_countersDONE (i965)
 GL_ARB_shader_image_load_store   in progress (curro)
 GL_ARB_shader_image_size in progress 
  (Martin Peres)
  -  GL_ARB_shader_storage_buffer_object  not started
  +  GL_ARB_shader_storage_buffer_object  in progress (Iago 
  Toral, Samuel Iglesias)
 GL_ARB_shading_language_packing  DONE (all drivers)
 GL_ARB_separate_shader_objects   DONE (all drivers)
 GL_ARB_stencil_texturing DONE (i965/gen8+, 
  nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
  --
  1.9.1
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Disable opt_sampler_eot for textureGather

2015-05-08 Thread Chris Forbes
I don't have CHV or SKL hw or docs to try and confirm this, but this
does what it claims to.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sat, May 9, 2015 at 5:10 AM, Neil Roberts n...@linux.intel.com wrote:
 The opt_sampler_eot optimisation seems to break when the last
 instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing
 this so it causes a lot of regressions. I can't find any documentation
 or known workarounds to indicate that this is expected behaviour, but
 considering that this is probably a pretty unlikely situation in a
 real use case we might as well disable it in order to avoid the
 regressions. In total this fixes 451 tests.

 Reviewed-by: Ben Widawsky b...@bwidawsk.net
 ---

 See here for some more discussion of this:

 http://lists.freedesktop.org/archives/mesa-dev/2015-May/083640.html

 As far as I can tell the Jenkins run mentioned in that email doesn't
 seem to have any tests on Cherryview or Skylake so that probably
 explains why it didn't pick up the regression.

  src/mesa/drivers/dri/i965/brw_fs.cpp | 10 ++
  1 file changed, 10 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 8dd680e..e9528e0 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2655,6 +2655,16 @@ fs_visitor::opt_sampler_eot()
 if (unlikely(tex_inst-is_head_sentinel()) || !tex_inst-is_tex())
return false;

 +   /* This optimisation doesn't seem to work for textureGather for some
 +* reason. I can't find any documentation or known workarounds to indicate
 +* that this is expected, but considering that it is probably pretty
 +* unlikely that a shader would directly write out the results from
 +* textureGather we might as well just disable it.
 +*/
 +   if (tex_inst-opcode == SHADER_OPCODE_TG4 ||
 +   tex_inst-opcode == SHADER_OPCODE_TG4_OFFSET)
 +  return false;
 +
 /* If there's no header present, we need to munge the LOAD_PAYLOAD as 
 well.
  * It's very likely to be the previous instruction.
  */
 --
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/23] glsl/types: add new subroutine type

2015-05-08 Thread Chris Forbes
Patches 11-13 are:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Fri, Apr 24, 2015 at 1:42 PM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This type will be used to store the name of subroutine types

 as in subroutine void myfunc(void);
 will store myfunc into a subroutine type.

 This is required to the parser can identify a subroutine
 type in a uniform decleration as a valid type, and also for
 looking up the type later.

 Also add contains_subroutine method.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/glsl/glsl_types.cpp| 63 
 ++
  src/glsl/glsl_types.h  | 19 ++
  src/glsl/ir_clone.cpp  |  1 +
  src/glsl/link_uniform_initializers.cpp |  1 +
  4 files changed, 84 insertions(+)

 diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
 index 9c9b7ef..37b5c62 100644
 --- a/src/glsl/glsl_types.cpp
 +++ b/src/glsl/glsl_types.cpp
 @@ -32,6 +32,7 @@ mtx_t glsl_type::mutex = _MTX_INITIALIZER_NP;
  hash_table *glsl_type::array_types = NULL;
  hash_table *glsl_type::record_types = NULL;
  hash_table *glsl_type::interface_types = NULL;
 +hash_table *glsl_type::subroutine_types = NULL;
  void *glsl_type::mem_ctx = NULL;

  void
 @@ -159,6 +160,22 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
 unsigned num_fields,
 mtx_unlock(glsl_type::mutex);
  }

 +glsl_type::glsl_type(const char *subroutine_name) :
 +   gl_type(0),
 +   base_type(GLSL_TYPE_SUBROUTINE),
 +   sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
 +   sampler_type(0), interface_packing(0),
 +   vector_elements(0), matrix_columns(0),
 +   length(0)
 +{
 +   mtx_lock(glsl_type::mutex);
 +
 +   init_ralloc_type_ctx();
 +   assert(subroutine_name != NULL);
 +   this-name = ralloc_strdup(this-mem_ctx, subroutine_name);
 +   this-vector_elements = 1;
 +   mtx_unlock(glsl_type::mutex);
 +}

  bool
  glsl_type::contains_sampler() const
 @@ -229,6 +246,22 @@ glsl_type::contains_opaque() const {
 }
  }

 +bool
 +glsl_type::contains_subroutine() const
 +{
 +   if (this-is_array()) {
 +  return this-fields.array-contains_subroutine();
 +   } else if (this-is_record()) {
 +  for (unsigned int i = 0; i  this-length; i++) {
 +if (this-fields.structure[i].type-contains_subroutine())
 +   return true;
 +  }
 +  return false;
 +   } else {
 +  return this-is_subroutine();
 +   }
 +}
 +
  gl_texture_index
  glsl_type::sampler_index() const
  {
 @@ -826,6 +859,34 @@ glsl_type::get_interface_instance(const 
 glsl_struct_field *fields,
 return t;
  }

 +const glsl_type *
 +glsl_type::get_subroutine_instance(const char *subroutine_name)
 +{
 +   const glsl_type key(subroutine_name);
 +
 +   mtx_lock(glsl_type::mutex);
 +
 +   if (subroutine_types == NULL) {
 +  subroutine_types = hash_table_ctor(64, record_key_hash, 
 record_key_compare);
 +   }
 +
 +   const glsl_type *t = (glsl_type *) hash_table_find(subroutine_types,  
 key);
 +   if (t == NULL) {
 +  mtx_unlock(glsl_type::mutex);
 +  t = new glsl_type(subroutine_name);
 +  mtx_lock(glsl_type::mutex);
 +
 +  hash_table_insert(subroutine_types, (void *) t, t);
 +   }
 +
 +   assert(t-base_type == GLSL_TYPE_SUBROUTINE);
 +   assert(strcmp(t-name, subroutine_name) == 0);
 +
 +   mtx_unlock(glsl_type::mutex);
 +
 +   return t;
 +}
 +

  const glsl_type *
  glsl_type::get_mul_type(const glsl_type *type_a, const glsl_type *type_b)
 @@ -958,6 +1019,7 @@ glsl_type::component_slots() const
 case GLSL_TYPE_SAMPLER:
 case GLSL_TYPE_ATOMIC_UINT:
 case GLSL_TYPE_VOID:
 +   case GLSL_TYPE_SUBROUTINE:
 case GLSL_TYPE_ERROR:
break;
 }
 @@ -1330,6 +1392,7 @@ glsl_type::count_attribute_slots() const
 case GLSL_TYPE_IMAGE:
 case GLSL_TYPE_ATOMIC_UINT:
 case GLSL_TYPE_VOID:
 +   case GLSL_TYPE_SUBROUTINE:
 case GLSL_TYPE_ERROR:
break;
 }
 diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h
 index d383dd5..078adaf 100644
 --- a/src/glsl/glsl_types.h
 +++ b/src/glsl/glsl_types.h
 @@ -59,6 +59,7 @@ enum glsl_base_type {
 GLSL_TYPE_INTERFACE,
 GLSL_TYPE_ARRAY,
 GLSL_TYPE_VOID,
 +   GLSL_TYPE_SUBROUTINE,
 GLSL_TYPE_ERROR
  };

 @@ -276,6 +277,11 @@ struct glsl_type {
   const char *block_name);

 /**
 +* Get the instance of an subroutine type
 +*/
 +   static const glsl_type *get_subroutine_instance(const char 
 *subroutine_name);
 +
 +   /**
  * Get the type resulting from a multiplication of \p type_a * \p type_b
  */
 static const glsl_type *get_mul_type(const glsl_type *type_a,
 @@ -526,6 +532,13 @@ struct glsl_type {
 /**
  * Query if a type is unnamed/anonymous (named by the parser)
  */
 +
 +   bool is_subroutine() const
 +   {
 +  return base_type == GLSL_TYPE_SUBROUTINE;
 +   }
 +   bool contains_subroutine() const;
 +
 bool

[Mesa-dev] [PATCH 2/4] i965/gen6: Upload all the clip viewports

2015-05-06 Thread Chris Forbes
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/gen6_viewport_state.c | 40 +
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
index 0c63283..95d204f 100644
--- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
@@ -42,27 +42,29 @@ gen6_upload_clip_vp(struct brw_context *brw)
struct brw_clipper_viewport *vp;
 
vp = brw_state_batch(brw, AUB_TRACE_CLIP_VP_STATE,
-   sizeof(*vp), 32, brw-clip.vp_offset);
+sizeof(*vp) * ctx-Const.MaxViewports, 32, 
brw-clip.vp_offset);
 
-   /* According to the Vertex X,Y Clamping and Quantization section of the
-* Strips and Fans documentation, objects must not have a screen-space
-* extents of over 8192 pixels, or they may be mis-rasterized.  The maximum
-* screen space coordinates of a small object may larger, but we have no
-* way to enforce the object size other than through clipping.
-*
-* If you're surprised that we set clip to -gbx to +gbx and it seems like
-* we'll end up with 16384 wide, note that for a 8192-wide render target,
-* we'll end up with a normal (-1, 1) clip volume that just covers the
-* drawable.
-*/
-   const float maximum_post_clamp_delta = 8192;
-   float gbx = maximum_post_clamp_delta / ctx-ViewportArray[0].Width;
-   float gby = maximum_post_clamp_delta / ctx-ViewportArray[0].Height;
+   for (unsigned i = 0; i  ctx-Const.MaxViewports; i++) {
+  /* According to the Vertex X,Y Clamping and Quantization section of the
+   * Strips and Fans documentation, objects must not have a screen-space
+   * extents of over 8192 pixels, or they may be mis-rasterized.  The 
maximum
+   * screen space coordinates of a small object may larger, but we have no
+   * way to enforce the object size other than through clipping.
+   *
+   * If you're surprised that we set clip to -gbx to +gbx and it seems like
+   * we'll end up with 16384 wide, note that for a 8192-wide render target,
+   * we'll end up with a normal (-1, 1) clip volume that just covers the
+   * drawable.
+   */
+  const float maximum_post_clamp_delta = 8192;
+  float gbx = maximum_post_clamp_delta / ctx-ViewportArray[i].Width;
+  float gby = maximum_post_clamp_delta / ctx-ViewportArray[i].Height;
 
-   vp-xmin = -gbx;
-   vp-xmax = gbx;
-   vp-ymin = -gby;
-   vp-ymax = gby;
+  vp[i].xmin = -gbx;
+  vp[i].xmax = gbx;
+  vp[i].ymin = -gby;
+  vp[i].ymax = gby;
+   }
 
brw-ctx.NewDriverState |= BRW_NEW_CLIP_VP;
 }
-- 
2.3.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] i965/gen6: Upload all the SF viewports

2015-05-06 Thread Chris Forbes
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_structs.h |  2 ++
 src/mesa/drivers/dri/i965/gen6_viewport_state.c | 29 +++--
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_structs.h 
b/src/mesa/drivers/dri/i965/brw_structs.h
index 7c97a95..55338c0 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -639,6 +639,8 @@ struct gen6_sf_viewport {
float m30;
float m31;
float m32;
+
+   unsigned pad0[2];
 };
 
 struct gen7_sf_clip_viewport {
diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
index 95d204f..2fb0182 100644
--- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
@@ -81,14 +81,14 @@ static void
 gen6_upload_sf_vp(struct brw_context *brw)
 {
struct gl_context *ctx = brw-ctx;
-   struct brw_sf_viewport *sfv;
+   struct gen6_sf_viewport *sfv;
GLfloat y_scale, y_bias;
-   double scale[3], translate[3];
const bool render_to_fbo = _mesa_is_user_fbo(ctx-DrawBuffer);
 
sfv = brw_state_batch(brw, AUB_TRACE_SF_VP_STATE,
-sizeof(*sfv), 32, brw-sf.vp_offset);
-   memset(sfv, 0, sizeof(*sfv));
+ sizeof(*sfv) * ctx-Const.MaxViewports,
+ 32, brw-sf.vp_offset);
+   memset(sfv, 0, sizeof(*sfv) * ctx-Const.MaxViewports);
 
/* _NEW_BUFFERS */
if (render_to_fbo) {
@@ -99,14 +99,19 @@ gen6_upload_sf_vp(struct brw_context *brw)
   y_bias = ctx-DrawBuffer-Height;
}
 
-   /* _NEW_VIEWPORT */
-   _mesa_get_viewport_xform(ctx, 0, scale, translate);
-   sfv-viewport.m00 = scale[0];
-   sfv-viewport.m11 = scale[1] * y_scale;
-   sfv-viewport.m22 = scale[2];
-   sfv-viewport.m30 = translate[0];
-   sfv-viewport.m31 = translate[1] * y_scale + y_bias;
-   sfv-viewport.m32 = translate[2];
+   for (unsigned i = 0; i  ctx-Const.MaxViewports; i++) {
+  double scale[3], translate[3];
+
+  /* _NEW_VIEWPORT */
+  _mesa_get_viewport_xform(ctx, i, scale, translate);
+  sfv[i].m00 = scale[0];
+  sfv[i].m11 = scale[1] * y_scale;
+  sfv[i].m22 = scale[2];
+  sfv[i].m30 = translate[0];
+  sfv[i].m31 = translate[1] * y_scale + y_bias;
+  sfv[i].m32 = translate[2];
+
+   }
 
brw-ctx.NewDriverState |= BRW_NEW_SF_VP;
 }
-- 
2.3.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] i965/gen6: setup limits for ARB_viewport_array

2015-05-06 Thread Chris Forbes
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_context.c | 4 ++--
 src/mesa/drivers/dri/i965/brw_defines.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6c00f6c..fd7420a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -598,8 +598,8 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx-Const.ShaderCompilerOptions[MESA_SHADER_COMPUTE].NirOptions = 
nir_options;
 
/* ARB_viewport_array */
-   if (brw-gen = 7  ctx-API == API_OPENGL_CORE) {
-  ctx-Const.MaxViewports = GEN7_NUM_VIEWPORTS;
+   if (brw-gen = 6  ctx-API == API_OPENGL_CORE) {
+  ctx-Const.MaxViewports = GEN6_NUM_VIEWPORTS;
   ctx-Const.ViewportSubpixelBits = 0;
 
   /* Cast to float before negating because MaxViewportWidth is unsigned.
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 7b5dd45..83d7a35 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1712,7 +1712,7 @@ enum brw_message_target {
 # define GEN6_CC_VIEWPORT_MODIFY   (1  12)
 # define GEN6_SF_VIEWPORT_MODIFY   (1  11)
 # define GEN6_CLIP_VIEWPORT_MODIFY (1  10)
-# define GEN7_NUM_VIEWPORTS16
+# define GEN6_NUM_VIEWPORTS16
 
 #define _3DSTATE_VIEWPORT_STATE_POINTERS_CC0x7823 /* GEN7+ */
 #define _3DSTATE_VIEWPORT_STATE_POINTERS_SF_CL 0x7821 /* GEN7+ */
-- 
2.3.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965/gen6: Enable ARB_viewport_array and AMD_vertex_shader_viewport_index

2015-05-06 Thread Chris Forbes
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index c28c171..3088a1a 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -292,6 +292,14 @@ intelInitExtensions(struct gl_context *ctx)
   /* Test if the kernel has the ioctl. */
   if (drm_intel_reg_read(brw-bufmgr, TIMESTAMP, dummy) == 0)
  ctx-Extensions.ARB_timer_query = true;
+
+  /* Only enable this in core profile because other parts of Mesa behave
+   * slightly differently when the extension is enabled.
+   */
+  if (ctx-API == API_OPENGL_CORE) {
+ ctx-Extensions.ARB_viewport_array = true;
+ ctx-Extensions.AMD_vertex_shader_viewport_index = true;
+  }
}
 
if (brw-gen = 5) {
@@ -313,14 +321,6 @@ intelInitExtensions(struct gl_context *ctx)
  ctx-Extensions.ARB_draw_indirect = true;
   }
 
-  /* Only enable this in core profile because other parts of Mesa behave
-   * slightly differently when the extension is enabled.
-   */
-  if (ctx-API == API_OPENGL_CORE) {
- ctx-Extensions.ARB_viewport_array = true;
- ctx-Extensions.AMD_vertex_shader_viewport_index = true;
-  }
-
   ctx-Extensions.ARB_texture_compression_bptc = true;
   ctx-Extensions.ARB_derivative_control = true;
}
-- 
2.3.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/23] i965: Plumb compiler debug logging through a function pointer in brw_compiler

2015-04-30 Thread Chris Forbes
Looks like everything prior to this patch has landed;

Ken's two patches for the printf-like debug plumbing, and the
remaining patches from this series are:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sun, Apr 19, 2015 at 9:02 AM, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Sat, Apr 18, 2015 at 1:55 PM, Kenneth Graunke kenn...@whitecape.org 
 wrote:
 On Friday, April 17, 2015 07:12:00 PM Jason Ekstrand wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 11 ++-
  src/mesa/drivers/dri/i965/brw_shader.cpp | 13 +
  src/mesa/drivers/dri/i965/brw_shader.h   |  2 ++
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 11 ++-
  4 files changed, 27 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 index 35bc241..123bdf7 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 @@ -2111,15 +2111,16 @@ fs_generator::generate_code(const cfg_t *cfg, int 
 dispatch_width)
ralloc_free(annotation.ann);
 }

 -   static GLuint msg_id = 0;
 -   _mesa_gl_debug(brw-ctx, msg_id,
 -  MESA_DEBUG_SOURCE_SHADER_COMPILER,
 -  MESA_DEBUG_TYPE_OTHER,
 -  MESA_DEBUG_SEVERITY_NOTIFICATION,
 +   const int debug_str_size = 160;
 +   char debug_str[debug_str_size];
 +   int len;
 +   len = snprintf(debug_str, debug_str_size,
%s SIMD%d shader: %d inst, %d loops, %d:%d 
 spills:fills, 
Promoted %u constants, compacted %d to %d bytes.\n,
stage_abbrev, dispatch_width, before_size / 16, 
 loop_count,
spill_count, fill_count, promoted_constants, 
 before_size, after_size);
 +   assert(len  debug_str_size); (void)len;
 +   brw-intelScreen-compiler-shader_debug_log(debug_str);

 I don't like that this requires fixed size buffer logic at every call
 site.  It's kinda gross.

 How about making it printf-like instead?  Specifically:
 http://cgit.freedesktop.org/~kwg/mesa/commit/?h=compiler-divorceid=1a71535d2de01f8a7ad244d39d801d63493ba5e9
 http://cgit.freedesktop.org/~kwg/mesa/commit/?h=compiler-divorceid=830a25a1f11367e032d8e6a13fa141ff82c06417

 Yeah, that's better.  I had thought about it but did the easy thing.
 Given that you've got the patches written, I'm 100% with doing that
 instead.
 --Jason

 (compiler-divorce of my tree has the rebased branch with those patches
 in, if that's useful to you)
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] glsl/es3.1: Allow textureGather and textureGatherOffset in GLSL ES 3.10

2015-04-30 Thread Chris Forbes
  /* Only ARB_texture_gather but not GLSL 4.0 or ARB_gpu_shader5.
   * used for relaxation of const offset requirements.
   */
  static bool
 -texture_gather_only(const _mesa_glsl_parse_state *state)
 +texture_gather_only_or_es31(const _mesa_glsl_parse_state *state)
  {
 return !state-is_version(400, 0) 
!state-ARB_gpu_shader5_enable 
 -  state-ARB_texture_gather_enable;
 +  (state-ARB_texture_gather_enable ||
 +   state-is_version(0, 310));
  }

I don't think this is correct. This is used to enable the restricted
versions of textureGather/textureGatherOffset which require a constant
offset. This restriction doesn't appear to exist in ES3.1 -- it's more
or less the GS5 version.

An example pair, where at most one should be enabled:
_texture(ir_tg4, texture_gather_only, glsl_type::vec4_type,
glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET),
_texture(ir_tg4, gpu_shader5, glsl_type::vec4_type,
glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET_NONCONST),

- Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] glsl/es3.1: Allow textureGather and textureGatherOffset in GLSL ES 3.10

2015-04-30 Thread Chris Forbes
Nevermind, pre-coffee. On re-reading the GLSL ES 3.1 spec, the offset
is required to be constant wherever texture offsets are used.

On Fri, May 1, 2015 at 10:03 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Thu, Apr 30, 2015 at 5:56 PM, Chris Forbes chr...@ijw.co.nz wrote:
  /* Only ARB_texture_gather but not GLSL 4.0 or ARB_gpu_shader5.
   * used for relaxation of const offset requirements.
   */
  static bool
 -texture_gather_only(const _mesa_glsl_parse_state *state)
 +texture_gather_only_or_es31(const _mesa_glsl_parse_state *state)
  {
 return !state-is_version(400, 0) 
!state-ARB_gpu_shader5_enable 
 -  state-ARB_texture_gather_enable;
 +  (state-ARB_texture_gather_enable ||
 +   state-is_version(0, 310));
  }

 I don't think this is correct. This is used to enable the restricted
 versions of textureGather/textureGatherOffset which require a constant
 offset. This restriction doesn't appear to exist in ES3.1 -- it's more
 or less the GS5 version.

 An example pair, where at most one should be enabled:
 _texture(ir_tg4, texture_gather_only, glsl_type::vec4_type,
 glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET),
 _texture(ir_tg4, gpu_shader5, glsl_type::vec4_type,
 glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET_NONCONST),

 From my read of GLSL ES 3.10, it requires const offsets. Non-const
 support is added in EXT_gpu_shader5:

   * extending the textureGather() built-in functions provided by
 OpenGL ES Shading Language 3.10:

 * allowing shaders to use arbitrary offsets computed at run-time to
   select a 2x2 footprint to gather from; and

   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/18] i965/ps/gen8: Refactor state uploading

2015-04-29 Thread Chris Forbes
It might be better to just prefetch no samplers in this case? -- a
shader that has this many active samplers probably doesn't actually
use them all in a single invocation.

On Thu, Apr 30, 2015 at 5:23 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Wednesday, April 29, 2015 07:47:26 PM Pohjolainen, Topi wrote:
 On Thu, Apr 23, 2015 at 09:58:22PM +0300, Pohjolainen, Topi wrote:
  On Thu, Apr 23, 2015 at 11:53:49AM -0700, Matt Turner wrote:
   On Wed, Apr 22, 2015 at 1:47 PM, Topi Pohjolainen
   topi.pohjolai...@intel.com wrote:
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/dri/i965/brw_state.h | 12 +
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 74 
---
 2 files changed, 59 insertions(+), 27 deletions(-)
   
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 178f039..0c4f65e 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -265,6 +265,18 @@ void gen7_set_surface_mcs_info(struct brw_context 
*brw,
 void gen7_check_surface_setup(uint32_t *surf, bool is_render_target);
 void gen7_init_vtable_surface_functions(struct brw_context *brw);
   
+/* gen8_ps_state.c */
+void gen8_upload_ps_state(struct brw_context *brw,
+  const struct gl_fragment_program *fp,
+  const struct brw_stage_state *stage_state,
+  const struct brw_wm_prog_data *prog_data,
+  uint32_t fast_clear_op);
+
+void gen8_upload_ps_extra(struct brw_context *brw,
+  const struct gl_fragment_program *fp,
+  const struct brw_wm_prog_data *prog_data,
+  bool multisampled_fbo);
+
 /* gen7_sol_state.c */
 void gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
   const struct brw_vue_map 
*vue_map);
diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index 5f39e12..da6136b 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -27,15 +27,13 @@
 #include brw_defines.h
 #include intel_batchbuffer.h
   
-static void
-upload_ps_extra(struct brw_context *brw)
+void
+gen8_upload_ps_extra(struct brw_context *brw,
+ const struct gl_fragment_program *fp,
+ const struct brw_wm_prog_data *prog_data,
+ bool multisampled_fbo)
 {
struct gl_context *ctx = brw-ctx;
-   /* BRW_NEW_FRAGMENT_PROGRAM */
-   const struct brw_fragment_program *fp =
-  brw_fragment_program_const(brw-fragment_program);
-   /* BRW_NEW_FS_PROG_DATA */
-   const struct brw_wm_prog_data *prog_data = brw-wm.prog_data;
uint32_t dw1 = 0;
   
dw1 |= GEN8_PSX_PIXEL_SHADER_VALID;
@@ -47,16 +45,14 @@ upload_ps_extra(struct brw_context *brw)
if (prog_data-num_varying_inputs != 0)
   dw1 |= GEN8_PSX_ATTRIBUTE_ENABLE;
   
-   if (fp-program.Base.InputsRead  VARYING_BIT_POS)
+   if (fp-Base.InputsRead  VARYING_BIT_POS)
   dw1 |= GEN8_PSX_USES_SOURCE_DEPTH | GEN8_PSX_USES_SOURCE_W;
   
-   /* BRW_NEW_NUM_SAMPLES | _NEW_MULTISAMPLE */
-   bool multisampled_fbo = brw-num_samples  1;
if (multisampled_fbo 
-   _mesa_get_min_invocations_per_fragment(ctx, fp-program, 
false)  1)
+   _mesa_get_min_invocations_per_fragment(ctx, fp, false)  1)
   dw1 |= GEN8_PSX_SHADER_IS_PER_SAMPLE;
   
-   if (fp-program.Base.SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN)
+   if (fp-Base.SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN)
   dw1 |= GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK;
   
if (prog_data-uses_omask)
@@ -68,6 +64,20 @@ upload_ps_extra(struct brw_context *brw)
ADVANCE_BATCH();
 }
   
+static void
+upload_ps_extra(struct brw_context *brw)
+{
+   /* BRW_NEW_FRAGMENT_PROGRAM */
+   const struct brw_fragment_program *fp =
+  brw_fragment_program_const(brw-fragment_program);
+   /* BRW_NEW_FS_PROG_DATA */
+   const struct brw_wm_prog_data *prog_data = brw-wm.prog_data;
+   /* BRW_NEW_NUM_SAMPLES | _NEW_MULTISAMPLE */
+   const bool multisampled_fbo = brw-num_samples  1;
+
+   gen8_upload_ps_extra(brw, fp-program, prog_data, 
multisampled_fbo);
+}
+
 const struct brw_tracked_state gen8_ps_extra = {
.dirty = {
   .mesa  = _NEW_MULTISAMPLE,
@@ -118,23 +128,24 @@ const struct brw_tracked_state gen8_wm_state = {
.emit = upload_wm_state,
 };
   
-static void
-upload_ps_state(struct brw_context *brw)
+void
+gen8_upload_ps_state(struct 

Re: [Mesa-dev] [PATCH v5] i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6

2015-04-28 Thread Chris Forbes
Have an:

Acked-by: Chris Forbes chr...@ijw.co.nz

On Fri, Apr 24, 2015 at 3:41 AM, Marius Predut marius.pre...@intel.com wrote:
 On SNB and IVB hw, for 1 pixel line thickness or less,
 the general anti-aliasing algorithm give up - garbage line is generated.
 Setting a Line Width of 0.0 specifies the rasterization of
 the “thinnest” (one-pixel-wide), non-antialiased lines.
 Lines rendered with zero Line Width are rasterized using
 Grid Intersection Quantization rules as specified
 by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.

 v2: Daniel Stone: Fix = used instead of == in an if-statement.
 v3: Ian Romanick: Use ._Enabled flag insteed .Enabled.
 Add code comments. re-word wrap the commit message.
 Add a complete bugzillia list.
 Improve the hardcoded values to produce better results.
 v4: Matt Turner: typo fixes and adjust = 1.49 to become  1.5

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006

 Signed-off-by: Marius Predut marius.pre...@intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_sf_state.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
 b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 index ea5c47a..e445ce2 100644
 --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 @@ -367,9 +367,25 @@ upload_sf_state(struct brw_context *brw)
float line_width =
   roundf(CLAMP(ctx-Line.Width, 0.0, ctx-Const.MaxLineWidth));
uint32_t line_width_u3_7 = U_FIXED(line_width, 7);
 -  /* TODO: line width of 0 is not allowed when MSAA enabled */
 -  if (line_width_u3_7 == 0)
 - line_width_u3_7 = 1;
 +
 +  /* Line width of 0 is not allowed when MSAA enabled */
 +  if (ctx-Multisample._Enabled) {
 + if (line_width_u3_7 == 0)
 + line_width_u3_7 = 1;
 +  } else if (ctx-Line.SmoothFlag  ctx-Line.Width  1.5) {
 + /* For 1 pixel line thickness or less, the general
 +  * anti-aliasing algorithm gives up, and a garbage line is
 +  * generated.  Setting a Line Width of 0.0 specifies the
 +  * rasterization of the thinnest (one-pixel-wide),
 +  * non-antialiased lines.
 +  *
 +  * Lines rendered with zero Line Width are rasterized using
 +  * Grid Intersection Quantization rules as specified by
 +  * bspec section 6.3.12.1 Zero-Width (Cosmetic) Line
 +  * Rasterization.
 +  */
 + line_width_u3_7 = 0;
 +  }
dw3 |= line_width_u3_7  GEN6_SF_LINE_WIDTH_SHIFT;
 }
 if (ctx-Line.SmoothFlag) {
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Disassemble sampler message names on Gen5+.

2015-04-24 Thread Chris Forbes
On the this is silly, I should really fix it list forever...

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Fri, Apr 24, 2015 at 6:02 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 Previously, sampler messages were decoded as

 sampler (1, 0, 2, 2) mlen 6 rlen 8  { align1 1H };

 I don't know how much time we've collectly wasted trying to read this
 format.  I can never recall which number is the surface index, sampler
 index, message type, or...whatever that other number is.  Figuring out
 the message name from the numerical code is also painful.

 Now they decode as:

 sampler sample_l SIMD16 Surface = 1 Sampler = 0 mlen 6 rlen 8 { align1 1H };

 This is easy to read at a glance, and matches the format I used for
 render target formats.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: matts...@gmail.com
 ---
  src/mesa/drivers/dri/i965/brw_disasm.c | 38 
 ++
  1 file changed, 34 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
 b/src/mesa/drivers/dri/i965/brw_disasm.c
 index d1078c0..95e262a 100644
 --- a/src/mesa/drivers/dri/i965/brw_disasm.c
 +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
 @@ -579,6 +579,34 @@ static const char *const urb_complete[2] = {
 [1] = complete
  };

 +static const char *const gen5_sampler_msg_type[] = {
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE]  = sample,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_BIAS] = sample_b,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_LOD]  = sample_l,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_COMPARE]  = sample_c,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_DERIVS]   = sample_d,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_BIAS_COMPARE] = sample_b_c,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_LOD_COMPARE]  = sample_l_c,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_LD]   = ld,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4]  = gather4,
 +   [GEN5_SAMPLER_MESSAGE_LOD] = lod,
 +   [GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO]  = resinfo,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4_C]= gather4_c,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO]   = gather4_po,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO_C] = gather4_po_c,
 +   [HSW_SAMPLER_MESSAGE_SAMPLE_DERIV_COMPARE] = sample_d_c,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_LD_MCS]   = ld_mcs,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DMS]   = ld2dms,
 +   [GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DSS]   = ld2dss,
 +};
 +
 +static const char *const gen5_sampler_simd_mode[4] = {
 +   [BRW_SAMPLER_SIMD_MODE_SIMD4X2]   = SIMD4x2,
 +   [BRW_SAMPLER_SIMD_MODE_SIMD8] = SIMD8,
 +   [BRW_SAMPLER_SIMD_MODE_SIMD16]= SIMD16,
 +   [BRW_SAMPLER_SIMD_MODE_SIMD32_64] = SIMD32/64,
 +};
 +
  static const char *const sampler_target_format[4] = {
 [0] = F,
 [2] = UD,
 @@ -1374,11 +1402,13 @@ brw_disassemble_inst(FILE *file, const struct 
 brw_device_info *devinfo,
  break;
   case BRW_SFID_SAMPLER:
  if (devinfo-gen = 5) {
 -   format(file,  (%ld, %ld, %ld, %ld),
 +   err |= control(file, sampler message, gen5_sampler_msg_type,
 +  brw_inst_sampler_msg_type(devinfo, inst), 
 space);
 +   err |= control(file, sampler simd mode, 
 gen5_sampler_simd_mode,
 +  brw_inst_sampler_simd_mode(devinfo, inst), 
 space);
 +   format(file,  Surface = %ld Sampler = %ld,
brw_inst_binding_table_index(devinfo, inst),
 -  brw_inst_sampler(devinfo, inst),
 -  brw_inst_sampler_msg_type(devinfo, inst),
 -  brw_inst_sampler_simd_mode(devinfo, inst));
 +  brw_inst_sampler(devinfo, inst));
  } else {
 format(file,  (%ld, %ld, %ld, ,
brw_inst_binding_table_index(devinfo, inst),
 --
 2.3.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 1/3] i965: Fix instanced geometry shaders on Gen8+.

2015-04-04 Thread Chris Forbes
For the series:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sat, Apr 4, 2015 at 11:46 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 Jordan added this in commit 741782b5948bb3d01d699f062a37513c2e73b076 for
 Gen7 platforms.  Embarassingly, this was missed for well over a year.

 Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs}
 with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/drivers/dri/i965/gen8_gs_state.c | 2 ++
  1 file changed, 2 insertions(+)

 Mini-series also available in the 'gen8sol' branch of ~kwg/mesa.

 diff --git a/src/mesa/drivers/dri/i965/gen8_gs_state.c 
 b/src/mesa/drivers/dri/i965/gen8_gs_state.c
 index 95cc123..46b9713 100644
 --- a/src/mesa/drivers/dri/i965/gen8_gs_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_gs_state.c
 @@ -82,6 +82,8 @@ gen8_upload_gs_state(struct brw_context *brw)
uint32_t dw7 = (brw-gs.prog_data-control_data_header_size_hwords 
GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT) |
brw-gs.prog_data-dispatch_mode |
 + ((brw-gs.prog_data-invocations - 1) 
 +  GEN7_GS_INSTANCE_CONTROL_SHIFT) |
GEN6_GS_STATISTICS_ENABLE |
(brw-gs.prog_data-include_primitive_id ?
 GEN7_GS_INCLUDE_PRIMITIVE_ID : 0) |
 --
 2.3.4

 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/23] i965: Use BRW_SURFACE_* in place of GL_TEXTURE_*

2015-03-31 Thread Chris Forbes
I'd adjust the write to surf[0] to use surf_type too.

Other than that, this patch is:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Tue, Mar 31, 2015 at 10:04 AM, Anuj Phogat anuj.pho...@gmail.com wrote:
 Makes no functional changes in the code.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/dri/i965/gen8_surface_state.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 index e9ba938..84fa383 100644
 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 @@ -174,6 +174,7 @@ gen8_update_texture_surface(struct gl_context *ctx,
 struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
 struct intel_mipmap_tree *aux_mt = NULL;
 uint32_t aux_mode = 0;
 +   uint32_t surf_type;
 mesa_format format = intelObj-_Format;
 uint32_t mocs_wb = brw-gen = 9 ? SKL_MOCS_WB : BDW_MOCS_WB;

 @@ -203,29 +204,31 @@ gen8_update_texture_surface(struct gl_context *ctx,
aux_mode = GEN8_SURFACE_AUX_MODE_MCS;
 }

 +   surf_type = translate_tex_target(tObj-Target);
 +
 /* If this is a view with restricted NumLayers, then our effective depth
  * is not just the miptree depth.
  */
 uint32_t effective_depth =
 -  (tObj-Immutable  tObj-Target != GL_TEXTURE_3D) ? tObj-NumLayers
 - : 
 mt-logical_depth0;
 +  (tObj-Immutable  surf_type != BRW_SURFACE_3D) ? tObj-NumLayers
 +   : mt-logical_depth0;

 uint32_t tex_format = translate_tex_format(brw, format, 
 sampler-sRGBDecode);

 uint32_t *surf = allocate_surface_state(brw, surf_offset);

 +
 surf[0] = translate_tex_target(tObj-Target)  BRW_SURFACE_TYPE_SHIFT |
   tex_format  BRW_SURFACE_FORMAT_SHIFT |
   vertical_alignment(mt) |
   horizontal_alignment(mt) |
   tiling_mode;

 -   if (tObj-Target == GL_TEXTURE_CUBE_MAP ||
 -   tObj-Target == GL_TEXTURE_CUBE_MAP_ARRAY) {
 +   if (surf_type == BRW_SURFACE_CUBE) {
surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
 }

 -   if (mt-logical_depth0  1  tObj-Target != GL_TEXTURE_3D)
 +   if (mt-logical_depth0  1  surf_type != BRW_SURFACE_3D)
surf[0] |= GEN8_SURFACE_IS_ARRAY;

 surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt-qpitch  2;
 --
 2.3.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] glsl: Reassociate multiplication of mat*mat*vec.

2015-03-28 Thread Chris Forbes
For the series:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sat, Mar 28, 2015 at 5:22 PM, Matt Turner matts...@gmail.com wrote:
 The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but
 mat4*(mat4*vec4) is only 32.

 On HSW (with vec4 vertex shaders):
 instructions in affected programs: 4420 - 3194 (-27.74%)

 On BDW (with scalar vertex shaders):
 instructions in affected programs: 12756 - 6726 (-47.27%)

 Implementing a general matrix chain ordering is harder (or at least
 tedious) because of having to walk the GLSL IR to create a list of
 multiplicands. I'm guessing that this patch handles 90+% of cases, but
 of course to tell definitively you'd have to implement the general
 thing.
 ---
  src/glsl/opt_algebraic.cpp | 14 ++
  1 file changed, 14 insertions(+)

 diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
 index 98c852a..a940d2f 100644
 --- a/src/glsl/opt_algebraic.cpp
 +++ b/src/glsl/opt_algebraic.cpp
 @@ -290,6 +290,20 @@ ir_algebraic_visitor::handle_expression(ir_expression 
 *ir)
 ir_expression *op_expr[4] = {NULL, NULL, NULL, NULL};
 unsigned int i;

 +   if (ir-operation == ir_binop_mul 
 +   ir-operands[0]-type-is_matrix() 
 +   ir-operands[1]-type-is_vector()) {
 +  ir_expression *matrix_mul = ir-operands[0]-as_expression();
 +
 +  if (matrix_mul  matrix_mul-operation == ir_binop_mul 
 + matrix_mul-operands[0]-type-is_matrix() 
 + matrix_mul-operands[1]-type-is_matrix()) {
 +
 + return mul(matrix_mul-operands[0],
 +mul(matrix_mul-operands[1], ir-operands[1]));
 +  }
 +   }
 +
 assert(ir-get_num_operands() = 4);
 for (i = 0; i  ir-get_num_operands(); i++) {
if (ir-operands[i]-type-is_matrix())
 --
 2.0.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] glsl: fix names in lower_constant_arrays_to_uniforms

2015-03-23 Thread Chris Forbes
Looks good to me. I should have considered this cross-stage case when
I fixed the first part of this bug...

Do you have a piglit test which hits this?

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Mon, Mar 23, 2015 at 8:12 PM, Tapani Pälli tapani.pa...@intel.com wrote:
 Patch changes lowering pass to use unique name for each uniform
 so that arrays from different stages cannot end up having same
 name.

 v2: instead of global counter, use pointer to achieve
 unique name (Kenneth Graunke)

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590
 Cc: 10.5 10.4 mesa-sta...@lists.freedesktop.org
 ---
  configure.ac| 2 +-
  src/glsl/lower_const_arrays_to_uniforms.cpp | 4 +---
  2 files changed, 2 insertions(+), 4 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 08378f5..19d4c06 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -68,7 +68,7 @@ AC_SUBST([OSMESA_VERSION])
  dnl Versions for external dependencies
  LIBDRM_REQUIRED=2.4.38
  LIBDRM_RADEON_REQUIRED=2.4.56
 -LIBDRM_INTEL_REQUIRED=2.4.60
 +LIBDRM_INTEL_REQUIRED=2.4.59
  LIBDRM_NVVIEUX_REQUIRED=2.4.33
  LIBDRM_NOUVEAU_REQUIRED=2.4.33 libdrm = 2.4.41
  LIBDRM_FREEDRENO_REQUIRED=2.4.57
 diff --git a/src/glsl/lower_const_arrays_to_uniforms.cpp 
 b/src/glsl/lower_const_arrays_to_uniforms.cpp
 index 2243f47..44967dc 100644
 --- a/src/glsl/lower_const_arrays_to_uniforms.cpp
 +++ b/src/glsl/lower_const_arrays_to_uniforms.cpp
 @@ -49,7 +49,6 @@ public:
 {
instructions = insts;
progress = false;
 -  index = 0;
 }

 bool run()
 @@ -63,7 +62,6 @@ public:
  private:
 exec_list *instructions;
 bool progress;
 -   unsigned index;
  };

  void
 @@ -82,7 +80,7 @@ lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)

 void *mem_ctx = ralloc_parent(con);

 -   char *uniform_name = ralloc_asprintf(mem_ctx, constarray__%d, index++);
 +   char *uniform_name = ralloc_asprintf(mem_ctx, constarray__%p, dra);

 ir_variable *uni =
new(mem_ctx) ir_variable(con-type, uniform_name, ir_var_uniform);
 --
 2.1.0

 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] glsl: fix names in lower_constant_arrays_to_uniforms

2015-03-23 Thread Chris Forbes
 -LIBDRM_INTEL_REQUIRED=2.4.60
 +LIBDRM_INTEL_REQUIRED=2.4.59

Hang on, what's this hunk doing here?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/11] i965/fs: Implement support for ir_barrier

2015-03-22 Thread Chris Forbes
Jordan,

You also need to set m0.2:15 (Barrier count enable) and m0.2:14-9
(Barrier count) to have the message gateway actually collect the
proper number of threads, right?

- Chris



On Mon, Mar 23, 2015 at 2:49 PM, Jordan Justen
jordan.l.jus...@intel.com wrote:
 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 Reviewed-by: Chris Forbes chr...@ijw.co.nz
 ---
  src/mesa/drivers/dri/i965/brw_defines.h|  5 +
  src/mesa/drivers/dri/i965/brw_fs.h |  3 +++
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 11 +++
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 27 
 +-
  src/mesa/drivers/dri/i965/brw_shader.cpp   |  3 +++
  5 files changed, 48 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 98a392a..9b1fd15 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -1102,6 +1102,11 @@ enum opcode {
  *   and number of SO primitives needed.
  */
 GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
 +
 +   /**
 +* GLSL barrier()
 +*/
 +   SHADER_OPCODE_BARRIER,
  };

  enum brw_urb_write_flags {
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 86a7906..b55c333 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -383,6 +383,8 @@ public:
 void emit_fb_writes();
 void emit_urb_writes();

 +   void emit_barrier();
 +
 void emit_shader_time_begin();
 void emit_shader_time_end();
 fs_inst *SHADER_TIME_ADD(enum shader_time_shader_type type, fs_reg value);
 @@ -551,6 +553,7 @@ private:
GLuint nr);
 void generate_fb_write(fs_inst *inst, struct brw_reg payload);
 void generate_urb_write(fs_inst *inst, struct brw_reg payload);
 +   void generate_barrier(fs_inst *inst, struct brw_reg src);
 void generate_blorp_fb_write(fs_inst *inst);
 void generate_pixel_xy(struct brw_reg dst, bool is_x);
 void generate_linterp(fs_inst *inst, struct brw_reg dst,
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 index bd12147..f817e84 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
 @@ -369,6 +369,13 @@ fs_generator::generate_urb_write(fs_inst *inst, struct 
 brw_reg payload)
  }

  void
 +fs_generator::generate_barrier(fs_inst *inst, struct brw_reg src)
 +{
 +   brw_barrier(p, src);
 +   brw_wait(p);
 +}
 +
 +void
  fs_generator::generate_blorp_fb_write(fs_inst *inst)
  {
 brw_fb_WRITE(p,
 @@ -2060,6 +2067,10 @@ fs_generator::generate_code(const cfg_t *cfg, int 
 dispatch_width)
 
 GEN7_PIXEL_INTERPOLATOR_LOC_PER_SLOT_OFFSET);
   break;

 +  case SHADER_OPCODE_BARRIER:
 +generate_barrier(inst, src[0]);
 +break;
 +
default:
  if (inst-opcode  (int) ARRAY_SIZE(opcode_descs)) {
 _mesa_problem(ctx, Unsupported opcode `%s' in %s,
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 2b1b72f..5cde8f5 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -3146,7 +3146,32 @@ fs_visitor::visit(ir_end_primitive *)
  void
  fs_visitor::visit(ir_barrier *)
  {
 -   assert(!Not implemented!);
 +   emit_barrier();
 +}
 +
 +void
 +fs_visitor::emit_barrier()
 +{
 +   assert(brw-gen = 7);
 +
 +   /* We are getting the barrier ID from the compute shader header */
 +   assert(stage == MESA_SHADER_COMPUTE);
 +
 +   fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 +
 +   /* Clear the message payload */
 +   fs_inst *inst = emit(MOV(payload, fs_reg(0u)));
 +   inst-force_writemask_all = true;
 +
 +   /* Copy bits 27:24 of r0.2 (barrier id) to the message payload reg.2 */
 +   struct fs_reg r0_2 = fs_reg(retype(brw_vec1_grf(0, 2), 
 BRW_REGISTER_TYPE_UD));
 +   inst = emit(AND(component(payload, 2), r0_2, fs_reg(0x0f00u)));
 +   inst-force_writemask_all = true;
 +
 +   /* Emit a gateway barrier message using the payload we set up, followed
 +* by a wait instruction.
 +*/
 +   emit(SHADER_OPCODE_BARRIER, reg_undef, payload);
  }

  void
 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 51c965c..d0a7c2a 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -572,6 +572,8 @@ brw_instruction_name(enum opcode op)
return gs_svb_set_dst_index;
 case GS_OPCODE_FF_SYNC_SET_PRIMITIVES:
return gs_ff_sync_set_primitives;
 +   case SHADER_OPCODE_BARRIER:
 +  return barrier;
 }

 unreachable(not reached);
 @@ -986,6 +988,7 @@ backend_instruction::has_side_effects() const
 case

Re: [Mesa-dev] [PATCH 3/4] i965: Rename do_stage_prog to brw_stage_compile

2015-03-20 Thread Chris Forbes
I think that having both the existing `struct brw_vs_compile` and a
function with the same name is going to cause confusion. (same with
the other non-fs stages)

On Sat, Mar 21, 2015 at 2:04 PM, Ian Romanick i...@freedesktop.org wrote:
 On 03/20/2015 06:02 PM, Ian Romanick wrote:
 On 03/20/2015 05:49 PM, Carl Worth wrote:
 The rename here is in preparation for these functions to be exported
 to other files.

 I think I'd wait on this until you're also going to make them
 non-static.  Otherwise it's just extra churn.

 Which happens in the next patch.  I'd bring that part of patch 4 into
 this patch.  With that and tabs, this patch is

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 And tabs. :)

 One other comment below.
 This commit is intended to have no functional change. It exists in
 preparation for some upcoming code movement in preparation for the
 shader cache.
 ---
  src/mesa/drivers/dri/i965/brw_ff_gs.c |  7 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp  |  2 +-
  src/mesa/drivers/dri/i965/brw_gs.c| 14 +++---
  src/mesa/drivers/dri/i965/brw_vs.c| 14 +++---
  src/mesa/drivers/dri/i965/brw_wm.c| 14 --
  src/mesa/drivers/dri/i965/brw_wm.h|  8 
  6 files changed, 31 insertions(+), 28 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_ff_gs.c 
 b/src/mesa/drivers/dri/i965/brw_ff_gs.c
 index c589171..8bc0a1c 100644
 --- a/src/mesa/drivers/dri/i965/brw_ff_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_ff_gs.c
 @@ -45,8 +45,9 @@

  #include util/ralloc.h

 -static void compile_ff_gs_prog(struct brw_context *brw,
 -   struct brw_ff_gs_prog_key *key)
 +static void
 +brw_ff_gs_compile(struct brw_context *brw,
 +  struct brw_ff_gs_prog_key *key)
  {
 struct brw_ff_gs_compile c;
 const GLuint *program;
 @@ -253,7 +254,7 @@ brw_upload_ff_gs_prog(struct brw_context *brw)
if (!brw_search_cache(brw-cache, BRW_CACHE_FF_GS_PROG,
  key, sizeof(key),
  brw-ff_gs.prog_offset, brw-ff_gs.prog_data)) {
 - compile_ff_gs_prog( brw, key );
 + brw_ff_gs_compile(brw, key);
}
 }
  }
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 780be80..24eb076 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -4177,7 +4177,7 @@ brw_fs_precompile(struct gl_context *ctx,
 uint32_t old_prog_offset = brw-wm.base.prog_offset;
 struct brw_wm_prog_data *old_prog_data = brw-wm.prog_data;

 -   bool success = do_wm_prog(brw, shader_prog, bfp, key);
 +   bool success = brw_wm_compile(brw, shader_prog, bfp, key);

 brw-wm.base.prog_offset = old_prog_offset;
 brw-wm.prog_data = old_prog_data;
 diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
 b/src/mesa/drivers/dri/i965/brw_gs.c
 index c45e217..a0da919 100644
 --- a/src/mesa/drivers/dri/i965/brw_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_gs.c
 @@ -35,10 +35,10 @@


  static bool
 -do_gs_prog(struct brw_context *brw,
 -   struct gl_shader_program *prog,
 -   struct brw_geometry_program *gp,
 -   struct brw_gs_prog_key *key)
 +brw_gs_compile(struct brw_context *brw,
 +   struct gl_shader_program *prog,
 +   struct brw_geometry_program *gp,
 +   struct brw_gs_prog_key *key)
  {
 struct brw_stage_state *stage_state = brw-gs.base;
 struct brw_gs_compile c;
 @@ -363,8 +363,8 @@ brw_upload_gs_prog(struct brw_context *brw)
   key, sizeof(key),
   stage_state-prog_offset, brw-gs.prog_data)) {
bool success =
 - do_gs_prog(brw, 
 ctx-_Shader-CurrentProgram[MESA_SHADER_GEOMETRY], gp,
 -key);
 + brw_gs_compile(brw, 
 ctx-_Shader-CurrentProgram[MESA_SHADER_GEOMETRY],
 +gp, key);
assert(success);
(void)success;
 }
 @@ -400,7 +400,7 @@ brw_gs_precompile(struct gl_context *ctx,
  */
 key.input_varyings = gp-Base.InputsRead;

 -   success = do_gs_prog(brw, shader_prog, bgp, key);
 +   success = brw_gs_compile(brw, shader_prog, bgp, key);

 brw-gs.base.prog_offset = old_prog_offset;
 brw-gs.prog_data = old_prog_data;
 diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
 b/src/mesa/drivers/dri/i965/brw_vs.c
 index 2c76d25..e5a55a7 100644
 --- a/src/mesa/drivers/dri/i965/brw_vs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vs.c
 @@ -188,10 +188,10 @@ brw_vs_prog_data_compare(const void *in_a, const void 
 *in_b)
  }

  static bool
 -do_vs_prog(struct brw_context *brw,
 -   struct gl_shader_program *prog,
 -   struct brw_vertex_program *vp,
 -   struct brw_vs_prog_key *key)
 +brw_vs_compile(struct brw_context *brw,
 +   struct gl_shader_program *prog,
 +   struct brw_vertex_program *vp,
 +   struct brw_vs_prog_key *key)
  {
 GLuint program_size;
 const GLuint *program;
 @@ -482,8 +482,8 @@ 

Re: [Mesa-dev] [Mesa-stable] [PATCH] glsl: Generate link error for non-matching gl_FragCoord redeclarations

2015-03-19 Thread Chris Forbes
LGTM.

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Sat, Mar 7, 2015 at 1:15 PM, Anuj Phogat anuj.pho...@gmail.com wrote:
 in different fragment shaders. This also applies to a case when gl_FragCoord
 is redeclared with no layout qualifiers in one fragment shader and not
 declared but used in other fragment shader.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Khronos Bug#12957
 Cc: 10.5 mesa-sta...@lists.freedesktop.org
 Cc: Ian Romanick i...@freedesktop.org
 ---
  src/glsl/linker.cpp | 15 ++-
  1 file changed, 2 insertions(+), 13 deletions(-)

 diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
 index e11b6fa..e8bda4f 100644
 --- a/src/glsl/linker.cpp
 +++ b/src/glsl/linker.cpp
 @@ -1365,24 +1365,13 @@ link_fs_input_layout_qualifiers(struct 
 gl_shader_program *prog,
 *   If gl_FragCoord is redeclared in any fragment shader in a 
 program,
 *it must be redeclared in all the fragment shaders in that program
 *that have a static use gl_FragCoord.
 -   *
 -   * Exclude the case when one of the 'linked_shader' or 'shader' 
 redeclares
 -   * gl_FragCoord with no layout qualifiers but the other one doesn't
 -   * redeclare it. If we strictly follow GLSL 1.50 spec's language, it
 -   * should be a link error. But, generating link error for this case 
 will
 -   * be a wrong behaviour which spec didn't intend to do and it could 
 also
 -   * break some applications.
 */
if ((linked_shader-redeclares_gl_fragcoord
  !shader-redeclares_gl_fragcoord
 -shader-uses_gl_fragcoord
 -(linked_shader-origin_upper_left
 -   || linked_shader-pixel_center_integer))
 +shader-uses_gl_fragcoord)
|| (shader-redeclares_gl_fragcoord
 !linked_shader-redeclares_gl_fragcoord
 -   linked_shader-uses_gl_fragcoord
 -   (shader-origin_upper_left
 -  || shader-pixel_center_integer))) {
 +   linked_shader-uses_gl_fragcoord)) {
   linker_error(prog, fragment shader defined with conflicting 
   layout qualifiers for gl_FragCoord\n);
}
 --
 1.9.3

 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Print spills:fills and number of promoted constants.

2015-03-17 Thread Chris Forbes
With the fix Jason mentioned:

Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Wed, Mar 18, 2015 at 10:19 AM, Matt Turner matts...@gmail.com wrote:
 On Tue, Mar 17, 2015 at 2:15 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Tue, Mar 17, 2015 at 2:09 PM, Matt Turner matts...@gmail.com wrote:
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 8edb4d0..63dedae 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -1969,7 +1969,7 @@ brw_vs_emit(struct brw_context *brw,
}

fs_generator g(brw, mem_ctx, (void *) c-key, prog_data-base.base,
 - c-vp-program.Base, v.runtime_check_aads_emit, 
 VS);
 + c-vp-program.Base, v.runtime_check_aads_emit, 
 v.promoted_constants, VS);

 Promoted constants and aads_emit need to be flipped around.  You got
 it right for FS.

 Thanks!

 Also, does this require any adaptations to shader-db or does it work as-is?

 Works as is, but shader-db report.py doesn't know about the new
 things. I'd like to add some switches to it like --spills or
 --compaction and have it print stats based on those.

 Other than that,
 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Thanks!
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/disasm: Fix format strings

2015-03-13 Thread Chris Forbes
Most of the brw_inst_* api returns 64bit values. This fixes disassembly
of sampler messages, etc.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_disasm.c | 48 +-
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
index c92c534..c41dde2 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
@@ -729,7 +729,7 @@ dest(FILE *file, struct brw_context *brw, brw_inst *inst)
  if (err == -1)
 return 0;
  if (brw_inst_dst_da1_subreg_nr(brw, inst))
-format(file, .%d, brw_inst_dst_da1_subreg_nr(brw, inst) /
+format(file, .%ld, brw_inst_dst_da1_subreg_nr(brw, inst) /
reg_type_size[brw_inst_dst_reg_type(brw, inst)]);
  string(file, );
  err |= control(file, horiz stride, horiz_stride,
@@ -740,7 +740,7 @@ dest(FILE *file, struct brw_context *brw, brw_inst *inst)
   } else {
  string(file, g[a0);
  if (brw_inst_dst_ia_subreg_nr(brw, inst))
-format(file, .%d, brw_inst_dst_ia_subreg_nr(brw, inst) /
+format(file, .%ld, brw_inst_dst_ia_subreg_nr(brw, inst) /
reg_type_size[brw_inst_dst_reg_type(brw, inst)]);
  if (brw_inst_dst_ia1_addr_imm(brw, inst))
 format(file,  %d, brw_inst_dst_ia1_addr_imm(brw, inst));
@@ -758,7 +758,7 @@ dest(FILE *file, struct brw_context *brw, brw_inst *inst)
  if (err == -1)
 return 0;
  if (brw_inst_dst_da16_subreg_nr(brw, inst))
-format(file, .%d, brw_inst_dst_da16_subreg_nr(brw, inst) /
+format(file, .%ld, brw_inst_dst_da16_subreg_nr(brw, inst) /
reg_type_size[brw_inst_dst_reg_type(brw, inst)]);
  string(file, 1);
  err |= control(file, writemask, writemask,
@@ -789,7 +789,7 @@ dest_3src(FILE *file, struct brw_context *brw, brw_inst 
*inst)
if (err == -1)
   return 0;
if (brw_inst_3src_dst_subreg_nr(brw, inst))
-  format(file, .%d, brw_inst_3src_dst_subreg_nr(brw, inst));
+  format(file, .%ld, brw_inst_3src_dst_subreg_nr(brw, inst));
string(file, 1);
err |= control(file, writemask, writemask,
   brw_inst_3src_dst_writemask(brw, inst), NULL);
@@ -1225,9 +1225,9 @@ brw_disassemble_inst(FILE *file, struct brw_context *brw, 
brw_inst *inst,
   string(file, ();
   err |= control(file, predicate inverse, pred_inv,
  brw_inst_pred_inv(brw, inst), NULL);
-  format(file, f%d, brw-gen = 7 ? brw_inst_flag_reg_nr(brw, inst) : 0);
+  format(file, f%ld, brw-gen = 7 ? brw_inst_flag_reg_nr(brw, inst) : 
0);
   if (brw_inst_flag_subreg_nr(brw, inst))
- format(file, .%d, brw_inst_flag_subreg_nr(brw, inst));
+ format(file, .%ld, brw_inst_flag_subreg_nr(brw, inst));
   if (brw_inst_access_mode(brw, inst) == BRW_ALIGN_1) {
  err |= control(file, predicate control align1, pred_ctrl_align1,
 brw_inst_pred_control(brw, inst), NULL);
@@ -1261,10 +1261,10 @@ brw_disassemble_inst(FILE *file, struct brw_context 
*brw, brw_inst *inst,
   (brw-gen  6 || (opcode != BRW_OPCODE_SEL 
 opcode != BRW_OPCODE_IF 
 opcode != BRW_OPCODE_WHILE))) {
- format(file, .f%d,
+ format(file, .f%ld,
 brw-gen = 7 ? brw_inst_flag_reg_nr(brw, inst) : 0);
  if (brw_inst_flag_subreg_nr(brw, inst))
-format(file, .%d, brw_inst_flag_subreg_nr(brw, inst));
+format(file, .%ld, brw_inst_flag_subreg_nr(brw, inst));
   }
}
 
@@ -1276,7 +1276,7 @@ brw_disassemble_inst(FILE *file, struct brw_context *brw, 
brw_inst *inst,
}
 
if (opcode == BRW_OPCODE_SEND  brw-gen  6)
-  format(file,  %d, brw_inst_base_mrf(brw, inst));
+  format(file,  %ld, brw_inst_base_mrf(brw, inst));
 
if (has_uip(brw, opcode)) {
   /* Instructions that have UIP also have JIP. */
@@ -1297,7 +1297,7 @@ brw_disassemble_inst(FILE *file, struct brw_context *brw, 
brw_inst *inst,
   pad(file, 16);
   format(file, Jump: %d, brw_inst_gen4_jump_count(brw, inst));
   pad(file, 32);
-  format(file, Pop: %d, brw_inst_gen4_pop_count(brw, inst));
+  format(file, Pop: %ld, brw_inst_gen4_pop_count(brw, inst));
} else if (brw-gen  6  (opcode == BRW_OPCODE_IF ||
opcode == BRW_OPCODE_IFF ||
opcode == BRW_OPCODE_HALT)) {
@@ -1305,7 +1305,7 @@ brw_disassemble_inst(FILE *file, struct brw_context *brw, 
brw_inst *inst,
   format(file, Jump: %d, brw_inst_gen4_jump_count(brw, inst));
} else if (brw-gen  6  opcode == BRW_OPCODE_ENDIF) {
   pad(file, 16);
-  format(file, Pop: %d, brw_inst_gen4_pop_count(brw, inst));
+  format(file, Pop: %ld, brw_inst_gen4_pop_count

[Mesa-dev] [PATCH 1/2] i965/disasm: Mark format() as being printf-style.

2015-03-13 Thread Chris Forbes
This allows us to get warnings from GCC when we mess up the format
strings.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_disasm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
index 863a6b3..c92c534 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
@@ -597,6 +597,9 @@ string(FILE *file, const char *string)
 }
 
 static int
+format(FILE *f, const char *format, ...) PRINTFLIKE(2, 3);
+
+static int
 format(FILE *f, const char *format, ...)
 {
char buf[1024];
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen6 gs: Convert brw_imm_ud/brw_imm_d to src_reg

2015-03-03 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Wed, Mar 4, 2015 at 2:25 PM, Jordan Justen jordan.l.jus...@intel.com wrote:
 Same idea as this patch, only for gen6_gs_visitor:

 commit 49a938a265f5959c9b558995cc658f80acb6eb18
 Author: Jordan Justen jordan.l.jus...@intel.com
 Date:   Fri Feb 20 12:12:25 2015 -0800
 i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

 Suggested-by: Matt Turner matts...@gmail.com
 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
 index 564b4cb..782687a 100644
 --- a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
 @@ -254,7 +254,7 @@ gen6_gs_visitor::visit(ir_end_primitive *)
 * vertex.
 */
src_reg offset(this, glsl_type::uint_type);
 -  emit(ADD(dst_reg(offset), this-vertex_output_offset, brw_imm_d(-1)));
 +  emit(ADD(dst_reg(offset), this-vertex_output_offset, src_reg(-1)));

src_reg dst(this-vertex_output);
dst.reladdr = ralloc(mem_ctx, src_reg);
 @@ -384,7 +384,7 @@ gen6_gs_visitor::emit_thread_end()
   dst_reg(this-temp), this-prim_count, this-svbi);
} else {
   inst = emit(GS_OPCODE_FF_SYNC,
 - dst_reg(this-temp), this-prim_count, brw_imm_ud(0u));
 + dst_reg(this-temp), this-prim_count, src_reg(0u));
}
inst-base_mrf = base_mrf;

 @@ -487,8 +487,8 @@ gen6_gs_visitor::emit_thread_end()
 if (c-prog_data.gen6_xfb_enabled) {
/* When emitting EOT, set SONumPrimsWritten Increment Value. */
src_reg data(this, glsl_type::uint_type);
 -  emit(AND(dst_reg(data), this-sol_prim_written, brw_imm_ud(0xu)));
 -  emit(SHL(dst_reg(data), data, brw_imm_ud(16u)));
 +  emit(AND(dst_reg(data), this-sol_prim_written, src_reg(0xu)));
 +  emit(SHL(dst_reg(data), data, src_reg(16u)));
emit(GS_OPCODE_SET_DWORD_2, dst_reg(MRF, base_mrf), data);
 }

 @@ -624,7 +624,7 @@ gen6_gs_visitor::xfb_write()
  * transform feedback is in interleaved or separate attribs mode.
  */
 src_reg sol_temp(this, glsl_type::uvec4_type);
 -   emit(ADD(dst_reg(sol_temp), this-svbi, brw_imm_ud(num_verts)));
 +   emit(ADD(dst_reg(sol_temp), this-svbi, src_reg(num_verts)));

 /* Compare SVBI calculated number with the maximum value, which is
  * in R1.4 (previously saved in this-max_svbi) for gen6.
 @@ -671,7 +671,7 @@ gen6_gs_visitor::xfb_program(unsigned vertex, unsigned 
 num_verts)
  * (all vertices). Otherwise, avoid writing any vertices for it
  */
 emit(ADD(dst_reg(sol_temp), this-sol_prim_written, 1u));
 -   emit(MUL(dst_reg(sol_temp), sol_temp, brw_imm_ud(num_verts)));
 +   emit(MUL(dst_reg(sol_temp), sol_temp, src_reg(num_verts)));
 emit(ADD(dst_reg(sol_temp), sol_temp, this-svbi));
 emit(CMP(dst_null_d(), sol_temp, this-max_svbi, BRW_CONDITIONAL_LE));
 emit(IF(BRW_PREDICATE_NORMAL));
 @@ -736,7 +736,7 @@ gen6_gs_visitor::xfb_program(unsigned vertex, unsigned 
 num_verts)
   */
  emit(ADD(dst_reg(this-destination_indices),
   this-destination_indices,
 - brw_imm_ud(num_verts)));
 + src_reg(num_verts)));
  emit(ADD(dst_reg(this-sol_prim_written),
   this-sol_prim_written, 1u));
   }
 --
 2.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gs: Check newly-generated GS-out VUE map against correct stage

2015-02-28 Thread Chris Forbes
Thanks Matt -- yes, for 10.5 as well.

I'll also note that this fixes:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=5

On Sat, Feb 28, 2015 at 8:48 PM, Matt Turner matts...@gmail.com wrote:
 On Fri, Feb 27, 2015 at 11:03 PM, Chris Forbes chr...@ijw.co.nz wrote:
 Previously, we compared our new GS-out VUE map to the existing *VS*-out
 VUE map, which is bogus.

 This would mostly manifest as redundant dirty flagging where the GS is
 in use but the VS and GS output layouts differ; but there is a scary
 case where we would fail to flag a GS-out layout change if it happened
 to match the VS-out layout.

 Signed-off-by: Chris Forbes chr...@ijw.co.nz
 Cc: 10.4 mesa-sta...@lists.freedesktop.org

 Presumably 10.5 as well?

 Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gs: Check newly-generated GS-out VUE map against correct stage

2015-02-27 Thread Chris Forbes
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
Cc: 10.4 mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_gs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index 1fba76a..efcff09 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -357,7 +357,7 @@ brw_upload_gs_prog(struct brw_context *brw)
}
brw-gs.base.prog_data = brw-gs.prog_data-base.base;
 
-   if (memcmp(brw-vs.prog_data-base.vue_map, brw-vue_map_geom_out,
+   if (memcmp(brw-gs.prog_data-base.vue_map, brw-vue_map_geom_out,
   sizeof(brw-vue_map_geom_out)) != 0) {
   brw-vue_map_geom_out = brw-gs.prog_data-base.vue_map;
   brw-state.dirty.brw |= BRW_NEW_VUE_MAP_GEOM_OUT;
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >