Re: [Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates

2016-07-11 Thread Samuel Iglesias Gonsálvez


On 11/07/16 14:54, Kenneth Graunke wrote:
> On Monday, July 11, 2016 1:37:46 PM PDT Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga 
>>
>> Gen7 hardware does not support double immediates so these need
>> to be moved in 32-bit chunks to a regular vgrf instead. Instead
>> of doing this every time we need to create a DF immediate,
>> create a helper function that does the right thing depending
>> on the hardware generation.
>>
>> v2:
>> - Define setup_imm_df() as an independent function (Curro)
>> - Create a specific builder to get rid of some instruction field
>>   assignments (Curro).
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez 
>> Reviewed-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h   |  3 +++
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 
>> 
>>  2 files changed, 40 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index 1f88f8f..d034573 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const 
>> brw::fs_builder ,
>>  const fs_reg ,
>>  const fs_reg ,
>>  uint32_t components);
>> +fs_reg setup_imm_df(const struct brw_device_info *devinfo,
>> +const brw::fs_builder ,
>> +double v);
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 04ed42e..94c719b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder 
>> ,
>>bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, 
>> 1));
>> }
>>  }
>> +
>> +fs_reg
>> +setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , 
>> double v)
>> +{
> 
> If you like, you can just do:
> 
>const struct brw_device_info *devinfo = bld.shader->devinfo;
> 
> and avoid the extra parameter.  Either way is fine.
> 

Right, I am going to do it.

Thanks!

Sam

>> +   assert(devinfo->gen >= 7);
>> +
>> +   if (devinfo->gen >= 8)
>> +  return brw_imm_df(v);
>> +
>> +   /* gen7 does not support DF immediates, so we generate a 64-bit constant 
>> by
>> +* writing the low 32-bit of the constant to suboffset 0 of a VGRF and
>> +* the high 32-bit to suboffset 4 and then applying a stride of 0.
>> +*
>> +* Alternatively, we could also produce a normal VGRF (without stride 0)
>> +* by writing to all the channels in the VGRF, however, that would hit 
>> the
>> +* gen7 bug where we have to split writes that span more than 1 register
>> +* into instructions with a width of 4 (otherwise the write to the second
>> +* register written runs into an execmask hardware bug) which isn't very
>> +* nice.
>> +*/
>> +   union {
>> +  double d;
>> +  struct {
>> + uint32_t i1;
>> + uint32_t i2;
>> +  };
>> +   } di;
>> +
>> +   di.d = v;
>> +
>> +   const fs_builder ubld = bld.exec_all().group(1, 0);
>> +   const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2);
>> +   ubld.MOV(tmp, brw_imm_ud(di.i1));
>> +   ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2));
>> +
>> +   return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0);
>> +}
>>
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa from git fails to compile

2016-07-11 Thread Pali Rohár
On Monday 11 July 2016 10:11:30 Emil Velikov wrote:
> Sounds similar (the same?) as
> https://bugs.freedesktop.org/show_bug.cgi?id=89347.

Output error looks lo be same.

> Which version of mako do you have, can you give things a try with
> 0.8.0 or later ?

If you mean mako python module, then I have version 0.5.0. That build is
for Ubuntu precise which have only that one version in repository, see:
http://packages.ubuntu.com/precise/python-mako So I do not have new
version of make...

-- 
Pali Rohár
pali.ro...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 1/1] r600, compute: Use vtx #3 for kernel arguments

2016-07-11 Thread Tom Stellard
On Sun, Jun 26, 2016 at 08:40:55PM -0400, Jan Vesely wrote:
> Both explicit and implicit.
> Using vtx 0 (as existing llvm code implies) does not work for dynamic offsets.
> 
> Signed-off-by: Jan Vesely 

I have no idea why vtx#3 works when vtx#0, maybe add a comment
explaining why we are using vtx#3.

With that change:

Reviewed-by: Tom Stellard 

> ---
> Hi,
> 
> I ran into problem when using VTX_READ from constant buffer would work only 
> for 0 index. The LLVM code implied that it should work (or maybe they 
> considered constant offsets only), but I could not find one way or the other 
> in ISA docs.
> 
> Switching to vtx#3 fixed the problem, though I'm not sure if it's the right 
> solution.
> 
> thanks,
> Jan
> 
> 
>  src/gallium/drivers/r600/evergreen_compute.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index 7f9580c..b351cee 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -369,6 +369,8 @@ static void evergreen_compute_upload_input(struct 
> pipe_context *ctx,
>   ctx->transfer_unmap(ctx, transfer);
>  
>   /* ID=0 is reserved for the parameters */
> + evergreen_cs_set_vertex_buffer(rctx, 3, 0,
> + (struct pipe_resource*)shader->kernel_param);
>   evergreen_cs_set_constant_buffer(rctx, 0, 0, input_size,
>   (struct pipe_resource*)shader->kernel_param);
>  }
> @@ -614,9 +616,9 @@ static void evergreen_set_compute_resources(struct 
> pipe_context *ctx,
>   start, count);
>  
>   for (unsigned i = 0; i < count; i++) {
> - /* The First three vertex buffers are reserved for parameters 
> and
> + /* The First four vertex buffers are reserved for parameters and
>* global buffers. */
> - unsigned vtx_id = 3 + i;
> + unsigned vtx_id = 4 + i;
>   if (resources[i]) {
>   struct r600_resource_global *buffer =
>   (struct r600_resource_global*)
> -- 
> 2.7.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/24] RFC nir/algebraic: Optimize open-coded nir_binop_bfm

2016-07-11 Thread Matt Turner
On Wed, Jun 29, 2016 at 2:04 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> BFM is (((1u << a) - 1) << b).  Recognize a couple patterns that look
> like this, and replace them with BFM.
>
> NOTE: Using lower_bitfield_insert is definitely not the right way to
> flag this optimization... so, I'm looking for some advice as to what the
> right way is.

I guess we'll just need another flag to indicate the presence of BFM?
Maybe add a has_bfm flag to nir_shader_compiler_options. It would be
the first has_*, but I can't think of anything better.

And really, maybe has_flrp makes more sense than lower_flrp. The
inverse of former certainly indicates you need to lower it.

That would get a Reviewed-by from me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 16/24] i965: Use LZD to implement nir_op_ifind_msb on Gen < 7

2016-07-11 Thread Matt Turner
On Thu, Jul 7, 2016 at 10:16 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> v2: Retype LZD source as UD to avoid potential problems with 0x8000.
> Suggested by Matt.  Also update comment about problem values with
> LZD(abs(x)).  Suggested by Curro.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 54 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 57 
> --
>  2 files changed, 90 insertions(+), 21 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 65f6406..93d5e9d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -623,8 +623,36 @@ emit_find_msb_using_lzd(const fs_builder ,
>  bool is_signed)
>  {
> fs_inst *inst;
> +   fs_reg temp = src;
>
> -   bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), src);
> +   if (is_signed) {
> +  /* LZD of an absolute value source almost always does the right
> +   * thing.  There are two problem values:
> +   *
> +   * * 0x8000.  Since abs(0x8000) == 0x8000, LZD returns
> +   *   0.  However, findMSB(int(0x8000)) == 30.
> +   *
> +   * * 0x.  Since abs(0x) == 1, LZD returns
> +   *   31.  Section 8.8 (Integer Functions) of the GLSL 4.50 spec says:
> +   *
> +   *For a value of zero or negative one, -1 will be returned.
> +   *
> +   * * Negative powers of two.  LZD(abs(-(1< +   *   findMSB(-(1< +   *

Might be nice to add these cases to the piglit tests.

15-17 are

Reviewed-by: Matt Turner 

That should be the whole series, minus the RFC at the end.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 2/3] Remove split-to-files.py

2016-07-11 Thread Nicolai Hähnle

Would you mind updating the README as well? With that, patches 2 & 3 are

Reviewed-by: Nicolai Hähnle 

Patch 1 is

Acked-by: Nicolai Hähnle 

On 11.07.2016 20:10, Marek Olšák wrote:

From: Marek Olšák 

Use MESA_SHADER_CAPTURE_PATH instead.
---
  split-to-files.py | 138 --
  1 file changed, 138 deletions(-)
  delete mode 100755 split-to-files.py

diff --git a/split-to-files.py b/split-to-files.py
deleted file mode 100755
index 721b2da..000
--- a/split-to-files.py
+++ /dev/null
@@ -1,138 +0,0 @@
-#!/usr/bin/env python3
-
-import re
-import os
-import argparse
-
-
-def parse_input(infile):
-shaders = dict()
-programs = dict()
-shadertuple = ("bad", 0)
-prognum = ""
-reading = False
-is_glsl = True
-
-for line in infile.splitlines():
-declmatch = re.match(
-r"GLSL (.*) shader (.*) source for linked program (.*):", line)
-arbmatch = re.match(
-r"ARB_([^_]*)_program source for program (.*):", line)
-if declmatch:
-shadertype = declmatch.group(1)
-shadernum = declmatch.group(2)
-prognum = declmatch.group(3)
-shadertuple = (shadertype, shadernum)
-
-# don't save driver-internal shaders.
-if prognum == "0":
-continue
-
-if prognum not in shaders:
-shaders[prognum] = dict()
-if shadertuple in shaders[prognum]:
-print("Warning: duplicate", shadertype, " shader ", shadernum,
-  "in program", prognum, "...tossing old shader.")
-shaders[prognum][shadertuple] = ''
-reading = True
-is_glsl = True
-print("Reading program {0} {1} shader {2}".format(
-prognum, shadertype, shadernum))
-elif arbmatch:
-shadertype = arbmatch.group(1)
-prognum = arbmatch.group(2)
-if prognum in programs:
-print("dupe!")
-exit(1)
-programs[prognum] = (shadertype, '')
-reading = True
-is_glsl = False
-print("Reading program {0} {1} shader".format(prognum, shadertype))
-elif re.match("GLSL IR for ", line):
-reading = False
-elif re.match("Mesa IR for ", line):
-reading = False
-elif re.match("GLSL source for ", line):
-reading = False
-elif reading:
-if is_glsl:
-shaders[prognum][shadertuple] += line + '\n'
-else:
-type, source = programs[prognum]
-programs[prognum] = (type, ''.join([source, line, '\n']))
-
-return (shaders, programs)
-
-
-def write_shader_test(filename, shaders):
-print("Writing {0}".format(filename))
-out = open(filename, 'w')
-
-min_version = 110
-for stage, num in shaders:
-shader = shaders[(stage, num)]
-m = re.match(r"^#version (\d\d\d)", shader)
-if m:
-version = int(m.group(1), 10)
-if version > min_version:
-min_version = version
-
-out.write("[require]\n")
-out.write("GLSL >= %.2f\n" % (min_version / 100.))
-out.write("\n")
-
-for stage, num in shaders:
-if stage == "vertex":
-out.write("[vertex shader]\n")
-elif stage == "fragment":
-out.write("[fragment shader]\n")
-elif stage == "geometry":
-out.write("[geometry shader]\n")
-elif stage == "tess ctrl" or stage == "tessellation control":
-out.write("[tessellation control shader]\n")
-elif stage == "tess eval" or stage == "tessellation evaluation":
-out.write("[tessellation evaluation shader]\n")
-else:
-assert False, stage
-out.write(shaders[(stage, num)])
-
-out.close()
-
-def write_arb_shader_test(filename, type, source):
-print("Writing {0}".format(filename))
-out = open(filename, 'w')
-out.write("[require]\n")
-out.write("GL_ARB_{0}_program\n".format(type))
-out.write("\n")
-out.write("[{0} program]\n".format(type))
-out.write(source)
-# INTEL_DEBUG won't output anything for ARB programs unless you draw
-out.write("\n[test]\ndraw rect -1 -1 1 2\n");
-out.close()
-
-def write_files(directory, shaders, programs):
-for prog in shaders:
-write_shader_test("{0}/{1}.shader_test".format(directory, prog),
-  shaders[prog])
-for prognum in programs:
-prog = programs[prognum]
-write_arb_shader_test("{0}/{1}p-{2}.shader_test".format(directory,
-prog[0][0], prognum), prog[0], prog[1])
-
-def main():
-parser = argparse.ArgumentParser()
-parser.add_argument('appname', help='Output directory (application name)')
-parser.add_argument('mesadebug', 

Re: [Mesa-dev] [PATCH] nvc0: use a define for the driver constant buffer size

2016-07-11 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Jul 11, 2016 at 4:25 PM, Samuel Pitoiset
 wrote:
> This might avoid mistakes if the size is bumped in the future.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 8 
>  src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 4 ++--
>  src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 8 
>  src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 6 +++---
>  src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 4 ++--
>  7 files changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
> index 10a4c83..dc4d1b3 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
> @@ -115,7 +115,7 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen,
>
> /* MS sample coordinate offsets */
> BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
> -   PUSH_DATA (push, 2048);
> +   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
> PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
> PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
> BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 2 * 8);
> @@ -253,7 +253,7 @@ nvc0_compute_validate_driverconst(struct nvc0_context 
> *nvc0)
> struct nvc0_screen *screen = nvc0->screen;
>
> BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
> -   PUSH_DATA (push, 2048);
> +   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
> PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
> PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
> BEGIN_NVC0(push, NVC0_CP(CB_BIND), 1);
> @@ -271,7 +271,7 @@ nvc0_compute_validate_buffers(struct nvc0_context *nvc0)
> int i;
>
> BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
> -   PUSH_DATA (push, 2048);
> +   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
> PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s));
> PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s));
> BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 4 * NVC0_MAX_BUFFERS);
> @@ -406,7 +406,7 @@ nvc0_compute_upload_input(struct nvc0_context *nvc0,
> }
>
> BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
> -   PUSH_DATA (push, 2048);
> +   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
> PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
> PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> index f6d535a..7acd477 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> @@ -104,7 +104,7 @@
>  #define NVC0_CB_USR_SIZE(6 << 16)
>  /* 6 driver constbuts, at 2K each */
>  #define NVC0_CB_AUX_INFO(s) NVC0_CB_USR_SIZE + (s << 11)
> -#define NVC0_CB_AUX_SIZE(6 << 11)
> +#define NVC0_CB_AUX_SIZE(1 << 11)
>  /* XXX: Figure out what this UNK data is. */
>  #define NVC0_CB_AUX_UNK_INFO0x000
>  #define NVC0_CB_AUX_UNK_SIZE(8 * 4)
> @@ -138,7 +138,7 @@
>  #define NVC0_CB_AUX_MP_INFO 0x600
>  #define NVC0_CB_AUX_MP_SIZE 3 * 4
>  /* 4 32-bits floats for the vertex runout, put at the end */
> -#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + NVC0_CB_AUX_SIZE
> +#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + (NVC0_CB_AUX_SIZE * 6)
>
>  struct nvc0_blitctx;
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> index 27cbbc4..944349d 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> @@ -1836,7 +1836,7 @@ nvc0_hw_sm_upload_input(struct nvc0_context *nvc0, 
> struct nvc0_hw_query *hq)
>PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1));
> } else {
>BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
> -  PUSH_DATA (push, 2048);
> +  PUSH_DATA (push, NVC0_CB_AUX_SIZE);
>PUSH_DATAh(push, address);
>PUSH_DATA (push, address);
>BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 3);
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index e0bfd3b..d22150a 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> @@ -960,7 +960,7 @@ nvc0_screen_create(struct nouveau_device *dev)
>/* TIC and TSC entries for each unit (nve4+ only) */
>/* auxiliary constants (6 user clip planes, base instance id) */
>BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
> -  

[Mesa-dev] [PATCH] nvc0: use a define for the driver constant buffer size

2016-07-11 Thread Samuel Pitoiset
This might avoid mistakes if the size is bumped in the future.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 8 
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 4 ++--
 src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 8 
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 6 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 4 ++--
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
index 10a4c83..dc4d1b3 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
@@ -115,7 +115,7 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen,
 
/* MS sample coordinate offsets */
BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
-   PUSH_DATA (push, 2048);
+   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 2 * 8);
@@ -253,7 +253,7 @@ nvc0_compute_validate_driverconst(struct nvc0_context *nvc0)
struct nvc0_screen *screen = nvc0->screen;
 
BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
-   PUSH_DATA (push, 2048);
+   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
BEGIN_NVC0(push, NVC0_CP(CB_BIND), 1);
@@ -271,7 +271,7 @@ nvc0_compute_validate_buffers(struct nvc0_context *nvc0)
int i;
 
BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
-   PUSH_DATA (push, 2048);
+   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s));
PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s));
BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 4 * NVC0_MAX_BUFFERS);
@@ -406,7 +406,7 @@ nvc0_compute_upload_input(struct nvc0_context *nvc0,
}
 
BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
-   PUSH_DATA (push, 2048);
+   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5));
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index f6d535a..7acd477 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -104,7 +104,7 @@
 #define NVC0_CB_USR_SIZE(6 << 16)
 /* 6 driver constbuts, at 2K each */
 #define NVC0_CB_AUX_INFO(s) NVC0_CB_USR_SIZE + (s << 11)
-#define NVC0_CB_AUX_SIZE(6 << 11)
+#define NVC0_CB_AUX_SIZE(1 << 11)
 /* XXX: Figure out what this UNK data is. */
 #define NVC0_CB_AUX_UNK_INFO0x000
 #define NVC0_CB_AUX_UNK_SIZE(8 * 4)
@@ -138,7 +138,7 @@
 #define NVC0_CB_AUX_MP_INFO 0x600
 #define NVC0_CB_AUX_MP_SIZE 3 * 4
 /* 4 32-bits floats for the vertex runout, put at the end */
-#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + NVC0_CB_AUX_SIZE
+#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + (NVC0_CB_AUX_SIZE * 6)
 
 struct nvc0_blitctx;
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
index 27cbbc4..944349d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
@@ -1836,7 +1836,7 @@ nvc0_hw_sm_upload_input(struct nvc0_context *nvc0, struct 
nvc0_hw_query *hq)
   PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1));
} else {
   BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3);
-  PUSH_DATA (push, 2048);
+  PUSH_DATA (push, NVC0_CB_AUX_SIZE);
   PUSH_DATAh(push, address);
   PUSH_DATA (push, address);
   BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 3);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index e0bfd3b..d22150a 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -960,7 +960,7 @@ nvc0_screen_create(struct nouveau_device *dev)
   /* TIC and TSC entries for each unit (nve4+ only) */
   /* auxiliary constants (6 user clip planes, base instance id) */
   BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
-  PUSH_DATA (push, 2048);
+  PUSH_DATA (push, NVC0_CB_AUX_SIZE);
   PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(i));
   PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(i));
   BEGIN_NVC0(push, NVC0_3D(CB_BIND(i)), 1);
diff --git 

Re: [Mesa-dev] [PATCH] nvc0: fix the driver cb size when draw parameters are used

2016-07-11 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

A follow-up patch to replace all those 2048's with some #define would
be great :)

On Mon, Jul 11, 2016 at 3:26 PM, Samuel Pitoiset
 wrote:
> The size of the driver constant buffer for each stage should be 2048
> and not 512 because it has been increased recently for buffers/images.
> While we are at it, do the same change for indirect draws.
>
> This fixes all ARB_shader_draw_parameters tests on GM107.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: 12.0 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
> index 4e40ff5..94274bc 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
> @@ -835,7 +835,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const 
> struct pipe_draw_info *info)
>
> /* Queue things up to let the macros write params to the driver constbuf 
> */
> BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
> -   PUSH_DATA (push, 512);
> +   PUSH_DATA (push, 2048);
> PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
> PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
> BEGIN_NVC0(push, NVC0_3D(CB_POS), 1);
> @@ -979,7 +979,7 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
> pipe_draw_info *info)
> if (nvc0->vertprog->vp.need_draw_parameters) {
>PUSH_SPACE(push, 9);
>BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
> -  PUSH_DATA (push, 512);
> +  PUSH_DATA (push, 2048);
>PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
>PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
>if (!info->indirect) {
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0: fix the driver cb size when draw parameters are used

2016-07-11 Thread Samuel Pitoiset
The size of the driver constant buffer for each stage should be 2048
and not 512 because it has been increased recently for buffers/images.
While we are at it, do the same change for indirect draws.

This fixes all ARB_shader_draw_parameters tests on GM107.

Signed-off-by: Samuel Pitoiset 
Cc: 12.0 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
index 4e40ff5..94274bc 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
@@ -835,7 +835,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const struct 
pipe_draw_info *info)
 
/* Queue things up to let the macros write params to the driver constbuf */
BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
-   PUSH_DATA (push, 512);
+   PUSH_DATA (push, 2048);
PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
BEGIN_NVC0(push, NVC0_3D(CB_POS), 1);
@@ -979,7 +979,7 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
if (nvc0->vertprog->vp.need_draw_parameters) {
   PUSH_SPACE(push, 9);
   BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
-  PUSH_DATA (push, 512);
+  PUSH_DATA (push, 2048);
   PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
   PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
   if (!info->indirect) {
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-11 Thread Francisco Jerez
Francisco Jerez  writes:

> Samuel Iglesias Gonsálvez  writes:
>
>> From: Iago Toral Quiroga 
>>
>> In fp64 we can produce code like this:
>>
>> mov(16) vgrf2<2>:UD, vgrf3<2>:UD
>>
>> That our simd lowering pass would typically split in instructions with a
>> width of 8, writing to two consecutive registers each. Unfortunately, gen7
>> hardware has a bug affecting execution masking and as a result, the
>> second GRF register write won't work properly. Curro verified this:
>>
>> "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1
>>  (where QtrCtrl is the 8-bit quarter of the execution mask signals
>>  specified in the instruction control fields) for the second
>>  compressed half of any single-precision instruction (for
>>  double-precision instructions it's hardwired to use NibCtrl+1,
>>  at least on HSW), which means that the EU will apply the wrong
>>  execution controls for the second sequential GRF write if the number
>>  of channels per GRF is not exactly eight in single-precision mode (or
>>  four in double-float mode)."
>>
>> In practice, this means that we cannot write more than one
>> consecutive GRF in a single instruction if the number of channels
>> per GRF is not exactly eight in single-precision mode (or four
>> in double-float mode).
>>
>> This patch makes our SIMD lowering pass split this kind of instructions
>> so that the split versions only write to a single register. In the
>> example above this means that we split the write in 4 instructions, each
>> one writing 4 UD elements (width = 4) to a single register.
>>
>> v2 (Curro):
>>  - Make explicit that the thing about hardwiring NibCtrl+1 for the second
>>compressed half is known to happen in Haswell and the issue with IVB
>>might not be exactly the same.
>>  - Assign max_width instead of returning early so that we can handle
>>multiple restrictions affecting to the same instruction.
>>  - Avoid division by 0 if the instruction does not write any registers.
>>  - Ignore instructions what have WE_all set.
>>  - Use the instruction execution type size instead of the dst type size.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 28 
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 2f473cc..4d57412 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct 
>> brw_device_info *devinfo,
>>  */
>> unsigned reg_count = inst->regs_written;
>>  
>
> You've put this right in the middle of one of my previous workarounds
> ;), can you move it down a bit more next to the "According to the IVB
> PRMs" line, or close to the end of the function?
>
>> +   /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is
>> +* the 8-bit quarter of the execution mask signals specified in the
>> +* instruction control fields) for the second compressed half of any
>> +* single-precision instruction (for double-precision instructions
>> +* it's hardwired to use NibCtrl+1, at least on HSW), which means that
>> +* the EU will apply the wrong execution controls for the second
>> +* sequential GRF write if the number of channels per GRF is not exactly
>> +* eight in single-precision mode (or four in double-float mode).
>> +*
>> +* In this situation we calculate the maximum size of the split
>> +* instructions so they only ever write to a single register.
>> +*/
>> +   if (devinfo->gen < 8 && inst->regs_written > 1 &&
>> +   !inst->force_writemask_all) {
>> +  unsigned channels_per_grf = inst->exec_size / inst->regs_written;
>
> Could be declared const.
>
>> +  unsigned exec_type_size = 0;
>> +  for (int i = 0; i < inst->sources; i++) {
>> + if (inst->src[i].file == BAD_FILE)
>> +break;
>
> It wouldn't be right to break early if the instruction has any valid
> sources after a non-present one.  This should probably be:
>
> |   if (inst->src[i].file != BAD_FILE)
> |  exec_type_size = MAX2(exec_type_size, 
> type_sz(inst->src[i].type));
>
> instead.
>
>> + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type));
>> +  }
>> +  assert(exec_type_size);
>> +
>> +  if (channels_per_grf != REG_SIZE / exec_type_size) {
>
> I think you really need to use (exec_type_size == 8 ? 4 : 8) instead of
> the RHS of this expression.  The hardware shifts exactly 8 channels per
> compressed half of the instruction regardless of the execution type,

(for execution types other than DF that is)

> so
> this formula would give you an incorrect answer for exec_type_size < 4.
>
>> + max_width = MIN2(max_width, channels_per_grf);
>> +  }
>
> Redundant braces.
>
>> +   }
>> +
>> for (unsigned i = 0; i < inst->sources; 

Re: [Mesa-dev] [PATCH v2 6/6] i965/fs: do d2x lowering before simd splitting

2016-07-11 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> So that we can have gen7 split large writes produced by this lowering pass.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 

Reviewed-by: Francisco Jerez 

> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4bf0ca2..d131106 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -5843,6 +5843,11 @@ fs_visitor::optimize()
>OPT(dead_code_eliminate);
> }
>  
> +   if (OPT(lower_d2x)) {
> +  OPT(opt_copy_propagate);
> +  OPT(dead_code_eliminate);
> +   }
> +
> OPT(lower_simd_width);
>  
> /* After SIMD lowering just in case we had to unroll the EOT send. */
> @@ -5879,11 +5884,6 @@ fs_visitor::optimize()
>OPT(dead_code_eliminate);
> }
>  
> -   if (OPT(lower_d2x)) {
> -  OPT(opt_copy_propagate);
> -  OPT(dead_code_eliminate);
> -   }
> -
> OPT(opt_combine_constants);
> OPT(lower_integer_multiplication);
>  
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-11 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> From: Iago Toral Quiroga 
>
> In fp64 we can produce code like this:
>
> mov(16) vgrf2<2>:UD, vgrf3<2>:UD
>
> That our simd lowering pass would typically split in instructions with a
> width of 8, writing to two consecutive registers each. Unfortunately, gen7
> hardware has a bug affecting execution masking and as a result, the
> second GRF register write won't work properly. Curro verified this:
>
> "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1
>  (where QtrCtrl is the 8-bit quarter of the execution mask signals
>  specified in the instruction control fields) for the second
>  compressed half of any single-precision instruction (for
>  double-precision instructions it's hardwired to use NibCtrl+1,
>  at least on HSW), which means that the EU will apply the wrong
>  execution controls for the second sequential GRF write if the number
>  of channels per GRF is not exactly eight in single-precision mode (or
>  four in double-float mode)."
>
> In practice, this means that we cannot write more than one
> consecutive GRF in a single instruction if the number of channels
> per GRF is not exactly eight in single-precision mode (or four
> in double-float mode).
>
> This patch makes our SIMD lowering pass split this kind of instructions
> so that the split versions only write to a single register. In the
> example above this means that we split the write in 4 instructions, each
> one writing 4 UD elements (width = 4) to a single register.
>
> v2 (Curro):
>  - Make explicit that the thing about hardwiring NibCtrl+1 for the second
>compressed half is known to happen in Haswell and the issue with IVB
>might not be exactly the same.
>  - Assign max_width instead of returning early so that we can handle
>multiple restrictions affecting to the same instruction.
>  - Avoid division by 0 if the instruction does not write any registers.
>  - Ignore instructions what have WE_all set.
>  - Use the instruction execution type size instead of the dst type size.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 28 
>  1 file changed, 28 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 2f473cc..4d57412 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct 
> brw_device_info *devinfo,
>  */
> unsigned reg_count = inst->regs_written;
>  

You've put this right in the middle of one of my previous workarounds
;), can you move it down a bit more next to the "According to the IVB
PRMs" line, or close to the end of the function?

> +   /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is
> +* the 8-bit quarter of the execution mask signals specified in the
> +* instruction control fields) for the second compressed half of any
> +* single-precision instruction (for double-precision instructions
> +* it's hardwired to use NibCtrl+1, at least on HSW), which means that
> +* the EU will apply the wrong execution controls for the second
> +* sequential GRF write if the number of channels per GRF is not exactly
> +* eight in single-precision mode (or four in double-float mode).
> +*
> +* In this situation we calculate the maximum size of the split
> +* instructions so they only ever write to a single register.
> +*/
> +   if (devinfo->gen < 8 && inst->regs_written > 1 &&
> +   !inst->force_writemask_all) {
> +  unsigned channels_per_grf = inst->exec_size / inst->regs_written;

Could be declared const.

> +  unsigned exec_type_size = 0;
> +  for (int i = 0; i < inst->sources; i++) {
> + if (inst->src[i].file == BAD_FILE)
> +break;

It wouldn't be right to break early if the instruction has any valid
sources after a non-present one.  This should probably be:

|   if (inst->src[i].file != BAD_FILE)
|  exec_type_size = MAX2(exec_type_size, 
type_sz(inst->src[i].type));

instead.

> + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type));
> +  }
> +  assert(exec_type_size);
> +
> +  if (channels_per_grf != REG_SIZE / exec_type_size) {

I think you really need to use (exec_type_size == 8 ? 4 : 8) instead of
the RHS of this expression.  The hardware shifts exactly 8 channels per
compressed half of the instruction regardless of the execution type, so
this formula would give you an incorrect answer for exec_type_size < 4.

> + max_width = MIN2(max_width, channels_per_grf);
> +  }

Redundant braces.

> +   }
> +
> for (unsigned i = 0; i < inst->sources; i++)
>reg_count = MAX2(reg_count, (unsigned)inst->regs_read(i));
>  
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> 

Re: [Mesa-dev] [PATCH 1/7] glsl: Separate overlapping sentinel nodes in exec_list.

2016-07-11 Thread Matt Turner
On Fri, Jul 8, 2016 at 3:18 PM, Matt Turner  wrote:
> I do appreciate the cleverness, but unfortunately it prevents a lot more
> cleverness in the form of additional compiler optimizations brought on
> by -fstrict-aliasing.
>
> No difference in OglBatch7 (n=20).
>
> Co-authored-by: Davin McCall 
> ---
> I took Ian's suggestion to add get_head_raw() and get_tail_raw() methods
> and use them in place of head_sentinel.next and tail_sentinel.prev.
>
>  src/compiler/glsl/ast.h|   4 +-
>  src/compiler/glsl/ast_function.cpp |  22 +--
>  src/compiler/glsl/ast_to_hir.cpp   |   6 +-
>  src/compiler/glsl/ast_type.cpp |   2 +-
>  src/compiler/glsl/glsl_parser_extras.cpp   |   6 +-
>  src/compiler/glsl/ir.cpp   |   8 +-
>  src/compiler/glsl/ir_clone.cpp |   2 +-
>  src/compiler/glsl/ir_constant_expression.cpp   |   2 +-
>  src/compiler/glsl/ir_function.cpp  |  14 +-
>  src/compiler/glsl/ir_reader.cpp|   4 +-
>  src/compiler/glsl/ir_validate.cpp  |   4 +-
>  src/compiler/glsl/list.h   | 184 
> -
>  src/compiler/glsl/lower_distance.cpp   |   4 +-
>  src/compiler/glsl/lower_jumps.cpp  |   2 +-
>  src/compiler/glsl/lower_packed_varyings.cpp|   8 +-
>  src/compiler/glsl/lower_tess_level.cpp |   4 +-
>  src/compiler/glsl/opt_conditional_discard.cpp  |   6 +-
>  src/compiler/glsl/opt_dead_builtin_varyings.cpp|   2 +-
>  src/compiler/glsl/opt_dead_code.cpp|   2 +-
>  src/compiler/glsl/opt_flatten_nested_if_blocks.cpp |   2 +-
>  src/compiler/nir/nir.h |   4 +-
>  src/compiler/nir/nir_opt_gcm.c |   2 +-
>  src/mesa/drivers/dri/i965/brw_cfg.h|   2 +-
>  src/mesa/drivers/dri/i965/brw_fs_builder.h |   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_builder.h   |   2 +-
>  25 files changed, 164 insertions(+), 136 deletions(-)
>
> diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
> index 06c7b03..fa5a731 100644
> --- a/src/compiler/glsl/ast.h
> +++ b/src/compiler/glsl/ast.h
> @@ -346,8 +346,8 @@ public:
>
> bool is_single_dimension() const
> {
> -  return this->array_dimensions.tail_pred->prev != NULL &&
> - this->array_dimensions.tail_pred->prev->is_head_sentinel();
> +  return this->array_dimensions.get_tail_raw()->prev != NULL &&
> + this->array_dimensions.get_tail_raw()->is_head_sentinel();

There's a missing ->prev on this line. Fixed locally, and passes Jenkins.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 1/3] Remove duplicated shaders across apps

2016-07-11 Thread Kenneth Graunke
On Monday, July 11, 2016 8:10:44 PM PDT Marek Olšák wrote:
> From: Marek Olšák 
> 
> $ fdupes -rdN .
> 
>[+] ./yofrankie/129.shader_test
>[-] ./yofrankie/126.shader_test
> 
>[+] ./yofrankie/123.shader_test
>[-] ./yofrankie/132.shader_test
> 
>[+] ./humus-volumetricfogging2/9.shader_test
>[-] ./humus-celshading/9.shader_test
> 
>[+] ./nexuiz/6.shader_test
>[-] ./humus-volumetricfogging2/12.shader_test
>[-] ./humus-domino/12.shader_test
>[-] ./yofrankie/6.shader_test
>[-] ./humus-celshading/12.shader_test

Series is:
Acked-by: Kenneth Graunke 

Also, thanks for the pointer about fdupes!  I hadn't heard of it and
instead had been using my own script:

http://whitecape.org/stuff/find-duplicates

Being able to point people at a distro-packaged program is a lot nicer.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db 1/3] Remove duplicated shaders across apps

2016-07-11 Thread Marek Olšák
From: Marek Olšák 

$ fdupes -rdN .

   [+] ./yofrankie/129.shader_test
   [-] ./yofrankie/126.shader_test

   [+] ./yofrankie/123.shader_test
   [-] ./yofrankie/132.shader_test

   [+] ./humus-volumetricfogging2/9.shader_test
   [-] ./humus-celshading/9.shader_test

   [+] ./nexuiz/6.shader_test
   [-] ./humus-volumetricfogging2/12.shader_test
   [-] ./humus-domino/12.shader_test
   [-] ./yofrankie/6.shader_test
   [-] ./humus-celshading/12.shader_test
---
 shaders/humus-celshading/12.shader_test |   22 -
 shaders/humus-celshading/9.shader_test  |   17 -
 shaders/humus-domino/12.shader_test |   22 -
 shaders/humus-volumetricfogging2/12.shader_test |   22 -
 shaders/yofrankie/126.shader_test   | 1614 --
 shaders/yofrankie/132.shader_test   | 1619 ---
 shaders/yofrankie/6.shader_test |   22 -
 7 files changed, 3338 deletions(-)
 delete mode 100644 shaders/humus-celshading/12.shader_test
 delete mode 100644 shaders/humus-celshading/9.shader_test
 delete mode 100644 shaders/humus-domino/12.shader_test
 delete mode 100644 shaders/humus-volumetricfogging2/12.shader_test
 delete mode 100644 shaders/yofrankie/126.shader_test
 delete mode 100644 shaders/yofrankie/132.shader_test
 delete mode 100644 shaders/yofrankie/6.shader_test

diff --git a/shaders/humus-celshading/12.shader_test 
b/shaders/humus-celshading/12.shader_test
deleted file mode 100644
index 8b913af..000
--- a/shaders/humus-celshading/12.shader_test
+++ /dev/null
@@ -1,22 +0,0 @@
-[require]
-GLSL >= 1.30
-
-[fragment shader]
-#version 130
-uniform ivec4 color;
-out ivec4 out_color;
-
-void main()
-{
-   out_color = color;
-}
-
-[vertex shader]
-#version 130
-in vec4 position;
-void main()
-{
-   gl_Position = position;
-}
-
-
diff --git a/shaders/humus-celshading/9.shader_test 
b/shaders/humus-celshading/9.shader_test
deleted file mode 100644
index 30ba541..000
--- a/shaders/humus-celshading/9.shader_test
+++ /dev/null
@@ -1,17 +0,0 @@
-[require]
-GLSL >= 1.10
-
-[fragment shader]
-uniform vec4 color;
-void main()
-{
-   gl_FragColor = color;
-}
-
-[vertex shader]
-attribute vec4 position;
-void main()
-{
-   gl_Position = position;
-}
-
diff --git a/shaders/humus-domino/12.shader_test 
b/shaders/humus-domino/12.shader_test
deleted file mode 100644
index 8b913af..000
--- a/shaders/humus-domino/12.shader_test
+++ /dev/null
@@ -1,22 +0,0 @@
-[require]
-GLSL >= 1.30
-
-[fragment shader]
-#version 130
-uniform ivec4 color;
-out ivec4 out_color;
-
-void main()
-{
-   out_color = color;
-}
-
-[vertex shader]
-#version 130
-in vec4 position;
-void main()
-{
-   gl_Position = position;
-}
-
-
diff --git a/shaders/humus-volumetricfogging2/12.shader_test 
b/shaders/humus-volumetricfogging2/12.shader_test
deleted file mode 100644
index 8b913af..000
--- a/shaders/humus-volumetricfogging2/12.shader_test
+++ /dev/null
@@ -1,22 +0,0 @@
-[require]
-GLSL >= 1.30
-
-[fragment shader]
-#version 130
-uniform ivec4 color;
-out ivec4 out_color;
-
-void main()
-{
-   out_color = color;
-}
-
-[vertex shader]
-#version 130
-in vec4 position;
-void main()
-{
-   gl_Position = position;
-}
-
-
diff --git a/shaders/yofrankie/126.shader_test 
b/shaders/yofrankie/126.shader_test
deleted file mode 100644
index 236419b..000
--- a/shaders/yofrankie/126.shader_test
+++ /dev/null
@@ -1,1614 +0,0 @@
-[require]
-GLSL >= 1.10
-
-[fragment shader]
-
-float exp_blender(float f)
-{
-   return pow(2.71828182846, f);
-}
-
-void rgb_to_hsv(vec4 rgb, out vec4 outcol)
-{
-   float cmax, cmin, h, s, v, cdelta;
-   vec3 c;
-
-   cmax = max(rgb[0], max(rgb[1], rgb[2]));
-   cmin = min(rgb[0], min(rgb[1], rgb[2]));
-   cdelta = cmax-cmin;
-
-   v = cmax;
-   if (cmax!=0.0)
-   s = cdelta/cmax;
-   else {
-   s = 0.0;
-   h = 0.0;
-   }
-
-   if (s == 0.0) {
-   h = 0.0;
-   }
-   else {
-   c = (vec3(cmax, cmax, cmax) - rgb.xyz)/cdelta;
-
-   if (rgb.x==cmax) h = c[2] - c[1];
-   else if (rgb.y==cmax) h = 2.0 + c[0] -  c[2];
-   else h = 4.0 + c[1] - c[0];
-
-   h /= 6.0;
-
-   if (h<0.0)
-   h += 1.0;
-   }
-
-   outcol = vec4(h, s, v, rgb.w);
-}
-
-void hsv_to_rgb(vec4 hsv, out vec4 outcol)
-{
-   float i, f, p, q, t, h, s, v;
-   vec3 rgb;
-
-   h = hsv[0];
-   s = hsv[1];
-   v = hsv[2];
-
-   if(s==0.0) {
-   rgb = vec3(v, v, v);
-   }
-   else {
-   if(h==1.0)
-   h = 0.0;
-   
-   h *= 6.0;
-   i = floor(h);
-   f = h - i;
-   rgb = vec3(f, f, f);
-   p = v*(1.0-s);
-   q = v*(1.0-(s*f));
-   t = v*(1.0-(s*(1.0-f)));
-   
-   if (i == 0.0) rgb = vec3(v, t, 

[Mesa-dev] [PATCH shader-db 3/3] si-report.py: don't crash if there are no shaders found

2016-07-11 Thread Marek Olšák
From: Marek Olšák 

---
 si-report.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/si-report.py b/si-report.py
index c7fe1b5..69af89e 100755
--- a/si-report.py
+++ b/si-report.py
@@ -366,6 +366,9 @@ def compare_results(before_all_results, after_all_results):
 errors_names.append(name)
 
 print '{} shaders in {} tests'.format(num_shaders, num_tests)
+if num_shaders == 0:
+return
+
 print "Totals:"
 print_before_after_stats(total_before, total_after)
 print "Totals from affected shaders:"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db 2/3] Remove split-to-files.py

2016-07-11 Thread Marek Olšák
From: Marek Olšák 

Use MESA_SHADER_CAPTURE_PATH instead.
---
 split-to-files.py | 138 --
 1 file changed, 138 deletions(-)
 delete mode 100755 split-to-files.py

diff --git a/split-to-files.py b/split-to-files.py
deleted file mode 100755
index 721b2da..000
--- a/split-to-files.py
+++ /dev/null
@@ -1,138 +0,0 @@
-#!/usr/bin/env python3
-
-import re
-import os
-import argparse
-
-
-def parse_input(infile):
-shaders = dict()
-programs = dict()
-shadertuple = ("bad", 0)
-prognum = ""
-reading = False
-is_glsl = True
-
-for line in infile.splitlines():
-declmatch = re.match(
-r"GLSL (.*) shader (.*) source for linked program (.*):", line)
-arbmatch = re.match(
-r"ARB_([^_]*)_program source for program (.*):", line)
-if declmatch:
-shadertype = declmatch.group(1)
-shadernum = declmatch.group(2)
-prognum = declmatch.group(3)
-shadertuple = (shadertype, shadernum)
-
-# don't save driver-internal shaders.
-if prognum == "0":
-continue
-
-if prognum not in shaders:
-shaders[prognum] = dict()
-if shadertuple in shaders[prognum]:
-print("Warning: duplicate", shadertype, " shader ", shadernum,
-  "in program", prognum, "...tossing old shader.")
-shaders[prognum][shadertuple] = ''
-reading = True
-is_glsl = True
-print("Reading program {0} {1} shader {2}".format(
-prognum, shadertype, shadernum))
-elif arbmatch:
-shadertype = arbmatch.group(1)
-prognum = arbmatch.group(2)
-if prognum in programs:
-print("dupe!")
-exit(1)
-programs[prognum] = (shadertype, '')
-reading = True
-is_glsl = False
-print("Reading program {0} {1} shader".format(prognum, shadertype))
-elif re.match("GLSL IR for ", line):
-reading = False
-elif re.match("Mesa IR for ", line):
-reading = False
-elif re.match("GLSL source for ", line):
-reading = False
-elif reading:
-if is_glsl:
-shaders[prognum][shadertuple] += line + '\n'
-else:
-type, source = programs[prognum]
-programs[prognum] = (type, ''.join([source, line, '\n']))
-
-return (shaders, programs)
-
-
-def write_shader_test(filename, shaders):
-print("Writing {0}".format(filename))
-out = open(filename, 'w')
-
-min_version = 110
-for stage, num in shaders:
-shader = shaders[(stage, num)]
-m = re.match(r"^#version (\d\d\d)", shader)
-if m:
-version = int(m.group(1), 10)
-if version > min_version:
-min_version = version
-
-out.write("[require]\n")
-out.write("GLSL >= %.2f\n" % (min_version / 100.))
-out.write("\n")
-
-for stage, num in shaders:
-if stage == "vertex":
-out.write("[vertex shader]\n")
-elif stage == "fragment":
-out.write("[fragment shader]\n")
-elif stage == "geometry":
-out.write("[geometry shader]\n")
-elif stage == "tess ctrl" or stage == "tessellation control":
-out.write("[tessellation control shader]\n")
-elif stage == "tess eval" or stage == "tessellation evaluation":
-out.write("[tessellation evaluation shader]\n")
-else:
-assert False, stage
-out.write(shaders[(stage, num)])
-
-out.close()
-
-def write_arb_shader_test(filename, type, source):
-print("Writing {0}".format(filename))
-out = open(filename, 'w')
-out.write("[require]\n")
-out.write("GL_ARB_{0}_program\n".format(type))
-out.write("\n")
-out.write("[{0} program]\n".format(type))
-out.write(source)
-# INTEL_DEBUG won't output anything for ARB programs unless you draw
-out.write("\n[test]\ndraw rect -1 -1 1 2\n");
-out.close()
-
-def write_files(directory, shaders, programs):
-for prog in shaders:
-write_shader_test("{0}/{1}.shader_test".format(directory, prog),
-  shaders[prog])
-for prognum in programs:
-prog = programs[prognum]
-write_arb_shader_test("{0}/{1}p-{2}.shader_test".format(directory,
-prog[0][0], prognum), prog[0], prog[1])
-
-def main():
-parser = argparse.ArgumentParser()
-parser.add_argument('appname', help='Output directory (application name)')
-parser.add_argument('mesadebug', help='MESA_GLSL=dump output file')
-args = parser.parse_args()
-
-dirname = "shaders/{0}".format(args.appname)
-if not os.path.isdir(dirname):
-os.mkdir(dirname)
-
-with open(args.mesadebug, 'r') as infile:
-

Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Ilia Mirkin
On Mon, Jul 11, 2016 at 2:01 PM, Marek Olšák  wrote:
> On Mon, Jul 11, 2016 at 7:55 PM, Ilia Mirkin  wrote:
>> On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšák  wrote:
>>> On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin  wrote:
 On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This bug is uncovered by glsl/lower_if_to_cond_assign.
> I don't know if it can be reproduced in any other way.
>
> Cc: 
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 76656f5..0b7feb7 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -1958,12 +1958,14 @@ 
> glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
>   emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
>break;
> case ir_unop_bitcast_f2i:
> -  result_src = op[0];
> -  result_src.type = GLSL_TYPE_INT;
> -  break;
> case ir_unop_bitcast_f2u:
> -  result_src = op[0];
> -  result_src.type = GLSL_TYPE_UINT;
> +  /* Make sure we don't propagate the negate modifier to integer 
> opcodes. */
> +  if (op[0].negate)

 Or abs or saturate, presumably?
>>>
>>> glsl_to_tgsi doesn't use ureg_abs.
>>
>> I'd rather not rely on that... it's pretty cheap to throw in there.
>
> I can't throw abs in there because st_src_reg doesn't have abs. :)

Oh. I see. Then

Reviewed-by: Ilia Mirkin 

as is :) Sorry for being dense, I should have looked at the code but
was relying on my (obviously poor) memory.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv/dump: Fix post-blit memory barrier

2016-07-11 Thread Jason Ekstrand
On Thu, Jul 7, 2016 at 8:12 PM, Jason Ekstrand  wrote:

> Drp...
>
> Reviewed-by: Jason Ekstrand 
>
I added my R-B to your two and pushed the lot of them.  Thanks for the
fixups!


> On Jul 7, 2016 4:06 PM, "Chad Versace"  wrote:
>
>> Swap srcAccessMask and dstAccessMask.
>> ---
>>  src/intel/vulkan/anv_dump.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/intel/vulkan/anv_dump.c b/src/intel/vulkan/anv_dump.c
>> index 49a5ae2..4a5a44f 100644
>> --- a/src/intel/vulkan/anv_dump.c
>> +++ b/src/intel/vulkan/anv_dump.c
>> @@ -158,8 +158,8 @@ dump_image_do_blit(struct anv_device *device, struct
>> dump_image *image,
>>0, 0, NULL, 0, NULL, 1,
>>&(VkImageMemoryBarrier) {
>>   .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
>> - .srcAccessMask = VK_ACCESS_HOST_READ_BIT,
>> - .dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT,
>> + .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT,
>> + .dstAccessMask = VK_ACCESS_HOST_READ_BIT,
>>   .oldLayout = VK_IMAGE_LAYOUT_GENERAL,
>>   .newLayout = VK_IMAGE_LAYOUT_GENERAL,
>>   .srcQueueFamilyIndex = 0,
>> --
>> 2.9.0.rc2
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Marek Olšák
On Mon, Jul 11, 2016 at 7:55 PM, Ilia Mirkin  wrote:
> On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšák  wrote:
>> On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin  wrote:
>>> On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák  wrote:
 From: Marek Olšák 

 This bug is uncovered by glsl/lower_if_to_cond_assign.
 I don't know if it can be reproduced in any other way.

 Cc: 
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index 76656f5..0b7feb7 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -1958,12 +1958,14 @@ 
 glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
   emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
break;
 case ir_unop_bitcast_f2i:
 -  result_src = op[0];
 -  result_src.type = GLSL_TYPE_INT;
 -  break;
 case ir_unop_bitcast_f2u:
 -  result_src = op[0];
 -  result_src.type = GLSL_TYPE_UINT;
 +  /* Make sure we don't propagate the negate modifier to integer 
 opcodes. */
 +  if (op[0].negate)
>>>
>>> Or abs or saturate, presumably?
>>
>> glsl_to_tgsi doesn't use ureg_abs.
>
> I'd rather not rely on that... it's pretty cheap to throw in there.

I can't throw abs in there because st_src_reg doesn't have abs. :)

>
>>
>> saturate is a dst modifier and this patch operates on src operands.
>
> Er, right, of course. Ignore that.
>
> With abs thrown into the condition,
>
> Reviewed-by: Ilia Mirkin 

Thanks.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Ilia Mirkin
On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšák  wrote:
> On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin  wrote:
>> On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> This bug is uncovered by glsl/lower_if_to_cond_assign.
>>> I don't know if it can be reproduced in any other way.
>>>
>>> Cc: 
>>> ---
>>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
>>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>> index 76656f5..0b7feb7 100644
>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>> @@ -1958,12 +1958,14 @@ 
>>> glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
>>>   emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
>>>break;
>>> case ir_unop_bitcast_f2i:
>>> -  result_src = op[0];
>>> -  result_src.type = GLSL_TYPE_INT;
>>> -  break;
>>> case ir_unop_bitcast_f2u:
>>> -  result_src = op[0];
>>> -  result_src.type = GLSL_TYPE_UINT;
>>> +  /* Make sure we don't propagate the negate modifier to integer 
>>> opcodes. */
>>> +  if (op[0].negate)
>>
>> Or abs or saturate, presumably?
>
> glsl_to_tgsi doesn't use ureg_abs.

I'd rather not rely on that... it's pretty cheap to throw in there.

>
> saturate is a dst modifier and this patch operates on src operands.

Er, right, of course. Ignore that.

With abs thrown into the condition,

Reviewed-by: Ilia Mirkin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mapi: Massage code to allow clang to compile.

2016-07-11 Thread Matt Turner
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
was violating the spec, resulting in it failing to compile.

Cc: mesa-sta...@lists.freedesktop.org
Co-authored-by: Tomasz Paweł Gajc 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599
---
I've tried for months to reproduce this, and I've still never been
able to on 64-bit builds. I can reproduce it on 32-bit however.

On MSVC, this patch will have the effect of changing the variables
from static to extern. I do not know if this will adversely affect
anything, so this patch would benefit from MSVC testing.

 configure.ac|  1 +
 src/mapi/entry_x86-64_tls.h |  9 +++--
 src/mapi/entry_x86_tls.h| 10 --
 src/mapi/entry_x86_tsd.h|  9 +++--
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index 367e9b5..ddd8e1d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -225,6 +225,7 @@ AX_GCC_FUNC_ATTRIBUTE([packed])
 AX_GCC_FUNC_ATTRIBUTE([pure])
 AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
 AX_GCC_FUNC_ATTRIBUTE([unused])
+AX_GCC_FUNC_ATTRIBUTE([visibility])
 AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
 AX_GCC_FUNC_ATTRIBUTE([weak])
 
diff --git a/src/mapi/entry_x86-64_tls.h b/src/mapi/entry_x86-64_tls.h
index 38faccc..c5262a1 100644
--- a/src/mapi/entry_x86-64_tls.h
+++ b/src/mapi/entry_x86-64_tls.h
@@ -25,6 +25,11 @@
  *Chia-I Wu 
  */
 
+#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY
+#define HIDDEN __attribute__((visibility("hidden")))
+#else
+#define HIDDEN
+#endif
 
 __asm__(".text\n"
 ".balign 32\n"
@@ -54,8 +59,8 @@ entry_patch_public(void)
 {
 }
 
-static char
-x86_64_entry_start[];
+extern char
+x86_64_entry_start[] HIDDEN;
 
 mapi_func
 entry_get_public(int slot)
diff --git a/src/mapi/entry_x86_tls.h b/src/mapi/entry_x86_tls.h
index 46d2ece..231b409 100644
--- a/src/mapi/entry_x86_tls.h
+++ b/src/mapi/entry_x86_tls.h
@@ -27,6 +27,12 @@
 
 #include 
 
+#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY
+#define HIDDEN __attribute__((visibility("hidden")))
+#else
+#define HIDDEN
+#endif
+
 __asm__(".text");
 
 __asm__("x86_current_tls:\n\t"
@@ -71,8 +77,8 @@ __asm__(".text");
 extern unsigned long
 x86_current_tls();
 
-static char x86_entry_start[];
-static char x86_entry_end[];
+extern char x86_entry_start[] HIDDEN;
+extern char x86_entry_end[] HIDDEN;
 
 void
 entry_patch_public(void)
diff --git a/src/mapi/entry_x86_tsd.h b/src/mapi/entry_x86_tsd.h
index ea7bacb..03d9735 100644
--- a/src/mapi/entry_x86_tsd.h
+++ b/src/mapi/entry_x86_tsd.h
@@ -25,6 +25,11 @@
  *Chia-I Wu 
  */
 
+#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY
+#define HIDDEN __attribute__((visibility("hidden")))
+#else
+#define HIDDEN
+#endif
 
 #define X86_ENTRY_SIZE 32
 
@@ -58,8 +63,8 @@ __asm__(".balign 32\n"
 #include 
 #include "u_execmem.h"
 
-static const char x86_entry_start[];
-static const char x86_entry_end[];
+extern const char x86_entry_start[] HIDDEN;
+extern const char x86_entry_end[] HIDDEN;
 
 void
 entry_patch_public(void)
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Marek Olšák
On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin  wrote:
> On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> This bug is uncovered by glsl/lower_if_to_cond_assign.
>> I don't know if it can be reproduced in any other way.
>>
>> Cc: 
>> ---
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> index 76656f5..0b7feb7 100644
>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> @@ -1958,12 +1958,14 @@ 
>> glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
>>   emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
>>break;
>> case ir_unop_bitcast_f2i:
>> -  result_src = op[0];
>> -  result_src.type = GLSL_TYPE_INT;
>> -  break;
>> case ir_unop_bitcast_f2u:
>> -  result_src = op[0];
>> -  result_src.type = GLSL_TYPE_UINT;
>> +  /* Make sure we don't propagate the negate modifier to integer 
>> opcodes. */
>> +  if (op[0].negate)
>
> Or abs or saturate, presumably?

glsl_to_tgsi doesn't use ureg_abs.

saturate is a dst modifier and this patch operates on src operands.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] mesa: etc2 online compression is unsupported, don't attempt it

2016-07-11 Thread Anuj Phogat
On Fri, Jul 8, 2016 at 5:29 PM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> Cc: "11.2 12.0" 
> ---
>
> v1 -> v2: also include a mesa_is_etc2_format function which takes a GLenum.
>
>  src/mesa/main/glformats.c | 23 +++
>  src/mesa/main/glformats.h |  3 +++
>  src/mesa/main/teximage.c  |  1 +
>  3 files changed, 27 insertions(+)
>
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index 24ce7b0..90f525c 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -907,6 +907,29 @@ _mesa_is_astc_format(GLenum internalFormat)
>  }
>
>  /**
> + * Test if the given format is an ETC2 format.
> + */
> +GLboolean
> +_mesa_is_etc2_format(GLenum internalFormat)
> +{
> +   switch (internalFormat) {
> +   case GL_COMPRESSED_RGB8_ETC2:
> +   case GL_COMPRESSED_SRGB8_ETC2:
> +   case GL_COMPRESSED_RGBA8_ETC2_EAC:
> +   case GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC:
> +   case GL_COMPRESSED_R11_EAC:
> +   case GL_COMPRESSED_RG11_EAC:
> +   case GL_COMPRESSED_SIGNED_R11_EAC:
> +   case GL_COMPRESSED_SIGNED_RG11_EAC:
> +   case GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2:
> +   case GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2:
> +  return true;
> +   default:
> +  return false;
> +   }
> +}
> +
> +/**
>   * Test if the given format is an integer (non-normalized) format.
>   */
>  GLboolean
> diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h
> index c73f464..474ede2 100644
> --- a/src/mesa/main/glformats.h
> +++ b/src/mesa/main/glformats.h
> @@ -61,6 +61,9 @@ extern GLboolean
>  _mesa_is_astc_format(GLenum internalFormat);
>
>  extern GLboolean
> +_mesa_is_etc2_format(GLenum internalFormat);
> +
> +extern GLboolean
>  _mesa_is_type_unsigned(GLenum type);
>
>  extern GLboolean
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 26a6c21..81e46a1 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -1307,6 +1307,7 @@ bool
>  _mesa_format_no_online_compression(const struct gl_context *ctx, GLenum 
> format)
>  {
> return _mesa_is_astc_format(format) ||
> +  _mesa_is_etc2_format(format) ||
>compressedteximage_only_format(ctx, format);
>  }
>
> --
> 2.7.3
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Ilia Mirkin
On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This bug is uncovered by glsl/lower_if_to_cond_assign.
> I don't know if it can be reproduced in any other way.
>
> Cc: 
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 76656f5..0b7feb7 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -1958,12 +1958,14 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
> ir, st_src_reg *op)
>   emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
>break;
> case ir_unop_bitcast_f2i:
> -  result_src = op[0];
> -  result_src.type = GLSL_TYPE_INT;
> -  break;
> case ir_unop_bitcast_f2u:
> -  result_src = op[0];
> -  result_src.type = GLSL_TYPE_UINT;
> +  /* Make sure we don't propagate the negate modifier to integer 
> opcodes. */
> +  if (op[0].negate)

Or abs or saturate, presumably?

> + emit_asm(ir, TGSI_OPCODE_MOV, result_dst, op[0]);
> +  else
> + result_src = op[0];
> +  result_src.type = ir->operation == ir_unop_bitcast_f2i ? GLSL_TYPE_INT 
> :
> +   
> GLSL_TYPE_UINT;
>break;
> case ir_unop_bitcast_i2f:
> case ir_unop_bitcast_u2f:
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast

2016-07-11 Thread Marek Olšák
From: Marek Olšák 

This bug is uncovered by glsl/lower_if_to_cond_assign.
I don't know if it can be reproduced in any other way.

Cc: 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 76656f5..0b7feb7 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1958,12 +1958,14 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
ir, st_src_reg *op)
  emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]);
   break;
case ir_unop_bitcast_f2i:
-  result_src = op[0];
-  result_src.type = GLSL_TYPE_INT;
-  break;
case ir_unop_bitcast_f2u:
-  result_src = op[0];
-  result_src.type = GLSL_TYPE_UINT;
+  /* Make sure we don't propagate the negate modifier to integer opcodes. 
*/
+  if (op[0].negate)
+ emit_asm(ir, TGSI_OPCODE_MOV, result_dst, op[0]);
+  else
+ result_src = op[0];
+  result_src.type = ir->operation == ir_unop_bitcast_f2i ? GLSL_TYPE_INT :
+   GLSL_TYPE_UINT;
   break;
case ir_unop_bitcast_i2f:
case ir_unop_bitcast_u2f:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/14] isl: Add support for HiZ surfaces

2016-07-11 Thread Pohjolainen, Topi
On Sat, Jul 09, 2016 at 12:17:28PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.c | 11 +++
>  src/intel/isl/isl.h | 17 +
>  src/intel/isl/isl_format_layout.csv |  1 +
>  src/intel/isl/isl_gen6.c|  8 
>  src/intel/isl/isl_gen7.c| 10 +-
>  src/intel/isl/isl_gen8.c|  3 ++-
>  6 files changed, 48 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 8c114a2..9ccdea2 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -167,6 +167,16 @@ isl_tiling_get_info(const struct isl_device *dev,
>break;
> }
>  
> +   case ISL_TILING_HIZ:
> +  /* HiZ buffers are required to have ISL_FORMAT_HIZ which is an 8x4
> +   * 128bpb format.  The tiling has the same physical dimensions as
> +   * Y-tiling but actually has two HiZ columns per Y-tiled column.
> +   */
> +  assert(bs == 16);
> +  logical_el = isl_extent2d(16, 16);
> +  phys_B = isl_extent2d(128, 32);
> +  break;
> +
> default:
>unreachable("not reached");
> } /* end switch */
> @@ -221,6 +231,7 @@ isl_surf_choose_tiling(const struct isl_device *dev,
>CHOOSE(ISL_TILING_LINEAR);
> }
>  
> +   CHOOSE(ISL_TILING_HIZ);
> CHOOSE(ISL_TILING_Ys);
> CHOOSE(ISL_TILING_Yf);
> CHOOSE(ISL_TILING_Y0);
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index 85af2d1..9a60bbd 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -345,6 +345,14 @@ enum isl_format {
> ISL_FORMAT_ASTC_LDR_2D_12X10_FLT16 =638,
> ISL_FORMAT_ASTC_LDR_2D_12X12_FLT16 =639,
>  
> +   /* The formats that follow are internal to ISL and as such don't have an
> +* explicit number.  We'll just let the C compiler assign it for us.  Any
> +* actual hardware formats *must* come before these in the list.
> +*/
> +
> +   /* Formats for color compression surfaces */
> +   ISL_FORMAT_HIZ,
> +
> /* Hardware doesn't understand this out-of-band value */
> ISL_FORMAT_UNSUPPORTED = UINT16_MAX,
>  };
> @@ -392,6 +400,9 @@ enum isl_txc {
> ISL_TXC_ETC1,
> ISL_TXC_ETC2,
> ISL_TXC_ASTC,
> +
> +   /* Used for auxiliary surface formats */
> +   ISL_TXC_HIZ,
>  };
>  
>  /**
> @@ -410,6 +421,7 @@ enum isl_tiling {
> ISL_TILING_Y0, /**< Legacy Y tiling */
> ISL_TILING_Yf, /**< Standard 4K tiling. The 'f' means "four". */
> ISL_TILING_Ys, /**< Standard 64K tiling. The 's' means "sixty-four". */
> +   ISL_TILING_HIZ, /**< Tiling format for HiZ surfaces */
>  };
>  
>  /**
> @@ -423,6 +435,7 @@ typedef uint32_t isl_tiling_flags_t;
>  #define ISL_TILING_Y0_BIT (1u << ISL_TILING_Y0)
>  #define ISL_TILING_Yf_BIT (1u << ISL_TILING_Yf)
>  #define ISL_TILING_Ys_BIT (1u << ISL_TILING_Ys)
> +#define ISL_TILING_HIZ_BIT(1u << ISL_TILING_HIZ)
>  #define ISL_TILING_ANY_MASK   (~0u)
>  #define ISL_TILING_NON_LINEAR_MASK(~ISL_TILING_LINEAR_BIT)
>  
> @@ -505,6 +518,7 @@ typedef uint64_t isl_surf_usage_flags_t;
>  #define ISL_SURF_USAGE_DISPLAY_FLIP_X_BIT  (1u << 10)
>  #define ISL_SURF_USAGE_DISPLAY_FLIP_Y_BIT  (1u << 11)
>  #define ISL_SURF_USAGE_STORAGE_BIT (1u << 12)
> +#define ISL_SURF_USAGE_HIZ_BIT (1u << 13)
>  /** @} */
>  
>  /**
> @@ -966,6 +980,9 @@ isl_format_has_bc_compression(enum isl_format fmt)
> case ISL_TXC_ETC2:
> case ISL_TXC_ASTC:
>return false;
> +
> +   case ISL_TXC_HIZ:
> +  unreachable("Should not be called on an aux surface");
> }
>  
> unreachable("bad texture compression mode");
> diff --git a/src/intel/isl/isl_format_layout.csv 
> b/src/intel/isl/isl_format_layout.csv
> index f90fbe0..3e681e8 100644
> --- a/src/intel/isl/isl_format_layout.csv
> +++ b/src/intel/isl/isl_format_layout.csv
> @@ -314,3 +314,4 @@ ASTC_LDR_2D_10X8_FLT16  , 128, 10,  8,  1, sf16, 
> sf16, sf16, sf16, ,
>  ASTC_LDR_2D_10X10_FLT16 , 128, 10, 10,  1, sf16, sf16, sf16, sf16, , 
> ,, linear,  astc
>  ASTC_LDR_2D_12X10_FLT16 , 128, 12, 10,  1, sf16, sf16, sf16, sf16, , 
> ,, linear,  astc
>  ASTC_LDR_2D_12X12_FLT16 , 128, 12, 12,  1, sf16, sf16, sf16, sf16, , 
> ,, linear,  astc
> +HIZ , 128,  8,  4,  1, , , , , , 
> ,,   ,   hiz
> diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
> index 699aa41..b5050ed 100644
> --- a/src/intel/isl/isl_gen6.c
> +++ b/src/intel/isl/isl_gen6.c
> @@ -89,6 +89,14 @@ gen6_choose_image_alignment_el(const struct isl_device 
> *dev,
> enum isl_msaa_layout msaa_layout,
> struct isl_extent3d *image_align_el)
>  {
> +   if (info->format == ISL_FORMAT_HIZ) {
> +  /* HiZ surfaces are 

Re: [Mesa-dev] [PATCH 07/14] isl: Use bpb in a few places where it's more natural than bs

2016-07-11 Thread Jason Ekstrand
On Mon, Jul 11, 2016 at 8:37 AM, Pohjolainen, Topi <
topi.pohjolai...@intel.com> wrote:

> On Sat, Jul 09, 2016 at 12:17:24PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/isl/isl.c  | 2 +-
> >  src/intel/isl/isl_gen6.c | 2 +-
> >  src/intel/isl/isl_gen7.c | 2 +-
> >  src/intel/isl/isl_storage_image.c| 4 ++--
> >  src/intel/vulkan/anv_formats.c   | 4 ++--
> >  src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 4 ++--
> >  6 files changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index a3a9427..796b4cc 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -996,7 +996,7 @@ isl_apply_surface_padding(const struct isl_device
> *dev,
> >  *  padding requirements.
> >  */
> > if (isl_format_is_yuv(info->format) &&
> > -   (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) {
> > +   (fmtl->bpb == 96 || fmtl->bpb == 48|| fmtl->bpb == 24)) {
>
> So these values were bits instead of bytes even though stored into 'bs'?
> Or how does this work? In the rest you have multiplied by the eight.
>

We used to use bpb and then we switched to bs and these values were left in
bpb.  Now we're switching bak so I'm leaving them alone again. :-)

In other words, the old code had a bug that this is fixing.  Since no one
uses ISL for YUV images yet, I didn't figure it was worth separating into
its own bugfix patch.


>
> >*total_h_el += 1;
> >*pad_bytes += 16;
> > }
> > diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
> > index 24c3939..699aa41 100644
> > --- a/src/intel/isl/isl_gen6.c
> > +++ b/src/intel/isl/isl_gen6.c
> > @@ -51,7 +51,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
> >  *   - any compressed texture format (BC*)
> >  *   - any YCRCB* format
> >  */
> > -   if (fmtl->bs > 8)
> > +   if (fmtl->bpb > 64)
> >return false;
> > if (isl_format_is_compressed(info->format))
> >return false;
> > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> > index 542c137..d9b0c08 100644
> > --- a/src/intel/isl/isl_gen7.c
> > +++ b/src/intel/isl/isl_gen7.c
> > @@ -51,7 +51,7 @@ gen7_choose_msaa_layout(const struct isl_device *dev,
> >  *formats: any format with greater than 64 bits per element, any
> >  *compressed texture format (BC*), and any YCRCB* format.
> >  */
> > -   if (fmtl->bs > 8)
> > +   if (fmtl->bpb > 64)
> >return false;
> > if (isl_format_is_compressed(info->format))
> >return false;
> > diff --git a/src/intel/isl/isl_storage_image.c
> b/src/intel/isl/isl_storage_image.c
> > index 590d2e4..2617eb0e 100644
> > --- a/src/intel/isl/isl_storage_image.c
> > +++ b/src/intel/isl/isl_storage_image.c
> > @@ -194,9 +194,9 @@ isl_has_matching_typed_storage_image_format(const
> struct brw_device_info *devinf
> > if (devinfo->gen >= 9) {
> >return true;
> > } else if (devinfo->gen >= 8 || devinfo->is_haswell) {
> > -  return isl_format_get_layout(fmt)->bs <= 8;
> > +  return isl_format_get_layout(fmt)->bpb <= 64;
> > } else {
> > -  return isl_format_get_layout(fmt)->bs <= 4;
> > +  return isl_format_get_layout(fmt)->bpb <= 32;
> > }
> >  }
> >
> > diff --git a/src/intel/vulkan/anv_formats.c
> b/src/intel/vulkan/anv_formats.c
> > index 457e820..b26e48a 100644
> > --- a/src/intel/vulkan/anv_formats.c
> > +++ b/src/intel/vulkan/anv_formats.c
> > @@ -271,7 +271,7 @@ anv_get_format(const struct brw_device_info
> *devinfo, VkFormat vk_format,
> >isl_format_get_layout(format.isl_format);
> >
> > if (tiling == VK_IMAGE_TILING_OPTIMAL &&
> > -   !util_is_power_of_two(isl_layout->bs)) {
> > +   !util_is_power_of_two(isl_layout->bpb)) {
> >/* Tiled formats *must* be power-of-two because we need up upload
> > * them with the render pipeline.  For 3-channel formats, we fix
> > * this by switching them over to RGBX or RGBA formats under the
> > @@ -409,7 +409,7 @@ anv_physical_device_get_format_properties(struct
> anv_physical_device *physical_d
> > * what most clients will want.
> > */
> >if (linear_fmt.isl_format != ISL_FORMAT_UNSUPPORTED &&
> > -
> !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bs) &&
> > +
> !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bpb) &&
> >isl_format_rgb_to_rgbx(linear_fmt.isl_format) ==
> ISL_FORMAT_UNSUPPORTED) {
> >   tiled &= ~VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT &
> >~VK_FORMAT_FEATURE_BLIT_DST_BIT;
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> > index fc1fc13..a4774e6 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> > +++ 

Re: [Mesa-dev] [PATCH 09/14] isl: Kill off isl_format_layout::bs

2016-07-11 Thread Pohjolainen, Topi
On Sat, Jul 09, 2016 at 12:17:27PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/gen_format_layout.py |  1 -
>  src/intel/isl/isl.c| 11 ++-
>  src/intel/isl/isl.h|  5 ++---
>  src/intel/isl/isl_gen9.c   | 14 +++---
>  src/intel/isl/isl_storage_image.c  |  4 ++--
>  src/intel/vulkan/anv_image.c   |  4 ++--
>  src/intel/vulkan/anv_meta_copy.c   |  4 ++--
>  7 files changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/src/intel/isl/gen_format_layout.py 
> b/src/intel/isl/gen_format_layout.py
> index 803967e..c9163fe 100644
> --- a/src/intel/isl/gen_format_layout.py
> +++ b/src/intel/isl/gen_format_layout.py
> @@ -68,7 +68,6 @@ TEMPLATE = template.Template(
>  .format = ISL_FORMAT_${format.name},
>  .name = "ISL_FORMAT_${format.name}",
>  .bpb = ${format.bpb},
> -.bs = ${format.bpb // 8},
>  .bw = ${format.bw},
>  .bh = ${format.bh},
>  .bd = ${format.bd},
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index e0e67e2..8c114a2 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -904,7 +904,8 @@ isl_calc_linear_row_pitch(const struct isl_device *dev,
>  *being used to determine whether additional pages need to be defined.
>  */
> assert(phys_slice0_sa->w % fmtl->bw == 0);
> -   row_pitch = MAX(row_pitch, fmtl->bs * (phys_slice0_sa->w / fmtl->bw));
> +   uint32_t bs = fmtl->bpb / 8;

Could be 'const'.

> +   row_pitch = MAX(row_pitch, bs * (phys_slice0_sa->w / fmtl->bw));
>  
> /* From the Broadwel PRM >> Volume 2d: Command Reference: Structures >>
>  * RENDER_SURFACE_STATE Surface Pitch (p349):
> @@ -922,9 +923,9 @@ isl_calc_linear_row_pitch(const struct isl_device *dev,
>  */
> if (info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) {
>if (isl_format_is_yuv(info->format)) {
> - row_pitch = isl_align_npot(row_pitch, 2 * fmtl->bs);
> + row_pitch = isl_align_npot(row_pitch, 2 * bs);
>} else  {
> - row_pitch = isl_align_npot(row_pitch, fmtl->bs);
> + row_pitch = isl_align_npot(row_pitch, bs);
>}
> }
>  
> @@ -1120,9 +1121,9 @@ isl_surf_init_s(const struct isl_device *dev,
>base_alignment = MAX(1, info->min_alignment);
>if (info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) {
>   if (isl_format_is_yuv(info->format)) {
> -base_alignment = MAX(base_alignment, 2 * fmtl->bs);
> +base_alignment = MAX(base_alignment, fmtl->bpb / 4);
>   } else {
> -base_alignment = MAX(base_alignment, fmtl->bs);
> +base_alignment = MAX(base_alignment, fmtl->bpb / 8);
>   }
>}
> } else {
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index 50c8e80..85af2d1 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -641,7 +641,6 @@ struct isl_format_layout {
>  
> uint16_t bpb; /**< bits per block */
>  
> -   uint8_t bs; /**< Block size, in bytes, rounded towards 0 */
> uint8_t bw; /**< Block width, in pixels */
> uint8_t bh; /**< Block height, in pixels */
> uint8_t bd; /**< Block depth, in pixels */
> @@ -1201,8 +1200,8 @@ isl_surf_get_row_pitch_el(const struct isl_surf *surf)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(surf->format);
>  
> -   assert(surf->row_pitch % fmtl->bs == 0);
> -   return surf->row_pitch / fmtl->bs;
> +   assert(surf->row_pitch % (fmtl->bpb / 8) == 0);
> +   return surf->row_pitch / (fmtl->bpb / 8);
>  }
>  
>  /**
> diff --git a/src/intel/isl/isl_gen9.c b/src/intel/isl/isl_gen9.c
> index aa290aa..39f4092 100644
> --- a/src/intel/isl/isl_gen9.c
> +++ b/src/intel/isl/isl_gen9.c
> @@ -40,7 +40,7 @@ gen9_calc_std_image_alignment_sa(const struct isl_device 
> *dev,
>  
> assert(isl_tiling_is_std_y(tiling));
>  
> -   const uint32_t bs = fmtl->bs;
> +   const uint32_t bpb = fmtl->bpb;
> const uint32_t is_Ys = tiling == ISL_TILING_Ys;
>  
> switch (info->dim) {
> @@ -49,7 +49,7 @@ gen9_calc_std_image_alignment_sa(const struct isl_device 
> *dev,
> * Layout and Tiling > 1D Surfaces > 1D Alignment Requirements.
> */
>*align_sa = (struct isl_extent3d) {
> - .w = 1 << (12 - (ffs(bs) - 1) + (4 * is_Ys)),
> + .w = 1 << (12 - (ffs(bpb) - 4) + (4 * is_Ys)),
>   .h = 1,
>   .d = 1,
>};
> @@ -60,8 +60,8 @@ gen9_calc_std_image_alignment_sa(const struct isl_device 
> *dev,
> * Requirements.
> */
>*align_sa = (struct isl_extent3d) {
> - .w = 1 << (6 - ((ffs(bs) - 1) / 2) + (4 * is_Ys)),
> - .h = 1 << (6 - ((ffs(bs) - 0) / 2) + (4 * is_Ys)),
> + .w = 1 << (6 - ((ffs(bpb) - 4) / 2) + (4 * is_Ys)),
> + .h = 1 << (6 - ((ffs(bpb) - 3) / 2) + (4 * is_Ys)),
>   .d = 1,
>};
>  
> @@ -86,9 +86,9 @@ gen9_calc_std_image_alignment_sa(const struct isl_device 
> *dev,

Re: [Mesa-dev] [PATCH 07/14] isl: Use bpb in a few places where it's more natural than bs

2016-07-11 Thread Pohjolainen, Topi
On Sat, Jul 09, 2016 at 12:17:24PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.c  | 2 +-
>  src/intel/isl/isl_gen6.c | 2 +-
>  src/intel/isl/isl_gen7.c | 2 +-
>  src/intel/isl/isl_storage_image.c| 4 ++--
>  src/intel/vulkan/anv_formats.c   | 4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 4 ++--
>  6 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index a3a9427..796b4cc 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -996,7 +996,7 @@ isl_apply_surface_padding(const struct isl_device *dev,
>  *  padding requirements.
>  */
> if (isl_format_is_yuv(info->format) &&
> -   (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) {
> +   (fmtl->bpb == 96 || fmtl->bpb == 48|| fmtl->bpb == 24)) {

So these values were bits instead of bytes even though stored into 'bs'?
Or how does this work? In the rest you have multiplied by the eight.

>*total_h_el += 1;
>*pad_bytes += 16;
> }
> diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
> index 24c3939..699aa41 100644
> --- a/src/intel/isl/isl_gen6.c
> +++ b/src/intel/isl/isl_gen6.c
> @@ -51,7 +51,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
>  *   - any compressed texture format (BC*)
>  *   - any YCRCB* format
>  */
> -   if (fmtl->bs > 8)
> +   if (fmtl->bpb > 64)
>return false;
> if (isl_format_is_compressed(info->format))
>return false;
> diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
> index 542c137..d9b0c08 100644
> --- a/src/intel/isl/isl_gen7.c
> +++ b/src/intel/isl/isl_gen7.c
> @@ -51,7 +51,7 @@ gen7_choose_msaa_layout(const struct isl_device *dev,
>  *formats: any format with greater than 64 bits per element, any
>  *compressed texture format (BC*), and any YCRCB* format.
>  */
> -   if (fmtl->bs > 8)
> +   if (fmtl->bpb > 64)
>return false;
> if (isl_format_is_compressed(info->format))
>return false;
> diff --git a/src/intel/isl/isl_storage_image.c 
> b/src/intel/isl/isl_storage_image.c
> index 590d2e4..2617eb0e 100644
> --- a/src/intel/isl/isl_storage_image.c
> +++ b/src/intel/isl/isl_storage_image.c
> @@ -194,9 +194,9 @@ isl_has_matching_typed_storage_image_format(const struct 
> brw_device_info *devinf
> if (devinfo->gen >= 9) {
>return true;
> } else if (devinfo->gen >= 8 || devinfo->is_haswell) {
> -  return isl_format_get_layout(fmt)->bs <= 8;
> +  return isl_format_get_layout(fmt)->bpb <= 64;
> } else {
> -  return isl_format_get_layout(fmt)->bs <= 4;
> +  return isl_format_get_layout(fmt)->bpb <= 32;
> }
>  }
>  
> diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
> index 457e820..b26e48a 100644
> --- a/src/intel/vulkan/anv_formats.c
> +++ b/src/intel/vulkan/anv_formats.c
> @@ -271,7 +271,7 @@ anv_get_format(const struct brw_device_info *devinfo, 
> VkFormat vk_format,
>isl_format_get_layout(format.isl_format);
>  
> if (tiling == VK_IMAGE_TILING_OPTIMAL &&
> -   !util_is_power_of_two(isl_layout->bs)) {
> +   !util_is_power_of_two(isl_layout->bpb)) {
>/* Tiled formats *must* be power-of-two because we need up upload
> * them with the render pipeline.  For 3-channel formats, we fix
> * this by switching them over to RGBX or RGBA formats under the
> @@ -409,7 +409,7 @@ anv_physical_device_get_format_properties(struct 
> anv_physical_device *physical_d
> * what most clients will want.
> */
>if (linear_fmt.isl_format != ISL_FORMAT_UNSUPPORTED &&
> -  
> !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bs) &&
> +  
> !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bpb) &&
>isl_format_rgb_to_rgbx(linear_fmt.isl_format) == 
> ISL_FORMAT_UNSUPPORTED) {
>   tiled &= ~VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT &
>~VK_FORMAT_FEATURE_BLIT_DST_BIT;
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> index fc1fc13..a4774e6 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
> @@ -982,7 +982,7 @@ namespace brw {
>  /* Untyped surface reads return 32 bits of the surface per
>   * component, without any sort of unpacking or type conversion,
>   */
> -const unsigned size = isl_format_get_layout(format)->bs / 4;
> +const unsigned size = isl_format_get_layout(format)->bpb / 32;
>  /* they don't properly handle out of bounds access, so we have to
>   * check manually if the coordinates are valid and predicate the
>  

Re: [Mesa-dev] [PATCH 04/14] isl: Rework the way we define tile sizes.

2016-07-11 Thread Pohjolainen, Topi
On Sat, Jul 09, 2016 at 12:17:21PM -0700, Jason Ekstrand wrote:
> This is based on a very long set of discussions between Chad and myself
> about how we should properly represent HiZ and CCS buffers.  The end result
> of that discussion was that a tiling actually has two different sizes, a
> logical size in elements, and a physical size in bytes and rows.  This
> commit reworks ISL's pitch and size calculations to work in terms of these
> two sizes.
> ---
>  src/intel/isl/isl.c | 181 
> ++--
>  src/intel/isl/isl.h |  32 +-
>  2 files changed, 133 insertions(+), 80 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 6f57ac2..633bfdf 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -111,30 +111,32 @@ isl_tiling_get_info(const struct isl_device *dev,
>  struct isl_tile_info *tile_info)
>  {
> const uint32_t bs = format_block_size;
> -   uint32_t width, height;
> +   struct isl_extent2d logical_el, phys_B;
>  
> assert(bs > 0);
> +   assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(bs));
>  
> switch (tiling) {
> case ISL_TILING_LINEAR:
> -  width = 1;
> -  height = 1;
> +  logical_el = isl_extent2d(1, 1);
> +  phys_B = isl_extent2d(bs, 1);
>break;
>  
> case ISL_TILING_X:
> -  width = 1 << 9;
> -  height = 1 << 3;

Maybe:

 assert(bs < 512);

> +  logical_el = isl_extent2d(512 / bs, 8);
> +  phys_B = isl_extent2d(512, 8);
>break;
>  
> case ISL_TILING_Y0:
> -  width = 1 << 7;
> -  height = 1 << 5;

And:

 assert(bs < 128);

> +  logical_el = isl_extent2d(128 / bs, 32);
> +  phys_B = isl_extent2d(128, 32);
>break;
>  
> case ISL_TILING_W:
>/* XXX: Should W tile be same as Y? */
> -  width = 1 << 6;
> -  height = 1 << 6;
> +  assert(bs == 1);
> +  logical_el = isl_extent2d(64, 64);
> +  phys_B = isl_extent2d(64, 64);
>break;
>  
> case ISL_TILING_Yf:
> @@ -147,8 +149,11 @@ isl_tiling_get_info(const struct isl_device *dev,
>  
>bool is_Ys = tiling == ISL_TILING_Ys;
>  
> -  width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
> -  height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys));
> +  unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
> +  unsigned height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys));

These could be both 'const'.

> +
> +  logical_el = isl_extent2d(width / bs, height);
> +  phys_B = isl_extent2d(width, height);
>break;
> }
>  
> @@ -158,9 +163,8 @@ isl_tiling_get_info(const struct isl_device *dev,
>  
> *tile_info = (struct isl_tile_info) {
>.tiling = tiling,
> -  .width = width,
> -  .height = height,
> -  .size = width * height,
> +  .logical_extent_el = logical_el,
> +  .phys_extent_B = phys_B,
> };
>  
> return true;
> @@ -827,7 +831,7 @@ isl_calc_array_pitch_el_rows(const struct isl_device *dev,
> *Tile Mode != Linear: This field must be set to an integer 
> multiple
> *of the tile height
> */
> -  pitch_el_rows = isl_align(pitch_el_rows, tile_info->height);
> +  pitch_el_rows = isl_align(pitch_el_rows, 
> tile_info->logical_extent_el.height);

Looks like you are overflowing the line here.

> }
>  
> return pitch_el_rows;
> @@ -837,11 +841,9 @@ isl_calc_array_pitch_el_rows(const struct isl_device 
> *dev,
>   * Calculate the pitch of each surface row, in bytes.
>   */
>  static uint32_t
> -isl_calc_row_pitch(const struct isl_device *dev,
> -   const struct isl_surf_init_info *restrict info,
> -   const struct isl_tile_info *tile_info,
> -   const struct isl_extent3d *image_align_sa,
> -   const struct isl_extent2d *phys_slice0_sa)
> +isl_calc_linear_row_pitch(const struct isl_device *dev,
> +  const struct isl_surf_init_info *restrict info,
> +  const struct isl_extent2d *phys_slice0_sa)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> @@ -894,39 +896,26 @@ isl_calc_row_pitch(const struct isl_device *dev,
> assert(phys_slice0_sa->w % fmtl->bw == 0);
> row_pitch = MAX(row_pitch, fmtl->bs * (phys_slice0_sa->w / fmtl->bw));
>  
> -   switch (tile_info->tiling) {
> -   case ISL_TILING_LINEAR:
> -  /* From the Broadwel PRM >> Volume 2d: Command Reference: Structures >>
> -   * RENDER_SURFACE_STATE Surface Pitch (p349):
> -   *
> -   *- For linear render target surfaces and surfaces accessed with 
> the
> -   *  typed data port messages, the pitch must be a multiple of the
> -   *  element size for non-YUV surface formats.  Pitch must be
> -   *  a multiple of 2 * element size for YUV surface formats.
> -   *
> -   *- [Requirements for SURFTYPE_BUFFER and SURFTYPE_STRBUF, 

Re: [Mesa-dev] [PATCH 03/14] isl: Rework the way we handle surface padding

2016-07-11 Thread Jason Ekstrand
On Mon, Jul 11, 2016 at 8:04 AM, Pohjolainen, Topi <
topi.pohjolai...@intel.com> wrote:

> On Sat, Jul 09, 2016 at 12:17:20PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/isl/isl.c | 52
> +---
> >  1 file changed, 25 insertions(+), 27 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index decba3d..6f57ac2 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -933,21 +933,21 @@ isl_calc_row_pitch(const struct isl_device *dev,
> >  }
> >
> >  /**
> > - * Calculate the surface's total height, including padding, in units of
> > - * surface elements.
> > + * Calculate and apply any padding required for the surface.
> > + *
> > + * @param[inout] total_h_el is updated with the new height
> > + * @param[out] pad_bytes is overwritten with additional padding
> requirements.
> >   */
> > -static uint32_t
> > -isl_calc_total_height_el(const struct isl_device *dev,
> > - const struct isl_surf_init_info *restrict info,
> > - const struct isl_tile_info *tile_info,
> > - uint32_t phys_array_len,
> > - uint32_t row_pitch,
> > - uint32_t array_pitch_el_rows)
> > +static void
> > +isl_apply_surface_padding(const struct isl_device *dev,
> > +  const struct isl_surf_init_info *restrict
> info,
> > +  const struct isl_tile_info *tile_info,
> > +  uint32_t *total_h_el,
> > +  uint32_t *pad_bytes)
> >  {
> > const struct isl_format_layout *fmtl =
> isl_format_get_layout(info->format);
> >
> > -   uint32_t total_h_el = phys_array_len * array_pitch_el_rows;
> > -   uint32_t pad_bytes = 0;
> > +   *pad_bytes = 0;
> >
> > /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
> >  * Formats >> Surface Padding Requirements >> Render Target and Media
> > @@ -981,14 +981,14 @@ isl_calc_total_height_el(const struct isl_device
> *dev,
> >  *  is to an even compressed row.
> >  */
> > if (isl_format_is_compressed(info->format))
> > -  total_h_el = isl_align(total_h_el, 2);
> > +  *total_h_el = isl_align(*total_h_el, 2);
> >
> > /*
> >  *- For cube surfaces, an additional two rows of padding are
> required
> >  *  at the bottom of the surface.
> >  */
> > if (info->usage & ISL_SURF_USAGE_CUBE_BIT)
> > -  total_h_el += 2;
> > +  *total_h_el += 2;
> >
> > /*
> >  *- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats,
> > @@ -998,8 +998,8 @@ isl_calc_total_height_el(const struct isl_device
> *dev,
> >  */
> > if (isl_format_is_yuv(info->format) &&
> > (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) {
> > -  total_h_el += 1;
> > -  pad_bytes += 16;
> > +  *total_h_el += 1;
> > +  *pad_bytes += 16;
> > }
> >
> > /*
> > @@ -1008,7 +1008,7 @@ isl_calc_total_height_el(const struct isl_device
> *dev,
> >  *  required above.
> >  */
> > if (tile_info->tiling == ISL_TILING_LINEAR)
> > -  pad_bytes += 64;
> > +  *pad_bytes += 64;
> >
> > /* The below text weakens, not strengthens, the padding requirements
> for
> >  * linear surfaces. Therefore we can safely ignore it.
> > @@ -1028,7 +1028,7 @@ isl_calc_total_height_el(const struct isl_device
> *dev,
> > if (ISL_DEV_GEN(dev) >= 9 &&
> > tile_info->tiling == ISL_TILING_LINEAR &&
> > (info->dim == ISL_SURF_DIM_2D || info->dim == ISL_SURF_DIM_3D)) {
> > -  total_h_el = isl_align(total_h_el, 4);
> > +  *total_h_el = isl_align(*total_h_el, 4);
> > }
> >
> > /*
> > @@ -1038,13 +1038,8 @@ isl_calc_total_height_el(const struct isl_device
> *dev,
> > if (ISL_DEV_GEN(dev) >= 9 &&
> > tile_info->tiling == ISL_TILING_LINEAR &&
> > info->dim == ISL_SURF_DIM_1D) {
> > -  total_h_el += 4;
> > +  *total_h_el += 4;
> > }
> > -
> > -   /* Be sloppy. Align any leftover padding to a row boundary. */
> > -   total_h_el += isl_align_div_npot(pad_bytes, row_pitch);
> > -
> > -   return total_h_el;
> >  }
> >
> >  bool
> > @@ -1108,10 +1103,13 @@ isl_surf_init_s(const struct isl_device *dev,
> > array_pitch_span, _align_sa,
> > _level0_sa, _slice0_sa);
> >
> > -   const uint32_t total_h_el =
> > -  isl_calc_total_height_el(dev, info, _info,
> > -   phys_level0_sa.array_len, row_pitch,
> > -   array_pitch_el_rows);
> > +   uint32_t total_h_el = phys_level0_sa.array_len * array_pitch_el_rows;
> > +
> > +   uint32_t pad_bytes;
> > +   isl_apply_surface_padding(dev, info, _info, _h_el,
> _bytes);
> > +
> > +   /* Be sloppy. Align any leftover padding to a row boundary. */
> > +   total_h_el += isl_align_div_npot(pad_bytes, row_pitch);
>

Re: [Mesa-dev] [PATCH 03/14] isl: Rework the way we handle surface padding

2016-07-11 Thread Pohjolainen, Topi
On Sat, Jul 09, 2016 at 12:17:20PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.c | 52 +---
>  1 file changed, 25 insertions(+), 27 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index decba3d..6f57ac2 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -933,21 +933,21 @@ isl_calc_row_pitch(const struct isl_device *dev,
>  }
>  
>  /**
> - * Calculate the surface's total height, including padding, in units of
> - * surface elements.
> + * Calculate and apply any padding required for the surface.
> + *
> + * @param[inout] total_h_el is updated with the new height
> + * @param[out] pad_bytes is overwritten with additional padding requirements.
>   */
> -static uint32_t
> -isl_calc_total_height_el(const struct isl_device *dev,
> - const struct isl_surf_init_info *restrict info,
> - const struct isl_tile_info *tile_info,
> - uint32_t phys_array_len,
> - uint32_t row_pitch,
> - uint32_t array_pitch_el_rows)
> +static void
> +isl_apply_surface_padding(const struct isl_device *dev,
> +  const struct isl_surf_init_info *restrict info,
> +  const struct isl_tile_info *tile_info,
> +  uint32_t *total_h_el,
> +  uint32_t *pad_bytes)
>  {
> const struct isl_format_layout *fmtl = 
> isl_format_get_layout(info->format);
>  
> -   uint32_t total_h_el = phys_array_len * array_pitch_el_rows;
> -   uint32_t pad_bytes = 0;
> +   *pad_bytes = 0;
>  
> /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface
>  * Formats >> Surface Padding Requirements >> Render Target and Media
> @@ -981,14 +981,14 @@ isl_calc_total_height_el(const struct isl_device *dev,
>  *  is to an even compressed row.
>  */
> if (isl_format_is_compressed(info->format))
> -  total_h_el = isl_align(total_h_el, 2);
> +  *total_h_el = isl_align(*total_h_el, 2);
>  
> /*
>  *- For cube surfaces, an additional two rows of padding are required
>  *  at the bottom of the surface.
>  */
> if (info->usage & ISL_SURF_USAGE_CUBE_BIT)
> -  total_h_el += 2;
> +  *total_h_el += 2;
>  
> /*
>  *- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats,
> @@ -998,8 +998,8 @@ isl_calc_total_height_el(const struct isl_device *dev,
>  */
> if (isl_format_is_yuv(info->format) &&
> (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) {
> -  total_h_el += 1;
> -  pad_bytes += 16;
> +  *total_h_el += 1;
> +  *pad_bytes += 16;
> }
>  
> /*
> @@ -1008,7 +1008,7 @@ isl_calc_total_height_el(const struct isl_device *dev,
>  *  required above.
>  */
> if (tile_info->tiling == ISL_TILING_LINEAR)
> -  pad_bytes += 64;
> +  *pad_bytes += 64;
>  
> /* The below text weakens, not strengthens, the padding requirements for
>  * linear surfaces. Therefore we can safely ignore it.
> @@ -1028,7 +1028,7 @@ isl_calc_total_height_el(const struct isl_device *dev,
> if (ISL_DEV_GEN(dev) >= 9 &&
> tile_info->tiling == ISL_TILING_LINEAR &&
> (info->dim == ISL_SURF_DIM_2D || info->dim == ISL_SURF_DIM_3D)) {
> -  total_h_el = isl_align(total_h_el, 4);
> +  *total_h_el = isl_align(*total_h_el, 4);
> }
>  
> /*
> @@ -1038,13 +1038,8 @@ isl_calc_total_height_el(const struct isl_device *dev,
> if (ISL_DEV_GEN(dev) >= 9 &&
> tile_info->tiling == ISL_TILING_LINEAR &&
> info->dim == ISL_SURF_DIM_1D) {
> -  total_h_el += 4;
> +  *total_h_el += 4;
> }
> -
> -   /* Be sloppy. Align any leftover padding to a row boundary. */
> -   total_h_el += isl_align_div_npot(pad_bytes, row_pitch);
> -
> -   return total_h_el;
>  }
>  
>  bool
> @@ -1108,10 +1103,13 @@ isl_surf_init_s(const struct isl_device *dev,
> array_pitch_span, _align_sa,
> _level0_sa, _slice0_sa);
>  
> -   const uint32_t total_h_el =
> -  isl_calc_total_height_el(dev, info, _info,
> -   phys_level0_sa.array_len, row_pitch,
> -   array_pitch_el_rows);
> +   uint32_t total_h_el = phys_level0_sa.array_len * array_pitch_el_rows;
> +
> +   uint32_t pad_bytes;
> +   isl_apply_surface_padding(dev, info, _info, _h_el, _bytes);
> +
> +   /* Be sloppy. Align any leftover padding to a row boundary. */
> +   total_h_el += isl_align_div_npot(pad_bytes, row_pitch);

Am I reading this correctly that this is non-functional change?

>  
> const uint32_t size =
>row_pitch * isl_align(total_h_el, tile_info.height);
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> 

Re: [Mesa-dev] Current status of the i965 dri3 bug #71759?

2016-07-11 Thread Martin Peres

On 11/07/16 17:40, i...@iirolaiho.net wrote:

Hi,

I am asking about freedesktop bug #71759.

The bug has been open since 2013, but the problems are beginning to
surface now: on Fedora+[RPMFusion|UnitedRPMs], it breaks h.264 playback
on Totem with default settings (1). Fabrice Bellet has done some
debugging on the problem and even submitted a patch, that however is not
working as of now (2).

1) https://bugzilla.redhat.com/show_bug.cgi?id=1309446
2) https://github.com/UnitedRPMs/libva-intel-driver/issues/1

Is there any progress on the issue? Is there anything a non-programmer
would be able to do to help?

Have I understood right that these drivers are mainly developed by Intel
employees on work time? If so, there certainly should be enough
resources to fix it.

The bug is currently assigned to Ian Romanick, who seems to be a some
kind of default assignee of these bugs.


Hey,

Sorry, I must have missed it. I will have a look at it tomorrow!

Thanks,
Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Current status of the i965 dri3 bug #71759?

2016-07-11 Thread iiro

Hi,

I am asking about freedesktop bug #71759.

The bug has been open since 2013, but the problems are beginning to  
surface now: on Fedora+[RPMFusion|UnitedRPMs], it breaks h.264  
playback on Totem with default settings (1). Fabrice Bellet has done  
some debugging on the problem and even submitted a patch, that however  
is not working as of now (2).


1) https://bugzilla.redhat.com/show_bug.cgi?id=1309446
2) https://github.com/UnitedRPMs/libva-intel-driver/issues/1

Is there any progress on the issue? Is there anything a non-programmer  
would be able to do to help?


Have I understood right that these drivers are mainly developed by  
Intel employees on work time? If so, there certainly should be enough  
resources to fix it.


The bug is currently assigned to Ian Romanick, who seems to be a some  
kind of default assignee of these bugs.


– Iiro Laiho

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: fix implicit truncation from 'int' to bitfield

2016-07-11 Thread Matt Turner
On Mon, Jul 11, 2016 at 6:31 AM, Francesco Ansanelli
 wrote:
> ---
>  src/gallium/drivers/i915/i915_context.c |6 +++---
>  src/gallium/drivers/i915/i915_flush.c   |6 +++---

Please prefix patches to this directory "i915g: "

>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/i915/i915_context.c 
> b/src/gallium/drivers/i915/i915_context.c
> index 82798bb..d7cdfd9 100644
> --- a/src/gallium/drivers/i915/i915_context.c
> +++ b/src/gallium/drivers/i915/i915_context.c
> @@ -216,9 +216,9 @@ i915_create_context(struct pipe_screen *screen, void 
> *priv, unsigned flags)
>
> i915->dirty = ~0;
> i915->hardware_dirty = ~0;
> -   i915->immediate_dirty = ~0;
> -   i915->dynamic_dirty = ~0;
> -   i915->static_dirty = ~0;
> +   i915->immediate_dirty |= ~0;
> +   i915->dynamic_dirty |= ~0;
> +   i915->static_dirty |= ~0;

What exactly is the warning you see? I'm having a difficult time
understanding how this could possibly help anything.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i915: fix implicit truncation from 'int' to bitfield

2016-07-11 Thread Francesco Ansanelli
---
 src/gallium/drivers/i915/i915_context.c |6 +++---
 src/gallium/drivers/i915/i915_flush.c   |6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/i915/i915_context.c 
b/src/gallium/drivers/i915/i915_context.c
index 82798bb..d7cdfd9 100644
--- a/src/gallium/drivers/i915/i915_context.c
+++ b/src/gallium/drivers/i915/i915_context.c
@@ -216,9 +216,9 @@ i915_create_context(struct pipe_screen *screen, void *priv, 
unsigned flags)
 
i915->dirty = ~0;
i915->hardware_dirty = ~0;
-   i915->immediate_dirty = ~0;
-   i915->dynamic_dirty = ~0;
-   i915->static_dirty = ~0;
+   i915->immediate_dirty |= ~0;
+   i915->dynamic_dirty |= ~0;
+   i915->static_dirty |= ~0;
i915->flush_dirty = 0;
 
return >base;
diff --git a/src/gallium/drivers/i915/i915_flush.c 
b/src/gallium/drivers/i915/i915_flush.c
index 6311c00..db05f97 100644
--- a/src/gallium/drivers/i915/i915_flush.c
+++ b/src/gallium/drivers/i915/i915_flush.c
@@ -81,9 +81,9 @@ void i915_flush(struct i915_context *i915,
batch->iws->batchbuffer_flush(batch, fence, flags);
i915->vbo_flushed = 1;
i915->hardware_dirty = ~0;
-   i915->immediate_dirty = ~0;
-   i915->dynamic_dirty = ~0;
-   i915->static_dirty = ~0;
+   i915->immediate_dirty |= ~0;
+   i915->dynamic_dirty |= ~0;
+   i915->static_dirty |= ~0;
/* kernel emits flushes in between batchbuffers */
i915->flush_dirty = 0;
i915->fired_vertices += i915->queued_vertices;
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: fix ignored qualifiers warning

2016-07-11 Thread Kenneth Graunke
On Saturday, July 9, 2016 10:16:29 AM PDT Francesco Ansanelli wrote:
> ---
>  src/mesa/drivers/dri/i965/gen6_queryobj.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
> b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> index 96db5e9..95a5c56 100644
> --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
> +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> @@ -99,7 +99,7 @@ write_xfb_primitives_written(struct brw_context *brw,
> }
>  }
>  
> -static inline const int
> +static inline int
>  pipeline_target_to_index(int target)
>  {
> if (target == GL_GEOMETRY_SHADER_INVOCATIONS)
> 

Pushed, thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates

2016-07-11 Thread Kenneth Graunke
On Monday, July 11, 2016 1:37:46 PM PDT Samuel Iglesias Gonsálvez wrote:
> From: Iago Toral Quiroga 
> 
> Gen7 hardware does not support double immediates so these need
> to be moved in 32-bit chunks to a regular vgrf instead. Instead
> of doing this every time we need to create a DF immediate,
> create a helper function that does the right thing depending
> on the hardware generation.
> 
> v2:
> - Define setup_imm_df() as an independent function (Curro)
> - Create a specific builder to get rid of some instruction field
>   assignments (Curro).
> 
> Signed-off-by: Samuel Iglesias Gonsálvez 
> Reviewed-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |  3 +++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 
> 
>  2 files changed, 40 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 1f88f8f..d034573 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const 
> brw::fs_builder ,
>  const fs_reg ,
>  const fs_reg ,
>  uint32_t components);
> +fs_reg setup_imm_df(const struct brw_device_info *devinfo,
> +const brw::fs_builder ,
> +double v);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 04ed42e..94c719b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder 
> ,
>bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, 
> 1));
> }
>  }
> +
> +fs_reg
> +setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , 
> double v)
> +{

If you like, you can just do:

   const struct brw_device_info *devinfo = bld.shader->devinfo;

and avoid the extra parameter.  Either way is fine.

> +   assert(devinfo->gen >= 7);
> +
> +   if (devinfo->gen >= 8)
> +  return brw_imm_df(v);
> +
> +   /* gen7 does not support DF immediates, so we generate a 64-bit constant 
> by
> +* writing the low 32-bit of the constant to suboffset 0 of a VGRF and
> +* the high 32-bit to suboffset 4 and then applying a stride of 0.
> +*
> +* Alternatively, we could also produce a normal VGRF (without stride 0)
> +* by writing to all the channels in the VGRF, however, that would hit the
> +* gen7 bug where we have to split writes that span more than 1 register
> +* into instructions with a width of 4 (otherwise the write to the second
> +* register written runs into an execmask hardware bug) which isn't very
> +* nice.
> +*/
> +   union {
> +  double d;
> +  struct {
> + uint32_t i1;
> + uint32_t i2;
> +  };
> +   } di;
> +
> +   di.d = v;
> +
> +   const fs_builder ubld = bld.exec_all().group(1, 0);
> +   const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2);
> +   ubld.MOV(tmp, brw_imm_ud(di.i1));
> +   ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2));
> +
> +   return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0);
> +}
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] i965/fs: use the new helper function to create double immediates

2016-07-11 Thread Kenneth Graunke
On Monday, July 11, 2016 1:19:34 PM PDT Samuel Iglesias Gonsálvez wrote:
> 
> On 06/07/16 22:32, Kenneth Graunke wrote:
> > On Wednesday, July 6, 2016 12:09:58 PM PDT Samuel Iglesias Gonsálvez wrote:
> >> From: Iago Toral Quiroga 
> >>
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> index 268c847..d805d95 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> @@ -832,7 +832,7 @@ fs_visitor::nir_emit_alu(const fs_builder , 
> >> nir_alu_instr *instr)
> >>* a register and compare with that.
> >>*/
> >>   fs_reg tmp = vgrf(glsl_type::double_type);
> >> - bld.MOV(tmp, brw_imm_df(0.0));
> >> + bld.MOV(tmp, setup_imm_df(0.0));
> > 
> > Does this need to be splatted out to a full SIMD-width?
> > Why not just do:
> > 
> >fs_reg tmp = setup_imm_df(0.0);
> > 
> > and let the CMP compare against the stride 0 register?
> > 
> 
> I have just noticed this is not right.
> 
> CMP will use the 64-bit immediate as one of the sources of the CMP,
> which is not valid in gen8+. According to BDW+'s PRMs, an 64-bit
> immediate is only valid in source values for instructions with single
> source operands.
> 
> I am going to keep the original patch.
> 
> Sam

Ah, right, I missed that on Gen8+ setup_imm_df returns an actual
immediate, rather than a GRF register with the immediate loaded into it.

Feel free to ignore that suggestion.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] vl/compositor: move weave shader out from rgb weaving

2016-07-11 Thread Christian König

Am 06.07.2016 um 20:03 schrieb Leo Liu:

We'll use weave shader in the later patch.

Signed-off-by: Leo Liu 


I think I would prefer having a separate component for format conversion 
of video buffers instead of pushing that into the compositor as well. We 
could still share the weave shader in a header file.


But that's only a minor cleanup we can do later on as well.

The patches are Acked-by: Christian König  
either way.


Regards,
Christian.


---
  src/gallium/auxiliary/vl/vl_compositor.c | 157 ---
  src/gallium/auxiliary/vl/vl_compositor.h |   2 +-
  2 files changed, 83 insertions(+), 76 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 77fc92e..275022b 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -126,6 +126,77 @@ create_vert_shader(struct vl_compositor *c)
  }
  
  static void

+create_frag_shader_weave(struct ureg_program *shader, struct ureg_dst fragment)
+{
+   struct ureg_src i_tc[2];
+   struct ureg_src sampler[3];
+   struct ureg_dst t_tc[2];
+   struct ureg_dst t_texel[2];
+   unsigned i, j;
+
+   i_tc[0] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTOP, 
TGSI_INTERPOLATE_LINEAR);
+   i_tc[1] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VBOTTOM, 
TGSI_INTERPOLATE_LINEAR);
+
+   for (i = 0; i < 3; ++i)
+  sampler[i] = ureg_DECL_sampler(shader, i);
+
+   for (i = 0; i < 2; ++i) {
+  t_tc[i] = ureg_DECL_temporary(shader);
+  t_texel[i] = ureg_DECL_temporary(shader);
+   }
+
+   /* calculate the texture offsets
+* t_tc.x = i_tc.x
+* t_tc.y = (round(i_tc.y - 0.5) + 0.5) / height * 2
+*/
+   for (i = 0; i < 2; ++i) {
+  ureg_MOV(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_X), i_tc[i]);
+  ureg_SUB(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ),
+   i_tc[i], ureg_imm1f(shader, 0.5f));
+  ureg_ROUND(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ), 
ureg_src(t_tc[i]));
+  ureg_MOV(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_W),
+   ureg_imm1f(shader, i ? 1.0f : 0.0f));
+  ureg_ADD(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ),
+   ureg_src(t_tc[i]), ureg_imm1f(shader, 0.5f));
+  ureg_MUL(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_Y),
+   ureg_src(t_tc[i]), ureg_scalar(i_tc[0], TGSI_SWIZZLE_W));
+  ureg_MUL(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_Z),
+   ureg_src(t_tc[i]), ureg_scalar(i_tc[1], TGSI_SWIZZLE_W));
+   }
+
+   /* fetch the texels
+* texel[0..1].x = tex(t_tc[0..1][0])
+* texel[0..1].y = tex(t_tc[0..1][1])
+* texel[0..1].z = tex(t_tc[0..1][2])
+*/
+   for (i = 0; i < 2; ++i)
+  for (j = 0; j < 3; ++j) {
+ struct ureg_src src = ureg_swizzle(ureg_src(t_tc[i]),
+TGSI_SWIZZLE_X, j ? TGSI_SWIZZLE_Z : TGSI_SWIZZLE_Y, 
TGSI_SWIZZLE_W, TGSI_SWIZZLE_W);
+
+ ureg_TEX(shader, ureg_writemask(t_texel[i], TGSI_WRITEMASK_X << j),
+  TGSI_TEXTURE_2D_ARRAY, src, sampler[j]);
+  }
+
+   /* calculate linear interpolation factor
+* factor = |round(i_tc.y) - i_tc.y| * 2
+*/
+   ureg_ROUND(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ), i_tc[0]);
+   ureg_ADD(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ),
+ureg_src(t_tc[0]), ureg_negate(i_tc[0]));
+   ureg_MUL(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ),
+ureg_abs(ureg_src(t_tc[0])), ureg_imm1f(shader, 2.0f));
+   ureg_LRP(shader, fragment, ureg_swizzle(ureg_src(t_tc[0]),
+TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z),
+ureg_src(t_texel[0]), ureg_src(t_texel[1]));
+
+   for (i = 0; i < 2; ++i) {
+  ureg_release_temporary(shader, t_texel[i]);
+  ureg_release_temporary(shader, t_tc[i]);
+   }
+}
+
+static void
  create_frag_shader_csc(struct ureg_program *shader, struct ureg_dst texel,
   struct ureg_dst fragment)
  {
@@ -199,86 +270,22 @@ create_frag_shader_video_buffer(struct vl_compositor *c)
  }
  
  static void *

-create_frag_shader_weave(struct vl_compositor *c)
+create_frag_shader_weave_rgb(struct vl_compositor *c)
  {
 struct ureg_program *shader;
-   struct ureg_src i_tc[2];
-   struct ureg_src sampler[3];
-   struct ureg_dst t_tc[2];
-   struct ureg_dst t_texel[2];
-   struct ureg_dst o_fragment;
-   unsigned i, j;
+   struct ureg_dst texel, fragment;
  
 shader = ureg_create(PIPE_SHADER_FRAGMENT);

 if (!shader)
return false;
  
-   i_tc[0] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTOP, TGSI_INTERPOLATE_LINEAR);

-   i_tc[1] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VBOTTOM, 
TGSI_INTERPOLATE_LINEAR);
-
-   for (i = 0; i < 3; ++i)
-  sampler[i] = ureg_DECL_sampler(shader, i);
-
-   for (i = 0; i < 2; ++i) {
-  t_tc[i] = ureg_DECL_temporary(shader);
- 

Re: [Mesa-dev] [PATCH] i965: fix ignored qualifiers warning

2016-07-11 Thread Eric Engestrom
On Sat, Jul 09, 2016 at 10:16:29AM +0200, Francesco Ansanelli wrote:
> ---
>  src/mesa/drivers/dri/i965/gen6_queryobj.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
> b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> index 96db5e9..95a5c56 100644
> --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
> +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> @@ -99,7 +99,7 @@ write_xfb_primitives_written(struct brw_context *brw,
> }
>  }
>  
> -static inline const int
> +static inline int
>  pipeline_target_to_index(int target)
>  {
> if (target == GL_GEOMETRY_SHADER_INVOCATIONS)
> -- 
> 1.7.9.5

Reviewed-by: Eric Engestrom 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-11 Thread Samuel Iglesias Gonsálvez
From: Iago Toral Quiroga 

In fp64 we can produce code like this:

mov(16) vgrf2<2>:UD, vgrf3<2>:UD

That our simd lowering pass would typically split in instructions with a
width of 8, writing to two consecutive registers each. Unfortunately, gen7
hardware has a bug affecting execution masking and as a result, the
second GRF register write won't work properly. Curro verified this:

"The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1
 (where QtrCtrl is the 8-bit quarter of the execution mask signals
 specified in the instruction control fields) for the second
 compressed half of any single-precision instruction (for
 double-precision instructions it's hardwired to use NibCtrl+1,
 at least on HSW), which means that the EU will apply the wrong
 execution controls for the second sequential GRF write if the number
 of channels per GRF is not exactly eight in single-precision mode (or
 four in double-float mode)."

In practice, this means that we cannot write more than one
consecutive GRF in a single instruction if the number of channels
per GRF is not exactly eight in single-precision mode (or four
in double-float mode).

This patch makes our SIMD lowering pass split this kind of instructions
so that the split versions only write to a single register. In the
example above this means that we split the write in 4 instructions, each
one writing 4 UD elements (width = 4) to a single register.

v2 (Curro):
 - Make explicit that the thing about hardwiring NibCtrl+1 for the second
   compressed half is known to happen in Haswell and the issue with IVB
   might not be exactly the same.
 - Assign max_width instead of returning early so that we can handle
   multiple restrictions affecting to the same instruction.
 - Avoid division by 0 if the instruction does not write any registers.
 - Ignore instructions what have WE_all set.
 - Use the instruction execution type size instead of the dst type size.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 28 
 1 file changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2f473cc..4d57412 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct brw_device_info 
*devinfo,
 */
unsigned reg_count = inst->regs_written;
 
+   /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is
+* the 8-bit quarter of the execution mask signals specified in the
+* instruction control fields) for the second compressed half of any
+* single-precision instruction (for double-precision instructions
+* it's hardwired to use NibCtrl+1, at least on HSW), which means that
+* the EU will apply the wrong execution controls for the second
+* sequential GRF write if the number of channels per GRF is not exactly
+* eight in single-precision mode (or four in double-float mode).
+*
+* In this situation we calculate the maximum size of the split
+* instructions so they only ever write to a single register.
+*/
+   if (devinfo->gen < 8 && inst->regs_written > 1 &&
+   !inst->force_writemask_all) {
+  unsigned channels_per_grf = inst->exec_size / inst->regs_written;
+  unsigned exec_type_size = 0;
+  for (int i = 0; i < inst->sources; i++) {
+ if (inst->src[i].file == BAD_FILE)
+break;
+ exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type));
+  }
+  assert(exec_type_size);
+
+  if (channels_per_grf != REG_SIZE / exec_type_size) {
+ max_width = MIN2(max_width, channels_per_grf);
+  }
+   }
+
for (unsigned i = 0; i < inst->sources; i++)
   reg_count = MAX2(reg_count, (unsigned)inst->regs_read(i));
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 6/6] i965/fs: do d2x lowering before simd splitting

2016-07-11 Thread Samuel Iglesias Gonsálvez
So that we can have gen7 split large writes produced by this lowering pass.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4bf0ca2..d131106 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5843,6 +5843,11 @@ fs_visitor::optimize()
   OPT(dead_code_eliminate);
}
 
+   if (OPT(lower_d2x)) {
+  OPT(opt_copy_propagate);
+  OPT(dead_code_eliminate);
+   }
+
OPT(lower_simd_width);
 
/* After SIMD lowering just in case we had to unroll the EOT send. */
@@ -5879,11 +5884,6 @@ fs_visitor::optimize()
   OPT(dead_code_eliminate);
}
 
-   if (OPT(lower_d2x)) {
-  OPT(opt_copy_propagate);
-  OPT(dead_code_eliminate);
-   }
-
OPT(opt_combine_constants);
OPT(lower_integer_multiplication);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/6] i965/fs: do pack lowering before simd splitting

2016-07-11 Thread Samuel Iglesias Gonsálvez
From: Iago Toral Quiroga 

So that we can have gen7 split large writes produced by the pack lowering.

Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4d57412..4bf0ca2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5838,6 +5838,11 @@ fs_visitor::optimize()
progress = false;
pass_num = 0;
 
+   if (OPT(lower_pack)) {
+  OPT(register_coalesce);
+  OPT(dead_code_eliminate);
+   }
+
OPT(lower_simd_width);
 
/* After SIMD lowering just in case we had to unroll the EOT send. */
@@ -5874,11 +5879,6 @@ fs_visitor::optimize()
   OPT(dead_code_eliminate);
}
 
-   if (OPT(lower_pack)) {
-  OPT(register_coalesce);
-  OPT(dead_code_eliminate);
-   }
-
if (OPT(lower_d2x)) {
   OPT(opt_copy_propagate);
   OPT(dead_code_eliminate);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/6] i965/fs: do not require force_writemask_all with exec_size 4

2016-07-11 Thread Samuel Iglesias Gonsálvez
So far we only used instructions with this size in situations where we
did not operate per-channel and we wanted to ignore the execution mask,
but gen7 fp64 will need to emit code with a width of 4 that needs
normal execution masking.

v2:
- Modify the assert instead of deleting it (Curro)

Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d25d26a..ce1ec0a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1649,7 +1649,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   brw_set_default_acc_write_control(p, inst->writes_accumulator);
   brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
 
-  assert(inst->force_writemask_all || inst->exec_size >= 8);
+  assert(inst->force_writemask_all || inst->exec_size >= 4);
   assert(inst->force_writemask_all || inst->group % inst->exec_size == 0);
   assert(inst->base_mrf + inst->mlen <= BRW_MAX_MRF(devinfo->gen));
   assert(inst->mlen <= BRW_MAX_MSG_LENGTH);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/6] i965/fs: use the new helper function to create double immediates

2016-07-11 Thread Samuel Iglesias Gonsálvez
From: Iago Toral Quiroga 

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 94c719b..acc8c1e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -789,7 +789,7 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
   * a register and compare with that.
   */
  fs_reg tmp = vgrf(glsl_type::double_type);
- bld.MOV(tmp, brw_imm_df(0.0));
+ bld.MOV(tmp, setup_imm_df(devinfo, bld, 0.0));
 
  /* A direct DF CMP using the flag register (null dst) won't work in
   * SIMD16 because the CMP will be split in two by lower_simd_width,
@@ -1128,7 +1128,7 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
case nir_op_d2b: {
   /* two-argument instructions can't take 64-bit immediates */
   fs_reg zero = vgrf(glsl_type::double_type);
-  bld.MOV(zero, brw_imm_df(0.0));
+  bld.MOV(zero, setup_imm_df(devinfo, bld, 0.0));
   /* A SIMD16 execution needs to be split in two instructions, so use
* a vgrf instead of the flag register as dst so instruction splitting
* works
@@ -1440,7 +1440,8 @@ fs_visitor::nir_emit_load_const(const fs_builder ,
 
case 64:
   for (unsigned i = 0; i < instr->def.num_components; i++)
- bld.MOV(offset(reg, bld, i), brw_imm_df(instr->value.f64[i]));
+ bld.MOV(offset(reg, bld, i),
+ setup_imm_df(devinfo, bld, instr->value.f64[i]));
   break;
 
default:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/6] i965/fs: fix Haswell support for doubles

2016-07-11 Thread Samuel Iglesias Gonsálvez
Hello,

This is the second version of this patch series [0].

The use of DIM instruction on HSW to setup an 64-bit immediate reg
(suggested by Kenneth here [1]) will be sent in a separate patch series.

Thanks,

Sam

[0] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122416.html
[1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122473.html

Iago Toral Quiroga (4):
  i965/fs: add a helper function to create double immediates
  i965/fs: use the new helper function to create double immediates
  i965/fs/gen7: split instructions that run into exec masking bugs
  i965/fs: do pack lowering before simd splitting

Samuel Iglesias Gonsálvez (2):
  i965/fs: do not require force_writemask_all with exec_size 4
  i965/fs: do d2x lowering before simd splitting

 src/mesa/drivers/dri/i965/brw_fs.cpp   | 48 --
 src/mesa/drivers/dri/i965/brw_fs.h |  3 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 44 +--
 4 files changed, 83 insertions(+), 14 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates

2016-07-11 Thread Samuel Iglesias Gonsálvez
From: Iago Toral Quiroga 

Gen7 hardware does not support double immediates so these need
to be moved in 32-bit chunks to a regular vgrf instead. Instead
of doing this every time we need to create a DF immediate,
create a helper function that does the right thing depending
on the hardware generation.

v2:
- Define setup_imm_df() as an independent function (Curro)
- Create a specific builder to get rid of some instruction field
  assignments (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez 
Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 +++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 
 2 files changed, 40 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 1f88f8f..d034573 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const 
brw::fs_builder ,
 const fs_reg ,
 const fs_reg ,
 uint32_t components);
+fs_reg setup_imm_df(const struct brw_device_info *devinfo,
+const brw::fs_builder ,
+double v);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 04ed42e..94c719b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder ,
   bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, 
1));
}
 }
+
+fs_reg
+setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , 
double v)
+{
+   assert(devinfo->gen >= 7);
+
+   if (devinfo->gen >= 8)
+  return brw_imm_df(v);
+
+   /* gen7 does not support DF immediates, so we generate a 64-bit constant by
+* writing the low 32-bit of the constant to suboffset 0 of a VGRF and
+* the high 32-bit to suboffset 4 and then applying a stride of 0.
+*
+* Alternatively, we could also produce a normal VGRF (without stride 0)
+* by writing to all the channels in the VGRF, however, that would hit the
+* gen7 bug where we have to split writes that span more than 1 register
+* into instructions with a width of 4 (otherwise the write to the second
+* register written runs into an execmask hardware bug) which isn't very
+* nice.
+*/
+   union {
+  double d;
+  struct {
+ uint32_t i1;
+ uint32_t i2;
+  };
+   } di;
+
+   di.d = v;
+
+   const fs_builder ubld = bld.exec_all().group(1, 0);
+   const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2);
+   ubld.MOV(tmp, brw_imm_ud(di.i1));
+   ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2));
+
+   return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0);
+}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] i965/fs: use the new helper function to create double immediates

2016-07-11 Thread Samuel Iglesias Gonsálvez


On 06/07/16 22:32, Kenneth Graunke wrote:
> On Wednesday, July 6, 2016 12:09:58 PM PDT Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga 
>>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 268c847..d805d95 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -832,7 +832,7 @@ fs_visitor::nir_emit_alu(const fs_builder , 
>> nir_alu_instr *instr)
>>* a register and compare with that.
>>*/
>>   fs_reg tmp = vgrf(glsl_type::double_type);
>> - bld.MOV(tmp, brw_imm_df(0.0));
>> + bld.MOV(tmp, setup_imm_df(0.0));
> 
> Does this need to be splatted out to a full SIMD-width?
> Why not just do:
> 
>fs_reg tmp = setup_imm_df(0.0);
> 
> and let the CMP compare against the stride 0 register?
> 

I have just noticed this is not right.

CMP will use the 64-bit immediate as one of the sources of the CMP,
which is not valid in gen8+. According to BDW+'s PRMs, an 64-bit
immediate is only valid in source values for instructions with single
source operands.

I am going to keep the original patch.

Sam

>>  
>>   /* A direct DF CMP using the flag register (null dst) won't work in
>>* SIMD16 because the CMP will be split in two by lower_simd_width,
>> @@ -1171,7 +1171,7 @@ fs_visitor::nir_emit_alu(const fs_builder , 
>> nir_alu_instr *instr)
>> case nir_op_d2b: {
>>/* two-argument instructions can't take 64-bit immediates */
>>fs_reg zero = vgrf(glsl_type::double_type);
>> -  bld.MOV(zero, brw_imm_df(0.0));
>> +  bld.MOV(zero, setup_imm_df(0.0));
>>/* A SIMD16 execution needs to be split in two instructions, so use
>> * a vgrf instead of the flag register as dst so instruction splitting
>> * works
> 
> Likewise, I don't think you need to splat here.
> 
>> @@ -1483,7 +1483,7 @@ fs_visitor::nir_emit_load_const(const fs_builder ,
>>  
>> case 64:
>>for (unsigned i = 0; i < instr->def.num_components; i++)
>> - bld.MOV(offset(reg, bld, i), brw_imm_df(instr->value.f64[i]));
>> + bld.MOV(offset(reg, bld, i), setup_imm_df(instr->value.f64[i]));
>>break;
>>  
>> default:
>>
> 
> This hunk looks good.
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96853] gl_PrimitiveID is zero when rendering points of size > 1

2016-07-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96853

--- Comment #5 from denis.fisse...@tu-dortmund.de ---
(In reply to Roland Scheidegger from comment #3) 
> In theory if there already is a user-provided gs (which doesn't output
> points) then the emulation code doesn't really apply. But I don't really
> know this code (other than knowing this is pretty tricky stuff...).

According to my testing, GS with Mesa3d show a kind of strange mixed behavior.
I have a GS, that fakes wide lines by constructing screen aligned quads from
lines. If I use it in VMware with Ubuntu 16.04 LTS and Mesa3d, it does not
construct wide lines, but only lines of width 1. Now if I abort the GS, using a
return statement right at the beginning of the main method and do not output
anything, lines of width 1 are still shown in the viewport. For those lines
however to be colored correctly, I have to output that color from my GS. So it
seems the geometry output part of the GS is somehow overridden, while other
parts of it are not.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa from git fails to compile

2016-07-11 Thread Emil Velikov
On 10 July 2016 at 22:11, Pali Rohár  wrote:
> Hello, compiling mesa from git is failing on this error:
>
> Making all in isl
> make[5]: Entering directory `/«PKGBUILDDIR»/build/dri/src/intel/isl'
> python2.7  ../../../../../src/intel/isl/gen_format_layout.py \
> --csv ../../../../../src/intel/isl/isl_format_layout.csv --out 
> isl_format_layout.c
> Traceback (most recent call last):
>   File "../../../../../src/intel/isl/gen_format_layout.py", line 92, in 
> 
> output_encoding='utf-8')
> TypeError: __init__() got an unexpected keyword argument 'future_imports'
> make[5]: *** [isl_format_layout.c] Error 1
> make[5]: Leaving directory `/«PKGBUILDDIR»/build/dri/src/intel/isl'
> make[4]: *** [all-recursive] Error 1
> make[4]: Leaving directory `/«PKGBUILDDIR»/build/dri/src/intel'
> make[3]: *** [all-recursive] Error 1
> make[3]: Leaving directory `/«PKGBUILDDIR»/build/dri/src'
> make[2]: *** [all] Error 2
> make[2]: Leaving directory `/«PKGBUILDDIR»/build/dri/src'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/«PKGBUILDDIR»/build/dri'
> make: *** [debian/stamp/x86_64-linux-gnu-build-dri] Error 2
>
> Any idea where is problem and how to fix it?
>
> Full build log is available at:
>
> https://launchpad.net/~pali/+archive/ubuntu/graphics-drivers/+build/10446196/+files/buildlog_ubuntu-precise-amd64.mesa-lts-trusty_11.3.0-git201607100358.5c17fb2~ubuntu12.04.1_BUILDING.txt.gz
>

Sounds similar (the same?) as
https://bugs.freedesktop.org/show_bug.cgi?id=89347.
Which version of mako do you have, can you give things a try with
0.8.0 or later ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev