Re: [Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates
On 11/07/16 14:54, Kenneth Graunke wrote: > On Monday, July 11, 2016 1:37:46 PM PDT Samuel Iglesias Gonsálvez wrote: >> From: Iago Toral Quiroga>> >> Gen7 hardware does not support double immediates so these need >> to be moved in 32-bit chunks to a regular vgrf instead. Instead >> of doing this every time we need to create a DF immediate, >> create a helper function that does the right thing depending >> on the hardware generation. >> >> v2: >> - Define setup_imm_df() as an independent function (Curro) >> - Create a specific builder to get rid of some instruction field >> assignments (Curro). >> >> Signed-off-by: Samuel Iglesias Gonsálvez >> Reviewed-by: Kenneth Graunke >> --- >> src/mesa/drivers/dri/i965/brw_fs.h | 3 +++ >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 >> >> 2 files changed, 40 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >> b/src/mesa/drivers/dri/i965/brw_fs.h >> index 1f88f8f..d034573 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.h >> +++ b/src/mesa/drivers/dri/i965/brw_fs.h >> @@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const >> brw::fs_builder , >> const fs_reg , >> const fs_reg , >> uint32_t components); >> +fs_reg setup_imm_df(const struct brw_device_info *devinfo, >> +const brw::fs_builder , >> +double v); >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 04ed42e..94c719b 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder >> , >>bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, >> 1)); >> } >> } >> + >> +fs_reg >> +setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , >> double v) >> +{ > > If you like, you can just do: > >const struct brw_device_info *devinfo = bld.shader->devinfo; > > and avoid the extra parameter. Either way is fine. > Right, I am going to do it. Thanks! Sam >> + assert(devinfo->gen >= 7); >> + >> + if (devinfo->gen >= 8) >> + return brw_imm_df(v); >> + >> + /* gen7 does not support DF immediates, so we generate a 64-bit constant >> by >> +* writing the low 32-bit of the constant to suboffset 0 of a VGRF and >> +* the high 32-bit to suboffset 4 and then applying a stride of 0. >> +* >> +* Alternatively, we could also produce a normal VGRF (without stride 0) >> +* by writing to all the channels in the VGRF, however, that would hit >> the >> +* gen7 bug where we have to split writes that span more than 1 register >> +* into instructions with a width of 4 (otherwise the write to the second >> +* register written runs into an execmask hardware bug) which isn't very >> +* nice. >> +*/ >> + union { >> + double d; >> + struct { >> + uint32_t i1; >> + uint32_t i2; >> + }; >> + } di; >> + >> + di.d = v; >> + >> + const fs_builder ubld = bld.exec_all().group(1, 0); >> + const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); >> + ubld.MOV(tmp, brw_imm_ud(di.i1)); >> + ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2)); >> + >> + return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0); >> +} >> > signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] mesa from git fails to compile
On Monday 11 July 2016 10:11:30 Emil Velikov wrote: > Sounds similar (the same?) as > https://bugs.freedesktop.org/show_bug.cgi?id=89347. Output error looks lo be same. > Which version of mako do you have, can you give things a try with > 0.8.0 or later ? If you mean mako python module, then I have version 0.5.0. That build is for Ubuntu precise which have only that one version in repository, see: http://packages.ubuntu.com/precise/python-mako So I do not have new version of make... -- Pali Rohár pali.ro...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RFC 1/1] r600, compute: Use vtx #3 for kernel arguments
On Sun, Jun 26, 2016 at 08:40:55PM -0400, Jan Vesely wrote: > Both explicit and implicit. > Using vtx 0 (as existing llvm code implies) does not work for dynamic offsets. > > Signed-off-by: Jan VeselyI have no idea why vtx#3 works when vtx#0, maybe add a comment explaining why we are using vtx#3. With that change: Reviewed-by: Tom Stellard > --- > Hi, > > I ran into problem when using VTX_READ from constant buffer would work only > for 0 index. The LLVM code implied that it should work (or maybe they > considered constant offsets only), but I could not find one way or the other > in ISA docs. > > Switching to vtx#3 fixed the problem, though I'm not sure if it's the right > solution. > > thanks, > Jan > > > src/gallium/drivers/r600/evergreen_compute.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/r600/evergreen_compute.c > b/src/gallium/drivers/r600/evergreen_compute.c > index 7f9580c..b351cee 100644 > --- a/src/gallium/drivers/r600/evergreen_compute.c > +++ b/src/gallium/drivers/r600/evergreen_compute.c > @@ -369,6 +369,8 @@ static void evergreen_compute_upload_input(struct > pipe_context *ctx, > ctx->transfer_unmap(ctx, transfer); > > /* ID=0 is reserved for the parameters */ > + evergreen_cs_set_vertex_buffer(rctx, 3, 0, > + (struct pipe_resource*)shader->kernel_param); > evergreen_cs_set_constant_buffer(rctx, 0, 0, input_size, > (struct pipe_resource*)shader->kernel_param); > } > @@ -614,9 +616,9 @@ static void evergreen_set_compute_resources(struct > pipe_context *ctx, > start, count); > > for (unsigned i = 0; i < count; i++) { > - /* The First three vertex buffers are reserved for parameters > and > + /* The First four vertex buffers are reserved for parameters and >* global buffers. */ > - unsigned vtx_id = 3 + i; > + unsigned vtx_id = 4 + i; > if (resources[i]) { > struct r600_resource_global *buffer = > (struct r600_resource_global*) > -- > 2.7.4 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] RFC nir/algebraic: Optimize open-coded nir_binop_bfm
On Wed, Jun 29, 2016 at 2:04 PM, Ian Romanickwrote: > From: Ian Romanick > > BFM is (((1u << a) - 1) << b). Recognize a couple patterns that look > like this, and replace them with BFM. > > NOTE: Using lower_bitfield_insert is definitely not the right way to > flag this optimization... so, I'm looking for some advice as to what the > right way is. I guess we'll just need another flag to indicate the presence of BFM? Maybe add a has_bfm flag to nir_shader_compiler_options. It would be the first has_*, but I can't think of anything better. And really, maybe has_flrp makes more sense than lower_flrp. The inverse of former certainly indicates you need to lower it. That would get a Reviewed-by from me. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 16/24] i965: Use LZD to implement nir_op_ifind_msb on Gen < 7
On Thu, Jul 7, 2016 at 10:16 AM, Ian Romanickwrote: > From: Ian Romanick > > v2: Retype LZD source as UD to avoid potential problems with 0x8000. > Suggested by Matt. Also update comment about problem values with > LZD(abs(x)). Suggested by Curro. > > Signed-off-by: Ian Romanick > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 54 ++-- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 57 > -- > 2 files changed, 90 insertions(+), 21 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 65f6406..93d5e9d 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -623,8 +623,36 @@ emit_find_msb_using_lzd(const fs_builder , > bool is_signed) > { > fs_inst *inst; > + fs_reg temp = src; > > - bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), src); > + if (is_signed) { > + /* LZD of an absolute value source almost always does the right > + * thing. There are two problem values: > + * > + * * 0x8000. Since abs(0x8000) == 0x8000, LZD returns > + * 0. However, findMSB(int(0x8000)) == 30. > + * > + * * 0x. Since abs(0x) == 1, LZD returns > + * 31. Section 8.8 (Integer Functions) of the GLSL 4.50 spec says: > + * > + *For a value of zero or negative one, -1 will be returned. > + * > + * * Negative powers of two. LZD(abs(-(1< + * findMSB(-(1< + * Might be nice to add these cases to the piglit tests. 15-17 are Reviewed-by: Matt Turner That should be the whole series, minus the RFC at the end. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH shader-db 2/3] Remove split-to-files.py
Would you mind updating the README as well? With that, patches 2 & 3 are Reviewed-by: Nicolai HähnlePatch 1 is Acked-by: Nicolai Hähnle On 11.07.2016 20:10, Marek Olšák wrote: From: Marek Olšák Use MESA_SHADER_CAPTURE_PATH instead. --- split-to-files.py | 138 -- 1 file changed, 138 deletions(-) delete mode 100755 split-to-files.py diff --git a/split-to-files.py b/split-to-files.py deleted file mode 100755 index 721b2da..000 --- a/split-to-files.py +++ /dev/null @@ -1,138 +0,0 @@ -#!/usr/bin/env python3 - -import re -import os -import argparse - - -def parse_input(infile): -shaders = dict() -programs = dict() -shadertuple = ("bad", 0) -prognum = "" -reading = False -is_glsl = True - -for line in infile.splitlines(): -declmatch = re.match( -r"GLSL (.*) shader (.*) source for linked program (.*):", line) -arbmatch = re.match( -r"ARB_([^_]*)_program source for program (.*):", line) -if declmatch: -shadertype = declmatch.group(1) -shadernum = declmatch.group(2) -prognum = declmatch.group(3) -shadertuple = (shadertype, shadernum) - -# don't save driver-internal shaders. -if prognum == "0": -continue - -if prognum not in shaders: -shaders[prognum] = dict() -if shadertuple in shaders[prognum]: -print("Warning: duplicate", shadertype, " shader ", shadernum, - "in program", prognum, "...tossing old shader.") -shaders[prognum][shadertuple] = '' -reading = True -is_glsl = True -print("Reading program {0} {1} shader {2}".format( -prognum, shadertype, shadernum)) -elif arbmatch: -shadertype = arbmatch.group(1) -prognum = arbmatch.group(2) -if prognum in programs: -print("dupe!") -exit(1) -programs[prognum] = (shadertype, '') -reading = True -is_glsl = False -print("Reading program {0} {1} shader".format(prognum, shadertype)) -elif re.match("GLSL IR for ", line): -reading = False -elif re.match("Mesa IR for ", line): -reading = False -elif re.match("GLSL source for ", line): -reading = False -elif reading: -if is_glsl: -shaders[prognum][shadertuple] += line + '\n' -else: -type, source = programs[prognum] -programs[prognum] = (type, ''.join([source, line, '\n'])) - -return (shaders, programs) - - -def write_shader_test(filename, shaders): -print("Writing {0}".format(filename)) -out = open(filename, 'w') - -min_version = 110 -for stage, num in shaders: -shader = shaders[(stage, num)] -m = re.match(r"^#version (\d\d\d)", shader) -if m: -version = int(m.group(1), 10) -if version > min_version: -min_version = version - -out.write("[require]\n") -out.write("GLSL >= %.2f\n" % (min_version / 100.)) -out.write("\n") - -for stage, num in shaders: -if stage == "vertex": -out.write("[vertex shader]\n") -elif stage == "fragment": -out.write("[fragment shader]\n") -elif stage == "geometry": -out.write("[geometry shader]\n") -elif stage == "tess ctrl" or stage == "tessellation control": -out.write("[tessellation control shader]\n") -elif stage == "tess eval" or stage == "tessellation evaluation": -out.write("[tessellation evaluation shader]\n") -else: -assert False, stage -out.write(shaders[(stage, num)]) - -out.close() - -def write_arb_shader_test(filename, type, source): -print("Writing {0}".format(filename)) -out = open(filename, 'w') -out.write("[require]\n") -out.write("GL_ARB_{0}_program\n".format(type)) -out.write("\n") -out.write("[{0} program]\n".format(type)) -out.write(source) -# INTEL_DEBUG won't output anything for ARB programs unless you draw -out.write("\n[test]\ndraw rect -1 -1 1 2\n"); -out.close() - -def write_files(directory, shaders, programs): -for prog in shaders: -write_shader_test("{0}/{1}.shader_test".format(directory, prog), - shaders[prog]) -for prognum in programs: -prog = programs[prognum] -write_arb_shader_test("{0}/{1}p-{2}.shader_test".format(directory, -prog[0][0], prognum), prog[0], prog[1]) - -def main(): -parser = argparse.ArgumentParser() -parser.add_argument('appname', help='Output directory (application name)') -parser.add_argument('mesadebug',
Re: [Mesa-dev] [PATCH] nvc0: use a define for the driver constant buffer size
Reviewed-by: Ilia MirkinOn Mon, Jul 11, 2016 at 4:25 PM, Samuel Pitoiset wrote: > This might avoid mistakes if the size is bumped in the future. > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 8 > src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 4 ++-- > src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 2 +- > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- > src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 8 > src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 6 +++--- > src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 4 ++-- > 7 files changed, 17 insertions(+), 17 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > index 10a4c83..dc4d1b3 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > @@ -115,7 +115,7 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen, > > /* MS sample coordinate offsets */ > BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); > - PUSH_DATA (push, 2048); > + PUSH_DATA (push, NVC0_CB_AUX_SIZE); > PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 2 * 8); > @@ -253,7 +253,7 @@ nvc0_compute_validate_driverconst(struct nvc0_context > *nvc0) > struct nvc0_screen *screen = nvc0->screen; > > BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); > - PUSH_DATA (push, 2048); > + PUSH_DATA (push, NVC0_CB_AUX_SIZE); > PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > BEGIN_NVC0(push, NVC0_CP(CB_BIND), 1); > @@ -271,7 +271,7 @@ nvc0_compute_validate_buffers(struct nvc0_context *nvc0) > int i; > > BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); > - PUSH_DATA (push, 2048); > + PUSH_DATA (push, NVC0_CB_AUX_SIZE); > PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s)); > PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s)); > BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 4 * NVC0_MAX_BUFFERS); > @@ -406,7 +406,7 @@ nvc0_compute_upload_input(struct nvc0_context *nvc0, > } > > BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); > - PUSH_DATA (push, 2048); > + PUSH_DATA (push, NVC0_CB_AUX_SIZE); > PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > index f6d535a..7acd477 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > @@ -104,7 +104,7 @@ > #define NVC0_CB_USR_SIZE(6 << 16) > /* 6 driver constbuts, at 2K each */ > #define NVC0_CB_AUX_INFO(s) NVC0_CB_USR_SIZE + (s << 11) > -#define NVC0_CB_AUX_SIZE(6 << 11) > +#define NVC0_CB_AUX_SIZE(1 << 11) > /* XXX: Figure out what this UNK data is. */ > #define NVC0_CB_AUX_UNK_INFO0x000 > #define NVC0_CB_AUX_UNK_SIZE(8 * 4) > @@ -138,7 +138,7 @@ > #define NVC0_CB_AUX_MP_INFO 0x600 > #define NVC0_CB_AUX_MP_SIZE 3 * 4 > /* 4 32-bits floats for the vertex runout, put at the end */ > -#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + NVC0_CB_AUX_SIZE > +#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + (NVC0_CB_AUX_SIZE * 6) > > struct nvc0_blitctx; > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c > index 27cbbc4..944349d 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c > @@ -1836,7 +1836,7 @@ nvc0_hw_sm_upload_input(struct nvc0_context *nvc0, > struct nvc0_hw_query *hq) >PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1)); > } else { >BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); > - PUSH_DATA (push, 2048); > + PUSH_DATA (push, NVC0_CB_AUX_SIZE); >PUSH_DATAh(push, address); >PUSH_DATA (push, address); >BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 3); > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > index e0bfd3b..d22150a 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > @@ -960,7 +960,7 @@ nvc0_screen_create(struct nouveau_device *dev) >/* TIC and TSC entries for each unit (nve4+ only) */ >/* auxiliary constants (6 user clip planes, base instance id) */ >BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); > -
[Mesa-dev] [PATCH] nvc0: use a define for the driver constant buffer size
This might avoid mistakes if the size is bumped in the future. Signed-off-by: Samuel Pitoiset--- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 8 src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 4 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 8 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c| 6 +++--- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 4 ++-- 7 files changed, 17 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c index 10a4c83..dc4d1b3 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c @@ -115,7 +115,7 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen, /* MS sample coordinate offsets */ BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 2 * 8); @@ -253,7 +253,7 @@ nvc0_compute_validate_driverconst(struct nvc0_context *nvc0) struct nvc0_screen *screen = nvc0->screen; BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); BEGIN_NVC0(push, NVC0_CP(CB_BIND), 1); @@ -271,7 +271,7 @@ nvc0_compute_validate_buffers(struct nvc0_context *nvc0) int i; BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s)); BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 4 * NVC0_MAX_BUFFERS); @@ -406,7 +406,7 @@ nvc0_compute_upload_input(struct nvc0_context *nvc0, } BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(5)); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h index f6d535a..7acd477 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h @@ -104,7 +104,7 @@ #define NVC0_CB_USR_SIZE(6 << 16) /* 6 driver constbuts, at 2K each */ #define NVC0_CB_AUX_INFO(s) NVC0_CB_USR_SIZE + (s << 11) -#define NVC0_CB_AUX_SIZE(6 << 11) +#define NVC0_CB_AUX_SIZE(1 << 11) /* XXX: Figure out what this UNK data is. */ #define NVC0_CB_AUX_UNK_INFO0x000 #define NVC0_CB_AUX_UNK_SIZE(8 * 4) @@ -138,7 +138,7 @@ #define NVC0_CB_AUX_MP_INFO 0x600 #define NVC0_CB_AUX_MP_SIZE 3 * 4 /* 4 32-bits floats for the vertex runout, put at the end */ -#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + NVC0_CB_AUX_SIZE +#define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + (NVC0_CB_AUX_SIZE * 6) struct nvc0_blitctx; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c index 27cbbc4..944349d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c @@ -1836,7 +1836,7 @@ nvc0_hw_sm_upload_input(struct nvc0_context *nvc0, struct nvc0_hw_query *hq) PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1)); } else { BEGIN_NVC0(push, NVC0_CP(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, address); PUSH_DATA (push, address); BEGIN_1IC0(push, NVC0_CP(CB_POS), 1 + 3); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index e0bfd3b..d22150a 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -960,7 +960,7 @@ nvc0_screen_create(struct nouveau_device *dev) /* TIC and TSC entries for each unit (nve4+ only) */ /* auxiliary constants (6 user clip planes, base instance id) */ BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); - PUSH_DATA (push, 2048); + PUSH_DATA (push, NVC0_CB_AUX_SIZE); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(i)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(i)); BEGIN_NVC0(push, NVC0_3D(CB_BIND(i)), 1); diff --git
Re: [Mesa-dev] [PATCH] nvc0: fix the driver cb size when draw parameters are used
Reviewed-by: Ilia MirkinA follow-up patch to replace all those 2048's with some #define would be great :) On Mon, Jul 11, 2016 at 3:26 PM, Samuel Pitoiset wrote: > The size of the driver constant buffer for each stage should be 2048 > and not 512 because it has been increased recently for buffers/images. > While we are at it, do the same change for indirect draws. > > This fixes all ARB_shader_draw_parameters tests on GM107. > > Signed-off-by: Samuel Pitoiset > Cc: 12.0 > --- > src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c > index 4e40ff5..94274bc 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c > @@ -835,7 +835,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const > struct pipe_draw_info *info) > > /* Queue things up to let the macros write params to the driver constbuf > */ > BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); > - PUSH_DATA (push, 512); > + PUSH_DATA (push, 2048); > PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); > PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); > BEGIN_NVC0(push, NVC0_3D(CB_POS), 1); > @@ -979,7 +979,7 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct > pipe_draw_info *info) > if (nvc0->vertprog->vp.need_draw_parameters) { >PUSH_SPACE(push, 9); >BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); > - PUSH_DATA (push, 512); > + PUSH_DATA (push, 2048); >PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); >PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); >if (!info->indirect) { > -- > 2.8.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nvc0: fix the driver cb size when draw parameters are used
The size of the driver constant buffer for each stage should be 2048 and not 512 because it has been increased recently for buffers/images. While we are at it, do the same change for indirect draws. This fixes all ARB_shader_draw_parameters tests on GM107. Signed-off-by: Samuel PitoisetCc: 12.0 --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c index 4e40ff5..94274bc 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c @@ -835,7 +835,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const struct pipe_draw_info *info) /* Queue things up to let the macros write params to the driver constbuf */ BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); - PUSH_DATA (push, 512); + PUSH_DATA (push, 2048); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); BEGIN_NVC0(push, NVC0_3D(CB_POS), 1); @@ -979,7 +979,7 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) if (nvc0->vertprog->vp.need_draw_parameters) { PUSH_SPACE(push, 9); BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3); - PUSH_DATA (push, 512); + PUSH_DATA (push, 2048); PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0)); if (!info->indirect) { -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs
Francisco Jerezwrites: > Samuel Iglesias Gonsálvez writes: > >> From: Iago Toral Quiroga >> >> In fp64 we can produce code like this: >> >> mov(16) vgrf2<2>:UD, vgrf3<2>:UD >> >> That our simd lowering pass would typically split in instructions with a >> width of 8, writing to two consecutive registers each. Unfortunately, gen7 >> hardware has a bug affecting execution masking and as a result, the >> second GRF register write won't work properly. Curro verified this: >> >> "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1 >> (where QtrCtrl is the 8-bit quarter of the execution mask signals >> specified in the instruction control fields) for the second >> compressed half of any single-precision instruction (for >> double-precision instructions it's hardwired to use NibCtrl+1, >> at least on HSW), which means that the EU will apply the wrong >> execution controls for the second sequential GRF write if the number >> of channels per GRF is not exactly eight in single-precision mode (or >> four in double-float mode)." >> >> In practice, this means that we cannot write more than one >> consecutive GRF in a single instruction if the number of channels >> per GRF is not exactly eight in single-precision mode (or four >> in double-float mode). >> >> This patch makes our SIMD lowering pass split this kind of instructions >> so that the split versions only write to a single register. In the >> example above this means that we split the write in 4 instructions, each >> one writing 4 UD elements (width = 4) to a single register. >> >> v2 (Curro): >> - Make explicit that the thing about hardwiring NibCtrl+1 for the second >>compressed half is known to happen in Haswell and the issue with IVB >>might not be exactly the same. >> - Assign max_width instead of returning early so that we can handle >>multiple restrictions affecting to the same instruction. >> - Avoid division by 0 if the instruction does not write any registers. >> - Ignore instructions what have WE_all set. >> - Use the instruction execution type size instead of the dst type size. >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 28 >> 1 file changed, 28 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index 2f473cc..4d57412 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct >> brw_device_info *devinfo, >> */ >> unsigned reg_count = inst->regs_written; >> > > You've put this right in the middle of one of my previous workarounds > ;), can you move it down a bit more next to the "According to the IVB > PRMs" line, or close to the end of the function? > >> + /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is >> +* the 8-bit quarter of the execution mask signals specified in the >> +* instruction control fields) for the second compressed half of any >> +* single-precision instruction (for double-precision instructions >> +* it's hardwired to use NibCtrl+1, at least on HSW), which means that >> +* the EU will apply the wrong execution controls for the second >> +* sequential GRF write if the number of channels per GRF is not exactly >> +* eight in single-precision mode (or four in double-float mode). >> +* >> +* In this situation we calculate the maximum size of the split >> +* instructions so they only ever write to a single register. >> +*/ >> + if (devinfo->gen < 8 && inst->regs_written > 1 && >> + !inst->force_writemask_all) { >> + unsigned channels_per_grf = inst->exec_size / inst->regs_written; > > Could be declared const. > >> + unsigned exec_type_size = 0; >> + for (int i = 0; i < inst->sources; i++) { >> + if (inst->src[i].file == BAD_FILE) >> +break; > > It wouldn't be right to break early if the instruction has any valid > sources after a non-present one. This should probably be: > > | if (inst->src[i].file != BAD_FILE) > | exec_type_size = MAX2(exec_type_size, > type_sz(inst->src[i].type)); > > instead. > >> + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type)); >> + } >> + assert(exec_type_size); >> + >> + if (channels_per_grf != REG_SIZE / exec_type_size) { > > I think you really need to use (exec_type_size == 8 ? 4 : 8) instead of > the RHS of this expression. The hardware shifts exactly 8 channels per > compressed half of the instruction regardless of the execution type, (for execution types other than DF that is) > so > this formula would give you an incorrect answer for exec_type_size < 4. > >> + max_width = MIN2(max_width, channels_per_grf); >> + } > > Redundant braces. > >> + } >> + >> for (unsigned i = 0; i < inst->sources;
Re: [Mesa-dev] [PATCH v2 6/6] i965/fs: do d2x lowering before simd splitting
Samuel Iglesias Gonsálvezwrites: > So that we can have gen7 split large writes produced by this lowering pass. > > Signed-off-by: Samuel Iglesias Gonsálvez Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 4bf0ca2..d131106 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -5843,6 +5843,11 @@ fs_visitor::optimize() >OPT(dead_code_eliminate); > } > > + if (OPT(lower_d2x)) { > + OPT(opt_copy_propagate); > + OPT(dead_code_eliminate); > + } > + > OPT(lower_simd_width); > > /* After SIMD lowering just in case we had to unroll the EOT send. */ > @@ -5879,11 +5884,6 @@ fs_visitor::optimize() >OPT(dead_code_eliminate); > } > > - if (OPT(lower_d2x)) { > - OPT(opt_copy_propagate); > - OPT(dead_code_eliminate); > - } > - > OPT(opt_combine_constants); > OPT(lower_integer_multiplication); > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs
Samuel Iglesias Gonsálvezwrites: > From: Iago Toral Quiroga > > In fp64 we can produce code like this: > > mov(16) vgrf2<2>:UD, vgrf3<2>:UD > > That our simd lowering pass would typically split in instructions with a > width of 8, writing to two consecutive registers each. Unfortunately, gen7 > hardware has a bug affecting execution masking and as a result, the > second GRF register write won't work properly. Curro verified this: > > "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1 > (where QtrCtrl is the 8-bit quarter of the execution mask signals > specified in the instruction control fields) for the second > compressed half of any single-precision instruction (for > double-precision instructions it's hardwired to use NibCtrl+1, > at least on HSW), which means that the EU will apply the wrong > execution controls for the second sequential GRF write if the number > of channels per GRF is not exactly eight in single-precision mode (or > four in double-float mode)." > > In practice, this means that we cannot write more than one > consecutive GRF in a single instruction if the number of channels > per GRF is not exactly eight in single-precision mode (or four > in double-float mode). > > This patch makes our SIMD lowering pass split this kind of instructions > so that the split versions only write to a single register. In the > example above this means that we split the write in 4 instructions, each > one writing 4 UD elements (width = 4) to a single register. > > v2 (Curro): > - Make explicit that the thing about hardwiring NibCtrl+1 for the second >compressed half is known to happen in Haswell and the issue with IVB >might not be exactly the same. > - Assign max_width instead of returning early so that we can handle >multiple restrictions affecting to the same instruction. > - Avoid division by 0 if the instruction does not write any registers. > - Ignore instructions what have WE_all set. > - Use the instruction execution type size instead of the dst type size. > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 28 > 1 file changed, 28 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 2f473cc..4d57412 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct > brw_device_info *devinfo, > */ > unsigned reg_count = inst->regs_written; > You've put this right in the middle of one of my previous workarounds ;), can you move it down a bit more next to the "According to the IVB PRMs" line, or close to the end of the function? > + /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is > +* the 8-bit quarter of the execution mask signals specified in the > +* instruction control fields) for the second compressed half of any > +* single-precision instruction (for double-precision instructions > +* it's hardwired to use NibCtrl+1, at least on HSW), which means that > +* the EU will apply the wrong execution controls for the second > +* sequential GRF write if the number of channels per GRF is not exactly > +* eight in single-precision mode (or four in double-float mode). > +* > +* In this situation we calculate the maximum size of the split > +* instructions so they only ever write to a single register. > +*/ > + if (devinfo->gen < 8 && inst->regs_written > 1 && > + !inst->force_writemask_all) { > + unsigned channels_per_grf = inst->exec_size / inst->regs_written; Could be declared const. > + unsigned exec_type_size = 0; > + for (int i = 0; i < inst->sources; i++) { > + if (inst->src[i].file == BAD_FILE) > +break; It wouldn't be right to break early if the instruction has any valid sources after a non-present one. This should probably be: | if (inst->src[i].file != BAD_FILE) | exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type)); instead. > + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type)); > + } > + assert(exec_type_size); > + > + if (channels_per_grf != REG_SIZE / exec_type_size) { I think you really need to use (exec_type_size == 8 ? 4 : 8) instead of the RHS of this expression. The hardware shifts exactly 8 channels per compressed half of the instruction regardless of the execution type, so this formula would give you an incorrect answer for exec_type_size < 4. > + max_width = MIN2(max_width, channels_per_grf); > + } Redundant braces. > + } > + > for (unsigned i = 0; i < inst->sources; i++) >reg_count = MAX2(reg_count, (unsigned)inst->regs_read(i)); > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org >
Re: [Mesa-dev] [PATCH 1/7] glsl: Separate overlapping sentinel nodes in exec_list.
On Fri, Jul 8, 2016 at 3:18 PM, Matt Turnerwrote: > I do appreciate the cleverness, but unfortunately it prevents a lot more > cleverness in the form of additional compiler optimizations brought on > by -fstrict-aliasing. > > No difference in OglBatch7 (n=20). > > Co-authored-by: Davin McCall > --- > I took Ian's suggestion to add get_head_raw() and get_tail_raw() methods > and use them in place of head_sentinel.next and tail_sentinel.prev. > > src/compiler/glsl/ast.h| 4 +- > src/compiler/glsl/ast_function.cpp | 22 +-- > src/compiler/glsl/ast_to_hir.cpp | 6 +- > src/compiler/glsl/ast_type.cpp | 2 +- > src/compiler/glsl/glsl_parser_extras.cpp | 6 +- > src/compiler/glsl/ir.cpp | 8 +- > src/compiler/glsl/ir_clone.cpp | 2 +- > src/compiler/glsl/ir_constant_expression.cpp | 2 +- > src/compiler/glsl/ir_function.cpp | 14 +- > src/compiler/glsl/ir_reader.cpp| 4 +- > src/compiler/glsl/ir_validate.cpp | 4 +- > src/compiler/glsl/list.h | 184 > - > src/compiler/glsl/lower_distance.cpp | 4 +- > src/compiler/glsl/lower_jumps.cpp | 2 +- > src/compiler/glsl/lower_packed_varyings.cpp| 8 +- > src/compiler/glsl/lower_tess_level.cpp | 4 +- > src/compiler/glsl/opt_conditional_discard.cpp | 6 +- > src/compiler/glsl/opt_dead_builtin_varyings.cpp| 2 +- > src/compiler/glsl/opt_dead_code.cpp| 2 +- > src/compiler/glsl/opt_flatten_nested_if_blocks.cpp | 2 +- > src/compiler/nir/nir.h | 4 +- > src/compiler/nir/nir_opt_gcm.c | 2 +- > src/mesa/drivers/dri/i965/brw_cfg.h| 2 +- > src/mesa/drivers/dri/i965/brw_fs_builder.h | 2 +- > src/mesa/drivers/dri/i965/brw_vec4_builder.h | 2 +- > 25 files changed, 164 insertions(+), 136 deletions(-) > > diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h > index 06c7b03..fa5a731 100644 > --- a/src/compiler/glsl/ast.h > +++ b/src/compiler/glsl/ast.h > @@ -346,8 +346,8 @@ public: > > bool is_single_dimension() const > { > - return this->array_dimensions.tail_pred->prev != NULL && > - this->array_dimensions.tail_pred->prev->is_head_sentinel(); > + return this->array_dimensions.get_tail_raw()->prev != NULL && > + this->array_dimensions.get_tail_raw()->is_head_sentinel(); There's a missing ->prev on this line. Fixed locally, and passes Jenkins. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH shader-db 1/3] Remove duplicated shaders across apps
On Monday, July 11, 2016 8:10:44 PM PDT Marek Olšák wrote: > From: Marek Olšák> > $ fdupes -rdN . > >[+] ./yofrankie/129.shader_test >[-] ./yofrankie/126.shader_test > >[+] ./yofrankie/123.shader_test >[-] ./yofrankie/132.shader_test > >[+] ./humus-volumetricfogging2/9.shader_test >[-] ./humus-celshading/9.shader_test > >[+] ./nexuiz/6.shader_test >[-] ./humus-volumetricfogging2/12.shader_test >[-] ./humus-domino/12.shader_test >[-] ./yofrankie/6.shader_test >[-] ./humus-celshading/12.shader_test Series is: Acked-by: Kenneth Graunke Also, thanks for the pointer about fdupes! I hadn't heard of it and instead had been using my own script: http://whitecape.org/stuff/find-duplicates Being able to point people at a distro-packaged program is a lot nicer. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH shader-db 1/3] Remove duplicated shaders across apps
From: Marek Olšák$ fdupes -rdN . [+] ./yofrankie/129.shader_test [-] ./yofrankie/126.shader_test [+] ./yofrankie/123.shader_test [-] ./yofrankie/132.shader_test [+] ./humus-volumetricfogging2/9.shader_test [-] ./humus-celshading/9.shader_test [+] ./nexuiz/6.shader_test [-] ./humus-volumetricfogging2/12.shader_test [-] ./humus-domino/12.shader_test [-] ./yofrankie/6.shader_test [-] ./humus-celshading/12.shader_test --- shaders/humus-celshading/12.shader_test | 22 - shaders/humus-celshading/9.shader_test | 17 - shaders/humus-domino/12.shader_test | 22 - shaders/humus-volumetricfogging2/12.shader_test | 22 - shaders/yofrankie/126.shader_test | 1614 -- shaders/yofrankie/132.shader_test | 1619 --- shaders/yofrankie/6.shader_test | 22 - 7 files changed, 3338 deletions(-) delete mode 100644 shaders/humus-celshading/12.shader_test delete mode 100644 shaders/humus-celshading/9.shader_test delete mode 100644 shaders/humus-domino/12.shader_test delete mode 100644 shaders/humus-volumetricfogging2/12.shader_test delete mode 100644 shaders/yofrankie/126.shader_test delete mode 100644 shaders/yofrankie/132.shader_test delete mode 100644 shaders/yofrankie/6.shader_test diff --git a/shaders/humus-celshading/12.shader_test b/shaders/humus-celshading/12.shader_test deleted file mode 100644 index 8b913af..000 --- a/shaders/humus-celshading/12.shader_test +++ /dev/null @@ -1,22 +0,0 @@ -[require] -GLSL >= 1.30 - -[fragment shader] -#version 130 -uniform ivec4 color; -out ivec4 out_color; - -void main() -{ - out_color = color; -} - -[vertex shader] -#version 130 -in vec4 position; -void main() -{ - gl_Position = position; -} - - diff --git a/shaders/humus-celshading/9.shader_test b/shaders/humus-celshading/9.shader_test deleted file mode 100644 index 30ba541..000 --- a/shaders/humus-celshading/9.shader_test +++ /dev/null @@ -1,17 +0,0 @@ -[require] -GLSL >= 1.10 - -[fragment shader] -uniform vec4 color; -void main() -{ - gl_FragColor = color; -} - -[vertex shader] -attribute vec4 position; -void main() -{ - gl_Position = position; -} - diff --git a/shaders/humus-domino/12.shader_test b/shaders/humus-domino/12.shader_test deleted file mode 100644 index 8b913af..000 --- a/shaders/humus-domino/12.shader_test +++ /dev/null @@ -1,22 +0,0 @@ -[require] -GLSL >= 1.30 - -[fragment shader] -#version 130 -uniform ivec4 color; -out ivec4 out_color; - -void main() -{ - out_color = color; -} - -[vertex shader] -#version 130 -in vec4 position; -void main() -{ - gl_Position = position; -} - - diff --git a/shaders/humus-volumetricfogging2/12.shader_test b/shaders/humus-volumetricfogging2/12.shader_test deleted file mode 100644 index 8b913af..000 --- a/shaders/humus-volumetricfogging2/12.shader_test +++ /dev/null @@ -1,22 +0,0 @@ -[require] -GLSL >= 1.30 - -[fragment shader] -#version 130 -uniform ivec4 color; -out ivec4 out_color; - -void main() -{ - out_color = color; -} - -[vertex shader] -#version 130 -in vec4 position; -void main() -{ - gl_Position = position; -} - - diff --git a/shaders/yofrankie/126.shader_test b/shaders/yofrankie/126.shader_test deleted file mode 100644 index 236419b..000 --- a/shaders/yofrankie/126.shader_test +++ /dev/null @@ -1,1614 +0,0 @@ -[require] -GLSL >= 1.10 - -[fragment shader] - -float exp_blender(float f) -{ - return pow(2.71828182846, f); -} - -void rgb_to_hsv(vec4 rgb, out vec4 outcol) -{ - float cmax, cmin, h, s, v, cdelta; - vec3 c; - - cmax = max(rgb[0], max(rgb[1], rgb[2])); - cmin = min(rgb[0], min(rgb[1], rgb[2])); - cdelta = cmax-cmin; - - v = cmax; - if (cmax!=0.0) - s = cdelta/cmax; - else { - s = 0.0; - h = 0.0; - } - - if (s == 0.0) { - h = 0.0; - } - else { - c = (vec3(cmax, cmax, cmax) - rgb.xyz)/cdelta; - - if (rgb.x==cmax) h = c[2] - c[1]; - else if (rgb.y==cmax) h = 2.0 + c[0] - c[2]; - else h = 4.0 + c[1] - c[0]; - - h /= 6.0; - - if (h<0.0) - h += 1.0; - } - - outcol = vec4(h, s, v, rgb.w); -} - -void hsv_to_rgb(vec4 hsv, out vec4 outcol) -{ - float i, f, p, q, t, h, s, v; - vec3 rgb; - - h = hsv[0]; - s = hsv[1]; - v = hsv[2]; - - if(s==0.0) { - rgb = vec3(v, v, v); - } - else { - if(h==1.0) - h = 0.0; - - h *= 6.0; - i = floor(h); - f = h - i; - rgb = vec3(f, f, f); - p = v*(1.0-s); - q = v*(1.0-(s*f)); - t = v*(1.0-(s*(1.0-f))); - - if (i == 0.0) rgb = vec3(v, t,
[Mesa-dev] [PATCH shader-db 3/3] si-report.py: don't crash if there are no shaders found
From: Marek Olšák--- si-report.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/si-report.py b/si-report.py index c7fe1b5..69af89e 100755 --- a/si-report.py +++ b/si-report.py @@ -366,6 +366,9 @@ def compare_results(before_all_results, after_all_results): errors_names.append(name) print '{} shaders in {} tests'.format(num_shaders, num_tests) +if num_shaders == 0: +return + print "Totals:" print_before_after_stats(total_before, total_after) print "Totals from affected shaders:" -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH shader-db 2/3] Remove split-to-files.py
From: Marek OlšákUse MESA_SHADER_CAPTURE_PATH instead. --- split-to-files.py | 138 -- 1 file changed, 138 deletions(-) delete mode 100755 split-to-files.py diff --git a/split-to-files.py b/split-to-files.py deleted file mode 100755 index 721b2da..000 --- a/split-to-files.py +++ /dev/null @@ -1,138 +0,0 @@ -#!/usr/bin/env python3 - -import re -import os -import argparse - - -def parse_input(infile): -shaders = dict() -programs = dict() -shadertuple = ("bad", 0) -prognum = "" -reading = False -is_glsl = True - -for line in infile.splitlines(): -declmatch = re.match( -r"GLSL (.*) shader (.*) source for linked program (.*):", line) -arbmatch = re.match( -r"ARB_([^_]*)_program source for program (.*):", line) -if declmatch: -shadertype = declmatch.group(1) -shadernum = declmatch.group(2) -prognum = declmatch.group(3) -shadertuple = (shadertype, shadernum) - -# don't save driver-internal shaders. -if prognum == "0": -continue - -if prognum not in shaders: -shaders[prognum] = dict() -if shadertuple in shaders[prognum]: -print("Warning: duplicate", shadertype, " shader ", shadernum, - "in program", prognum, "...tossing old shader.") -shaders[prognum][shadertuple] = '' -reading = True -is_glsl = True -print("Reading program {0} {1} shader {2}".format( -prognum, shadertype, shadernum)) -elif arbmatch: -shadertype = arbmatch.group(1) -prognum = arbmatch.group(2) -if prognum in programs: -print("dupe!") -exit(1) -programs[prognum] = (shadertype, '') -reading = True -is_glsl = False -print("Reading program {0} {1} shader".format(prognum, shadertype)) -elif re.match("GLSL IR for ", line): -reading = False -elif re.match("Mesa IR for ", line): -reading = False -elif re.match("GLSL source for ", line): -reading = False -elif reading: -if is_glsl: -shaders[prognum][shadertuple] += line + '\n' -else: -type, source = programs[prognum] -programs[prognum] = (type, ''.join([source, line, '\n'])) - -return (shaders, programs) - - -def write_shader_test(filename, shaders): -print("Writing {0}".format(filename)) -out = open(filename, 'w') - -min_version = 110 -for stage, num in shaders: -shader = shaders[(stage, num)] -m = re.match(r"^#version (\d\d\d)", shader) -if m: -version = int(m.group(1), 10) -if version > min_version: -min_version = version - -out.write("[require]\n") -out.write("GLSL >= %.2f\n" % (min_version / 100.)) -out.write("\n") - -for stage, num in shaders: -if stage == "vertex": -out.write("[vertex shader]\n") -elif stage == "fragment": -out.write("[fragment shader]\n") -elif stage == "geometry": -out.write("[geometry shader]\n") -elif stage == "tess ctrl" or stage == "tessellation control": -out.write("[tessellation control shader]\n") -elif stage == "tess eval" or stage == "tessellation evaluation": -out.write("[tessellation evaluation shader]\n") -else: -assert False, stage -out.write(shaders[(stage, num)]) - -out.close() - -def write_arb_shader_test(filename, type, source): -print("Writing {0}".format(filename)) -out = open(filename, 'w') -out.write("[require]\n") -out.write("GL_ARB_{0}_program\n".format(type)) -out.write("\n") -out.write("[{0} program]\n".format(type)) -out.write(source) -# INTEL_DEBUG won't output anything for ARB programs unless you draw -out.write("\n[test]\ndraw rect -1 -1 1 2\n"); -out.close() - -def write_files(directory, shaders, programs): -for prog in shaders: -write_shader_test("{0}/{1}.shader_test".format(directory, prog), - shaders[prog]) -for prognum in programs: -prog = programs[prognum] -write_arb_shader_test("{0}/{1}p-{2}.shader_test".format(directory, -prog[0][0], prognum), prog[0], prog[1]) - -def main(): -parser = argparse.ArgumentParser() -parser.add_argument('appname', help='Output directory (application name)') -parser.add_argument('mesadebug', help='MESA_GLSL=dump output file') -args = parser.parse_args() - -dirname = "shaders/{0}".format(args.appname) -if not os.path.isdir(dirname): -os.mkdir(dirname) - -with open(args.mesadebug, 'r') as infile: -
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
On Mon, Jul 11, 2016 at 2:01 PM, Marek Olšákwrote: > On Mon, Jul 11, 2016 at 7:55 PM, Ilia Mirkin wrote: >> On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšák wrote: >>> On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin wrote: On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák wrote: > From: Marek Olšák > > This bug is uncovered by glsl/lower_if_to_cond_assign. > I don't know if it can be reproduced in any other way. > > Cc: > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index 76656f5..0b7feb7 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -1958,12 +1958,14 @@ > glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) > emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); >break; > case ir_unop_bitcast_f2i: > - result_src = op[0]; > - result_src.type = GLSL_TYPE_INT; > - break; > case ir_unop_bitcast_f2u: > - result_src = op[0]; > - result_src.type = GLSL_TYPE_UINT; > + /* Make sure we don't propagate the negate modifier to integer > opcodes. */ > + if (op[0].negate) Or abs or saturate, presumably? >>> >>> glsl_to_tgsi doesn't use ureg_abs. >> >> I'd rather not rely on that... it's pretty cheap to throw in there. > > I can't throw abs in there because st_src_reg doesn't have abs. :) Oh. I see. Then Reviewed-by: Ilia Mirkin as is :) Sorry for being dense, I should have looked at the code but was relying on my (obviously poor) memory. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] anv/dump: Fix post-blit memory barrier
On Thu, Jul 7, 2016 at 8:12 PM, Jason Ekstrandwrote: > Drp... > > Reviewed-by: Jason Ekstrand > I added my R-B to your two and pushed the lot of them. Thanks for the fixups! > On Jul 7, 2016 4:06 PM, "Chad Versace" wrote: > >> Swap srcAccessMask and dstAccessMask. >> --- >> src/intel/vulkan/anv_dump.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/src/intel/vulkan/anv_dump.c b/src/intel/vulkan/anv_dump.c >> index 49a5ae2..4a5a44f 100644 >> --- a/src/intel/vulkan/anv_dump.c >> +++ b/src/intel/vulkan/anv_dump.c >> @@ -158,8 +158,8 @@ dump_image_do_blit(struct anv_device *device, struct >> dump_image *image, >>0, 0, NULL, 0, NULL, 1, >>&(VkImageMemoryBarrier) { >> .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, >> - .srcAccessMask = VK_ACCESS_HOST_READ_BIT, >> - .dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT, >> + .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT, >> + .dstAccessMask = VK_ACCESS_HOST_READ_BIT, >> .oldLayout = VK_IMAGE_LAYOUT_GENERAL, >> .newLayout = VK_IMAGE_LAYOUT_GENERAL, >> .srcQueueFamilyIndex = 0, >> -- >> 2.9.0.rc2 >> >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
On Mon, Jul 11, 2016 at 7:55 PM, Ilia Mirkinwrote: > On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšák wrote: >> On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin wrote: >>> On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák wrote: From: Marek Olšák This bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 76656f5..0b7feb7 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1958,12 +1958,14 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); break; case ir_unop_bitcast_f2i: - result_src = op[0]; - result_src.type = GLSL_TYPE_INT; - break; case ir_unop_bitcast_f2u: - result_src = op[0]; - result_src.type = GLSL_TYPE_UINT; + /* Make sure we don't propagate the negate modifier to integer opcodes. */ + if (op[0].negate) >>> >>> Or abs or saturate, presumably? >> >> glsl_to_tgsi doesn't use ureg_abs. > > I'd rather not rely on that... it's pretty cheap to throw in there. I can't throw abs in there because st_src_reg doesn't have abs. :) > >> >> saturate is a dst modifier and this patch operates on src operands. > > Er, right, of course. Ignore that. > > With abs thrown into the condition, > > Reviewed-by: Ilia Mirkin Thanks. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
On Mon, Jul 11, 2016 at 1:48 PM, Marek Olšákwrote: > On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkin wrote: >> On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák wrote: >>> From: Marek Olšák >>> >>> This bug is uncovered by glsl/lower_if_to_cond_assign. >>> I don't know if it can be reproduced in any other way. >>> >>> Cc: >>> --- >>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- >>> 1 file changed, 7 insertions(+), 5 deletions(-) >>> >>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> index 76656f5..0b7feb7 100644 >>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> @@ -1958,12 +1958,14 @@ >>> glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) >>> emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); >>>break; >>> case ir_unop_bitcast_f2i: >>> - result_src = op[0]; >>> - result_src.type = GLSL_TYPE_INT; >>> - break; >>> case ir_unop_bitcast_f2u: >>> - result_src = op[0]; >>> - result_src.type = GLSL_TYPE_UINT; >>> + /* Make sure we don't propagate the negate modifier to integer >>> opcodes. */ >>> + if (op[0].negate) >> >> Or abs or saturate, presumably? > > glsl_to_tgsi doesn't use ureg_abs. I'd rather not rely on that... it's pretty cheap to throw in there. > > saturate is a dst modifier and this patch operates on src operands. Er, right, of course. Ignore that. With abs thrown into the condition, Reviewed-by: Ilia Mirkin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mapi: Massage code to allow clang to compile.
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code was violating the spec, resulting in it failing to compile. Cc: mesa-sta...@lists.freedesktop.org Co-authored-by: Tomasz Paweł GajcBugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599 --- I've tried for months to reproduce this, and I've still never been able to on 64-bit builds. I can reproduce it on 32-bit however. On MSVC, this patch will have the effect of changing the variables from static to extern. I do not know if this will adversely affect anything, so this patch would benefit from MSVC testing. configure.ac| 1 + src/mapi/entry_x86-64_tls.h | 9 +++-- src/mapi/entry_x86_tls.h| 10 -- src/mapi/entry_x86_tsd.h| 9 +++-- 4 files changed, 23 insertions(+), 6 deletions(-) diff --git a/configure.ac b/configure.ac index 367e9b5..ddd8e1d 100644 --- a/configure.ac +++ b/configure.ac @@ -225,6 +225,7 @@ AX_GCC_FUNC_ATTRIBUTE([packed]) AX_GCC_FUNC_ATTRIBUTE([pure]) AX_GCC_FUNC_ATTRIBUTE([returns_nonnull]) AX_GCC_FUNC_ATTRIBUTE([unused]) +AX_GCC_FUNC_ATTRIBUTE([visibility]) AX_GCC_FUNC_ATTRIBUTE([warn_unused_result]) AX_GCC_FUNC_ATTRIBUTE([weak]) diff --git a/src/mapi/entry_x86-64_tls.h b/src/mapi/entry_x86-64_tls.h index 38faccc..c5262a1 100644 --- a/src/mapi/entry_x86-64_tls.h +++ b/src/mapi/entry_x86-64_tls.h @@ -25,6 +25,11 @@ *Chia-I Wu */ +#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY +#define HIDDEN __attribute__((visibility("hidden"))) +#else +#define HIDDEN +#endif __asm__(".text\n" ".balign 32\n" @@ -54,8 +59,8 @@ entry_patch_public(void) { } -static char -x86_64_entry_start[]; +extern char +x86_64_entry_start[] HIDDEN; mapi_func entry_get_public(int slot) diff --git a/src/mapi/entry_x86_tls.h b/src/mapi/entry_x86_tls.h index 46d2ece..231b409 100644 --- a/src/mapi/entry_x86_tls.h +++ b/src/mapi/entry_x86_tls.h @@ -27,6 +27,12 @@ #include +#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY +#define HIDDEN __attribute__((visibility("hidden"))) +#else +#define HIDDEN +#endif + __asm__(".text"); __asm__("x86_current_tls:\n\t" @@ -71,8 +77,8 @@ __asm__(".text"); extern unsigned long x86_current_tls(); -static char x86_entry_start[]; -static char x86_entry_end[]; +extern char x86_entry_start[] HIDDEN; +extern char x86_entry_end[] HIDDEN; void entry_patch_public(void) diff --git a/src/mapi/entry_x86_tsd.h b/src/mapi/entry_x86_tsd.h index ea7bacb..03d9735 100644 --- a/src/mapi/entry_x86_tsd.h +++ b/src/mapi/entry_x86_tsd.h @@ -25,6 +25,11 @@ *Chia-I Wu */ +#ifdef HAVE_FUNC_ATTRIBUTE_VISIBIITY +#define HIDDEN __attribute__((visibility("hidden"))) +#else +#define HIDDEN +#endif #define X86_ENTRY_SIZE 32 @@ -58,8 +63,8 @@ __asm__(".balign 32\n" #include #include "u_execmem.h" -static const char x86_entry_start[]; -static const char x86_entry_end[]; +extern const char x86_entry_start[] HIDDEN; +extern const char x86_entry_end[] HIDDEN; void entry_patch_public(void) -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
On Mon, Jul 11, 2016 at 7:31 PM, Ilia Mirkinwrote: > On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšák wrote: >> From: Marek Olšák >> >> This bug is uncovered by glsl/lower_if_to_cond_assign. >> I don't know if it can be reproduced in any other way. >> >> Cc: >> --- >> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- >> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> index 76656f5..0b7feb7 100644 >> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >> @@ -1958,12 +1958,14 @@ >> glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) >> emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); >>break; >> case ir_unop_bitcast_f2i: >> - result_src = op[0]; >> - result_src.type = GLSL_TYPE_INT; >> - break; >> case ir_unop_bitcast_f2u: >> - result_src = op[0]; >> - result_src.type = GLSL_TYPE_UINT; >> + /* Make sure we don't propagate the negate modifier to integer >> opcodes. */ >> + if (op[0].negate) > > Or abs or saturate, presumably? glsl_to_tgsi doesn't use ureg_abs. saturate is a dst modifier and this patch operates on src operands. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH v2] mesa: etc2 online compression is unsupported, don't attempt it
On Fri, Jul 8, 2016 at 5:29 PM, Ilia Mirkinwrote: > Signed-off-by: Ilia Mirkin > Cc: "11.2 12.0" > --- > > v1 -> v2: also include a mesa_is_etc2_format function which takes a GLenum. > > src/mesa/main/glformats.c | 23 +++ > src/mesa/main/glformats.h | 3 +++ > src/mesa/main/teximage.c | 1 + > 3 files changed, 27 insertions(+) > > diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c > index 24ce7b0..90f525c 100644 > --- a/src/mesa/main/glformats.c > +++ b/src/mesa/main/glformats.c > @@ -907,6 +907,29 @@ _mesa_is_astc_format(GLenum internalFormat) > } > > /** > + * Test if the given format is an ETC2 format. > + */ > +GLboolean > +_mesa_is_etc2_format(GLenum internalFormat) > +{ > + switch (internalFormat) { > + case GL_COMPRESSED_RGB8_ETC2: > + case GL_COMPRESSED_SRGB8_ETC2: > + case GL_COMPRESSED_RGBA8_ETC2_EAC: > + case GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC: > + case GL_COMPRESSED_R11_EAC: > + case GL_COMPRESSED_RG11_EAC: > + case GL_COMPRESSED_SIGNED_R11_EAC: > + case GL_COMPRESSED_SIGNED_RG11_EAC: > + case GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2: > + case GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2: > + return true; > + default: > + return false; > + } > +} > + > +/** > * Test if the given format is an integer (non-normalized) format. > */ > GLboolean > diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h > index c73f464..474ede2 100644 > --- a/src/mesa/main/glformats.h > +++ b/src/mesa/main/glformats.h > @@ -61,6 +61,9 @@ extern GLboolean > _mesa_is_astc_format(GLenum internalFormat); > > extern GLboolean > +_mesa_is_etc2_format(GLenum internalFormat); > + > +extern GLboolean > _mesa_is_type_unsigned(GLenum type); > > extern GLboolean > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c > index 26a6c21..81e46a1 100644 > --- a/src/mesa/main/teximage.c > +++ b/src/mesa/main/teximage.c > @@ -1307,6 +1307,7 @@ bool > _mesa_format_no_online_compression(const struct gl_context *ctx, GLenum > format) > { > return _mesa_is_astc_format(format) || > + _mesa_is_etc2_format(format) || >compressedteximage_only_format(ctx, format); > } > > -- > 2.7.3 > > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-stable Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
On Mon, Jul 11, 2016 at 1:28 PM, Marek Olšákwrote: > From: Marek Olšák > > This bug is uncovered by glsl/lower_if_to_cond_assign. > I don't know if it can be reproduced in any other way. > > Cc: > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index 76656f5..0b7feb7 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -1958,12 +1958,14 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* > ir, st_src_reg *op) > emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); >break; > case ir_unop_bitcast_f2i: > - result_src = op[0]; > - result_src.type = GLSL_TYPE_INT; > - break; > case ir_unop_bitcast_f2u: > - result_src = op[0]; > - result_src.type = GLSL_TYPE_UINT; > + /* Make sure we don't propagate the negate modifier to integer > opcodes. */ > + if (op[0].negate) Or abs or saturate, presumably? > + emit_asm(ir, TGSI_OPCODE_MOV, result_dst, op[0]); > + else > + result_src = op[0]; > + result_src.type = ir->operation == ir_unop_bitcast_f2i ? GLSL_TYPE_INT > : > + > GLSL_TYPE_UINT; >break; > case ir_unop_bitcast_i2f: > case ir_unop_bitcast_u2f: > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
From: Marek OlšákThis bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 76656f5..0b7feb7 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1958,12 +1958,14 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) emit_asm(ir, TGSI_OPCODE_TRUNC, result_dst, op[0]); break; case ir_unop_bitcast_f2i: - result_src = op[0]; - result_src.type = GLSL_TYPE_INT; - break; case ir_unop_bitcast_f2u: - result_src = op[0]; - result_src.type = GLSL_TYPE_UINT; + /* Make sure we don't propagate the negate modifier to integer opcodes. */ + if (op[0].negate) + emit_asm(ir, TGSI_OPCODE_MOV, result_dst, op[0]); + else + result_src = op[0]; + result_src.type = ir->operation == ir_unop_bitcast_f2i ? GLSL_TYPE_INT : + GLSL_TYPE_UINT; break; case ir_unop_bitcast_i2f: case ir_unop_bitcast_u2f: -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/14] isl: Add support for HiZ surfaces
On Sat, Jul 09, 2016 at 12:17:28PM -0700, Jason Ekstrand wrote: > --- > src/intel/isl/isl.c | 11 +++ > src/intel/isl/isl.h | 17 + > src/intel/isl/isl_format_layout.csv | 1 + > src/intel/isl/isl_gen6.c| 8 > src/intel/isl/isl_gen7.c| 10 +- > src/intel/isl/isl_gen8.c| 3 ++- > 6 files changed, 48 insertions(+), 2 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index 8c114a2..9ccdea2 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -167,6 +167,16 @@ isl_tiling_get_info(const struct isl_device *dev, >break; > } > > + case ISL_TILING_HIZ: > + /* HiZ buffers are required to have ISL_FORMAT_HIZ which is an 8x4 > + * 128bpb format. The tiling has the same physical dimensions as > + * Y-tiling but actually has two HiZ columns per Y-tiled column. > + */ > + assert(bs == 16); > + logical_el = isl_extent2d(16, 16); > + phys_B = isl_extent2d(128, 32); > + break; > + > default: >unreachable("not reached"); > } /* end switch */ > @@ -221,6 +231,7 @@ isl_surf_choose_tiling(const struct isl_device *dev, >CHOOSE(ISL_TILING_LINEAR); > } > > + CHOOSE(ISL_TILING_HIZ); > CHOOSE(ISL_TILING_Ys); > CHOOSE(ISL_TILING_Yf); > CHOOSE(ISL_TILING_Y0); > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h > index 85af2d1..9a60bbd 100644 > --- a/src/intel/isl/isl.h > +++ b/src/intel/isl/isl.h > @@ -345,6 +345,14 @@ enum isl_format { > ISL_FORMAT_ASTC_LDR_2D_12X10_FLT16 =638, > ISL_FORMAT_ASTC_LDR_2D_12X12_FLT16 =639, > > + /* The formats that follow are internal to ISL and as such don't have an > +* explicit number. We'll just let the C compiler assign it for us. Any > +* actual hardware formats *must* come before these in the list. > +*/ > + > + /* Formats for color compression surfaces */ > + ISL_FORMAT_HIZ, > + > /* Hardware doesn't understand this out-of-band value */ > ISL_FORMAT_UNSUPPORTED = UINT16_MAX, > }; > @@ -392,6 +400,9 @@ enum isl_txc { > ISL_TXC_ETC1, > ISL_TXC_ETC2, > ISL_TXC_ASTC, > + > + /* Used for auxiliary surface formats */ > + ISL_TXC_HIZ, > }; > > /** > @@ -410,6 +421,7 @@ enum isl_tiling { > ISL_TILING_Y0, /**< Legacy Y tiling */ > ISL_TILING_Yf, /**< Standard 4K tiling. The 'f' means "four". */ > ISL_TILING_Ys, /**< Standard 64K tiling. The 's' means "sixty-four". */ > + ISL_TILING_HIZ, /**< Tiling format for HiZ surfaces */ > }; > > /** > @@ -423,6 +435,7 @@ typedef uint32_t isl_tiling_flags_t; > #define ISL_TILING_Y0_BIT (1u << ISL_TILING_Y0) > #define ISL_TILING_Yf_BIT (1u << ISL_TILING_Yf) > #define ISL_TILING_Ys_BIT (1u << ISL_TILING_Ys) > +#define ISL_TILING_HIZ_BIT(1u << ISL_TILING_HIZ) > #define ISL_TILING_ANY_MASK (~0u) > #define ISL_TILING_NON_LINEAR_MASK(~ISL_TILING_LINEAR_BIT) > > @@ -505,6 +518,7 @@ typedef uint64_t isl_surf_usage_flags_t; > #define ISL_SURF_USAGE_DISPLAY_FLIP_X_BIT (1u << 10) > #define ISL_SURF_USAGE_DISPLAY_FLIP_Y_BIT (1u << 11) > #define ISL_SURF_USAGE_STORAGE_BIT (1u << 12) > +#define ISL_SURF_USAGE_HIZ_BIT (1u << 13) > /** @} */ > > /** > @@ -966,6 +980,9 @@ isl_format_has_bc_compression(enum isl_format fmt) > case ISL_TXC_ETC2: > case ISL_TXC_ASTC: >return false; > + > + case ISL_TXC_HIZ: > + unreachable("Should not be called on an aux surface"); > } > > unreachable("bad texture compression mode"); > diff --git a/src/intel/isl/isl_format_layout.csv > b/src/intel/isl/isl_format_layout.csv > index f90fbe0..3e681e8 100644 > --- a/src/intel/isl/isl_format_layout.csv > +++ b/src/intel/isl/isl_format_layout.csv > @@ -314,3 +314,4 @@ ASTC_LDR_2D_10X8_FLT16 , 128, 10, 8, 1, sf16, > sf16, sf16, sf16, , > ASTC_LDR_2D_10X10_FLT16 , 128, 10, 10, 1, sf16, sf16, sf16, sf16, , > ,, linear, astc > ASTC_LDR_2D_12X10_FLT16 , 128, 12, 10, 1, sf16, sf16, sf16, sf16, , > ,, linear, astc > ASTC_LDR_2D_12X12_FLT16 , 128, 12, 12, 1, sf16, sf16, sf16, sf16, , > ,, linear, astc > +HIZ , 128, 8, 4, 1, , , , , , > ,, , hiz > diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c > index 699aa41..b5050ed 100644 > --- a/src/intel/isl/isl_gen6.c > +++ b/src/intel/isl/isl_gen6.c > @@ -89,6 +89,14 @@ gen6_choose_image_alignment_el(const struct isl_device > *dev, > enum isl_msaa_layout msaa_layout, > struct isl_extent3d *image_align_el) > { > + if (info->format == ISL_FORMAT_HIZ) { > + /* HiZ surfaces are
Re: [Mesa-dev] [PATCH 07/14] isl: Use bpb in a few places where it's more natural than bs
On Mon, Jul 11, 2016 at 8:37 AM, Pohjolainen, Topi < topi.pohjolai...@intel.com> wrote: > On Sat, Jul 09, 2016 at 12:17:24PM -0700, Jason Ekstrand wrote: > > --- > > src/intel/isl/isl.c | 2 +- > > src/intel/isl/isl_gen6.c | 2 +- > > src/intel/isl/isl_gen7.c | 2 +- > > src/intel/isl/isl_storage_image.c| 4 ++-- > > src/intel/vulkan/anv_formats.c | 4 ++-- > > src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 4 ++-- > > 6 files changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > > index a3a9427..796b4cc 100644 > > --- a/src/intel/isl/isl.c > > +++ b/src/intel/isl/isl.c > > @@ -996,7 +996,7 @@ isl_apply_surface_padding(const struct isl_device > *dev, > > * padding requirements. > > */ > > if (isl_format_is_yuv(info->format) && > > - (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) { > > + (fmtl->bpb == 96 || fmtl->bpb == 48|| fmtl->bpb == 24)) { > > So these values were bits instead of bytes even though stored into 'bs'? > Or how does this work? In the rest you have multiplied by the eight. > We used to use bpb and then we switched to bs and these values were left in bpb. Now we're switching bak so I'm leaving them alone again. :-) In other words, the old code had a bug that this is fixing. Since no one uses ISL for YUV images yet, I didn't figure it was worth separating into its own bugfix patch. > > >*total_h_el += 1; > >*pad_bytes += 16; > > } > > diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c > > index 24c3939..699aa41 100644 > > --- a/src/intel/isl/isl_gen6.c > > +++ b/src/intel/isl/isl_gen6.c > > @@ -51,7 +51,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev, > > * - any compressed texture format (BC*) > > * - any YCRCB* format > > */ > > - if (fmtl->bs > 8) > > + if (fmtl->bpb > 64) > >return false; > > if (isl_format_is_compressed(info->format)) > >return false; > > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c > > index 542c137..d9b0c08 100644 > > --- a/src/intel/isl/isl_gen7.c > > +++ b/src/intel/isl/isl_gen7.c > > @@ -51,7 +51,7 @@ gen7_choose_msaa_layout(const struct isl_device *dev, > > *formats: any format with greater than 64 bits per element, any > > *compressed texture format (BC*), and any YCRCB* format. > > */ > > - if (fmtl->bs > 8) > > + if (fmtl->bpb > 64) > >return false; > > if (isl_format_is_compressed(info->format)) > >return false; > > diff --git a/src/intel/isl/isl_storage_image.c > b/src/intel/isl/isl_storage_image.c > > index 590d2e4..2617eb0e 100644 > > --- a/src/intel/isl/isl_storage_image.c > > +++ b/src/intel/isl/isl_storage_image.c > > @@ -194,9 +194,9 @@ isl_has_matching_typed_storage_image_format(const > struct brw_device_info *devinf > > if (devinfo->gen >= 9) { > >return true; > > } else if (devinfo->gen >= 8 || devinfo->is_haswell) { > > - return isl_format_get_layout(fmt)->bs <= 8; > > + return isl_format_get_layout(fmt)->bpb <= 64; > > } else { > > - return isl_format_get_layout(fmt)->bs <= 4; > > + return isl_format_get_layout(fmt)->bpb <= 32; > > } > > } > > > > diff --git a/src/intel/vulkan/anv_formats.c > b/src/intel/vulkan/anv_formats.c > > index 457e820..b26e48a 100644 > > --- a/src/intel/vulkan/anv_formats.c > > +++ b/src/intel/vulkan/anv_formats.c > > @@ -271,7 +271,7 @@ anv_get_format(const struct brw_device_info > *devinfo, VkFormat vk_format, > >isl_format_get_layout(format.isl_format); > > > > if (tiling == VK_IMAGE_TILING_OPTIMAL && > > - !util_is_power_of_two(isl_layout->bs)) { > > + !util_is_power_of_two(isl_layout->bpb)) { > >/* Tiled formats *must* be power-of-two because we need up upload > > * them with the render pipeline. For 3-channel formats, we fix > > * this by switching them over to RGBX or RGBA formats under the > > @@ -409,7 +409,7 @@ anv_physical_device_get_format_properties(struct > anv_physical_device *physical_d > > * what most clients will want. > > */ > >if (linear_fmt.isl_format != ISL_FORMAT_UNSUPPORTED && > > - > !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bs) && > > + > !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bpb) && > >isl_format_rgb_to_rgbx(linear_fmt.isl_format) == > ISL_FORMAT_UNSUPPORTED) { > > tiled &= ~VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT & > >~VK_FORMAT_FEATURE_BLIT_DST_BIT; > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > > index fc1fc13..a4774e6 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > > +++
Re: [Mesa-dev] [PATCH 09/14] isl: Kill off isl_format_layout::bs
On Sat, Jul 09, 2016 at 12:17:27PM -0700, Jason Ekstrand wrote: > --- > src/intel/isl/gen_format_layout.py | 1 - > src/intel/isl/isl.c| 11 ++- > src/intel/isl/isl.h| 5 ++--- > src/intel/isl/isl_gen9.c | 14 +++--- > src/intel/isl/isl_storage_image.c | 4 ++-- > src/intel/vulkan/anv_image.c | 4 ++-- > src/intel/vulkan/anv_meta_copy.c | 4 ++-- > 7 files changed, 21 insertions(+), 22 deletions(-) > > diff --git a/src/intel/isl/gen_format_layout.py > b/src/intel/isl/gen_format_layout.py > index 803967e..c9163fe 100644 > --- a/src/intel/isl/gen_format_layout.py > +++ b/src/intel/isl/gen_format_layout.py > @@ -68,7 +68,6 @@ TEMPLATE = template.Template( > .format = ISL_FORMAT_${format.name}, > .name = "ISL_FORMAT_${format.name}", > .bpb = ${format.bpb}, > -.bs = ${format.bpb // 8}, > .bw = ${format.bw}, > .bh = ${format.bh}, > .bd = ${format.bd}, > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index e0e67e2..8c114a2 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -904,7 +904,8 @@ isl_calc_linear_row_pitch(const struct isl_device *dev, > *being used to determine whether additional pages need to be defined. > */ > assert(phys_slice0_sa->w % fmtl->bw == 0); > - row_pitch = MAX(row_pitch, fmtl->bs * (phys_slice0_sa->w / fmtl->bw)); > + uint32_t bs = fmtl->bpb / 8; Could be 'const'. > + row_pitch = MAX(row_pitch, bs * (phys_slice0_sa->w / fmtl->bw)); > > /* From the Broadwel PRM >> Volume 2d: Command Reference: Structures >> > * RENDER_SURFACE_STATE Surface Pitch (p349): > @@ -922,9 +923,9 @@ isl_calc_linear_row_pitch(const struct isl_device *dev, > */ > if (info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) { >if (isl_format_is_yuv(info->format)) { > - row_pitch = isl_align_npot(row_pitch, 2 * fmtl->bs); > + row_pitch = isl_align_npot(row_pitch, 2 * bs); >} else { > - row_pitch = isl_align_npot(row_pitch, fmtl->bs); > + row_pitch = isl_align_npot(row_pitch, bs); >} > } > > @@ -1120,9 +1121,9 @@ isl_surf_init_s(const struct isl_device *dev, >base_alignment = MAX(1, info->min_alignment); >if (info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) { > if (isl_format_is_yuv(info->format)) { > -base_alignment = MAX(base_alignment, 2 * fmtl->bs); > +base_alignment = MAX(base_alignment, fmtl->bpb / 4); > } else { > -base_alignment = MAX(base_alignment, fmtl->bs); > +base_alignment = MAX(base_alignment, fmtl->bpb / 8); > } >} > } else { > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h > index 50c8e80..85af2d1 100644 > --- a/src/intel/isl/isl.h > +++ b/src/intel/isl/isl.h > @@ -641,7 +641,6 @@ struct isl_format_layout { > > uint16_t bpb; /**< bits per block */ > > - uint8_t bs; /**< Block size, in bytes, rounded towards 0 */ > uint8_t bw; /**< Block width, in pixels */ > uint8_t bh; /**< Block height, in pixels */ > uint8_t bd; /**< Block depth, in pixels */ > @@ -1201,8 +1200,8 @@ isl_surf_get_row_pitch_el(const struct isl_surf *surf) > { > const struct isl_format_layout *fmtl = > isl_format_get_layout(surf->format); > > - assert(surf->row_pitch % fmtl->bs == 0); > - return surf->row_pitch / fmtl->bs; > + assert(surf->row_pitch % (fmtl->bpb / 8) == 0); > + return surf->row_pitch / (fmtl->bpb / 8); > } > > /** > diff --git a/src/intel/isl/isl_gen9.c b/src/intel/isl/isl_gen9.c > index aa290aa..39f4092 100644 > --- a/src/intel/isl/isl_gen9.c > +++ b/src/intel/isl/isl_gen9.c > @@ -40,7 +40,7 @@ gen9_calc_std_image_alignment_sa(const struct isl_device > *dev, > > assert(isl_tiling_is_std_y(tiling)); > > - const uint32_t bs = fmtl->bs; > + const uint32_t bpb = fmtl->bpb; > const uint32_t is_Ys = tiling == ISL_TILING_Ys; > > switch (info->dim) { > @@ -49,7 +49,7 @@ gen9_calc_std_image_alignment_sa(const struct isl_device > *dev, > * Layout and Tiling > 1D Surfaces > 1D Alignment Requirements. > */ >*align_sa = (struct isl_extent3d) { > - .w = 1 << (12 - (ffs(bs) - 1) + (4 * is_Ys)), > + .w = 1 << (12 - (ffs(bpb) - 4) + (4 * is_Ys)), > .h = 1, > .d = 1, >}; > @@ -60,8 +60,8 @@ gen9_calc_std_image_alignment_sa(const struct isl_device > *dev, > * Requirements. > */ >*align_sa = (struct isl_extent3d) { > - .w = 1 << (6 - ((ffs(bs) - 1) / 2) + (4 * is_Ys)), > - .h = 1 << (6 - ((ffs(bs) - 0) / 2) + (4 * is_Ys)), > + .w = 1 << (6 - ((ffs(bpb) - 4) / 2) + (4 * is_Ys)), > + .h = 1 << (6 - ((ffs(bpb) - 3) / 2) + (4 * is_Ys)), > .d = 1, >}; > > @@ -86,9 +86,9 @@ gen9_calc_std_image_alignment_sa(const struct isl_device > *dev,
Re: [Mesa-dev] [PATCH 07/14] isl: Use bpb in a few places where it's more natural than bs
On Sat, Jul 09, 2016 at 12:17:24PM -0700, Jason Ekstrand wrote: > --- > src/intel/isl/isl.c | 2 +- > src/intel/isl/isl_gen6.c | 2 +- > src/intel/isl/isl_gen7.c | 2 +- > src/intel/isl/isl_storage_image.c| 4 ++-- > src/intel/vulkan/anv_formats.c | 4 ++-- > src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp | 4 ++-- > 6 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index a3a9427..796b4cc 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -996,7 +996,7 @@ isl_apply_surface_padding(const struct isl_device *dev, > * padding requirements. > */ > if (isl_format_is_yuv(info->format) && > - (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) { > + (fmtl->bpb == 96 || fmtl->bpb == 48|| fmtl->bpb == 24)) { So these values were bits instead of bytes even though stored into 'bs'? Or how does this work? In the rest you have multiplied by the eight. >*total_h_el += 1; >*pad_bytes += 16; > } > diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c > index 24c3939..699aa41 100644 > --- a/src/intel/isl/isl_gen6.c > +++ b/src/intel/isl/isl_gen6.c > @@ -51,7 +51,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev, > * - any compressed texture format (BC*) > * - any YCRCB* format > */ > - if (fmtl->bs > 8) > + if (fmtl->bpb > 64) >return false; > if (isl_format_is_compressed(info->format)) >return false; > diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c > index 542c137..d9b0c08 100644 > --- a/src/intel/isl/isl_gen7.c > +++ b/src/intel/isl/isl_gen7.c > @@ -51,7 +51,7 @@ gen7_choose_msaa_layout(const struct isl_device *dev, > *formats: any format with greater than 64 bits per element, any > *compressed texture format (BC*), and any YCRCB* format. > */ > - if (fmtl->bs > 8) > + if (fmtl->bpb > 64) >return false; > if (isl_format_is_compressed(info->format)) >return false; > diff --git a/src/intel/isl/isl_storage_image.c > b/src/intel/isl/isl_storage_image.c > index 590d2e4..2617eb0e 100644 > --- a/src/intel/isl/isl_storage_image.c > +++ b/src/intel/isl/isl_storage_image.c > @@ -194,9 +194,9 @@ isl_has_matching_typed_storage_image_format(const struct > brw_device_info *devinf > if (devinfo->gen >= 9) { >return true; > } else if (devinfo->gen >= 8 || devinfo->is_haswell) { > - return isl_format_get_layout(fmt)->bs <= 8; > + return isl_format_get_layout(fmt)->bpb <= 64; > } else { > - return isl_format_get_layout(fmt)->bs <= 4; > + return isl_format_get_layout(fmt)->bpb <= 32; > } > } > > diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c > index 457e820..b26e48a 100644 > --- a/src/intel/vulkan/anv_formats.c > +++ b/src/intel/vulkan/anv_formats.c > @@ -271,7 +271,7 @@ anv_get_format(const struct brw_device_info *devinfo, > VkFormat vk_format, >isl_format_get_layout(format.isl_format); > > if (tiling == VK_IMAGE_TILING_OPTIMAL && > - !util_is_power_of_two(isl_layout->bs)) { > + !util_is_power_of_two(isl_layout->bpb)) { >/* Tiled formats *must* be power-of-two because we need up upload > * them with the render pipeline. For 3-channel formats, we fix > * this by switching them over to RGBX or RGBA formats under the > @@ -409,7 +409,7 @@ anv_physical_device_get_format_properties(struct > anv_physical_device *physical_d > * what most clients will want. > */ >if (linear_fmt.isl_format != ISL_FORMAT_UNSUPPORTED && > - > !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bs) && > + > !util_is_power_of_two(isl_format_layouts[linear_fmt.isl_format].bpb) && >isl_format_rgb_to_rgbx(linear_fmt.isl_format) == > ISL_FORMAT_UNSUPPORTED) { > tiled &= ~VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT & >~VK_FORMAT_FEATURE_BLIT_DST_BIT; > diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > index fc1fc13..a4774e6 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp > @@ -982,7 +982,7 @@ namespace brw { > /* Untyped surface reads return 32 bits of the surface per > * component, without any sort of unpacking or type conversion, > */ > -const unsigned size = isl_format_get_layout(format)->bs / 4; > +const unsigned size = isl_format_get_layout(format)->bpb / 32; > /* they don't properly handle out of bounds access, so we have to > * check manually if the coordinates are valid and predicate the >
Re: [Mesa-dev] [PATCH 04/14] isl: Rework the way we define tile sizes.
On Sat, Jul 09, 2016 at 12:17:21PM -0700, Jason Ekstrand wrote: > This is based on a very long set of discussions between Chad and myself > about how we should properly represent HiZ and CCS buffers. The end result > of that discussion was that a tiling actually has two different sizes, a > logical size in elements, and a physical size in bytes and rows. This > commit reworks ISL's pitch and size calculations to work in terms of these > two sizes. > --- > src/intel/isl/isl.c | 181 > ++-- > src/intel/isl/isl.h | 32 +- > 2 files changed, 133 insertions(+), 80 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index 6f57ac2..633bfdf 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -111,30 +111,32 @@ isl_tiling_get_info(const struct isl_device *dev, > struct isl_tile_info *tile_info) > { > const uint32_t bs = format_block_size; > - uint32_t width, height; > + struct isl_extent2d logical_el, phys_B; > > assert(bs > 0); > + assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(bs)); > > switch (tiling) { > case ISL_TILING_LINEAR: > - width = 1; > - height = 1; > + logical_el = isl_extent2d(1, 1); > + phys_B = isl_extent2d(bs, 1); >break; > > case ISL_TILING_X: > - width = 1 << 9; > - height = 1 << 3; Maybe: assert(bs < 512); > + logical_el = isl_extent2d(512 / bs, 8); > + phys_B = isl_extent2d(512, 8); >break; > > case ISL_TILING_Y0: > - width = 1 << 7; > - height = 1 << 5; And: assert(bs < 128); > + logical_el = isl_extent2d(128 / bs, 32); > + phys_B = isl_extent2d(128, 32); >break; > > case ISL_TILING_W: >/* XXX: Should W tile be same as Y? */ > - width = 1 << 6; > - height = 1 << 6; > + assert(bs == 1); > + logical_el = isl_extent2d(64, 64); > + phys_B = isl_extent2d(64, 64); >break; > > case ISL_TILING_Yf: > @@ -147,8 +149,11 @@ isl_tiling_get_info(const struct isl_device *dev, > >bool is_Ys = tiling == ISL_TILING_Ys; > > - width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys)); > - height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys)); > + unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys)); > + unsigned height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys)); These could be both 'const'. > + > + logical_el = isl_extent2d(width / bs, height); > + phys_B = isl_extent2d(width, height); >break; > } > > @@ -158,9 +163,8 @@ isl_tiling_get_info(const struct isl_device *dev, > > *tile_info = (struct isl_tile_info) { >.tiling = tiling, > - .width = width, > - .height = height, > - .size = width * height, > + .logical_extent_el = logical_el, > + .phys_extent_B = phys_B, > }; > > return true; > @@ -827,7 +831,7 @@ isl_calc_array_pitch_el_rows(const struct isl_device *dev, > *Tile Mode != Linear: This field must be set to an integer > multiple > *of the tile height > */ > - pitch_el_rows = isl_align(pitch_el_rows, tile_info->height); > + pitch_el_rows = isl_align(pitch_el_rows, > tile_info->logical_extent_el.height); Looks like you are overflowing the line here. > } > > return pitch_el_rows; > @@ -837,11 +841,9 @@ isl_calc_array_pitch_el_rows(const struct isl_device > *dev, > * Calculate the pitch of each surface row, in bytes. > */ > static uint32_t > -isl_calc_row_pitch(const struct isl_device *dev, > - const struct isl_surf_init_info *restrict info, > - const struct isl_tile_info *tile_info, > - const struct isl_extent3d *image_align_sa, > - const struct isl_extent2d *phys_slice0_sa) > +isl_calc_linear_row_pitch(const struct isl_device *dev, > + const struct isl_surf_init_info *restrict info, > + const struct isl_extent2d *phys_slice0_sa) > { > const struct isl_format_layout *fmtl = > isl_format_get_layout(info->format); > > @@ -894,39 +896,26 @@ isl_calc_row_pitch(const struct isl_device *dev, > assert(phys_slice0_sa->w % fmtl->bw == 0); > row_pitch = MAX(row_pitch, fmtl->bs * (phys_slice0_sa->w / fmtl->bw)); > > - switch (tile_info->tiling) { > - case ISL_TILING_LINEAR: > - /* From the Broadwel PRM >> Volume 2d: Command Reference: Structures >> > - * RENDER_SURFACE_STATE Surface Pitch (p349): > - * > - *- For linear render target surfaces and surfaces accessed with > the > - * typed data port messages, the pitch must be a multiple of the > - * element size for non-YUV surface formats. Pitch must be > - * a multiple of 2 * element size for YUV surface formats. > - * > - *- [Requirements for SURFTYPE_BUFFER and SURFTYPE_STRBUF,
Re: [Mesa-dev] [PATCH 03/14] isl: Rework the way we handle surface padding
On Mon, Jul 11, 2016 at 8:04 AM, Pohjolainen, Topi < topi.pohjolai...@intel.com> wrote: > On Sat, Jul 09, 2016 at 12:17:20PM -0700, Jason Ekstrand wrote: > > --- > > src/intel/isl/isl.c | 52 > +--- > > 1 file changed, 25 insertions(+), 27 deletions(-) > > > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > > index decba3d..6f57ac2 100644 > > --- a/src/intel/isl/isl.c > > +++ b/src/intel/isl/isl.c > > @@ -933,21 +933,21 @@ isl_calc_row_pitch(const struct isl_device *dev, > > } > > > > /** > > - * Calculate the surface's total height, including padding, in units of > > - * surface elements. > > + * Calculate and apply any padding required for the surface. > > + * > > + * @param[inout] total_h_el is updated with the new height > > + * @param[out] pad_bytes is overwritten with additional padding > requirements. > > */ > > -static uint32_t > > -isl_calc_total_height_el(const struct isl_device *dev, > > - const struct isl_surf_init_info *restrict info, > > - const struct isl_tile_info *tile_info, > > - uint32_t phys_array_len, > > - uint32_t row_pitch, > > - uint32_t array_pitch_el_rows) > > +static void > > +isl_apply_surface_padding(const struct isl_device *dev, > > + const struct isl_surf_init_info *restrict > info, > > + const struct isl_tile_info *tile_info, > > + uint32_t *total_h_el, > > + uint32_t *pad_bytes) > > { > > const struct isl_format_layout *fmtl = > isl_format_get_layout(info->format); > > > > - uint32_t total_h_el = phys_array_len * array_pitch_el_rows; > > - uint32_t pad_bytes = 0; > > + *pad_bytes = 0; > > > > /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > > * Formats >> Surface Padding Requirements >> Render Target and Media > > @@ -981,14 +981,14 @@ isl_calc_total_height_el(const struct isl_device > *dev, > > * is to an even compressed row. > > */ > > if (isl_format_is_compressed(info->format)) > > - total_h_el = isl_align(total_h_el, 2); > > + *total_h_el = isl_align(*total_h_el, 2); > > > > /* > > *- For cube surfaces, an additional two rows of padding are > required > > * at the bottom of the surface. > > */ > > if (info->usage & ISL_SURF_USAGE_CUBE_BIT) > > - total_h_el += 2; > > + *total_h_el += 2; > > > > /* > > *- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats, > > @@ -998,8 +998,8 @@ isl_calc_total_height_el(const struct isl_device > *dev, > > */ > > if (isl_format_is_yuv(info->format) && > > (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) { > > - total_h_el += 1; > > - pad_bytes += 16; > > + *total_h_el += 1; > > + *pad_bytes += 16; > > } > > > > /* > > @@ -1008,7 +1008,7 @@ isl_calc_total_height_el(const struct isl_device > *dev, > > * required above. > > */ > > if (tile_info->tiling == ISL_TILING_LINEAR) > > - pad_bytes += 64; > > + *pad_bytes += 64; > > > > /* The below text weakens, not strengthens, the padding requirements > for > > * linear surfaces. Therefore we can safely ignore it. > > @@ -1028,7 +1028,7 @@ isl_calc_total_height_el(const struct isl_device > *dev, > > if (ISL_DEV_GEN(dev) >= 9 && > > tile_info->tiling == ISL_TILING_LINEAR && > > (info->dim == ISL_SURF_DIM_2D || info->dim == ISL_SURF_DIM_3D)) { > > - total_h_el = isl_align(total_h_el, 4); > > + *total_h_el = isl_align(*total_h_el, 4); > > } > > > > /* > > @@ -1038,13 +1038,8 @@ isl_calc_total_height_el(const struct isl_device > *dev, > > if (ISL_DEV_GEN(dev) >= 9 && > > tile_info->tiling == ISL_TILING_LINEAR && > > info->dim == ISL_SURF_DIM_1D) { > > - total_h_el += 4; > > + *total_h_el += 4; > > } > > - > > - /* Be sloppy. Align any leftover padding to a row boundary. */ > > - total_h_el += isl_align_div_npot(pad_bytes, row_pitch); > > - > > - return total_h_el; > > } > > > > bool > > @@ -1108,10 +1103,13 @@ isl_surf_init_s(const struct isl_device *dev, > > array_pitch_span, _align_sa, > > _level0_sa, _slice0_sa); > > > > - const uint32_t total_h_el = > > - isl_calc_total_height_el(dev, info, _info, > > - phys_level0_sa.array_len, row_pitch, > > - array_pitch_el_rows); > > + uint32_t total_h_el = phys_level0_sa.array_len * array_pitch_el_rows; > > + > > + uint32_t pad_bytes; > > + isl_apply_surface_padding(dev, info, _info, _h_el, > _bytes); > > + > > + /* Be sloppy. Align any leftover padding to a row boundary. */ > > + total_h_el += isl_align_div_npot(pad_bytes, row_pitch); >
Re: [Mesa-dev] [PATCH 03/14] isl: Rework the way we handle surface padding
On Sat, Jul 09, 2016 at 12:17:20PM -0700, Jason Ekstrand wrote: > --- > src/intel/isl/isl.c | 52 +--- > 1 file changed, 25 insertions(+), 27 deletions(-) > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c > index decba3d..6f57ac2 100644 > --- a/src/intel/isl/isl.c > +++ b/src/intel/isl/isl.c > @@ -933,21 +933,21 @@ isl_calc_row_pitch(const struct isl_device *dev, > } > > /** > - * Calculate the surface's total height, including padding, in units of > - * surface elements. > + * Calculate and apply any padding required for the surface. > + * > + * @param[inout] total_h_el is updated with the new height > + * @param[out] pad_bytes is overwritten with additional padding requirements. > */ > -static uint32_t > -isl_calc_total_height_el(const struct isl_device *dev, > - const struct isl_surf_init_info *restrict info, > - const struct isl_tile_info *tile_info, > - uint32_t phys_array_len, > - uint32_t row_pitch, > - uint32_t array_pitch_el_rows) > +static void > +isl_apply_surface_padding(const struct isl_device *dev, > + const struct isl_surf_init_info *restrict info, > + const struct isl_tile_info *tile_info, > + uint32_t *total_h_el, > + uint32_t *pad_bytes) > { > const struct isl_format_layout *fmtl = > isl_format_get_layout(info->format); > > - uint32_t total_h_el = phys_array_len * array_pitch_el_rows; > - uint32_t pad_bytes = 0; > + *pad_bytes = 0; > > /* From the Broadwell PRM >> Volume 5: Memory Views >> Common Surface > * Formats >> Surface Padding Requirements >> Render Target and Media > @@ -981,14 +981,14 @@ isl_calc_total_height_el(const struct isl_device *dev, > * is to an even compressed row. > */ > if (isl_format_is_compressed(info->format)) > - total_h_el = isl_align(total_h_el, 2); > + *total_h_el = isl_align(*total_h_el, 2); > > /* > *- For cube surfaces, an additional two rows of padding are required > * at the bottom of the surface. > */ > if (info->usage & ISL_SURF_USAGE_CUBE_BIT) > - total_h_el += 2; > + *total_h_el += 2; > > /* > *- For packed YUV, 96 bpt, 48 bpt, and 24 bpt surface formats, > @@ -998,8 +998,8 @@ isl_calc_total_height_el(const struct isl_device *dev, > */ > if (isl_format_is_yuv(info->format) && > (fmtl->bs == 96 || fmtl->bs == 48|| fmtl->bs == 24)) { > - total_h_el += 1; > - pad_bytes += 16; > + *total_h_el += 1; > + *pad_bytes += 16; > } > > /* > @@ -1008,7 +1008,7 @@ isl_calc_total_height_el(const struct isl_device *dev, > * required above. > */ > if (tile_info->tiling == ISL_TILING_LINEAR) > - pad_bytes += 64; > + *pad_bytes += 64; > > /* The below text weakens, not strengthens, the padding requirements for > * linear surfaces. Therefore we can safely ignore it. > @@ -1028,7 +1028,7 @@ isl_calc_total_height_el(const struct isl_device *dev, > if (ISL_DEV_GEN(dev) >= 9 && > tile_info->tiling == ISL_TILING_LINEAR && > (info->dim == ISL_SURF_DIM_2D || info->dim == ISL_SURF_DIM_3D)) { > - total_h_el = isl_align(total_h_el, 4); > + *total_h_el = isl_align(*total_h_el, 4); > } > > /* > @@ -1038,13 +1038,8 @@ isl_calc_total_height_el(const struct isl_device *dev, > if (ISL_DEV_GEN(dev) >= 9 && > tile_info->tiling == ISL_TILING_LINEAR && > info->dim == ISL_SURF_DIM_1D) { > - total_h_el += 4; > + *total_h_el += 4; > } > - > - /* Be sloppy. Align any leftover padding to a row boundary. */ > - total_h_el += isl_align_div_npot(pad_bytes, row_pitch); > - > - return total_h_el; > } > > bool > @@ -1108,10 +1103,13 @@ isl_surf_init_s(const struct isl_device *dev, > array_pitch_span, _align_sa, > _level0_sa, _slice0_sa); > > - const uint32_t total_h_el = > - isl_calc_total_height_el(dev, info, _info, > - phys_level0_sa.array_len, row_pitch, > - array_pitch_el_rows); > + uint32_t total_h_el = phys_level0_sa.array_len * array_pitch_el_rows; > + > + uint32_t pad_bytes; > + isl_apply_surface_padding(dev, info, _info, _h_el, _bytes); > + > + /* Be sloppy. Align any leftover padding to a row boundary. */ > + total_h_el += isl_align_div_npot(pad_bytes, row_pitch); Am I reading this correctly that this is non-functional change? > > const uint32_t size = >row_pitch * isl_align(total_h_el, tile_info.height); > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org >
Re: [Mesa-dev] Current status of the i965 dri3 bug #71759?
On 11/07/16 17:40, i...@iirolaiho.net wrote: Hi, I am asking about freedesktop bug #71759. The bug has been open since 2013, but the problems are beginning to surface now: on Fedora+[RPMFusion|UnitedRPMs], it breaks h.264 playback on Totem with default settings (1). Fabrice Bellet has done some debugging on the problem and even submitted a patch, that however is not working as of now (2). 1) https://bugzilla.redhat.com/show_bug.cgi?id=1309446 2) https://github.com/UnitedRPMs/libva-intel-driver/issues/1 Is there any progress on the issue? Is there anything a non-programmer would be able to do to help? Have I understood right that these drivers are mainly developed by Intel employees on work time? If so, there certainly should be enough resources to fix it. The bug is currently assigned to Ian Romanick, who seems to be a some kind of default assignee of these bugs. Hey, Sorry, I must have missed it. I will have a look at it tomorrow! Thanks, Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Current status of the i965 dri3 bug #71759?
Hi, I am asking about freedesktop bug #71759. The bug has been open since 2013, but the problems are beginning to surface now: on Fedora+[RPMFusion|UnitedRPMs], it breaks h.264 playback on Totem with default settings (1). Fabrice Bellet has done some debugging on the problem and even submitted a patch, that however is not working as of now (2). 1) https://bugzilla.redhat.com/show_bug.cgi?id=1309446 2) https://github.com/UnitedRPMs/libva-intel-driver/issues/1 Is there any progress on the issue? Is there anything a non-programmer would be able to do to help? Have I understood right that these drivers are mainly developed by Intel employees on work time? If so, there certainly should be enough resources to fix it. The bug is currently assigned to Ian Romanick, who seems to be a some kind of default assignee of these bugs. – Iiro Laiho ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i915: fix implicit truncation from 'int' to bitfield
On Mon, Jul 11, 2016 at 6:31 AM, Francesco Ansanelliwrote: > --- > src/gallium/drivers/i915/i915_context.c |6 +++--- > src/gallium/drivers/i915/i915_flush.c |6 +++--- Please prefix patches to this directory "i915g: " > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/drivers/i915/i915_context.c > b/src/gallium/drivers/i915/i915_context.c > index 82798bb..d7cdfd9 100644 > --- a/src/gallium/drivers/i915/i915_context.c > +++ b/src/gallium/drivers/i915/i915_context.c > @@ -216,9 +216,9 @@ i915_create_context(struct pipe_screen *screen, void > *priv, unsigned flags) > > i915->dirty = ~0; > i915->hardware_dirty = ~0; > - i915->immediate_dirty = ~0; > - i915->dynamic_dirty = ~0; > - i915->static_dirty = ~0; > + i915->immediate_dirty |= ~0; > + i915->dynamic_dirty |= ~0; > + i915->static_dirty |= ~0; What exactly is the warning you see? I'm having a difficult time understanding how this could possibly help anything. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i915: fix implicit truncation from 'int' to bitfield
--- src/gallium/drivers/i915/i915_context.c |6 +++--- src/gallium/drivers/i915/i915_flush.c |6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/i915/i915_context.c b/src/gallium/drivers/i915/i915_context.c index 82798bb..d7cdfd9 100644 --- a/src/gallium/drivers/i915/i915_context.c +++ b/src/gallium/drivers/i915/i915_context.c @@ -216,9 +216,9 @@ i915_create_context(struct pipe_screen *screen, void *priv, unsigned flags) i915->dirty = ~0; i915->hardware_dirty = ~0; - i915->immediate_dirty = ~0; - i915->dynamic_dirty = ~0; - i915->static_dirty = ~0; + i915->immediate_dirty |= ~0; + i915->dynamic_dirty |= ~0; + i915->static_dirty |= ~0; i915->flush_dirty = 0; return >base; diff --git a/src/gallium/drivers/i915/i915_flush.c b/src/gallium/drivers/i915/i915_flush.c index 6311c00..db05f97 100644 --- a/src/gallium/drivers/i915/i915_flush.c +++ b/src/gallium/drivers/i915/i915_flush.c @@ -81,9 +81,9 @@ void i915_flush(struct i915_context *i915, batch->iws->batchbuffer_flush(batch, fence, flags); i915->vbo_flushed = 1; i915->hardware_dirty = ~0; - i915->immediate_dirty = ~0; - i915->dynamic_dirty = ~0; - i915->static_dirty = ~0; + i915->immediate_dirty |= ~0; + i915->dynamic_dirty |= ~0; + i915->static_dirty |= ~0; /* kernel emits flushes in between batchbuffers */ i915->flush_dirty = 0; i915->fired_vertices += i915->queued_vertices; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: fix ignored qualifiers warning
On Saturday, July 9, 2016 10:16:29 AM PDT Francesco Ansanelli wrote: > --- > src/mesa/drivers/dri/i965/gen6_queryobj.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c > b/src/mesa/drivers/dri/i965/gen6_queryobj.c > index 96db5e9..95a5c56 100644 > --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c > +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c > @@ -99,7 +99,7 @@ write_xfb_primitives_written(struct brw_context *brw, > } > } > > -static inline const int > +static inline int > pipeline_target_to_index(int target) > { > if (target == GL_GEOMETRY_SHADER_INVOCATIONS) > Pushed, thanks! signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates
On Monday, July 11, 2016 1:37:46 PM PDT Samuel Iglesias Gonsálvez wrote: > From: Iago Toral Quiroga> > Gen7 hardware does not support double immediates so these need > to be moved in 32-bit chunks to a regular vgrf instead. Instead > of doing this every time we need to create a DF immediate, > create a helper function that does the right thing depending > on the hardware generation. > > v2: > - Define setup_imm_df() as an independent function (Curro) > - Create a specific builder to get rid of some instruction field > assignments (Curro). > > Signed-off-by: Samuel Iglesias Gonsálvez > Reviewed-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_fs.h | 3 +++ > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 > > 2 files changed, 40 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index 1f88f8f..d034573 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const > brw::fs_builder , > const fs_reg , > const fs_reg , > uint32_t components); > +fs_reg setup_imm_df(const struct brw_device_info *devinfo, > +const brw::fs_builder , > +double v); > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 04ed42e..94c719b 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder > , >bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, > 1)); > } > } > + > +fs_reg > +setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , > double v) > +{ If you like, you can just do: const struct brw_device_info *devinfo = bld.shader->devinfo; and avoid the extra parameter. Either way is fine. > + assert(devinfo->gen >= 7); > + > + if (devinfo->gen >= 8) > + return brw_imm_df(v); > + > + /* gen7 does not support DF immediates, so we generate a 64-bit constant > by > +* writing the low 32-bit of the constant to suboffset 0 of a VGRF and > +* the high 32-bit to suboffset 4 and then applying a stride of 0. > +* > +* Alternatively, we could also produce a normal VGRF (without stride 0) > +* by writing to all the channels in the VGRF, however, that would hit the > +* gen7 bug where we have to split writes that span more than 1 register > +* into instructions with a width of 4 (otherwise the write to the second > +* register written runs into an execmask hardware bug) which isn't very > +* nice. > +*/ > + union { > + double d; > + struct { > + uint32_t i1; > + uint32_t i2; > + }; > + } di; > + > + di.d = v; > + > + const fs_builder ubld = bld.exec_all().group(1, 0); > + const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); > + ubld.MOV(tmp, brw_imm_ud(di.i1)); > + ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2)); > + > + return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0); > +} > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] i965/fs: use the new helper function to create double immediates
On Monday, July 11, 2016 1:19:34 PM PDT Samuel Iglesias Gonsálvez wrote: > > On 06/07/16 22:32, Kenneth Graunke wrote: > > On Wednesday, July 6, 2016 12:09:58 PM PDT Samuel Iglesias Gonsálvez wrote: > >> From: Iago Toral Quiroga> >> > >> --- > >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> index 268c847..d805d95 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> @@ -832,7 +832,7 @@ fs_visitor::nir_emit_alu(const fs_builder , > >> nir_alu_instr *instr) > >>* a register and compare with that. > >>*/ > >> fs_reg tmp = vgrf(glsl_type::double_type); > >> - bld.MOV(tmp, brw_imm_df(0.0)); > >> + bld.MOV(tmp, setup_imm_df(0.0)); > > > > Does this need to be splatted out to a full SIMD-width? > > Why not just do: > > > >fs_reg tmp = setup_imm_df(0.0); > > > > and let the CMP compare against the stride 0 register? > > > > I have just noticed this is not right. > > CMP will use the 64-bit immediate as one of the sources of the CMP, > which is not valid in gen8+. According to BDW+'s PRMs, an 64-bit > immediate is only valid in source values for instructions with single > source operands. > > I am going to keep the original patch. > > Sam Ah, right, I missed that on Gen8+ setup_imm_df returns an actual immediate, rather than a GRF register with the immediate loaded into it. Feel free to ignore that suggestion. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] vl/compositor: move weave shader out from rgb weaving
Am 06.07.2016 um 20:03 schrieb Leo Liu: We'll use weave shader in the later patch. Signed-off-by: Leo LiuI think I would prefer having a separate component for format conversion of video buffers instead of pushing that into the compositor as well. We could still share the weave shader in a header file. But that's only a minor cleanup we can do later on as well. The patches are Acked-by: Christian König either way. Regards, Christian. --- src/gallium/auxiliary/vl/vl_compositor.c | 157 --- src/gallium/auxiliary/vl/vl_compositor.h | 2 +- 2 files changed, 83 insertions(+), 76 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_compositor.c b/src/gallium/auxiliary/vl/vl_compositor.c index 77fc92e..275022b 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.c +++ b/src/gallium/auxiliary/vl/vl_compositor.c @@ -126,6 +126,77 @@ create_vert_shader(struct vl_compositor *c) } static void +create_frag_shader_weave(struct ureg_program *shader, struct ureg_dst fragment) +{ + struct ureg_src i_tc[2]; + struct ureg_src sampler[3]; + struct ureg_dst t_tc[2]; + struct ureg_dst t_texel[2]; + unsigned i, j; + + i_tc[0] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTOP, TGSI_INTERPOLATE_LINEAR); + i_tc[1] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VBOTTOM, TGSI_INTERPOLATE_LINEAR); + + for (i = 0; i < 3; ++i) + sampler[i] = ureg_DECL_sampler(shader, i); + + for (i = 0; i < 2; ++i) { + t_tc[i] = ureg_DECL_temporary(shader); + t_texel[i] = ureg_DECL_temporary(shader); + } + + /* calculate the texture offsets +* t_tc.x = i_tc.x +* t_tc.y = (round(i_tc.y - 0.5) + 0.5) / height * 2 +*/ + for (i = 0; i < 2; ++i) { + ureg_MOV(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_X), i_tc[i]); + ureg_SUB(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ), + i_tc[i], ureg_imm1f(shader, 0.5f)); + ureg_ROUND(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ), ureg_src(t_tc[i])); + ureg_MOV(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_W), + ureg_imm1f(shader, i ? 1.0f : 0.0f)); + ureg_ADD(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_YZ), + ureg_src(t_tc[i]), ureg_imm1f(shader, 0.5f)); + ureg_MUL(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_Y), + ureg_src(t_tc[i]), ureg_scalar(i_tc[0], TGSI_SWIZZLE_W)); + ureg_MUL(shader, ureg_writemask(t_tc[i], TGSI_WRITEMASK_Z), + ureg_src(t_tc[i]), ureg_scalar(i_tc[1], TGSI_SWIZZLE_W)); + } + + /* fetch the texels +* texel[0..1].x = tex(t_tc[0..1][0]) +* texel[0..1].y = tex(t_tc[0..1][1]) +* texel[0..1].z = tex(t_tc[0..1][2]) +*/ + for (i = 0; i < 2; ++i) + for (j = 0; j < 3; ++j) { + struct ureg_src src = ureg_swizzle(ureg_src(t_tc[i]), +TGSI_SWIZZLE_X, j ? TGSI_SWIZZLE_Z : TGSI_SWIZZLE_Y, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W); + + ureg_TEX(shader, ureg_writemask(t_texel[i], TGSI_WRITEMASK_X << j), + TGSI_TEXTURE_2D_ARRAY, src, sampler[j]); + } + + /* calculate linear interpolation factor +* factor = |round(i_tc.y) - i_tc.y| * 2 +*/ + ureg_ROUND(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ), i_tc[0]); + ureg_ADD(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ), +ureg_src(t_tc[0]), ureg_negate(i_tc[0])); + ureg_MUL(shader, ureg_writemask(t_tc[0], TGSI_WRITEMASK_YZ), +ureg_abs(ureg_src(t_tc[0])), ureg_imm1f(shader, 2.0f)); + ureg_LRP(shader, fragment, ureg_swizzle(ureg_src(t_tc[0]), +TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z), +ureg_src(t_texel[0]), ureg_src(t_texel[1])); + + for (i = 0; i < 2; ++i) { + ureg_release_temporary(shader, t_texel[i]); + ureg_release_temporary(shader, t_tc[i]); + } +} + +static void create_frag_shader_csc(struct ureg_program *shader, struct ureg_dst texel, struct ureg_dst fragment) { @@ -199,86 +270,22 @@ create_frag_shader_video_buffer(struct vl_compositor *c) } static void * -create_frag_shader_weave(struct vl_compositor *c) +create_frag_shader_weave_rgb(struct vl_compositor *c) { struct ureg_program *shader; - struct ureg_src i_tc[2]; - struct ureg_src sampler[3]; - struct ureg_dst t_tc[2]; - struct ureg_dst t_texel[2]; - struct ureg_dst o_fragment; - unsigned i, j; + struct ureg_dst texel, fragment; shader = ureg_create(PIPE_SHADER_FRAGMENT); if (!shader) return false; - i_tc[0] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTOP, TGSI_INTERPOLATE_LINEAR); - i_tc[1] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VBOTTOM, TGSI_INTERPOLATE_LINEAR); - - for (i = 0; i < 3; ++i) - sampler[i] = ureg_DECL_sampler(shader, i); - - for (i = 0; i < 2; ++i) { - t_tc[i] = ureg_DECL_temporary(shader); -
Re: [Mesa-dev] [PATCH] i965: fix ignored qualifiers warning
On Sat, Jul 09, 2016 at 10:16:29AM +0200, Francesco Ansanelli wrote: > --- > src/mesa/drivers/dri/i965/gen6_queryobj.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c > b/src/mesa/drivers/dri/i965/gen6_queryobj.c > index 96db5e9..95a5c56 100644 > --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c > +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c > @@ -99,7 +99,7 @@ write_xfb_primitives_written(struct brw_context *brw, > } > } > > -static inline const int > +static inline int > pipeline_target_to_index(int target) > { > if (target == GL_GEOMETRY_SHADER_INVOCATIONS) > -- > 1.7.9.5 Reviewed-by: Eric Engestrom___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs
From: Iago Toral QuirogaIn fp64 we can produce code like this: mov(16) vgrf2<2>:UD, vgrf3<2>:UD That our simd lowering pass would typically split in instructions with a width of 8, writing to two consecutive registers each. Unfortunately, gen7 hardware has a bug affecting execution masking and as a result, the second GRF register write won't work properly. Curro verified this: "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is the 8-bit quarter of the execution mask signals specified in the instruction control fields) for the second compressed half of any single-precision instruction (for double-precision instructions it's hardwired to use NibCtrl+1, at least on HSW), which means that the EU will apply the wrong execution controls for the second sequential GRF write if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode)." In practice, this means that we cannot write more than one consecutive GRF in a single instruction if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode). This patch makes our SIMD lowering pass split this kind of instructions so that the split versions only write to a single register. In the example above this means that we split the write in 4 instructions, each one writing 4 UD elements (width = 4) to a single register. v2 (Curro): - Make explicit that the thing about hardwiring NibCtrl+1 for the second compressed half is known to happen in Haswell and the issue with IVB might not be exactly the same. - Assign max_width instead of returning early so that we can handle multiple restrictions affecting to the same instruction. - Avoid division by 0 if the instruction does not write any registers. - Ignore instructions what have WE_all set. - Use the instruction execution type size instead of the dst type size. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 28 1 file changed, 28 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2f473cc..4d57412 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4691,6 +4691,34 @@ get_fpu_lowered_simd_width(const struct brw_device_info *devinfo, */ unsigned reg_count = inst->regs_written; + /* Pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is +* the 8-bit quarter of the execution mask signals specified in the +* instruction control fields) for the second compressed half of any +* single-precision instruction (for double-precision instructions +* it's hardwired to use NibCtrl+1, at least on HSW), which means that +* the EU will apply the wrong execution controls for the second +* sequential GRF write if the number of channels per GRF is not exactly +* eight in single-precision mode (or four in double-float mode). +* +* In this situation we calculate the maximum size of the split +* instructions so they only ever write to a single register. +*/ + if (devinfo->gen < 8 && inst->regs_written > 1 && + !inst->force_writemask_all) { + unsigned channels_per_grf = inst->exec_size / inst->regs_written; + unsigned exec_type_size = 0; + for (int i = 0; i < inst->sources; i++) { + if (inst->src[i].file == BAD_FILE) +break; + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type)); + } + assert(exec_type_size); + + if (channels_per_grf != REG_SIZE / exec_type_size) { + max_width = MIN2(max_width, channels_per_grf); + } + } + for (unsigned i = 0; i < inst->sources; i++) reg_count = MAX2(reg_count, (unsigned)inst->regs_read(i)); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 6/6] i965/fs: do d2x lowering before simd splitting
So that we can have gen7 split large writes produced by this lowering pass. Signed-off-by: Samuel Iglesias Gonsálvez--- src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 4bf0ca2..d131106 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5843,6 +5843,11 @@ fs_visitor::optimize() OPT(dead_code_eliminate); } + if (OPT(lower_d2x)) { + OPT(opt_copy_propagate); + OPT(dead_code_eliminate); + } + OPT(lower_simd_width); /* After SIMD lowering just in case we had to unroll the EOT send. */ @@ -5879,11 +5884,6 @@ fs_visitor::optimize() OPT(dead_code_eliminate); } - if (OPT(lower_d2x)) { - OPT(opt_copy_propagate); - OPT(dead_code_eliminate); - } - OPT(opt_combine_constants); OPT(lower_integer_multiplication); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/6] i965/fs: do pack lowering before simd splitting
From: Iago Toral QuirogaSo that we can have gen7 split large writes produced by the pack lowering. Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 4d57412..4bf0ca2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5838,6 +5838,11 @@ fs_visitor::optimize() progress = false; pass_num = 0; + if (OPT(lower_pack)) { + OPT(register_coalesce); + OPT(dead_code_eliminate); + } + OPT(lower_simd_width); /* After SIMD lowering just in case we had to unroll the EOT send. */ @@ -5874,11 +5879,6 @@ fs_visitor::optimize() OPT(dead_code_eliminate); } - if (OPT(lower_pack)) { - OPT(register_coalesce); - OPT(dead_code_eliminate); - } - if (OPT(lower_d2x)) { OPT(opt_copy_propagate); OPT(dead_code_eliminate); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/6] i965/fs: do not require force_writemask_all with exec_size 4
So far we only used instructions with this size in situations where we did not operate per-channel and we wanted to ignore the execution mask, but gen7 fp64 will need to emit code with a width of 4 that needs normal execution masking. v2: - Modify the assert instead of deleting it (Curro) Reviewed-by: Francisco Jerez--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index d25d26a..ce1ec0a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1649,7 +1649,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_acc_write_control(p, inst->writes_accumulator); brw_set_default_exec_size(p, cvt(inst->exec_size) - 1); - assert(inst->force_writemask_all || inst->exec_size >= 8); + assert(inst->force_writemask_all || inst->exec_size >= 4); assert(inst->force_writemask_all || inst->group % inst->exec_size == 0); assert(inst->base_mrf + inst->mlen <= BRW_MAX_MRF(devinfo->gen)); assert(inst->mlen <= BRW_MAX_MSG_LENGTH); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/6] i965/fs: use the new helper function to create double immediates
From: Iago Toral QuirogaReviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 94c719b..acc8c1e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -789,7 +789,7 @@ fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr *instr) * a register and compare with that. */ fs_reg tmp = vgrf(glsl_type::double_type); - bld.MOV(tmp, brw_imm_df(0.0)); + bld.MOV(tmp, setup_imm_df(devinfo, bld, 0.0)); /* A direct DF CMP using the flag register (null dst) won't work in * SIMD16 because the CMP will be split in two by lower_simd_width, @@ -1128,7 +1128,7 @@ fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr *instr) case nir_op_d2b: { /* two-argument instructions can't take 64-bit immediates */ fs_reg zero = vgrf(glsl_type::double_type); - bld.MOV(zero, brw_imm_df(0.0)); + bld.MOV(zero, setup_imm_df(devinfo, bld, 0.0)); /* A SIMD16 execution needs to be split in two instructions, so use * a vgrf instead of the flag register as dst so instruction splitting * works @@ -1440,7 +1440,8 @@ fs_visitor::nir_emit_load_const(const fs_builder , case 64: for (unsigned i = 0; i < instr->def.num_components; i++) - bld.MOV(offset(reg, bld, i), brw_imm_df(instr->value.f64[i])); + bld.MOV(offset(reg, bld, i), + setup_imm_df(devinfo, bld, instr->value.f64[i])); break; default: -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/6] i965/fs: fix Haswell support for doubles
Hello, This is the second version of this patch series [0]. The use of DIM instruction on HSW to setup an 64-bit immediate reg (suggested by Kenneth here [1]) will be sent in a separate patch series. Thanks, Sam [0] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122416.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122473.html Iago Toral Quiroga (4): i965/fs: add a helper function to create double immediates i965/fs: use the new helper function to create double immediates i965/fs/gen7: split instructions that run into exec masking bugs i965/fs: do pack lowering before simd splitting Samuel Iglesias Gonsálvez (2): i965/fs: do not require force_writemask_all with exec_size 4 i965/fs: do d2x lowering before simd splitting src/mesa/drivers/dri/i965/brw_fs.cpp | 48 -- src/mesa/drivers/dri/i965/brw_fs.h | 3 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 44 +-- 4 files changed, 83 insertions(+), 14 deletions(-) -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/6] i965/fs: add a helper function to create double immediates
From: Iago Toral QuirogaGen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). Signed-off-by: Samuel Iglesias Gonsálvez Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.h | 3 +++ src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37 2 files changed, 40 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 1f88f8f..d034573 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -512,3 +512,6 @@ void shuffle_64bit_data_for_32bit_write(const brw::fs_builder , const fs_reg , const fs_reg , uint32_t components); +fs_reg setup_imm_df(const struct brw_device_info *devinfo, +const brw::fs_builder , +double v); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 04ed42e..94c719b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -4547,3 +4547,40 @@ shuffle_64bit_data_for_32bit_write(const fs_builder , bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, 1)); } } + +fs_reg +setup_imm_df(const struct brw_device_info *devinfo, const fs_builder , double v) +{ + assert(devinfo->gen >= 7); + + if (devinfo->gen >= 8) + return brw_imm_df(v); + + /* gen7 does not support DF immediates, so we generate a 64-bit constant by +* writing the low 32-bit of the constant to suboffset 0 of a VGRF and +* the high 32-bit to suboffset 4 and then applying a stride of 0. +* +* Alternatively, we could also produce a normal VGRF (without stride 0) +* by writing to all the channels in the VGRF, however, that would hit the +* gen7 bug where we have to split writes that span more than 1 register +* into instructions with a width of 4 (otherwise the write to the second +* register written runs into an execmask hardware bug) which isn't very +* nice. +*/ + union { + double d; + struct { + uint32_t i1; + uint32_t i2; + }; + } di; + + di.d = v; + + const fs_builder ubld = bld.exec_all().group(1, 0); + const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); + ubld.MOV(tmp, brw_imm_ud(di.i1)); + ubld.MOV(horiz_offset(tmp, 1), brw_imm_ud(di.i2)); + + return component(retype(tmp, BRW_REGISTER_TYPE_DF), 0); +} -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] i965/fs: use the new helper function to create double immediates
On 06/07/16 22:32, Kenneth Graunke wrote: > On Wednesday, July 6, 2016 12:09:58 PM PDT Samuel Iglesias Gonsálvez wrote: >> From: Iago Toral Quiroga>> >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 268c847..d805d95 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -832,7 +832,7 @@ fs_visitor::nir_emit_alu(const fs_builder , >> nir_alu_instr *instr) >>* a register and compare with that. >>*/ >> fs_reg tmp = vgrf(glsl_type::double_type); >> - bld.MOV(tmp, brw_imm_df(0.0)); >> + bld.MOV(tmp, setup_imm_df(0.0)); > > Does this need to be splatted out to a full SIMD-width? > Why not just do: > >fs_reg tmp = setup_imm_df(0.0); > > and let the CMP compare against the stride 0 register? > I have just noticed this is not right. CMP will use the 64-bit immediate as one of the sources of the CMP, which is not valid in gen8+. According to BDW+'s PRMs, an 64-bit immediate is only valid in source values for instructions with single source operands. I am going to keep the original patch. Sam >> >> /* A direct DF CMP using the flag register (null dst) won't work in >>* SIMD16 because the CMP will be split in two by lower_simd_width, >> @@ -1171,7 +1171,7 @@ fs_visitor::nir_emit_alu(const fs_builder , >> nir_alu_instr *instr) >> case nir_op_d2b: { >>/* two-argument instructions can't take 64-bit immediates */ >>fs_reg zero = vgrf(glsl_type::double_type); >> - bld.MOV(zero, brw_imm_df(0.0)); >> + bld.MOV(zero, setup_imm_df(0.0)); >>/* A SIMD16 execution needs to be split in two instructions, so use >> * a vgrf instead of the flag register as dst so instruction splitting >> * works > > Likewise, I don't think you need to splat here. > >> @@ -1483,7 +1483,7 @@ fs_visitor::nir_emit_load_const(const fs_builder , >> >> case 64: >>for (unsigned i = 0; i < instr->def.num_components; i++) >> - bld.MOV(offset(reg, bld, i), brw_imm_df(instr->value.f64[i])); >> + bld.MOV(offset(reg, bld, i), setup_imm_df(instr->value.f64[i])); >>break; >> >> default: >> > > This hunk looks good. > signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 96853] gl_PrimitiveID is zero when rendering points of size > 1
https://bugs.freedesktop.org/show_bug.cgi?id=96853 --- Comment #5 from denis.fisse...@tu-dortmund.de --- (In reply to Roland Scheidegger from comment #3) > In theory if there already is a user-provided gs (which doesn't output > points) then the emulation code doesn't really apply. But I don't really > know this code (other than knowing this is pretty tricky stuff...). According to my testing, GS with Mesa3d show a kind of strange mixed behavior. I have a GS, that fakes wide lines by constructing screen aligned quads from lines. If I use it in VMware with Ubuntu 16.04 LTS and Mesa3d, it does not construct wide lines, but only lines of width 1. Now if I abort the GS, using a return statement right at the beginning of the main method and do not output anything, lines of width 1 are still shown in the viewport. For those lines however to be colored correctly, I have to output that color from my GS. So it seems the geometry output part of the GS is somehow overridden, while other parts of it are not. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] mesa from git fails to compile
On 10 July 2016 at 22:11, Pali Rohárwrote: > Hello, compiling mesa from git is failing on this error: > > Making all in isl > make[5]: Entering directory `/«PKGBUILDDIR»/build/dri/src/intel/isl' > python2.7 ../../../../../src/intel/isl/gen_format_layout.py \ > --csv ../../../../../src/intel/isl/isl_format_layout.csv --out > isl_format_layout.c > Traceback (most recent call last): > File "../../../../../src/intel/isl/gen_format_layout.py", line 92, in > > output_encoding='utf-8') > TypeError: __init__() got an unexpected keyword argument 'future_imports' > make[5]: *** [isl_format_layout.c] Error 1 > make[5]: Leaving directory `/«PKGBUILDDIR»/build/dri/src/intel/isl' > make[4]: *** [all-recursive] Error 1 > make[4]: Leaving directory `/«PKGBUILDDIR»/build/dri/src/intel' > make[3]: *** [all-recursive] Error 1 > make[3]: Leaving directory `/«PKGBUILDDIR»/build/dri/src' > make[2]: *** [all] Error 2 > make[2]: Leaving directory `/«PKGBUILDDIR»/build/dri/src' > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory `/«PKGBUILDDIR»/build/dri' > make: *** [debian/stamp/x86_64-linux-gnu-build-dri] Error 2 > > Any idea where is problem and how to fix it? > > Full build log is available at: > > https://launchpad.net/~pali/+archive/ubuntu/graphics-drivers/+build/10446196/+files/buildlog_ubuntu-precise-amd64.mesa-lts-trusty_11.3.0-git201607100358.5c17fb2~ubuntu12.04.1_BUILDING.txt.gz > Sounds similar (the same?) as https://bugs.freedesktop.org/show_bug.cgi?id=89347. Which version of mako do you have, can you give things a try with 0.8.0 or later ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev