[Mesa-dev] [PATCH 1/2] i965/fs: Allow spilling for SIMD16 compute shaders
For fragment shaders, we can always use a SIMD8 program. Therefore, if we detect spilling with a SIMD16 program, then it is better to skip generating a SIMD16 program to only rely on a SIMD8 program. Unfortunately, this doesn't work for compute shaders. For a compute shader, we may be required to use SIMD16 if the local workgroup size is bigger than a certain size. For example, on gen7, if the local workgroup size is larger than 512, then a SIMD16 program is required. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840 Signed-off-by: Jordan JustenCc: "11.2" --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b506040..3f063a9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5228,7 +5228,7 @@ fs_visitor::allocate_registers() * SIMD8. There's probably actually some intermediate point where * SIMD16 with a couple of spills is still better. */ - if (dispatch_width == 16) { + if (dispatch_width == 16 && min_dispatch_width <= 8) { fail("Failure to register allocate. Reduce number of " "live scalar values to avoid this."); } else { diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 7446ca1..43d8a9d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -407,6 +407,7 @@ public: bool spilled_any_registers; const unsigned dispatch_width; /**< 8 or 16 */ + unsigned min_dispatch_width; int shader_time_index; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 88b1896..753d97f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1021,6 +1021,17 @@ fs_visitor::init() unreachable("unhandled shader stage"); } + if (stage == MESA_SHADER_COMPUTE) { + const brw_cs_prog_data *cs_prog_data = + (const brw_cs_prog_data*) prog_data; + unsigned size = cs_prog_data->local_size[0] * + cs_prog_data->local_size[1] * cs_prog_data->local_size[2]; + size = DIV_ROUND_UP(size, devinfo->max_cs_threads); + min_dispatch_width = size > 16 ? 32 : (size > 8 ? 16 : 8); + } else { + min_dispatch_width = 8; + } + this->prog_data = this->stage_prog_data; this->failed = false; -- 2.7.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/compute: Skip SIMD8 generation if it can't be used
If the local workgroup size is sufficiently large, then the SIMD8 program can't be used. In this case we can skip generating the SIMD8 program. For complex programs this can save a significant amount of time. Signed-off-by: Jordan Justen--- src/mesa/drivers/dri/i965/brw_fs.cpp | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3f063a9..07c9c01 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1933,8 +1933,8 @@ fs_visitor::compact_virtual_grfs() void fs_visitor::assign_constant_locations() { - /* Only the first compile (SIMD8 mode) gets to decide on locations. */ - if (dispatch_width != 8) + /* Only the first compile gets to decide on locations. */ + if (dispatch_width != min_dispatch_width) return; unsigned int num_pull_constants = 0; @@ -5731,6 +5731,7 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data, shader->info.cs.local_size[2]; unsigned max_cs_threads = compiler->devinfo->max_cs_threads; + unsigned simd_required = DIV_ROUND_UP(local_workgroup_size, max_cs_threads); cfg_t *cfg = NULL; const char *fail_msg = NULL; @@ -5740,11 +5741,13 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data, fs_visitor v8(compiler, log_data, mem_ctx, key, _data->base, NULL, /* Never used in core profile */ shader, 8, shader_time_index); - if (!v8.run_cs()) { - fail_msg = v8.fail_msg; - } else if (local_workgroup_size <= 8 * max_cs_threads) { - cfg = v8.cfg; - prog_data->simd_size = 8; + if (simd_required <= 8) { + if (!v8.run_cs()) { + fail_msg = v8.fail_msg; + } else { + cfg = v8.cfg; + prog_data->simd_size = 8; + } } fs_visitor v16(compiler, log_data, mem_ctx, key, _data->base, @@ -5754,7 +5757,8 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data, !fail_msg && !v8.simd16_unsupported && local_workgroup_size <= 16 * max_cs_threads) { /* Try a SIMD16 compile */ - v16.import_uniforms(); + if (simd_required <= 8) + v16.import_uniforms(); if (!v16.run_cs()) { compiler->shader_perf_log(log_data, "SIMD16 shader failed to compile: %s", -- 2.7.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2.5/3] glsl: only apply default stream to output blocks
On Fri, 2016-02-26 at 07:41 +0100, Samuel Iglesias Gonsálvez wrote: > > On Fri, Feb 26, 2016 at 11:51:16AM +1100, Timothy Arceri wrote: > > This is needed to allow invalid qualifier checks on inputs. > > > > Cc: Samuel Iglesias Gonsálvez> > --- > > I missed this in the first series as no tests hit this, I guess > > that means > > we have no gs tests that have an input block with a layout > > qualifier :( > > > > Yes, you are right. Would you mind adding one test for this case? Sure. > > > Transform feedback qualifiers I'm adding do a similar thing and I > > was > > hitting this problem with them. > > > > src/compiler/glsl/glsl_parser_extras.cpp | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp > > b/src/compiler/glsl/glsl_parser_extras.cpp > > index ec180c0..2b1cc0d 100644 > > --- a/src/compiler/glsl/glsl_parser_extras.cpp > > +++ b/src/compiler/glsl/glsl_parser_extras.cpp > > @@ -922,7 +922,8 @@ _mesa_ast_process_interface_block(YYLTYPE > > *locp, > > block->layout.flags.i |= block_interface_qualifier; > > > > if (state->stage == MESA_SHADER_GEOMETRY && > > - state->has_explicit_attrib_stream()) { > > + state->has_explicit_attrib_stream() && > > + block->layout.flags.q.out) { > > Reviewed-by: Samuel Iglesias Gonsálvez > > I am thinking that we need to return a compiler error when setting > stream qualifier to an input block as glslangValidator does but in > a different patch... If you are busy, I can write it later today. > Just > let me know. Right that should happen once patch 3 lands. It should catch all invalid input qualifiers. The only limitation is using it for function params too at this point, but its a good first step IMO. I hope to fix up some small issues with my component qualifier series in the next couple of days so I can finally land this and some other fixes. Thanks for the reviews. > > Sam > > > /* Assign global layout's stream value. */ > > block->layout.flags.q.stream = 1; > > block->layout.flags.q.explicit_stream = 0; > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: set VIEWPORT_BOUNDS_RANGE value depending of the supported OpenGL version
From ARB_viewport_array spec: " * On GL3-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-16384, 16383]. * On GL4-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-32768, 32767]." Signed-off-by: Samuel Iglesias Gonsálvez--- src/mesa/drivers/dri/i965/brw_context.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 31b6b2a..1569992 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -687,10 +687,15 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS; ctx->Const.ViewportSubpixelBits = 0; - /* Cast to float before negating because MaxViewportWidth is unsigned. - */ - ctx->Const.ViewportBounds.Min = -(float)ctx->Const.MaxViewportWidth; - ctx->Const.ViewportBounds.Max = ctx->Const.MaxViewportWidth; + if (brw->intelScreen->driScrnPriv->max_gl_core_version >= 40) { + ctx->Const.ViewportBounds.Min = -32768; + ctx->Const.ViewportBounds.Max = 32767; + } else { + /* Cast to float before negating because MaxViewportWidth is unsigned. + */ + ctx->Const.ViewportBounds.Min = -(float)ctx->Const.MaxViewportWidth; + ctx->Const.ViewportBounds.Max = ctx->Const.MaxViewportWidth; + } } /* ARB_gpu_shader5 */ -- 2.7.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Fri, Feb 26, 2016 at 9:32 AM, Michel Dänzerwrote: > On 26.02.2016 16:14, Oded Gabbay wrote: >> On Fri, Feb 26, 2016 at 5:01 AM, Michel Dänzer wrote: >>> >>> [ Dropping mesa-stable list from Cc, since sending patches there by >>> e-mail before they've landed on master is basically noise ] >> >> Problem is that I sometimes later forget to add stable :) > > Note that I'm only referring to sending patches to the mesa-stable list > by e-mail, which isn't necessary for them to be backported to stable > branches. The stable branch maintainer will pick patches for backporting > using the bin/get-pick-list.sh script. > > Adding the mesa-stable tag to the commit log is of course fine per se. > Yeah, I understand, but git send-email is configured to automatically adds the cc: tag. Maybe I should disable it... > >>> On 26.02.2016 06:09, Oded Gabbay wrote: Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). >>> >>> That may be true for (some?) formats with 4 components of 8 bits, but >>> I'd be surprised if it was true for all formats handled by this >>> function. Just as one example, consider formats with 32 bits per component. >>> >> >> I first wanted to get these 3 patches out of the gate so people could >> have a working desktop in the most default form they are working (4 >> components of 8 bits). I promise I will continu to work on this and >> will aspire to reach parity with LE, but I'm doing this on my free >> time so it could take some time. >> >> I will definitely want to check all formats. > > Then you can just add the return ENDIAN_NONE in the > V_0280A0_COLOR_8_8_8_8 case instead of at the beginning of the function. > That should address Matt's concern as well. > Hmm, maybe I should. I will check to see if this doesn't cause regressions from what I have arrived to and will update here. Oded > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On 26.02.2016 16:14, Oded Gabbay wrote: > On Fri, Feb 26, 2016 at 5:01 AM, Michel Dänzerwrote: >> >> [ Dropping mesa-stable list from Cc, since sending patches there by >> e-mail before they've landed on master is basically noise ] > > Problem is that I sometimes later forget to add stable :) Note that I'm only referring to sending patches to the mesa-stable list by e-mail, which isn't necessary for them to be backported to stable branches. The stable branch maintainer will pick patches for backporting using the bin/get-pick-list.sh script. Adding the mesa-stable tag to the commit log is of course fine per se. >> On 26.02.2016 06:09, Oded Gabbay wrote: >>> Since the rework on gallium pipe formats, there is no more need to do >>> endian swap of the colorformat in the h/w, because the conversion between >>> mesa format and gallium (pipe) format takes endianess into account (see >>> the big #if in p_format.h). >> >> That may be true for (some?) formats with 4 components of 8 bits, but >> I'd be surprised if it was true for all formats handled by this >> function. Just as one example, consider formats with 32 bits per component. >> > > I first wanted to get these 3 patches out of the gate so people could > have a working desktop in the most default form they are working (4 > components of 8 bits). I promise I will continu to work on this and > will aspire to reach parity with LE, but I'm doing this on my free > time so it could take some time. > > I will definitely want to check all formats. Then you can just add the return ENDIAN_NONE in the V_0280A0_COLOR_8_8_8_8 case instead of at the beginning of the function. That should address Matt's concern as well. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 94295] [swrast] piglit shader_runner fast_color_clear/all-colors regression
https://bugs.freedesktop.org/show_bug.cgi?id=94295 Plamena Manolovachanged: What|Removed |Added Assignee|mesa-dev@lists.freedesktop. |plamena.manol...@intel.com |org | Status|NEW |ASSIGNED -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 29192] Kwin crashed after checking an advanced setting in Desktop Effects
https://bugs.freedesktop.org/show_bug.cgi?id=29192 Christopher M. Penalverchanged: What|Removed |Added CC||christopher.m.penalver@gmai ||l.com Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #3 from Christopher M. Penalver --- Leonardo La Malfa, Ubuntu Maverick reached EOL on April 10, 2012. For more on this, please see https://wiki.ubuntu.com/Releases. If this is reproducible with a supported release, it will help immensely if you filed a new report with Ubuntu by ensuring you have the package xdiagnose installed, and that you click the Yes button for attaching additional debugging information running the following from a terminal: ubuntu-bug xorg Also, please feel free to subscribe me to it. For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Fri, Feb 26, 2016 at 5:01 AM, Michel Dänzerwrote: > > [ Dropping mesa-stable list from Cc, since sending patches there by > e-mail before they've landed on master is basically noise ] Problem is that I sometimes later forget to add stable :) > > On 26.02.2016 06:09, Oded Gabbay wrote: >> Since the rework on gallium pipe formats, there is no more need to do >> endian swap of the colorformat in the h/w, because the conversion between >> mesa format and gallium (pipe) format takes endianess into account (see >> the big #if in p_format.h). > > That may be true for (some?) formats with 4 components of 8 bits, but > I'd be surprised if it was true for all formats handled by this > function. Just as one example, consider formats with 32 bits per component. > I first wanted to get these 3 patches out of the gate so people could have a working desktop in the most default form they are working (4 components of 8 bits). I promise I will continu to work on this and will aspire to reach parity with LE, but I'm doing this on my free time so it could take some time. I will definitely want to check all formats. Oded > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 29148] KWin segfaults when OpenGL desktop effects are enabled
https://bugs.freedesktop.org/show_bug.cgi?id=29148 Christopher M. Penalverchanged: What|Removed |Added Status|NEW |RESOLVED CC||christopher.m.penalver@gmai ||l.com Resolution|--- |INVALID --- Comment #7 from Christopher M. Penalver --- Alain Perrot, Kubuntu Maverick reached EOL on April 10, 2012. For more on this, please see https://wiki.ubuntu.com/Releases. If this is reproducible with a supported release, it will help immensely if you filed a new report with Ubuntu by ensuring you have the package xdiagnose installed, and that you click the Yes button for attaching additional debugging information running the following from a terminal: ubuntu-bug xorg Also, please feel free to subscribe me to it. For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 28799] Shared textures not working in 7.7.1
https://bugs.freedesktop.org/show_bug.cgi?id=28799 Christopher M. Penalverchanged: What|Removed |Added Resolution|--- |INVALID CC||christopher.m.penalver@gmai ||l.com Status|NEW |RESOLVED --- Comment #2 from Christopher M. Penalver --- steve.cor...@virtualcomputer.com, Ubuntu 10.04 reached EOL on May 9, 2013. For more on this, please see https://wiki.ubuntu.com/Releases. If this is reproducible with a supported release, it will help immensely if you filed a new report with Ubuntu by ensuring you have the package xdiagnose installed, and that you click the Yes button for attaching additional debugging information running the following from a terminal: ubuntu-bug xorg Also, please feel free to subscribe me to it. For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 12895] 3D on ATI 9200SE video card?
https://bugs.freedesktop.org/show_bug.cgi?id=12895 Christopher M. Penalverchanged: What|Removed |Added Resolution|--- |INVALID CC||christopher.m.penalver@gmai ||l.com Status|NEW |RESOLVED --- Comment #2 from Christopher M. Penalver --- carlo, Kubuntu 7.04 reached EOL on October 19, 2008. For more on this, please see https://wiki.ubuntu.com/Releases. If this is reproducible with a supported release, it will help immensely if you filed a new report with Ubuntu by ensuring you have the package xdiagnose installed, and that you click the Yes button for attaching additional debugging information running the following from a terminal: ubuntu-bug xorg Also, please feel free to subscribe me to it. For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] android: re-generate git_sha1.h if git HEAD updated
The git_sha1.h has to depend on the git HEAD otherwise it will never be updated. Signed-off-by: Chih-Wei Huang--- src/mesa/Android.gen.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/Android.gen.mk b/src/mesa/Android.gen.mk index a985f0a..e567102 100644 --- a/src/mesa/Android.gen.mk +++ b/src/mesa/Android.gen.mk @@ -69,7 +69,7 @@ define es-gen $(hide) $(PRIVATE_SCRIPT) $(1) $(PRIVATE_XML) > $@ endef -$(intermediates)/main/git_sha1.h: +$(intermediates)/main/git_sha1.h: $(wildcard $(MESA_TOP)/.git/HEAD) @mkdir -p $(dir $@) @echo "GIT-SHA1: $(PRIVATE_MODULE) <= git" $(hide) touch $@ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: remove stray ; after if
On Feb 25, 2016 5:33 PM, "Matt Turner"wrote: > > Indeed, that looks like a mistake. Yes, yes it is. Good catch. Reviewed-by: Jason Ekstrand > Reviewed-by: Matt Turner > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2.5/3] glsl: only apply default stream to output blocks
On Fri, Feb 26, 2016 at 11:51:16AM +1100, Timothy Arceri wrote: > This is needed to allow invalid qualifier checks on inputs. > > Cc: Samuel Iglesias Gonsálvez> --- > I missed this in the first series as no tests hit this, I guess that means > we have no gs tests that have an input block with a layout qualifier :( > Yes, you are right. Would you mind adding one test for this case? > Transform feedback qualifiers I'm adding do a similar thing and I was > hitting this problem with them. > > src/compiler/glsl/glsl_parser_extras.cpp | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp > b/src/compiler/glsl/glsl_parser_extras.cpp > index ec180c0..2b1cc0d 100644 > --- a/src/compiler/glsl/glsl_parser_extras.cpp > +++ b/src/compiler/glsl/glsl_parser_extras.cpp > @@ -922,7 +922,8 @@ _mesa_ast_process_interface_block(YYLTYPE *locp, > block->layout.flags.i |= block_interface_qualifier; > > if (state->stage == MESA_SHADER_GEOMETRY && > - state->has_explicit_attrib_stream()) { > + state->has_explicit_attrib_stream() && > + block->layout.flags.q.out) { Reviewed-by: Samuel Iglesias Gonsálvez I am thinking that we need to return a compiler error when setting stream qualifier to an input block as glslangValidator does but in a different patch... If you are busy, I can write it later today. Just let me know. Sam >/* Assign global layout's stream value. */ >block->layout.flags.q.stream = 1; >block->layout.flags.q.explicit_stream = 0; > -- > 2.5.0 > > signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] winsys/radeon: drop support for DRM 2.12.0
On 25.02.2016 08:09, Marek Olšák wrote: > From: Marek Olšák> > in order to make some winsys interface changes easier Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: dump full shader disassemblies into ddebug logs
On 26.02.2016 01:42, Marek Olšák wrote: > From: Marek OlšákReviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: allow dumping shader disassemblies to a file
What's the purpose of this change? Unless I'm missing something, only stderr is ever passed in. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 26/28] glsl: lower tessellation varyings packed with component layout qualifier
On Thu, 2016-02-25 at 18:32 -0800, Kenneth Graunke wrote: > On Tuesday, December 29, 2015 4:00:26 PM PST Timothy Arceri wrote: > > For tessellation shaders we cannot just copy everything to the > > packed > > varyings like we do in other stages as tessellation uses shared > > memory for > > varyings, therefore it is only safe to copy array elements that the > > shader > > actually uses. > > Also, you can only copy the exact *components* written by the shader. > For example, one nasty thing a valid TCS might do is: > > patch out ivec4 foo; > foo[gl_InvocationID] = gl_InvocationID; > > which, given four threads, will write <0, 1, 2, 3> to the > vector. But > if each thread writes the whole vec4 by accident, you may end up with > garbage in 3/4 of the components. > > It would be worth verifying that you handle this correctly. Ok I'll write some more piglit tests. > > (Such indirecting will probably get lowered to if-ladders, because > anything else is fairly crazy...) > > > This class searches the IR for uses of varyings and then creates > > instructions that copy those vars to a packed varying. This means > > it is > > easy to end up with duplicate copies if the varying is used more > > than once, > > also arrays of arrays create a duplicate copy for each dimension > > that > > exists. These issues are not easily resolved without breaking > > various > > corner cases so we leave it to a later IR stage to clean up the > > mess. > > > > Note that neither GLSL IR nor NIR can currently can't clean up the > > duplicates when and indirect is used as an array index. This patch > > assumes that NIR will eventually be able to clean this up. > > --- > > src/glsl/lower_packed_varyings.cpp | 421 > > ++ > +++ > > 1 file changed, 421 insertions(+) > > I'm finding this code to be basically impossible to read. I wish I > had > some kind of concrete suggestion. This is a hard problem. Walking > dereference chains and emitting new ones with reswizzling is probably > going to be awful no matter what. This may be as good as it gets. > > Ian, do you have any suggestions by chance? > > > > > diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/ > lower_packed_varyings.cpp > > index b606cc8..9522969 100644 > > --- a/src/glsl/lower_packed_varyings.cpp > > +++ b/src/glsl/lower_packed_varyings.cpp > > @@ -148,10 +148,28 @@ > > #include "ir.h" > > #include "ir_builder.h" > > #include "ir_optimization.h" > > +#include "ir_rvalue_visitor.h" > > #include "program/prog_instruction.h" > > +#include "util/hash_table.h" > > > > using namespace ir_builder; > > > > +/** > > + * Creates new type for and array when the base type changes. > > + */ > > +static const glsl_type * > > +update_packed_array_type(const glsl_type *type, const glsl_type > *packed_type) > > +{ > > + const glsl_type *element_type = type->fields.array; > > + if (element_type->is_array()) { > > + const glsl_type *new_array_type = > > +update_packed_array_type(element_type, packed_type); > > + return glsl_type::get_array_instance(new_array_type, type- > > >length); > > + } else { > > + return glsl_type::get_array_instance(packed_type, type- > > >length); > > + } > > +} > > + > > static bool > > needs_lowering(ir_variable *var, bool has_enhanced_layouts, > > bool disable_varying_packing) > > @@ -205,6 +223,51 @@ create_packed_var(void * const mem_ctx, const > > char > *packed_name, > > return packed_var; > > } > > > > +/** > > + * Creates a packed varying for the tessellation packing. > > + */ > > +static ir_variable * > > +create_tess_packed_var(void *mem_ctx, ir_variable *unpacked_var) > > +{ > > + /* create packed varying name using location */ > > + char location_str[11]; > > + snprintf(location_str, 11, "%d", unpacked_var->data.location); > > + char *packed_name; > > + if ((ir_variable_mode) unpacked_var->data.mode == > > ir_var_shader_out) > > + packed_name = ralloc_asprintf(mem_ctx, "packed_out:%s", > location_str); > > + else > > + packed_name = ralloc_asprintf(mem_ctx, "packed_in:%s", > > location_str); > > + > > + const glsl_type *packed_type; > > + switch (unpacked_var->type->without_array()->base_type) { > > + case GLSL_TYPE_UINT: > > + packed_type = glsl_type::uvec4_type; > > + break; > > + case GLSL_TYPE_INT: > > + packed_type = glsl_type::ivec4_type; > > + break; > > + case GLSL_TYPE_FLOAT: > > + packed_type = glsl_type::vec4_type; > > + break; > > + case GLSL_TYPE_DOUBLE: > > + packed_type = glsl_type::dvec4_type; > > + break; > > + default: > > + assert(!"Unexpected type in tess varying packing"); > > + return NULL; > > + } > > + > > + /* Create array new array type */ > > Just "new array type", probably? > > > + if (unpacked_var->type->is_array()) { > > + packed_type = update_packed_array_type(unpacked_var->type, > packed_type); > >
Re: [Mesa-dev] [PATCH 1/3] gallium/radeon: remove separate BE path in r600_translate_colorswap
On 26.02.2016 06:09, Oded Gabbay wrote: > After further testing, it appears there is no need for > separate BE path in r600_translate_colorswap() > > The only fix remaining is the change of the last if statement, in the 4 > channels case. Originally, it contained an invalid swizzle configuration > that never got hit, in LE or BE. So the fix is relevant for both systems. > > This patch adds an additional 120 available visuals for LE and BE, > as seen in glxinfo Did you test that this doesn't cause any regressions in piglit gpu.py on x86? (Ideally with both r600g and radeonsi) -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
[ Dropping mesa-stable list from Cc, since sending patches there by e-mail before they've landed on master is basically noise ] On 26.02.2016 06:09, Oded Gabbay wrote: > Since the rework on gallium pipe formats, there is no more need to do > endian swap of the colorformat in the h/w, because the conversion between > mesa format and gallium (pipe) format takes endianess into account (see > the big #if in p_format.h). That may be true for (some?) formats with 4 components of 8 bits, but I'd be surprised if it was true for all formats handled by this function. Just as one example, consider formats with 32 bits per component. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 26/28] glsl: lower tessellation varyings packed with component layout qualifier
On Tuesday, December 29, 2015 4:00:26 PM PST Timothy Arceri wrote: > For tessellation shaders we cannot just copy everything to the packed > varyings like we do in other stages as tessellation uses shared memory for > varyings, therefore it is only safe to copy array elements that the shader > actually uses. Also, you can only copy the exact *components* written by the shader. For example, one nasty thing a valid TCS might do is: patch out ivec4 foo; foo[gl_InvocationID] = gl_InvocationID; which, given four threads, will write <0, 1, 2, 3> to the vector. But if each thread writes the whole vec4 by accident, you may end up with garbage in 3/4 of the components. It would be worth verifying that you handle this correctly. (Such indirecting will probably get lowered to if-ladders, because anything else is fairly crazy...) > This class searches the IR for uses of varyings and then creates > instructions that copy those vars to a packed varying. This means it is > easy to end up with duplicate copies if the varying is used more than once, > also arrays of arrays create a duplicate copy for each dimension that > exists. These issues are not easily resolved without breaking various > corner cases so we leave it to a later IR stage to clean up the mess. > > Note that neither GLSL IR nor NIR can currently can't clean up the > duplicates when and indirect is used as an array index. This patch > assumes that NIR will eventually be able to clean this up. > --- > src/glsl/lower_packed_varyings.cpp | 421 ++ +++ > 1 file changed, 421 insertions(+) I'm finding this code to be basically impossible to read. I wish I had some kind of concrete suggestion. This is a hard problem. Walking dereference chains and emitting new ones with reswizzling is probably going to be awful no matter what. This may be as good as it gets. Ian, do you have any suggestions by chance? > > diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/ lower_packed_varyings.cpp > index b606cc8..9522969 100644 > --- a/src/glsl/lower_packed_varyings.cpp > +++ b/src/glsl/lower_packed_varyings.cpp > @@ -148,10 +148,28 @@ > #include "ir.h" > #include "ir_builder.h" > #include "ir_optimization.h" > +#include "ir_rvalue_visitor.h" > #include "program/prog_instruction.h" > +#include "util/hash_table.h" > > using namespace ir_builder; > > +/** > + * Creates new type for and array when the base type changes. > + */ > +static const glsl_type * > +update_packed_array_type(const glsl_type *type, const glsl_type *packed_type) > +{ > + const glsl_type *element_type = type->fields.array; > + if (element_type->is_array()) { > + const glsl_type *new_array_type = > +update_packed_array_type(element_type, packed_type); > + return glsl_type::get_array_instance(new_array_type, type->length); > + } else { > + return glsl_type::get_array_instance(packed_type, type->length); > + } > +} > + > static bool > needs_lowering(ir_variable *var, bool has_enhanced_layouts, > bool disable_varying_packing) > @@ -205,6 +223,51 @@ create_packed_var(void * const mem_ctx, const char *packed_name, > return packed_var; > } > > +/** > + * Creates a packed varying for the tessellation packing. > + */ > +static ir_variable * > +create_tess_packed_var(void *mem_ctx, ir_variable *unpacked_var) > +{ > + /* create packed varying name using location */ > + char location_str[11]; > + snprintf(location_str, 11, "%d", unpacked_var->data.location); > + char *packed_name; > + if ((ir_variable_mode) unpacked_var->data.mode == ir_var_shader_out) > + packed_name = ralloc_asprintf(mem_ctx, "packed_out:%s", location_str); > + else > + packed_name = ralloc_asprintf(mem_ctx, "packed_in:%s", location_str); > + > + const glsl_type *packed_type; > + switch (unpacked_var->type->without_array()->base_type) { > + case GLSL_TYPE_UINT: > + packed_type = glsl_type::uvec4_type; > + break; > + case GLSL_TYPE_INT: > + packed_type = glsl_type::ivec4_type; > + break; > + case GLSL_TYPE_FLOAT: > + packed_type = glsl_type::vec4_type; > + break; > + case GLSL_TYPE_DOUBLE: > + packed_type = glsl_type::dvec4_type; > + break; > + default: > + assert(!"Unexpected type in tess varying packing"); > + return NULL; > + } > + > + /* Create array new array type */ Just "new array type", probably? > + if (unpacked_var->type->is_array()) { > + packed_type = update_packed_array_type(unpacked_var->type, packed_type); > + } You need to preserve unpacked_var->data.patch here, or else per-patch varyings will become per-vertex varyings (which have malformed types). Note that per-patch varyings don't have to be arrays, but they can be. It might be worth verifying both cases. > + > + return create_packed_var(mem_ctx, packed_name, packed_type, unpacked_var, > +(ir_variable_mode)
Re: [Mesa-dev] [PATCH 3/5] i965/cfg: Eliminate an empty then-branch of an if/else/endif
Ian Romanickwrites: > From: Ian Romanick > > On BDW, > > total instructions in shared programs: 8448571 -> 8448367 (-0.00%) > instructions in affected programs: 21000 -> 20796 (-0.97%) > helped: 116 > HURT: 0 > > Signed-off-by: Ian Romanick > --- > src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > index 7aa72b1..149596f 100644 > --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > @@ -34,6 +34,7 @@ > * - if/endif > * . else in else/endif > * - if/else/endif > + * - then in if/else/endif > */ > bool > dead_control_flow_eliminate(backend_shader *s) > @@ -114,6 +115,23 @@ dead_control_flow_eliminate(backend_shader *s) > > progress = true; > } > + } else if (inst->opcode == BRW_OPCODE_ELSE && > + prev_inst->opcode == BRW_OPCODE_IF) { > + bblock_t *const else_block = block; > + bblock_t *const if_block = prev_block; > + backend_instruction *const if_inst = prev_inst; > + backend_instruction *const else_inst = inst; > + > + /* Since the else-branch is becoming the new then-branch, the > + * condition has to be inverted. > + */ > + if_inst->predicate_inverse = !if_inst->predicate_inverse; > + else_inst->remove(else_block); > + > + if (if_block->can_combine_with(else_block)) > +if_block->combine_with(else_block); Ugh, IIRC backend_instruction::remove(block) will remove the block behind your back when it becomes empty (and it will here because ELSE can only be the only instruction left inside 'block' whenever you hit this path), so you're passing a pointer to a no-longer-existing block to (can_)combine_with(). I believe this will never let you combine the blocks anyway because the previous block ends with an IF instruction which you haven't removed, so this is effectively a no-op [assuming it doesn't crash ;)]. If you drop the two lines above you can put my Reviewed-by: Francisco Jerez on this patch and the rest of the series. > + > + progress = true; >} > } > > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] i965/cfg: Split out dead control flow paths to simplify both paths
Ian Romanickwrites: > From: Ian Romanick > > Signed-off-by: Ian Romanick > --- > .../drivers/dri/i965/brw_dead_control_flow.cpp | 93 > +- > 1 file changed, 38 insertions(+), 55 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > index dadcff8..64f406d 100644 > --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp > @@ -48,67 +48,50 @@ dead_control_flow_eliminate(backend_shader *s) >/* ENDIF instructions, by definition, can only be found at the start of > * basic blocks. > */ > - if (inst->opcode == BRW_OPCODE_ENDIF) { > - bool found = false; > - bblock_t *if_block = NULL, *else_block = NULL, *endif_block = block; > - backend_instruction *endif_inst = inst; > - > - backend_instruction *if_inst = NULL, *else_inst = NULL; > - if (prev_inst->opcode == BRW_OPCODE_ELSE) { > -else_inst = prev_inst; > -else_block = endif_block->prev(); > -found = true; > - > -/* Don't remove the ENDIF if we didn't find a dead IF. */ > -endif_inst = NULL; > - } > - > - if (prev_inst->opcode == BRW_OPCODE_IF) { > -if_inst = prev_inst; > -if_block = prev_block; > -found = true; > - } > - > - if (found) { > -bblock_t *earlier_block = NULL, *later_block = NULL; > + if (inst->opcode == BRW_OPCODE_ENDIF && > + prev_inst->opcode == BRW_OPCODE_ELSE) { > + bblock_t *const else_block = prev_block; > + backend_instruction *const else_inst = prev_inst; > > -if (if_inst) { > - if (if_block->start_ip == if_block->end_ip) { > - earlier_block = if_block->prev(); > - } else { > - earlier_block = if_block; > - } > - if_inst->remove(if_block); > -} > + else_inst->remove(else_block); > + progress = true; > + } else if (inst->opcode == BRW_OPCODE_ENDIF && > + prev_inst->opcode == BRW_OPCODE_IF) { Just one nitpick: Can you align this to the parenthesis on the previous line? With that fixed: Reviewed-by: Francisco Jerez > + bblock_t *const endif_block = block; > + bblock_t *const if_block = prev_block; > + backend_instruction *const endif_inst = inst; > + backend_instruction *const if_inst = prev_inst; > > -if (else_inst) { > - else_inst->remove(else_block); > -} > + bblock_t *earlier_block = NULL, *later_block = NULL; > > -if (endif_inst) { > - if (endif_block->start_ip == endif_block->end_ip) { > - later_block = endif_block->next(); > - } else { > - later_block = endif_block; > - } > - endif_inst->remove(endif_block); > -} > + if (if_block->start_ip == if_block->end_ip) { > +earlier_block = if_block->prev(); > + } else { > +earlier_block = if_block; > + } > + if_inst->remove(if_block); > > -assert((earlier_block == NULL) == (later_block == NULL)); > -if (earlier_block && > earlier_block->can_combine_with(later_block)) { > - earlier_block->combine_with(later_block); > - > - /* If ENDIF was in its own block, then we've now deleted it > and > -* merged the two surrounding blocks, the latter of which the > -* __next block pointer was pointing to. > -*/ > - if (endif_block != later_block) { > - __next = earlier_block->next(); > - } > + if (endif_block->start_ip == endif_block->end_ip) { > +later_block = endif_block->next(); > + } else { > +later_block = endif_block; > + } > + endif_inst->remove(endif_block); > + > + assert((earlier_block == NULL) == (later_block == NULL)); > + if (earlier_block && earlier_block->can_combine_with(later_block)) { > +earlier_block->combine_with(later_block); > + > +/* If ENDIF was in its own block, then we've now deleted it and > + * merged the two surrounding blocks, the latter of which the > + * __next block pointer was pointing to. > + */ > +if (endif_block != later_block) { > + __next = earlier_block->next(); > } > - > -progress = true; > } > + > + progress = true; >} else if (inst->opcode == BRW_OPCODE_ELSE && > prev_inst->opcode ==
Re: [Mesa-dev] [PATCH] anv: remove stray ; after if
Indeed, that looks like a mistake. Reviewed-by: Matt Turner___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/28] glsl: update explicit location matching to support component qualifier
On Tuesday, December 29, 2015 4:00:12 PM PST Timothy Arceri wrote: > This is needed so we don't optimise away the varying when more than > one shares the same location. > --- > src/glsl/linker.cpp | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > index 31d55e3..f709922 100644 > --- a/src/glsl/linker.cpp > +++ b/src/glsl/linker.cpp > @@ -2684,7 +2684,7 @@ match_explicit_outputs_to_inputs(struct gl_shader_program *prog, > gl_shader *consumer) > { > glsl_symbol_table parameters; > - ir_variable *explicit_locations[MAX_VARYING] = { NULL }; > + ir_variable *explicit_locations[MAX_VARYING][4] = { {NULL, NULL} }; > > /* Find all shader outputs in the "producer" stage. > */ > @@ -2697,8 +2697,8 @@ match_explicit_outputs_to_inputs(struct gl_shader_program *prog, >if (var->data.explicit_location && >var->data.location >= VARYING_SLOT_VAR0) { > const unsigned idx = var->data.location - VARYING_SLOT_VAR0; > - if (explicit_locations[idx] == NULL) > -explicit_locations[idx] = var; > + if (explicit_locations[idx][var->data.location_frac] == NULL) > +explicit_locations[idx][var->data.location_frac] = var; >} > } > > @@ -2712,7 +2712,8 @@ match_explicit_outputs_to_inputs(struct gl_shader_program *prog, >ir_variable *output = NULL; >if (input->data.explicit_location >&& input->data.location >= VARYING_SLOT_VAR0) { > - output = explicit_locations[input->data.location - VARYING_SLOT_VAR0]; > + output = explicit_locations[input->data.location - VARYING_SLOT_VAR0] > +[input->data.location_frac]; > > if (output != NULL){ > input->data.is_unmatched_generic_inout = 0; > Patches 12-15 are: Reviewed-by: Kenneth Graunkesignature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3 11/28] glsl: cross validate varyings with a component qualifier
On Friday, January 8, 2016 10:15:58 AM PST Timothy Arceri wrote: > This change checks for component overlap, including handling overlap of > locations and components by doubles. Previously there was no validation > for assigning explicit locations to a location used by the second half > of a double. > > V3: simplify handling of doubles and fix double component aliasing > detection > > V2: fix component matching for matricies > > Cc: Anuj Phogat> --- > src/glsl/link_varyings.cpp | 63 + + > 1 file changed, 52 insertions(+), 11 deletions(-) > > diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp > index 6a9ee94..03c131a 100644 > --- a/src/glsl/link_varyings.cpp > +++ b/src/glsl/link_varyings.cpp > @@ -222,7 +222,7 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, >gl_shader *producer, gl_shader *consumer) > { > glsl_symbol_table parameters; > - ir_variable *explicit_locations[MAX_VARYING] = { NULL, }; > + ir_variable *explicit_locations[MAX_VARYING][4] = { {NULL, NULL} }; > > /* Find all shader outputs in the "producer" stage. > */ > @@ -243,18 +243,59 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, > unsigned num_elements = type->count_attribute_slots(false); > unsigned idx = var->data.location - VARYING_SLOT_VAR0; > unsigned slot_limit = idx + num_elements; > + unsigned last_comp; > + > + if (var->type->without_array()->is_record()) { > +/* The componet qualifier can't be used on structs so just treat ^^^ component Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/28] glsl: fix cross validation for explicit locations on structs and arrays
On Tuesday, December 29, 2015 4:00:10 PM PST Timothy Arceri wrote: > --- > src/glsl/link_varyings.cpp | 43 ++- > 1 file changed, 30 insertions(+), 13 deletions(-) > > diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp > index ee7cae0..dea8741 100644 > --- a/src/glsl/link_varyings.cpp > +++ b/src/glsl/link_varyings.cpp > @@ -239,18 +239,24 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, > /* User-defined varyings with explicit locations are handled >* differently because they do not need to have matching names. >*/ > - const unsigned idx = var->data.location - VARYING_SLOT_VAR0; > + const glsl_type *type = get_varying_type(var, producer->Stage); > + unsigned num_elements = type->count_attribute_slots(false); > + unsigned idx = var->data.location - VARYING_SLOT_VAR0; > + unsigned slot_limit = idx + num_elements; > > - if (explicit_locations[idx] != NULL) { > -linker_error(prog, > + while(idx < slot_limit) { while (idx < slot_limit) { [same comment below] Patches 8-10 are: Reviewed-by: Kenneth Graunke> +if (explicit_locations[idx] != NULL) { > + linker_error(prog, > "%s shader has multiple outputs explicitly " > "assigned to location %d\n", > _mesa_shader_stage_to_string(producer->Stage), > idx); > -return; > - } > + return; > +} > > - explicit_locations[idx] = var; > +explicit_locations[idx] = var; > +idx++; > + } >} > } > > @@ -298,14 +304,25 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, > ir_variable *output = NULL; > if (input->data.explicit_location > && input->data.location >= VARYING_SLOT_VAR0) { > -output = explicit_locations[input->data.location - VARYING_SLOT_VAR0]; > > -if (output == NULL) { > - linker_error(prog, > -"%s shader input `%s' with explicit location " > -"has no matching output\n", > -_mesa_shader_stage_to_string(consumer->Stage), > -input->name); > +const glsl_type *type = get_varying_type(input, consumer- >Stage); > +unsigned num_elements = type->count_attribute_slots(false); > +unsigned idx = input->data.location - VARYING_SLOT_VAR0; > +unsigned slot_limit = idx + num_elements; > + > +while(idx < slot_limit) { > + output = explicit_locations[idx]; > + > + if (output == NULL || > + input->data.location != output->data.location) { > + linker_error(prog, > + "%s shader input `%s' with explicit location " > + "has no matching output\n", > + _mesa_shader_stage_to_string(consumer- >Stage), > + input->name); > + break; > + } > + idx++; > } > } else { > output = parameters.get_variable(input->name); > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] Struct / union sizes being calculated incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 Pavan Yalamanchilichanged: What|Removed |Added Summary|clSetKernelArg from OpenCL |Struct / union sizes being |is erroring out incorrectly |calculated incorrectly --- Comment #6 from Pavan Yalamanchili --- @fernando, looks like you are right. The code paths do not even consider that users might be sending in custom data types (aka structs / unions). -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] clSetKernelArg from OpenCL is erroring out incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 --- Comment #5 from Pavan Yalamanchili--- @Fernando, the updated code sample only includes native types inside a struct. But I will look at the file you mention to see if anything fishy is happening. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2.5/3] glsl: only apply default stream to output blocks
This is needed to allow invalid qualifier checks on inputs. Cc: Samuel Iglesias Gonsálvez--- I missed this in the first series as no tests hit this, I guess that means we have no gs tests that have an input block with a layout qualifier :( Transform feedback qualifiers I'm adding do a similar thing and I was hitting this problem with them. src/compiler/glsl/glsl_parser_extras.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index ec180c0..2b1cc0d 100644 --- a/src/compiler/glsl/glsl_parser_extras.cpp +++ b/src/compiler/glsl/glsl_parser_extras.cpp @@ -922,7 +922,8 @@ _mesa_ast_process_interface_block(YYLTYPE *locp, block->layout.flags.i |= block_interface_qualifier; if (state->stage == MESA_SHADER_GEOMETRY && - state->has_explicit_attrib_stream()) { + state->has_explicit_attrib_stream() && + block->layout.flags.q.out) { /* Assign global layout's stream value. */ block->layout.flags.q.stream = 1; block->layout.flags.q.explicit_stream = 0; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: remove stray ; after if
Both logic and indentation suggests that the ; were not intended here. --- src/intel/vulkan/anv_cmd_buffer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index b060828..827c3ed 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -465,7 +465,7 @@ void anv_CmdSetViewport( ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); const uint32_t total_count = firstViewport + viewportCount; - if (cmd_buffer->state.dynamic.viewport.count < total_count); + if (cmd_buffer->state.dynamic.viewport.count < total_count) cmd_buffer->state.dynamic.viewport.count = total_count; memcpy(cmd_buffer->state.dynamic.viewport.viewports + firstViewport, @@ -483,7 +483,7 @@ void anv_CmdSetScissor( ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); const uint32_t total_count = firstScissor + scissorCount; - if (cmd_buffer->state.dynamic.scissor.count < total_count); + if (cmd_buffer->state.dynamic.scissor.count < total_count) cmd_buffer->state.dynamic.scissor.count = total_count; memcpy(cmd_buffer->state.dynamic.scissor.scissors + firstScissor, -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] clSetKernelArg from OpenCL is erroring out incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 --- Comment #4 from Francisco Jerez--- I haven't run the program myself either, but it's likely that the module::argument::size value is calculated incorrectly in the compiler glue code, check out the definition of arg_api_size in llvm/invocation.cpp:454, which is likely to be wrong for non-builtin types. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] clSetKernelArg from OpenCL is erroring out incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 Pavan Yalamanchilichanged: What|Removed |Added Priority|medium |high -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] clSetKernelArg from OpenCL is erroring out incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 Pavan Yalamanchilichanged: What|Removed |Added Version|10.6|11.1 -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91556] clSetKernelArg from OpenCL is erroring out incorrectly
https://bugs.freedesktop.org/show_bug.cgi?id=91556 Pavan Yalamanchilichanged: What|Removed |Added Attachment #117515|0 |1 is obsolete|| --- Comment #3 from Pavan Yalamanchili --- Created attachment 121969 --> https://bugs.freedesktop.org/attachment.cgi?id=121969=edit file to reproduce the problem This is still a problem in 11.1.2 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
On Thu, Feb 25, 2016 at 6:16 PM, Francisco Jerezwrote: > Ian Romanick writes: > >> On 02/25/2016 12:13 PM, Francisco Jerez wrote: >>> Ian Romanick writes: >>> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: > Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >> From the OpenGL 4.2 spec: >> >> "When a constructor is used to convert any integer or floating-point >> type to a >> bool, 0 and 0.0 are converted to false, and non-zero values are >> converted to >> true." >> >> Thus, even the smallest non-zero floating value should be translated to >> true. >> This behavior has been verified also with proprietary NVIDIA drivers. >> >> Currently, we implement this conversion as a cmp.nz operation with >> floats, >> subject to floating-point precision limitations, and as a result, >> relatively >> small non-zero floating point numbers return false instead of true. >> >> This patch fixes the problem by getting rid of the sign bit (to cover >> the case >> of -0.0) and testing the result against 0u using an integer comparison >> instead. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index db20c71..7d62d7e 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , >> nir_alu_instr *instr) >>bld.MOV(result, negate(op[0])); >>break; >> >> - case nir_op_f2b: >> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >> - break; >> + case nir_op_f2b: { >> + /* Because comparing to 0.0f is subject to precision limitations, >> do the >> + * comparison using integers (we need to get rid of the sign bit >> for that) >> + */ >> + if (devinfo->gen >= 8) >> + op[0] = resolve_source_modifiers(op[0]); >> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); >> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >> + break; >> + } >> + >> case nir_op_i2b: >>bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >>break; >> > > Does that fix anything? I don't really see a problem with the existing > logic. Yes any "non-zero value" should be converted to true. But surely > that definition cannot include denorms, which you are always allowed to > flush to zero. > (Albeit I can't tell what the result would be with NaNs with the float > compare, nor what the result actually should be in this case since glsl > doesn't require NaNs neither.) Based on this and Jason's comments, I think we need a bunch of new tests. - smallest positive normal number - abs of smallest positive normal number - neg of " "" " - largest positive subnormal number - abs of largest positive subnormal number - neg of"""" - all of the above with negative numbers - NaN - abs of NaN - neg of " Perhaps others? +/-Inf just for kicks? >>> >>> What's the point? The result of most of the above (except possibly >>> bool(NaN)) is undefined by the spec: >>> >>> "Any denormalized value input into a shader or potentially generated by >>> any operation in a shader can be flushed to 0. [...] NaNs are not >>> required to be generated. [...] Operations and built-in functions that >>> operate on a NaN are not required to return a NaN as the result." >> >> Except that apparently one major OpenGL vendor does something well >> defined that's different than what we do. > > I'm skeptical that nVidia would treat single-precision denorms > inconsistently between datatype constructors and other floating-point > arithmetic, but assuming that's the case it would be an argument for > proposing the spec change to Khronos rather than introducing a dubiously > compliant change into the back-end. I think I would argue against > making such a change in the spec in any case, because even though it's > implementation-defined whether denorms are flushed or not, the following > is guaranteed by the spec AFAIUI: > > | if (bool(f)) > |random_arithmetic_on(f /* Guaranteed not to be zero here even if > | denorms are flushed */); > > With this change in, bool(f) would evaluate to true even if f is a > denorm which is flushed to zero for all subsequent arithmetic in the > block. For that reason this seems more likely to
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
Ian Romanickwrites: > On 02/25/2016 12:13 PM, Francisco Jerez wrote: >> Ian Romanick writes: >> >>> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: > From the OpenGL 4.2 spec: > > "When a constructor is used to convert any integer or floating-point type > to a > bool, 0 and 0.0 are converted to false, and non-zero values are > converted to > true." > > Thus, even the smallest non-zero floating value should be translated to > true. > This behavior has been verified also with proprietary NVIDIA drivers. > > Currently, we implement this conversion as a cmp.nz operation with floats, > subject to floating-point precision limitations, and as a result, > relatively > small non-zero floating point numbers return false instead of true. > > This patch fixes the problem by getting rid of the sign bit (to cover the > case > of -0.0) and testing the result against 0u using an integer comparison > instead. > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index db20c71..7d62d7e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , > nir_alu_instr *instr) >bld.MOV(result, negate(op[0])); >break; > > - case nir_op_f2b: > - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); > - break; > + case nir_op_f2b: { > + /* Because comparing to 0.0f is subject to precision limitations, > do the > + * comparison using integers (we need to get rid of the sign bit > for that) > + */ > + if (devinfo->gen >= 8) > + op[0] = resolve_source_modifiers(op[0]); > + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); > + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); > + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); > + break; > + } > + > case nir_op_i2b: >bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >break; > Does that fix anything? I don't really see a problem with the existing logic. Yes any "non-zero value" should be converted to true. But surely that definition cannot include denorms, which you are always allowed to flush to zero. (Albeit I can't tell what the result would be with NaNs with the float compare, nor what the result actually should be in this case since glsl doesn't require NaNs neither.) >>> >>> Based on this and Jason's comments, I think we need a bunch of new tests. >>> >>> - smallest positive normal number >>> - abs of smallest positive normal number >>> - neg of " "" " >>> - largest positive subnormal number >>> - abs of largest positive subnormal number >>> - neg of"""" >>> - all of the above with negative numbers >>> - NaN >>> - abs of NaN >>> - neg of " >>> >>> Perhaps others? +/-Inf just for kicks? >> >> What's the point? The result of most of the above (except possibly >> bool(NaN)) is undefined by the spec: >> >> "Any denormalized value input into a shader or potentially generated by >> any operation in a shader can be flushed to 0. [...] NaNs are not >> required to be generated. [...] Operations and built-in functions that >> operate on a NaN are not required to return a NaN as the result." > > Except that apparently one major OpenGL vendor does something well > defined that's different than what we do. I'm skeptical that nVidia would treat single-precision denorms inconsistently between datatype constructors and other floating-point arithmetic, but assuming that's the case it would be an argument for proposing the spec change to Khronos rather than introducing a dubiously compliant change into the back-end. I think I would argue against making such a change in the spec in any case, because even though it's implementation-defined whether denorms are flushed or not, the following is guaranteed by the spec AFAIUI: | if (bool(f)) |random_arithmetic_on(f /* Guaranteed not to be zero here even if | denorms are flushed */); With this change in, bool(f) would evaluate to true even if f is a denorm which is flushed to zero for all subsequent arithmetic in the block. For that reason this seems more likely to break stuff than to fix stuff, it's not like applications can expect denorms to be representable at all regardless of what nVidia does, but they can expect the above GLSL source to work. > If we can validate what
[Mesa-dev] [PATCH 2/5] i965/cfg: Track prev_block and prev_inst explicitly in the whole function
From: Ian RomanickThis provides a trivial simplification now, and it makes some future changes more straight forward. Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index 716e2bc..7aa72b1 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -41,7 +41,9 @@ dead_control_flow_eliminate(backend_shader *s) bool progress = false; foreach_block_safe (block, s->cfg) { + bblock_t *prev_block = block->prev(); backend_instruction *const inst = block->start(); + backend_instruction *prev_inst = prev_block->end(); /* ENDIF instructions, by definition, can only be found at the start of * basic blocks. @@ -52,20 +54,20 @@ dead_control_flow_eliminate(backend_shader *s) backend_instruction *endif_inst = inst; backend_instruction *if_inst = NULL, *else_inst = NULL; - backend_instruction *prev_inst = endif_block->prev()->end(); if (prev_inst->opcode == BRW_OPCODE_ELSE) { else_inst = prev_inst; else_block = endif_block->prev(); found = true; -if (else_block->start_ip == else_block->end_ip) - prev_inst = else_block->prev()->end(); +if (else_block->start_ip == else_block->end_ip) { + prev_block = prev_block->prev(); + prev_inst = prev_block->end(); +} } if (prev_inst->opcode == BRW_OPCODE_IF) { if_inst = prev_inst; -if_block = else_block != NULL ? else_block->prev() - : endif_block->prev(); +if_block = prev_block; found = true; } else { /* Don't remove the ENDIF if we didn't find a dead IF. */ -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] i965/cfg: Split out dead control flow paths to simplify both paths
From: Ian RomanickSigned-off-by: Ian Romanick --- .../drivers/dri/i965/brw_dead_control_flow.cpp | 93 +- 1 file changed, 38 insertions(+), 55 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index dadcff8..64f406d 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -48,67 +48,50 @@ dead_control_flow_eliminate(backend_shader *s) /* ENDIF instructions, by definition, can only be found at the start of * basic blocks. */ - if (inst->opcode == BRW_OPCODE_ENDIF) { - bool found = false; - bblock_t *if_block = NULL, *else_block = NULL, *endif_block = block; - backend_instruction *endif_inst = inst; - - backend_instruction *if_inst = NULL, *else_inst = NULL; - if (prev_inst->opcode == BRW_OPCODE_ELSE) { -else_inst = prev_inst; -else_block = endif_block->prev(); -found = true; - -/* Don't remove the ENDIF if we didn't find a dead IF. */ -endif_inst = NULL; - } - - if (prev_inst->opcode == BRW_OPCODE_IF) { -if_inst = prev_inst; -if_block = prev_block; -found = true; - } - - if (found) { -bblock_t *earlier_block = NULL, *later_block = NULL; + if (inst->opcode == BRW_OPCODE_ENDIF && + prev_inst->opcode == BRW_OPCODE_ELSE) { + bblock_t *const else_block = prev_block; + backend_instruction *const else_inst = prev_inst; -if (if_inst) { - if (if_block->start_ip == if_block->end_ip) { - earlier_block = if_block->prev(); - } else { - earlier_block = if_block; - } - if_inst->remove(if_block); -} + else_inst->remove(else_block); + progress = true; + } else if (inst->opcode == BRW_OPCODE_ENDIF && + prev_inst->opcode == BRW_OPCODE_IF) { + bblock_t *const endif_block = block; + bblock_t *const if_block = prev_block; + backend_instruction *const endif_inst = inst; + backend_instruction *const if_inst = prev_inst; -if (else_inst) { - else_inst->remove(else_block); -} + bblock_t *earlier_block = NULL, *later_block = NULL; -if (endif_inst) { - if (endif_block->start_ip == endif_block->end_ip) { - later_block = endif_block->next(); - } else { - later_block = endif_block; - } - endif_inst->remove(endif_block); -} + if (if_block->start_ip == if_block->end_ip) { +earlier_block = if_block->prev(); + } else { +earlier_block = if_block; + } + if_inst->remove(if_block); -assert((earlier_block == NULL) == (later_block == NULL)); -if (earlier_block && earlier_block->can_combine_with(later_block)) { - earlier_block->combine_with(later_block); - - /* If ENDIF was in its own block, then we've now deleted it and -* merged the two surrounding blocks, the latter of which the -* __next block pointer was pointing to. -*/ - if (endif_block != later_block) { - __next = earlier_block->next(); - } + if (endif_block->start_ip == endif_block->end_ip) { +later_block = endif_block->next(); + } else { +later_block = endif_block; + } + endif_inst->remove(endif_block); + + assert((earlier_block == NULL) == (later_block == NULL)); + if (earlier_block && earlier_block->can_combine_with(later_block)) { +earlier_block->combine_with(later_block); + +/* If ENDIF was in its own block, then we've now deleted it and + * merged the two surrounding blocks, the latter of which the + * __next block pointer was pointing to. + */ +if (endif_block != later_block) { + __next = earlier_block->next(); } - -progress = true; } + + progress = true; } else if (inst->opcode == BRW_OPCODE_ELSE && prev_inst->opcode == BRW_OPCODE_IF) { bblock_t *const else_block = block; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/5] Make dead control follow elimination handle empty then-branches
This series replaces the previous single patch. Doing it in dead control flow elimination helps 4 additional shaders versus doing during translation from NIR. As Curro mentioned in his review of the earlier patch, this also has the potential to help vec4 shaders in addition to scalar shaders. I have not measured that, however. .../drivers/dri/i965/brw_dead_control_flow.cpp | 88 ++-- 1 file changed, 43 insertions(+), 45 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965/cfg: Don't handle fully empty if/else/endif
From: Ian RomanickThis will now never occur. The empty if-else part would have already been removed leaving an empty if-endif part. No shader-db changes. Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index 149596f..dadcff8 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -33,7 +33,6 @@ * * - if/endif * . else in else/endif - * - if/else/endif * - then in if/else/endif */ bool @@ -44,7 +43,7 @@ dead_control_flow_eliminate(backend_shader *s) foreach_block_safe (block, s->cfg) { bblock_t *prev_block = block->prev(); backend_instruction *const inst = block->start(); - backend_instruction *prev_inst = prev_block->end(); + backend_instruction *const prev_inst = prev_block->end(); /* ENDIF instructions, by definition, can only be found at the start of * basic blocks. @@ -60,19 +59,14 @@ dead_control_flow_eliminate(backend_shader *s) else_block = endif_block->prev(); found = true; -if (else_block->start_ip == else_block->end_ip) { - prev_block = prev_block->prev(); - prev_inst = prev_block->end(); -} +/* Don't remove the ENDIF if we didn't find a dead IF. */ +endif_inst = NULL; } if (prev_inst->opcode == BRW_OPCODE_IF) { if_inst = prev_inst; if_block = prev_block; found = true; - } else { -/* Don't remove the ENDIF if we didn't find a dead IF. */ -endif_inst = NULL; } if (found) { -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] i965/cfg: Eliminate an empty then-branch of an if/else/endif
From: Ian RomanickOn BDW, total instructions in shared programs: 8448571 -> 8448367 (-0.00%) instructions in affected programs: 21000 -> 20796 (-0.97%) helped: 116 HURT: 0 Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index 7aa72b1..149596f 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -34,6 +34,7 @@ * - if/endif * . else in else/endif * - if/else/endif + * - then in if/else/endif */ bool dead_control_flow_eliminate(backend_shader *s) @@ -114,6 +115,23 @@ dead_control_flow_eliminate(backend_shader *s) progress = true; } + } else if (inst->opcode == BRW_OPCODE_ELSE && + prev_inst->opcode == BRW_OPCODE_IF) { + bblock_t *const else_block = block; + bblock_t *const if_block = prev_block; + backend_instruction *const if_inst = prev_inst; + backend_instruction *const else_inst = inst; + + /* Since the else-branch is becoming the new then-branch, the + * condition has to be inverted. + */ + if_inst->predicate_inverse = !if_inst->predicate_inverse; + else_inst->remove(else_block); + + if (if_block->can_combine_with(else_block)) +if_block->combine_with(else_block); + + progress = true; } } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] i965/cfg: Slightly rearrange dead_control_flow_eliminate
From: Ian Romanick'git diff -w' is a bit more illustrative. A couple declarations were moved, the continue was removed, and the code was reindented. This will simplify future changes. Signed-off-by: Ian Romanick --- .../drivers/dri/i965/brw_dead_control_flow.cpp | 113 +++-- 1 file changed, 57 insertions(+), 56 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index 61f2581..716e2bc 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -41,76 +41,77 @@ dead_control_flow_eliminate(backend_shader *s) bool progress = false; foreach_block_safe (block, s->cfg) { - bblock_t *if_block = NULL, *else_block = NULL, *endif_block = block; - bool found = false; + backend_instruction *const inst = block->start(); /* ENDIF instructions, by definition, can only be found at the start of * basic blocks. */ - backend_instruction *endif_inst = endif_block->start(); - if (endif_inst->opcode != BRW_OPCODE_ENDIF) - continue; - - backend_instruction *if_inst = NULL, *else_inst = NULL; - backend_instruction *prev_inst = endif_block->prev()->end(); - if (prev_inst->opcode == BRW_OPCODE_ELSE) { - else_inst = prev_inst; - else_block = endif_block->prev(); - found = true; - - if (else_block->start_ip == else_block->end_ip) -prev_inst = else_block->prev()->end(); - } + if (inst->opcode == BRW_OPCODE_ENDIF) { + bool found = false; + bblock_t *if_block = NULL, *else_block = NULL, *endif_block = block; + backend_instruction *endif_inst = inst; + + backend_instruction *if_inst = NULL, *else_inst = NULL; + backend_instruction *prev_inst = endif_block->prev()->end(); + if (prev_inst->opcode == BRW_OPCODE_ELSE) { +else_inst = prev_inst; +else_block = endif_block->prev(); +found = true; + +if (else_block->start_ip == else_block->end_ip) + prev_inst = else_block->prev()->end(); + } - if (prev_inst->opcode == BRW_OPCODE_IF) { - if_inst = prev_inst; - if_block = else_block != NULL ? else_block->prev() - : endif_block->prev(); - found = true; - } else { - /* Don't remove the ENDIF if we didn't find a dead IF. */ - endif_inst = NULL; - } + if (prev_inst->opcode == BRW_OPCODE_IF) { +if_inst = prev_inst; +if_block = else_block != NULL ? else_block->prev() + : endif_block->prev(); +found = true; + } else { +/* Don't remove the ENDIF if we didn't find a dead IF. */ +endif_inst = NULL; + } - if (found) { - bblock_t *earlier_block = NULL, *later_block = NULL; + if (found) { +bblock_t *earlier_block = NULL, *later_block = NULL; - if (if_inst) { -if (if_block->start_ip == if_block->end_ip) { - earlier_block = if_block->prev(); -} else { - earlier_block = if_block; +if (if_inst) { + if (if_block->start_ip == if_block->end_ip) { + earlier_block = if_block->prev(); + } else { + earlier_block = if_block; + } + if_inst->remove(if_block); } -if_inst->remove(if_block); - } - if (else_inst) { -else_inst->remove(else_block); - } - - if (endif_inst) { -if (endif_block->start_ip == endif_block->end_ip) { - later_block = endif_block->next(); -} else { - later_block = endif_block; +if (else_inst) { + else_inst->remove(else_block); } -endif_inst->remove(endif_block); - } - assert((earlier_block == NULL) == (later_block == NULL)); - if (earlier_block && earlier_block->can_combine_with(later_block)) { -earlier_block->combine_with(later_block); +if (endif_inst) { + if (endif_block->start_ip == endif_block->end_ip) { + later_block = endif_block->next(); + } else { + later_block = endif_block; + } + endif_inst->remove(endif_block); +} -/* If ENDIF was in its own block, then we've now deleted it and - * merged the two surrounding blocks, the latter of which the - * __next block pointer was pointing to. - */ -if (endif_block != later_block) { - __next = earlier_block->next(); +assert((earlier_block ==
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
On 02/25/2016 12:13 PM, Francisco Jerez wrote: > Ian Romanickwrites: > >> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: >>> Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: From the OpenGL 4.2 spec: "When a constructor is used to convert any integer or floating-point type to a bool, 0 and 0.0 are converted to false, and non-zero values are converted to true." Thus, even the smallest non-zero floating value should be translated to true. This behavior has been verified also with proprietary NVIDIA drivers. Currently, we implement this conversion as a cmp.nz operation with floats, subject to floating-point precision limitations, and as a result, relatively small non-zero floating point numbers return false instead of true. This patch fixes the problem by getting rid of the sign bit (to cover the case of -0.0) and testing the result against 0u using an integer comparison instead. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index db20c71..7d62d7e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr *instr) bld.MOV(result, negate(op[0])); break; - case nir_op_f2b: - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); - break; + case nir_op_f2b: { + /* Because comparing to 0.0f is subject to precision limitations, do the + * comparison using integers (we need to get rid of the sign bit for that) + */ + if (devinfo->gen >= 8) + op[0] = resolve_source_modifiers(op[0]); + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); + break; + } + case nir_op_i2b: bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); break; >>> >>> Does that fix anything? I don't really see a problem with the existing >>> logic. Yes any "non-zero value" should be converted to true. But surely >>> that definition cannot include denorms, which you are always allowed to >>> flush to zero. >>> (Albeit I can't tell what the result would be with NaNs with the float >>> compare, nor what the result actually should be in this case since glsl >>> doesn't require NaNs neither.) >> >> Based on this and Jason's comments, I think we need a bunch of new tests. >> >> - smallest positive normal number >> - abs of smallest positive normal number >> - neg of " "" " >> - largest positive subnormal number >> - abs of largest positive subnormal number >> - neg of"""" >> - all of the above with negative numbers >> - NaN >> - abs of NaN >> - neg of " >> >> Perhaps others? +/-Inf just for kicks? > > What's the point? The result of most of the above (except possibly > bool(NaN)) is undefined by the spec: > > "Any denormalized value input into a shader or potentially generated by > any operation in a shader can be flushed to 0. [...] NaNs are not > required to be generated. [...] Operations and built-in functions that > operate on a NaN are not required to return a NaN as the result." Except that apparently one major OpenGL vendor does something well defined that's different than what we do. If we can validate what the behavior is across multiple implementations, we may find a set of common behavior that differs from ours. If other people commonly do a thing that's different than what we do, there's a very good chance that some application depends on that behavior. This certainly would not be the first time. >>> Roland >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Thu, Feb 25, 2016 at 11:23 PM, Matt Turnerwrote: > On Thu, Feb 25, 2016 at 1:19 PM, Oded Gabbay wrote: >> On Thu, Feb 25, 2016 at 11:16 PM, Matt Turner wrote: >>> On Thu, Feb 25, 2016 at 1:09 PM, Oded Gabbay wrote: Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). Signed-off-by: Oded Gabbay Cc: "11.1 11.2" --- src/gallium/drivers/r600/r600_state_common.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index c3346f2..614b0fb 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -2704,6 +2704,12 @@ uint32_t r600_translate_colorformat(enum chip_class chip, enum pipe_format forma uint32_t r600_colorformat_endian_swap(uint32_t colorformat) { + /* +* No need to do endian swaps on colors, as mesa<-->pipe formats +* conversion take into account the endian issue +*/ + return ENDIAN_NONE; >>> >>> Surely you didn't mean to leave the now-unreachable 50 line switch >>> statement below? >>> >> >> I actually didn't know if to delete it for good, or leave it there in >> case we will need it back. Of course, we can always get it back from >> the git log, but... >> >> If you guys feel strongly about it, I can delete it. > > There's no point in leaving in code that's commented out, and I think > this is worse than that. > > I don't work on this driver, but if this function is really supposed > to do nothing... maybe you can just delete it? As I said, I have no problem either way. If the AMD guys will have no problem as well, than I'll just delete it. Oded ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Thu, Feb 25, 2016 at 1:19 PM, Oded Gabbaywrote: > On Thu, Feb 25, 2016 at 11:16 PM, Matt Turner wrote: >> On Thu, Feb 25, 2016 at 1:09 PM, Oded Gabbay wrote: >>> Since the rework on gallium pipe formats, there is no more need to do >>> endian swap of the colorformat in the h/w, because the conversion between >>> mesa format and gallium (pipe) format takes endianess into account (see >>> the big #if in p_format.h). >>> >>> Signed-off-by: Oded Gabbay >>> Cc: "11.1 11.2" >>> --- >>> src/gallium/drivers/r600/r600_state_common.c | 6 ++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/src/gallium/drivers/r600/r600_state_common.c >>> b/src/gallium/drivers/r600/r600_state_common.c >>> index c3346f2..614b0fb 100644 >>> --- a/src/gallium/drivers/r600/r600_state_common.c >>> +++ b/src/gallium/drivers/r600/r600_state_common.c >>> @@ -2704,6 +2704,12 @@ uint32_t r600_translate_colorformat(enum chip_class >>> chip, enum pipe_format forma >>> >>> uint32_t r600_colorformat_endian_swap(uint32_t colorformat) >>> { >>> + /* >>> +* No need to do endian swaps on colors, as mesa<-->pipe formats >>> +* conversion take into account the endian issue >>> +*/ >>> + return ENDIAN_NONE; >> >> Surely you didn't mean to leave the now-unreachable 50 line switch >> statement below? >> > > I actually didn't know if to delete it for good, or leave it there in > case we will need it back. Of course, we can always get it back from > the git log, but... > > If you guys feel strongly about it, I can delete it. There's no point in leaving in code that's commented out, and I think this is worse than that. I don't work on this driver, but if this function is really supposed to do nothing... maybe you can just delete it? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/11] program: Remove extra reference_program()
On 02/25/2016 10:01 PM, Marek Olšák wrote: On Thu, Feb 25, 2016 at 9:31 PM, Miklós Mátéwrote: I noticed that this has been reviewed, but has not been committed. Does it require further action from me, or was it just forgotten? Ideally you would ask somebody to push the patch for you. I'll do it. Marek Thank you. MM ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Thu, Feb 25, 2016 at 11:16 PM, Matt Turnerwrote: > On Thu, Feb 25, 2016 at 1:09 PM, Oded Gabbay wrote: >> Since the rework on gallium pipe formats, there is no more need to do >> endian swap of the colorformat in the h/w, because the conversion between >> mesa format and gallium (pipe) format takes endianess into account (see >> the big #if in p_format.h). >> >> Signed-off-by: Oded Gabbay >> Cc: "11.1 11.2" >> --- >> src/gallium/drivers/r600/r600_state_common.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/src/gallium/drivers/r600/r600_state_common.c >> b/src/gallium/drivers/r600/r600_state_common.c >> index c3346f2..614b0fb 100644 >> --- a/src/gallium/drivers/r600/r600_state_common.c >> +++ b/src/gallium/drivers/r600/r600_state_common.c >> @@ -2704,6 +2704,12 @@ uint32_t r600_translate_colorformat(enum chip_class >> chip, enum pipe_format forma >> >> uint32_t r600_colorformat_endian_swap(uint32_t colorformat) >> { >> + /* >> +* No need to do endian swaps on colors, as mesa<-->pipe formats >> +* conversion take into account the endian issue >> +*/ >> + return ENDIAN_NONE; > > Surely you didn't mean to leave the now-unreachable 50 line switch > statement below? > I actually didn't know if to delete it for good, or leave it there in case we will need it back. Of course, we can always get it back from the git log, but... If you guys feel strongly about it, I can delete it. Oded >> + >> if (R600_BIG_ENDIAN) { >> switch(colorformat) { >> /* 8-bit buffers. */ >> -- >> 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
On Thu, Feb 25, 2016 at 1:09 PM, Oded Gabbaywrote: > Since the rework on gallium pipe formats, there is no more need to do > endian swap of the colorformat in the h/w, because the conversion between > mesa format and gallium (pipe) format takes endianess into account (see > the big #if in p_format.h). > > Signed-off-by: Oded Gabbay > Cc: "11.1 11.2" > --- > src/gallium/drivers/r600/r600_state_common.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/gallium/drivers/r600/r600_state_common.c > b/src/gallium/drivers/r600/r600_state_common.c > index c3346f2..614b0fb 100644 > --- a/src/gallium/drivers/r600/r600_state_common.c > +++ b/src/gallium/drivers/r600/r600_state_common.c > @@ -2704,6 +2704,12 @@ uint32_t r600_translate_colorformat(enum chip_class > chip, enum pipe_format forma > > uint32_t r600_colorformat_endian_swap(uint32_t colorformat) > { > + /* > +* No need to do endian swaps on colors, as mesa<-->pipe formats > +* conversion take into account the endian issue > +*/ > + return ENDIAN_NONE; Surely you didn't mean to leave the now-unreachable 50 line switch statement below? > + > if (R600_BIG_ENDIAN) { > switch(colorformat) { > /* 8-bit buffers. */ > -- > 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] gallium/r600: Don't let h/w do endian swap for colorformat
Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). Signed-off-by: Oded GabbayCc: "11.1 11.2" --- src/gallium/drivers/r600/r600_state_common.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index c3346f2..614b0fb 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -2704,6 +2704,12 @@ uint32_t r600_translate_colorformat(enum chip_class chip, enum pipe_format forma uint32_t r600_colorformat_endian_swap(uint32_t colorformat) { + /* +* No need to do endian swaps on colors, as mesa<-->pipe formats +* conversion take into account the endian issue +*/ + return ENDIAN_NONE; + if (R600_BIG_ENDIAN) { switch(colorformat) { /* 8-bit buffers. */ -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] gallium/radeon: disable evergreen_do_fast_color_clear for BE
This function is currently broken for BE. I assume it's because of util_pack_color(). Until I fix this path, I prefer to disable it so users would be able to see correct colors on their desktop and applications. Together with the two following patches: - gallium/r600: Don't let h/w do endian swap for colorformat - gallium/radeon: remove separate BE path in r600_translate_colorswap it fixes BZ#72877 and BZ#92039 Signed-off-by: Oded GabbayCc: "11.1 11.2" --- src/gallium/drivers/radeon/r600_texture.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 454d0f1..0b31d0a 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1408,6 +1408,11 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx, { int i; + /* This function is broken in BE, so just disable this path for now */ +#ifdef PIPE_ARCH_BIG_ENDIAN + return; +#endif + if (rctx->render_cond) return; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] gallium/radeon: remove separate BE path in r600_translate_colorswap
After further testing, it appears there is no need for separate BE path in r600_translate_colorswap() The only fix remaining is the change of the last if statement, in the 4 channels case. Originally, it contained an invalid swizzle configuration that never got hit, in LE or BE. So the fix is relevant for both systems. This patch adds an additional 120 available visuals for LE and BE, as seen in glxinfo Signed-off-by: Oded GabbayCc: "11.1 11.2" --- src/gallium/drivers/radeon/r600_texture.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 9a3ccb5..454d0f1 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1293,25 +1293,14 @@ unsigned r600_translate_colorswap(enum pipe_format format) break; case 4: /* check the middle channels, the 1st and 4th channel can be NONE */ -#ifdef PIPE_ARCH_LITTLE_ENDIAN if (HAS_SWIZZLE(1,Y) && HAS_SWIZZLE(2,Z)) return V_0280A0_SWAP_STD; /* XYZW */ else if (HAS_SWIZZLE(1,Z) && HAS_SWIZZLE(2,Y)) return V_0280A0_SWAP_STD_REV; /* WZYX */ else if (HAS_SWIZZLE(1,Y) && HAS_SWIZZLE(2,X)) return V_0280A0_SWAP_ALT; /* ZYXW */ - else if (HAS_SWIZZLE(1,X) && HAS_SWIZZLE(2,Y)) - return V_0280A0_SWAP_ALT_REV; /* WXYZ */ -#else - if (HAS_SWIZZLE(1,W) && HAS_SWIZZLE(2,X)) - return V_0280A0_SWAP_STD_REV; /* ZWXY */ - else if (HAS_SWIZZLE(1,X) && HAS_SWIZZLE(2,W)) - return V_0280A0_SWAP_STD; /* YXWZ */ - else if (HAS_SWIZZLE(1,W) && HAS_SWIZZLE(2,Z)) - return V_0280A0_SWAP_ALT_REV; /* XWZY */ else if (HAS_SWIZZLE(1,Z) && HAS_SWIZZLE(2,W)) - return V_0280A0_SWAP_ALT; /* YZWX */ -#endif + return V_0280A0_SWAP_ALT_REV; /* YZWX */ break; } return ~0U; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] Fix desktop colors for r600g on Big-Endian machines
So I finally managed to get the desktop colors to work correctly. Apparently, my previous fixes were partially wrong (but also partially correct). There are two major points: 1. Because the mesa <--> pipe format conversion takes into account endianess (see p_format.h), there is no need to do a colorformat endian swap in the H/w 2. evergreen_do_fast_color_clear is broken on BE, probably because of the packing. I need to fix that, but it may take a bit more time, so in the meantime, I would like to disable this path for BE. No real harm is done but now the colors on the desktop are correct. Thanks, Oded Oded Gabbay (3): gallium/radeon: remove separate BE path in r600_translate_colorswap gallium/r600: Don't let h/w do endian swap for colorformat gallium/radeon: disable evergreen_do_fast_color_clear for BE src/gallium/drivers/r600/r600_state_common.c | 6 ++ src/gallium/drivers/radeon/r600_texture.c| 18 ++ 2 files changed, 12 insertions(+), 12 deletions(-) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/11] program: Remove extra reference_program()
On Thu, Feb 25, 2016 at 9:31 PM, Miklós Mátéwrote: > I noticed that this has been reviewed, but has not been committed. Does it > require further action from me, or was it just forgotten? Ideally you would ask somebody to push the patch for you. I'll do it. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] mesa: optionally associate a gl_program to ati_fragment_shader
On 02/25/2016 07:38 PM, Ian Romanick wrote: On 02/24/2016 05:37 PM, Brian Paul wrote: On 02/24/2016 04:35 PM, Miklós Máté wrote: the state tracker will use it Signed-off-by: Miklós Máté--- src/mesa/drivers/common/driverfuncs.c | 3 +++ src/mesa/main/atifragshader.c | 13 - src/mesa/main/dd.h| 7 ++- src/mesa/main/mtypes.h| 1 + src/mesa/main/state.c | 14 +- 5 files changed, 35 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/common/driverfuncs.c b/src/mesa/drivers/common/driverfuncs.c index 752aaf6..65a0cf8 100644 --- a/src/mesa/drivers/common/driverfuncs.c +++ b/src/mesa/drivers/common/driverfuncs.c @@ -117,6 +117,9 @@ _mesa_init_driver_functions(struct dd_function_table *driver) driver->NewProgram = _mesa_new_program; driver->DeleteProgram = _mesa_delete_program; + /* ATI_fragment_shader */ + driver->NewATIfs = NULL; + /* simple state commands */ driver->AlphaFunc = NULL; driver->BlendColor = NULL; diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 8fcbff6..34f45c6 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -30,6 +30,7 @@ #include "main/mtypes.h" #include "main/dispatch.h" #include "main/atifragshader.h" +#include "program/program.h" #define MESA_DEBUG_ATI_FS 0 @@ -63,6 +64,7 @@ _mesa_delete_ati_fragment_shader(struct gl_context *ctx, struct ati_fragment_sha free(s->Instructions[i]); free(s->SetupInst[i]); } + _mesa_reference_program(ctx, >Program, NULL); free(s); } @@ -321,6 +323,8 @@ _mesa_BeginFragmentShaderATI(void) free(ctx->ATIFragmentShader.Current->SetupInst[i]); } + _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, NULL); + /* malloc the instructions here - not sure if the best place but its a start */ for (i = 0; i < MAX_NUM_PASSES_ATI; i++) { @@ -405,7 +409,14 @@ _mesa_EndFragmentShaderATI(void) } #endif - if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, NULL)) { + if (ctx->Driver.NewATIfs) { + struct gl_program *prog = ctx->Driver.NewATIfs(ctx, + ctx->ATIFragmentShader.Current); + _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, prog); + } + + if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, +curProg->Program)) { ctx->ATIFragmentShader.Current->isValid = GL_FALSE; /* XXX is this the right error? */ _mesa_error(ctx, GL_INVALID_OPERATION, diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h index 3f5aa5d..8410a15 100644 --- a/src/mesa/main/dd.h +++ b/src/mesa/main/dd.h @@ -473,7 +473,12 @@ struct dd_function_table { struct gl_program * (*NewProgram)(struct gl_context *ctx, GLenum target, GLuint id); /** Delete a program */ - void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog); + void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog); + /** +* Allocate a program to associate with the new ATI fragment shader (optional) +*/ + struct gl_program * (*NewATIfs)(struct gl_context *ctx, + struct ati_fragment_shader *curProg); The second line of the function decl should be indented more. See other nearby functions for examples. Also... what changed in the DeleteProgram line? I've been staring at it, but I can't see the sailboat. I accidentally included the removal of the trailing whitespace. My editor automatically removes them, so I have to pick changes line-by-line. Patch looks OK otherwise. Acked-by: Brian Paul With the various whitespace issues fixed (and I think the DeleteProgram change is a whitespace issue of some sort), this patch is Reviewed-by: Ian Romanick Miklós, I assume you need someone to commit this for you? I can fix the minor whitespace problems and commit it. Yes, thank you. I thought I got rid of all formatting issues since the last time, but it seems some of them eluded my attention. MM /** * Notify driver that a program string (and GPU code) has been specified * or modified. Return GL_TRUE or GL_FALSE to indicate if the program is diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 12d3863..22e8a21 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2197,6 +2197,7 @@ struct ati_fragment_shader GLboolean interpinp1; GLboolean isValid; GLuint swizzlerq; + struct gl_program *Program; }; /** diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c index 57f1341..f4e8288 100644 --- a/src/mesa/main/state.c +++ b/src/mesa/main/state.c @@ -124,7 +124,8 @@ update_program(struct gl_context *ctx) * follows: * 1. OpenGL 2.0/ARB
Re: [Mesa-dev] [PATCH 11/11] program: Remove extra reference_program()
I noticed that this has been reviewed, but has not been committed. Does it require further action from me, or was it just forgotten? MM On 02/03/2016 10:06 AM, Marek Olšák wrote: Reviewed-by: Marek OlšákMarek On Wed, Dec 16, 2015 at 12:05 AM, Miklós Máté wrote: It was already done in get_mesa_program() --- src/mesa/program/ir_to_mesa.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 8f58f3e..a28cf97 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -2938,8 +2938,6 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (linked_prog) { _mesa_copy_linked_program_data((gl_shader_stage) i, prog, linked_prog); -_mesa_reference_program(ctx, >_LinkedShaders[i]->Program, -linked_prog); if (!ctx->Driver.ProgramStringNotify(ctx, _mesa_shader_stage_to_program(i), linked_prog)) { -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] build/nir: Remove unused Makefile.sources from nir folder
Redoing the Android/SCons got extra messy, so I opted for this 'hack'. There was even plan B of doing a "Open vSwitch", where we use it (and makes all the the whole build parallel). As plan B is off the table for now (quite evasive) I'll beat things in shape to remove this file. -Emil On 25 February 2016 at 17:19, Jason Ekstrandwrote: > I would really like to see this happen. However, I seem to recall Emil > having some reason for keeping it. Emil? > --Jason > > On Thu, Feb 25, 2016 at 4:50 AM, Eduardo Lima Mitev > wrote: >> >> NIR sources are added in src/compiler/Makefile.sources. >> --- >> src/compiler/nir/Makefile.sources | 71 >> --- >> 1 file changed, 71 deletions(-) >> delete mode 100644 src/compiler/nir/Makefile.sources >> >> diff --git a/src/compiler/nir/Makefile.sources >> b/src/compiler/nir/Makefile.sources >> deleted file mode 100644 >> index 0755a10..000 >> --- a/src/compiler/nir/Makefile.sources >> +++ /dev/null >> @@ -1,71 +0,0 @@ >> -NIR_GENERATED_FILES = \ >> - nir_builder_opcodes.h \ >> - nir_constant_expressions.c \ >> - nir_opcodes.c \ >> - nir_opcodes.h \ >> - nir_opt_algebraic.c >> - >> -NIR_FILES = \ >> - glsl_to_nir.cpp \ >> - glsl_to_nir.h \ >> - nir.c \ >> - nir.h \ >> - nir_array.h \ >> - nir_builder.h \ >> - nir_clone.c \ >> - nir_constant_expressions.h \ >> - nir_control_flow.c \ >> - nir_control_flow.h \ >> - nir_control_flow_private.h \ >> - nir_dominance.c \ >> - nir_from_ssa.c \ >> - nir_gs_count_vertices.c \ >> - nir_intrinsics.c \ >> - nir_intrinsics.h \ >> - nir_instr_set.c \ >> - nir_instr_set.h \ >> - nir_liveness.c \ >> - nir_lower_alu_to_scalar.c \ >> - nir_lower_atomics.c \ >> - nir_lower_clip.c \ >> - nir_lower_global_vars_to_local.c \ >> - nir_lower_gs_intrinsics.c \ >> - nir_lower_load_const_to_scalar.c \ >> - nir_lower_locals_to_regs.c \ >> - nir_lower_idiv.c \ >> - nir_lower_io.c \ >> - nir_lower_outputs_to_temporaries.c \ >> - nir_lower_phis_to_scalar.c \ >> - nir_lower_samplers.c \ >> - nir_lower_system_values.c \ >> - nir_lower_tex.c \ >> - nir_lower_to_source_mods.c \ >> - nir_lower_two_sided_color.c \ >> - nir_lower_vars_to_ssa.c \ >> - nir_lower_var_copies.c \ >> - nir_lower_vec_to_movs.c \ >> - nir_metadata.c \ >> - nir_move_vec_src_uses_to_dest.c \ >> - nir_normalize_cubemap_coords.c \ >> - nir_opt_constant_folding.c \ >> - nir_opt_copy_propagate.c \ >> - nir_opt_cse.c \ >> - nir_opt_dce.c \ >> - nir_opt_dead_cf.c \ >> - nir_opt_gcm.c \ >> - nir_opt_global_to_local.c \ >> - nir_opt_peephole_select.c \ >> - nir_opt_remove_phis.c \ >> - nir_opt_undef.c \ >> - nir_print.c \ >> - nir_remove_dead_variables.c \ >> - nir_search.c \ >> - nir_search.h \ >> - nir_split_var_copies.c \ >> - nir_sweep.c \ >> - nir_to_ssa.c \ >> - nir_validate.c \ >> - nir_vla.h \ >> - nir_worklist.c \ >> - nir_worklist.h >> - >> -- >> 2.5.3 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
Ian Romanickwrites: > On 02/25/2016 08:46 AM, Roland Scheidegger wrote: >> Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >>> From the OpenGL 4.2 spec: >>> >>> "When a constructor is used to convert any integer or floating-point type >>> to a >>> bool, 0 and 0.0 are converted to false, and non-zero values are converted >>> to >>> true." >>> >>> Thus, even the smallest non-zero floating value should be translated to >>> true. >>> This behavior has been verified also with proprietary NVIDIA drivers. >>> >>> Currently, we implement this conversion as a cmp.nz operation with floats, >>> subject to floating-point precision limitations, and as a result, relatively >>> small non-zero floating point numbers return false instead of true. >>> >>> This patch fixes the problem by getting rid of the sign bit (to cover the >>> case >>> of -0.0) and testing the result against 0u using an integer comparison >>> instead. >>> --- >>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- >>> 1 file changed, 12 insertions(+), 3 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> index db20c71..7d62d7e 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , >>> nir_alu_instr *instr) >>>bld.MOV(result, negate(op[0])); >>>break; >>> >>> - case nir_op_f2b: >>> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >>> - break; >>> + case nir_op_f2b: { >>> + /* Because comparing to 0.0f is subject to precision limitations, do >>> the >>> + * comparison using integers (we need to get rid of the sign bit for >>> that) >>> + */ >>> + if (devinfo->gen >= 8) >>> + op[0] = resolve_source_modifiers(op[0]); >>> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >>> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); >>> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >>> + break; >>> + } >>> + >>> case nir_op_i2b: >>>bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >>>break; >>> >> >> Does that fix anything? I don't really see a problem with the existing >> logic. Yes any "non-zero value" should be converted to true. But surely >> that definition cannot include denorms, which you are always allowed to >> flush to zero. >> (Albeit I can't tell what the result would be with NaNs with the float >> compare, nor what the result actually should be in this case since glsl >> doesn't require NaNs neither.) > > Based on this and Jason's comments, I think we need a bunch of new tests. > > - smallest positive normal number > - abs of smallest positive normal number > - neg of " "" " > - largest positive subnormal number > - abs of largest positive subnormal number > - neg of"""" > - all of the above with negative numbers > - NaN > - abs of NaN > - neg of " > > Perhaps others? +/-Inf just for kicks? > What's the point? The result of most of the above (except possibly bool(NaN)) is undefined by the spec: "Any denormalized value input into a shader or potentially generated by any operation in a shader can be flushed to 0. [...] NaNs are not required to be generated. [...] Operations and built-in functions that operate on a NaN are not required to return a NaN as the result." >> Roland >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 94295] [swrast] piglit shader_runner fast_color_clear/all-colors regression
https://bugs.freedesktop.org/show_bug.cgi?id=94295 Bug ID: 94295 Summary: [swrast] piglit shader_runner fast_color_clear/all-colors regression Product: Mesa Version: 11.2 Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: imir...@alum.mit.edu, lem...@gmail.com, plamena.manol...@intel.com mesa: d1509a5848dee57b933139ad2610e99ae09cb5ec (master 11.3.0-devel) $ ./bin/shader_runner tests/fast_color_clear/all-colors.shader_test -auto Segmentation fault (core dumped) (gdb) bt #0 find_empty_block (prog=0xf2ae10, uniform=0xf2f030) at glsl/link_uniforms.cpp:1051 #1 link_assign_uniform_locations (prog=prog@entry=0xf2ae10, boolean_true=1065353216, num_explicit_uniform_locs=num_explicit_uniform_locs@entry=4294967295, max_uniform_locs=98304) at glsl/link_uniforms.cpp:1238 #2 0x7fd99ef73db9 in link_shaders (ctx=ctx@entry=0x7fd9a4a99010, prog=prog@entry=0xf2ae10) at glsl/linker.cpp:4566 #3 0x7fd99eecb3fb in _mesa_glsl_link_shader (ctx=ctx@entry=0x7fd9a4a99010, prog=prog@entry=0xf2ae10) at program/ir_to_mesa.cpp:3036 #4 0x7fd99edd1b8a in link_program (ctx=0x7fd9a4a99010, program=) at main/shaderapi.c:1048 #5 0x7fd9a45dafec in stub_glLinkProgram (program=3) at piglit/tests/util/piglit-dispatch-gen.c:32599 #6 0x0040776a in link_and_use_shaders () at piglit/tests/shaders/shader_runner.c:1042 #7 0x0040e02c in piglit_init (argc=2, argv=0x7ffd6815d008) at piglit/tests/shaders/shader_runner.c:3292 #8 0x7fd9a464b7fb in run_test (gl_fw=0xd34c20, argc=2, argv=0x7ffd6815d008) at piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:73 #9 0x7fd9a462ff6a in piglit_gl_test_run (argc=2, argv=0x7ffd6815d008, config=0x7ffd6815cec0) at piglit/tests/util/piglit-framework-gl.c:199 #10 0x00405b50 in main (argc=2, argv=0x7ffd6815d008) at piglit/tests/shaders/shader_runner.c:54 (gdb) l 1046find_empty_block(struct gl_shader_program *prog, 1047 struct gl_uniform_storage *uniform) 1048{ 1049 const unsigned entries = MAX2(1, uniform->array_elements); 1050 1051 foreach_list_typed(struct empty_uniform_block, block, link, 1052 >EmptyUniformLocations) { 1053 /* Found a block with enough slots to fit the uniform */ 1054 if (block->slots == entries) { 1055 unsigned start = block->start; (gdb) print block $1 = (empty_uniform_block *) 0x0 (gdb) print prog->EmptyUniformLocations $2 = {head = 0x0, tail = 0x0, tail_pred = 0x0} 65dfb3048e8291675ca33581aeff8921f7ea509d is the first bad commit commit 65dfb3048e8291675ca33581aeff8921f7ea509d Author: Plamena ManolovaDate: Thu Feb 11 15:00:02 2016 +0200 compiler/glsl: Fix uniform location counting. This patch moves the calculation of current uniforms to link_uniforms, which makes use of UniformRemapTable which stores all the reserved uniform locations. Location assignment for implicit uniforms now tries to use any gaps left in the table after the location assignment for explicit uniforms. This gives us more space to store more uniforms. Patch is based on earlier patch with following changes/additions: 1: Move the counting of explicit locations to check_explicit_uniform_locations and then pass the number to link_assign_uniform_locations. 2: Count the number of empty slots in UniformRemapTable and store them in a list_head. 3: Try to find an empty slot for implicit locations from the list, if that fails resize UniformRemapTable. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array Signed-off-by: Tapani Pälli Signed-off-by: Plamena Manolova Reviewed-by: Ilia Mirkin Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696 :04 04 5848c556c369c2c798c1c1e036c70c740b56a97a 25915fac71a54954aafd0139a55045ba394969e6 Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
Roland Scheideggerwrites: > Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >> From the OpenGL 4.2 spec: >> >> "When a constructor is used to convert any integer or floating-point type to >> a >> bool, 0 and 0.0 are converted to false, and non-zero values are converted to >> true." >> >> Thus, even the smallest non-zero floating value should be translated to true. >> This behavior has been verified also with proprietary NVIDIA drivers. >> >> Currently, we implement this conversion as a cmp.nz operation with floats, >> subject to floating-point precision limitations, and as a result, relatively The bool constructor *is* subject to floating-point precision limitations AFAIK, just like anything else dealing with floating-point numbers. >> small non-zero floating point numbers return false instead of true. >> >> This patch fixes the problem by getting rid of the sign bit (to cover the >> case >> of -0.0) and testing the result against 0u using an integer comparison >> instead. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index db20c71..7d62d7e 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , >> nir_alu_instr *instr) >>bld.MOV(result, negate(op[0])); >>break; >> >> - case nir_op_f2b: >> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >> - break; >> + case nir_op_f2b: { >> + /* Because comparing to 0.0f is subject to precision limitations, do >> the >> + * comparison using integers (we need to get rid of the sign bit for >> that) >> + */ >> + if (devinfo->gen >= 8) >> + op[0] = resolve_source_modifiers(op[0]); >> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); >> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >> + break; >> + } >> + >> case nir_op_i2b: >>bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >>break; >> > > Does that fix anything? I don't really see a problem with the existing > logic. Yes any "non-zero value" should be converted to true. But surely > that definition cannot include denorms, which you are always allowed to > flush to zero. Yeah, one is allowed to flush denorms to zero on input to any operation, including bool(), I don't see any reason in the above why bool() shouldn't be equivalent to "!= 0". > (Albeit I can't tell what the result would be with NaNs with the float > compare, nor what the result actually should be in this case since glsl > doesn't require NaNs neither.) > The hardware CMP instruction considers NaNs to be different from zero, and AFAICT the implementation in this patch does the same. > Roland > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
On 02/25/2016 08:46 AM, Roland Scheidegger wrote: > Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >> From the OpenGL 4.2 spec: >> >> "When a constructor is used to convert any integer or floating-point type to >> a >> bool, 0 and 0.0 are converted to false, and non-zero values are converted to >> true." >> >> Thus, even the smallest non-zero floating value should be translated to true. >> This behavior has been verified also with proprietary NVIDIA drivers. >> >> Currently, we implement this conversion as a cmp.nz operation with floats, >> subject to floating-point precision limitations, and as a result, relatively >> small non-zero floating point numbers return false instead of true. >> >> This patch fixes the problem by getting rid of the sign bit (to cover the >> case >> of -0.0) and testing the result against 0u using an integer comparison >> instead. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index db20c71..7d62d7e 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , >> nir_alu_instr *instr) >>bld.MOV(result, negate(op[0])); >>break; >> >> - case nir_op_f2b: >> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >> - break; >> + case nir_op_f2b: { >> + /* Because comparing to 0.0f is subject to precision limitations, do >> the >> + * comparison using integers (we need to get rid of the sign bit for >> that) >> + */ >> + if (devinfo->gen >= 8) >> + op[0] = resolve_source_modifiers(op[0]); >> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); >> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >> + break; >> + } >> + >> case nir_op_i2b: >>bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >>break; >> > > Does that fix anything? I don't really see a problem with the existing > logic. Yes any "non-zero value" should be converted to true. But surely > that definition cannot include denorms, which you are always allowed to > flush to zero. > (Albeit I can't tell what the result would be with NaNs with the float > compare, nor what the result actually should be in this case since glsl > doesn't require NaNs neither.) Based on this and Jason's comments, I think we need a bunch of new tests. - smallest positive normal number - abs of smallest positive normal number - neg of " "" " - largest positive subnormal number - abs of largest positive subnormal number - neg of"""" - all of the above with negative numbers - NaN - abs of NaN - neg of " Perhaps others? +/-Inf just for kicks? > Roland > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] install-gallium-links: port changes from install-lib-links
From: Emil VelikovNamely: b662d5282f7 mesa: Add clean-local rule to remove .lib links. 5c1aac17adf install-lib-links: don't depend on .libs directory fece147be53 install-lib-links: remove the .install-lib-links file With these in place, make distcheck now passes and a race condition has been avoided. Cc: "11.1 11.2" Signed-off-by: Emil Velikov --- install-gallium-links.mk | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/install-gallium-links.mk b/install-gallium-links.mk index f45f1b4..4010cad 100644 --- a/install-gallium-links.mk +++ b/install-gallium-links.mk @@ -3,9 +3,9 @@ if BUILD_SHARED if HAVE_COMPAT_SYMLINKS -all-local : .libs/install-gallium-links +all-local : .install-gallium-links -.libs/install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES) +.install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES) $(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \ link_dir=$(top_builddir)/$(LIB_DIR)/gallium;\ if test x$(egl_LTLIBRARIES) != x; then \ @@ -23,4 +23,15 @@ all-local : .libs/install-gallium-links fi; \ done && touch $@ endif + +clean-local: + for f in $(notdir $(dri_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \ +$(notdir $(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \ +$(notdir $(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)); do \ + echo $$f; \ + $(RM) $(top_builddir)/$(LIB_DIR)/gallium/$$f; \ + done; + rmdir $(top_builddir)/$(LIB_DIR)/gallium || true + $(RM) .install-gallium-links + endif -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] automake: add more missing options for make distcheck
From: Emil VelikovNamely - opencl, osmesa (only the gallium flavour as it conflicts with the classic one), surfaceless egl platform and a couple gallium drivers (virgl and vc4). Cc: "11.1 11.2" Signed-off-by: Emil Velikov --- Makefile.am | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Makefile.am b/Makefile.am index 5df8bc3..2c06e3a 100644 --- a/Makefile.am +++ b/Makefile.am @@ -24,19 +24,21 @@ SUBDIRS = src AM_DISTCHECK_CONFIGURE_FLAGS = \ --enable-dri3 \ --enable-gallium-tests \ + --enable-gallium-osmesa \ --enable-gbm \ --enable-gles1 \ --enable-gles2 \ --enable-glx-tls \ --enable-nine \ + --enable-opencl \ --enable-va \ --enable-vdpau \ --enable-xa \ --enable-xvmc \ --disable-llvm-shared-libs \ - --with-egl-platforms=x11,wayland,drm \ + --with-egl-platforms=x11,wayland,drm,surfaceless \ --with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \ - --with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast + --with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl ACLOCAL_AMFLAGS = -I m4 -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] automake: explicitly set distcheck configure flags
From: Emil VelikovPretty much all of these are enabled by default. Considering the recent updates (see previous commits) one might as well list most/all of these here. Signed-off-by: Emil Velikov --- Makefile.am | 5 + 1 file changed, 5 insertions(+) diff --git a/Makefile.am b/Makefile.am index 2c06e3a..f9bad14 100644 --- a/Makefile.am +++ b/Makefile.am @@ -22,15 +22,20 @@ SUBDIRS = src AM_DISTCHECK_CONFIGURE_FLAGS = \ + --enable-dri \ --enable-dri3 \ + --enable-egl \ --enable-gallium-tests \ --enable-gallium-osmesa \ + --enable-gallium-llvm \ --enable-gbm \ --enable-gles1 \ --enable-gles2 \ + --enable-glx \ --enable-glx-tls \ --enable-nine \ --enable-opencl \ + --enable-opengl \ --enable-va \ --enable-vdpau \ --enable-xa \ -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/10] i965: Always do NIR IO lowering at specialization time.
First 3 are Reviewed-by: Jason EkstrandIt'll take real work to review the others. On Thu, Feb 25, 2016 at 11:01 AM, Kenneth Graunke wrote: > We've now hit literally every case other than geometry shaders (and > compute shaders, but those are a no-op). So, let's just move geometry > shaders over too and be done with it. > > The only advantage to doing this at link time was to save the expense > of running the pass on recompiles. But we're already running a lot of > passes, and the extra code complexity isn't worth it. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_nir.c | 8 > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 1 + > 2 files changed, 1 insertion(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c > b/src/mesa/drivers/dri/i965/brw_nir.c > index 61acf38..efa4c48 100644 > --- a/src/mesa/drivers/dri/i965/brw_nir.c > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > @@ -598,7 +598,6 @@ brw_create_nir(struct brw_context *brw, > bool is_scalar) > { > struct gl_context *ctx = >ctx; > - const struct brw_device_info *devinfo = brw->intelScreen->devinfo; > const nir_shader_compiler_options *options = >ctx->Const.ShaderCompilerOptions[stage].NirOptions; > bool progress; > @@ -625,13 +624,6 @@ brw_create_nir(struct brw_context *brw, >OPT_V(nir_lower_atomics, shader_prog); > } > > - if (nir->stage != MESA_SHADER_VERTEX && > - nir->stage != MESA_SHADER_TESS_CTRL && > - nir->stage != MESA_SHADER_TESS_EVAL && > - nir->stage != MESA_SHADER_FRAGMENT) { > - nir = brw_nir_lower_io(nir, devinfo, is_scalar, false, NULL); > - } > - > return nir; > } > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > index 183fe35..40966c6 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp > @@ -598,6 +598,7 @@ brw_compile_gs(const struct brw_compiler *compiler, > void *log_data, > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); > shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, > >tex, >is_scalar); > + shader = brw_nir_lower_io(shader, compiler->devinfo, is_scalar, false, > NULL); > shader = brw_postprocess_nir(shader, compiler->devinfo, is_scalar); > > prog_data->include_primitive_id = > -- > 2.7.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: rework the compatibility check of visuals in glXMakeCurrent
On 02/25/2016 07:48 AM, Brian Paul wrote: > On 02/25/2016 08:26 AM, Miklós Máté wrote: >> On 02/25/2016 02:37 AM, Brian Paul wrote: >>> On 02/24/2016 04:35 PM, Miklós Máté wrote: Now it follows the GLX 1.4 specification. >>> >>> Can you elaborate on that a bit? >> Section 2.1 of the GLX spec lists a few criteria for a context and a >> drawable to be compatible. >> >>> >>> This fixes post-processing in SW:KotOR. Signed-off-by: Miklós Máté--- src/mesa/main/context.c | 42 -- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 26eee28..6c16229 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1525,10 +1525,6 @@ _mesa_copy_context( const struct gl_context *src, struct gl_context *dst, * Check if the given context can render into the given framebuffer * by checking visual attributes. * - * Most of these tests could go away because Mesa is now pretty flexible - * in terms of mixing rendering contexts with framebuffers. As long - * as RGB vs. CI mode agree, we're probably good. - * * \return GL_TRUE if compatible, GL_FALSE otherwise. */ static GLboolean @@ -1541,32 +1537,18 @@ check_compatible(const struct gl_context *ctx, if (buffer == _mesa_get_incomplete_framebuffer()) return GL_TRUE; -#if 0 - /* disabling this fixes the fgl_glxgears pbuffer demo */ - if (ctxvis->doubleBufferMode && !bufvis->doubleBufferMode) - return GL_FALSE; -#endif - if (ctxvis->stereoMode && !bufvis->stereoMode) - return GL_FALSE; - if (ctxvis->haveAccumBuffer && !bufvis->haveAccumBuffer) - return GL_FALSE; - if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer) - return GL_FALSE; - if (ctxvis->haveStencilBuffer && !bufvis->haveStencilBuffer) - return GL_FALSE; - if (ctxvis->redMask && ctxvis->redMask != bufvis->redMask) - return GL_FALSE; - if (ctxvis->greenMask && ctxvis->greenMask != bufvis->greenMask) - return GL_FALSE; - if (ctxvis->blueMask && ctxvis->blueMask != bufvis->blueMask) - return GL_FALSE; -#if 0 - /* disabled (see bug 11161) */ - if (ctxvis->depthBits && ctxvis->depthBits != bufvis->depthBits) - return GL_FALSE; -#endif - if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits) - return GL_FALSE; +#define check_component(foo) \ + if (ctxvis->foo && bufvis->foo && \ + ctxvis->foo != bufvis->foo) \ + return GL_FALSE + + check_component(redMask); + check_component(greenMask); + check_component(blueMask); + check_component(depthBits); + check_component(stencilBits); + +#undef check_component >>> >>> IIRC, Ian had some comments on this so he should re-review. But since >>> Mesa doesn't actually used the red/green/blueMask fields (AFAIK), I'm >>> not sure what those checks are good for. >> Yes, my original intention was to remove this function entirely, but Ian >> convinced me that GLX mandates at least these checks. > > I wonder if those checks could be moved into the GLX code. Maybe? That would mean that the checks would need to be replicated in the GLX code and EGL code. Let me poke around in that code a little, and let me double check both the specs. > For Windows, the wglMakeCurrent docs say "The hdc parameter must refer > to a drawing surface supported by OpenGL. It need not be the same hdc > that was passed to wglCreateContext when hglrc was created, but it must > be on the same device and have the same pixel format." We check for > that in our stw_make_current() in the WGL code. > > -Brian > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] i965: Split brw_nir_lower_inputs/outputs into per-stage functions.
These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 304 +--- 1 file changed, 174 insertions(+), 130 deletions(-) This is easier to view with git diff -b; most of the churn is unindenting the code a level. You can grab this from the 'vueclean' branch of my tree. diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index f21e676..ed836bf 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -203,108 +203,84 @@ remap_patch_urb_offsets(nir_block *block, void *closure) } static void -brw_nir_lower_inputs(nir_shader *nir, - const struct brw_device_info *devinfo, - bool is_scalar, - bool use_legacy_snorm_formula, - const uint8_t *vs_attrib_wa_flags) +brw_nir_lower_vs_inputs(nir_shader *nir, +const struct brw_device_info *devinfo, +bool is_scalar, +bool use_legacy_snorm_formula, +const uint8_t *vs_attrib_wa_flags) { - switch (nir->stage) { - case MESA_SHADER_VERTEX: - /* Start with the location of the variable's base. */ - foreach_list_typed(nir_variable, var, node, >inputs) { - var->data.driver_location = var->data.location; - } + /* Start with the location of the variable's base. */ + foreach_list_typed(nir_variable, var, node, >inputs) { + var->data.driver_location = var->data.location; + } - /* Now use nir_lower_io to walk dereference chains. Attribute arrays - * are loaded as one vec4 per element (or matrix column), so we use - * type_size_vec4 here. - */ - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); + /* Now use nir_lower_io to walk dereference chains. Attribute arrays +* are loaded as one vec4 per element (or matrix column), so we use +* type_size_vec4 here. +*/ + nir_lower_io(nir, nir_var_shader_in, type_size_vec4); - /* This pass needs actual constants */ - nir_opt_constant_folding(nir); + /* This pass needs actual constants */ + nir_opt_constant_folding(nir); - add_const_offset_to_base(nir, nir_var_shader_in); + add_const_offset_to_base(nir, nir_var_shader_in); - brw_nir_apply_attribute_workarounds(nir, use_legacy_snorm_formula, - vs_attrib_wa_flags); + brw_nir_apply_attribute_workarounds(nir, use_legacy_snorm_formula, + vs_attrib_wa_flags); - if (is_scalar) { - /* Finally, translate VERT_ATTRIB_* values into the actual registers. - * - * Note that we can use nir->info.inputs_read instead of - * key->inputs_read since the two are identical aside from Gen4-5 - * edge flag differences. - */ - GLbitfield64 inputs_read = nir->info.inputs_read; + if (is_scalar) { + /* Finally, translate VERT_ATTRIB_* values into the actual registers. + * + * Note that we can use nir->info.inputs_read instead of + * key->inputs_read since the two are identical aside from Gen4-5 + * edge flag differences. + */ + GLbitfield64 inputs_read = nir->info.inputs_read; - nir_foreach_function(nir, function) { -if (function->impl) { - nir_foreach_block(function->impl, remap_vs_attrs, _read); -} + nir_foreach_function(nir, function) { + if (function->impl) { +nir_foreach_block(function->impl, remap_vs_attrs, _read); } } - break; - case MESA_SHADER_TESS_CTRL: - case MESA_SHADER_GEOMETRY: { - if (!is_scalar && nir->stage == MESA_SHADER_GEOMETRY) { - foreach_list_typed(nir_variable, var, node, >inputs) { -var->data.driver_location = var->data.location; - } - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); - } else { - /* The GLSL linker will have already matched up GS inputs and - * the outputs of prior stages. The driver does extend VS outputs - * in some cases, but only for legacy OpenGL or Gen4-5 hardware, - * neither of which offer geometry shader support. So we can - * safely ignore that. - * - * For SSO pipelines, we use a fixed VUE map layout based on variable - * locations, so we can rely on rendezvous-by-location to make this - * work. - * - * However, we need to ignore VARYING_SLOT_PRIMITIVE_ID, as it's not - * written by previous stages and shows up via payload magic. - */ - struct brw_vue_map input_vue_map; - GLbitfield64 inputs_read = -
[Mesa-dev] [PATCH 02/10] i965: Make an is_scalar boolean in brw_compile_gs().
Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can help with line-wrapping. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 3f30f5b..183fe35 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -594,11 +594,11 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, memset(, 0, sizeof(c)); c.key = *key; + const bool is_scalar = compiler->scalar_stage[MESA_SHADER_GEOMETRY]; nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, - compiler->scalar_stage[MESA_SHADER_GEOMETRY]); - shader = brw_postprocess_nir(shader, compiler->devinfo, -compiler->scalar_stage[MESA_SHADER_GEOMETRY]); + is_scalar); + shader = brw_postprocess_nir(shader, compiler->devinfo, is_scalar); prog_data->include_primitive_id = (shader->info.inputs_read & VARYING_BIT_PRIMITIVE_ID) != 0; @@ -807,7 +807,7 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, brw_print_vue_map(stderr, _data->base.vue_map); } - if (compiler->scalar_stage[MESA_SHADER_GEOMETRY]) { + if (is_scalar) { /* TODO: Support instanced GS. We have basically no tests... */ assert(prog_data->invocations == 1); -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] i965: Always do NIR IO lowering at specialization time.
We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 8 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 1 + 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 61acf38..efa4c48 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -598,7 +598,6 @@ brw_create_nir(struct brw_context *brw, bool is_scalar) { struct gl_context *ctx = >ctx; - const struct brw_device_info *devinfo = brw->intelScreen->devinfo; const nir_shader_compiler_options *options = ctx->Const.ShaderCompilerOptions[stage].NirOptions; bool progress; @@ -625,13 +624,6 @@ brw_create_nir(struct brw_context *brw, OPT_V(nir_lower_atomics, shader_prog); } - if (nir->stage != MESA_SHADER_VERTEX && - nir->stage != MESA_SHADER_TESS_CTRL && - nir->stage != MESA_SHADER_TESS_EVAL && - nir->stage != MESA_SHADER_FRAGMENT) { - nir = brw_nir_lower_io(nir, devinfo, is_scalar, false, NULL); - } - return nir; } diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 183fe35..40966c6 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -598,6 +598,7 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, is_scalar); + shader = brw_nir_lower_io(shader, compiler->devinfo, is_scalar, false, NULL); shader = brw_postprocess_nir(shader, compiler->devinfo, is_scalar); prog_data->include_primitive_id = -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/10] i965: Simplify brw_nir_lower_vue_inputs() slightly.
The same code appeared in both branches; pull it above the if statement. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 883603e..a5949d5 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -249,19 +249,14 @@ void brw_nir_lower_vue_inputs(nir_shader *nir, bool is_scalar, const struct brw_vue_map *vue_map) { - if (!is_scalar && nir->stage == MESA_SHADER_GEOMETRY) { - foreach_list_typed(nir_variable, var, node, >inputs) { - var->data.driver_location = var->data.location; - } - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); - } else { - foreach_list_typed(nir_variable, var, node, >inputs) { - var->data.driver_location = var->data.location; - } + foreach_list_typed(nir_variable, var, node, >inputs) { + var->data.driver_location = var->data.location; + } - /* Inputs are stored in vec4 slots, so use type_size_vec4(). */ - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); + /* Inputs are stored in vec4 slots, so use type_size_vec4(). */ + nir_lower_io(nir, nir_var_shader_in, type_size_vec4); + if (is_scalar || nir->stage != MESA_SHADER_GEOMETRY) { /* This pass needs actual constants */ nir_opt_constant_folding(nir); -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] i965: Avoid recalculating the normal VUE map for IO lowering.
The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 27 ++--- src/mesa/drivers/dri/i965/brw_nir.h | 5 ++- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 37 --- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp| 12 4 files changed, 30 insertions(+), 51 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 90c4f66..883603e 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -246,9 +246,8 @@ brw_nir_lower_vs_inputs(nir_shader *nir, } void -brw_nir_lower_vue_inputs(nir_shader *nir, - const struct brw_device_info *devinfo, - bool is_scalar) +brw_nir_lower_vue_inputs(nir_shader *nir, bool is_scalar, + const struct brw_vue_map *vue_map) { if (!is_scalar && nir->stage == MESA_SHADER_GEOMETRY) { foreach_list_typed(nir_variable, var, node, >inputs) { @@ -256,26 +255,6 @@ brw_nir_lower_vue_inputs(nir_shader *nir, } nir_lower_io(nir, nir_var_shader_in, type_size_vec4); } else { - /* The GLSL linker will have already matched up GS inputs and - * the outputs of prior stages. The driver does extend VS outputs - * in some cases, but only for legacy OpenGL or Gen4-5 hardware, - * neither of which offer geometry shader support. So we can - * safely ignore that. - * - * For SSO pipelines, we use a fixed VUE map layout based on variable - * locations, so we can rely on rendezvous-by-location to make this - * work. - * - * However, we need to ignore VARYING_SLOT_PRIMITIVE_ID, as it's not - * written by previous stages and shows up via payload magic. - */ - struct brw_vue_map input_vue_map; - GLbitfield64 inputs_read = - nir->info.inputs_read & ~VARYING_BIT_PRIMITIVE_ID; - brw_compute_vue_map(devinfo, _vue_map, inputs_read, - nir->info.separate_shader || - nir->stage == MESA_SHADER_TESS_CTRL); - foreach_list_typed(nir_variable, var, node, >inputs) { var->data.driver_location = var->data.location; } @@ -291,7 +270,7 @@ brw_nir_lower_vue_inputs(nir_shader *nir, nir_foreach_function(nir, function) { if (function->impl) { nir_foreach_block(function->impl, remap_inputs_with_vue_map, - _vue_map); + (void *) vue_map); } } } diff --git a/src/mesa/drivers/dri/i965/brw_nir.h b/src/mesa/drivers/dri/i965/brw_nir.h index 0fbdc5f..2d8341f 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.h +++ b/src/mesa/drivers/dri/i965/brw_nir.h @@ -88,9 +88,8 @@ void brw_nir_lower_vs_inputs(nir_shader *nir, bool is_scalar, bool use_legacy_snorm_formula, const uint8_t *vs_attrib_wa_flags); -void brw_nir_lower_vue_inputs(nir_shader *nir, - const struct brw_device_info *devinfo, - bool is_scalar); +void brw_nir_lower_vue_inputs(nir_shader *nir, bool is_scalar, + const struct brw_vue_map *vue_map); void brw_nir_lower_tes_inputs(nir_shader *nir, const struct brw_vue_map *vue); void brw_nir_lower_fs_inputs(nir_shader *nir); void brw_nir_lower_vue_outputs(nir_shader *nir, bool is_scalar); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 7f59db4..7df6c72 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -596,9 +596,27 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, const bool is_scalar = compiler->scalar_stage[MESA_SHADER_GEOMETRY]; nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + + /* The GLSL linker will have already matched up GS inputs and the outputs +* of prior stages. The driver does extend VS outputs in some cases, but +* only for legacy OpenGL or Gen4-5 hardware, neither of which offer +* geometry shader support. So we can safely ignore that. +* +* For SSO pipelines, we use a fixed VUE map layout based on variable +* locations, so we can rely on rendezvous-by-location making this work. +* +* However, we need to ignore VARYING_SLOT_PRIMITIVE_ID, as it's not +* written by previous stages and shows up via payload magic. +*/ + GLbitfield64 inputs_read = + shader->info.inputs_read & ~VARYING_BIT_PRIMITIVE_ID; + brw_compute_vue_map(compiler->devinfo, + _vue_map, inputs_read, +
[Mesa-dev] [PATCH 01/10] i965/nir: Do lower_io late for fragment shaders
From: Jason EkstrandThe Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_nir.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b506040..6c9ba36 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5594,6 +5594,7 @@ brw_compile_fs(const struct brw_compiler *compiler, void *log_data, nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true); + shader = brw_nir_lower_io(shader, compiler->devinfo, true, false, NULL); shader = brw_postprocess_nir(shader, compiler->devinfo, true); /* key->alpha_test_func means simulating alpha testing via discards, diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 41059b3..61acf38 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -627,7 +627,8 @@ brw_create_nir(struct brw_context *brw, if (nir->stage != MESA_SHADER_VERTEX && nir->stage != MESA_SHADER_TESS_CTRL && - nir->stage != MESA_SHADER_TESS_EVAL) { + nir->stage != MESA_SHADER_TESS_EVAL && + nir->stage != MESA_SHADER_FRAGMENT) { nir = brw_nir_lower_io(nir, devinfo, is_scalar, false, NULL); } -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] i965: Avoid recalculating the tessellation VUE map for IO lowering.
The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c| 19 --- src/mesa/drivers/dri/i965/brw_nir.h| 4 ++-- src/mesa/drivers/dri/i965/brw_shader.cpp | 15 --- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 13 +++-- 4 files changed, 25 insertions(+), 26 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 2bd6c4e..90c4f66 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -149,7 +149,7 @@ remap_inputs_with_vue_map(nir_block *block, void *closure) struct remap_patch_urb_offsets_state { nir_builder b; - struct brw_vue_map vue_map; + const struct brw_vue_map *vue_map; }; static bool @@ -167,7 +167,7 @@ remap_patch_urb_offsets(nir_block *block, void *closure) if ((stage == MESA_SHADER_TESS_CTRL && is_output(intrin)) || (stage == MESA_SHADER_TESS_EVAL && is_input(intrin))) { - int vue_slot = state->vue_map.varying_to_slot[intrin->const_index[0]]; + int vue_slot = state->vue_map->varying_to_slot[intrin->const_index[0]]; assert(vue_slot != -1); intrin->const_index[0] = vue_slot; @@ -176,7 +176,7 @@ remap_patch_urb_offsets(nir_block *block, void *closure) nir_const_value *const_vertex = nir_src_as_const_value(*vertex); if (const_vertex) { intrin->const_index[0] += const_vertex->u[0] * - state->vue_map.num_per_vertex_slots; + state->vue_map->num_per_vertex_slots; } else { state->b.cursor = nir_before_instr(>instr); @@ -185,7 +185,7 @@ remap_patch_urb_offsets(nir_block *block, void *closure) nir_imul(>b, nir_ssa_for_src(>b, *vertex, 1), nir_imm_int(>b, - state->vue_map.num_per_vertex_slots)); + state->vue_map->num_per_vertex_slots)); /* Add it to the existing offset */ nir_src *offset = nir_get_io_offset_src(intrin); @@ -298,12 +298,10 @@ brw_nir_lower_vue_inputs(nir_shader *nir, } void -brw_nir_lower_tes_inputs(nir_shader *nir) +brw_nir_lower_tes_inputs(nir_shader *nir, const struct brw_vue_map *vue_map) { struct remap_patch_urb_offsets_state state; - brw_compute_tess_vue_map(_map, -nir->info.inputs_read & ~VARYING_BIT_PRIMITIVE_ID, -nir->info.patch_inputs_read); + state.vue_map = vue_map; foreach_list_typed(nir_variable, var, node, >inputs) { var->data.driver_location = var->data.location; @@ -347,11 +345,10 @@ brw_nir_lower_vue_outputs(nir_shader *nir, } void -brw_nir_lower_tcs_outputs(nir_shader *nir) +brw_nir_lower_tcs_outputs(nir_shader *nir, const struct brw_vue_map *vue_map) { struct remap_patch_urb_offsets_state state; - brw_compute_tess_vue_map(_map, nir->info.outputs_written, -nir->info.patch_outputs_written); + state.vue_map = vue_map; nir_foreach_variable(var, >outputs) { var->data.driver_location = var->data.location; diff --git a/src/mesa/drivers/dri/i965/brw_nir.h b/src/mesa/drivers/dri/i965/brw_nir.h index 0140f3a..0fbdc5f 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.h +++ b/src/mesa/drivers/dri/i965/brw_nir.h @@ -91,10 +91,10 @@ void brw_nir_lower_vs_inputs(nir_shader *nir, void brw_nir_lower_vue_inputs(nir_shader *nir, const struct brw_device_info *devinfo, bool is_scalar); -void brw_nir_lower_tes_inputs(nir_shader *nir); +void brw_nir_lower_tes_inputs(nir_shader *nir, const struct brw_vue_map *vue); void brw_nir_lower_fs_inputs(nir_shader *nir); void brw_nir_lower_vue_outputs(nir_shader *nir, bool is_scalar); -void brw_nir_lower_tcs_outputs(nir_shader *nir); +void brw_nir_lower_tcs_outputs(nir_shader *nir, const struct brw_vue_map *vue); void brw_nir_lower_fs_outputs(nir_shader *nir); nir_shader *brw_postprocess_nir(nir_shader *nir, diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 857a079..dfe6afc 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -1227,10 +1227,16 @@ brw_compile_tes(const struct brw_compiler *compiler, const bool is_scalar = compiler->scalar_stage[MESA_SHADER_TESS_EVAL]; nir_shader *nir = nir_shader_clone(mem_ctx, src_shader); - nir = brw_nir_apply_sampler_key(nir, devinfo, >tex, is_scalar); nir->info.inputs_read = key->inputs_read; nir->info.patch_inputs_read = key->patch_inputs_read; - brw_nir_lower_tes_inputs(nir); + + struct brw_vue_map
[Mesa-dev] [PATCH 04/10] i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir.
This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index efa4c48..6996630 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -518,7 +518,7 @@ brw_nir_lower_io(nir_shader *nir, OPT_V(brw_nir_lower_outputs, devinfo, is_scalar); OPT_V(nir_lower_io, nir_var_all, is_scalar ? type_size_scalar : type_size_vec4); - return nir_optimize(nir, is_scalar); + return nir; } /* Prepare the given shader for codegen @@ -539,6 +539,8 @@ brw_postprocess_nir(nir_shader *nir, bool progress; /* Written by OPT and OPT_V */ (void)progress; + nir = nir_optimize(nir, is_scalar); + if (devinfo->gen >= 6) { /* Try and fuse multiply-adds */ OPT(brw_nir_opt_peephole_ffma); -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/10] i965: Eliminate brw_nir_lower_{inputs, outputs, io} functions.
Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +- src/mesa/drivers/dri/i965/brw_nir.c | 88 ++- src/mesa/drivers/dri/i965/brw_nir.h | 20 -- src/mesa/drivers/dri/i965/brw_shader.cpp | 3 +- src/mesa/drivers/dri/i965/brw_vec4.cpp| 6 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 +- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp| 3 +- 7 files changed, 33 insertions(+), 93 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 6c9ba36..261dff6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5594,7 +5594,8 @@ brw_compile_fs(const struct brw_compiler *compiler, void *log_data, nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true); - shader = brw_nir_lower_io(shader, compiler->devinfo, true, false, NULL); + brw_nir_lower_fs_inputs(shader); + brw_nir_lower_fs_outputs(shader); shader = brw_postprocess_nir(shader, compiler->devinfo, true); /* key->alpha_test_func means simulating alpha testing via discards, diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index ed836bf..2bd6c4e 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -202,7 +202,7 @@ remap_patch_urb_offsets(nir_block *block, void *closure) return true; } -static void +void brw_nir_lower_vs_inputs(nir_shader *nir, const struct brw_device_info *devinfo, bool is_scalar, @@ -245,7 +245,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, } } -static void +void brw_nir_lower_vue_inputs(nir_shader *nir, const struct brw_device_info *devinfo, bool is_scalar) @@ -297,7 +297,7 @@ brw_nir_lower_vue_inputs(nir_shader *nir, } } -static void +void brw_nir_lower_tes_inputs(nir_shader *nir) { struct remap_patch_urb_offsets_state state; @@ -324,46 +324,14 @@ brw_nir_lower_tes_inputs(nir_shader *nir) } } -static void +void brw_nir_lower_fs_inputs(nir_shader *nir) { nir_assign_var_locations(>inputs, >num_inputs, type_size_scalar); nir_lower_io(nir, nir_var_shader_in, type_size_scalar); } -static void -brw_nir_lower_inputs(nir_shader *nir, - const struct brw_device_info *devinfo, - bool is_scalar, - bool use_legacy_snorm_formula, - const uint8_t *vs_attrib_wa_flags) -{ - switch (nir->stage) { - case MESA_SHADER_VERTEX: - brw_nir_lower_vs_inputs(nir, devinfo, is_scalar, use_legacy_snorm_formula, - vs_attrib_wa_flags); - break; - case MESA_SHADER_TESS_CTRL: - case MESA_SHADER_GEOMETRY: - brw_nir_lower_vue_inputs(nir, devinfo, is_scalar); - break; - case MESA_SHADER_TESS_EVAL: - brw_nir_lower_tes_inputs(nir); - break; - case MESA_SHADER_FRAGMENT: - assert(is_scalar); - brw_nir_lower_fs_inputs(nir); - break; - case MESA_SHADER_COMPUTE: - /* Compute shaders have no inputs. */ - assert(exec_list_is_empty(>inputs)); - break; - default: - unreachable("unsupported shader stage"); - } -} - -static void +void brw_nir_lower_vue_outputs(nir_shader *nir, bool is_scalar) { @@ -378,7 +346,7 @@ brw_nir_lower_vue_outputs(nir_shader *nir, } } -static void +void brw_nir_lower_tcs_outputs(nir_shader *nir) { struct remap_patch_urb_offsets_state state; @@ -404,7 +372,7 @@ brw_nir_lower_tcs_outputs(nir_shader *nir) } } -static void +void brw_nir_lower_fs_outputs(nir_shader *nir) { nir_assign_var_locations(>outputs, >num_outputs, @@ -412,30 +380,6 @@ brw_nir_lower_fs_outputs(nir_shader *nir) nir_lower_io(nir, nir_var_shader_out, type_size_scalar); } -static void -brw_nir_lower_outputs(nir_shader *nir, bool is_scalar) -{ - switch (nir->stage) { - case MESA_SHADER_VERTEX: - case MESA_SHADER_TESS_EVAL: - case MESA_SHADER_GEOMETRY: - brw_nir_lower_vue_outputs(nir, is_scalar); - break; - case MESA_SHADER_TESS_CTRL: - brw_nir_lower_tcs_outputs(nir); - break; - case MESA_SHADER_FRAGMENT: - brw_nir_lower_fs_outputs(nir); - break; - case MESA_SHADER_COMPUTE: - /* Compute shaders have no outputs. */ - assert(exec_list_is_empty(>outputs)); - break; - default: -
[Mesa-dev] [PATCH 05/10] i965: Remove catch-all nir_lower_io call with specific cases.
Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_nir.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 6996630..f21e676 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -252,6 +252,7 @@ brw_nir_lower_inputs(nir_shader *nir, foreach_list_typed(nir_variable, var, node, >inputs) { var->data.driver_location = var->data.location; } + nir_lower_io(nir, nir_var_shader_in, type_size_vec4); } else { /* The GLSL linker will have already matched up GS inputs and * the outputs of prior stages. The driver does extend VS outputs @@ -323,6 +324,7 @@ brw_nir_lower_inputs(nir_shader *nir, assert(is_scalar); nir_assign_var_locations(>inputs, >num_inputs, type_size_scalar); + nir_lower_io(nir, nir_var_shader_in, type_size_scalar); break; case MESA_SHADER_COMPUTE: /* Compute shaders have no inputs. */ @@ -349,6 +351,7 @@ brw_nir_lower_outputs(nir_shader *nir, } else { nir_foreach_variable(var, >outputs) var->data.driver_location = var->data.location; + nir_lower_io(nir, nir_var_shader_out, type_size_vec4); } break; case MESA_SHADER_TESS_CTRL: { @@ -378,6 +381,7 @@ brw_nir_lower_outputs(nir_shader *nir, case MESA_SHADER_FRAGMENT: nir_assign_var_locations(>outputs, >num_outputs, type_size_scalar); + nir_lower_io(nir, nir_var_shader_out, type_size_scalar); break; case MESA_SHADER_COMPUTE: /* Compute shaders have no outputs. */ @@ -516,7 +520,6 @@ brw_nir_lower_io(nir_shader *nir, OPT_V(brw_nir_lower_inputs, devinfo, is_scalar, use_legacy_snorm_formula, vs_attrib_wa_flags); OPT_V(brw_nir_lower_outputs, devinfo, is_scalar); - OPT_V(nir_lower_io, nir_var_all, is_scalar ? type_size_scalar : type_size_vec4); return nir; } -- 2.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] mesa: optimize out the realloc from glCopyTexImagexD()
On 02/24/2016 03:35 PM, Miklós Máté wrote: > v2: comment about the purpose of the code > v3: also compare texFormat, > add a perf debug message, > formatting fixes > > Signed-off-by: Miklós Máté> --- > src/mesa/main/teximage.c | 35 +++ > 1 file changed, 35 insertions(+) > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c > index 8a4c628..a906de3 100644 > --- a/src/mesa/main/teximage.c > +++ b/src/mesa/main/teximage.c > @@ -3474,6 +3474,23 @@ formats_differ_in_component_sizes(mesa_format f1, > mesa_format f2) > return GL_FALSE; > } > > +static bool > +can_avoid_reallocation(struct gl_texture_image *texImage, GLenum > internalFormat, > + mesa_format texFormat, GLint x, GLint y, GLsizei width, GLsizei > height, GLint border) The second line should be indented to line up with the ( of the previous line. With that fixed, this patch is Reviewed-by: Ian Romanick Assuming there is no other review feedback, I can fix that whitespace issue for you when I commit the patch. > +{ > + if (texImage->InternalFormat != internalFormat) > + return false; > + if (texImage->TexFormat != texFormat) > + return false; > + if (texImage->Border != border) > + return false; > + if (texImage->Width2 != width) > + return false; > + if (texImage->Height2 != height) > + return false; > + return true; > +} > + > /** > * Implement the glCopyTexImage1/2D() functions. > */ > @@ -3517,6 +3534,24 @@ copyteximage(struct gl_context *ctx, GLuint dims, > texFormat = _mesa_choose_texture_format(ctx, texObj, target, level, > internalFormat, GL_NONE, GL_NONE); > > + /* First check if reallocating the texture buffer can be avoided. > +* Without the realloc the copy can be 20x faster. > +*/ > + _mesa_lock_texture(ctx, texObj); > + { > + texImage = _mesa_select_tex_image(texObj, target, level); > + if (texImage && can_avoid_reallocation(texImage, internalFormat, > texFormat, > + x, y, width, height, border)) { > + _mesa_unlock_texture(ctx, texObj); > + return _mesa_copy_texture_sub_image(ctx, dims, texObj, target, > level, > + 0, 0, 0, x, y, width, height, > + "CopyTexImage"); > + } > + } > + _mesa_unlock_texture(ctx, texObj); > + _mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_LOW, "glCopyTexImage " > +"can't avoid reallocating texture storage\n"); > + > rb = _mesa_get_read_renderbuffer_for_format(ctx, internalFormat); > > if (_mesa_is_gles3(ctx)) { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] mesa: optionally associate a gl_program to ati_fragment_shader
On 02/24/2016 05:37 PM, Brian Paul wrote: > On 02/24/2016 04:35 PM, Miklós Máté wrote: >> the state tracker will use it >> >> Signed-off-by: Miklós Máté>> --- >> src/mesa/drivers/common/driverfuncs.c | 3 +++ >> src/mesa/main/atifragshader.c | 13 - >> src/mesa/main/dd.h| 7 ++- >> src/mesa/main/mtypes.h| 1 + >> src/mesa/main/state.c | 14 +- >> 5 files changed, 35 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/common/driverfuncs.c >> b/src/mesa/drivers/common/driverfuncs.c >> index 752aaf6..65a0cf8 100644 >> --- a/src/mesa/drivers/common/driverfuncs.c >> +++ b/src/mesa/drivers/common/driverfuncs.c >> @@ -117,6 +117,9 @@ _mesa_init_driver_functions(struct >> dd_function_table *driver) >> driver->NewProgram = _mesa_new_program; >> driver->DeleteProgram = _mesa_delete_program; >> >> + /* ATI_fragment_shader */ >> + driver->NewATIfs = NULL; >> + >> /* simple state commands */ >> driver->AlphaFunc = NULL; >> driver->BlendColor = NULL; >> diff --git a/src/mesa/main/atifragshader.c >> b/src/mesa/main/atifragshader.c >> index 8fcbff6..34f45c6 100644 >> --- a/src/mesa/main/atifragshader.c >> +++ b/src/mesa/main/atifragshader.c >> @@ -30,6 +30,7 @@ >> #include "main/mtypes.h" >> #include "main/dispatch.h" >> #include "main/atifragshader.h" >> +#include "program/program.h" >> >> #define MESA_DEBUG_ATI_FS 0 >> >> @@ -63,6 +64,7 @@ _mesa_delete_ati_fragment_shader(struct gl_context >> *ctx, struct ati_fragment_sha >> free(s->Instructions[i]); >> free(s->SetupInst[i]); >> } >> + _mesa_reference_program(ctx, >Program, NULL); >> free(s); >> } >> >> @@ -321,6 +323,8 @@ _mesa_BeginFragmentShaderATI(void) >>free(ctx->ATIFragmentShader.Current->SetupInst[i]); >> } >> >> + _mesa_reference_program(ctx, >> >ATIFragmentShader.Current->Program, NULL); >> + >> /* malloc the instructions here - not sure if the best place but its >> a start */ >> for (i = 0; i < MAX_NUM_PASSES_ATI; i++) { >> @@ -405,7 +409,14 @@ _mesa_EndFragmentShaderATI(void) >> } >> #endif >> >> - if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, >> NULL)) { >> + if (ctx->Driver.NewATIfs) { >> + struct gl_program *prog = ctx->Driver.NewATIfs(ctx, >> + >> ctx->ATIFragmentShader.Current); >> + _mesa_reference_program(ctx, >> >ATIFragmentShader.Current->Program, prog); >> + } >> + >> + if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, >> +curProg->Program)) { >> ctx->ATIFragmentShader.Current->isValid = GL_FALSE; >> /* XXX is this the right error? */ >> _mesa_error(ctx, GL_INVALID_OPERATION, >> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h >> index 3f5aa5d..8410a15 100644 >> --- a/src/mesa/main/dd.h >> +++ b/src/mesa/main/dd.h >> @@ -473,7 +473,12 @@ struct dd_function_table { >> struct gl_program * (*NewProgram)(struct gl_context *ctx, GLenum >> target, >>GLuint id); >> /** Delete a program */ >> - void (*DeleteProgram)(struct gl_context *ctx, struct gl_program >> *prog); >> + void (*DeleteProgram)(struct gl_context *ctx, struct gl_program >> *prog); >> + /** >> +* Allocate a program to associate with the new ATI fragment >> shader (optional) >> +*/ >> + struct gl_program * (*NewATIfs)(struct gl_context *ctx, >> + struct ati_fragment_shader *curProg); > > The second line of the function decl should be indented more. See other > nearby functions for examples. Also... what changed in the DeleteProgram line? I've been staring at it, but I can't see the sailboat. > Patch looks OK otherwise. > > Acked-by: Brian Paul With the various whitespace issues fixed (and I think the DeleteProgram change is a whitespace issue of some sort), this patch is Reviewed-by: Ian Romanick Miklós, I assume you need someone to commit this for you? I can fix the minor whitespace problems and commit it. >> /** >> * Notify driver that a program string (and GPU code) has been >> specified >> * or modified. Return GL_TRUE or GL_FALSE to indicate if the >> program is >> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h >> index 12d3863..22e8a21 100644 >> --- a/src/mesa/main/mtypes.h >> +++ b/src/mesa/main/mtypes.h >> @@ -2197,6 +2197,7 @@ struct ati_fragment_shader >> GLboolean interpinp1; >> GLboolean isValid; >> GLuint swizzlerq; >> + struct gl_program *Program; >> }; >> >> /** >> diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c >> index 57f1341..f4e8288 100644 >> --- a/src/mesa/main/state.c >> +++ b/src/mesa/main/state.c >> @@ -124,7 +124,8 @@ update_program(struct gl_context *ctx) >>
Re: [Mesa-dev] [PATCH] radeonsi: also dump shaders on a VM fault
Am 25.02.2016 um 17:54 schrieb Marek Olšák: From: Marek OlšákClearly a good idea. Patch is Reviewed-by: Christian König --- src/gallium/drivers/radeonsi/si_debug.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_debug.c b/src/gallium/drivers/radeonsi/si_debug.c index 7c2b745..eb0cabb 100644 --- a/src/gallium/drivers/radeonsi/si_debug.c +++ b/src/gallium/drivers/radeonsi/si_debug.c @@ -781,8 +781,7 @@ void si_check_vm_faults(struct si_context *sctx) fprintf(f, "Device name: %s\n\n", screen->get_name(screen)); fprintf(f, "Failing VM page: 0x%08x\n\n", addr); - si_dump_last_bo_list(sctx, f); - si_dump_last_ib(sctx, f); + si_dump_debug_state(>b.b, f, 0); fclose(f); fprintf(stderr, "Detected a VM fault, exiting...\n"); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] Fixes for building AOSP master
Hi, On 25 February 2016 at 01:47, Emil Velikovwrote: > On 24 February 2016 at 18:56, Rob Herring wrote: >> AOSP master branch has switched to clang from gcc and has major build >> system changes moving away from GNU make. > > Out of curiosity: what are they moving to ? I can see "blueprint" > (ninja?), kati (gnu make clone) and soong(?). Is there a > comparison/documentation about them ? Ninja replaces Make. Kati converts Makefiles into Ninja build definitions. Blueprint (Android), GN (Chrome, new), and Gyp (Chrome, deprecated) generate Ninja files from higher-level build descriptions. Soong also appears to translate Blueprint to Ninja, though I'd thought Blueprint did that itself ... Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] nvc0: rework nvc0_compute_validate_program()
On Thu, Feb 25, 2016 at 12:42 PM, Samuel Pitoisetwrote: > It would be easy to make the validate functions return a boolean to handle > errors. I will think more about that stuff. But currently, I prefer to > follow the existing design and drop this boolean. This would need to be done as part of a larger error-handling strategy. Right now we don't handle errors very well at all :( ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] Fixes for building AOSP master
2016-02-26 1:27 GMT+08:00 Emil Velikov: > On 25 February 2016 at 17:22, Chih-Wei Huang wrote: >> 2016-02-25 9:47 GMT+08:00 Emil Velikov : >>> On 24 February 2016 at 18:56, Rob Herring wrote: AOSP master branch has switched to clang from gcc and has major build system changes moving away from GNU make. >>> >>> Out of curiosity: what are they moving to ? I can see "blueprint" >>> (ninja?), kati (gnu make clone) and soong(?). Is there a >>> comparison/documentation about them ? >> >> Google has announced the change to all partners. >> The reasons include it’s time to have just one compiler >> for Android, and the positive impact on security >> of sanitizers like AddressSanitizer, etc. >> > I was wondering about GNU make move (to ...?), while I think you're > talking about gcc vs clang. Ah, I didn't read your question clearly. Sorry. Actually I only see the move of gcc to clang in the announcement, not mentioned about the make. >> The full post and faq can be found in the >> android-gms-announcements list. >> > Searching for android-gms-announcements does list anything. Is there a > typo in the name or the discussion group is closed to partners only ? Oh, it's partners only. Sorry. -- Chih-Wei Android-x86 project http://www.android-x86.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] nvc0: rework nvc0_compute_validate_program()
On 02/25/2016 06:44 PM, Ilia Mirkin wrote: On Thu, Feb 25, 2016 at 12:42 PM, Samuel Pitoisetwrote: It would be easy to make the validate functions return a boolean to handle errors. I will think more about that stuff. But currently, I prefer to follow the existing design and drop this boolean. This would need to be done as part of a larger error-handling strategy. Right now we don't handle errors very well at all :( Yeah, it's a long time effort. :-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] nvc0: rework nvc0_compute_validate_program()
On 02/25/2016 06:35 PM, Ilia Mirkin wrote: On Wed, Feb 24, 2016 at 12:44 PM, Samuel Pitoisetwrote: Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/Makefile.sources | 1 - src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 34 ++ src/gallium/drivers/nouveau/nvc0/nvc0_compute.h| 9 -- src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 1 + .../drivers/nouveau/nvc0/nvc0_shader_state.c | 15 ++ src/gallium/drivers/nouveau/nvc0/nve4_compute.c| 4 +-- 6 files changed, 20 insertions(+), 44 deletions(-) delete mode 100644 src/gallium/drivers/nouveau/nvc0/nvc0_compute.h diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index 43ffce6..65f08c7 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -150,7 +150,6 @@ NVC0_C_SOURCES := \ nvc0/gm107_texture.xml.h \ nvc0/nvc0_3d.xml.h \ nvc0/nvc0_compute.c \ - nvc0/nvc0_compute.h \ nvc0/nvc0_compute.xml.h \ nvc0/nvc0_context.c \ nvc0/nvc0_context.h \ diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c index a664aaf..060f59d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c @@ -23,7 +23,8 @@ */ #include "nvc0/nvc0_context.h" -#include "nvc0/nvc0_compute.h" + +#include "nvc0/nvc0_compute.xml.h" int nvc0_screen_compute_setup(struct nvc0_screen *screen, @@ -120,34 +121,6 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen, return 0; } -bool -nvc0_compute_validate_program(struct nvc0_context *nvc0) -{ - struct nvc0_program *prog = nvc0->compprog; - - if (prog->mem) - return true; - - if (!prog->translated) { - prog->translated = nvc0_program_translate( - prog, nvc0->screen->base.device->chipset, >base.debug); - if (!prog->translated) - return false; - } - if (unlikely(!prog->code_size)) - return false; - - if (likely(prog->code_size)) { - if (nvc0_program_upload_code(nvc0, prog)) { - struct nouveau_pushbuf *push = nvc0->base.pushbuf; - BEGIN_NVC0(push, NVC0_CP(FLUSH), 1); - PUSH_DATA (push, NVC0_COMPUTE_FLUSH_CODE); - return true; - } - } - return false; -} - static void nvc0_compute_validate_samplers(struct nvc0_context *nvc0) { @@ -292,8 +265,7 @@ nvc0_compute_validate_globals(struct nvc0_context *nvc0) static bool nvc0_compute_state_validate(struct nvc0_context *nvc0) { - if (!nvc0_compute_validate_program(nvc0)) - return false; + nvc0_compprog_validate(nvc0); if (nvc0->dirty_cp & NVC0_NEW_CP_CONSTBUF) nvc0_compute_validate_constbufs(nvc0); if (nvc0->dirty_cp & NVC0_NEW_CP_DRIVERCONST) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h deleted file mode 100644 index a23f7f3..000 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h +++ /dev/null @@ -1,9 +0,0 @@ -#ifndef NVC0_COMPUTE_H -#define NVC0_COMPUTE_H - -#include "nvc0/nvc0_compute.xml.h" - -bool -nvc0_compute_validate_program(struct nvc0_context *nvc0); - -#endif /* NVC0_COMPUTE_H */ diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h index 7aa4b62..0f1ebb0 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h @@ -254,6 +254,7 @@ void nvc0_tctlprog_validate(struct nvc0_context *); void nvc0_tevlprog_validate(struct nvc0_context *); void nvc0_gmtyprog_validate(struct nvc0_context *); void nvc0_fragprog_validate(struct nvc0_context *); +void nvc0_compprog_validate(struct nvc0_context *); void nvc0_tfb_validate(struct nvc0_context *); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c index 2f46c43..6b02ed5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c @@ -28,6 +28,8 @@ #include "nvc0/nvc0_context.h" #include "nvc0/nvc0_query_hw.h" +#include "nvc0/nvc0_compute.xml.h" + static inline void nvc0_program_update_context_state(struct nvc0_context *nvc0, struct nvc0_program *prog, int stage) @@ -257,6 +259,19 @@ nvc0_gmtyprog_validate(struct nvc0_context *nvc0) } void +nvc0_compprog_validate(struct nvc0_context *nvc0) +{ + struct nouveau_pushbuf *push = nvc0->base.pushbuf; + struct nvc0_program *cp = nvc0->compprog; + + if (cp && !nvc0_program_validate(nvc0, cp)) +
Re: [Mesa-dev] [PATCH 3/3] nvc0: rework nvc0_compute_validate_program()
On Wed, Feb 24, 2016 at 12:44 PM, Samuel Pitoisetwrote: > Reduce the amount of duplicated code by re-using > nvc0_program_validate(). While we are at it, change the prototype > to return void and remove nvc0_compute.h which is now useless. > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/drivers/nouveau/Makefile.sources | 1 - > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 34 > ++ > src/gallium/drivers/nouveau/nvc0/nvc0_compute.h| 9 -- > src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 1 + > .../drivers/nouveau/nvc0/nvc0_shader_state.c | 15 ++ > src/gallium/drivers/nouveau/nvc0/nve4_compute.c| 4 +-- > 6 files changed, 20 insertions(+), 44 deletions(-) > delete mode 100644 src/gallium/drivers/nouveau/nvc0/nvc0_compute.h > > diff --git a/src/gallium/drivers/nouveau/Makefile.sources > b/src/gallium/drivers/nouveau/Makefile.sources > index 43ffce6..65f08c7 100644 > --- a/src/gallium/drivers/nouveau/Makefile.sources > +++ b/src/gallium/drivers/nouveau/Makefile.sources > @@ -150,7 +150,6 @@ NVC0_C_SOURCES := \ > nvc0/gm107_texture.xml.h \ > nvc0/nvc0_3d.xml.h \ > nvc0/nvc0_compute.c \ > - nvc0/nvc0_compute.h \ > nvc0/nvc0_compute.xml.h \ > nvc0/nvc0_context.c \ > nvc0/nvc0_context.h \ > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > index a664aaf..060f59d 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c > @@ -23,7 +23,8 @@ > */ > > #include "nvc0/nvc0_context.h" > -#include "nvc0/nvc0_compute.h" > + > +#include "nvc0/nvc0_compute.xml.h" > > int > nvc0_screen_compute_setup(struct nvc0_screen *screen, > @@ -120,34 +121,6 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen, > return 0; > } > > -bool > -nvc0_compute_validate_program(struct nvc0_context *nvc0) > -{ > - struct nvc0_program *prog = nvc0->compprog; > - > - if (prog->mem) > - return true; > - > - if (!prog->translated) { > - prog->translated = nvc0_program_translate( > - prog, nvc0->screen->base.device->chipset, >base.debug); > - if (!prog->translated) > - return false; > - } > - if (unlikely(!prog->code_size)) > - return false; > - > - if (likely(prog->code_size)) { > - if (nvc0_program_upload_code(nvc0, prog)) { > - struct nouveau_pushbuf *push = nvc0->base.pushbuf; > - BEGIN_NVC0(push, NVC0_CP(FLUSH), 1); > - PUSH_DATA (push, NVC0_COMPUTE_FLUSH_CODE); > - return true; > - } > - } > - return false; > -} > - > static void > nvc0_compute_validate_samplers(struct nvc0_context *nvc0) > { > @@ -292,8 +265,7 @@ nvc0_compute_validate_globals(struct nvc0_context *nvc0) > static bool > nvc0_compute_state_validate(struct nvc0_context *nvc0) > { > - if (!nvc0_compute_validate_program(nvc0)) > - return false; > + nvc0_compprog_validate(nvc0); > if (nvc0->dirty_cp & NVC0_NEW_CP_CONSTBUF) >nvc0_compute_validate_constbufs(nvc0); > if (nvc0->dirty_cp & NVC0_NEW_CP_DRIVERCONST) > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h > deleted file mode 100644 > index a23f7f3..000 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.h > +++ /dev/null > @@ -1,9 +0,0 @@ > -#ifndef NVC0_COMPUTE_H > -#define NVC0_COMPUTE_H > - > -#include "nvc0/nvc0_compute.xml.h" > - > -bool > -nvc0_compute_validate_program(struct nvc0_context *nvc0); > - > -#endif /* NVC0_COMPUTE_H */ > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > index 7aa4b62..0f1ebb0 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h > @@ -254,6 +254,7 @@ void nvc0_tctlprog_validate(struct nvc0_context *); > void nvc0_tevlprog_validate(struct nvc0_context *); > void nvc0_gmtyprog_validate(struct nvc0_context *); > void nvc0_fragprog_validate(struct nvc0_context *); > +void nvc0_compprog_validate(struct nvc0_context *); > > void nvc0_tfb_validate(struct nvc0_context *); > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c > index 2f46c43..6b02ed5 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c > @@ -28,6 +28,8 @@ > #include "nvc0/nvc0_context.h" > #include "nvc0/nvc0_query_hw.h" > > +#include "nvc0/nvc0_compute.xml.h" > + > static inline void > nvc0_program_update_context_state(struct nvc0_context *nvc0, >struct nvc0_program *prog, int stage) > @@ -257,6 +259,19 @@ nvc0_gmtyprog_validate(struct nvc0_context *nvc0) > } > > void >
Re: [Mesa-dev] [PATCH 0/8] Fixes for building AOSP master
On 25 February 2016 at 17:22, Chih-Wei Huangwrote: > 2016-02-25 9:47 GMT+08:00 Emil Velikov : >> On 24 February 2016 at 18:56, Rob Herring wrote: >>> AOSP master branch has switched to clang from gcc and has major build >>> system changes moving away from GNU make. >> >> Out of curiosity: what are they moving to ? I can see "blueprint" >> (ninja?), kati (gnu make clone) and soong(?). Is there a >> comparison/documentation about them ? > > Google has announced the change to all partners. > The reasons include it’s time to have just one compiler > for Android, and the positive impact on security > of sanitizers like AddressSanitizer, etc. > I was wondering about GNU make move (to ...?), while I think you're talking about gcc vs clang. > The full post and faq can be found in the > android-gms-announcements list. > Searching for android-gms-announcements does list anything. Is there a typo in the name or the discussion group is closed to partners only ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] Fixes for building AOSP master
2016-02-25 9:47 GMT+08:00 Emil Velikov: > On 24 February 2016 at 18:56, Rob Herring wrote: >> AOSP master branch has switched to clang from gcc and has major build >> system changes moving away from GNU make. > > Out of curiosity: what are they moving to ? I can see "blueprint" > (ninja?), kati (gnu make clone) and soong(?). Is there a > comparison/documentation about them ? Google has announced the change to all partners. The reasons include it’s time to have just one compiler for Android, and the positive impact on security of sanitizers like AddressSanitizer, etc. The full post and faq can be found in the android-gms-announcements list. -- Chih-Wei Android-x86 project http://www.android-x86.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] build/nir: Remove unused Makefile.sources from nir folder
I would really like to see this happen. However, I seem to recall Emil having some reason for keeping it. Emil? --Jason On Thu, Feb 25, 2016 at 4:50 AM, Eduardo Lima Mitevwrote: > NIR sources are added in src/compiler/Makefile.sources. > --- > src/compiler/nir/Makefile.sources | 71 > --- > 1 file changed, 71 deletions(-) > delete mode 100644 src/compiler/nir/Makefile.sources > > diff --git a/src/compiler/nir/Makefile.sources > b/src/compiler/nir/Makefile.sources > deleted file mode 100644 > index 0755a10..000 > --- a/src/compiler/nir/Makefile.sources > +++ /dev/null > @@ -1,71 +0,0 @@ > -NIR_GENERATED_FILES = \ > - nir_builder_opcodes.h \ > - nir_constant_expressions.c \ > - nir_opcodes.c \ > - nir_opcodes.h \ > - nir_opt_algebraic.c > - > -NIR_FILES = \ > - glsl_to_nir.cpp \ > - glsl_to_nir.h \ > - nir.c \ > - nir.h \ > - nir_array.h \ > - nir_builder.h \ > - nir_clone.c \ > - nir_constant_expressions.h \ > - nir_control_flow.c \ > - nir_control_flow.h \ > - nir_control_flow_private.h \ > - nir_dominance.c \ > - nir_from_ssa.c \ > - nir_gs_count_vertices.c \ > - nir_intrinsics.c \ > - nir_intrinsics.h \ > - nir_instr_set.c \ > - nir_instr_set.h \ > - nir_liveness.c \ > - nir_lower_alu_to_scalar.c \ > - nir_lower_atomics.c \ > - nir_lower_clip.c \ > - nir_lower_global_vars_to_local.c \ > - nir_lower_gs_intrinsics.c \ > - nir_lower_load_const_to_scalar.c \ > - nir_lower_locals_to_regs.c \ > - nir_lower_idiv.c \ > - nir_lower_io.c \ > - nir_lower_outputs_to_temporaries.c \ > - nir_lower_phis_to_scalar.c \ > - nir_lower_samplers.c \ > - nir_lower_system_values.c \ > - nir_lower_tex.c \ > - nir_lower_to_source_mods.c \ > - nir_lower_two_sided_color.c \ > - nir_lower_vars_to_ssa.c \ > - nir_lower_var_copies.c \ > - nir_lower_vec_to_movs.c \ > - nir_metadata.c \ > - nir_move_vec_src_uses_to_dest.c \ > - nir_normalize_cubemap_coords.c \ > - nir_opt_constant_folding.c \ > - nir_opt_copy_propagate.c \ > - nir_opt_cse.c \ > - nir_opt_dce.c \ > - nir_opt_dead_cf.c \ > - nir_opt_gcm.c \ > - nir_opt_global_to_local.c \ > - nir_opt_peephole_select.c \ > - nir_opt_remove_phis.c \ > - nir_opt_undef.c \ > - nir_print.c \ > - nir_remove_dead_variables.c \ > - nir_search.c \ > - nir_search.h \ > - nir_split_var_copies.c \ > - nir_sweep.c \ > - nir_to_ssa.c \ > - nir_validate.c \ > - nir_vla.h \ > - nir_worklist.c \ > - nir_worklist.h > - > -- > 2.5.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
On Thu, Feb 25, 2016 at 8:19 AM, Matt Turnerwrote: > On Thu, Feb 25, 2016 at 2:15 AM, Iago Toral Quiroga > wrote: > > From the OpenGL 4.2 spec: > > > > "When a constructor is used to convert any integer or floating-point > type to a > > bool, 0 and 0.0 are converted to false, and non-zero values are > converted to > > true." > > > > Thus, even the smallest non-zero floating value should be translated to > true. > > This behavior has been verified also with proprietary NVIDIA drivers. > > Ooh, interesting. > > > Currently, we implement this conversion as a cmp.nz operation with > floats, > > subject to floating-point precision limitations, and as a result, > relatively > > small non-zero floating point numbers return false instead of true. > > > > This patch fixes the problem by getting rid of the sign bit (to cover > the case > > of -0.0) and testing the result against 0u using an integer comparison > instead. > > --- > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > index db20c71..7d62d7e 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , > nir_alu_instr *instr) > >bld.MOV(result, negate(op[0])); > >break; > > > > - case nir_op_f2b: > > - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); > > - break; > > + case nir_op_f2b: { > > + /* Because comparing to 0.0f is subject to precision limitations, > do the > > + * comparison using integers (we need to get rid of the sign bit > for that) > > + */ > > + if (devinfo->gen >= 8) > > + op[0] = resolve_source_modifiers(op[0]); > Hrm... I'm not sure what I think about this. Neither abs nor neg *should* affect f2b since zero/non-zero should just pass through. However, resolving the source modifiers could affect if they, for instance, flushed dnorms. Incidentally, this means we probably want a NIR optimization to get rid of neg and abs right before != 0 if we can. > > Oh, good thinking. > > > + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); > > + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); > > + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); > > The cmod_propagation is going to turn this into an AND.NZ, which we > may as well just do here: > >set_condmod(BRW_CONDITIONAL_NZ, > bld.AND(result, op[0], brw_imm_ud(0x7FFFu)); > > > + break; > > Bad indentation. > > With those two things changed in both patches, they are > > Reviewed-by: Matt Turner > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: also dump shaders on a VM fault
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_debug.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_debug.c b/src/gallium/drivers/radeonsi/si_debug.c index 7c2b745..eb0cabb 100644 --- a/src/gallium/drivers/radeonsi/si_debug.c +++ b/src/gallium/drivers/radeonsi/si_debug.c @@ -781,8 +781,7 @@ void si_check_vm_faults(struct si_context *sctx) fprintf(f, "Device name: %s\n\n", screen->get_name(screen)); fprintf(f, "Failing VM page: 0x%08x\n\n", addr); - si_dump_last_bo_list(sctx, f); - si_dump_last_ib(sctx, f); + si_dump_debug_state(>b.b, f, 0); fclose(f); fprintf(stderr, "Detected a VM fault, exiting...\n"); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: > From the OpenGL 4.2 spec: > > "When a constructor is used to convert any integer or floating-point type to a > bool, 0 and 0.0 are converted to false, and non-zero values are converted to > true." > > Thus, even the smallest non-zero floating value should be translated to true. > This behavior has been verified also with proprietary NVIDIA drivers. > > Currently, we implement this conversion as a cmp.nz operation with floats, > subject to floating-point precision limitations, and as a result, relatively > small non-zero floating point numbers return false instead of true. > > This patch fixes the problem by getting rid of the sign bit (to cover the case > of -0.0) and testing the result against 0u using an integer comparison > instead. > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index db20c71..7d62d7e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , > nir_alu_instr *instr) >bld.MOV(result, negate(op[0])); >break; > > - case nir_op_f2b: > - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); > - break; > + case nir_op_f2b: { > + /* Because comparing to 0.0f is subject to precision limitations, do > the > + * comparison using integers (we need to get rid of the sign bit for > that) > + */ > + if (devinfo->gen >= 8) > + op[0] = resolve_source_modifiers(op[0]); > + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); > + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); > + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); > + break; > + } > + > case nir_op_i2b: >bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >break; > Does that fix anything? I don't really see a problem with the existing logic. Yes any "non-zero value" should be converted to true. But surely that definition cannot include denorms, which you are always allowed to flush to zero. (Albeit I can't tell what the result would be with NaNs with the float compare, nor what the result actually should be in this case since glsl doesn't require NaNs neither.) Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radeonsi: allow dumping shader disassemblies to a file
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_compute.c | 2 +- src/gallium/drivers/radeonsi/si_shader.c | 46 +-- src/gallium/drivers/radeonsi/si_shader.h | 3 +- 3 files changed, 29 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 9f5f4c6..1ec695e 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -115,7 +115,7 @@ static void *si_create_compute_state( si_shader_binary_read_config(>shader.binary, >shader.config, 0); si_shader_dump(sctx->screen, >shader, >b.debug, - TGSI_PROCESSOR_COMPUTE); + TGSI_PROCESSOR_COMPUTE, stderr); si_shader_binary_upload(sctx->screen, >shader); program->input_buffer = si_resource_create_custom(sctx->b.b.screen, diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 57458ae..8c1151a 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -4406,14 +4406,14 @@ int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader *shader) static void si_shader_dump_disassembly(const struct radeon_shader_binary *binary, struct pipe_debug_callback *debug, - const char *name) + const char *name, FILE *file) { char *line, *p; unsigned i, count; if (binary->disasm_string) { - fprintf(stderr, "Shader %s disassembly:\n", name); - fprintf(stderr, "%s", binary->disasm_string); + fprintf(file, "Shader %s disassembly:\n", name); + fprintf(file, "%s", binary->disasm_string); if (debug && debug->debug_message) { /* Very long debug messages are cut off, so send the @@ -4443,9 +4443,9 @@ static void si_shader_dump_disassembly(const struct radeon_shader_binary *binary "Shader Disassembly End"); } } else { - fprintf(stderr, "Shader %s binary:\n", name); + fprintf(file, "Shader %s binary:\n", name); for (i = 0; i < binary->code_size; i += 4) { - fprintf(stderr, "@0x%x: %02x%02x%02x%02x\n", i, + fprintf(file, "@0x%x: %02x%02x%02x%02x\n", i, binary->code[i + 3], binary->code[i + 2], binary->code[i + 1], binary->code[i]); } @@ -4457,7 +4457,8 @@ static void si_shader_dump_stats(struct si_screen *sscreen, unsigned num_inputs, unsigned code_size, struct pipe_debug_callback *debug, -unsigned processor) +unsigned processor, +FILE *file) { unsigned lds_increment = sscreen->b.chip_class >= CIK ? 512 : 256; unsigned lds_per_wave = 0; @@ -4493,15 +4494,16 @@ static void si_shader_dump_stats(struct si_screen *sscreen, if (lds_per_wave) max_simd_waves = MIN2(max_simd_waves, 16384 / lds_per_wave); - if (r600_can_dump_shader(>b, processor)) { + if (file != stderr || + r600_can_dump_shader(>b, processor)) { if (processor == TGSI_PROCESSOR_FRAGMENT) { - fprintf(stderr, "*** SHADER CONFIG ***\n" + fprintf(file, "*** SHADER CONFIG ***\n" "SPI_PS_INPUT_ADDR = 0x%04x\n" "SPI_PS_INPUT_ENA = 0x%04x\n", conf->spi_ps_input_addr, conf->spi_ps_input_ena); } - fprintf(stderr, "*** SHADER STATS ***\n" + fprintf(file, "*** SHADER STATS ***\n" "SGPRS: %d\n" "VGPRS: %d\n" "Code Size: %d bytes\n" @@ -4555,27 +4557,30 @@ static const char *si_get_shader_name(struct si_shader *shader, } void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader, - struct pipe_debug_callback *debug, unsigned processor) + struct pipe_debug_callback *debug, unsigned processor, + FILE *file) { - if (r600_can_dump_shader(>b, processor) && - !(sscreen->b.debug_flags & DBG_NO_ASM)) { - fprintf(stderr, "\n%s:\n", si_get_shader_name(shader, processor)); + if (file != stderr || + (r600_can_dump_shader(>b, processor) && +!(sscreen->b.debug_flags & DBG_NO_ASM))) { + fprintf(file, "\n%s:\n",
[Mesa-dev] [PATCH 2/2] radeonsi: dump full shader disassemblies into ddebug logs
From: Marek Olšákincluding prolog and epilog disassemblies --- src/gallium/drivers/radeonsi/si_debug.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_debug.c b/src/gallium/drivers/radeonsi/si_debug.c index e16ebbd..7c2b745 100644 --- a/src/gallium/drivers/radeonsi/si_debug.c +++ b/src/gallium/drivers/radeonsi/si_debug.c @@ -34,15 +34,15 @@ DEBUG_GET_ONCE_OPTION(replace_shaders, "RADEON_REPLACE_SHADERS", NULL) -static void si_dump_shader(struct si_shader_ctx_state *state, const char *name, - FILE *f) +static void si_dump_shader(struct si_screen *sscreen, + struct si_shader_ctx_state *state, FILE *f) { if (!state->cso || !state->current) return; - fprintf(f, "%s shader disassembly:\n", name); si_dump_shader_key(state->cso->type, >current->key, f); - fprintf(f, "%s\n\n", state->current->binary.disasm_string); + si_shader_dump(sscreen, state->current, NULL, + state->cso->info.processor, f); } /** @@ -670,11 +670,11 @@ static void si_dump_debug_state(struct pipe_context *ctx, FILE *f, si_dump_debug_registers(sctx, f); si_dump_framebuffer(sctx, f); - si_dump_shader(>vs_shader, "Vertex", f); - si_dump_shader(>tcs_shader, "Tessellation control", f); - si_dump_shader(>tes_shader, "Tessellation evaluation", f); - si_dump_shader(>gs_shader, "Geometry", f); - si_dump_shader(>ps_shader, "Fragment", f); + si_dump_shader(sctx->screen, >vs_shader, f); + si_dump_shader(sctx->screen, >tcs_shader, f); + si_dump_shader(sctx->screen, >tes_shader, f); + si_dump_shader(sctx->screen, >gs_shader, f); + si_dump_shader(sctx->screen, >ps_shader, f); si_dump_last_bo_list(sctx, f); si_dump_last_ib(sctx, f); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: fix precision of f2b
On Thu, Feb 25, 2016 at 2:15 AM, Iago Toral Quirogawrote: > From the OpenGL 4.2 spec: > > "When a constructor is used to convert any integer or floating-point type to a > bool, 0 and 0.0 are converted to false, and non-zero values are converted to > true." > > Thus, even the smallest non-zero floating value should be translated to true. > This behavior has been verified also with proprietary NVIDIA drivers. Ooh, interesting. > Currently, we implement this conversion as a cmp.nz operation with floats, > subject to floating-point precision limitations, and as a result, relatively > small non-zero floating point numbers return false instead of true. > > This patch fixes the problem by getting rid of the sign bit (to cover the case > of -0.0) and testing the result against 0u using an integer comparison > instead. > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index db20c71..7d62d7e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder , > nir_alu_instr *instr) >bld.MOV(result, negate(op[0])); >break; > > - case nir_op_f2b: > - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); > - break; > + case nir_op_f2b: { > + /* Because comparing to 0.0f is subject to precision limitations, do > the > + * comparison using integers (we need to get rid of the sign bit for > that) > + */ > + if (devinfo->gen >= 8) > + op[0] = resolve_source_modifiers(op[0]); Oh, good thinking. > + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); > + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFu)); > + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); The cmod_propagation is going to turn this into an AND.NZ, which we may as well just do here: set_condmod(BRW_CONDITIONAL_NZ, bld.AND(result, op[0], brw_imm_ud(0x7FFFu)); > + break; Bad indentation. With those two things changed in both patches, they are Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] st/mesa: implement GL_ATI_fragment_shader
On Thu, Feb 25, 2016 at 4:18 PM, Miklós Mátéwrote: > On 02/25/2016 11:40 AM, Marek Olšák wrote: >> >> On Thu, Feb 25, 2016 at 12:35 AM, Miklós Máté wrote: >>> >>> v2: fix arithmetic for special opcodes, >>> fix fog state, cleanup >>> v3: simplify handling of special opcodes, >>> fix rebinding with different textargets or fog equation, >>> lots of formatting fixes >>> >>> Signed-off-by: Miklós Máté >>> --- >>> src/mesa/Makefile.sources | 1 + >>> src/mesa/main/atifragshader.h | 1 + >>> src/mesa/main/texstate.c | 18 + >>> src/mesa/main/texstate.h | 3 + >>> src/mesa/program/program.h| 2 + >>> src/mesa/state_tracker/st_atifs_to_tgsi.c | 726 >>> ++ >>> src/mesa/state_tracker/st_atifs_to_tgsi.h | 65 +++ >>> src/mesa/state_tracker/st_atom_constbuf.c | 16 + >>> src/mesa/state_tracker/st_atom_shader.c | 27 +- >>> src/mesa/state_tracker/st_cb_drawpixels.c | 1 + >>> src/mesa/state_tracker/st_cb_program.c| 36 +- >>> src/mesa/state_tracker/st_program.c | 30 +- >>> src/mesa/state_tracker/st_program.h | 7 + >>> 13 files changed, 930 insertions(+), 3 deletions(-) >>> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c >>> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h > > [snip] >>> >>> + if (texinst->Opcode == ATI_FRAGMENT_SHADER_SAMPLE_OP) { >>> + /* use the current texture target for the sample operation >>> + * note: this implementation doesn't support re-using an ATI_fs >>> + *with different texture targets >>> + */ >>> + gl_texture_index index = _mesa_get_texture_target_index(t->ctx, >>> r); >> >> Please use value from the shader key here, not the context function. >> >>> + unsigned target = translate_texture_target(index); >>> + >>> + /* by default texture and sampler indexes are the same */ >>> + src[1] = t->samplers[r]; >>> + ureg_tex_insn(t->ureg, TGSI_OPCODE_TEX, dst, 1, target, >>> +NULL, 0, src, 2); >>> + } else if (texinst->Opcode == ATI_FRAGMENT_SHADER_PASS_OP) { >>> + ureg_insn(t->ureg, TGSI_OPCODE_MOV, dst, 1, src, 1); >>> + } >>> + > > [snip] > >>> +/** >>> + * Called when a new variant is needed, we need to translate >>> + * the ATI fragment shader to TGSI >>> + */ >>> +enum pipe_error >>> +st_translate_atifs_program( >>> + struct gl_context *ctx, >>> + struct ureg_program *ureg, >>> + struct ati_fragment_shader *atifs, >>> + struct gl_program *program, >>> + GLuint numInputs, >>> + const GLuint inputMapping[], >>> + const ubyte inputSemanticName[], >>> + const ubyte inputSemanticIndex[], >>> + const GLuint interpMode[], >>> + GLuint numOutputs, >>> + const GLuint outputMapping[], >>> + const ubyte outputSemanticName[], >>> + const ubyte outputSemanticIndex[]) >>> +{ >>> + enum pipe_error ret = PIPE_OK; >>> + >>> + unsigned pass, i, r; >>> + >>> + struct st_translate translate, *t; >>> + t = >>> + memset(t, 0, sizeof *t); >>> + >>> + t->inputMapping = inputMapping; >>> + t->outputMapping = outputMapping; >>> + t->ureg = ureg; >>> + t->ctx = ctx; >>> + t->atifs = atifs; >>> + >>> + /* >>> +* Declare input attributes. >>> +*/ >>> + for (i = 0; i < numInputs; i++) { >>> + t->inputs[i] = ureg_DECL_fs_input(ureg, >>> +inputSemanticName[i], >>> +inputSemanticIndex[i], >>> +interpMode[i]); >>> + } >>> + >>> + /* >>> +* Declare output attributes: >>> +* we always have numOutputs=1 and it's FRAG_RESULT_COLOR >>> +*/ >>> + t->outputs[0] = ureg_DECL_output( ureg, >>> + TGSI_SEMANTIC_COLOR, >>> + outputSemanticIndex[0] ); >>> + >>> + /* Emit constants and immediates. Mesa uses a single index space >>> +* for these, so we put all the translated regs in t->constants. >>> +*/ >>> + if (program->Parameters) { >>> + t->constants = calloc( program->Parameters->NumParameters, >>> +sizeof t->constants[0] ); >>> + if (t->constants == NULL) { >>> + ret = PIPE_ERROR_OUT_OF_MEMORY; >>> + goto out; >>> + } >>> + >>> + for (i = 0; i < program->Parameters->NumParameters; i++) { >>> + switch (program->Parameters->Parameters[i].Type) { >>> + case PROGRAM_STATE_VAR: >>> + case PROGRAM_UNIFORM: >>> +t->constants[i] = ureg_DECL_constant( ureg, i ); >>> +break; >>> + >>> + case PROGRAM_CONSTANT: >>> +t->constants[i] = >>> + ureg_DECL_immediate( ureg, >>> +(const >>> float*)program->Parameters->ParameterValues[i], >>> +4 ); >>> +break; >>> + default: >>> +break; >>> + } >>> + } >>> + } >>> + >>> + /* texture
Re: [Mesa-dev] [PATCH 4/7] main: rework the compatibility check of visuals in glXMakeCurrent
On 02/25/2016 08:26 AM, Miklós Máté wrote: On 02/25/2016 02:37 AM, Brian Paul wrote: On 02/24/2016 04:35 PM, Miklós Máté wrote: Now it follows the GLX 1.4 specification. Can you elaborate on that a bit? Section 2.1 of the GLX spec lists a few criteria for a context and a drawable to be compatible. This fixes post-processing in SW:KotOR. Signed-off-by: Miklós Máté--- src/mesa/main/context.c | 42 -- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 26eee28..6c16229 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1525,10 +1525,6 @@ _mesa_copy_context( const struct gl_context *src, struct gl_context *dst, * Check if the given context can render into the given framebuffer * by checking visual attributes. * - * Most of these tests could go away because Mesa is now pretty flexible - * in terms of mixing rendering contexts with framebuffers. As long - * as RGB vs. CI mode agree, we're probably good. - * * \return GL_TRUE if compatible, GL_FALSE otherwise. */ static GLboolean @@ -1541,32 +1537,18 @@ check_compatible(const struct gl_context *ctx, if (buffer == _mesa_get_incomplete_framebuffer()) return GL_TRUE; -#if 0 - /* disabling this fixes the fgl_glxgears pbuffer demo */ - if (ctxvis->doubleBufferMode && !bufvis->doubleBufferMode) - return GL_FALSE; -#endif - if (ctxvis->stereoMode && !bufvis->stereoMode) - return GL_FALSE; - if (ctxvis->haveAccumBuffer && !bufvis->haveAccumBuffer) - return GL_FALSE; - if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer) - return GL_FALSE; - if (ctxvis->haveStencilBuffer && !bufvis->haveStencilBuffer) - return GL_FALSE; - if (ctxvis->redMask && ctxvis->redMask != bufvis->redMask) - return GL_FALSE; - if (ctxvis->greenMask && ctxvis->greenMask != bufvis->greenMask) - return GL_FALSE; - if (ctxvis->blueMask && ctxvis->blueMask != bufvis->blueMask) - return GL_FALSE; -#if 0 - /* disabled (see bug 11161) */ - if (ctxvis->depthBits && ctxvis->depthBits != bufvis->depthBits) - return GL_FALSE; -#endif - if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits) - return GL_FALSE; +#define check_component(foo) \ + if (ctxvis->foo && bufvis->foo && \ + ctxvis->foo != bufvis->foo) \ + return GL_FALSE + + check_component(redMask); + check_component(greenMask); + check_component(blueMask); + check_component(depthBits); + check_component(stencilBits); + +#undef check_component IIRC, Ian had some comments on this so he should re-review. But since Mesa doesn't actually used the red/green/blueMask fields (AFAIK), I'm not sure what those checks are good for. Yes, my original intention was to remove this function entirely, but Ian convinced me that GLX mandates at least these checks. I wonder if those checks could be moved into the GLX code. For Windows, the wglMakeCurrent docs say "The hdc parameter must refer to a drawing surface supported by OpenGL. It need not be the same hdc that was passed to wglCreateContext when hglrc was created, but it must be on the same device and have the same pixel format." We check for that in our stw_make_current() in the WGL code. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] st/mesa: implement GL_ATI_fragment_shader
On 02/25/2016 08:20 AM, Miklós Máté wrote: On 02/25/2016 02:37 AM, Brian Paul wrote: + if (texinst->Opcode == ATI_FRAGMENT_SHADER_SAMPLE_OP) { + /* use the current texture target for the sample operation + * note: this implementation doesn't support re-using an ATI_fs + *with different texture targets + */ + gl_texture_index index = _mesa_get_texture_target_index(t->ctx, r); + unsigned target = translate_texture_target(index); So, the result of compiling the shader happens to depend upon the currently active texture for unit 'r'? That seems funny/fragile. I've never really looked too closely at ATI_fragment_shader so I don't know. Yes, the shader code doesn't supply the texture target, it has to be deduced in the draw call, and a separate variant has to be created if the shader is re-used with different texture targets. AFAICT the r200 driver avoids this by translating the shader on every draw call. That would be good info to have in a comment somewhere, if it's not already stated somewhere else. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] st/mesa: fix handling the fallback texture
On 02/25/2016 02:40 AM, Brian Paul wrote: On 02/24/2016 04:35 PM, Miklós Máté wrote: This fixes post-processing in SW:KotOR. Can you elaborate on exactly what's happening and why this change fixes things? Sometimes no texture is bound during the post-processing, which results in msamp=0, and segfault a few lines later. v2: fix const-ness Signed-off-by: Miklós Máté--- src/mesa/state_tracker/st_atom_sampler.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_atom_sampler.c b/src/mesa/state_tracker/st_atom_sampler.c index 82dcf5e..52187d0 100644 --- a/src/mesa/state_tracker/st_atom_sampler.c +++ b/src/mesa/state_tracker/st_atom_sampler.c @@ -133,7 +133,7 @@ convert_sampler(struct st_context *st, { const struct gl_texture_object *texobj; struct gl_context *ctx = st->ctx; - struct gl_sampler_object *msamp; + const struct gl_sampler_object *msamp; GLenum texBaseFormat; texobj = ctx->Texture.Unit[texUnit]._Current; @@ -144,6 +144,10 @@ convert_sampler(struct st_context *st, texBaseFormat = _mesa_texture_base_format(texobj); msamp = _mesa_get_samplerobj(ctx, texUnit); + if (!msamp) { + /* handle the fallback texture */ + msamp = >Sampler; + } memset(sampler, 0, sizeof(*sampler)); sampler->wrap_s = gl_wrap_xlate(msamp->WrapS); I'm guessing that _mesa_get_samplerobj() returns NULL only if there's no currently active texture for the given unit. If so, maybe the msamp assignment should get moved into the earlier "if (!texobj)" test. Yes, you're right. I now moved the assignment. MM ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] main: rework the compatibility check of visuals in glXMakeCurrent
On 02/25/2016 02:37 AM, Brian Paul wrote: On 02/24/2016 04:35 PM, Miklós Máté wrote: Now it follows the GLX 1.4 specification. Can you elaborate on that a bit? Section 2.1 of the GLX spec lists a few criteria for a context and a drawable to be compatible. This fixes post-processing in SW:KotOR. Signed-off-by: Miklós Máté--- src/mesa/main/context.c | 42 -- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 26eee28..6c16229 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1525,10 +1525,6 @@ _mesa_copy_context( const struct gl_context *src, struct gl_context *dst, * Check if the given context can render into the given framebuffer * by checking visual attributes. * - * Most of these tests could go away because Mesa is now pretty flexible - * in terms of mixing rendering contexts with framebuffers. As long - * as RGB vs. CI mode agree, we're probably good. - * * \return GL_TRUE if compatible, GL_FALSE otherwise. */ static GLboolean @@ -1541,32 +1537,18 @@ check_compatible(const struct gl_context *ctx, if (buffer == _mesa_get_incomplete_framebuffer()) return GL_TRUE; -#if 0 - /* disabling this fixes the fgl_glxgears pbuffer demo */ - if (ctxvis->doubleBufferMode && !bufvis->doubleBufferMode) - return GL_FALSE; -#endif - if (ctxvis->stereoMode && !bufvis->stereoMode) - return GL_FALSE; - if (ctxvis->haveAccumBuffer && !bufvis->haveAccumBuffer) - return GL_FALSE; - if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer) - return GL_FALSE; - if (ctxvis->haveStencilBuffer && !bufvis->haveStencilBuffer) - return GL_FALSE; - if (ctxvis->redMask && ctxvis->redMask != bufvis->redMask) - return GL_FALSE; - if (ctxvis->greenMask && ctxvis->greenMask != bufvis->greenMask) - return GL_FALSE; - if (ctxvis->blueMask && ctxvis->blueMask != bufvis->blueMask) - return GL_FALSE; -#if 0 - /* disabled (see bug 11161) */ - if (ctxvis->depthBits && ctxvis->depthBits != bufvis->depthBits) - return GL_FALSE; -#endif - if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits) - return GL_FALSE; +#define check_component(foo) \ + if (ctxvis->foo && bufvis->foo && \ + ctxvis->foo != bufvis->foo) \ + return GL_FALSE + + check_component(redMask); + check_component(greenMask); + check_component(blueMask); + check_component(depthBits); + check_component(stencilBits); + +#undef check_component IIRC, Ian had some comments on this so he should re-review. But since Mesa doesn't actually used the red/green/blueMask fields (AFAIK), I'm not sure what those checks are good for. Yes, my original intention was to remove this function entirely, but Ian convinced me that GLX mandates at least these checks. MM Nowadays, we should be flexible enough that a single context can render to any variety of 16 or 24 or 32bpp surfaces. If we don't care about the bits/channel, we shouldn't care about the masks either. -Brian return GL_TRUE; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev