On Mon, Jun 11, 2018 at 6:34 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > On Mon, Jun 11, 2018 at 3:32 PM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: >> >> On Wed, Jun 6, 2018 at 7:43 AM, Rob Clark <robdcl...@gmail.com> wrote: >>> >>> Signed-off-by: Rob Clark <robdcl...@gmail.com> >>> --- >>> I can't say for sure that this will work on all drivers, but it is >>> what the blob driver does, and it seems to make deqp happy. I could >>> move this to it's own pass inside ir3, but that seemed like overkill >>> >>> src/compiler/nir/nir.h | 10 ++++++++++ >>> src/compiler/nir/nir_lower_system_values.c | 17 +++++++++++++++++ >>> src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + >>> 3 files changed, 28 insertions(+) >>> >>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >>> index 073ab4e82ea..de3d55d83af 100644 >>> --- a/src/compiler/nir/nir.h >>> +++ b/src/compiler/nir/nir.h >>> @@ -1963,6 +1963,16 @@ typedef struct nir_shader_compiler_options { >>> */ >>> bool lower_base_vertex; >>> >>> + /** >>> + * If enabled, gl_HelperInvocation will be lowered as: >>> + * >>> + * !((1 << gl_SampleID) & gl_SampleMaskIN[0])) >> >> >> This only works for multi-sampling. What about the single-sampled case? > > > Actually, I'm not even sure that it would work for multisampling for us. > What about 2x MSAA? There you are probably going to have two pixels > involved in order to get derivatives. >
so it definitely works on single-sampling case, at least on adreno. That is really the only case I've tested yet, but afaict blob does same thing in various MSAA cases, based on what I see in cmdstream traces.. Maybe it is relying on something arguably hw specific, ie. the hw isn't going to schedule a thread unless it is (a) covered, or (b) helper.. which maybe isn't true on other hw, but then I guess in those cases the driver wouldn't need to lower gl_HelperInvocation since there would have to be some other way to implement gl_HelperInvocation. If this seems like something too driver specific, I'll just roll it into my own ir3 private pass.. but that seemed overkill. BR, -R > --Jason > > >>> >>> + * >>> + * TODO any hw w/ more than 32 samples? For them (if they >>> + * used this option), a bit more math would be involved. >>> + */ >>> + bool lower_helper_invocation; >>> + >>> bool lower_cs_local_index_from_id; >>> >>> bool lower_device_index_to_zero; >>> diff --git a/src/compiler/nir/nir_lower_system_values.c >>> b/src/compiler/nir/nir_lower_system_values.c >>> index 487da042620..6668cbb5dcd 100644 >>> --- a/src/compiler/nir/nir_lower_system_values.c >>> +++ b/src/compiler/nir/nir_lower_system_values.c >>> @@ -136,6 +136,23 @@ convert_block(nir_block *block, nir_builder *b) >>> nir_load_first_vertex(b)); >>> break; >>> >>> + case SYSTEM_VALUE_HELPER_INVOCATION: >>> + if (b->shader->options->lower_helper_invocation) { >>> + nir_ssa_def *tmp; >>> + >>> + tmp = nir_ushr(b, >>> + nir_imm_int(b, 1), >>> + nir_load_sample_id(b)); >>> + >>> + tmp = nir_iand(b, >>> + nir_load_sample_mask_in(b), >>> + tmp); >>> + >>> + sysval = nir_inot(b, nir_i2b(b, tmp)); >>> + } >>> + >>> + break; >>> + >>> case SYSTEM_VALUE_INSTANCE_INDEX: >>> sysval = nir_iadd(b, >>> nir_load_instance_id(b), >>> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>> b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>> index cd1f9c526f2..341d990b269 100644 >>> --- a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>> +++ b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>> @@ -51,6 +51,7 @@ static const nir_shader_compiler_options options = { >>> .lower_extract_byte = true, >>> .lower_extract_word = true, >>> .lower_all_io_to_temps = true, >>> + .lower_helper_invocation = true, >>> }; >>> >>> struct nir_shader * >>> -- >>> 2.17.0 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev