Re: [Mesa-dev] [PATCH 04/10] radeonsi: use ac_shader_config
On Fri, May 3, 2019 at 7:19 AM Nicolai Hähnle wrote: > From: Nicolai Hähnle > > --- > src/amd/common/ac_binary.c| 2 + > src/gallium/drivers/radeonsi/si_compute.c | 14 +-- > src/gallium/drivers/radeonsi/si_shader.c | 112 +++--- > src/gallium/drivers/radeonsi/si_shader.h | 25 + > 4 files changed, 27 insertions(+), 126 deletions(-) > > diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c > index 44251886b5f..d0ca55e0e0d 100644 > --- a/src/amd/common/ac_binary.c > +++ b/src/amd/common/ac_binary.c > @@ -218,26 +218,28 @@ void ac_parse_shader_binary_config(const char *data, > size_t nbytes, > unsigned value = util_le32_to_cpu(*(uint32_t*)(data + i + > 4)); > switch (reg) { > case R_00B028_SPI_SHADER_PGM_RSRC1_PS: > case R_00B128_SPI_SHADER_PGM_RSRC1_VS: > case R_00B228_SPI_SHADER_PGM_RSRC1_GS: > case R_00B848_COMPUTE_PGM_RSRC1: > case R_00B428_SPI_SHADER_PGM_RSRC1_HS: > conf->num_sgprs = MAX2(conf->num_sgprs, > (G_00B028_SGPRS(value) + 1) * 8); > conf->num_vgprs = MAX2(conf->num_vgprs, > (G_00B028_VGPRS(value) + 1) * 4); > conf->float_mode = G_00B028_FLOAT_MODE(value); > + conf->rsrc1 = value; > break; > case R_00B02C_SPI_SHADER_PGM_RSRC2_PS: > conf->lds_size = MAX2(conf->lds_size, > G_00B02C_EXTRA_LDS_SIZE(value)); > break; > case R_00B84C_COMPUTE_PGM_RSRC2: > conf->lds_size = MAX2(conf->lds_size, > G_00B84C_LDS_SIZE(value)); > + conf->rsrc2 = value; > break; > case R_0286CC_SPI_PS_INPUT_ENA: > conf->spi_ps_input_ena = value; > break; > case R_0286D0_SPI_PS_INPUT_ADDR: > conf->spi_ps_input_addr = value; > break; > case R_0286E8_SPI_TMPRING_SIZE: > case R_00B860_COMPUTE_TMPRING_SIZE: > /* WAVESIZE is in units of 256 dwords. */ > diff --git a/src/gallium/drivers/radeonsi/si_compute.c > b/src/gallium/drivers/radeonsi/si_compute.c > index 541d7e6f118..02d7bac406a 100644 > --- a/src/gallium/drivers/radeonsi/si_compute.c > +++ b/src/gallium/drivers/radeonsi/si_compute.c > @@ -59,21 +59,21 @@ static const amd_kernel_code_t > *si_compute_get_code_object( > uint64_t symbol_offset) > { > if (!program->use_code_object_v2) { > return NULL; > } > return (const amd_kernel_code_t*) > (program->shader.binary.code + symbol_offset); > } > > static void code_object_to_config(const amd_kernel_code_t *code_object, > - struct si_shader_config *out_config) { > + struct ac_shader_config *out_config) { > > uint32_t rsrc1 = code_object->compute_pgm_resource_registers; > uint32_t rsrc2 = code_object->compute_pgm_resource_registers >> 32; > out_config->num_sgprs = code_object->wavefront_sgpr_count; > out_config->num_vgprs = code_object->workitem_vgpr_count; > out_config->float_mode = G_00B028_FLOAT_MODE(rsrc1); > out_config->rsrc1 = rsrc1; > out_config->lds_size = MAX2(out_config->lds_size, > G_00B84C_LDS_SIZE(rsrc2)); > out_config->rsrc2 = rsrc2; > out_config->scratch_bytes_per_wave = > @@ -241,22 +241,22 @@ static void *si_create_compute_state( > const amd_kernel_code_t *code_object = > si_compute_get_code_object(program, 0); > code_object_to_config(code_object, > >shader.config); > if (program->shader.binary.reloc_count != 0) { > fprintf(stderr, "Error: %d unsupported > relocations\n", > > program->shader.binary.reloc_count); > FREE(program); > return NULL; > } > } else { > - > si_shader_binary_read_config(>shader.binary, > ->shader.config, 0); > + > ac_shader_binary_read_config(>shader.binary, > +>shader.config, 0, false); > } > si_shader_dump(sctx->screen, >shader, > >debug, >PIPE_SHADER_COMPUTE, stderr, true); > if (si_shader_binary_upload(sctx->screen, > >shader) < 0) { > fprintf(stderr, "LLVM failed to upload shader\n"); > FREE(program); > return NULL; > } > } > > @@ -362,21 +362,21 @@ static void si_initialize_compute(struct
Re: [Mesa-dev] [PATCH 3/3] radeonsi: overhaul the vertex fetch fixup mechanism
For the series: Reviewed-by: Marek Olšák Marek On Fri, May 3, 2019 at 7:06 AM Haehnle, Nicolai wrote: > On 03.05.19 12:36, Nicolai Hähnle wrote: > > On 25.04.19 13:18, Nicolai Hähnle wrote: > >> @@ -4618,21 +4648,27 @@ static void si_bind_vertex_elements(struct > >> pipe_context *ctx, void *state) > >> struct si_vertex_elements *old = sctx->vertex_elements; > >> struct si_vertex_elements *v = (struct si_vertex_elements*)state; > >> sctx->vertex_elements = v; > >> sctx->vertex_buffers_dirty = true; > >> if (v && > >> (!old || > >>old->count != v->count || > >>old->uses_instance_divisors != v->uses_instance_divisors || > >> - v->uses_instance_divisors || /* we don't check which > >> divisors changed */ > >> + /* we don't check which divisors changed */ > >> + v->uses_instance_divisors || > >> + /* fix_fetch_{always,opencode,unaligned} and > >> hw_load_is_dword are > >> + * functions of fix_fetch and the src_offset alignment. > >> + * If they change and fix_fetch doesn't, it must be due to > >> different > >> + * src_offset alignment, which is reflected in > >> fix_fetch_opencode. */ > >> + old->fix_fetch_opencode != v->fix_fetch_opencode || > >>memcmp(old->fix_fetch, v->fix_fetch, > >> sizeof(v->fix_fetch[0]) * v->count))) > > > > The following condition got dropped in a late cleanup that I was doing: > > > >> (old->vb_alignment_check_mask ^ v->vb_alignment_check_mask) & > >> sctx->vertex_buffer_unaligned || > > ... and also this: > > >((v->vb_alignment_check_mask & sctx->vertex_buffer_unaligned) > && > > memcmp(old->vertex_buffer_index, v->vertex_buffer_index, > >sizeof(v->vertex_buffer_index[0]) * v->count)) || > > Cheers, > Nicolai > > > > > > I've fixed that locally. > > > > Cheers, > > Nicolai > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl_to_nir: remove unused type_is_int()
On Wed, May 08, 2019 at 01:55:53PM +1000, Timothy Arceri wrote: > This was missed in e00fa99b08b3. > > Cc: Christian Gmeiner > --- > src/compiler/glsl/glsl_to_nir.cpp | 9 - > 1 file changed, 9 deletions(-) Reviewed-by: Caio Marcelo de Oliveira Filho ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91687] Crash when creating new context after destroying the old one using indirect rendering
https://bugs.freedesktop.org/show_bug.cgi?id=91687 Timothy Arceri changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #6 from Timothy Arceri --- Is this working for you with recent Mesa? I was unable to figure out the examples dependences on my distro so couldn't test. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_nir: remove unused type_is_int()
This was missed in e00fa99b08b3. Cc: Christian Gmeiner --- src/compiler/glsl/glsl_to_nir.cpp | 9 - 1 file changed, 9 deletions(-) diff --git a/src/compiler/glsl/glsl_to_nir.cpp b/src/compiler/glsl/glsl_to_nir.cpp index 47159ebd5e8..a51d39c4753 100644 --- a/src/compiler/glsl/glsl_to_nir.cpp +++ b/src/compiler/glsl/glsl_to_nir.cpp @@ -1728,15 +1728,6 @@ type_is_signed(glsl_base_type type) type == GLSL_TYPE_INT16; } -static bool -type_is_int(glsl_base_type type) -{ - return type == GLSL_TYPE_UINT || type == GLSL_TYPE_INT || - type == GLSL_TYPE_UINT8 || type == GLSL_TYPE_INT8 || - type == GLSL_TYPE_UINT16 || type == GLSL_TYPE_INT16 || - type == GLSL_TYPE_UINT64 || type == GLSL_TYPE_INT64; -} - void nir_visitor::visit(ir_expression *ir) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99781] Some Unity games fail assertion on startup in glXCreateContextAttribsARB
https://bugs.freedesktop.org/show_bug.cgi?id=99781 Timothy Arceri changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #19 from Timothy Arceri --- I've reverted this for now as it was causing regressions bug #110632 and bug #110590 commit a01b393c397c846345f03f76f1167dd667e0ee96 Author: Timothy Arceri Date: Tue May 7 13:55:32 2019 +1000 Revert "glx: Fix synthetic error generation in __glXSendError" This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878. This seems to have broken a number of wine games. Lets revert everything for now and try again later. Acked-by: Adam Jackson Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 Looks like we need better piglit tests before trying again. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110632] "glx: Fix synthetic error generation in __glXSendError" broke wine games on 32-bit
https://bugs.freedesktop.org/show_bug.cgi?id=110632 Timothy Arceri changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from Timothy Arceri --- I've reverted this for now. We can try again later once this regression is figured out. commit a01b393c397c846345f03f76f1167dd667e0ee96 Author: Timothy Arceri Date: Tue May 7 13:55:32 2019 +1000 Revert "glx: Fix synthetic error generation in __glXSendError" This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878. This seems to have broken a number of wine games. Lets revert everything for now and try again later. Acked-by: Adam Jackson Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] kmsro: add _dri.so to two of the kmsro drivers.
From: Dave Airlie --- src/gallium/targets/dri/meson.build | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/targets/dri/meson.build b/src/gallium/targets/dri/meson.build index dd40969a166..45daf647960 100644 --- a/src/gallium/targets/dri/meson.build +++ b/src/gallium/targets/dri/meson.build @@ -78,8 +78,8 @@ foreach d : [[with_gallium_kmsro, [ 'pl111_dri.so', 'repaper_dri.so', 'rockchip_dri.so', - 'st7586.so', - 'st7735r.so', + 'st7586_dri.so', + 'st7735r_dri.so', 'sun4i-drm_dri.so', ]], [with_gallium_radeonsi, 'radeonsi_dri.so'], -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
Makes sense, thank you for the clarification. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/9] egl: add EGL_platform_device support
Acked-by: Marek Olšák Marek On Mon, May 6, 2019 at 11:02 AM Emil Velikov wrote: > This new 'platform' is added by default with no guards. > > It is effectively a copy of the surfaceless one, with updated function > names and brand new probe function. > > Due to the reuse, some of the ifdef HAVE_SURFACELESS_PLATFORM guards > have been dropped. > > A worthy mention are the changes in _egFindDisplay, since the original > and dup'd fd are required, we make use of the plat_opt argument. > > Note that no hacks for eglGetDisplay are added - the API works only with > the eglGetPlatformDisplay* API. > > v2: > - s/_eglCompareDeviceDisplay/_eglSameDeviceDisplay/ (Eric) > - let ^^ return bool (Eric) > - fixup meson build, move files() further up (Eric) > - copy from plat. surfaceless w/o the visual cleanups > - close and free when destroying the dpy > - sprinkle a few _eglDeviceSupports > - split fd handling into separate function > - use directly the render node if no FD is given (Mathias) > > v3: > - s/dpy/disp/g > - drop swap_buffers* callbacks > - drop loader_set_logger() > - drop local define > - re-introduce _eglGetDRMDeviceRenderNode() > - EGL_WARN on ForceSoftware with HW device - continue using the HW device > - bail out for "EGL_MESA_device_software" until it's fixed > - wire-up the Android build > > v4: > - use new style _eglFindDisplay() > - split hw vs sw code paths > - don't close the internal fd (already handled in FiniDisplay()) > - make swrast work (bit hacky bit will do for now) > - Android for real, drop autotools > - Correct HW + LIBGL_ALWAYS_SOFTWARE check > - use the dri2_create_drawable() helper > > Cc: Mathias Fröhlich > Cc: Marek Olšák > Signed-off-by: Emil Velikov > --- > src/egl/Android.mk | 1 + > src/egl/drivers/dri2/egl_dri2.c| 3 + > src/egl/drivers/dri2/egl_dri2.h| 13 +- > src/egl/drivers/dri2/platform_device.c | 432 + > src/egl/main/eglapi.c | 13 +- > src/egl/main/egldevice.c | 16 + > src/egl/main/egldevice.h | 3 + > src/egl/main/egldisplay.c | 64 > src/egl/main/egldisplay.h | 7 +- > src/egl/main/eglglobals.c | 1 + > src/egl/meson.build| 1 + > 11 files changed, 543 insertions(+), 11 deletions(-) > create mode 100644 src/egl/drivers/dri2/platform_device.c > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: check for libdrm when using VL with X11
Quoting Alyssa Ross (2019-05-07 06:17:15) > On Mon, May 06, 2019 at 04:38:20PM -0700, Alyssa Rosenzweig wrote: > > Wrong Alyssa, cc'ing the right one :) > > Thank you for the CC, fellow Alyssa! :) > > > On Mon, May 06, 2019 at 04:32:38PM +0100, Emil Velikov wrote: > > > Alyssa this should resolve the failure with minimal churn. Please let > > > me know if it works on your end or not. > > Emil, this works for me. Thank you. > > Tested-by: Alyssa Ross > > > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-stable Staged for 19.0, thanks. signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
They aren't. It's just a function pointer. We could add support for it but it doesn't seem worth the effort. On May 7, 2019 17:42:23 Alyssa Rosenzweig wrote: Gotcha. I wasn't sure negations in the NIR search rule were possible...? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: call constant folding before opt algebraic
On 8/5/19 1:51 am, Samuel Pitoiset wrote: What games are affected btw? PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR CodeSize MaxWaves batman-arkham-city 2581 . . .. . dawn-of-war-3 244 . . .. . f1-2017 5627 . . .. . fallout4-vr 196 . . .. . nier 1905 . . .. . no-mans-sky 4054 . . .. . prey 2182 . . .. . rot-tomb-raider 8391 . . .. . skyrim-vr 494 . . .. . sot-tomb-raider 613 . . . 0.01 % . the_witcher_3-medium 803 . . .. . the_wither_3-ultra 10400.05 % -0.03 % .. 0.02 % valve-vr-pref-trace 323 . . .. . wolfenstein-2 1056 . . -0.20 % -0.02 % . -- All affected 690.50 % -0.22 % -7.69 % -0.28 %0.57 % -- Total 29509 . . -0.03 % . . Can you please double check before pushing because of the flrp changes that landed around? No change. On 5/7/19 7:14 AM, Timothy Arceri wrote: ping! On 2/5/19 1:38 pm, Timothy Arceri wrote: The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) --- src/amd/vulkan/radv_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index cd5a9f2afb4..ad7b2439735 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -162,8 +162,8 @@ radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively, NIR_PASS(progress, shader, nir_opt_dead_cf); NIR_PASS(progress, shader, nir_opt_cse); NIR_PASS(progress, shader, nir_opt_peephole_select, 8, true, true); - NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_constant_folding); + NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_undef); NIR_PASS(progress, shader, nir_opt_conditional_discard); if (shader->options->max_unroll_iterations) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] ac, radv: remove the vec3 restriction with LLVM 9+
Reviewed-by: Marek Olšák Marek On Thu, May 2, 2019 at 10:12 AM Samuel Pitoiset wrote: > This changes requires LLVM r356755. > > 32706 shaders in 16744 tests > Totals: > SGPRS: 1448848 -> 1455984 (0.49 %) > VGPRS: 1016684 -> 1016220 (-0.05 %) > Spilled SGPRs: 25871 -> 25815 (-0.22 %) > Spilled VGPRs: 122 -> 122 (0.00 %) > Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread > Code Size: 55324500 -> 55301152 (-0.04 %) bytes > Max Waves: 235660 -> 235586 (-0.03 %) > > Totals from affected shaders: > SGPRS: 293704 -> 300840 (2.43 %) > VGPRS: 246716 -> 246252 (-0.19 %) > Spilled SGPRs: 159 -> 103 (-35.22 %) > Scratch size: 188 -> 180 (-4.26 %) dwords per thread > Code Size: 8653664 -> 8630316 (-0.27 %) bytes > Max Waves: 60811 -> 60737 (-0.12 %) > > v3: - rebase on top of master > - remove the restriction for SSBO stores as well > v2: - fix llvm 8 > > Signed-off-by: Samuel Pitoiset > --- > > I plan to run benchmarks with that change. > > src/amd/common/ac_llvm_build.c| 15 --- > src/amd/common/ac_llvm_build.h| 1 + > src/amd/common/ac_nir_to_llvm.c | 9 ++--- > src/amd/vulkan/radv_nir_to_llvm.c | 4 +++- > 4 files changed, 18 insertions(+), 11 deletions(-) > > diff --git a/src/amd/common/ac_llvm_build.c > b/src/amd/common/ac_llvm_build.c > index 22b771db774..e191a64310f 100644 > --- a/src/amd/common/ac_llvm_build.c > +++ b/src/amd/common/ac_llvm_build.c > @@ -84,6 +84,7 @@ ac_llvm_context_init(struct ac_llvm_context *ctx, > ctx->v3i32 = LLVMVectorType(ctx->i32, 3); > ctx->v4i32 = LLVMVectorType(ctx->i32, 4); > ctx->v2f32 = LLVMVectorType(ctx->f32, 2); > + ctx->v3f32 = LLVMVectorType(ctx->f32, 3); > ctx->v4f32 = LLVMVectorType(ctx->f32, 4); > ctx->v8i32 = LLVMVectorType(ctx->i32, 8); > > @@ -1167,7 +1168,7 @@ ac_build_llvm8_buffer_store_common(struct > ac_llvm_context *ctx, > args[idx++] = voffset ? voffset : ctx->i32_0; > args[idx++] = soffset ? soffset : ctx->i32_0; > args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : > 0), 0); > - unsigned func = num_channels == 3 ? 4 : num_channels; > + unsigned func = HAVE_LLVM < 0x900 && num_channels == 3 ? 4 : > num_channels; > const char *indexing_kind = structurized ? "struct" : "raw"; > char name[256], type_name[8]; > > @@ -1225,9 +1226,9 @@ ac_build_buffer_store_dword(struct ac_llvm_context > *ctx, > bool writeonly_memory, > bool swizzle_enable_hint) > { > - /* Split 3 channel stores, becase LLVM doesn't support 3-channel > + /* Split 3 channel stores, because only LLVM 9+ support 3-channel > * intrinsics. */ > - if (num_channels == 3) { > + if (num_channels == 3 && HAVE_LLVM < 0x900) { > LLVMValueRef v[3], v01; > > for (int i = 0; i < 3; i++) { > @@ -1354,7 +1355,7 @@ ac_build_llvm8_buffer_load_common(struct > ac_llvm_context *ctx, > args[idx++] = voffset ? voffset : ctx->i32_0; > args[idx++] = soffset ? soffset : ctx->i32_0; > args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : > 0), 0); > - unsigned func = num_channels == 3 ? 4 : num_channels; > + unsigned func = HAVE_LLVM < 0x900 && num_channels == 3 ? 4 : > num_channels; > const char *indexing_kind = structurized ? "struct" : "raw"; > char name[256], type_name[8]; > > @@ -1420,7 +1421,7 @@ ac_build_buffer_load(struct ac_llvm_context *ctx, > if (num_channels == 1) > return result[0]; > > - if (num_channels == 3) > + if (num_channels == 3 && HAVE_LLVM < 0x900) > result[num_channels++] = LLVMGetUndef(ctx->f32); > return ac_build_gather_values(ctx, result, num_channels); > } > @@ -1512,7 +1513,7 @@ ac_build_llvm8_tbuffer_load(struct ac_llvm_context > *ctx, > args[idx++] = soffset ? soffset : ctx->i32_0; > args[idx++] = LLVMConstInt(ctx->i32, dfmt | (nfmt << 4), 0); > args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : > 0), 0); > - unsigned func = num_channels == 3 ? 4 : num_channels; > + unsigned func = HAVE_LLVM < 0x900 && num_channels == 3 ? 4 : > num_channels; > const char *indexing_kind = structurized ? "struct" : "raw"; > char name[256], type_name[8]; > > @@ -1698,7 +1699,7 @@ ac_build_llvm8_tbuffer_store(struct ac_llvm_context > *ctx, > args[idx++] = soffset ? soffset : ctx->i32_0; > args[idx++] = LLVMConstInt(ctx->i32, dfmt | (nfmt << 4), 0); > args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : > 0), 0); > - unsigned func = num_channels == 3 ? 4 : num_channels; > + unsigned func = HAVE_LLVM < 0x900 && num_channels == 3 ? 4 : > num_channels; > const char *indexing_kind = structurized ? "struct" : "raw"; > char name[256], type_name[8];
Re: [Mesa-dev] GitLab Merge Request stable workflow question
There is a document about this in docs/, but I think you just need to add the Cc: stable tag or the Fixes: tag to the commit message of all commits you wanna nominate. Mare On Mon, May 6, 2019 at 12:12 PM Chuck Atkins wrote: > When doing an MR via GitLab, is adding the Cc: mesa-stable item enough to > nominate it for inclusion in stable or does it still need to be separately > sent to the stable mailing list? > > The question is specifically wrt > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/806 since I'd > like to see the fix in the next stable release, but it's really a broader > workflow question of how stable is dealt with while both the MR and ML > processes are active. > > -- > Chuck Atkins > Staff R Engineer, Scientific Computing > Kitware, Inc. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: add an AMD_TEX_ANISO environment variable
Reviewed-by: Marek Olšák Marek On Mon, May 6, 2019 at 8:19 PM Timothy Arceri wrote: > This brings it inline with the recently added AMD_DEBUG. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109619 > --- > src/gallium/drivers/radeonsi/si_pipe.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/drivers/radeonsi/si_pipe.c > b/src/gallium/drivers/radeonsi/si_pipe.c > index b0e0ca7af05..4d36fd46a9b 100644 > --- a/src/gallium/drivers/radeonsi/si_pipe.c > +++ b/src/gallium/drivers/radeonsi/si_pipe.c > @@ -950,6 +950,10 @@ struct pipe_screen *radeonsi_screen_create(struct > radeon_winsys *ws, >sizeof(struct si_transfer), 64); > > sscreen->force_aniso = MIN2(16, > debug_get_num_option("R600_TEX_ANISO", -1)); > + if (sscreen->force_aniso == -1) { > + sscreen->force_aniso = MIN2(16, > debug_get_num_option("AMD_TEX_ANISO", -1)); > + } > + > if (sscreen->force_aniso >= 0) { > printf("radeonsi: Forcing anisotropy filter to %ix\n", >/* round down to a power of two */ > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
D'oh. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
Sigh ... given that there's both "is_used_by_if" and "is_not_used_by_if" ... gonna go with "no". On Tue, May 7, 2019 at 6:42 PM Alyssa Rosenzweig wrote: > > Gotcha. I wasn't sure negations in the NIR search rule were possible...? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
Gotcha. I wasn't sure negations in the NIR search rule were possible...? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] egl: add EGL_platform_device support
On Mon, May 6, 2019 at 11:19 AM Emil Velikov wrote: > On Sat, 4 May 2019 at 04:18, Marek Olšák wrote: > > > > On Fri, May 3, 2019 at 1:58 AM Mathias Fröhlich < > mathias.froehl...@gmx.net> wrote: > >> > >> Good Morning, > >> > >> On Wednesday, 1 May 2019 21:43:08 CEST Marek Olšák wrote: > >> > BTW, swrast doesn't have to exist on the system. It's not uncommon > for me > >> > to have no swrast on my development system. > >> > >> Ok. I see. I use swrast regularly to test changes with different > backend drivers. > >> Also especially classic swrast as something that is close to the good > old swtnl > >> drivers - to catch bad interactions with those. > >> > >> Anyhow, with a very old swrast I think you will get test failures. > >> But else if the system swrast is found in the hopefully not so distant > future > >> the tests should even pass - well depends on what Emil now does to get a > >> better overall swrast behavior. > >> On a production system with a full set of driver packages I do expect to > >> find swrast, right? At least on a workstation grade linux distribution. > >> > >> I start to see the actual problem for AMD there. > >> Not your test system at home, but the pro driver that needs to ship > >> and QA swrast then. > >> > >> Anyhow, I do not actually understand the way how we walk all > >> installed egl driver implementations - including closed drivers - > finally > >> and present all those devices. In a perfect world *for the customer* > >> I could enumerate all devices - including oss i965 and the closed nvidia > >> bumblebee device - on my laptop for example. > >> > >> Means - if that works fine AMD could hook into that mechanism and > >> provide further devices. Well - in the long term. > > > > > > We include libGL and libEGL along with radeonsi in our binary driver > installer. We probably don't include swrast, but I'm not 100% sure. > > > The series I just sent out covers everything but the "don't expose the > software device". It does include a hack which can be toggled to > achieve that though ;-) > > My line of thinking is as follows: > > Preamble: > A software device is only listed when the user requests the full > device list via QueryDevices and even then, it's the last one in the > list. > Thus it's close to impossible to get it "by mistake". > > Case A - average Joe: > Getting Mesa from their distribution - swrast is build and shipped. > > Case B - tailored solution like AMDGPU-PRO, Yocto builders or others: > People doing the platform integration know if swrast will be > built/available. If listing the software device is not something > they're interested, the trivial hack can be applied locally. > > This seems like a perfectly good middle-ground, don't you agree? > Yes, it's OK. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/9] egl: keep the software device at the end of the list
On Mon, May 6, 2019 at 11:02 AM Emil Velikov wrote: > From: Emil Velikov > > By default, the user is likely to pick the first device so it should > not be the least performant (aka software) one. > > Suggested-by: Marek Olšák > Signed-off-by: Emil Velikov > --- > src/egl/main/egldevice.c | 15 ++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/src/egl/main/egldevice.c b/src/egl/main/egldevice.c > index c5c9a21273a..328d9ea08c5 100644 > --- a/src/egl/main/egldevice.c > +++ b/src/egl/main/egldevice.c > @@ -293,13 +293,26 @@ _eglQueryDevicesEXT(EGLint max_devices, >goto out; > } > > + /* Push the first device (the software one) to the end of the list. > +* Sending it to the user only if they've requested the full list. > +* > +* By default, the user is likely to pick the first device so having > the > +* software (aka least performant) one is not a good idea. > +*/ > *num_devices = MIN2(num_devs, max_devices); > > - for (i = 0, dev = devs; i < *num_devices; i++) { > + for (i = 0, dev = devs->Next; dev && i < max_devices; i++) { >devices[i] = dev; >dev = dev->Next; > } > > + /* User requested the full device list, add the sofware device. */ > + if (max_devices >= num_devs) { > + /* The first device is always software */ > Oh BTW, the comment here is not congruent with the code. Marek > + assert(_eglDeviceSupports(devs, _EGL_DEVICE_SOFTWARE)); > + devices[num_devs - 1] = devs; > + } > + > out: > mtx_unlock(_eglGlobal.Mutex); > > -- > 2.21.0 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/9] egl: keep the software device at the end of the list
For patches 1-7: Reviewed-by: Marek Olšák Marek On Mon, May 6, 2019 at 11:02 AM Emil Velikov wrote: > From: Emil Velikov > > By default, the user is likely to pick the first device so it should > not be the least performant (aka software) one. > > Suggested-by: Marek Olšák > Signed-off-by: Emil Velikov > --- > src/egl/main/egldevice.c | 15 ++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/src/egl/main/egldevice.c b/src/egl/main/egldevice.c > index c5c9a21273a..328d9ea08c5 100644 > --- a/src/egl/main/egldevice.c > +++ b/src/egl/main/egldevice.c > @@ -293,13 +293,26 @@ _eglQueryDevicesEXT(EGLint max_devices, >goto out; > } > > + /* Push the first device (the software one) to the end of the list. > +* Sending it to the user only if they've requested the full list. > +* > +* By default, the user is likely to pick the first device so having > the > +* software (aka least performant) one is not a good idea. > +*/ > *num_devices = MIN2(num_devs, max_devices); > > - for (i = 0, dev = devs; i < *num_devices; i++) { > + for (i = 0, dev = devs->Next; dev && i < max_devices; i++) { >devices[i] = dev; >dev = dev->Next; > } > > + /* User requested the full device list, add the sofware device. */ > + if (max_devices >= num_devs) { > + /* The first device is always software */ > + assert(_eglDeviceSupports(devs, _EGL_DEVICE_SOFTWARE)); > + devices[num_devs - 1] = devs; > + } > + > out: > mtx_unlock(_eglGlobal.Mutex); > > -- > 2.21.0 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
On Tue, May 7, 2019 at 5:45 PM Alyssa Rosenzweig wrote: > > > IMO better names might be is_scalar_swizzle or something. > > Ah, yes, that would be a better name! is_not_scalar_swizzle in this case > (logic is flipped). > > > Can num_components be 1? If so, then this will return false, whereas > > you probably wanted it to return true. > > I think that's the correct behaviour...? It should return true if > multiple distinct channels are accessed. For nr_components=1, that's > false by definition. Right, that's my bad -- I was thinking is_scalar, but this is is_non_scalar, so you're good. (My personal preference would be to make it is_scalar and then stick a negation into the nir rule, but I don't have anything solid to back up that preference. Negations in names are confusing to me, I guess.) Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
> IMO better names might be is_scalar_swizzle or something. Ah, yes, that would be a better name! is_not_scalar_swizzle in this case (logic is flipped). > Can num_components be 1? If so, then this will return false, whereas > you probably wanted it to return true. I think that's the correct behaviour...? It should return true if multiple distinct channels are accessed. For nr_components=1, that's false by definition. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Fix two uninitialized accesses in compiler
Tentative R-b, but I'm baffled what the flip-flops would be about. Could you link the list of failures introduced (we're maybe relying on buggy behaviour anyway)? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] nir: Add nir_lower_blend pass
> Logic ops seem... challenging to emulate in the shader. That shader > would need the destination colors in the framebuffer storage format, and > I'm not sure that's always possible (maybe?). Alright, that's good to know. I will note that in Midgard, the native hardware ops are to load/store the actual framebuffer storage format. As seen later in the series, I lower load/store_output to a series of ops converting to/from float and the actual format. It's an open-question whether we want actual nir_intrinsics for this so the lowering happens in common NIR code and then we get fun logic ops and such. > It might also be fun to add support for GL_AMD_blend_minmax_factor and > GL_SGIX_blend_alpha_minmax. Looking at this lowering pass, it seems > like most of the work would be adding tests. Probably! > Having all of the lowering related to blending in one place seems like a > good idea. Probably, yes. Also since GLSL IR is, well ;) > > +/* These structs encapsulates the blend state such that it can be lowered > encapsulate > > + * cleanly */ > > */ on its own line. There's at least one more instance of this below. Is there, uh, a thing I can stick in my vimrc to get this right? > Alas, case should be at the same indentation level as switch. Good to know, thank you. > Since factor is an enum, I think it's better to not have the default > case. If there's no default case, the compiler will issue a warning if > a new enum is added but not handled. > > Either way, the break is definitely not necessary. +1 > { goes on it's own line. --Wait, that's a rule I actually follow, how'd I mess it up here? :P > Why? Without a vectorizer (or even with a vectorizer), it seems like > this will generate much worse code. I guess it's only a few > instructions once per shader, so it may not matter... but it's a little > surprising especially after going to extra effort to get the blend color > as a vector. Honestly? Since that's how vc4 (scalar) did it and for v1 of this series I was more eager to get something passing tests than particularly good. In my original implementation (before joining upstream), I did it like this and then used nir_opt_vectorize to get it back to something reasonable. That's one strategy still. The other option is to try to generate vector out of the gate, but that's hard to get right when RGB/alpha can be separate (but don't have to be), etc. The logic needed would approach just, well, having ALU vectorization... might as well merge the vectorize pass at that point... Suggestions welcome :) > FWIW, since someone will probably comment, I like this organization > better than the method other passes use of splitting things into > multiple functions. Hehe, thank you :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/compiler: Unset flag reg when FB write is not predicated
That's a bit gross that we have to do that Oh, well. Reviewed-by: Jason Ekstrand On Mon, Apr 29, 2019 at 6:01 PM Matt Turner wrote: > In the FS IR we pretend that the instruction is predicated with (+f0.1) > just for flag dependency tracking purposes. Since the instruction > doesn't support predication before Haswell, we unset the predicate so we > should also unset the flag register so that we can round-trip the > disassembly. > --- > src/intel/compiler/brw_fs_generator.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > b/src/intel/compiler/brw_fs_generator.cpp > index af8350aed6c..84909f83fec 100644 > --- a/src/intel/compiler/brw_fs_generator.cpp > +++ b/src/intel/compiler/brw_fs_generator.cpp > @@ -363,6 +363,7 @@ fs_generator::generate_fb_write(fs_inst *inst, struct > brw_reg payload) > { > if (devinfo->gen < 8 && !devinfo->is_haswell) { >brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); > + brw_set_default_flag_reg(p, 0, 0); > } > > const struct brw_reg implied_header = > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107511] KHR/khrplatform.h not always installed when needed
https://bugs.freedesktop.org/show_bug.cgi?id=107511 Madhurkiran changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #2 from Madhurkiran --- Hi all, I see that certain GL headers like "glcorearb.h" needs "KHRplatform.h" header files. This works seamlessly when the GL and EGL providers are MESA. In scenarios where the GL provider is MESA and EGL provider is some GPU vendor say ARM, in that scenario, the khrplatform header comes from two different providers and that results in conflict. In scenarios like this, who should provide the KHRplatform.h header. Some users, may want use exclusively EGL/GLES2 binaries, and they dont need GL support. How can we resolve this conflict? Thanks Mads -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110625] [TRACKER] Mesa 19.1 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=110625 El jinete sin cabeza changed: What|Removed |Added CC||romanescu.2...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] compiler: Add enums for blend state
This commit is Reviewed-by: Ian Romanick On 5/5/19 7:26 PM, Alyssa Rosenzweig wrote: > We add enums corresponding to (GLES) blend state to shader_enums.h, > complementing the existing advanced blending enums in the file. This > allows us to represent blending state in a driver-agnostic, API-agnostic > way to permit lowering. > > Signed-off-by: Alyssa Rosenzweig > Cc: Eric Anholt > Cc: Kenneth Graunke > --- > src/compiler/shader_enums.h | 21 + > 1 file changed, 21 insertions(+) > > diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h > index ac293af4519..47b1ca01dd6 100644 > --- a/src/compiler/shader_enums.h > +++ b/src/compiler/shader_enums.h > @@ -753,6 +753,27 @@ enum gl_advanced_blend_mode > BLEND_ALL= 0x7fff, > }; > > +enum blend_func > +{ > + BLEND_FUNC_ADD, > + BLEND_FUNC_SUBTRACT, > + BLEND_FUNC_REVERSE_SUBTRACT, > + BLEND_FUNC_MIN, > + BLEND_FUNC_MAX, > +}; > + > +enum blend_factor > +{ > + BLEND_FACTOR_ZERO, > + BLEND_FACTOR_SRC_COLOR, > + BLEND_FACTOR_DST_COLOR, > + BLEND_FACTOR_SRC_ALPHA, > + BLEND_FACTOR_DST_ALPHA, > + BLEND_FACTOR_CONSTANT_COLOR, > + BLEND_FACTOR_CONSTANT_ALPHA, > + BLEND_FACTOR_SRC_ALPHA_SATURATE, > +}; > + > enum gl_tess_spacing > { > TESS_SPACING_UNSPECIFIED, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] nir: Add blend_const_color_rgba sysval
This commit is Reviewed-by: Ian Romanick On 5/5/19 7:26 PM, Alyssa Rosenzweig wrote: > This represents a float vec4 constant color, as passed to glBlendColor. > While the existing 4 shader sysvals are retained to minimize code churn, > a single vectorized intrinsic is required for efficient blending on > vector architectures. (This may also apply to archictectures like > Bifrost where ALU is scalar but load/store is vector; it largely depends > on how blending is implemented per-driver.) > > Signed-off-by: Alyssa Rosenzweig > Cc: Eric Anholt > Cc: Kenneth Graunke > --- > src/compiler/nir/nir_intrinsics.py | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/nir/nir_intrinsics.py > b/src/compiler/nir/nir_intrinsics.py > index 3a0470c2ca1..df459a3cdec 100644 > --- a/src/compiler/nir/nir_intrinsics.py > +++ b/src/compiler/nir/nir_intrinsics.py > @@ -568,11 +568,14 @@ system_value("viewport_z_offset", 1) > system_value("viewport_scale", 3) > system_value("viewport_offset", 3) > > -# Blend constant color values. Float values are clamped.# > +# Blend constant color values. Float values are clamped. Vectored versions > are > +# provided as well for driver convenience > + > system_value("blend_const_color_r_float", 1) > system_value("blend_const_color_g_float", 1) > system_value("blend_const_color_b_float", 1) > system_value("blend_const_color_a_float", 1) > +system_value("blend_const_color_rgba", 4) > system_value("blend_const_color_rgba_unorm", 1) > system_value("blend_const_color__unorm", 1) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] nir: Add nir_lower_blend pass
On 5/5/19 7:26 PM, Alyssa Rosenzweig wrote: > This new lowering pass implements the OpenGL ES blend pipeline in > shaders, applicable to hardware lacking full-featured blending hardware > (including Midgard/Bifrost and vc4). This pass is run on a fragment > shader, rewriting the store to a blended version, loading in the > framebuffer destination color and constant color via intrinsics as > necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to > pass dEQP's blend tests. That said, at present it has the following > limitations: > > - MRT is not supported. > - Logic ops are not supported. Logic ops seem... challenging to emulate in the shader. That shader would need the destination colors in the framebuffer storage format, and I'm not sure that's always possible (maybe?). We (and I think everyone else) removed GL_EXT_blend_logic_op because nobody's hardware could handle the interactions GL_EXT_blend_equation_separate. It would be cool to add it back. :) It might also be fun to add support for GL_AMD_blend_minmax_factor and GL_SGIX_blend_alpha_minmax. Looking at this lowering pass, it seems like most of the work would be adding tests. > MRT support is on my TODO list but paused until MRT is implemented in > the rest of the driver. Both changes should be fairly trivial. > > It also includes MIN/MAX modes, so in conjunction with the advanced > blend mode lowering it should be sufficient for ES3, though this has not > been thoroughly tested. It is an open question whether the current GLSL > IR based advanced blend lowering should be NIRified and merged into this > pass. Having all of the lowering related to blending in one place seems like a good idea. > ...Dual-source blending is not supported, Ryan. > > Signed-off-by: Alyssa Rosenzweig > Cc: Eric Anholt > Cc: Kenneth Graunke > --- > src/compiler/Makefile.sources | 1 + > src/compiler/nir/meson.build | 1 + > src/compiler/nir/nir.h | 22 +++ > src/compiler/nir/nir_lower_blend.c | 214 + > 4 files changed, 238 insertions(+) > create mode 100644 src/compiler/nir/nir_lower_blend.c > > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources > index 9bebc3d8867..d68b9550b02 100644 > --- a/src/compiler/Makefile.sources > +++ b/src/compiler/Makefile.sources > @@ -238,6 +238,7 @@ NIR_FILES = \ > nir/nir_lower_bit_size.c \ > nir/nir_lower_bool_to_float.c \ > nir/nir_lower_bool_to_int32.c \ > + nir/nir_lower_blend.c \ > nir/nir_lower_clamp_color_outputs.c \ > nir/nir_lower_clip.c \ > nir/nir_lower_clip_cull_distance_arrays.c \ > diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build > index a8faeb9c018..73ab62d4b46 100644 > --- a/src/compiler/nir/meson.build > +++ b/src/compiler/nir/meson.build > @@ -116,6 +116,7 @@ files_libnir = files( >'nir_lower_array_deref_of_vec.c', >'nir_lower_atomics_to_ssbo.c', >'nir_lower_bitmap.c', > + 'nir_lower_blend.c', >'nir_lower_bool_to_float.c', >'nir_lower_bool_to_int32.c', >'nir_lower_clamp_color_outputs.c', > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > index 37161e83e4d..8b68faed819 100644 > --- a/src/compiler/nir/nir.h > +++ b/src/compiler/nir/nir.h > @@ -3447,6 +3447,28 @@ typedef enum { > > bool nir_lower_to_source_mods(nir_shader *shader, > nir_lower_to_source_mods_flags options); > > +/* These structs encapsulates the blend state such that it can be lowered encapsulate > + * cleanly */ */ on its own line. There's at least one more instance of this below. > + > +typedef struct { > + enum blend_func func; > + > + enum blend_factor src_factor; > + bool invert_src_factor; > + > + enum blend_factor dst_factor; > + bool invert_dst_factor; > +} nir_lower_blend_channel; > + > +typedef struct { > + struct { > + nir_lower_blend_channel rgb; > + nir_lower_blend_channel alpha; > + } rt[8]; > +} nir_lower_blend_options; > + > +void nir_lower_blend(nir_shader *shader, nir_lower_blend_options options); > + > bool nir_lower_gs_intrinsics(nir_shader *shader); > > typedef unsigned (*nir_lower_bit_size_callback)(const nir_alu_instr *, void > *); > diff --git a/src/compiler/nir/nir_lower_blend.c > b/src/compiler/nir/nir_lower_blend.c > new file mode 100644 > index 000..5a874f08834 > --- /dev/null > +++ b/src/compiler/nir/nir_lower_blend.c > @@ -0,0 +1,214 @@ > +/* > + * Copyright (C) 2019 Alyssa Rosenzweig > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the
Re: [Mesa-dev] [PATCH] radv: apply the indexing workaround for atomic buffer operations on GFX9
Hi Samuel, This doesn't apply cleanly on 19.0, and I'm not sure how to resolve the diff. Could you provide a packport please? Thanks, Dylan Quoting Samuel Pitoiset (2019-05-03 02:45:34) > Because the new raw/struct intrinsics are buggy with LLVM 8 > (they weren't marked as source of divergence), we fallback to the > old instrinsics for atomic buffer operations. This means we need > to apply the indexing workaround for GFX9. > > The fact that we need another workaround is painful but we should > be able to clean up that a bit once LLVM 7 support will be dropped. > > This fixes a GPU hang with AC Odyssey and some rendering problems > with Nioh. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 > Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic > intrinsics with LLVM 9+") > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_nir_to_llvm.c | 12 +++- > src/amd/common/ac_shader_abi.h| 1 + > src/amd/vulkan/radv_nir_to_llvm.c | 6 ++ > 3 files changed, 14 insertions(+), 5 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index c92eaaca31d..151e0d0f961 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -2417,10 +2417,12 @@ static void get_image_coords(struct ac_nir_context > *ctx, > } > > static LLVMValueRef get_image_buffer_descriptor(struct ac_nir_context *ctx, > -const nir_intrinsic_instr > *instr, bool write) > +const nir_intrinsic_instr > *instr, > + bool write, bool atomic) > { > LLVMValueRef rsrc = get_image_descriptor(ctx, instr, AC_DESC_BUFFER, > write); > - if (ctx->abi->gfx9_stride_size_workaround) { > + if (ctx->abi->gfx9_stride_size_workaround || > + (ctx->abi->gfx9_stride_size_workaround_for_atomic && atomic)) { > LLVMValueRef elem_count = > LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, > 0), ""); > LLVMValueRef stride = > LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 1, > 0), ""); > stride = LLVMBuildLShr(ctx->ac.builder, stride, > LLVMConstInt(ctx->ac.i32, 16, 0), ""); > @@ -2466,7 +2468,7 @@ static LLVMValueRef visit_image_load(struct > ac_nir_context *ctx, > unsigned num_channels = util_last_bit(mask); > LLVMValueRef rsrc, vindex; > > - rsrc = get_image_buffer_descriptor(ctx, instr, false); > + rsrc = get_image_buffer_descriptor(ctx, instr, false, false); > vindex = LLVMBuildExtractElement(ctx->ac.builder, > get_src(ctx, instr->src[1]), > ctx->ac.i32_0, ""); > > @@ -2520,7 +2522,7 @@ static void visit_image_store(struct ac_nir_context > *ctx, > args.cache_policy = get_cache_policy(ctx, access, true, > writeonly_memory); > > if (dim == GLSL_SAMPLER_DIM_BUF) { > - LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, > true); > + LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, > true, false); > LLVMValueRef src = ac_to_float(>ac, get_src(ctx, > instr->src[3])); > unsigned src_channels = ac_get_llvm_num_components(src); > LLVMValueRef vindex; > @@ -2632,7 +2634,7 @@ static LLVMValueRef visit_image_atomic(struct > ac_nir_context *ctx, > params[param_count++] = get_src(ctx, instr->src[3]); > > if (dim == GLSL_SAMPLER_DIM_BUF) { > - params[param_count++] = get_image_buffer_descriptor(ctx, > instr, true); > + params[param_count++] = get_image_buffer_descriptor(ctx, > instr, true, true); > params[param_count++] = > LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[1]), > > ctx->ac.i32_0, ""); /* vindex */ > params[param_count++] = ctx->ac.i32_0; /* voffset */ > diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h > index 108fe58ce57..8debb1ff986 100644 > --- a/src/amd/common/ac_shader_abi.h > +++ b/src/amd/common/ac_shader_abi.h > @@ -203,6 +203,7 @@ struct ac_shader_abi { > /* Whether to workaround GFX9 ignoring the stride for the buffer size > if IDXEN=0 > * and LLVM optimizes an indexed load with constant index to IDXEN=0. > */ > bool gfx9_stride_size_workaround; > + bool gfx9_stride_size_workaround_for_atomic; > }; > > #endif /* AC_SHADER_ABI_H */ > diff --git a/src/amd/vulkan/radv_nir_to_llvm.c > b/src/amd/vulkan/radv_nir_to_llvm.c > index 796d78e34f4..d83f0bd547f 100644 > --- a/src/amd/vulkan/radv_nir_to_llvm.c > +++ b/src/amd/vulkan/radv_nir_to_llvm.c > @@ -3687,6 +3687,12 @@ LLVMModuleRef
[Mesa-dev] [ANNOUNCE] mesa 19.1.0-rc1
The first release candidate for Mesa 19.1.0 is now available, with 1 week of delay. The plan is to have one release candidate every Tuesday, until the anticipated final release on 28th May 2019 (one week later the schedule in the calendar, due the delay). The expectation is that the 19.0 branch will remain alive with bi- weekly releases until the 19.1.1 release. In the path to 19.1.0 release, there is a tracker bug for the regressions found since 19.0: https://bugs.freedesktop.org/show_bug.cgi?id=110625 Here are the people which helped shape the current release. 1 Adam Jackson 1 Albert Pal 12 Alejandro Piñeiro 1 Alexander von Gluck IV 1 Alexandros Frantzis 32 Alok Hota 192 Alyssa Rosenzweig 1 Alyssa Ross 1 Amit Pundir 5 Andre Heider 2 Andreas Baierl 12 Andres Gomez 5 Andrii Simiklit 1 Antia Puentes 7 Anuj Phogat 48 Axel Davy 1 Bart Oldeman 102 Bas Nieuwenhuizen 1 Benjamin Gordon 1 Benjamin Tissoires 3 Boyan Ding 1 Boyuan Zhang 51 Brian Paul 58 Caio Marcelo de Oliveira Filho 1 Carlos Garnacho 17 Chad Versace 2 Charmaine Lee 78 Chia-I Wu 3 Chris Forbes 19 Chris Wilson 11 Christian Gmeiner 1 Chuck Atkins 6 Connor Abbott 2 Daniel Schürmann 2 Daniel Stone 12 Danylo Piliaiev 60 Dave Airlie 3 David Riley 1 David Shao 1 Dominik Drees 1 Drew Davenport 39 Dylan Baker 5 Eduardo Lima Mitev 1 El Christianito 6 Eleni Maria Stea 3 Elie Tournier 27 Emil Velikov 1 Emmanuel Gil Peyrot 121 Eric Anholt 138 Eric Engestrom 5 Erico Nunes 79 Erik Faye-Lund 2 Ernestas Kulik 6 Francisco Jerez 4 Fritz Koenig 33 Gert Wollny 2 Greg V 1 Grigori Goronzy 4 Guido Günther 44 Gurchetan Singh 1 Guttula, Suresh 1 Hal Gentz 1 Heinrich 39 Iago Toral Quiroga 54 Ian Romanick 5 Icenowy Zheng 14 Ilia Mirkin 1 Illia Iorin 12 James Zhu 2 Jan Vesely 202 Jason Ekstrand 1 Jian-Hong Pan 1 Jiang, Sonny 3 John Stultz 1 Jon Turney 21 Jonathan Marek 22 Jordan Justen 4 Jose Maria Casanova Crespo 1 José Fonseca 16 Juan A. Suarez Romero 5 Julien Isorce 2 Józef Kucia 82 Karol Herbst 3 Kasireddy, Vivek 866 Kenneth Graunke 1 Kevin Strasser 1 Khaled Emara 1 Khem Raj 1 Kishore Kadiyala 1 Konstantin Kharlamov 49 Kristian Høgsberg 5 Leo Liu 2 Lepton Wu 113 Lionel Landwerlin 3 Lubomir Rintel 3 Lucas Stach 115 Marek Olšák 1 Mario Kleiner 5 Mark Janes 2 Mateusz Krzak 22 Mathias Fröhlich 7 Matt Turner 1 Matthias Lorenz 6 Mauro Rossi 1 Maya Rashish 19 Michel Dänzer 6 Mike Blumenkrantz 1 Nanley Chery 1 Neha Bhende 9 Nicolai Hähnle 3 Oscar Blumberg 1 Patrick Lerda 1 Patrick Rudolph 12 Pierre Moreau 3 Plamena Manolova 7 Qiang Yu 45 Rafael Antognolli 1 Ray Zhang 1 Rhys Kidd 27 Rhys Perry 130 Rob Clark 2 Rob Herring 1 Rodrigo Vivi 2 Roland Scheidegger 1 Romain Failliot 1 Ross Burton 1 Ryan Houdek 9 Sagar Ghuge 4 Samuel Iglesias Gonsálvez 141 Samuel Pitoiset 4 Sergii Romantsov 1 Sonny Jiang 42 Tapani Pälli 5 Thomas Hellstrom 1 Timo Aaltonen 69 Timothy Arceri 19 Timur Kristóf 1 Tobias Klausmann 1 Tomasz Figa 17 Tomeu Vizoso 8 Toni Lönnberg 2 Topi Pohjolainen 2 Vasily Khoruzhick 4 Vinson Lee 1 Vivek Kasireddy 1 Xavier Bouchoux 1 Yevhenii Kolesnikov 1 coypu 1 davidbepo 1 grmat 1 pal1000 3 suresh guttula git tag: mesa-19.1.0-rc1 https://mesa.freedesktop.org/archive/mesa-19.1.0-rc1.tar.xz MD5: 3d94c1cebf9d607f361f312906efdad5 mesa-19.1.0-rc1.tar.xz SHA1: a0abdc582beaba6f31f24ef622b32f9b8391cb76 mesa-19.1.0-rc1.tar.xz SHA256: a09d0319e18783e2416d6dfce401bb872675874b332a91bfe880841eb6156e73 mesa-19.1.0-rc1.tar.xz SHA512: a56215882a7c22b7b8fe57d5703914d674841e4045676e2cc2e7834d17f4d5a765516bec4f01eea6772c50e1d979cc430e032302f38c6e7a4274bc43a4d647b1 mesa-19.1.0-rc1.tar.xz PGP: https://mesa.freedesktop.org/archive/mesa-19.1.0-rc1.tar.xz.sig signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Revert "glx: Fix synthetic error generation in __glXSendError"
On Tue, 2019-05-07 at 20:17 +1000, Timothy Arceri wrote: > I don't know enough about this code to take responsibility for such > changes. I was just trying to revert to the status quo until this could > be investigated again. > > My suggestion is we roll back the recent change. Then someone needs to > create piglit test for both scenarios before trying to move forward again. > > If you want to try something different then go for it :) Don't feel bad about not understanding this code. Honestly what libGL is doing here - fabricating an X11 protocol error out of thin air - is completely unique to libGL among the X client libraries, and it's not surprising that attempting to do causes issues. I'm fine with any amount of revert here. I can't promise I'll have the time to investigate this in any detail promptly though. - ajax ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] gallium/util: fix two MSVC compiler warnings
For the series: Reviewed-by: Roland Scheidegger Am 04.05.19 um 18:07 schrieb Brian Paul: > Remove stray const qualifier. > s/unsigned/enum tgsi_semantic/ > --- > src/gallium/auxiliary/util/u_format_zs.h | 2 +- > src/gallium/auxiliary/util/u_simple_shaders.c | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_format_zs.h > b/src/gallium/auxiliary/util/u_format_zs.h > index 160919d..bed3c51 100644 > --- a/src/gallium/auxiliary/util/u_format_zs.h > +++ b/src/gallium/auxiliary/util/u_format_zs.h > @@ -113,7 +113,7 @@ void > util_format_z24_unorm_s8_uint_pack_s_8uint(uint8_t *dst_row, unsigned > dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, > unsigned height); > > void > -util_format_z24_unorm_s8_uint_pack_separate(uint8_t *dst_row, unsigned > dst_stride, const uint32_t *z_src_row, unsigned z_src_stride, const uint8_t > *s_src_row, unsigned s_src_stride, const unsigned width, unsigned height); > +util_format_z24_unorm_s8_uint_pack_separate(uint8_t *dst_row, unsigned > dst_stride, const uint32_t *z_src_row, unsigned z_src_stride, const uint8_t > *s_src_row, unsigned s_src_stride, unsigned width, unsigned height); > > void > util_format_s8_uint_z24_unorm_unpack_z_float(float *dst_row, unsigned > dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, > unsigned height); > diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c > b/src/gallium/auxiliary/util/u_simple_shaders.c > index 4046ab1..d62a655 100644 > --- a/src/gallium/auxiliary/util/u_simple_shaders.c > +++ b/src/gallium/auxiliary/util/u_simple_shaders.c > @@ -117,8 +117,8 @@ util_make_vertex_passthrough_shader_with_so(struct > pipe_context *pipe, > > void *util_make_layered_clear_vertex_shader(struct pipe_context *pipe) > { > - const unsigned semantic_names[] = {TGSI_SEMANTIC_POSITION, > - TGSI_SEMANTIC_GENERIC}; > + const enum tgsi_semantic semantic_names[] = {TGSI_SEMANTIC_POSITION, > +TGSI_SEMANTIC_GENERIC}; > const unsigned semantic_indices[] = {0, 0}; > > return util_make_vertex_passthrough_shader_with_so(pipe, 2, > semantic_names, > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: call constant folding before opt algebraic
What games are affected btw? Can you please double check before pushing because of the flrp changes that landed around? On 5/7/19 7:14 AM, Timothy Arceri wrote: ping! On 2/5/19 1:38 pm, Timothy Arceri wrote: The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) --- src/amd/vulkan/radv_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index cd5a9f2afb4..ad7b2439735 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -162,8 +162,8 @@ radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively, NIR_PASS(progress, shader, nir_opt_dead_cf); NIR_PASS(progress, shader, nir_opt_cse); NIR_PASS(progress, shader, nir_opt_peephole_select, 8, true, true); - NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_constant_folding); + NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_undef); NIR_PASS(progress, shader, nir_opt_conditional_discard); if (shader->options->max_unroll_iterations) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102522] [radeonsi, bisected] commit 147d7fb772 causes full-window map to flash green in Crea
https://bugs.freedesktop.org/show_bug.cgi?id=102522 Kaelyn T changed: What|Removed |Added Status|NEEDINFO|RESOLVED Resolution|--- |FIXED --- Comment #7 from Kaelyn T --- I just tried replaying my trace using apitrace 8.0 and the flashing issue didn't occur. I suspect the underlying issue may have been fixed since I previously replayed the trace last June, as I'm currently running Mesa 19.0.3, LLVM 8.0.0, Xorg 1.20.4, and kernel 5.0.10. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: call constant folding before opt algebraic
Nope, r-b On Tue, May 7, 2019 at 8:36 AM Samuel Pitoiset wrote: > > Seems fine to, > > Reviewed-by: Samuel Pitoiset > > Bas, any comments? > > On 5/7/19 7:14 AM, Timothy Arceri wrote: > > ping! > > > > On 2/5/19 1:38 pm, Timothy Arceri wrote: > >> The pattern of calling opt algebraic first seems to have originated > >> in i965. The order in OpenGL drivers generally doesn't matter > >> because the GLSL IR optimisations do constant folding before > >> opt algebraic. > >> > >> However in Vulkan drivers calling opt algebraic first can result > >> in missed constant folding opportunities. > >> > >> vkpipeline-db results (VEGA64): > >> > >> Totals from affected shaders: > >> SGPRS: 3160 -> 3176 (0.51 %) > >> VGPRS: 3588 -> 3580 (-0.22 %) > >> Spilled SGPRs: 52 -> 44 (-15.38 %) > >> Spilled VGPRs: 0 -> 0 (0.00 %) > >> Private memory VGPRs: 0 -> 0 (0.00 %) > >> Scratch size: 12 -> 12 (0.00 %) dwords per thread > >> Code Size: 261812 -> 261036 (-0.30 %) bytes > >> LDS: 7 -> 7 (0.00 %) blocks > >> Max Waves: 346 -> 348 (0.58 %) > >> Wait states: 0 -> 0 (0.00 %) > >> --- > >> src/amd/vulkan/radv_shader.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c > >> index cd5a9f2afb4..ad7b2439735 100644 > >> --- a/src/amd/vulkan/radv_shader.c > >> +++ b/src/amd/vulkan/radv_shader.c > >> @@ -162,8 +162,8 @@ radv_optimize_nir(struct nir_shader *shader, bool > >> optimize_conservatively, > >> NIR_PASS(progress, shader, nir_opt_dead_cf); > >> NIR_PASS(progress, shader, nir_opt_cse); > >> NIR_PASS(progress, shader, nir_opt_peephole_select, > >> 8, true, true); > >> -NIR_PASS(progress, shader, nir_opt_algebraic); > >> NIR_PASS(progress, shader, nir_opt_constant_folding); > >> +NIR_PASS(progress, shader, nir_opt_algebraic); > >> NIR_PASS(progress, shader, nir_opt_undef); > >> NIR_PASS(progress, shader, > >> nir_opt_conditional_discard); > >> if (shader->options->max_unroll_iterations) { > >> > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] panfrost: Fix two uninitialized accesses in compiler
Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop. Signed-off-by: Tomeu Vizoso --- src/gallium/drivers/panfrost/midgard/midgard_compile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.c b/src/gallium/drivers/panfrost/midgard/midgard_compile.c index 3a1f805702e2..0cdde46028fc 100644 --- a/src/gallium/drivers/panfrost/midgard/midgard_compile.c +++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.c @@ -915,7 +915,7 @@ optimise_nir(nir_shader *nir) NIR_PASS(progress, nir, nir_opt_constant_folding); if (lower_flrp != 0) { -bool lower_flrp_progress; +bool lower_flrp_progress = false; NIR_PASS(lower_flrp_progress, nir, nir_lower_flrp, @@ -1020,7 +1020,7 @@ emit_load_const(compiler_context *ctx, nir_load_const_instr *instr) { nir_ssa_def def = instr->def; -float *v = ralloc_array(NULL, float, 4); +float *v = rzalloc_array(NULL, float, 4); nir_const_load_to_arr(v, instr, f32); _mesa_hash_table_u64_insert(ctx->ssa_constants, def.index + 1, v); } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently
On 5/7/19 5:30 PM, Ian Romanick wrote: On 5/7/19 8:20 AM, Samuel Pitoiset wrote: This introduces glitches with Talos and Serious Sam 2017 with RADV... Are you able to reproduce the problem with ANV? Probably not very easily. If you can figure out which shader it is, it should be easy to figure out the problem from before / after NIR. The NIR before/after is *very* different, not easy to find the differences actually. On 5/7/19 8:01 AM, GitLab Mirror wrote: Module: Mesa Branch: master Commit: 5b908db604b2f47bb8382047533e556db8d5f52b URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=5b908db604b2f47bb8382047533e556db8d5f52b Author: Ian Romanick Date: Tue Aug 21 17:17:24 2018 -0700 nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently No changes on any other Intel platforms. v2: Rebase on 424372e5dd5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8189888 -> 8153912 (-0.44%) instructions in affected programs: 1199037 -> 1163061 (-3.00%) helped: 4124 HURT: 10 helped stats (abs) min: 1 max: 40 x̄: 8.73 x̃: 9 helped stats (rel) min: 0.20% max: 86.96% x̄: 4.96% x̃: 3.02% HURT stats (abs) min: 1 max: 2 x̄: 1.20 x̃: 1 HURT stats (rel) min: 1.06% max: 3.92% x̄: 1.62% x̃: 1.06% 95% mean confidence interval for instructions value: -8.84 -8.56 95% mean confidence interval for instructions %-change: -5.12% -4.77% Instructions are helped. total cycles in shared programs: 188606710 -> 188426964 (-0.10%) cycles in affected programs: 27505596 -> 27325850 (-0.65%) helped: 4026 HURT: 77 helped stats (abs) min: 2 max: 646 x̄: 44.99 x̃: 46 helped stats (rel) min: <.01% max: 94.58% x̄: 2.35% x̃: 0.85% HURT stats (abs) min: 2 max: 376 x̄: 17.79 x̃: 6 HURT stats (rel) min: <.01% max: 2.60% x̄: 0.22% x̃: 0.04% 95% mean confidence interval for cycles value: -44.75 -42.87 95% mean confidence interval for cycles %-change: -2.44% -2.17% Cycles are helped. LOST: 3 GAINED: 35 Reviewed-by: Matt Turner --- src/compiler/nir/nir_lower_flrp.c | 134 ++ 1 file changed, 134 insertions(+) diff --git a/src/compiler/nir/nir_lower_flrp.c b/src/compiler/nir/nir_lower_flrp.c index 952068ec9cc..c041fefc52b 100644 --- a/src/compiler/nir/nir_lower_flrp.c +++ b/src/compiler/nir/nir_lower_flrp.c @@ -137,6 +137,89 @@ replace_with_fast(struct nir_builder *bld, struct u_vector *dead_flrp, append_flrp_to_dead_list(dead_flrp, alu); } +/** + * Replace flrp(a, b, c) with (b*c ± c) + a + */ +static void +replace_with_expanded_ffma_and_add(struct nir_builder *bld, + struct u_vector *dead_flrp, + struct nir_alu_instr *alu, bool subtract_c) +{ + nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0); + nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1); + nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2); + + nir_ssa_def *const b_times_c = nir_fadd(bld, b, c); + nir_instr_as_alu(b_times_c->parent_instr)->exact = alu->exact; + + nir_ssa_def *inner_sum; + + if (subtract_c) { + nir_ssa_def *const neg_c = nir_fneg(bld, c); + nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact; + + inner_sum = nir_fadd(bld, b_times_c, neg_c); + } else { + inner_sum = nir_fadd(bld, b_times_c, c); + } + + nir_instr_as_alu(inner_sum->parent_instr)->exact = alu->exact; + + nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, a); + nir_instr_as_alu(outer_sum->parent_instr)->exact = alu->exact; + + nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(outer_sum)); + + /* DO NOT REMOVE the original flrp yet. Many of the lowering choices are + * based on other uses of the sources. Removing the flrp may cause the + * last flrp in a sequence to make a different, incorrect choice. + */ + append_flrp_to_dead_list(dead_flrp, alu); +} + +/** + * Determines whether a swizzled source is constant w/ all components the same. + * + * The value of the constant is stored in \c result. + * + * \return + * True if all components of the swizzled source are the same constant. + * Otherwise false is returned. + */ +static bool +all_same_constant(const nir_alu_instr *instr, unsigned src, double *result) +{ + nir_const_value *val = nir_src_as_const_value(instr->src[src].src); + + if (!val) + return false; + + const uint8_t *const swizzle = instr->src[src].swizzle; + const unsigned num_components = nir_dest_num_components(instr->dest.dest); + + if (instr->dest.dest.ssa.bit_size == 32) { + const float first = val[swizzle[0]].f32; + + for (unsigned i = 1; i < num_components; i++) { + if (val[swizzle[i]].f32 != first) + return false; + } + + *result = first; + } else { + const double first = val[swizzle[0]].f64; + + for (unsigned i = 1; i < num_components; i++) { + if
Re: [Mesa-dev] Mesa (master): nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently
On 5/7/19 8:20 AM, Samuel Pitoiset wrote: > This introduces glitches with Talos and Serious Sam 2017 with RADV... > > Are you able to reproduce the problem with ANV? Probably not very easily. If you can figure out which shader it is, it should be easy to figure out the problem from before / after NIR. > On 5/7/19 8:01 AM, GitLab Mirror wrote: >> Module: Mesa >> Branch: master >> Commit: 5b908db604b2f47bb8382047533e556db8d5f52b >> URL: >> http://cgit.freedesktop.org/mesa/mesa/commit/?id=5b908db604b2f47bb8382047533e556db8d5f52b >> >> >> Author: Ian Romanick >> Date: Tue Aug 21 17:17:24 2018 -0700 >> >> nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently >> >> No changes on any other Intel platforms. >> >> v2: Rebase on 424372e5dd5 ("nir: Use the flrp lowering pass instead of >> nir_opt_algebraic") >> >> Iron Lake and GM45 had similar results. (Iron Lake shown) >> total instructions in shared programs: 8189888 -> 8153912 (-0.44%) >> instructions in affected programs: 1199037 -> 1163061 (-3.00%) >> helped: 4124 >> HURT: 10 >> helped stats (abs) min: 1 max: 40 x̄: 8.73 x̃: 9 >> helped stats (rel) min: 0.20% max: 86.96% x̄: 4.96% x̃: 3.02% >> HURT stats (abs) min: 1 max: 2 x̄: 1.20 x̃: 1 >> HURT stats (rel) min: 1.06% max: 3.92% x̄: 1.62% x̃: 1.06% >> 95% mean confidence interval for instructions value: -8.84 -8.56 >> 95% mean confidence interval for instructions %-change: -5.12% -4.77% >> Instructions are helped. >> >> total cycles in shared programs: 188606710 -> 188426964 (-0.10%) >> cycles in affected programs: 27505596 -> 27325850 (-0.65%) >> helped: 4026 >> HURT: 77 >> helped stats (abs) min: 2 max: 646 x̄: 44.99 x̃: 46 >> helped stats (rel) min: <.01% max: 94.58% x̄: 2.35% x̃: 0.85% >> HURT stats (abs) min: 2 max: 376 x̄: 17.79 x̃: 6 >> HURT stats (rel) min: <.01% max: 2.60% x̄: 0.22% x̃: 0.04% >> 95% mean confidence interval for cycles value: -44.75 -42.87 >> 95% mean confidence interval for cycles %-change: -2.44% -2.17% >> Cycles are helped. >> >> LOST: 3 >> GAINED: 35 >> >> Reviewed-by: Matt Turner >> >> --- >> >> src/compiler/nir/nir_lower_flrp.c | 134 >> ++ >> 1 file changed, 134 insertions(+) >> >> diff --git a/src/compiler/nir/nir_lower_flrp.c >> b/src/compiler/nir/nir_lower_flrp.c >> index 952068ec9cc..c041fefc52b 100644 >> --- a/src/compiler/nir/nir_lower_flrp.c >> +++ b/src/compiler/nir/nir_lower_flrp.c >> @@ -137,6 +137,89 @@ replace_with_fast(struct nir_builder *bld, struct >> u_vector *dead_flrp, >> append_flrp_to_dead_list(dead_flrp, alu); >> } >> +/** >> + * Replace flrp(a, b, c) with (b*c ± c) + a >> + */ >> +static void >> +replace_with_expanded_ffma_and_add(struct nir_builder *bld, >> + struct u_vector *dead_flrp, >> + struct nir_alu_instr *alu, bool >> subtract_c) >> +{ >> + nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0); >> + nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1); >> + nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2); >> + >> + nir_ssa_def *const b_times_c = nir_fadd(bld, b, c); >> + nir_instr_as_alu(b_times_c->parent_instr)->exact = alu->exact; >> + >> + nir_ssa_def *inner_sum; >> + >> + if (subtract_c) { >> + nir_ssa_def *const neg_c = nir_fneg(bld, c); >> + nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact; >> + >> + inner_sum = nir_fadd(bld, b_times_c, neg_c); >> + } else { >> + inner_sum = nir_fadd(bld, b_times_c, c); >> + } >> + >> + nir_instr_as_alu(inner_sum->parent_instr)->exact = alu->exact; >> + >> + nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, a); >> + nir_instr_as_alu(outer_sum->parent_instr)->exact = alu->exact; >> + >> + nir_ssa_def_rewrite_uses(>dest.dest.ssa, >> nir_src_for_ssa(outer_sum)); >> + >> + /* DO NOT REMOVE the original flrp yet. Many of the lowering >> choices are >> + * based on other uses of the sources. Removing the flrp may >> cause the >> + * last flrp in a sequence to make a different, incorrect choice. >> + */ >> + append_flrp_to_dead_list(dead_flrp, alu); >> +} >> + >> +/** >> + * Determines whether a swizzled source is constant w/ all components >> the same. >> + * >> + * The value of the constant is stored in \c result. >> + * >> + * \return >> + * True if all components of the swizzled source are the same constant. >> + * Otherwise false is returned. >> + */ >> +static bool >> +all_same_constant(const nir_alu_instr *instr, unsigned src, double >> *result) >> +{ >> + nir_const_value *val = nir_src_as_const_value(instr->src[src].src); >> + >> + if (!val) >> + return false; >> + >> + const uint8_t *const swizzle = instr->src[src].swizzle; >> + const unsigned num_components = >> nir_dest_num_components(instr->dest.dest); >> + >> + if (instr->dest.dest.ssa.bit_size == 32) { >> + const float first = val[swizzle[0]].f32; >> + >> + for (unsigned i = 1; i < num_components;
Re: [Mesa-dev] Mesa (master): nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently
This introduces glitches with Talos and Serious Sam 2017 with RADV... Are you able to reproduce the problem with ANV? On 5/7/19 8:01 AM, GitLab Mirror wrote: Module: Mesa Branch: master Commit: 5b908db604b2f47bb8382047533e556db8d5f52b URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=5b908db604b2f47bb8382047533e556db8d5f52b Author: Ian Romanick Date: Tue Aug 21 17:17:24 2018 -0700 nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently No changes on any other Intel platforms. v2: Rebase on 424372e5dd5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8189888 -> 8153912 (-0.44%) instructions in affected programs: 1199037 -> 1163061 (-3.00%) helped: 4124 HURT: 10 helped stats (abs) min: 1 max: 40 x̄: 8.73 x̃: 9 helped stats (rel) min: 0.20% max: 86.96% x̄: 4.96% x̃: 3.02% HURT stats (abs) min: 1 max: 2 x̄: 1.20 x̃: 1 HURT stats (rel) min: 1.06% max: 3.92% x̄: 1.62% x̃: 1.06% 95% mean confidence interval for instructions value: -8.84 -8.56 95% mean confidence interval for instructions %-change: -5.12% -4.77% Instructions are helped. total cycles in shared programs: 188606710 -> 188426964 (-0.10%) cycles in affected programs: 27505596 -> 27325850 (-0.65%) helped: 4026 HURT: 77 helped stats (abs) min: 2 max: 646 x̄: 44.99 x̃: 46 helped stats (rel) min: <.01% max: 94.58% x̄: 2.35% x̃: 0.85% HURT stats (abs) min: 2 max: 376 x̄: 17.79 x̃: 6 HURT stats (rel) min: <.01% max: 2.60% x̄: 0.22% x̃: 0.04% 95% mean confidence interval for cycles value: -44.75 -42.87 95% mean confidence interval for cycles %-change: -2.44% -2.17% Cycles are helped. LOST: 3 GAINED: 35 Reviewed-by: Matt Turner --- src/compiler/nir/nir_lower_flrp.c | 134 ++ 1 file changed, 134 insertions(+) diff --git a/src/compiler/nir/nir_lower_flrp.c b/src/compiler/nir/nir_lower_flrp.c index 952068ec9cc..c041fefc52b 100644 --- a/src/compiler/nir/nir_lower_flrp.c +++ b/src/compiler/nir/nir_lower_flrp.c @@ -137,6 +137,89 @@ replace_with_fast(struct nir_builder *bld, struct u_vector *dead_flrp, append_flrp_to_dead_list(dead_flrp, alu); } +/** + * Replace flrp(a, b, c) with (b*c ± c) + a + */ +static void +replace_with_expanded_ffma_and_add(struct nir_builder *bld, + struct u_vector *dead_flrp, + struct nir_alu_instr *alu, bool subtract_c) +{ + nir_ssa_def *const a = nir_ssa_for_alu_src(bld, alu, 0); + nir_ssa_def *const b = nir_ssa_for_alu_src(bld, alu, 1); + nir_ssa_def *const c = nir_ssa_for_alu_src(bld, alu, 2); + + nir_ssa_def *const b_times_c = nir_fadd(bld, b, c); + nir_instr_as_alu(b_times_c->parent_instr)->exact = alu->exact; + + nir_ssa_def *inner_sum; + + if (subtract_c) { + nir_ssa_def *const neg_c = nir_fneg(bld, c); + nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact; + + inner_sum = nir_fadd(bld, b_times_c, neg_c); + } else { + inner_sum = nir_fadd(bld, b_times_c, c); + } + + nir_instr_as_alu(inner_sum->parent_instr)->exact = alu->exact; + + nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, a); + nir_instr_as_alu(outer_sum->parent_instr)->exact = alu->exact; + + nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(outer_sum)); + + /* DO NOT REMOVE the original flrp yet. Many of the lowering choices are +* based on other uses of the sources. Removing the flrp may cause the +* last flrp in a sequence to make a different, incorrect choice. +*/ + append_flrp_to_dead_list(dead_flrp, alu); +} + +/** + * Determines whether a swizzled source is constant w/ all components the same. + * + * The value of the constant is stored in \c result. + * + * \return + * True if all components of the swizzled source are the same constant. + * Otherwise false is returned. + */ +static bool +all_same_constant(const nir_alu_instr *instr, unsigned src, double *result) +{ + nir_const_value *val = nir_src_as_const_value(instr->src[src].src); + + if (!val) + return false; + + const uint8_t *const swizzle = instr->src[src].swizzle; + const unsigned num_components = nir_dest_num_components(instr->dest.dest); + + if (instr->dest.dest.ssa.bit_size == 32) { + const float first = val[swizzle[0]].f32; + + for (unsigned i = 1; i < num_components; i++) { + if (val[swizzle[i]].f32 != first) +return false; + } + + *result = first; + } else { + const double first = val[swizzle[0]].f64; + + for (unsigned i = 1; i < num_components; i++) { + if (val[swizzle[i]].f64 != first) +return false; + } + + *result = first; + } + + return true; +} + static bool sources_are_constants_with_similar_magnitudes(const nir_alu_instr *instr) { @@ -265,6 +348,57 @@ convert_flrp_instruction(nir_builder *bld, return; } + /* +* -
Re: [Mesa-dev] [PATCH] gallivm: fix broken 8-wide s3tc decoding
LGTM. Reviewed-by: Jose Fonseca From: srol...@vmware.com Sent: Tuesday, May 7, 2019 03:12 To: Jose Fonseca; Brian Paul; mesa-dev@lists.freedesktop.org Cc: Roland Scheidegger Subject: [PATCH] gallivm: fix broken 8-wide s3tc decoding From: Roland Scheidegger Brian noticed there was an uninitialized var for the 8-wide case and 128 bit blocks, which made it always crash. Likewise, the 64bit block case had another crash bug due to type mismatch. Color decode (used for all s3tc formats) also had a bogus shuffle for this case, leading to decode artifacts. Fix these all up, which makes the code actually work 8-wide. Note that it's still not used - I've verified it works, and the generated assembly does look quite a bit simpler actually (20-30% less instructions for the s3tc decode part with avx2), however in practice it still seems to be sligthly slower for some unknown reason (tested with openarena) on my haswell box, so for now continue to split things into 4-wide vectors before decoding. --- .../auxiliary/gallivm/lp_bld_format_s3tc.c| 33 +-- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_s3tc.c b/src/gallium/auxiliary/gallivm/lp_bld_format_s3tc.c index 9561c349dad..8f6e9bec18a 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_format_s3tc.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_format_s3tc.c @@ -77,24 +77,17 @@ lp_build_uninterleave2_half(struct gallivm_state *gallivm, unsigned lo_hi) { LLVMValueRef shuffle, elems[LP_MAX_VECTOR_LENGTH]; - unsigned i, j; + unsigned i; assert(type.length <= LP_MAX_VECTOR_LENGTH); assert(lo_hi < 2); if (type.length * type.width == 256) { - assert(type.length >= 4); - for (i = 0, j = 0; i < type.length; ++i) { - if (i == type.length / 4) { -j = type.length; - } else if (i == type.length / 2) { -j = type.length / 2; - } else if (i == 3 * type.length / 4) { -j = 3 * type.length / 4; - } else { -j += 2; - } - elems[i] = lp_build_const_int32(gallivm, j + lo_hi); + assert(type.length == 8); + assert(type.width == 32); + const unsigned shufvals[8] = {0, 2, 8, 10, 4, 6, 12, 14}; + for (i = 0; i < type.length; ++i) { + elems[i] = lp_build_const_int32(gallivm, shufvals[i] + lo_hi); } } else { for (i = 0; i < type.length; ++i) { @@ -277,7 +270,7 @@ lp_build_gather_s3tc(struct gallivm_state *gallivm, } else { LLVMValueRef tmp[4], cc01, cc23; - struct lp_type lp_type32, lp_type64, lp_type32dxt; + struct lp_type lp_type32, lp_type64; memset(_type32, 0, sizeof lp_type32); lp_type32.width = 32; lp_type32.length = length; @@ -309,10 +302,14 @@ lp_build_gather_s3tc(struct gallivm_state *gallivm, lp_build_const_extend_shuffle(gallivm, 2, 4), ""); } if (length == 8) { +struct lp_type lp_type32_4; +memset(_type32_4, 0, sizeof lp_type32_4); +lp_type32_4.width = 32; +lp_type32_4.length = 4; for (i = 0; i < 4; ++i) { tmp[0] = elems[i]; tmp[1] = elems[i+4]; - elems[i] = lp_build_concat(gallivm, tmp, lp_type32, 2); + elems[i] = lp_build_concat(gallivm, tmp, lp_type32_4, 2); } } cc01 = lp_build_interleave2_half(gallivm, lp_type32, elems[0], elems[1], 0); @@ -811,7 +808,7 @@ s3tc_dxt3_to_rgba_aos(struct gallivm_state *gallivm, tmp = lp_build_select(, sel_mask, alpha_low, alpha_hi); bit_pos = LLVMBuildAnd(builder, bit_pos, lp_build_const_int_vec(gallivm, type, 0xffdf), ""); - /* Warning: slow shift with per element count */ + /* Warning: slow shift with per element count (without avx2) */ /* * Could do pshufb here as well - just use appropriate 2 bits in bit_pos * to select the right byte with pshufb. Then for the remaining one bit @@ -1640,7 +1637,6 @@ s3tc_decode_block_dxt5(struct gallivm_state *gallivm, lp_build_const_int_vec(gallivm, type16, 8), ""); alpha = LLVMBuildBitCast(builder, alpha, i64t, ""); shuffle1 = lp_build_const_shuffle1(gallivm, 0, 8); - /* XXX this shuffle broken with LLVM 2.8 */ alpha0 = LLVMBuildShuffleVector(builder, alpha0, alpha0, shuffle1, ""); alpha1 = LLVMBuildShuffleVector(builder, alpha1, alpha1, shuffle1, ""); @@ -2176,6 +2172,9 @@ lp_build_fetch_s3tc_rgba_aos(struct gallivm_state *gallivm, return rgba; } + /* +* Could use n > 8 here with avx2, but doesn't seem faster. +*/ if (n > 4) { unsigned count; LLVMTypeRef i8_vectype = LLVMVectorType(i8t, 4 * n); -- 2.17.1 ___ mesa-dev mailing list
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
On Tue, May 7, 2019 at 9:28 AM Ilia Mirkin wrote: > Divergence is generally when multiple parallel "lanes" go in different > directions -- a jump that some lanes take and others don't, which > requires the GPU to execute some lanes first, and then the rest, > separately. > > IMO better names might be is_scalar_swizzle or something. > I was going to make roughly the same comment but couldn't come up with a good name suggestion. I like is_scalar_swizzle. :-D > On Mon, May 6, 2019 at 11:00 PM Alyssa Rosenzweig > wrote: > > > > This allows algebraic optimizations to check if the argument accesses > > multiple distinct components of a vector. So a swizzle like "xyz" will > > return true, but "yyy" will return false, as will a scalar. This can be > > useful for optimizations on vector processors, where a convergent > > swizzle can be done in one clock (replicating as if a scalar) but a > > divergent one must be scalarized. In these cases, it is useful to > > optimize differently based on whether the swizzle diverges. (Use case is > > the "csel" condition on Midgard). > > > > Signed-off-by: Alyssa Rosenzweig > > Cc: Jason Ekstrand > > --- > > src/compiler/nir/nir_search_helpers.h | 16 > > 1 file changed, 16 insertions(+) > > > > diff --git a/src/compiler/nir/nir_search_helpers.h > b/src/compiler/nir/nir_search_helpers.h > > index 1624508993d..46d7c300643 100644 > > --- a/src/compiler/nir/nir_search_helpers.h > > +++ b/src/compiler/nir/nir_search_helpers.h > > @@ -143,6 +143,22 @@ is_not_const(nir_alu_instr *instr, unsigned src, > UNUSED unsigned num_components, > > return !nir_src_is_const(instr->src[src].src); > > } > > > > +/* I.e. a vector that actually accesses multiple channels */ > > + > > +static inline bool > > +is_divergent_vector(nir_alu_instr *instr, UNUSED unsigned src, unsigned > num_components, > > + const uint8_t *swizzle) > > +{ > > + unsigned first_component = swizzle[0]; > > + > > + for (unsigned i = 1; i < num_components; ++i) { > > + if (swizzle[i] != first_component) > > + return true; > > + } > > Can num_components be 1? If so, then this will return false, whereas > you probably wanted it to return true. > Yes, it can. Good catch! > > + > > + return false; > > +} > > + > > static inline bool > > is_used_more_than_once(nir_alu_instr *instr) > > { > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper
Divergence is generally when multiple parallel "lanes" go in different directions -- a jump that some lanes take and others don't, which requires the GPU to execute some lanes first, and then the rest, separately. IMO better names might be is_scalar_swizzle or something. On Mon, May 6, 2019 at 11:00 PM Alyssa Rosenzweig wrote: > > This allows algebraic optimizations to check if the argument accesses > multiple distinct components of a vector. So a swizzle like "xyz" will > return true, but "yyy" will return false, as will a scalar. This can be > useful for optimizations on vector processors, where a convergent > swizzle can be done in one clock (replicating as if a scalar) but a > divergent one must be scalarized. In these cases, it is useful to > optimize differently based on whether the swizzle diverges. (Use case is > the "csel" condition on Midgard). > > Signed-off-by: Alyssa Rosenzweig > Cc: Jason Ekstrand > --- > src/compiler/nir/nir_search_helpers.h | 16 > 1 file changed, 16 insertions(+) > > diff --git a/src/compiler/nir/nir_search_helpers.h > b/src/compiler/nir/nir_search_helpers.h > index 1624508993d..46d7c300643 100644 > --- a/src/compiler/nir/nir_search_helpers.h > +++ b/src/compiler/nir/nir_search_helpers.h > @@ -143,6 +143,22 @@ is_not_const(nir_alu_instr *instr, unsigned src, UNUSED > unsigned num_components, > return !nir_src_is_const(instr->src[src].src); > } > > +/* I.e. a vector that actually accesses multiple channels */ > + > +static inline bool > +is_divergent_vector(nir_alu_instr *instr, UNUSED unsigned src, unsigned > num_components, > + const uint8_t *swizzle) > +{ > + unsigned first_component = swizzle[0]; > + > + for (unsigned i = 1; i < num_components; ++i) { > + if (swizzle[i] != first_component) > + return true; > + } Can num_components be 1? If so, then this will return false, whereas you probably wanted it to return true. > + > + return false; > +} > + > static inline bool > is_used_more_than_once(nir_alu_instr *instr) > { > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: add a workaround for Monster Hunter World and LLVM 7&8
The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. LLVM 9 will be fixed but we need a workaround for older LLVM versions, see https://reviews.llvm.org/D61313. Cc: "19.0" "19.1" Signed-off-by: Samuel Pitoiset --- src/amd/common/ac_llvm_util.c | 7 --- src/amd/common/ac_llvm_util.h | 1 + src/amd/vulkan/radv_debug.h | 1 + src/amd/vulkan/radv_device.c | 8 src/amd/vulkan/radv_shader.c | 2 ++ 5 files changed, 16 insertions(+), 3 deletions(-) diff --git a/src/amd/common/ac_llvm_util.c b/src/amd/common/ac_llvm_util.c index 69446863b95..6063411310b 100644 --- a/src/amd/common/ac_llvm_util.c +++ b/src/amd/common/ac_llvm_util.c @@ -151,13 +151,14 @@ static LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family, LLVMTargetRef target = ac_get_llvm_target(triple); snprintf(features, sizeof(features), -"+DumpCode,-fp32-denormals,+fp64-denormals%s%s%s%s%s", +"+DumpCode,-fp32-denormals,+fp64-denormals%s%s%s%s%s%s", HAVE_LLVM >= 0x0800 ? "" : ",+vgpr-spilling", tm_options & AC_TM_SISCHED ? ",+si-scheduler" : "", tm_options & AC_TM_FORCE_ENABLE_XNACK ? ",+xnack" : "", tm_options & AC_TM_FORCE_DISABLE_XNACK ? ",-xnack" : "", -tm_options & AC_TM_PROMOTE_ALLOCA_TO_SCRATCH ? ",-promote-alloca" : ""); - +tm_options & AC_TM_PROMOTE_ALLOCA_TO_SCRATCH ? ",-promote-alloca" : "", +tm_options & AC_TM_NO_LOAD_STORE_OPT ? ",-load-store-opt" : ""); + LLVMTargetMachineRef tm = LLVMCreateTargetMachine( target, triple, diff --git a/src/amd/common/ac_llvm_util.h b/src/amd/common/ac_llvm_util.h index 6d961c06f8a..ca00540da80 100644 --- a/src/amd/common/ac_llvm_util.h +++ b/src/amd/common/ac_llvm_util.h @@ -65,6 +65,7 @@ enum ac_target_machine_options { AC_TM_CHECK_IR = (1 << 5), AC_TM_ENABLE_GLOBAL_ISEL = (1 << 6), AC_TM_CREATE_LOW_OPT = (1 << 7), + AC_TM_NO_LOAD_STORE_OPT = (1 << 8), }; enum ac_float_mode { diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h index 17a2f3370c0..652a3b677d2 100644 --- a/src/amd/vulkan/radv_debug.h +++ b/src/amd/vulkan/radv_debug.h @@ -51,6 +51,7 @@ enum { RADV_DEBUG_CHECKIR = 0x20, RADV_DEBUG_NOTHREADLLVM = 0x40, RADV_DEBUG_NOBINNING = 0x80, + RADV_DEBUG_NO_LOAD_STORE_OPT = 0x100, }; enum { diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 10956ded66f..caf1316096e 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -464,6 +464,7 @@ static const struct debug_control radv_debug_options[] = { {"checkir", RADV_DEBUG_CHECKIR}, {"nothreadllvm", RADV_DEBUG_NOTHREADLLVM}, {"nobinning", RADV_DEBUG_NOBINNING}, + {"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT}, {NULL, 0} }; @@ -510,6 +511,13 @@ radv_handle_per_app_options(struct radv_instance *instance, } else if (!strcmp(name, "DOOM_VFR")) { /* Work around a Doom VFR game bug */ instance->debug_flags |= RADV_DEBUG_NO_DYNAMIC_BOUNDS; + } else if (!strcmp(name, "MonsterHunterWorld.exe")) { + /* Workaround for a WaW hazard when LLVM moves/merges +* load/store memory operations. +* See https://reviews.llvm.org/D61313 +*/ + if (HAVE_LLVM < 0x900) + instance->debug_flags |= RADV_DEBUG_NO_LOAD_STORE_OPT; } } diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 7568d59056c..b5ee8ac5c51 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -649,6 +649,8 @@ shader_variant_create(struct radv_device *device, tm_options |= AC_TM_SISCHED; if (options->check_ir) tm_options |= AC_TM_CHECK_IR; + if (device->instance->debug_flags & RADV_DEBUG_NO_LOAD_STORE_OPT) + tm_options |= AC_TM_NO_LOAD_STORE_OPT; thread_compiler = !(device->instance->debug_flags & RADV_DEBUG_NOTHREADLLVM); radv_init_llvm_once(); -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: check for libdrm when using VL with X11
On Mon, May 06, 2019 at 04:38:20PM -0700, Alyssa Rosenzweig wrote: > Wrong Alyssa, cc'ing the right one :) Thank you for the CC, fellow Alyssa! :) > On Mon, May 06, 2019 at 04:32:38PM +0100, Emil Velikov wrote: > > Alyssa this should resolve the failure with minimal churn. Please let > > me know if it works on your end or not. Emil, this works for me. Thank you. Tested-by: Alyssa Ross signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Revert "glx: Fix synthetic error generation in __glXSendError"
On 7/5/19 6:27 pm, Michel Dänzer wrote: On 2019-05-07 5:55 a.m., Timothy Arceri wrote: This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878. This seems to have broken a number of wine games. Cc: Adam Jackson Cc: Ian Romanick Cc: Hal Gentz Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 --- src/glx/glx_error.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glx/glx_error.c b/src/glx/glx_error.c index 712ecf8213d..653cbeb2d2a 100644 --- a/src/glx/glx_error.c +++ b/src/glx/glx_error.c @@ -54,7 +54,7 @@ __glXSendError(Display * dpy, int_fast8_t errorCode, uint_fast32_t resourceID, error.errorCode = glx_dpy->codes->first_error + errorCode; } - error.sequenceNumber = dpy->last_request_read; + error.sequenceNumber = dpy->request; error.resourceID = resourceID; error.minorCode = minorCode; error.majorCode = glx_dpy->majorOpcode; @@ -73,7 +73,7 @@ __glXSendErrorForXcb(Display * dpy, const xcb_generic_error_t *err) error.type = X_Error; error.errorCode = err->error_code; - error.sequenceNumber = dpy->last_request_read; + error.sequenceNumber = err->sequence; error.resourceID = err->resource_id; error.minorCode = err->minor_code; error.majorCode = err->major_code; As-is, this will re-introduce https://bugs.freedesktop.org/show_bug.cgi?id=99781 . That one was about __glXSendErrorForXcb, while the regressions are about __glXSendError, so maybe only revert the __glXSendError hunk for now? I don't know enough about this code to take responsibility for such changes. I was just trying to revert to the status quo until this could be investigated again. My suggestion is we roll back the recent change. Then someone needs to create piglit test for both scenarios before trying to move forward again. If you want to try something different then go for it :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110606] [lib32] [vulkan-overlay-layer] build failure
https://bugs.freedesktop.org/show_bug.cgi?id=110606 Eric Engestrom changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Eric Engestrom --- Fixed by: commit 2d2927938f074f402cab28aa5322567a76cbde58 Author: Lionel Landwerlin Date: Fri May 3 16:42:55 2019 +0100 vulkan/overlay-layer: fix cast errors Not quite sure what version of GCC/Clang produces errors (8.3.0 locally was fine). v2: also fix an integer literal issue (Karol) Signed-off-by: Lionel Landwerlin Reviewed-by: Tapani Pälli (v1) Reviewed-by: Eric Engestrom -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110606] [lib32] [vulkan-overlay-layer] build failure
https://bugs.freedesktop.org/show_bug.cgi?id=110606 Eric Engestrom changed: What|Removed |Added CC||tehfr...@gmail.com --- Comment #1 from Eric Engestrom --- *** Bug 110607 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110607] Vulkan overlay build broken on IA-32
https://bugs.freedesktop.org/show_bug.cgi?id=110607 Eric Engestrom changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #1 from Eric Engestrom --- *** This bug has been marked as a duplicate of bug 110606 *** -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Revert "glx: Fix synthetic error generation in __glXSendError"
On Tue, 7 May 2019 at 09:27, Michel Dänzer wrote: > > On 2019-05-07 5:55 a.m., Timothy Arceri wrote: > > This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878. > > > > This seems to have broken a number of wine games. > > > > Cc: Adam Jackson > > Cc: Ian Romanick > > Cc: Hal Gentz > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 > > --- > > src/glx/glx_error.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/src/glx/glx_error.c b/src/glx/glx_error.c > > index 712ecf8213d..653cbeb2d2a 100644 > > --- a/src/glx/glx_error.c > > +++ b/src/glx/glx_error.c > > @@ -54,7 +54,7 @@ __glXSendError(Display * dpy, int_fast8_t errorCode, > > uint_fast32_t resourceID, > >error.errorCode = glx_dpy->codes->first_error + errorCode; > > } > > > > - error.sequenceNumber = dpy->last_request_read; > > + error.sequenceNumber = dpy->request; > > error.resourceID = resourceID; > > error.minorCode = minorCode; > > error.majorCode = glx_dpy->majorOpcode; > > @@ -73,7 +73,7 @@ __glXSendErrorForXcb(Display * dpy, const > > xcb_generic_error_t *err) > > > > error.type = X_Error; > > error.errorCode = err->error_code; > > - error.sequenceNumber = dpy->last_request_read; > > + error.sequenceNumber = err->sequence; > > error.resourceID = err->resource_id; > > error.minorCode = err->minor_code; > > error.majorCode = err->major_code; > > > > As-is, this will re-introduce > https://bugs.freedesktop.org/show_bug.cgi?id=99781 . That one was about > __glXSendErrorForXcb, while the regressions are about __glXSendError, so > maybe only revert the __glXSendError hunk for now? > Could not agree more. Can I suggest adding inline comment + even a bugzilla link? Otherwise we're bound to copy/paste and break it again. HTH -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110632] "glx: Fix synthetic error generation in __glXSendError" broke wine games on 32-bit
https://bugs.freedesktop.org/show_bug.cgi?id=110632 --- Comment #6 from Hi-Angel --- (In reply to Ian Romanick from comment #1) > Is it possible to get a backtrace from __glXSendError? I don't understand > why this particular commit would change behavior from "not error" to "error". Sure, can you tell offhand though, in which line can I set an "abort()" to cause a core dump on the error? I tried to get a stacktrace both from gdb and winedbg, but for various reasons neither works. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Revert "glx: Fix synthetic error generation in __glXSendError"
On 2019-05-07 5:55 a.m., Timothy Arceri wrote: > This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878. > > This seems to have broken a number of wine games. > > Cc: Adam Jackson > Cc: Ian Romanick > Cc: Hal Gentz > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 > --- > src/glx/glx_error.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/glx/glx_error.c b/src/glx/glx_error.c > index 712ecf8213d..653cbeb2d2a 100644 > --- a/src/glx/glx_error.c > +++ b/src/glx/glx_error.c > @@ -54,7 +54,7 @@ __glXSendError(Display * dpy, int_fast8_t errorCode, > uint_fast32_t resourceID, >error.errorCode = glx_dpy->codes->first_error + errorCode; > } > > - error.sequenceNumber = dpy->last_request_read; > + error.sequenceNumber = dpy->request; > error.resourceID = resourceID; > error.minorCode = minorCode; > error.majorCode = glx_dpy->majorOpcode; > @@ -73,7 +73,7 @@ __glXSendErrorForXcb(Display * dpy, const > xcb_generic_error_t *err) > > error.type = X_Error; > error.errorCode = err->error_code; > - error.sequenceNumber = dpy->last_request_read; > + error.sequenceNumber = err->sequence; > error.resourceID = err->resource_id; > error.minorCode = err->minor_code; > error.majorCode = err->major_code; > As-is, this will re-introduce https://bugs.freedesktop.org/show_bug.cgi?id=99781 . That one was about __glXSendErrorForXcb, while the regressions are about __glXSendError, so maybe only revert the __glXSendError hunk for now? -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110345] Unrecoverable GPU crash with DiRT 4
https://bugs.freedesktop.org/show_bug.cgi?id=110345 --- Comment #16 from Samuel Pitoiset --- Are you still able to reproduce with latest mesa/llvm git? We fixed some issues that might or might not help. Since last time, I tried (a bunch of times) to reproduce the GPU hang with Dirt 4 without success, unfortunately. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110632] "glx: Fix synthetic error generation in __glXSendError" broke wine games on 32-bit
https://bugs.freedesktop.org/show_bug.cgi?id=110632 Michel Dänzer changed: What|Removed |Added CC||zegen...@protonmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110632] "glx: Fix synthetic error generation in __glXSendError" broke wine games on 32-bit
https://bugs.freedesktop.org/show_bug.cgi?id=110632 Michel Dänzer changed: What|Removed |Added CC||hi-an...@yandex.ru --- Comment #5 from Michel Dänzer --- *** Bug 110590 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110590] [Regression][Bisected] GTAⅣ under wine fails with GLXBadFBConfig
https://bugs.freedesktop.org/show_bug.cgi?id=110590 Michel Dänzer changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #1 from Michel Dänzer --- *** This bug has been marked as a duplicate of bug 110632 *** -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106351] Freezes with plasmashell and steam client
https://bugs.freedesktop.org/show_bug.cgi?id=106351 Michel Dänzer changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #18 from Michel Dänzer --- Right, fixed by https://gitlab.freedesktop.org/mesa/mesa/commit/fe2edb25dd5628c395a65b60998f11e839d2b458 , thanks for the reminder Timothy. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110603] Blocky and black opacity/alpha using RADV on some games
https://bugs.freedesktop.org/show_bug.cgi?id=110603 --- Comment #5 from Samuel Pitoiset --- This is indeed weird. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4] anv: fix alphaToCoverage when there is no color attachment
On Mon, 2019-05-06 at 14:32 -0500, Jason Ekstrand wrote: > On Mon, May 6, 2019 at 9:01 AM Iago Toral Quiroga > wrote: > > From: Samuel Iglesias Gonsálvez > > > > > > > > There are tests in CTS for alpha to coverage without a color > > attachment > > > > that are failing. This happens because we remove the shader color > > > > outputs when we don't have a valid color attachment for them, but > > when > > > > alpha to coverage is enabled we still want to preserve the the > > output > > > > at location 0 since we need the alpha component. In that case we > > will > > > > also need to create a null render target for RT 0. > > > > > > > > v2: > > > > - We already create a null rt when we don't have any, so reuse > > that > > > > for this case (Jason) > > > > - Simplify the code a bit (Iago) > > > > > > > > v3: > > > > - Take alpha to coverage from the key and don't tie this to > > depth-only > > > > rendering only, we want the same behavior if we have multiple > > render > > > > targets but the one at location 0 is not used. (Jason). > > > > - Rewrite commit message (Iago) > > > > > > > > v4: > > > > - Make sure we take into account the array length of the shader > > outputs, > > > > which we were no handling correctly either and make sure we > > also > > > > create null render targets for any invalid array entries too. > > (Jason) > > > > > > > > Fixes the following CTS tests: > > > > dEQP- > > VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.* > > > > > > > > Signed-off-by: Samuel Iglesias Gonsálvez > > > > Signed-off-by: Iago Toral Quiroga > > > > --- > > > > src/intel/vulkan/anv_pipeline.c | 56 - > > > > > > 1 file changed, 42 insertions(+), 14 deletions(-) > > > > > > > > diff --git a/src/intel/vulkan/anv_pipeline.c > > b/src/intel/vulkan/anv_pipeline.c > > > > index 20eab548fb2..f15f0896266 100644 > > > > --- a/src/intel/vulkan/anv_pipeline.c > > > > +++ b/src/intel/vulkan/anv_pipeline.c > > > > @@ -823,14 +823,24 @@ anv_pipeline_link_fs(const struct > > brw_compiler *compiler, > > > > continue; > > > > > > > >const unsigned rt = var->data.location - FRAG_RESULT_DATA0; > > > > - /* Unused or out-of-bounds */ > > > > - if (rt >= MAX_RTS || !(stage->key.wm.color_outputs_valid & > > (1 << rt))) > > > > + /* Out-of-bounds */ > > > > + if (rt >= MAX_RTS) > > > > continue; > > > > > > > >const unsigned array_len = > > > > glsl_type_is_array(var->type) ? glsl_get_length(var- > > >type) : 1; > > > >assert(rt + array_len <= max_rt); > > > > > > > > + /* Unused */ > > > > + if (!(stage->key.wm.color_outputs_valid & BITFIELD_RANGE(rt, > > array_len))) { > > > > + /* If this is the RT at location 0 and we have alpha to > > coverage > > > > + * enabled we will have to create a null RT for it, so > > mark it as > > > > + * used. > > > > + */ > > > > + if (rt > 0 || !stage->key.wm.alpha_to_coverage) > > > > +continue; > > > > + } > > > > + > > > >for (unsigned i = 0; i < array_len; i++) > > > > rt_used[rt + i] = true; > > > > } > > > > @@ -841,11 +851,22 @@ anv_pipeline_link_fs(const struct > > brw_compiler *compiler, > > > > continue; > > > > > > > >rt_to_bindings[i] = num_rts; > > > > - rt_bindings[rt_to_bindings[i]] = (struct > > anv_pipeline_binding) { > > > > - .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, > > > > - .binding = 0, > > > > - .index = i, > > > > - }; > > > > + > > > > + if (stage->key.wm.color_outputs_valid & (1 << i)) { > > > > + rt_bindings[rt_to_bindings[i]] = (struct > > anv_pipeline_binding) { > > > > +.set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, > > > > +.binding = 0, > > > > +.index = i, > > > > + }; > > > > + } else { > > > > + /* Setup a null render target */ > > > > + rt_bindings[rt_to_bindings[i]] = (struct > > anv_pipeline_binding) { > > > > +.set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, > > > > +.binding = 0, > > > > +.index = UINT32_MAX, > > > > + }; > > > > + } > > > > + > > > >num_rts++; > > > > } > > > > > > > > @@ -855,14 +876,21 @@ anv_pipeline_link_fs(const struct > > brw_compiler *compiler, > > > > continue; > > > > > > > >const unsigned rt = var->data.location - FRAG_RESULT_DATA0; > > > > + const unsigned array_len = > > > > + glsl_type_is_array(var->type) ? glsl_get_length(var- > > >type) : 1; > > > > + > > > >if (rt >= MAX_RTS || > > > > - !(stage->key.wm.color_outputs_valid & (1 << rt))) { > > > > - /* Unused or out-of-bounds, throw it away */ > > > > - deleted_output =
Re: [Mesa-dev] [PATCH] radv: call constant folding before opt algebraic
Seems fine to, Reviewed-by: Samuel Pitoiset Bas, any comments? On 5/7/19 7:14 AM, Timothy Arceri wrote: ping! On 2/5/19 1:38 pm, Timothy Arceri wrote: The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) --- src/amd/vulkan/radv_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index cd5a9f2afb4..ad7b2439735 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -162,8 +162,8 @@ radv_optimize_nir(struct nir_shader *shader, bool optimize_conservatively, NIR_PASS(progress, shader, nir_opt_dead_cf); NIR_PASS(progress, shader, nir_opt_cse); NIR_PASS(progress, shader, nir_opt_peephole_select, 8, true, true); - NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_constant_folding); + NIR_PASS(progress, shader, nir_opt_algebraic); NIR_PASS(progress, shader, nir_opt_undef); NIR_PASS(progress, shader, nir_opt_conditional_discard); if (shader->options->max_unroll_iterations) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev