Jason Ekstrand <ja...@jlekstrand.net> writes: > On Sep 16, 2016 3:04 PM, "Francisco Jerez" <curroje...@riseup.net> wrote: >> >> Not intended for upstream. Should cause a GPU hang if some thread is >> executed with a non-contiguous dispatch mask breaking assumptions of >> brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or >> Piglit regressions, while replacing brw_stage_has_packed_dispatch() >> with a dummy implementation that unconditionally returns true on top >> of this patch causes multiple GPU hangs. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 17 +++++++++++++++++ >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 +++++++++++++++++++++ >> 2 files changed, 38 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 042203d..b3eec49 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -33,6 +33,23 @@ using namespace brw::surface_access; >> void >> fs_visitor::emit_nir_code() >> { >> + if (brw_stage_has_packed_dispatch(stage, prog_data)) { > > Mind adding "0 &&" and merging this patch so we remain aware of the issue, > keep it building, and can easily test future hardware. > I guess that would work -- An alternative that would keep the NIR-to-i965 pass tidier would be to assert(devinfo->gen <= 9) in brw_stage_has_packed_dispatch() to make sure we don't forget to do a full Piglit/DEQP/CTS run with this patch applied when a new generation is powered on. I can also add a comment with a link to this patch.
>> + const fs_builder ubld = bld.exec_all().group(1, 0); >> + const fs_reg tmp = component(bld.vgrf(BRW_REGISTER_TYPE_UD), 0); >> + const fs_reg mask = (stage == MESA_SHADER_FRAGMENT ? > brw_vmask_reg() : >> + brw_dmask_reg()); >> + >> + ubld.ADD(tmp, mask, brw_imm_ud(1)); >> + ubld.AND(tmp, mask, tmp); >> + >> + /* This will loop forever if the dispatch mask doesn't have the > expected >> + * form '2^n-1', in which case tmp will be non-zero. >> + */ >> + bld.emit(BRW_OPCODE_DO); >> + bld.CMP(bld.null_reg_ud(), tmp, brw_imm_ud(0), BRW_CONDITIONAL_NZ); >> + set_predicate(BRW_PREDICATE_NORMAL, bld.emit(BRW_OPCODE_WHILE)); >> + } >> + >> /* emit the arrays used for inputs and outputs - load/store > intrinsics will >> * be converted to reads/writes of these arrays >> */ >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> index ba3bbdf..9f7a1f0 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> @@ -35,6 +35,27 @@ namespace brw { >> void >> vec4_visitor::emit_nir_code() >> { >> + if (brw_stage_has_packed_dispatch(stage, &prog_data->base)) { >> + const dst_reg tmp = writemask(dst_reg(this, glsl_type::uint_type), >> + WRITEMASK_X); >> + const src_reg mask = >> + > brw_swizzle(retype(stride(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE, > BRW_ARF_STATE, 0), >> + 0, 4, 1), >> + BRW_REGISTER_TYPE_UD), >> + BRW_SWIZZLE_ZZZZ); > > Can we just do vec4_reg(brw_vmask_reg)? > That didn't seem to work, because both brw_dmask_reg() and brw_vmask_reg() return a register region that is equivalent to sr0.0.xxxx in Align16 addressing mode (maybe a bug of PATCH 1? It's unlikely to matter a lot in practice though), so the Z component where the DMask is stored in is not even part of the returned Align16 region. >> + >> + emit(ADD(tmp, mask, brw_imm_ud(1))); >> + emit(AND(tmp, mask, src_reg(tmp))); >> + >> + /* This will loop forever if the dispatch mask doesn't have the > expected >> + * form '2^n-1', in which case tmp will be non-zero. >> + */ >> + emit(BRW_OPCODE_DO); >> + emit(CMP(dst_null_ud(), src_reg(tmp), brw_imm_ud(0), >> + BRW_CONDITIONAL_NZ)); >> + emit(BRW_OPCODE_WHILE)->predicate = BRW_PREDICATE_NORMAL; >> + } >> + >> if (nir->num_uniforms > 0) >> nir_setup_uniforms(); >> >> -- >> 2.9.0 >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev