Jason Ekstrand <[email protected]> writes: > On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez <[email protected]> > wrote: > >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++++++++++++++++++++++++++ >> 1 file changed, 26 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index 1f3b23b..7002346 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -6547,6 +6547,32 @@ brw_compile_cs(const struct brw_compiler *compiler, >> void *log_data, >> } >> } >> >> + fs_visitor v32(compiler, log_data, mem_ctx, key, &prog_data->base, >> + NULL, /* Never used in core profile */ >> + shader, 32, shader_time_index); >> + if (!fail_msg && v8.max_dispatch_width >= 32 && >> + simd_required == 32) { >> > > I don't see where simd_required is getting aligned up to a power-of-two so > how are we expecting to hit this? Also, I took a look at the SIMD16 case > above, and we're hand-rolling simd_required there (which we shouldn't be). > Here's what I would suggest: >
Yeah, I had noticed that last night shortly after sending, the version of this patch I pushed in the branch shouldn't have this bug: https://cgit.freedesktop.org/~currojerez/mesa/commit/?h=i965-simd32-cs&id=3d7f22ffd567901c9561b93b0d6945a50a095997 > simd_required = DIV_ROUND_UP(...) > min_simd = 32 > > then, in each one we do > > if ((INTEL_DEBUG & DEBUG_NO16) && simd_required <= 16 && min_simd >= 16) { > if (min_simd == 8) > v16.import_uniforms(v8) > if (!v16.run_cs()) { > /* fail */ > } else { > /* success */ > min_simd = MIN2(min_simd, 16); > } > } > > with the obvious adjustments for 8 and 32. That way no8 and no16 both work > fine and we properly guarantee that we compile exactly one shader. > So if I'm understanding correctly you're asking me to rework the SIMD16 and SIMD8 back-end invocation so the no8 and no16 debugging options are taken into account for compute shaders? That would definitely make PATCH 23 unnecessary but will be somewhat more churn (not saying that it wouldn't make sense to rewrite brw_compile_cs() eventually). > Seem reasonable? > > --Jason > > >> + /* Try a SIMD32 compile */ >> + if (simd_required <= 8) >> + v32.import_uniforms(&v8); >> + else if (simd_required <= 16) >> + v32.import_uniforms(&v16); >> + >> + if (!v32.run_cs()) { >> + compiler->shader_perf_log(log_data, >> + "SIMD32 shader failed to compile: %s", >> + v16.fail_msg); >> + if (!cfg) { >> + fail_msg = >> + "Couldn't generate SIMD32 program and not " >> + "enough threads for SIMD16"; >> + } >> + } else { >> + cfg = v32.cfg; >> + prog_data->simd_size = 32; >> + } >> + } >> + >> if (unlikely(cfg == NULL)) { >> assert(fail_msg); >> if (error_str) >> -- >> 2.7.3 >> >> _______________________________________________ >> mesa-dev mailing list >> [email protected] >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
