This looks like the right idea to me too. It may sound a bit weird to do that per instruction, but d3d11 does that as well. (Some d3d versions just have a global flag basically forbidding or allowing any such fast math optimizations in the assembly, but I'm not actually sure everybody honors that without tesselation...)
For 1/9: Reviewed-by: Roland Scheidegger <srol...@vmware.com> 2/9 has a typo in the commit short log ("Instrutions"). FWIW surely on nv50 you could keep a single mad instruction for umad (sad maybe too?). (I'm actually wondering if the hw really can't do unfused float multiply+add as a single instruction but I know next to nothing about nvidia hw...) Roland Am 12.06.2017 um 12:42 schrieb Nicolai Hähnle: > On 11.06.2017 20:42, Karol Herbst wrote: >> Running Tomb Raider on Nouveau I found some flicker caused by ignoring >> precise >> modifiers on variables inside Nouveau. >> >> This series add precise/invariant handling to TGSI, which can be then >> used by >> drivers to disable certain unsafe optimisations which may otherwise alter >> calculations, which depend on having the same result across shaders. > > It's kind of amazing that we got this far without doing this. On the > radeonsi side, it's probably related to how conservative LLVM is. > > But this series is a good idea, since it might allow us to become more > aggressive with optimizations in radeonsi as well. > > >> This series fixes this bug in Tomb Raider and one CTS test for 4.4 and >> 4.5 >> >> Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to >> apply the >> precise flag on instruction emited in ir_assignment->rhs->accept(); >> but I found >> no other easy way to handle this. Maybe somebody of you has a better >> idea? > > Sent a suggestion, as well as comments on patches 4 & 5. Patches 1 & 2: > > Reviewed-by: Nicolai Hähnle <nicolai.haeh...@amd.com> > > >> >> Karol Herbst (9): >> tgsi: add precise flag to tgsi_instruction >> tgsi/dump: print _PRECISE modifier on Instrutions >> st/glsl_to_tgsi: handle precise modifier >> tgsi: populate precise >> tgsi/text: parse _PRECISE modifier >> nv50/ir: add precise field to Instruction >> nv50/ir/tgsi: handle precise for most ALU instructions >> nv50/ir: disable mul+add to mad for precise instructions >> nv50/ir/tgsi: split mad to mul+add >> >> src/gallium/auxiliary/tgsi/tgsi_build.c | 4 + >> src/gallium/auxiliary/tgsi/tgsi_dump.c | 4 + >> src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +++- >> src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++- >> src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 ++++- >> src/gallium/auxiliary/util/u_simple_shaders.c | 2 +- >> src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 + >> .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 16 ++++ >> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +- >> src/gallium/include/pipe/p_shader_tokens.h | 3 +- >> src/gallium/state_trackers/nine/nine_shader.c | 6 +- >> src/mesa/state_tracker/st_atifs_to_tgsi.c | 38 ++++----- >> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 92 >> +++++++++++++++++----- >> src/mesa/state_tracker/st_mesa_to_tgsi.c | 8 +- >> src/mesa/state_tracker/st_pbo.c | 2 +- >> 15 files changed, 172 insertions(+), 59 deletions(-) >> > > _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau