Am 13.06.2017 um 01:57 schrieb Roland Scheidegger: > This looks like the right idea to me too. It may sound a bit weird to do > that per instruction, but d3d11 does that as well. (Some d3d versions > just have a global flag basically forbidding or allowing any such fast > math optimizations in the assembly, but I'm not actually sure everybody > honors that without tesselation...) > > For 1/9: > Reviewed-by: Roland Scheidegger <srol...@vmware.com>
I forgot to mention, could you add some bits in gallium docs (source/tgsi.rst) for this? Not sure where maybe under Modifiers or some such. Roland > > 2/9 has a typo in the commit short log ("Instrutions"). > > FWIW surely on nv50 you could keep a single mad instruction for umad > (sad maybe too?). (I'm actually wondering if the hw really can't do > unfused float multiply+add as a single instruction but I know next to > nothing about nvidia hw...) > > Roland > > Am 12.06.2017 um 12:42 schrieb Nicolai Hähnle: >> On 11.06.2017 20:42, Karol Herbst wrote: >>> Running Tomb Raider on Nouveau I found some flicker caused by ignoring >>> precise >>> modifiers on variables inside Nouveau. >>> >>> This series add precise/invariant handling to TGSI, which can be then >>> used by >>> drivers to disable certain unsafe optimisations which may otherwise alter >>> calculations, which depend on having the same result across shaders. >> >> It's kind of amazing that we got this far without doing this. On the >> radeonsi side, it's probably related to how conservative LLVM is. >> >> But this series is a good idea, since it might allow us to become more >> aggressive with optimizations in radeonsi as well. >> >> >>> This series fixes this bug in Tomb Raider and one CTS test for 4.4 and >>> 4.5 >>> >>> Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to >>> apply the >>> precise flag on instruction emited in ir_assignment->rhs->accept(); >>> but I found >>> no other easy way to handle this. Maybe somebody of you has a better >>> idea? >> >> Sent a suggestion, as well as comments on patches 4 & 5. Patches 1 & 2: >> >> Reviewed-by: Nicolai Hähnle <nicolai.haeh...@amd.com> >> >> >>> >>> Karol Herbst (9): >>> tgsi: add precise flag to tgsi_instruction >>> tgsi/dump: print _PRECISE modifier on Instrutions >>> st/glsl_to_tgsi: handle precise modifier >>> tgsi: populate precise >>> tgsi/text: parse _PRECISE modifier >>> nv50/ir: add precise field to Instruction >>> nv50/ir/tgsi: handle precise for most ALU instructions >>> nv50/ir: disable mul+add to mad for precise instructions >>> nv50/ir/tgsi: split mad to mul+add >>> >>> src/gallium/auxiliary/tgsi/tgsi_build.c | 4 + >>> src/gallium/auxiliary/tgsi/tgsi_dump.c | 4 + >>> src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +++- >>> src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++- >>> src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 ++++- >>> src/gallium/auxiliary/util/u_simple_shaders.c | 2 +- >>> src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 + >>> .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 16 ++++ >>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +- >>> src/gallium/include/pipe/p_shader_tokens.h | 3 +- >>> src/gallium/state_trackers/nine/nine_shader.c | 6 +- >>> src/mesa/state_tracker/st_atifs_to_tgsi.c | 38 ++++----- >>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 92 >>> +++++++++++++++++----- >>> src/mesa/state_tracker/st_mesa_to_tgsi.c | 8 +- >>> src/mesa/state_tracker/st_pbo.c | 2 +- >>> 15 files changed, 172 insertions(+), 59 deletions(-) >>> >> >> > _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau