No. From the LLVM langref: The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add expressions that can be fused if the code generator determines that (a) the target instruction set has support for a fused operation, and (b) that the fused operation is more efficient than the equivalent, separate pair of mul and add instructions.
The (b) part is especially important -- it says that LLVM can pick and choose which fmuladd intrinsics to turn into FMA instructions, or unfused MULADD instructions, or just a sequence of mul+add. For example, if many instructions call fmuladd with the first two arguments the same, it can break it up into a mul followed by a bunch of adds. That wouldn't be ok under the GLSL precise semantics (assuming the target would've used FMA otherwise, which I think some GCN cards will do). Also, and maybe more importantly, if an app developer explicitly asks for fma() with a precise modifier, it's probably not a great idea to then give them an unfused mul+add -- it's legal, thanks to GLSL's weasel-wording, but probably not what you really want, on HW which actually does have an FMA instruction :) Connor On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin <[email protected]> wrote: > Wouldn't this guarantee that nothing is fused (and thus fine)? > Presumably fmuladd always does mul+add either as 1 or 2 instructions? > > On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott <[email protected]> wrote: >> If the fma has the exact flag, then we need to use the llvm.fma >> intrinsic. These come from fma() calls with the precise or invariant >> qualifiers in GLSL, where you basically have to fuse everything or >> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that. >> >> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie <[email protected]> wrote: >>> From: Dave Airlie <[email protected]> >>> >>> For Vulkan SPIR-V the spec states >>> fma() Inherited from OpFMul followed by OpFAdd. >>> >>> Matt says the backend will do the right thing depending on the >>> hardware being compiled for, if you use the fmuladd intrinsic. >>> >>> Using the Mad Max pts test, on high settings at 4K: >>> CHP: 55->60 >>> HGDD: 46->50 >>> LM: 55->60 >>> No change on Stronghold. >>> >>> Thanks to Feral for spending the time to track this down. >>> >>> Signed-off-by: Dave Airlie <[email protected]> >>> --- >>> src/amd/common/ac_nir_to_llvm.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> b/src/amd/common/ac_nir_to_llvm.c >>> index d7b6259..11ba487 100644 >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, >>> const nir_alu_instr *instr) >>> result); >>> break; >>> case nir_op_ffma: >>> - result = emit_intrin_3f_param(&ctx->ac, "llvm.fma", >>> + result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd", >>> ac_to_float_type(&ctx->ac, >>> def_type), src[0], src[1], src[2]); >>> break; >>> case nir_op_ibitfield_extract: >>> -- >>> 2.9.4 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> [email protected] >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> _______________________________________________ >> mesa-dev mailing list >> [email protected] >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
