Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

Connor Abbott Wed, 04 Oct 2017 10:02:06 -0700

No. From the LLVM langref:

The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add
expressions that can be fused if the code generator determines that
(a) the target instruction set has support for a fused operation, and
(b) that the fused operation is more efficient than the equivalent,
separate pair of mul and add instructions.


The (b) part is especially important -- it says that LLVM can pick and
choose which fmuladd intrinsics to turn into FMA instructions, or
unfused MULADD instructions, or just a sequence of mul+add. For
example, if many instructions call fmuladd with the first two
arguments the same, it can break it up into a mul followed by a bunch
of adds. That wouldn't be ok under the GLSL precise semantics
(assuming the target would've used FMA otherwise, which I think some
GCN cards will do).

Also, and maybe more importantly, if an app developer explicitly asks
for fma() with a precise modifier, it's probably not a great idea to
then give them an unfused mul+add -- it's legal, thanks to GLSL's
weasel-wording, but probably not what you really want, on HW which
actually does have an FMA instruction :)

Connor


On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin <[email protected]> wrote:
> Wouldn't this guarantee that nothing is fused (and thus fine)?
> Presumably fmuladd always does mul+add either as 1 or 2 instructions?
>
> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott <[email protected]> wrote:
>> If the fma has the exact flag, then we need to use the llvm.fma
>> intrinsic. These come from fma() calls with the precise or invariant
>> qualifiers in GLSL, where you basically have to fuse everything or
>> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.
>>
>> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie <[email protected]> wrote:
>>> From: Dave Airlie <[email protected]>
>>>
>>> For Vulkan SPIR-V the spec states
>>> fma() Inherited from OpFMul followed by OpFAdd.
>>>
>>> Matt says the backend will do the right thing depending on the
>>> hardware being compiled for, if you use the fmuladd intrinsic.
>>>
>>> Using the Mad Max pts test, on high settings at 4K:
>>> CHP: 55->60
>>> HGDD: 46->50
>>> LM: 55->60
>>> No change on Stronghold.
>>>
>>> Thanks to Feral for spending the time to track this down.
>>>
>>> Signed-off-by: Dave Airlie <[email protected]>
>>> ---
>>>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/amd/common/ac_nir_to_llvm.c 
>>> b/src/amd/common/ac_nir_to_llvm.c
>>> index d7b6259..11ba487 100644
>>> --- a/src/amd/common/ac_nir_to_llvm.c
>>> +++ b/src/amd/common/ac_nir_to_llvm.c
>>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, 
>>> const nir_alu_instr *instr)
>>>                                                       result);
>>>                 break;
>>>         case nir_op_ffma:
>>> -               result = emit_intrin_3f_param(&ctx->ac, "llvm.fma",
>>> +               result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd",
>>>                                               ac_to_float_type(&ctx->ac, 
>>> def_type), src[0], src[1], src[2]);
>>>                 break;
>>>         case nir_op_ibitfield_extract:
>>> --
>>> 2.9.4
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> [email protected]
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> [email protected]
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

Reply via email to