Re: [Mesa-dev] [PATCH] ac/nir: use llvm fma intrinsic if nir instruction is exact.

Marek Olšák Fri, 06 Oct 2017 09:40:28 -0700

On Fri, Oct 6, 2017 at 4:39 AM, Dave Airlie <airl...@gmail.com> wrote:
> On 6 October 2017 at 12:31, Marek Olšák <mar...@gmail.com> wrote:
>> On Fri, Oct 6, 2017 at 4:10 AM, Connor Abbott <cwabbo...@gmail.com> wrote:
>>> On Thu, Oct 5, 2017 at 10:08 PM, Marek Olšák <mar...@gmail.com> wrote:
>>>> On Fri, Oct 6, 2017 at 3:50 AM, Connor Abbott <cwabbo...@gmail.com> wrote:
>>>>> Why? While it might technically be legal, always generating an unfused
>>>>> mul+add when the user explicitly requested fma() seems harsh...
>>>>
>>>> It's slow on some chips. It doesn't need any other reason.
>>>>
>>>> Marek
>>>
>>> Presumably, if the developer asked for fma, then they don't care how
>>> fast or slow it is...
>>
>> Feral asked for fma. They care. This debate is pointless. We just
>> won't use fma by default. Period.
>
> They didn't ask for it with precise precision. I'm assuming if someone wants
> fma with precise precision we should give it to them. Like at least
> the fma manpage states.
>
> https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/fma.xhtml


Oh please Dave, the page says the exact opposite of what you are
saying. The only thing the manpage says is: If fma and mul+add have
different precision, fma can't be split and mul+add can't be combined.
It doesn't say anything about precision of the result of fma itself.
Search for the word "can". It's not the same as "must".

That said, RADV can use as many slow opcodes as you want if you
insist. I'm only saying that the opcode selection of radeonsi is
non-negotiable on my side, and nir_to_llvm might get radeonsi-specific
opcode selection.

Marek
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac/nir: use llvm fma intrinsic if nir instruction is exact.

Reply via email to