[PATCH] D99675: RFC [llvm][clang] Create new intrinsic llvm.arith.fence to control FP optimization at expression level

Melanie Blower via Phabricator via cfe-commits Tue, 06 Apr 2021 11:49:22 -0700

mibintc added a comment.

In D99675#2671924 <https://reviews.llvm.org/D99675#2671924>, @efriedma wrote:


>> The expression “llvm.arith.fence(a * b) + c” means that “a * b” must happen 
>> before “+ c” and FMA guarantees that, but to prevent later optimizations 
>> from unpacking the FMA the correct transformation needs to be:
>>
>> llvm.arith.fence(a * b) + c  →  llvm.arith.fence(FMA(a, b, c))
>
> Does this actually block later transforms from unpacking the FMA?  Maybe if 
> the FMA isn't marked "fast"...

I'd like @pengfei to reply to this question. I think the overall idea is that 
many of the optimizations are pattern based, and the existing pattern wouldn't 
match the new intrinsic.

> ----
>
> How is llvm.arith.fence() different from using "freeze" on a floating-point 
> value?  The goal isn't really the same, sure, but the effects seem similar at 
> first glance.

Initially we thought the intrinsic "ssa.copy" could serve. However ssa.copy is 
for a different purpose and it gets optimized away.  We want arith.fence to 
survive through codegen, that's one reason why we think a new intrinsic is 
needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99675/new/

https://reviews.llvm.org/D99675

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D99675: RFC [llvm][clang] Create new intrinsic llvm.arith.fence to control FP optimization at expression level

Reply via email to