[clang] [llvm] [SelectionDAG][NVPTX] support expanding target intrinsics; implement for `nvvm.{fmax/fmin}` (PR #194783)

Matt Arsenault via cfe-commits Mon, 29 Jun 2026 13:05:01 -0700

arsenm wrote:

> Even though `nvvm.fmax/fmin` are very close to `maximumnum/minimumnum`, their 
> semantics slightly differ, since the LLVM intrinsics depend on the global FTZ 
> settings, but the nvvm versions encode on a per-instruction level whether FTZ 
> is desired. As you can see in NVTPXTargetTransformInfo, we do translate to 
> the `llvm.maximumnum` whenever the FTZ semantics make this valid.


I maintain that there is no reason to have these intrinsics, and having them is 
unnecessary complexity. Trying to surface every knob of every PTX instruction 
into an intrinsic should not be a goal. You can express the same semantics with 
an explicit flush sequence, which you can select into your instruction 
modifier. 

> 
> In addition to these cases like `fmin/fmax` with instruction-level FTZ 
> semantics, other cases that it would be useful to have scalarizable vector 
> forms of target-specific intrinsics would be:
> 
> * Instructions requiring instruction-level rounding modes (e.g. 
> `nvvm_add_rn_f` etc.)

If you want this kind of legalization, we really ought to have generic 
intrinsics for fixed rounding mode operations. Trying to treat target 
intrinsics like an abstract operation is backwards (we're also talking about 
just making the regular instructions always have rounding controls 
[here](https://discourse.llvm.org/t/rfc-yet-another-strict-fp/)

> * Target-specific math function approximations (e.g. `nvvm_sin_approx_f` , 
> `amdgcn_cos` etc.)

This is the kind of case that definitely shouldn't be legalized. It's adding 
complexity and hidden legalization costs (e.g., the cost model assumes all 
intrinsics are cheap by default).


> * Other target-specific instructions like `nvvm_fmin_ftz_xorsign_abs_f` , 
> where the `xorsign_abs` pattern would require a chain of several generic  
> LLVM instructions instead of using a single target-specific intrinsic.
> 

Which is fine and normal, the cost of fully supporting an intrinsic across all 
optimization is very high. Pattern matching is a backend responsibility which 
should be done anyway 



https://github.com/llvm/llvm-project/pull/194783
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SelectionDAG][NVPTX] support expanding target intrinsics; implement for `nvvm.{fmax/fmin}` (PR #194783)

Reply via email to