================
@@ -1705,16 +1705,21 @@ class CIRGenFunction : public CIRGenTypeCache {
void instantiateIndirectGotoBlock();
- /// Emit a simple LLVM intrinsic that takes N scalar arguments and whose
- /// return type matches the type of the first argument. The intrinsic name is
- /// used verbatim; any overload mangling (e.g. `.f32`, `.p1`) must be baked
- /// into \p intrinName by the caller.
+ /// Emit a simple LLVM intrinsic that takes N scalar arguments. The
intrinsic
+ /// name is used verbatim; any overload mangling (e.g. `.f32`, `.p1`) must be
+ /// baked into \p intrinName by the caller. The result type defaults to the
+ /// type of the first argument; pass \p resultType for intrinsics whose
result
+ /// differs from the operand, such as a vector reduction that returns the
+ /// element type. Unlike classic CodeGen, CIR has no intrinsic registry to
+ /// derive the result type from the operand, so it must be supplied here.
template <unsigned N>
[[maybe_unused]] RValue
emitBuiltinWithOneOverloadedType(const CallExpr *e,
- llvm::StringRef intrinName) {
+ llvm::StringRef intrinName,
+ mlir::Type resultType = {}) {
----------------
adams381 wrote:
> calls to this function above?
---
The AMDGPU callers above default the result type because those intrinsics
return the same type as their operand (`readfirstlane`, `ds.swizzle`,
`readlane`, `div.fmas` are all result == arg0), so `convertType(arg0)` is
correct for them. `vector.reduce.*` is the case where it isn't: the result is
the element type, not the vector operand.
Classic doesn't pass a result type because
`getIntrinsic(Intrinsic::vector_reduce_xor, {VecTy})` reads the return type off
the intrinsic signature. CIR's `LLVMIntrinsicCallOp` lowering uses the op's
declared result type directly (`LowerToLLVM.cpp`,
`convertType(op->getResultTypes()[0])`) with no registry to derive it, so the
element type has to be supplied at the call.
Defaulting reduce to `convertType(arg0)` (the vector) emits `<4 x i32>
@llvm.vector.reduce.xor.v4i32(<4 x i32>)`, and lowering aborts:
```
error: call intrinsic signature <4 x i32> (<4 x i32>) to overloaded intrinsic
"llvm.vector.reduce.xor" does not match any of the overloads: intrinsic return
type (vector element of overload type 0) expected i32 (overload type 0 is <4 x
i32>), but got <4 x i32>
fatal error: error in backend: Lowering from LLVMIR dialect to llvm IR failed!
```
If you'd rather not give the shared helper an optional result type, the
alternative is to emit these three intrinsic calls inline with the element type
and leave the helper unchanged. @andykaylor asked to route through the helper,
which is why I extended it instead; either path has to specify the element
result type.
https://github.com/llvm/llvm-project/pull/201164
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits