================
@@ -842,6 +845,34 @@ bool SystemZTargetLowering::useSoftFloat() const {
   return Subtarget.hasSoftFloat();
 }
 
+unsigned
+SystemZTargetLowering::getNumRegisters(LLVMContext &Context, EVT VT,
+                                       std::optional<MVT> RegisterVT) const {
+  // i128 inline assembly operand.
+  if (VT == MVT::i128 && RegisterVT && *RegisterVT == MVT::Untyped)
+    return 1;
+  // Pass narrow fp16 vectors per the ABI even though they are generally
+  // expanded.
+  if (Subtarget.hasVector() && VT.isVector() && VT.getScalarType() == MVT::f16)
+    return divideCeil(VT.getVectorNumElements(), SystemZ::VectorBytes / 2);
----------------
uweigand wrote:

I think the case I was more concerned about is e.g. a 48-byte vector (24 half 
elements).   Does this take 3 registers, or is this implicitly extended to the 
next power-of-two and thus takes 4 registers?

https://github.com/llvm/llvm-project/pull/171066
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to