[clang] [RISCV] Improve casting between i1 scalable vectors and i8 fixed vect… (PR #139190)

2025-05-08 Thread Craig Topper via cfe-commits

https://github.com/topperc created 
https://github.com/llvm/llvm-project/pull/139190

…ors for -mrvv-vector-bits

For i1 vectors, we used an i8 fixed vector as the storage type.

If the known minimum number of elements of the scalable vector type is less 
than 8, we were doing the cast through memory. This used a load or store from a 
fixed vector alloca. If X is less than 8, DataLayout indicates that the 
load/store reads/writes vscale bytes even if vscale is known and vscale*X is 
less than or equal to 8. This means the load or store is outside the bounds of 
the fixed size alloca as far as DataLayout is concerned leading to undefined 
behavior.

This patch avoids this by widening the i1 scalable vector type with zero 
elements until it is divisible by 8. This allows it be bitcasted to/from an i8 
scalable vector. We then insert or extract the i8 fixed vector into this type.

Hopefully this enables #130973 to be accepted.

>From 86692b0229da44dce5321b00c8409e50de86efaf Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 8 May 2025 15:13:47 -0700
Subject: [PATCH] [RISCV] Improve casting between i1 scalable vectors and i8
 fixed vectors for -mrvv-vector-bits

For i1 vectors, we used an i8 fixed vector as the storage type.

If the known minimum number of elements of the scalable vector type
is less than 8, we were doing the cast through memory. This used a
load or store from a fixed vector alloca. If X is less than 8, DataLayout
indicates that the load/store reads/writes vscale bytes even if vscale is
known and vscale*X is less than or equal to 8. This means the load or store
is outside the bounds of the fixed size alloca as far as DataLayout is
concerned leading to undefined behavior.

This patch avoids this by widening the i1 scalable vector type with
zero elements until it is divisible by 8. This allows it be bitcasted
to/from an i8 scalable vector. We then insert or extract the i8 fixed
vector into this type.

Hopefully this enables #130973 to be accepted.
---
 clang/lib/CodeGen/CGCall.cpp  |  26 -
 clang/lib/CodeGen/CGExprScalar.cpp|  27 -
 .../attr-riscv-rvv-vector-bits-less-8-call.c  | 104 +++---
 .../attr-riscv-rvv-vector-bits-less-8-cast.c  |  56 ++
 .../attr-rvv-vector-bits-bitcast-less-8.c |  32 +++---
 .../CodeGen/RISCV/attr-rvv-vector-bits-cast.c |  18 +--
 .../RISCV/attr-rvv-vector-bits-codegen.c  |  37 ---
 .../RISCV/attr-rvv-vector-bits-globals.c  |  16 +--
 8 files changed, 119 insertions(+), 197 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 9dfd25f9a8d43..81dfc3884f1af 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1366,19 +1366,29 @@ static llvm::Value *CreateCoercedLoad(Address Src, 
llvm::Type *Ty,
   // If we are casting a fixed i8 vector to a scalable i1 predicate
   // vector, use a vector insert and bitcast the result.
   if (ScalableDstTy->getElementType()->isIntegerTy(1) &&
-  ScalableDstTy->getElementCount().isKnownMultipleOf(8) &&
   FixedSrcTy->getElementType()->isIntegerTy(8)) {
 ScalableDstTy = llvm::ScalableVectorType::get(
 FixedSrcTy->getElementType(),
-ScalableDstTy->getElementCount().getKnownMinValue() / 8);
+llvm::divideCeil(
+ScalableDstTy->getElementCount().getKnownMinValue(), 8));
   }
   if (ScalableDstTy->getElementType() == FixedSrcTy->getElementType()) {
 auto *Load = CGF.Builder.CreateLoad(Src);
 auto *PoisonVec = llvm::PoisonValue::get(ScalableDstTy);
 llvm::Value *Result = CGF.Builder.CreateInsertVector(
 ScalableDstTy, PoisonVec, Load, uint64_t(0), "cast.scalable");
-if (ScalableDstTy != Ty)
-  Result = CGF.Builder.CreateBitCast(Result, Ty);
+ScalableDstTy = cast(Ty);
+if (ScalableDstTy->getElementType()->isIntegerTy(1) &&
+!ScalableDstTy->getElementCount().isKnownMultipleOf(8) &&
+FixedSrcTy->getElementType()->isIntegerTy(8))
+  ScalableDstTy = llvm::ScalableVectorType::get(
+  ScalableDstTy->getElementType(),
+  llvm::alignTo<8>(
+  ScalableDstTy->getElementCount().getKnownMinValue()));
+if (Result->getType() != ScalableDstTy)
+  Result = CGF.Builder.CreateBitCast(Result, ScalableDstTy);
+if (Result->getType() != Ty)
+  Result = CGF.Builder.CreateExtractVector(Ty, Result, uint64_t(0));
 return Result;
   }
 }
@@ -1476,8 +1486,14 @@ CoerceScalableToFixed(CodeGenFunction &CGF, 
llvm::FixedVectorType *ToTy,
   // If we are casting a scalable i1 predicate vector to a fixed i8
   // vector, first bitcast the source.
   if (FromTy->getElementType()->isIntegerTy(1) &&
-  FromTy->getElementCount().isKnownMultipleOf(8) &&
   ToTy->getElementType() == CGF.Builder.getInt8Ty()) {
+if (!FromTy->getElementCount().isKnownMulti

[clang] [RISCV] Improve casting between i1 scalable vectors and i8 fixed vect… (PR #139190)

2025-05-08 Thread Craig Topper via cfe-commits

https://github.com/topperc edited 
https://github.com/llvm/llvm-project/pull/139190
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits