https://github.com/kumarak created 
https://github.com/llvm/llvm-project/pull/204185

PointerType and VPtrType have hard-coded sizes and alignments of 64/8 bits. On 
targets with 32-bit pointers (e.g., nvptx, spirv32), this trips the record 
layout builder. Any record containing a pointer hit the insertPadding assertion 
(offset >= size)
  because the pointer was sized at 8 bytes while the following field was placed 
at the AST-mandated 4-byte offset.

 ### Changes:
  - CIRGenerator: attaches a CIR-native cir.ptr data-layout entry at module 
setup, storing {size-in-bits, abi-align-in-bits} read
  from the target DataLayout (only for the default address space).
  - CIRTypes: PointerType reads its size/alignment from that entry (falling 
back to 64/8 when absent); VPtrType routes through a cir.ptr so it picks up the 
same width. 
  - LowerToLLVM: strips the cir.ptr entry during CIR→LLVM lowering, since 
cir.ptr has no meaning in LLVM IR.
  - Unit test: checking 4-byte pointer/vptr layout on nvptx, verified across 
CIR, CIR→LLVM.

> _Note: This is a proposed fix for the 32-bit-pointer record-layout crash, and 
> I'm not certain it aligns with the CIR team's longer-term plan to resolve it. 
> If the team prefers a different design, I'm happy to rework it. Feedback on 
> the overall direction is very welcome._

>From 393e9bf45ad1e95efc55fe84d36c8672e19fe631 Mon Sep 17 00:00:00 2001
From: AkshayK <[email protected]>
Date: Tue, 16 Jun 2026 11:14:06 -0400
Subject: [PATCH] [CIR] Drive pointer and vptr width from a CIR-native
 data-layout entry

PointerType and VPtrType hardcoded size/alignment to 64/8, aborting record
layout on 32-bit-pointer targets such as nvptx/spirv32. Attach a CIR-native
cir.ptr data-layout entry at module setup and read the pointer width from it,
so the CIR type system needs no LLVM-dialect dependency; the entry is stripped
during CIR->LLVM lowering. 64-bit targets are unchanged.
---
 clang/lib/CIR/CodeGen/CIRGenerator.cpp        | 24 +++++++-
 clang/lib/CIR/Dialect/IR/CIRTypes.cpp         | 56 +++++++++++++++++--
 .../CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp | 17 ++++++
 .../test/CIR/CodeGen/pointer-width-32bit.cpp  | 44 +++++++++++++++
 4 files changed, 134 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/CIR/CodeGen/pointer-width-32bit.cpp

diff --git a/clang/lib/CIR/CodeGen/CIRGenerator.cpp 
b/clang/lib/CIR/CodeGen/CIRGenerator.cpp
index d4fcbb6e42f3e..a9ffece66b3ab 100644
--- a/clang/lib/CIR/CodeGen/CIRGenerator.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenerator.cpp
@@ -21,6 +21,7 @@
 #include "clang/AST/DeclGroup.h"
 #include "clang/CIR/CIRGenerator.h"
 #include "clang/CIR/InitAllDialects.h"
+#include "clang/CIR/MissingFeatures.h"
 #include "llvm/IR/DataLayout.h"
 
 using namespace cir;
@@ -42,7 +43,28 @@ static void setMLIRDataLayout(mlir::ModuleOp &mod, const 
llvm::DataLayout &dl) {
   mlir::MLIRContext *mlirContext = mod.getContext();
   mlir::DataLayoutSpecInterface dlSpec =
       mlir::translateDataLayout(dl, mlirContext);
-  mod->setAttr(mlir::DLTIDialect::kDataLayoutAttrName, dlSpec);
+
+  // Add a CIR-native pointer data-layout entry so cir.ptr / cir.vptr size and
+  // alignment are driven by the data layout rather than hardcoded. 
+  // The value stores {size-in-bits, abi-align-in-bits} keyed on cir.ptr.
+  //
+  // TODO(cir): Only the default address space is recorded and 
address-space-dependent
+  // pointer sizes are not modeled yet. Emit per-address-space entries.
+  assert(!cir::MissingFeatures::dataLayoutPtrHandlingBasedOnLangAS());
+  constexpr unsigned kBitsInByte = 8;
+  unsigned ptrSizeBits = dl.getPointerSizeInBits(/*AS=*/0);
+  unsigned ptrAlignBits =
+      dl.getPointerABIAlignment(/*AS=*/0).value() * kBitsInByte;
+  auto ptrKey = cir::PointerType::get(cir::VoidType::get(mlirContext));
+  auto ptrVal = mlir::DenseI32ArrayAttr::get(
+      mlirContext,
+      {static_cast<int32_t>(ptrSizeBits), static_cast<int32_t>(ptrAlignBits)});
+  llvm::SmallVector<mlir::DataLayoutEntryInterface> entries(
+      dlSpec.getEntries().begin(), dlSpec.getEntries().end());
+  entries.push_back(mlir::DataLayoutEntryAttr::get(ptrKey, ptrVal));
+
+  mod->setAttr(mlir::DLTIDialect::kDataLayoutAttrName,
+               mlir::DataLayoutSpecAttr::get(mlirContext, entries));
 }
 
 void CIRGenerator::Initialize(ASTContext &astContext) {
diff --git a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp 
b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
index e2e754479c654..041d5e1881596 100644
--- a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
@@ -580,20 +580,62 @@ void RecordType::removeABIConversionNamePrefix() {
 // Data Layout information for types
 
//===----------------------------------------------------------------------===//
 
+// A CIR-native pointer data-layout entry stores {size-in-bits,
+// abi-align-in-bits} as a dense i32 array keyed on a cir.ptr type (see
+// setMLIRDataLayout in CIRGenerator).
+namespace {
+constexpr static uint64_t kBitsInByte = 8;
+
+// Defaults used only when the module carries no cir.ptr data-layout entry
+// (e.g. CIR parsed from text without a data layout). These mirror the MLIR 
LLVM
+// dialect's pointer defaults.
+constexpr static uint64_t kDefaultPointerSizeBits = 64;
+constexpr static uint64_t kDefaultPointerAlignment = 8;
+
+enum class CIRPtrDLPos { Size = 0, AbiAlign = 1 };
+
+// Returns the requested field of the cir.ptr data-layout entry.
+std::optional<uint64_t> getPointerSpecValue(mlir::DataLayoutEntryListRef 
params,
+                                            CIRPtrDLPos pos) {
+  for (mlir::DataLayoutEntryInterface entry : params) {
+    if (!entry.isTypeEntry())
+      continue;
+    auto spec = mlir::dyn_cast<mlir::DenseI32ArrayAttr>(entry.getValue());
+    assert(spec && spec.size() == 2 &&
+           "malformed cir.ptr data layout entry: expected a pair of i32 "
+           "{size-in-bits, abi-align-in-bits}");
+    return static_cast<uint64_t>(spec[static_cast<int>(pos)]);
+  }
+  return std::nullopt;
+}
+} // namespace
+
 llvm::TypeSize
 PointerType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
                                ::mlir::DataLayoutEntryListRef params) const {
+  // The pointer width comes from the CIR-native data-layout entry keyed on
+  // cir.ptr, which records the width for the default address space; fall back
+  // to 64 bits if the module carries no such entry.
   // FIXME: improve this in face of address spaces
   assert(!cir::MissingFeatures::dataLayoutPtrHandlingBasedOnLangAS());
-  return llvm::TypeSize::getFixed(64);
+  if (std::optional<uint64_t> size =
+          getPointerSpecValue(params, CIRPtrDLPos::Size))
+    return llvm::TypeSize::getFixed(*size);
+  return llvm::TypeSize::getFixed(kDefaultPointerSizeBits);
 }
 
 uint64_t
 PointerType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
                              ::mlir::DataLayoutEntryListRef params) const {
+  // As with the size, the alignment is taken from the default-address-space
+  // cir.ptr data-layout entry. Address-space-dependent alignments are not yet
+  // modeled.
   // FIXME: improve this in face of address spaces
   assert(!cir::MissingFeatures::dataLayoutPtrHandlingBasedOnLangAS());
-  return 8;
+  if (std::optional<uint64_t> align =
+          getPointerSpecValue(params, CIRPtrDLPos::AbiAlign))
+    return *align / kBitsInByte;
+  return kDefaultPointerAlignment;
 }
 
 llvm::TypeSize
@@ -1112,14 +1154,16 @@ DataMemberType::getABIAlignment(const 
::mlir::DataLayout &dataLayout,
 llvm::TypeSize
 VPtrType::getTypeSizeInBits(const mlir::DataLayout &dataLayout,
                             mlir::DataLayoutEntryListRef params) const {
-  // FIXME: consider size differences under different ABIs
-  return llvm::TypeSize::getFixed(64);
+  // The vtable pointer is an ordinary data pointer; route the query through a
+  // cir.ptr so it picks up the same data-layout-driven width.
+  return dataLayout.getTypeSizeInBits(
+      cir::PointerType::get(cir::VoidType::get(getContext())));
 }
 
 uint64_t VPtrType::getABIAlignment(const mlir::DataLayout &dataLayout,
                                    mlir::DataLayoutEntryListRef params) const {
-  // FIXME: consider alignment differences under different ABIs
-  return 8;
+  return dataLayout.getTypeABIAlignment(
+      cir::PointerType::get(cir::VoidType::get(getContext())));
 }
 
 
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp 
b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
index 1579e967885d8..dd244cc4ef641 100644
--- a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+++ b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
@@ -3689,6 +3689,23 @@ void ConvertCIRToLLVMPass::runOnOperation() {
   if (failed(applyPartialConversion(ops, target, std::move(patterns))))
     signalPassFailure();
 
+  // The CIR-native pointer data-layout entry (keyed on cir.ptr) drives pointer
+  // widths during CIR codegen and lowering, but cir.ptr has no meaning once 
the
+  // module is translated to LLVM IR. Drop it so the resulting data layout only
+  // references LLVM types.
+  if (auto dlSpec = mlir::dyn_cast_or_null<mlir::DataLayoutSpecAttr>(
+          module->getAttr(mlir::DLTIDialect::kDataLayoutAttrName))) {
+    llvm::SmallVector<mlir::DataLayoutEntryInterface> kept;
+    for (mlir::DataLayoutEntryInterface entry : dlSpec.getEntries()) {
+      if (entry.isTypeEntry() &&
+          mlir::isa<cir::PointerType>(mlir::cast<mlir::Type>(entry.getKey())))
+        continue;
+      kept.push_back(entry);
+    }
+    module->setAttr(mlir::DLTIDialect::kDataLayoutAttrName,
+                    mlir::DataLayoutSpecAttr::get(module.getContext(), kept));
+  }
+
   // Emit the llvm.global_ctors array.
   buildCtorDtorList(module, cir::CIRDialect::getGlobalCtorsAttrName(),
                     "llvm.global_ctors", [](mlir::Attribute attr) {
diff --git a/clang/test/CIR/CodeGen/pointer-width-32bit.cpp 
b/clang/test/CIR/CodeGen/pointer-width-32bit.cpp
new file mode 100644
index 0000000000000..15f3ec92d7e16
--- /dev/null
+++ b/clang/test/CIR/CodeGen/pointer-width-32bit.cpp
@@ -0,0 +1,44 @@
+// RUN: %clang_cc1 -std=c++20 -triple nvptx-nvidia-cuda -fclangir -emit-cir %s 
-o %t.cir
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+// RUN: %clang_cc1 -std=c++20 -triple nvptx-nvidia-cuda -fclangir -emit-llvm 
%s -o %t-cir.ll
+// RUN: FileCheck --check-prefix=LLVM --input-file=%t-cir.ll %s
+// RUN: %clang_cc1 -std=c++20 -triple nvptx-nvidia-cuda -emit-llvm %s -o %t.ll
+// RUN: FileCheck --check-prefix=OGCG --input-file=%t.ll %s
+
+// On a target with 32-bit pointers (e.g. nvptx) both a data pointer (!cir.ptr)
+// and the vtable pointer (!cir.vptr) are 4 bytes wide. The pointer width is
+// carried by a CIR-native data-layout entry keyed on cir.ptr, so the field
+// following a pointer lands at the AST-mandated offset. Sizing pointers as a
+// hardcoded 64 bits previously tripped the record layout builder 
(insertPadding:
+// assertion `offset >= size`) on every record containing a pointer.
+
+struct S {
+  int *p;
+  int x;
+};
+
+S s;
+
+class A {
+public:
+  virtual void f();
+  int x;
+};
+
+void A::f() {}
+
+// The module carries a CIR-native pointer data-layout entry ({size, abi-align}
+// in bits) that drives both cir.ptr and cir.vptr widths. The 4-byte pointer is
+// immediately followed by 'x' at offset 4 with no padding, and each record is
+// 4-byte aligned.
+// CIR-DAG: !rec_S = !cir.struct<"S" {!cir.ptr<!s32i>, !s32i}>
+// CIR-DAG: !rec_A = !cir.struct<class "A" {!cir.vptr, !s32i}>
+// CIR-DAG: !cir.ptr<!cir.void> = array<i32: 32, 32>
+// CIR: cir.global external @s = #cir.zero : !rec_S {alignment = 4 : i64}
+// CIR: cir.global{{.*}}@_ZTV1A = #cir.vtable<{{.*}}{alignment = 4 : i64}
+
+// LLVM: @s = global %struct.S zeroinitializer, align 4
+// LLVM: @_ZTV1A = global { [3 x ptr] } {{.*}}, align 4
+
+// OGCG: @s = global %struct.S zeroinitializer, align 4
+// OGCG: @_ZTV1A = {{.*}}constant { [3 x ptr] } {{.*}}, align 4

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to