https://github.com/davidstone created
https://github.com/llvm/llvm-project/pull/169126
`OwningArrayRef` has several problems.
The naming is strange: `ArrayRef` is specifically a non-owning view, so the
name means "owning non-owning view".
It has a const-correctness bug that is inherent to the interface.
`OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This means that
the following code compiles:
```c++
void const_incorrect(llvm::OwningArrayRef<int> const a) {
a[0] = 5;
}
```
It's surprising for a non-reference type to allow modification of its elements
even when it's declared `const`. However, the problems from this inheritance
(which ultimately stem from the same issue as the weird name) are even worse.
The following function compiles without warning but corrupts memory when called:
```c++
void memory_corruption(llvm::OwningArrayRef<int> a) {
a.consume_front();
}
```
This happens because `MutableArrayRef::consume_front` modifies the internal
data pointer to advance the referenced array forward. That's not an issue for
`MutableArrayRef` because it's just a view. It is an issue for `OwningArrayRef`
because that pointer is passed as the argument to `delete[]`, so when it's
modified by advancing it forward it ceases to be valid to `delete[]`. From
there, undefined behavior occurs.
It is mostly less convenient than `std::vector` for construction. By combining
the `size` and the `capacity` together without going through `std::allocator`
to get memory, it's not possible to fill in data with the correct value to
begin with. Instead, the user must construct an `OwningArrayRef` of the
appropriate size, then fill in the data. This has one of two consequences:
1. If `T` is a class type, we have to first default construct all of the
elements when we construct `OwningArrayRef` and then in a second pass we can
assign to those elements to give what we want. This wastes time and for some
classes is not possible.
2. If `T` is a built-in type, the data starts out uninitialized. This easily
forgotten step means we access uninitialized memory.
Using `std::vector`, by constrast, has well-known constructors that can fill in
the data that we actually want on construction.
`OwningArrayRef` has slightly different performance characteristics than
`std::vector`, but the difference is minimal.
The first difference is a theoretical negative for `OwningArrayRef`: by
implementing in terms of `new[]` and `delete[]`, the implementation has less
room to optimize these calls. However, I say this is theoretical because for
clang, at least, the extra freedom of optimization given to `std::allocator` is
not yet taken advantage of (see
https://github.com/llvm/llvm-project/issues/68365)
The second difference is slightly in favor of `OwningArrayRef`:
`sizeof(std::vector<T>) == sizeof(void *) 3` on pretty much any implementation,
whereas `sizeof(OwningArrayRef) == sizeof(void *) * 2` which seems like a win.
However, this is just a misdirection of the accounting costs: array-new sticks
bookkeeping information in the allocated storage. There are some cases where
this is beneficial to reduce stack usage, but that minor benefit doesn't seem
worth the costs. If we actually need that optimization, we'd be better served
by writing a `DynamicArray` type that implements a full vector-like feature set
(except for operations that change the size of the container) while allocating
through `std::allocator` to avoid the pitfalls outlined earlier.
Depends on https://github.com/llvm/llvm-project/pull/168768 and
https://github.com/llvm/llvm-project/pull/169124, so ignore the first two
commits in this PR. The other PRs are for changes that are useful regardless of
whether we decide to delete this type and that require a little more thought
than just find + replace.
>From 1653d14099cd722c00e575a4b5a9a9673f20e813 Mon Sep 17 00:00:00 2001
From: David Stone <[email protected]>
Date: Wed, 19 Nov 2025 12:58:07 -0700
Subject: [PATCH 1/3] Use `llvm::SmallVector` instead of `OwningArrayRef` in
`VTableLayout`.
This simplifies the code by removing the manual optimization for size == 1, and
also gives us an optimization for other small sizes.
Accept a `llvm::SmallVector` by value for the constructor and move it into the
destination, rather than accepting `ArrayRef` that we copy from. This also lets
us not have to construct a reference to the elements of a
`std::initializer_list`, which requires reading the implementation of the
constructor to know whether it's safe.
Also explicitly document that the constructor requires the input indexes to
have a size of at least 1.
---
clang/include/clang/AST/VTableBuilder.h | 32 +++++++------------------
clang/lib/AST/VTableBuilder.cpp | 23 +++++++++---------
2 files changed, 20 insertions(+), 35 deletions(-)
diff --git a/clang/include/clang/AST/VTableBuilder.h
b/clang/include/clang/AST/VTableBuilder.h
index e1efe8cddcc5e..58f6fcbc1270e 100644
--- a/clang/include/clang/AST/VTableBuilder.h
+++ b/clang/include/clang/AST/VTableBuilder.h
@@ -246,12 +246,12 @@ class VTableLayout {
// point for a given vtable index.
typedef llvm::SmallVector<unsigned, 4> AddressPointsIndexMapTy;
+ using VTableIndicesTy = llvm::SmallVector<std::size_t>;
+
private:
- // Stores the component indices of the first component of each virtual table
in
- // the virtual table group. To save a little memory in the common case where
- // the vtable group contains a single vtable, an empty vector here represents
- // the vector {0}.
- OwningArrayRef<size_t> VTableIndices;
+ // Stores the component indices of the first component of each virtual table
+ // in the virtual table group.
+ VTableIndicesTy VTableIndices;
OwningArrayRef<VTableComponent> VTableComponents;
@@ -265,7 +265,8 @@ class VTableLayout {
AddressPointsIndexMapTy AddressPointIndices;
public:
- VTableLayout(ArrayRef<size_t> VTableIndices,
+ // Requires `!VTableIndicies.empty()`
+ VTableLayout(VTableIndicesTy VTableIndices,
ArrayRef<VTableComponent> VTableComponents,
ArrayRef<VTableThunkTy> VTableThunks,
const AddressPointsMapTy &AddressPoints);
@@ -292,26 +293,11 @@ class VTableLayout {
return AddressPointIndices;
}
- size_t getNumVTables() const {
- if (VTableIndices.empty())
- return 1;
- return VTableIndices.size();
- }
+ size_t getNumVTables() const { return VTableIndices.size(); }
- size_t getVTableOffset(size_t i) const {
- if (VTableIndices.empty()) {
- assert(i == 0);
- return 0;
- }
- return VTableIndices[i];
- }
+ size_t getVTableOffset(size_t i) const { return VTableIndices[i]; }
size_t getVTableSize(size_t i) const {
- if (VTableIndices.empty()) {
- assert(i == 0);
- return vtable_components().size();
- }
-
size_t thisIndex = VTableIndices[i];
size_t nextIndex = (i + 1 == VTableIndices.size())
? vtable_components().size()
diff --git a/clang/lib/AST/VTableBuilder.cpp b/clang/lib/AST/VTableBuilder.cpp
index 9951126c2c3a3..922d43bb3f239 100644
--- a/clang/lib/AST/VTableBuilder.cpp
+++ b/clang/lib/AST/VTableBuilder.cpp
@@ -999,7 +999,7 @@ class ItaniumVTableBuilder {
public:
/// Component indices of the first component of each of the vtables in the
/// vtable group.
- SmallVector<size_t, 4> VTableIndices;
+ VTableLayout::VTableIndicesTy VTableIndices;
ItaniumVTableBuilder(ItaniumVTableContext &VTables,
const CXXRecordDecl *MostDerivedClass,
@@ -2306,18 +2306,17 @@ MakeAddressPointIndices(const
VTableLayout::AddressPointsMapTy &addressPoints,
return indexMap;
}
-VTableLayout::VTableLayout(ArrayRef<size_t> VTableIndices,
+VTableLayout::VTableLayout(VTableIndicesTy VTableIndices,
ArrayRef<VTableComponent> VTableComponents,
ArrayRef<VTableThunkTy> VTableThunks,
const AddressPointsMapTy &AddressPoints)
- : VTableComponents(VTableComponents), VTableThunks(VTableThunks),
- AddressPoints(AddressPoints),
AddressPointIndices(MakeAddressPointIndices(
- AddressPoints, VTableIndices.size())) {
- if (VTableIndices.size() <= 1)
- assert(VTableIndices.size() == 1 && VTableIndices[0] == 0);
- else
- this->VTableIndices = OwningArrayRef<size_t>(VTableIndices);
-
+ : VTableIndices(std::move(VTableIndices)),
+ VTableComponents(VTableComponents), VTableThunks(VTableThunks),
+ AddressPoints(AddressPoints),
+ AddressPointIndices(
+ MakeAddressPointIndices(AddressPoints, this->VTableIndices.size())) {
+ assert(!this->VTableIndices.empty() &&
+ "VTableLayout requires at least one index.");
llvm::sort(this->VTableThunks, [](const VTableLayout::VTableThunkTy &LHS,
const VTableLayout::VTableThunkTy &RHS) {
assert((LHS.first != RHS.first || LHS.second == RHS.second) &&
@@ -3730,8 +3729,8 @@ void
MicrosoftVTableContext::computeVTableRelatedInformation(
SmallVector<VTableLayout::VTableThunkTy, 1> VTableThunks(
Builder.vtable_thunks_begin(), Builder.vtable_thunks_end());
VFTableLayouts[id] = std::make_unique<VTableLayout>(
- ArrayRef<size_t>{0}, Builder.vtable_components(), VTableThunks,
- EmptyAddressPointsMap);
+ VTableLayout::VTableIndicesTy{0}, Builder.vtable_components(),
+ VTableThunks, EmptyAddressPointsMap);
Thunks.insert(Builder.thunks_begin(), Builder.thunks_end());
const ASTRecordLayout &Layout = Context.getASTRecordLayout(RD);
>From c3bc565c959ee690db386241d2c16d808b961faf Mon Sep 17 00:00:00 2001
From: David Stone <[email protected]>
Date: Fri, 21 Nov 2025 15:43:27 -0700
Subject: [PATCH 2/3] [llvm] Replace `OwningArrayRef` with `std::vector` in
`BTFParser`
`OwningArrayRef` requires that the size and the capacity are the same. This
prevents reusing memory allocations unless the size happens to be exactly the
same (which is rare enough we don't even try). Switch to `std::vector` instead
so that we're not repeatedly calling `new[]` and `delete[]`.
---
llvm/include/llvm/DebugInfo/BTF/BTFParser.h | 2 +-
llvm/lib/DebugInfo/BTF/BTFParser.cpp | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
index f8b5b29738b3f..32644e6700e78 100644
--- a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
+++ b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
@@ -46,7 +46,7 @@ class BTFParser {
// A copy of types table from the object file but using native byte
// order. Should not be too big in practice, e.g. for ~250MiB vmlinux
// image it is ~4MiB.
- OwningArrayRef<uint8_t> TypesBuffer;
+ std::vector<uint8_t> TypesBuffer;
// Maps ELF section number to instruction line number information.
// Each BTFLinesVector is sorted by `InsnOffset` to allow fast lookups.
diff --git a/llvm/lib/DebugInfo/BTF/BTFParser.cpp
b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
index 4fc31a4456031..971a0d9f01769 100644
--- a/llvm/lib/DebugInfo/BTF/BTFParser.cpp
+++ b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
@@ -206,7 +206,8 @@ Error BTFParser::parseTypesInfo(ParseContext &Ctx, uint64_t
TypesInfoStart,
StringRef RawData) {
using support::endian::byte_swap;
- TypesBuffer = OwningArrayRef<uint8_t>(arrayRefFromStringRef(RawData));
+ auto RawDataAsBytes = arrayRefFromStringRef(RawData);
+ TypesBuffer.assign(RawDataAsBytes.begin(), RawDataAsBytes.end());
// Switch endianness if necessary.
endianness Endianness = Ctx.Obj.isLittleEndian() ? llvm::endianness::little
: llvm::endianness::big;
@@ -380,7 +381,7 @@ Error BTFParser::parse(const ObjectFile &Obj, const
ParseOptions &Opts) {
SectionLines.clear();
SectionRelocs.clear();
Types.clear();
- TypesBuffer = OwningArrayRef<uint8_t>();
+ TypesBuffer.clear();
ParseContext Ctx(Obj, Opts);
std::optional<SectionRef> BTF;
>From 01164dbcc1dcd5b8a1f09af225ed5582fbb0f3cb Mon Sep 17 00:00:00 2001
From: David Stone <[email protected]>
Date: Fri, 21 Nov 2025 16:12:54 -0700
Subject: [PATCH 3/3] [llvm][clang] Remove `llvm::OwningArrayRef`
`OwningArrayRef` has several problems.
The naming is strange: `ArrayRef` is specifically a non-owning view, so the
name means "owning non-owning view".
It has a const-correctness bug that is inherent to the interface.
`OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This means that
the following code compiles:
```c++
void const_incorrect(llvm::OwningArrayRef<int> const a) {
a[0] = 5;
}
```
It's surprising for a non-reference type to allow modification of its elements
even when it's declared `const`. However, the problems from this inheritance
(which ultimately stem from the same issue as the weird name) are even worse.
The following function compiles without warning but corrupts memory when called:
```c++
void memory_corruption(llvm::OwningArrayRef<int> a) {
a.consume_front();
}
```
This happens because `MutableArrayRef::consume_front` modifies the internal
data pointer to advance the referenced array forward. That's not an issue for
`MutableArrayRef` because it's just a view. It is an issue for `OwningArrayRef`
because that pointer is passed as the argument to `delete[]`, so when it's
modified by advancing it forward it ceases to be valid to `delete[]`. From
there, undefined behavior occurs.
It is mostly less convenient than `std::vector` for construction. By combining
the `size` and the `capacity` together without going through `std::allocator`
to get memory, it's not possible to fill in data with the correct value to
begin with. Instead, the user must construct an `OwningArrayRef` of the
appropriate size, then fill in the data. This has one of two consequences:
1. If `T` is a class type, we have to first default construct all of the
elements when we construct `OwningArrayRef` and then in a second pass we can
assign to those elements to give what we want. This wastes time and for some
classes is not possible.
2. If `T` is a built-in type, the data starts out uninitialized. This easily
forgotten step means we access uninitialized memory.
Using `std::vector`, by constrast, has well-known constructors that can fill in
the data that we actually want on construction.
`OwningArrayRef` has slightly different performance characteristics than
`std::vector`, but the difference is minimal.
The first difference is a theoretical negative for `OwningArrayRef`: by
implementing in terms of `new[]` and `delete[]`, the implementation has less
room to optimize these calls. However, I say this is theoretical because for
clang, at least, the extra freedom of optimization given to `std::allocator` is
not yet taken advantage of (see
https://github.com/llvm/llvm-project/issues/68365)
The second difference is slightly in favor of `OwningArrayRef`:
`sizeof(std::vector<T>) == sizeof(void *) 3` on pretty much any implementation,
whereas `sizeof(OwningArrayRef) == sizeof(void *) * 2` which seems like a win.
However, this is just a misdirection of the accounting costs: array-new sticks
bookkeeping information in the allocated storage. There are some cases where
this is beneficial to reduce stack usage, but that minor benefit doesn't seem
worth the costs. If we actually need that optimization, we'd be better served
by writing a `DynamicArray` type that implements a full vector-like feature set
(except for operations that change the size of the container) while allocating
through `std::allocator` to avoid the pitfalls outlined earlier.
---
clang/include/clang/AST/VTableBuilder.h | 4 ++--
clang/include/clang/Basic/LLVM.h | 4 +---
llvm/include/llvm/ADT/ArrayRef.h | 23 -------------------
llvm/include/llvm/CGData/CGDataPatchItem.h | 4 ++--
llvm/include/llvm/CodeGen/PBQP/Math.h | 10 ++++----
.../Transforms/Vectorize/SLPVectorizer.cpp | 7 +++---
llvm/unittests/ADT/ArrayRefTest.cpp | 7 ------
7 files changed, 14 insertions(+), 45 deletions(-)
diff --git a/clang/include/clang/AST/VTableBuilder.h
b/clang/include/clang/AST/VTableBuilder.h
index 58f6fcbc1270e..25aa8ef7ba4a3 100644
--- a/clang/include/clang/AST/VTableBuilder.h
+++ b/clang/include/clang/AST/VTableBuilder.h
@@ -253,10 +253,10 @@ class VTableLayout {
// in the virtual table group.
VTableIndicesTy VTableIndices;
- OwningArrayRef<VTableComponent> VTableComponents;
+ std::vector<VTableComponent> VTableComponents;
/// Contains thunks needed by vtables, sorted by indices.
- OwningArrayRef<VTableThunkTy> VTableThunks;
+ std::vector<VTableThunkTy> VTableThunks;
/// Address points for all vtables.
AddressPointsMapTy AddressPoints;
diff --git a/clang/include/clang/Basic/LLVM.h b/clang/include/clang/Basic/LLVM.h
index f4956cd16cbcf..fed23f0c44f38 100644
--- a/clang/include/clang/Basic/LLVM.h
+++ b/clang/include/clang/Basic/LLVM.h
@@ -29,8 +29,7 @@ namespace llvm {
class Twine;
class VersionTuple;
template<typename T> class ArrayRef;
- template<typename T> class MutableArrayRef;
- template<typename T> class OwningArrayRef;
+ template <typename T> class MutableArrayRef;
template<unsigned InternalLen> class SmallString;
template<typename T, unsigned N> class SmallVector;
template<typename T> class SmallVectorImpl;
@@ -65,7 +64,6 @@ namespace clang {
// ADT's.
using llvm::ArrayRef;
using llvm::MutableArrayRef;
- using llvm::OwningArrayRef;
using llvm::SaveAndRestore;
using llvm::SmallString;
using llvm::SmallVector;
diff --git a/llvm/include/llvm/ADT/ArrayRef.h b/llvm/include/llvm/ADT/ArrayRef.h
index d7ed2c78749f0..00b5534469d65 100644
--- a/llvm/include/llvm/ADT/ArrayRef.h
+++ b/llvm/include/llvm/ADT/ArrayRef.h
@@ -445,29 +445,6 @@ namespace llvm {
}
};
- /// This is a MutableArrayRef that owns its array.
- template <typename T> class OwningArrayRef : public MutableArrayRef<T> {
- public:
- OwningArrayRef() = default;
- OwningArrayRef(size_t Size) : MutableArrayRef<T>(new T[Size], Size) {}
-
- OwningArrayRef(ArrayRef<T> Data)
- : MutableArrayRef<T>(new T[Data.size()], Data.size()) {
- std::copy(Data.begin(), Data.end(), this->begin());
- }
-
- OwningArrayRef(OwningArrayRef &&Other) { *this = std::move(Other); }
-
- OwningArrayRef &operator=(OwningArrayRef &&Other) {
- delete[] this->data();
- this->MutableArrayRef<T>::operator=(Other);
- Other.MutableArrayRef<T>::operator=(MutableArrayRef<T>());
- return *this;
- }
-
- ~OwningArrayRef() { delete[] this->data(); }
- };
-
/// @name ArrayRef Deduction guides
/// @{
/// Deduction guide to construct an ArrayRef from a single element.
diff --git a/llvm/include/llvm/CGData/CGDataPatchItem.h
b/llvm/include/llvm/CGData/CGDataPatchItem.h
index d13f89b032542..e9aad8c23fefc 100644
--- a/llvm/include/llvm/CGData/CGDataPatchItem.h
+++ b/llvm/include/llvm/CGData/CGDataPatchItem.h
@@ -22,10 +22,10 @@ struct CGDataPatchItem {
// Where to patch.
uint64_t Pos;
// Source data.
- OwningArrayRef<uint64_t> D;
+ std::vector<uint64_t> D;
CGDataPatchItem(uint64_t Pos, const uint64_t *D, int N)
- : Pos(Pos), D(ArrayRef<uint64_t>(D, N)) {}
+ : Pos(Pos), D(D, D + N) {}
};
} // namespace llvm
diff --git a/llvm/include/llvm/CodeGen/PBQP/Math.h
b/llvm/include/llvm/CodeGen/PBQP/Math.h
index 1cbbeeba3f32b..ba673dd9380a8 100644
--- a/llvm/include/llvm/CodeGen/PBQP/Math.h
+++ b/llvm/include/llvm/CodeGen/PBQP/Math.h
@@ -41,10 +41,10 @@ class Vector {
Vector(Vector &&V) : Data(std::move(V.Data)) {}
// Iterator-based access.
- const PBQPNum *begin() const { return Data.begin(); }
- const PBQPNum *end() const { return Data.end(); }
- PBQPNum *begin() { return Data.begin(); }
- PBQPNum *end() { return Data.end(); }
+ const PBQPNum *begin() const { return Data.data(); }
+ const PBQPNum *end() const { return Data.data() + Data.size(); }
+ PBQPNum *begin() { return Data.data(); }
+ PBQPNum *end() { return Data.data() + Data.size(); }
/// Comparison operator.
bool operator==(const Vector &V) const {
@@ -87,7 +87,7 @@ class Vector {
}
private:
- OwningArrayRef<PBQPNum> Data;
+ std::vector<PBQPNum> Data;
};
/// Return a hash_value for the given vector.
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 3b36ccbd677dc..1a7200201d1c0 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -23390,9 +23390,10 @@ bool SLPVectorizerPass::vectorizeStores(
unsigned End = Operands.size();
unsigned Repeat = 0;
constexpr unsigned MaxAttempts = 4;
- OwningArrayRef<std::pair<unsigned, unsigned>>
RangeSizes(Operands.size());
- for (std::pair<unsigned, unsigned> &P : RangeSizes)
- P.first = P.second = 1;
+ std::vector<std::pair<unsigned, unsigned>> RangeSizesStorage(
+ Operands.size(), {1, 1});
+ // The `slice` and `drop_front` interfaces are convenient
+ const auto RangeSizes = MutableArrayRef(RangeSizesStorage);
DenseMap<Value *, std::pair<unsigned, unsigned>> NonSchedulable;
auto IsNotVectorized = [](bool First,
const std::pair<unsigned, unsigned> &P) {
diff --git a/llvm/unittests/ADT/ArrayRefTest.cpp
b/llvm/unittests/ADT/ArrayRefTest.cpp
index 736c8fbb26b38..b1a86f0214b74 100644
--- a/llvm/unittests/ADT/ArrayRefTest.cpp
+++ b/llvm/unittests/ADT/ArrayRefTest.cpp
@@ -307,13 +307,6 @@ TEST(ArrayRefTest, ArrayRef) {
EXPECT_TRUE(AR2.equals(AR2Ref));
}
-TEST(ArrayRefTest, OwningArrayRef) {
- static const int A1[] = {0, 1};
- OwningArrayRef<int> A{ArrayRef(A1)};
- OwningArrayRef<int> B(std::move(A));
- EXPECT_EQ(A.data(), nullptr);
-}
-
TEST(ArrayRefTest, ArrayRefFromStdArray) {
std::array<int, 5> A1{{42, -5, 0, 1000000, -1000000}};
ArrayRef<int> A2 = ArrayRef(A1);
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits