[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread David Blaikie via cfe-commits

dwblaikie wrote:

> This only affects builds with GCC.

My understanding was that we basically only cared about performance of clang on 
clang - that we expect people who want an efficient clang to bootstrap?

Could `always_inline` the function, but those sort of annotations would get out 
of date/bitrot.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread Nikita Popov via cfe-commits

nikic wrote:

> I also noticed that LoadedSLocEntryTable uses 42 elements tables, which 
> probably introduces some extra divisions to do the indexing. I can try 
> switching to 32 elements.

I've tested the following change that forces power of two pages sizes, but it 
did not have any impact: 
https://github.com/llvm/llvm-project/commit/fa54294b359c2934a34f3bf24a94b2172c5fc760

I've done some profiling and found that the regression seems to be related to 
inlining. In particular getAndAdvanceChar() inside LexTokenInternal() no longer 
gets inlined, and that's hot code. This only affects builds with GCC.

It's not totally obvious to me why that happens, as getAndAdvanceChar() doesn't 
appear to use PagedVector at all.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread Giulio Eulisse via cfe-commits

ktf wrote:

@mikaelholmen see also https://github.com/llvm/llvm-project/pull/67958. I can't 
seem to create a version that avoids triggering the warning and is also 
compatible across both macOS and Windows. Some good idea would be appreciated, 
especially because I do not have a windows setup and godbolt does not seem to 
like LLVM as an external package.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread via cfe-commits

mikaelholmen wrote:

Hi @vgvassilev !
Compiling this with gcc we see lots of warnings like
`
../include/llvm/ADT/PagedVector.h:213:59: warning: friend declaration 'bool 
llvm::operator==(const llvm::PagedVector::MaterializedIterator&, 
const llvm::PagedVector::MaterializedIterator&)' declares a 
non-template function [-Wnon-template-friend]
  213 |MaterializedIterator const );
  |   ^
../include/llvm/ADT/PagedVector.h:213:59: note: (if this is not what you 
intended, make sure the function template has already been declared and add <> 
after the function name here) 
../include/llvm/ADT/PagedVector.h:215:59: warning: friend declaration 'bool 
llvm::operator!=(const llvm::PagedVector::MaterializedIterator&, 
const llvm::PagedVector::MaterializedIterator&)' declares a 
non-template function [-Wnon-template-friend]
  215 |MaterializedIterator const );
  |
`
Anything that could/should be fixed even if clang seems to not bother?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread Giulio Eulisse via cfe-commits

ktf wrote:

I also noticed that LoadedSLocEntryTable uses 42 elements tables, which 
probably introduces some extra divisions to do the indexing. I can try 
switching to 32 elements.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread Giulio Eulisse via cfe-commits

ktf wrote:

> @ktf could you take a look at this?

Sure, I did https://github.com/llvm/llvm-project/pull/67972 with your 
suggestion, however I am not sure how I can run the benchmark myself.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-10-02 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

> > Looks like this causes a compile-time regression: 
> > https://llvm-compile-time-tracker.com/compare.php?from=abcaebfe3aacb13d46be5e949fd6ed9b4321e2f6=4ae51570806ba5c5fcabe6d6dcbe52e3a5d5453b=instructions%3Au
> >  About 0.5% at `O0`.
> 
> It is probably the change in the `SourceManager.cpp` in non-modules mode that 
> caused it. IIUC, this is a ~memory regression and probably comes by the 
> larger page/slab that we allocate~. Looks like the instruction count 
> increased and the branching increased. Could it be the indexing operation:
> 
> https://github.com/llvm/llvm-project/blob/f99bd296108756e9824b179761fb5a40807af66f/llvm/include/llvm/ADT/PagedVector.h#L82-L95
> 
> Would adding an `LLVM_UNLIKELY` in the if-stmt condition help us?

@ktf could you take a look at this?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Jakub Kuderski via cfe-commits

kuhar wrote:

@vgvassilev 
> I find the use of the macro I proposed cleaner though.

Ideally, I think we could have an llvm-specific macro that combines the guard 
and the check (say `LLVM_EXPECT_DEATH`), so that it's less of a footgun. In any 
case, the failures should be resolved now, and we can iterate it further in the 
future.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

> @vgvassilev I've seen other tests use the pattern from my fix. Feel free to 
> overwrite with your version if that's preferred.

Too late in the night in my end and I closed my laptop already. I am fine with 
your fix - I find the use of the macro I proposed cleaner though. 

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Jakub Kuderski via cfe-commits

kuhar wrote:

Also, don't use have to guard that with `#ifdef EXPECT_DEBUG_DEATH`?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Jakub Kuderski via cfe-commits

kuhar wrote:

@vgvassilev I've seen other tests use the pattern from my fix. Feel free to 
overwrite with your version if that's preferred.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

@kuhar, thanks, you overtook me. Here is my diff:
```diff
diff --git a/llvm/unittests/ADT/PagedVectorTest.cpp 
b/llvm/unittests/ADT/PagedVectorTest.cpp
index e1b0c62d3395..c45df6d9e7e8 100644
--- a/llvm/unittests/ADT/PagedVectorTest.cpp
+++ b/llvm/unittests/ADT/PagedVectorTest.cpp
@@ -24,8 +24,8 @@ TEST(PagedVectorTest, EmptyTest) {
   EXPECT_EQ(V.materialized_end().getIndex(), 0ULL);
   EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
 
-  EXPECT_DEATH(V[0], "Index < Size");
-  EXPECT_DEATH(PagedVector(nullptr), "Allocator cannot be null");
+  EXPECT_DEBUG_DEATH(V[0], "Index < Size");
+  EXPECT_DEBUG_DEATH(PagedVector(nullptr), "Allocator cannot be null");
 }
 
 TEST(PagedVectorTest, ExpandTest) {
@@ -69,7 +69,7 @@ TEST(PagedVectorTest, HalfPageFillingTest) {
   for (int I = 0; I < 5; ++I)
 EXPECT_EQ(V[I], I);
   for (int I = 5; I < 10; ++I)
-EXPECT_DEATH(V[I], "Index < Size");
+EXPECT_DEBUG_DEATH(V[I], "Index < Size");
 }
 
 TEST(PagedVectorTest, FillFullMultiPageTest) {
@@ -244,7 +244,7 @@ TEST(PagedVectorTest, ShrinkTest) {
   EXPECT_EQ(V.size(), 0ULL);
   EXPECT_EQ(V.capacity(), 0ULL);
   EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
-  EXPECT_DEATH(V[0], "Index < Size");
+  EXPECT_DEBUG_DEATH(V[0], "Index < Size");
 }
 
 TEST(PagedVectorTest, FunctionalityTest) {
 ```
 
 Could you use `EXPECT_DEBUG_DEATH` in your fix?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Jakub Kuderski via cfe-commits

kuhar wrote:

@vvereschaka I submitted a fix for death tests in 
https://github.com/llvm/llvm-project/commit/8580010672e9ff37b0744927296ca00dbcbef5be

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Vladimir Vereschaka via cfe-commits

vvereschaka wrote:

@vgvassilev ,
got failed tests: https://lab.llvm.org/buildbot/#/builders/86/builds/66095

* LLVM-Unit :: ADT/./ADTTests.exe/PagedVectorTest/EmptyTest
* LLVM-Unit :: ADT/./ADTTests.exe/PagedVectorTest/HalfPageFillingTest
* LLVM-Unit :: ADT/./ADTTests.exe/PagedVectorTest/ShrinkTest

would you take care of it?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

> Looks like this causes a compile-time regression: 
> https://llvm-compile-time-tracker.com/compare.php?from=abcaebfe3aacb13d46be5e949fd6ed9b4321e2f6=4ae51570806ba5c5fcabe6d6dcbe52e3a5d5453b=instructions%3Au
>  About 0.5% at `O0`.

It is probably the change in the `SourceManager.cpp` in non-modules mode that 
caused it. IIUC, this is a memory regression and probably comes by the larger 
page/slab that we allocate. What would be the best way to profile this?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Nikita Popov via cfe-commits

nikic wrote:

Looks like this causes a compile-time regression: 
https://llvm-compile-time-tracker.com/compare.php?from=abcaebfe3aacb13d46be5e949fd6ed9b4321e2f6=4ae51570806ba5c5fcabe6d6dcbe52e3a5d5453b=instructions%3Au
 About 0.5% at `O0`.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-30 Thread Giulio Eulisse via cfe-commits

ktf wrote:

Thanks! If I were younger I would probably know how to post a dancing banana 
emoji at this point... ;-)

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

@ktf, congrats for your first patch to llvm. Well done!

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits

https://github.com/vgvassilev closed 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits

vgvassilev wrote:

I believe we have beaten this PR to death and would propose to merge it when 
the builds succeed. 

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/20] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/19] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/18] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/17] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/16] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Jakub Kuderski via cfe-commits


@@ -0,0 +1,266 @@
+//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(PagePtr, PageSize);
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, however, elements of
+  /// the last page will not be destroyed.
+  ///
+  /// For these reason the usage of this vector is discouraged if you rely
+  /// on the construction / destructor of the 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Jakub Kuderski via cfe-commits


@@ -0,0 +1,266 @@
+//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(PagePtr, PageSize);
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, however, elements of
+  /// the last page will not be destroyed.
+  ///
+  /// For these reason the usage of this vector is discouraged if you rely
+  /// on the construction / destructor of the 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Jakub Kuderski via cfe-commits


@@ -0,0 +1,266 @@
+//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(PagePtr, PageSize);
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, however, elements of
+  /// the last page will not be destroyed.
+  ///
+  /// For these reason the usage of this vector is discouraged if you rely
+  /// on the construction / destructor of the 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Jakub Kuderski via cfe-commits


@@ -0,0 +1,266 @@
+//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(PagePtr, PageSize);
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, however, elements of
+  /// the last page will not be destroyed.
+  ///
+  /// For these reason the usage of this vector is discouraged if you rely
+  /// on the construction / destructor of the 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Jakub Kuderski via cfe-commits


@@ -0,0 +1,266 @@
+//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(PagePtr, PageSize);
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, however, elements of
+  /// the last page will not be destroyed.
+  ///
+  /// For these reason the usage of this vector is discouraged if you rely
+  /// on the construction / destructor of the 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

ktf wrote:

@vgvassilev Thanks, I have pushed an update with the remaining comments 
addressed. I also verified that ROOT peaks at 52 MB rather than 82.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits


@@ -0,0 +1,342 @@
+//===- llvm/unittest/ADT/PagedVectorTest.cpp 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// PagedVector unit tests.
+//
+//===--===//
+
+#include "llvm/ADT/PagedVector.h"
+#include "gtest/gtest.h"
+#include 
+
+namespace llvm {
+TEST(PagedVectorTest, EmptyTest) {
+  PagedVector V;
+  EXPECT_EQ(V.empty(), true);
+  EXPECT_EQ(V.size(), 0ULL);
+  EXPECT_EQ(V.capacity(), 0ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 0ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 0ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+
+  EXPECT_DEATH(V[0], "Index < Size");
+  EXPECT_DEATH(PagedVector(nullptr), "Allocator cannot be null");
+}
+
+TEST(PagedVectorTest, ExpandTest) {
+  PagedVector V;
+  V.resize(2);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 2ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 2ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 2ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+}
+
+TEST(PagedVectorTest, FullPageFillingTest) {
+  PagedVector V;
+  V.resize(10);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 10ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  for (int I = 0; I < 10; ++I) {
+V[I] = I;
+  }

ktf wrote:

Done. Including some ifs...

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits


@@ -0,0 +1,267 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");

ktf wrote:

This of course should have been `assert(A`. I fixed your commit in a subsequent 
one.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/15] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/14] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/13] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/12] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/11] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,342 @@
+//===- llvm/unittest/ADT/PagedVectorTest.cpp 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// PagedVector unit tests.
+//
+//===--===//
+
+#include "llvm/ADT/PagedVector.h"
+#include "gtest/gtest.h"
+#include 
+
+namespace llvm {
+TEST(PagedVectorTest, EmptyTest) {
+  PagedVector V;
+  EXPECT_EQ(V.empty(), true);
+  EXPECT_EQ(V.size(), 0ULL);
+  EXPECT_EQ(V.capacity(), 0ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 0ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 0ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+
+  EXPECT_DEATH(V[0], "Index < Size");
+  EXPECT_DEATH(PagedVector(nullptr), "Allocator cannot be null");
+}
+
+TEST(PagedVectorTest, ExpandTest) {
+  PagedVector V;
+  V.resize(2);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 2ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 2ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 2ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+}
+
+TEST(PagedVectorTest, FullPageFillingTest) {
+  PagedVector V;
+  V.resize(10);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 10ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  for (int I = 0; I < 10; ++I) {
+V[I] = I;
+  }
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 10ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 0ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 10ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 10LL);
+  for (int I = 0; I < 10; ++I) {
+EXPECT_EQ(V[I], I);
+  }
+}
+
+TEST(PagedVectorTest, HalfPageFillingTest) {
+  PagedVector V;
+  V.resize(5);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 5ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  for (int I = 0; I < 5; ++I) {
+V[I] = I;
+  }
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 5LL);
+  for (int I = 0; I < 5; ++I) {
+EXPECT_EQ(V[I], I);
+  }
+  for (int I = 5; I < 10; ++I) {
+EXPECT_DEATH(V[I], "Index < Size");
+  }
+}
+
+TEST(PagedVectorTest, FillFullMultiPageTest) {
+  PagedVector V;
+  V.resize(20);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 20ULL);
+  EXPECT_EQ(V.capacity(), 20ULL);
+  for (int I = 0; I < 20; ++I) {
+V[I] = I;
+  }
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 20LL);
+  for (auto MI = V.materialized_begin(), ME = V.materialized_end(); MI != ME;
+   ++MI) {
+EXPECT_EQ(*MI, std::distance(V.materialized_begin(), MI));
+  }
+}
+
+TEST(PagedVectorTest, FillHalfMultiPageTest) {
+  PagedVector V;
+  V.resize(20);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 20ULL);
+  EXPECT_EQ(V.capacity(), 20ULL);
+  for (int I = 0; I < 5; ++I) {
+V[I] = I;
+  }
+  for (int I = 10; I < 15; ++I) {
+V[I] = I;
+  }
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 20LL);
+  for (int I = 0; I < 5; ++I) {
+EXPECT_EQ(V[I], I);
+  }
+  for (int I = 10; I < 15; ++I) {
+EXPECT_EQ(V[I], I);
+  }
+}
+
+TEST(PagedVectorTest, FillLastMultiPageTest) {
+  PagedVector V;
+  V.resize(20);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 20ULL);
+  EXPECT_EQ(V.capacity(), 20ULL);
+  for (int I = 10; I < 15; ++I) {
+V[I] = I;
+  }
+  for (int I = 10; I < 15; ++I) {
+EXPECT_EQ(V[I], I);
+  }
+
+  // Since we fill the last page only, the materialized vector
+  // should contain only the last page.
+  int J = 10;
+  for (auto MI = V.materialized_begin(), ME = V.materialized_end(); MI != ME;
+   ++MI) {
+if (J < 15) {
+  EXPECT_EQ(*MI, J);
+} else {
+  EXPECT_EQ(*MI, 0);
+}
+++J;
+  }
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 10LL);
+}
+
+// Filling the first element of all the pages
+// will allocate all of them
+TEST(PagedVectorTest, FillSparseMultiPageTest) {
+  PagedVector V;
+  V.resize(100);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 100ULL);
+  EXPECT_EQ(V.capacity(), 100ULL);
+  for (int I = 0; I < 10; ++I) {
+V[I * 10] = I;
+  }
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 
100LL);
+  for (int I = 0; I < 100; ++I) {
+if (I % 10 == 0) {
+  EXPECT_EQ(V[I], I / 10);
+} else {
+  EXPECT_EQ(V[I], 0);
+}
+  }
+}
+
+struct TestHelper {
+  int A = -1;
+};
+
+// Use this to count how many times the constructor / destructor are called
+struct TestHelper2 {
+  int A = -1;
+  static int constructed;
+  static int destroyed;
+
+  TestHelper2() { constructed++; }
+  

[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,267 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//

vgvassilev wrote:

```suggestion
//===- llvm/ADT/PagedVector.h - 'Lazily allocated' vectors --*- C++ -*-===//
```

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,342 @@
+//===- llvm/unittest/ADT/PagedVectorTest.cpp 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// PagedVector unit tests.
+//
+//===--===//
+
+#include "llvm/ADT/PagedVector.h"
+#include "gtest/gtest.h"
+#include 
+
+namespace llvm {
+TEST(PagedVectorTest, EmptyTest) {
+  PagedVector V;
+  EXPECT_EQ(V.empty(), true);
+  EXPECT_EQ(V.size(), 0ULL);
+  EXPECT_EQ(V.capacity(), 0ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 0ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 0ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+
+  EXPECT_DEATH(V[0], "Index < Size");
+  EXPECT_DEATH(PagedVector(nullptr), "Allocator cannot be null");
+}
+
+TEST(PagedVectorTest, ExpandTest) {
+  PagedVector V;
+  V.resize(2);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 2ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 2ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 2ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+}
+
+TEST(PagedVectorTest, FullPageFillingTest) {
+  PagedVector V;
+  V.resize(10);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 10ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  for (int I = 0; I < 10; ++I) {
+V[I] = I;
+  }

vgvassilev wrote:

And for the rest of the loops in the test.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits

https://github.com/vgvassilev edited 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,342 @@
+//===- llvm/unittest/ADT/PagedVectorTest.cpp 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// PagedVector unit tests.
+//
+//===--===//
+
+#include "llvm/ADT/PagedVector.h"
+#include "gtest/gtest.h"
+#include 
+
+namespace llvm {
+TEST(PagedVectorTest, EmptyTest) {
+  PagedVector V;
+  EXPECT_EQ(V.empty(), true);
+  EXPECT_EQ(V.size(), 0ULL);
+  EXPECT_EQ(V.capacity(), 0ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 0ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 0ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+
+  EXPECT_DEATH(V[0], "Index < Size");
+  EXPECT_DEATH(PagedVector(nullptr), "Allocator cannot be null");
+}
+
+TEST(PagedVectorTest, ExpandTest) {
+  PagedVector V;
+  V.resize(2);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 2ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  EXPECT_EQ(V.materialized_begin().getIndex(), 2ULL);
+  EXPECT_EQ(V.materialized_end().getIndex(), 2ULL);
+  EXPECT_EQ(std::distance(V.materialized_begin(), V.materialized_end()), 0LL);
+}
+
+TEST(PagedVectorTest, FullPageFillingTest) {
+  PagedVector V;
+  V.resize(10);
+  EXPECT_EQ(V.empty(), false);
+  EXPECT_EQ(V.size(), 10ULL);
+  EXPECT_EQ(V.capacity(), 10ULL);
+  for (int I = 0; I < 10; ++I) {
+V[I] = I;
+  }

vgvassilev wrote:

```suggestion
  for (int I = 0; I < 10; ++I)
V[I] = I;
```

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits

https://github.com/vgvassilev approved this pull request.

LGTM modulo my comments. Before merging could you verify that we still get the 
expected performance gains on the ROOT side?

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,267 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");

vgvassilev wrote:

```suggestion
assert(!A && "Allocator cannot be nullptr");
```

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

ktf wrote:

Ping @zygoloid. I have addressed the last 6 comments. I hope I did not miss any 
this time.
Ping @vgvassilev. Could you do one last pass as well and commit? I do not think 
I have commit rights. I also checked with my ROOT build and everything works as 
before and the memory gain is still there.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-29 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 01/10] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits


@@ -0,0 +1,301 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialised elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously. The order 
of
+  /// the elements however depends on the order of access of the pages.
+  PointerIntPair Allocator;
+
+  constexpr static T *InvalidPage = nullptr;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be null");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == InvalidPage) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf unresolved 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/8] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/7] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/6] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.

zygoloid wrote:

```suggestion
  /// Return the size of the vector.
```
I don't think the additional explanation is necessary here; every LLVM 
developer should know what `size()` on a container does. If you'd prefer to 
keep the comments, then they're both inaccurate (it's the maximum index that 
can be accessed *plus 1*, and it's the most recent value passed to `resize` not 
the maximum).


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,301 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialised elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously. The order 
of
+  /// the elements however depends on the order of access of the pages.
+  PointerIntPair Allocator;
+
+  constexpr static T *InvalidPage = nullptr;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be null");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == InvalidPage) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits

https://github.com/zygoloid approved this pull request.

It looks like there were a couple of unaddressed comments from my earlier 
reviews; I think github may have been "helpfully" hiding them from you by 
default. I've added comments on them so you should hopefully be able to find 
them.

Thanks for your perseverance, this is looking good. Only really trivial nits; 
for my part, I'm happy for you to go ahead and commit once you've addressed 
them.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {

zygoloid wrote:

```suggestion
  explicit PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
```

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,301 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialised elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously. The order 
of
+  /// the elements however depends on the order of access of the pages.
+  PointerIntPair Allocator;
+
+  constexpr static T *InvalidPage = nullptr;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be null");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == InvalidPage) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (!PagePtr) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;

zygoloid wrote:

```suggestion
  PagePtr = Allocator.getPointer()->template Allocate(PageSize);
  // We need to invoke the default constructor on all the elements of the
  // page.
  std::uninitialized_value_construct_n(PagePtr, PageSize);
```
Minor simplification; feel free to apply or ignore this suggestion.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits


@@ -0,0 +1,301 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialised elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously. The order 
of
+  /// the elements however depends on the order of access of the pages.
+  PointerIntPair Allocator;
+
+  constexpr static T *InvalidPage = nullptr;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be null");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == InvalidPage) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits

https://github.com/zygoloid edited 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits

https://github.com/zygoloid resolved 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits

https://github.com/zygoloid unresolved 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Richard Smith via cfe-commits

https://github.com/zygoloid resolved 
https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/5] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/4] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/3] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/2] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread David Blaikie via cfe-commits

dwblaikie wrote:

Oh, no worries at all. LLVM 's still figuring out this whole GitHub transition 
anyway. I mention it only for future reference and because it might help 
reviewers/you if comments are getting missplaced/there's trouble tracking 
unaddressed feedback, etc.

Thanks for sticking with this! I know it's a lot to work through.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == nullptr) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == nullptr) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == nullptr) {
+  T *NewPagePtr = Allocator.getPointer()->template Allocate(PageSize);
+  // We need to invoke the default constructor on all the elements of the
+  // page.
+  std::uninitialized_value_construct_n(NewPagePtr, PageSize);
+
+  PagePtr = NewPagePtr;
+}
+// Dereference the element in the page.
+return PagePtr[Index % PageSize];
+  }
+
+  /// Return the capacity of the vector. I.e. the maximum size it can be
+  /// expanded to with the resize method without allocating more pages.
+  [[nodiscard]] size_t capacity() const {
+return PageToDataPtrs.size() * PageSize;
+  }
+
+  /// Return the size of the vector. I.e. the maximum index that can be
+  /// accessed, i.e. the maximum value which was used as argument of the
+  /// resize method.
+  [[nodiscard]] size_t size() const { return Size; }
+
+  /// Resize the vector. Notice that the constructor of the elements will not
+  /// be invoked until an element of a given page is accessed, at which point
+  /// all the elements of the page will be constructed.
+  ///
+  /// If the new size is smaller than the current size, the elements of the
+  /// pages that are not needed anymore will be destroyed, 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Vassil Vassilev via cfe-commits


@@ -0,0 +1,282 @@
+//===- llvm/ADT/PagedVector.h - 'Lazyly allocated' vectors *- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the PagedVector class.
+//
+//===--===//
+#ifndef LLVM_ADT_PAGEDVECTOR_H
+#define LLVM_ADT_PAGEDVECTOR_H
+
+#include "llvm/ADT/PointerIntPair.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Allocator.h"
+#include 
+#include 
+
+namespace llvm {
+/// A vector that allocates memory in pages.
+///
+/// Order is kept, but memory is allocated only when one element of the page is
+/// accessed. This introduces a level of indirection, but it is useful when you
+/// have a sparsely initialised vector where the full size is allocated 
upfront.
+///
+/// As a side effect the elements are initialised later than in a normal 
vector.
+/// On the first access to one of the elements of a given page, all the 
elements
+/// of the page are initialised. This also means that the elements of the page
+/// are initialised beyond the size of the vector.
+///
+/// Similarly on destruction the elements are destroyed only when the page is
+/// not needed anymore, delaying invoking the destructor of the elements.
+///
+/// Notice that this has iterators only on materialized elements. This
+/// is deliberately done under the assumption you would dereference the 
elements
+/// while iterating, therefore materialising them and losing the gains in terms
+/// of memory usage this container provides. If you have such a use case, you
+/// probably want to use a normal std::vector or a llvm::SmallVector.
+template  class PagedVector {
+  static_assert(PageSize > 1, "PageSize must be greater than 0. Most likely "
+  "you want it to be greater than 16.");
+  /// The actual number of elements in the vector which can be accessed.
+  size_t Size = 0;
+
+  /// The position of the initial element of the page in the Data vector.
+  /// Pages are allocated contiguously in the Data vector.
+  mutable SmallVector PageToDataPtrs;
+  /// Actual page data. All the page elements are allocated on the
+  /// first access of any of the elements of the page. Elements are default
+  /// constructed and elements of the page are stored contiguously.
+  PointerIntPair Allocator;
+
+public:
+  using value_type = T;
+
+  /// Default constructor. We build our own allocator and mark it as such with
+  /// `true` in the second pair element.
+  PagedVector() : Allocator(new BumpPtrAllocator, true) {}
+  PagedVector(BumpPtrAllocator *A) : Allocator(A, false) {
+assert(A != nullptr && "Allocator cannot be nullptr");
+  }
+
+  ~PagedVector() {
+clear();
+// If we own the allocator, delete it.
+if (Allocator.getInt())
+  delete Allocator.getPointer();
+  }
+
+  // Forbid copy and move as we do not need them for the current use case.
+  PagedVector(const PagedVector &) = delete;
+  PagedVector(PagedVector &&) = delete;
+  PagedVector =(const PagedVector &) = delete;
+  PagedVector =(PagedVector &&) = delete;
+
+  /// Look up an element at position `Index`.
+  /// If the associated page is not filled, it will be filled with default
+  /// constructed elements.
+  T [](size_t Index) const {
+assert(Index < Size);
+assert(Index / PageSize < PageToDataPtrs.size());
+T * = PageToDataPtrs[Index / PageSize];
+// If the page was not yet allocated, allocate it.
+if (PagePtr == nullptr) {

vgvassilev wrote:

```suggestion
if (!PagePtr) {
```

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

ktf wrote:

@dwblaikie apologies. This is my first PR to LLVM and given the effects of it 
for my use case, I guess I have been a bit to enthusiastic.

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread David Blaikie via cfe-commits

dwblaikie wrote:

(aside: I was confused why there was only one commit in this PR, since there'd 
been so many updates to it - but I see they've been force pushed, which my 
vague understanding is that force pushing can complicate tracking previous 
comments and the LLVM convention is not to do so: "When updating a pull 
request, you should push additional “fix up” commits to your branch instead of 
force pushing. This makes it easier for GitHub to track the context of previous 
review comments. " - https://llvm.org/docs/GitHub.html#updating-pull-requests )

https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From 07da9379065daab62d43ba453ba6b71f3880a089 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo *)nullptr);

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From c14aaf14cbcd292165aece38ebdae74ce92a4d32 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/5] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From c14aaf14cbcd292165aece38ebdae74ce92a4d32 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/4] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf updated https://github.com/llvm/llvm-project/pull/66430

>From c14aaf14cbcd292165aece38ebdae74ce92a4d32 Mon Sep 17 00:00:00 2001
From: Giulio Eulisse <10544+...@users.noreply.github.com>
Date: Thu, 14 Sep 2023 21:58:21 +0200
Subject: [PATCH 1/3] Introduce PagedVector class

The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
---
 clang/include/clang/Basic/SourceManager.h |   3 +-
 clang/include/clang/Serialization/ASTReader.h |   5 +-
 clang/lib/Basic/SourceManager.cpp |  10 +-
 clang/lib/Serialization/ASTReader.cpp |   5 +-
 llvm/docs/ProgrammersManual.rst   |  34 ++
 llvm/include/llvm/ADT/PagedVector.h   | 282 +++
 llvm/unittests/ADT/CMakeLists.txt |   1 +
 llvm/unittests/ADT/PagedVectorTest.cpp| 342 ++
 8 files changed, 672 insertions(+), 10 deletions(-)
 create mode 100644 llvm/include/llvm/ADT/PagedVector.h
 create mode 100644 llvm/unittests/ADT/PagedVectorTest.cpp

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index 2f846502d6f3327..e37caa2252532f9 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -43,6 +43,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -699,7 +700,7 @@ class SourceManager : public RefCountedBase {
   ///
   /// Negative FileIDs are indexes into this table. To get from ID to an index,
   /// use (-ID - 2).
-  SmallVector LoadedSLocEntryTable;
+  llvm::PagedVector LoadedSLocEntryTable;
 
   /// The starting offset of the next local SLocEntry.
   ///
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dc1eb21c27801fe..65e19c6e44cf571 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -38,6 +38,7 @@
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/MapVector.h"
+#include "llvm/ADT/PagedVector.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -487,7 +488,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the type with
   /// ID = (I + 1) << FastQual::Width has already been loaded
-  std::vector TypesLoaded;
+  llvm::PagedVector TypesLoaded;
 
   using GlobalTypeMapType =
   ContinuousRangeMap;
@@ -501,7 +502,7 @@ class ASTReader
   ///
   /// When the pointer at index I is non-NULL, the declaration with ID
   /// = I + 1 has already been loaded.
-  std::vector DeclsLoaded;
+  llvm::PagedVector DeclsLoaded;
 
   using GlobalDeclMapType =
   ContinuousRangeMap;
diff --git a/clang/lib/Basic/SourceManager.cpp 
b/clang/lib/Basic/SourceManager.cpp
index 0521ac7b30339ab..7fa8b8096ac4931 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -2344,11 +2344,11 @@ SourceManager::MemoryBufferSizes 
SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-+ llvm::capacity_in_bytes(LocalSLocEntryTable)
-+ llvm::capacity_in_bytes(LoadedSLocEntryTable)
-+ llvm::capacity_in_bytes(SLocEntryLoaded)
-+ llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+llvm::capacity_in_bytes(LocalSLocEntryTable) +
+llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+llvm::capacity_in_bytes(SLocEntryLoaded) +
+llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
 size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 0952244d037a77c..bb2a32ade0a0d7c 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -7944,9 +7944,10 @@ void ASTReader::PrintStats() {
   std::fprintf(stderr, "*** AST File Statistics:\n");
 
   unsigned NumTypesLoaded =
-  TypesLoaded.size() - llvm::count(TypesLoaded, QualType());
+  TypesLoaded.size() - llvm::count(TypesLoaded.materialized(), QualType());
   unsigned NumDeclsLoaded =
-  DeclsLoaded.size() - llvm::count(DeclsLoaded, (Decl *)nullptr);
+  DeclsLoaded.size() -
+  llvm::count(DeclsLoaded.materialized(), (Decl *)nullptr);
   unsigned NumIdentifiersLoaded =
   IdentifiersLoaded.size() -
   llvm::count(IdentifiersLoaded, (IdentifierInfo 

[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Introduce paged vector (PR #66430)

2023-09-28 Thread Giulio Eulisse via cfe-commits

https://github.com/ktf resolved https://github.com/llvm/llvm-project/pull/66430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   >