[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added a comment.

In D77621#2002430 , @dblaikie wrote:

> (seems a bit questionable to rely on uintptr_t being necessarily the same 
> type as either uint32_t or uint64_t - but maybe that's guaranteed/written 
> down somewhere)?


I think in practice uintptr_t will match range and size with one of uint32_t 
and uint64_t, but as you suggest it might not be equivalent to either (e.g., 
could use all three of unsigned, unsigned long, and unsigned long long).  We 
need to be consistent between the std::conditional in SmallVectorSizeType and 
the explicit instantiations (and skimming the current implementation it looks 
like it is consistent, doesn’t rely on uintptr_t).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

https://reviews.llvm.org/D79214


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

In D77621#2013546 , @nikic wrote:

> @browneee Looks like LLVM already defines `LLVM_PTR_SIZE` as a more portable 
> version of `__SIZEOF_POINTER__`.


I saw LLVM_PTR_SIZE, but its definition may be based on sizeof() 
,
 so I don't think it should be used in a preprocessor condition.

SIZE_MAX looks like a good option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Nikita Popov via Phabricator via cfe-commits
nikic added a comment.

@browneee Looks like LLVM already defines `LLVM_PTR_SIZE` as a more portable 
version of `__SIZEOF_POINTER__`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

In D77621#2013400 , @browneee wrote:

> Thanks for the tips, MaskRay.
>
> Yes, I expect this would fix that issue.
>
> smeenai, SizeTypeMax() is intended to return size_t.
>
>  ---
>
> I see a couple of options for fixing the truncation warning on 32-bit 
> platforms:
>
> 1. Add an explicit cast to remove the warning.
>   - Disadvantage: the second instantiation still exists even though it is 
> unused.
>
> ``` static constexpr size_t SizeTypeMax() { return 
> static_cast(std::numeric_limits::max()); } ```
> 2. Use a std::conditional to swap the type of one instantiation to avoid 
> conflicts.
>
>   In this case I'd probably swap back to using uintptr_t and disable the 
> uint32_t on 32bit.
>   - Disadvantage: the second instantiation still exists even though it is 
> unused.
>
> ``` // Will be unused when instantiated with char. // This is to avoid 
> instantiation for uint32_t conflicting with uintptr_t on 32-bit systems. 
> template class llvm::SmallVectorBase uint32_t, char>::type>; template class llvm::SmallVectorBase; ```
> 3. Use preprocessor to disable one of the instantiations on 32-bit platforms.
>
>   In this case I'd probably swap back to using uintptr_t and disable the 
> uint32_t on 32bit.
>   - Disadvantage: uses preprocessor
>   - Disadvantage: potential for portability issues with different platforms 
> lacking certain macros
>
> ``` #if __SIZEOF_POINTER__ != 4  && !defined(_WIN32) && !defined(__ILP32) 
> template class llvm::SmallVectorBase; #endif template class 
> llvm::SmallVectorBase; ```
>
> My order of preference would be 1, 2, 3.
>
> Is there another solution I've missed? Thoughts on which is best? 
> @dblaikie


I don't think (3) needs to be non-portable - it could use SIZE_MAX to test if 
it's bigger than 2^32? (or == 2^64)? That seems like it'd be a pretty good 
direct test? & ensure the types written in the explicit specializations are the 
same types written in the header/std::conditional (seems a bit questionable to 
rely on uintptr_t being necessarily the same type as either uint32_t or 
uint64_t - but maybe that's guaranteed/written down somewhere)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

Thanks for the tips, MaskRay.

Yes, I expect this would fix that issue.

smeenai, SizeTypeMax() is intended to return size_t.

---

I see a couple of options for fixing the truncation warning on 32-bit platforms:

1. Add an explicit cast to remove the warning.
  - Disadvantage: the second instantiation still exists even though it is 
unused.

  static constexpr size_t SizeTypeMax() {
return static_cast(std::numeric_limits::max());
  }



2. Use a std::conditional to swap the type of one instantiation to avoid 
conflicts.

  In this case I'd probably swap back to using uintptr_t and disable the 
uint32_t on 32bit.
  - Disadvantage: the second instantiation still exists even though it is 
unused.

  // Will be unused when instantiated with char.
  // This is to avoid instantiation for uint32_t conflicting with uintptr_t on 
32-bit systems.
  template class llvm::SmallVectorBase::type>; 
  template class llvm::SmallVectorBase;  



3. Use preprocessor to disable one of the instantiations on 32-bit platforms.

  In this case I'd probably swap back to using uintptr_t and disable the 
uint32_t on 32bit.
  - Disadvantage: uses preprocessor
  - Disadvantage: potential for portability issues with different platforms 
lacking certain macros

  #if __SIZEOF_POINTER__ != 4  && !defined(_WIN32) && !defined(__ILP32)
  template class llvm::SmallVectorBase; 
  #endif
  template class llvm::SmallVectorBase;  

My order of preference would be 1, 2, 3.

Is there another solution I've missed? Thoughts on which is best? @dblaikie


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:52
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax() {
+return std::numeric_limits::max();

smeenai wrote:
> browneee wrote:
> > dexonsmith wrote:
> > > STL data structures have a name for this called `max_size()`.  Should we 
> > > be consistent with that?
> > Good question.
> > 
> > This brought my attention to the existing 
> > SmallVectorTemplateCommon::max_size() which also needed to be updated.
> > I'm going to name this new function SizeTypeMax to best describe what it 
> > provides, and leave it separate from max_size().
> Was it intentional to make this return a `size_t` rather than a `Size_T`? 
> Clang gives a truncation warning on 32-bit platforms when you try to 
> instantiate the template with `uint64_t` as a result.
I think returning size_t is the correct thing here & the fix is not to use a 64 
bit size on a 32 bit machine - that was initially intended to be solved by 
using uintptr_t, but got lost when that turned out to cause issues on 32 bit 
machines.

@browneee could you fix this differently, so that when sizeof(uintptr_t) == 4 
there's only one instantiation/only uint32_t is used?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-30 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:52
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax() {
+return std::numeric_limits::max();

browneee wrote:
> dexonsmith wrote:
> > STL data structures have a name for this called `max_size()`.  Should we be 
> > consistent with that?
> Good question.
> 
> This brought my attention to the existing 
> SmallVectorTemplateCommon::max_size() which also needed to be updated.
> I'm going to name this new function SizeTypeMax to best describe what it 
> provides, and leave it separate from max_size().
Was it intentional to make this return a `size_t` rather than a `Size_T`? Clang 
gives a truncation warning on 32-bit platforms when you try to instantiate the 
template with `uint64_t` as a result.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-27 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment.

I guess this fixes https://bugs.llvm.org/show_bug.cgi?id=45289 as well


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-27 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay closed this revision.
MaskRay added a comment.

@browneee Hi, it seems that you did not attach `Differential Revision: ` to the 
commit dda3c19a3618dce9492687f8e880e7a73486ee98 
 so the 
differential was not closed automatically.

In previous llvm-dev discussions, people agreed that `Reviewed-by:` and 
`Differential Revision: ` are useful and should be retained.

I usually do this when committing: `arcfilter; git fetch origin master && git 
rebase origin/master; last-minute-testing && git push origin HEAD:master`

where `arcfilter` is a shell function which drops unneeded Phabricator tags

  arcfilter () {
  arc amend
  git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} 
/Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: 
/,"");print}' | git commit --amend -F -
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-27 Thread David Blaikie via Phabricator via cfe-commits
dblaikie accepted this revision.
dblaikie added a comment.

Seems good to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee marked an inline comment as done.
browneee added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:19
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"

nikic wrote:
> Is this include still needed?
SmallVectorTemplateBase::grow() still remains in the 
header and uses report_bad_alloc_error().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 260064.
browneee added a comment.

Change uintptr_t to uint64_t to ensure this does not instantiate the same 
template twice on platforms where uintptr_t is equivalent to uint32_t.

Also considered using the preprocessor to disable the uintptr_t instantiation, 
but chose to avoid preprocessor use.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -37,24 +37,30 @@
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
 
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-/// This function will report a fatal error if it cannot increase capacity.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
+
+// Note: Moving this function into the header may cause performance regression.
+template 
+void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
+   size_t TSize) {
+  // Ensure we can fit the new capacity.
+  // This is only going to be applicable when the capacity is 32 bit.
+  if (MinCapacity > SizeTypeMax())
 report_bad_alloc_error("SmallVector capacity overflow during allocation");
 
   // Ensure we can meet the guarantee of space for at least one more element.
   // The above check alone will not catch the case where grow is called with a
   // default MinCapacity of 0, but the current capacity cannot be increased.
-  if (capacity() == size_t(UINT32_MAX))
+  // This is only going to be applicable when the capacity is 32 bit.
+  if (capacity() == SizeTypeMax())
 report_bad_alloc_error("SmallVector capacity unable to grow");
 
+  // In theory 2*capacity can overflow if the capacity is 64 bit, but the
+  // original capacity would never be large enough for this to be a problem.
   size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
+  NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
 
   void *NewElts;
   if (BeginX == FirstEl) {
@@ -70,3 +76,6 @@
   this->BeginX = NewElts;
   this->Capacity = NewCapacity;
 }
+
+template class llvm::SmallVectorBase;
+template class llvm::SmallVectorBase;
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -70,9 +83,14 @@
   }
 };
 
+template 
+using SmallVectorSizeType =
+typename std::conditional= 8, uint64_t,
+  uint32_t>::type;
+
 /// Figure out the offset of the first element.
 template  struct SmallVectorAlignmentAndSize {
-  AlignedCharArrayUnion Base;
+  AlignedCharArrayUnion>> Base;
   AlignedCharArrayUnion FirstEl;
 };
 
@@ -80,7 +98,10 @@
 /// the type T is a POD. 

[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

Reverted in 5cb4c3776a34d48e43d9118921d2191aee0e3d21 


Fails on plaforms where uintptr_t is the same type as uint32_t.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee closed this revision.
browneee added a comment.

Comitted: b5f0eae1dc3c09c020cdf9d07238dec9acdacf5f 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 260006.
browneee marked 4 inline comments as done.
browneee added a comment.

- Change SizeTypeMax to a static constexpr function.
- Fix comment typos.
- Add comment to alert others to possible performance loss if that function is 
moved to the header.

@nikic: Thank you for detecting, analyzing, and solving the performance 
regression!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -37,24 +37,30 @@
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
 
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-/// This function will report a fatal error if it cannot increase capacity.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
+
+// Note: Moving this function into the header may cause performance regression.
+template 
+void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
+   size_t TSize) {
+  // Ensure we can fit the new capacity.
+  // This is only going to be applicable when the capacity is 32 bit.
+  if (MinCapacity > SizeTypeMax())
 report_bad_alloc_error("SmallVector capacity overflow during allocation");
 
   // Ensure we can meet the guarantee of space for at least one more element.
   // The above check alone will not catch the case where grow is called with a
   // default MinCapacity of 0, but the current capacity cannot be increased.
-  if (capacity() == size_t(UINT32_MAX))
+  // This is only going to be applicable when the capacity is 32 bit.
+  if (capacity() == SizeTypeMax())
 report_bad_alloc_error("SmallVector capacity unable to grow");
 
+  // In theory 2*capacity can overflow if the capacity is 64 bit, but the
+  // original capacity would never be large enough for this to be a problem.
   size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
+  NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
 
   void *NewElts;
   if (BeginX == FirstEl) {
@@ -70,3 +76,6 @@
   this->BeginX = NewElts;
   this->Capacity = NewCapacity;
 }
+
+template class llvm::SmallVectorBase;
+template class llvm::SmallVectorBase;
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -70,9 +83,13 @@
   }
 };
 
+template 
+using SmallVectorSizeType =
+typename std::conditional::type;
+
 /// Figure out the offset of the first element.
 template  struct SmallVectorAlignmentAndSize {
-  AlignedCharArrayUnion Base;
+  AlignedCharArrayUnion>> Base;
   AlignedCharArrayUnion FirstEl;
 };
 
@@ -80,7 +97,10 @@
 /// the type T is a POD. The extra dummy 

[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Nikita Popov via Phabricator via cfe-commits
nikic added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:84
 
+template  const size_t SmallVectorBase::SizeTypeMax;
+

dblaikie wrote:
> nikic wrote:
> > Is this needed? I don't think it makes a lot of sense to allow odr-use of 
> > `SizeTypeMax`. As it's a protected member, it's only used in the 
> > SmallVector implementation, where we control how it is used.
> It's used as a parameter to std::min, so it's already odr used & I'd rather 
> not leave it as a trap to walk around even if we addressed that issue.
> 
> I assume if it were a constexpr local in a protected inline function it 
> wouldn't hinder optimizations in any real way?
> It's used as a parameter to std::min, so it's already odr used & I'd rather 
> not leave it as a trap to walk around even if we addressed that issue.

Oh, right you are! In that case this seems fine :)

> I assume if it were a constexpr local in a protected inline function it 
> wouldn't hinder optimizations in any real way?

The change from constexpr function to constexpr static didn't change anything 
performance-wise, so either way works for me.

Another option is:

```
enum : size_t { SizeTypeMax = std::numeric_limits::max() };
```

Kind of sad that in C++14, using an enum is still the only "no nonsense" way to 
declare a constant.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:84
 
+template  const size_t SmallVectorBase::SizeTypeMax;
+

nikic wrote:
> Is this needed? I don't think it makes a lot of sense to allow odr-use of 
> `SizeTypeMax`. As it's a protected member, it's only used in the SmallVector 
> implementation, where we control how it is used.
It's used as a parameter to std::min, so it's already odr used & I'd rather not 
leave it as a trap to walk around even if we addressed that issue.

I assume if it were a constexpr local in a protected inline function it 
wouldn't hinder optimizations in any real way?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Nikita Popov via Phabricator via cfe-commits
nikic accepted this revision.
nikic added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:19
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"

Is this include still needed?



Comment at: llvm/include/llvm/ADT/SmallVector.h:84
 
+template  const size_t SmallVectorBase::SizeTypeMax;
+

Is this needed? I don't think it makes a lot of sense to allow odr-use of 
`SizeTypeMax`. As it's a protected member, it's only used in the SmallVector 
implementation, where we control how it is used.



Comment at: llvm/lib/Support/SmallVector.cpp:48
+  // Ensure we can fit the new capacity.
+  // This is only going to be applicable if the when the capacity is 32 bit.
+  if (MinCapacity > SizeTypeMax)

Nit: `if the when` => `if the` or `when`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread David Blaikie via Phabricator via cfe-commits
dblaikie accepted this revision.
dblaikie added a comment.

In D77621#2001099 , @dexonsmith wrote:

> In D77621#2000237 , @nikic wrote:
>
> > Okay, I'm basically fine with that, if it is our stance that SmallVector 
> > should always be preferred over std::vector. Not really related to this 
> > revision, but it would probably help to do some renaming/aliasing to 
> > facilitate that view. Right now, the number of `SmallVector` uses in 
> > LLVM is really small compared to the `std::vector` uses (100 vs 6000 
> > based on a not very accurate grep). I think part of that is in the name, 
> > and calling it `using Vector = SmallVector` and `using 
> > VectorImpl = SmallVectorImpl` would make it a lot more obvious that 
> > this is our preferred general purpose vector type, even if the stored data 
> > is not small.
>
>
> Those aliases SGTM.


I'd be slightly against, just because having a name that differs from the 
standard name only in case seems pretty subtle - that and momentum, we've had 
SmallVector around for a while & I think it's OK. I don't mind some places 
using std::vector either, though. Don't feel strongly enough that I'd outright 
stand against such an alias/change, but just expressing this amount of disfavor.

In D77621#2001378 , @nikic wrote:

> So tl;dr looks like as long as we keep `grow_pod` outside the header file, 
> this change seems to be approximately free in terms of compile-time and 
> memory usage both.


Awesome - thanks for looking into it!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size/capacity use word-size integers when elements are small

2020-04-24 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 259949.
browneee added a comment.

Switch back to size and capacity type conditionally larger approach (appologies 
for the noise here).

Apply performance regression solutions from @nikic


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -37,24 +37,29 @@
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
 
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-/// This function will report a fatal error if it cannot increase capacity.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
+
+template 
+void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
+   size_t TSize) {
+  // Ensure we can fit the new capacity.
+  // This is only going to be applicable if the when the capacity is 32 bit.
+  if (MinCapacity > SizeTypeMax)
 report_bad_alloc_error("SmallVector capacity overflow during allocation");
 
   // Ensure we can meet the guarantee of space for at least one more element.
   // The above check alone will not catch the case where grow is called with a
   // default MinCapacity of 0, but the current capacity cannot be increased.
-  if (capacity() == size_t(UINT32_MAX))
+  // This is only going to be applicable if the when the capacity is 32 bit.
+  if (capacity() == SizeTypeMax)
 report_bad_alloc_error("SmallVector capacity unable to grow");
 
+  // In theory 2*capacity can overflow if the capacity is 64 bit, but the
+  // original capacity would never be large enough for this to be a problem.
   size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
+  NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax);
 
   void *NewElts;
   if (BeginX == FirstEl) {
@@ -70,3 +75,6 @@
   this->BeginX = NewElts;
   this->Capacity = NewCapacity;
 }
+
+template class llvm::SmallVectorBase;
+template class llvm::SmallVectorBase;
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,21 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax = std::numeric_limits::max();
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -70,9 +81,15 @@
   }
 };
 
+template  const size_t SmallVectorBase::SizeTypeMax;
+
+template 
+using SmallVectorSizeType =
+typename std::conditional::type;
+
 /// Figure out the offset of the first element.
 template  struct SmallVectorAlignmentAndSize {
-  AlignedCharArrayUnion Base;
+  AlignedCharArrayUnion>> Base;
   AlignedCharArrayUnion FirstEl;
 };
 
@@ -80,7 +97,10 @@
 /// the type T is a POD. The extra dummy template argument is used by ArrayRef
 /// to avoid unnecessarily requiring T to be complete.
 template 
-class SmallVectorTemplateCommon : public SmallVectorBase {
+class 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-22 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 259480.
browneee added a comment.

Switch approach back to std::vector change.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  clang-tools-extra/clang-doc/Serialize.cpp
  clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
  clang/include/clang/Serialization/ASTWriter.h
  clang/include/clang/Serialization/PCHContainerOperations.h
  clang/lib/CodeGen/ObjectFilePCHContainerOperations.cpp
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/PrecompiledPreamble.cpp
  clang/lib/Frontend/SerializedDiagnosticPrinter.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GlobalModuleIndex.cpp
  clang/lib/Serialization/PCHContainerOperations.cpp
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/include/llvm/Bitstream/BitstreamWriter.h
  llvm/include/llvm/Remarks/BitstreamRemarkSerializer.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/ExecutionEngine/Orc/ThreadSafeModule.cpp
  llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
  llvm/tools/llvm-cat/llvm-cat.cpp
  llvm/tools/llvm-modextract/llvm-modextract.cpp
  llvm/unittests/Bitstream/BitstreamReaderTest.cpp
  llvm/unittests/Bitstream/BitstreamWriterTest.cpp

Index: llvm/unittests/Bitstream/BitstreamWriterTest.cpp
===
--- llvm/unittests/Bitstream/BitstreamWriterTest.cpp
+++ llvm/unittests/Bitstream/BitstreamWriterTest.cpp
@@ -8,27 +8,31 @@
 
 #include "llvm/Bitstream/BitstreamWriter.h"
 #include "llvm/ADT/STLExtras.h"
-#include "llvm/ADT/SmallString.h"
+#include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+#include 
+
 using namespace llvm;
 
+using ::testing::IsEmpty;
+
 namespace {
 
 TEST(BitstreamWriterTest, emitBlob) {
-  SmallString<64> Buffer;
+  std::vector Buffer;
   BitstreamWriter W(Buffer);
   W.emitBlob("str", /* ShouldEmitSize */ false);
-  EXPECT_EQ(StringRef("str\0", 4), Buffer);
+  EXPECT_EQ(StringRef("str\0", 4), StringRef(Buffer.data(), Buffer.size()));
 }
 
 TEST(BitstreamWriterTest, emitBlobWithSize) {
-  SmallString<64> Buffer;
+  std::vector Buffer;
   {
 BitstreamWriter W(Buffer);
 W.emitBlob("str");
   }
-  SmallString<64> Expected;
+  std::vector Expected;
   {
 BitstreamWriter W(Expected);
 W.EmitVBR(3, 6);
@@ -38,21 +42,21 @@
 W.Emit('r', 8);
 W.Emit(0, 8);
   }
-  EXPECT_EQ(StringRef(Expected), Buffer);
+  EXPECT_EQ(Expected, Buffer);
 }
 
 TEST(BitstreamWriterTest, emitBlobEmpty) {
-  SmallString<64> Buffer;
+  std::vector Buffer;
   BitstreamWriter W(Buffer);
   W.emitBlob("", /* ShouldEmitSize */ false);
-  EXPECT_EQ(StringRef(""), Buffer);
+  EXPECT_THAT(Buffer, IsEmpty());
 }
 
 TEST(BitstreamWriterTest, emitBlob4ByteAligned) {
-  SmallString<64> Buffer;
+  std::vector Buffer;
   BitstreamWriter W(Buffer);
   W.emitBlob("str0", /* ShouldEmitSize */ false);
-  EXPECT_EQ(StringRef("str0"), Buffer);
+  EXPECT_EQ(StringRef("str0"), StringRef(Buffer.data(), Buffer.size()));
 }
 
 } // end namespace
Index: llvm/unittests/Bitstream/BitstreamReaderTest.cpp
===
--- llvm/unittests/Bitstream/BitstreamReaderTest.cpp
+++ llvm/unittests/Bitstream/BitstreamReaderTest.cpp
@@ -11,6 +11,8 @@
 #include "llvm/Bitstream/BitstreamWriter.h"
 #include "gtest/gtest.h"
 
+#include 
+
 using namespace llvm;
 
 namespace {
@@ -96,7 +98,7 @@
 StringRef BlobIn((const char *)BlobData.begin(), BlobSize);
 
 // Write the bitcode.
-SmallVector Buffer;
+std::vector Buffer;
 unsigned AbbrevID;
 {
   BitstreamWriter Stream(Buffer);
@@ -115,7 +117,7 @@
 
 // Stream the buffer into the reader.
 BitstreamCursor Stream(
-ArrayRef((const uint8_t *)Buffer.begin(), Buffer.size()));
+ArrayRef((const uint8_t *)Buffer.data(), Buffer.size()));
 
 // Header.  Included in test so that we can run llvm-bcanalyzer to debug
 // when there are problems.
Index: llvm/tools/llvm-modextract/llvm-modextract.cpp
===
--- llvm/tools/llvm-modextract/llvm-modextract.cpp
+++ llvm/tools/llvm-modextract/llvm-modextract.cpp
@@ -58,12 +58,12 @@
   ExitOnErr(errorCodeToError(EC));
 
   if (BinaryExtract) {
-SmallVector Result;
+std::vector Result;
 BitcodeWriter Writer(Result);
-Result.append(Ms[ModuleIndex].getBuffer().begin(),
+Result.insert(Result.end(), Ms[ModuleIndex].getBuffer().begin(),
   Ms[ModuleIndex].getBuffer().end());
 Writer.copyStrtab(Ms[ModuleIndex].getStrtab());
-Out->os() << Result;
+Out->os().write(Result.data(), Result.size());
 Out->keep();
 return 0;
   }
Index: llvm/tools/llvm-cat/llvm-cat.cpp
===
--- llvm/tools/llvm-cat/llvm-cat.cpp
+++ llvm/tools/llvm-cat/llvm-cat.cpp
@@ -54,7 +54,7 @@
   ExitOnError ExitOnErr("llvm-cat: 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-21 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

Thanks for the revert explanation and notes, nikic.

@dexonsmith what is your current thinking on going back to the original 
std::vector approach?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-18 Thread Nikita Popov via Phabricator via cfe-commits
nikic reopened this revision.
nikic added a comment.
This revision is now accepted and ready to land.

I have reverted this change, because it causes a 1% compile-time 

 and memory usage 

 regression. The memory usage regression is probably fine given what this 
change does, but the compile-time regression is not. (For context, this pretty 
much undoes the wins that the recent removal of waymarking gave us.)

Some notes:

- Can you please split out the fix to grow() into a separate revision? It does 
not seem related to the main change, and reduces surface area.
- I don't think the automatic switch of the size/capacity field has been 
justified well. We have plenty of SmallVectors in LLVM that are, indeed, small. 
There is no way an MCRelaxableFragment will ever end up storing a single 
instruction that is 4G large.
- Similarly, I'm not really convinced about handling this in SmallVector at 
all. The original change here just used an `std::vector` in the one place where 
this has become an issue. That seems like a pretty good solution until there is 
evidence that this is really a more widespread problem.

But in any case, my primary concern here is the compile-time regression, and 
it's not immediately clear which part of the change it comes from.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-17 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith closed this revision.
dexonsmith added a comment.

In D77621#1979769 , @browneee wrote:

> GIT_COMMITTER_NAME=Andrew Browne
>  GIT_COMMITTER_EMAIL=brown...@google.com
>
> This would be my second commit. I will request access next time - thanks 
> @dexonsmith!


I should have said `GIT_AUTHOR*` info, since I'm the committer :).  Just landed 
it as b8d08e961df1d229872c785ebdbc8367432e9752 
, thanks 
for waiting!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-17 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 258444.
browneee added a comment.

Rebase to latest HEAD.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -36,30 +36,6 @@
 static_assert(sizeof(SmallVector) ==
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
-
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
-report_bad_alloc_error("SmallVector capacity overflow during allocation");
-
-  size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
-
-  void *NewElts;
-  if (BeginX == FirstEl) {
-NewElts = safe_malloc(NewCapacity * TSize);
-
-// Copy the elements over.  No need to run dtors on PODs.
-memcpy(NewElts, this->BeginX, size() * TSize);
-  } else {
-// If this wasn't grown from the inline copy, grow the allocated space.
-NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
-  }
-
-  this->BeginX = NewElts;
-  this->Capacity = NewCapacity;
-}
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -46,7 +59,39 @@
 
   /// This is an implementation of the grow() method which only works
   /// on POD-like data types and is out of line to reduce code duplication.
-  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
+  /// This function will report a fatal error if it cannot increase capacity.
+  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
+// Ensure we can fit the new capacity.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (MinCapacity > SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity overflow during allocation");
+
+// Ensure we can meet the guarantee of space for at least one more element.
+// The above check alone will not catch the case where grow is called with a
+// default MinCapacity of 0, but the current capacity cannot be increased.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (capacity() == SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity unable to grow");
+
+// In theory 2*capacity can overflow if the capacity is 64 bit, but the
+// original capacity would never be large enough for this to be a problem.
+size_t NewCapacity = 2 * capacity() + 1; // Always grow.
+NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
+
+void *NewElts;
+if (BeginX == FirstEl) {
+  NewElts = safe_malloc(NewCapacity * TSize);
+
+  // Copy the elements over.  No need to run 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-15 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 257861.
browneee added a comment.

Rebase to latest HEAD.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -36,30 +36,6 @@
 static_assert(sizeof(SmallVector) ==
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
-
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
-report_bad_alloc_error("SmallVector capacity overflow during allocation");
-
-  size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
-
-  void *NewElts;
-  if (BeginX == FirstEl) {
-NewElts = safe_malloc(NewCapacity * TSize);
-
-// Copy the elements over.  No need to run dtors on PODs.
-memcpy(NewElts, this->BeginX, size() * TSize);
-  } else {
-// If this wasn't grown from the inline copy, grow the allocated space.
-NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
-  }
-
-  this->BeginX = NewElts;
-  this->Capacity = NewCapacity;
-}
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -46,7 +59,39 @@
 
   /// This is an implementation of the grow() method which only works
   /// on POD-like data types and is out of line to reduce code duplication.
-  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
+  /// This function will report a fatal error if it cannot increase capacity.
+  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
+// Ensure we can fit the new capacity.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (MinCapacity > SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity overflow during allocation");
+
+// Ensure we can meet the guarantee of space for at least one more element.
+// The above check alone will not catch the case where grow is called with a
+// default MinCapacity of 0, but the current capacity cannot be increased.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (capacity() == SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity unable to grow");
+
+// In theory 2*capacity can overflow if the capacity is 64 bit, but the
+// original capacity would never be large enough for this to be a problem.
+size_t NewCapacity = 2 * capacity() + 1; // Always grow.
+NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
+
+void *NewElts;
+if (BeginX == FirstEl) {
+  NewElts = safe_malloc(NewCapacity * TSize);
+
+  // Copy the elements over.  No need to run 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

GIT_COMMITTER_NAME=Andrew Browne
GIT_COMMITTER_EMAIL=brown...@google.com

This would be my second commit. I will request access next time - thanks 
@dexonsmith!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added a comment.

In D77621#1979183 , @browneee wrote:

> @dexonsmith I am not a committer, if the last changes looks good please 
> submit for me. Thanks!


You've had a few patches in the past, I suggest you get yourself access.
https://llvm.org/docs/DeveloperPolicy.html#new-contributors

Or, let me know what you want for your `GIT_COMMITTER*` info.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Andrew via Phabricator via cfe-commits
browneee added a comment.

I'm open to suggestions to resolve the clang tidy naming warnings. I would 
prefer to leave grow_pod the same, to minimize changes.

@dexonsmith I am not a committer, if the last changes looks good please submit 
for me. Thanks!




Comment at: llvm/include/llvm/ADT/SmallVector.h:52
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax() {
+return std::numeric_limits::max();

dexonsmith wrote:
> STL data structures have a name for this called `max_size()`.  Should we be 
> consistent with that?
Good question.

This brought my attention to the existing SmallVectorTemplateCommon::max_size() 
which also needed to be updated.
I'm going to name this new function SizeTypeMax to best describe what it 
provides, and leave it separate from max_size().



Comment at: llvm/include/llvm/ADT/SmallVector.h:179
+  using Base::empty;
+  using Base::size;
+

dexonsmith wrote:
> Optionally we could expose `max_size()` as well.
Not done. Updated existing max_size() instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 257130.
browneee marked 3 inline comments as done.
browneee added a comment.

Rename SizeMax() to SizeTypeMax(). Fix max_size().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -36,30 +36,6 @@
 static_assert(sizeof(SmallVector) ==
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
-
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
-report_bad_alloc_error("SmallVector capacity overflow during allocation");
-
-  size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
-
-  void *NewElts;
-  if (BeginX == FirstEl) {
-NewElts = safe_malloc(NewCapacity * TSize);
-
-// Copy the elements over.  No need to run dtors on PODs.
-memcpy(NewElts, this->BeginX, size() * TSize);
-  } else {
-// If this wasn't grown from the inline copy, grow the allocated space.
-NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
-  }
-
-  this->BeginX = NewElts;
-  this->Capacity = NewCapacity;
-}
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  /// The maximum value of the Size_T used.
+  static constexpr size_t SizeTypeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -46,7 +59,39 @@
 
   /// This is an implementation of the grow() method which only works
   /// on POD-like data types and is out of line to reduce code duplication.
-  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
+  /// This function will report a fatal error if it cannot increase capacity.
+  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
+// Ensure we can fit the new capacity.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (MinCapacity > SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity overflow during allocation");
+
+// Ensure we can meet the guarantee of space for at least one more element.
+// The above check alone will not catch the case where grow is called with a
+// default MinCapacity of 0, but the current capacity cannot be increased.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (capacity() == SizeTypeMax())
+  report_bad_alloc_error("SmallVector capacity unable to grow");
+
+// In theory 2*capacity can overflow if the capacity is 64 bit, but the
+// original capacity would never be large enough for this to be a problem.
+size_t NewCapacity = 2 * capacity() + 1; // Always grow.
+NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeTypeMax());
+
+void *NewElts;
+if (BeginX == FirstEl) {
+  NewElts = 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith accepted this revision.
dexonsmith added a comment.
This revision is now accepted and ready to land.

Thanks for your patience, I missed the updates on Friday.

I have a couple of optional comments inline that I don't feel strongly about.  
LGTM either way.

In D77621#1972764 , @dblaikie wrote:

> Fair enough - that complexity seems reasonably acceptable to me if you reckon 
> the memory size benefits are still worthwhile (did you measure them on any 
> particular workloads? Do we have lots of fairly empty SmallVectors, etc?) if 
> they don't apply to smaller types like this?


I haven't measured anything recently.  Last I looked there were a number of 
SmallVectors inside other data structures (sometimes, sadly, SmallVector) on 
the heap (or stack).  In some cases the main reason not to use `std::vector` is 
the exception pessimizations.  It's nice to keep them small if it's reasonable 
to.




Comment at: llvm/include/llvm/ADT/SmallVector.h:52
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax() {
+return std::numeric_limits::max();

STL data structures have a name for this called `max_size()`.  Should we be 
consistent with that?



Comment at: llvm/include/llvm/ADT/SmallVector.h:179
+  using Base::empty;
+  using Base::size;
+

Optionally we could expose `max_size()` as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

Looks good to me at this point (I have some vague quandries about whether the 
report_fatal_error stuff could be improved/made more clear, but couldn't come 
up with an actionable suggestion so far) - @dexonsmith could you check this 
over and offer final approval?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-13 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 257044.
browneee added a comment.

Changed SizeMax to static constexpr function.
Changed static asserts.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -36,30 +36,6 @@
 static_assert(sizeof(SmallVector) ==
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
-
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
-report_bad_alloc_error("SmallVector capacity overflow during allocation");
-
-  size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
-
-  void *NewElts;
-  if (BeginX == FirstEl) {
-NewElts = safe_malloc(NewCapacity * TSize);
-
-// Copy the elements over.  No need to run dtors on PODs.
-memcpy(NewElts, this->BeginX, size() * TSize);
-  } else {
-// If this wasn't grown from the inline copy, grow the allocated space.
-NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
-  }
-
-  this->BeginX = NewElts;
-  this->Capacity = NewCapacity;
-}
+static_assert(sizeof(SmallVector) ==
+  sizeof(void *) * 2 + sizeof(void *),
+  "1 byte elements have word-sized type for size and capacity");
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,23 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax() {
+return std::numeric_limits::max();
+  }
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -46,7 +59,39 @@
 
   /// This is an implementation of the grow() method which only works
   /// on POD-like data types and is out of line to reduce code duplication.
-  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
+  /// This function will report a fatal error if it cannot increase capacity.
+  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
+// Ensure we can fit the new capacity.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (MinCapacity > SizeMax())
+  report_bad_alloc_error("SmallVector capacity overflow during allocation");
+
+// Ensure we can meet the guarantee of space for at least one more element.
+// The above check alone will not catch the case where grow is called with a
+// default MinCapacity of 0, but the current capacity cannot be increased.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (capacity() == SizeMax())
+  report_bad_alloc_error("SmallVector capacity unable to grow");
+
+// In theory 2*capacity can overflow if the capacity is 64 bit, but the
+// original capacity would never be large enough for this to be a problem.
+size_t NewCapacity = 2 * capacity() + 1; // Always grow.
+NewCapacity = std::min(std::max(NewCapacity, MinCapacity), SizeMax());
+
+void *NewElts;
+if (BeginX == FirstEl) {
+  NewElts = safe_malloc(NewCapacity * TSize);
+
+  // Copy 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-11 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added inline comments.



Comment at: llvm/include/llvm/ADT/SmallVector.h:53
+  // The maximum size depends on size_type used.
+  size_t SizeMax() { return size_type(-1ULL); }
 

dblaikie wrote:
> I'd probably use numeric_limits here & make this static constexpr
Making it a constexpr /variable/ gets a bit more complicated - do you happen to 
know off-hand whether this requires a separate/out-of-line definition if it's 
ODR used (ah, seems it does - in C++14 which LLVM Uses, in 17 ("A constexpr 
specifier used in a function or static member variable (since C++17) 
declaration implies inline.") it would be OK)

I think it's best not to leave a trap that might catch someone up in the future 
if SizeMax ends up getting ODR used (eg: in a call to std::less, rather than an 
explicit op<, etc) & then the lack of definition would lead to linking failure, 
etc. So best to leave it as a static constexpr function rather than static 
constexpr variable.



Comment at: llvm/lib/Support/SmallVector.cpp:40-47
+// Check that SmallVector with 1 byte elements takes extra space due to using
+// uintptr_r instead of uint32_t for size and capacity.
+static_assert(sizeof(SmallVector) > sizeof(SmallVector) 
||
+  sizeof(uintptr_t) == sizeof(uint32_t),
+  "1 byte elements should cause larger size and capacity type");
+static_assert(sizeof(SmallVector) ==
+  sizeof(SmallVector),

I think it's probably best to assert the size more like the above (but using 
zero-sized inline buffer), rather than using relative sizes - seems like it'd 
be more readable to me at least. Maybe @dexonsmith has a perspective here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-11 Thread Andrew via Phabricator via cfe-commits
browneee updated this revision to Diff 256803.
browneee marked 3 inline comments as done.
browneee added a comment.

Address comments from dblaikie.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621

Files:
  llvm/include/llvm/ADT/SmallVector.h
  llvm/lib/Support/SmallVector.cpp

Index: llvm/lib/Support/SmallVector.cpp
===
--- llvm/lib/Support/SmallVector.cpp
+++ llvm/lib/Support/SmallVector.cpp
@@ -37,29 +37,11 @@
   sizeof(unsigned) * 2 + sizeof(void *) * 2,
   "wasted space in SmallVector size 1");
 
-/// grow_pod - This is an implementation of the grow() method which only works
-/// on POD-like datatypes and is out of line to reduce code duplication.
-void SmallVectorBase::grow_pod(void *FirstEl, size_t MinCapacity,
-   size_t TSize) {
-  // Ensure we can fit the new capacity in 32 bits.
-  if (MinCapacity > UINT32_MAX)
-report_bad_alloc_error("SmallVector capacity overflow during allocation");
-
-  size_t NewCapacity = 2 * capacity() + 1; // Always grow.
-  NewCapacity =
-  std::min(std::max(NewCapacity, MinCapacity), size_t(UINT32_MAX));
-
-  void *NewElts;
-  if (BeginX == FirstEl) {
-NewElts = safe_malloc(NewCapacity * TSize);
-
-// Copy the elements over.  No need to run dtors on PODs.
-memcpy(NewElts, this->BeginX, size() * TSize);
-  } else {
-// If this wasn't grown from the inline copy, grow the allocated space.
-NewElts = safe_realloc(this->BeginX, NewCapacity * TSize);
-  }
-
-  this->BeginX = NewElts;
-  this->Capacity = NewCapacity;
-}
+// Check that SmallVector with 1 byte elements takes extra space due to using
+// uintptr_r instead of uint32_t for size and capacity.
+static_assert(sizeof(SmallVector) > sizeof(SmallVector) ||
+  sizeof(uintptr_t) == sizeof(uint32_t),
+  "1 byte elements should cause larger size and capacity type");
+static_assert(sizeof(SmallVector) ==
+  sizeof(SmallVector),
+  "larger elements should keep the same size and capacity type");
Index: llvm/include/llvm/ADT/SmallVector.h
===
--- llvm/include/llvm/ADT/SmallVector.h
+++ llvm/include/llvm/ADT/SmallVector.h
@@ -16,10 +16,10 @@
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/AlignOf.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/MemAlloc.h"
 #include "llvm/Support/type_traits.h"
-#include "llvm/Support/ErrorHandling.h"
 #include 
 #include 
 #include 
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,11 +35,21 @@
 
 namespace llvm {
 
-/// This is all the non-templated stuff common to all SmallVectors.
-class SmallVectorBase {
+/// This is all the stuff common to all SmallVectors.
+///
+/// The template parameter specifies the type which should be used to hold the
+/// Size and Capacity of the SmallVector, so it can be adjusted.
+/// Using 32 bit size is desirable to shink the size of the SmallVector.
+/// Using 64 bit size is desirable for cases like SmallVector, where a
+/// 32 bit size would limit the vector to ~4GB. SmallVectors are used for
+/// buffering bitcode output - which can exceed 4GB.
+template  class SmallVectorBase {
 protected:
   void *BeginX;
-  unsigned Size = 0, Capacity;
+  Size_T Size = 0, Capacity;
+
+  // The maximum size depends on size_type used.
+  static constexpr size_t SizeMax = std::numeric_limits::max();
 
   SmallVectorBase() = delete;
   SmallVectorBase(void *FirstEl, size_t TotalCapacity)
@@ -46,7 +57,39 @@
 
   /// This is an implementation of the grow() method which only works
   /// on POD-like data types and is out of line to reduce code duplication.
-  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize);
+  /// This function will report a fatal error if it cannot increase capacity.
+  void grow_pod(void *FirstEl, size_t MinCapacity, size_t TSize) {
+// Ensure we can fit the new capacity.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (MinCapacity > SizeMax)
+  report_bad_alloc_error("SmallVector capacity overflow during allocation");
+
+// Ensure we can meet the guarantee of space for at least one more element.
+// The above check alone will not catch the case where grow is called with a
+// default MinCapacity of 0, but the current capacity cannot be increased.
+// This is only going to be applicable if the when the capacity is 32 bit.
+if (capacity() == SizeMax)
+  report_bad_alloc_error("SmallVector capacity unable to grow");
+
+// In theory 2*capacity can overflow if the capacity is 64 bit, but the
+// original capacity would never be large enough for this to be a 

[PATCH] D77621: ADT: SmallVector size & capacity use word-size integers when elements are small.

2020-04-10 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

Please update the patch description/subject line.

@dexonsmith I'll leave this to you for final approval, since it was your 
idea/you've been touching things here. But looks like about the right direction.




Comment at: llvm/include/llvm/ADT/SmallVector.h:47
 protected:
+  typedef Size_T size_type;
+

Don't think this typedef is really pulling its weight - probably just refer to 
the template type parameter directly?



Comment at: llvm/include/llvm/ADT/SmallVector.h:53
+  // The maximum size depends on size_type used.
+  size_t SizeMax() { return size_type(-1ULL); }
 

I'd probably use numeric_limits here & make this static constexpr



Comment at: llvm/include/llvm/ADT/SmallVector.h:132
+: public SmallVectorBase> {
   /// Find the address of the first element.  For this pointer math to be valid
   /// with small-size of 0 for T with lots of alignment, it's important that

I'd probably add a "using Base = SmallVectorBase>" here, 
and then use that in the ctor and grow_pod.

Also down by the other using decls maybe add "using 
Base::size/Base::capacity/Base::empty" so you don't have to "this->" everything.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77621/new/

https://reviews.llvm.org/D77621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits