[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-09 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG13a351e862ba: [clang][deps] Use MemoryBuffer in minimizing 
FS (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), Buffer->getBufferEnd());
-// Implicitly null terminate the contents for Clang's lexer.
-Result.Contents.push_back('\0');
-Result.Contents.pop_back();
+Result.Contents = std::move(*MaybeBuffer);
 return Result;
   }
 
@@ -60,15 +57,8 @@
   // The contents produced by the minimizer must be null terminated.
   assert(MinimizedFileContents.data()[MinimizedFileContents.size()] == '\0' &&
  "not null terminated contents");
-  // Even though there's an implicit null terminator in the minimized contents,
-  // we want to temporarily make it explicit. This will ensure that the
-  // std::move will preserve it even if it needs to do a copy if the
-  // SmallString still has the small capacity.
-  MinimizedFileContents.push_back('\0');
-  Result.Contents = std::move(MinimizedFileContents);
-  // Now make the null terminator implicit again, so that Clang's lexer can 
find
-  // it right where the buffer ends.
-  Result.Contents.pop_back();
+  Result.Contents = std::make_unique(
+  std::move(MinimizedFileContents));
 
   // Compute the skipped PP ranges that speedup skipping over inactive
   // preprocessor blocks.
Index: 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
===
--- 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -43,9 +43,7 @@
   ///
   /// The filesystem opens the file even for `stat` calls open to avoid the
   /// issues with stat + open of minimized files that might lead to a
-  /// mismatching size of the file. If file is not minimized, the full file is
-  /// read and copied into memory to ensure that it's not memory mapped to 
avoid
-  /// running out of file descriptors.
+  /// mismatching size of the file.
   static CachedFileSystemEntry createFileEntry(StringRef Filename,
llvm::vfs::FileSystem ,
bool Minimize = true);
@@ -65,7 +63,7 @@
   return MaybeStat.getError();
 assert(!MaybeStat->isDirectory() && "not a file");
 assert(isValid() && "not initialized");
-return Contents.str();
+return Contents->getBuffer();
   }
 
   /// \returns The error or the status of the entry.
@@ -94,11 +92,7 @@
 
 private:
   llvm::ErrorOr MaybeStat;
-  // Store the contents in a small string to allow a
-  // move from the small string for the minimized contents.
-  // Note: small size of 1 allows us to store an empty string with an implicit
-  // null terminator without any allocations.
-  llvm::SmallString<1> Contents;
+  std::unique_ptr Contents;
   PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
 };
 


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-

[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-08 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith accepted this revision.
dexonsmith added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h:49
-  /// running out of file descriptors.
+  /// mismatching size of the file.
   static CachedFileSystemEntry createFileEntry(StringRef Filename,
llvm::vfs::FileSystem ,

In the implementation of this function we get 
`std::unique_ptr` from the underlying filesystem. When that 
goes out of scope, the file descriptor will be closed immediately. How did 
copying contents into memory prevent running out of file descriptors? @arphaman


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 392745.
jansvoboda11 marked an inline comment as done.
jansvoboda11 added a comment.

Update comment


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), Buffer->getBufferEnd());
-// Implicitly null terminate the contents for Clang's lexer.
-Result.Contents.push_back('\0');
-Result.Contents.pop_back();
+Result.Contents = std::move(*MaybeBuffer);
 return Result;
   }
 
@@ -60,15 +57,8 @@
   // The contents produced by the minimizer must be null terminated.
   assert(MinimizedFileContents.data()[MinimizedFileContents.size()] == '\0' &&
  "not null terminated contents");
-  // Even though there's an implicit null terminator in the minimized contents,
-  // we want to temporarily make it explicit. This will ensure that the
-  // std::move will preserve it even if it needs to do a copy if the
-  // SmallString still has the small capacity.
-  MinimizedFileContents.push_back('\0');
-  Result.Contents = std::move(MinimizedFileContents);
-  // Now make the null terminator implicit again, so that Clang's lexer can 
find
-  // it right where the buffer ends.
-  Result.Contents.pop_back();
+  Result.Contents = std::make_unique(
+  std::move(MinimizedFileContents));
 
   // Compute the skipped PP ranges that speedup skipping over inactive
   // preprocessor blocks.
Index: 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
===
--- 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -43,9 +43,7 @@
   ///
   /// The filesystem opens the file even for `stat` calls open to avoid the
   /// issues with stat + open of minimized files that might lead to a
-  /// mismatching size of the file. If file is not minimized, the full file is
-  /// read and copied into memory to ensure that it's not memory mapped to 
avoid
-  /// running out of file descriptors.
+  /// mismatching size of the file.
   static CachedFileSystemEntry createFileEntry(StringRef Filename,
llvm::vfs::FileSystem ,
bool Minimize = true);
@@ -65,7 +63,7 @@
   return MaybeStat.getError();
 assert(!MaybeStat->isDirectory() && "not a file");
 assert(isValid() && "not initialized");
-return Contents.str();
+return Contents->getBuffer();
   }
 
   /// \returns The error or the status of the entry.
@@ -94,11 +92,7 @@
 
 private:
   llvm::ErrorOr MaybeStat;
-  // Store the contents in a small string to allow a
-  // move from the small string for the minimized contents.
-  // Note: small size of 1 allows us to store an empty string with an implicit
-  // null terminator without any allocations.
-  llvm::SmallString<1> Contents;
+  std::unique_ptr Contents;
   PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
 };
 


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), 

[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 marked 2 inline comments as done.
jansvoboda11 added inline comments.



Comment at: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h:54
+  /// and invoke the given function right after the move.
+  SmallVectorMemoryBuffer(
+  SmallVectorImpl &,

dexonsmith wrote:
> jansvoboda11 wrote:
> > I'm not happy with introducing new (hacky) constructor.
> > 
> > But, the best alternative I could come up with is to avoid using 
> > `SmallVectorMemoryBuffer`, store the `SmallString` with minimized contents 
> > somewhere in the filesystem and refer to it via regular `MemoryBuffer`. 
> > This seems needlessly complicated, so hacky constructor it is...
> > 
> > Side question: is the hassle with implicit null-termination (expected by 
> > Clang's lexer) worth the complications? What are the benefits anyway?
> It'd be better to add a `RequiresNullTerminator` argument to the constructor, 
> matching the MemoryBuffer factory functions, which (if set) does the 
> `push_back`/`pop_back` dance, and in either case passes it through to 
> `init()`.
> ```
> lang=c++
>   SmallVectorMemoryBuffer(SmallVectorImpl &, bool 
> RequiresNullTerminator = true)
>   : SV(std::move(SV)), BufferName("") {
> if (RequiresNullTerminator) {
>   this->SV.push_back(0);
>   this->SV.pop_back(0);
> }
> init(this->SV.begin(), this->SV.end(), RequiresNullTerminator);
>   }
> ```
> 
> If you do that, you should update the (few) existing callers to explicitly 
> pass `false`. Note that `RequiresNullTerminator` defaults to true in all the 
> other `MemoryBuffer` APIs so it'd be confusing to default it to `false` here.
> 
> > Side question: is the hassle with implicit null-termination (expected by 
> > Clang's lexer) worth the complications? What are the benefits anyway?
> 
> Not sure whether it's a micro-optimization for getting better codegen or if 
> it just predates `StringRef`, which made it much easier to reason about a 
> byte range as a single unit (rather than tracking two separate pointers, 
> which is more error-prone). I somewhat doubt it's worth it, which is why when 
> I wrote the minimizer I used `StringRef`, but I don't have data and it'd be a 
> lot of work to change. (The minimizer is fast enough as-is, but it does less 
> work than the full Clang lexer. It's plausible that the optimizations matter 
> there?)
You're right that exposing the `RequiresNullTerminator` parameter is much 
clearer than passing in a random lambda. I've extracted the change into D115331.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 392717.
jansvoboda11 added a comment.

Rebase on top of D115331 .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), Buffer->getBufferEnd());
-// Implicitly null terminate the contents for Clang's lexer.
-Result.Contents.push_back('\0');
-Result.Contents.pop_back();
+Result.Contents = std::move(*MaybeBuffer);
 return Result;
   }
 
@@ -60,15 +57,8 @@
   // The contents produced by the minimizer must be null terminated.
   assert(MinimizedFileContents.data()[MinimizedFileContents.size()] == '\0' &&
  "not null terminated contents");
-  // Even though there's an implicit null terminator in the minimized contents,
-  // we want to temporarily make it explicit. This will ensure that the
-  // std::move will preserve it even if it needs to do a copy if the
-  // SmallString still has the small capacity.
-  MinimizedFileContents.push_back('\0');
-  Result.Contents = std::move(MinimizedFileContents);
-  // Now make the null terminator implicit again, so that Clang's lexer can 
find
-  // it right where the buffer ends.
-  Result.Contents.pop_back();
+  Result.Contents = std::make_unique(
+  std::move(MinimizedFileContents));
 
   // Compute the skipped PP ranges that speedup skipping over inactive
   // preprocessor blocks.
Index: 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
===
--- 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -65,7 +65,7 @@
   return MaybeStat.getError();
 assert(!MaybeStat->isDirectory() && "not a file");
 assert(isValid() && "not initialized");
-return Contents.str();
+return Contents->getBuffer();
   }
 
   /// \returns The error or the status of the entry.
@@ -94,11 +94,7 @@
 
 private:
   llvm::ErrorOr MaybeStat;
-  // Store the contents in a small string to allow a
-  // move from the small string for the minimized contents.
-  // Note: small size of 1 allows us to store an empty string with an implicit
-  // null terminator without any allocations.
-  llvm::SmallString<1> Contents;
+  std::unique_ptr Contents;
   PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
 };
 


Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), Buffer->getBufferEnd());
-// Implicitly null terminate the contents for Clang's lexer.
-Result.Contents.push_back('\0');
-Result.Contents.pop_back();
+Result.Contents = std::move(*MaybeBuffer);
 return Result;
   }
 
@@ -60,15 +57,8 @@
   // The contents produced by the minimizer must be null terminated.
   assert(MinimizedFileContents.data()[MinimizedFileContents.size()] == '\0' &&
  "not null terminated contents");
-  // Even though there's an implicit null terminator in the minimized contents,
-  // we want to temporarily make it explicit. This will ensure that the
-  // std::move will 

[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-06 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added inline comments.



Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp:64-70
   MinimizedFileContents.push_back('\0');
-  Result.Contents = std::move(MinimizedFileContents);
-  // Now make the null terminator implicit again, so that Clang's lexer can 
find
-  // it right where the buffer ends.
-  Result.Contents.pop_back();
+  Result.Contents = std::make_unique(
+  std::move(MinimizedFileContents), [](SmallVectorImpl ) {
+// Now make the null terminator implicit again, so that Clang's lexer
+// can find it right where the buffer ends.
+SV.pop_back();
+  });

This is a bit simpler:
```
lang=c++
Result.Contents = MinimizedFileContents.isSmall()
? MemoryBuffer::getMemBufferCopy(...)
: std::make_unique();
```
and doesn't require changing SmallVectorMemoryBuffer.

Another option is:
```
lang=c++
{
  // Move to the heap and ensure null-terminated to ensure the MemoryBuffer 
works.
  SmallVector Heap(std::move(MinimizedFileContents));
  Heap.push_back(0);
  Heap.pop_back();
  Result.Contents = 
std::make_unique(std::move(Heap));
}
```
The latter seems like reasonable logic for the SmallVectorMemoryBuffer 
constructor though.



Comment at: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h:54
+  /// and invoke the given function right after the move.
+  SmallVectorMemoryBuffer(
+  SmallVectorImpl &,

jansvoboda11 wrote:
> I'm not happy with introducing new (hacky) constructor.
> 
> But, the best alternative I could come up with is to avoid using 
> `SmallVectorMemoryBuffer`, store the `SmallString` with minimized contents 
> somewhere in the filesystem and refer to it via regular `MemoryBuffer`. This 
> seems needlessly complicated, so hacky constructor it is...
> 
> Side question: is the hassle with implicit null-termination (expected by 
> Clang's lexer) worth the complications? What are the benefits anyway?
It'd be better to add a `RequiresNullTerminator` argument to the constructor, 
matching the MemoryBuffer factory functions, which (if set) does the 
`push_back`/`pop_back` dance, and in either case passes it through to `init()`.
```
lang=c++
  SmallVectorMemoryBuffer(SmallVectorImpl &, bool 
RequiresNullTerminator = true)
  : SV(std::move(SV)), BufferName("") {
if (RequiresNullTerminator) {
  this->SV.push_back(0);
  this->SV.pop_back(0);
}
init(this->SV.begin(), this->SV.end(), RequiresNullTerminator);
  }
```

If you do that, you should update the (few) existing callers to explicitly pass 
`false`. Note that `RequiresNullTerminator` defaults to true in all the other 
`MemoryBuffer` APIs so it'd be confusing to default it to `false` here.

> Side question: is the hassle with implicit null-termination (expected by 
> Clang's lexer) worth the complications? What are the benefits anyway?

Not sure whether it's a micro-optimization for getting better codegen or if it 
just predates `StringRef`, which made it much easier to reason about a byte 
range as a single unit (rather than tracking two separate pointers, which is 
more error-prone). I somewhat doubt it's worth it, which is why when I wrote 
the minimizer I used `StringRef`, but I don't have data and it'd be a lot of 
work to change. (The minimizer is fast enough as-is, but it does less work than 
the full Clang lexer. It's plausible that the optimizations matter there?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h:54
+  /// and invoke the given function right after the move.
+  SmallVectorMemoryBuffer(
+  SmallVectorImpl &,

I'm not happy with introducing new (hacky) constructor.

But, the best alternative I could come up with is to avoid using 
`SmallVectorMemoryBuffer`, store the `SmallString` with minimized contents 
somewhere in the filesystem and refer to it via regular `MemoryBuffer`. This 
seems needlessly complicated, so hacky constructor it is...

Side question: is the hassle with implicit null-termination (expected by 
Clang's lexer) worth the complications? What are the benefits anyway?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115043/new/

https://reviews.llvm.org/D115043

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115043: [clang][deps] Use MemoryBuffer in minimizing FS

2021-12-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added reviewers: Bigcheese, dexonsmith, arphaman.
jansvoboda11 requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This patch avoids unnecessarily copying `mmap`-ed memory into 
`CachedFileSystemEntry` by storing `MemoryBuffer` instead. The change leads to 
~50% reduction of peak memory usage when scanning LLVM+Clang via 
`clang-scan-deps`.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115043

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
  llvm/include/llvm/Support/SmallVectorMemoryBuffer.h


Index: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
===
--- llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
+++ llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
@@ -14,6 +14,7 @@
 #ifndef LLVM_SUPPORT_SMALLVECTORMEMORYBUFFER_H
 #define LLVM_SUPPORT_SMALLVECTORMEMORYBUFFER_H
 
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/raw_ostream.h"
@@ -48,6 +49,16 @@
 init(this->SV.begin(), this->SV.end(), false);
   }
 
+  /// Construct a SmallVectorMemoryBuffer from the given SmallVector r-value,
+  /// and invoke the given function right after the move.
+  SmallVectorMemoryBuffer(
+  SmallVectorImpl &,
+  llvm::function_ref &)> AfterMove)
+  : SV(std::move(SV)), BufferName("") {
+AfterMove(this->SV);
+init(this->SV.begin(), this->SV.end(), false);
+  }
+
   // Key function.
   ~SmallVectorMemoryBuffer() override;
 
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -9,6 +9,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h"
 #include "clang/Lex/DependencyDirectivesSourceMinimizer.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
 #include "llvm/Support/Threading.h"
 
 using namespace clang;
@@ -43,11 +44,7 @@
 // FIXME: Propage the diagnostic if desired by the client.
 CachedFileSystemEntry Result;
 Result.MaybeStat = std::move(*Stat);
-Result.Contents.reserve(Buffer->getBufferSize() + 1);
-Result.Contents.append(Buffer->getBufferStart(), Buffer->getBufferEnd());
-// Implicitly null terminate the contents for Clang's lexer.
-Result.Contents.push_back('\0');
-Result.Contents.pop_back();
+Result.Contents = std::move(*MaybeBuffer);
 return Result;
   }
 
@@ -65,10 +62,12 @@
   // std::move will preserve it even if it needs to do a copy if the
   // SmallString still has the small capacity.
   MinimizedFileContents.push_back('\0');
-  Result.Contents = std::move(MinimizedFileContents);
-  // Now make the null terminator implicit again, so that Clang's lexer can 
find
-  // it right where the buffer ends.
-  Result.Contents.pop_back();
+  Result.Contents = std::make_unique(
+  std::move(MinimizedFileContents), [](SmallVectorImpl ) {
+// Now make the null terminator implicit again, so that Clang's lexer
+// can find it right where the buffer ends.
+SV.pop_back();
+  });
 
   // Compute the skipped PP ranges that speedup skipping over inactive
   // preprocessor blocks.
Index: 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
===
--- 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ 
clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -65,7 +65,7 @@
   return MaybeStat.getError();
 assert(!MaybeStat->isDirectory() && "not a file");
 assert(isValid() && "not initialized");
-return Contents.str();
+return Contents->getBuffer();
   }
 
   /// \returns The error or the status of the entry.
@@ -94,11 +94,7 @@
 
 private:
   llvm::ErrorOr MaybeStat;
-  // Store the contents in a small string to allow a
-  // move from the small string for the minimized contents.
-  // Note: small size of 1 allows us to store an empty string with an implicit
-  // null terminator without any allocations.
-  llvm::SmallString<1> Contents;
+  std::unique_ptr Contents;
   PreprocessorSkippedRangeMapping PPSkippedRangeMapping;
 };
 


Index: llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
===
--- llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
+++ llvm/include/llvm/Support/SmallVectorMemoryBuffer.h
@@ -14,6 +14,7 @@
 #ifndef LLVM_SUPPORT_SMALLVECTORMEMORYBUFFER_H
 #define