[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Nico Weber via Phabricator via cfe-commits
thakis added a comment.

If it's not too much trouble, maybe you could cherry-pick and amend 
8d498e08deaf6e06a578cfedb4eb259b722ac7f6 
 into the 
commit that relands this.

(If not, no worries.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

In D136651#4067012 , @glandium wrote:

> Oh yes, we're using `-DLLVM_LINK_LLVM_DYLIB=ON` too, so that would be the 
> main trigger.

This was indeed the main trigger. Thanks for the help narrowing this down. I 
reverted for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Mike Hommey via Phabricator via cfe-commits
glandium added a comment.

Oh yes, we're using `-DLLVM_LINK_LLVM_DYLIB=ON` too, so that would be the main 
trigger.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

In D136651#4066960 , @mstorsjo wrote:

> In D136651#4066949 , @friss wrote:
>
>> Thanks for the report, unfortunately I have no way of testing this setup.
>
> Can you try a build with `-DLLVM_LINK_LLVM_DYLIB=ON`?

Trying. If that's indeed the issue (and not the cross-compile environment), 
I'll revert while looking at it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

In D136651#4066949 , @friss wrote:

> Thanks for the report, unfortunately I have no way of testing this setup.

Can you try a build with `-DLLVM_LINK_LLVM_DYLIB=ON`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

In D136651#4064474 , @glandium wrote:

> In D136651#4064260 , @glandium 
> wrote:
>
>> This broke our mac builds with errors like:
>>
>>   ld64.lld: error: undefined symbol: CFRunLoopRun
>>   >>> referenced by 
>> tools/clang/tools/clang-stat-cache/CMakeFiles/clang-stat-cache.dir/clang-stat-cache.cpp.o:(symbol
>>  main+0x11be)
>>
>> (and many more symbols)
>
> Additional information: those are cross-compiled, and for some reason the 
> `-framework CoreServices` in the CMakeLists.txt doesn't seem to make it to 
> the command line, which is surprising because something similar works fine 
> for e.g. dsymutil.

Thanks for the report, unfortunately I have no way of testing this setup. Any 
chance you can experiment on your side? DirectoryWatcher uses the came decency, 
I'm wondering why it's not an issue there. If it helps to revert the change 
while someone investigates, that's perfectly fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

In D136651#4064474 , @glandium wrote:

> In D136651#4064260 , @glandium 
> wrote:
>
>> This broke our mac builds with errors like:
>>
>>   ld64.lld: error: undefined symbol: CFRunLoopRun
>>   >>> referenced by 
>> tools/clang/tools/clang-stat-cache/CMakeFiles/clang-stat-cache.dir/clang-stat-cache.cpp.o:(symbol
>>  main+0x11be)
>>
>> (and many more symbols)
>
> Additional information: those are cross-compiled, and for some reason the 
> `-framework CoreServices` in the CMakeLists.txt doesn't seem to make it to 
> the command line, which is surprising because something similar works fine 
> for e.g. dsymutil.

I ran into similar issues, not when cross compiling, but when compiling 
natively. In my case, I'm building with `-DLLVM_LINK_LLVM_DYLIB=ON`. Without 
that, it builds correctly for me.

The new tool needs to be linked with `-framework CoreServices`. This does get 
set in `clang/tools/clang-stat-cache/CMakeLists.txt` like this:

  if(APPLE)
  set(CLANG_STAT_CACHE_LIB_DEPS
"-framework CoreServices"
)
  endif()
  
  clang_target_link_libraries(clang-stat-cache
PRIVATE
${CLANG_STAT_CACHE_LIB_DEPS}
)

However, `clang_target_link_libraries` ignores extra dependencies when linking 
against the dylib: 
https://github.com/llvm/llvm-project/blob/a033dbbe5c43247b60869b008e67ed86ed230eaa/clang/cmake/modules/AddClang.cmake#L209-L213

  if (CLANG_LINK_CLANG_DYLIB)
target_link_libraries(${target} ${type} clang-cpp)
  else()
target_link_libraries(${target} ${type} ${ARGN})
  endif()

I guess `clang_target_link_libraries` needs a mechanism to disambiguate between 
generic clang library dependencies (which need to be dropped when linking 
against `clang-cpp` instead) and other dependencies which always are needed. I 
wonder if there's prior art for such disambiguation in other places - e.g. 
`llvm_add_library` does have similar logic for linking against either other 
llvm libraries or the dylib (and handles lots of other options). (I'm a bit out 
of time for digging further into this right now...)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-19 Thread Mike Hommey via Phabricator via cfe-commits
glandium added a comment.

In D136651#4064260 , @glandium wrote:

> This broke our mac builds with errors like:
>
>   ld64.lld: error: undefined symbol: CFRunLoopRun
>   >>> referenced by 
> tools/clang/tools/clang-stat-cache/CMakeFiles/clang-stat-cache.dir/clang-stat-cache.cpp.o:(symbol
>  main+0x11be)
>
> (and many more symbols)

Additional information: those are cross-compiled, and for some reason the 
`-framework CoreServices` in the CMakeLists.txt doesn't seem to make it to the 
command line, which is surprising because something similar works fine for e.g. 
dsymutil.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Mike Hommey via Phabricator via cfe-commits
glandium added a comment.

This broke our mac builds with errors like:

  ld64.lld: error: undefined symbol: CFRunLoopRun
  >>> referenced by 
tools/clang/tools/clang-stat-cache/CMakeFiles/clang-stat-cache.dir/clang-stat-cache.cpp.o:(symbol
 main+0x11be)

(and many more symbols)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

In D136651#4063818 , @friss wrote:

> In D136651#4063632 , @lebedev.ri 
> wrote:
>
>> In D136651#4063597 , @friss wrote:
>>
>>> In D136651#4060071 , @lebedev.ri 
>>> wrote:
>>>
 Docs still missing.
>>>
>>> Sorry @lebedev.ri I missed your earlier comment about this. What format of 
>>> doc would you like to see? A more elaborate comment in the tool's source or 
>>> something else?
>>
>> Right now it's impossible to discover that tool unless you already know it 
>> exists (or it's mentioned in some xcode doc, i guess)
>> It should have it's own description in it's `--help`, and ideally an `.rst` 
>> documentation page in `docs`.
>
> I have no issue adding a more detailed help text. The documentation page 
> seems somewhat overkill, but I guess I could be convinced otherwise. You need 
> a fairly special build environment for this to make a difference and I'm not 
> sure it's broadly applicable. You also need good build system integration to 
> make it practical.

... and all of that knowledge seems to be missing from the current docs, which 
is why i'm asking in the first place.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

In D136651#4063632 , @lebedev.ri 
wrote:

> In D136651#4063597 , @friss wrote:
>
>> In D136651#4060071 , @lebedev.ri 
>> wrote:
>>
>>> Docs still missing.
>>
>> Sorry @lebedev.ri I missed your earlier comment about this. What format of 
>> doc would you like to see? A more elaborate comment in the tool's source or 
>> something else?
>
> Right now it's impossible to discover that tool unless you already know it 
> exists (or it's mentioned in some xcode doc, i guess)
> It should have it's own description in it's `--help`, and ideally an `.rst` 
> documentation page in `docs`.

I have no issue adding a more detailed help text. The documentation page seems 
somewhat overkill, but I guess I could be convinced otherwise. You need a 
fairly special build environment for this to make a difference and I'm not sure 
it's broadly applicable. You also need good build system integration to make it 
practical.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

In D136651#4063597 , @friss wrote:

> In D136651#4060071 , @lebedev.ri 
> wrote:
>
>> Docs still missing.
>
> Sorry @lebedev.ri I missed your earlier comment about this. What format of 
> doc would you like to see? A more elaborate comment in the tool's source or 
> something else?

Right now it's impossible to discover that tool unless you already know it 
exists (or it's mentioned in some xcode doc, i guess)
It should have it's own description in it's `--help`, and ideally an `.rst` 
documentation page in `docs`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Frederic Riss via Phabricator via cfe-commits
friss added a subscriber: lebedev.ri.
friss added a comment.

In D136651#4060071 , @lebedev.ri 
wrote:

> Docs still missing.

Sorry @lebedev.ri I missed your earlier comment about this. What format of doc 
would you like to see? A more elaborate comment in the tool's source or 
something else?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-18 Thread Frederic Riss via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa033dbbe5c43: [Clang] Give Clang the ability to use a shared 
stat cache (authored by friss).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3228,3 +3230,306 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache =
+vfs::StatCacheFileSystem::create(std::move(*ErrorOrBuffer), Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-17 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 489968.
friss added a comment.

More Windows fixes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3228,3 +3230,306 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache =
+vfs::StatCacheFileSystem::create(std::move(*ErrorOrBuffer), Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile _ace(TestDirectory.path("a/c/e"));
+  

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-17 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Docs still missing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-17 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 489914.
friss added a comment.

The pre-commit CI showed some test failures on Windows. Try to address these.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3228,3 +3230,306 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache =
+vfs::StatCacheFileSystem::create(std::move(*ErrorOrBuffer), Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-06 Thread Ben Langmuir via Phabricator via cfe-commits
benlangmuir added a comment.

> Sorry for missing these. I think I addressed all of them now.

Thanks, LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-05 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

In D136651#4030008 , @benlangmuir 
wrote:

> Re-added a few of my minor comments that I think got lost on a specific 
> version of the diff. Otherwise LGTM.

Sorry for missing these. I think I addressed all of them now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-05 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 486711.
friss added a comment.

Fix Windows build (Thanks @ravi-ramaseshan)
Address @benlangmuir comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3228,3 +3230,306 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache =
+vfs::StatCacheFileSystem::create(std::move(*ErrorOrBuffer), Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-05 Thread Ben Langmuir via Phabricator via cfe-commits
benlangmuir accepted this revision.
benlangmuir added a comment.

Re-added a few of my minor comments that I think got lost on a specific version 
of the diff. Otherwise LGTM.




Comment at: llvm/include/llvm/Support/StatCacheFileSystem.h:23
+/// A ProxyFileSystem using cached information for status() rather than going 
to
+/// the real filesystem.
+///

s/real/underlying/ - it could be wrapping any filesystem in principle.



Comment at: llvm/include/llvm/Support/StatCacheFileSystem.h:26
+/// When dealing with a huge tree of (mostly) immutable filesystem content
+/// like and SDK, it can be very costly to ask the underlying filesystem for
+/// `stat` data. Even when caching the `stat`s internally, having many

typo: "and SDK" -> "an SDK"



Comment at: llvm/include/llvm/Support/StatCacheFileSystem.h:51
+  static Expected>
+  create(std::unique_ptr &,
+ StringRef CacheFilename, IntrusiveRefCntPtr FS);

Nit: It's unusal to see unique_ptr && instead of just unique_ptr. Is this 
needed? Same for the constructor.



Comment at: llvm/include/llvm/Support/StatCacheFileSystem.h:52
+  create(std::unique_ptr &,
+ StringRef CacheFilename, IntrusiveRefCntPtr FS);
+

Is CacheFilename different from CacheBuffer->getBufferIdentifier()?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-04 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 486466.
friss added a comment.

Rebase correctly


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3228,3 +3230,306 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-04 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

As someone who has no idea what this does, this seems to be missing docs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-04 Thread Frederic Riss via Phabricator via cfe-commits
friss added inline comments.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:291
+  IsCaseSensitive =
+  ::pathconf(TargetDirectory.c_str(), _PC_CASE_SENSITIVE) == 1;
+#endif

benlangmuir wrote:
> Is this pathconf extension Darwin-only?
I don't think so. It's a BSD extension.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:300
+  auto endTime = llvm::TimeRecord::getCurrentTime();
+  endTime -= startTime;
+

benlangmuir wrote:
> Nit: move subtraction to the print or create a new "duration" variable 
> instead of mutating endTime. Same below for writeStatCache.
I would have done so originally if there was a TimeRecord overload that allows 
simple subtraction. I don't think that exists. I renamed the variable to 
`duration`, maybe it make the code a little more readable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-04 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 486439.
friss added a comment.

Address review feedback
Move StatCacheFileSystem to its own file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  clang/tools/driver/features.json
  llvm/include/llvm/Support/StatCacheFileSystem.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/StatCacheFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -14,9 +14,11 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/StatCacheFileSystem.h"
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3388,3 +3390,306 @@
   EXPECT_FALSE(Redirecting->exists("/b"));
   EXPECT_TRUE(Redirecting->exists("/vfile"));
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+sys::fs::file_status s;
+status(BaseDir, s);
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, s, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+// Base path should be present in the stat cache.
+Filenames.push_back(std::string(BaseDir));
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2023-01-03 Thread Ravi Ramaseshan via Phabricator via cfe-commits
ravi-ramaseshan requested changes to this revision.
ravi-ramaseshan added inline comments.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:28
+#include 
+#include 
+

Some of these includes may not be available on a platform like Windows. Can you 
please guard them appropriately so that the build does not break?



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:3019
+
+  __attribute__((unused)) bool Consumed = Path.consume_front(BaseDir);
+  assert(Consumed && "Path does not start with expected prefix.");

Please use `LLVM_ATTRIBUTE_UNUSED`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-11-16 Thread Ben Langmuir via Phabricator via cfe-commits
benlangmuir requested changes to this revision.
benlangmuir added inline comments.
This revision now requires changes to proceed.



Comment at: clang/include/clang/Driver/Options.td:3374
+def ivfsstatcache : JoinedOrSeparate<["-"], "ivfsstatcache">, 
Group, Flags<[CC1Option]>,
+  HelpText<"Use the stat data cached in file instead of doing 
filesystemsyscalls. See clang-stat-cache utility.">;
 def ivfsoverlay : JoinedOrSeparate<["-"], "ivfsoverlay">, 
Group, Flags<[CC1Option]>,

"filesystemsyscalls" -> "filesystem syscalls"



Comment at: clang/lib/Frontend/CompilerInvocation.cpp:5117
+llvm::ErrorOr> Buffer =
+llvm::MemoryBuffer::getFile(File);
+if (!Buffer) {

We should use `BaseFS->getBufferForFile` here in case the BaseFS is not the 
real filesystem. This matches how vfsoverlay works below and could avoid 
re-reading this file repeatedly during scanning where we have a caching 
filesystem at the base.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:176
+  if (EC) {
+llvm::errs() << "fstat failed\n";
+return false;

This could use more description of the error.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:291
+  IsCaseSensitive =
+  ::pathconf(TargetDirectory.c_str(), _PC_CASE_SENSITIVE) == 1;
+#endif

Is this pathconf extension Darwin-only?



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:300
+  auto endTime = llvm::TimeRecord::getCurrentTime();
+  endTime -= startTime;
+

Nit: move subtraction to the print or create a new "duration" variable instead 
of mutating endTime. Same below for writeStatCache.



Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1113
+/// A ProxyFileSystem using cached information for status() rather than going 
to
+/// the real filesystem.
+///

s/real/underlying/ - it could be wrapping any filesystem in principle.



Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1116
+/// When dealing with a huge tree of (mostly) immutable filesystem content
+/// like and SDK, it can be very costly to ask the underlying filesystem for
+/// `stat` data. Even when caching the `stat`s internally, having many

typo: "and SDK" -> "an SDK"



Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1141
+  static Expected>
+  create(std::unique_ptr &,
+ StringRef CacheFilename, IntrusiveRefCntPtr FS);

Nit: It's unusal to see `unique_ptr &&` instead of just `unique_ptr`.  Is this 
needed?  Same for the constructor.



Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1142
+  create(std::unique_ptr &,
+ StringRef CacheFilename, IntrusiveRefCntPtr FS);
+

Is `CacheFilename` different from `CacheBuffer->getBufferIdentifier()`?



Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1167
+StatCacheWriter(StringRef BaseDir, bool IsCaseSensitive,
+uint64_t ValidityToken = 0);
+~StatCacheWriter();

Currently we are not adding the SDK base directory itself to the cache, which 
causes `status(BaseDir)` to fail.  I think we should add a status parameter to 
the constructor of `StatCacheWriter` to force it to be included.  This case 
also needs a test.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2816
+
+class StatCacheFileSystem::StatCacheLookupInfo {
+public:

Does this depend on other non-public code from this file, or could we split it 
into StatCacheFileSystem.cpp?  Obviously there is a lot of code here already, 
but would be nice to break the trend if it's easy to do so.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2907
+// // whether the cache is up-to-date.
+//char BaseDir[N]; // Zero terminated path to the base directory
+//< OnDiskHashtable Data > // Data for the has table. The keys are the

I think this comment is missing 'Version', which was added later.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:3050
+  // Move back to right after the Magic to insert BucketOffset
+  Out.seek(4);
+  Out.write((const char *), sizeof(BucketOffset));

Nit: could we use pwrite here and then change this method to use the 
more-general `raw_pwrite_stream` instead of being fd_ostream-specific?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-11-01 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 472379.
friss added a comment.

Fix unit test build


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  clang/tools/driver/features.json
  llvm/include/llvm/Support/VirtualFileSystem.h
  llvm/lib/Support/VirtualFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3159,3 +3160,302 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  sys::fs::file_status s;
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+VersionMatch, FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile _ace(TestDirectory.path("a/c/e"));
+  TempFile _acf(TestDirectory.path("a/c/f"), "", "More dummy contents");
+  TempDir _ag(TestDirectory.path("a/g"));
+  TempFile _agh(TestDirectory.path("a/g/h"));
+
+  StringRef BaseDir(_a.path());
+
+  SmallVector Filenames;
+  IntrusiveRefCntPtr StatCacheFS;
+  createStatCacheFileSystem(TestDirectory.path("stat.cache"), BaseDir,
+/* 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-11-01 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 472355.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  clang/tools/driver/features.json
  llvm/include/llvm/Support/VirtualFileSystem.h
  llvm/lib/Support/VirtualFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3159,3 +3160,302 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  sys::fs::file_status s;
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile _ace(TestDirectory.path("a/c/e"));
+  TempFile _acf(TestDirectory.path("a/c/f"), "", "More dummy contents");
+  TempDir _ag(TestDirectory.path("a/g"));
+  TempFile _agh(TestDirectory.path("a/g/h"));
+
+  StringRef BaseDir(_a.path());
+
+  SmallVector Filenames;
+  IntrusiveRefCntPtr StatCacheFS;
+  createStatCacheFileSystem(TestDirectory.path("stat.cache"), BaseDir,
+/* IsCaseSensitive= */ true, StatCacheFS,
+ 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-11-01 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 472321.
friss added a comment.

Add version number in cache file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/VirtualFileSystem.h
  llvm/lib/Support/VirtualFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3159,3 +3160,302 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  sys::fs::file_status s;
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+bool VersionMatch;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_TRUE(VersionMatch);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile _ace(TestDirectory.path("a/c/e"));
+  TempFile _acf(TestDirectory.path("a/c/f"), "", "More dummy contents");
+  TempDir _ag(TestDirectory.path("a/g"));
+  TempFile _agh(TestDirectory.path("a/g/h"));
+
+  StringRef BaseDir(_a.path());
+
+  SmallVector Filenames;
+  IntrusiveRefCntPtr StatCacheFS;
+  createStatCacheFileSystem(TestDirectory.path("stat.cache"), BaseDir,
+/* IsCaseSensitive= */ true, 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-26 Thread Steven Wu via Phabricator via cfe-commits
steven_wu added a comment.

> Yes, we can. I pondered doing it in the original patch but didn't see a 
> reason this would evolve in the future. Of course, I can't predict the 
> future, so maybe a version field is the safe approach.

I think a version number should be good enough for now. For the portability, it 
is probably not too big a concern. We can add that in future versions if it is 
needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-26 Thread Frederic Riss via Phabricator via cfe-commits
friss added inline comments.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:3002
+// The format of the stat cache is (pseudo-code):
+//  struct stat_cache {
+//char Magic[4];   // "STAT" or "Stat"

steven_wu wrote:
> Few comments about stats representation.
> 1. Can we version the stat cache file so we can evolve it in the future if 
> needed?
> 2. I wonder if we need to have a more flexible representation for DataType 
> other than `sys::fs::file_status`. Current entry is locked with the endian of 
> the host and can't be used to encode more information than file_status. 
> 1. Can we version the stat cache file so we can evolve it in the future if 
> needed?

Yes, we can. I pondered doing it in the original patch but didn't see a reason 
this would evolve in the future. Of course, I can't predict the future, so 
maybe a version field is the safe approach.

> 2. I wonder if we need to have a more flexible representation for DataType 
> other than `sys::fs::file_status`. Current entry is locked with the endian of 
> the host and can't be used to encode more information than file_status. 

The non-portability of the current cache doesn't bother me as it's meant to 
mirror filesystem aspects that are local to a machine. Doesn't adding a version 
number alleviate some of these concerns? If we wanted to add something else in 
the future, or make it portable, we could just bump that.

I think your comment hints at a higher-level question though. If we were to add 
something else to the cached data, it wouldn't be a stat cache anymore. Should 
the feature be a more general "filesystem cache" which starts out with just 
stat data? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-25 Thread Steven Wu via Phabricator via cfe-commits
steven_wu added inline comments.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:3002
+// The format of the stat cache is (pseudo-code):
+//  struct stat_cache {
+//char Magic[4];   // "STAT" or "Stat"

Few comments about stats representation.
1. Can we version the stat cache file so we can evolve it in the future if 
needed?
2. I wonder if we need to have a more flexible representation for DataType 
other than `sys::fs::file_status`. Current entry is locked with the endian of 
the host and can't be used to encode more information than file_status. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-25 Thread Frederic Riss via Phabricator via cfe-commits
friss updated this revision to Diff 470644.
friss added a comment.

Address some review feedback


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CMakeLists.txt
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/VirtualFileSystem.h
  llvm/lib/Support/VirtualFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3159,3 +3160,300 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  sys::fs::file_status s;
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+  EXPECT_EQ(s1.getLastModificationTime(), s2.getLastModificationTime());
+  EXPECT_EQ(s1.getUniqueID(), s2.getUniqueID());
+  EXPECT_EQ(s1.getUser(), s2.getUser());
+  EXPECT_EQ(s1.getGroup(), s2.getGroup());
+  EXPECT_EQ(s1.getSize(), s2.getSize());
+}
+  }
+};
+
+TEST_F(StatCacheFileSystemTest, Basic) {
+  TempDir TestDirectory("virtual-file-system-test", /*Unique*/ true);
+  TempDir _a(TestDirectory.path("a"));
+  TempFile _ab(TestDirectory.path("a/b"));
+  TempDir _ac(TestDirectory.path("a/c"));
+  TempFile _acd(TestDirectory.path("a/c/d"), "", "Dummy contents");
+  TempFile _ace(TestDirectory.path("a/c/e"));
+  TempFile _acf(TestDirectory.path("a/c/f"), "", "More dummy contents");
+  TempDir _ag(TestDirectory.path("a/g"));
+  TempFile _agh(TestDirectory.path("a/g/h"));
+
+  StringRef BaseDir(_a.path());
+
+  SmallVector Filenames;
+  IntrusiveRefCntPtr StatCacheFS;
+  createStatCacheFileSystem(TestDirectory.path("stat.cache"), BaseDir,
+/* IsCaseSensitive= */ true, StatCacheFS,
+Filenames);
+  

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-24 Thread Ben Barham via Phabricator via cfe-commits
bnbarham added a comment.

> You're correct that this overhead has been measured on implicit module 
> builds. As I mentioned in the commit message this saves over 20% of the 
> overall built time in some cases. This time is split between module 
> validation (which could be skipped) and HeaderSearch (which cannot be 
> skipped).

Heh, I read the commit as saying we've seen a 20% overhead, but not that this 
completely removed that. Not sure why I didn't make that connection in my head, 
it's a reasonable implication. Very nice!




Comment at: clang/lib/Frontend/CompilerInvocation.cpp:5115-5130
+  for (const auto  : CI.getHeaderSearchOpts().VFSStatCacheFiles) {
+llvm::ErrorOr> Buffer =
+llvm::MemoryBuffer::getFile(File);
+if (!Buffer) {
+  Diags.Report(diag::err_missing_vfs_stat_cache_file) << File;
+  continue;
+}

friss wrote:
> bnbarham wrote:
> > IMO VFS overlays being modeled like this is a mistake and makes 
> > reasoning/testing about them fairly difficult. I have a PR up 
> > https://reviews.llvm.org/D121423 to change `OverlayFileSystem` to make more 
> > sense and be a real overlay rather than what it is now. If I finish that 
> > one off, how would you feel about changing the behavior of 
> > `StatCacheFileSystem` to just immediately error if it doesn't contain the 
> > file, rather than proxy (and thus chain) as it does now?
> > 
> > So for multiple files we'd then have:
> >   - OverlayFileSystem
> > - StatCacheFileSystem
> > - StatCacheFileSystem
> > - RealFileSystem
> > 
> > Then any non-stat or exists call would return 
> > `llvm::errc::no_such_file_or_directory` and then the next FS would be used 
> > instead.
> > 
> > I don't think this *really* matters for `StatCacheFileSystem`, so I'm fine 
> > if you'd rather not wait for me to change `OverlayFileSystem`. I can make 
> > the changes myself after getting my PR in.
> I don't think that's really doable if you want to keep the ability to cache 
> negative hits. If a miss always results in a query to the real filesystem, 
> then you're not saving the `stat` call. A lot of the time is spent in 
> HeaderSearch which queries a similar number of non-existing and existing 
> files.
> But I'm not dead set on this. I also haven't spent a lot of time thinking 
> about your proposal.
That's a great point. It'd be easy enough to avoid by eg. passing down an error 
to use for either "this is a path I handle and it doesn't exist" or "this is 
not a path I handle" and handling that in `OverlayFileSystem`, which could 
default to `llvm:errc::no_such_file_or_directory`. Generally I just think this 
is easier to reason about, but it's more  problem for `RedirectingFileSystem` 
then it is for this (since this isn't *really* an overlay at all).

This is definitely fine as is either way though. If I find 
`RedirectingFileSystem` needs the same sort of idea then we can revisit it then.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2959-2960
+// the pathis different. The canonicalization that the call to 
remove_dots()
+// does leaves only '..' with symlinks as a source of confusion. If the 
path
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),

friss wrote:
> bnbarham wrote:
> > This sentence is a little confusing to me. `remove_dots` just leaves `..` 
> > unless you pass `remove_dot_dot` (but it doesn't do any checking). IMO just 
> > the `If the path does not contain '..' we can safely say it doesn't exist.` 
> > is enough.
> `remove_dots` does more than remove dots. It is confusing, but it also 
> removes excess separators (see the `Canonical` unit test). This means that 
> the cache will work for /path/to/cache/file/a as well as 
> /path/to/cache/file///a and /path/to/cache/file/././a. There are basically 
> infinite spellings of a path just by adding `.` and additional separators.
> `..` is interesting because it's not semantically preserving to remove them 
> in the presence of symlinks.
> I'm fine with simplifying the description, but that is the nuance I tried to 
> convey.
I suppose it could be confusing if you didn't know what `remove_dots` did, but 
in that case I'd prefer something like:
> `remove_dots` canonicalizes the path by removing `.` and excess separators, 
> but leaves `..` since it isn't semantically preserving to remove them.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2961
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),
+  "..") == sys::path::end(SuffixPath)) {

friss wrote:
> bnbarham wrote:
> > FWIW `StringRef` has a `contains`
> But that wouldn't be correct. Here we are looking for a path component which 
> is `..`. A simple text search would fire on a filename containing 

[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-24 Thread Frederic Riss via Phabricator via cfe-commits
friss added a comment.

Thanks for the initial feedback!

> Mostly just skimmed so far, will hopefully have some time to look at this 
> more tomorrow. Out of interest, do you have any performance numbers using 
> this change? I assume this mainly impacts implicit modules (since I suspect 
> we'd also be opening the file as well anyway), is that true?

You're correct that this overhead has been measured on implicit module builds. 
As I mentioned in the commit message this saves over 20% of the overall built 
time in some cases. This time is split between module validation (which could 
be skipped) and HeaderSearch (which cannot be skipped).




Comment at: clang/lib/Frontend/CompilerInvocation.cpp:5115-5130
+  for (const auto  : CI.getHeaderSearchOpts().VFSStatCacheFiles) {
+llvm::ErrorOr> Buffer =
+llvm::MemoryBuffer::getFile(File);
+if (!Buffer) {
+  Diags.Report(diag::err_missing_vfs_stat_cache_file) << File;
+  continue;
+}

bnbarham wrote:
> IMO VFS overlays being modeled like this is a mistake and makes 
> reasoning/testing about them fairly difficult. I have a PR up 
> https://reviews.llvm.org/D121423 to change `OverlayFileSystem` to make more 
> sense and be a real overlay rather than what it is now. If I finish that one 
> off, how would you feel about changing the behavior of `StatCacheFileSystem` 
> to just immediately error if it doesn't contain the file, rather than proxy 
> (and thus chain) as it does now?
> 
> So for multiple files we'd then have:
>   - OverlayFileSystem
> - StatCacheFileSystem
> - StatCacheFileSystem
> - RealFileSystem
> 
> Then any non-stat or exists call would return 
> `llvm::errc::no_such_file_or_directory` and then the next FS would be used 
> instead.
> 
> I don't think this *really* matters for `StatCacheFileSystem`, so I'm fine if 
> you'd rather not wait for me to change `OverlayFileSystem`. I can make the 
> changes myself after getting my PR in.
I don't think that's really doable if you want to keep the ability to cache 
negative hits. If a miss always results in a query to the real filesystem, then 
you're not saving the `stat` call. A lot of the time is spent in HeaderSearch 
which queries a similar number of non-existing and existing files.
But I'm not dead set on this. I also haven't spent a lot of time thinking about 
your proposal.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2959-2960
+// the pathis different. The canonicalization that the call to 
remove_dots()
+// does leaves only '..' with symlinks as a source of confusion. If the 
path
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),

bnbarham wrote:
> This sentence is a little confusing to me. `remove_dots` just leaves `..` 
> unless you pass `remove_dot_dot` (but it doesn't do any checking). IMO just 
> the `If the path does not contain '..' we can safely say it doesn't exist.` 
> is enough.
`remove_dots` does more than remove dots. It is confusing, but it also removes 
excess separators (see the `Canonical` unit test). This means that the cache 
will work for /path/to/cache/file/a as well as /path/to/cache/file///a and 
/path/to/cache/file/././a. There are basically infinite spellings of a path 
just by adding `.` and additional separators.
`..` is interesting because it's not semantically preserving to remove them in 
the presence of symlinks.
I'm fine with simplifying the description, but that is the nuance I tried to 
convey.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2961
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),
+  "..") == sys::path::end(SuffixPath)) {

bnbarham wrote:
> FWIW `StringRef` has a `contains`
But that wouldn't be correct. Here we are looking for a path component which is 
`..`. A simple text search would fire on a filename containing `..`. I think 
this search on components is the only correct way to do this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-24 Thread Ben Barham via Phabricator via cfe-commits
bnbarham added a comment.

Mostly just skimmed so far, will hopefully have some time to look at this more 
tomorrow. Out of interest, do you have any performance numbers using this 
change? I assume this mainly impacts implicit modules (since I suspect we'd 
also be opening the file as well anyway), is that true?




Comment at: clang/lib/Frontend/CompilerInvocation.cpp:5115-5130
+  for (const auto  : CI.getHeaderSearchOpts().VFSStatCacheFiles) {
+llvm::ErrorOr> Buffer =
+llvm::MemoryBuffer::getFile(File);
+if (!Buffer) {
+  Diags.Report(diag::err_missing_vfs_stat_cache_file) << File;
+  continue;
+}

IMO VFS overlays being modeled like this is a mistake and makes 
reasoning/testing about them fairly difficult. I have a PR up 
https://reviews.llvm.org/D121423 to change `OverlayFileSystem` to make more 
sense and be a real overlay rather than what it is now. If I finish that one 
off, how would you feel about changing the behavior of `StatCacheFileSystem` to 
just immediately error if it doesn't contain the file, rather than proxy (and 
thus chain) as it does now?

So for multiple files we'd then have:
  - OverlayFileSystem
- StatCacheFileSystem
- StatCacheFileSystem
- RealFileSystem

Then any non-stat or exists call would return 
`llvm::errc::no_such_file_or_directory` and then the next FS would be used 
instead.

I don't think this *really* matters for `StatCacheFileSystem`, so I'm fine if 
you'd rather not wait for me to change `OverlayFileSystem`. I can make the 
changes myself after getting my PR in.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:143
+// There is no cross-platform way to implement a validity check. If this
+// platofrm doesn't support it, just consider the cache constents always
+// valid. When that's the case, the tool running cache generation needs





Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2958
+// cache prefix. It could be that the file doesn't exist, or the spelling
+// the pathis different. The canonicalization that the call to 
remove_dots()
+// does leaves only '..' with symlinks as a source of confusion. If the 
path





Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2959-2960
+// the pathis different. The canonicalization that the call to 
remove_dots()
+// does leaves only '..' with symlinks as a source of confusion. If the 
path
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),

This sentence is a little confusing to me. `remove_dots` just leaves `..` 
unless you pass `remove_dot_dot` (but it doesn't do any checking). IMO just the 
`If the path does not contain '..' we can safely say it doesn't exist.` is 
enough.



Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2961
+// does not contain '..' we can safely say it doesn't exist.
+if (std::find(sys::path::begin(SuffixPath), sys::path::end(SuffixPath),
+  "..") == sys::path::end(SuffixPath)) {

FWIW `StringRef` has a `contains`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-24 Thread Thorsten via Phabricator via cfe-commits
tschuett added inline comments.



Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:61
+
+namespace {
+

Sorry, but you misuse anonymous namespaces. You want static instead.
https://llvm.org/docs/CodingStandards.html#anonymous-namespaces


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

2022-10-24 Thread Frederic Riss via Phabricator via cfe-commits
friss created this revision.
friss added reviewers: jansvoboda11, bnbarham, arphaman.
Herald added a subscriber: hiraditya.
Herald added a project: All.
friss requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

  Every Clang instance uses an internal FileSystemStatCache to avoid
  stating the same content multiple times. However, different instances
  of Clang will contend for filesystem access for their initial stats
  during HeaderSearch or module validation.
  
  On some workloads, the time spent in the kernel in these concurrent
  stat calls has been measured to be over 20% of the overall compilation
  time. This is extremly wassteful when most of the stat calls target
  mostly immutable content like a SDK.
  
  This commit introduces a new tool `clang-stat-cache` able to generate
  an OnDiskHashmap containing the stat data for a given filesystem
  hierarchy.
  
  The driver part of this has been modeled after -ivfsoverlay given
  the similarities with what it influences. It introduces a new
  -ivfsstatcache driver option to instruct Clang to use a stat cache
  generated by `clang-stat-cache`. These stat caches are inserted at
  the bottom of the VFS stack (right above the real filesystem).


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D136651

Files:
  clang/include/clang/Basic/DiagnosticFrontendKinds.td
  clang/include/clang/Driver/Options.td
  clang/include/clang/Lex/HeaderSearchOptions.h
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/vfsstatcache.c
  clang/test/clang-stat-cache/cache-effects.c
  clang/test/clang-stat-cache/errors.test
  clang/tools/CMakeLists.txt
  clang/tools/clang-stat-cache/CMakeLists.txt
  clang/tools/clang-stat-cache/clang-stat-cache.cpp
  llvm/include/llvm/Support/VirtualFileSystem.h
  llvm/lib/Support/VirtualFileSystem.cpp
  llvm/unittests/Support/VirtualFileSystemTest.cpp

Index: llvm/unittests/Support/VirtualFileSystemTest.cpp
===
--- llvm/unittests/Support/VirtualFileSystemTest.cpp
+++ llvm/unittests/Support/VirtualFileSystemTest.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Testing/Support/SupportHelpers.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 #include 
 
@@ -3159,3 +3160,300 @@
 "  DummyFileSystem (RecursiveContents)\n",
 Output);
 }
+
+class StatCacheFileSystemTest : public ::testing::Test {
+public:
+  void SetUp() override {}
+
+  template 
+  void createStatCacheFileSystem(
+  StringRef OutputFile, StringRef BaseDir, bool IsCaseSensitive,
+  IntrusiveRefCntPtr ,
+  StringCollection ,
+  IntrusiveRefCntPtr Lower = new ErrorDummyFileSystem(),
+  uint64_t ValidityToken = 0) {
+vfs::StatCacheFileSystem::StatCacheWriter Generator(
+BaseDir, IsCaseSensitive, ValidityToken);
+std::error_code ErrorCode;
+
+Result.reset();
+
+for (sys::fs::recursive_directory_iterator I(BaseDir, ErrorCode), E;
+ I != E && !ErrorCode; I.increment(ErrorCode)) {
+  Filenames.push_back(I->path());
+  StringRef Path(Filenames.back().c_str());
+  sys::fs::file_status s;
+  status(Path, s);
+  Generator.addEntry(Path, s);
+}
+
+{
+  raw_fd_ostream StatCacheFile(OutputFile, ErrorCode);
+  ASSERT_FALSE(ErrorCode);
+  Generator.writeStatCache(StatCacheFile);
+}
+
+loadCacheFile(OutputFile, ValidityToken, Lower, Result);
+  }
+
+  void loadCacheFile(StringRef OutputFile, uint64_t ExpectedValidityToken,
+ IntrusiveRefCntPtr Lower,
+ IntrusiveRefCntPtr ) {
+auto ErrorOrBuffer = MemoryBuffer::getFile(OutputFile);
+EXPECT_TRUE(ErrorOrBuffer);
+StringRef CacheBaseDir;
+bool IsCaseSensitive;
+uint64_t FileValidityToken;
+auto E = vfs::StatCacheFileSystem::validateCacheFile(
+(*ErrorOrBuffer)->getMemBufferRef(), CacheBaseDir, IsCaseSensitive,
+FileValidityToken);
+ASSERT_FALSE(E);
+EXPECT_EQ(FileValidityToken, ExpectedValidityToken);
+auto ExpectedCache = vfs::StatCacheFileSystem::create(
+std::move(*ErrorOrBuffer), OutputFile, Lower);
+ASSERT_FALSE(ExpectedCache.takeError());
+Result = *ExpectedCache;
+  }
+
+  template 
+  void
+  compareStatCacheToRealFS(IntrusiveRefCntPtr CacheFS,
+   const StringCollection ) {
+IntrusiveRefCntPtr RealFS = vfs::getRealFileSystem();
+
+for (auto  : Files) {
+  auto ErrorOrStatus1 = RealFS->status(File);
+  auto ErrorOrStatus2 = CacheFS->status(File);
+
+  EXPECT_EQ((bool)ErrorOrStatus1, (bool)ErrorOrStatus2);
+  if (!ErrorOrStatus1 || !ErrorOrStatus2)
+continue;
+
+  vfs::Status s1 = *ErrorOrStatus1, s2 = *ErrorOrStatus2;
+  EXPECT_EQ(s1.getName(), s2.getName());
+  EXPECT_EQ(s1.getType(), s2.getType());
+  EXPECT_EQ(s1.getPermissions(), s2.getPermissions());
+