Author: Michael Park Date: 2025-12-15T23:33:22-08:00 New Revision: 1928c1ea9b57e9c44325d436bc7bb2f4585031f3
URL: https://github.com/llvm/llvm-project/commit/1928c1ea9b57e9c44325d436bc7bb2f4585031f3 DIFF: https://github.com/llvm/llvm-project/commit/1928c1ea9b57e9c44325d436bc7bb2f4585031f3.diff LOG: [C++20][Modules] Improve namespace look-up performance for modules. (#171769) ## Problem Given code such as `N::foo();`, we perform name look-up on `N`. In the case where `N` is a namespace declared in imported modules, one namespace decl (the "key declaration") for each module that declares a namespace `foo` is loaded and stored. In large scales where there are many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this becomes extremely inefficient because every look-up (500,000 of them) return 1,500 results. The following synthetic script demonstrates the problem: ```bash #/usr/bin/env bash CLANG=${CLANG:-clang++} NUM_MODULES=${NUM_MODULES:-1500} NUM_USES=${NUM_USES:-500000} USE_MODULES=${USE_MODULES:-true} TMPDIR=$(mktemp -d) echo "Working in temp directory: $TMPDIR" cd $TMPDIR trap "rm -rf \"$TMPDIR\"" EXIT echo "namespace N { inline void foo() {} }" > m1.h for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done if $USE_MODULES; then seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h" fi > a.cpp if $USE_MODULES; then for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done else for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done fi echo "int main() {" >> a.cpp for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done echo "}" >> a.cpp if $USE_MODULES; then time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \ $(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done) else time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null fi ``` As of 575d6892bcc5cef926cfc1b95225148262c96a15, without modules (`USE_MODULES=false`) this takes about **4.5s**, whereas with modules (`USE_MODULES=true`), this takes about **37s**. With this PR, without modules there's no change (as expected) at 4.5s, but with modules it improves to about **5.2s**. ## Approach The approach taken here aims to maintain status-quo with respect to the input and output of modules. That is, the `ASTReader` and `ASTWriter` both read and write the same declarations as it did before. The difference is in the middle part: the [`StoredDeclsMap` in `DeclContext`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/AST/DeclBase.h#L2024-L2030). The `StoredDeclsMap` is roughly a `map<DeclarationName, StoredDeclsList>`. Currently, we read all of the external namespace decls from `ASTReader`, they all get stored into the `StoredDeclsList`, and the `ASTWriter` iterates through that list and writes out the results. This PR continues to read all of the external namespace decls from `ASTReader`, but only stores one namespace decl in the `StoredDeclsList`. This is okay since the reading of the decls handles all of the merging and chaining of the namespace decls, and as long as they're loaded and chained, returning one for look-up purposes is sufficient. The other half of the problem is to write out all of the external namespaces that we used to store in `StoredDeclsList` but no longer. For this, we take advantage of the [`KeyDecls`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/Serialization/ASTReader.h#L1342-L1347) data structure in `ASTReader`. `KeyDecls` is roughly a `map<Decl *, vector<GlobalDeclID>>`, and it stores a mapping from the canonical decl of a redeclarable decl to a list of `GlobalDeclID`s where each ID represents a "key declaration" from each imported module. More to the point, if we read external namespaces `N1`, `N2`, `N3` in `ASTReader`, we'll either have `N1` mapped to `[N2, N3]`, or some newly local canonical decl mapped to `[N1, N2, N3]`. Either way, we can visit `N1`, `N2`, and `N3` by doing `ASTReader::forEachImportedKeyDecls(N1, Visitor)`, and we leverage this to maintain the current behavior of writing out all of the imported namespace decls in `ASTWriter`. ## Alternatives Attempted - Tried reading fewer declarations on the `ASTReader` side, and writing out fewer declarations on the `ASTWriter` side, and neither options worked at all. - Tried trying to split `StoredDeclsList` into two pieces, one with non-namespace decls and one with only namespace decls, but that didn't work well... I think because the order of the declarations matter sometimes, and maybe also because the declaration replacement logic gets more complicated. - Tried to deduplicate at the `SemaLookup` level. Basically, retrieve all the stored decls but deduplicate populating the `LookupResult` [here](https://github.com/llvm/llvm-project/blob/release/21.x/clang/lib/Sema/SemaLookup.cpp#L1137-L1144). This did improve things slightly, but not quite enough, and this solution seemed cleaner in the end anyway. Added: clang/unittests/Serialization/NamespaceLookupTest.cpp Modified: clang/lib/Serialization/ASTReader.cpp clang/lib/Serialization/ASTWriter.cpp clang/unittests/Serialization/CMakeLists.txt Removed: ################################################################################ diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp index aec61322fb8be..e8600d1e914cd 100644 --- a/clang/lib/Serialization/ASTReader.cpp +++ b/clang/lib/Serialization/ASTReader.cpp @@ -555,7 +555,25 @@ namespace { using MacroDefinitionsMap = llvm::StringMap<std::pair<StringRef, bool /*IsUndef*/>>; -using DeclsMap = llvm::DenseMap<DeclarationName, SmallVector<NamedDecl *, 8>>; + +class DeclsSet { + SmallVector<NamedDecl *, 64> Decls; + llvm::SmallPtrSet<NamedDecl *, 8> Found; + +public: + operator ArrayRef<NamedDecl *>() const { return Decls; } + + bool empty() const { return Decls.empty(); } + + bool insert(NamedDecl *ND) { + auto [_, Inserted] = Found.insert(ND); + if (Inserted) + Decls.push_back(ND); + return Inserted; + } +}; + +using DeclsMap = llvm::DenseMap<DeclarationName, DeclsSet>; } // namespace @@ -8702,14 +8720,23 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC, return false; // Load the list of declarations. - SmallVector<NamedDecl *, 64> Decls; - llvm::SmallPtrSet<NamedDecl *, 8> Found; + DeclsSet DS; auto Find = [&, this](auto &&Table, auto &&Key) { for (GlobalDeclID ID : Table.find(Key)) { NamedDecl *ND = cast<NamedDecl>(GetDecl(ID)); - if (ND->getDeclName() == Name && Found.insert(ND).second) - Decls.push_back(ND); + if (ND->getDeclName() != Name) + continue; + // Special case for namespaces: There can be a lot of redeclarations of + // some namespaces, and we import a "key declaration" per imported module. + // Since all declarations of a namespace are essentially interchangeable, + // we can optimize namespace look-up by only storing the key declaration + // of the current TU, rather than storing N key declarations where N is + // the # of imported modules that declare that namespace. + // TODO: Try to generalize this optimization to other redeclarable decls. + if (isa<NamespaceDecl>(ND)) + ND = cast<NamedDecl>(getKeyDeclaration(ND)); + DS.insert(ND); } }; @@ -8744,8 +8771,8 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC, Find(It->second.Table, Name); } - SetExternalVisibleDeclsForName(DC, Name, Decls); - return !Decls.empty(); + SetExternalVisibleDeclsForName(DC, Name, DS); + return !DS.empty(); } void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) { @@ -8763,7 +8790,16 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) { for (GlobalDeclID ID : It->second.Table.findAll()) { NamedDecl *ND = cast<NamedDecl>(GetDecl(ID)); - Decls[ND->getDeclName()].push_back(ND); + // Special case for namespaces: There can be a lot of redeclarations of + // some namespaces, and we import a "key declaration" per imported module. + // Since all declarations of a namespace are essentially interchangeable, + // we can optimize namespace look-up by only storing the key declaration + // of the current TU, rather than storing N key declarations where N is + // the # of imported modules that declare that namespace. + // TODO: Try to generalize this optimization to other redeclarable decls. + if (isa<NamespaceDecl>(ND)) + ND = cast<NamedDecl>(getKeyDeclaration(ND)); + Decls[ND->getDeclName()].insert(ND); } // FIXME: Why a PCH test is failing if we remove the iterator after findAll? @@ -8773,9 +8809,9 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) { findAll(ModuleLocalLookups, NumModuleLocalVisibleDeclContexts); findAll(TULocalLookups, NumTULocalVisibleDeclContexts); - for (DeclsMap::iterator I = Decls.begin(), E = Decls.end(); I != E; ++I) { - SetExternalVisibleDeclsForName(DC, I->first, I->second); - } + for (auto &[Name, DS] : Decls) + SetExternalVisibleDeclsForName(DC, Name, DS); + const_cast<DeclContext *>(DC)->setHasExternalVisibleStorage(false); } diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp index dc7f93e0483e4..899fd69c2045e 100644 --- a/clang/lib/Serialization/ASTWriter.cpp +++ b/clang/lib/Serialization/ASTWriter.cpp @@ -4396,20 +4396,20 @@ class ASTDeclContextNameLookupTrait template <typename Coll> data_type getData(const Coll &Decls) { unsigned Start = DeclIDs.size(); - for (NamedDecl *D : Decls) { + auto AddDecl = [this](NamedDecl *D) { NamedDecl *DeclForLocalLookup = getDeclForLocalLookup(Writer.getLangOpts(), D); if (Writer.getDoneWritingDeclsAndTypes() && !Writer.wasDeclEmitted(DeclForLocalLookup)) - continue; + return; // Try to avoid writing internal decls to reduced BMI. // See comments in ASTWriter::WriteDeclContextLexicalBlock for details. if (Writer.isGeneratingReducedBMI() && !DeclForLocalLookup->isFromExplicitGlobalModule() && IsInternalDeclFromFileContext(DeclForLocalLookup)) - continue; + return; auto ID = Writer.GetDeclRef(DeclForLocalLookup); @@ -4423,7 +4423,7 @@ class ASTDeclContextNameLookupTrait ModuleLocalDeclsMap.insert({Key, DeclIDsTy{ID}}); else Iter->second.push_back(ID); - continue; + return; } break; case LookupVisibility::TULocal: { @@ -4432,7 +4432,7 @@ class ASTDeclContextNameLookupTrait TULocalDeclsMap.insert({D->getDeclName(), DeclIDsTy{ID}}); else Iter->second.push_back(ID); - continue; + return; } case LookupVisibility::GenerallyVisibile: // Generally visible decls go into the general lookup table. @@ -4440,6 +4440,24 @@ class ASTDeclContextNameLookupTrait } DeclIDs.push_back(ID); + }; + for (NamedDecl *D : Decls) { + if (ASTReader *Chain = Writer.getChain(); + Chain && isa<NamespaceDecl>(D) && D->isFromASTFile() && + D == Chain->getKeyDeclaration(D)) { + // In ASTReader, we stored only the key declaration of a namespace decl + // for this TU rather than storing all of the key declarations from each + // imported module. If we have an external namespace decl, this is that + // key declaration and we need to re-expand it to write out all of the + // key declarations from each imported module again. + // + // See comment 'ASTReader::FindExternalVisibleDeclsByName' for details. + Chain->forEachImportedKeyDecl(D, [&AddDecl](const Decl *D) { + AddDecl(cast<NamedDecl>(const_cast<Decl *>(D))); + }); + } else { + AddDecl(D); + } } return std::make_pair(Start, DeclIDs.size()); } diff --git a/clang/unittests/Serialization/CMakeLists.txt b/clang/unittests/Serialization/CMakeLists.txt index 6782e6b4d7330..a5cc1ed83af49 100644 --- a/clang/unittests/Serialization/CMakeLists.txt +++ b/clang/unittests/Serialization/CMakeLists.txt @@ -2,6 +2,7 @@ add_clang_unittest(SerializationTests ForceCheckFileInputTest.cpp InMemoryModuleCacheTest.cpp ModuleCacheTest.cpp + NamespaceLookupTest.cpp NoCommentsTest.cpp PreambleInNamedModulesTest.cpp LoadSpecLazilyTest.cpp diff --git a/clang/unittests/Serialization/NamespaceLookupTest.cpp b/clang/unittests/Serialization/NamespaceLookupTest.cpp new file mode 100644 index 0000000000000..eefa4be9fbee5 --- /dev/null +++ b/clang/unittests/Serialization/NamespaceLookupTest.cpp @@ -0,0 +1,247 @@ +//== unittests/Serialization/NamespaceLookupOptimizationTest.cpp =======// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "clang/Driver/CreateInvocationFromArgs.h" +#include "clang/Frontend/CompilerInstance.h" +#include "clang/Frontend/FrontendAction.h" +#include "clang/Frontend/FrontendActions.h" +#include "clang/Parse/ParseAST.h" +#include "clang/Serialization/ASTReader.h" +#include "clang/Tooling/Tooling.h" +#include "gtest/gtest.h" + +using namespace llvm; +using namespace clang; +using namespace clang::tooling; + +namespace { + +class NamespaceLookupTest : public ::testing::Test { + void SetUp() override { + ASSERT_FALSE( + sys::fs::createUniqueDirectory("namespace-lookup-test", TestDir)); + } + + void TearDown() override { sys::fs::remove_directories(TestDir); } + +public: + SmallString<256> TestDir; + + void addFile(StringRef Path, StringRef Contents) { + ASSERT_FALSE(sys::path::is_absolute(Path)); + + SmallString<256> AbsPath(TestDir); + sys::path::append(AbsPath, Path); + + ASSERT_FALSE( + sys::fs::create_directories(llvm::sys::path::parent_path(AbsPath))); + + std::error_code EC; + llvm::raw_fd_ostream OS(AbsPath, EC); + ASSERT_FALSE(EC); + OS << Contents; + } + + std::string GenerateModuleInterface(StringRef ModuleName, + StringRef Contents) { + std::string FileName = llvm::Twine(ModuleName + ".cppm").str(); + addFile(FileName, Contents); + + IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS = + llvm::vfs::createPhysicalFileSystem(); + DiagnosticOptions DiagOpts; + IntrusiveRefCntPtr<DiagnosticsEngine> Diags = + CompilerInstance::createDiagnostics(*VFS, DiagOpts); + CreateInvocationOptions CIOpts; + CIOpts.Diags = Diags; + CIOpts.VFS = VFS; + + std::string CacheBMIPath = + llvm::Twine(TestDir + "/" + ModuleName + ".pcm").str(); + std::string PrebuiltModulePath = + "-fprebuilt-module-path=" + TestDir.str().str(); + const char *Args[] = {"clang++", + "-std=c++20", + "--precompile", + PrebuiltModulePath.c_str(), + "-working-directory", + TestDir.c_str(), + "-I", + TestDir.c_str(), + FileName.c_str(), + "-o", + CacheBMIPath.c_str()}; + std::shared_ptr<CompilerInvocation> Invocation = + createInvocation(Args, CIOpts); + EXPECT_TRUE(Invocation); + + CompilerInstance Instance(std::move(Invocation)); + Instance.setDiagnostics(Diags); + Instance.getFrontendOpts().OutputFile = CacheBMIPath; + // Avoid memory leaks. + Instance.getFrontendOpts().DisableFree = false; + GenerateModuleInterfaceAction Action; + EXPECT_TRUE(Instance.ExecuteAction(Action)); + EXPECT_FALSE(Diags->hasErrorOccurred()); + + return CacheBMIPath; + } +}; + +struct NamespaceLookupResult { + int NumLocalNamespaces = 0; + int NumExternalNamespaces = 0; +}; + +class NamespaceLookupConsumer : public ASTConsumer { + NamespaceLookupResult &Result; + +public: + explicit NamespaceLookupConsumer(NamespaceLookupResult &Result) + : Result(Result) {} + + void HandleTranslationUnit(ASTContext &Context) override { + TranslationUnitDecl *TU = Context.getTranslationUnitDecl(); + ASSERT_TRUE(TU); + ASTReader *Chain = dyn_cast_or_null<ASTReader>(Context.getExternalSource()); + ASSERT_TRUE(Chain); + for (const Decl *D : + TU->lookup(DeclarationName(&Context.Idents.get("N")))) { + if (!isa<NamespaceDecl>(D)) + continue; + if (!D->isFromASTFile()) { + ++Result.NumLocalNamespaces; + } else { + ++Result.NumExternalNamespaces; + EXPECT_EQ(D, Chain->getKeyDeclaration(D)); + } + } + } +}; + +class NamespaceLookupAction : public ASTFrontendAction { + NamespaceLookupResult &Result; + +public: + explicit NamespaceLookupAction(NamespaceLookupResult &Result) + : Result(Result) {} + + std::unique_ptr<ASTConsumer> + CreateASTConsumer(CompilerInstance &CI, StringRef /*Unused*/) override { + return std::make_unique<NamespaceLookupConsumer>(Result); + } +}; + +TEST_F(NamespaceLookupTest, ExternalNamespacesOnly) { + GenerateModuleInterface("M1", R"cpp( +export module M1; +namespace N {} + )cpp"); + GenerateModuleInterface("M2", R"cpp( +export module M2; +namespace N {} + )cpp"); + GenerateModuleInterface("M3", R"cpp( +export module M3; +namespace N {} + )cpp"); + const char *test_file_contents = R"cpp( +import M1; +import M2; +import M3; + )cpp"; + std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str(); + NamespaceLookupResult Result; + EXPECT_TRUE(runToolOnCodeWithArgs( + std::make_unique<NamespaceLookupAction>(Result), test_file_contents, + { + "-std=c++20", + DepArg.c_str(), + "-I", + TestDir.c_str(), + }, + "main.cpp")); + + EXPECT_EQ(0, Result.NumLocalNamespaces); + EXPECT_EQ(1, Result.NumExternalNamespaces); +} + +TEST_F(NamespaceLookupTest, ExternalReplacedByLocal) { + GenerateModuleInterface("M1", R"cpp( +export module M1; +namespace N {} + )cpp"); + GenerateModuleInterface("M2", R"cpp( +export module M2; +namespace N {} + )cpp"); + GenerateModuleInterface("M3", R"cpp( +export module M3; +namespace N {} + )cpp"); + const char *test_file_contents = R"cpp( +import M1; +import M2; +import M3; + +namespace N {} + )cpp"; + std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str(); + NamespaceLookupResult Result; + EXPECT_TRUE(runToolOnCodeWithArgs( + std::make_unique<NamespaceLookupAction>(Result), test_file_contents, + { + "-std=c++20", + DepArg.c_str(), + "-I", + TestDir.c_str(), + }, + "main.cpp")); + + EXPECT_EQ(1, Result.NumLocalNamespaces); + EXPECT_EQ(0, Result.NumExternalNamespaces); +} + +TEST_F(NamespaceLookupTest, LocalAndExternalInterleaved) { + GenerateModuleInterface("M1", R"cpp( +export module M1; +namespace N {} + )cpp"); + GenerateModuleInterface("M2", R"cpp( +export module M2; +namespace N {} + )cpp"); + GenerateModuleInterface("M3", R"cpp( +export module M3; +namespace N {} + )cpp"); + const char *test_file_contents = R"cpp( +import M1; + +namespace N {} + +import M2; +import M3; + )cpp"; + std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str(); + NamespaceLookupResult Result; + EXPECT_TRUE(runToolOnCodeWithArgs( + std::make_unique<NamespaceLookupAction>(Result), test_file_contents, + { + "-std=c++20", + DepArg.c_str(), + "-I", + TestDir.c_str(), + }, + "main.cpp")); + + EXPECT_EQ(1, Result.NumLocalNamespaces); + EXPECT_EQ(1, Result.NumExternalNamespaces); +} + +} // namespace _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
