On Mon, Mar 23, 2015 at 7:36 PM, Ben Langmuir <[email protected]> wrote:
> > On Mar 23, 2015, at 5:06 PM, Richard Smith <[email protected]> wrote: > > On Mon, Mar 23, 2015 at 5:01 PM, Ben Langmuir <[email protected]> wrote: > >> >> On Mar 23, 2015, at 4:18 PM, Richard Smith <[email protected]> wrote: >> >> On Mon, Mar 23, 2015 at 3:59 PM, Ben Langmuir <[email protected]> >> wrote: >> >>> >>> On Mar 23, 2015, at 3:43 PM, Richard Smith <[email protected]> >>> wrote: >>> >>> On Mon, Mar 23, 2015 at 2:41 PM, Ben Langmuir <[email protected]> >>> wrote: >>> >>>> >>>> On Mar 23, 2015, at 1:05 PM, Richard Smith <[email protected]> >>>> wrote: >>>> >>>> On Sat, Mar 14, 2015 at 9:25 AM, Ben Langmuir <[email protected]> >>>> wrote: >>>> >>>>> Hi Richard, >>>>> >>>>> This commit looks like it caused a regression in PCH-loading >>>>> performance when the PCH imports modules. It doesn’t affect performance >>>>> of >>>>> loading modules on their own, or of a PCH that doesn’t use modules. If >>>>> you >>>>> have access to a Darwin system, it’s easy to reproduce with: >>>>> >>>>> $ cat t.h >>>>> @import Cocoa; >>>>> >>>>> $ cat t.m >>>>> // empty >>>>> >>>>> $ clang -fmodules -fmodules-cache-path=mcp -x objective-c-header t.h >>>>> -o t.pch >>>>> $ time clang -fmodules -fmodules-cache-path=mcp -Xclang -include-pch >>>>> -Xclang t.pch t.m -fsyntax-only >>>>> >>>>> I’m seeing ~8x difference between r228233 and r228234. In real world >>>>> code (where the .m file isn’t empty) we’ve seen up to a ~90% difference. >>>>> >>>>> Do you know what’s going on here? >>>> >>>> >>>> Hmm. I suspect this change causes the lookup table for the translation >>>> unit to be built, which in turn pulls in all the lexical contents of the >>>> translation unit (because outside of C++ we don't serialize a DeclContext >>>> lookup table for the TU, nor even build one usually). >>>> >>>> When a PCH loads I think we do it without Sema, but (at least >>>>> previously) queued up the interesting Decls we saw for when Sema was added >>>>> later. So I’m surprised this had such a big effect. >>>> >>>> >>>> If we were loading PCHs without Sema, then I think before this change >>>> we would not have merged declarations of the same entity from two >>>> independent modules imported into the same PCH. >>>> >>>> >>>> Wouldn’t that happen in InitializeSema() when we >>>> called pushExternalDeclIntoScope() on the contents of PreloadedDeclIDs? >>>> >>> >>> No; we're talking about the case where a Decl is deserialized before we >>> have a Sema instance. If we never deserialized the Decl from its DeclID >>> before we had a Sema (which I think is what you're suggesting) we'd never >>> have reached the modified code. >>> >>> >>> If we want to support loading Decls without Sema, we need to be able to >>> merge them without Sema. If we don't, then we should find whichever >>> codepath is leading us to do so and fix it. Being able to load a serialized >>> AST without a Sema seems useful to me, in the absence of a good reason why >>> it shouldn't work. >>> >>> >>> I thought PCH intentionally wouldn’t deserialize decls until after we >>> initialized Sema. Maybe we have a bug (which wouldn’t surprise me), or >>> maybe that wasn’t the intent (which would scare me a bit since then we have >>> some decls showing up immediately and some waiting for sema with no clear >>> split between them). >>> >>> Perhaps we should use a separate side-table for merging in this case >>>> outside C++, rather than falling back to DeclContext lookup? >>>> >>>> >>>> I don’t know this code path well enough to say. What would go into the >>>> table (or perhaps more importantly what wouldn’t)? >>>> >>> >>> Any deserialized declarations that should be visible by name lookup in >>> the TU would be in the table. The only difference would be that we'd be >>> more careful to ensure that we never trigger the import of all lexical >>> declarations in the TU in order to build the table. >>> >>> Are you still seeing this performance problem with Clang trunk? Towards >>> the end of last week I fixed a bug where we would unnecessarily pull in all >>> lexical declarations within the TU when doing certain lookups, which might >>> have fixed this slowdown. (If not, it'd be useful to find out whether >>> that's the problem, and if so, what is causing us to load these >>> declarations.) >>> >>> >>> Cool, I’ll give ToT a shot. I last tried with r232229. >>> >> >> r232928 onwards, or a revision in [232793, 232905), have the change I'm >> referring to. >> >> >> ToT got faster, but not even close to pre-228234. >> >> r228233: 0.0574 s <== “before” >> r228234: 0.4433 s <== “after" >> r232229: 0.4028 s >> r233028: 0.3097 s <== ToT >> > > Where's ToT spending its time? > > > Suspiciously it’s coming from LoadLexicalDeclsFromExternalStorage inside > **noload_lookup**… > Aha, thanks! Should be fixed in r233046, let me know if there are still problems. > are we setting hasExternalVisibleStorage incorrectly? > > Running Time Self Symbol Name > 341.0ms 97.1% 0.0 cc1_main(llvm::ArrayRef<char const*>, char const*, > void*) > 340.0ms 96.8% 0.0 > clang::ExecuteCompilerInvocation(clang::CompilerInstance*) > 340.0ms 96.8% 0.0 > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) > 336.0ms 95.7% 0.0 > clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, > clang::FrontendInputFile const&) > 335.0ms 95.4% 0.0 > clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, bool, > bool, void*, bool) > 335.0ms 95.4% 0.0 > clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, > std::__1::basic_string<char, std::__1::char_traits<char>, > std::__1::allocator<char> > const&, bool, bool, clang::Preprocessor&, > clang::ASTContext&, void*, bool, bool, bool) > 335.0ms 95.4% 1.0 > clang::ASTReader::ReadAST(std::__1::basic_string<char, > std::__1::char_traits<char>, std::__1::allocator<char> > const&, > clang::serialization::ModuleKind, clang::SourceLocation, unsigned int) > 248.0ms 70.6% 0.0 clang::ASTReader::InitializeContext() > 247.0ms 70.3% 0.0 clang::ASTReader::GetType(unsigned int) > 247.0ms 70.3% 0.0 clang::ASTReader::readTypeRecord(unsigned > int) > 247.0ms 70.3% 0.0 > clang::ASTReader::ReadDeclRecord(unsigned int) > 247.0ms 70.3% 0.0 > clang::ASTDeclReader::Visit(clang::Decl*) > 247.0ms 70.3% 0.0 > clang::declvisitor::Base<clang::declvisitor::make_ptr, > clang::ASTDeclReader, void>::Visit(clang::Decl*) > 247.0ms 70.3% 0.0 > clang::ASTDeclReader::VisitTypedefNameDecl(clang::TypedefNameDecl*) > 246.0ms 70.0% 0.0 > clang::ASTReader::GetTypeSourceInfo(clang::serialization::ModuleFile&, > llvm::SmallVector<unsigned long long, 64u> const&, unsigned int&) > 246.0ms 70.0% 0.0 clang::ASTReader::GetType(unsigned > int) > 246.0ms 70.0% 0.0 > clang::ASTReader::readTypeRecord(unsigned int) > 246.0ms 70.0% 0.0 > clang::ASTReader::GetType(unsigned int) > 246.0ms 70.0% 0.0 > clang::ASTReader::readTypeRecord(unsigned int) > 246.0ms 70.0% 0.0 > clang::ASTReader::ReadDeclRecord(unsigned int) > 246.0ms 70.0% 0.0 > clang::ASTDeclReader::Visit(clang::Decl*) > 246.0ms 70.0% 0.0 > clang::declvisitor::Base<clang::declvisitor::make_ptr, > clang::ASTDeclReader, void>::Visit(clang::Decl*) > 246.0ms 70.0% 0.0 > clang::ASTDeclReader::VisitTagDecl(clang::TagDecl*) > 246.0ms 70.0% 0.0 void > clang::ASTDeclReader::mergeRedeclarable<clang::TagDecl>(clang::Redeclarable<clang::TagDecl>*, > clang::ASTDeclReader::RedeclarableResult&, unsigned int) > 246.0ms 70.0% 0.0 > clang::ASTDeclReader::findExisting(clang::NamedDecl*) > 246.0ms 70.0% 0.0 > clang::DeclContext::noload_lookup(clang::DeclarationName) > 246.0ms 70.0% 0.0 > clang::DeclContext::lookup(clang::DeclarationName) const > 246.0ms 70.0% 0.0 > clang::DeclContext::buildLookup() > 242.0ms 68.9% 1.0 > clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const > > > > > >> Ben >> >> Ben >>>>> >>>>> > On Feb 4, 2015, at 3:37 PM, Richard Smith < >>>>> [email protected]> wrote: >>>>> > >>>>> > Author: rsmith >>>>> > Date: Wed Feb 4 17:37:59 2015 >>>>> > New Revision: 228234 >>>>> > >>>>> > URL: http://llvm.org/viewvc/llvm-project?rev=228234&view=rev >>>>> > Log: >>>>> > [modules] When using -E, we may try to merge decls despite having no >>>>> Sema >>>>> > object. In such a case, use the TU's DC for merging global decls >>>>> rather than >>>>> > giving up when we find there is no TU scope. >>>>> > >>>>> > Ultimately, we should probably avoid all loading of decls when >>>>> preprocessing, >>>>> > but there are other reasonable use cases for loading an AST file >>>>> with no Sema >>>>> > object for which this is the right thing. >>>>> > >>>>> > Added: >>>>> > cfe/trunk/test/Modules/Inputs/preprocess/ >>>>> > cfe/trunk/test/Modules/Inputs/preprocess/file.h >>>>> > cfe/trunk/test/Modules/Inputs/preprocess/fwd.h >>>>> > cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap >>>>> > Modified: >>>>> > cfe/trunk/include/clang/Frontend/CompilerInstance.h >>>>> > cfe/trunk/lib/Frontend/CompilerInstance.cpp >>>>> > cfe/trunk/lib/Serialization/ASTReaderDecl.cpp >>>>> > cfe/trunk/test/Modules/preprocess.m >>>>> > >>>>> > Modified: cfe/trunk/include/clang/Frontend/CompilerInstance.h >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CompilerInstance.h?rev=228234&r1=228233&r2=228234&view=diff >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/include/clang/Frontend/CompilerInstance.h (original) >>>>> > +++ cfe/trunk/include/clang/Frontend/CompilerInstance.h Wed Feb 4 >>>>> 17:37:59 2015 >>>>> > @@ -588,7 +588,7 @@ public: >>>>> > /// Create an external AST source to read a PCH file. >>>>> > /// >>>>> > /// \return - The new object on success, or null on failure. >>>>> > - static ExternalASTSource *createPCHExternalASTSource( >>>>> > + static IntrusiveRefCntPtr<ASTReader> createPCHExternalASTSource( >>>>> > StringRef Path, const std::string &Sysroot, bool >>>>> DisablePCHValidation, >>>>> > bool AllowPCHWithCompilerErrors, Preprocessor &PP, ASTContext >>>>> &Context, >>>>> > void *DeserializationListener, bool OwnDeserializationListener, >>>>> > >>>>> > Modified: cfe/trunk/lib/Frontend/CompilerInstance.cpp >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInstance.cpp?rev=228234&r1=228233&r2=228234&view=diff >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/lib/Frontend/CompilerInstance.cpp (original) >>>>> > +++ cfe/trunk/lib/Frontend/CompilerInstance.cpp Wed Feb 4 17:37:59 >>>>> 2015 >>>>> > @@ -388,32 +388,30 @@ void CompilerInstance::createASTContext( >>>>> > void CompilerInstance::createPCHExternalASTSource( >>>>> > StringRef Path, bool DisablePCHValidation, bool >>>>> AllowPCHWithCompilerErrors, >>>>> > void *DeserializationListener, bool OwnDeserializationListener) { >>>>> > - IntrusiveRefCntPtr<ExternalASTSource> Source; >>>>> > bool Preamble = >>>>> getPreprocessorOpts().PrecompiledPreambleBytes.first != 0; >>>>> > - Source = createPCHExternalASTSource( >>>>> > + ModuleManager = createPCHExternalASTSource( >>>>> > Path, getHeaderSearchOpts().Sysroot, DisablePCHValidation, >>>>> > AllowPCHWithCompilerErrors, getPreprocessor(), getASTContext(), >>>>> > DeserializationListener, OwnDeserializationListener, Preamble, >>>>> > getFrontendOpts().UseGlobalModuleIndex); >>>>> > - ModuleManager = static_cast<ASTReader*>(Source.get()); >>>>> > - getASTContext().setExternalSource(Source); >>>>> > } >>>>> > >>>>> > -ExternalASTSource *CompilerInstance::createPCHExternalASTSource( >>>>> > +IntrusiveRefCntPtr<ASTReader> >>>>> CompilerInstance::createPCHExternalASTSource( >>>>> > StringRef Path, const std::string &Sysroot, bool >>>>> DisablePCHValidation, >>>>> > bool AllowPCHWithCompilerErrors, Preprocessor &PP, ASTContext >>>>> &Context, >>>>> > void *DeserializationListener, bool OwnDeserializationListener, >>>>> > bool Preamble, bool UseGlobalModuleIndex) { >>>>> > HeaderSearchOptions &HSOpts = >>>>> PP.getHeaderSearchInfo().getHeaderSearchOpts(); >>>>> > >>>>> > - std::unique_ptr<ASTReader> Reader; >>>>> > - Reader.reset(new ASTReader(PP, Context, >>>>> > - Sysroot.empty() ? "" : Sysroot.c_str(), >>>>> > - DisablePCHValidation, >>>>> > - AllowPCHWithCompilerErrors, >>>>> > - /*AllowConfigurationMismatch*/false, >>>>> > - HSOpts.ModulesValidateSystemHeaders, >>>>> > - UseGlobalModuleIndex)); >>>>> > + IntrusiveRefCntPtr<ASTReader> Reader( >>>>> > + new ASTReader(PP, Context, Sysroot.empty() ? "" : >>>>> Sysroot.c_str(), >>>>> > + DisablePCHValidation, >>>>> AllowPCHWithCompilerErrors, >>>>> > + /*AllowConfigurationMismatch*/ false, >>>>> > + HSOpts.ModulesValidateSystemHeaders, >>>>> UseGlobalModuleIndex)); >>>>> > + >>>>> > + // We need the external source to be set up before we read the >>>>> AST, because >>>>> > + // eagerly-deserialized declarations may use it. >>>>> > + Context.setExternalSource(Reader.get()); >>>>> > >>>>> > Reader->setDeserializationListener( >>>>> > static_cast<ASTDeserializationListener >>>>> *>(DeserializationListener), >>>>> > @@ -427,7 +425,7 @@ ExternalASTSource *CompilerInstance::cre >>>>> > // Set the predefines buffer as suggested by the PCH reader. >>>>> Typically, the >>>>> > // predefines buffer will be empty. >>>>> > PP.setPredefines(Reader->getSuggestedPredefines()); >>>>> > - return Reader.release(); >>>>> > + return Reader; >>>>> > >>>>> > case ASTReader::Failure: >>>>> > // Unrecoverable failure: don't even try to process the input >>>>> file. >>>>> > @@ -442,6 +440,7 @@ ExternalASTSource *CompilerInstance::cre >>>>> > break; >>>>> > } >>>>> > >>>>> > + Context.setExternalSource(nullptr); >>>>> > return nullptr; >>>>> > } >>>>> > >>>>> > >>>>> > Modified: cfe/trunk/lib/Serialization/ASTReaderDecl.cpp >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Serialization/ASTReaderDecl.cpp?rev=228234&r1=228233&r2=228234&view=diff >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/lib/Serialization/ASTReaderDecl.cpp (original) >>>>> > +++ cfe/trunk/lib/Serialization/ASTReaderDecl.cpp Wed Feb 4 >>>>> 17:37:59 2015 >>>>> > @@ -2591,6 +2591,11 @@ DeclContext *ASTDeclReader::getPrimaryCo >>>>> > return ED->getASTContext().getLangOpts().CPlusPlus? >>>>> ED->getDefinition() >>>>> > : nullptr; >>>>> > >>>>> > + // We can see the TU here only if we have no Sema object. In that >>>>> case, >>>>> > + // there's no TU scope to look in, so using the DC alone is >>>>> sufficient. >>>>> > + if (auto *TU = dyn_cast<TranslationUnitDecl>(DC)) >>>>> > + return TU; >>>>> > + >>>>> > return nullptr; >>>>> > } >>>>> > >>>>> > >>>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/file.h >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/file.h?rev=228234&view=auto >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/file.h (added) >>>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/file.h Wed Feb 4 >>>>> 17:37:59 2015 >>>>> > @@ -0,0 +1,3 @@ >>>>> > +struct __FILE; >>>>> > +#include "fwd.h" >>>>> > +typedef struct __FILE FILE; >>>>> > >>>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/fwd.h >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/fwd.h?rev=228234&view=auto >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/fwd.h (added) >>>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/fwd.h Wed Feb 4 >>>>> 17:37:59 2015 >>>>> > @@ -0,0 +1 @@ >>>>> > +struct __FILE; >>>>> > >>>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap?rev=228234&view=auto >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap (added) >>>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap Wed >>>>> Feb 4 17:37:59 2015 >>>>> > @@ -0,0 +1,2 @@ >>>>> > +module fwd { header "fwd.h" export * } >>>>> > +module file { header "file.h" export * } >>>>> > >>>>> > Modified: cfe/trunk/test/Modules/preprocess.m >>>>> > URL: >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/preprocess.m?rev=228234&r1=228233&r2=228234&view=diff >>>>> > >>>>> ============================================================================== >>>>> > --- cfe/trunk/test/Modules/preprocess.m (original) >>>>> > +++ cfe/trunk/test/Modules/preprocess.m Wed Feb 4 17:37:59 2015 >>>>> > @@ -1,9 +1,14 @@ >>>>> > // RUN: rm -rf %t >>>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -include %S/Inputs/preprocess-prefix.h -E %s | FileCheck >>>>> -strict-whitespace >>>>> %s >>>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -x objective-c-header -emit-pch %S/Inputs/preprocess-prefix.h -o %t.pch >>>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -include-pch %t.pch -E %s | FileCheck -strict-whitespace %s >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -include %S/Inputs/preprocess-prefix.h -E %s | >>>>> FileCheck -strict-whitespace %s >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -x objective-c-header -emit-pch >>>>> %S/Inputs/preprocess-prefix.h -o %t.pch >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -include-pch %t.pch -E %s | FileCheck >>>>> -strict-whitespace %s >>>>> > +// >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -x objective-c++ -include >>>>> %S/Inputs/preprocess-prefix.h -E %s | FileCheck -strict-whitespace %s >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -x objective-c++-header -emit-pch >>>>> %S/Inputs/preprocess-prefix.h -o %t.pch >>>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>>> -I %S/Inputs/preprocess -x objective-c++ -include-pch %t.pch -E %s | >>>>> FileCheck -strict-whitespace %s >>>>> > #import "diamond_right.h" >>>>> > #import "diamond_right.h" // to check that imports get their own line >>>>> > +#include "file.h" >>>>> > void test() { >>>>> > top_left_before(); >>>>> > left_and_right(); >>>>> > @@ -15,6 +20,7 @@ void test() { >>>>> > >>>>> > // CHECK: @import diamond_right; /* clang -E: implicit import for >>>>> "{{.*}}diamond_right.h" */{{$}} >>>>> > // CHECK: @import diamond_right; /* clang -E: implicit import for >>>>> "{{.*}}diamond_right.h" */{{$}} >>>>> > +// CHECK: @import file; /* clang -E: implicit import for >>>>> "{{.*}}file.h" */{{$}} >>>>> > // CHECK-NEXT: void test() {{{$}} >>>>> > // CHECK-NEXT: top_left_before();{{$}} >>>>> > // CHECK-NEXT: left_and_right();{{$}} >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > cfe-commits mailing list >>>>> > [email protected] >>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>>>> >>>> >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
