On Mon, Mar 23, 2015 at 5:01 PM, Ben Langmuir <[email protected]> wrote:
> > On Mar 23, 2015, at 4:18 PM, Richard Smith <[email protected]> wrote: > > On Mon, Mar 23, 2015 at 3:59 PM, Ben Langmuir <[email protected]> wrote: > >> >> On Mar 23, 2015, at 3:43 PM, Richard Smith <[email protected]> wrote: >> >> On Mon, Mar 23, 2015 at 2:41 PM, Ben Langmuir <[email protected]> >> wrote: >> >>> >>> On Mar 23, 2015, at 1:05 PM, Richard Smith <[email protected]> >>> wrote: >>> >>> On Sat, Mar 14, 2015 at 9:25 AM, Ben Langmuir <[email protected]> >>> wrote: >>> >>>> Hi Richard, >>>> >>>> This commit looks like it caused a regression in PCH-loading >>>> performance when the PCH imports modules. It doesn’t affect performance of >>>> loading modules on their own, or of a PCH that doesn’t use modules. If you >>>> have access to a Darwin system, it’s easy to reproduce with: >>>> >>>> $ cat t.h >>>> @import Cocoa; >>>> >>>> $ cat t.m >>>> // empty >>>> >>>> $ clang -fmodules -fmodules-cache-path=mcp -x objective-c-header t.h -o >>>> t.pch >>>> $ time clang -fmodules -fmodules-cache-path=mcp -Xclang -include-pch >>>> -Xclang t.pch t.m -fsyntax-only >>>> >>>> I’m seeing ~8x difference between r228233 and r228234. In real world >>>> code (where the .m file isn’t empty) we’ve seen up to a ~90% difference. >>>> >>>> Do you know what’s going on here? >>> >>> >>> Hmm. I suspect this change causes the lookup table for the translation >>> unit to be built, which in turn pulls in all the lexical contents of the >>> translation unit (because outside of C++ we don't serialize a DeclContext >>> lookup table for the TU, nor even build one usually). >>> >>> When a PCH loads I think we do it without Sema, but (at least >>>> previously) queued up the interesting Decls we saw for when Sema was added >>>> later. So I’m surprised this had such a big effect. >>> >>> >>> If we were loading PCHs without Sema, then I think before this change we >>> would not have merged declarations of the same entity from two independent >>> modules imported into the same PCH. >>> >>> >>> Wouldn’t that happen in InitializeSema() when we >>> called pushExternalDeclIntoScope() on the contents of PreloadedDeclIDs? >>> >> >> No; we're talking about the case where a Decl is deserialized before we >> have a Sema instance. If we never deserialized the Decl from its DeclID >> before we had a Sema (which I think is what you're suggesting) we'd never >> have reached the modified code. >> >> >> If we want to support loading Decls without Sema, we need to be able to >> merge them without Sema. If we don't, then we should find whichever >> codepath is leading us to do so and fix it. Being able to load a serialized >> AST without a Sema seems useful to me, in the absence of a good reason why >> it shouldn't work. >> >> >> I thought PCH intentionally wouldn’t deserialize decls until after we >> initialized Sema. Maybe we have a bug (which wouldn’t surprise me), or >> maybe that wasn’t the intent (which would scare me a bit since then we have >> some decls showing up immediately and some waiting for sema with no clear >> split between them). >> >> Perhaps we should use a separate side-table for merging in this case >>> outside C++, rather than falling back to DeclContext lookup? >>> >>> >>> I don’t know this code path well enough to say. What would go into the >>> table (or perhaps more importantly what wouldn’t)? >>> >> >> Any deserialized declarations that should be visible by name lookup in >> the TU would be in the table. The only difference would be that we'd be >> more careful to ensure that we never trigger the import of all lexical >> declarations in the TU in order to build the table. >> >> Are you still seeing this performance problem with Clang trunk? Towards >> the end of last week I fixed a bug where we would unnecessarily pull in all >> lexical declarations within the TU when doing certain lookups, which might >> have fixed this slowdown. (If not, it'd be useful to find out whether >> that's the problem, and if so, what is causing us to load these >> declarations.) >> >> >> Cool, I’ll give ToT a shot. I last tried with r232229. >> > > r232928 onwards, or a revision in [232793, 232905), have the change I'm > referring to. > > > ToT got faster, but not even close to pre-228234. > > r228233: 0.0574 s <== “before” > r228234: 0.4433 s <== “after" > r232229: 0.4028 s > r233028: 0.3097 s <== ToT > Where's ToT spending its time? > Ben > > Ben >>>> >>>> > On Feb 4, 2015, at 3:37 PM, Richard Smith <[email protected]> >>>> wrote: >>>> > >>>> > Author: rsmith >>>> > Date: Wed Feb 4 17:37:59 2015 >>>> > New Revision: 228234 >>>> > >>>> > URL: http://llvm.org/viewvc/llvm-project?rev=228234&view=rev >>>> > Log: >>>> > [modules] When using -E, we may try to merge decls despite having no >>>> Sema >>>> > object. In such a case, use the TU's DC for merging global decls >>>> rather than >>>> > giving up when we find there is no TU scope. >>>> > >>>> > Ultimately, we should probably avoid all loading of decls when >>>> preprocessing, >>>> > but there are other reasonable use cases for loading an AST file with >>>> no Sema >>>> > object for which this is the right thing. >>>> > >>>> > Added: >>>> > cfe/trunk/test/Modules/Inputs/preprocess/ >>>> > cfe/trunk/test/Modules/Inputs/preprocess/file.h >>>> > cfe/trunk/test/Modules/Inputs/preprocess/fwd.h >>>> > cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap >>>> > Modified: >>>> > cfe/trunk/include/clang/Frontend/CompilerInstance.h >>>> > cfe/trunk/lib/Frontend/CompilerInstance.cpp >>>> > cfe/trunk/lib/Serialization/ASTReaderDecl.cpp >>>> > cfe/trunk/test/Modules/preprocess.m >>>> > >>>> > Modified: cfe/trunk/include/clang/Frontend/CompilerInstance.h >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CompilerInstance.h?rev=228234&r1=228233&r2=228234&view=diff >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/include/clang/Frontend/CompilerInstance.h (original) >>>> > +++ cfe/trunk/include/clang/Frontend/CompilerInstance.h Wed Feb 4 >>>> 17:37:59 2015 >>>> > @@ -588,7 +588,7 @@ public: >>>> > /// Create an external AST source to read a PCH file. >>>> > /// >>>> > /// \return - The new object on success, or null on failure. >>>> > - static ExternalASTSource *createPCHExternalASTSource( >>>> > + static IntrusiveRefCntPtr<ASTReader> createPCHExternalASTSource( >>>> > StringRef Path, const std::string &Sysroot, bool >>>> DisablePCHValidation, >>>> > bool AllowPCHWithCompilerErrors, Preprocessor &PP, ASTContext >>>> &Context, >>>> > void *DeserializationListener, bool OwnDeserializationListener, >>>> > >>>> > Modified: cfe/trunk/lib/Frontend/CompilerInstance.cpp >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInstance.cpp?rev=228234&r1=228233&r2=228234&view=diff >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/lib/Frontend/CompilerInstance.cpp (original) >>>> > +++ cfe/trunk/lib/Frontend/CompilerInstance.cpp Wed Feb 4 17:37:59 >>>> 2015 >>>> > @@ -388,32 +388,30 @@ void CompilerInstance::createASTContext( >>>> > void CompilerInstance::createPCHExternalASTSource( >>>> > StringRef Path, bool DisablePCHValidation, bool >>>> AllowPCHWithCompilerErrors, >>>> > void *DeserializationListener, bool OwnDeserializationListener) { >>>> > - IntrusiveRefCntPtr<ExternalASTSource> Source; >>>> > bool Preamble = >>>> getPreprocessorOpts().PrecompiledPreambleBytes.first != 0; >>>> > - Source = createPCHExternalASTSource( >>>> > + ModuleManager = createPCHExternalASTSource( >>>> > Path, getHeaderSearchOpts().Sysroot, DisablePCHValidation, >>>> > AllowPCHWithCompilerErrors, getPreprocessor(), getASTContext(), >>>> > DeserializationListener, OwnDeserializationListener, Preamble, >>>> > getFrontendOpts().UseGlobalModuleIndex); >>>> > - ModuleManager = static_cast<ASTReader*>(Source.get()); >>>> > - getASTContext().setExternalSource(Source); >>>> > } >>>> > >>>> > -ExternalASTSource *CompilerInstance::createPCHExternalASTSource( >>>> > +IntrusiveRefCntPtr<ASTReader> >>>> CompilerInstance::createPCHExternalASTSource( >>>> > StringRef Path, const std::string &Sysroot, bool >>>> DisablePCHValidation, >>>> > bool AllowPCHWithCompilerErrors, Preprocessor &PP, ASTContext >>>> &Context, >>>> > void *DeserializationListener, bool OwnDeserializationListener, >>>> > bool Preamble, bool UseGlobalModuleIndex) { >>>> > HeaderSearchOptions &HSOpts = >>>> PP.getHeaderSearchInfo().getHeaderSearchOpts(); >>>> > >>>> > - std::unique_ptr<ASTReader> Reader; >>>> > - Reader.reset(new ASTReader(PP, Context, >>>> > - Sysroot.empty() ? "" : Sysroot.c_str(), >>>> > - DisablePCHValidation, >>>> > - AllowPCHWithCompilerErrors, >>>> > - /*AllowConfigurationMismatch*/false, >>>> > - HSOpts.ModulesValidateSystemHeaders, >>>> > - UseGlobalModuleIndex)); >>>> > + IntrusiveRefCntPtr<ASTReader> Reader( >>>> > + new ASTReader(PP, Context, Sysroot.empty() ? "" : >>>> Sysroot.c_str(), >>>> > + DisablePCHValidation, AllowPCHWithCompilerErrors, >>>> > + /*AllowConfigurationMismatch*/ false, >>>> > + HSOpts.ModulesValidateSystemHeaders, >>>> UseGlobalModuleIndex)); >>>> > + >>>> > + // We need the external source to be set up before we read the >>>> AST, because >>>> > + // eagerly-deserialized declarations may use it. >>>> > + Context.setExternalSource(Reader.get()); >>>> > >>>> > Reader->setDeserializationListener( >>>> > static_cast<ASTDeserializationListener >>>> *>(DeserializationListener), >>>> > @@ -427,7 +425,7 @@ ExternalASTSource *CompilerInstance::cre >>>> > // Set the predefines buffer as suggested by the PCH reader. >>>> Typically, the >>>> > // predefines buffer will be empty. >>>> > PP.setPredefines(Reader->getSuggestedPredefines()); >>>> > - return Reader.release(); >>>> > + return Reader; >>>> > >>>> > case ASTReader::Failure: >>>> > // Unrecoverable failure: don't even try to process the input >>>> file. >>>> > @@ -442,6 +440,7 @@ ExternalASTSource *CompilerInstance::cre >>>> > break; >>>> > } >>>> > >>>> > + Context.setExternalSource(nullptr); >>>> > return nullptr; >>>> > } >>>> > >>>> > >>>> > Modified: cfe/trunk/lib/Serialization/ASTReaderDecl.cpp >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Serialization/ASTReaderDecl.cpp?rev=228234&r1=228233&r2=228234&view=diff >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/lib/Serialization/ASTReaderDecl.cpp (original) >>>> > +++ cfe/trunk/lib/Serialization/ASTReaderDecl.cpp Wed Feb 4 17:37:59 >>>> 2015 >>>> > @@ -2591,6 +2591,11 @@ DeclContext *ASTDeclReader::getPrimaryCo >>>> > return ED->getASTContext().getLangOpts().CPlusPlus? >>>> ED->getDefinition() >>>> > : nullptr; >>>> > >>>> > + // We can see the TU here only if we have no Sema object. In that >>>> case, >>>> > + // there's no TU scope to look in, so using the DC alone is >>>> sufficient. >>>> > + if (auto *TU = dyn_cast<TranslationUnitDecl>(DC)) >>>> > + return TU; >>>> > + >>>> > return nullptr; >>>> > } >>>> > >>>> > >>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/file.h >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/file.h?rev=228234&view=auto >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/file.h (added) >>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/file.h Wed Feb 4 >>>> 17:37:59 2015 >>>> > @@ -0,0 +1,3 @@ >>>> > +struct __FILE; >>>> > +#include "fwd.h" >>>> > +typedef struct __FILE FILE; >>>> > >>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/fwd.h >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/fwd.h?rev=228234&view=auto >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/fwd.h (added) >>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/fwd.h Wed Feb 4 >>>> 17:37:59 2015 >>>> > @@ -0,0 +1 @@ >>>> > +struct __FILE; >>>> > >>>> > Added: cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap?rev=228234&view=auto >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap (added) >>>> > +++ cfe/trunk/test/Modules/Inputs/preprocess/module.modulemap Wed >>>> Feb 4 17:37:59 2015 >>>> > @@ -0,0 +1,2 @@ >>>> > +module fwd { header "fwd.h" export * } >>>> > +module file { header "file.h" export * } >>>> > >>>> > Modified: cfe/trunk/test/Modules/preprocess.m >>>> > URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Modules/preprocess.m?rev=228234&r1=228233&r2=228234&view=diff >>>> > >>>> ============================================================================== >>>> > --- cfe/trunk/test/Modules/preprocess.m (original) >>>> > +++ cfe/trunk/test/Modules/preprocess.m Wed Feb 4 17:37:59 2015 >>>> > @@ -1,9 +1,14 @@ >>>> > // RUN: rm -rf %t >>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>> -include %S/Inputs/preprocess-prefix.h -E %s | FileCheck -strict-whitespace >>>> %s >>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -x >>>> objective-c-header -emit-pch %S/Inputs/preprocess-prefix.h -o %t.pch >>>> > -// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs >>>> -include-pch %t.pch -E %s | FileCheck -strict-whitespace %s >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -include %S/Inputs/preprocess-prefix.h -E %s | >>>> FileCheck -strict-whitespace %s >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -x objective-c-header -emit-pch >>>> %S/Inputs/preprocess-prefix.h -o %t.pch >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -include-pch %t.pch -E %s | FileCheck >>>> -strict-whitespace %s >>>> > +// >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -x objective-c++ -include >>>> %S/Inputs/preprocess-prefix.h -E %s | FileCheck -strict-whitespace %s >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -x objective-c++-header -emit-pch >>>> %S/Inputs/preprocess-prefix.h -o %t.pch >>>> > +// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t -I %S/Inputs -I >>>> %S/Inputs/preprocess -x objective-c++ -include-pch %t.pch -E %s | FileCheck >>>> -strict-whitespace %s >>>> > #import "diamond_right.h" >>>> > #import "diamond_right.h" // to check that imports get their own line >>>> > +#include "file.h" >>>> > void test() { >>>> > top_left_before(); >>>> > left_and_right(); >>>> > @@ -15,6 +20,7 @@ void test() { >>>> > >>>> > // CHECK: @import diamond_right; /* clang -E: implicit import for >>>> "{{.*}}diamond_right.h" */{{$}} >>>> > // CHECK: @import diamond_right; /* clang -E: implicit import for >>>> "{{.*}}diamond_right.h" */{{$}} >>>> > +// CHECK: @import file; /* clang -E: implicit import for >>>> "{{.*}}file.h" */{{$}} >>>> > // CHECK-NEXT: void test() {{{$}} >>>> > // CHECK-NEXT: top_left_before();{{$}} >>>> > // CHECK-NEXT: left_and_right();{{$}} >>>> > >>>> > >>>> > _______________________________________________ >>>> > cfe-commits mailing list >>>> > [email protected] >>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>>> >>> >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
