[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. Aha, cool, so it's probably just virtual functions. Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
martong added a comment. Just added a new entry to our roadmap: https://github.com/Ericsson/clang/issues/435 Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
martong added a comment. @NoQ , @dkrupp CallEvent::getRuntimeDefinition is overwritten in - AnyFunctionCall - CXXInstanceCall - CXXMemberCall - CXXDestructorCall - ObjCMethodCall AnyFunctionCall handles the CTU logic. CXXInstanceCall calls into AnyFunctionCall if the function is not virtual. If the function is virtual then we try to find the Decl of it via the dynamic type info. At this point, we could get the definition of that function via the CTU logic, indeed. This is something we should implement in the future. However, it seems like this is the only case (not considering ObjC). CXXMemberCall calls back to AnyFunctionCall or to CXXInstanceCall. CXXDestructorCall does the same. Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dkrupp added a comment. > Which means that for some calls we aren't even trying to make a CTU lookup. Thanks @NoQ, we will take a look at it! Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. Herald added a subscriber: mikhail.ramalho. Just noticed: `getRuntimeDefinition()` has a lot of overrides in `CallEvent` sub-classes, and there paths that don't defer to `AnyFunctionCall::getRuntimeDefinition()`, eg., ` CXXInstanceCall::getRuntimeDefinition()` => `if (MD->isVirtual()) ...`. Which means that for some calls we aren't even trying to make a CTU lookup. Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun abandoned this revision. xazax.hun added a comment. Resubmitted in https://reviews.llvm.org/rL326439 Phabricator did not display the mailing list conversation. The point is, the circular dependency does not exist in the upstream version of clang. The reason is that CMake does not track the header includes as dependencies. There might be some layering violations with the header includes but those are independent of this patch and need to be fixed separately. Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
Resubmitted as r326439. Sorry for all the trouble. We need to hack around the Analyses.def being required by Frontend, but it would nice to remove this dependency upstream. On Thu, Mar 1, 2018 at 3:34 PM Benjamin Kramerwrote: > Frontend depends on StaticAnalyzerCore by > including "clang/StaticAnalyzer/Core/Analyses.def". That's a clear layering > violation, but cmake doesn't model header dependencies. Maybe Analyses.def > should move into its own library. > > > On Thu, Mar 1, 2018 at 3:07 PM Gábor Horváth via llvm-commits < > llvm-comm...@lists.llvm.org> wrote: > >> >> >> 2018. márc. 1. 14:58 ezt írta ("Ilya Biryukov" ): >> >> I replied to a commit in the wrong thread ( >> https://reviews.llvm.org/rL326323), sorry. >> Here are the important bits: >> >> This change introduced the following cyclic dependency in the build >> system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. >> >> >> I do not see the Frontend -> StaticAnalyzerCore dependency upstream. See: >> https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/CMakeLists.txt >> >> Do I miss something here? >> >> Thanks in advance, >> Gábor >> >> >> I'm sorry, but I had to revert the commit in r326432. Cyclic dependency >> broke our internal integrate (we use a different buildsystem, which breaks >> on cyclic deps) and it's really messy to workaround it since CrossTU >> depends on both Index and Frontend. >> Moving the code that uses CrossTU from StaticAnalyzerCore to >> StaticAnalyzerFrontend should probably fix it, but I don't have enough >> context to come up with a fix. >> >> >> >> On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via >> cfe-commits wrote: >> >>> a.sidorin reopened this revision. >>> a.sidorin added a comment. >>> >>> The changes were reverted: >>> http://llvm.org/viewvc/llvm-project?rev=326432=rev >>> Gabor, could you take a look? >>> >>> >>> Repository: >>> rC Clang >>> >>> https://reviews.llvm.org/D30691 >>> >>> >>> >>> ___ >>> cfe-commits mailing list >>> cfe-commits@lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >>> >> >> >> -- >> Regards, >> Ilya Biryukov >> >> >> ___ >> llvm-commits mailing list >> llvm-comm...@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits >> > -- Regards, Ilya Biryukov ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
Frontend depends on StaticAnalyzerCore by including "clang/StaticAnalyzer/Core/Analyses.def". That's a clear layering violation, but cmake doesn't model header dependencies. Maybe Analyses.def should move into its own library. On Thu, Mar 1, 2018 at 3:07 PM Gábor Horváth via llvm-commits < llvm-comm...@lists.llvm.org> wrote: > > > 2018. márc. 1. 14:58 ezt írta ("Ilya Biryukov"): > > I replied to a commit in the wrong thread ( > https://reviews.llvm.org/rL326323), sorry. > Here are the important bits: > > This change introduced the following cyclic dependency in the build > system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. > > > I do not see the Frontend -> StaticAnalyzerCore dependency upstream. See: > https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/CMakeLists.txt > > Do I miss something here? > > Thanks in advance, > Gábor > > > I'm sorry, but I had to revert the commit in r326432. Cyclic dependency > broke our internal integrate (we use a different buildsystem, which breaks > on cyclic deps) and it's really messy to workaround it since CrossTU > depends on both Index and Frontend. > Moving the code that uses CrossTU from StaticAnalyzerCore to > StaticAnalyzerFrontend should probably fix it, but I don't have enough > context to come up with a fix. > > > > On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via > cfe-commits wrote: > >> a.sidorin reopened this revision. >> a.sidorin added a comment. >> >> The changes were reverted: >> http://llvm.org/viewvc/llvm-project?rev=326432=rev >> Gabor, could you take a look? >> >> >> Repository: >> rC Clang >> >> https://reviews.llvm.org/D30691 >> >> >> >> ___ >> cfe-commits mailing list >> cfe-commits@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >> > > > -- > Regards, > Ilya Biryukov > > > ___ > llvm-commits mailing list > llvm-comm...@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
I am away from my workstation so I would really appreciate if you could recommit. Thanks in advance, Gábor 2018. márc. 1. 15:28 ezt írta ("Ilya Biryukov"): > You're right. We have this extra dependency in our internal build files > for some reason and I missed that. > It's totally my fault. > > Should I resubmit the patch that I reverted or you would rather do it > yourself? > > > > On Thu, Mar 1, 2018 at 3:07 PM Gábor Horváth wrote: > >> >> >> 2018. márc. 1. 14:58 ezt írta ("Ilya Biryukov" ): >> >> I replied to a commit in the wrong thread (https://reviews.llvm.org/ >> rL326323), sorry. >> Here are the important bits: >> >> This change introduced the following cyclic dependency in the build >> system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. >> >> >> I do not see the Frontend -> StaticAnalyzerCore dependency upstream. See: >> https://github.com/llvm-mirror/clang/blob/master/lib/ >> Frontend/CMakeLists.txt >> >> Do I miss something here? >> >> Thanks in advance, >> Gábor >> >> >> I'm sorry, but I had to revert the commit in r326432. Cyclic dependency >> broke our internal integrate (we use a different buildsystem, which breaks >> on cyclic deps) and it's really messy to workaround it since CrossTU >> depends on both Index and Frontend. >> Moving the code that uses CrossTU from StaticAnalyzerCore to >> StaticAnalyzerFrontend should probably fix it, but I don't have enough >> context to come up with a fix. >> >> >> >> On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via >> cfe-commits wrote: >> >>> a.sidorin reopened this revision. >>> a.sidorin added a comment. >>> >>> The changes were reverted: http://llvm.org/viewvc/llvm- >>> project?rev=326432=rev >>> Gabor, could you take a look? >>> >>> >>> Repository: >>> rC Clang >>> >>> https://reviews.llvm.org/D30691 >>> >>> >>> >>> ___ >>> cfe-commits mailing list >>> cfe-commits@lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >>> >> >> >> -- >> Regards, >> Ilya Biryukov >> >> >> > > -- > Regards, > Ilya Biryukov > ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
You're right. We have this extra dependency in our internal build files for some reason and I missed that. It's totally my fault. Should I resubmit the patch that I reverted or you would rather do it yourself? On Thu, Mar 1, 2018 at 3:07 PM Gábor Horváthwrote: > > > 2018. márc. 1. 14:58 ezt írta ("Ilya Biryukov" ): > > I replied to a commit in the wrong thread ( > https://reviews.llvm.org/rL326323), sorry. > Here are the important bits: > > This change introduced the following cyclic dependency in the build > system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. > > > I do not see the Frontend -> StaticAnalyzerCore dependency upstream. See: > https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/CMakeLists.txt > > Do I miss something here? > > Thanks in advance, > Gábor > > > I'm sorry, but I had to revert the commit in r326432. Cyclic dependency > broke our internal integrate (we use a different buildsystem, which breaks > on cyclic deps) and it's really messy to workaround it since CrossTU > depends on both Index and Frontend. > Moving the code that uses CrossTU from StaticAnalyzerCore to > StaticAnalyzerFrontend should probably fix it, but I don't have enough > context to come up with a fix. > > > > On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via > cfe-commits wrote: > >> a.sidorin reopened this revision. >> a.sidorin added a comment. >> >> The changes were reverted: >> http://llvm.org/viewvc/llvm-project?rev=326432=rev >> Gabor, could you take a look? >> >> >> Repository: >> rC Clang >> >> https://reviews.llvm.org/D30691 >> >> >> >> ___ >> cfe-commits mailing list >> cfe-commits@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >> > > > -- > Regards, > Ilya Biryukov > > > -- Regards, Ilya Biryukov ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
2018. márc. 1. 14:58 ezt írta ("Ilya Biryukov"): I replied to a commit in the wrong thread (https://reviews.llvm.org/rL326323), sorry. Here are the important bits: This change introduced the following cyclic dependency in the build system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. I do not see the Frontend -> StaticAnalyzerCore dependency upstream. See: https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/CMakeLists.txt Do I miss something here? Thanks in advance, Gábor I'm sorry, but I had to revert the commit in r326432. Cyclic dependency broke our internal integrate (we use a different buildsystem, which breaks on cyclic deps) and it's really messy to workaround it since CrossTU depends on both Index and Frontend. Moving the code that uses CrossTU from StaticAnalyzerCore to StaticAnalyzerFrontend should probably fix it, but I don't have enough context to come up with a fix. On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via cfe-commits wrote: > a.sidorin reopened this revision. > a.sidorin added a comment. > > The changes were reverted: http://llvm.org/viewvc/llvm- > project?rev=326432=rev > Gabor, could you take a look? > > > Repository: > rC Clang > > https://reviews.llvm.org/D30691 > > > > ___ > cfe-commits mailing list > cfe-commits@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits > -- Regards, Ilya Biryukov ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
I replied to a commit in the wrong thread (https://reviews.llvm.org/rL326323), sorry. Here are the important bits: This change introduced the following cyclic dependency in the build system: StaticAnalyzerCore -> CrossTU -> Frontend -> StaticAnalyzerCore. I'm sorry, but I had to revert the commit in r326432. Cyclic dependency broke our internal integrate (we use a different buildsystem, which breaks on cyclic deps) and it's really messy to workaround it since CrossTU depends on both Index and Frontend. Moving the code that uses CrossTU from StaticAnalyzerCore to StaticAnalyzerFrontend should probably fix it, but I don't have enough context to come up with a fix. On Thu, Mar 1, 2018 at 2:01 PM Aleksei Sidorin via Phabricator via cfe-commitswrote: > a.sidorin reopened this revision. > a.sidorin added a comment. > > The changes were reverted: > http://llvm.org/viewvc/llvm-project?rev=326432=rev > Gabor, could you take a look? > > > Repository: > rC Clang > > https://reviews.llvm.org/D30691 > > > > ___ > cfe-commits mailing list > cfe-commits@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits > -- Regards, Ilya Biryukov ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a reviewer: ilya-biryukov. xazax.hun added a subscriber: ilya-biryukov. xazax.hun added a comment. @ilya-biryukov Could you please provide some more details where the cyclic dependency is? I cannot reproduce the problem and usually cmake fails when there is a cyclic dependency and you are building shared libraries. Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
a.sidorin reopened this revision. a.sidorin added a comment. The changes were reverted: http://llvm.org/viewvc/llvm-project?rev=326432=rev Gabor, could you take a look? Repository: rC Clang https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rL326323: [analyzer] Support for naive cross translation unit analysis (authored by xazax, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D30691?vs=134431=136282#toc Repository: rL LLVM https://reviews.llvm.org/D30691 Files: cfe/trunk/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/SubEngine.h cfe/trunk/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp cfe/trunk/lib/StaticAnalyzer/Core/CMakeLists.txt cfe/trunk/lib/StaticAnalyzer/Core/CallEvent.cpp cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp cfe/trunk/lib/StaticAnalyzer/Core/PathDiagnostic.cpp cfe/trunk/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp cfe/trunk/lib/StaticAnalyzer/Frontend/CMakeLists.txt cfe/trunk/test/Analysis/Inputs/ctu-chain.cpp cfe/trunk/test/Analysis/Inputs/ctu-other.cpp cfe/trunk/test/Analysis/Inputs/externalFnMap.txt cfe/trunk/test/Analysis/analyzer-config.cpp cfe/trunk/test/Analysis/ctu-main.cpp cfe/trunk/tools/scan-build-py/README.md cfe/trunk/tools/scan-build-py/libscanbuild/__init__.py cfe/trunk/tools/scan-build-py/libscanbuild/analyze.py cfe/trunk/tools/scan-build-py/libscanbuild/arguments.py cfe/trunk/tools/scan-build-py/libscanbuild/clang.py cfe/trunk/tools/scan-build-py/libscanbuild/report.py cfe/trunk/tools/scan-build-py/tests/unit/test_analyze.py cfe/trunk/tools/scan-build-py/tests/unit/test_clang.py Index: cfe/trunk/tools/scan-build-py/README.md === --- cfe/trunk/tools/scan-build-py/README.md +++ cfe/trunk/tools/scan-build-py/README.md @@ -41,6 +41,32 @@ Use `--help` to know more about the commands. +How to use the experimental Cross Translation Unit analysis +--- + +To run the CTU analysis, a compilation database file has to be created: + +$ intercept-build + +To run the Clang Static Analyzer against a compilation database +with CTU analysis enabled, execute: + +$ analyze-build --ctu + +For CTU analysis an additional (function-definition) collection-phase is required. +For debugging purposes, it is possible to separately execute the collection +and the analysis phase. By doing this, the intermediate files used for +the analysis are kept on the disk in `./ctu-dir`. + +# Collect and store the data required by the CTU analysis +$ analyze-build --ctu-collect-only + +# Analyze using the previously collected data +$ analyze-build --ctu-analyze-only + +Use `--help` to get more information about the commands. + + Limitations --- Index: cfe/trunk/tools/scan-build-py/libscanbuild/arguments.py === --- cfe/trunk/tools/scan-build-py/libscanbuild/arguments.py +++ cfe/trunk/tools/scan-build-py/libscanbuild/arguments.py @@ -18,8 +18,8 @@ import argparse import logging import tempfile -from libscanbuild import reconfigure_logging -from libscanbuild.clang import get_checkers +from libscanbuild import reconfigure_logging, CtuConfig +from libscanbuild.clang import get_checkers, is_ctu_capable __all__ = ['parse_args_for_intercept_build', 'parse_args_for_analyze_build', 'parse_args_for_scan_build'] @@ -98,6 +98,11 @@ # add cdb parameter invisibly to make report module working. args.cdb = 'compile_commands.json' +# Make ctu_dir an abspath as it is needed inside clang +if not from_build_command and hasattr(args, 'ctu_phases') \ +and hasattr(args.ctu_phases, 'dir'): +args.ctu_dir = os.path.abspath(args.ctu_dir) + def validate_args_for_analyze(parser, args, from_build_command): """ Command line parsing is done by the argparse module, but semantic @@ -122,6 +127,18 @@ elif not from_build_command and not os.path.exists(args.cdb): parser.error(message='compilation database is missing') +# If the user wants CTU mode +if not from_build_command and hasattr(args, 'ctu_phases') \ +and hasattr(args.ctu_phases, 'dir'): +# If CTU analyze_only, the input directory should exist +if args.ctu_phases.analyze and not args.ctu_phases.collect \ +and not os.path.exists(args.ctu_dir): +parser.error(message='missing CTU directory') +# Check CTU capability via checking clang-func-mapping +if not is_ctu_capable(args.func_map_cmd): +parser.error(message="""This version of clang does not support CTU +functionality or clang-func-mapping command not found.""") + def create_intercept_parser(): """ Creates a parser for
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rC326323: [analyzer] Support for naive cross translation unit analysis (authored by xazax, committed by ). Changed prior to commit: https://reviews.llvm.org/D30691?vs=134431=136283#toc Repository: rC Clang https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h include/clang/StaticAnalyzer/Core/PathSensitive/SubEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/analyzer-config.cpp test/Analysis/ctu-main.cpp tools/scan-build-py/README.md tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: lib/StaticAnalyzer/Core/PathDiagnostic.cpp === --- lib/StaticAnalyzer/Core/PathDiagnostic.cpp +++ lib/StaticAnalyzer/Core/PathDiagnostic.cpp @@ -379,11 +379,25 @@ return None; } +static bool compareCrossTUSourceLocs(FullSourceLoc XL, FullSourceLoc YL) { + std::pairXOffs = XL.getDecomposedLoc(); + std::pair YOffs = YL.getDecomposedLoc(); + const SourceManager = XL.getManager(); + std::pair InSameTU = SM.isInTheSameTranslationUnit(XOffs, YOffs); + if (InSameTU.first) +return XL.isBeforeInTranslationUnitThan(YL); + const FileEntry *XFE = SM.getFileEntryForID(XL.getFileID()); + const FileEntry *YFE = SM.getFileEntryForID(YL.getFileID()); + if (!XFE || !YFE) +return XFE && !YFE; + return XFE->getName() < YFE->getName(); +} + static bool compare(const PathDiagnostic , const PathDiagnostic ) { FullSourceLoc XL = X.getLocation().asLocation(); FullSourceLoc YL = Y.getLocation().asLocation(); if (XL != YL) -return XL.isBeforeInTranslationUnitThan(YL); +return compareCrossTUSourceLocs(XL, YL); if (X.getBugType() != Y.getBugType()) return X.getBugType() < Y.getBugType(); if (X.getCategory() != Y.getCategory()) @@ -403,7 +417,8 @@ SourceLocation YDL = YD->getLocation(); if (XDL != YDL) { const SourceManager = XL.getManager(); - return SM.isBeforeInTranslationUnit(XDL, YDL); + return compareCrossTUSourceLocs(FullSourceLoc(XDL, SM), + FullSourceLoc(YDL, SM)); } } PathDiagnostic::meta_iterator XI = X.meta_begin(), XE = X.meta_end(); Index: lib/StaticAnalyzer/Core/CMakeLists.txt === --- lib/StaticAnalyzer/Core/CMakeLists.txt +++ lib/StaticAnalyzer/Core/CMakeLists.txt @@ -58,6 +58,7 @@ clangASTMatchers clangAnalysis clangBasic + clangCrossTU clangLex clangRewrite ${Z3_LINK_FILES} Index: lib/StaticAnalyzer/Core/CallEvent.cpp === --- lib/StaticAnalyzer/Core/CallEvent.cpp +++ lib/StaticAnalyzer/Core/CallEvent.cpp @@ -28,6 +28,7 @@ #include "clang/Analysis/AnalysisDeclContext.h" #include "clang/Analysis/CFG.h" #include "clang/Analysis/ProgramPoint.h" +#include "clang/CrossTU/CrossTranslationUnit.h" #include "clang/Basic/IdentifierTable.h" #include "clang/Basic/LLVM.h" #include "clang/Basic/SourceLocation.h" @@ -405,7 +406,27 @@ } } - return {}; + SubEngine *Engine = getState()->getStateManager().getOwningEngine(); + AnalyzerOptions = Engine->getAnalysisManager().options; + + // Try to get CTU definition only if CTUDir is provided. + if (!Opts.naiveCTUEnabled()) +return RuntimeDefinition(); + + cross_tu::CrossTranslationUnitContext = + *Engine->getCrossTranslationUnitContext(); + llvm::Expected CTUDeclOrError = + CTUCtx.getCrossTUDefinition(FD, Opts.getCTUDir(), Opts.getCTUIndexName()); + + if (!CTUDeclOrError) { +handleAllErrors(CTUDeclOrError.takeError(), +[&](const cross_tu::IndexError ) { + CTUCtx.emitCrossTUDiagnostics(IE); +}); +return {}; + } + + return RuntimeDefinition(*CTUDeclOrError); } void AnyFunctionCall::getInitialStackFrameContents( Index: lib/StaticAnalyzer/Core/ExprEngine.cpp === ---
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dcoughlin accepted this revision. dcoughlin added a comment. Thanks Gabor, this looks good to me. Please commit! https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a comment. In https://reviews.llvm.org/D30691#1003514, @george.karpenkov wrote: > Python code looks OK to me, I have one last request: could we have a small > documentation how the whole thing is supposed work in integration, preferably > on an available open-source project any reader could check out? > I am asking because I have actually tried and failed to launch this CTU > patch on a project I was analyzing. We added the documentation. Could you recheck? Thanks in advance! Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:395 +return XFE && !YFE; + return XFE->getName() < YFE->getName(); +} nikhgupt wrote: > getName could yield incorrect results if two files in the project have the > same name. This might break the assert for PathDiagnostics 'total ordering' > and 'uniqueness'. > Maybe replacing FileEntry's getName with FullSourceLoc's getFileID could > resolve this. Thank you, this is a known problem that we plan to address in a follow-up patch. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 134431. xazax.hun added a comment. - Rebased to current ToT - Fixed a problem that the scan-build-py used an old version of the ctu configuration option - Added a short guide how to use CTU https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h include/clang/StaticAnalyzer/Core/PathSensitive/SubEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/README.md tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,15 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#',
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added inline comments. Comment at: tools/scan-build-py/libscanbuild/analyze.py:702 + # To have good results from static analyzer certain compiler options shall be george.karpenkov wrote: > This blank line should not be in this PR. Scheduled to be done. Comment at: tools/scan-build-py/libscanbuild/report.py:270 +# Protect parsers from bogus report files coming from clang crashes +for filename in glob.iglob(os.path.join(output_dir, pattern)): george.karpenkov wrote: > I understand the intent here, but it seems it should be handled at a > different level: would it be hard to change Clang to only write the report > file at the very end, when no crash should be encountered? Or make parsers > not choke on empty fields? After careful investigation, I have to say, it would be very hard to tell that we've done this right in the current setup. Unlike the the rest of clang, ASTImporter.cpp is not a strong part of the system. It has a lot of ongoing fixes and probably more problems (missing CXX functionality) on the way. This is the part which makes CTU less reliable than clang generally. In order to elegantly survive a problem of this unit, we need to leave the other things intact and fix ASTImporter instead (this is an ongoing effort). Until then, this fix seems good enough. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov added a comment. Python code looks OK to me, I have one last request: could we have a small documentation how the whole thing is supposed work in integration, preferably on an available open-source project any reader could check out? I am asking because I have actually tried and failed to launch this CTU patch on a project I was analyzing. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
nikhgupt added inline comments. Herald added a subscriber: hintonda. Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:395 +return XFE && !YFE; + return XFE->getName() < YFE->getName(); +} getName could yield incorrect results if two files in the project have the same name. This might break the assert for PathDiagnostics 'total ordering' and 'uniqueness'. Maybe replacing FileEntry's getName with FullSourceLoc's getFileID could resolve this. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 129509. xazax.hun marked 5 inline comments as done. xazax.hun added a comment. - Fixed review comments https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h include/clang/StaticAnalyzer/Core/PathSensitive/SubEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,15 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/f un.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + + +class FuncMapSrcToAstTest(unittest.TestCase): +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added inline comments. Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:418-423 SourceLocation XDL = XD->getLocation(); SourceLocation YDL = YD->getLocation(); if (XDL != YDL) { const SourceManager = XL.getManager(); - return SM.isBeforeInTranslationUnit(XDL, YDL); + return compareCrossTUSourceLocs(FullSourceLoc(XDL, SM), + FullSourceLoc(YDL, SM)); NoQ wrote: > xazax.hun wrote: > > NoQ wrote: > > > It seems to me that `XDL` and `YDL` are exactly the same as `XL` and `YL` > > > we've seen at the beginning of the function. > > > > > > ...we still have only one `SourceManager`, right? > > Is this true? > > One is the location associated with the PathDiagnostic the other is the > > location of the Decl associated with the issue. I do not have deep > > understanding of this part of the code but not sure if these are guaranteed > > to be the same. > Whoops, you're totally right, never mind. > > Comments might have probably helped me understand that faster. Sure, comments might help, but I want to keep the changes in this patch focused and minimal, so I prefer to do such modifications in a separate patch. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added inline comments. Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:418-423 SourceLocation XDL = XD->getLocation(); SourceLocation YDL = YD->getLocation(); if (XDL != YDL) { const SourceManager = XL.getManager(); - return SM.isBeforeInTranslationUnit(XDL, YDL); + return compareCrossTUSourceLocs(FullSourceLoc(XDL, SM), + FullSourceLoc(YDL, SM)); xazax.hun wrote: > NoQ wrote: > > It seems to me that `XDL` and `YDL` are exactly the same as `XL` and `YL` > > we've seen at the beginning of the function. > > > > ...we still have only one `SourceManager`, right? > Is this true? > One is the location associated with the PathDiagnostic the other is the > location of the Decl associated with the issue. I do not have deep > understanding of this part of the code but not sure if these are guaranteed > to be the same. Whoops, you're totally right, never mind. Comments might have probably helped me understand that faster. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added inline comments. Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:418-423 SourceLocation XDL = XD->getLocation(); SourceLocation YDL = YD->getLocation(); if (XDL != YDL) { const SourceManager = XL.getManager(); - return SM.isBeforeInTranslationUnit(XDL, YDL); + return compareCrossTUSourceLocs(FullSourceLoc(XDL, SM), + FullSourceLoc(YDL, SM)); NoQ wrote: > It seems to me that `XDL` and `YDL` are exactly the same as `XL` and `YL` > we've seen at the beginning of the function. > > ...we still have only one `SourceManager`, right? Is this true? One is the location associated with the PathDiagnostic the other is the location of the Decl associated with the issue. I do not have deep understanding of this part of the code but not sure if these are guaranteed to be the same. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. Had a fresh look on the C++ part, it is super clean now, i'm very impressed :) Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:373-374 - return RuntimeDefinition(); + auto Engine = static_cast( + getState()->getStateManager().getOwningEngine()); + *Humbly suggests to refactor whatever we need here into `SubEngine`'s virtual method(s). `getAnalysisManager()` is already there, so i guess we only need to expose `getCrossTranslationUnitContext()`. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:385 + FD, Engine->getAnalysisManager().options.getCTUDir(), + "externalFnMap.txt"); + I think `CallEvent` is the last place where i'd prefer to hardcode this filename. Maybe hardcode it in `CrossTranslationUnitContext` or get from `AnalyzerOptions`? (the former, i guess, because i doubt anybody would ever want to change it). Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:418-423 SourceLocation XDL = XD->getLocation(); SourceLocation YDL = YD->getLocation(); if (XDL != YDL) { const SourceManager = XL.getManager(); - return SM.isBeforeInTranslationUnit(XDL, YDL); + return compareCrossTUSourceLocs(FullSourceLoc(XDL, SM), + FullSourceLoc(YDL, SM)); It seems to me that `XDL` and `YDL` are exactly the same as `XL` and `YL` we've seen at the beginning of the function. ...we still have only one `SourceManager`, right? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added inline comments. Comment at: lib/StaticAnalyzer/Core/AnalyzerOptions.cpp:407 + if (!NaiveCTU.hasValue()) { +NaiveCTU = getBooleanOption("experimental-enable-naive-ctu-analysis", +/*Default=*/false); This option might not be the most readable but this way experimental is a prefix. Do you prefer this way or something like `enable-experimental-naive-ctu-analysis`? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 127525. xazax.hun marked an inline comment as not done. xazax.hun added a comment. - Address some review comments - Rebased on ToT https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,15 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/f un.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + + +class FuncMapSrcToAstTest(unittest.TestCase): + +def
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun marked 6 inline comments as done. xazax.hun added inline comments. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:372 + + cross_tu::CrossTranslationUnitContext = + Engine->getCrossTranslationUnitContext(); dcoughlin wrote: > Can this logic be moved to AnalysisDeclContext->getBody()? > > CallEvent::getRuntimeDefinition() is really all about modeling function > dispatch at run time. It seems odd to have the cross-translation-unit loading > (which is about compiler book-keeping rather than semantics) here. I just tried that and unfortunately, that introduces cyclic dependencies. CrossTU depends on Frontend. Frontend depends on Sema. Sema depends on Analysis. Making Analysis depending on CrossTU introduces the cycle. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:382 +[&](const cross_tu::IndexError ) { + CTUCtx.emitCrossTUDiagnostics(IE); +}); dcoughlin wrote: > I don't think it makes sense to diagnose index errors here. > > Doing it when during analysis means that, for example, the parse error could > be emitted or not emitted depending on whether the analyzer thinks a > particular call site is reached. > > It would be better to validate/parse the index before starting analysis > rather than during analysis itself. > > While I agree, right now this validation is not the part of the analyzer but the responsibility of the "driver" script for example CodeChecker. It is useful to have this diagnostics to catch bugs in the driver. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added a comment. In https://reviews.llvm.org/D30691#954740, @george.karpenkov wrote: > I've tried using the patch, and I got blocked at the following: CTU options > are only exposed when one goes through `analyze-build` frontend, which > requires `compile_commands.json` to be present. I've used `libear` to > generate `compile_commands.json`, but the generated JSON does not contain the > `command` field, which causes `@require` before `run` to die (also, due to > the passing style this error was unnecessarily difficult to debug). > So could you write a short documentation somewhere how all pieces fit > together? What entry point should be used, what should people do who don't > have a build system-generated `compile_commands.json`etc. etc. Basically this patch does not change the way scan-build-py is working. So you can either create a compile_commands.json using intercept-build and than use analyze-build to use the result. Or you can do the two together using scan-build. We only offer the CTU functionality in analyze-build, because CTU needs multiple runs anyway, so on the spot build action capture and analyze is not possible with CTU. The general usage of scan-build-py is written in clang/tools/scan-build-py/README.md https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dcoughlin added a comment. Some comments on the C++ inline. Comment at: include/clang/AST/ASTContext.h:82 class APValue; +class ASTImporter; class ASTMutationListener; Is this forward declaration needed? Comment at: include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:63 private: + cross_tu::CrossTranslationUnitContext + I don't think CrossTranslationUnitContext belongs in ExprEngine (it doesn't really have much to do with transfer functions and graph construction. Can you move it to AnalysisDeclContextManager instead? Also, when you move it to AnalysisManager can you make it a pointer that is NULL when naive cross-translation support is not enabled? This will make it more clear that the cross-translation unit support will not always be available. Comment at: lib/StaticAnalyzer/Core/AnalyzerOptions.cpp:400 + return CTUDir.getValue(); +} Can you also add an analyzer option that is something like 'enable-naive-cross-translation-unit-analysis' and defaults to false? I'd like to avoid using the presence of 'ctu-dir' as an indication that the analyzer should use the naive CTU analysis. This way when if add a less naive CTU analysis we'll be able to the CTUDir for analysis artifacts (such as summaries) for the less naive CTU analysis as well. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:365 + + auto Engine = static_cast( + getState()->getStateManager().getOwningEngine()); This downcast is an indication that the CTUContext is living in the wrong class. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:372 + + cross_tu::CrossTranslationUnitContext = + Engine->getCrossTranslationUnitContext(); Can this logic be moved to AnalysisDeclContext->getBody()? CallEvent::getRuntimeDefinition() is really all about modeling function dispatch at run time. It seems odd to have the cross-translation-unit loading (which is about compiler book-keeping rather than semantics) here. Comment at: lib/StaticAnalyzer/Core/CallEvent.cpp:382 +[&](const cross_tu::IndexError ) { + CTUCtx.emitCrossTUDiagnostics(IE); +}); I don't think it makes sense to diagnose index errors here. Doing it when during analysis means that, for example, the parse error could be emitted or not emitted depending on whether the analyzer thinks a particular call site is reached. It would be better to validate/parse the index before starting analysis rather than during analysis itself. Comment at: lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:197 - AnalysisConsumer(const Preprocessor , const std::string , - AnalyzerOptionsRef opts, ArrayRef plugins, - CodeInjector *injector) + AnalysisConsumer(CompilerInstance , const Preprocessor , + const std::string , AnalyzerOptionsRef opts, There is no need for AnalysisConsumer to take both a CompilerInstance and a Preprocessor. You can just call `getPreprocessor()` to get the preprocessor from the CompilerInstance https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov requested changes to this revision. george.karpenkov added a comment. This revision now requires changes to proceed. I've tried using the patch, and I got blocked at the following: CTU options are only exposed when one goes through `analyze-build` frontend, which requires `compile_commands.json` to be present. I've used `libear` to generate `compile_commands.json`, but the generated JSON does not contain the `command` field, which causes `@require` before `run` to die (also, due to the passing style this error was unnecessarily difficult to debug). So could you write a short documentation somewhere how all pieces fit together? What entry point should be used, what should people do who don't have a build system-generated `compile_commands.json`etc. etc. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov added inline comments. Comment at: tools/scan-build-py/libscanbuild/report.py:257 def read_bugs(output_dir, html): +# type: (str, bool) -> Generator[Dict[str, Any], None, None] """ Generate a unique sequence of bugs from given output directory. Minor nitpicking: type comments are semi-standardized with Sphinx-style auto-generated documentation, and should be a part of the docstring. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added inline comments. Comment at: tools/scan-build-py/libscanbuild/analyze.py:44 +CTU_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +CTU_TEMP_FNMAP_FOLDER = 'tmpExternalFnMaps' george.karpenkov wrote: > gerazo wrote: > > george.karpenkov wrote: > > > What would happen when multiple analyzer runs are launched concurrently > > > in the same directory? Would they race on this file? > > Yes and no. The 1st, collect part of CTU creates this file by aggregating > > all data from the build system, while the 2nd part which does the analysis > > itself only reads it. Multiple analysis can use this file simultaneously > > without problem. However, multiple collect phases launched on a system does > > not make sense. In this case, the later one would write over the previous > > results with the same data. > > However, multiple collect phases launched on a system does not make sense > > Why not? What about a system with multiple users, where code is in the shared > directory? Or even simpler: I've launched a background process, forgot about > it, and then launched it again? > > > In this case, the later one would write over the previous results with the > > same data. > > That is probably fine, I am more worried about racing, where process B would > be reading a partially overriden file (not completely sure whether it is > possible) > I see your point. In order to create the multiple user scenario you've mentioned, those users need to give the exact same output folders for their jobs. Our original bet was that our users are not doing this as the scan-build tool itself is also cannot be used this way. Still the process left there is something to consider. I will plan some kind of locking mechanism to avoid situations like this. Comment at: tools/scan-build-py/libscanbuild/analyze.py:609 +# Recover namedtuple from json when coming from analyze_cc +if not hasattr(ctu_config, 'collect'): +ctu_config = CtuConfig(collect=ctu_config[0], george.karpenkov wrote: > gerazo wrote: > > george.karpenkov wrote: > > > In which case is this branch hit? Isn't improperly formed input argument > > > indicative of an internal error at this stage? > > An other part of scan-build-py, analyze_cc uses namedtuple to json format > > to communicate. However, the names are not coming back from json, so this > > code helps in this. This is the case when someone uses the whole toolset > > with compiler wrapping. All the environment variable hassle is also > > happening because of this. So these env vars are not for user modification > > (as you've suggested earlier). > OK so `opts['ctu']` is a tuple or a named tuple depending on how this > function is entered? BTW could you point me to the `analyze_cc` entry point? > > For the purpose of having more uniform code with less cases to care about, do > you think we could just use ordinary tuples instead of constructing a named > one, since we have to deconstruct an ordinary tuple in any case? Using a NamedTuple improves readability of the code a lot with less comments. It is unfortunate that serializing it is not solved by Python. I think moving this code to the entry point would make the whole thing much nicer. The entry point is at analyze_compiler_wrapper Comment at: tools/scan-build-py/libscanbuild/arguments.py:376 +metavar='', +default='ctu-dir', +help="""Defines the temporary directory used between ctu george.karpenkov wrote: > BTW can we also explicitly add `dest='ctu_dir'` here, as otherwise I was > initially very confused as to where the variable is set. Yes, of course. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov added a comment. Python part looks good to me. I don't know whether @dcoughlin or @NoQ would want to insert additional comments on C++ parts. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 125986. xazax.hun added a comment. Herald added a subscriber: rnkovacs. - @gerazo addressed some review comments in scan-build-py. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,15 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/f un.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + + +class FuncMapSrcToAstTest(unittest.TestCase): +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov added inline comments. Comment at: tools/scan-build-py/libscanbuild/analyze.py:44 +CTU_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +CTU_TEMP_FNMAP_FOLDER = 'tmpExternalFnMaps' gerazo wrote: > george.karpenkov wrote: > > What would happen when multiple analyzer runs are launched concurrently in > > the same directory? Would they race on this file? > Yes and no. The 1st, collect part of CTU creates this file by aggregating all > data from the build system, while the 2nd part which does the analysis itself > only reads it. Multiple analysis can use this file simultaneously without > problem. However, multiple collect phases launched on a system does not make > sense. In this case, the later one would write over the previous results with > the same data. > However, multiple collect phases launched on a system does not make sense Why not? What about a system with multiple users, where code is in the shared directory? Or even simpler: I've launched a background process, forgot about it, and then launched it again? > In this case, the later one would write over the previous results with the > same data. That is probably fine, I am more worried about racing, where process B would be reading a partially overriden file (not completely sure whether it is possible) Comment at: tools/scan-build-py/libscanbuild/analyze.py:103 +def prefix_with(constant, pieces): +""" From a sequence create another sequence where every second element gerazo wrote: > george.karpenkov wrote: > > Actually I would prefer a separate NFC PR just moving this function. This > > PR is already too complicated as it is. > Yes. The only reason was for the move to make it testable. However, we need > more tests as you wrote down below. Sure, but that separate PR can also include tests. Comment at: tools/scan-build-py/libscanbuild/analyze.py:120 + func_map_cmd=args.func_map_cmd) +if hasattr(args, 'ctu_phases') and hasattr(args.ctu_phases, 'dir') +else CtuConfig(collect=False, analyze=False, dir='', func_map_cmd='')) gerazo wrote: > george.karpenkov wrote: > > Extensive `hasattr` usage is often a codesmell, and it hinders > > understanding. > > Firstly, shouldn't `args.ctu_phases.dir` be always available, judging from > > the code in `arguments.py`? > > Secondly, in what cases `args.ctu_phases` is not available when we are > > already going down the CTU code path? Shouldn't we throw an error at that > > point instead of creating a default configuration? > > (default-configurations-instead-of-crashing is also another way to > > introduce very subtle and hard-to-catch error modes) > It definitely needs more comments, so thanks, I will put them here. > There are two separate possibilities here for this code part: > 1. The user does not want CTU at all. In this case, no CTU phases are > switched on (collect and analyze), so no CTU code will run. This is why dir > has no importance in this case. > 2. CTU capabilities were not even detected, so the help and available > arguments about CTU are also missing. In this case we create a dummy config > telling that everything is switched off. > > The reason for using hasattr was that not having CTU capabilities is not an > exceptional condition, rather a matter of configuration. I felt that handling > this with an exception would be misleading. Right, but instead of doing (1) and (2) can we have a separate (maybe hidden from user) param called e.g. `ctu_enabled` which would explicitly communicate that? Comment at: tools/scan-build-py/libscanbuild/analyze.py:144 +if len(ast_files) == 1: +mangled_ast_pairs.append((mangled_name, ast_files.pop())) + gerazo wrote: > george.karpenkov wrote: > > Firstly, no need to modify the set in order to get the first element, just > > use `next(iter(ast_files))`. > > Secondly, when exactly do conflicting names happen? Is it a bug? Shouldn't > > the tool log such cases? > Nice catch, thanks. > For your second question: Unfortunately, it is not a bug, rather a misfeature > which we can't handle yet. There can be tricky build-systems where a single > file with different configurations is built multiple times, so the same > function signature will be present in multiple link units. The other option > is when two separate compile units have an exact same function signature, but > they are never linked together, so it is not a problem in the build itself. > In this case, the cross compilation unit functionality cannot tell exactly > which compilation unit to turn to for the function, because there are > multiple valid choices. In this case, we deliberately leave such functions > out to avoid potential problems. It deserves a comment I think. > In this case, we deliberately leave such functions out to avoid potential > problems
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added a comment. The code modifications are coming soon (after doing some extensive testing) for the scan-build part. Comment at: tools/scan-build-py/libscanbuild/analyze.py:223 +ctu_config = get_ctu_config(args) +if ctu_config.collect: +shutil.rmtree(ctu_config.dir, ignore_errors=True) danielmarjamaki wrote: > danielmarjamaki wrote: > > not a big deal but I would use early exits in this function > with "not a big deal" I mean; feel free to ignore my comment if you want to > have it this way. I've checked it through. The only place for an early exit now would be before the else. The 1st and 2nd ifs are in fact non-orthogonal. Comment at: tools/scan-build-py/libscanbuild/analyze.py:145 +mangled_ast_pairs.append((mangled_name, ast_files.pop())) + +return mangled_ast_pairs george.karpenkov wrote: > Overall, instead of creating a dictionary with multiple elements, and then > converting to a list, it's much simper to only add an element to > `mangled_to_asts` when it is not already mapped to something (probably > logging a message otherwise), and then just return `mangled_to_asts.items()` The reason for the previous is that we need to count the occurence number ofdifferent mappings only let those pass through which don't have multiple variations. Comment at: tools/scan-build-py/libscanbuild/analyze.py:189 +# Remove all temporary files +shutil.rmtree(fnmap_dir, ignore_errors=True) + gerazo wrote: > george.karpenkov wrote: > > Having an analysis tool remove files is scary, what if (maybe not in this > > revision, but in a future iteration) a bug is introduced, and the tool > > removes user code instead? > > Why not just create a temporary directory with `tempfile.mkdtemp`, put all > > temporary files there, and then simply iterate through them? > > Then you would be able to get rid of the constant `CPU_TEMP_FNMAP_FOLDER` > > entirely, and OS would be responsible for cleanup. > Yes, you are right. We are essentially using a temp dir. Because of the size > we first had to put it next to the project (not on tmp drive for instance) > and for debugging purposes we gave a name to it. Still it can be done with > mkdtemp as well. Finally, I came to the conclusion that mkdtemp would not be better than the current solution. In order to find our created dir by other threads, we need a designated name. Suffixing it by generated name would further complicate things as we need not to allow multiple concurrent runs here. The current solution is more robust from this point of view. Comment at: tools/scan-build-py/libscanbuild/analyze.py:241 +run_analyzer_parallel(args) +shutil.rmtree(ctu_config.dir, ignore_errors=True) +else: george.karpenkov wrote: > Same as the comment above about removing folders. Also it seems like there > should be a way to remove redundancy in `if collect / remove tree` block > repeated twice. Th previous call for data removal happens because the user asked for a collect run, so we clean data to do a recollection. This second one happens because the user asked for a full recollection and anaylsis run all in one. So here we destroy the temp data for user's convenience. This happens after, not before like previously. The default behavior is to do this when the user uses the tool the easy way (collect and analyze all in one) and we intentionally keep collection data if the user only asks for a collect or an analyze run. So with this, the user can use a collect run's results for multiple analyze runs. This is written in the command line help. I will definitely put comments here to explain. Comment at: tools/scan-build-py/libscanbuild/analyze.py:296 +'command': [execution.cmd[0], '-c'] + compilation.flags, +'ctu': json.loads(os.getenv('ANALYZE_BUILD_CTU')) } george.karpenkov wrote: > Again, is it possible to avoid JSON-over-environment-variables? There is an other thing against changing this. Currently the interface here using env variables is used by intercept-build, analyze-build and scan-build tool as well. In order to drop json, we need to change those tools too. It would be a separate patch definitely. Comment at: tools/scan-build-py/libscanbuild/analyze.py:561 +except OSError: +pass +ast_command = [opts['clang'], '-emit-ast'] gerazo wrote: > george.karpenkov wrote: > > `try/except/pass` is almost always bad. > > When can the error occur? Why are we ignoring it? > I think this code is redundant with the if above. Here the folders are created on demand. Because these are created parallel by multiple processes, there is small chance that an other process already created the folder between the isdir check and the makedirs
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added a comment. Thanks George for the review. I will start working on the code right away. I've tried to answer the simpler cases. Comment at: tools/scan-build-py/libscanbuild/analyze.py:44 +CTU_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +CTU_TEMP_FNMAP_FOLDER = 'tmpExternalFnMaps' george.karpenkov wrote: > What would happen when multiple analyzer runs are launched concurrently in > the same directory? Would they race on this file? Yes and no. The 1st, collect part of CTU creates this file by aggregating all data from the build system, while the 2nd part which does the analysis itself only reads it. Multiple analysis can use this file simultaneously without problem. However, multiple collect phases launched on a system does not make sense. In this case, the later one would write over the previous results with the same data. Comment at: tools/scan-build-py/libscanbuild/analyze.py:64 if need_analyzer(args.build): -run_analyzer_parallel(args) +run_analyzer_with_ctu(args) else: george.karpenkov wrote: > `run_analyzer_with_ctu` is an unfortunate function name, since it may or may > not launch CTU depending on the passed arguments. Could we just call it > `run_analyzer`? We have an other run_analyzer method but still, it is a good idead, I will make something better up. Comment at: tools/scan-build-py/libscanbuild/analyze.py:103 +def prefix_with(constant, pieces): +""" From a sequence create another sequence where every second element george.karpenkov wrote: > Actually I would prefer a separate NFC PR just moving this function. This PR > is already too complicated as it is. Yes. The only reason was for the move to make it testable. However, we need more tests as you wrote down below. Comment at: tools/scan-build-py/libscanbuild/analyze.py:120 + func_map_cmd=args.func_map_cmd) +if hasattr(args, 'ctu_phases') and hasattr(args.ctu_phases, 'dir') +else CtuConfig(collect=False, analyze=False, dir='', func_map_cmd='')) george.karpenkov wrote: > Extensive `hasattr` usage is often a codesmell, and it hinders understanding. > Firstly, shouldn't `args.ctu_phases.dir` be always available, judging from > the code in `arguments.py`? > Secondly, in what cases `args.ctu_phases` is not available when we are > already going down the CTU code path? Shouldn't we throw an error at that > point instead of creating a default configuration? > (default-configurations-instead-of-crashing is also another way to introduce > very subtle and hard-to-catch error modes) It definitely needs more comments, so thanks, I will put them here. There are two separate possibilities here for this code part: 1. The user does not want CTU at all. In this case, no CTU phases are switched on (collect and analyze), so no CTU code will run. This is why dir has no importance in this case. 2. CTU capabilities were not even detected, so the help and available arguments about CTU are also missing. In this case we create a dummy config telling that everything is switched off. The reason for using hasattr was that not having CTU capabilities is not an exceptional condition, rather a matter of configuration. I felt that handling this with an exception would be misleading. Comment at: tools/scan-build-py/libscanbuild/analyze.py:128 +A function map contains the id of a function (mangled name) and the +originating source (the corresponding AST file) name.""" + george.karpenkov wrote: > Could you also specify what `func_map_lines` is (file? list? of strings?) in > the docstring? There's specific Sphinx syntax for that. Otherwise it is hard > to follow. Thanks, I'll do it. Comment at: tools/scan-build-py/libscanbuild/analyze.py:144 +if len(ast_files) == 1: +mangled_ast_pairs.append((mangled_name, ast_files.pop())) + george.karpenkov wrote: > Firstly, no need to modify the set in order to get the first element, just > use `next(iter(ast_files))`. > Secondly, when exactly do conflicting names happen? Is it a bug? Shouldn't > the tool log such cases? Nice catch, thanks. For your second question: Unfortunately, it is not a bug, rather a misfeature which we can't handle yet. There can be tricky build-systems where a single file with different configurations is built multiple times, so the same function signature will be present in multiple link units. The other option is when two separate compile units have an exact same function signature, but they are never linked together, so it is not a problem in the build itself. In this case, the cross compilation unit functionality cannot tell exactly which compilation unit to turn to for the function, because there are multiple valid choices. In this
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
george.karpenkov requested changes to this revision. george.karpenkov added inline comments. This revision now requires changes to proceed. Comment at: tools/scan-build-py/libscanbuild/analyze.py:44 +CTU_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +CTU_TEMP_FNMAP_FOLDER = 'tmpExternalFnMaps' What would happen when multiple analyzer runs are launched concurrently in the same directory? Would they race on this file? Comment at: tools/scan-build-py/libscanbuild/analyze.py:64 if need_analyzer(args.build): -run_analyzer_parallel(args) +run_analyzer_with_ctu(args) else: `run_analyzer_with_ctu` is an unfortunate function name, since it may or may not launch CTU depending on the passed arguments. Could we just call it `run_analyzer`? Comment at: tools/scan-build-py/libscanbuild/analyze.py:103 +def prefix_with(constant, pieces): +""" From a sequence create another sequence where every second element Actually I would prefer a separate NFC PR just moving this function. This PR is already too complicated as it is. Comment at: tools/scan-build-py/libscanbuild/analyze.py:120 + func_map_cmd=args.func_map_cmd) +if hasattr(args, 'ctu_phases') and hasattr(args.ctu_phases, 'dir') +else CtuConfig(collect=False, analyze=False, dir='', func_map_cmd='')) Extensive `hasattr` usage is often a codesmell, and it hinders understanding. Firstly, shouldn't `args.ctu_phases.dir` be always available, judging from the code in `arguments.py`? Secondly, in what cases `args.ctu_phases` is not available when we are already going down the CTU code path? Shouldn't we throw an error at that point instead of creating a default configuration? (default-configurations-instead-of-crashing is also another way to introduce very subtle and hard-to-catch error modes) Comment at: tools/scan-build-py/libscanbuild/analyze.py:128 +A function map contains the id of a function (mangled name) and the +originating source (the corresponding AST file) name.""" + Could you also specify what `func_map_lines` is (file? list? of strings?) in the docstring? There's specific Sphinx syntax for that. Otherwise it is hard to follow. Comment at: tools/scan-build-py/libscanbuild/analyze.py:138 +else: +mangled_to_asts[mangled_name].add(ast_file) + Could be improved with `defaultdict`: ``` mangled_to_asts = defaultdict(set) ... mangled_to_asts[mangled_name].add(ast_file) ``` Comment at: tools/scan-build-py/libscanbuild/analyze.py:144 +if len(ast_files) == 1: +mangled_ast_pairs.append((mangled_name, ast_files.pop())) + Firstly, no need to modify the set in order to get the first element, just use `next(iter(ast_files))`. Secondly, when exactly do conflicting names happen? Is it a bug? Shouldn't the tool log such cases? Comment at: tools/scan-build-py/libscanbuild/analyze.py:145 +mangled_ast_pairs.append((mangled_name, ast_files.pop())) + +return mangled_ast_pairs Overall, instead of creating a dictionary with multiple elements, and then converting to a list, it's much simper to only add an element to `mangled_to_asts` when it is not already mapped to something (probably logging a message otherwise), and then just return `mangled_to_asts.items()` Comment at: tools/scan-build-py/libscanbuild/analyze.py:160 +def generate_func_map_lines(fnmap_dir): +""" Iterate over all lines of input files in random order. """ + Firstly, is `glob.glob` actually random, as the docstring is saying? If yes, can we make it not to be random (e.g. with `sorted`), as dealing with randomized input makes tracking down bugs so much harder? Comment at: tools/scan-build-py/libscanbuild/analyze.py:168 + +def write_global_map(ctudir, arch, mangled_ast_pairs): +""" Write (mangled function name, ast file) pairs into final file. """ If `write_global_map` is a closure, please use the fact that it would capture `ctudir` from the outer scope, and remove it from the argument list. That would make it more obvious that it does not change between invocations. Comment at: tools/scan-build-py/libscanbuild/analyze.py:189 +# Remove all temporary files +shutil.rmtree(fnmap_dir, ignore_errors=True) + Having an analysis tool remove files is scary, what if (maybe not in this revision, but in a future iteration) a bug is introduced, and the tool removes user code instead? Why not just create a temporary directory with `tempfile.mkdtemp`, put all temporary files there, and then simply iterate through them? Then
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 121700. xazax.hun added a comment. - Rebased on ToT https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,16 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang', +'not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/f un.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + + +class FuncMapSrcToAstTest(unittest.TestCase): + +def test_empty_gives_empty(self): +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 119458. xazax.hun marked an inline comment as done. xazax.hun added a comment. - Update the scan-build part to work correctly with the accepted version of libCrossTU https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,16 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang', +'not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun3#I# ast/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertTrue(('c:@F@fun3#I#', 'ast/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertFalse(('c:@F@fun1#I#', 'ast/fun7.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['c:@F@fun1#I# ast/fun1.c.ast', + 'c:@F@fun2#I# ast/fun2.c.ast', + 'c:@F@fun1#I# ast/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/fun1.c.ast') in pairs) +self.assertTrue(('c:@F@fun2#I#', 'ast/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['c:@F@fun1#I# ast/f un.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('c:@F@fun1#I#', 'ast/f un.c.ast') in pairs) +self.assertEqual(1,
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dkrupp requested changes to this revision. dkrupp added a comment. This revision now requires changes to proceed. Please fix the incompatibility between analyze-build and lib/CrossTU in the format of externalFnMap.txt mappfing file. Comment at: tools/scan-build-py/libscanbuild/analyze.py:535 +ast_path = os.path.join("ast", triple_arch, path + ".ast") +func_ast_list.append(mangled_name + "@" + triple_arch + " " + ast_path) +return func_ast_list There is a incompatibility between this scan-build (analyze-build actually) implementation and new lib/CrossTU library. CrossTranslationUnitContext::loadExternalAST( StringRef LookupName, StringRef CrossTUDir, StringRef IndexName) expects the externalFnMap.txt to be in "functionName astFilename" format. however currently we generate here "functionName@arch astFilename" lines. One possible fix could be to create one externalFnMap.txt indexes per arch /ast/x86_64/externalFnMap.txt /ast/ppc64/externalFnMap.txt etc. and call clang analyze with the architecture specific map directory: e.g. ctu-dir=/ast/x86_64 This would then work if the "to-be-analyzed" source-code is cross-compiled into multiple architectures. Would be useful to add a test-case too to check if the map file and ctu-dir content generated by analyze-build is compatible. Comment at: tools/scan-build-py/libscanbuild/analyze.py:585 ++ [opts['file']] +triple_arch = get_triple_arch(cmd, cwd) +generate_ast(triple_arch) Maybe we could use the full target-triple for distinguishing the AST binaries, not only the architecture part. The sys part for example is probably important too and a "win32" AST may not be compatible with a "linux" AST. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a comment. In https://reviews.llvm.org/D30691#878830, @r.stahl wrote: > For my purposes I replaced the return statement of the > compareCrossTUSourceLocs function with: > > return XL.getFileID() < YL.getFileID(); > > > A more correct fix would create only one unique diagnostic for both cases. Thank you for the report, I could reproduce this after modifying the null dereference checker. My fear of using file IDs is that I don't know whether they are stable. So in subsequent runs, different paths might be chosen by the analyzer and this could be confusing to the user. I will think about a workaround that both stable and solves this assertion. I see two possible ways to do the proper fix. One is to check explicitly for this case when the same function appears in multiple translation units. A better approach would be to have the ASTImporter handle this case. I think the proper fix is better addressed in a separate patch. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 116502. xazax.hun added a comment. Herald added a subscriber: baloghadamsoftware. - Fixed an issue with source locations - Updated to latest trunk https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,16 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang', +'not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun3i@x86_64 ast/x86_64/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertTrue(('_Z1fun3i@x86_64', 'ast/x86_64/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun1i@x86_64 ast/x86_64/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertFalse(('_Z1fun1i@x86_64', 'ast/x86_64/fun7.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/f un.c.ast'] +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a comment. In https://reviews.llvm.org/D30691#878711, @r.stahl wrote: > If either of the FullSourceLocs is a MacroID, the call > SM.getFileEntryForID(XL.getFileID()) will return a null pointer. The null > pointer will crash the program when attempting to call ->getName() on it. Thank you for the report and the detailed analysis! I did not want to get the expansion location since the original code did not query that either, so for now I just handle the case when I can not query a file entry. This case can occur even when no macros are in the code, for example when a PCH is referring to a nonexistent file. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
r.stahl added a comment. In a similar case, static inline functions are an issue. inc.h void foo(); static inline void wow() { int a = *(int*)0; } main.c #include "inc.h" void moo() { foo(); wow(); } other.c #include "inc.h" void foo() { wow(); } The inline function is inlined into each calling AST as different AST objects. This causes the PathDiagnostics to be distinct, while pointing to the exact same source location. When the compareCrossTUSourceLocs function tries to compare the FullSourceLocs, it cannot find a difference and FlushDiagnostics will assert on an erroneous compare function. For my purposes I replaced the return statement of the compareCrossTUSourceLocs function with: return XL.getFileID() < YL.getFileID(); A more correct fix would create only one unique diagnostic for both cases. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
r.stahl added a comment. While testing this I stumbled upon a crash with the following test case: inc.h #define BASE ((int*)0) void foo(); main.c: #include "inc.h" void moo() { int a = BASE[0]; foo(); } other.c #include "inc.h" void foo() { int a = BASE[0]; } Note that I used a custom checker that did not stop on the path like the DerefChecker would here. I did not know how to reproduce it with official checkers, but the issue should be understandable without reproduction. With the given test a checker may produce two results for the null dereference in moo() and foo(). When analyzing main.c they will both be found and therefore sorted with PathDiagnostic.cpp "compareCrossTUSourceLocs". If either of the FullSourceLocs is a MacroID, the call SM.getFileEntryForID(XL.getFileID()) will return a null pointer. The null pointer will crash the program when attempting to call ->getName() on it. My solution was to add the following lines before the .getFileID() calls: XL = XL.getExpansionLoc(); YL = YL.getExpansionLoc(); Comment at: lib/StaticAnalyzer/Core/PathDiagnostic.cpp:391 +return XL.isBeforeInTranslationUnitThan(YL); + return SM.getFileEntryForID(XL.getFileID())->getName() < + SM.getFileEntryForID(YL.getFileID())->getName(); see comment https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
danielmarjamaki accepted this revision. danielmarjamaki added inline comments. This revision is now accepted and ready to land. Comment at: tools/scan-build-py/libscanbuild/analyze.py:165 +with open(filename, 'r') as in_file: +for line in in_file: +yield line whisperity wrote: > danielmarjamaki wrote: > > I believe you can write: > > > > for line in open(filename, 'r'): > Do we want to rely on the interpreter implementation on when the file is > closed. > > If > > ``` > for line in open(filename, 'r'): > something() > ``` > > is used, the file handle will be closed based on garbage collection rules. > Having this handle disposed after the iteration is true for the stock CPython > implementation, but it is still nontheless an implementation specific > approach. > > Whereas using `with` will explicitly close the file handle on the spot, no > matter what. ok I did not know that. feel free to ignore my comment. Comment at: tools/scan-build-py/libscanbuild/analyze.py:172 +extern_fns_map_file = os.path.join(ctudir, CTU_FUNCTION_MAP_FILENAME) +with open(extern_fns_map_file, 'w') as out_file: +for mangled_name, ast_file in mangled_ast_pairs: whisperity wrote: > danielmarjamaki wrote: > > this 'with' seems redundant. I suggest an assignment and then less > > indentation will be needed below > I don't seem to understand what do you want to assign to what. I did not consider the garbage collection. I assumed that out_file would Always be closed when it Went out of scope and then this would require less indentation: out_file = open(extern_fns_map_file, 'w') for mangled_name, ast_file in mangled_ast_pairs: out_file.write('%s %s\n' % (mangled_name, ast_file)) Comment at: tools/scan-build-py/libscanbuild/analyze.py:223 +ctu_config = get_ctu_config(args) +if ctu_config.collect: +shutil.rmtree(ctu_config.dir, ignore_errors=True) danielmarjamaki wrote: > not a big deal but I would use early exits in this function with "not a big deal" I mean; feel free to ignore my comment if you want to have it this way. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
whisperity added a comment. The Python code here still uses `mangled name` in their wording. Does this mean this patch is yet to be updated with the USR management in the parent patch? Comment at: tools/scan-build-py/libscanbuild/analyze.py:165 +with open(filename, 'r') as in_file: +for line in in_file: +yield line danielmarjamaki wrote: > I believe you can write: > > for line in open(filename, 'r'): Do we want to rely on the interpreter implementation on when the file is closed. If ``` for line in open(filename, 'r'): something() ``` is used, the file handle will be closed based on garbage collection rules. Having this handle disposed after the iteration is true for the stock CPython implementation, but it is still nontheless an implementation specific approach. Whereas using `with` will explicitly close the file handle on the spot, no matter what. Comment at: tools/scan-build-py/libscanbuild/analyze.py:172 +extern_fns_map_file = os.path.join(ctudir, CTU_FUNCTION_MAP_FILENAME) +with open(extern_fns_map_file, 'w') as out_file: +for mangled_name, ast_file in mangled_ast_pairs: danielmarjamaki wrote: > this 'with' seems redundant. I suggest an assignment and then less > indentation will be needed below I don't seem to understand what do you want to assign to what. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
danielmarjamaki added inline comments. Comment at: tools/scan-build-py/libscanbuild/analyze.py:165 +with open(filename, 'r') as in_file: +for line in in_file: +yield line I believe you can write: for line in open(filename, 'r'): Comment at: tools/scan-build-py/libscanbuild/analyze.py:172 +extern_fns_map_file = os.path.join(ctudir, CTU_FUNCTION_MAP_FILENAME) +with open(extern_fns_map_file, 'w') as out_file: +for mangled_name, ast_file in mangled_ast_pairs: this 'with' seems redundant. I suggest an assignment and then less indentation will be needed below Comment at: tools/scan-build-py/libscanbuild/analyze.py:223 +ctu_config = get_ctu_config(args) +if ctu_config.collect: +shutil.rmtree(ctu_config.dir, ignore_errors=True) not a big deal but I would use early exits in this function Comment at: tools/scan-build-py/libscanbuild/clang.py:177 +arch = "" +i = 0 +while i < len(cmd) and cmd[i] != "-triple": I am guessing that you can use cmd.find() instead of the loop https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 105053. xazax.hun added a comment. - Patch scan-build instead of using custom scripts - Rebase patch based on the proposed LibTooling CTU code https://reviews.llvm.org/D30691 Files: include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/PathDiagnostic.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp lib/StaticAnalyzer/Frontend/CMakeLists.txt test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/scan-build-py/libscanbuild/__init__.py tools/scan-build-py/libscanbuild/analyze.py tools/scan-build-py/libscanbuild/arguments.py tools/scan-build-py/libscanbuild/clang.py tools/scan-build-py/libscanbuild/report.py tools/scan-build-py/tests/unit/test_analyze.py tools/scan-build-py/tests/unit/test_clang.py Index: tools/scan-build-py/tests/unit/test_clang.py === --- tools/scan-build-py/tests/unit/test_clang.py +++ tools/scan-build-py/tests/unit/test_clang.py @@ -92,3 +92,16 @@ self.assertEqual('Checker One description', result.get('checker.one')) self.assertTrue('checker.two' in result) self.assertEqual('Checker Two description', result.get('checker.two')) + + +class ClangIsCtuCapableTest(unittest.TestCase): +def test_ctu_not_found(self): +is_ctu = sut.is_ctu_capable('not-found-clang', +'not-found-clang-func-mapping') +self.assertFalse(is_ctu) + + +class ClangGetTripleArchTest(unittest.TestCase): +def test_arch_is_not_empty(self): +arch = sut.get_triple_arch(['clang', '-E', '-'], '.') +self.assertTrue(len(arch) > 0) Index: tools/scan-build-py/tests/unit/test_analyze.py === --- tools/scan-build-py/tests/unit/test_analyze.py +++ tools/scan-build-py/tests/unit/test_analyze.py @@ -4,12 +4,12 @@ # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -import libear -import libscanbuild.analyze as sut import unittest import re import os import os.path +import libear +import libscanbuild.analyze as sut class ReportDirectoryTest(unittest.TestCase): @@ -333,3 +333,83 @@ def test_method_exception_not_caught(self): self.assertRaises(Exception, method_exception_from_inside, dict()) + + +class PrefixWithTest(unittest.TestCase): + +def test_gives_empty_on_empty(self): +res = sut.prefix_with(0, []) +self.assertFalse(res) + +def test_interleaves_prefix(self): +res = sut.prefix_with(0, [1, 2, 3]) +self.assertListEqual([0, 1, 0, 2, 0, 3], res) + + +class MergeCtuMapTest(unittest.TestCase): + +def test_no_map_gives_empty(self): +pairs = sut.create_global_ctu_function_map([]) +self.assertFalse(pairs) + +def test_multiple_maps_merged(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun3i@x86_64 ast/x86_64/fun3.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertTrue(('_Z1fun3i@x86_64', 'ast/x86_64/fun3.c.ast') in pairs) +self.assertEqual(3, len(pairs)) + +def test_not_unique_func_left_out(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun1i@x86_64 ast/x86_64/fun7.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertFalse(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertFalse(('_Z1fun1i@x86_64', 'ast/x86_64/fun7.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertEqual(1, len(pairs)) + +def test_duplicates_are_kept(self): +concat_map = ['_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast', + '_Z1fun2i@x86_64 ast/x86_64/fun2.c.ast', + '_Z1fun1i@x86_64 ast/x86_64/fun1.c.ast'] +pairs = sut.create_global_ctu_function_map(concat_map) +self.assertTrue(('_Z1fun1i@x86_64', 'ast/x86_64/fun1.c.ast') in pairs) +self.assertTrue(('_Z1fun2i@x86_64', 'ast/x86_64/fun2.c.ast') in pairs) +self.assertEqual(2, len(pairs)) + +def test_space_handled_in_source(self): +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
klimek added a reviewer: rsmith. klimek added a comment. Richard (added as reviewer) usually owns decisions around clang itself. Writing an email to cfe-dev with the numbers and wait for whether others have concerns would probably also be good. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. Regarding serializing vs not serializing and now vs later. 1. I think we eventually need to provide a reasonable default approach presented to the user. This approach shouldn't be hurting the user dramatically in any sense. Because //serializing// hurts the user's disk space dramatically, and //not-serializing// may be slower and a bit more memory-intensive but isn't too bad in all senses, out of these two options //not-serializing// is definitely preferable as a default approach. 2. Later we should definitely consider the alternative approaches that serialize only some ASTs, with the hope that one of them would turn out to be a better default approach. 3. From 2. it follows that for now it's better to keep both approaches around - as we believe that the ideal approach may be a combination of the two. Therefore it doesn't really matter in what order they land. tl;dr: I propose to land serialization-based approach first, then land non-serialization-based approach later and make it default, then consider taking the best of the two and making it a new default. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a comment. Some of the CTU related analyzer independent parts are being factored out. The review is ongoing here: https://reviews.llvm.org/D34512 Another small and independent part is under review here: https://reviews.llvm.org/D34506 https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dkrupp added a comment. Thanks. > It would be best to just add the scan-build-py support to the tree, > especially, since the new scrips are not tested. OK. We will update this patch with the scan-build-py changes and remove the ctu-build.py and ctu-analyze.py scripts. > I am curious which optimization strategies you considered. An idea that @NoQ > came up with was to serialize only the most used translation units. Another > idea is to choose the TUs that a particular file has most dependencies on and > only inline functions from those. Both of these strategies could work in practice, we did not try them. We implemented the two extremes: serialize all TUs/don't serialize any of the TUs. Both of them could can be useful in practice as is (depending if the user is cpu/memory/disk space bound). I think we could try with the above suggested optimizations as an incremental improvement (and keep this initial patch as simple as possible). > What mode would you prefer? Would you pay the 20%-30% in speed but reduce the > huge disk consumption? That might be a good option, especially, if you have > not exhausted the ideas on how to optimize. In this initial version I think we should keep the serializing mode. We just measured that the heap consumption of the non-serializing mode and it seems to be ~50% larger. Probably because the serializing mode only loads those AST fragments from the disk that is imported. But I can imagine that some user still want to use the non-serializing version which is not using extra disk space. So we will add the non-serializing mode in a next patch as an Analyzer option first (which we can turn into default behaviour later on). OK? > I see some changes to the compiler proper, such as ASTImporter, ASTContext, > SourceManager. We should split those changes into a separate patch and ask > for review from people working on those components. You can point back to > this patch, which would contain the analyzer related changes, and state that > the other patch is blocking this work. Allright, we will do that. So is it OK to proceed like this? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
zaks.anna added a comment. > -(Anna) Scan-build-py integration of the functionality is nearly finished > (see https://github.com/rizsotto/scan-build/issues/83) (--ctu switch performs > both analysis phases at once). This I think could go in a different patch, > but until we could keep the ctu-build.py and ctu-analyze.py scripts. Do you > agree? It's important to bring this patch into the LLVM repo so that it becomes part of the clang/llvm project and is used. The whole point of adding CTU integration to scan-build-py is to make sure that there is a single tool that all/most users could use; adding the patch to a fork does not accomplish that goal. Also, I am not a fan of developing on downstream branches and that is against the LLVM Developer policy due to all the reasons described here: http://www.llvm.org/docs/DeveloperPolicy.html#incremental-development. This development style leads to fragmentation of the community and the project. Unfortunately, we often see cases where large patches developed out of tree never make it in as a result of not following this policy and it would be great to avoid this in the future. > This I think could go in a different patch, but until we could keep the > ctu-build.py and ctu-analyze.py scripts. Do you agree? It would be best to just add the scan-build-py support to the tree, especially, since the new scrips are not tested. > -(Anna) Dumping the ASTs to the disk. We tried a version where, ASTs are not > dumped in the 1st phase, but is recreated each time a function definition is > needed from an external TU. It works fine, but the analysis time went up by > 20-30% on open source C projects. I am curious which optimization strategies you considered. An idea that @NoQ came up with was to serialize only the most used translation units. Another idea is to choose the TUs that a particular file has most dependencies on and only inline functions from those. What mode would you prefer? Would you pay the 20%-30% in speed but reduce the huge disk consumption? That might be a good option, especially, if you have not exhausted the ideas on how to optimize. > Is it OK to add this functionality in a next patch? Or should we it as an > optional feature right now? This depends on what the plan for going forward is. Specifically, if we do not need the serialization mode, you could remove that from this patch and add the new mode. If you think the serialization mode is essential going forward, we could have the other mode in a separate patch. (It would be useful to split out the serialization mode work into a separate patch so that we could revert it later on if we see that the other mode is better.) I see some changes to the compiler proper, such as ASTImporter, ASTContext, SourceManager. We should split those changes into a separate patch and ask for review from people working on those components. You can point back to this patch, which would contain the analyzer related changes, and state that the other patch is blocking this work. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
a.sidorin accepted this revision. a.sidorin added a comment. Hello Daniel & Gabor, Thank you very much for your work. This patch looks good to me but I think such a change should also be approved by maintainers. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dkrupp added a comment. Thanks for the reviews so far. I think we have addressed all major concerns regarding this patch: -(Anna) Scan-build-py integration of the functionality is nearly finished (see https://github.com/rizsotto/scan-build/issues/83) (--ctu switch performs both analysis phases at once). This I think could go in a different patch, but until we could keep the ctu-build.py and ctu-analyze.py scripts. Do you agree? -(Devin,NoQ) In the externalFnMap.txt (which contains which function definitions is located where) Unified Symbol Resolution (USR) is used to identify functions instead of mangled names, which seems to work equally well for C and C++ -(Anna) Dumping the ASTs to the disk. We tried a version where, ASTs are not dumped in the 1st phase, but is recreated each time a function definition is needed from an external TU. It works fine, but the analysis time went up by 20-30% on open source C projects. Is it OK to add this functionality in a next patch? Or should we it as an optional feature right now? Do you see anything else that would prevent this patch to get in? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun marked 2 inline comments as done. xazax.hun added a comment. In https://reviews.llvm.org/D30691#731617, @zaks.anna wrote: > I agree that scan-build or scan-build-py integration is an important issue to > resolve here. What I envision is that users will just call scan-build and > pass -whole-project as an option to it. Everything else will happen > automagically:) We contacted Laszlo and we have a pull request into scan-build that is under review. He is very helpful and supports the idea of scan-build-py supporting CTU analysis. > I do not quite understand why AST serialization is needed at all. Can we > instead recompile the translation units on demand into a separate ASTContext > and then ASTImport? We did a prototype implementation of on-demand reparsing. On the C projects we tested, the runtime is increased by 10-30% compared to dumping the ASTs. Note that, it is relatively fast to parse C, I would expect a much bigger delta in case of C++ projects. Unfortunately, we weren't able to test that setting due to the ASTImporter limitations. Comment at: include/clang/AST/Mangle.h:59 + // the static analyzer. + bool ShouldForceMangleProto; xazax.hun wrote: > dcoughlin wrote: > > I'm pretty worried about using C++ mangling for C functions. It doesn't > > ever seem appropriate to do so and it sounds like it is papering over > > problems with the design. > > > > Some questions: > > - How do you handle when two translation units have different C functions > > with the same type signatures? Is there a collision? This can arise because > > of two-level namespacing or when building multiple targets with the same > > CTU directory. > > - How do you handle when a C function has the same signature as a C++ > > function. Is there a collision when you mangle the C function? > I agree that using C++ mangling for C+ is not the nicest solution, and I had > to circumvent a problem in the name mangler for C prototypes. > > In case a mangled name is found in multiple source files, it will not be > imported. This is the way how collisions handled regardless of being C or C++ > functions. > The target arch is appended to the mangled name to support the cross > compilation scenario. Currently we do not add the full triple, but this could > be done. > > An alternative solution would be to use USRs instead of mangled names but we > did not explore this option in depth yet. Note that the newest version of this patch does not use name mangling, it uses USRs instead. This turned out to be a perfectly viable alternative and we did not see any behavioral changes on the project we tested after the transition. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 101695. xazax.hun edited the summary of this revision. xazax.hun added a comment. - Migrate to use USR instead of mangled names. Name mangling related changes are reverted. - Better error handling in some cases. - File paths containing spaces are now handled correctly. - Fixes to support scripts. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp test/Analysis/func-mapping-test.cpp test/CMakeLists.txt test/lit.cfg tools/CMakeLists.txt tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/ctu-analysis/ctu-analyze.py tools/ctu-analysis/ctu-build.py tools/scan-build-py/libscanbuild/analyze.py Index: tools/scan-build-py/libscanbuild/analyze.py === --- tools/scan-build-py/libscanbuild/analyze.py +++ tools/scan-build-py/libscanbuild/analyze.py @@ -383,7 +383,8 @@ def target(): """ Creates output file name for reports. """ -if opts['output_format'] in {'plist', 'plist-html'}: +if opts['output_format'] in {'plist', 'plist-html', + 'plist-multi-file'}: (handle, name) = tempfile.mkstemp(prefix='report-', suffix='.plist', dir=opts['output_dir']) Index: tools/ctu-analysis/ctu-build.py === --- /dev/null +++ tools/ctu-analysis/ctu-build.py @@ -0,0 +1,243 @@ +#!/usr/bin/env python + +import argparse +import json +import glob +import logging +import multiprocessing +import os +import re +import signal +import subprocess +import shlex +import shutil +import tempfile + +SOURCE_PATTERN = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +TIMEOUT = 86400 +EXTERNAL_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +TEMP_EXTERNAL_FNMAP_FOLDER = 'tmpExternalFnMaps' + + +def get_args(): +parser = argparse.ArgumentParser( +description='Executes 1st pass of CTU analysis where we preprocess ' +'all files in the compilation database and generate ' +'AST dumps and other necessary information from those ' +'to be used later by the 2nd pass of ' +'Cross Translation Unit analysis', +formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='JSON Compilation Database to be used') +parser.add_argument('-p', metavar='preanalyze-dir', dest='ctuindir', +help='Target directory for preanalyzation data', +default='.ctu') +parser.add_argument('-j', metavar='threads', dest='threads', type=int, +help='Number of threads to be used', +default=int(multiprocessing.cpu_count() * 1.0)) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output') +parser.add_argument('--clang-path', metavar='clang-path', +dest='clang_path', +help='Set path to directory of clang binaries used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +mainargs = parser.parse_args() + +if mainargs.verbose: +logging.getLogger().setLevel(logging.INFO) + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +logging.info('CTU uses clang dir: ' + + (clang_path if clang_path != '' else '')) + +return mainargs, clang_path + + +def process_buildlog(buildlog_filename, src_2_dir, src_2_cmd, src_order, + cmd_2_src, cmd_order): +with open(buildlog_filename, 'r') as buildlog_file: +buildlog = json.load(buildlog_file) +for step in buildlog: +if SOURCE_PATTERN.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +src_2_cmd[step['file']] = step['command'] +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
a.sidorin added inline comments. Comment at: lib/AST/ASTContext.cpp:1481 + assert(!FD->hasBody() && "FD has a definition in current translation unit!"); + if (!FD->getType()->getAs()) +return nullptr; // Cannot even mangle that. xazax.hun wrote: > a.sidorin wrote: > > This needs a FIXME because currently many functions cannot be analyzed > > because of it. > What do you mean here? This comment was about `if (!FD->getType()->getAs()) return nullptr`. While testing, we have found that many C and C-style functions are FunctionNoProtoTypes, so they cannot be mangled, but they still have a body in another TU. There definitely should be a way to fix this. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 96727. xazax.hun marked 5 inline comments as done. xazax.hun added a comment. - Updates according to review comments - Improvements to the python scripts https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp test/Analysis/func-mapping-test.cpp test/CMakeLists.txt test/lit.cfg tools/CMakeLists.txt tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/ctu-analysis/ctu-analyze.py tools/ctu-analysis/ctu-build.py tools/scan-build-py/libscanbuild/runner.py Index: tools/scan-build-py/libscanbuild/runner.py === --- tools/scan-build-py/libscanbuild/runner.py +++ tools/scan-build-py/libscanbuild/runner.py @@ -162,7 +162,8 @@ def target(): """ Creates output file name for reports. """ -if opts['output_format'] in {'plist', 'plist-html'}: +if opts['output_format'] in {'plist', 'plist-html', + 'plist-multi-file'}: (handle, name) = tempfile.mkstemp(prefix='report-', suffix='.plist', dir=opts['output_dir']) Index: tools/ctu-analysis/ctu-build.py === --- /dev/null +++ tools/ctu-analysis/ctu-build.py @@ -0,0 +1,239 @@ +#!/usr/bin/env python + +import argparse +import json +import glob +import logging +import multiprocessing +import os +import re +import signal +import subprocess +import shlex +import shutil +import tempfile + +SOURCE_PATTERN = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +TIMEOUT = 86400 +EXTERNAL_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' +TEMP_EXTERNAL_FNMAP_FOLDER = 'tmpExternalFnMaps' + + +def get_args(): +parser = argparse.ArgumentParser( +description='Executes 1st pass of CTU analysis where we preprocess ' +'all files in the compilation database and generate ' +'AST dumps and other necessary information from those ' +'to be used later by the 2nd pass of ' +'Cross Translation Unit analysis', +formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='JSON Compilation Database to be used') +parser.add_argument('-p', metavar='preanalyze-dir', dest='ctuindir', +help='Target directory for preanalyzation data', +default='.ctu') +parser.add_argument('-j', metavar='threads', dest='threads', type=int, +help='Number of threads to be used', +default=int(multiprocessing.cpu_count() * 1.0)) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output') +parser.add_argument('--clang-path', metavar='clang-path', +dest='clang_path', +help='Set path to directory of clang binaries used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +mainargs = parser.parse_args() + +if mainargs.verbose: +logging.getLogger().setLevel(logging.INFO) + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +logging.info('CTU uses clang dir: ' + + (clang_path if clang_path != '' else '')) + +return mainargs, clang_path + + +def process_buildlog(buildlog_filename, src_2_dir, src_2_cmd, src_order, + cmd_2_src, cmd_order): +with open(buildlog_filename, 'r') as buildlog_file: +buildlog = json.load(buildlog_file) +for step in buildlog: +if SOURCE_PATTERN.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +src_2_cmd[step['file']] = step['command'] +src_order.append(step['file']) +if step['command'] not in cmd_2_src: +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
whisperity added inline comments. Comment at: lib/AST/ASTContext.cpp:1497 + auto FnUnitCacheEntry = FunctionAstUnitMap.find(MangledFnName); + if (FnUnitCacheEntry == FunctionAstUnitMap.end()) { +if (FunctionFileMap.empty()) { danielmarjamaki wrote: > How about replacing FunctionAstUnitMap.find() with FunctionAstUnitMap.count()? > `FnUnitCacheEntry` is used in line `1525`. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
danielmarjamaki added inline comments. Comment at: lib/AST/ASTContext.cpp:1495 + ASTUnit *Unit = nullptr; + StringRef ASTFileName; + auto FnUnitCacheEntry = FunctionAstUnitMap.find(MangledFnName); As far as I see you can move this ASTFileName declaration down. Comment at: lib/AST/ASTContext.cpp:1497 + auto FnUnitCacheEntry = FunctionAstUnitMap.find(MangledFnName); + if (FnUnitCacheEntry == FunctionAstUnitMap.end()) { +if (FunctionFileMap.empty()) { How about replacing FunctionAstUnitMap.find() with FunctionAstUnitMap.count()? Comment at: lib/AST/ASTContext.cpp:1511 +auto It = FunctionFileMap.find(MangledFnName); +if (It != FunctionFileMap.end()) + ASTFileName = It->second; As far as I see this can be changed to: ``` if (It == FunctionFileMap.end()) return nullptr; const StringRef ASTFileName = It->second; ``` Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:48 + : Ctx(Context), ItaniumCtx(MangleCtx) {} + std::string CurrentFileName; + As far I see CurrentFileName is only used in 1 method and should therefore be private. Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:86 + SM.getFileEntryForID(SM.getMainFileID())->getName(); + char *Path = realpath(SMgrName.str().c_str(), nullptr); + CurrentFileName = Path; I believe realpath is a posix function. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 96377. xazax.hun marked an inline comment as done. xazax.hun added a comment. - Removed more unused code. - Remove the usages of posix API. - Changes according to some review comments. - Add a test to the clang-func-mapping tool. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp test/Analysis/func-mapping-test.cpp test/CMakeLists.txt test/lit.cfg tools/CMakeLists.txt tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/ctu-analysis/ctu-analyze.py tools/ctu-analysis/ctu-build.py tools/scan-build-py/libscanbuild/runner.py Index: tools/scan-build-py/libscanbuild/runner.py === --- tools/scan-build-py/libscanbuild/runner.py +++ tools/scan-build-py/libscanbuild/runner.py @@ -162,7 +162,8 @@ def target(): """ Creates output file name for reports. """ -if opts['output_format'] in {'plist', 'plist-html'}: +if opts['output_format'] in {'plist', 'plist-html', + 'plist-multi-file'}: (handle, name) = tempfile.mkstemp(prefix='report-', suffix='.plist', dir=opts['output_dir']) Index: tools/ctu-analysis/ctu-build.py === --- /dev/null +++ tools/ctu-analysis/ctu-build.py @@ -0,0 +1,214 @@ +#!/usr/bin/env python + +import argparse +import json +import logging +import multiprocessing +import os +import re +import signal +import subprocess +import shlex + +SOURCE_PATTERN = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +TIMEOUT = 86400 +DEFINED_FUNCTIONS_FILENAME = 'definedFns.txt' +EXTERNAL_FUNCTIONS_FILENAME = 'externalFns.txt' +EXTERNAL_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' + + +def get_args(): +parser = argparse.ArgumentParser( +description='Executes 1st pass of CTU analysis where we preprocess ' +'all files in the compilation database and generate ' +'AST dumps and other necessary information from those ' +'to be used later by the 2nd pass of ' +'Cross Translation Unit analysis', +formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='JSON Compilation Database to be used') +parser.add_argument('-p', metavar='preanalyze-dir', dest='ctuindir', +help='Target directory for preanalyzation data', +default='.ctu') +parser.add_argument('-j', metavar='threads', dest='threads', type=int, +help='Number of threads to be used', +default=int(multiprocessing.cpu_count() * 1.0)) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output') +parser.add_argument('--clang-path', metavar='clang-path', +dest='clang_path', +help='Set path to directory of clang binaries used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +mainargs = parser.parse_args() + +if mainargs.verbose: +logging.getLogger().setLevel(logging.INFO) + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +logging.info('CTU uses clang dir: ' + + (clang_path if clang_path != '' else '')) + +return mainargs, clang_path + + +def process_buildlog(buildlog_filename, src_2_dir, src_2_cmd, src_order, + cmd_2_src, cmd_order): +with open(buildlog_filename, 'r') as buildlog_file: +buildlog = json.load(buildlog_file) +for step in buildlog: +if SOURCE_PATTERN.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +src_2_cmd[step['file']] = step['command'] +src_order.append(step['file']) +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun marked 6 inline comments as done and an inline comment as not done. xazax.hun added a comment. In https://reviews.llvm.org/D30691#731617, @zaks.anna wrote: > I agree that scan-build or scan-build-py integration is an important issue to > resolve here. What I envision is that users will just call scan-build and > pass -whole-project as an option to it. Everything else will happen > automagically:) We had a skype meeting with Laszlo. He had no objections adding this to scan-build-py. > Another concern is dumping the ASTs to the disk. There are really 2 concerns > here. First one is the high disk usage, which is a blocker for having higher > adoption. Second is that I am not sure how complete and reliable AST > serialization and deserialization are. Are those components used for > something else that is running in production or are we just utilizing > -ast-dump, which is used for debugging? I do not quite understand why AST > serialization is needed at all. Can we instead recompile the translation > units on demand into a separate ASTContext and then ASTImport? According to our measurements there are some optimization possibilities in the serialized AST format. Even though it might not get us an order of magnitude improvement, it can be still very significant. The main reason, why the AST dumps are so large: duplicated AST parts from the headers. Once modules are supported and utilized, the size could be reduced once again significantly. The serialization/deserialization of AST nodes should be very reliable. The same mechanisms are used for precompiled headers and modules. AST serialization is there to avoid the time overhead of reparsing translation units. This can be a big win for translation units that have metaprograms. Technically, I can not see any obstackles in reparsing a translation unit on demand, but we did not measure how much slower would that be. Having a serialized format could also make it possible to only load fragmets of the AST into the memory instead of the whole translation unit. (This feature is currently not utilized by this patch.) Comment at: include/clang/AST/Decl.h:53 class VarTemplateDecl; +class CompilerInstance; dcoughlin wrote: > Is this needed? It seems like a layering violation. Good catch, this is redundant. I will remove this in the next patch. Comment at: include/clang/AST/Mangle.h:59 + // the static analyzer. + bool ShouldForceMangleProto; dcoughlin wrote: > I'm pretty worried about using C++ mangling for C functions. It doesn't ever > seem appropriate to do so and it sounds like it is papering over problems > with the design. > > Some questions: > - How do you handle when two translation units have different C functions > with the same type signatures? Is there a collision? This can arise because > of two-level namespacing or when building multiple targets with the same CTU > directory. > - How do you handle when a C function has the same signature as a C++ > function. Is there a collision when you mangle the C function? I agree that using C++ mangling for C+ is not the nicest solution, and I had to circumvent a problem in the name mangler for C prototypes. In case a mangled name is found in multiple source files, it will not be imported. This is the way how collisions handled regardless of being C or C++ functions. The target arch is appended to the mangled name to support the cross compilation scenario. Currently we do not add the full triple, but this could be done. An alternative solution would be to use USRs instead of mangled names but we did not explore this option in depth yet. Comment at: lib/AST/ASTContext.cpp:1481 + assert(!FD->hasBody() && "FD has a definition in current translation unit!"); + if (!FD->getType()->getAs()) +return nullptr; // Cannot even mangle that. a.sidorin wrote: > This needs a FIXME because currently many functions cannot be analyzed > because of it. What do you mean here? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
dcoughlin added inline comments. Comment at: include/clang/AST/Decl.h:53 class VarTemplateDecl; +class CompilerInstance; Is this needed? It seems like a layering violation. Comment at: include/clang/AST/Mangle.h:59 + // the static analyzer. + bool ShouldForceMangleProto; I'm pretty worried about using C++ mangling for C functions. It doesn't ever seem appropriate to do so and it sounds like it is papering over problems with the design. Some questions: - How do you handle when two translation units have different C functions with the same type signatures? Is there a collision? This can arise because of two-level namespacing or when building multiple targets with the same CTU directory. - How do you handle when a C function has the same signature as a C++ function. Is there a collision when you mangle the C function? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
zaks.anna added a comment. I agree that scan-build or scan-build-py integration is an important issue to resolve here. What I envision is that users will just call scan-build and pass -whole-project as an option to it. Everything else will happen automagically:) Another concern is dumping the ASTs to the disk. There are really 2 concerns here. First one is the high disk usage, which is a blocker for having higher adoption. Second is that I am not sure how complete and reliable AST serialization and deserialization are. Are those components used for something else that is running in production or are we just utilizing -ast-dump, which is used for debugging? I do not quite understand why AST serialization is needed at all. Can we instead recompile the translation units on demand into a separate ASTContext and then ASTImport? https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 94814. xazax.hun edited the summary of this revision. xazax.hun added a comment. - Removed a clang tool, replaced with python tool functionality. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Decl.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/CMakeLists.txt tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/ctu-analysis/ctu-analyze.py tools/ctu-analysis/ctu-build.py tools/scan-build-py/libscanbuild/runner.py Index: tools/scan-build-py/libscanbuild/runner.py === --- tools/scan-build-py/libscanbuild/runner.py +++ tools/scan-build-py/libscanbuild/runner.py @@ -162,7 +162,8 @@ def target(): """ Creates output file name for reports. """ -if opts['output_format'] in {'plist', 'plist-html'}: +if opts['output_format'] in {'plist', 'plist-html', + 'plist-multi-file'}: (handle, name) = tempfile.mkstemp(prefix='report-', suffix='.plist', dir=opts['output_dir']) Index: tools/ctu-analysis/ctu-build.py === --- /dev/null +++ tools/ctu-analysis/ctu-build.py @@ -0,0 +1,230 @@ +#!/usr/bin/env python + +import argparse +import json +import logging +import multiprocessing +import os +import re +import signal +import subprocess +import shlex + +SOURCE_PATTERN = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +TIMEOUT = 86400 +DEFINED_FUNCTIONS_FILENAME = 'definedFns.txt' +EXTERNAL_FUNCTIONS_FILENAME = 'externalFns.txt' +EXTERNAL_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' + + +def get_args(): +parser = argparse.ArgumentParser( +description='Executes 1st pass of CTU analysis where we preprocess ' +'all files in the compilation database and generate ' +'AST dumps and other necessary information from those ' +'to be used later by the 2nd pass of ' +'Cross Translation Unit analysis', +formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='JSON Compilation Database to be used') +parser.add_argument('-p', metavar='preanalyze-dir', dest='ctuindir', +help='Target directory for preanalyzation data', +default='.ctu') +parser.add_argument('-j', metavar='threads', dest='threads', type=int, +help='Number of threads to be used', +default=int(multiprocessing.cpu_count() * 1.0)) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output') +parser.add_argument('--clang-path', metavar='clang-path', +dest='clang_path', +help='Set path to directory of clang binaries used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +mainargs = parser.parse_args() + +if mainargs.verbose: +logging.getLogger().setLevel(logging.INFO) + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +logging.info('CTU uses clang dir: ' + + (clang_path if clang_path != '' else '')) + +return mainargs, clang_path + + +def process_buildlog(buildlog_filename, src_2_dir, src_2_cmd, src_order, + cmd_2_src, cmd_order): +with open(buildlog_filename, 'r') as buildlog_file: +buildlog = json.load(buildlog_file) +for step in buildlog: +if SOURCE_PATTERN.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +src_2_cmd[step['file']] = step['command'] +src_order.append(step['file']) +if step['command'] not in cmd_2_src: +cmd_2_src[step['command']] = [step['file']] +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun marked 6 inline comments as done. xazax.hun added inline comments. Comment at: test/Analysis/Inputs/externalFnMap.txt:1 +_Z7h_chaini@x86_64 xtu-chain.cpp.ast +_ZN4chns4chf2Ei@x86_64 xtu-chain.cpp.ast NoQ wrote: > These tests use a pre-made external function map. > > Are you willing to add tests for generating function maps? > > That'd be useful, in my opinion, because it'd actually tell people how to run > the whole thing. Good idea! We will add a test for that. Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:1 +//===- ClangCmdlineArchExtractor.cpp ===// +// NoQ wrote: > Do we really intend to keep this as a tool? I suspect obtaining the > architecture could be done much easier by parsing the run-line in python > scripts, we were just in too much rush to consider this possibility, and > calling a whole tool for just that sounds like an overkill. I think it would > be great to simplify this out. > > Additionally, this whole architecture hassle kicks in only rarely. It is only > important to know the architecture because some projects have the same file > compiled for different architectures (we used to have it in Android which has > binaries that are compiled for both host machine and target machine, but for > most projects just having a mangled name is already good enough). So we can > delay this discussion by splitting this out of the initial patch, if you > want, but that's extra work, of course, so please take the above as more of a > mumble than of a request. We had the same idea recently. Next version of this patch will do this by parsing the clang cc1 command line (gathered by -###). Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:72 + + return 0; +} danielmarjamaki wrote: > EXIT_SUCCESS is also possible however I guess that is 0 on all > implementations. Other clang tools return 0. Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:192 + lockedWrite(BuildDir + "/definedFns.txt", DefinedFuncsStr.str()); + std::ostringstream CFGStr; + for (auto : CG) { a.sidorin wrote: > It is not 'CFG'Str, it should be 'CallGraph'Str. Sorry for this issue. That part is removed from this version of the patch. It wasn't used anywhere. Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:257 + Tool.run(newFrontendActionFactory().get()); +} danielmarjamaki wrote: > no return. That is not a problem for C++ programs, but added it. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 94709. xazax.hun edited the summary of this revision. xazax.hun added a comment. - Fixed some memory leaks. - Removed some unrelated style changes. - Simplifications to the python scripts. - Removed debug prints. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Decl.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/ctu-chain.cpp test/Analysis/Inputs/ctu-other.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/ctu-main.cpp tools/CMakeLists.txt tools/clang-cmdline-arch-extractor/CMakeLists.txt tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/ctu-analysis/ctu-analyze.py tools/ctu-analysis/ctu-build.py tools/scan-build-py/libscanbuild/runner.py Index: tools/scan-build-py/libscanbuild/runner.py === --- tools/scan-build-py/libscanbuild/runner.py +++ tools/scan-build-py/libscanbuild/runner.py @@ -162,7 +162,8 @@ def target(): """ Creates output file name for reports. """ -if opts['output_format'] in {'plist', 'plist-html'}: +if opts['output_format'] in {'plist', 'plist-html', + 'plist-multi-file'}: (handle, name) = tempfile.mkstemp(prefix='report-', suffix='.plist', dir=opts['output_dir']) Index: tools/ctu-analysis/ctu-build.py === --- /dev/null +++ tools/ctu-analysis/ctu-build.py @@ -0,0 +1,219 @@ +#!/usr/bin/env python + +import argparse +import json +import logging +import multiprocessing +import os +import re +import signal +import subprocess + + +SOURCE_PATTERN = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +TIMEOUT = 86400 +DEFINED_FUNCTIONS_FILENAME = 'definedFns.txt' +EXTERNAL_FUNCTIONS_FILENAME = 'externalFns.txt' +EXTERNAL_FUNCTION_MAP_FILENAME = 'externalFnMap.txt' + + +def get_args(): +parser = argparse.ArgumentParser( +description='Executes 1st pass of CTU analysis where we preprocess ' +'all files in the compilation database and generate ' +'AST dumps and other necessary information from those ' +'to be used later by the 2nd pass of ' +'Cross Translation Unit analysis', +formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='JSON Compilation Database to be used') +parser.add_argument('-p', metavar='preanalyze-dir', dest='ctuindir', +help='Target directory for preanalyzation data', +default='.ctu') +parser.add_argument('-j', metavar='threads', dest='threads', type=int, +help='Number of threads to be used', +default=int(multiprocessing.cpu_count() * 1.0)) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output') +parser.add_argument('--clang-path', metavar='clang-path', +dest='clang_path', +help='Set path to directory of clang binaries used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +mainargs = parser.parse_args() + +if mainargs.verbose: +logging.getLogger().setLevel(logging.INFO) + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +logging.info('CTU uses clang dir: ' + + (clang_path if clang_path != '' else '')) + +return mainargs, clang_path + + +def process_buildlog(buildlog_filename, src_2_dir, src_2_cmd, src_order, + cmd_2_src, cmd_order): +with open(buildlog_filename, 'r') as buildlog_file: +buildlog = json.load(buildlog_file) +for step in buildlog: +if SOURCE_PATTERN.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +src_2_cmd[step['file']] = step['command'] +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. Yeah, of course, ideally sticking this into scan-build, one way or another, would be great. I understand that it'd require quite a bit of communication with Laszlo. Otherwise we're just duplicating a whole lot of things. Thanks for digging into this, guys. Really. Comment at: lib/AST/ASTContext.cpp:1529-1530 +iterateContextDecls(TU, MangledFnName, MangleCtx)) { +llvm::errs() << "Importing function " << MangledFnName << " from " + << ASTFileName << "\n"; +// FIXME: Refactor const_cast These debug prints should be removed, i guess. Comment at: lib/AST/ASTImporter.cpp:3222 - if (FunctionDecl *FoundFunction = dyn_cast(FoundDecls[I])) { + if (auto *FoundFunction = dyn_cast(FoundDecls[I])) { if (FoundFunction->hasExternalFormalLinkage() && We could probably commit the `auto`-related changes separately in order to keep this patch clean. Comment at: test/Analysis/Inputs/externalFnMap.txt:1 +_Z7h_chaini@x86_64 xtu-chain.cpp.ast +_ZN4chns4chf2Ei@x86_64 xtu-chain.cpp.ast These tests use a pre-made external function map. Are you willing to add tests for generating function maps? That'd be useful, in my opinion, because it'd actually tell people how to run the whole thing. Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:1 +//===- ClangCmdlineArchExtractor.cpp ===// +// Do we really intend to keep this as a tool? I suspect obtaining the architecture could be done much easier by parsing the run-line in python scripts, we were just in too much rush to consider this possibility, and calling a whole tool for just that sounds like an overkill. I think it would be great to simplify this out. Additionally, this whole architecture hassle kicks in only rarely. It is only important to know the architecture because some projects have the same file compiled for different architectures (we used to have it in Android which has binaries that are compiled for both host machine and target machine, but for most projects just having a mangled name is already good enough). So we can delay this discussion by splitting this out of the initial patch, if you want, but that's extra work, of course, so please take the above as more of a mumble than of a request. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
NoQ added a comment. One more obvious observation regarding scan-build: because you are already reading a compilation database, the whole tool is essentially usable in combination with the current scan-build-py (which can create compilation databases). So it's already quite usable, but you're forced to do regular analysis before cross-translation-unit analysis. So all we need is an extra mode in scan-build-py that does the interception and leaves the rest of the work to us, either by piping commands to us directly, or by providing us with a compilation database (if `--intercept-first` is passed). Having some kind of `--intercept-only` would solve half of the problems. Of course, ideally the logic that adds the `-analyzer*` options should also be re-used, but for usability it isn't immediately necessary. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
danielmarjamaki added inline comments. Comment at: tools/xtu-analysis/xtu-analyze.py:29 + +threading_factor = int(multiprocessing.cpu_count() * 1.5) +analyser_output_formats = ['plist-multi-file', 'plist', 'plist-html', gerazo wrote: > danielmarjamaki wrote: > > does this mean that if there are 4 cores this script will by default use 6 > > threads? isn't that too aggressive? > Yes, it does mean that. You are right, it is aggressive. To be honest, the > xtu-build step is more io intensive where it really makes sense. In the > xtu-analyze step, it is marginal when big files are compiled (more cpu, less > io). We will put this one back to 1.0 instead. I see. Feel free to use such ratio. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
gerazo added inline comments. Comment at: tools/xtu-analysis/xtu-analyze.py:29 + +threading_factor = int(multiprocessing.cpu_count() * 1.5) +analyser_output_formats = ['plist-multi-file', 'plist', 'plist-html', danielmarjamaki wrote: > does this mean that if there are 4 cores this script will by default use 6 > threads? isn't that too aggressive? Yes, it does mean that. You are right, it is aggressive. To be honest, the xtu-build step is more io intensive where it really makes sense. In the xtu-analyze step, it is marginal when big files are compiled (more cpu, less io). We will put this one back to 1.0 instead. https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun updated this revision to Diff 93658. xazax.hun marked 11 inline comments as done. xazax.hun added a comment. - Fixed some of the review comments and some other cleanups. https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Decl.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/Inputs/xtu-chain.cpp test/Analysis/Inputs/xtu-other.cpp test/Analysis/xtu-main.cpp tools/CMakeLists.txt tools/clang-cmdline-arch-extractor/CMakeLists.txt tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/scan-build-py/libscanbuild/runner.py tools/xtu-analysis/xtu-analyze.py tools/xtu-analysis/xtu-build.py Index: tools/xtu-analysis/xtu-build.py === --- /dev/null +++ tools/xtu-analysis/xtu-build.py @@ -0,0 +1,195 @@ +#!/usr/bin/env python + +import argparse +import io +import json +import multiprocessing +import os +import re +import signal +import subprocess +import string + +threading_factor = int(multiprocessing.cpu_count() * 1.5) +timeout = 86400 + +parser = argparse.ArgumentParser( +description='Executes 1st pass of XTU analysis') +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='Use a JSON Compilation Database') +parser.add_argument('-p', metavar='preanalyze-dir', dest='xtuindir', +help='Use directory for generating preanalyzation data ' + '(default=".xtu")', +default='.xtu') +parser.add_argument('-j', metavar='threads', dest='threads', +help='Number of threads used (default=' + +str(threading_factor) + ')', +default=threading_factor) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output of every command executed') +parser.add_argument('--clang-path', metavar='clang-path', dest='clang_path', +help='Set path of clang binaries to be used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +parser.add_argument('--timeout', metavar='N', +help='Timeout for build in seconds (default: %d)' % +timeout, +default=timeout) +mainargs = parser.parse_args() + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +if mainargs.verbose: +print 'XTU uses clang dir: ' + \ +(clang_path if clang_path != '' else '') + +buildlog_file = open(mainargs.buildlog, 'r') +buildlog = json.load(buildlog_file) +buildlog_file.close() + +src_pattern = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +src_2_dir = {} +src_2_cmd = {} +src_order = [] +cmd_2_src = {} +cmd_order = [] +for step in buildlog: +if src_pattern.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +if step['file'] not in src_2_cmd: +src_2_cmd[step['file']] = step['command'] +src_order.append(step['file']) +if step['command'] not in cmd_2_src: +cmd_2_src[step['command']] = [step['file']] +cmd_order.append(step['command']) +else: +cmd_2_src[step['command']].append(step['file']) + + +def clear_file(filename): +try: +os.remove(filename) +except OSError: +pass + + +def get_command_arguments(cmd): +had_command = False +args = [] +for arg in cmd.split(): +if had_command and not src_pattern.match(arg): +args.append(arg) +if not had_command and arg.find('=') == -1: +had_command = True +return args + + +def generate_ast(source): +cmd = src_2_cmd[source] +args = get_command_arguments(cmd) +arch_command = os.path.join(clang_path, 'clang-cmdline-arch-extractor') + \ +' ' + string.join(args, ' ') + ' ' + source +if mainargs.verbose: +print arch_command +arch_output = subprocess.check_output(arch_command, shell=True) +arch = arch_output[arch_output.rfind('@')+1:].strip() +ast_joined_path = os.path.join(mainargs.xtuindir, +
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
danielmarjamaki added inline comments. Comment at: include/clang/AST/ASTContext.h:42 #include "clang/Basic/Specifiers.h" +#include "clang/Basic/VersionTuple.h" #include "llvm/ADT/APSInt.h" I don't see why this is included here. Comment at: include/clang/AST/Mangle.h:55 const ManglerKind Kind; + // Used for cross tranlsation unit analysis. + // To reduce the risk of function name collision in C projects, we force tranlsation => translation Comment at: lib/AST/ASTContext.cpp:1457 +return nullptr; + for (Decl *D : DC->decls()) { +const auto *SubDC = dyn_cast(D); could add a "const" perhaps Comment at: lib/AST/ASTContext.cpp:1491 + std::string MangledFnName = getMangledName(FD, MangleCtx.get()); + std::string ExternalFunctionMap = (XTUDir + "/externalFnMap.txt").str(); + ASTUnit *Unit = nullptr; as far as I see ExternalFunctionMap can be moved into the subscope. Comment at: lib/AST/ASTContext.cpp:1500 + std::string FunctionName, FileName; + while (ExternalFnMapFile >> FunctionName >> FileName) +FunctionFileMap[FunctionName] = (XTUDir + "/" + FileName).str(); I would recommend that a parser is added that make sure the file data is correct/sane. If there is some load/save mismatch now or in the future or if the user choose to tweak the file for some reason, the behaviour could be wrong. Comment at: lib/AST/ASTContext.cpp:1502 +FunctionFileMap[FunctionName] = (XTUDir + "/" + FileName).str(); + ExternalFnMapFile.close(); +} well.. I believe it's redundant to close this explicitly since the std::ifstream destructor should do that. But it doesn't hurt doing it. Comment at: lib/Basic/SourceManager.cpp:2028 /// \brief Determines the order of 2 source locations in the translation unit. +/// FIXME: It also works when two locations are from different translation unit. +///In that case it will return *some* order. I am not good at english. But I believe unit=>units. Comment at: lib/Basic/SourceManager.cpp:2126 } - llvm_unreachable("Unsortable locations found"); + // FIXME: Source locations from different translation unit. + return LOffs.first < ROffs.first; I am not good at english. But I believe unit=>units. Comment at: lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:48 +#include +#include +#include I believe sys/file.h and unistd.h are posix includes. Comment at: lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:433 +return; + int fd = open(fileName.c_str(), O_CREAT|O_WRONLY|O_APPEND, 0666); + flock(fd, LOCK_EX); this is posix file-I/O Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:56 + if (Sources.empty()) +return 1; + Maybe use the EXIT_FAILURE instead Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:66 + for (StringRef SourceFile : Sources) { +char *Path = realpath(SourceFile.data(), nullptr); +if (Path) This is posix function as far as I know. Comment at: tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp:72 + + return 0; +} EXIT_SUCCESS is also possible however I guess that is 0 on all implementations. Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:32 +#include +#include +#include posix includes Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:153 + + if (!FileName.empty()) +switch (FD->getLinkageInternal()) { I do not see how FileName could be empty. It always starts with "/ast/" right? Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:193 + lockedWrite(BuildDir + "/definedFns.txt", DefinedFuncsStr.str()); + std::stringstream CFGStr; + for (auto : CG) { if a STL stream will be used I recommend the std::ostringstream instead Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:244 +errs() << "Exactly one XTU dir should be provided\n"; +return 1; + } maybe use EXIT_FAILURE Comment at: tools/clang-func-mapping/ClangFnMapGen.cpp:257 + Tool.run(newFrontendActionFactory().get()); +} no return. Comment at: tools/xtu-analysis/xtu-analyze.py:29 + +threading_factor = int(multiprocessing.cpu_count() * 1.5) +analyser_output_formats = ['plist-multi-file', 'plist', 'plist-html', does this mean that if there are 4 cores this script will by default use 6 threads? isn't that too aggressive? Repository: rL LLVM
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun added a comment. Guide to run the two pass analysis: Process --- These are the steps of XTU analysis: 1. `xtu-build.py` script uses your compilation database and extracts all necessary information from files compiled. It puts all its generated data into a folder (.xtu by default). 2. `xtu-analyze.py` script uses all previously generated data and executes the analysis. It needs the clang binary and scan-build-py's analyze-cc in order to do that. The output is put into a folder (.xtu-out by default) where both the analysis reports and the called commands' generated output are stored. Usage example - 1. You have generated a compilation database and you are in your project's build directory 2. `xtu-build.py -b build.json -v --clang-path ` 3. `xtu-analyse.py -b build.json -v --clang-path --analyze-cc-path ` Repository: rL LLVM https://reviews.llvm.org/D30691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis
xazax.hun created this revision. Herald added a subscriber: mgorny. This patch adds support for naive cross translational unit analysis. The aim of this patch is to be minimal to enable the development of the feature on the top of tree. This patch should be an NFC in case XTUDir is not provided by the user. When XTUDir is provided: - In case a function definition is not available it will be looked up from a textual index, whether it was available in another TU. - The AST dump of the other TU will be loaded and the function definition will be inlined. One of the main limitations is that the coverage pattern of the analysis will change when this feature is enabled. For this reason in the future it might be better to include some heuristics to prefer examining shorter execution paths to the longer ones. Until than this feature is not recommended to be turned on by users unless they already fixed the important issues with this feature turned off. We will cover more detailed analysis of this patch soon in our EuroLLVM talk: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#7 We will talk about how this works and the memory usage, analysis time, coverage pattern change, limitations of the ASTImporter, how the length of the bugpaths changed and a lot more. Feel free to skip the review after the talk, but we wanted to make the code available in an easy to review format before the conference. Repository: rL LLVM https://reviews.llvm.org/D30691 Files: include/clang/AST/ASTContext.h include/clang/AST/Decl.h include/clang/AST/Mangle.h include/clang/StaticAnalyzer/Core/AnalyzerOptions.h include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/AST/ASTContext.cpp lib/AST/ASTImporter.cpp lib/AST/ItaniumMangle.cpp lib/Basic/SourceManager.cpp lib/StaticAnalyzer/Core/AnalyzerOptions.cpp lib/StaticAnalyzer/Core/CMakeLists.txt lib/StaticAnalyzer/Core/CallEvent.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp test/Analysis/Inputs/externalFnMap.txt test/Analysis/Inputs/xtu-chain.cpp test/Analysis/Inputs/xtu-other.cpp test/Analysis/xtu-main.cpp tools/CMakeLists.txt tools/clang-cmdline-arch-extractor/CMakeLists.txt tools/clang-cmdline-arch-extractor/ClangCmdlineArchExtractor.cpp tools/clang-func-mapping/CMakeLists.txt tools/clang-func-mapping/ClangFnMapGen.cpp tools/scan-build-py/libscanbuild/runner.py tools/xtu-analysis/xtu-analyze.py tools/xtu-analysis/xtu-build.py Index: tools/xtu-analysis/xtu-build.py === --- /dev/null +++ tools/xtu-analysis/xtu-build.py @@ -0,0 +1,195 @@ +#!/usr/bin/env python + +import argparse +import io +import json +import multiprocessing +import os +import re +import signal +import subprocess +import string + +threading_factor = int(multiprocessing.cpu_count() * 1.5) +timeout = 86400 + +parser = argparse.ArgumentParser( +description='Executes 1st pass of XTU analysis') +parser.add_argument('-b', required=True, dest='buildlog', +metavar='build.json', +help='Use a JSON Compilation Database') +parser.add_argument('-p', metavar='preanalyze-dir', dest='xtuindir', +help='Use directory for generating preanalyzation data ' + '(default=".xtu")', +default='.xtu') +parser.add_argument('-j', metavar='threads', dest='threads', +help='Number of threads used (default=' + +str(threading_factor) + ')', +default=threading_factor) +parser.add_argument('-v', dest='verbose', action='store_true', +help='Verbose output of every command executed') +parser.add_argument('--clang-path', metavar='clang-path', dest='clang_path', +help='Set path of clang binaries to be used ' + '(default taken from CLANG_PATH envvar)', +default=os.environ.get('CLANG_PATH')) +parser.add_argument('--timeout', metavar='N', +help='Timeout for build in seconds (default: %d)' % +timeout, +default=timeout) +mainargs = parser.parse_args() + +if mainargs.clang_path is None: +clang_path = '' +else: +clang_path = os.path.abspath(mainargs.clang_path) +if mainargs.verbose: +print 'XTU uses clang dir: ' + \ +(clang_path if clang_path != '' else '') + +buildlog_file = open(mainargs.buildlog, 'r') +buildlog = json.load(buildlog_file) +buildlog_file.close() + +src_pattern = re.compile('.*\.(C|c|cc|cpp|cxx|ii|m|mm)$', re.IGNORECASE) +src_2_dir = {} +src_2_cmd = {} +src_order = [] +cmd_2_src = {} +cmd_order = [] +for step in buildlog: +if src_pattern.match(step['file']): +if step['file'] not in src_2_dir: +src_2_dir[step['file']] = step['directory'] +