Thanks Gábor. This looks great.
One last thing that occurred to me is that we should not create a ModelInjector
at all unless the model-path is specified. This allows us to preserve the
existing behavior in the analyzer while we continue to evolve this new
functionality.
Specifically:
std::unique_ptr<AnalysisASTConsumer>
-ento::CreateAnalysisConsumer(const Preprocessor &pp, const std::string &outDir,
- AnalyzerOptionsRef opts,
- ArrayRef<std::string> plugins) {
+ento::CreateAnalysisConsumer(CompilerInstance &CI) {
// Disable the effects of '-Werror' when using the AnalysisConsumer.
- pp.getDiagnostics().setWarningsAsErrors(false);
+ CI.getPreprocessor().getDiagnostics().setWarningsAsErrors(false);
- return llvm::make_unique<AnalysisConsumer>(pp, outDir, opts, plugins);
+ return llvm::make_unique<AnalysisConsumer>(
+ CI.getPreprocessor(), CI.getFrontendOpts().OutputFile,
+ CI.getAnalyzerOpts(), CI.getFrontendOpts().Plugins,
+ new ModelInjector(CI));
}
We can query 'opts' to see if model-path is empty; if it is we can pass nullptr
instead of 'new ModelInjector(CI)'.
> On Aug 12, 2014, at 1:22 AM, Gábor Horváth <[email protected]> wrote:
>
> Hi Ted,
>
> Thank you for the review, I have altered the patch accordingly, and also
> added the patch to follow up the API change in clang tidy.
>
>
> On 12 August 2014 08:44, Ted Kremenek <[email protected]
> <mailto:[email protected]>> wrote:
>
>> On Aug 1, 2014, at 1:44 AM, Gábor Horváth <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Hi Ted,
>>
>> Thank you for the review.
>>
>> On 1 August 2014 07:25, Ted Kremenek <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Gábor,
>>
>> This is looking good to me. Some minor nits/comments:
>>
>> - Please add doxygen comments for the CodeInjector class.
>>
>> Done.
>>
>> - For the BugReporter patch, are there tests for that functionality change?
>> I saw tests in the other patch, but not that one. It's fine to separate the
>> review of that change before the primary change goes in, but I was curious.
>>
>> Well, it may be a bit complicated. I deleted some code in BugReporter to not
>> to discard bug reports that are in a model file, and the plist part of the
>> test case only pass if that patch is applied (if the patch is not applied
>> the nullpointer dereference warning that has the position in the modelfile
>> will be discarded. In the long term it would be better to report these
>> errors elsewhere but it is not supported yet by the bugreporter patch). I
>> can move the plist check into a separate testcase and add that case to the
>> BugReporter patch instead. The division by zero test should work without the
>> BugReporter patch.
>
> Ok, this make sense. Can you clarify what you mean by "better to report
> these errors elsewhere"?
>
>
> It might be confusing for the user, if the execution path of the bug report
> contains a location that is inside a model file so the report contains codes
> and files that are not present in the analysed project. So it might be more
> user friendly if the locations that are inside a model file would be excluded
> from the reported execution path, however if this is not an issue I am ok
> with reporting those locations as well.
>
>>
>> - As for breaking code in the 'extra' repository, LLVM-internal API is not
>> sacrosanct. If we break the 'extra' projects we just need to update them,
>> but I'm not certain if that is possible in this case.
>>
>> As far as I can remember it would be a straightforward fix in the extra
>> repository. Clang-tidy calls CreateAnalysisConsumer.
>
> Sounds good. Let's get the right API and just fix up clang-tidy.
>
>>
>> - For comments, please consistently use sentence casing and end with
>> periods, and for type names use the appropriate casing. For example:
>>
>> + // modules create a separate compilerinstance for parsing modules, maybe
>> it is
>> + // for reason so I mimic this behavior
>> + CompilerInstance Instance;
>> ...
>>
>> This comment looks a bit suspect, since it seems like a question to
>> yourself. Here you use the word "I"; who is "I" in the context of this
>> code? The comment also seems like an unanswered question. Is this a stale
>> comment?
>>
>>
>> Done, the comment was improved.
>>
>> Another example is this comment:
>>
>> + // FIXME: double memoization is redundant. Here and in bodyfarm.
>> + llvm::StringMap<Stmt *> Bodies;
>>
>> This can be made slightly cleaner. For example:
>>
>> + // FIXME: Double memorization is redundant, with
>> + /// memoization both here and in BodyFarm.
>> + llvm::StringMap<Stmt *> Bodies;
>>
>> Done.
>>
>> - Only use doxygen comments for documentation. For example:
>>
>> + if (notzero_notmodeled(p)) {
>> + /// There is no information about the value of p, because
>> + /// notzero_notmodeled is not modeled and the function definition
>> + /// is not available.
>> + int j = 5 / p; // expected-warning {{Division by zero}}
>> + }
>>
>> In this case we should use '//', not '///'. The former are true comments,
>> and the latter are candidates to be extracted for documentation.
>>
>>
>> Done.
>>
>> Overall, however, this is getting really close.
>>
>>
>> It is great.
>>
>> Thanks,
>> Gábor
>
> Wonderful. The rest of my comments are minor:
>
>> +/// \brief CodeInjector is an interface which is responsible forinjecting
>> AST of
>> +/// function definitions that may not be available in the original source.
>> +///
>> +/// The getBody function will be called each time the static analyzer
>> examines a
>> +/// function call that has no definition available in the current
>> translation
>> +/// unit. If the returned statement is not a nullpointer, it is assumed to
>> be
>> +/// the body of a function which will be used for the analysis. The source
>> of
>> +/// the body can be arbitrary, but it is advised to use memoization to avoid
>> +/// unnecessary reparsing of the external source that provides the body of
>> the
>> +/// functions.
>
>
> "forinjecting" -> "for injecting"
> "nullpointer" -> "null pointer"
>
>> +++ include/clang/StaticAnalyzer/Frontend/FrontendActions.h (working copy)
>> @@ -10,10 +10,16 @@
>> #ifndef LLVM_CLANG_GR_FRONTENDACTIONS_H
>> #define LLVM_CLANG_GR_FRONTENDACTIONS_H
>>
>> +#include <map>
>> +
>
>
> This "#include" of <map> doesn't seem needed. Neither is the one in
> ModelConsumer.h
>
>> +++ lib/StaticAnalyzer/Frontend/ModelConsumer.cpp (working copy)
>> @@ -0,0 +1,42 @@
>> +//===--- ModelConsumer.cpp - ASTConsumer for consuming model files
>> --------===//
>> +//
>> +// The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +///
>> +/// \file
>> +/// \brief This file implements an ASTConsumer for consuming model files.
>> +///
>> +/// This ASTConsumer handles the AST of a parsed model file. All top level
>> +/// function definitions will be collected from that model file for later
>> +/// retrieval during the static analyzis. The body of these functions will
>> not
>> +/// be injected into the ASTUnit of the analyzed translation unit. It will
>> be
>> +/// available through the BodyFarm which is utilized by the
>> AnalysisDeclContext
>> +/// class.
>> +///
>
> "analyzis" -> "analysis"
>
>> + // The instance wants to take ownership, however disablefree frontend
>> option
>> + // is set to true to avoid double free issues
>
> Use the actual casing for the option for technical precision:
>
> DisableFree
>
>> + /// \brief Synthetize a body for a declaration
>> + ///
>> + /// This method first looks up the appropriate model file based on the
>> + /// model-path configuration option and the name of the declaration that
>> is
>> + /// looked up. If no model were synthetized yet for a function with that
>> name
>> + /// it will create a new compiler instance to parse the model file using
>> the
>> + /// ASTContext, Preprocessor, SourceManager of the original compiler
>> instance.
>> + /// The former resources are shared between the two compiler instance, so
>> the
>> + /// newly created instance have to "leak" these objects, since they are
>> owned
>> + /// by the original instance.
>
> Synthetize -> Synthesize
> synthetized -> synthesized
>
>> + std::vector<std::unique_ptr<ASTUnit> > ModelAsts;
>
> I'd prefer this to be "ModelASTs", as 'AST' is an acronym.
>
>
> It was an unused member (from an earlier implementation) that I forgot to
> remove, but done now.
>
> Otherwise, this all looks great to me.
>
>>
>> Cheers,
>> Ted
>>
>> On Jul 30, 2014, at 3:29 AM, Gábor Horváth <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>> Hi Ted,
>>>
>>> Thank you for the review.
>>>
>>>
>>> On 30 July 2014 08:18, Ted Kremenek <[email protected]
>>> <mailto:[email protected]>> wrote:
>>> Hi Gábor,
>>>
>>> Thanks for making progress on this very promising enhancement to the
>>> analyzer. I have an assortment of comments, in no particular order:
>>>
>>> - ModelInjector.h and ModelConsumer.h
>>>
>>> There is a comment at the top of these files, but I think a bit more
>>> explanation is needed. For example:
>>>
>>> MetaConsumer.cpp:
>>>
>>> +// "Meta" ASTConsumer for consuming model files.
>>>
>>> That doesn't really explain anything. What does "Meta" in quotes mean? I
>>> think an explanation here on what this does is helpful when someone
>>> discovers this code for the first time.
>>>
>>> Similarly, we should add some high-level comments for CodeInjector.h and
>>> ModelInjector.h. We have a good start in ModelInjector.h:
>>>
>>> +/// \file
>>> +/// \brief Defines the clang::ento::ModelInjector class which implements
>>> the
>>> +/// clang::CodeInjector interface. This class is responsible for injecting
>>> +/// function definitions that were synthetized from model files.
>>> +///
>>>
>>> Let's consider expanding it:
>>>
>>> /// \brief This file defines the clang::ento::ModelInjector class which
>>> implements the
>>> /// clang::CodeInjector interface. This class is responsible for injecting
>>> /// function definitions that were synthesized from model files.
>>>
>>> /// Model files allow definitions of functions to be lazily constituted
>>> for functions
>>> /// which lack bodies in the original source code. This allows the
>>> analyzer
>>> /// to more precisely analyze code that calls such functions, analyzing the
>>> /// artificial definitions (which typically approximate the semantics of
>>> the
>>> /// called function) when called by client code. These definitions are
>>> /// reconstituted lazily, on-demand, by the static analyzer engine.
>>>
>>> CodeInjector.h provides some information, but it is a bit vague:
>>>
>>> +///
>>> +/// \file
>>> +/// \brief Defines the clang::CodeInjector interface which is responsible
>>> for
>>> +/// injecting AST of function definitions from external source.
>>> +///
>>>
>>> It's a bit unclear how this gets used. I think a bit of prose here would
>>> help clarify its role in the static analyzer. I also think the
>>> CodeInjector interface is also more abstract than the prose describes.
>>> There's nothing about CodeInjector's interface that requires the injected
>>> definitions to come from an external source. That's an implementation
>>> detail of a concrete subclass. Instead, all CodeInjector does is provide
>>> an interface that lazily provides definitions for functions and methods
>>> that may not be present in the original source.
>>>
>>> I have added some further documentation to address these issues.
>>>
>>>
>>>
>>> I'm also looking at the change to
>>> StaticAnalyzer/Frontend/FrontendActions.cpp, and wonder if we can simplify
>>> things:
>>>
>>>> +++ lib/StaticAnalyzer/Frontend/FrontendActions.cpp (working copy)
>>>> @@ -7,9 +7,11 @@
>>>> //
>>>>
>>>> //===----------------------------------------------------------------------===//
>>>>
>>>> +#include "clang/Frontend/CompilerInstance.h"
>>>> #include "clang/StaticAnalyzer/Frontend/FrontendActions.h"
>>>> -#include "clang/Frontend/CompilerInstance.h"
>>>> #include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h"
>>>> +#include "clang/StaticAnalyzer/Frontend/ModelConsumer.h"
>>>> +#include "ModelInjector.h"
>>>> using namespace clang;
>>>> using namespace ento;
>>>>
>>>> @@ -18,6 +20,14 @@
>>>> return CreateAnalysisConsumer(CI.getPreprocessor(),
>>>> CI.getFrontendOpts().OutputFile,
>>>> CI.getAnalyzerOpts(),
>>>> - CI.getFrontendOpts().Plugins);
>>>> + CI.getFrontendOpts().Plugins,
>>>> + new ModelInjector(CI));
>>>> }
>>>>
>>>
>>>
>>> It looks like CreateAnalysisConsumer just continues to grow more arguments,
>>> all which derive from using 'CI'. This seems silly, since this function is
>>> called in one place. Instead of intro ducting a dependency on
>>> ModelInjector.h in this file, we can just sink these arguments into
>>> CreateAnalysisConsumer() itself, resulting in:
>>>
>>> return CreateAnalysisConsumer(CI);
>>>
>>> and let CreateAnalysisConsumer() do all that boilerplate.
>>>
>>> That was my original idea as well but it broke the compilation of some code
>>> in extra repository and I wasn't sure if it is ok to break the API with
>>> this patch. But I find it cleaner this way so I modified it in this
>>> iteration.
>>>
>>>
>>> Next, let's look at the change to FrontendAction:
>>>
>>>> class FrontendAction {
>>>> + /// Is this action invoked on a model file? Model files are incomplete
>>>> + /// translation units that relies on type information from another
>>>> translation
>>>> + /// unit. Check ParseModelFileAction for details.
>>>> + bool ModelFile;
>>>
>>> Perhaps "IsModelFile"? "ModelFile" sounds like it should be a reference to
>>> the file itself.
>>>
>>>> FrontendInputFile CurrentInput;
>>>> std::unique_ptr<ASTUnit> CurrentASTUnit;
>>>> CompilerInstance *Instance;
>>>> @@ -105,7 +109,11 @@
>>>> /// @}
>>>>
>>>> public:
>>>> - FrontendAction();
>>>> + /// \brief Constructor
>>>> + ///
>>>> + /// \param modelFile determines whether the source files this action
>>>> invoked
>>>> + /// on should be treated as a model file. Defaults to false.
>>>> + FrontendAction(bool modelFile = false);
>>>
>>> It seems suboptimal to modify the interface of FrontendAction just for this
>>> one edge case. Instead of modifying the constructor arguments, we could
>>> default initialize "IsModelFile" to false, and have a setter to change it.
>>> For example:
>>>
>>> ParseModelFileAction::ParseModelFileAction(llvm::StringMap<Stmt *>
>>> &Bodies)
>>> : ASTFrontendAction(/*ModelFile=*/true), Bodies(Bodies) {}
>>>
>>> becomes:
>>>
>>> ParseModelFileAction::ParseModelFileAction(llvm::StringMap<Stmt *>
>>> &Bodies)
>>> : Bodies(Bodies) {
>>> IsModelFile = true;
>>> }
>>>
>>> Looking at this more, I wonder if we should modify FrontendAction at all.
>>> The only place where isModelParsingAction() is called is in one spot in
>>> CompilerInstance.cpp:
>>>
>>> if (hasSourceManager() && !Act.isModelParsingAction())
>>>
>>> It *might* be cleaner to just have a virtual member function in
>>> FrontendAction, which defaults to returning false, but is generic for all
>>> subclasses to override. Then we don't need the "IsModelFile" field in
>>> FrontendAction at all, and we just have ParseModelFileAction override that
>>> single member function. We could then name that method to be something a
>>> bit more generic. That would allow us to not touch FrontendAction at all
>>> except for providing that single virtual method that can be overridden in
>>> subclasses. I somewhat prefer this approach because it provides a cleaner
>>> separation of concerns between FrontendAction (which is defined
>>> libFrontend) and the static analyzer. That would also allow you to get rid
>>> of isModelParsingAction() entirely (replacing it with something more
>>> generic).
>>>
>>>
>>> You are right, it is much cleaner to use a virtual function, so I modified
>>> the patch to use that approach. The new virtual function has the same name
>>> because I have yet to find any better and more general name yet. Do you
>>> have an idea for a better name?
>>>
>>> As for the test case:
>>>
>>>> +typedef int* intptr;
>>>> +
>>>> +void modelled(intptr p);
>>>> +
>>>> +int main() {
>>>> + modelled(0);
>>>> + return 0;
>>>> +}
>>>
>>> Please add some comments in this test file explaining what is happening.
>>> Also, it would be great if this both used FileCheck (which it does now) but
>>> also verified the diagnostics so we get cross-checking of the output (we
>>> see this in some analyzer tests). It also makes it easier to understand
>>> the test.
>>>
>>> Also, is there a reason to break up the tests between
>>> model-suppress-falsepos.cpp and model-file.cpp? It seems like one test
>>> file will do fine; just clearly comment on what is happening for each test.
>>> I also recommend called the modeled function "modeledFunction" instead of
>>> "modelled" (which according to my spell checker has an additional 'l').
>>>
>>> I have merged the test files and also added some commets to explain what is
>>> going on. I have fixed the misspelling as well. The nullpointer dereference
>>> is only checked through plist because the point where the comment with the
>>> expected warning should be added is inside the model file and it did not
>>> work for me if the comment was in a separate file. If there is a different
>>> way to verify the warnings that are in a separate file and I did not find
>>> it, please let me know.
>>>
>>>
>>> As for the model files themselves:
>>>
>>>> Index: test/Analysis/modelled.model
>>>> ===================================================================
>>>> --- test/Analysis/modelled.model (revision 0)
>>>> +++ test/Analysis/modelled.model (working copy)
>>>> @@ -0,0 +1,3 @@
>>>> +void modelled(intptr p) {
>>>> + ++*p;
>>>> +}
>>>> \ No newline at end of file
>>>> Index: test/Analysis/notzero.model
>>>> ===================================================================
>>>> --- test/Analysis/notzero.model (revision 0)
>>>> +++ test/Analysis/notzero.model (working copy)
>>>
>>> Let's put these in a separate subdirectory, for example, "models", instead
>>> of mixing them with the tests. This way they really serve as "inputs" to
>>> the analyzer.
>>>
>>> I have moved the model files to tests/Inputs/Models.
>>>
>>>
>>> Overall this is looking good. I think the explanatory comments will really
>>> help people understand what this is doing, and I think changing how we
>>> thread the information through FrontendAction will help not introduce an
>>> artificial tainting of FrontendAction with concepts specific to the static
>>> analyzer.
>>>
>>> Cheers,
>>> Ted
>>>
>>>
>>> Thanks,
>>> Gábor
>>>
>>>
>>> On Jul 16, 2014, at 2:45 AM, Gábor Horváth <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On 14 July 2014 19:32, Anna Zaks <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>> On Jul 13, 2014, at 6:11 AM, Gábor Horváth <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>> Hi Anna,
>>>>>
>>>>> Thank you for the review. I have tweaked the test, so it no longer
>>>>> requires the error reporting tweak that is not done yet to pass. I have
>>>>> also added some high level comments to some files, if you think some
>>>>> information is lacking I will add them in the next iteration as well. The
>>>>> BugReporter patch is now separated into a different patch.
>>>>>
>>>>>
>>>>> On 11 July 2014 18:02, Anna Zaks <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>> For example, modeling functions should allow you to find bugs and
>>>>> suppress false positives outside of those functions. I would suggest
>>>>> adding a few of those tests first.
>>>>>
>>>>>
>>>>> How are the false positives suppressed? I did not find any resource on
>>>>> that. Found some analyzer attributes but I did not find them suitable for
>>>>> this purpuse at the first glance. But I think once the locations that are
>>>>> in a model file are omitted from the report path, the regular methods for
>>>>> suppressing false positives should work (and I will definitely add test
>>>>> case to ensure this once it is done).
>>>>>
>>>>
>>>> What I meant is that it is possible to construct a test where ability to
>>>> model a function would eliminate a false positive. This would be another
>>>> way to test your patch without worrying about BugReporter.
>>>>
>>>> I got it now, thansk. I have updated the patch with a test case where a
>>>> false positive case is eliminated by a model file.
>>>>
>>>> Thanks,
>>>> Gábor
>>>>
>>>>> Thanks,
>>>>> Gábor
>>>>> <api_modeling.patch><bugreporter.patch>
>>>>
>>>>
>>>> <api_modeling.patch><bugreporter.patch>
>>>
>>>
>>> <api_modeling.patch><bugreporter.patch>
>>
>>
>> <api_modeling.patch><bugreporter.patch>
>
>
>
> Thanks,
> Gábor
> <api_modeling.patch><bugreporter.patch><clangTidy.patch>
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits