Re: [PATCH] Modeling APIs in the Static Analyzer

Ted Kremenek Thu, 31 Jul 2014 22:27:07 -0700

Hi Gábor,

This is looking good to me.  Some minor nits/comments:


- Please add doxygen comments for the CodeInjector class.
- For the BugReporter patch, are there tests for that functionality change?  I 
saw tests in the other patch, but not that one.  It's fine to separate the 
review of that change before the primary change goes in, but I was curious.
- As for breaking code in the 'extra' repository, LLVM-internal API is not 
sacrosanct.  If we break the 'extra' projects we just need to update them, but 
I'm not certain if that is possible in this case.
- For comments, please consistently use sentence casing and end with periods, 
and for type names use the appropriate casing.  For example:

+  // modules create a separate compilerinstance for parsing modules, maybe it 
is
+  // for reason so I mimic this behavior
+  CompilerInstance Instance;
...

This comment looks a bit suspect, since it seems like a question to yourself.  
Here you use the word "I"; who is "I" in the context of this code?  The comment 
also seems like an unanswered question.  Is this a stale comment?

Another example is this comment:

+  // FIXME: double memoization is redundant. Here and in bodyfarm.
+  llvm::StringMap<Stmt *> Bodies;

This can be made slightly cleaner.  For example:

+  // FIXME: Double memorization is redundant, with
+  /// memoization both here and in BodyFarm.
+  llvm::StringMap<Stmt *> Bodies;


- Only use doxygen comments for documentation.  For example:

+  if (notzero_notmodeled(p)) {
+   /// There is no information about the value of p, because
+   /// notzero_notmodeled is not modeled and the function definition
+   /// is not available.
+    int j = 5 / p; // expected-warning {{Division by zero}}
+  }

In this case we should use '//', not '///'.  The former are true comments, and 
the latter are candidates to be extracted for documentation.

Overall, however, this is getting really close.

Cheers,
Ted

On Jul 30, 2014, at 3:29 AM, Gábor Horváth <[email protected]> wrote:

> Hi Ted,
> 
> Thank you for the review.
> 
> 
> On 30 July 2014 08:18, Ted Kremenek <[email protected]> wrote:
> Hi Gábor,
> 
> Thanks for making progress on this very promising enhancement to the 
> analyzer.  I have an assortment of comments, in no particular order:
> 
> - ModelInjector.h and ModelConsumer.h
> 
> There is a comment at the top of these files, but I think a bit more 
> explanation is needed.  For example:
> 
>   MetaConsumer.cpp:
> 
>     +// "Meta" ASTConsumer for consuming model files.
> 
> That doesn't really explain anything.  What does "Meta" in quotes mean?  I 
> think an explanation here on what this does is helpful when someone discovers 
> this code for the first time.
> 
> Similarly, we should add some high-level comments for CodeInjector.h and 
> ModelInjector.h.  We have a good start in ModelInjector.h:
> 
> +/// \file
> +/// \brief Defines the clang::ento::ModelInjector class which implements the
> +/// clang::CodeInjector interface. This class is responsible for injecting
> +/// function definitions that were synthetized from model files.
> +///
> 
> Let's consider expanding it:
> 
>  /// \brief This file defines the clang::ento::ModelInjector class which 
> implements the
>  /// clang::CodeInjector interface. This class is responsible for injecting
>  /// function definitions that were synthesized from model files.
> 
>  /// Model files allow definitions of functions to be lazily constituted for 
> functions
>  /// which lack bodies in the original source code.  This allows the analyzer
>  /// to more precisely analyze code that calls such functions, analyzing the
>  /// artificial definitions (which typically approximate the semantics of the
>  /// called function) when called by client code.  These definitions are
>  /// reconstituted lazily, on-demand, by the static analyzer engine.
> 
> CodeInjector.h provides some information, but it is a bit vague:
> 
> +///
> +/// \file
> +/// \brief Defines the clang::CodeInjector interface which is responsible for
> +/// injecting AST of function definitions from external source.
> +///
> 
> It's a bit unclear how this gets used.  I think a bit of prose here would 
> help clarify its role in the static analyzer.  I also think the CodeInjector 
> interface is also more abstract than the prose describes.  There's nothing 
> about CodeInjector's interface that requires the injected definitions to come 
> from an external source.  That's an implementation detail of a concrete 
> subclass.  Instead, all CodeInjector does is provide an interface that lazily 
> provides definitions for functions and methods that may not be present in the 
> original source.
> 
> I have added some further documentation to address these issues.
> 
>  
> 
> I'm also looking at the change to 
> StaticAnalyzer/Frontend/FrontendActions.cpp, and wonder if we can simplify 
> things:
> 
>> +++ lib/StaticAnalyzer/Frontend/FrontendActions.cpp (working copy)
>> @@ -7,9 +7,11 @@
>>  //
>>  
>> //===----------------------------------------------------------------------===//
>>  
>> +#include "clang/Frontend/CompilerInstance.h"
>>  #include "clang/StaticAnalyzer/Frontend/FrontendActions.h"
>> -#include "clang/Frontend/CompilerInstance.h"
>>  #include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h"
>> +#include "clang/StaticAnalyzer/Frontend/ModelConsumer.h"
>> +#include "ModelInjector.h"
>>  using namespace clang;
>>  using namespace ento;
>>  
>> @@ -18,6 +20,14 @@
>>    return CreateAnalysisConsumer(CI.getPreprocessor(),
>>                                  CI.getFrontendOpts().OutputFile,
>>                                  CI.getAnalyzerOpts(),
>> -                                CI.getFrontendOpts().Plugins);
>> +                                CI.getFrontendOpts().Plugins,
>> +                                new ModelInjector(CI));
>>  }
>>  
> 
> 
> It looks like CreateAnalysisConsumer just continues to grow more arguments, 
> all which derive from using 'CI'.  This seems silly, since this function is 
> called in one place.  Instead of intro ducting a dependency on 
> ModelInjector.h in this file, we can just sink these arguments into 
> CreateAnalysisConsumer() itself, resulting in:
> 
>   return CreateAnalysisConsumer(CI);
> 
> and let CreateAnalysisConsumer() do all that boilerplate.
> 
> That was my original idea as well but it broke the compilation of some code 
> in extra repository and I wasn't sure if it is ok to break the API with this 
> patch. But I find it cleaner this way so I modified it in this iteration.
> 
> 
> Next, let's look at the change to FrontendAction:
> 
>>  class FrontendAction {
>> +  /// Is this action invoked on a model file? Model files are incomplete
>> +  /// translation units that relies on type information from another 
>> translation
>> +  /// unit. Check ParseModelFileAction for details.
>> +  bool ModelFile;
> 
> Perhaps "IsModelFile"?  "ModelFile" sounds like it should be a reference to 
> the file itself.
> 
>>    FrontendInputFile CurrentInput;
>>    std::unique_ptr<ASTUnit> CurrentASTUnit;
>>    CompilerInstance *Instance;
>> @@ -105,7 +109,11 @@
>>    /// @}
>>  
>>  public:
>> -  FrontendAction();
>> +  /// \brief Constructor
>> +  ///
>> +  /// \param modelFile determines whether the source files this action 
>> invoked
>> +  /// on should be treated as a model file. Defaults to false.
>> +  FrontendAction(bool modelFile = false);
> 
> It seems suboptimal to modify the interface of FrontendAction just for this 
> one edge case.  Instead of modifying the constructor arguments, we could 
> default initialize "IsModelFile" to false, and have a setter to change it.  
> For example:
> 
>   ParseModelFileAction::ParseModelFileAction(llvm::StringMap<Stmt *> &Bodies)
>     : ASTFrontendAction(/*ModelFile=*/true), Bodies(Bodies) {}
> 
> becomes:
>  
>   ParseModelFileAction::ParseModelFileAction(llvm::StringMap<Stmt *> &Bodies)
>     : Bodies(Bodies)  {
>     IsModelFile = true;
>   }
> 
> Looking at this more, I wonder if we should modify FrontendAction at all.  
> The only place where isModelParsingAction() is called is in one spot in 
> CompilerInstance.cpp:
> 
>    if (hasSourceManager() && !Act.isModelParsingAction())
> 
> It *might* be cleaner to just have a virtual member function in 
> FrontendAction, which defaults to returning false, but is generic for all 
> subclasses to override.  Then we don't need the "IsModelFile" field in 
> FrontendAction at all, and we just have ParseModelFileAction override that 
> single member function.  We could then name that method to be something a bit 
> more generic.  That would allow us to not touch FrontendAction at all except 
> for providing that single virtual method that can be overridden in 
> subclasses.  I somewhat prefer this approach because it provides a cleaner 
> separation of concerns between FrontendAction (which is defined libFrontend) 
> and the static analyzer.  That would also allow you to get rid of 
> isModelParsingAction() entirely (replacing it with something more generic).
> 
> 
> You are right, it is much cleaner to use a virtual function, so I modified 
> the patch to use that approach. The new virtual function has the same name 
> because I have yet to find any better and more general name yet. Do you have 
> an idea for a better name?
>  
> As for the test case:
> 
>> +typedef int* intptr;
>> +
>> +void modelled(intptr p);
>> +
>> +int main() {
>> + modelled(0);
>> + return 0;
>> +}
> 
> Please add some comments in this test file explaining what is happening.  
> Also, it would be great if this both used FileCheck (which it does now) but 
> also verified the diagnostics so we get cross-checking of the output (we see 
> this in some analyzer tests).  It also makes it easier to understand the test.
> 
> Also, is there a reason to break up the tests between 
> model-suppress-falsepos.cpp and model-file.cpp?  It seems like one test file 
> will do fine; just clearly comment on what is happening for each test.  I 
> also recommend called the modeled function "modeledFunction" instead of 
> "modelled" (which according to my spell checker has an additional 'l'). 
> 
> I have merged the test files and also added some commets to explain what is 
> going on. I have fixed the misspelling as well. The nullpointer dereference 
> is only checked through plist because the point where the comment with the 
> expected warning should be added is inside the model file and it did not work 
> for me if the comment was in a separate file. If there is a different way to 
> verify the warnings that are in a separate file and I did not find it, please 
> let me know.
>  
> 
> As for the model files themselves:
> 
>> Index: test/Analysis/modelled.model
>> ===================================================================
>> --- test/Analysis/modelled.model  (revision 0)
>> +++ test/Analysis/modelled.model  (working copy)
>> @@ -0,0 +1,3 @@
>> +void modelled(intptr p) {
>> + ++*p;
>> +}
>> \ No newline at end of file
>> Index: test/Analysis/notzero.model
>> ===================================================================
>> --- test/Analysis/notzero.model (revision 0)
>> +++ test/Analysis/notzero.model (working copy)
> 
> Let's put these in a separate subdirectory, for example, "models", instead of 
> mixing them with the tests.  This way they really serve as "inputs" to the 
> analyzer.
> 
> I have moved the model files to tests/Inputs/Models.
>  
> 
> Overall this is looking good.  I think the explanatory comments will really 
> help people understand what this is doing, and I think changing how we thread 
> the information through FrontendAction will help not introduce an artificial 
> tainting of FrontendAction with concepts specific to the static analyzer.
> 
> Cheers,
> Ted
> 
> 
> Thanks,
> Gábor
>  
> 
> On Jul 16, 2014, at 2:45 AM, Gábor Horváth <[email protected]> wrote:
> 
>> 
>> 
>> 
>> On 14 July 2014 19:32, Anna Zaks <[email protected]> wrote:
>> 
>>> On Jul 13, 2014, at 6:11 AM, Gábor Horváth <[email protected]> wrote:
>>> 
>>> Hi Anna,
>>> 
>>> Thank you for the review. I have tweaked the test, so it no longer requires 
>>> the error reporting tweak that is not done yet to pass. I have also added 
>>> some high level comments to some files, if you think some information is 
>>> lacking I will add them in the next iteration as well. The BugReporter 
>>> patch is now separated into a different patch. 
>>> 
>>> 
>>> On 11 July 2014 18:02, Anna Zaks <[email protected]> wrote:
>>> 
>>> For example, modeling functions should allow you to find bugs and suppress 
>>> false positives outside of those functions. I would suggest adding a few of 
>>> those tests first.
>>> 
>>> 
>>> How are the false positives suppressed? I did not find any resource on 
>>> that. Found some analyzer attributes but I did not find them suitable for 
>>> this purpuse at the first glance. But I think once the locations that are 
>>> in a model file are omitted from the report path, the regular methods for 
>>> suppressing false positives should work (and I will definitely add test 
>>> case to ensure this once it is done).
>>> 
>> 
>> What I meant is that it is possible to construct a test where ability to 
>> model a function would eliminate a false positive. This would be another way 
>> to test your patch without worrying about BugReporter.
>> 
>> I got it now, thansk. I have updated the patch with a test case where a 
>> false positive case is eliminated by a model file.
>> 
>> Thanks,
>> Gábor
>>  
>>> Thanks,
>>> Gábor
>>> <api_modeling.patch><bugreporter.patch>
>> 
>> 
>> <api_modeling.patch><bugreporter.patch>
> 
> 
> <api_modeling.patch><bugreporter.patch>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] Modeling APIs in the Static Analyzer

Reply via email to