[PATCH] D54149: [Analyzer] [WIP] Standard C++ library functions checker for the std::find() family

2019-11-29 Thread Balogh, Ádám via Phabricator via cfe-commits
baloghadamsoftware added a comment.

The modeling framework in its current state cannot be used for detailed 
modeling of complex functions. Even detailed modeling (including handling of 
`GDM` data for checkers) of standard C functions are done manually: see `C 
string` and `stream` checkers. Template functions of the `C++ STL` are more 
complex, creating a generic modeling framework would be huge multi-year project 
for which we currently lack the resources. So we decided to model these 
functions manually: Model STL Algoirthms to improve the iterator checkers 



Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54149/new/

https://reviews.llvm.org/D54149



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D54149: [Analyzer] [WIP] Standard C++ library functions checker for the std::find() family

2018-11-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.
Herald added a subscriber: gamesh411.

Such checker is useful. It'd be great to add a lot of "pure" functions into it, 
simply for the sake of pointing out that they have no side effects, i.e. avoid 
unnecessary invalidation.

The bound-to-argument constraint is useful. It avoids the alpha-renaming 
problem by assigning the correct value right away instead of stuffing an 
aliasing condition into the solver.

I still believe that the idea to split the state on `std::find` is 
questionable, no matter how exactly it's implemented. `std::find` does more 
than tells whether the element is in the container, so calling it just for its 
other effect (to obtain the iterator to the element) when you already know that 
the element is in the container is not an indication of a dead code, even if it 
is a bad code smell in most cases. I think it is very likely that people will 
come to us with false positives of that kind and we'll have to revert this 
behavior. Would you be OK to keep support for `std::find()` under an option, 
off by default but available when the user opts into lint checks? This could be 
implemented by making a new check kind (eg., 
`CK_OptimisticStdCXXLibraryFunctionsChecker`).

Random thought: i still think that implementing the "object value id" thing (a 
symbol-like id that is associated with the memory region of the object and gets 
invalidated when the object is logically mutated but gets copied as-is to the 
new object upon copy/move-constructors or assignments) is a good way to 
refactor all these things. But this work seems orthogonal to what you did here 
(i.e., we'll have to perform the copy of the "id" here, but that's just a 
separate task). So this patch shouldn't be blocked on that.

Another random thought: it'd be great to make a bug reporter visitor that 
highlights state splits produced by this checker. Eg., in your case it would 
produce an event note that says something like "Assuming object is not present 
in iterator range". Otherwise the user would be unable to understand why do you 
think that the iterator is bad. Or for, say, `ispunct()` it would say "Assuming 
character is punctuation symbol".




Comment at: lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:159-161
+ProgramStateRef
+applyAsBoundToArgument(ProgramStateRef State, const CallEvent ,
+   const FunctionSummaryTy ) const;

I think it's important to document that bound-to-argument should always be used 
instead of compares-to-argument when the comparison operator is == and the 
constraint is applied to return values of fully evaluated functions.



Comment at: lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:375-376
+  OtherV = SVB.evalCast(OtherV, T, OtherT);
+  State = State->BindExpr(Call.getOriginExpr(), Call.getLocationContext(),
+  OtherV);
+  return State;

H.

`std::find` is a function that returns a C++ object by value.

For such functions it is not enough to bind the returned prvalue of the object. 
It is important to realize that while it is an equal iterator, it's not the 
same object. Which is why it is necessary to foresee the storage for the newly 
constructed iterator, so that destruction/materialization/copy elision worked 
correctly. See the respective logic in `ExprEngine::bindReturnValue()`:

```
  lang=c++
   611   if (auto RTC = getCurrentCFGElement().getAs()) {
   612 // Conjure a temporary if the function returns an object by value.
   613 SVal Target;
   614 assert(RTC->getStmt() == Call.getOriginExpr());
   615 EvalCallOptions CallOpts; // FIXME: We won't really need those.
   616 std::tie(State, Target) =
   617 prepareForObjectConstruction(Call.getOriginExpr(), State, LCtx,
   618  RTC->getConstructionContext(), 
CallOpts);
   // ...
   627   } // ...
```

It's bad that checkers currently need to do that manually when they implement 
`evalCall()` for C++ functions that return objects. We need to come up with a 
better API, eg. `State->BindReturnValue(same arguments)` that prepares for 
object construction (i.e., move the part of the functionality of 
`ExprEngine::bindReturnValue()` that's responsible for binding into a 
`ProgramState` method, and only keep the functionality that's responsible for 
conjuring the value).



Comment at: test/Analysis/std-cxx-library-functions.cpp:7-11
+void test_find(std::vector V, int n) {
+  const std::vector::iterator b = V.begin(), e = V.end();
+  clang_analyzer_eval(std::find(b, e, n) == e); // expected-warning{{TRUE}}
+  // expected-warning@-1{{FALSE}}
+}

I think that's a great standalone test.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54149/new/

https://reviews.llvm.org/D54149



___
cfe-commits mailing 

[PATCH] D54149: [Analyzer] [WIP] Standard C++ library functions checker for the std::find() family

2018-11-06 Thread Balogh , Ádám via Phabricator via cfe-commits
baloghadamsoftware created this revision.
baloghadamsoftware added a reviewer: NoQ.
baloghadamsoftware added a project: clang.
Herald added subscribers: donat.nagy, Szelethus, mikhail.ramalho, a.sidorin, 
rnkovacs, szepet, whisperity.
Herald added a reviewer: george.karpenkov.

Repository:
  rC Clang

https://reviews.llvm.org/D54149

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
  test/Analysis/Inputs/system-header-simulator-cxx.h
  test/Analysis/diagnostics/explicit-suppression.cpp
  test/Analysis/iterator-range.cpp
  test/Analysis/std-c-library-functions.cpp
  test/Analysis/std-cxx-library-functions.cpp

Index: test/Analysis/std-cxx-library-functions.cpp
===
--- /dev/null
+++ test/Analysis/std-cxx-library-functions.cpp
@@ -0,0 +1,11 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=apiModeling.StdCXXLibraryFunctions,debug.ExprInspection -verify %s
+
+#include "Inputs/system-header-simulator-cxx.h"
+
+void clang_analyzer_eval(bool);
+
+void test_find(std::vector V, int n) {
+  const std::vector::iterator b = V.begin(), e = V.end();
+  clang_analyzer_eval(std::find(b, e, n) == e); // expected-warning{{TRUE}}
+  // expected-warning@-1{{FALSE}}
+}
Index: test/Analysis/std-c-library-functions.cpp
===
--- test/Analysis/std-c-library-functions.cpp
+++ test/Analysis/std-c-library-functions.cpp
@@ -12,3 +12,20 @@
 void test() {
   clang_analyzer_eval(isalpha('A')); // no-crash // expected-warning{{UNKNOWN}}
 }
+
+namespace std {
+int isalnum(int);
+}
+
+namespace {
+// Non std!!!
+
+int isalnum(int) {
+  return 0;
+}
+}
+
+void test_non_std() {
+  clang_analyzer_eval(std::isalnum('A')); // expected-warning{{TRUE}}
+  clang_analyzer_eval(isalnum('A')); // expected-warning{{FALSE}}
+}
Index: test/Analysis/iterator-range.cpp
===
--- test/Analysis/iterator-range.cpp
+++ test/Analysis/iterator-range.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorRange -analyzer-config aggressive-binary-operation-simplification=true -analyzer-config c++-container-inlining=false %s -verify
-// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorRange -analyzer-config aggressive-binary-operation-simplification=true -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,apiModeling.StdCXXLibraryFunctions,cplusplus,alpha.cplusplus.IteratorRange -analyzer-config aggressive-binary-operation-simplification=true -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_analyze_cc1 -std=c++11 -analyzer-checker=core,apiModeling.StdCXXLibraryFunctions,cplusplus,alpha.cplusplus.IteratorRange -analyzer-config aggressive-binary-operation-simplification=true -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
 
 #include "Inputs/system-header-simulator-cxx.h"
 
@@ -97,6 +97,108 @@
 *i2; // expected-warning{{Iterator accessed outside of its range}}
 }
 
+void good_find(std::vector , int e) {
+  auto first = std::find(V.begin(), V.end(), e);
+  if (V.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(V.begin(), V.end(), e);
+  *first; // expected-warning{{Iterator accessed outside of its range}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(V.begin(), V.end(), seq.begin(), seq.end());
+  if (V.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(V.begin(), V.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed outside of its range}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(V.begin(), V.end(), seq.begin(), seq.end());
+  if (V.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(V.begin(), V.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed outside of its range}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(V.begin(), V.end(), odd);
+  if (V.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(V.begin(), V.end(), odd);
+  *first; // expected-warning{{Iterator accessed outside of its range}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(V.begin(), V.end(), odd);
+  if (V.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(V.begin(), V.end(), odd);
+  *first; // expected-warning{{Iterator