[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2017-01-09 Thread Phabricator via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL291430: [analyzer] Add checker for iterators dereferenced 
beyond their range. (authored by xazax).

Changed prior to commit:
  https://reviews.llvm.org/D25660?vs=81224=83596#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25660

Files:
  cfe/trunk/include/clang/StaticAnalyzer/Checkers/Checkers.td
  cfe/trunk/lib/StaticAnalyzer/Checkers/CMakeLists.txt
  cfe/trunk/lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp
  cfe/trunk/test/Analysis/Inputs/system-header-simulator-cxx.h
  cfe/trunk/test/Analysis/diagnostics/explicit-suppression.cpp
  cfe/trunk/test/Analysis/inlining/stl.cpp
  cfe/trunk/test/Analysis/iterator-past-end.cpp

Index: cfe/trunk/include/clang/StaticAnalyzer/Checkers/Checkers.td
===
--- cfe/trunk/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ cfe/trunk/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -278,6 +278,14 @@
 
 } // end: "optin.cplusplus"
 
+let ParentPackage = CplusplusAlpha in {
+
+def IteratorPastEndChecker : Checker<"IteratorPastEnd">,
+  HelpText<"Check iterators used past end">,
+  DescFile<"IteratorPastEndChecker.cpp">;
+
+} // end: "alpha.cplusplus"
+
 
 //===--===//
 // Valist checkers.
Index: cfe/trunk/test/Analysis/inlining/stl.cpp
===
--- cfe/trunk/test/Analysis/inlining/stl.cpp
+++ cfe/trunk/test/Analysis/inlining/stl.cpp
@@ -6,8 +6,7 @@
 void clang_analyzer_eval(bool);
 
 void testVector(std::vector ) {
-  if (nums.begin()) return;
-  if (nums.end()) return;
+  if (nums.begin() != nums.end()) return;
   
   clang_analyzer_eval(nums.size() == 0);
 #if INLINE
Index: cfe/trunk/test/Analysis/iterator-past-end.cpp
===
--- cfe/trunk/test/Analysis/iterator-past-end.cpp
+++ cfe/trunk/test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,205 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+
+#include "Inputs/system-header-simulator-cxx.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_good_negated(const std::vector ) {
+  auto i = v.end();
+  if (!(i == v.end()))
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-16 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added a comment.

And thank you for the awesome work and addressing the review comments!!!


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-16 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added a comment.

> I am doing it right now. Unfortunately I found a crash which I fixed,

Is it fixed in this patch?

> but then it turned out that overwrites of the iterator variable are not 
> handled. I am working on this 
>  problem.

My suggestion is to commit this patch and address the iterator variable 
overwrites separately, so that it would be more incremental and easier to 
review. Does this sound good to you?


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-16 Thread Balogh , Ádám via Phabricator via cfe-commits
baloghadamsoftware added a comment.

In https://reviews.llvm.org/D25660#613519, @zaks.anna wrote:

> Also, have you evaluated this on real codebases? What results do you see? Are 
> there any false positives found? Are there any true positives found?


I am doing it right now. Unfortunately I found a crash which I fixed, but then 
it turned out that overwrites of the iterator variable are not handled. I am 
working on this problem.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-13 Thread Balogh , Ádám via Phabricator via cfe-commits
baloghadamsoftware updated this revision to Diff 81224.
baloghadamsoftware added a comment.

Now isInStdNamespace is used. Hack is back so https://reviews.llvm.org/D27202 
is not a dependency now.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-cxx.h
  test/Analysis/diagnostics/explicit-suppression.cpp
  test/Analysis/inlining/stl.cpp
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,205 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+
+#include "Inputs/system-header-simulator-cxx.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_good_negated(const std::vector ) {
+  auto i = v.end();
+  if (!(i == v.end()))
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto first = 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-09 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added a comment.

Thanks Artem!

Just to be clear, I think this patch should be committed once 
"inTopLevelNamespace" issue is addressed. That is the only issue pending as far 
as I can see.

The visitor should be a separate patch.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-09 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

A quick example of how a bug reporter visitor for this checker may look like - 
it needs to be expanded significantly but here's a start.

F2675703: report-11.html  <== example of 
how it looks.

See, for example, `MallocChecker` to understand the rest of the bureaucracy 
around bug reporter visitors.

  PathDiagnosticPiece *IteratorPastEndChecker::IteratorVisitor::VisitNode(
  const ExplodedNode *N, const ExplodedNode *PrevN,
  BugReporterContext , BugReport ) {
if (!TrackedRegion)
  return nullptr;
  
const Stmt *S = PathDiagnosticLocation::getStmt(N);
if (!S)
  return nullptr;
  
const LocationContext *LC = N->getLocationContext();
  
PathDiagnosticLocation L = PathDiagnosticLocation::createBegin(
S, BRC.getSourceManager(), N->getLocationContext());
  
ProgramStateRef State = N->getState(),
PrevState = PrevN->getState();
  
const IteratorPosition *Pos =
   getIteratorPosition(State, TrackedRegion),
   *PrevPos =
   getIteratorPosition(PrevState, TrackedRegion);
  
// Because we're tracking this region, it must be an interesting region.
assert(Pos);
  
if (Pos && !PrevPos) {
  // We didn't know the state of this iterator before, why do we know it 
now?
  // Maybe it was constructed from another iterator (state of which we 
knew).
  // FIXME: Handle more carryover cases.
  if (const auto *CE = dyn_cast(S)) {
if (CE->getConstructor()->isCopyOrMoveConstructor()) {
  if (State->getSVal(CE, LC).getAsRegion() == TrackedRegion) {
if (const MemRegion *FromRegion =
State->getSVal(CE->getArg(0), LC).getAsRegion()) {
  assert(getIteratorPosition(State, FromRegion));
  // Because we traverse the ExplodedGraph from bottom to top,
  // this affects how we handle the events before copying.
  TrackedRegion = FromRegion;
  // FIXME: More sensible text.
  return new PathDiagnosticEventPiece(L, "Iterator state copied");
}
  }
}
  }
  
  // Or maybe it's a brand-new iterator.
  TrackedRegion = nullptr;
  // FIXME: More sensible text.
  return new PathDiagnosticEventPiece(L, "Begin tracking the iterator");
}
return nullptr;
  }


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-08 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:721
+
+static bool inTopLevelNamespace(const Decl *D, IdentifierInfo *II) {
+  const auto *ND = dyn_cast(D->getDeclContext());

zaks.anna wrote:
> Would isInStdNamespace() from BugReporterVisitor.cpp be useful here? It would 
> be fine to add this API to the CheckerContext or some other place accessible 
> from here and the BugReporter.
Is there a reason not to use isInStdNamespace() instead of the 
inTopLevelNamespace()? We can add the API to Checker Context.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-08 Thread Balogh , Ádám via Phabricator via cfe-commits
baloghadamsoftware added a comment.

https://reviews.llvm.org/D27202 is now a dependency, but cannot add it.




Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:166
+IteratorPastEndChecker::IteratorPastEndChecker() {
+  PastEndBugType.reset(new BugType(this, "Iterator Past End", "C++ STL 
Error"));
+  PastEndBugType->setSuppressOnSink(true);

zaks.anna wrote:
> How about: "C++ STL Error" -> "Misuse of STL APIs"
OK, I copied it from another checker :-)


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-08 Thread Balogh , Ádám via Phabricator via cfe-commits
baloghadamsoftware updated this revision to Diff 80728.
baloghadamsoftware marked 4 inline comments as done.
baloghadamsoftware added a comment.

Minor corrections, comments, some new tests, test input headers merged.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-cxx.h
  test/Analysis/diagnostics/explicit-suppression.cpp
  test/Analysis/inlining/stl.cpp
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,205 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+
+#include "Inputs/system-header-simulator-cxx.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_good_negated(const std::vector ) {
+  auto i = v.end();
+  if (!(i == v.end()))
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-05 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added a comment.

Also, have you evaluated this on real codebases? What results do you see? Are 
there any false positives found? Are there any true positives found?


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-12-02 Thread Anna Zaks via Phabricator via cfe-commits
zaks.anna added a comment.

It's awesome to see that all the major issues have been addressed. Thank you 
for working on this and diligently working through the code review!!!

I have a few minor comments below.

Could you add this example yours as a "no-warning" test case:
const auto start = v.begin();
while(true) {

  const auto item = find(start, v.end());
  if(item==v.end())
 break;
  doSomething(*item);
  start = ++item;

}




Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:166
+IteratorPastEndChecker::IteratorPastEndChecker() {
+  PastEndBugType.reset(new BugType(this, "Iterator Past End", "C++ STL 
Error"));
+  PastEndBugType->setSuppressOnSink(true);

How about: "C++ STL Error" -> "Misuse of STL APIs"



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:386
+// FIXME: Evaluation of these STL calls should be moved to StdCLibraryFunctions
+//   checker (see patch D20811) or another similar checker for C++ STL
+//   functions (e.g. StdCXXLibraryFunctions or StdCppLibraryFunctions).

Please, quote svn revision number instead of phabricator number.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:395
+  ASTContext  = C.getASTContext();
+  if (!II_std)
+II_std = ("std");

You could simplify the code a bit by moving all these identifier lookups into a 
subrutine and/or just have a single statement guard checking f they have been 
initialized.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:445
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

I agree with Artem that the future readers and maintainers of this code would 
greatly benefit if there were higher level comments explaining how this checker 
works. For example, here, we are saving the information about the comparison 
because iterators are value types...



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:721
+
+static bool inTopLevelNamespace(const Decl *D, IdentifierInfo *II) {
+  const auto *ND = dyn_cast(D->getDeclContext());

Would isInStdNamespace() from BugReporterVisitor.cpp be useful here? It would 
be fine to add this API to the CheckerContext or some other place accessible 
from here and the BugReporter.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:732
+
+static const RegionOrSymbol getRegionOrSymbol(const SVal ) {
+  if (const auto Reg = Val.getAsRegion()) {

This could be useful for other checkers as well. Maybe refactor this out as 
part of a subsequent commit?



Comment at: test/Analysis/Inputs/system-header-simulator-for-iterators.h:62
+  ForwardIterator2 first2, ForwardIterator2 last2);
+}

baloghadamsoftware wrote:
> Maybe we should merge this file with the system-header-simulator-cxx.h? It 
> already contains a vector type but no iterators.
Yes, we the headers are supposed to be reusable for different checkers!



Comment at: test/Analysis/iterator-past-end.cpp:73
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}

The error message is not very good for the find API cases. There is only a 
possibility of access past end. Also its much better to be explicit about what 
went wrong here - the user forgot to check the return value of find. We could 
say something like "The value returned from 'find' needs to be checked before 
it's accessed". 

We'd  need to implement a custom BugReporterVisitor that detects if the 
iterator is a return value from some method that needs checking. This can be & 
should be a separate patch.
 


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-24 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:204
+  CheckerContext ) const {
+  const auto *ThisExpr = COCE->getArg(0);
+

baloghadamsoftware wrote:
> NoQ wrote:
> > This code definitely deserves comments. I managed to understand that this 
> > is a workaround for completely replacing the conjured symbol with a lazy 
> > value upon calling a method over temporary, which the core does from time 
> > to time, and i suspect that this code may break whenever more than one 
> > checker starts doing this (i.e. you'd have to skip more than one 
> > predecessor node in this case).
> > 
> > I still think that the root cause here is conjured structural symbols which 
> > i'd probably prefer to get rid of completely, and then this hack wouldn't 
> > be necessary.
> I think I do not fully understand you here: do you mean some fix in the core?
I am not sure why I am handleing CXXOperatorCall here. Instead, I should handle 
every call, but only instance calls. For final solution would it not be better 
to make the checker explicitely metrialize a temporary object here instead of 
just creating it silently? Then my existing checker function would catch it.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-21 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: test/Analysis/Inputs/system-header-simulator-for-iterators.h:62
+  ForwardIterator2 first2, ForwardIterator2 last2);
+}

Maybe we should merge this file with the system-header-simulator-cxx.h? It 
already contains a vector type but no iterators.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-18 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware updated this revision to Diff 78527.
baloghadamsoftware added a comment.

Test updated to include test case where system headers are inlined.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-for-iterators.h
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,194 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=false %s -verify
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume -analyzer-config c++-container-inlining=true -DINLINE=1 %s -verify
+
+#include "Inputs/system-header-simulator-for-iterators.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_good_negated(const std::vector ) {
+  auto i = v.end();
+  if (!(i == v.end()))
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-17 Thread Anna Zaks via cfe-commits
zaks.anna added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

baloghadamsoftware wrote:
> zaks.anna wrote:
> > baloghadamsoftware wrote:
> > > baloghadamsoftware wrote:
> > > > NoQ wrote:
> > > > > baloghadamsoftware wrote:
> > > > > > NoQ wrote:
> > > > > > > a.sidorin wrote:
> > > > > > > > What will happen if we write something like this:
> > > > > > > > ```
> > > > > > > > bool Eq1 = it1 == it2;
> > > > > > > > bool Eq2 = it3 == it4;
> > > > > > > > if (Eq1) {...}?
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > As I understand, we'll assume the second condition instead of 
> > > > > > > > first.
> > > > > > > Had a look. So the problem is, we obtain the result of the 
> > > > > > > comparison as a symbol, from which it is too hard to recover the 
> > > > > > > operands in order to move iterator position data from one value 
> > > > > > > to another.
> > > > > > > 
> > > > > > > Normally we obtain a simple SymbolConjured for the return value 
> > > > > > > of the `operator==()` call (or, similarly, `operator!=()`). For 
> > > > > > > plain-value iterators (eg. `typedef T *iterator`) we might be 
> > > > > > > obtaining an actual binary symbolic expression, but even then 
> > > > > > > it's not quite clear how to obtain operands (the structure of the 
> > > > > > > expression might have changed due to algebraic simplifications). 
> > > > > > > Additionally, LHS and RHS aren't necessarily symbols (they might 
> > > > > > > be semi-concrete), so composing symbolic expressions from them in 
> > > > > > > general is problematic with our symbol hierarchy, which is rarely 
> > > > > > > a problem for numbers but for structural symbols it'd be a mess.
> > > > > > > 
> > > > > > > For now i suggest, instead of storing only the last LHS and RHS, 
> > > > > > > to save a map from symbols (which are results of comparisons) to 
> > > > > > > (LHS value, RHS value) pairs. This map should, apart from the 
> > > > > > > obvious, be cleaned up whenever one of the iterators in the pair 
> > > > > > > gets mutated (incremented or decremented). This should take care 
> > > > > > > of the problem Alexey points out, and should work with 
> > > > > > > semi-concrete stuff.
> > > > > > > 
> > > > > > > For the future i suggest to let users construct their own symbols 
> > > > > > > and symbolic expressions more easily. In fact, if only we had all 
> > > > > > > iterators as regions, we should have probably used SymbolMetadata 
> > > > > > > for this purpose: it's easy to both recover the parent region 
> > > > > > > from it and use it in symbolic expressions. We could also 
> > > > > > > deprecate the confusing structural symbols (provide default-bound 
> > > > > > > lazy compound values for conjured structures instead), and then 
> > > > > > > it'd be possible to transition to SymbolMetadata entirely.
> > > > > > Thank you for the suggestion. I am not sure if I fully understand 
> > > > > > you. If I create a map where the key is the resulting symbol of the 
> > > > > > comparison, it will not work because evalAssume is called for the 
> > > > > > innermost comparison. So if the body of operator== (or operator!=) 
> > > > > > is inlined, then I get a binary symbolic expression in evalAssume, 
> > > > > > not the SymbolConjured. This binary Symbolic expression is a 
> > > > > > comparison of the internals of the iterators, e.g. the internal 
> > > > > > pointer. So the key will not match any LHS and RHS value pair in 
> > > > > > the map. I also thought on such solution earlier but I dismissed it 
> > > > > > because of this.
> > > > > Well, even if the body of the comparison operator is inlined, 
> > > > > PreStmt/PostStmt callbacks should still work, and it doesn't really 
> > > > > matter if there's a `SymbolConjured` or not, we can still add the 
> > > > > symbolic expression to our map as a key.
> > > > > 
> > > > > Essentially, you ignore whatever happens in the iterator's 
> > > > > operator==() when it's inlined (including any evalAssume events), 
> > > > > then in PostStmt of operator==() you map the return-value symbol of 
> > > > > the operator to the operator's arguments (operands), then whenever an 
> > > > > assumption is being made against the return-value symbol, you carry 
> > > > > over this assumption to the operands. I think it shouldn't really 
> > > > > matter if the operator call was inlined.
> > > > > 
> > > > > The only unexpected thing that may happen due to inlining is if the 
> > > > > inlined operator returns a concrete value (plain true or plain false) 
> > > > > instead of the symbol, but in this case what we need to do is to just 
> > > > > carry over the assumption to the operands instantly.
> > > > Maybe if I evaluate the operator==() call for iterators using 
> > > > evalCall()?
> > > 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-17 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware marked 2 inline comments as done.
baloghadamsoftware added a comment.

In https://reviews.llvm.org/D25660#590778, @NoQ wrote:

> - Agree on the `evalAssume()` implementation (i'm still not quite 
> understanding what the problem is here, see the new inline comments);


I think it will be settled soon.

> - We should probably not warn by default on unchecked std::find (see comments 
> following the `push_back(2016)` example), unless some strong arguments 
> against such code patterns are provided;

It is automatic. The role of evalCall is only to reduce the exploded graph. If 
I remove it, we get the same result (that is why we have a nonStdFind there, to 
check this case). but with far more states. Especially in case of vector, where 
the GNU implementation is quite complicated because of optimizations.

> - A BugReporterVisitor should be added to report iterator state changes to 
> the user across the diagnostic path;

I also thought of this. The question is where to start the chain.

> - Our code owners often have strong opinions regarding warning message 
> wording.

I need suggestions here.

> Then there are a few ideas on finding more bugs, which you shouldn't 
> necessarily implement, that came up during the review, eg.:
> 
> - Alexey suspects that iterators implemented as plain pointers (commonly used 
> in LLVM itself) might be unsupported;

I think it is supported now.

> - Alexey points out that ++/-- may be handled in a less conservative manner;

That is a future plan, but then it also results in a new name for the checker, 
e.g. IteratorRange.

> - More checks could be implemented in this checker, eg. passing `end()` as 
> first argument of `std::find()` is an instant bug (somebody accidentally 
> swapped begin and end?).

Good idea, but what if it is intentional? I do not mean that we pass end() 
directly, but if we do a loop of find() functions where the beginning of the 
next range is always the successor of the last found element, we may result in 
a range of [end(), end()[, which I think is a valid empty range:

const auto start = v.begin();
while(true) {

  const auto item = find(start, v.end());
  if(item==v.end())
 break;
  doSomething(*item);
  start = ++item;

}

> A list of ideas on improving core experience, mostly for myself because i 
> seem to be the only one with strong opinions on this:
> 
> - Provide a checker callback for structure copies, which would unify the 
> multitude of similar callbacks in this checker;

A callback? Or just move the copy into a simple (or template?) function?

> - Consider removing the conjured structure symbol hack.

Which hack do you mean here? In evalCall() of the various std functions? As I 
mentioned, they can be removed, but then we will get more states in the 
exploded graph.

> Did i forget anything?




https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-17 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware marked 10 inline comments as done.
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:209
+   CheckerContext ) const {
+  const auto *Func = Call.getDecl()->getAsFunction();
+  if (Func->isOverloadedOperator()) {

a.sidorin wrote:
> As I remember, `PostCall` is also called for ObjC calls like `ObjCMethodCall` 
> which may not have `FunctionDecl` as their callee. So, `Func` may be a 
> nullptr and needs a check.
You are right, and the same is true for PreCall.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:219
+
+  assert(LCtx->getKind() == LocationContext::StackFrame &&
+ "Function does not begin with stack frame context");

zaks.anna wrote:
> a.sidorin wrote:
> > `isa(LCtx)`?
> > And `cast<>` below already does the same check with an assertion.
> At least one advantage of the assert is that it provides an error message. 
> I'd not try to minimize the number of asserts.
I agree, but I think Alexei is rigt here: cast<> already has the assert we need 
here.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:322
+   bool Assumption) const {
+  const auto *SE = Cond.getAsSymExpr();
+  if (!SE)

a.sidorin wrote:
> What will happen if we compare two iterators related to different containers? 
> I guess the result will be undefined but I'm not sure if  we can track it in 
> this checker without referencing the owning container. Let's leave this code 
> as-is but I think this choice deserves a comment.
That will be part of another checker, but where exactly to put the comment you 
suggest?



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

zaks.anna wrote:
> baloghadamsoftware wrote:
> > baloghadamsoftware wrote:
> > > NoQ wrote:
> > > > baloghadamsoftware wrote:
> > > > > NoQ wrote:
> > > > > > a.sidorin wrote:
> > > > > > > What will happen if we write something like this:
> > > > > > > ```
> > > > > > > bool Eq1 = it1 == it2;
> > > > > > > bool Eq2 = it3 == it4;
> > > > > > > if (Eq1) {...}?
> > > > > > > ```
> > > > > > > 
> > > > > > > As I understand, we'll assume the second condition instead of 
> > > > > > > first.
> > > > > > Had a look. So the problem is, we obtain the result of the 
> > > > > > comparison as a symbol, from which it is too hard to recover the 
> > > > > > operands in order to move iterator position data from one value to 
> > > > > > another.
> > > > > > 
> > > > > > Normally we obtain a simple SymbolConjured for the return value of 
> > > > > > the `operator==()` call (or, similarly, `operator!=()`). For 
> > > > > > plain-value iterators (eg. `typedef T *iterator`) we might be 
> > > > > > obtaining an actual binary symbolic expression, but even then it's 
> > > > > > not quite clear how to obtain operands (the structure of the 
> > > > > > expression might have changed due to algebraic simplifications). 
> > > > > > Additionally, LHS and RHS aren't necessarily symbols (they might be 
> > > > > > semi-concrete), so composing symbolic expressions from them in 
> > > > > > general is problematic with our symbol hierarchy, which is rarely a 
> > > > > > problem for numbers but for structural symbols it'd be a mess.
> > > > > > 
> > > > > > For now i suggest, instead of storing only the last LHS and RHS, to 
> > > > > > save a map from symbols (which are results of comparisons) to (LHS 
> > > > > > value, RHS value) pairs. This map should, apart from the obvious, 
> > > > > > be cleaned up whenever one of the iterators in the pair gets 
> > > > > > mutated (incremented or decremented). This should take care of the 
> > > > > > problem Alexey points out, and should work with semi-concrete stuff.
> > > > > > 
> > > > > > For the future i suggest to let users construct their own symbols 
> > > > > > and symbolic expressions more easily. In fact, if only we had all 
> > > > > > iterators as regions, we should have probably used SymbolMetadata 
> > > > > > for this purpose: it's easy to both recover the parent region from 
> > > > > > it and use it in symbolic expressions. We could also deprecate the 
> > > > > > confusing structural symbols (provide default-bound lazy compound 
> > > > > > values for conjured structures instead), and then it'd be possible 
> > > > > > to transition to SymbolMetadata entirely.
> > > > > Thank you for the suggestion. I am not sure if I fully understand 
> > > > > you. If I create a map where the key is the resulting symbol of the 
> > > > > comparison, it will not work because evalAssume is called for the 
> > > > > innermost comparison. So if the body of operator== (or 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-17 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware updated this revision to Diff 78352.
baloghadamsoftware added a comment.

Updated according to comments.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-for-iterators.h
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,193 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd -analyzer-eagerly-assume %s -verify
+
+#include "Inputs/system-header-simulator-for-iterators.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_good_negated(const std::vector ) {
+  auto i = v.end();
+  if (!(i == v.end()))
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search_n(std::vector , std::vector ) {
+  auto nth = 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-10 Thread Anna Zaks via cfe-commits
zaks.anna added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:219
+
+  assert(LCtx->getKind() == LocationContext::StackFrame &&
+ "Function does not begin with stack frame context");

a.sidorin wrote:
> `isa(LCtx)`?
> And `cast<>` below already does the same check with an assertion.
At least one advantage of the assert is that it provides an error message. I'd 
not try to minimize the number of asserts.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

baloghadamsoftware wrote:
> baloghadamsoftware wrote:
> > NoQ wrote:
> > > baloghadamsoftware wrote:
> > > > NoQ wrote:
> > > > > a.sidorin wrote:
> > > > > > What will happen if we write something like this:
> > > > > > ```
> > > > > > bool Eq1 = it1 == it2;
> > > > > > bool Eq2 = it3 == it4;
> > > > > > if (Eq1) {...}?
> > > > > > ```
> > > > > > 
> > > > > > As I understand, we'll assume the second condition instead of first.
> > > > > Had a look. So the problem is, we obtain the result of the comparison 
> > > > > as a symbol, from which it is too hard to recover the operands in 
> > > > > order to move iterator position data from one value to another.
> > > > > 
> > > > > Normally we obtain a simple SymbolConjured for the return value of 
> > > > > the `operator==()` call (or, similarly, `operator!=()`). For 
> > > > > plain-value iterators (eg. `typedef T *iterator`) we might be 
> > > > > obtaining an actual binary symbolic expression, but even then it's 
> > > > > not quite clear how to obtain operands (the structure of the 
> > > > > expression might have changed due to algebraic simplifications). 
> > > > > Additionally, LHS and RHS aren't necessarily symbols (they might be 
> > > > > semi-concrete), so composing symbolic expressions from them in 
> > > > > general is problematic with our symbol hierarchy, which is rarely a 
> > > > > problem for numbers but for structural symbols it'd be a mess.
> > > > > 
> > > > > For now i suggest, instead of storing only the last LHS and RHS, to 
> > > > > save a map from symbols (which are results of comparisons) to (LHS 
> > > > > value, RHS value) pairs. This map should, apart from the obvious, be 
> > > > > cleaned up whenever one of the iterators in the pair gets mutated 
> > > > > (incremented or decremented). This should take care of the problem 
> > > > > Alexey points out, and should work with semi-concrete stuff.
> > > > > 
> > > > > For the future i suggest to let users construct their own symbols and 
> > > > > symbolic expressions more easily. In fact, if only we had all 
> > > > > iterators as regions, we should have probably used SymbolMetadata for 
> > > > > this purpose: it's easy to both recover the parent region from it and 
> > > > > use it in symbolic expressions. We could also deprecate the confusing 
> > > > > structural symbols (provide default-bound lazy compound values for 
> > > > > conjured structures instead), and then it'd be possible to transition 
> > > > > to SymbolMetadata entirely.
> > > > Thank you for the suggestion. I am not sure if I fully understand you. 
> > > > If I create a map where the key is the resulting symbol of the 
> > > > comparison, it will not work because evalAssume is called for the 
> > > > innermost comparison. So if the body of operator== (or operator!=) is 
> > > > inlined, then I get a binary symbolic expression in evalAssume, not the 
> > > > SymbolConjured. This binary Symbolic expression is a comparison of the 
> > > > internals of the iterators, e.g. the internal pointer. So the key will 
> > > > not match any LHS and RHS value pair in the map. I also thought on such 
> > > > solution earlier but I dismissed it because of this.
> > > Well, even if the body of the comparison operator is inlined, 
> > > PreStmt/PostStmt callbacks should still work, and it doesn't really 
> > > matter if there's a `SymbolConjured` or not, we can still add the 
> > > symbolic expression to our map as a key.
> > > 
> > > Essentially, you ignore whatever happens in the iterator's operator==() 
> > > when it's inlined (including any evalAssume events), then in PostStmt of 
> > > operator==() you map the return-value symbol of the operator to the 
> > > operator's arguments (operands), then whenever an assumption is being 
> > > made against the return-value symbol, you carry over this assumption to 
> > > the operands. I think it shouldn't really matter if the operator call was 
> > > inlined.
> > > 
> > > The only unexpected thing that may happen due to inlining is if the 
> > > inlined operator returns a concrete value (plain true or plain false) 
> > > instead of the symbol, but in this case what we need to do is to just 
> > > carry over the assumption to the operands instantly.
> > Maybe if I evaluate the 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-10 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:204
+  CheckerContext ) const {
+  const auto *ThisExpr = COCE->getArg(0);
+

NoQ wrote:
> This code definitely deserves comments. I managed to understand that this is 
> a workaround for completely replacing the conjured symbol with a lazy value 
> upon calling a method over temporary, which the core does from time to time, 
> and i suspect that this code may break whenever more than one checker starts 
> doing this (i.e. you'd have to skip more than one predecessor node in this 
> case).
> 
> I still think that the root cause here is conjured structural symbols which 
> i'd probably prefer to get rid of completely, and then this hack wouldn't be 
> necessary.
I think I do not fully understand you here: do you mean some fix in the core?


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-10 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

baloghadamsoftware wrote:
> NoQ wrote:
> > baloghadamsoftware wrote:
> > > NoQ wrote:
> > > > a.sidorin wrote:
> > > > > What will happen if we write something like this:
> > > > > ```
> > > > > bool Eq1 = it1 == it2;
> > > > > bool Eq2 = it3 == it4;
> > > > > if (Eq1) {...}?
> > > > > ```
> > > > > 
> > > > > As I understand, we'll assume the second condition instead of first.
> > > > Had a look. So the problem is, we obtain the result of the comparison 
> > > > as a symbol, from which it is too hard to recover the operands in order 
> > > > to move iterator position data from one value to another.
> > > > 
> > > > Normally we obtain a simple SymbolConjured for the return value of the 
> > > > `operator==()` call (or, similarly, `operator!=()`). For plain-value 
> > > > iterators (eg. `typedef T *iterator`) we might be obtaining an actual 
> > > > binary symbolic expression, but even then it's not quite clear how to 
> > > > obtain operands (the structure of the expression might have changed due 
> > > > to algebraic simplifications). Additionally, LHS and RHS aren't 
> > > > necessarily symbols (they might be semi-concrete), so composing 
> > > > symbolic expressions from them in general is problematic with our 
> > > > symbol hierarchy, which is rarely a problem for numbers but for 
> > > > structural symbols it'd be a mess.
> > > > 
> > > > For now i suggest, instead of storing only the last LHS and RHS, to 
> > > > save a map from symbols (which are results of comparisons) to (LHS 
> > > > value, RHS value) pairs. This map should, apart from the obvious, be 
> > > > cleaned up whenever one of the iterators in the pair gets mutated 
> > > > (incremented or decremented). This should take care of the problem 
> > > > Alexey points out, and should work with semi-concrete stuff.
> > > > 
> > > > For the future i suggest to let users construct their own symbols and 
> > > > symbolic expressions more easily. In fact, if only we had all iterators 
> > > > as regions, we should have probably used SymbolMetadata for this 
> > > > purpose: it's easy to both recover the parent region from it and use it 
> > > > in symbolic expressions. We could also deprecate the confusing 
> > > > structural symbols (provide default-bound lazy compound values for 
> > > > conjured structures instead), and then it'd be possible to transition 
> > > > to SymbolMetadata entirely.
> > > Thank you for the suggestion. I am not sure if I fully understand you. If 
> > > I create a map where the key is the resulting symbol of the comparison, 
> > > it will not work because evalAssume is called for the innermost 
> > > comparison. So if the body of operator== (or operator!=) is inlined, then 
> > > I get a binary symbolic expression in evalAssume, not the SymbolConjured. 
> > > This binary Symbolic expression is a comparison of the internals of the 
> > > iterators, e.g. the internal pointer. So the key will not match any LHS 
> > > and RHS value pair in the map. I also thought on such solution earlier 
> > > but I dismissed it because of this.
> > Well, even if the body of the comparison operator is inlined, 
> > PreStmt/PostStmt callbacks should still work, and it doesn't really matter 
> > if there's a `SymbolConjured` or not, we can still add the symbolic 
> > expression to our map as a key.
> > 
> > Essentially, you ignore whatever happens in the iterator's operator==() 
> > when it's inlined (including any evalAssume events), then in PostStmt of 
> > operator==() you map the return-value symbol of the operator to the 
> > operator's arguments (operands), then whenever an assumption is being made 
> > against the return-value symbol, you carry over this assumption to the 
> > operands. I think it shouldn't really matter if the operator call was 
> > inlined.
> > 
> > The only unexpected thing that may happen due to inlining is if the inlined 
> > operator returns a concrete value (plain true or plain false) instead of 
> > the symbol, but in this case what we need to do is to just carry over the 
> > assumption to the operands instantly.
> Maybe if I evaluate the operator==() call for iterators using evalCall()?
Sorry, maybe my phrasing was not accurate enough. The problem is that the 
assumption is not being made against the return-value symbol of the 
operator==(), but if inlined, against the internal == operator. So I do not 
have the same key in evalAssume() thus I cannot access the operands from the 
map I stored in checkPostCall(). The only solution I can imagine is that I 
evalCall() the operator==() but then we lose the opportunity to check anything 
inside the body of the operator.


https://reviews.llvm.org/D25660




[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-09 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware marked an inline comment as done.
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

baloghadamsoftware wrote:
> NoQ wrote:
> > a.sidorin wrote:
> > > What will happen if we write something like this:
> > > ```
> > > bool Eq1 = it1 == it2;
> > > bool Eq2 = it3 == it4;
> > > if (Eq1) {...}?
> > > ```
> > > 
> > > As I understand, we'll assume the second condition instead of first.
> > Had a look. So the problem is, we obtain the result of the comparison as a 
> > symbol, from which it is too hard to recover the operands in order to move 
> > iterator position data from one value to another.
> > 
> > Normally we obtain a simple SymbolConjured for the return value of the 
> > `operator==()` call (or, similarly, `operator!=()`). For plain-value 
> > iterators (eg. `typedef T *iterator`) we might be obtaining an actual 
> > binary symbolic expression, but even then it's not quite clear how to 
> > obtain operands (the structure of the expression might have changed due to 
> > algebraic simplifications). Additionally, LHS and RHS aren't necessarily 
> > symbols (they might be semi-concrete), so composing symbolic expressions 
> > from them in general is problematic with our symbol hierarchy, which is 
> > rarely a problem for numbers but for structural symbols it'd be a mess.
> > 
> > For now i suggest, instead of storing only the last LHS and RHS, to save a 
> > map from symbols (which are results of comparisons) to (LHS value, RHS 
> > value) pairs. This map should, apart from the obvious, be cleaned up 
> > whenever one of the iterators in the pair gets mutated (incremented or 
> > decremented). This should take care of the problem Alexey points out, and 
> > should work with semi-concrete stuff.
> > 
> > For the future i suggest to let users construct their own symbols and 
> > symbolic expressions more easily. In fact, if only we had all iterators as 
> > regions, we should have probably used SymbolMetadata for this purpose: it's 
> > easy to both recover the parent region from it and use it in symbolic 
> > expressions. We could also deprecate the confusing structural symbols 
> > (provide default-bound lazy compound values for conjured structures 
> > instead), and then it'd be possible to transition to SymbolMetadata 
> > entirely.
> Thank you for the suggestion. I am not sure if I fully understand you. If I 
> create a map where the key is the resulting symbol of the comparison, it will 
> not work because evalAssume is called for the innermost comparison. So if the 
> body of operator== (or operator!=) is inlined, then I get a binary symbolic 
> expression in evalAssume, not the SymbolConjured. This binary Symbolic 
> expression is a comparison of the internals of the iterators, e.g. the 
> internal pointer. So the key will not match any LHS and RHS value pair in the 
> map. I also thought on such solution earlier but I dismissed it because of 
> this.
Maybe if I evaluate the operator==() call for iterators using evalCall()?


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-07 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware updated this revision to Diff 77033.
baloghadamsoftware added a comment.

Interim version, updated according to some of the comments.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-for-iterators.h
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,180 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd %s -verify
+
+#include "Inputs/system-header-simulator-for-iterators.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search_n(std::vector , std::vector ) {
+  auto nth = std::search_n(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != nth)
+*nth; // no-warning
+}
+
+void 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-07 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

NoQ wrote:
> a.sidorin wrote:
> > What will happen if we write something like this:
> > ```
> > bool Eq1 = it1 == it2;
> > bool Eq2 = it3 == it4;
> > if (Eq1) {...}?
> > ```
> > 
> > As I understand, we'll assume the second condition instead of first.
> Had a look. So the problem is, we obtain the result of the comparison as a 
> symbol, from which it is too hard to recover the operands in order to move 
> iterator position data from one value to another.
> 
> Normally we obtain a simple SymbolConjured for the return value of the 
> `operator==()` call (or, similarly, `operator!=()`). For plain-value 
> iterators (eg. `typedef T *iterator`) we might be obtaining an actual binary 
> symbolic expression, but even then it's not quite clear how to obtain 
> operands (the structure of the expression might have changed due to algebraic 
> simplifications). Additionally, LHS and RHS aren't necessarily symbols (they 
> might be semi-concrete), so composing symbolic expressions from them in 
> general is problematic with our symbol hierarchy, which is rarely a problem 
> for numbers but for structural symbols it'd be a mess.
> 
> For now i suggest, instead of storing only the last LHS and RHS, to save a 
> map from symbols (which are results of comparisons) to (LHS value, RHS value) 
> pairs. This map should, apart from the obvious, be cleaned up whenever one of 
> the iterators in the pair gets mutated (incremented or decremented). This 
> should take care of the problem Alexey points out, and should work with 
> semi-concrete stuff.
> 
> For the future i suggest to let users construct their own symbols and 
> symbolic expressions more easily. In fact, if only we had all iterators as 
> regions, we should have probably used SymbolMetadata for this purpose: it's 
> easy to both recover the parent region from it and use it in symbolic 
> expressions. We could also deprecate the confusing structural symbols 
> (provide default-bound lazy compound values for conjured structures instead), 
> and then it'd be possible to transition to SymbolMetadata entirely.
Thank you for the suggestion. I am not sure if I fully understand you. If I 
create a map where the key is the resulting symbol of the comparison, it will 
not work because evalAssume is called for the innermost comparison. So if the 
body of operator== (or operator!=) is inlined, then I get a binary symbolic 
expression in evalAssume, not the SymbolConjured. This binary Symbolic 
expression is a comparison of the internals of the iterators, e.g. the internal 
pointer. So the key will not match any LHS and RHS value pair in the map. I 
also thought on such solution earlier but I dismissed it because of this.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:580
+  C.addTransition(stateFound);
+  C.addTransition(stateNotFound);
+}

NoQ wrote:
> NoQ wrote:
> > Ouch, i have one more concern, which can be expressed with the following 
> > false-positive test which currently fails:
> > 
> > ```
> > void foo() {
> >   std::vector vec;
> >   vec.push_back(2016);
> >   auto i = vec.find(vec.begin(), vec.end(), 2016);
> >   *i; // no-warning
> > }
> > ```
> > 
> > Not instantly sure what to do with this. You can avoid state splits until 
> > you are actually sure if both branches are possible, but that'd suppress a 
> > lot of useful positives. Such positives could be suppressed with 
> > assertions, of course, but i'd still hope there aren't too many of those.
> I mean, `std::find(...` ><
False positives can occur whenever we are sure that we will find the element so 
we do not check for the result to be equal with end().


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-01 Thread Artem Dergachev via cfe-commits
NoQ added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:580
+  C.addTransition(stateFound);
+  C.addTransition(stateNotFound);
+}

NoQ wrote:
> Ouch, i have one more concern, which can be expressed with the following 
> false-positive test which currently fails:
> 
> ```
> void foo() {
>   std::vector vec;
>   vec.push_back(2016);
>   auto i = vec.find(vec.begin(), vec.end(), 2016);
>   *i; // no-warning
> }
> ```
> 
> Not instantly sure what to do with this. You can avoid state splits until you 
> are actually sure if both branches are possible, but that'd suppress a lot of 
> useful positives. Such positives could be suppressed with assertions, of 
> course, but i'd still hope there aren't too many of those.
I mean, `std::find(...` ><


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-01 Thread Artem Dergachev via cfe-commits
NoQ added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:580
+  C.addTransition(stateFound);
+  C.addTransition(stateNotFound);
+}

Ouch, i have one more concern, which can be expressed with the following 
false-positive test which currently fails:

```
void foo() {
  std::vector vec;
  vec.push_back(2016);
  auto i = vec.find(vec.begin(), vec.end(), 2016);
  *i; // no-warning
}
```

Not instantly sure what to do with this. You can avoid state splits until you 
are actually sure if both branches are possible, but that'd suppress a lot of 
useful positives. Such positives could be suppressed with assertions, of 
course, but i'd still hope there aren't too many of those.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-11-01 Thread Artem Dergachev via cfe-commits
NoQ added a comment.

I think i managed to understand the reasoning behind your solutions! Right now 
i definitely approve all the high-level logic apart from the handling of 
left/right `SVal`s for `evalAssume`, which i think could be easily improved 
upon without significant drawbacks. See the inline comment.

As for inlining STL headers - ouch. It was supposed to be working (i.e. never 
inlining), it'd probably be great to know why it gets inlined. STL headers are 
confusing much more often than helpful to the analyzer in most cases. That 
said, if we're going to ever revert this decision, i think it's great to have 
more stuff already working, so i'd not worry about that. If moving stuff to a 
header defeats the purpose of some of your tests (eg. tests that specifically 
test what happens if the function is inlined), then probably it'd be a good 
idea to duplicate the tests, eg:

  // RUN: ... -DUSE_HEADER=0 ...
  // RUN: ... -DUSE_HEADER=1 ...
  
  #if USE_HEADER
  #include "Inputs/..."
  #else
  // Paste header here.
  #endif




Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:195
+auto Param = State->getLValue(P, LCtx);
+auto Arg = State->getSVal(CE->getArg(idx++), LCtx->getParent());
+const auto *Pos = getIteratorPosition(State, Arg);

baloghadamsoftware wrote:
> NoQ wrote:
> > I think this trick needs more comments/explaining. It is very unusual. Are 
> > you trying to model effects of passing an iterator by value into a 
> > function? What part of these effects are not modeled magically by the core?
> If I pass an iterator by value (the most usual case) I have to assign its 
> position (in or out of range) to the formal parameter from the actual one.
Had a look. So, essentially, the core copies argument values to parameter 
regions in `enterStackFrame()` without ever notifying checkers about it in any 
way. Okaay. Yep, let's stick to that for now, as i've no better approach in 
mind.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

a.sidorin wrote:
> What will happen if we write something like this:
> ```
> bool Eq1 = it1 == it2;
> bool Eq2 = it3 == it4;
> if (Eq1) {...}?
> ```
> 
> As I understand, we'll assume the second condition instead of first.
Had a look. So the problem is, we obtain the result of the comparison as a 
symbol, from which it is too hard to recover the operands in order to move 
iterator position data from one value to another.

Normally we obtain a simple SymbolConjured for the return value of the 
`operator==()` call (or, similarly, `operator!=()`). For plain-value iterators 
(eg. `typedef T *iterator`) we might be obtaining an actual binary symbolic 
expression, but even then it's not quite clear how to obtain operands (the 
structure of the expression might have changed due to algebraic 
simplifications). Additionally, LHS and RHS aren't necessarily symbols (they 
might be semi-concrete), so composing symbolic expressions from them in general 
is problematic with our symbol hierarchy, which is rarely a problem for numbers 
but for structural symbols it'd be a mess.

For now i suggest, instead of storing only the last LHS and RHS, to save a map 
from symbols (which are results of comparisons) to (LHS value, RHS value) 
pairs. This map should, apart from the obvious, be cleaned up whenever one of 
the iterators in the pair gets mutated (incremented or decremented). This 
should take care of the problem Alexey points out, and should work with 
semi-concrete stuff.

For the future i suggest to let users construct their own symbols and symbolic 
expressions more easily. In fact, if only we had all iterators as regions, we 
should have probably used SymbolMetadata for this purpose: it's easy to both 
recover the parent region from it and use it in symbolic expressions. We could 
also deprecate the confusing structural symbols (provide default-bound lazy 
compound values for conjured structures instead), and then it'd be possible to 
transition to SymbolMetadata entirely.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-27 Thread Anna Zaks via cfe-commits
zaks.anna added a comment.

>> Actually, I always test first on real code, and it seemed to be inlined. But 
>> now, even if I 
>>  removed the pragma it was not inlined.

Looks like this patch is interfering with this inlining suppression. We had 
many false positives without it. Mainly, the analyzer would not understand the 
invariants of the container data structures.

`ExprEngine::defaultEvalCall` calls `mayInlineCallKind `which contains this:
`// Conditionally control the inlining of methods on objects that look

  // like C++ containers.
  if (!Opts.mayInlineCXXContainerMethods())
if (!Ctx.getSourceManager().isInMainFile(FD->getLocation()))
  if (isContainerMethod(Ctx, FD))
return false;`




Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template  struct __iterator {
+  typedef __iterator iterator;

NoQ wrote:
> We should probably separate this into an #include-able header in 
> `test/Analysis/Inputs/`.
> 
> Also, there's always a bit of concern that it wasn't copy-pasted from a 
> standard library implementation with an incompatible license such as (L)GPL. 
> Which often happens when you do your best to emulate the normal way of 
> defining things as closely as possible.
We often do forward declare in the implementation file as it is done here. We 
mainly use the Inputs directory to simulate system headers.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-27 Thread Aleksei Sidorin via cfe-commits
a.sidorin added a comment.

Thank you for this patch! I like some solutions used in it but I also have some 
comments (inline).


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-27 Thread Aleksei Sidorin via cfe-commits
a.sidorin added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:209
+   CheckerContext ) const {
+  const auto *Func = Call.getDecl()->getAsFunction();
+  if (Func->isOverloadedOperator()) {

As I remember, `PostCall` is also called for ObjC calls like `ObjCMethodCall` 
which may not have `FunctionDecl` as their callee. So, `Func` may be a nullptr 
and needs a check.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:156
+if (isAccessOperator(Func->getOverloadedOperator())) {
+  if (Func->isCXXInstanceMember()) {
+const auto  = static_cast(Call);

This can be written some shorter: `if (const auto *InstCall = 
dyn_cast()`



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:219
+
+  assert(LCtx->getKind() == LocationContext::StackFrame &&
+ "Function does not begin with stack frame context");

`isa(LCtx)`?
And `cast<>` below already does the same check with an assertion.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:258
+  auto State = C.getState();
+  const auto *LCtx = C.getPredecessor()->getLocationContext();
+

Just `C.getLocationContext()`?



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:305
+  auto RegionMap = State->get();
+  for (auto I = RegionMap.begin(), E = RegionMap.end(); I != E; ++I) {
+if (!SR.isLiveRegion(I->first)) {

This loop may be C++11-fied.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:322
+   bool Assumption) const {
+  const auto *SE = Cond.getAsSymExpr();
+  if (!SE)

What will happen if we compare two iterators related to different containers? I 
guess the result will be undefined but I'm not sure if  we can track it in this 
checker without referencing the owning container. Let's leave this code as-is 
but I think this choice deserves a comment.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:337
+  const auto *RPos = getIteratorPosition(State, Right);
+  if (LPos && !RPos) {
+if ((LPos->isInRange() && ((Opc == BO_EQ) == Assumption)) ||

Maybe we should just swap Rhs and Lhs if LPos is null? So, we can avoid code 
duplication.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:423
+
+void IteratorPastEndChecker::handleComparison(CheckerContext ,
+  const SVal ,

What will happen if we write something like this:
```
bool Eq1 = it1 == it2;
bool Eq2 = it3 == it4;
if (Eq1) {...}?
```

As I understand, we'll assume the second condition instead of first.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:454
+  if (Pos && Pos->isOutofRange()) {
+State = setIteratorPosition(State, Val, IteratorPosition::getInRange());
+C.addTransition(State);

I'm not sure it's totally correct. `--` for `begin()` will give us out-of-range 
iterator. According to header description, we're catching just "past-end" 
iterators, but this is confusing a bit for me.
Moreover, if we're out of end() in multiple positions, a single `--` will not 
make the iterator valid again. You use a good conservative approach, but could 
you please add a comment describing it?



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:571
+  auto  = C.getSValBuilder();
+  const auto *LCtx = C.getPredecessor()->getLocationContext();
+

Just `C.getLocationContext()`.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:573
+
+  auto RetVal = svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, 
C.blockCount());
+  auto SecondParam = state->getSVal(CE->getArg(1), C.getLocationContext());

You can use overload which does not require the tag.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:574
+  auto RetVal = svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, 
C.blockCount());
+  auto SecondParam = state->getSVal(CE->getArg(1), C.getLocationContext());
+

getLocationContext => LCtx 



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:605
+static bool isIteratorType(const QualType ) {
+  const auto *CRD = Type->getUnqualifiedDesugaredType()->getAsCXXRecordDecl();
+  return isIterator(CRD);

A common way of defining iterator types is just their declaration as pointers. 
I'm not sure this code will work well in such cases.
You can see some example in LLVM containers like SmallVector, where iterators 
are declared in the following way:

```
  typedef T *iterator;
  typedef const T 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-27 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template  struct __iterator {
+  typedef __iterator iterator;

NoQ wrote:
> baloghadamsoftware wrote:
> > NoQ wrote:
> > > We should probably separate this into an #include-able header in 
> > > `test/Analysis/Inputs/`.
> > > 
> > > Also, there's always a bit of concern that it wasn't copy-pasted from a 
> > > standard library implementation with an incompatible license such as 
> > > (L)GPL. Which often happens when you do your best to emulate the normal 
> > > way of defining things as closely as possible.
> > I did it now, but first one of my tests failed. I fixed the bug, but it 
> > turned out that if I include these types and functions, no method or 
> > function is checked, just conjured symbols are generated. Should including 
> > not behave the same as copying the contents? This happened even if I 
> > removed the pragma.
> Aha, i guess that's because we don't inline STL headers. See 
> `mayInlineCXXStandardLibrary()` / `-analyzer-config c++-stdlib-inlining`.
> 
> The lesson to learn here is that it's a good idea to make tests as similar to 
> real code as possible. Because on real code, it would probably also not be 
> inlined.
Actually, I always test first on real code, and it seemed to be inlined. But 
now, even if I removed the pragma it was not inlined.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-27 Thread Artem Dergachev via cfe-commits
NoQ added a comment.

Thanks!! Will try to look at the rest of the stuff as soon as possible><




Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template  struct __iterator {
+  typedef __iterator iterator;

baloghadamsoftware wrote:
> NoQ wrote:
> > We should probably separate this into an #include-able header in 
> > `test/Analysis/Inputs/`.
> > 
> > Also, there's always a bit of concern that it wasn't copy-pasted from a 
> > standard library implementation with an incompatible license such as 
> > (L)GPL. Which often happens when you do your best to emulate the normal way 
> > of defining things as closely as possible.
> I did it now, but first one of my tests failed. I fixed the bug, but it 
> turned out that if I include these types and functions, no method or function 
> is checked, just conjured symbols are generated. Should including not behave 
> the same as copying the contents? This happened even if I removed the pragma.
Aha, i guess that's because we don't inline STL headers. See 
`mayInlineCXXStandardLibrary()` / `-analyzer-config c++-stdlib-inlining`.

The lesson to learn here is that it's a good idea to make tests as similar to 
real code as possible. Because on real code, it would probably also not be 
inlined.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-26 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:195
+auto Param = State->getLValue(P, LCtx);
+auto Arg = State->getSVal(CE->getArg(idx++), LCtx->getParent());
+const auto *Pos = getIteratorPosition(State, Arg);

NoQ wrote:
> I think this trick needs more comments/explaining. It is very unusual. Are 
> you trying to model effects of passing an iterator by value into a function? 
> What part of these effects are not modeled magically by the core?
If I pass an iterator by value (the most usual case) I have to assign its 
position (in or out of range) to the formal parameter from the actual one.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:568
+  // We also should check for copy assignment operator, but for some reason
+  // it is not provided implicitly in Clang Static Analyzer
+  for (const auto *M : CRD->methods()) {

NoQ wrote:
> It's not analyzer's fault :) We're inspecting the AST here.
> 
> Anyway, does `CXXRecordDecl::needsImplicitCopyAssignment()` look useful?
No, it does not. I need to check whether the type is copiable, since that is a 
criteria for being an operator (copiable via constructor and assignment, 
deleteable, incrementable and dereferencable). It seems that while copy 
constructor and destructor is generated automatically, copy assignment not, at 
least not in this simple case. So I defaulted it to true, and I set it to false 
if I find a deleted or a non-public copy assignment.



Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template  struct __iterator {
+  typedef __iterator iterator;

NoQ wrote:
> We should probably separate this into an #include-able header in 
> `test/Analysis/Inputs/`.
> 
> Also, there's always a bit of concern that it wasn't copy-pasted from a 
> standard library implementation with an incompatible license such as (L)GPL. 
> Which often happens when you do your best to emulate the normal way of 
> defining things as closely as possible.
I did it now, but first one of my tests failed. I fixed the bug, but it turned 
out that if I include these types and functions, no method or function is 
checked, just conjured symbols are generated. Should including not behave the 
same as copying the contents? This happened even if I removed the pragma.


https://reviews.llvm.org/D25660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-26 Thread Balogh , Ádám via cfe-commits
baloghadamsoftware updated this revision to Diff 75875.
baloghadamsoftware added a comment.

Updated according to the comments. Also fixed a bug and moved access check to 
pre-call instead of post-call.


https://reviews.llvm.org/D25660

Files:
  include/clang/StaticAnalyzer/Checkers/Checkers.td
  lib/StaticAnalyzer/Checkers/CMakeLists.txt
  lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/Inputs/system-header-simulator-for-iterators.h
  test/Analysis/iterator-past-end.cpp

Index: test/Analysis/iterator-past-end.cpp
===
--- /dev/null
+++ test/Analysis/iterator-past-end.cpp
@@ -0,0 +1,180 @@
+// RUN: %clang_cc1 -std=c++11 -analyze -analyzer-checker=core,cplusplus,alpha.cplusplus.IteratorPastEnd %s -verify
+
+#include "Inputs/system-header-simulator-for-iterators.h"
+
+void simple_good(const std::vector ) {
+  auto i = v.end();
+  if (i != v.end())
+*i; // no-warning
+}
+
+void simple_bad(const std::vector ) {
+  auto i = v.end();
+  *i; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void decrease(const std::vector ) {
+  auto i = v.end();
+  --i;
+  *i; // no-warning
+}
+
+void copy_and_decrease1(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i1; // no-warning
+}
+
+void copy_and_decrease2(const std::vector ) {
+  auto i1 = v.end();
+  auto i2 = i1;
+  --i1;
+  *i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void copy_and_increase1(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i1 == v.end())
+*i2; // no-warning
+}
+
+void copy_and_increase2(const std::vector ) {
+  auto i1 = v.begin();
+  auto i2 = i1;
+  ++i1;
+  if (i2 == v.end())
+*i2; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find(std::vector , int e) {
+  auto first = std::find(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_find_end(std::vector , std::vector ) {
+  auto last = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_first_of(std::vector , std::vector ) {
+  auto first =
+  std::find_first_of(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_first_of(std::vector , std::vector ) {
+  auto first = std::find_end(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+bool odd(int i) { return i % 2; }
+
+void good_find_if(std::vector ) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if(std::vector , int e) {
+  auto first = std::find_if(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_find_if_not(std::vector ) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_find_if_not(std::vector , int e) {
+  auto first = std::find_if_not(vec.begin(), vec.end(), odd);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_lower_bound(std::vector , int e) {
+  auto first = std::lower_bound(vec.begin(), vec.end(), e);
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  if (vec.end() != last)
+*last; // no-warning
+}
+
+void bad_upper_bound(std::vector , int e) {
+  auto last = std::lower_bound(vec.begin(), vec.end(), e);
+  *last; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != first)
+*first; // no-warning
+}
+
+void bad_search(std::vector , std::vector ) {
+  auto first = std::search(vec.begin(), vec.end(), seq.begin(), seq.end());
+  *first; // expected-warning{{Iterator accessed past its end}}
+}
+
+void good_search_n(std::vector , std::vector ) {
+  auto nth = std::search_n(vec.begin(), vec.end(), seq.begin(), seq.end());
+  if (vec.end() != 

[PATCH] D25660: [Analyzer] Checker for iterators dereferenced beyond their range.

2016-10-18 Thread Artem Dergachev via cfe-commits
NoQ added a subscriber: a.sidorin.
NoQ added a comment.

Wow, you managed to check something that could be checked without going through 
a hell of modeling dozens of STL methods, and probably even without stepping on 
poor C++ temporary object modeling in the analyzer, which sounds great.

These comments are incomplete because i didn't yet take my time to understand 
how your program state traits work; hope to come back to this a bit later.

Adding Alexey because he's fond of iterators.




Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:115
+
+typedef std::pair RegionOrSymbol;
+

Maybe `llvm::PointerUnion`?



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:175
+  const auto *LCtx = C.getLocationContext();
+  if (LCtx->getKind() != LocationContext::StackFrame)
+return;

I think functions should always begin with a stack frame context, not sure, 
does this ever get violated? Do we have `checkBeginBlock`? Sorry if i'm wrong.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:183
+  const auto *Site =
+  static_cast(LCtx)->getCallSite();
+  if (!Site)

LLVM `cast<>` should be used, because it asserts cast correctness through 
LLVM's custom RTTI (and `LocationContext` child classes do support that).



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:195
+auto Param = State->getLValue(P, LCtx);
+auto Arg = State->getSVal(CE->getArg(idx++), LCtx->getParent());
+const auto *Pos = getIteratorPosition(State, Arg);

I think this trick needs more comments/explaining. It is very unusual. Are you 
trying to model effects of passing an iterator by value into a function? What 
part of these effects are not modeled magically by the core?



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:358
+
+bool IteratorPastEndChecker::evalCall(const CallExpr *CE,
+  CheckerContext ) const {

So the thing about `evalCall` is that every call can only be eval'ed by only 
one checker. So if you're doing this, you should be sure that your checker is 
modelling //all// effects of the call on //everything// in the program state, 
//manually//, and any checker that relies on that modelling should make sure 
that your checker is turned on.

Because the functions you are modelling are pure, i think it's, in general, a 
good idea to `evalCall()` them. Other checkers should be able to rely on 
PreCall/PostCall events to model their state changes.

So the question is, in what checker do we want this modelling to happen. 
Because your checker is looking for very specific errors, it might be a good 
idea to eventually split it into a separate checker. I think, at least, a FIXME 
for this task should be left around. I'm also currently tackling with a single 
checker to model all standard library functions (D20811), maybe i'd come up 
with a way to merge it there.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:443
+  if (Pos && Pos->isOutofRange()) {
+auto *N = C.generateNonFatalErrorNode(State);
+if (!N)

Accessing end() is a UB, we should probably generate a fatal error node here.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:446
+  return;
+reportPastEndBug("Iterator possibly accessed past its end.", Val, C, N);
+  }

I think path-sensitive checkers should present their findings proudly. After 
all, they did their best to find a single execution path on which the problem 
//certainly// manifests.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:521
+  auto RetVal = svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, 
C.blockCount());
+  auto SecondParam = state->getSVal(CE->getArg(1), C.getLocationContext());
+

Number of arguments of `CE` should be checked beforehand. Yes, it is UB to 
modify namespace `std::` to introduce functions with same names but less 
arguments, but we still should not crash when we see such code.



Comment at: lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp:568
+  // We also should check for copy assignment operator, but for some reason
+  // it is not provided implicitly in Clang Static Analyzer
+  for (const auto *M : CRD->methods()) {

It's not analyzer's fault :) We're inspecting the AST here.

Anyway, does `CXXRecordDecl::needsImplicitCopyAssignment()` look useful?



Comment at: test/Analysis/iterator-past-end.cpp:3
+
+template  struct __iterator {
+  typedef __iterator iterator;

We should probably separate this into an #include-able header in 
`test/Analysis/Inputs/`.

Also, there's always a bit of concern that it wasn't copy-pasted from a 
standard