[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-06 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:908
+// into subexpressions of Sym.
+Visit(Sym));
   }

martong wrote:
> vsavchenko wrote:
> > martong wrote:
> > > Alright. So, this is correct because `Visit` boils down finally to either 
> > > `infer(Sym->getType)` or to `VisitBinaryOperator`. And both of them do a 
> > > correct over-approximation of the ranges. Please confirm. 
> > > 
> > > First, I was a bit concerned b/c it is not immediate and not documented 
> > > here. And it is easy to think by the first look that this might be faulty 
> > > if we take the approximation of one operand of a binop that might not be 
> > > true for the whole binop expression. Again, that is not the case because 
> > > we approximate only in case of such ops where we can do a correct 
> > > over-approximation (i.e. `|`, `&` and `%`). 
> > > 
> > > My point is, I'd like to see more explanatory comments here.
> > I'm sorry, but I don't really understand your point here.
> > 
> > Everything that this solver provides is conservative ranges, from whatever 
> > source it comes.  If you intersect two conservative ranges, you get a 
> > conservative range.
> > It doesn't matter what we do in `Visit` as long as it is correct.  If 
> > `Visit` is incorrect then the previous version of this code that gave 
> > preference to some sources over the other ones was also incorrect.
> Thanks for your reply.  So, with other words, I didn't see why it is 
> immediate that a range for a sub-expression is a good approximation for the 
> whole expression. Maybe it's just me, but that's not obvious until one checks 
> that what exactly happens in `Visit`.
Oh, I mean, it's not correct.  Symbolic expressions are N-ary operators, and if 
we know constraints for at least some of these N operands, we can provide a 
conservative range for the whole symbol using some knowledge of the operator.  
It doesn't say anywhere that we use a range for a sub-expression as an 
approximation for the whole range.

Actually I want to move some of these other sources inside of `Visit` as well 
because they trigger only to very specific kinds of symbolic expressions (e.g. 
binary minus, equality/disequality, comparisons).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-06 Thread Gabor Marton via Phabricator via cfe-commits
martong added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:908
+// into subexpressions of Sym.
+Visit(Sym));
   }

vsavchenko wrote:
> martong wrote:
> > Alright. So, this is correct because `Visit` boils down finally to either 
> > `infer(Sym->getType)` or to `VisitBinaryOperator`. And both of them do a 
> > correct over-approximation of the ranges. Please confirm. 
> > 
> > First, I was a bit concerned b/c it is not immediate and not documented 
> > here. And it is easy to think by the first look that this might be faulty 
> > if we take the approximation of one operand of a binop that might not be 
> > true for the whole binop expression. Again, that is not the case because we 
> > approximate only in case of such ops where we can do a correct 
> > over-approximation (i.e. `|`, `&` and `%`). 
> > 
> > My point is, I'd like to see more explanatory comments here.
> I'm sorry, but I don't really understand your point here.
> 
> Everything that this solver provides is conservative ranges, from whatever 
> source it comes.  If you intersect two conservative ranges, you get a 
> conservative range.
> It doesn't matter what we do in `Visit` as long as it is correct.  If `Visit` 
> is incorrect then the previous version of this code that gave preference to 
> some sources over the other ones was also incorrect.
Thanks for your reply.  So, with other words, I didn't see why it is immediate 
that a range for a sub-expression is a good approximation for the whole 
expression. Maybe it's just me, but that's not obvious until one checks that 
what exactly happens in `Visit`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-06 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:908
+// into subexpressions of Sym.
+Visit(Sym));
   }

martong wrote:
> Alright. So, this is correct because `Visit` boils down finally to either 
> `infer(Sym->getType)` or to `VisitBinaryOperator`. And both of them do a 
> correct over-approximation of the ranges. Please confirm. 
> 
> First, I was a bit concerned b/c it is not immediate and not documented here. 
> And it is easy to think by the first look that this might be faulty if we 
> take the approximation of one operand of a binop that might not be true for 
> the whole binop expression. Again, that is not the case because we 
> approximate only in case of such ops where we can do a correct 
> over-approximation (i.e. `|`, `&` and `%`). 
> 
> My point is, I'd like to see more explanatory comments here.
I'm sorry, but I don't really understand your point here.

Everything that this solver provides is conservative ranges, from whatever 
source it comes.  If you intersect two conservative ranges, you get a 
conservative range.
It doesn't matter what we do in `Visit` as long as it is correct.  If `Visit` 
is incorrect then the previous version of this code that gave preference to 
some sources over the other ones was also incorrect.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-06 Thread Gabor Marton via Phabricator via cfe-commits
martong added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:908
+// into subexpressions of Sym.
+Visit(Sym));
   }

Alright. So, this is correct because `Visit` boils down finally to either 
`infer(Sym->getType)` or to `VisitBinaryOperator`. And both of them do a 
correct over-approximation of the ranges. Please confirm. 

First, I was a bit concerned b/c it is not immediate and not documented here. 
And it is easy to think by the first look that this might be faulty if we take 
the approximation of one operand of a binop that might not be true for the 
whole binop expression. Again, that is not the case because we approximate only 
in case of such ops where we can do a correct over-approximation (i.e. `|`, `&` 
and `%`). 

My point is, I'd like to see more explanatory comments here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-06 Thread Valeriy Savchenko via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6017cb31bb35: [analyzer][solver] Use all sources of 
constraints (authored by vsavchenko).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

Files:
  clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
  clang/test/Analysis/constant-folding.c


Index: clang/test/Analysis/constant-folding.c
===
--- clang/test/Analysis/constant-folding.c
+++ clang/test/Analysis/constant-folding.c
@@ -179,6 +179,36 @@
   }
 }
 
+unsigned reset();
+
+void testCombinedSources(unsigned a, unsigned b) {
+  if (b >= 10 && (a | b) <= 30) {
+// Check that we can merge constraints from (a | b), a, and b.
+// Because of the order of assumptions, we already know that (a | b) is 
[10, 30].
+clang_analyzer_eval((a | b) >= 10 && (a | b) <= 30); // 
expected-warning{{TRUE}}
+  }
+
+  a = reset();
+  b = reset();
+
+  if ((a | b) <= 30 && b >= 10) {
+// Check that we can merge constraints from (a | b), a, and b.
+// At this point, we know that (a | b) is [0, 30], but the knowledge
+// of b >= 10 added later can help us to refine it and change it to [10, 
30].
+clang_analyzer_eval(10 <= (a | b) && (a | b) <= 30); // 
expected-warning{{TRUE}}
+  }
+
+  a = reset();
+  b = reset();
+
+  unsigned c = (a | b) & (a != b);
+  if (c <= 40 && a == b) {
+// Even though we have a directo constraint for c [0, 40],
+// we can get a more precise range by looking at the expression itself.
+clang_analyzer_eval(c == 0); // expected-warning{{TRUE}}
+  }
+}
+
 void testRemainderRules(unsigned int a, unsigned int b, int c, int d) {
   // Check that we know that remainder of zero divided by any number is still 
0.
   clang_analyzer_eval((0 % c) == 0); // expected-warning{{TRUE}}
Index: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
===
--- clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
+++ clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
@@ -884,26 +884,28 @@
   }
 
   RangeSet infer(SymbolRef Sym) {
-if (Optional ConstraintBasedRange = intersect(
-RangeFactory, getConstraint(State, Sym),
-// If Sym is a difference of symbols A - B, then maybe we have 
range
-// set stored for B - A.
-//
-// If we have range set stored for both A - B and B - A then
-// calculate the effective range set by intersecting the range set
-// for A - B and the negated range set of B - A.
-getRangeForNegatedSub(Sym), getRangeForEqualities(Sym))) {
-  return *ConstraintBasedRange;
-}
-
-// If Sym is a comparison expression (except <=>),
-// find any other comparisons with the same operands.
-// See function description.
-if (Optional CmpRangeSet = getRangeForComparisonSymbol(Sym)) {
-  return *CmpRangeSet;
-}
-
-return Visit(Sym);
+return intersect(
+RangeFactory,
+// Of course, we should take the constraint directly associated with
+// this symbol into consideration.
+getConstraint(State, Sym),
+// If Sym is a difference of symbols A - B, then maybe we have range
+// set stored for B - A.
+//
+// If we have range set stored for both A - B and B - A then
+// calculate the effective range set by intersecting the range set
+// for A - B and the negated range set of B - A.
+getRangeForNegatedSub(Sym),
+// If Sym is (dis)equality, we might have some information on that
+// in our equality classes data structure.
+getRangeForEqualities(Sym),
+// If Sym is a comparison expression (except <=>),
+// find any other comparisons with the same operands.
+// See function description.
+getRangeForComparisonSymbol(Sym),
+// Apart from the Sym itself, we can infer quite a lot if we look
+// into subexpressions of Sym.
+Visit(Sym));
   }
 
   RangeSet infer(EquivalenceClass Class) {


Index: clang/test/Analysis/constant-folding.c
===
--- clang/test/Analysis/constant-folding.c
+++ clang/test/Analysis/constant-folding.c
@@ -179,6 +179,36 @@
   }
 }
 
+unsigned reset();
+
+void testCombinedSources(unsigned a, unsigned b) {
+  if (b >= 10 && (a | b) <= 30) {
+// Check that we can merge constraints from (a | b), a, and b.
+// Because of the order of assumptions, we already know that (a | b) is [10, 30].
+clang_analyzer_eval((a | b) >= 10 && (a | b) <= 30); // expected-warning{{TRUE}}
+  }
+
+  a = reset();
+  b = reset();
+
+  if ((a | b) <= 30 && b >= 10) {
+// Check that we can merge constraints from (a | b), a, and b.
+// At this point, we kno

[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-05 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ accepted this revision.
NoQ added a comment.
This revision is now accepted and ready to land.

such passes
@
much LLVM

> Performance measurements also show the we are within the same margins.

Great! I'd expect massive constraint solver improvements to actually make 
performance //better// because they cut infeasible paths. This one's probably 
not that massive but it's still amazing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-05 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added a comment.

I compared issues produced by this patch to the issues produced before that on 
all projects from `clang/utils/analyzer/projects`, and didn't find any 
difference.

Performance measurements also show the we are within the same margins.
F17774659: photo_2021-07-05 19.39.43.jpeg 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105436/new/

https://reviews.llvm.org/D105436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105436: [analyzer][solver] Use all sources of constraints

2021-07-05 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko created this revision.
vsavchenko added reviewers: NoQ, xazax.hun, martong, steakhal, Szelethus, 
ASDenysPetrov, manas, RedDocMD.
Herald added subscribers: dkrupp, donat.nagy, mikhail.ramalho, a.sidorin, 
rnkovacs, szepet, baloghadamsoftware.
vsavchenko requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Prior to this patch, we always gave priority to constraints that we
actually know about symbols in question.  However, these can get
outdated and we can get better results if we look at all possible
sources of knowledge, including sub-expressions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D105436

Files:
  clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
  clang/test/Analysis/constant-folding.c


Index: clang/test/Analysis/constant-folding.c
===
--- clang/test/Analysis/constant-folding.c
+++ clang/test/Analysis/constant-folding.c
@@ -179,6 +179,36 @@
   }
 }
 
+unsigned reset();
+
+void testCombinedSources(unsigned a, unsigned b) {
+  if (b >= 10 && (a | b) <= 30) {
+// Check that we can merge constraints from (a | b), a, and b.
+// Because of the order of assumptions, we already know that (a | b) is 
[10, 30].
+clang_analyzer_eval((a | b) >= 10 && (a | b) <= 30); // 
expected-warning{{TRUE}}
+  }
+
+  a = reset();
+  b = reset();
+
+  if ((a | b) <= 30 && b >= 10) {
+// Check that we can merge constraints from (a | b), a, and b.
+// At this point, we know that (a | b) is [0, 30], but the knowledge
+// of b >= 10 added later can help us to refine it and change it to [10, 
30].
+clang_analyzer_eval(10 <= (a | b) && (a | b) <= 30); // 
expected-warning{{TRUE}}
+  }
+
+  a = reset();
+  b = reset();
+
+  unsigned c = (a | b) & (a != b);
+  if (c <= 40 && a == b) {
+// Even though we have a directo constraint for c [0, 40],
+// we can get a more precise range by looking at the expression itself.
+clang_analyzer_eval(c == 0); // expected-warning{{TRUE}}
+  }
+}
+
 void testRemainderRules(unsigned int a, unsigned int b, int c, int d) {
   // Check that we know that remainder of zero divided by any number is still 
0.
   clang_analyzer_eval((0 % c) == 0); // expected-warning{{TRUE}}
Index: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
===
--- clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
+++ clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
@@ -884,26 +884,28 @@
   }
 
   RangeSet infer(SymbolRef Sym) {
-if (Optional ConstraintBasedRange = intersect(
-RangeFactory, getConstraint(State, Sym),
-// If Sym is a difference of symbols A - B, then maybe we have 
range
-// set stored for B - A.
-//
-// If we have range set stored for both A - B and B - A then
-// calculate the effective range set by intersecting the range set
-// for A - B and the negated range set of B - A.
-getRangeForNegatedSub(Sym), getRangeForEqualities(Sym))) {
-  return *ConstraintBasedRange;
-}
-
-// If Sym is a comparison expression (except <=>),
-// find any other comparisons with the same operands.
-// See function description.
-if (Optional CmpRangeSet = getRangeForComparisonSymbol(Sym)) {
-  return *CmpRangeSet;
-}
-
-return Visit(Sym);
+return intersect(
+RangeFactory,
+// Of course, we should take the constraint directly associated with
+// this symbol into consideration.
+getConstraint(State, Sym),
+// If Sym is a difference of symbols A - B, then maybe we have range
+// set stored for B - A.
+//
+// If we have range set stored for both A - B and B - A then
+// calculate the effective range set by intersecting the range set
+// for A - B and the negated range set of B - A.
+getRangeForNegatedSub(Sym),
+// If Sym is (dis)equality, we might have some information on that
+// in our equality classes data structure.
+getRangeForEqualities(Sym),
+// If Sym is a comparison expression (except <=>),
+// find any other comparisons with the same operands.
+// See function description.
+getRangeForComparisonSymbol(Sym),
+// Apart from the Sym itself, we can infer quite a lot if we look
+// into subexpressions of Sym.
+Visit(Sym));
   }
 
   RangeSet infer(EquivalenceClass Class) {


Index: clang/test/Analysis/constant-folding.c
===
--- clang/test/Analysis/constant-folding.c
+++ clang/test/Analysis/constant-folding.c
@@ -179,6 +179,36 @@
   }
 }
 
+unsigned reset();
+
+void testCombinedSources(unsigned a, unsigned b) {
+  if (b >= 10 && (a | b) <= 30) {
+// Check that we can merge constraints