[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-20 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

Here it is: https://reviews.llvm.org/D106370


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-20 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added a comment.

In D106136#2889610 , @martong wrote:

> In D106136#2883707 , @vsavchenko 
> wrote:
>
>> In D106136#2883100 , @martong 
>> wrote:
>>
>>> Thanks for taking your time to take a look. And I accept your statements. 
>>> Nevertheless, let me explain my motivation.
>>>
>>> I thought that a class is trivial if it has only one member. And it seemed 
>>> perfectly logical to not store equivalence classes with one member. I.e `a` 
>>> equals to `a` does not hold any meaningful information it just takes 
>>> precious memory. When we add a new constraint we take careful steps to 
>>> avoid adding a new class with one member. However, when remove dead kicks 
>>> in, suddenly we end up having classes stored with one member, which is a 
>>> somewhat inconsistent design IMHO. Perhaps some better documentation could 
>>> have helped.
>>
>> Representative symbol gives its equivalence class an ID.  We use this ID for 
>> everything, for comparing and for mapping.  Since we live in a persistent 
>> world, we can't just change this ID at some point, it will basically mean 
>> that we want to replace a class with another class.  So, all maps where this 
>> class participated (constraints, class, members, disequality) should remove 
>> the old class and add the new class.  This is a huge burden.  You need to 
>> carefully do all this, and bloat existing data structures.
>>
>> Trivial class is an optimization for the vast majority of symbols that never 
>> get into any classes.  We still need to associate constraints with those.  
>> This is where implicit Symbol to Class conversion comes in handy.
>>
>> Now let's imagine that something else is merged into it.  We are obliged to 
>> officially map the symbol to its class, and the class to its members.  When 
>> this happens, it goes into the data structure FOREVER.  And when in the 
>> future, with only one member, instead of keeping something that we already 
>> have from other versions of the data structure, you decide to remove the 
>> item from the class map, you actually use more memory when it was there!  
>> This is how persistent data structures work, different versions of the same 
>> data structure share data.  And I'm not even talking here about the fact 
>> that you now need to replace the class with one ID with the class with 
>> another ID in every relationship.
>>
>> You can probably measure memory footprint in your example and see it for 
>> yourself.
>
> Yeah, makes sense. Thanks for taking more time for further explanations. 
> Still, IMHO, the definition of a **trivial class** needs a clear written 
> documentation in the code itself, so other developers can easier understand 
> and maintain this code. I am going to create an NFC patch that summarizes 
> this discussion in a comment attached to the `isTrivial` function.

That's a great idea!  Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-20 Thread Gabor Marton via Phabricator via cfe-commits
martong abandoned this revision.
martong added a comment.

In D106136#2883707 , @vsavchenko 
wrote:

> In D106136#2883100 , @martong wrote:
>
>> Thanks for taking your time to take a look. And I accept your statements. 
>> Nevertheless, let me explain my motivation.
>>
>> I thought that a class is trivial if it has only one member. And it seemed 
>> perfectly logical to not store equivalence classes with one member. I.e `a` 
>> equals to `a` does not hold any meaningful information it just takes 
>> precious memory. When we add a new constraint we take careful steps to avoid 
>> adding a new class with one member. However, when remove dead kicks in, 
>> suddenly we end up having classes stored with one member, which is a 
>> somewhat inconsistent design IMHO. Perhaps some better documentation could 
>> have helped.
>
> Representative symbol gives its equivalence class an ID.  We use this ID for 
> everything, for comparing and for mapping.  Since we live in a persistent 
> world, we can't just change this ID at some point, it will basically mean 
> that we want to replace a class with another class.  So, all maps where this 
> class participated (constraints, class, members, disequality) should remove 
> the old class and add the new class.  This is a huge burden.  You need to 
> carefully do all this, and bloat existing data structures.
>
> Trivial class is an optimization for the vast majority of symbols that never 
> get into any classes.  We still need to associate constraints with those.  
> This is where implicit Symbol to Class conversion comes in handy.
>
> Now let's imagine that something else is merged into it.  We are obliged to 
> officially map the symbol to its class, and the class to its members.  When 
> this happens, it goes into the data structure FOREVER.  And when in the 
> future, with only one member, instead of keeping something that we already 
> have from other versions of the data structure, you decide to remove the item 
> from the class map, you actually use more memory when it was there!  This is 
> how persistent data structures work, different versions of the same data 
> structure share data.  And I'm not even talking here about the fact that you 
> now need to replace the class with one ID with the class with another ID in 
> every relationship.
>
> You can probably measure memory footprint in your example and see it for 
> yourself.

Yeah, makes sense. Thanks for taking more time for further explanations. Still, 
IMHO, the definition of a **trivial class** needs a clear written documentation 
in the code itself, so other developers can easier understand and maintain this 
code. I am going to create an NFC patch that summarizes this discussion in a 
comment attached to the `isTrivial` function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-16 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added a comment.

In D106136#2883100 , @martong wrote:

> Thanks for taking your time to take a look. And I accept your statements. 
> Nevertheless, let me explain my motivation.
>
> I thought that a class is trivial if it has only one member. And it seemed 
> perfectly logical to not store equivalence classes with one member. I.e `a` 
> equals to `a` does not hold any meaningful information it just takes precious 
> memory. When we add a new constraint we take careful steps to avoid adding a 
> new class with one member. However, when remove dead kicks in, suddenly we 
> end up having classes stored with one member, which is a somewhat 
> inconsistent design IMHO. Perhaps some better documentation could have helped.

Representative symbol gives its equivalence class an ID.  We use this ID for 
everything, for comparing and for mapping.  Since we live in a persistent 
world, we can't just change this ID at some point, it will basically mean that 
we want to replace a class with another class.  So, all maps where this class 
participated (constraints, class, members, disequality) should remove the old 
class and add the new class.  This is a huge burden.  You need to carefully do 
all this, and bloat existing data structures.

Trivial class is an optimization for the vast majority of symbols that never 
get into any classes.  We still need to associate constraints with those.  This 
is where implicit Symbol to Class conversion comes in handy.

Now let's imagine that something else is merged into it.  We are obliged to 
officially map the symbol to its class, and the class to its members.  When 
this happens, it goes into the data structure FOREVER.  And when in the future, 
with only one member, instead of keeping something that we already have from 
other versions of the data structure, you decide to remove the item from the 
class map, you actually use more memory when it was there!  This is how 
persistent data structures work, different versions of the same data structure 
share data.  And I'm not even talking here about the fact that you now need to 
replace the class with one ID with the class with another ID in every 
relationship.

You can probably measure memory footprint in your example and see it for 
yourself.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-16 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

Thanks for taking your time to take a look. And I accept your statements. 
Nevertheless, let me explain my motivation.

I thought that a class is trivial if it has only one member. And it seemed 
perfectly logical to not store equivalence classes with one member. I.e `a` 
equals to `a` does not hold any meaningful information it just takes precious 
memory. When we add a new constraint we take careful steps to avoid adding a 
new class with one member. However, when remove dead kicks in, suddenly we end 
up having classes stored with one member, which is a somewhat inconsistent 
design IMHO. Perhaps some better documentation could have helped.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-16 Thread Valeriy Savchenko via Phabricator via cfe-commits
vsavchenko added a comment.

Thanks for working on it, but it is a quite large change that I don't get the 
motivation for (it doesn't even fix the recently found bug).

Summary explains what the patch does, but for why it is done, it talks about an 
invariant that is not specified anywhere in the code.

When I was implementing it, trivial class is a symbol that was never ever equal 
to anything else.  The moment we make merge something into it, it stops being 
trivial, forever!  Representative is an implementation detail and shouldn't be 
required to represent the actual information.  What I'm saying is that 
representative can be long gone, but the class can still hold.  If we use 
representative symbol for something other than its pointer or type after it's 
gone, the mistake is using representative in the first place, not the 
motivation to make it more complex and change representatives.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-16 Thread Gabor Marton via Phabricator via cfe-commits
martong added a subscriber: vabridgers.
martong added a comment.

Note1: This issue is not related to and does not fix 
https://bugs.llvm.org/show_bug.cgi?id=51109
Note2: I am about to measure any performance penalty induced by the extra 
bookkeeping we must do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106136/new/

https://reviews.llvm.org/D106136

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106136: [Analyzer][solver] Fix equivalence class invariant violation in removeDeadBindings

2021-07-16 Thread Gabor Marton via Phabricator via cfe-commits
martong created this revision.
martong added reviewers: vsavchenko, NoQ, steakhal.
Herald added subscribers: manas, ASDenysPetrov, gamesh411, dkrupp, donat.nagy, 
Szelethus, mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, 
xazax.hun, whisperity.
Herald added a reviewer: Szelethus.
martong requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

There is an invariant in the range based solver for equivalence classes.
We don't store class->member associations for trivial classes in the
state (`ClassMembers`). This means any SymbolSet stored in ClassMembers
must have at least two members. This invariant is violated in
removeDeadBindings, because we remove a class from `ClassMembers` only
once it became empty.

Fixing this invariant violation implies that we must prepare for element
removals from an equivalence class. An element removal might result in
downgrading a non-trivial class to trivial, also the representative
symbol might be changed. If the representative symbol has changed then
we have to re-key constraints and the disequality info.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106136

Files:
  clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
  clang/test/Analysis/equivalence-classes-and-remove-dead.c

Index: clang/test/Analysis/equivalence-classes-and-remove-dead.c
===
--- /dev/null
+++ clang/test/Analysis/equivalence-classes-and-remove-dead.c
@@ -0,0 +1,64 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -analyzer-checker=core,debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
+
+// RUN: %clang_analyze_cc1 %s \
+// RUN:   -analyzer-checker=core,debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false \
+// RUN:   2>&1 | FileCheck %s
+
+void clang_analyzer_eval(int);
+void clang_analyzer_warnIfReached();
+void clang_analyzer_printState();
+
+void testIntersection(int a, int b, int c) {
+  if (a < 42 && b > 15 && c >= 25 && c <= 30) {
+if (a != b)
+  return;
+
+clang_analyzer_eval(a > 15);  // expected-warning{{TRUE}}
+clang_analyzer_eval(b < 42);  // expected-warning{{TRUE}}
+clang_analyzer_eval(a <= 30); // expected-warning{{UNKNOWN}}
+
+clang_analyzer_printState(); // At this point a, b, c are alive
+// CHECK:  "constraints": [
+// CHECK-NEXT:   { "symbol": "reg_$0", "range": "{ [16, 41] }" },
+// CHECK-NEXT:   { "symbol": "reg_$1", "range": "{ [16, 41] }" },
+// CHECK-NEXT:   { "symbol": "reg_$2", "range": "{ [25, 30] }" },
+// CHECK-NEXT:   { "symbol": "(reg_$0) != (reg_$1)", "range": "{ [0, 0] }" }
+// CHECK-NEXT: ],
+// CHECK-NEXT: "equivalence_classes": [
+// CHECK-NEXT:   [ "reg_$0", "reg_$1" ]
+// CHECK-NEXT: ],
+// CHECK-NEXT: "disequality_info": null,
+
+if (c == b) {
+  // Also, it should be noted that c is dead at this point, but the
+  // constraint initially associated with c is still around.
+  clang_analyzer_printState(); // a, b are alive
+  // CHECK:  "constraints": [
+  // CHECK-NEXT:   { "symbol": "reg_$0", "range": "{ [25, 30] }" },
+  // CHECK-NEXT:   { "symbol": "reg_$1", "range": "{ [25, 30] }" },
+  // CHECK-NEXT:   { "symbol": "(reg_$0) != (reg_$1)", "range": "{ [0, 0] }" }
+  // CHECK-NEXT: ],
+  // CHECK-NEXT: "equivalence_classes": [
+  // CHECK-NEXT:   [ "reg_$0", "reg_$1" ]
+  // CHECK-NEXT: ],
+  // CHECK-NEXT: "disequality_info": null,
+  clang_analyzer_eval(a >= 25 && a <= 30); // expected-warning{{TRUE}}
+
+  clang_analyzer_printState(); // only b is alive
+  // CHECK:   "constraints": [
+  // CHECK-NEXT:{ "symbol": "reg_$1", "range": "{ [25, 30] }" }
+  // CHECK-NEXT:  ],
+  // CHECK-NEXT:  "equivalence_classes": null,
+  // CHECK-NEXT:  "disequality_info": null,
+  clang_analyzer_eval(b >= 25 && b <= 30); // expected-warning{{TRUE}}
+
+  clang_analyzer_printState(); // all died
+  // CHECK:   "constraints": null,
+  // CHECK-NEXT:  "equivalence_classes": null,
+  // CHECK-NEXT:  "disequality_info": null,
+}
+  }
+}
Index: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
===
--- clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
+++ clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
@@ -592,6 +592,16 @@
   RangeSet::Factory ,
   ProgramStateRef State);
 
+  /// Associate new members to the class. This may result in changing the
+  /// representative symbol or downgrading from non-trivial to trivial.
+  LLVM_NODISCARD ProgramStateRef setNewMembers(SymbolSet NewMembers,
+   ProgramStateRef State);
+
+  // Remove an already empty class amongs the constraints and the