[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-08-26 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D81254#2218196 , @ASDenysPetrov 
wrote:

>> We know where it points to (aka. the pointer's value is conjured), but we 
>> don't know what the value if there.
>
> Absolutely right. I looked at this patch after a while and don't currently 
> see a proper solution.
>
>> I feel like this problem should be handled by some sort of invalidation 
>> mechanism.
>
> Definitely it should be some criteria/mark/flag about the region 
> invalidation. Maybe someone more confident could prompt us how to.

How about using the mistical `SymbolDerived`?
The clang-analyzer-guide 

 says:

> Another atomic symbol, closely related to SymbolRegionValue, is 
> **SymbolDerived**. **It represents a value of a region after another symbol 
> was written into a direct or indirect super-region.** SymbolDerived contains 
> a reference to both the parent symbol and the parent region. This symbol is 
> mostly a technical hack. Usually SymbolDerived appears after invalidation: 
> the whole structure of a certain type gets smashed with a single 
> SymbolConjured, and then values of its fields become represented with the 
> help of SymbolDerived of that conjured symbol and the region of the field. In 
> any case, SymbolDerived is similar to SymbolRegionValue, just refers to a 
> value after a certain event during analysis rather than at start of analysis.

Could we use that here @NoQ?

---

In D81254#2218613 , @ASDenysPetrov 
wrote:

> Should this revision be //closed //or //rejected //somehow?

I'm not sure about that.
I think it still has some value. Especially these two test-cases:

  int index_sym(const int *a, int index) {
int var;
int ret = 0;
if (a[index] < 2)
  var = 1;
  
// Since we did not write through a pointer, we can be sure that the values 
bound to 'a' are still valid.
// (We should also take strict-aliasing into account if 
-fno-strict-aliasing is not enabled.)
//
// We should take this branch if it was taken previously.
if (a[index] < 2) {
  // FIXME: We should not warn here.
  ret = var; // expected-warning{{Assigned value is garbage or undefined 
[core.uninitialized.Assign]}}
}
return ret;
  }
  
  int index_sym_invalidated(const int *a, int index, int index2) {
int var;
int ret = 0;
if (a[index] < 2)
  var = 1;
  
a[index2] = 42; // Store a value to //somewhere// in 'a'.
// A store happens through a pointer which //may// alias with the region 
'a'.
// So this store might overwrite the value of 'a[index]'
// Thus, we should invalidate all the bindings to the region 'a'.
// Except the binding to 'a[index2]' which is known to be 42.
if (a[index] < 2) {
  // This warning is reasonable.
  ret = var; // expected-warning{{Assigned value is garbage or undefined 
[core.uninitialized.Assign]}}
}
return ret;
  }


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-08-14 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

Hey, folk, welcome to https://reviews.llvm.org/D85984
I've moved the logic of this checker in `PthreadLockChecker`

Should this revision be //closed //or //rejected //somehow?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-08-14 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

@steakhal

> If index1 and index2 has the same value, we should not be confident that the 
> x == y holds.

Thanks! Now I see. Shame on me =)

> We know where it points to (aka. the pointer's value is conjured), but we 
> don't know what the value if there.

Absolutely right. I looked at this patch after a while and don't currently see 
a proper solution.

> I feel like this problem should be handled by some sort of invalidation 
> mechanism.

Definitely it should be some criteria/mark/flag about the region invalidation. 
Maybe someone more confident could prompt us how to.




Comment at: clang/test/Analysis/PR9289.cpp:13-21
+int index_int(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[42] < 2)
+var = 1;
+  if (a[42] < 2)
+ret = var; // no warning about garbage value

steakhal wrote:
> Why do we test this?
> Here we can //reason// about the index.
> This patch preserves the current behavior relating to this.
This is a root case of the bug which this patch came from. I decided to vary it 
by using an explicit index.



Comment at: clang/test/Analysis/PR9289.cpp:36-40
+  if (a[index] < 2)
+var = 1;
+  index = index2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined 
[core.uninitialized.Assign]}}

steakhal wrote:
> I'm not sure why do you overwrite the `index` if you could simply index with 
> `index2` instead in the following expression. That would make no difference, 
> only make the test more readable IMO.
> Same comment for all the`*_change` test cases.
My intention is to retain `index` inside brackets. Just to check that `index` 
changed indeed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-08-10 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D81254#2122449 , @ASDenysPetrov 
wrote:

> @NoQ, thanks for the examples.
>
> I didn't get the first one. How do you get to the "//In reality we don't 
> know//", if we don't touch `a[index1]`:

If `index1` and `index2` has the **same** value, we should not be confident 
that the `x == y` holds.

>   void foo(int *a); // Potentially modifies elements of 'a'.
>   void fooRef(int *); // Potentially modifies elements of 'a'.
>   
>   void test2(int *a) {
> // a - {reg_$6}
> foo(a); // after this, CSA doesn't change a.
> // a - {reg_$6}
> fooRef(a); // after this, CSA changes a.
> // a - {conj_$10{int *, LC1, S925, #1}}
>   }

I don't think this is the right way expressing this.
AFAIK this means that we //don't// know where the resulting pointer points to - 
which is not true.
We know where it points to (aka. the pointer's value is conjured), but we don't 
know what the value if there.

Unfortunately, I don't know the solution to your problem, but I'm pretty 
confident that this is something else.
However, I'm really interested in where this discussion goes.

---

I feel like this problem should be handled by some sort of invalidation 
mechanism.
Take my comments on these with a pinch of salt though - I'm really not 
experienced in this :|




Comment at: clang/test/Analysis/PR9289.cpp:13-21
+int index_int(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[42] < 2)
+var = 1;
+  if (a[42] < 2)
+ret = var; // no warning about garbage value

Why do we test this?
Here we can //reason// about the index.
This patch preserves the current behavior relating to this.



Comment at: clang/test/Analysis/PR9289.cpp:36-40
+  if (a[index] < 2)
+var = 1;
+  index = index2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined 
[core.uninitialized.Assign]}}

I'm not sure why do you overwrite the `index` if you could simply index with 
`index2` instead in the following expression. That would make no difference, 
only make the test more readable IMO.
Same comment for all the`*_change` test cases.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-30 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

@NoQ, thanks for the examples.

I didn't get the first one. How do you get to the "//In reality we don't 
know//", if we don't touch `a[index1]`:

  void test1(int *a, int index1, int index2) {
int x = a[index1];
a[index2] = 0;
int y = a[index1];
  
// In reality we don't know but after your patch
// we're confident that this is "TRUE".
clang_analyzer_eval(x == y);
  }

I worked on the second case. I found some possible way to resovle it:

  void foo(int *a); // Potentially modifies elements of 'a'.
  void fooRef(int *); // Potentially modifies elements of 'a'.
  
  void test2(int *a) {
// a - {reg_$6}
foo(a); // after this, CSA doesn't change a.
// a - {reg_$6}
fooRef(a); // after this, CSA changes a.
// a - {conj_$10{int *, LC1, S925, #1}}
  }

Here we need to make `foo` behave the same as `fooRef`. What do you think?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-18 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Here are some examples:

  void test1(int *a, int index1, int index2) {
int x = a[index1];
a[index2] = 0;
int y = a[index1];
  
// In reality we don't know but after your patch
// we're confident that this is "TRUE".
clang_analyzer_eval(x == y);
  }
  
  void foo(int *a); // Potentially modifies elements of 'a'.
  
  void test2(int *a, int index) {
int x = a[index];
foo(a);
int y = a[index];
  
// In reality we don't know but after your patch
// we're confident that this is "TRUE".
clang_analyzer_eval(x == y);
  }


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-09 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov marked an inline comment as done.
ASDenysPetrov added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RegionStore.cpp:1709
   if (!O.getRegion())
-return UnknownVal();
+return svalBuilder.getRegionValueSymbolVal(R);
 

NoQ wrote:
> As soon as contents of the array change during analysis, this becomes 
> incorrect: we cannot keep denoting a new value with the same old symbol.
I've updated the patch, adding more tests with the case you mentioned. Please, 
look at them, especially at functions  last four functions. Is this your case? 
If so, then I tested them and they passed successfully. And if I'm wrong, 
please, explain what I missed more detaily.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-09 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov updated this revision to Diff 269481.
ASDenysPetrov added a comment.

Added more tests changing an array element.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254

Files:
  clang/lib/StaticAnalyzer/Core/RegionStore.cpp
  clang/test/Analysis/PR9289.cpp

Index: clang/test/Analysis/PR9289.cpp
===
--- /dev/null
+++ clang/test/Analysis/PR9289.cpp
@@ -0,0 +1,97 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s
+
+int index_sym(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  if (a[index] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int index_int(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[42] < 2)
+var = 1;
+  if (a[42] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int multi_ext_arr(const int **a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index][index2] < 2)
+var = 1;
+  if (a[index][index2] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int index_sym_change(const int *a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  index = index2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
+
+int index_int_change(const int *a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  index = 42;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
+
+int element_sym_change(int *a, int index, int newValue) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  a[index] = newValue;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
+
+int element_int_change(int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  a[index] = 42;
+  if (a[index] < 2)
+ret = var; // no warning, branch does not execute
+  return ret;
+}
+
+int element_int_change2(int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  a[index] = 1;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
+
+int ptr_sym_change(int *a, int index, int *a2) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  a = a2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
\ No newline at end of file
Index: clang/lib/StaticAnalyzer/Core/RegionStore.cpp
===
--- clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -1704,9 +1704,9 @@
   // FIXME: This is a hack, and doesn't do anything really intelligent yet.
   const RegionRawOffset  = R->getAsArrayOffset();
 
-  // If we cannot reason about the offset, return an unknown value.
+  // If we cannot reason about the offset, return a symbolic value.
   if (!O.getRegion())
-return UnknownVal();
+return svalBuilder.getRegionValueSymbolVal(R);
 
   if (const TypedValueRegion *baseR =
 dyn_cast_or_null(O.getRegion())) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-06 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RegionStore.cpp:1709
   if (!O.getRegion())
-return UnknownVal();
+return svalBuilder.getRegionValueSymbolVal(R);
 

As soon as contents of the array change during analysis, this becomes 
incorrect: we cannot keep denoting a new value with the same old symbol.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81254/new/

https://reviews.llvm.org/D81254



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81254: [analyzer] Produce symbolic values for C-array elements

2020-06-05 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov created this revision.
ASDenysPetrov added reviewers: dcoughlin, NoQ, alexfh.
ASDenysPetrov added a project: clang.
Herald added subscribers: cfe-commits, martong, Charusso, dkrupp, donat.nagy, 
Szelethus, mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware, xazax.hun.

Problem:
The issue is that UnknownVal is produced for an array element when it is used 
in expressions with unknown bounds and unknown index. Thus it doesn't bind in 
the list of Expressions and never be used twice then.

Solution:
Produce symbolic values for array elements instead of UnknownVal. This also 
enables to bind these values and use them later in the next expressions.

This fixes https://bugs.llvm.org/show_bug.cgi?id=9289


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D81254

Files:
  clang/lib/StaticAnalyzer/Core/RegionStore.cpp
  clang/test/Analysis/PR9289.cpp


Index: clang/test/Analysis/PR9289.cpp
===
--- /dev/null
+++ clang/test/Analysis/PR9289.cpp
@@ -0,0 +1,32 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s
+
+int fun(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  if (a[index] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int fun2(const int **a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index][index2] < 2)
+var = 1;
+  if (a[index][index2] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int fun3(const int *a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  index = index2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined 
[core.uninitialized.Assign]}}
+  return ret;
+}
Index: clang/lib/StaticAnalyzer/Core/RegionStore.cpp
===
--- clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -1704,9 +1704,9 @@
   // FIXME: This is a hack, and doesn't do anything really intelligent yet.
   const RegionRawOffset  = R->getAsArrayOffset();
 
-  // If we cannot reason about the offset, return an unknown value.
+  // If we cannot reason about the offset, return a symbolic value.
   if (!O.getRegion())
-return UnknownVal();
+return svalBuilder.getRegionValueSymbolVal(R);
 
   if (const TypedValueRegion *baseR =
 dyn_cast_or_null(O.getRegion())) {


Index: clang/test/Analysis/PR9289.cpp
===
--- /dev/null
+++ clang/test/Analysis/PR9289.cpp
@@ -0,0 +1,32 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s
+
+int fun(const int *a, int index) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  if (a[index] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int fun2(const int **a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index][index2] < 2)
+var = 1;
+  if (a[index][index2] < 2)
+ret = var; // no warning about garbage value
+  return ret;
+}
+
+int fun3(const int *a, int index, int index2) {
+  int var;
+  int ret = 0;
+  if (a[index] < 2)
+var = 1;
+  index = index2;
+  if (a[index] < 2)
+ret = var; // expected-warning{{Assigned value is garbage or undefined [core.uninitialized.Assign]}}
+  return ret;
+}
Index: clang/lib/StaticAnalyzer/Core/RegionStore.cpp
===
--- clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -1704,9 +1704,9 @@
   // FIXME: This is a hack, and doesn't do anything really intelligent yet.
   const RegionRawOffset  = R->getAsArrayOffset();
 
-  // If we cannot reason about the offset, return an unknown value.
+  // If we cannot reason about the offset, return a symbolic value.
   if (!O.getRegion())
-return UnknownVal();
+return svalBuilder.getRegionValueSymbolVal(R);
 
   if (const TypedValueRegion *baseR =
 dyn_cast_or_null(O.getRegion())) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits