[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-13 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

> I took Denys' patch D105340  and created a 
> prototype on top of that to create the SymbolCast in 
> SValBuilder::evalCastSubKind(loc::MemRegionVal V,

Here is the new (alternative) patch: https://reviews.llvm.org/D117229


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-12 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

Ah, I forgot to mention one last point:

5. Revert D115149 . We should never reach 
that problematic assertion once we produce the `SymbolCast`s.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-12 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

Thanks for the review guys.

Artem, I agree, that we should remove `LocAsInteger`. `LocaAsInteger` is a 
primitive handling of a specific cast operation (when we cast a pointer to an 
integer). The integration of  `LocaAsInteger` into the `SymExpr` hierarchy is 
problematic as both this patch and its parent shows.

For precision, we should produce a `SymbolCast` when we evaluate a cast 
operation of a pointer to an integral type. I agree with Denys, that 
`makeSymExprValNN` is probably not the best place to do that. I took Denys' 
patch D105340  and created a prototype on top 
of that to create the `SymbolCast` in 
`SValBuilder::evalCastSubKind(loc::MemRegionVal V,`.

  void test_param(int *p) {
long p_as_integer = (long)p;
clang_analyzer_dump(p);// 
expected-warning{{{reg_$0}}}
clang_analyzer_dump(p_as_integer); // expected-warning{{(long) (reg_$0)}}

I'd like to upload that soon.

Once we produce the `SymbolCast`s, then we can start handling them 
semantically. In my opinion, we should follow these steps:

1. Produce `SymbolCast`s. This is done in D105340 
, plus I am going to create another patch for 
pointer-to-int casts.
2. Handle `SymbolCast`s for the simplest cases. E.g. when we cast a (64bit 
width) pointer to a (64bit width unsigned) integer. This requires the 
foundation for an additional mapping in the `ProgramState` that can map a base 
symbol to cast symbols. This would be used in the constraint manager and this 
mapping must be super efficient.
3. Gradually handle more complex `SymbolCast` semantics. E.g. a widening cast.
4. Remove `LocAsInteger`. Produce the `SymbolCast`s by default and remove the 
option of `ShouldSupportSymbolicIntegerCasts`.

Point 2) and 3) are already started in D103096 
. But again, I think first we have to get the 
solid foundations with the `State` and with the `ConstraintManager` before 
getting into the more complex implementations of widening.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-10 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:420
+  auto FromTy = symRHS->getType();
+  auto ToTy = RLocAsInt->getType(this->Context);
+  symRHS = this->getSymbolManager().getCastSymbol(symRHS, FromTy, ToTy);

ASDenysPetrov wrote:
> Please take into account that `SVal::getType` may return NULL `QualType`. See 
> function's description.
Yup, `SVal::getType()` is relatively speculative. It arguably shouldn't be but 
as of now it is.

In this case though it'll probably always work correctly. Despite the relevant 
code making relatively little sense:
```lang=c++
   QualType VisitNonLocLocAsInteger(nonloc::LocAsInteger LI) {
 // Who cares about the pointer type? It's turned into an integer anyway.
 QualType NestedType = Visit(LI.getLoc());

 // Ok, now we suddenly have a failure condition when there's no pointer 
type
 // even though the pointer type is completely immaterial.
 // It probably never fails in practice though, because there's usually at 
least some
 // pointer type (void pointer in the worst case, but it's still something).
 if (NestedType.isNull())
   return NestedType;
  
 // See? We can get bit width without pointer type.
 return Context.getIntTypeForBitwidth(LI.getNumBits(),
  // Oh I know this one! It's a pointer 
type. It isn't an integer type at all.
  // Whelp I guess "always unsigned" 
works for us as well as anything else.
  // We don't have any information 
about signedness one way or the other.
  NestedType->isSignedIntegerType());
   }
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-10 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:420
+  auto FromTy = symRHS->getType();
+  auto ToTy = RLocAsInt->getType(this->Context);
+  symRHS = this->getSymbolManager().getCastSymbol(symRHS, FromTy, ToTy);

Please take into account that `SVal::getType` may return NULL `QualType`. See 
function's description.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-10 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:415-423
+  // FIXME This should be done in getAsSymbol. But then getAsSymbol should be
+  // the member function of SValBuilder (?)
+  if (symRHS)
+if (auto RLocAsInt = RHS.getAs()) {
+  auto FromTy = symRHS->getType();
+  auto ToTy = RLocAsInt->getType(this->Context);
+  symRHS = this->getSymbolManager().getCastSymbol(symRHS, FromTy, ToTy);

IMO this is not the right place for producing `SymbolCast` because we can meet 
`(type)x`-like semantics in a different contexts and expressions.



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:419-420
+if (auto RLocAsInt = RHS.getAs()) {
+  auto FromTy = symRHS->getType();
+  auto ToTy = RLocAsInt->getType(this->Context);
+  symRHS = this->getSymbolManager().getCastSymbol(symRHS, FromTy, ToTy);

According to https://llvm.org/docs/CodingStandards.html#id28



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:471-478
+  if (Optional RV = rhs.getAs()) {
+// Support pointer arithmetic where the addend is on the left
+// and the pointer on the right.
 
-if (IsCommutative(op)) {
-  // Swap operands.
-  return evalBinOpLN(state, op, *RV, lhs.castAs(), type);
-}
+assert(op == BO_Add);
 
+// Commute the operands.

This can be carried out to the separate revert revision as it is not related to 
this one directly


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2022-01-10 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

Starting producing `SymbolCast` you should be careful. Please pay attention on 
my revision D105340  before doing this. I 
don't want you walk through the thing I fought.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2021-12-17 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Oh nice, it's great to see these things moving.

I have a couple observations for your specific approach:

(1) Sounds like you're introducing a symbol `(long p)` that represents the 
exact same value as (and is used interchangeably with) `{$p} (as 
long)`. Ambiguity is generally bad because it means a lot of duplication and 
missed cornercases.

(2) I think you can run into the same problem with a non-symbolic region which 
you can't unwrap the same way because there's no symbol to cast:

  void foo() {
int p;
12 - (long)
  }

Maybe we should stick to your approach as strictly superior and eliminate 
`LocAsInteger` entirely? In this case the example (2) will work through 
introducing a new custom symbol `SymbolRegionAddress` which would look like 
`$addr`. Then add a hard canonicalization rule `SymRegion{$addr} = R` 
(symregions around address symbols are ill-formed). Ideally we'd also have to 
unwrap offsets, i.e. `$addr{Element{x, char, $i}} = $addr + $i`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2021-12-17 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Many thanks for digging into this @martong. I really enjoyed it!
I also believe that this is the fix for the underlying issue.

I also think the `getAsSymbol()` should be somewhere where we can create new 
symbols.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2021-12-17 Thread Gabor Marton via Phabricator via cfe-commits
martong added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:417
+  // the member function of SValBuilder (?)
+  if (symRHS)
+if (auto RLocAsInt = RHS.getAs()) {

We should handle LHS as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115932/new/

https://reviews.llvm.org/D115932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115932: [Analyzer] Create and handle SymbolCast for pointer to integral conversion

2021-12-17 Thread Gabor Marton via Phabricator via cfe-commits
martong created this revision.
martong added reviewers: steakhal, NoQ, ASDenysPetrov.
Herald added subscribers: manas, gamesh411, dkrupp, donat.nagy, Szelethus, 
mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, xazax.hun.
Herald added a reviewer: Szelethus.
martong requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Currently, pointer to integral type casts are not modeled properly in
the symbol tree.
Below

  void foo(int *p) {
12 - (long)p;
  }

the resulting symbol for the BinOP is an IntSymExpr. LHS is `12` and the RHS is
`{reg_$0}`.

Even though, the `SVal` that is created for `(long)p` is correctly an
`LocAsInteger`, we loose this information when we build up the symbol tree for
`12 - (long)p`. To preserve the cast, we have to create a new node in
the symbol tree that represents it: the `SymbolCast`. This
is what this patch does.

This reverts D115149 , except the test cases.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115932

Files:
  clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h
  clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
  clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp


Index: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
===
--- clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
+++ clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
@@ -1204,7 +1204,23 @@
   return SVB.makeSymbolVal(S);
 }
 
-// TODO: Support SymbolCast.
+SVal VisitSymbolCast(const SymbolCast *Cast) {
+  SVal Castee = Visit(Cast->getOperand());
+  const auto  = SVB.getContext();
+  QualType CastTy = Cast->getType();
+
+  // Cast a pointer to an integral type.
+  if (Castee.getAs() && !Loc::isLocType(CastTy)) {
+const unsigned BitWidth = Context.getIntWidth(CastTy);
+if (Castee.getAsSymbol())
+  return SVB.makeLocAsInteger(Castee.castAs(), BitWidth);
+if (auto X = Castee.getAs())
+  return SVB.makeIntVal(X->getValue());
+// FIXME other cases?
+  }
+
+  return Base::VisitSymbolCast(Cast);
+}
 
 SVal VisitSymIntExpr(const SymIntExpr *S) {
   auto I = Cached.find(S);
Index: clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
===
--- clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
+++ clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
@@ -412,6 +412,15 @@
   SymbolRef symLHS = LHS.getAsSymbol();
   SymbolRef symRHS = RHS.getAsSymbol();
 
+  // FIXME This should be done in getAsSymbol. But then getAsSymbol should be
+  // the member function of SValBuilder (?)
+  if (symRHS)
+if (auto RLocAsInt = RHS.getAs()) {
+  auto FromTy = symRHS->getType();
+  auto ToTy = RLocAsInt->getType(this->Context);
+  symRHS = this->getSymbolManager().getCastSymbol(symRHS, FromTy, ToTy);
+}
+
   // TODO: When the Max Complexity is reached, we should conjure a symbol
   // instead of generating an Unknown value and propagate the taint info to it.
   const unsigned MaxComp = AnOpts.MaxSymbolComplexity;
@@ -459,23 +468,14 @@
 return evalBinOpLN(state, op, *LV, rhs.castAs(), type);
   }
 
-  if (const Optional RV = rhs.getAs()) {
-const auto IsCommutative = [](BinaryOperatorKind Op) {
-  return Op == BO_Mul || Op == BO_Add || Op == BO_And || Op == BO_Xor ||
- Op == BO_Or;
-};
+  if (Optional RV = rhs.getAs()) {
+// Support pointer arithmetic where the addend is on the left
+// and the pointer on the right.
 
-if (IsCommutative(op)) {
-  // Swap operands.
-  return evalBinOpLN(state, op, *RV, lhs.castAs(), type);
-}
+assert(op == BO_Add);
 
-// If the right operand is a concrete int location then we have nothing
-// better but to treat it as a simple nonloc.
-if (auto RV = rhs.getAs()) {
-  const nonloc::ConcreteInt RhsAsLoc = makeIntVal(RV->getValue());
-  return evalBinOpNN(state, op, lhs.castAs(), RhsAsLoc, type);
-}
+// Commute the operands.
+return evalBinOpLN(state, op, *RV, lhs.castAs(), type);
   }
 
   return evalBinOpNN(state, op, lhs.castAs(), rhs.castAs(),
Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h
===
--- clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h
+++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h
@@ -141,6 +141,7 @@
   using SValVisitor::Visit;
   using SymExprVisitor::Visit;
   using MemRegionVisitor::Visit;
+  using Base = FullSValVisitor;
 };
 
 } // end namespace ento


Index: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
===
--- clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
+++ clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
@@