[PATCH] D60112: [analyzer] Treat write into a top-level parameter variable with destructor as escape.

2019-04-12 Thread Phabricator via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rC358321: [analyzer] Escape pointers stored into top-level 
parameters with destructors. (authored by dergachev, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D60112?vs=193229=194993#toc

Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60112/new/

https://reviews.llvm.org/D60112

Files:
  lib/StaticAnalyzer/Core/ExprEngine.cpp
  test/Analysis/malloc.cpp

Index: test/Analysis/malloc.cpp
===
--- test/Analysis/malloc.cpp
+++ test/Analysis/malloc.cpp
@@ -141,3 +141,26 @@
   }
   return funcname; // no-warning
 }
+
+namespace argument_leak {
+class A {
+  char *name;
+
+public:
+  char *getName() {
+if (!name) {
+  name = static_cast(malloc(10));
+}
+return name;
+  }
+  ~A() {
+if (name) {
+  delete[] name;
+}
+  }
+};
+
+void test(A a) {
+  (void)a.getName();
+}
+} // namespace argument_leak
Index: lib/StaticAnalyzer/Core/ExprEngine.cpp
===
--- lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -2621,43 +2621,39 @@
   getCheckerManager().runCheckersForPostStmt(Dst, AfterInvalidateSet, AE, *this);
 }
 
-// A value escapes in three possible cases:
+// A value escapes in four possible cases:
 // (1) We are binding to something that is not a memory region.
-// (2) We are binding to a MemrRegion that does not have stack storage.
-// (3) We are binding to a MemRegion with stack storage that the store
+// (2) We are binding to a MemRegion that does not have stack storage.
+// (3) We are binding to a top-level parameter region with a non-trivial
+// destructor. We won't see the destructor during analysis, but it's there.
+// (4) We are binding to a MemRegion with stack storage that the store
 // does not understand.
-ProgramStateRef ExprEngine::processPointerEscapedOnBind(ProgramStateRef State,
-SVal Loc,
-SVal Val,
-const LocationContext *LCtx) {
-  // Are we storing to something that causes the value to "escape"?
-  bool escapes = true;
-
-  // TODO: Move to StoreManager.
-  if (Optional regionLoc = Loc.getAs()) {
-escapes = !regionLoc->getRegion()->hasStackStorage();
-
-if (!escapes) {
-  // To test (3), generate a new state with the binding added.  If it is
-  // the same state, then it escapes (since the store cannot represent
-  // the binding).
-  // Do this only if we know that the store is not supposed to generate the
-  // same state.
-  SVal StoredVal = State->getSVal(regionLoc->getRegion());
-  if (StoredVal != Val)
-escapes = (State == (State->bindLoc(*regionLoc, Val, LCtx)));
-}
-  }
+ProgramStateRef
+ExprEngine::processPointerEscapedOnBind(ProgramStateRef State, SVal Loc,
+SVal Val, const LocationContext *LCtx) {
 
-  // If our store can represent the binding and we aren't storing to something
-  // that doesn't have local storage then just return and have the simulation
-  // state continue as is.
-  if (!escapes)
-return State;
+  // Cases (1) and (2).
+  const MemRegion *MR = Loc.getAsRegion();
+  if (!MR || !MR->hasStackStorage())
+return escapeValue(State, Val, PSK_EscapeOnBind);
+
+  // Case (3).
+  if (const auto *VR = dyn_cast(MR->getBaseRegion()))
+if (VR->hasStackParametersStorage() && VR->getStackFrame()->inTopFrame())
+  if (const auto *RD = VR->getValueType()->getAsCXXRecordDecl())
+if (!RD->hasTrivialDestructor())
+  return escapeValue(State, Val, PSK_EscapeOnBind);
+
+  // Case (4): in order to test that, generate a new state with the binding
+  // added. If it is the same state, then it escapes (since the store cannot
+  // represent the binding).
+  // Do this only if we know that the store is not supposed to generate the
+  // same state.
+  SVal StoredVal = State->getSVal(MR);
+  if (StoredVal != Val)
+if (State == (State->bindLoc(loc::MemRegionVal(MR), Val, LCtx)))
+  return escapeValue(State, Val, PSK_EscapeOnBind);
 
-  // Otherwise, find all symbols referenced by 'val' that we are tracking
-  // and stop tracking them.
-  State = escapeValue(State, Val, PSK_EscapeOnBind);
   return State;
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60112: [analyzer] Treat write into a top-level parameter variable with destructor as escape.

2019-04-10 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus accepted this revision.
Szelethus added a comment.
This revision is now accepted and ready to land.

Okay, I played around with this patch, I see now where this is going! LGTM!

> Do you think i should document it somehow?

Aye, the description you gave was enlightening, thanks! If you can squeeze it 
somewhere in the code where it isn't out of place, it's all the better! :)




Comment at: clang/test/Analysis/malloc.cpp:151
+  char *getName() {
+if (!name) {
+  name = static_cast(malloc(10));

NoQ wrote:
> Szelethus wrote:
> > Is this relevant? `name` will never be null.
> Not really, just makes the code look a bit more sensible and idiomatic and 
> less warning-worthy-anyway, to make it as clear as possible that the positive 
> here is indeed false. We don't really have a constructor in this class, but 
> we can imagine that it zero-initializes name. Without this check calling 
> `getName()` multiple times would immediately result in a leak.
Convinced ;)


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60112/new/

https://reviews.llvm.org/D60112



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60112: [analyzer] Treat write into a top-level parameter variable with destructor as escape.

2019-04-02 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ marked an inline comment as done.
NoQ added a comment.

In D60112#1451198 , @Szelethus wrote:

> Woah, the code looks amazing, cheers on the refactoring! I'll be honest, I'm 
> struggling a bit with the sentence "we're now in the top frame". In order, I 
> don't understand what does
>
> - we
> - now
> - in the top frame mean. "Top-level argument" is another one -- Do we have 
> **precise** definitions for there terms?


Cf. `LocationContext::inTopFrame()`.

The top frame is the `StackFrameContext` whose parent context is `nullptr`. 
There is only one such context and it is the root of the location context tree. 
It corresponds to the stack frame from which the current Analysis begins. That 
is, each such context corresponds to a path-sensitive analysis line in the 
`-analyzer-display-progress` dump. Other stack frame contexts (with non-null 
parents) correspond to function calls *during* analysis. They usually 
correspond to stack frames of inlined calls (since D49443 
 we sometimes create stack frame contexts for 
calls that will never be inlined, but the rough idea is still the same).

So this patch talks about the situation when we start analysis from function, 
say, `test(A a) { ... }`, and its argument `a` exists throughout the whole 
analysis: its constructor was called before the analysis has started, and its 
destructor will be called after the analysis has ended. Having no parent 
context means that we don't know anything about the caller or the call site of 
`test(a)` - the information we usually do have for inlined calls.

The phrase "We are now in the top frame" therefore roughly translates to "The 
`CoreEngine` worklist item that the `ExprEngine` is currently processing 
corresponds to an `ExplodedNode` that has a `ProgramPoint` with a 
`LocationContext` whose nearest `StackFrameContext` is the top frame". When i'm 
talking about top-level variables, i mean "A `VarRegion` whose parent region is 
a `StackArgumentsSpaceRegion` whose identity is a top-frame 
`StackFrameContext`".

Do you think i should document it somehow?




Comment at: clang/test/Analysis/malloc.cpp:151
+  char *getName() {
+if (!name) {
+  name = static_cast(malloc(10));

Szelethus wrote:
> Is this relevant? `name` will never be null.
Not really, just makes the code look a bit more sensible and idiomatic and less 
warning-worthy-anyway, to make it as clear as possible that the positive here 
is indeed false. We don't really have a constructor in this class, but we can 
imagine that it zero-initializes name. Without this check calling `getName()` 
multiple times would immediately result in a leak.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60112/new/

https://reviews.llvm.org/D60112



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60112: [analyzer] Treat write into a top-level parameter variable with destructor as escape.

2019-04-02 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus added a comment.

Woah, the code looks amazing, cheers on the refactoring! I'll be honest, I'm 
struggling a bit with the sentence "we're now in the top frame". In order, I 
don't understand what does

- we
- now
- in the top frame

mean. "Top-level argument" is another one -- Do we have **precise** definitions 
for there terms?




Comment at: clang/test/Analysis/malloc.cpp:151
+  char *getName() {
+if (!name) {
+  name = static_cast(malloc(10));

Is this relevant? `name` will never be null.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60112/new/

https://reviews.llvm.org/D60112



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60112: [analyzer] Treat write into a top-level parameter variable with destructor as escape.

2019-04-01 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a_sidorin, rnkovacs, 
mikhail.ramalho, Szelethus, baloghadamsoftware, Charusso.
Herald added subscribers: cfe-commits, jdoerfert, dkrupp, donat.nagy, 
a.sidorin, szepet.
Herald added a project: clang.

Writing stuff into an argument variable is usually equivalent to writing stuff 
to a local variable: it will have no effect outside of the function. There's an 
important exception from this rule: if the argument variable has a non-trivial 
destructor, the destructor would be invoked on the parent stack frame, exposing 
contents of the otherwise dead argument variable to the caller.

We've had this problem in https://bugs.llvm.org/show_bug.cgi?id=37459#c3 where 
we weren't invalidating argument regions after the call. Such invalidation is 
completely unnecessary when the argument doesn't have a destructor, but it's 
vital when it does.

The newly added test case demonstrates the same problem but "inside out": when 
we're receiving an object with a non-trivial destructor as a top-level 
argument, we're exposing ourselves to the destructor call of this variable 
which we won't ever encounter during the current analysis because it'll only 
happen in the parent stack frame. Such destructor may do various stuff with 
values we put into the variable, such as deallocating memory owned by the 
object, but we won't see it and report spurious leaks.

Note that the parameter variable is dead after it's referenced for the last 
time within the function regardless of whether it has a destructor or not. The 
variable is dead because we can guarantee that we'll never be able to access it 
throughout the rest of the analysis. It indicates that all our knowledge about 
the variable is final. For example, if there's a pointer stored in this 
variable that's allocated, and it's not stored anywhere else, it won't be 
deallocated until the end of the analysis. This is why it is incorrect to 
simply make top-level parameter variables with destructors live forever: it 
contradicts the performance-related purpose of dead symbol collection, even if 
it does play nicely with the leak-finding purpose of dead symbols collection.

Therefore i believe that the right solution is to treat any writes into 
top-level parameters with destructors as //escapes//. The value is still stored 
there and something that's beyond our control (in this case, a destructor call) 
will happen to it and we cannot predict what exactly will happen. It's a 
typical escape scenario.

Well, in fact, we *can* predict what happens. After all, it happens immediately 
after the end of the analysis and we don't need to know anything about the 
caller stack frame in order to evaluate these destructors. So the right right 
solution is to just append destructor modeling to the end of the analysis. 
This, however, is going to be very hard to implement because you'll have to 
teach the analyzer how to behave correctly with a null location context - 
that's going to be a lot of crashes to sort out.


Repository:
  rC Clang

https://reviews.llvm.org/D60112

Files:
  clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
  clang/test/Analysis/malloc.cpp

Index: clang/test/Analysis/malloc.cpp
===
--- clang/test/Analysis/malloc.cpp
+++ clang/test/Analysis/malloc.cpp
@@ -141,3 +141,26 @@
   }
   return funcname; // no-warning
 }
+
+namespace argument_leak {
+class A {
+  char *name;
+
+public:
+  char *getName() {
+if (!name) {
+  name = static_cast(malloc(10));
+}
+return name;
+  }
+  ~A() {
+if (name) {
+  delete[] name;
+}
+  }
+};
+
+void test(A a) {
+  (void)a.getName();
+}
+} // namespace argument_leak
Index: clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
===
--- clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -2623,43 +2623,39 @@
   getCheckerManager().runCheckersForPostStmt(Dst, AfterInvalidateSet, AE, *this);
 }
 
-// A value escapes in three possible cases:
+// A value escapes in four possible cases:
 // (1) We are binding to something that is not a memory region.
-// (2) We are binding to a MemrRegion that does not have stack storage.
-// (3) We are binding to a MemRegion with stack storage that the store
+// (2) We are binding to a MemRegion that does not have stack storage.
+// (3) We are binding to a top-level parameter region with a non-trivial
+// destructor. We won't see the destructor during analysis, but it's there.
+// (4) We are binding to a MemRegion with stack storage that the store
 // does not understand.
-ProgramStateRef ExprEngine::processPointerEscapedOnBind(ProgramStateRef State,
-SVal Loc,
-SVal Val,
-const