[PATCH] D71041: [analyzer][discussion] Talk about escapes

Artem Dergachev via Phabricator via cfe-commits Thu, 05 Dec 2019 13:13:56 -0800

NoQ added inline comments.


================
Comment at: clang/test/Analysis/fuchsia_handle.cpp:210
+  // Because of arrays, structs, the suggestion is to escape when whe no longer
+  // have any pointer to that symbolic region.
+  if (zx_channel_create(0, get_handle_address(), &sb))
----------------
xazax.hun wrote:
> xazax.hun wrote:
> > xazax.hun wrote:
> > > NoQ wrote:
> > > > xazax.hun wrote:
> > > > > NoQ wrote:
> > > > > > NoQ wrote:
> > > > > > > This has nothing to do with symbolic regions. We can run into 
> > > > > > > this problem even if it's a local variable in the current stack 
> > > > > > > frame:
> > > > > > > ```lang=c++
> > > > > > > void foo() {
> > > > > > >   zx_handle_t sa, sb;
> > > > > > >   escape(&sb); // Escape *before* create!!
> > > > > > > 
> > > > > > >   zx_channel_create(0, &sa, &sb);
> > > > > > >   zx_handle_close(sa);
> > > > > > >   close_escaped();
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > The solution that'll obviously work would be to keep track of all 
> > > > > > > regions that escaped at least once, and then not even start 
> > > > > > > tracking the handle if it's getting placed into a region that 
> > > > > > > causes an escape when written into or has itself escaped before, 
> > > > > > > but that sounds like a huge overkill.
> > > > > > > 
> > > > > > > Lemme think. This sounds vaguely familiar but i can't immediately 
> > > > > > > recall what my thoughts were last time i thought about it.
> > > > > > `$ cat test.c`
> > > > > > ```lang=c++
> > > > > > void manage(void **x);
> > > > > > void free_managed();
> > > > > > 
> > > > > > void foo() {
> > > > > >   void *x;
> > > > > >   manage(&x);
> > > > > >   x = malloc(1);
> > > > > >   free_managed();
> > > > > > }
> > > > > > ```
> > > > > > `$ clang --analyze test.c`
> > > > > > ```lang=c++
> > > > > > test.c:8:3: warning: Potential leak of memory pointed to by 'x'
> > > > > >   free_managed();
> > > > > >   ^~~~~~~~~~~~~~
> > > > > > 1 warning generated.
> > > > > > ```
> > > > > > Sigh.
> > > > > Oh, I see. Yeah this one will be fun to deal with 
> > > > The rules are pretty easy though, right?
> > > > ```lang=c++
> > > >   2680 // A value escapes in four possible cases:
> > > >   2681 // (1) We are binding to something that is not a memory region.
> > > >   2682 // (2) We are binding to a MemRegion that does not have stack 
> > > > storage.
> > > >   2683 // (3) We are binding to a top-level parameter region with a 
> > > > non-trivial
> > > >   2684 //     destructor. We won't see the destructor during analysis, 
> > > > but it's there.
> > > >   2685 // (4) We are binding to a MemRegion with stack storage that the 
> > > > store
> > > >   2686 //     does not understand.
> > > >   2687 ProgramStateRef
> > > >   2688 ExprEngine::processPointerEscapedOnBind(ProgramStateRef State, 
> > > > SVal Loc,
> > > >   2689                                         SVal Val, const 
> > > > LocationContext *LCtx) {
> > > > ```
> > > > Basically, locals are the only special case; writing into anything else 
> > > > should be an immediate escape.
> > > > 
> > > > We could easily update this procedure to additionally keep track of all 
> > > > escaped locals in the program state, and then escape all binds to 
> > > > previously escaped locals as well.
> > > > 
> > > > The checker would then have to follow the same rules.
> > > > 
> > > > In the worst case, manually.
> > > > 
> > > > But i think we should instead update the `checkPointerEscape()` 
> > > > callback to escape the values of out-parameters upon evaluating the 
> > > > call conservatively (if it doesn't already) and then teach the checker 
> > > > to mark escaped regions explicitly as escaped (as opposed to removing 
> > > > them from the program state), and then teach it to never transition 
> > > > from escaped to opened. That would be cleaner because that'll only 
> > > > require us to hardcode the escaping procedure once.
> > > > 
> > > > Or we could just make the "bool 
> > > > doesWriteIntoThisRegionCauseAnEscape?(Region, State)" function 
> > > > re-usable.
> > > Yeah.  I wonder if this procedure is the right place though. We do not 
> > > actually see a bind in the code above.
> > Hmm, in the stack example we do see the point of invalidation (which 
> > results in an escape). That make things easier, checkers could even work 
> > that problem around if they wanted to. In the original example, however, 
> > there is no point of invalidation, the region we get is already escaped 
> > without seeing any bind later on. And there is no store into the escaped 
> > region, since `zx_channel_create` is evaluated conservatively, we just 
> > attach the state to the conjured symbol retroactively.
> > 
> > So at this point I wonder if the checker should ever track any symbols that 
> > are is bound to a non-stack region (even if we do not see the bind itself). 
> > This might circumvent most of our problems?
> > 
> > And for the stack example we can just make the escaped state explicit again 
> > and never transition from an escaped state to a non-escaped one. 
> > 
> > WDYT?
> Actually, thinking a bit more, this is more of a workaround. It does not 
> matter if the region is on the stack or on the heap. What does matter if we 
> did see the inception of the region (i.e.: the allocation) or not. If we 
> store a handle on the heap but we did see the it from the very beginning we 
> should still be able to track it properly.
> Hmm, in the stack example we do see the point of invalidation (which results 
> in an escape). That make things easier, checkers could even work that problem 
> around if they wanted to. In the original example, however, there is no point 
> of invalidation, the region we get is already escaped without seeing any bind 
> later on.

When in doubt, think what would have happened during concrete execution. There 
*is* a point for escape-on-bind in both cases *during* the evaluation of 
`zx_channel_create()`, we're just not invoking `checkPointerEscape` on this 
point, but we should be.

Indeed, the mental model of behavior of any function that returns a value 
through an out-parameter would be:

1. Come up with the value of the parameter.
2. Write it into the provided region.

Step 2 is basically a store bind and so it should be triggering escape-on-bind 
- either because we're writing into an unknown memory space, or because we're 
writing into a "pre-escaped local" (a notion i just came up with in order to 
explain my example with the local). However, because the call is evaluated 
conservatively, these two steps are merged together and happen simultaneously. 
To make things worse, the first checker callback in which the value becomes 
available (`checkPostCall`) is invoked *after* the actual step 2. This leads to 
the paradox that we are already past step 2 when the checker is just trying to 
evaluate step 1.

So, our options are either to model step 2 manually in the checker, or to teach 
`ExprEngine` to trigger `checkPointerEscape` immediately after `checkPostCall` 
with all out-parameter values as roots.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71041/new/

https://reviews.llvm.org/D71041



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D71041: [analyzer][discussion] Talk about escapes

Reply via email to