On 5/30/19 10:15 AM, Jeff Law wrote:
On 5/30/19 9:34 AM, Martin Sebor wrote:

If the alias oracle can be used to give the same results without
excessive false positives then I think it would be fine to make
use of it.  Is that something you consider a prerequisite for
this change or should I look into it as a followup?
I think we should explore it a bit before making a final decision.  It
may guide us for other work in this space (like detecting escaping
locals).   I think a dirty prototype to see if it's even in the right
ballpark would make sense.

Okay, let me look into it.
Sounds good.  Again, go with a quick prototype to see if it's likely
feasible.  The tests you've added should dramatically help evaluating if
the oracle is up to the task.

So to expand on what I said on the phone when we spoke: the problem
I quickly ran into with the prototype is that I wasn't able to find
a way to identify pointers to alloca/VLA storage.

In the the points-to solution for the pointer being returned they
both have the vars_contains_escaped_heap flag set.  That seems like
an omission that shouldn't be hard to fix, but on its own, I don't
think it would be sufficient.

In the IL a VLA is represented as a pointer to an array, but when
returning a pointer into a VLA (at some offset so it's an SSA_NAME),
the pointer's point-to solution doesn't include the VLA pointer or
(AFAICS) make it possible to tell even that it is a VLA.  For example
here:

  f (int n)
  {
    int * p;
    int[0:D.1912] * a.1;
    sizetype _1;
    void * saved_stack.3_3;
    sizetype _6;

    <bb 2> [local count: 1073741824]:
    saved_stack.3_3 = __builtin_stack_save ();
    _1 = (sizetype) n_2(D);
    _6 = _1 * 4;
    a.1_8 = __builtin_alloca_with_align (_6, 32);
    p_9 = a.1_8 + _6;
    __builtin_stack_restore (saved_stack.3_3);
    return p_9;
  }

p_9's solution's is:

  p_9, points-to vars: { D.1925 } (escaped, escaped heap)

I couldn't find out how to determine that D.1925 is a VLA (or even
what it is).  It's not among the function's local variables that
FOR_EACH_LOCAL_DECL iterates over.

By replacing the VLA in the test case with a call to malloc I get
this for the returned p_7:

  p_7, points-to NULL, points-to vars: { D.1916 } (escaped, escaped heap)

Again, I see no way to quickly tell that this pointer points into
the block returned from malloc.

I'm sure Richard will correct me if there is a simple way to do it
but I couldn't find one.

If there is a way to identify that the returned pointer is for
a VLA (or alloca), for parity with the current patch we also
need to find the location of the VLA declaration or the alloca
call so that the warning could point to it.  Unless there's
a straight path from the mysterious D.1925 above to that
decl/call statement, finding it would likely require some
sor of traversal.  At that point I wouldn't be surprised if
the complexity of the solution didn't approach or even exceed
the current approach.

Martin

FWIW, I'm working on enhancing this to detect returning freed
pointers (under a different option).  That seems like a good
opportunity to also look into making use of the alias oracle.
Yes.  I think there's two interesting cases here to ponder.  If we free
a pointer that must point to a local, then we can warn & trap.  If we
free a pointer that may point to a local, then we can only warn (unless
we can isolate the path).

I wasn't actually thinking of freeing locals but it sounds like
a useful enhancement as well.  Thanks for the suggestion! :)

To be clear, what I'm working on is detecting:

   void* f (void *p)
   {
     free (p);   // note: pointer was freed here
     // ...
     return p;   // warning: returning a freed pointer
   }
Ah, we were talking about different things. Though what you're doing
might be better modeled in a true global static analyzer as a
use-after-free problem.  My sense is that translation-unit local version
of that problem really isn't that useful in practice.  THough I guess
there isn't anything bad with having a TU local version.



Besides these enhancements, there's also a request to diagnose
dereferencing pointers to compound literals whose lifetime has
ended (PR 89990), or more generally, those to any such local
object.  It's about preventing essentially the same problem
(accessing bad data or corrupting others) but it seems that
both the analysis and the handling will be sufficiently
different to consider implementing it somewhere else.  What
are your thoughts on that?
I think the tough problem here is we lose binding scopes as we lower
into gimple, so solving it in the general case may be tough.  We've
started adding some clobbers into the IL to denote object death points,
but I'm not sure if they're sufficient to implement this kind of warning.

I was afraid that might be a problem.
Way back in the early days of tree-ssa we kept the binding scopes.  But
that proved problematical in various ways.  Mostly they just got in the
way of analysis an optimization and we spent far too much time in the
optimizers working around them or removing those which were empty.

They'd be helpful in this kind of analysis, stack slot sharing vs the
inliner and a couple other problems.  I don't know if the pendulum has
moved far enough to revisit :-)

jeff


Reply via email to