On May 5, 2014, at 5:55 , Manuel Klimek <[email protected]> wrote:
> On Thu, May 1, 2014 at 6:27 PM, Jordan Rose <[email protected]> wrote:
>
> On Apr 30, 2014, at 10:23 , Manuel Klimek <[email protected]> wrote:
>
>> On Wed, Apr 30, 2014 at 7:16 PM, Jordan Rose <[email protected]> wrote:
>>
>> On Apr 30, 2014, at 1:35 , Manuel Klimek <[email protected]> wrote:
>>
>>> On Wed, Apr 30, 2014 at 5:34 AM, Jordan Rose <[email protected]> wrote:
>>> What this is basically saying is that we will never be in a situation where
>>> the condition (or the rightmost branch in the condition) was not evaluated
>>> in the current CFG block. That seems a bit questionable in general but
>>> reasonable given the way the CFG is currently constructed.
>>>
>>> Well, all I've been trying to do here is to a) document and b) encode in
>>> the assert the actual current CFG invariant the analyzer relies on for
>>> correctness.
>>> The problem is that if what is returned from ResolveCondition is *not*
>>> evaluated in the block, we have no guarantee that it was evaluated at all -
>>> but the correctness of all the path analyses relies on it being in the SVal
>>> cache of the state. Otherwise we will assume branches are reachable that
>>> are clearly unreachable because of which parts of the logical operator tree
>>> we *assumed* to be true (so we know the condition must be statically
>>> knowable in the current state).
>>>
>>> I have a hard time putting that into words though, so any help for how to
>>> wordsmith this to be easier to understand would be highly appreciated...
>>
>> "in the SVal cache of the state" is "in the Environment" using analyzer
>> terminology (it's not really a cache, since it's not reproducible)
>>
>> exactly :) I learned that today while doing more digging through the code...
>>
>> , but since things regularly get cleaned out of the Environment there aren't
>> that many cases where a condition would be present but wouldn't have been
>> evaluated in the previous block.
>>
>> I don't yet understand how exactly things get cleaned out of the environment
>> (other than losing scope).
>> It seems to me like a lot of the warnings rely on branch conditions always
>> being in the environment at the point of the terminator, otherwise it's easy
>> to produce false positives in unreachable code paths.
>
> The presence of bindings in the Environment is controlled by the
> LiveVariables analysis. Essentially, an expression is live from the point of
> its evaluation to the point it is consumed.
>
> Can you elaborate on what "consumed" means in this context?
"conusmed" = "used to evaluate another statement". So in "foo(a(), b())", the
expressions 'a()' and 'b()' (and 'foo') are consumed by the CallExpr, because
their values must be used to evaluate the next statement. Until temporary
destructors, an expression was never consumed more than once within a CFG.
(It's close to "is a child of", but not quite the same. The LHS of the comma
operator is not consumed by the full BinaryOperator expression, because the
LHS's value is not needed to get the final result.)
>
> Unfortunately, it can be a little more complicated than that
> sometimes...especially with LiveVariables being a flow-sensitive analysis but
> not a path-sensitive one.
>
> What is the difference of flow vs. path sensitive analysis here?
>
> (This is one of the reasons for the liveness bug that Pavel uncovered.)
>
> I'm not sure yet where there was a liveness bug - every instance of bug I've
> seen was due to expression not having yet been evaluated (due to the logical
> operator optimization that ResolveCondition relies on). I've not seen a case
> yet where an expression was removed from the Environment after it had been
> in. Do you have an example for that?
A flow-sensitive algorithm is basically a graph search through the CFG: all
possible paths through the function assuming that all branches are feasible.
(Or nearly all. Our CFG builder prunes out trivially unreachable branches, like
"if (false) { ... }" or "do { ... } while (false);".) A path-sensitive
algorithm, on the other hand, attempts to only consider valid paths through the
CFG; if you see the same condition repeated twice, the results will be
consistent.
The issue in PR18159 illustrates this difference for liveness. Consider the
expression "A || (B || C)". A fairly simple liveness algorithm would say that
if you're trying to get the value of the outer || expression, both "A" and "B
|| C" need to be live. But that's not actually true at runtime—if "A" is always
true, "B || C" will never get evaluated. So saying that the value for "B || C"
should be considered live is a valid flow-sensitive statement (because there
exists a path where it is needed), but not a path-sensitive one (because it
doesn't tell you on which paths it is needed, and it isn't "all of them").
That's not exactly a bug; it's just more conservative than we need to be.
I'm not going to explain this next bit well because it's been a long time since
we investigated this, but the (current) temporary destructors branches attempt
to keep "A", "B", and "C" live a bit longer than they otherwise would be
(because we try to evaluate the "A || B || ..." branch more than once). When
Pavel put this in, though, we eventually found that on paths where A was false,
the value for "B" and "C" would stick around too long—all the way to the next
iteration of a loop. Then, even when A was true, we'd look at the value for "B
|| C", which would of course be the value from last time through the loop.
Somewhere in there is a bug—does it make sense to use flow-sensitive liveness
analysis for a path-sensitive algorithm? Should we have made sure to
recursively kill (mark non-live) the leaf descendents of binary operators,
which are treated specially, after the top-level operator has been evaluated?
Should we kill everything that is syntactically within a loop (not good enough;
doesn't handle goto)?
Or, should we not try to change how BinaryOperators are evaluated (i.e. revert
Pavel's change), and continue on as before, and come up with another solution?
That's what we ended up with.
For the actual question, "when are things removed from the Environment", it
happens pretty consistently: roughly at the next full-statement, or at the end
of a function (inlined or top-level). Things don't just stick around. But it
shouldn't ever hit the cases you care about, since those are within a single
expression and everything important has been marked live.
Anyway, I hope that helps somewhat.
Jordan
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits