On Mon, May 5, 2014 at 7:38 PM, Jordan Rose <[email protected]> wrote:
> > On May 5, 2014, at 5:55 , Manuel Klimek <[email protected]> wrote: > > On Thu, May 1, 2014 at 6:27 PM, Jordan Rose <[email protected]> wrote: > >> >> On Apr 30, 2014, at 10:23 , Manuel Klimek <[email protected]> wrote: >> >> On Wed, Apr 30, 2014 at 7:16 PM, Jordan Rose <[email protected]>wrote: >> >>> >>> On Apr 30, 2014, at 1:35 , Manuel Klimek <[email protected]> wrote: >>> >>> On Wed, Apr 30, 2014 at 5:34 AM, Jordan Rose <[email protected]>wrote: >>> >>>> What this is basically saying is that we will never be in a situation >>>> where the condition (or the rightmost branch in the condition) was not >>>> evaluated in the current CFG block. That seems a bit questionable in >>>> general but reasonable given the way the CFG is currently constructed. >>> >>> >>> Well, all I've been trying to do here is to a) document and b) encode in >>> the assert the actual current CFG invariant the analyzer relies on for >>> correctness. >>> The problem is that if what is returned from ResolveCondition is *not* >>> evaluated in the block, we have no guarantee that it was evaluated at all - >>> but the correctness of all the path analyses relies on it being in the SVal >>> cache of the state. Otherwise we will assume branches are reachable that >>> are clearly unreachable because of which parts of the logical operator tree >>> we *assumed* to be true (so we know the condition must be statically >>> knowable in the current state). >>> >>> I have a hard time putting that into words though, so any help for how >>> to wordsmith this to be easier to understand would be highly appreciated... >>> >>> >>> "in the SVal cache of the state" is "in the Environment" using analyzer >>> terminology (it's not really a cache, since it's not reproducible) >>> >> >> exactly :) I learned that today while doing more digging through the >> code... >> >> >>> , but since things regularly get cleaned *out* of the Environment there >>> aren't that many cases where a condition would be present but wouldn't have >>> been evaluated in the previous block. >>> >> >> I don't yet understand how exactly things get cleaned out of the >> environment (other than losing scope). >> It seems to me like a lot of the warnings rely on branch conditions >> always being in the environment at the point of the terminator, otherwise >> it's easy to produce false positives in unreachable code paths. >> >> >> The presence of bindings in the Environment is controlled by the >> LiveVariables analysis. Essentially, an expression is live from the point >> of its evaluation to the point it is consumed. >> > > Can you elaborate on what "consumed" means in this context? > > > "conusmed" = "used to evaluate another statement". So in "foo(a(), b())", > the expressions 'a()' and 'b()' (and 'foo') are consumed by the CallExpr, > because their values must be used to evaluate the next statement. Until > temporary destructors, an expression was never consumed more than once > within a CFG. (It's close to "is a child of", but not quite the same. The > LHS of the comma operator is not consumed by the full BinaryOperator > expression, because the LHS's value is not needed to get the final result.) > That definition of consumed makes sense, but seems to contradict the last paragraph of your email, where you say: """For the actual question, "when are things removed from the Environment", it happens pretty consistently: roughly at the next full-statement, or at the end of a function (inlined or top-level). Things don't just stick around. But it shouldn't ever hit the cases you care about, since those are within a single expression and everything important has been marked live.""" That sounds like the point where something is consumed and where it is removed from the Environment are not the same? /me is thoroughly confused now :) > Unfortunately, it can be a little more complicated than that >> sometimes...especially with LiveVariables being a flow-sensitive analysis >> but not a path-sensitive one. >> > > What is the difference of flow vs. path sensitive analysis here? > > >> (This is one of the reasons for the liveness bug that Pavel uncovered.) >> > > I'm not sure yet where there was a liveness bug - every instance of bug > I've seen was due to expression not having yet been evaluated (due to the > logical operator optimization that ResolveCondition relies on). I've not > seen a case yet where an expression was removed from the Environment after > it had been in. Do you have an example for that? > > > A flow-sensitive algorithm is basically a graph search through the CFG: > all possible paths through the function assuming that all branches are > feasible. (Or nearly all. Our CFG builder prunes out trivially unreachable > branches, like "if (false) { ... }" or "do { ... } while (false);".) A > path-sensitive algorithm, on the other hand, attempts to only consider > valid paths through the CFG; if you see the same condition repeated twice, > the results will be consistent. > That makes sense. The issue in PR18159 <http://llvm.org/bugs/show_bug.cgi?id=18159> > Is that bug with Pavel's patch or without? > illustrates this difference for liveness. Consider the expression "A || (B > || C)". A fairly simple liveness algorithm would say that if you're trying > to get the value of the outer || expression, both "A" and "B || C" need to > be live. But that's not actually true at runtime—if "A" is always true, "B > || C" will never get evaluated. So saying that the value for "B || C" > should be considered live is a valid flow-sensitive statement (because > there exists a path where it is needed), but not a path-sensitive one > (because it doesn't tell you on *which* paths it is needed, and it isn't > "all of them"). That's not exactly a bug; it's just more conservative than > we need to be. > Makes sense. > I'm not going to explain this next bit well because it's been a long time > since we investigated this, but the (current) temporary destructors > branches attempt to keep "A", "B", and "C" live a bit longer than they > otherwise would be (because we try to evaluate the "A || B || ..." branch > more than once). When Pavel put this in, though, we eventually found that > on paths where A was false, the value for "B" and "C" would stick around > *too* long—all the way to the next iteration of a loop. Then, *even when > A was true,* we'd look at the value for "B || C", which would of course > be the value from last time through the loop. > > Somewhere in there is a bug—does it make sense to use flow-sensitive > liveness analysis for a path-sensitive algorithm? Should we have made sure > to *recursively* kill (mark non-live) the leaf descendents of binary > operators, which are treated specially, after the top-level operator has > been evaluated? Should we kill everything that is syntactically within a > loop (not good enough; doesn't handle goto)? > > Or, should we not try to change how BinaryOperators are evaluated (i.e. > revert Pavel's change), and continue on as before, and come up with another > solution? That's what we ended up with. > So, as I mentioned, I'm currently running all my tests without Pavel's patch, and have yet to find an example where the temporary destructor handling is incorrect because the variables are not live long enough... /me craves an example here :) > For the actual question, "when are things removed from the Environment", > it happens pretty consistently: roughly at the next full-statement, or at > the end of a function (inlined or top-level). Things don't just stick > around. But it shouldn't ever hit the cases *you* care about, since those > are within a single expression and everything important has been marked > live. > See above for why this statement confuses me. Thank you so much for answering all those crazy questions! :) Cheers, /Manuel > > Anyway, I hope that helps somewhat. > Jordan > >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
