On 13/02/18 05:09, John Fastabend wrote:
> I have some code that is getting close to working on bounded loops. Here
> is how I think it should work, (with some annotations on what I have).
Thanks for this! For comparison, since my code is also 'getting close to
working' here's how my approach works.
First, we get rid of (most of) check_cfg(). Replace it with a function
that just puts the state-list marks on jump targets (including following-
insns of conditional jumps and calls), then does a quick check that every
insn following any JMP-class insn has a state-list mark (this suffices to
detect statically-unreachable code). 2*O(n) in insn_cnt, and much easier
to understand than check_cfg (I only recently figured out that it's also
a topological sort algorithm).
Now of course at this point there is nothing to prevent loops at all, so in
is_state_visited we now add a check where we search backwards through the
parentage chain for a state with the same insn_idx as us. (We have to
change a couple of things for this to work: store the insn_idx of the
current frame instead of the callsite (insn_idx of the calling frame), and
thanks to function calls it turns out that the chain we want doesn't match
that for any of the registers. So I gave registers their own parentage
chains (that's a nice simplification and worth doing in itself) and added
a new chain to func_states that does what I want.) If we find such a state,
we have looped, so we reject the prog.
So far the diffstat has 159 insertions, 353 deletions. The downside of
course is that walking the parentage chain all the time to look for loops
is going to hurt performance. I haven't yet come up with a way around that
but some time spent meditating on data structures might turn something up ;)
Next we want to start allowing the loop as long as it's bounded. So when we
detect the loop in is_state_visited, we look at the last jump we did. (With
a slight tweak we can make this always available in prev_insn_idx, even if
we elided the push/pop_stack in check_cond_jmp_op.) We then look at the
jump insn, the old state (conveniently returned from our loop detection
routine!) and the new state, to decide whether the loop is moving towards
termination. The test I'm using for now shares some "it's just one case"
limits as your point 3 :-) For now it's:
a) Is the jump a JLT with a constant? (Same reason as yours.)
b) Are we in the 'jump true' branch? (I haven't quite figured out how to
make sure of this in general, checking insn_idx < prev_insn_idx works
but rules out back-edges with forward jumps. Fortunately those only
occur in unnatural loops so we don't have to support them all.)
c) Does the jump reg hold a SCALAR_VALUE in both old and new states?
d) Did the jump reg->umin_value increase from old to new states?
e) How _fast_ did it increase? Rule for now is to take the delta, multiply
it by 16, and if that exceeds the jump compare constant, we're happy.
This doesn't actually limit loops to 16 times, because the delta could
decay exponentially - the worst case comes out as 687 iterations. But
we could do something cleverer with recording the delta so we can insist
on linear induction variables (i = i + c) so that we can give those more
iterations without giving too many to the worst case exponential.
If all of these checks pass, then the loop is (so far) behaving itself. So
we mark the old state with a "this is being looped on" flag (which prevents
it from being used for pruning. I haven't yet come up with a way to clear
the flag when all the state's descendants have reached an exit, so this
will unfortunately reduce pruning effectiveness) and then we carry on the
walk. do_check() will ultimately walk around the loop as many times as the
Of course, there's one problem here: when do_check walks the loop for the
last time, and knows that the loop test has to be false... it will still
follow the branch anyway and keep walking forever. So I also had to add a
patch to not follow 'impossible' branches, determined by the jump register
getting inconsistent bounds. This is similar to what check_cond_jmp_op()
already does with JEQ/JNE of const reg against const imm.
> The worst case loop count would
> be, `c - min_value` then `(c - min_value) * max(loop_cfg(i))` would be the
> the worst case instruction that might be executed. Where loop_cfg(i) is
> max insns that can be run through the loop with worst case CFG path.
Currently I don't have any equivalent of that loop_cfg(i) factor, so I'm
applying the same max iteration count to all loops regardless of the size of
the loop body. But I guess I could do something like counting insns since
parent state and recording that in the explored_states, so that when I have
my detected loop I can count how long it is.
Anyway, that's my design, I hope it wasn't _too_ horrifying, and I hope to
have some RFC patches in the next week or so.