Hi all, Here's an idea that came up today on irc (thanks the_drow_) while discussing again saving the generated assembler into files and reloading it on the next run. As you know we categorize this idea as "can never work", but now I think that there is a variant that could work, unless I'm missing something.
What *cannot* work is storing in the files any information about Python-level data, like "this piece of assembler assumes that there is a module X with a class Y with a method Z". I'm not going to repeat again the multiple problems with that (see http://pypy.readthedocs.org/en/latest/faq.html#couldn-t-the-jit-dump-and-reload-already-compiled-machine-code). However, it might be possible to save enough lower-level information to avoid the problem completely. The idea would be that when we're about the enter tracing mode and there is a saved file, we use a "fast-path": * we find a *likely* candidate from the saved file, based on some explicitly-provided way to hash the Python code objects, for example (it doesn't have to be a guaranteed match) * this likely saved trace comes with a recording of all the *green* guards that we originally did (both promotions and things that just depend on purely green inputs). (This means extra work for making sure we save this information in the first place.) * we run it like in the blackhole interpreter, but checking the result of the green guards against the recorded ones * we also take all green values that we get this way and pass them as constants to the next step (see below, "**"). This means that we generalize (and lower-level-ize) the vague idea "a module X with a class Y with a method Z" to be instead a series of guards that need to give the same results as before. They would automatically check some subset of the new interpreter's state by comparing it against the old's --- but only as much as the actual loop happens to need. For example, if we had in the (old) normal trace a guard_value(p12, <constant pointer>), then of course it makes no sense to record the old interpreter's constant pointer, which will change. But it makes sense to record *what* was really deduced from this constant pointer, i.e. all the green getfields and getarrayitems we did. And for example, if it was a PyCode object, we would record the green switch that we did on the integer value that we got in the old interpreter (which is the next opcode), even though that's all green. That's the real condition: that we would follow the same path by constant-folding the decisions on the green variables. So, to finish the new interpreter's reloading: if the checking done above passes, the next step is a fast-path through the assembler. We "just" need to reload the saved assembler as a sequence of bytes, and fix all constants there. To continue the example above, if a piece of assembler was generated from the instruction guard_value(p12, <constant pointer>), then the saved file must contain enough information so that we know we must replace this old constant pointer's value in the assembler with the new constant pointer's value recorded above (at "**"). Overall, this would result in a much faster warm-up: faster tracing, and no optimization nor regular assembler generation at all --- only a very quick assembler reloading and fixing step. There are of course complications from the fact that we don't simply record loops, but bridges. They might be seen in a different order in the new process, so that when we are in the checking mode, we might run the start of the loop, but then jump into the bridge --- even though the loop was not fully seen so far. This is not impossible to implement by reloading the complete loop+bridge, but making the tail of the loop invalid until we really run into it (with an extra temporary guard). And I'm sure that unrolling will somehow come with its lot of funniness, as usual. However, does it sound reasonable at all, or am I missing something else? A bientôt, Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev