WRT to PIL and compilation and all that, I think it's time to think about how the linker might look.
As I see it the compilation chain with the user typing this in the prompt: perl6 foo.pl perl6 is a compiled perl 6 script that takes an input file, and compiles it, and then passes the compiled unit to the default runtime (e.g. parrot). perl6 creates a new instance of the perl compiler (presumably an object). The compiler will only compile the actual file 'foo.pl', and disregard any 'require', 'use', or 'eval' statements. The compiler produces an object representing a linkable unit, which will be discussed later. At this point the runtime kicks in. The runtime really runs compiled byte code for the runtime linker, which takes the compiled unit that the compiler emitted and prepares it for execution. The runtime linker checks if any inclusions of outside code have been made, and if so, invokes a search routine with the foreign module plugin responsible. For example 'use python:Numerical' will use the pyhon module plugin to produce a linkable unit. A given module will normally traverse a search path, find some source code, check to see if there is a valid cached version of the source code, and if needed, recompile the source code into another linkable unit. As the linker gets new linkable units it checks if they have any dependencies of their own, and eventually resolves all the data and code that modules take from one another. The resulting mess has only one requirement: that it can be run by the runtime - that is, byte code can be extracted out of it. If the modules expose more than just byte code with resolved dependencies in the modules, for example type annotations, serialized PIL, serialized perl 6 code, and so forth, it may, at this point, do any amount of static analysis as it pleases, recompiling, relinking, optimizing, inlining, performing early resolution (especially of MMD) and otherwise modifying code (provided it was asked to do this by the user). The optimization premise is: by the time it's linked it probably won't change too much. Link time isa magictime for resolving calls, inlining values, folding newly discovered constants, and so forth. Furthermore, a linker may cache the link between two modules, regardless of the calling script, so that optimization does not have to be repeated. The result is still the same: code that can be executed by the runtime. It just might be more efficient. The linker also must always be able to get the original version of the linked byte code back, either by reversing some changes, or keeping the original. At this point the runtime's runloop kicks in, starting at the start point in the byte code, and doing it's thing. Runtime loading of code (e.g. eval 'sub foo { }') simply reiterates the above procedure: 'sub foo { }' is compiled by the compiler, creating a linkable unit (that can give 'sub foo { }' to the world). The runtime linker gets a fault saying "byte code state is changing, $compiled_code_from_eval is being ammended to $linked_byte_code_in_runtime_loop". The linker must then link the running code to the result of eval. To do this it may need to undo it's optimizations that assumed there was no sub foo. For example, if there is a call to 'foo()' somewhere in foo.pl, the linker may have just inlined a 'die "no such sub foo()"' error instead of the call. Another linker may have put in code to do a runtime search for the '&foo' symbol and apply it. The linker that did a runtime search that may fail doesn't need to change anything, but the linker which inlined a fatal error must undo that optimization now that things have changed. The behavior of the linker WRT to such things is dependant on the deployment setting. In a long running mod_perl application there may even be a linker that optimizes code as time goes by, slowly changing things to be more and more static. As the process progresses through time, the probability of new code being introduced is lower, so the CPU time is invested better. Furthermore, latency is not hindered, and startup is fast because the linker doesn't do any optimizations in the begining. This is part of the proposed optimizer chain, as brought up on p6l a month or so ago. Anyway, back to runtime linking. Once the code is consistent again, e.g. calls to foo() will now work as expected, eval gets the compiled code, and runs it. It just happens that 'sub foo { }' has no runtime effects, so evak returns, and normal execution is resumed. To get the semantics of 'perl -c' you force the linker to resolve everything, but don't actually go to the runloop. Linkable units are first class objects, and may be of a different class. This has merits when, for example, a linkable unit is implemented by an FFI wrapper. The FFI wrapper determines at link time what the foreign interface looks like, and then behaves like the linkable unit one might expect if it were a native interface. It can generate bytecode to call the foreign functions on demand at link time. This should simplify the link process. Linkable units have an implementation class that determines the behavior to producing byte code. Linkable unit classes do any number of roles the implementor chose to add. The linker searches for roles it is interested in. For example, LinkableUnit::WithPIL is the interface that linkable units that have PIL code expose. Furthermore, different types of bytecode formats are also roles. For example, here is a wrapper linkable unit that exposes PBC for various versions of parrot PIR linkable units: class LinkableUnit::PBC { has Linkablemy $.linkable_unit handles <*>; submethod BUILD ($.linkable_unit) { die "i wrap around PIR linkables" unless ($linkable_unit ~~ LinkableUnit::Emits::PIR); given $linkable_unit->pir_version { when ... { # use correct version of parrot to compile PIR # to bytecode } } } } # when the Perl6::Compiler emits PIR LinkableUnit::PBC.new(:linkable_unit(Perl6::Compiler.new(:string<sub foo { }>))); This approach should encourage each runtime to have a tightly coupled linked that looks for a specific bytecode role, but allow this linker to share the maximum amount of code with a linker for another system. Furthermore, linked level translations of interfaces with compatible bytecode underneath, for example runtime loading of x86/ELF vs. x86/Mach-O can be implemented on top of the same runtime engine for x86 machine code, and with possible reuse for the two different binary formats. Furthermore, ELF and Mach-O for other archs can be reused. The way this is done is that the link format LinkableUnit classes expose a consistent symbol interface, and the X86<->native runtime translation link layer wraps over that. Any of the two (the native code translater and the link format reader) can be exchanged. Some possible linkable unit roles: * Embedded source (knows to map bytecode back to source code) * Source reference annotations (line numbers and such, but no complete code) * Emit::PIL (pil version of the code can be extracted) * Emit::PBC (parrot byte code version of the code can be extracted) * Emit::... * Rich type/value annotations (full partially resolved type/value inferencing trees for symbols, including constantness, return values, and so forth). PHEW, THAT WAS LONG. Sorry! -- () Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418 perl hacker & /\ kung foo master: /me sushi-spin-kicks : neeyah!!!!!!!!!!!!!!!!!!!!
pgpMJ6bIaGORb.pgp
Description: PGP signature