Hi Marcus, On Fri, Jan 18, 2019 at 5:42 AM Marcus Denker <[email protected]> wrote:
> > > On 18 Jan 2019, at 14:26, ducasse <[email protected]> wrote: > > > > I simply love the dynamic rewriting this is just too cool. We should > systematically use it. > > I will continue to use it in any deprecation. > > > > On my TODO is to make it stand-alone and provide is as a “compatibility > transform”, too. > > So we can add it to methods that we want to keep for compatibility, but > they will nevertheless transform the code automatically. > (this then might be disabled in production to not transform) > > > Now I have a simple question (You can explain it to me over lunch one of > these days). > > > > I do not get why RBAST would not be a good representation for the > compiler? > > I would like to know what is the difference. > > > I think it is a good one. I have not yet seen a reason why not. But > remember, Roel left Squeak because his visitor pattern for the compiler was > rejected as a dumb idea… so there are definitely different views on core > questions. > > E.g. the RB AST is annotated and the whole things for sure uses a bit more > memory than the compiler designed for a machine from 1978. > > > You mean that before going from BC to AST was difficult? > > You need to do the mapping somehow, the compiler needs to remember the BC > offset in the code generation phase and the AST (somehow) needs to store > that information (either in every node or some table). > > > How opal performs it? It does not use the source of the method to > recreate the AST but he can do it from the BC? > > > > It uses the IR (which I still am not 100% sure about, it came from the old > “ClosureCompiler” Design and it turned out to be quite useful, for example > for the mapping: every IR node retains the offset of the BC it creates, > then the IR Nodes > retain the AST node that created them. > > -> so we just do a query: “IRMethod, give me the IRInstruction that > created BC offset X. then “IR, which AST node did create you? then the AST > Node: what is your highlight interval in the source? > > The devil is in the detail as one IR can produce multiple byte code > offsets (and byte codes) and one byte code might be created by two IR > nodes, but it does seem to work with some tricks. > Which I want to remove by improving the mapping and even the IR more… > there is even the question: do we need the IR? could we not do it simpler? > > The IR was quite nice back when we tried to do things with byte code > manipulation (Bytesurgeon), now it feels a bit of an overkill. But it > simplifies e.g. the bc mapping. > I find Bytesurgeon functionality, specifically a bytecode dis/assembler very useful, but wouldn't use it for the back end of the bytecode compiler. It adds overhead that has no benefit. But I use my version, MethodMassage, for lots of things: - transporting compiled methods from one dialect to another, e.g. to do in-image JIT compilation omg a method from Pharo in Squeak. - generating JIT test cases - generating methods that can't be generated from Smalltalk source, e.g. an accessor for a JavaScript implementation above Smalltalk where inst var 0 is the prototype slot and nil is unbound, and so in a loop one wants to fetch the Nth inst var from a temporary initialized to self, and ion the value is non-nil return it, otherwise setting the temporary to the prototype slot, hence walking up the prototype chain until an initialized inst var is found. I based mine around the messages that InstructionStream sends to the client in the interpretFooInstructionFor: methods; a client that catches doesNotUnderstand: then forms the basis of the disassembler. Simple and light-weight. Marcus _,,,^..^,,,_ best, Eliot
