On 7/16/06, Nick Sieger <[EMAIL PROTECTED]> wrote:
On 7/15/06, Charles O Nutter < [EMAIL PROTECTED]> wrote:- Local vars in Ruby code compiled to Java *must* be maintained in a separate data structure.
- Currently, they are maintained in a Scope object, where there's one Scope per call Frame. This is a bit inefficient at the moment, requiring multiple structures when one would do.
- As a simple alternative, we might maintain local vars as an array per method invocation (which is basically the same thing we do in Scope, but lighter-weight and not bound to the existing runtime). If there are ten local vars in a method's lifetime, we have ten slots in that array. evals could cause new arrays to come into being, and we'd need to grow the array. However, we could easily pass a final reference to this array of values to an inner class "block" when passing it off. Later invocations could then directly access and modify those values.
Since you've already characterized Ruby classes as method bags, why not consider a Ruby method 1-1 correspondent to Java class from the compiler's point of view? Local vars in the Ruby method would be stored in a table as a field in the Java class. Creating a block would be a matter of compiling a new method class and giving it referential access to the containing method's local variable table.
To avoid threading issues you'd either have to create a new instance of the method class before invoking or maintain the local variable table as a thread local. Not sure which is more expensive. The instantiation approach seems more flexible, but given how much of Ruby is method invocation, maybe the latter would be preferable.
This exact option has been discussed (ok, Kelly and I discussed it). Each method becomes a class, instances of the class are call frames on the method, locally-scoped variables and whatnot are stored in that instance. Blocks created can read and modifytheir instantiator's state since they can access their containing "method class". It also works for more complex scenarios: blocks within blocks have to be able to see their containing blocks' dynamic vars as well as the eventual local vars at the top of the pile, so blocks would be inner classes in "method classes", blocks within blocks would be inner classes within those "block classes" and so on. This approach would certainly work.
Unfortunately, I believe it's also pretty heavy. With all those method calls instantiating all those method classes, with their corresponding frame information, scope vars, passed-in blocks, and so on, we don't save a lot per-call even with Java compilation. That model is essentially what's done now, though a bit more inefficiently: a method call instantiates a frame and a scope and this and that; those are passed on to blocks instantiated in the same scope; and there's a nasty spaghetti nightmare mapping all those blocks and methods to all those frames and scopes.
So there's the latter option you mention above, which Kelly and I discussed a bit this past week. I put forward the idea that simulating a stack machine with single stacks-per-thread would allow mapping blocks, arguments, and local and dynamic variables in a much lighter-weight way. Where now we instantiate an explicit Frame instance for every method call, we would instead push relevant bits of the frame onto a true stack. Where now we instantiate a Scope per method call we would just "allocate" a range out of the local var stack for use during the new call. The eventual model is that every method invocation has available to it a range of indicies out of multiple per-thread stacks. Handling arguments, manipulating local variables, passing blocks and maintaining their context...all become a simple matter of specifying ranges on those stacks. And there's no more spaghetti; call frames become simple stack frames, and everyone just points at their stack range in memory.
This mimicking of a stack machine ends up simplfying a great many things. It's also far less object and memory intensive, since we will just expand those arrays as needed when the call stack deepens...and it will only deepen to a point. When we start to mix in things like tail-call recursion...well, it just seems to work.
So then the rough way this might work, under this model:
- Methods are compiled to produce least call overhead and minimum classes; rather than a class per method, there's stubbing and interface tricks that can be pulled
- Methods would ideally be direct calls
- Local variables, including arguments, would be maintained as frames on a thread-local stack; variable reads and writes would manipulate array offsets into that frame stack
- Blocks would hold references back to the frames on that stack, delegating appropriate variables to each level. A hierarchical data structure could be built to provide O(1) access times to all variables in a block's scope hierarchy
- Tail calls would simply reuse the top frame on the stack, expanding or shrinking it as necessary, setting the new call's args as appropriate and clearing the local var frame before execution
I can see how to make all of this work...it's mostly just a matter of time and patience refactoring the system to use completely new call/arg/var mechanisms without breaking it :) But hey, it oughta be a lot of fun, right?
--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ Jruby-devel mailing list Jruby-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jruby-devel