Inline... On Thu, May 3, 2012 at 3:52 AM, Subramanya Sastry <sss.li...@gmail.com> wrote: > So, Tom and I have wondered on and off about how to represent/handle $-vars > better. And last week, when Charlie, Tom, and I met, we talked about it > some more, where it occured to us that $-vars are not that different from > local variables. Tom had also indicated earlier that $1 .. $n are just a > function of the match-data ($~), and so they are. So are $`, $', $+.
Let's be clear here first of all...we're talking about the method-local (frame-local) $-vars $_, $~, and derivatives of the latter like you list. Other globals have differenty scoping. > We create a special operand for $~, say LastMatch. Once we figure out more > details, LastMatch could derive off LocalVariable, and in all analyses will > behave as if the programmer declared it, and will get a fixed slot (say > zero, or -1 which could represent the last slot, or n+1 where n is # > incoming arg slots) in the binding (instead of a special field in > DynamicScope). One significant difference from a regular local-var is that > there won't be any scope-visible definitions into this local variable. But, > clearly, reg-exp calls will set this var (updateBackref in RubyRegexp and > RubyString) -- but this could be pretty much any call (in the absence of any > other info). So, that information has to be somehow surfaced in the IR > representation so it can be subject to regular analysis like any other var. > In any case, I dont know how it will as an explicit local-var yet, but till > then, we could treat LastMatch as a special operand type. > > So, we create LastMatch, and get rid of Backref and NthRef operands and > convert them into special calls on the LastMatch operand. > > And, here are the two significant changes. If there are no uses of > LastMatch in a method (and all descendent scopes -- blocks passed into > calls), then there is no reason for RubyString/RubyRegexp to do anything > with updateBackref. This also means that scopes can have reg-exp calls and > can still get by without allocating a heap binding for those scopes. The > question is how to pass this information into RubyString/RubyRegexp. One is > by way of a special flag set somewhere on the call stack ... or by using > special-purpose calls in Regexp. The existing AST implementations might > also be able to take advantage of it, I think ... I don't see here how two methods -- one that writes backref and one that reads it -- would be satisfied. Not all uses of $~ (for example) read $~ in the body of the Ruby code; it's possible to use it across calls without ever using any of the specially-named globals. > Anyway, this is just a broad outline, and not all details are worked out, > but insofaras $~ is effectively just a regular local-var, it has the exact > same behavior as a method-level local var used by a piece of Ruby code. So > far, in the current implementation of backRef in DynamicScope as a special > field, I see that it behaves this way as well -- so why not just make it > explicit and take advantage of it? > > Am I missing anything? Maybe not? Our goal with all these "extra" frame-local pieces of data is obviously to make them free when not used (and as cheap as possible when used). Backref/lastline violate that perhaps more than any other feature right now in that if there's any hint that they might be used we construct a full heap-based scope for the method *and* force all locals into that scope too. That sounds bad enough...and then realize that "hint of use" includes any method *names* in core that are known to access backref/lastline, like #[]. So every method that calls #[] on any object deoptimizes to use a full heap scope every time. *Awful* I've had several ideas for eliminating this gross over-compensation, and some of them will be helped by IR treating backref/lastline as though they're "always heap-based" locals: * When a method *may* access backref/lastline, increment/decrement an index into a per-thread array of values. This avoids all allocation but has the cost of read+modify of a field twice and try/finally logic (both of which would be *drastically* cheaper than a heap scope). It's not free when not used, but it has the same drastically reduced cost whether it is used or not. * Add an additional call path for methods that *may* access backref/lastline that preserves a single thread-local value for each. Only one would be set at a time, and the call stack would preserve context. In this case, the dynamic call logic looks up a target method to inspect whether it *actually* accesses backref/lastline, and only does the thread-local logic in that case. It would make the non-used case nearly free (on invokedynamic) and the used case cheaper (but more expensive than the index and pre-allocated array) The former I can do now even with the current compiler by adding another "CallConfiguration" and compiler logic around it (currently CallConfiguration can only reflect frame:yes|no, scope:full|dummy|none), and I may even do that for JRuby 1.7. The latter really needs IR and invokedynamic to be efficient, since I would want to bind the proper logic into each call site. - Charlie --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email