Inline...

On Thu, May 3, 2012 at 3:52 AM, Subramanya Sastry <sss.li...@gmail.com> wrote:
> So, Tom and I have wondered on and off about how to represent/handle $-vars
> better.  And last week, when Charlie, Tom, and I met, we talked about it
> some more, where it occured to us that $-vars are not that different from
> local variables.  Tom had also indicated earlier that $1 .. $n are just a
> function of the match-data ($~), and so they are.  So are $`, $', $+.

Let's be clear here first of all...we're talking about the
method-local (frame-local) $-vars $_, $~, and derivatives of the
latter like you list. Other globals have differenty scoping.

> We create a special operand for $~, say LastMatch.  Once we figure out more
> details, LastMatch could derive off LocalVariable, and in all analyses will
> behave as if the programmer declared it, and will get a fixed slot (say
> zero, or -1 which could represent the last slot, or n+1 where n is #
> incoming arg slots) in the binding (instead of a special field in
> DynamicScope).  One significant difference from a regular local-var is that
> there won't be any scope-visible definitions into this local variable.  But,
> clearly, reg-exp calls will set this var (updateBackref in RubyRegexp and
> RubyString) -- but this could be pretty much any call (in the absence of any
> other info).  So, that information has to be somehow surfaced in the IR
> representation so it can be subject to regular analysis like any other var.
> In any case, I dont know how it will as an explicit local-var yet, but till
> then, we could treat LastMatch as a special operand type.
>
> So, we create LastMatch, and get rid of Backref and NthRef operands and
> convert them into special calls on the LastMatch operand.
>
> And, here are the two significant changes.  If there are no uses of
> LastMatch in a method (and all descendent scopes -- blocks passed into
> calls), then there is no reason for RubyString/RubyRegexp to do anything
> with updateBackref.  This also means that scopes can have reg-exp calls and
> can still get by without allocating a heap binding for those scopes.  The
> question is how to pass this information into RubyString/RubyRegexp.  One is
> by way of a special flag set somewhere on the call stack ... or by using
> special-purpose calls in Regexp.  The existing AST implementations might
> also be able to take advantage of it, I think ...

I don't see here how two methods -- one that writes backref and one
that reads it -- would be satisfied. Not all uses of $~ (for example)
read $~ in the body of the Ruby code; it's possible to use it across
calls without ever using any of the specially-named globals.

> Anyway, this is just a broad outline, and not all details are worked out,
> but insofaras $~ is effectively just a regular local-var, it has the exact
> same behavior as a method-level local var used by a piece of Ruby code.  So
> far, in the current implementation of backRef in DynamicScope as a special
> field, I see that it behaves this way as well -- so why not just make it
> explicit and take advantage of it?
>
> Am I missing anything?

Maybe not?

Our goal with all these "extra" frame-local pieces of data is
obviously to make them free when not used (and as cheap as possible
when used). Backref/lastline violate that perhaps more than any other
feature right now in that if there's any hint that they might be used
we construct a full heap-based scope for the method *and* force all
locals into that scope too. That sounds bad enough...and then realize
that "hint of use" includes any method *names* in core that are known
to access backref/lastline, like #[]. So every method that calls #[]
on any object deoptimizes to use a full heap scope every time. *Awful*

I've had several ideas for eliminating this gross over-compensation,
and some of them will be helped by IR treating backref/lastline as
though they're "always heap-based" locals:

* When a method *may* access backref/lastline, increment/decrement an
index into a per-thread array of values. This avoids all allocation
but has the cost of read+modify of a field twice and try/finally logic
(both of which would be *drastically* cheaper than a heap scope). It's
not free when not used, but it has the same drastically reduced cost
whether it is used or not.

* Add an additional call path for methods that *may* access
backref/lastline that preserves a single thread-local value for each.
Only one would be set at a time, and the call stack would preserve
context. In this case, the dynamic call logic looks up a target method
to inspect whether it *actually* accesses backref/lastline, and only
does the thread-local logic in that case. It would make the non-used
case nearly free (on invokedynamic) and the used case cheaper (but
more expensive than the index and pre-allocated array)

The former I can do now even with the current compiler by adding
another "CallConfiguration" and compiler logic around it (currently
CallConfiguration can only reflect frame:yes|no,
scope:full|dummy|none), and I may even do that for JRuby 1.7. The
latter really needs IR and invokedynamic to be efficient, since I
would want to bind the proper logic into each call site.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to