So, Tom and I have wondered on and off about how to represent/handle $-vars
better.  And last week, when Charlie, Tom, and I met, we talked about it
some more, where it occured to us that $-vars are not that different from
local variables.  Tom had also indicated earlier that $1 .. $n are just a
function of the match-data ($~), and so they are.  So are $`, $', $+.

Today, on my flight to San Francisco, I started poking around it some more
and realized that we could encode $-vars better.

Here is a proposal for the same, which also has some ramifications (I think
for the good) on the JRuby implementation of RegExp.

We create a special operand for $~, say LastMatch.  Once we figure out more
details, LastMatch could derive off LocalVariable, and in all analyses will
behave as if the programmer declared it, and will get a fixed slot (say
zero, or -1 which could represent the last slot, or n+1 where n is #
incoming arg slots) in the binding (instead of a special field in
DynamicScope).  One significant difference from a regular local-var is that
there won't be any scope-visible definitions into this local variable.
But, clearly, reg-exp calls will set this var (updateBackref in RubyRegexp
and RubyString) -- but this could be pretty much any call (in the absence
of any other info).  So, that information has to be somehow surfaced in the
IR representation so it can be subject to regular analysis like any other
var.  In any case, I dont know how it will as an explicit local-var yet,
but till then, we could treat LastMatch as a special operand type.

So, we create LastMatch, and get rid of Backref and NthRef operands and
convert them into special calls on the LastMatch operand.

And, here are the two significant changes.  If there are no uses of
LastMatch in a method (and all descendent scopes -- blocks passed into
calls), then there is no reason for RubyString/RubyRegexp to do anything
with updateBackref.  This also means that scopes can have reg-exp calls and
can still get by without allocating a heap binding for those scopes.  The
question is how to pass this information into RubyString/RubyRegexp.  One
is by way of a special flag set somewhere on the call stack ... or by using
special-purpose calls in Regexp.  The existing AST implementations might
also be able to take advantage of it, I think ...

Anyway, this is just a broad outline, and not all details are worked out,
but insofaras $~ is effectively just a regular local-var, it has the exact
same behavior as a method-level local var used by a piece of Ruby code.  So
far, in the current implementation of backRef in DynamicScope as a special
field, I see that it behaves this way as well -- so why not just make it
explicit and take advantage of it?

Am I missing anything?

Subbu.

Reply via email to