From: "Patrick R. Michaud" <[EMAIL PROTECTED]> Date: Thu, 10 Jul 2008 00:31:53 -0500
On Thu, Jul 10, 2008 at 12:29:57AM -0400, Bob Rogers wrote: . . . > Shouldn't > for 1..10 -> $x { > sub foo() { say $x; } > push(@foos, \&foo); > } > produce the same result as > for 1..10 -> $x { > my $foo = sub { say $x; }; > push(@foos, $foo); > } > modulo global namespace mangling (and broken Perl 6 syntax)? And if > not, why not? Yes, the two examples above would be the effectively the same, but neither of them are the same as my original. Perl 6 recognizes taking a closure at the point where a sub is referenced (used as an rvalue), so in the first example we would do the newclosure at the &foo reference inside the push() call, the second example we would do the newclosure as part of the "my $foo" assignment. Hmm. I have been assuming that when you say "taking a closure", that is equivalent to "using *the* closure" for a given :outer context. It's beginning to dawn on me that this might not be what you mean; more below. But according to Synopsis 4 [1], if we simply define foo and call it directly (as opposed to taking a reference), we don't take that snapshot of its lexical scope . . . Why on earth not? How can foo() ever operate correctly without the right :outer scope, regardless of how we call it? I assume I must be misunderstanding (again) . . . > Also, we can't do this exactly at the point of definition, because > sub calls can lexically occur before the definition of the thing > they're calling. So it has to be moved higher in the block somehow. > > Eh? Can't do what at the point of definition? Do you have an example? By "this" I meant "take a newclosure and store it in the symbol table". The following is valid Perl 6: my $x = 'hello'; foo(); sub foo() { say $x; } At the point of the foo() call, we have to have already stored an entry in the symbol table for 'foo', thus if we wait until the sub declaration to do that, it's too late. So, we have to move the newclosure+store sequence to a point earlier in the block than where the sub declaration itself occurs. Yes; I do see what you mean now. But I think the compiler needs to do this generally; foo must be defined at BEGIN time, so the compiler has to save up the other code (including the initialization of $x if not its creation as a lexical) in order to ensure that they are evaluated after BEGIN time. The attached patrick-example-1.pir is how I would translate this (though one could argue that the BEGIN code should go into an :init sub). But I don't think this is out of the ordinary. As an example of another nontrivial case of BEGIN-time processing, consider the following Perl 5 code: sub foo { print "This is the original foo.\n"; } my $old_foo; BEGIN { $old_foo = \&foo; } sub foo { print "Out with the old, in with the new.\n"; $old_foo->(); } foo(); I admit this is a wierd example, but if the first "foo" definition were in another file, the BEGIN block and second definition would amount to a reasonable way to wrap the original "foo". The problem, of course, is that all PIR sub definitions appear to happen simultaneously, so the straightforward translation fails because you can't name them both "foo" in the PIR source. So the compiler must emit some kind of explicit namespace mangling, and arrange for it to be run at BEGIN time. > From: "Patrick R. Michaud" <[EMAIL PROTECTED]> > ...would we also have to generate code to restore the previous > value of 'foo' upon exiting the sub? (Think recursive calls here.) > > sub bar($x) { > sub foo() { say $x; } > if ($x > 0) { bar($x-1); } > foo(); > } > > No. AFAIK, "sub foo() { ... }" always mangles the global definition of > foo. So in Perl 5, the first call wins; in Perl 6, it might be the last > call, but I don't really know. Okay, I wasn't aware of this for Perl 5, so perhaps Perl 6 has similar semantics. But in my head it doesn't seem to match what I read in Synopsis 4 at [1], particularly where S04 says my sub bar { print $x } # not cloned yet Hmm. The "my" makes "&bar" lexical, so at this point, the existence of &bar does not expand the potential lifetime of $x. But in fact, &bar *does* need newclosure, and this is a lexical property of the containing block, known to the compiler before it emits any of the block code, so I don't understand what time "yet" refers to. It is also free to turn referenced closures into mere anonymous subroutines if the block does not refer to any external lexicals that should themselves be cloned. (Again I'm making the leap that 'cloning' in the synopsis is what we mean by 'newclosure' in Parrot.) I think that's a reasonable guess, and is what I had been assuming. The literature (e.g. [2]) says that a closure "captures variable bindings"; that is the terminology I am used to. I find the term "snapshot" confusing, since it seems to imply that the variable values are copied immediately into the closure, rather than referencing the bindings so that they are shared with the :outer sub and any other closures that reference those bindings. However, it just occurred to me that Larry might be referring to the need to close over the variables at all. In the S04 example, the compiler ought to realize that &bar is "not a downward function" [3], and arrange to do "newclosure" immediately. (Indeed; I think the comments in the example are incorrect; it's the "return &baz" that disqualifies &baz, &bar, and $x from being downward.) However, if some or all of these references were "downward" (i.e. not referenced after the containing block returns), then the compiler need not implement those subs via closures. In Parrot, the two available technologies for doing this are inlining, and generating bsr/ret subroutines. But, by talking about "cloning status", this section seems to imply that the decision to clone might need to happen at runtime. I am not aware of any Perl 6 constructs that require such a thing, except perhaps for the following: my $code1 = &bar; # "now bar is cloned" my $code2 = &bar; # ??? Is $code1 a different closure from $code2? If so, different *how*? If they are different objects that both refer to the identical bindings, what is the advantage of having two separate objects that behave indistiguishably? And if one really has a reason for separate objects, why not just say so explicitly? my $code2 = $code1.clone; That seems clearer, and may even be easier to implement. However, there may be other such constructs, so I must assume my understanding of what Perl 6 requires is incomplete. John Dlugosz's spec-in-progress [4] may shed some light in this area, but I cannot find anything (after a casual search, which I'm sure is the wrong tactic for such a document). > To summarize, I am not arguing that relying on "autoclose" is > necessarily wrong, and that we should therefore get rid of it. > [...] I don't exactly understand what "autoclose" is (even after reading the other thread) -- my impression is that it refers to invoking or taking a snapshot of an inner sub's lexical environment even when its outer context hasn't been invoked yet. If this limited understanding of "autoclose" is correct, then I don't think my examples are relying on it; but if the examples I'm giving do make use of the "autoclose" feature you keep referring to, then I guess I need to learn more about "autoclose" to be able to talk intelligently about it. Pm Since at least October 2006, and up through r28762, "autoclose" meant the ability of subs marked with :outer to turn into working closures without the benefit of "newclosure". This would happen when a closure was invoked without an outer_ctx (normally filled in by newclosure), in which case Closure:invoke worked backward through the call chain to find the innermost context of the outer_sub, and installed that as the outer_ctx. So if you called the symbol table entry for the lexical sub from within the :outer sub (the normal case), it would work as if the :outer sub had done newclosure and stuffed the sub back in the namespace. In the case where the outer sub had never been invoked, or wasn't in the active call chain, there is another hack to create an uninitialized context, which is why this case typically gets "null PMC" errors. Either way, this implicit context capture happened only the first time an uninitialized closure is invoked. (chromatic says this is documented in "the lexical spec", or at least some of it, but I can't find it myself.) So you are definitely using "autoclose." In fact, your RT#56398 ticket effectively asks to extend this feature so that it updates the outer_ctx on *every* call, which (as implemented) breaks newclosure. -- Bob [1] http://dev.perl.org/perl6/doc/design/syn/S04.html#When_is_a_closure_not_a_closure [2] http://en.wikipedia.org/wiki/Closure_(computer_science) [3] Sorry, can't find a good reference for this. The terminology is (uh) common in Lisp. In many Lisp implementations, if you say "(declare (downward-function foo))", the compiler tries to stack-allocate the variables foo closes over. In order to implement the equivalent optimization in Parrot, one would have to have a way to tell "newclosure" not to do RetContinuation promotion. But I digress (as usual). [4] http://www.dlugosz.com/Perl6/specdoc.pdf
## Bob's translation of Patrick's Perl 6 example: ## ## my $x = 'hello'; ## foo(); ## sub foo() { say $x; } .sub main :main ## BEGIN time. .local pmc x .lex '$x', x .const .Sub foo_sub = 'foo_sub' .local pmc foo_closure foo_closure = newclosure foo_sub set_hll_global 'foo', foo_closure ## Run time. x = new 'String' x = 'hello' foo() .end .sub foo_sub :anon :outer('main') .local pmc x x = find_lex '$x' print x print "\n" .end