Re: [perl #56398] [BUG] lexical inner scope always keeps first lexpad (or something)

Bob Rogers Thu, 10 Jul 2008 18:36:14 -0700

   From: "Patrick R. Michaud" <[EMAIL PROTECTED]>
   Date: Thu, 10 Jul 2008 00:31:53 -0500

   On Thu, Jul 10, 2008 at 12:29:57AM -0400, Bob Rogers wrote:
   . . .
   > Shouldn't
   >    for 1..10 -> $x {
   >       sub foo() { say $x; }
   >       push(@foos, \&foo);
   >    }
   > produce the same result as
   >    for 1..10 -> $x {
   >       my $foo = sub { say $x; };
   >       push(@foos, $foo);
   >    }
   > modulo global namespace mangling (and broken Perl 6 syntax)?  And if
   > not, why not?

   Yes, the two examples above would be the effectively the same, but 
   neither of them are the same as my original.  Perl 6 recognizes 
   taking a closure at the point where a sub is referenced (used as an rvalue), 
   so in the first example we would do the newclosure at the &foo 
   reference inside the push() call, the second example we would do 
   the newclosure as part of the "my $foo" assignment.

Hmm.  I have been assuming that when you say "taking a closure", that is
equivalent to "using *the* closure" for a given :outer context.  It's
beginning to dawn on me that this might not be what you mean; more
below.

   But according to Synopsis 4 [1], if we simply define foo and call it
   directly (as opposed to taking a reference), we don't take that
   snapshot of its lexical scope . . .

Why on earth not?  How can foo() ever operate correctly without the
right :outer scope, regardless of how we call it?  I assume I must be
misunderstanding (again) . . .

   >    Also, we can't do this exactly at the point of definition, because
   >    sub calls can lexically occur before the definition of the thing
   >    they're calling.  So it has to be moved higher in the block somehow.
   > 
   > Eh?  Can't do what at the point of definition?  Do you have an example?

   By "this" I meant "take a newclosure and store it in the 
   symbol table".  The following is valid Perl 6:

       my $x = 'hello';
       foo();
       sub foo() { say $x; }

   At the point of the foo() call, we have to have already stored
   an entry in the symbol table for 'foo', thus if we wait until
   the sub declaration to do that, it's too late.  So, we have to
   move the newclosure+store sequence to a point earlier in the block 
   than where the sub declaration itself occurs.

Yes; I do see what you mean now.  But I think the compiler needs to do
this generally; foo must be defined at BEGIN time, so the compiler has
to save up the other code (including the initialization of $x if not its
creation as a lexical) in order to ensure that they are evaluated after
BEGIN time.  The attached patrick-example-1.pir is how I would translate
this (though one could argue that the BEGIN code should go into an :init
sub).

   But I don't think this is out of the ordinary.  As an example of
another nontrivial case of BEGIN-time processing, consider the following
Perl 5 code:

        sub foo {
            print "This is the original foo.\n";
        }

        my $old_foo;
        BEGIN {
            $old_foo = \&foo;
        }

        sub foo {
            print "Out with the old, in with the new.\n";
            $old_foo->();
        }

        foo();

I admit this is a wierd example, but if the first "foo" definition were
in another file, the BEGIN block and second definition would amount to a
reasonable way to wrap the original "foo".  The problem, of course, is
that all PIR sub definitions appear to happen simultaneously, so the
straightforward translation fails because you can't name them both "foo"
in the PIR source.  So the compiler must emit some kind of explicit
namespace mangling, and arrange for it to be run at BEGIN time.

   >    From: "Patrick R. Michaud" <[EMAIL PROTECTED]>
   >    ...would we also have to generate code to restore the previous
   >    value of 'foo' upon exiting the sub?  (Think recursive calls here.)
   > 
   >        sub bar($x) {
   >       sub foo() { say $x; }
   >       if ($x > 0) { bar($x-1); }
   >       foo();
   >        }
   > 
   > No.  AFAIK, "sub foo() { ... }" always mangles the global definition of
   > foo.  So in Perl 5, the first call wins; in Perl 6, it might be the last
   > call, but I don't really know.  

   Okay, I wasn't aware of this for Perl 5, so perhaps Perl 6 has similar
   semantics.  But in my head it doesn't seem to match what I read
   in Synopsis 4 at [1], particularly where S04 says

       my sub bar { print $x }         # not cloned yet

Hmm.  The "my" makes "&bar" lexical, so at this point, the existence of
&bar does not expand the potential lifetime of $x.  But in fact, &bar
*does* need newclosure, and this is a lexical property of the containing
block, known to the compiler before it emits any of the block code, so I
don't understand what time "yet" refers to.

        It is also free to turn referenced closures into mere anonymous
        subroutines if the block does not refer to any external lexicals
        that should themselves be cloned.

   (Again I'm making the leap that 'cloning' in the synopsis is what 
   we mean by 'newclosure' in Parrot.)

I think that's a reasonable guess, and is what I had been assuming.  The
literature (e.g. [2]) says that a closure "captures variable bindings";
that is the terminology I am used to.  I find the term "snapshot"
confusing, since it seems to imply that the variable values are copied
immediately into the closure, rather than referencing the bindings so
that they are shared with the :outer sub and any other closures that
reference those bindings.

   However, it just occurred to me that Larry might be referring to the
need to close over the variables at all.  In the S04 example, the
compiler ought to realize that &bar is "not a downward function" [3],
and arrange to do "newclosure" immediately.  (Indeed; I think the
comments in the example are incorrect; it's the "return &baz" that
disqualifies &baz, &bar, and $x from being downward.)  However, if some
or all of these references were "downward" (i.e. not referenced after
the containing block returns), then the compiler need not implement
those subs via closures.  In Parrot, the two available technologies for
doing this are inlining, and generating bsr/ret subroutines.

   But, by talking about "cloning status", this section seems to imply
that the decision to clone might need to happen at runtime.  I am not
aware of any Perl 6 constructs that require such a thing, except perhaps
for the following:

        my $code1 = &bar;                # "now bar is cloned"
        my $code2 = &bar;                # ???

Is $code1 a different closure from $code2?  If so, different *how*?  If
they are different objects that both refer to the identical bindings,
what is the advantage of having two separate objects that behave
indistiguishably?

   And if one really has a reason for separate objects, why not just say
so explicitly?

        my $code2 = $code1.clone;

That seems clearer, and may even be easier to implement.

   However, there may be other such constructs, so I must assume my
understanding of what Perl 6 requires is incomplete.  John Dlugosz's
spec-in-progress [4] may shed some light in this area, but I cannot find
anything (after a casual search, which I'm sure is the wrong tactic for
such a document).

   >    To summarize, I am not arguing that relying on "autoclose" is
   > necessarily wrong, and that we should therefore get rid of it.  
   > [...]

   I don't exactly understand what "autoclose" is (even after reading
   the other thread) -- my impression is that it refers to invoking
   or taking a snapshot of an inner sub's lexical environment
   even when its outer context hasn't been invoked yet.  If this 
   limited understanding of "autoclose" is correct, then I don't think 
   my examples are relying on it; but if the examples I'm giving
   do make use of the "autoclose" feature you keep referring to,
   then I guess I need to learn more about "autoclose" to be able
   to talk intelligently about it.

   Pm

Since at least October 2006, and up through r28762, "autoclose" meant
the ability of subs marked with :outer to turn into working closures
without the benefit of "newclosure".  This would happen when a closure
was invoked without an outer_ctx (normally filled in by newclosure), in
which case Closure:invoke worked backward through the call chain to find
the innermost context of the outer_sub, and installed that as the
outer_ctx.  So if you called the symbol table entry for the lexical sub
from within the :outer sub (the normal case), it would work as if the
:outer sub had done newclosure and stuffed the sub back in the
namespace.  In the case where the outer sub had never been invoked, or
wasn't in the active call chain, there is another hack to create an
uninitialized context, which is why this case typically gets "null PMC"
errors.  Either way, this implicit context capture happened only the
first time an uninitialized closure is invoked.  (chromatic says this is
documented in "the lexical spec", or at least some of it, but I can't
find it myself.)

   So you are definitely using "autoclose."  In fact, your RT#56398
ticket effectively asks to extend this feature so that it updates the
outer_ctx on *every* call, which (as implemented) breaks newclosure.

                                        -- Bob

[1]  
http://dev.perl.org/perl6/doc/design/syn/S04.html#When_is_a_closure_not_a_closure

[2]  http://en.wikipedia.org/wiki/Closure_(computer_science)

[3]  Sorry, can't find a good reference for this.  The terminology is
     (uh) common in Lisp.  In many Lisp implementations, if you say
     "(declare (downward-function foo))", the compiler tries to
     stack-allocate the variables foo closes over.  In order to
     implement the equivalent optimization in Parrot, one would have to
     have a way to tell "newclosure" not to do RetContinuation
     promotion.  But I digress (as usual).

[4]  http://www.dlugosz.com/Perl6/specdoc.pdf

## Bob's translation of Patrick's Perl 6 example:
##
##        my $x = 'hello';
##        foo();
##        sub foo() { say $x; }

.sub main :main

        ## BEGIN time.
        .local pmc x
        .lex '$x', x
        .const .Sub foo_sub = 'foo_sub'
        .local pmc foo_closure
        foo_closure = newclosure foo_sub
        set_hll_global 'foo', foo_closure

        ## Run time.
        x = new 'String'
        x = 'hello'
        foo()
.end

.sub foo_sub :anon :outer('main')
        .local pmc x
        x = find_lex '$x'
        print x
        print "\n"
.end

Re: [perl #56398] [BUG] lexical inner scope always keeps first lexpad (or something)

Reply via email to