advanced: closures (was Re: test for real number)

Jeff Pinyan Thu, 31 May 2001 10:10:55 -0700
On May 31, Randal L. Schwartz said:

>Paul> Be careful with this, though -- if you call some other function
>Paul> from this scope, the value of $SIG{__WARN__} will still be the
>Paul> routine that increments $bad, but it won't be able to see the
>Paul> $bad we made with my()!
>
>Wrong.  It'll still work!  It's a closure!
>
>(now explain that... :)

What's a closure?  To understand this, first you have to understand a bit
about Perl's scoping.

There are two types of variables in Perl, package (global), and private
(lexical).  Package variables belong to a specific package:

  #!/usr/bin/perl

  $foo = 10;  # this is $foo in package main
  print $main::foo;  # this is how you explicitly access $foo in main

Private variables are declared with 'my', and belong to the block they are
defined in.  Because of this, they are called lexical -- you can see, in
the code, the exact scope of the variable.  Lexical variables do not
belong to any package.

  $foo = 10;
  {
    my $foo = 20;
    print $foo;        # 20
    print $main::foo;  # 10
  }
  print $foo;          # 10
  print $main::foo;    # 10

The 'my $foo' is ONLY visible inside the block of code we defined it in.
That means that if a function is looking for a variable named $foo, it
might not see the 'my $foo' we created:

  # the 'my $foo' DOES NOT EXIST OUT HERE!
  sub bar { print $foo }

  $foo = 10;
  {
    # 'my $foo' exists in here!
    my $foo = 20;
    bar();
  }

This program prints 10, not 20.  (This is a reason why it's good to send
the variables that a function will need to the function explicitly.)

[This is a short introduction to scope.  Please read the documentation
about my()[1], and Mark-Jason Dominus's "Coping with Scoping"[2] article.]

So, japhy, what's a closure?

A closure is an ANONYMOUS function (constructed via $x = sub { ... }) that
contains LEXICAL variables that have been defined in a scope visible to
the closure itself.

"Guh?"  Yeah, that's what I thought you'd say.

Here's a simple example:

  sub make_a_counter {
    my $start = shift;
    return sub { return $start++ }
  }

  my $start_at_5 = make_a_counter(5);

  print $start_at_5->() for 1 .. 10;

Ok, let's step back and see what this is doing:

  sub make_a_counter {
    my $start = shift;
    return sub { return $start++ }
  }

We're creating a function called make_a_counter(), and it takes some
number as its argument (we store that in the LEXICAL variable
$start).  Then, the function returns a reference to an anonymous function.

  $code_ref = sub { ... };

That code creates a reference to an anonymous function, and stores it in
$code_ref.  You can then call that function via

  $code_ref->(@args);
  # or
  &$code_ref(@args);

Now, the anonymous function we create is:

  {
    return $start++;
  }

Where did this $start variable get defined?  It was defined in the same
scope as this anonymous function... so that means $start is visible to
this anonymous function.

Now, we return a reference to the anonymous function.

  my $start_at_5 = make_a_counter(5);

Now we store that code reference in $start_at_5.

  print $start_at_5->() for 1 .. 10;

Here, we print the return value of the code reference 10 times.  The
output of this program is 5 6 7 8 9 10 11 12 13 14 (spaces added for
clarity).

But how did the function see $start?  We called the function OUTSIDE of
the scope where the code reference was created!

That is the magical effect of closures.  The code reference we created
used the lexical $start variable.  You see, lexical variables only
disappear (or, more technically, "go out of scope") when their reference
count drops to 0 -- that means, when no other data structure is depending
on them to exist.  We've created a code reference that depends on $start
to stick around, so lovingly, it complies. :)

But, if $start stays around, what happens when we make TWO counters?!

  $x = make_a_counter(10);
  $y = make_a_counter(20);
  print $x->();
  print $y->();
  print $x->();

This prints (thank goodness!) 10 20 11.  But why?  First, let's try that
exercise WITHOUT using a lexical $start, but rather, using a global
variable.

  sub make_a_bad_counter {
    $start = shift;
    return sub { return $start++ }
  }

  $x = make_a_bad_counter(10);
  $y = make_a_bad_counter(20);
  print $x->();
  print $y->();
  print $x->();

This code prints 20 21 22.  The reason is because the code reference
returned by make_a_bad_counter() is just accessing the global $start
variable.  By the time we call $x->(), $start has been set to 20 by the
call of make_a_bad_counter(20).  Oh well. :(

So now we know that we have to use my() variables.  But why does the
my() variable hold different values for different counters?  Because
my() uses a new chunk of memory if the chunk of memory it expected to use
has not been freed.  How does that memory get freed?  By the variable's
reference count being 0 -- which isn't the case here, since we have a code
reference depending on that variable hanging around.

Here's a quick example:

for (1..3) {
  my $x = $_;     
  print \$x;
}
SCALAR(0xc62c8)
SCALAR(0xc62c8)
SCALAR(0xc62c8)

Notice how the place in memory doesn't change?

friday:~ $ perl -l
for (1..3) { 
  my $x = $_;
  push @y, \$x;
  print \$x;
}
SCALAR(0xc62c8)
SCALAR(0xc6280)
SCALAR(0xc6328)

Now it's changed.  Why?  Because the @y array holds references to $x, and
that increases the reference count!

That's all for now.  I hope this has been helpful -- it might be a bit too
much for beginners that don't know much about scoping and references[3].


[1] perldoc -f my  and  perldoc -q lexical

[2] "Coping with Scoping", by Mark-Jason Dominus, Winter 1998:
    http://perl.plover.com/FAQs/Namespaces.html

[3] perldoc perlreftut  and  perldoc perlref  and "Using References", by
    Jeff Pinyan, January 2000: http://www.pobox.com/~japhy/docs/using_refs

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
Are you a Monk?  http://www.perlmonks.com/     http://forums.perlguru.com/
Perl Programmer at RiskMetrics Group, Inc.     http://www.riskmetrics.com/
Acacia Fraternity, Rensselaer Chapter.         Brother #734
**        I no longer need a publisher for my Perl Regex book :)        **
advanced: closures (was Re: test for real number)

Reply via email to