PATCH perl_reference.pod "Remedies for Inner Subroutines"

Brian McCauley Fri, 31 Oct 2003 13:01:36 -0800

Stas Bekman <[EMAIL PROTECTED]> writes:

> Brian McCauley wrote:
> 
> > I think porting.pod is done.
> 
> Indeed.
> 
> > Now I have to attack perl_reference.pod,
> > and I assume from what you said before you don't want to release the
> > one without the other.
> 
> Yes. Let's commit them together.


Here's a _very_ rough first cut at perl_reference.pod.  I haven't even
proof-read it yet so it's probably got spelling a and grammar errors
but I just want to be sure I'm going in the right direction.

--- perl_reference.pod.orig     Thu Aug 14 18:11:11 2003
+++ perl_reference.pod  Fri Oct 31 19:46:56 2003
@@ -863,16 +863,17 @@
 problem, Perl will always alert you.
 
 Given that you have a script that has this problem, what are the ways
-to solve it? There are many of them and we will discuss some of them
-here.
+to solve it? There have been many of them suggested in the past and we
+will discuss some of them here.
 
 We will use the following code to show the different solutions.
 
   multirun.pl
   -----------
-  #!/usr/bin/perl -w
+  #!/usr/bin/perl
   
   use strict;
+  use warnings;
   
   for (1..3){
     print "run: [time $_]\n";
@@ -925,20 +926,26 @@
   Counter is equal to 5 !
   Counter is equal to 6 !
 
-Obviously, the C<$counter> variable is not reinitialized on each
-execution of run(). It retains its value from the previous execution,
-and sub increment_counter() increments that.
-
-One of the workarounds is to use globally declared variables, with the
-C<vars> pragma.
+Apparently, the C<$counter> variable is not reinitialized on each
+execution of run(), it retains its value from the previous execution,
+and increment_counter() increments that.  Actually that's not quite
+what happens.  On each execution of run() a new C<$counter> variable
+is initialized to zero but increment_counter is remains bound to the
+C<$counter> variable from the first call to run().
+
+The simplest of the workarounds is to use package-scoped variables,
+declared using C<our> or, on older versions of Perl, the C<vars>
+pragma.  Note that whereas using C<my> declaration also implicitly
+initializes variables to undefined the C<our> declaration does not,
+and so you may need to add explicit initialisation.
 
   multirun1.pl
-  -----------
-  #!/usr/bin/perl -w
+  ------------
+  #!/usr/bin/perl
   
   use strict;
-  use vars qw($counter);
-  
+  use warnings;  
+
   for (1..3){
     print "run: [time $_]\n";
     run();
@@ -946,7 +953,7 @@
   
   sub run {
   
-    $counter = 0;
+    our $counter = 0;
   
     increment_counter();
     increment_counter();
@@ -977,11 +984,34 @@
 problem, since there is no C<my()> (lexically defined) variable used
 in the nested subroutine.
 
-Another approach is to use fully qualified variables. This is better,
-since less memory will be used, but it adds a typing overhead:
+In the above example we know C<$counter> is just a simple small
+scalar.  In the general case variables could reference external
+resource handles or large data structures.  In that situation the fact
+that the variable would not be released immediately when run()
+completes could be a problem.  To avoid this you should put C<local>
+in front of all you C<our> declarations for all variables other than
+simple scalars.  This has the effect of restoring the variable to its
+previous value (usually undefined) upon exit from the current scope.
+As a side-effect C<local> also initializes the variables to C<undef>.
+So, you recall that thing I said about needing to remeber to add
+explicit initialization when you replace C<my> by C<our>, well you can
+forget it again if you replace C<my> with C<local our>.
+
+Be warned that C<local> will not release circular data structures and
+if the original CGI script relied on process termination to clean up
+after it then it will leak memory as a registry script.
+
+A varient of the package variable approach is to use explicit package
+qualified variables.  This has the advantage on old versions of Perl
+that there is no need to load the C<vars> module, but it adds a
+significant typing overhead.  This approach is not suitable for
+registry scripts because they would all be stomping on the C<main::>
+namespace rather than each staying within the namespace allocated to
+them.  And, besides, the overhead of loading the C<vars> module would
+only have to be paid one per Perl interpreter.
 
   multirun2.pl
-  -----------
+  ------------
   #!/usr/bin/perl -w
   
   use strict;
@@ -993,14 +1023,14 @@
   
   sub run {
   
-    $main::counter = 0;
+    $::counter = 0;
   
     increment_counter();
     increment_counter();
   
     sub increment_counter{
-      $main::counter++;
-      print "Counter is equal to $main::counter !\n";
+      $::counter++;
+      print "Counter is equal to $::counter !\n";
     }
   
   } # end of sub run
@@ -1019,10 +1049,11 @@
 and then submit it for your script to process.
 
   multirun3.pl
-  -----------
-  #!/usr/bin/perl -w
+  ------------
+  #!/usr/bin/perl
   
   use strict;
+  use warnings;
   
   for (1..3){
     print "run: [time $_]\n";
@@ -1056,10 +1087,11 @@
 variables in a calling function.
 
   multirun4.pl
-  -----------
-  #!/usr/bin/perl -w
+  ------------
+  #!/usr/bin/perl
   
   use strict;
+  use warnings;
   
   for (1..3){
     print "run: [time $_]\n";
@@ -1092,10 +1124,11 @@
 a literal, e.g. I<increment_counter(5)>).
 
   multirun5.pl
-  -----------
-  #!/usr/bin/perl -w
+  ------------
+  #!/usr/bin/perl
   
   use strict;
+  use warnings;
   
   for (1..3){
     print "run: [time $_]\n";
@@ -1120,14 +1153,27 @@
 
 Here is a solution that avoids the problem entirely by splitting the
 code into two files; the first is really just a wrapper and loader,
-the second file contains the heart of the code.
+the second file contains the heart of the code.  This second file must
+go into a directory in your C<@INC>.  Some people like to put the
+library in the same directory as the script but this assumes that the
+current working directory will be equal to the directory where the
+script is located and also that C<@INC> will contain C<'.'>, neither
+of which are assumptions you should expect to hold in all cases.
+
+Note that the name chosen for the library must be unique thoughout the
+entire server and indeed every server on which you many ever install
+the script.  This solution is probably more trouble than it is worth -
+it is only included here because it was mentioned in previous versions
+of this guide.
 
   multirun6.pl
-  -----------
-  #!/usr/bin/perl -w
+  ------------
+  #!/usr/bin/perl
   
   use strict;
-  require 'multirun6-lib.pl' ;
+  use warnings;
+
+  require 'multirun6-lib.pl';
   
   for (1..3){
     print "run: [time $_]\n";
@@ -1138,7 +1184,8 @@
 
   multirun6-lib.pl
   ----------------
-  use strict ;
+  use strict;
+  use warnings;
   
   my $counter;
 
@@ -1156,7 +1203,53 @@
   
   1 ;
 
-Now you have at least six workarounds to choose from.
+An alternative verion of the above, that mitigates some of the
+disadvantages, is to use a Perl5-style Exporter module rather than a
+Perl4-style library.  The requirement of global uniqueness of the
+module name still applies but at least this is a problem we're already
+familiar with.
+
+  multirun7.pl
+  ------------
+  #!/usr/bin/perl
+  
+  use strict;
+  use warnings;
+  use My::Multirun7;
+  
+  for (1..3){
+    print "run: [time $_]\n";
+    run();
+  }
+
+Separate file:
+
+  My/Multirun7.pm
+  ----------------
+  package My::Multirun7;
+  use strict;
+  use warnings;
+  use base qw( Exporter );
+  our @EXPORT = qw( run );
+
+  my $counter;
+
+  sub run {
+    $counter = 0;
+
+    increment_counter();
+    increment_counter();
+  }
+  
+  sub increment_counter{
+    $counter++;
+    print "Counter is equal to $counter !\n";
+  }
+  
+  1 ;
+
+Now you have at least five workarounds to choose from (not counting
+numbers 2 and 6).
 
 For more information please refer to perlref and perlsub manpages.

PATCH perl_reference.pod "Remedies for Inner Subroutines"

Reply via email to