Thank you, AW, for a well-written summary of the aspect of
mod_perl that causes the most difficult/nasty bugs for people.
But you left out one important caveat which could scare away
more potential users than it saves.
The retention of values from previous executions applies
only to global variables. Specifically:
- when the script terminates, whatever is in the *global*
variables stays there.
it remembers your compiled script, and the state of its *global*
variables from the last execution
The big danger is that your *global* variables start with the
state in which the previous run
Programming pundits have been discouraging the use of global
variables for years now, perhaps a little more strongly than
is good for the state of the art. However anyone feels
about that, it's useful to write out some guidelines:
* Use global variables for information that you specifically
want to save across executions. The best case is for items
that are defined during the first execution, then used in
later executions.
* If you want to use a global variable for some other reason,
be very careful and aware that it may start an execution
with a value from a previous execution.
* Put most variables inside of subs (including handlers).
These will be initialized for each execution just as in
other perl contexts. Most people have come to accept a bit
more parameter-passing to subs than would be necessary with
global variables, for this initialization plus avoiding
inadvertent bugs when one forgets a global variable name.
A technique that I've used to avoid problems (particularly
when converting old CGI scripts) is, given a list of global
variables at the start of the module:
my ($var1, $var2, $other_var);
my $inited_var = 123;
write directly thereafter:
sub init_vars {
undef $var1; undef $var2; undef $other_var;
inited_var = 123;
}
and call init_vars just before each exit from each handler.
Even easier though to me less satisfying, call init_vars right
after entry to each handler.
Regards,
cmac
On Apr 26, 2010, at 4:23 AM, André Warnier wrote:
Chris Bennett wrote:
...
I have not regretted it. I have learned many details that I could
have overlooked with regular perl. Mod_perl is more unforgiving of
not knowing exactly what my variables are doing and what values
they hold
Perl and nod_perl are very deterministic, and there is no mystery
in what they do with variables. The trick is to understand exactly
how mod_perl works, and how this plays along with the way Apache
(in its different MPM variations) works.
This particular issue has taught me two things.
I am never going to use no warnings 'uninitialized' again. It is
too dangerous to be overlooking possible problems.
I agree.
Maybe even do as Michael said, and make all warnings fatal, if we
are talking about user/web oriented applications.
It has also taught me that perl itself may leave values in
variables such as $1, even after a server stop and start and first
running of a program.
That however, is definitely not the case.
If you stop and start the server, you have a totally new
environment, and there is nothing left from the previous one.
If you are using a "prefork" version of the Apache server, then
that is also true each time an Apache child ends and a new one is
started : it gets a "new perl" and a new set of variables.
(but the thing is, you mostly cannot predict when Apache will start
a new child, nor which child will handle which request).
For other Apache configurations, the situation may be a bit more
complicated.
The main aspect to understand with mod_perl (as opposed to running
a perl program without it) :
- when you run a perl script without mod_perl, the sequence is :
- a new perl interpreter is started, clean
- your script gets compiled, and gets a brand-new set of variables
- your script gets run, starting with this new set of variables
- your script "exits"
- perl exits, and returns all its memory to the OS
.. and the next time you run your script, the same steps happen.
under mod_perl :
- an Apache child starts, and it gets a new perl
- your script gets compiled, the first time around. That time, it
gets a brand-new set of variables.
- your script gets run, the first time, with these brand-new
variables.
- when the script terminates, whatever is in these variables
stays there.
- the perl interpreter inside that same Apache child stays alive,
amd it "remembers" the compiled code of your script.
Now is the difference : when a new request comes into Apache, and
(as may happen) it is sent to the *same* Apache child, it is
processed by the same perl interpreter. And that one is not
"clean" : it remembers your compiled script, and the state of its
variables from the last execution. And that is where it starts from.
The big gain is that perl does not have to compile the script
again, it can run the compiled code right away.
The big danger is that your variables start with the state in which
the previous run of the same script in the same Apache child left
them.
That can sometimes be put to good use, but it is also deadly if you
are not very careful.
The above is only a very approximative explanation, and the reality
is somewhat subtler. But if you stick to that basic explanation,
you will avoid much trouble.
The fact that your $1 variable retained a value from an earlier
comparison however has nothing to do with the above. That would be
true even if your script was not running under mod_perl.
Sounds like an early lesson out of C. Never assume anything is
in fact defined without defining it yourself.
That is a good principle in general (and not only in perl).
A final observation : at the beginning, I think what most perl/web
programmers find the most interesting aspect of mod_perl is that
scripts/modules run much faster (because they do not need to be re-
compiled each time).
But I find that the real benefit is more in terms of how closely it
is integrated into the Apache "insides", and the incredible power
it gives you to create "handlers" and "filters" to let you
intervene at just about every stage of the processing of a request,
and use all the power and flexibility of perl (and of the CPAN
library) to do all kinds of stuff you could not even dream of
otherwise.
reading the on-line mod_perl documentation is also a unique way to
learn how Apache itself works.